Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130

2010-01-04 Thread Ross Walker
On Sun, Jan 3, 2010 at 1:59 AM, Brent Jones  wrote:
> On Wed, Dec 30, 2009 at 9:35 PM, Ross Walker  wrote:
>> On Dec 30, 2009, at 11:55 PM, "Steffen Plotner" 
>> wrote:
>>
>> Hello,
>>
>> I was doing performance testing, validating zvol performance in
>> particularly, and found that zvol write performance to be slow ~35-44MB/s at
>> 1MB blocksize writes. I then tested the underlying zfs file system with the
>> same test and got 121MB/s.  Is there any way to fix this? I really would
>> like to have compatible performance between the zfs filesystem and the zfs
>> zvols.
>>
>> Been there.
>> ZVOLs were changed a while ago to make each operation synchronous so to
>> provide data consistency in the event of a system crash or power outage,
>> particularly when used as backing stores for iscsitgt or comstar.
>> While I think that the change is necessary I think they should have made the
>> cooked 'dsk' device node run with caching enabled to provide an alternative
>> for those willing to take the risk, or modify iscsitgt/comstar to issue a
>> sync after every write if write-caching is enabled on the backing device and
>> the user doesn't want to write cache, or advertise WCE on the mode page to
>> the initiators and let them sync.
>> I also believe performance can be better. When using zvols with iscsitgt and
>> comstar I was unable to break 30MB/s with 4k sequential read workload to a
>> zvol with a 128k recordsize (recommended for sequential IO), not very good.
>> To the same hardware running Linux and iSCSI Enterprise Target I was able to
>> drive over 50MB/s with the same workload. This isn't writes, just reads. I
>> was able to do somewhat better going to the physical device with iscsitgt
>> and comstar, but not as good as Linux, so I kept on using Linux for iSCSI
>> and Solaris for NFS which performed better.
>>
>
> I also noticed that using ZVOLS instead of files, for 20MB/sec read
> I/O, I saw as many as 900 iops to the disks themselves.
> When using file based luns to Comstar, doing 20MB/sec read I/O will
> just issue a couple hundred iops.
> Seemed to get decent performance, it was required for me to either
> throw away my X4540's and switch to 7000's with expensive SSDs, or
> switch to file-based Comstar LUNs and disable the ZIL  :(
>
> Sad when a $50k piece of equipment requires such sacrifice.

Yes, the ZVOLs seem to issue each recordsize IO synchronously without
providing any way for the IOs to coalesce. If the IO scheduler
underneath or even the HBA driver could give it 40-100us between
flushes to allow a couple more IOs to merge then it would make a world
of difference.

Even better, have the iscsitgt/comstar manage the cache, meaning,
provide async writes with cache flush support when write cache is
enabled, provide FUA support utilizing fully synchronous IO, if the
admin wants write caching then enable it in the iSCSI target, letting
the initiator manage it's own data integrity needs, if they don't they
disable it in the target and all IO is synchronous as it is now.

This means providing both an async and sync interface to ZVOLs with
cache flushing capabilities and modifying the software to use it as
appropriate.

-Ross
___
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130

2010-01-02 Thread Brent Jones
On Wed, Dec 30, 2009 at 9:35 PM, Ross Walker  wrote:
> On Dec 30, 2009, at 11:55 PM, "Steffen Plotner" 
> wrote:
>
> Hello,
>
> I was doing performance testing, validating zvol performance in
> particularly, and found that zvol write performance to be slow ~35-44MB/s at
> 1MB blocksize writes. I then tested the underlying zfs file system with the
> same test and got 121MB/s.  Is there any way to fix this? I really would
> like to have compatible performance between the zfs filesystem and the zfs
> zvols.
>
> Been there.
> ZVOLs were changed a while ago to make each operation synchronous so to
> provide data consistency in the event of a system crash or power outage,
> particularly when used as backing stores for iscsitgt or comstar.
> While I think that the change is necessary I think they should have made the
> cooked 'dsk' device node run with caching enabled to provide an alternative
> for those willing to take the risk, or modify iscsitgt/comstar to issue a
> sync after every write if write-caching is enabled on the backing device and
> the user doesn't want to write cache, or advertise WCE on the mode page to
> the initiators and let them sync.
> I also believe performance can be better. When using zvols with iscsitgt and
> comstar I was unable to break 30MB/s with 4k sequential read workload to a
> zvol with a 128k recordsize (recommended for sequential IO), not very good.
> To the same hardware running Linux and iSCSI Enterprise Target I was able to
> drive over 50MB/s with the same workload. This isn't writes, just reads. I
> was able to do somewhat better going to the physical device with iscsitgt
> and comstar, but not as good as Linux, so I kept on using Linux for iSCSI
> and Solaris for NFS which performed better.
> -Ross
>
> ___
> zfs-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>

I also noticed that using ZVOLS instead of files, for 20MB/sec read
I/O, I saw as many as 900 iops to the disks themselves.
When using file based luns to Comstar, doing 20MB/sec read I/O will
just issue a couple hundred iops.
Seemed to get decent performance, it was required for me to either
throw away my X4540's and switch to 7000's with expensive SSDs, or
switch to file-based Comstar LUNs and disable the ZIL  :(

Sad when a $50k piece of equipment requires such sacrifice.

-- 
Brent Jones
[email protected]
___
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130

2010-01-02 Thread Steffen Plotner
-Original Message-
From: Ross Walker [mailto:[email protected]]
Sent: Thu 12/31/2009 12:35 AM
To: Steffen Plotner
Cc: 
Subject: Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
 
Been there.

ZVOLs were changed a while ago to make each operation synchronous so to provide 
data consistency in the event of a system crash or power outage, particularly 
when used as backing stores for iscsitgt or comstar.

While I think that the change is necessary I think they should have made the 
cooked 'dsk' device node run with caching enabled to provide an alternative for 
those willing to take the risk, or modify iscsitgt/comstar to issue a sync 
after every write if write-caching is enabled on the backing device and the 
user doesn't want to write cache, or advertise WCE on the mode page to the 
initiators and let them sync. 

I also believe performance can be better. When using zvols with iscsitgt and 
comstar I was unable to break 30MB/s with 4k sequential read workload to a zvol 
with a 128k recordsize (recommended for sequential IO), not very good. To the 
same hardware running Linux and iSCSI Enterprise Target I was able to drive 
over 50MB/s with the same workload. This isn't writes, just reads. I was able 
to do somewhat better going to the physical device with iscsitgt and comstar, 
but not as good as Linux, so I kept on using Linux for iSCSI and Solaris for 
NFS which performed better.

-Ross
 

Thank you for the information, I guess the grass is not always greener on the 
other side. I currently run linux IET+LVM and was looking for improved snapshot 
capabilities. Comstar is extremely well engineered from a scsi/iscsi/fc 
perspective. It is sad to see that ZVOLs have such a performance issue.  I have 
tried changing the WCE setting in the comstar LU and it made barely a 
difference. 

Steffen
___
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130

2009-12-30 Thread Richard Elling


On Dec 30, 2009, at 9:35 PM, Ross Walker wrote:

On Dec 30, 2009, at 11:55 PM, "Steffen Plotner"  
 wrote:



Hello,

I was doing performance testing, validating zvol performance in  
particularly, and found that zvol write performance to be slow  
~35-44MB/s at 1MB blocksize writes. I then tested the underlying  
zfs file system with the same test and got 121MB/s.  Is there any  
way to fix this? I really would like to have compatible performance  
between the zfs filesystem and the zfs zvols.



Been there.

ZVOLs were changed a while ago to make each operation synchronous so  
to provide data consistency in the event of a system crash or power  
outage, particularly when used as backing stores for iscsitgt or  
comstar.


While I think that the change is necessary I think they should have  
made the cooked 'dsk' device node run with caching enabled to  
provide an alternative for those willing to take the risk, or modify  
iscsitgt/comstar to issue a sync after every write if write-caching  
is enabled on the backing device and the user doesn't want to write  
cache, or advertise WCE on the mode page to the initiators and let  
them sync.


CR 6794730, need zvol support for DKIOCSETWCE and friends, was
integrated into b113.  Unfortunately, OpenSolaris 2009.06 is b111, where
zvol performance will stink.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6794730

This still requires that the client implements WCE (or WCD, as some  
developers
like double negatives :-(. This is optional for Solaris iSCSI clients  
and, IIRC, the

default has changed over time.  See the above CR for more info.
 -- richard

___
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130

2009-12-30 Thread Ross Walker
On Dec 30, 2009, at 11:55 PM, "Steffen Plotner"  
 wrote:



Hello,

I was doing performance testing, validating zvol performance in  
particularly, and found that zvol write performance to be slow  
~35-44MB/s at 1MB blocksize writes. I then tested the underlying zfs  
file system with the same test and got 121MB/s.  Is there any way to  
fix this? I really would like to have compatible performance between  
the zfs filesystem and the zfs zvols.



Been there.

ZVOLs were changed a while ago to make each operation synchronous so  
to provide data consistency in the event of a system crash or power  
outage, particularly when used as backing stores for iscsitgt or  
comstar.


While I think that the change is necessary I think they should have  
made the cooked 'dsk' device node run with caching enabled to provide  
an alternative for those willing to take the risk, or modify iscsitgt/ 
comstar to issue a sync after every write if write-caching is enabled  
on the backing device and the user doesn't want to write cache, or  
advertise WCE on the mode page to the initiators and let them sync.


I also believe performance can be better. When using zvols with  
iscsitgt and comstar I was unable to break 30MB/s with 4k sequential  
read workload to a zvol with a 128k recordsize (recommended for  
sequential IO), not very good. To the same hardware running Linux and  
iSCSI Enterprise Target I was able to drive over 50MB/s with the same  
workload. This isn't writes, just reads. I was able to do somewhat  
better going to the physical device with iscsitgt and comstar, but not  
as good as Linux, so I kept on using Linux for iSCSI and Solaris for  
NFS which performed better.


-Ross
 ___
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130

2009-12-30 Thread Brent Jones
On Wed, Dec 30, 2009 at 8:55 PM, Steffen Plotner  wrote:
> Hello,
>
> I was doing performance testing, validating zvol performance in
> particularly, and found that zvol write performance to be slow ~35-44MB/s at
> 1MB blocksize writes. I then tested the underlying zfs file system with the
> same test and got 121MB/s.  Is there any way to fix this? I really would
> like to have compatible performance between the zfs filesystem and the zfs
> zvols.
>
> # first test is a file test at the root of the zpool vg_satabeast8_vol0
> dd if=/dev/zero of=/vg_satabeast8_vol0/testing bs=1M count=32768
> 32768+0 records in
> 32768+0 records out
> 34359738368 bytes (34 GB) copied, 285.037 s, 121 MB/s
>
> # create zvol
> zfs create -V 100G -b 4k vg_satabeast8_vol0/lv_test
>
> # test zvol with 'dsk' device
>> dd if=/dev/zero of=/dev/zvol/dsk/vg_satabeast8_vol0/lv_test bs=1M
>> count=32768
> 32768+0 records in
> 32768+0 records out
> 34359738368 bytes (34 GB) copied, 981.219 s, 35.0 MB/s
>
> # test zvol with 'rdsk' device (results are better than 'dsk', however, not
> as good as a regular file)
> dd if=/dev/zero of=/dev/zvol/rdsk/vg_satabeast8_vol0/lv_test bs=1M
> count=32768
> 32768+0 records in
> 32768+0 records out
> 34359738368 bytes (34 GB) copied, 766.247 s, 44.8 MB/s
>
>
>>uname -a
> SunOS zfs-debug-node 5.11 snv_130 i86pc i386 i86pc Solaris
>
> I believe this problem is affecting performance tests others are doing with
> Comstar and exported zvol logical units.
>
> Steffen
> ___
> Steffen Plotner    Amherst College    Tel
> (413) 542-2348
> Systems/Network Administrator/Programmer   PO BOX 5000    Fax
> (413) 542-2626
> Systems & Networking   Amherst, MA 01002-5000
> [email protected]
>
>
> ___
> zfs-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>

Why did you make the ZFS file system have 4k blocks?
I'd let ZFS manage that for you, which by default I believe is 128K

-- 
Brent Jones
[email protected]
___
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss