Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
On Sun, Jan 3, 2010 at 1:59 AM, Brent Jones wrote: > On Wed, Dec 30, 2009 at 9:35 PM, Ross Walker wrote: >> On Dec 30, 2009, at 11:55 PM, "Steffen Plotner" >> wrote: >> >> Hello, >> >> I was doing performance testing, validating zvol performance in >> particularly, and found that zvol write performance to be slow ~35-44MB/s at >> 1MB blocksize writes. I then tested the underlying zfs file system with the >> same test and got 121MB/s. Is there any way to fix this? I really would >> like to have compatible performance between the zfs filesystem and the zfs >> zvols. >> >> Been there. >> ZVOLs were changed a while ago to make each operation synchronous so to >> provide data consistency in the event of a system crash or power outage, >> particularly when used as backing stores for iscsitgt or comstar. >> While I think that the change is necessary I think they should have made the >> cooked 'dsk' device node run with caching enabled to provide an alternative >> for those willing to take the risk, or modify iscsitgt/comstar to issue a >> sync after every write if write-caching is enabled on the backing device and >> the user doesn't want to write cache, or advertise WCE on the mode page to >> the initiators and let them sync. >> I also believe performance can be better. When using zvols with iscsitgt and >> comstar I was unable to break 30MB/s with 4k sequential read workload to a >> zvol with a 128k recordsize (recommended for sequential IO), not very good. >> To the same hardware running Linux and iSCSI Enterprise Target I was able to >> drive over 50MB/s with the same workload. This isn't writes, just reads. I >> was able to do somewhat better going to the physical device with iscsitgt >> and comstar, but not as good as Linux, so I kept on using Linux for iSCSI >> and Solaris for NFS which performed better. >> > > I also noticed that using ZVOLS instead of files, for 20MB/sec read > I/O, I saw as many as 900 iops to the disks themselves. > When using file based luns to Comstar, doing 20MB/sec read I/O will > just issue a couple hundred iops. > Seemed to get decent performance, it was required for me to either > throw away my X4540's and switch to 7000's with expensive SSDs, or > switch to file-based Comstar LUNs and disable the ZIL :( > > Sad when a $50k piece of equipment requires such sacrifice. Yes, the ZVOLs seem to issue each recordsize IO synchronously without providing any way for the IOs to coalesce. If the IO scheduler underneath or even the HBA driver could give it 40-100us between flushes to allow a couple more IOs to merge then it would make a world of difference. Even better, have the iscsitgt/comstar manage the cache, meaning, provide async writes with cache flush support when write cache is enabled, provide FUA support utilizing fully synchronous IO, if the admin wants write caching then enable it in the iSCSI target, letting the initiator manage it's own data integrity needs, if they don't they disable it in the target and all IO is synchronous as it is now. This means providing both an async and sync interface to ZVOLs with cache flushing capabilities and modifying the software to use it as appropriate. -Ross ___ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
On Wed, Dec 30, 2009 at 9:35 PM, Ross Walker wrote: > On Dec 30, 2009, at 11:55 PM, "Steffen Plotner" > wrote: > > Hello, > > I was doing performance testing, validating zvol performance in > particularly, and found that zvol write performance to be slow ~35-44MB/s at > 1MB blocksize writes. I then tested the underlying zfs file system with the > same test and got 121MB/s. Is there any way to fix this? I really would > like to have compatible performance between the zfs filesystem and the zfs > zvols. > > Been there. > ZVOLs were changed a while ago to make each operation synchronous so to > provide data consistency in the event of a system crash or power outage, > particularly when used as backing stores for iscsitgt or comstar. > While I think that the change is necessary I think they should have made the > cooked 'dsk' device node run with caching enabled to provide an alternative > for those willing to take the risk, or modify iscsitgt/comstar to issue a > sync after every write if write-caching is enabled on the backing device and > the user doesn't want to write cache, or advertise WCE on the mode page to > the initiators and let them sync. > I also believe performance can be better. When using zvols with iscsitgt and > comstar I was unable to break 30MB/s with 4k sequential read workload to a > zvol with a 128k recordsize (recommended for sequential IO), not very good. > To the same hardware running Linux and iSCSI Enterprise Target I was able to > drive over 50MB/s with the same workload. This isn't writes, just reads. I > was able to do somewhat better going to the physical device with iscsitgt > and comstar, but not as good as Linux, so I kept on using Linux for iSCSI > and Solaris for NFS which performed better. > -Ross > > ___ > zfs-discuss mailing list > [email protected] > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > I also noticed that using ZVOLS instead of files, for 20MB/sec read I/O, I saw as many as 900 iops to the disks themselves. When using file based luns to Comstar, doing 20MB/sec read I/O will just issue a couple hundred iops. Seemed to get decent performance, it was required for me to either throw away my X4540's and switch to 7000's with expensive SSDs, or switch to file-based Comstar LUNs and disable the ZIL :( Sad when a $50k piece of equipment requires such sacrifice. -- Brent Jones [email protected] ___ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
-Original Message- From: Ross Walker [mailto:[email protected]] Sent: Thu 12/31/2009 12:35 AM To: Steffen Plotner Cc: Subject: Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130 Been there. ZVOLs were changed a while ago to make each operation synchronous so to provide data consistency in the event of a system crash or power outage, particularly when used as backing stores for iscsitgt or comstar. While I think that the change is necessary I think they should have made the cooked 'dsk' device node run with caching enabled to provide an alternative for those willing to take the risk, or modify iscsitgt/comstar to issue a sync after every write if write-caching is enabled on the backing device and the user doesn't want to write cache, or advertise WCE on the mode page to the initiators and let them sync. I also believe performance can be better. When using zvols with iscsitgt and comstar I was unable to break 30MB/s with 4k sequential read workload to a zvol with a 128k recordsize (recommended for sequential IO), not very good. To the same hardware running Linux and iSCSI Enterprise Target I was able to drive over 50MB/s with the same workload. This isn't writes, just reads. I was able to do somewhat better going to the physical device with iscsitgt and comstar, but not as good as Linux, so I kept on using Linux for iSCSI and Solaris for NFS which performed better. -Ross Thank you for the information, I guess the grass is not always greener on the other side. I currently run linux IET+LVM and was looking for improved snapshot capabilities. Comstar is extremely well engineered from a scsi/iscsi/fc perspective. It is sad to see that ZVOLs have such a performance issue. I have tried changing the WCE setting in the comstar LU and it made barely a difference. Steffen ___ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
On Dec 30, 2009, at 9:35 PM, Ross Walker wrote: On Dec 30, 2009, at 11:55 PM, "Steffen Plotner" wrote: Hello, I was doing performance testing, validating zvol performance in particularly, and found that zvol write performance to be slow ~35-44MB/s at 1MB blocksize writes. I then tested the underlying zfs file system with the same test and got 121MB/s. Is there any way to fix this? I really would like to have compatible performance between the zfs filesystem and the zfs zvols. Been there. ZVOLs were changed a while ago to make each operation synchronous so to provide data consistency in the event of a system crash or power outage, particularly when used as backing stores for iscsitgt or comstar. While I think that the change is necessary I think they should have made the cooked 'dsk' device node run with caching enabled to provide an alternative for those willing to take the risk, or modify iscsitgt/comstar to issue a sync after every write if write-caching is enabled on the backing device and the user doesn't want to write cache, or advertise WCE on the mode page to the initiators and let them sync. CR 6794730, need zvol support for DKIOCSETWCE and friends, was integrated into b113. Unfortunately, OpenSolaris 2009.06 is b111, where zvol performance will stink. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6794730 This still requires that the client implements WCE (or WCD, as some developers like double negatives :-(. This is optional for Solaris iSCSI clients and, IIRC, the default has changed over time. See the above CR for more info. -- richard ___ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
On Dec 30, 2009, at 11:55 PM, "Steffen Plotner" wrote: Hello, I was doing performance testing, validating zvol performance in particularly, and found that zvol write performance to be slow ~35-44MB/s at 1MB blocksize writes. I then tested the underlying zfs file system with the same test and got 121MB/s. Is there any way to fix this? I really would like to have compatible performance between the zfs filesystem and the zfs zvols. Been there. ZVOLs were changed a while ago to make each operation synchronous so to provide data consistency in the event of a system crash or power outage, particularly when used as backing stores for iscsitgt or comstar. While I think that the change is necessary I think they should have made the cooked 'dsk' device node run with caching enabled to provide an alternative for those willing to take the risk, or modify iscsitgt/ comstar to issue a sync after every write if write-caching is enabled on the backing device and the user doesn't want to write cache, or advertise WCE on the mode page to the initiators and let them sync. I also believe performance can be better. When using zvols with iscsitgt and comstar I was unable to break 30MB/s with 4k sequential read workload to a zvol with a 128k recordsize (recommended for sequential IO), not very good. To the same hardware running Linux and iSCSI Enterprise Target I was able to drive over 50MB/s with the same workload. This isn't writes, just reads. I was able to do somewhat better going to the physical device with iscsitgt and comstar, but not as good as Linux, so I kept on using Linux for iSCSI and Solaris for NFS which performed better. -Ross ___ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zvol (slow) vs file (fast) performance snv_130
On Wed, Dec 30, 2009 at 8:55 PM, Steffen Plotner wrote: > Hello, > > I was doing performance testing, validating zvol performance in > particularly, and found that zvol write performance to be slow ~35-44MB/s at > 1MB blocksize writes. I then tested the underlying zfs file system with the > same test and got 121MB/s. Is there any way to fix this? I really would > like to have compatible performance between the zfs filesystem and the zfs > zvols. > > # first test is a file test at the root of the zpool vg_satabeast8_vol0 > dd if=/dev/zero of=/vg_satabeast8_vol0/testing bs=1M count=32768 > 32768+0 records in > 32768+0 records out > 34359738368 bytes (34 GB) copied, 285.037 s, 121 MB/s > > # create zvol > zfs create -V 100G -b 4k vg_satabeast8_vol0/lv_test > > # test zvol with 'dsk' device >> dd if=/dev/zero of=/dev/zvol/dsk/vg_satabeast8_vol0/lv_test bs=1M >> count=32768 > 32768+0 records in > 32768+0 records out > 34359738368 bytes (34 GB) copied, 981.219 s, 35.0 MB/s > > # test zvol with 'rdsk' device (results are better than 'dsk', however, not > as good as a regular file) > dd if=/dev/zero of=/dev/zvol/rdsk/vg_satabeast8_vol0/lv_test bs=1M > count=32768 > 32768+0 records in > 32768+0 records out > 34359738368 bytes (34 GB) copied, 766.247 s, 44.8 MB/s > > >>uname -a > SunOS zfs-debug-node 5.11 snv_130 i86pc i386 i86pc Solaris > > I believe this problem is affecting performance tests others are doing with > Comstar and exported zvol logical units. > > Steffen > ___ > Steffen Plotner Amherst College Tel > (413) 542-2348 > Systems/Network Administrator/Programmer PO BOX 5000 Fax > (413) 542-2626 > Systems & Networking Amherst, MA 01002-5000 > [email protected] > > > ___ > zfs-discuss mailing list > [email protected] > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > Why did you make the ZFS file system have 4k blocks? I'd let ZFS manage that for you, which by default I believe is 128K -- Brent Jones [email protected] ___ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
