On Dec 27, 2012, at 3:39 PM, George Herman <[email protected]> wrote:
> > On Dec 22, 2012, at 4:07 PM, Richard Elling > <[email protected]> wrote: > >> Hi George, >> >> On Dec 19, 2012, at 6:52 PM, George Herman <[email protected]> wrote: >> >>> I'm observing some strange behavior that I hope someone might be able to >>> explain... >>> >>> I run the following script: >>> >>> zpool create zpool c0t5000C50041ABABABd0 >>> zfs create -V 200G -b 4k zpool/zvol >>> zfs set logbias=throughput zpool/zvol >> >> This has no effect unless you have a slog. > > I understood that if the logbias property is set to 'throughput' then intent > log blocks will be allocated from the main pool instead of any separate > intent log devices. The data will be written immediately to spread the write > load thus making for quicker subsequent transaction group commits to the > pool. Is this not correct? You are correct, but if you have no slog, the ZIL is in the pool and logbias is ignored. >> >>> zfs set primarycache=none zpool/zvol >>> zfs set secondarycache=none zpool/zvol >> >> primarycache and secondarycache affect reads, not writes. > > Understood, I'm doing a test of both reads and writes. > >> >>> ./zvol_wce /dev/zvol/rdsk/zpool/zvol 1 >> >> This enables write caching, so we then expect to not use the ZIL unless >> the application issues a SYNCHRONIZE_CACHE-like command. > > What write caching enabled does, seems to be an area that is > misunderstood.... or at least misunderstood by me. I understood that if it > was disabled (the default), every IO operation to the disk was followed by a > synchronize cache operation. With it on, I understood that this no longer > happened on async operations but applications that specified sync > transactions the synchronize cache operation would still happen as usual. > Turning it on, evidently has much more impact that this. I'm being told that > it impacts synchronous operations, in that it treats all sync operations as > async. If this is indeed the case, is there anyway to turn off synchronize > cache for async operations only? SYNCHRONIZE_CACHE is a SCSI command and has nothing to do with write cache enablement. The short answer is that disabling the write cache on an Zvol causes the write I/O to be of the synchronous variety (similar to NFS synchronous writes) and thus use the ZIL. For the default write cache enabled configuration, the writes will be flushed to ZIL upon demand. In the case of COMSTAR, an initiator issuing SYNCHRONIZE_CACHE to the target will cause a cache flush to the ZIL. >> >>> vdbench -f Zvol_var >>> >>> I'm using the zvol_wce program that is described on Milak's blog to enable >>> write caching. (I understand that it disables synchronize cache operation >>> on each write) >>> I turn off IO throttling (via 'echo zfs_no_write_throttle/w1 |mdb -kw'). >> >> The write throttle is a little tricky to manage. For a small RAM machine, it >> will >> behave very differently than a large RAM machine. Similarly, for slow disks, >> it will behave differently. Have you tried to measure it's use prior to >> disabling? >> I left some scripts to do that at Nexenta, but for the benefits of the >> community, try >> dtrace -n 'fbt:zfs:txg_delay:entry {printf("%Y\n", walltimestamp)}' >> >> If that doesn't tick over, then you aren't seeing any write throttle >> requests. >> NB, at the last illumos hackathon, Basil was working on turning this into a >> kstat >> counter :-) > > > It's the combination of both that is causing the confusing behavior. If what > I've described above is indeed true, I understand why I'm seeing the strange > behavior. As for measurements, I'm seeing the following: > > WCE and IO Throttling on (default): ~ <10 MB/sec > WCE enabled and IO Throttling on : ~130 MB/sec > WCE enable and IO Throttling off: ~ 230MB/sec > > I did testing of the drive as a raw device, and I get a maximum transfer rate > of 145 MB/sec... the specs say that is it 147 MB/sec. The joys of benchmarking :-) -- richard > > -George >> >>> I run write sequential operations using xfersize=4k and specify o_dsync and >>> o_rsync in vdbench. >>> >>> I'm getting 200+MB/sec reported as the throughput (via vdbench) when the >>> drive can only do 147 MB/sec. >> >> Sure, you're caching. >> >>> Another observation... when I control c the program, it takes 10+ secs to >>> terminate, and there continues to be IO to the drive during the terminaton. >> >> Sure, you're caching. >> >>> (Seen via iostat) When either io throttling is disabled or I don't turn on >>> WCE, vdbench reports <10 MB/sec and the problem will terminate normally. >> >> That is 2,500 IOPS @ 4k, not bad for a disk that can do maybe 80. >> -- richard >> >> -- >> >> [email protected] >> +1-760-896-4422 >> >> >> >> illumos-discuss | Archives | Modify Your Subscription > > illumos-discuss | Archives | Modify Your Subscription -- [email protected] +1-760-896-4422 ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
