Re: [zfs-discuss] Looking for some hardware answers, maybe someone on this list could help
On Wed, Oct 15, 2008 at 9:13 PM, Al Hopper [EMAIL PROTECTED] wrote: The exception to the rule of multiple 12v output sections is PC Power Cooling - who claim that there is no technical advantage to having multiple 12v outputs (and this feature is only a marketing gimmick). But now that they have merged with OCZ - who always claimed that there are advantages to multiple 12v output sections ... I'm not sure where they stand today. In any case the PC Power Cooling PSUs are premium, reliable, high performance parts in my personal experience - altough their Silencer products are far from silent in my experience! :) it's good to have that vote of confidence as i picked that brand :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Looking for some hardware answers, maybe someone on this list could help
On Wed, Oct 15, 2008 at 9:13 PM, Al Hopper [EMAIL PROTECTED] wrote: The exception to the rule of multiple 12v output sections is PC Power Cooling - who claim that there is no technical advantage to having multiple 12v outputs (and this feature is only a marketing gimmick). But now that they have merged with OCZ - who always claimed that there are advantages to multiple 12v output sections ... I'm not sure where they stand today. In any case the PC Power Cooling PSUs are premium, reliable, high performance parts in my personal experience - altough their Silencer products are far from silent in my experience! :) Well that depends, you can build a power supply with multiple isolated 12V rails. I would hope this is what they mean when they specify multiple 12V outputs with equal/different current/load ratings. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
Well obviously recovery scenarios need testing, but I still don't see it being that bad. My thinking on this is: 1. Loss of a server is very much the worst case scenario. Disk errors are much more likely, and with raid-z2 pools on the individual servers this should not pose a problem. I also would not expect to see disk failures downing an entire x4500. Sun have sold an awful lot of these now, enough for me to feel any such problems should be a thing of the past. 2. Even when a server does fail, the nature of ZFS is such that you would not expect to loose your data, nor should you be expecting to resilver the entire 28TB. A motherboard / backplane / PSU failure will offline that server, but once the faulted components are replaced your pool will come back online. Once the pool is online, ZFS has the ability to resilver just the changed data, meaning that your rebuild time will be simply proportional to the time the server was down. Of course these failure modes would need testing, as would rebuild times. I don't see 'zfs send' performance being an issue though, not unless Grey has another 150TB of storage lying around that he's not telling us about. :-) There are always going to be some tradeoffs between risk, capacity and price, but I expect that the benefits of this setup far outweigh the negatives. Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
Howdy! Very valuable advice here (and from Bob, who made similar comments - thanks, Bob!). I think, then, we'll generally stick to 128K recordsizes. In the case of databases, we'll stray as appropriate, and we may also stray with the HPC compute cluster if we can get demonstrate that it is worth it. To answer your questions below... Currently, we have a single pool, in a load share configuration (no raidz), that collects all the storage (which answers Ross' question too). From that we carve filesystems on demand. There are many more tests planned for that construction, though, so we are not married to it. Redundancy abounds. ; Since the pool doesn't employ raidz, it isn't internally redundant, but we plan to replicate the pool's data to an identical system (which is not yet built) at another site. Our initial userbase don't need the replication, however, because they uses the system for little more than scratch space. Huge genomic datasets are dumped on the storage, analyzed, and the results (which are much smaller) get sent elsewhere. Everything is wiped out soon after that and the process starts again. Future projected uses of the storage, however, would be far less tolerant of loss, so I expect we'll want to reconfigure the pool in raidz. I see that Archie and Miles have shared some harrowing concerns which we take very seriously. I don't think I'll be able to reply to them today, but I certainly will in the near future (particularly once we've completed some more of our induced failure scenarios). Sidenote: Today we made eight network/iSCSI related tweaks that, in aggregate, have resulted in dramatic performance improvements (some I just hadn't gotten around to yet, others suggested by Sun's Mertol Ozyoney)... - disabling the Nagle algorithm on the head node - setting each iSCSI target block size to match the ZFS record size of 128K - disabling thin provisioning on the iSCSI targets - enabling jumbo frames everywhere (each switch and NIC) - raising ddi_msix_alloc_limit to 8 - raising ip_soft_rings_cnt to 16 - raising tcp_deferred_acks_max to 16 - raising tcp_local_dacks_max to 16 Rerunning the same tests, we now see... [1GB file size, 1KB record size] Command: iozone -i o -i 1 -i 2 -r 1k -s 1g -f /data-das/perftest/1gbtest Write: 143373 Rewrite: 183170 Read: 433205 Reread: 435503 Random Read: 90118 Random Write: 19488 [8GB file size, 512KB record size] Command: iozone -i 0 -i 1 -i 2 -r 512k -s 8g -f /volumes/data-iscsi/perftest/8gbtest Write: 463260 Rewrite: 449280 Read: 1092291 Reread: 881044 Random Read: 442565 Random Write: 565565 [64GB file size, 1MB record size] Command: iozone -i o -i 1 -i 2 -r 1m -s 64g -f /data-das/perftest/64gbtest Write: 357199 Rewrite: 342788 Read: 609553 Reread: 645618 Random Read: 218874 Random Write: 339624 Thanks so much to everyone for all their great contributions! -Gray On Thu, Oct 16, 2008 at 2:20 AM, Akhilesh Mritunjai [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi Gray, You've got a nice setup going there, few comments: 1. Do not tune ZFS without a proven test-case to show otherwise, except... 2. For databases. Tune recordsize for that particular FS to match DB recordsize. Few questions... * How are you divvying up the space ? * How are you taking care of redundancy ? * Are you aware that each layer of ZFS needs its own redundancy ? Since you have got a mixed use case here, I would be surprized if a general config would cover all, though it might do with some luck. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Gray Carper MSIS Technical Services University of Michigan Medical School [EMAIL PROTECTED] | skype: graycarper | 734.418.8506 http://www.umms.med.umich.edu/msis/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
Miles makes a good point here, you really need to look at how this copes with various failure modes. Based on my experience, iSCSI is something that may cause you problems. When I tested this kind of setup last year I found that the entire pool hung for 3 minutes any time an iSCSI volume went offline. It looked like a relatively simple thing to fix if you can recompile the iSCSI driver, and there is talk about making the timeout adjustable, but for me that was enough to put our project on hold for now. Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
Oops - one thing I meant to mention: We only plan to cross-site replicate data for those folks who require it. The HPC data crunching would have no use for it, so that filesystem wouldn't be replicated. In reality, we only expect a select few users, with relatively small filesystems, to actually need replication. (Which begs the question: Why build an identical 150TB system to support that? Good question. I think we'll reevaluate. ;) -Gray On Thu, Oct 16, 2008 at 3:50 PM, Gray Carper [EMAIL PROTECTED] wrote: Howdy! Very valuable advice here (and from Bob, who made similar comments - thanks, Bob!). I think, then, we'll generally stick to 128K recordsizes. In the case of databases, we'll stray as appropriate, and we may also stray with the HPC compute cluster if we can get demonstrate that it is worth it. To answer your questions below... Currently, we have a single pool, in a load share configuration (no raidz), that collects all the storage (which answers Ross' question too). From that we carve filesystems on demand. There are many more tests planned for that construction, though, so we are not married to it. Redundancy abounds. ; Since the pool doesn't employ raidz, it isn't internally redundant, but we plan to replicate the pool's data to an identical system (which is not yet built) at another site. Our initial userbase don't need the replication, however, because they uses the system for little more than scratch space. Huge genomic datasets are dumped on the storage, analyzed, and the results (which are much smaller) get sent elsewhere. Everything is wiped out soon after that and the process starts again. Future projected uses of the storage, however, would be far less tolerant of loss, so I expect we'll want to reconfigure the pool in raidz. I see that Archie and Miles have shared some harrowing concerns which we take very seriously. I don't think I'll be able to reply to them today, but I certainly will in the near future (particularly once we've completed some more of our induced failure scenarios). Sidenote: Today we made eight network/iSCSI related tweaks that, in aggregate, have resulted in dramatic performance improvements (some I just hadn't gotten around to yet, others suggested by Sun's Mertol Ozyoney)... - disabling the Nagle algorithm on the head node - setting each iSCSI target block size to match the ZFS record size of 128K - disabling thin provisioning on the iSCSI targets - enabling jumbo frames everywhere (each switch and NIC) - raising ddi_msix_alloc_limit to 8 - raising ip_soft_rings_cnt to 16 - raising tcp_deferred_acks_max to 16 - raising tcp_local_dacks_max to 16 Rerunning the same tests, we now see... [1GB file size, 1KB record size] Command: iozone -i o -i 1 -i 2 -r 1k -s 1g -f /data-das/perftest/1gbtest Write: 143373 Rewrite: 183170 Read: 433205 Reread: 435503 Random Read: 90118 Random Write: 19488 [8GB file size, 512KB record size] Command: iozone -i 0 -i 1 -i 2 -r 512k -s 8g -f /volumes/data-iscsi/perftest/8gbtest Write: 463260 Rewrite: 449280 Read: 1092291 Reread: 881044 Random Read: 442565 Random Write: 565565 [64GB file size, 1MB record size] Command: iozone -i o -i 1 -i 2 -r 1m -s 64g -f /data-das/perftest/64gbtest Write: 357199 Rewrite: 342788 Read: 609553 Reread: 645618 Random Read: 218874 Random Write: 339624 Thanks so much to everyone for all their great contributions! -Gray On Thu, Oct 16, 2008 at 2:20 AM, Akhilesh Mritunjai [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi Gray, You've got a nice setup going there, few comments: 1. Do not tune ZFS without a proven test-case to show otherwise, except... 2. For databases. Tune recordsize for that particular FS to match DB recordsize. Few questions... * How are you divvying up the space ? * How are you taking care of redundancy ? * Are you aware that each layer of ZFS needs its own redundancy ? Since you have got a mixed use case here, I would be surprized if a general config would cover all, though it might do with some luck. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Gray Carper MSIS Technical Services University of Michigan Medical School [EMAIL PROTECTED] | skype: graycarper | 734.418.8506 http://www.umms.med.umich.edu/msis/ -- Gray Carper MSIS Technical Services University of Michigan Medical School [EMAIL PROTECTED] | skype: graycarper | 734.418.8506 http://www.umms.med.umich.edu/msis/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Tuning for a file server, disabling data cache (almost)
On 15 October, 2008 - Richard Elling sent me these 4,3K bytes: Tomas Ögren wrote: Hello. Executive summary: I want arc_data_limit (like arc_meta_limit, but for data) and set it to 0.5G or so. Is there any way to simulate it? We describe how to limit the size of the ARC cache in the Evil Tuning Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide Will that limit the _data_ portion only, or the metadata as well? /Tomas -- Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Tuning for a file server, disabling data cache (almost)
Tomas Ögren wrote: On 15 October, 2008 - Richard Elling sent me these 4,3K bytes: Tomas Ögren wrote: Hello. Executive summary: I want arc_data_limit (like arc_meta_limit, but for data) and set it to 0.5G or so. Is there any way to simulate it? We describe how to limit the size of the ARC cache in the Evil Tuning Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide Will that limit the _data_ portion only, or the metadata as well? Recent builds of OpenSolaris have the ability to control on a per dataset basis what is put into the ARC and L2ARC using the primrarycache and secondarycache dataset properties: primarycache=all | none | metadata Controls what is cached in the primary cache (ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only metadata is cached. The default value is all. secondarycache=all | none | metadata Controls what is cached in the secondary cache (L2ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only meta- data is cached. The default value is all. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Lost Disk Space
I've been struggling to fully understand why disk space seems to vanish. I've dug through bits of code and reviewed all the mails on the subject that I can find, but I still don't have a proper understanding of whats going on. I did a test with a local zpool on snv_97... zfs list, zpool list, and zdb all seem to disagree on how much space is available. In this case its only a discrepancy of about 20G or so, but I've got Thumpers that have a discrepancy of over 6TB! Can someone give a really detailed explanation about whats going on? block traversal size 670225837056 != alloc 720394438144 (leaked 50168601088) bp count:15182232 bp logical:672332631040 avg: 44284 bp physical: 669020836352 avg: 44066compression: 1.00 bp allocated: 670225837056 avg: 44145compression: 1.00 SPA allocated: 720394438144 used: 96.40% Blocks LSIZE PSIZE ASIZE avgcomp %Total Type 12 120K 26.5K 79.5K 6.62K4.53 0.00 deferred free 1512 512 1.50K 1.50K1.00 0.00 object directory 3 1.50K 1.50K 4.50K 1.50K1.00 0.00 object array 116K 1.50K 4.50K 4.50K 10.67 0.00 packed nvlist - - - - - -- packed nvlist size 72 8.45M889K 2.60M 37.0K9.74 0.00 bplist - - - - - -- bplist header - - - - - -- SPA space map header 974 4.48M 2.65M 7.94M 8.34K1.70 0.00 SPA space map - - - - - -- ZIL intent log 96.7K 1.51G389M777M 8.04K3.98 0.12 DMU dnode 17 17.0K 8.50K 17.5K 1.03K2.00 0.00 DMU objset - - - - - -- DSL directory 13 6.50K 6.50K 19.5K 1.50K1.00 0.00 DSL directory child map 12 6.00K 6.00K 18.0K 1.50K1.00 0.00 DSL dataset snap map 14 38.0K 10.0K 30.0K 2.14K3.80 0.00 DSL props - - - - - -- DSL dataset - - - - - -- ZFS znode 2 1K 1K 2K 1K1.00 0.00 ZFS V0 ACL 5.81M 558G557G557G 95.8K1.0089.27 ZFS plain file 382K 301M200M401M 1.05K1.50 0.06 ZFS directory 9 4.50K 4.50K 9.00K 1K1.00 0.00 ZFS master node 12 482K 20.0K 40.0K 3.33K 24.10 0.00 ZFS delete queue 8.20M 66.1G 65.4G 65.8G 8.03K1.0110.54 zvol object 1512 512 1K 1K1.00 0.00 zvol prop - - - - - -- other uint8[] - - - - - -- other uint64[] - - - - - -- other ZAP - - - - - -- persistent error log 1 128K 10.5K 31.5K 31.5K 12.19 0.00 SPA history - - - - - -- SPA history offsets - - - - - -- Pool properties - - - - - -- DSL permissions - - - - - -- ZFS ACL - - - - - -- ZFS SYSACL - - - - - -- FUID table - - - - - -- FUID table size 5 3.00K 2.50K 7.50K 1.50K1.20 0.00 DSL dataset next clones - - - - - -- scrub work queue 14.5M 626G623G624G 43.1K1.00 100.00 Total real21m16.862s user0m36.984s sys 0m5.757s === Looking at the data: [EMAIL PROTECTED] ~$ zfs list backup zpool list backup NAME USED AVAIL REFER MOUNTPOINT backup 685G 237K27K /backup NAME SIZE USED AVAILCAP HEALTH ALTROOT backup 696G 671G 25.1G96% ONLINE - So zdb says 626GB is used, zfs list says 685GB is used, and zpool list says 671GB is used. The pool was filled to 100% capacity via dd, this is confirmed, I can't write data, but yet zpool list says its only 96%. benr. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Strange result when syncing between SPARC and x86
Hello Today I've suddenly noticed that symlinks (at least) are corrupted when sync ZFS from SPARC to x86 (zfs send | ssh | zfs recv). Example is: [EMAIL PROTECTED] ls -la /data/zones/testfs/root/etc/services lrwxrwxrwx 1 root root 15 Oct 13 14:35 /data/zones/testfs/root/etc/services - ./inet/services [EMAIL PROTECTED] ls -la /data/zones/testfs/root/etc/services lrwxrwxrwx 1 root root 15 Oct 13 14:35 /data/zones/testfs/root/etc/services - s/teni/.ervices Firstly I thought it's because original FS on SPARC is compressed... so I've just synced it locally on same machine and all was OK just different FS size since destination was not compressed. Then I've synced that copy again to x86 but result was same - symlinks were corrupted... so it's not compression. SPARC is snv_85 and x86 snv_82, I haven't got a chance yet to test on latest OpenSolaris. Any suggestions? Looks like the first 8 bytes aren't reversed. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Strange result when syncing between SPARC and x86
Hi Just checked with snv_99 on x86 (VMware install) - same result :( Regards Mike [EMAIL PROTECTED] wrote: Hello Today I've suddenly noticed that symlinks (at least) are corrupted when sync ZFS from SPARC to x86 (zfs send | ssh | zfs recv). Example is: [EMAIL PROTECTED] ls -la /data/zones/testfs/root/etc/services lrwxrwxrwx 1 root root 15 Oct 13 14:35 /data/zones/testfs/root/etc/services - ./inet/services [EMAIL PROTECTED] ls -la /data/zones/testfs/root/etc/services lrwxrwxrwx 1 root root 15 Oct 13 14:35 /data/zones/testfs/root/etc/services - s/teni/.ervices Firstly I thought it's because original FS on SPARC is compressed... so I've just synced it locally on same machine and all was OK just different FS size since destination was not compressed. Then I've synced that copy again to x86 but result was same - symlinks were corrupted... so it's not compression. SPARC is snv_85 and x86 snv_82, I haven't got a chance yet to test on latest OpenSolaris. Any suggestions? Looks like the first 8 bytes aren't reversed. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Improving zfs send performance
On Wed, Oct 15, 2008 at 9:37 PM, Brent Jones [EMAIL PROTECTED] wrote: Scott, Can you tell us the configuration that you're using that is working for you? Were you using RaidZ, or RaidZ2? I'm wondering what the sweetspot is to get a good compromise in vdevs and usable space/performance I used RaidZ with 4x5 disk and 4x6 disk vdevs in one pool with two hot spares. This is very similar to how the pre-installed OS shipped from sun. Also note that I am using ssh as the transfer method. I have not tried mbuffer with this configuration as in testing with initial home directories of ~14GB in size it was not needed. This configuration seems to be similar to Carsten Aulbert's evaluation, without mbuffer in the pipe. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Ok, I'm not entirely sure this is the same problem, but it does sound fairly similar. Apologies for hijacking the thread if this does turn out to be something else. After following the advice here to get mbuffer working with zfs send / receive, I found I was only getting around 10MB/s throughput. Thinking it was a network problem I started the below thread in the OpenSolaris help forum: http://www.opensolaris.org/jive/thread.jspa?messageID=294846 Now though I don't think it's network at all. The end result from that thread is that we can't see any errors in the network setup, and using nicstat and NFS I can show that the server is capable of 50-60MB/s over the gigabit link. Nicstat also shows clearly that both zfs send / receive and mbuffer are only sending 1/5 of that amount of data over the network. I've completely run out of ideas of my own (but I do half expect there's a simple explanation I haven't thought of). Can anybody think of a reason why both zfs send / receive and mbuffer would be so slow? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Try to separate the two things: (1) Try /dev/zero - mbuffer --- network --- mbuffer /dev/null That should give you wirespeed I tried that already. It still gets just 10-11MB/s from this server. I can get zfs send / receive and mbuffer working at 30MB/s though from a couple of test servers (with much lower specs). (2) Try zfs send | mbuffer /dev/null That should give you an idea how fast zfs send really is locally. Hmm, that's better than 10MB/s, but the average is still only around 20MB/s: summary: 942 MByte in 47.4 sec - average of 19.9 MB/s I think that points to another problem though as the send mbuffer is 100% full. Certainly the pool itself doesn't appear under any strain at all while this is going on: capacity operationsbandwidthpool used avail read write read write-- - - - - - -rc-pool 732G 1.55T171 85 21.3M 1.01M mirror 144G 320G 38 0 4.78M 0c1t1d0 - - 6 0 779K 0c1t2d0 - - 17 0 2.17M 0c2t1d0 - - 14 0 1.85M 0 mirror 146G 318G 39 0 4.89M 0c1t3d0 - - 20 0 2.50M 0c2t2d0 - - 13 0 1.63M 0c2t0d0 - - 6 0 779K 0 mirror 146G 318G 34 0 4.35M 0c2t3d0 - - 19 0 2.39M 0c1t5d0 - - 7 0 1002K 0 c1t4d0 - - 7 0 1002K 0 mirror 148G 316G 23 0 2.93M 0c2t4d0 - - 8 0 1.09M 0 c2t5d0 - - 6 0 890K 0c1t6d0 - - 7 0 1002K 0 mirror 148G 316G 35 0 4.35M 0 c1t7d0 - - 6 0 779K 0c2t6d0 - - 12 0 1.52M 0c2t7d0 - - 17 0 2.07M 0 c3d1p0 12K 504M 0 85 0 1.01M-- - - - - - - Especially when compared to the zfs send stats on my backup server which managed 30MB/s via mbuffer (Being received on a single virtual SATA disk): capacity operationsbandwidthpool used avail read write read write-- - - - - - -rpool 5.12G 42.6G 0 5 0 27.1K c4t0d0s0 5.12G 42.6G 0 5 0 27.1K-- - - - - - -zfspool 431G 4.11T261 0 31.4M 0 raidz2 431G 4.11T261 0 31.4M 0c4t1d0 - -155 0 6.28M 0c4t2d0 - -155 0 6.27M 0c4t3d0 - -155 0 6.27M 0c4t4d0 - -155 0 6.27M 0c4t5d0 - -155 0 6.27M 0-- - - - - - - The really ironic thing is that the 30MB/s send / receive was sending to a virtual SATA disk which is stored (via sync NFS) on the server I'm having problems with... Ross _ Win New York holidays with Kellogg’s Live Search http://clk.atdmt.com/UKM/go/111354033/direct/01/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Oh dear god. Sorry folks, it looks like the new hotmail really doesn't play well with the list. Trying again in plain text: Try to separate the two things: (1) Try /dev/zero - mbuffer --- network --- mbuffer /dev/null That should give you wirespeed I tried that already. It still gets just 10-11MB/s from this server. I can get zfs send / receive and mbuffer working at 30MB/s though from a couple of test servers (with much lower specs). (2) Try zfs send | mbuffer /dev/null That should give you an idea how fast zfs send really is locally. Hmm, that's better than 10MB/s, but the average is still only around 20MB/s: summary: 942 MByte in 47.4 sec - average of 19.9 MB/s I think that points to another problem though as the send mbuffer is 100% full. Certainly the pool itself doesn't appear under any strain at all while this is going on: capacity operationsbandwidth pool used avail read write read write -- - - - - - - rc-pool 732G 1.55T171 85 21.3M 1.01M mirror 144G 320G 38 0 4.78M 0 c1t1d0 - - 6 0 779K 0 c1t2d0 - - 17 0 2.17M 0 c2t1d0 - - 14 0 1.85M 0 mirror 146G 318G 39 0 4.89M 0 c1t3d0 - - 20 0 2.50M 0 c2t2d0 - - 13 0 1.63M 0 c2t0d0 - - 6 0 779K 0 mirror 146G 318G 34 0 4.35M 0 c2t3d0 - - 19 0 2.39M 0 c1t5d0 - - 7 0 1002K 0 c1t4d0 - - 7 0 1002K 0 mirror 148G 316G 23 0 2.93M 0 c2t4d0 - - 8 0 1.09M 0 c2t5d0 - - 6 0 890K 0 c1t6d0 - - 7 0 1002K 0 mirror 148G 316G 35 0 4.35M 0 c1t7d0 - - 6 0 779K 0 c2t6d0 - - 12 0 1.52M 0 c2t7d0 - - 17 0 2.07M 0 c3d1p0 12K 504M 0 85 0 1.01M -- - - - - - - Especially when compared to the zfs send stats on my backup server which managed 30MB/s via mbuffer (Being received on a single virtual SATA disk): capacity operationsbandwidth pool used avail read write read write -- - - - - - - rpool 5.12G 42.6G 0 5 0 27.1K c4t0d0s0 5.12G 42.6G 0 5 0 27.1K -- - - - - - - zfspool 431G 4.11T261 0 31.4M 0 raidz2 431G 4.11T261 0 31.4M 0 c4t1d0 - -155 0 6.28M 0 c4t2d0 - -155 0 6.27M 0 c4t3d0 - -155 0 6.27M 0 c4t4d0 - -155 0 6.27M 0 c4t5d0 - -155 0 6.27M 0 -- - - - - - - The really ironic thing is that the 30MB/s send / receive was sending to a virtual SATA disk which is stored (via sync NFS) on the server I'm having problems with... Ross Date: Thu, 16 Oct 2008 14:27:49 +0200 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Improving zfs send performance Hi Ross Ross wrote: Now though I don't think it's network at all. The end result from that thread is that we can't see any errors in the network setup, and using nicstat and NFS I can show that the server is capable of 50-60MB/s over the gigabit link. Nicstat also shows clearly that both zfs send / receive and mbuffer are only sending 1/5 of that amount of data over the network. I've completely run out of ideas of my own (but I do half expect there's a simple explanation I haven't thought of). Can anybody think of a reason why both zfs send / receive and mbuffer would be so slow? Try to separate the two things: (1) Try /dev/zero - mbuffer --- network --- mbuffer /dev/null That should give you wirespeed (2) Try zfs send | mbuffer /dev/null That should give you an idea how fast zfs send really is locally. Carsten _ Get all your favourite content with the slick new MSN Toolbar - FREE http://clk.atdmt.com/UKM/go/111354027/direct/01/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Tuning for a file server, disabling data cache (almost)
On 16 October, 2008 - Darren J Moffat sent me these 1,7K bytes: Tomas Ögren wrote: On 15 October, 2008 - Richard Elling sent me these 4,3K bytes: Tomas Ögren wrote: Hello. Executive summary: I want arc_data_limit (like arc_meta_limit, but for data) and set it to 0.5G or so. Is there any way to simulate it? We describe how to limit the size of the ARC cache in the Evil Tuning Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide Will that limit the _data_ portion only, or the metadata as well? Recent builds of OpenSolaris have the ability to control on a per dataset basis what is put into the ARC and L2ARC using the primrarycache and secondarycache dataset properties: primarycache=all | none | metadata Controls what is cached in the primary cache (ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only metadata is cached. The default value is all. secondarycache=all | none | metadata Controls what is cached in the secondary cache (L2ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only meta- data is cached. The default value is all. Yeah, the problem is (like I wrote in the first post), if I set primarycache=metadata, then ZFS prefetch will go into horribly inefficient mode where it will do lots of prefetching, but the prefetched data will be discarded immediately. 128k prefetch for a 32k read will throw away the other 96k immediately. Followed by another 128k prefetch for the next 32k read, throwing away the other 96k. So ZFS needs to have _some_ data cache, but I want to limit it for short term data only.. Setting data cache limit to 512M or something should work fine, but I want to leave the rest to metadata as that's the place where it can help the most. Unless I can do some trickery with a ram disk and put that as secondarycache with data cache as well.. /Tomas -- Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs cp hangs when the mirrors are removed ..
Karthik Krishnamoorthy wrote: We did try with this zpool set failmode=continue pool option and the wait option before pulling running the cp command and pulling out the mirrors and in both cases there was a hang and I have a core dump of the hang as well. You have to wait for the I/O drivers to declare that the device is dead. This can be up to several minutes, depending on the driver. Any pointers to the bug opening process ? http://bugs.opensolaris.org, or bugster if you have an account. Be sure to indicate which drivers you are using, as this is not likely a ZFS bug, per se. Output from prtconf -D should be a minimum. -- richard Thanks Karthik On 10/15/08 22:27, Neil Perrin wrote: On 10/15/08 23:12, Karthik Krishnamoorthy wrote: Neil, Thanks for the quick suggestion, the hang seems to happen even with the zpool set failmode=continue pool option. Any other way to recover from the hang ? You should set the property before you remove the devices. This should prevent the hang. It isn't used to recover from it. If you did do that then it seems like a bug somewhere in ZFS or the IO stack below it. In which case you should file a bug. Neil. thanks and regards, Karthik On 10/15/08 22:03, Neil Perrin wrote: Karthik, The pool failmode property as implemented governs the behaviour when all the devices needed are unavailable. The default behaviour is to wait (block) until the IO can continue - perhaps by re-enabling the device(s). The behaviour you expected can be achieved by zpool set failmode=continue pool, as shown in the link you indicated below. Neil. On 10/15/08 22:38, Karthik Krishnamoorthy wrote: Hello All, Summary: cp command for mirrored zfs hung when all the disks in the mirrored pool were unavailable. Detailed description: ~ The cp command (copy a 1GB file from nfs to zfs) hung when all the disks in the mirrored pool (both c1t0d9 and c2t0d9) were removed physically. NAMESTATE READ WRITE CKSUM testONLINE 0 0 0 mirrorONLINE 0 0 0 c1t0d9 ONLINE 0 0 0 c2t0d9 ONLINE 0 0 0 We think if all the disks in the pool are unavailable, cp command should fail with error (not cause hang). Our request: Please investigate the root cause of this issue. How to reproduce: ~ 1. create a zfs mirrored pool 2. execute cp command from somewhere to the zfs mirrored pool. 3. remove the both of disks physically during cp command working = hang happen (cp command never return and we can't kill cp command) One engineer pointed me to this page http://opensolaris.org/os/community/arc/caselog/2007/567/onepager/ and indicated that if all the mirrors are removed zfs enters a hang like state to prevent the kernel from going into a panic mode and this type of feature would be an RFE. My questions are Are there any documentation of the mirror configuration of zfs that explains what happens when the underlying drivers detect problems in one of the mirror devices? It seems that the traditional views of mirror or raid-2 would expect that the mirror would be able to proceed without interruption and that does not seem to be this case in ZFS. What is the purpose of the mirror, in zfs? Is it more like an instant backup? If so, what can the user do to recover, when there is an IO error on one of the devices? Appreciate any pointers and help, Thanks and regards, Karthik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Strange result when syncing between SPARC and x86
Hello Today I've suddenly noticed that symlinks (at least) are corrupted when sync ZFS from SPARC to x86 (zfs send | ssh | zfs recv). Example is: [EMAIL PROTECTED] ls -la /data/zones/testfs/root/etc/services lrwxrwxrwx 1 root root 15 Oct 13 14:35 /data/zones/testfs/root/etc/services - ./inet/services [EMAIL PROTECTED] ls -la /data/zones/testfs/root/etc/services lrwxrwxrwx 1 root root 15 Oct 13 14:35 /data/zones/testfs/root/etc/services - s/teni/.ervices Firstly I thought it's because original FS on SPARC is compressed... so I've just synced it locally on same machine and all was OK just different FS size since destination was not compressed. Then I've synced that copy again to x86 but result was same - symlinks were corrupted... so it's not compression. SPARC is snv_85 and x86 snv_82, I haven't got a chance yet to test on latest OpenSolaris. Any suggestions? Thanks Mike ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Tuning for a file server, disabling data cache (almost)
I might be misunderstanding here, but I don't see how you're going to improve on zfs set primarycache=metadata. You complain that ZFS throws away 96kb of data if you're only reading 32kb at a time, but then also complain that you are IO/s bound and that this is restricting your maximum transfer rate. If it's io/s that is limiting you it makes no difference that ZFS is throwing away 96kb of data, you're going to get the same iops and same throughput at your application whether you're using 32k or 128k zfs record sizes. Also, you're asking on one hand for each disk to get larger IO blocks, and on the other you are complaining that with large block sizes a lot of data is wasted. That looks like a contradictory argument to me as you can't you're asking have both of these. You just need to pick whichever one is more suited to your needs. Like I said, I may be misunderstanding, but I think you might be looking for something that you don't actually need. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS pool not imported on boot on Solaris Xen PV DomU
Hi, I am trying a setup with a Linux Xen Dom0 on which runs an OpenSolaris 2008.05 DomU. I have 8 hard disk partitions that I exported to the DomU (they are visible as c4d[1-8]p0) I have created a raidz2 pool on these virtual disks. Now, if I shutdown the system and I start it again, the pool is not automatically imported during the boot. If I type zpool status, I can't see it, so I do a zfs import and I see that there is my pool that I can import, so I import it and it works. But I wonder why it isn't imported automatically. How is managed the pool import during bootup ? Does solaris try to import every single pool that's available, or does it read some list from a file somewhere (possibly the boot_archive) ? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Tuning for a file server, disabling data cache (almost)
Tomas Ögren wrote: On 16 October, 2008 - Darren J Moffat sent me these 1,7K bytes: Tomas Ögren wrote: On 15 October, 2008 - Richard Elling sent me these 4,3K bytes: Tomas Ögren wrote: Hello. Executive summary: I want arc_data_limit (like arc_meta_limit, but for data) and set it to 0.5G or so. Is there any way to simulate it? We describe how to limit the size of the ARC cache in the Evil Tuning Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide Will that limit the _data_ portion only, or the metadata as well? Recent builds of OpenSolaris have the ability to control on a per dataset basis what is put into the ARC and L2ARC using the primrarycache and secondarycache dataset properties: primarycache=all | none | metadata Controls what is cached in the primary cache (ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only metadata is cached. The default value is all. secondarycache=all | none | metadata Controls what is cached in the secondary cache (L2ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only meta- data is cached. The default value is all. Yeah, the problem is (like I wrote in the first post), if I set primarycache=metadata, then ZFS prefetch will go into horribly inefficient mode where it will do lots of prefetching, but the prefetched data will be discarded immediately. 128k prefetch for a 32k read will throw away the other 96k immediately. Followed by another 128k prefetch for the next 32k read, throwing away the other 96k. Are you sure this is a prefetch, or is it just the recordsize? The checksum is based on the record, so to validate the checksum the entire record must be read. If you have a fixed record record sized workload where the size 128 kBytes, then you might adjust the recordsize parameter. -- richard So ZFS needs to have _some_ data cache, but I want to limit it for short term data only.. Setting data cache limit to 512M or something should work fine, but I want to leave the rest to metadata as that's the place where it can help the most. Unless I can do some trickery with a ram disk and put that as secondarycache with data cache as well.. /Tomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Hi Scott, Scott Williamson wrote: You seem to be using dd for write testing. In my testing I noted that there was a large difference in write speed between using dd to write from /dev/zero and using other files. Writing from /dev/zero always seemed to be fast, reaching the maximum of ~200MB/s and using cp which would perform poorler the fewer the vdevs. You are right, the write benchmarks were done with dd just to have some bulk bulk figures since usually zeros can be generated fast enough. This also impacted the zfs send speed, as with fewer vdevs in RaidZ2 the disks seemed to spend most of their time seeking during the send. That seems a bit too simplistic to me. If you compare raidz with raidz2 it seems that raidz2 is not too bad with fewer vdevs. I wish there was a way for zfs send to avoid so many seeks. The 1 TB file system is still being zfs send, now close to 48 hours. Cheers Carsten PS: We still have a spare thumper sitting around, maybe I give it a try with 5 vdevs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Hi Ross Ross wrote: Now though I don't think it's network at all. The end result from that thread is that we can't see any errors in the network setup, and using nicstat and NFS I can show that the server is capable of 50-60MB/s over the gigabit link. Nicstat also shows clearly that both zfs send / receive and mbuffer are only sending 1/5 of that amount of data over the network. I've completely run out of ideas of my own (but I do half expect there's a simple explanation I haven't thought of). Can anybody think of a reason why both zfs send / receive and mbuffer would be so slow? Try to separate the two things: (1) Try /dev/zero - mbuffer --- network --- mbuffer /dev/null That should give you wirespeed (2) Try zfs send | mbuffer /dev/null That should give you an idea how fast zfs send really is locally. Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Hi Carsten, You seem to be using dd for write testing. In my testing I noted that there was a large difference in write speed between using dd to write from /dev/zero and using other files. Writing from /dev/zero always seemed to be fast, reaching the maximum of ~200MB/s and using cp which would perform poorler the fewer the vdevs. This also impacted the zfs send speed, as with fewer vdevs in RaidZ2 the disks seemed to spend most of their time seeking during the send. On Thu, Oct 16, 2008 at 1:27 AM, Carsten Aulbert [EMAIL PROTECTED] wrote: Some time ago I made some tests to find this: (1) create a new zpool (2) Copy user's home to it (always the same ~ 25 GB IIRC) (3) zfs send to /dev/null (4) evaluate continue loop I did this for fully mirrored setups, raidz as well as raidz2, the results were mixed: https://n0.aei.uni-hannover.de/cgi-bin/twiki/view/ATLAS/ZFSBenchmarkTest#ZFS_send_performance_relevant_fo The culprit here might be that in retrospect this seemed like a good home filesystem, i.e. one which was quite fast. If you don't want to bother with the table: Mirrored setup never exceeded 58 MB/s and was getting faster the more small mirrors you used. RaidZ had its sweetspot with a configuration of '6 6 6 6 6 6 5 5', i.e. 6 or 5 disks per RaidZ and 8 vdevs RaidZ2 finally was best at '10 9 9 9 9', i.e. 5 vdevs but not much worse with only 3, i.e. what we are currently using to get more storage space (gains us about 2 TB/box). Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool not imported on boot on Solaris Xen PV DomU
Francois Goudal wrote: Hi, I am trying a setup with a Linux Xen Dom0 on which runs an OpenSolaris 2008.05 DomU. I have 8 hard disk partitions that I exported to the DomU (they are visible as c4d[1-8]p0) I have created a raidz2 pool on these virtual disks. Now, if I shutdown the system and I start it again, the pool is not automatically imported during the boot. If I type zpool status, I can't see it, so I do a zfs import and I see that there is my pool that I can import, so I import it and it works. But I wonder why it isn't imported automatically. How is managed the pool import during bootup ? Does solaris try to import every single pool that's available, or does it read some list from a file somewhere (possibly the boot_archive) ? The file is /etc/zfs/zpool.cache Unfortunately, it is not human readable, but zdb -C can be used to examine its contents. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!
Tano wrote: I'm not sure if this is a problem with the iscsitarget or zfs. I'd greatly appreciate it if it gets moved to the proper list. Well I'm just about out of ideas on what might be wrong.. Quick history: I installed OS 2008.05 when it was SNV_86 to try out ZFS with VMWare. Found out that multilun's were being treated as multipaths so waited till SNV_94 came out to fix the issues with VMWARE and iscsitadm/zfs shareiscsi=on. I Installed OS2008.05 on a virtual machine as a test bed, pkg image-update to SNV_94 a month ago, made some thin provisioned partitions, shared them with iscsitadm and mounted on VMWare without any problems. Ran storage VMotion and all went well. So with this success I purchased a Dell 1900 with a PERC 5/i controller 6 x 15K SAS DRIVEs with ZFS RAIDZ1 configuration. I shared the zfs partitions and mounted them on VMWare. Everything is great till I have to write to the disks. It won't write! What's the error exactly? What step are you performing to get the error? Creating the vmfs3 filesystem? Accessing the mountpoint? Steps I took creating the disks 1) Installed mega_sas drivers. 2) zpool create tank raidz c5t0d0 c5t1d0 c5t2d0 c5t3d0 c5t4d0 c5t5d0 3) zfs create -V 1TB tank/disk1 4) zfs create -V 1TB tank/disk2 5) iscsitadm create target -b /dev/zvol/rdsk/tank/disk1 LABEL1 6) iscsitadm create target -b /dev/zvol/rdsk/tank/disk2 LABEL2 Now both drives are lun 0 but with uniqu VMHBA device identifiers. SO they are detected as seperate drives. I then redid (deleted) step 5 and 6 and changed it too 5) iscsitadm create target -u 0 -b /dev/zvol/rdsk/tank/disk1 LABEL1 6) iscsitadm create target -u 1 -b /dev/zvol/rdsk/tank/disk2 LABEL1 VMWARE discovers the seperate LUNs on the Device identifier, but still unable to write to the iscsi luns. Why is it that the steps I've conducted in SNV_94 work but in SNV_97,98, or 99 don't. Any ideas?? any log files I can check? I am still an ignorant linux user so I only know to look in /var/log :) The relevant errors from /var/log/vmkernel on the ESX server would be helpful. Also, iscsitadm list target -v Also, I blogged a bit on OpenSolaris iSCSI VMware ESX I was using b98 on a X4500. http://blogs.sun.com/rarneson/entry/zfs_clones_iscsi_and_vmware -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Ryan Arneson Sun Microsystems, Inc. 303-223-6264 [EMAIL PROTECTED] http://blogs.sun.com/rarneson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] am I screwed?
On Mon, Oct 13, 2008 at 10:25 PM, dick hoogendijk [EMAIL PROTECTED] wrote: We have to dig deeper with kmdb. But before we do that, tell me please what is an easy way to transfer the messages from the failsafe login on the problematic machine to i.e. this S10u5 server. All former screen output had to be typed in by hand. I didn't know of another way. If you say no to mount the pool on /a, does it still hang? Just to ask the obvious question, did you try to press ENTER or anything else where it was hanging? What build are you booting into failsafe mode? Something older, or b99? Do you have a build-99 DVD to boot from, from which you can get a proper running system with networking, etc? -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
r == Ross [EMAIL PROTECTED] writes: r 1. Loss of a server is very much the worst case scenario. r Disk errors are much more likely, and with raid-z2 pools on r the individual servers yeah, it kind of sucks that the slow resilvering speed enforces this two-tier scheme. Also if you're going to have 1000 spinning platters you'll have a drive failure every four days or so---you need to be able to do more than one resilver at a time, and you need to do resilvers without interrupting scrubs which could take so long to run that you run them continuously. The ZFS-on-zvol hack lets you do both to a point, but I think it's an ugly workaround for lack of scalability in flat ZFS, not the ideal way to do things. r A motherboard / backplane / PSU failure will offline that r server, but once the faulted components are replaced your pool r will come back online. Once the pool is online, ZFS has the r ability to resilver just the changed data, except that is not what actually happens for my iSCSI setup. If I 'zpool offline' the target before taking it down, it usually does work as you describe---a relatively fast resilver kicks off, and no CKSUM errors appear later. I've used it gently. I haven't offlined a raidz2 device for three weeks while writing gigabytes to the pool in the mean time, but for my gentle use it does seem to work. But if the iSCSI target goes down unexpectedly---ex., because I pull the network cord---it does come back online and does resilver, but latent CKSUM errors show up weeks later. Also, if the head node reboots during a resilver, ZFS totally forgets what it was doing, and upon reboot just blindly mounts the unclean component as if it were clean, later calling all the differences CKSUM errors. same thing happens if you offline a device, then reboot. The ``persistent'' offlining doesn't seem to work, and in any case the device comes online without a proper resilver. SVM had dirty-region logging stored in the metadb so that resilvers could continue where they left off across reboots. I believe SVM usually did a full resilver when a component disappeared, but am not sure this was always the case. Anyway ZFS doesn't seem to have a similar capability, at least not one that works. so, in practice, whenever any iSCSI component goes away unexpectedly---target server failure, power failure, kernel panic, L2 spanning tree reconfiguration, whatever---you have to scrub the whole pool from the head node. It's interesting how the speed and optimisation of these maintenance activities limit pool size. It's not just full scrubs. If the filesystem is subject to corruption, you need a backup. If the filesystem takes two months to back up / restore, then you need really solid incremental backup/restore features, and the backup needs to be a cold spare, not just a backup---restoring means switching the roles of the primary and backup system, not actually moving data. finally, for really big pools, even O(n) might be too slow. The ZFS best practice guide for converting UFS to ZFS says ``start multiple rsync's in parallel,'' but I think we're finding zpool scrubs and zfs sends are not well-parallelized. These reliability limitations and performance characteristics of maintenance tasks seem to make a sort of max-pool-size Wall beyond which you end up painted into corners. If they were made better, I think you'd later hit another wall at the maximum amount of data you could push through one head node and would have to switch to some QFS/GFS/OCFS-type separate-data-and-metadata filesystem, and to match ZFS this filesystem would have to do scrubs, resilvers, and backups in a distributed way not just distribute normal data access. A month ago I might have ranted, ``head node speed puts a cap on how _busy_ the filesystem can be, not how big it can be, so ZFS (modulo a lot of bug fixes) could be fantastic for data sets of virtually unlimited size even with its single-initiator, single-head-node limitation, so long as the pool gets very light access.'' Now, I don't think so, because scrubbing/resilvering/backup-restore has to flow through the head node, too. This observation also means my preference for a ``recovery tool'' that treats corrupt pools as read-only over fsck (online or offline) isn't very scalable. The original zfs kool-aid ``online maintenance'' model of doing a cheap fsck at import time and a long O(n) fsck through online scrubs is the only one with a future in a world where maintenance activities can take months. pgpzqaJe5ZecE.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
[EMAIL PROTECTED] said: It's interesting how the speed and optimisation of these maintenance activities limit pool size. It's not just full scrubs. If the filesystem is subject to corruption, you need a backup. If the filesystem takes two months to back up / restore, then you need really solid incremental backup/restore features, and the backup needs to be a cold spare, not just a backup---restoring means switching the roles of the primary and backup system, not actually moving data. I'll chime in here with feeling uncomfortable with such a huge ZFS pool, and also with my discomfort of the ZFS-over-ISCSI-on-ZFS approach. There just seem to be too many moving parts depending on each other, any one of which can make the entire pool unavailable. For the stated usage of the original poster, I think I would aim toward turning each of the Thumpers into an NFS server, configure the head-node as a pNFS/NFSv4.1 metadata server, and let all the clients speak parallel-NFS to the cluster of file servers. You'll end up with a huge logical pool, but a Thumper outage should result only in loss of access to the data on that particular system. The work of scrub/resilver/replication can be divided among the servers rather than all living on a single head node. Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
pNFS is NFS-centric of course and it is not yet stable, isn't it? btw, what is the ETA for pNFS putback? On Thu, 2008-10-16 at 12:20 -0700, Marion Hakanson wrote: [EMAIL PROTECTED] said: It's interesting how the speed and optimisation of these maintenance activities limit pool size. It's not just full scrubs. If the filesystem is subject to corruption, you need a backup. If the filesystem takes two months to back up / restore, then you need really solid incremental backup/restore features, and the backup needs to be a cold spare, not just a backup---restoring means switching the roles of the primary and backup system, not actually moving data. I'll chime in here with feeling uncomfortable with such a huge ZFS pool, and also with my discomfort of the ZFS-over-ISCSI-on-ZFS approach. There just seem to be too many moving parts depending on each other, any one of which can make the entire pool unavailable. For the stated usage of the original poster, I think I would aim toward turning each of the Thumpers into an NFS server, configure the head-node as a pNFS/NFSv4.1 metadata server, and let all the clients speak parallel-NFS to the cluster of file servers. You'll end up with a huge logical pool, but a Thumper outage should result only in loss of access to the data on that particular system. The work of scrub/resilver/replication can be divided among the servers rather than all living on a single head node. Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
On Thu, Oct 16, 2008 at 12:20:36PM -0700, Marion Hakanson wrote: I'll chime in here with feeling uncomfortable with such a huge ZFS pool, and also with my discomfort of the ZFS-over-ISCSI-on-ZFS approach. There just seem to be too many moving parts depending on each other, any one of which can make the entire pool unavailable. But does it work well enough? It may be faster than NFS if there's only one client for each volume (unless you have fast slog devices for the ZIL). And it'd have better semantics too (e.g., no need for the client and server to agree on identities/domains). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
[EMAIL PROTECTED] said: In general, such tasks would be better served by T5220 (or the new T5440 :-) and J4500s. This would change the data paths from: client --net-- T5220 --net-- X4500 --SATA-- disks to client --net-- T5440 --SAS-- disks With the J4500 you get the same storage density as the X4500, but with SAS access (some would call this direct access). You will have much better bandwidth and lower latency between the T5440 (server) and disks while still having the ability to multi-head the disks. The There's an odd economic factor here, if you're in the .edu sector: The Sun Education Essentials promotional price list has the X4540 priced lower than a bare J4500 (not on the promotional list, but with a standard EDU discount). We have a project under development right now which might be served well by one of these EDU X4540's with a J4400 attached to it. The spec sheets for J4400 and J4500 say you can chain together enough of them to make a pool of 192 drives. I'm unsure about the bandwidth of these daisy-chained SAS interconnects, though. Any thoughts as to how high one might scale an X4540-plus-J4x00 solution? How does the X4540's internal disk bandwidth compare to that of the (non-RAID) SAS HBA? Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
nw == Nicolas Williams [EMAIL PROTECTED] writes: nw But does it work well enough? It may be faster than NFS if You're talking about different things. Gray is using NFS period between the storage cluster and the compute cluster, no iSCSI. Gray's (``does it work well enough''): iSCSI within storage cluster NFS to storage consumers Marion's (less ``uncomfortable''): nothing(?) within storage cluster pNFS to storage consumers but Marion's is not really possible at all, and won't be for a while with other groups' choice of storage-consumer platform, so it'd have to be GlusterFS or some other goofy fringe FUSEy thing or not-very-general crude in-house hack. I guess since Gray is copying data in and out all the time he doesn't have to worry about the glacial-restore problem and corruption problem. If it were my worry, I'd definitely include NFS clients in the performance test because iSCSI is high-latency, and the NFS clients could be more latency-sensitive than the local benchmark. I might test coalescing in the big data separately from running the crunching, because maybe the big data can be copied in with pax-over-netcat, or something other than NFS, and maybe the crunching could treat the big data as read-only and write its small result to a fast standalone ZFS server which would make NFS faster. and i'd get the small important data that needs backup off this mess (but please let us know how the failure simulating testing goes!). pgpM2yKwKqo4d.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!
Thank you Ryan for your response, I have included all the information you requested in line to this document: I will be testing SNV_86 again to see whether the problem persists, maybe it's my hardware. I will confirm that soon enough. On Thu, October 16, 2008 10:31 am, Ryan Arneson wrote: Tano wrote: I'm not sure if this is a problem with the iscsitarget or zfs. I'd greatly appreciate it if it gets moved to the proper list. Well I'm just about out of ideas on what might be wrong.. Quick history: I installed OS 2008.05 when it was SNV_86 to try out ZFS with VMWare. Found out that multilun's were being treated as multipaths so waited till SNV_94 came out to fix the issues with VMWARE and iscsitadm/zfs shareiscsi=on. I Installed OS2008.05 on a virtual machine as a test bed, pkg image-update to SNV_94 a month ago, made some thin provisioned partitions, shared them with iscsitadm and mounted on VMWare without any problems. Ran storage VMotion and all went well. So with this success I purchased a Dell 1900 with a PERC 5/i controller 6 x 15K SAS DRIVEs with ZFS RAIDZ1 configuration. I shared the zfs partitions and mounted them on VMWare. Everything is great till I have to write to the disks. It won't write! What's the error exactly? From the VMWARE Infrastructure front end, everything looks like is in order. I Send Targets to the iscsi IP, then rescan the HBA and it detects all the LUNs and Targets. What step are you performing to get the error? Creating the vmfs3 filesystem? Accessing the mountpoint? The error occurs when attempting to write large data sets to the mount point. Formatting the drive VMFS3 works, manually copying 5 megabytes of data to the Target works. Running cp -a of the VM folder or cold VM migration will hang the infrastructure client and the ESX host lags. No timeouts of any sort will occur. I waited up to an hour. Steps I took creating the disks 1) Installed mega_sas drivers. 2) zpool create tank raidz c5t0d0 c5t1d0 c5t2d0 c5t3d0 c5t4d0 c5t5d0 3) zfs create -V 1TB tank/disk1 4) zfs create -V 1TB tank/disk2 5) iscsitadm create target -b /dev/zvol/rdsk/tank/disk1 LABEL1 6) iscsitadm create target -b /dev/zvol/rdsk/tank/disk2 LABEL2 Now both drives are lun 0 but with uniqu VMHBA device identifiers. SO they are detected as seperate drives. I then redid (deleted) step 5 and 6 and changed it too 5) iscsitadm create target -u 0 -b /dev/zvol/rdsk/tank/disk1 LABEL1 6) iscsitadm create target -u 1 -b /dev/zvol/rdsk/tank/disk2 LABEL1 VMWARE discovers the seperate LUNs on the Device identifier, but still unable to write to the iscsi luns. Why is it that the steps I've conducted in SNV_94 work but in SNV_97,98, or 99 don't. Any ideas?? any log files I can check? I am still an ignorant linux user so I only know to look in /var/log :) The relevant errors from /var/log/vmkernel on the ESX server would be helpful. So I weeded out the best that I could the logs from /var/log/vmkernel. Basically everytime I initiated a command from vmware I captured the logs. I have broken down what I was doing at what point in the logs. Again the complete breakdown of both systems: [b]VMware ESX 3.5 Update 2[/b] [EMAIL PROTECTED] log]# uname -a Linux vmware-860-1.ucr.edu 2.4.21-57.ELvmnix #1 Tue Aug 12 17:28:03 PDT 2008 i686 i686 i386 GNU/Linux [EMAIL PROTECTED] log]# arch i686 [b]Opensolaris:[/b] Dell Poweredge 1900 PERC 5/i 6 disk 450GB each SAS 15kRPM Broadcomm BNX driver: no conflicts. Quadcore 1600 Mhz 1066 FSB 8 GB RAM [EMAIL PROTECTED]:~# uname -a SunOS iscsi-sas 5.11 snv_99 i86pc i386 i86pc Solaris [EMAIL PROTECTED]:~# isainfo -v 64-bit amd64 applications ssse3 cx16 mon sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications ssse3 ahf cx16 mon sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu [EMAIL PROTECTED]:~# zpool status -v pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t0d0s0 ONLINE 0 0 0 c3t1d0ONLINE 0 0 0 errors: No known data errors pool: vdrive state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM vdrive ONLINE 0 0 0 raidz1ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 errors: No known data errors [EMAIL PROTECTED]:~# [EMAIL PROTECTED]:~# zfs create -V 750G vdrive/LUNA [EMAIL PROTECTED]:~# zfs create -V 1250G vdrive/LUNB [EMAIL PROTECTED]:~# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool
Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!
Also I had read your blog post previously. I will be taking advantage of the cloning/snapshot section of your blog once I am successful writing to the Targets. Thanks again! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
On Thu, Oct 16, 2008 at 04:30:28PM -0400, Miles Nordin wrote: nw == Nicolas Williams [EMAIL PROTECTED] writes: nw But does it work well enough? It may be faster than NFS if You're talking about different things. Gray is using NFS period between the storage cluster and the compute cluster, no iSCSI. I was replying to Marion's comment about ZFS-over-ISCSI-on-ZFS, not to Gray. I can see why one might worry about ZFS-over-iSCSI-on-ZFS. Two layers of copy-on-write might interact in odd ways that kill performance. But if you want ZFS-over-iSCSI in the first place then ZFS-over-iSCSI-on-ZFS sounds like the correct approach IF it can perform well enough. ZFS-over-iSCSI could certainly perform better than NFS, but again, it may depend on what kind of ZIL devices you have. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
nw == Nicolas Williams [EMAIL PROTECTED] writes: mh == Marion Hakanson [EMAIL PROTECTED] writes: nw I was replying to Marion's [...] nw ZFS-over-iSCSI could certainly perform better than NFS, better than what, ZFS-over-'mkfile'-files-on-NFS? No one was suggesting that. Do you mean better than pNFS? It sounded at first like you meant iSCSI-over-ZFS should perform better than NFS, but no one's suggesting that either. Gray:NFS over ZFS over iSCSI over ZFS over disk Marion: pNFS over ZFS over disk they are both using the same amount of {,p}NFS. pgp2sQIXdWVEA.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Enable compression on ZFS root
No, the last arguments are not options.nbsp; Unfortunately,br the syntax doesn't provide a way to specify compressionbr at the creation time.nbsp; It should, though.nbsp; Or perhapsbr compression should be the default.br Should I submit an RFE somewhere then? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
On Oct 16, 2008, at 15:20, Marion Hakanson wrote: For the stated usage of the original poster, I think I would aim toward turning each of the Thumpers into an NFS server, configure the head- node as a pNFS/NFSv4.1 It's a shame that Lustre isn't available on Solaris yet either. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!
I googled on some sub-strings from your ESX logs and found these threads on the VmWare forum which lists similar error messages, suggests some actions to try on the ESX server: http://communities.vmware.com/message/828207 Also, see this thread: http://communities.vmware.com/thread/131923 Are you using multiple Ethernet connections between the OpenSolaris box and the ESX server? Your 'iscsitadm list target -v' is showing Connections: 0, so run that command after the ESX server initiator has successfully connected to the OpenSolaris iscsi target, and post that output. The log files seem to show the iscsi session has dropped out, and the initiator is auto retrying to connect to the target, but failing. It may help to get a packet capture at this stage to try see why the logon is failing. Regards Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS-over-iSCSI performance testing (with low random access results)...
[EMAIL PROTECTED] said: but Marion's is not really possible at all, and won't be for a while with other groups' choice of storage-consumer platform, so it'd have to be GlusterFS or some other goofy fringe FUSEy thing or not-very-general crude in-house hack. Well, of course the magnitude of fringe factor is in the eye of the beholder. I didn't intend to make pNFS seem like a done deal. I don't quite yet think of OpenSolaris as a done deal either, still using Solaris-10 here in production, but since this is an OpenSolaris mailing list I should be more careful. Anyway, from looking over the wiki/blog info, apparently the sticking point with pNFS may be client-side availability -- there's only Linux and (Open)Solaris NFSv4.1 clients just yet. Still, pNFS claims to be backwards compatible with NFS v3 clients: If you point a traditional NFS client at the pNFS metadata server, the MDS is supposed to relay the data from the backend data servers. [EMAIL PROTECTED] said: It's a shame that Lustre isn't available on Solaris yet either. Actually, that may not be so terribly fringey, either. Lustre and Sun's Scalable Storage product can make use of Thumpers: http://www.sun.com/software/products/lustre/ http://www.sun.com/servers/cr/scalablestorage/ Apparently it's possible to have a Solaris/ZFS data-server for Lustre backend storage: http://wiki.lustre.org/index.php?title=Lustre_OSS/MDS_with_ZFS_DMU I see they do not yet have anything other than Linux clients, so that's a limitation. But you can share out a Lustre filesystem over NFS, potentially from multiple Lustre clients. Maybe via CIFS/samba as well. Lastly, I've considered the idea of using Shared-QFS to glue together multiple Thumper-hosted ISCSI LUN's. You could add shared-QFS clients (acting as NFS/CIFS servers) if the client load needed more than one. Then SAM-FS would be a possibility for backup/replication. Anyway, I do feel that none of this stuff is quite there yet. But my experience with ZFS on fiberchannel SAN storage, that sinking feeling I've had when a little connectivity glitch resulted in a ZFS panic, makes me wonder if non-redundant ZFS on an ISCSI SAN is there yet, either. So far none of our lost-connection incidents resulted in pool corruption, but we have only 4TB or so. Restoring that much from tape is feasible, but even if Gray's 150TB of data can be recreated, it would take weeks to reload it. If it's decided that the clustered-filesystem solutions aren't feasible yet, the suggestion I've seen that I liked the best was Richard's, with a bad-boy server SAS-connected to multiple J4500's. But since Gray's project already has the X4500's, I guess they'd have to find another use for them (:-). Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!
Nigel Smith wrote: I googled on some sub-strings from your ESX logs and found these threads on the VmWare forum which lists similar error messages, suggests some actions to try on the ESX server: http://communities.vmware.com/message/828207 Also, see this thread: http://communities.vmware.com/thread/131923 Are you using multiple Ethernet connections between the OpenSolaris box and the ESX server? Indeed, I think there might be some notion of 2 separate interfaces. I see 0.0.0.0 and the 138.xx.xx.xx networks. Oct 16 06:38:29 vmware-860-1 vmkernel: 0:02:03:00.166 cpu1:1080)iSCSI: bus 0 target 40 trying to establish session 0x9a684e0 to portal 0, address 0.0.0.0 port 3260 group 1 Oct 16 06:16:30 vmware-860-1 vmkernel: 0:01:41:01.021 cpu1:1076)iSCSI: bus 0 target 38 established session 0x9a402c0 #1 to portal 0, address 138.23.117.32 port 3260 group 1, alias luna Do you have an active interface on the OpenSolaris box that is configured for 0.0.0.0 right now? By default, since you haven't configured the tpgt on the iscsi target, solaris will broadcast all active interfaces in its SendTargets response. On the ESX side, ESX will attempt to log into all addresses in that SendTargets response, even though you may only put 1 address in the sw initiator config. If that is the case, you have a few options a) disable that bogus interface b) fully configure it and and also create a vmkernel interface that can connect to it c) configure a tpgt mask on the iscsi target (iscsitadm create tpgt) to only use the valid address Also, I never see target 40 log into anything...is that still a valid target number? You may want to delete everything in /var/lib/iscsi and reboot the host. The vmkbinding and vmkdiscovery files will be rebuilt and it will start over with target 0. Sometimes, things get a bit crufty. -ryan Your 'iscsitadm list target -v' is showing Connections: 0, so run that command after the ESX server initiator has successfully connected to the OpenSolaris iscsi target, and post that output. The log files seem to show the iscsi session has dropped out, and the initiator is auto retrying to connect to the target, but failing. It may help to get a packet capture at this stage to try see why the logon is failing. Regards Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 200805 Grub problems
Ok, I managed to get my grub menu (and spashimage) back by following: http://www.genunix.org/wiki/index.php/ZFS_rpool_Upgrade_and_GRUB Initially, I just did it for the boot enviroment I wanted to use, but it didn't seem to work, so I also did it for the previous boot enviroment. I'm not sure what it did, but it gave me the grub menu back (and splash image). However, it would kernel panic (when trying to mount the zfs rpool I'm guessing). I eventually followed the exact procedure for all my boot enviroments, just doing them in order from oldest to newest, I also didn't export rpool before I rebooted out of the LiveCD, and now it boots up fine. Anyone know what's going on? I've had this happen to me twice now on seperate machines. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Enable compression on ZFS root
Vincent Fox wrote: Or perhaps compression should be the default. No way please! Things taking even more memory should never be the default. An installation switch would be nice though. Freedom of coice ;-) -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss