Re: [zfs-discuss] SPARC SATA, please.
Richard Elling wrote: Miles Nordin wrote: ave == Andre van Eyssen an...@purplecow.org writes: et == Erik Trimble erik.trim...@sun.com writes: ea == Erik Ableson eable...@mac.com writes: edm == Eric D. Mudama edmud...@bounceswoosh.org writes: ave The LSI SAS controllers with SATA ports work nicely with ave SPARC. I think what you mean is ``some LSI SAS controllers work nicely with SPARC''. It would help if you tell exactly which one you're using. I thought the LSI 1068 do not work with SPARC (mfi driver, x86 only). Sun has been using the LSI 1068[E] and its cousin, 1064[E] in SPARC machines for many years. In fact, I can't think of a SPARC machine in the current product line that does not use either 1068 or 1064 (I'm sure someone will correct me, though ;-) -- richard Might be worth having a look at the T1000 to see what's in there. We used to ship those with SATA drives in. cheers, --justin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Purpose of zfs_acl_split_ace
Hi, in nfs-discuss, Andrwe Watkins has brought up the question, why an inheritable ACE is split into two ACEs when a descendant directory is created. Ref: http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/zfs_acl.c#1506 I must admit that I had observed this behavior many times, but never asked myself why ACE inheritance is implemented like this. The best explanation I can come up with is that chmod calls on the mode bits should not change inheritable ACEs, and splitting inheritable (non inherit-only) ACEs is an easy way to achieve this. Does this interpretation match the original intention, or are there any other or better reasons? It there a reason why inheritable ACEs are split always, even if the particular chmod call would not require splitting them? Thank you, Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Increase size of ZFS mirror
Hi all, I have a ZFS mirror of two 500GB disks, I'd like to up these to 1TB disks, how can I do this? I must break the mirror as I don't have enough controller on my system board. My current mirror looks like this: [b]r...@beleg-ia:/share/media# zpool status share pool: share state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM share ONLINE 0 0 0 mirrorONLINE 0 0 0 c5d0s0 ONLINE 0 0 0 c5d1s0 ONLINE 0 0 0 errors: No known data errors[/b] If I detach c5d1s0, add a 1TB drive, attach that, wait for it to resilver, then detach c5d0s0 and add another 1TB drive and attach that to the zpool, will that up the storage of the pool? Thanks very much, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput?
Thank you for your reply. I had read the blog. The most interesting thing is WHY is there no performance improve when it set any compression? The compressed read I/O is less than uncompressed data, and decompress is faster than compress. so if lzjb write is better than non-compressed, the lzjb read would be better than write? Is the ARC or L2ARC do any tricks? Thanks From: David Pacheco david.pach...@sun.com To: Chookiex hexcoo...@yahoo.com Cc: zfs-discuss@opensolaris.org Sent: Wednesday, June 24, 2009 4:53:37 AM Subject: Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput? Chookiex wrote: Hi all. Because the property compression could decrease the file size, and the file IO will be decreased also. So, would it increase the ZFS I/O throughput with compression? for example: I turn on gzip-9,on a server with 2*4core Xeon, 8GB RAM. It could compress my files with compressratio 2.5x+. could it be? or I turn on lzjb, about 1.5x with the same files.. It's possible, but it depends on a lot of factors, including what your bottleneck is to begin with, how compressible your data is, and how hard you want the system to work compressing it. With gzip-9, I'd be shocked if you saw bandwidth improved. It seems more common with lzjb: http://blogs.sun.com/dap/entry/zfs_compression (skip down to the results) -- Dave could it be? Is there anyone have a idea? thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
On Wed, 24 Jun 2009 03:14:52 PDT Ben no-re...@opensolaris.org wrote: If I detach c5d1s0, add a 1TB drive, attach that, wait for it to resilver, then detach c5d0s0 and add another 1TB drive and attach that to the zpool, will that up the storage of the pool? That will do the trick perfectly. I just did the same last week ;-) -- Dick Hoogendijk -- PGP/GnuPG key: 01D2433D + http://nagual.nl/ | nevada / OpenSolaris 2009.06 release + All that's really worth doing is what we do for others (Lewis Carrol) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpoll status -x output
Hi, In company where I'm working we are using zpool status -x command output in monitoring scripts for check health all ZFS pools. Everything is OK except few systems where zpool status -x is exactly the same as zpool status. I'm not sure but looks like this behavior is not OS version specific (I observe this on one latest OpenSolaris build but also on some previous on on two boxes with Solarises 10). I found that in all these cases in command output I see some additional notes like: status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. If zpoll status -x output is not single line message all pools are healthy by using old on disk format IMO this behavior is inccorrect because in zpool(1M) description for -x option I see: -x Only display status for pools that are exhibiting errors or are otherwise unavail- able. In this case there is no errors in pools and all resources still are available. Comments? Should I open case for this? Tomasz -- Wydział Zarządzania i Ekonomii Politechnika Gdańska http://www.zie.pg.gda.pl/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Turn off the time slider on some zpools
How to turn off the timeslider snapshots on certain file systems? http://wikis.sun.com/display/OpenSolarisInfo/How+to+Manage+the+Automatic+ZFS+Snapshot+Service Thank you, very handy stuff! BTW - will zfs automatically delete snapshots, when I`ll go low on disk space? -- With respect, Nik Maslov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Migration: 1 x 160GB IDE boot drive --- 2 x 30GB SATA SSDs
Hi, I have OpenSolaris 2009.06 currently installed on a 160 GB IDE drive. I want to replace this with a 2-way mirror 30 GB SATA SSD boot setup. I found these 2 threads which seem to answer some questions I had, but I still have some questions. http://opensolaris.org/jive/thread.jspa?messageID=386577 http://opensolaris.org/jive/thread.jspa?threadID=104656 FIRST QUESTION: Although, it seems possible to add a drive to form a mirror for the ZFS boot pool 'rpool', the main problem I see is that in my case, I would be attempting to form a mirror using a smaller drive (30GB) than the initial 160GB drive. Is there an easy solution to this problem, or would it be simpler to just do a reinstall of OpenSolaris 2009.06 onto 2 brand new 30GB SSDs? I have the option of the fresh install, as I haven't invested much time in configuring this OS2009.06 boot environment yet. SECOND QUESTION: I also want the possibility to have multiple boot environments within OpenSolaris 2009.06 to allow easy rollback to a working boot environment in case of an IPS update problem. I presume this will not cause any additional complications? THIRD QUESTION: This is for a home fileserver so I don't want to spend too much, but does anyone see any problem with having the OS installed on MLC SSDs, which are cheaper than SLC SSDs. I'm thinking here specifically about wearing out the SSD if the OS does too many writes to the SSDs. I agree SSDs are a bit overkill, and using standard spinning metal would be cheaper, but the case is vibrating like crazy as I ran out of drive slots and had to use non-grommeted attachments for the boot drive. But the SSDs should be silent and should certainly speed up boot and shutdown times dramatically :) Thanks, Simon http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
dick hoogendijk schrieb: On Wed, 24 Jun 2009 03:14:52 PDT Ben no-re...@opensolaris.org wrote: If I detach c5d1s0, add a 1TB drive, attach that, wait for it to resilver, then detach c5d0s0 and add another 1TB drive and attach that to the zpool, will that up the storage of the pool? That will do the trick perfectly. I just did the same last week ;-) Doesn't detaching render the detach disk command the detached disk as a disk unassociated with a pool? I think it might be better to import the pool with only one half of the mirror without detaching the disk, and the do a zpool replace. In this case if something goes wrong during resilver, you still have the other half of the mirror to bring your pool back up again. If you detach the disk upfront this won't be possible. Just an idea... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Thomas, Could you post an example of what you mean (ie commands in the order to use them)? I've not played with ZFS that much and I don't want to muck my system up (I have data backed up, but am more concerned about getting myself in a mess and having to reinstall, thus losing my configurations). Many thanks for both of your replies, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS for iSCSI based SAN
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, i'm getting involved in a pre-production test and want to be sure of the means i'll have to use. Take 2 SunFire x4150 1 3750 Gb Cisco Switche 1 private VLAN on the Gb ports of the SW. 1 x4150 is going to be ESX4 aka VSphere Server ( 1 Hardware mirror of 146G 32G Ram) Booting ESX 4 on local disk. The other is going to be used as a poor-man-SAN : 8X146G SAS 15k 8Go Ram Solaris 10 2 first disks Hardware mirror of 146Go with Sol10 UFS filesystem on it. The next 6 others will be used as a raidz2 ZFS volume of 535G, compression and shareiscsi=on. I'm going to CHAP protect it soon... I'm going to put two zfs slices on it: zfs create -V 250G SAN/ESX1 zfs create -V 250G SAN/ESX2 And using it for VMFS. Oh, by the way, i've no VMotion plugin. In my tests ESX4 seems to work fine with this, but i haven't already stressed it ;-) Therefore, i don't know if the 1Gb FDuplex per port will be enough, i don't know either i'have to put sort of redundant access form ESX to SAN,etc Is my configuration OK ? It's only a preprod install, i'm able to break almost everything if it's necessary. Thanks for all yours answers. Yours, faithfully. - -- Cordialement. - - Lycée Alfred Nobel,Clichy sous bois http://www.lyceenobel.org KeyID 0x46EA1D16 FingerPrint 997B164F4F606A61E7B1FC61961A821646EA1D16 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpCHx4ACgkQlhqCFkbqHRbf/ACfbV1amZJxHfVHKDknoh2hT/5y SpwAoJktgPqvEkFa5jHgUGXNnkv7TX99 =zH2Q -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Turn off the time slider on some zpools
cindy.swearin...@sun.com writes: Hi Harry, Are you attempting this change when logged in as yourself or as root? my user The top section of this procedure describes how to add yourself to zfssnap role. Otherwise, if you are doing this step as a non-root user, it probably won't work. my user is in role zfssnap. And in role `root' $ roles postgres,root,zfssnap I'm not sure how a user can access the gui tool without being logged in as themselves. Since the tool is on the dropdown menu System. Do you mean root has to log into X? But anyway.. the command lines shown on that page are much handier anyway. I just wondered if the Gui tool was working as expected. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Many thanks Thomas, I have a test machine so I shall try it on that before I try it on my main system. Thanks very much once again, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Purpose of zfs_acl_split_ace
Nils Goroll wrote: Hi, I just noticed that Mark Shellenbaum has replied to the same question in a thread ACL not being inherited correctly on zfs-discuss. Sorry for the noise. Out of curiosity, I would still be interested in answers to this question: It there a reason why inheritable ACEs are split always, even if the particular chmod call would not require splitting them? For instance, a mode bit change would never influence n...@owner/@group/@everyone ACEs and even for the @owner/@group/@everyone, one could check if the mode bits are actually changed by the chmod call. Any group entry could have its permissions modified in some situations (i.e. group has greater permissions than owner). Its true that a user entry wouldn't necessarily need it, but in order to keep the algorithm simpler we just always do the split. It would be simple enough to exclude user entries from splitting. Feel free to open a bug on this. Does this make any sense? Thank you, Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
On 24.06.09 17:10, Thomas Maier-Komor wrote: Ben schrieb: Thomas, Could you post an example of what you mean (ie commands in the order to use them)? I've not played with ZFS that much and I don't want to muck my system up (I have data backed up, but am more concerned about getting myself in a mess and having to reinstall, thus losing my configurations). Many thanks for both of your replies, Ben I'm not an expert on this, and I haven't tried it, so beware: 1) If the pool you want to expand is not the root pool: $ zpool export mypool # now replace one of the disks with a new disk $ zpool import mypool # zpool status will show that mypool is in degraded state because of a missing disk $ zpool replace mypool replaceddisk # now the pool will start resilvering # Once it is done with resilvering: $ zpool detach mypool otherdisk # now physically replace otherdisk $ zpool replace mypool otherdisk Last command would fail as there would be no longer otherdisk in mypool. Though you can always play with files first (or with VirtualBox etc): # preparation mkdir -p /var/tmp/disks/removed mkfile -n 64m /var/tmp/disks/disk0 mkfile -n 64m /var/tmp/disks/disk1 mkfile -n 128m /var/tmp/disks/bigdisk0 mkfile -n 128m /var/tmp/disks/bigdisk1 zpool create test mirror /var/tmp/disks/disk0 /var/tmp/disks/disk1 zpool list test # let's start by making sure there's no latent errors: zpool scrub test while zpool status -v test | grep % ; do sleep 1; done zpool status -v test zpool export test mv /var/tmp/disks/disk0 /var/tmp/disks/removed/disk0 # you don't need '-d /path' with real disks zpool import -d /var/tmp/disks test zpool status -v test # insert new disk mv /var/tmp/disks/bigdisk0 /var/tmp/disks/disk0 zpool replace test /var/tmp/disks/disk0 while zpool status -v test | grep % ; do sleep 1; done zpool status -v test # make sure that resilvering is complete zpool detach test /var/tmp/disks/disk1 mv /var/tmp/disks/disk1 /var/tmp/disks/removed/disk1 # insert new disk mv /var/tmp/disks/bigdisk1 /var/tmp/disks/disk1 zpool attach test /var/tmp/disks/disk0 /var/tmp/disks/disk1 while zpool status -v test | grep % ; do sleep 1; done zpool status -v test zpool list test hth, victor ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
On Wed, June 24, 2009 08:42, Philippe Schwarz wrote: In my tests ESX4 seems to work fine with this, but i haven't already stressed it ;-) Therefore, i don't know if the 1Gb FDuplex per port will be enough, i don't know either i'have to put sort of redundant access form ESX to SAN,etc Is my configuration OK ? It's only a preprod install, i'm able to break almost everything if it's necessary. At least in 3.x, VMware had a limitation of only being able to use one connection per iSCSI target (even if there were multiple LUNs on it): http://mail.opensolaris.org/pipermail/zfs-discuss/2009-June/028731.html Not sure if that's changed in 4.x, so if you're going to have more than one LUN, then having more than one target may be advantageous. See also: http://www.vmware.com/files/pdf/iSCSI_design_deploy.pdf You may want to go to the VMware lists / forums to see what the people there say as well. Out of curiosity, any reason why went with iSCSI and not NFS? There seems to be some debate on which is better under which circumstances. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpoll status -x output
It might be easier to look for the pool status thusly zpool get health poolname -- richard Tomasz Kłoczko wrote: Hi, In company where I'm working we are using zpool status -x command output in monitoring scripts for check health all ZFS pools. Everything is OK except few systems where zpool status -x is exactly the same as zpool status. I'm not sure but looks like this behavior is not OS version specific (I observe this on one latest OpenSolaris build but also on some previous on on two boxes with Solarises 10). I found that in all these cases in command output I see some additional notes like: status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. If zpoll status -x output is not single line message all pools are healthy by using old on disk format IMO this behavior is inccorrect because in zpool(1M) description for -x option I see: -x Only display status for pools that are exhibiting errors or are otherwise unavail- able. In this case there is no errors in pools and all resources still are available. Comments? Should I open case for this? Tomasz -- Wydział Zarządzania i Ekonomii Politechnika Gdańska http://www.zie.pg.gda.pl/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
2 first disks Hardware mirror of 146Go with Sol10 UFS filesystem on it. The next 6 others will be used as a raidz2 ZFS volume of 535G, compression and shareiscsi=on. I'm going to CHAP protect it soon... you're not going to get the random read write performance you need for a vm backend out of any kind of parity raid. just go with 3 sets of mirrors. unless you're ok with subpar performance (and if you think you are, you should really reconsider). also you might get significant mileage out of putting an ssd in and using it for zil. here's a good post from roch's blog about parity vs mirrored setups: http://blogs.sun.com/roch/entry/when_to_and_not_to ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
See this thread for information on load testing for vmware: http://communities.vmware.com/thread/73745?tstart=0start=0 Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn't always mean that is the best solution. Choose the best solution for your use case. You should have at least two NICs per connection to storage and LAN (4 total in this simple example), for redundancy if nothing else. Performance wise, vsphere can now have multiple SW iSCSI connections to a single LUN. My testing showed compression increased iSCSI performance by 1.7x, so I like compression. But again, these are my tests in my situation. Your results may differ from mine. Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. Remove the ZIL for testing to see the max benefit you could get. Don't do this in production! -Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
http://opensolaris.org/jive/thread.jspa?threadID=105702tstart=0 Yes, this does sound very similar. It looks to me like data from read files is clogging the ARC so that there is no more room for more writes when ZFS periodically goes to commit unwritten data. I'm wondering if changing txg_time to a lower value might help. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
Bottim line with virtual machines is that your IO will be random by definition since it all goes into the same pipe. If you want to be able to scale, go with RAID 1 vdevs. And don't skimp on the memory. Our current experience hasn't shown a need for an SSD for the ZIL but it might be useful for L2ARC (using iSCSI for VMs, NFS for templates and iso images) Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 24 juin 2009, at 18:56, milosz mew...@gmail.com wrote: Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn't always mean that is the best solution. Choose the best solution for your use case. multiple vm disks that have any kind of load on them will bury a raidz or raidz2. out of a 6x raidz2 you are going to get the iops and random seek latency of a single drive (realistically the random seek will probably be slightly worse, actually). how could that be adequate for a virtual machine backend? if you set up a raidz2 with 6x15k drives, for the majority of use cases, you are pretty much throwing your money away. you are going to roll your own san, buy a bunch of 15k drives, use 2-3u of rackspace and four (or more) switchports, and what you're getting out of it is essentially a 500gb 15k drive with a high mttdl and a really huge theoretical transfer speed for sequential operations (which you won't be able to saturate anyway because you're delivering over gige)? for this particular setup i can't really think of a situation where that would make sense. Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. link? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Wed, 24 Jun 2009, Ethan Erchinger wrote: http://opensolaris.org/jive/thread.jspa?threadID=105702tstart=0 Yes, this does sound very similar. It looks to me like data from read files is clogging the ARC so that there is no more room for more writes when ZFS periodically goes to commit unwritten data. I'm wondering if changing txg_time to a lower value might help. There is no doubt that having ZFS sync the written data more often would help. However, it should not be necessary to tune the OS for such a common task as batch processing a bunch of files. A more appropriate solution is for ZFS to notice that more than XXX megabytes are uncommitted, so maybe it should wake up and go write some data. It is useful for ZFS to defer data writes in case the same file is updated many times. In the case where the same file is updated many times, the total uncommitted data is still limited by the amount of data which is re-written and so the 30 second cycle is fine. In my case the amount of uncommitted data is limited by available RAM and how fast my application is able to produce new data to write. The problem is very much related to how fast the data is output. If the new data is created at a slower rate (output files are smaller) then the problem just goes away. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Narrow escape!
Ok, this is getting weird. I just ran a zpool clear, and now it says: # zpool clear zfspool # zpool status pool: zfspool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: scrub completed after 6h35m with 0 errors on Wed Jun 24 02:46:58 2009 config: NAMESTATE READ WRITE CKSUM zfspool ONLINE 0 0 0 raidz2ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 107G repaired c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 688K repaired c1t4d0 ONLINE 0 0 0 774K repaired c1t5d0 ONLINE 0 0 0 errors: No known data errors -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Ben wrote: Hi all, I have a ZFS mirror of two 500GB disks, I'd like to up these to 1TB disks, how can I do this? I must break the mirror as I don't have enough controller on my system board. My current mirror looks like this: [b]r...@beleg-ia:/share/media# zpool status share pool: share state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM share ONLINE 0 0 0 mirrorONLINE 0 0 0 c5d0s0 ONLINE 0 0 0 c5d1s0 ONLINE 0 0 0 errors: No known data errors[/b] If I detach c5d1s0, add a 1TB drive, attach that, wait for it to resilver, then detach c5d0s0 and add another 1TB drive and attach that to the zpool, will that up the storage of the pool? Thanks very much, Ben The following changes, which went into snv_116, change this behavior: PSARC 2008/353 zpool autoexpand property 6475340 when lun expands, zfs should expand too 6563887 in-place replacement allows for smaller devices 6606879 should be able to grow pool without a reboot or export/import 6844090 zfs should be able to mirror to a smaller disk With this change we introduced a new property ('autoexpand') which you must enable if you want devices to automatically grow (this includes replacing them with larger ones). You can alternatively use the '-e' (expand) option to 'zpool online' to grow individual drives even if 'autoexpand' is disabled. The reason we made this change was so that all device expansion would be managed in the same way. I'll try to blog about this soon but for now be aware that post snv_116 the typical method of growing pools by replacing devices will require at least one additional step. Thanks, George ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput?
Chookiex wrote: Thank you for your reply. I had read the blog. The most interesting thing is WHY is there no performance improve when it set any compression? There are many potential reasons, so I'd first try to identify what your current bandwidth limiter is. If you're running out of CPU on your current workload, for example, adding compression is not going to help performance. If this is over a network, you could be saturating the link. Or you might not have enough threads to drive the system to bandwidth. Compression will only help performance if you've got plenty of CPU and other resources but you're out of disk bandwidth. But even if that's the case, it's possible that compression doesn't save enough space that you actually decrease the number of disk I/Os that need to be done. The compressed read I/O is less than uncompressed data, and decompress is faster than compress. Out of curiosity, what's the compression ratio? -- Dave so if lzjb write is better than non-compressed, the lzjb read would be better than write? Is the ARC or L2ARC do any tricks? Thanks *From:* David Pacheco david.pach...@sun.com *To:* Chookiex hexcoo...@yahoo.com *Cc:* zfs-discuss@opensolaris.org *Sent:* Wednesday, June 24, 2009 4:53:37 AM *Subject:* Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput? Chookiex wrote: Hi all. Because the property compression could decrease the file size, and the file IO will be decreased also. So, would it increase the ZFS I/O throughput with compression? for example: I turn on gzip-9,on a server with 2*4core Xeon, 8GB RAM. It could compress my files with compressratio 2.5x+. could it be? or I turn on lzjb, about 1.5x with the same files. It's possible, but it depends on a lot of factors, including what your bottleneck is to begin with, how compressible your data is, and how hard you want the system to work compressing it. With gzip-9, I'd be shocked if you saw bandwidth improved. It seems more common with lzjb: http://blogs.sun.com/dap/entry/zfs_compression (skip down to the results) -- Dave could it be? Is there anyone have a idea? thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- David Pacheco, Sun Microsystems Fishworks.http://blogs.sun.com/dap/ -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
I think this is the board that shipped in the original T2000 machines before they began putting the sas/sata onboard: LSISAS3080X-R Can anyone verify this? Justin Stringfellow wrote: Richard Elling wrote: Miles Nordin wrote: ave == Andre van Eyssen an...@purplecow.org writes: et == Erik Trimble erik.trim...@sun.com writes: ea == Erik Ableson eable...@mac.com writes: edm == Eric D. Mudama edmud...@bounceswoosh.org writes: ave The LSI SAS controllers with SATA ports work nicely with ave SPARC. I think what you mean is ``some LSI SAS controllers work nicely with SPARC''. It would help if you tell exactly which one you're using. I thought the LSI 1068 do not work with SPARC (mfi driver, x86 only). Sun has been using the LSI 1068[E] and its cousin, 1064[E] in SPARC machines for many years. In fact, I can't think of a SPARC machine in the current product line that does not use either 1068 or 1064 (I'm sure someone will correct me, though ;-) -- richard Might be worth having a look at the T1000 to see what's in there. We used to ship those with SATA drives in. cheers, --justin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on 32 bit?
Dennis is correct in that there are significant areas where 32-bit systems will remain the norm for some time to come. think of that hundreds of thousands of VMWare ESX/Workstation/Player/Server installations on non VT capable cpu`s - even if the cpu has 64bit capability, a VM cannot run in 64bit mode the cpu is missing VT support. And VT isn`t available for so long, and still there are even recent CPUs which don`t have VT support -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
jr == Jacob Ritorto jacob.rito...@gmail.com writes: jr I think this is the board that shipped in the original jr T2000 machines before they began putting the sas/sata onboard: jr LSISAS3080X-R jr Can anyone verify this? can't verify but FWIW i fucked it up: I thought the LSI 1068 do not work with SPARC (mfi driver, x86 only). ^ me. this is wrong. mega_sas, the open source driver for 1078/PERC, is x86-only. http://mail.opensolaris.org/pipermail/zfs-discuss/2009-March/027338.html and mpt is the 1068 driver, proprietary, works on x86 and SPARC. mfi is some other (abandoned?) random third-party open-source driver for some of these cards that no one's mentioned using yet, at https://svn.itee.uq.edu.au/repo/mfi/ then there is also itmpt, the third-party-downloadable closed-source driver from LSI Logic, dunno much about it but someone here used it. sorry. There's also been talk of two tools, MegaCli and lsiutil, which are both binary only and exist for both Linux and Solaris, and I think are used only with the 1078 cards but maybe not. pgpwVYyC49Hzf.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Thomas Maier-Komor wrote: Ben schrieb: Thomas, Could you post an example of what you mean (ie commands in the order to use them)? I've not played with ZFS that much and I don't want to muck my system up (I have data backed up, but am more concerned about getting myself in a mess and having to reinstall, thus losing my configurations). Many thanks for both of your replies, Ben I'm not an expert on this, and I haven't tried it, so beware: 1) If the pool you want to expand is not the root pool: $ zpool export mypool # now replace one of the disks with a new disk $ zpool import mypool # zpool status will show that mypool is in degraded state because of a missing disk $ zpool replace mypool replaceddisk # now the pool will start resilvering # Once it is done with resilvering: $ zpool detach mypool otherdisk # now physically replace otherdisk $ zpool replace mypool otherdisk This will all work well. But I have a couple of suggestions for you as well. If you are using mirrored vdevs then you can also grow the vdev by making it a 3 or a 4 way mirror. This way you don't lose your resiliency in your vdev whilst you are migrating to larger disks. Now of course you have to be able to take the extra device in your system either via a spare drive bay in a storage enclosure or SAN or iSCSI based LUNS. When you have a lot of data and the business requires you to minimize any risk as much as possible this is a good idea. The pool was only offline for 14 seconds to gain the extra space and at all times there were *always* two devices in my mirror vdev. Here is a cut and paste from this process from just the other day with a live production server where the maintenance window was only 5 minutes. This pool was increased from 300 to 500 GB on LUNS from two disparate datacentres. 2009-06-17.13:57:05 zpool attach blackboard c4t600C0FF00924686710D4CF02d0 c4t600C0FF00082CA2312B99E05d0 2009-06-17.18:12:14 zpool detach blackboard c4t600C0FF00080797CC7A87F02d0 2009-06-17.18:12:57 zpool attach blackboard c4t600C0FF00924686710D4CF02d0 c4t600C0FF00086136F22B65F05d0 2009-06-17.20:02:00 zpool detach blackboard c4t600C0FF00924686710D4CF02d0 2009-06-18.05:58:52 zpool export blackboard 2009-06-18.05:59:06 zpool import blackboard For home users this is probably overkill, but I thought I would mention it for more enterprise type people that are maybe familiar with disksuiite and not ZFS as much. 2) if you are working on the root pool, just skip export/import part and boot with only one half of the mirror. Don't forget to run installgrub after replacing a disk. HTH, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ___ Scott Lawson Systems Architect Manukau Institute of Technology Information Communication Technology Services Private Bag 94006 Manukau City Auckland New Zealand Phone : +64 09 968 7611 Fax: +64 09 968 7641 Mobile : +64 27 568 7611 mailto:sc...@manukau.ac.nz http://www.manukau.ac.nz perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
Hey sbreden! :o) No, I havent tried to tinker with my drives. They have been functioning all the time. I suspect (can not remember) that each SATA slot in the card has a number attached to it? Can anyone confirm this? If I am right, OpenSolaris will say something about disc 6 is broken and on the card there is a number 6? Then you can identify the disc? I thought of exchanging my PCI card with a PCIe card variant instead to reach higher speeds. PCI-X is legacy. The problem with PCIe cards is that soon SSD drives will be common. A ZFS raid with SSD would need maybe PCIe x 16 or so, to reach max band width. The PCIe cards are all PCIe x 4 or something of today. I need a PCIe x 16 card to make it future proof for the SSD discs. Maybe the best bet would be to attach a SSD disc directly to a PCIe slot, to reach max transfer speed? Or wait for SATA 3? I dont know. I want to wait until SSD raids are tested out. Then I will buy an apropriate card capable of SSD raids. Maybe SSD discs should never be used in conjunction with a card, and always connect directly to the SATA port? Until I know more on this, my PCI card will be fine. 150MB/sec is ok for my personal needs. (My ZFS raid is connected to my Desktop PC. I dont have a server that is on 24/7 using power. I want to save power. Save the earth! :o) All my 5 ZFS raid discs are connected to one Molex. That molex has a power switch. So I just turn on the ZFS raid and copy all files I need to my system disc (which is 500GB) and then immediately reboot and turn off the ZFS raid. This way I only have one disc active, which I use as a cache. When my data are ready, I copy them to the ZFS raid and then shut down the power to the ZFS raid discs.) However I have a question. Which speed will I get with this solution. I have 2 SSD discs in a PCI slot = 150MB/sec. Now I add 1 SSD disc into a SATA slot and another SSD disc into another SATA slot. Then I have 5 disc in PCI = 150MB/sec 1 disc in SATA = 300MB/sec (I assume SATA reach 300MB/sec?) 1 disc in SATA = 300MB/sec. I connect all the 7 discs into one ZFS raid. Which speed will I get? Will I get 150 + 300 + 300MB/sec? Or will the PCI slot strangle the SATA ports? Or will the fastest speed win and I will only get 300MB/sec? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
Wouldn't it make sense for the timing technique to be used if the data is coming in at a rate slower than the underlying disk storage? But then if the data starts to come at a faster rate, ZFS needs to start streaming to disk as quickly as it can, and instead of re-ordering writes in blocks, it should just do the best it can with whatever is currently in memory. And when that mode activates, inbound data should be throttled to match the current throughput to disk. That preserves the efficient write ordering that ZFS was originally designed for, but means a more graceful degradation under load, with the system tending towards a steady state of throughput that matches what you would expect from other filesystems on those physical disks. Of course, I have no idea how difficult this is technically. But the idea seems reasonable to me. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
Bob Friesenhahn wrote: On Wed, 24 Jun 2009, Marcelo Leal wrote: Hello Bob, I think that is related with my post about zio_taskq_threads and TXG sync : ( http://www.opensolaris.org/jive/thread.jspa?threadID=105703tstart=0 ) Roch did say that this is on top of the performance problems, and in the same email i did talk about the change from 5s to 30s, what i think makes this problem worst, if this txg sync interval be fixed. The problem is that basing disk writes on a simple timeout and available memory does not work. It is easy for an application to write considerable amounts of new data in 30 seconds, or even 5 seconds. If the application blocks while the data is being comitted, then the application is not performing any useful function during that time. Current ZFS write behavior make it not very useful for the creative media industries even though otherwise it should be a perfect fit since hundreds of terrabytes of working disk (or even petabytes) are normal for this industry. For example, when data is captured to disk from film via a datacine (real time = 24 files/second and 6MB to 50MB per file), or captured to disk from a high-definition video camera, there is little margin for error and blocking on writes will result in missed frames or other malfunction. Current ZFS write behavior is based on timing and the amount of system memory and it does not seem that throwing more storage hardware at the problem solves anything at all. I wonder whether a filesystem property streamed might be appropriate? This could act as hint to ZFS that the data is sequential and should be streamed direct to disk. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 milosz a écrit : Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn't always mean that is the best solution. Choose the best solution for your use case. multiple vm disks that have any kind of load on them will bury a raidz or raidz2. out of a 6x raidz2 you are going to get the iops and random seek latency of a single drive (realistically the random seek will probably be slightly worse, actually). how could that be adequate for a virtual machine backend? if you set up a raidz2 with 6x15k drives, for the majority of use cases, you are pretty much throwing your money away. you are going to roll your own san, buy a bunch of 15k drives, use 2-3u of rackspace and four (or more) switchports, and what you're getting out of it is essentially a 500gb 15k drive with a high mttdl and a really huge theoretical transfer speed for sequential operations (which you won't be able to saturate anyway because you're delivering over gige)? for this particular setup i can't really think of a situation where that would make sense. Ouch ! Pretty direct answer. That's very interesting however. Let me focus on a few more points : - - The hardware can't really be extended any more. No budget ;-( - - the VM will be mostly few IO systems : - -- WS2003 with Trend Officescan, WSUS (for 300 XP) and RDP - -- Solaris10 with SRSS 4.2 (Sunray server) (File and DB servers won't move in a nearby future to VM+SAN) I thought -but could be wrong- that those systems could afford a high latency IOs data rate. what you're getting out of it is essentially a 500gb 15k drive with a high mttdl That's what i wanted, a rock-solid disk area, despite a not-as-good-as-i'd-like random IO. I'll give it a try with sequential tranfer. However, thanks for your answer. Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. link? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss - -- Cordialement. - - Lycée Alfred Nobel,Clichy sous bois http://www.lyceenobel.org KeyID 0x46EA1D16 FingerPrint 997B164F4F606A61E7B1FC61961A821646EA1D16 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpCjKIACgkQlhqCFkbqHRZ8EwCbBbtEsFOimeiUXFMNRBrJI4uO xuAAnRO8pv3ES2bhIUWfEuyEtp8M1vGl =kRUK -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David Magda a écrit : On Wed, June 24, 2009 08:42, Philippe Schwarz wrote: In my tests ESX4 seems to work fine with this, but i haven't already stressed it ;-) Therefore, i don't know if the 1Gb FDuplex per port will be enough, i don't know either i'have to put sort of redundant access form ESX to SAN,etc Is my configuration OK ? It's only a preprod install, i'm able to break almost everything if it's necessary. At least in 3.x, VMware had a limitation of only being able to use one connection per iSCSI target (even if there were multiple LUNs on it): http://mail.opensolaris.org/pipermail/zfs-discuss/2009-June/028731.html Not sure if that's changed in 4.x, so if you're going to have more than one LUN, then having more than one target may be advantageous. See also: http://www.vmware.com/files/pdf/iSCSI_design_deploy.pdf You may want to go to the VMware lists / forums to see what the people there say as well. Out of curiosity, any reason why went with iSCSI and not NFS? There seems to be some debate on which is better under which circumstances. iSCSI instead of NFS ? Because of the overwhelming difference in transfer rate between them, In fact, that's what i read. And setting isCSI target is so simple, that i didn't even search another solution. Thanks for your answer. - -- Cordialement. - - Lycée Alfred Nobel,Clichy sous bois http://www.lyceenobel.org KeyID 0x46EA1D16 FingerPrint 997B164F4F606A61E7B1FC61961A821646EA1D16 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpCkpgACgkQlhqCFkbqHRZYtACfc5QMhQmWvC1wAZD36YLJkBNT XV8An0DPj+te+ppS0fBAlDL8vmFKMGG+ =h0Nv -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Wed, 24 Jun 2009, Marcelo Leal wrote: I think that is the purpose of the current implementation: http://blogs.sun.com/roch/entry/the_new_zfs_write_throttle But seems like is not that easy... as i did understand what Roch said, seems like the cause is not always a hardy writer. I see this: The new code keeps track of the amount of data accepted in a TXG and the time it takes to sync. It dynamically adjusts that amount so that each TXG sync takes about 5 seconds (txg_time variable). It also clamps the limit to no more than 1/8th of physical memory. It is interesting that it was decided that a TXG sync should take 5 seconds by default. That does seem to be about what I am seeing here. There is no mention of the devastation to the I/O channel which occurs if the kernel writes 5 seconds worth of data (e.g. 2GB) as fast as possible on a system using mirroring (2GB becomes 4GB of writes). If it writes 5 seconds of data as fast as possible, then it seems that this blocks any opportunity to read more data so that application processing can continue during the TXG sync. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Thu, 25 Jun 2009, Ian Collins wrote: I wonder whether a filesystem property streamed might be appropriate? This could act as hint to ZFS that the data is sequential and should be streamed direct to disk. ZFS does not seem to offer an ability to stream direct to disk other than perhaps via the special raw mode known to database developers. It seems that current ZFS behavior is works as designed. The write transaction time is currently tuned for 5 seconds and so it writes data intensely for 5 seconds while either starving the readers and/or blocking the writers. Notice that by the end of TXG write, zfs iostat is reporting zero reads: % zpool iostat Sun_2540 1 capacity operationsbandwidth pool used avail read write read write -- - - - - - - Sun_2540 456G 1.18T 14 0 1.86M 0 Sun_2540 456G 1.18T 0 19 0 1.47M Sun_2540 456G 1.18T 0 3.11K 0 385M Sun_2540 456G 1.18T 0 3.00K 0 385M Sun_2540 456G 1.18T 0 3.34K 0 387M Sun_2540 456G 1.18T 0 3.01K 0 386M Sun_2540 458G 1.18T 19 1.87K 30.2K 220M Sun_2540 458G 1.18T 0 0 0 0 Sun_2540 458G 1.18T275 0 34.4M 0 Sun_2540 458G 1.18T448 0 56.1M 0 Sun_2540 458G 1.18T468 0 58.5M 0 Sun_2540 458G 1.18T425 0 53.2M 0 Sun_2540 458G 1.18T402 0 50.4M 0 Sun_2540 458G 1.18T364 0 45.5M 0 Sun_2540 458G 1.18T339 0 42.4M 0 Sun_2540 458G 1.18T376 0 47.0M 0 Sun_2540 458G 1.18T307 0 38.5M 0 Sun_2540 458G 1.18T380 0 47.5M 0 Sun_2540 458G 1.18T148 1.35K 18.3M 117M Sun_2540 458G 1.18T 20 3.01K 2.60M 385M Sun_2540 458G 1.18T 15 3.00K 1.98M 384M Sun_2540 458G 1.18T 4 3.03K 634K 388M Sun_2540 458G 1.18T 0 3.01K 0 386M Sun_2540 460G 1.18T142792 15.8M 82.7M Sun_2540 460G 1.18T375 0 46.9M 0 Here is an interesting discussion thread on another list that I had not seen before: http://opensolaris.org/jive/thread.jspa?messageID=347212 Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
- - the VM will be mostly few IO systems : - -- WS2003 with Trend Officescan, WSUS (for 300 XP) and RDP - -- Solaris10 with SRSS 4.2 (Sunray server) (File and DB servers won't move in a nearby future to VM+SAN) I thought -but could be wrong- that those systems could afford a high latency IOs data rate. might be fine most of the time... rdp in particular is vulnerable to io spiking and disk latency. depends on how many users you have on that rdp vm. also wsus is surprisingly (or not, given it's a microsoft production) resource-hungry. if those servers are on physical boxes right now i'd do some perfmon caps and add up the iops. what you're getting out of it is essentially a 500gb 15k drive with a high mttdl That's what i wanted, a rock-solid disk area, despite a not-as-good-as-i'd-like random IO. fair enough. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
On Wed, Jun 24 at 15:38, Bob Friesenhahn wrote: On Wed, 24 Jun 2009, Orvar Korvar wrote: I thought of exchanging my PCI card with a PCIe card variant instead to reach higher speeds. PCI-X is legacy. The problem with PCIe cards is that soon SSD drives will be common. A ZFS raid with SSD would need maybe PCIe x 16 or so, to reach max band width. The PCIe cards are all PCIe x 4 or something of today. I need a PCIe x 16 card to make it future proof for the SSD discs. Maybe the best bet would be to attach a SSD disc directly to a PCIe slot, to reach max transfer speed? Or wait for SATA 3? I dont know. I want to wait until SSD I don't think this is valid thinking because it assumes that write rates for SSDs are higher than for traditional hard drives. This assumption is not often correct. Maybe someday. SSDs offer much lower write latencies (no head seek!) but their bulk sequential data transfer properties are not yet better than hard drives. The main purpose for using SSDs with ZFS is to reduce latencies for synchronous writes required by network file service and databases. In the available 5 months ago category, the Intel X25-E will write sequentially at ~170MB/s according to the datasheets. That is faster than most, if not all rotating media today. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
On Jun 24, 2009, at 16:54, Philippe Schwarz wrote: Out of curiosity, any reason why went with iSCSI and not NFS? There seems to be some debate on which is better under which circumstances. iSCSI instead of NFS ? Because of the overwhelming difference in transfer rate between them, In fact, that's what i read. That would depend on I/O pattern, wouldn't it? If you have mostly random I/O then it's unlikely you'd saturate a GigE as you're not streaming. Well, this is with 3.x. I don't have any experience with 4.x so I guess it's best to test. Everyone's going to have to build up all their knowledge from scratch with the new software. :) http://tinyurl.com/d8urpx http://vmetc.com/2009/05/01/reasons-for-using-nfs-with-vmware-virtual-infrastructure/ Cloning Windows images (assuming one VMDK per FS) would be a possibility as well. Either way, you may want to tweak some of the TCP settings for best results: http://serverfault.com/questions/13190 And setting isCSI target is so simple, that i didn't even search another solution. # zfs set sharenfs=on mypool/myfs1 http://docs.sun.com/app/docs/doc/819-5461/gamnd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
Bob Friesenhahn wrote: On Wed, 24 Jun 2009, Marcelo Leal wrote: I think that is the purpose of the current implementation: http://blogs.sun.com/roch/entry/the_new_zfs_write_throttle But seems like is not that easy... as i did understand what Roch said, seems like the cause is not always a hardy writer. I see this: The new code keeps track of the amount of data accepted in a TXG and the time it takes to sync. It dynamically adjusts that amount so that each TXG sync takes about 5 seconds (txg_time variable). It also clamps the limit to no more than 1/8th of physical memory. hmmm... methinks there is a chance that the 1/8th rule might not work so well for machines with lots of RAM and slow I/O. I'm also reasonably sure that that sort of machine is not what Sun would typically build for performance lab testing, as a rule. Hopefully Roch will comment when it is morning in Europe. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
On Wed, 24 Jun 2009, Eric D. Mudama wrote: The main purpose for using SSDs with ZFS is to reduce latencies for synchronous writes required by network file service and databases. In the available 5 months ago category, the Intel X25-E will write sequentially at ~170MB/s according to the datasheets. That is faster than most, if not all rotating media today. Sounds good. Is that is after the whole device has been re-written a few times or just when you first use it? How many of these devices do you own and use? Seagate Cheetah drives can now support a sustained data rate of 204MB/second. That is with 600GB capacity rather than 64GB and at a similar price point (i.e. 10X less cost per GB). Or you can just RAID-0 a few cheaper rotating rust drives and achieve a huge sequential data rate. I see that the Intel X25-E claims a sequential read performance of 250 MB/s. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Wed, 24 Jun 2009, Richard Elling wrote: The new code keeps track of the amount of data accepted in a TXG and the time it takes to sync. It dynamically adjusts that amount so that each TXG sync takes about 5 seconds (txg_time variable). It also clamps the limit to no more than 1/8th of physical memory. hmmm... methinks there is a chance that the 1/8th rule might not work so well for machines with lots of RAM and slow I/O. I'm also reasonably sure that that sort of machine is not what Sun would typically build for performance lab testing, as a rule. Hopefully Roch will comment when it is morning in Europe. Slow I/O is relative. If I install more memory does that make my I/O even slower? I did some more testing. I put the input data on a different drive and sent application output to the ZFS pool. I no longer noticed any stalls in the execution even though the large ZFS flushes are taking place. This proves that my application is seeing stalled reads rather than stalled writes. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Turn off the time slider on some zpools
Hi Mykola, Yes, if you are speaking of the automatic TimeSlider snapshots, the snapshots are rotated. I think the threshold is 80% full disk space. Cheers, Cindy Mykola Maslov wrote: How to turn off the timeslider snapshots on certain file systems? http://wikis.sun.com/display/OpenSolarisInfo/How+to+Manage+the+Automatic+ZFS+Snapshot+Service Thank you, very handy stuff! BTW - will zfs automatically delete snapshots, when I`ll go low on disk space? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Wed, 24 Jun 2009, Richard Elling wrote: The new code keeps track of the amount of data accepted in a TXG and the time it takes to sync. It dynamically adjusts that amount so that each TXG sync takes about 5 seconds (txg_time variable). It also clamps the limit to no more than 1/8th of physical memory. hmmm... methinks there is a chance that the 1/8th rule might not work so well for machines with lots of RAM and slow I/O. I'm also reasonably sure that that sort of machine is not what Sun would typically build for performance lab testing, as a rule. Hopefully Roch will comment when it is morning in Europe. Slow I/O is relative. If I install more memory does that make my I/O even slower? I did some more testing. I put the input data on a different drive and sent application output to the ZFS pool. I no longer noticed any stalls in the execution even though the large ZFS flushes are taking place. This proves that my application is seeing stalled reads rather than stalled writes. There is a bug in the database about reads blocked by writes which may be related: http://bugs.opensolaris.org/view_bug.do?bug_id=6471212 The symptom is sometimes reducing queue depth makes read perform better. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discu ss -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] BugID formally known as 6746456
Does anyone know if related problems to the panic's dismissed as duplicate of 6746456 ever resulted in Solaris 10 patches? It sounds like they were actually solved in OpenSolaris but S10 is still panicing predictably when Linux NFS clients try to change a nobody UID/GID on a ZFS exported filesystem. Specifically the NFS induced panics related to the nobody id not mapping correctly, or, more precisely, attempts to change user/group ID nobody causing S10u7 to blow chunks in zfs_fuid.c zfs_fuid_table_load's ASSERT? While the workaround to change the id's on the server is possible, it pretty much torpedo's management's view of Solaris' stability and sending fileserver duty back to Linux... :( Anybody could create a nobody file and put the system into endless boot-loops without this being patched. I'm hoping further work on this issue was done on the S10 side of the house and there is a stealthy patch ID that can fix the issue. Thanks, -Rob -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migration: 1 x 160GB IDE boot drive --- 2 x 30GB SATA SSDs
On Wed, Jun 24, 2009 at 6:32 PM, Simon Bredenno-re...@opensolaris.org wrote: FIRST QUESTION: Although, it seems possible to add a drive to form a mirror for the ZFS boot pool 'rpool', the main problem I see is that in my case, I would be attempting to form a mirror using a smaller drive (30GB) than the initial 160GB drive. Is there an easy solution to this problem, or would it be simpler to just do a reinstall of OpenSolaris 2009.06 onto 2 brand new 30GB SSDs? I have the option of the fresh install, as I haven't invested much time in configuring this OS2009.06 boot environment yet. Depends on how you define easy. Due to the smaller new drive, you can't use zpool replace. Some people will find this easy enough : http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery Others might find it's too complicated, and opt for reinstall plus zpool attach SECOND QUESTION: I also want the possibility to have multiple boot environments within OpenSolaris 2009.06 to allow easy rollback to a working boot environment in case of an IPS update problem. I presume this will not cause any additional complications? Correct. THIRD QUESTION: This is for a home fileserver so I don't want to spend too much, but does anyone see any problem with having the OS installed on MLC SSDs, which are cheaper than SLC SSDs. I'm thinking here specifically about wearing out the SSD if the OS does too many writes to the SSDs. zfs is SSD-friendly due to it's copy-on-write nature. Having a mirror also provide additional level of protection. -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss