Re: [zfs-discuss] rpool on ssd. endurance question.
If anybody uses SSD for rpool more than half-year, can you post SMART information about HostWrites attribute? I want to see how SSD wear for system disk purposes. I'd be happy to, exactly what commands shall I run? Hm. I'm experimenting with OpenSolaris in virtual machine now. Unfortunately I can't give you exactly how-to. But i think it is possible to compile Smartmontools http://smartmontools.sourceforge.net and get SMART attributes something like that: smartctl -a /dev/rdsk/c0t0d0s0 You need install SUNWgcc package to compile. Take a look at http://opensolaris.org/jive/thread.jspa?threadID=120402 and http://opensolaris.org/jive/thread.jspa?threadID=124372 . ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] MPT issues strikes back
Hi all,Yet another story regarding mpt issues, and in order to make a longstory short everytime that a Dell R710 running snv_134 logs the information scsi: [ID 107833 kern.warning] WARNING:/p...@0,0/pci8086,3...@4/pci1028,1...@0 (mpt0): , the system freezes andony a hard-reset fixes the issue. Is there any sort of parameter to be used to minimize/avoid this issue? Machine specs : Dell R710 16 GB memory 2 Intel Quad-Core E5506 SunOS san01 5.11 snv_134 i86pc i386 i86pc Solaris Dell Integrated SAS 6/i Controller ( mpt0 Firmware version v0.25.47.0(IR) ) with 2 disks attached without raid Thanks in advance,Bruno -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] MPT issues strikes back
Hi all, Yet another story regarding mpt issues, and in order to make a long story short everytime that a Dell R710 running snv_134 logs the information scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,3...@4/pci1028,1...@0 (mpt0): , the system freezes and ony a hard-reset fixes the issue. Is there any sort of parameter to be used to minimize/avoid this issue? Machine specs : Dell R710, 16 GB memory, 2 Intel Quad-Core E5506 SunOS san01 5.11 snv_134 i86pc i386 i86pc Solaris Dell Integrated SAS 6/i Controller ( mpt0 Firmware version v0.25.47.0 (IR) ) with 2 disks attached without raid Thanks in advance, Bruno -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
- Daniel Carosone d...@geek.com.au skrev: On Mon, Apr 26, 2010 at 10:02:42AM -0700, Chris Du wrote: SAS: full duplex SATA: half duplex SAS: dual port SATA: single port (some enterprise SATA has dual port) SAS: 2 active channel - 2 concurrent write, or 2 read, or 1 write and 1 read SATA: 1 active channel - 1 read or 1 write SAS: Full error detection and recovery on both read and write SATA: error detection and recovery on write, only error detection on read SAS: Full SCSI TCQ SATA: Lame ATA NCQ What's so lame about NCQ? roy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS deduplication ratio on Server 2008 backup VHD files
The problem is that the windows server backup seems to choose dynamic vhd (which would make sense in most cases) and I dont know if there is a way to change that. Using ISCSI-volumes wont help in my case since servers are running on physical hardware. Am 27.04.2010 01:54, schrieb Brandon High: On Mon, Apr 26, 2010 at 8:51 AM, tim Kriestim.kr...@gmx.de wrote: I am kinda confused over the change of dedup ratio from changing the record size, since it should dedup 256-bit blocks. Dedup works on the blocks or either recordsize or volblocksize. The checksum is made per block written, and those checksums are used to dedup the data. With a recordsize of 128k, two blocks with a one byte difference would not dedup. With an 8k recordsize, 15 out of 16 blocks would dedup. Repeat over the entire VHD. Setting the record size equal to a multiple of the VHD's internal block size and ensuring that the internal filesystem is block aligned will probably help to improve dedup ratios. So for an NTFS guest with 4k blocks, use a 4k, 8k or 16k record size and ensure that when you install in the VHD that its partitions are block aligned for the recordsize you're using. VHD supports fixed size and dynamic size images. If you're using a fixed image, the space is pre-allocated. This doesn't mean you'll waste unused space on ZFS with compression, since all those zeros will take up almost no space. Your VHD file should remain block-aligned however. I'm not sure that a dynamic size image will block align if there is empty space. Using compress=zle will only compress the zeros with almost no cpu penalty. Using a COMSTAR iscsi volume is probably an even better idea, since you won't have the POSIX layer in the path, and you won't have the VHD file header throwing off your block alignment. -B ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey If something like this already exists, please let me know. Otherwise, I plan to: Create zfshistory command, written in python. (open source, public, free) So, I decided to rename this zhist and started a project on google code. I'm not very far along yet, except, based on all the discussion in this thread, have a very good idea how it should all be implemented. Particular thanks to Richard Elling, whose in-depth discussion of path renames and moves made me think a lot about implementation, and settle on inode tracking. If anyone would like to contribute, please let me know off-list. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS deduplication ratio on Server 2008 backup VHD files
- Tim.Kreis tim.kr...@gmx.de skrev: The problem is that the windows server backup seems to choose dynamic vhd (which would make sense in most cases) and I dont know if there is a way to change that. Using ISCSI-volumes wont help in my case since servers are running on physical hardware. It should work well anyway, if you (a) fill up the server with memory and (b) reduce block size to 8k or even less. But do (a) before (b). Dedup is very memory hungry roy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Mapping inode numbers to file names
Let's suppose you rename a file or directory. /tank/widgets/a/rel2049_773.13-4/somefile.txt Becomes /tank/widgets/b/foogoo_release_1.9/README Let's suppose you are now working on widget B, and you want to look at the past zfs snapshot of README, but you don't remember where it came from. That is, you don't know the previous name or location where that file used to be. One way you could do it would be: Look up the inode number of README. (for example, ls -i README) (suppose it's inode 12345) find /tank/.zfs/snapshot -inum 12345 Problem is, the find command will run for a long time. Is there any faster way to find the file name(s) when all you know is the inode number? (Actually, all you know is all the info that's in the present directory, which is not limited to inode number; but, inode number is the only information that I personally know could be useful.) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] MPT issues strikes back
Bruno Sousa on Tue, Apr 27, 2010 at 09:16:08AM +0200 wrote: Hi all, Yet another story regarding mpt issues, and in order to make a long story short everytime that a Dell R710 running snv_134 logs the information scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,3...@4/pci1028,1...@0 (mpt0): , the system freezes and ony a hard-reset fixes the issue. Is there any sort of parameter to be used to minimize/avoid this issue? We had the same problem on a X4600, turned out to be a bad SSD and or connection at the location listed in the error message. Since removing that drive, we have not encounted that issue. You might want to look at http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=7acda35c626180d9cda7bd1df451?bug_id=6894775 too. -Mark Machine specs : Dell R710, 16 GB memory, 2 Intel Quad-Core E5506 SunOS san01 5.11 snv_134 i86pc i386 i86pc Solaris Dell Integrated SAS 6/i Controller ( mpt0 Firmware version v0.25.47.0 (IR) ) with 2 disks attached without raid Thanks in advance, Bruno -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Performance drop during scrub?
Hi all I have a test system with snv134 and 8x2TB drives in RAIDz2 and currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on the testpool drops to something hardly usable while scrubbing the pool. How can I address this? Will adding Zil or L2ARC help? Is it possible to tune down scrub's priority somehow? Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Mon, April 26, 2010 17:21, Edward Ned Harvey wrote: Also, if you've got all those disks in an array, and they're MTBF is ... let's say 25,000 hours ... then 3 yrs later when they begin to fail, they have a tendency to all fail around the same time, which increases the probability of exceeding your designed level of redundancy. It's useful to consider this when doing mid-life upgrades. Unfortunately there's not too much useful to be done right now with RAID setups. With mirrors, when adding some disks mid-life (seems like a common though by no means universal scenario to not fully populate the chassis at first, and add more 1/3 to 1/2 way through the projected life), with some extra trouble one can attach a new disk as a n+1st disk in an existing mirror, wait for the resilver, and detach an old disk. That mirror is now one new disk and one old disk, rather than two disks of the same age. Then build a new mirror out of the freed disk plus another new disk. Now you've got both mirrors consisting of disks of different ages, less prone to failing at the same time. (Of course this doesn't work when you're using bigger drives for the mid-life kicker, and most of the time it would make sense to do so.) Even buying different (mixed) brands initially doesn't help against aging; only against batch or design problems. Hey, you know what might be helpful? Being able to add redundancy to a raid vdev. Being able to go from RAIDZ2 to RAIDZ3 by adding another drive of suitable size. Also being able to go the other way. This lets you do the trick of temporarily adding redundancy to a vdev while swapping out devices one at a time to eventually upgrade the size (since you're deliberately creating a fault situation, increasing redundancy before you do it makes loads of sense!). I recently bought 2x 1Tb disks for my sun server, for $650 each. This was enough to make me do the analysis, why am I buying sun branded overpriced disks? Here is the abridged version: No argument that, in the existing market, with various levels of need, this is often the right choice. I find it deeply frustrating and annoying that this dilemma exists entirely due to bad behavior by the disk companies, though. First they sell deliberately-defective drives (lie about cache flush, for example) and then they (in conspiracy with an accomplice company) charge us many times the cost of the physical hardware for fixed versions. This MUST be stopped. This is EXACTLY what standards exist for -- so we can buy known-quantity products in a competitive market. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, 27 Apr 2010, David Dyer-Bennet wrote: Hey, you know what might be helpful? Being able to add redundancy to a raid vdev. Being able to go from RAIDZ2 to RAIDZ3 by adding another drive of suitable size. Also being able to go the other way. This lets you do the trick of temporarily adding redundancy to a vdev while swapping out devices one at a time to eventually upgrade the size (since you're deliberately creating a fault situation, increasing redundancy before you do it makes loads of sense!). You can already replace one drive with another (zpool replace) so as long as there is space for the new drive, it is not necessary to degrade the array and lose redundancy while replacing a device. As long as you can physically add a drive to the system (even temporarily) it is not necessary to deliberately create a fault situation. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, April 27, 2010 10:38, Bob Friesenhahn wrote: On Tue, 27 Apr 2010, David Dyer-Bennet wrote: Hey, you know what might be helpful? Being able to add redundancy to a raid vdev. Being able to go from RAIDZ2 to RAIDZ3 by adding another drive of suitable size. Also being able to go the other way. This lets you do the trick of temporarily adding redundancy to a vdev while swapping out devices one at a time to eventually upgrade the size (since you're deliberately creating a fault situation, increasing redundancy before you do it makes loads of sense!). You can already replace one drive with another (zpool replace) so as long as there is space for the new drive, it is not necessary to degrade the array and lose redundancy while replacing a device. As long as you can physically add a drive to the system (even temporarily) it is not necessary to deliberately create a fault situation. I don't think I understand your scenario here. The docs online at http://docs.sun.com/app/docs/doc/819-5461/gazgd?a=view describe uses of zpool replace that DO run the array degraded for a while, and don't seem to mention any other. Could you be more detailed? -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance drop during scrub?
On Tue, 27 Apr 2010, Roy Sigurd Karlsbakk wrote: I have a test system with snv134 and 8x2TB drives in RAIDz2 and currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on the testpool drops to something hardly usable while scrubbing the pool. How can I address this? Will adding Zil or L2ARC help? Is it possible to tune down scrub's priority somehow? Does the NFS performance problem seem to be mainly read performance, or write performance? If it is primarily a read performance issue, then adding lots more RAM and/or a L2ARC device should help since that would reduce the need to (re-)read the underlying disks during the scrub. Likewise, adding an intent log SSD would help with NFS write performance. Zfs scrub needs to access all written data on all disks and is usually disk-seek or disk I/O bound so it is difficult to keep it from hogging the disk resources. A pool based on mirror devices will behave much more nicely while being scrubbed than one based on RAIDz2. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, 27 Apr 2010, David Dyer-Bennet wrote: I don't think I understand your scenario here. The docs online at http://docs.sun.com/app/docs/doc/819-5461/gazgd?a=view describe uses of zpool replace that DO run the array degraded for a while, and don't seem to mention any other. Could you be more detailed? If a disk has failed, then it makes sense to physically remove the old disk, insert a new one, and do 'zpool replace tank c1t1d0'. However if the disk has not failed, then you can install a new disk in another location and use the two argument form of replace like 'zpool replace tank c1t1d0 c1t1d7'. If I understand things correctly, this allows you to replace one good disk with another without risking the data in your pool. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rpool on ssd. endurance question.
On 04/27/10 03:55 AM, Yuri Vorobyev wrote: If anybody uses SSD for rpool more than half-year, can you post SMART information about HostWrites attribute? I want to see how SSD wear for system disk purposes. I'd be happy to, exactly what commands shall I run? Hm. I'm experimenting with OpenSolaris in virtual machine now. Unfortunately I can't give you exactly how-to. But i think it is possible to compile Smartmontools http://smartmontools.sourceforge.net and get SMART attributes something like that: smartctl -a /dev/rdsk/c0t0d0s0 You need install SUNWgcc package to compile. Take a look at http://opensolaris.org/jive/thread.jspa?threadID=120402 and http://opensolaris.org/jive/thread.jspa?threadID=124372 . I tried compiling the smartmontools, using OracleSolarisStudio U1. I can't get any usable data: bash-4.0$ pfexec /Download_Files/SmartMonTools/smartmontools-5.39.1/smartctl -a /dev/rdsk/c8t1d0s0 -d ata smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net ### ATA command routine ata_command_interface() NOT IMPLEMENTED under Solaris. Please contact smartmontools-supp...@lists.sourceforge.net if you want to help in porting smartmontools to Solaris. ### Smartctl: Device Read Identity Failed (not an ATA/ATAPI device) A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. bash-4.0$ Next I tried this command to get some info: bash-4.0$ /usr/bin/iostat -En c8t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: OCZ-VERTEX-EXRevision: 1.21 Serial No: Size: 128.04GB 128035676160 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 10 Predictive Failure Analysis: 0 c8t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: OCZ-VERTEX Revision: 1.3 Serial No: Size: 256.06GB 256060514304 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 6 Predictive Failure Analysis: 0 c8t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: INTEL SSDSA2MH16 Revision: 8820 Serial No: Size: 160.04GB 160041885696 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 6 Predictive Failure Analysis: 0 c7t0d0 Soft Errors: 0 Hard Errors: 10 Transport Errors: 0 Vendor: MATSHITA Product: BD-MLT UJ-220S Revision: 1.01 Serial No: Size: 0.00GB 0 bytes Media Error: 0 Device Not Ready: 10 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 bash-4.0$ If you can come up with a way I can get you more info, post a response. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs destroy hangs machine if snapshot exists- workaround found
Hi - was there any progress on this issue? I'd be interested to know if any bugs were filed regarding it and whether there's a way to follow up on the progress. Cheers, Alasdair -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS version information changes (heads up)
Hi everyone, Please review the information below regarding access to ZFS version information. Let me know if you have questions. Thanks, Cindy CR 6898657: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6898657 ZFS commands zpool upgrade -v and zfs upgrade -v refer to URLs that are no longer redirected to the correct location after April 30, 2010. Description The opensolaris.org site has moved to hub.opensolaris.org and the opensolaris.org site is no longer redirected to the new site after April 30, 2010. The zpool upgrade and zfs upgrade commands in the Solaris 10 releases and the OpenSolaris release refer to opensolaris.org URLs that no longer exist. For example: # zpool upgrade -v . . . For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/N # zfs upgrade -v . . . For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/zpl/N Workaround Access either of the replacement URLs as follows. 1. For zpool upgrade -v, use this URL: http://hub.opensolaris.org/bin/view/Community+Group+zfs/N 2. For zfs upgrade -v, use this URL: http://hub.opensolaris.org/bin/view/Community+Group+zfs/N-1 Resolution CR 6898657 identifies the replacement hub.opensolaris.org URLs and the longer term fix, which is that the zfs upgrade and zpool upgrade commands will provide the following new text: For more information on a particular version, including supported releases, see the ZFS Administration Guide. The revised ZFS Administration Guide describes the ZFS version descriptions and the Solaris OS releases that provide the version and feature, starting on page 293, here: http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, April 27, 2010 11:17, Bob Friesenhahn wrote: On Tue, 27 Apr 2010, David Dyer-Bennet wrote: I don't think I understand your scenario here. The docs online at http://docs.sun.com/app/docs/doc/819-5461/gazgd?a=view describe uses of zpool replace that DO run the array degraded for a while, and don't seem to mention any other. Could you be more detailed? If a disk has failed, then it makes sense to physically remove the old disk, insert a new one, and do 'zpool replace tank c1t1d0'. However if the disk has not failed, then you can install a new disk in another location and use the two argument form of replace like 'zpool replace tank c1t1d0 c1t1d7'. If I understand things correctly, this allows you to replace one good disk with another without risking the data in your pool. I don't see any reason to think the old device remains in use until the new device is resilvered, and if it doesn't, then you're down one level of redundancy the instant the old device goes out of service. I don't have a RAIDZ group, but trying this while there's significant load on the group, it should be easy to see if there's traffic on the old drive after the resilver starts. If there is, that would seem to be evidence that it's continuing to use the old drive while resilvering to the new one, which would be good. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?
I've got an OCZ Vertex 30gb drive with a 1GB stripe used for the slog and the rest used for the L2ARC, which for ~ $100 has been a nice boost to nfs writes. What about the Intel X25-V? I know it will likely be fine for L2ARC, but what about ZIL/slog? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?
For the l2arc you want iops pure an simple. For this I think the Intel SSDs are still king. The slog however has a gotcha, you want a iops, but also you want something that doesn't say it's done writing until the write is safely nonvolitile. The intel drives fail in this regard. So far I'm thinking the best bet will likely one of the sandforce sf-1500 based drives with the supercap on it. Something like the Vertex 2 pro. These are of course just my thoughts on the matter as I work towards designing a SQL storage backend. Your mileage may vary. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, 27 Apr 2010, David Dyer-Bennet wrote: I don't have a RAIDZ group, but trying this while there's significant load on the group, it should be easy to see if there's traffic on the old drive after the resilver starts. If there is, that would seem to be evidence that it's continuing to use the old drive while resilvering to the new one, which would be good. If you have a pool on just a single drive and you use 'zpool replace foo bar' to move the pool data from drive 'foo' to drive 'bar', does it stop reading drive 'foo' immediately when the transfer starts? Please do me a favor and check this for me. Thanks, Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rpool on ssd. endurance question.
On 04/27/10 03:24 PM, Miles Nordin wrote: http://opensolaris.org/jive/thread.jspa?messageID=473727 -- 'smartctl -d sat,12 ...' is the incantation to use on solaris for ahci OK, getting somewhere. I have a total of 3 SSD's in my laptop. Laptop is a Clevo D901C. ***Boot Disk # ./smartctl -d sat,12 -i /dev/rdsk/c8t0d0s0 smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: OCZ Vertex SSD Device Model: OCZ-VERTEX-EX Serial Number:8AF2FQS5Z5010V377CP5 Firmware Version: 1.21 User Capacity:128,035,676,160 bytes Device is:In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Tue Apr 27 16:21:07 2010 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled # ./smartctl -d sat,12 -A /dev/rdsk/c8t0d0s0 smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x 010 000 000Old_age Offline - 0 9 Power_On_Hours 0x --- --- ---Old_age Offline - 1860 12 Power_Cycle_Count 0x --- --- ---Old_age Offline - 516 184 Initial_Bad_Block_Count 0x --- --- ---Old_age Offline - 630 195 Program_Failure_Blk_Ct 0x --- --- ---Old_age Offline - 0 196 Erase_Failure_Blk_Ct0x --- --- ---Old_age Offline - 0 197 Read_Failure_Blk_Ct 0x --- --- ---Old_age Offline - 0 198 Read_Sectors_Tot_Ct 0x --- --- ---Old_age Offline - 2641425802 199 Write_Sectors_Tot_Ct0x --- --- ---Old_age Offline - 1860174045 200 Read_Commands_Tot_Ct0x --- --- ---Old_age Offline - 19326129 201 Write_Commands_Tot_Ct 0x --- --- ---Old_age Offline - 26514568 202 Error_Bits_Flash_Tot_Ct 0x --- --- ---Old_age Offline - 1004 203 Corr_Read_Errors_Tot_Ct 0x --- --- ---Old_age Offline - 1004 204 Bad_Block_Full_Flag 0x --- --- ---Old_age Offline - 0 205 Max_PE_Count_Spec 0x --- --- ---Old_age Offline - 1 206 Min_Erase_Count 0x --- --- ---Old_age Offline - 1 207 Max_Erase_Count 0x --- --- ---Old_age Offline - 62 208 Average_Erase_Count 0x --- --- ---Old_age Offline - 10 209 Remaining_Lifetime_Perc 0x --- --- ---Old_age Offline - 100 210 Unknown_Attribute 0x 207 000 000Old_age Offline - 0 211 Unknown_Attribute 0x 000 000 000Old_age Offline - 0 # ***2nd Disk ./smartctl -d sat,12 -i /dev/rdsk/c8t1d0s0 smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: OCZ Vertex SSD Device Model: OCZ-VERTEX Serial Number:407BM8W7WR6QSSNDTVRU Firmware Version: 1.3 User Capacity:256,060,514,304 bytes Device is:In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Tue Apr 27 16:10:12 2010 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled # ./smartctl -d sat,12 -A /dev/rdsk/c8t1d0s0 smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x 007 000 000Old_age Offline - 0 9 Power_On_Hours 0x --- --- ---Old_age Offline - 1838 12 Power_Cycle_Count 0x --- --- ---Old_age Offline - 511 184 Initial_Bad_Block_Count 0x --- --- ---Old_age Offline - 39 195 Program_Failure_Blk_Ct 0x --- --- ---Old_age
Re: [zfs-discuss] HELP! zpool corrupted data
Cindy, Thanks for your help as it got me on the right track. The OpenSolaris Live CD wasn't reading the GUID/GPT partition tables properly, which was causing the Assertion failed errors. I relabeled the disks using the partition information I was able to get from the FreeBSD Live CD, and then was able to import/repair the zpool using the latest OpenSolaris Live CD. Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Migrate ZFS volume to new pool
We would like to delete and recreate our existing zfs pool without losing any data. The way we though we could do this was attach a few HDDs and create a new temporary pool, migrate our existing zfs volume to the new pool, delete and recreate the old pool and migrate the zfs volumes back. The big problem we have is we need to do all this live, without any downtime. We have 2 volumes taking up around 11TB and they are shared out to a couple windows servers with comstar. Anyone have any good ideas? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS version information changes (heads up)
On Tue, Apr 27, 2010 at 11:29:04AM -0600, Cindy Swearingen wrote: The revised ZFS Administration Guide describes the ZFS version descriptions and the Solaris OS releases that provide the version and feature, starting on page 293, here: http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs It's not entirely clear how much of the text above you're quoting as the addition, but surely referring to a page number is even more volatile than a url? -- Dan. pgpmg908CTKgT.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrate ZFS volume to new pool
Hi Wolf, Which Solaris release is this? If it is an OpenSolaris system running a recent build, you might consider the zpool split feature, which splits a mirrored pool into two separate pools, while the original pool is online. If possible, attach the spare disks to create the mirrored pool as a first step. See the example below. Thanks, Cindy You can attach the spare disks to the existing pool to create the mirrored pool: # zpool attach tank disk-1 spare-disk-1 # zpool attach tank disk-2 spare-disk-2 Which gives you a pool like this: # zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Tue Apr 27 14:36:28 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c2t9d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c2t10d0 ONLINE 0 0 0 c2t6d0 ONLINE 0 0 0 56.5K resilvered errors: No known data errors Then, split the mirrored pool, like this: # zpool split tank tank2 # zpool import tank2 # zpool status tank tank2 pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Tue Apr 27 14:36:28 2010 config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 c2t9d0ONLINE 0 0 0 c2t10d0 ONLINE 0 0 0 errors: No known data errors pool: tank2 state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tank2 ONLINE 0 0 0 c2t5d0ONLINE 0 0 0 c2t6d0ONLINE 0 0 0 On 04/27/10 15:06, Wolfraider wrote: We would like to delete and recreate our existing zfs pool without losing any data. The way we though we could do this was attach a few HDDs and create a new temporary pool, migrate our existing zfs volume to the new pool, delete and recreate the old pool and migrate the zfs volumes back. The big problem we have is we need to do all this live, without any downtime. We have 2 volumes taking up around 11TB and they are shared out to a couple windows servers with comstar. Anyone have any good ideas? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, Apr 27, 2010 at 10:36:37AM +0200, Roy Sigurd Karlsbakk wrote: - Daniel Carosone d...@geek.com.au skrev: SAS: Full SCSI TCQ SATA: Lame ATA NCQ What's so lame about NCQ? Primarily, the meager number of outstanding requests; write cache is needed to pretend the writes are done straight away and free up the slots for reads. If you want throughput, you want to hand the disk controller as many requests as possible, so it can optimise seek order. If you have especially latency-sensitive requests, you need to manage the queue carefully with either system. -- Dan. pgpf0r3L8VyeA.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS version information changes (heads up)
The OSOL ZFS Admin Guide PDF is pretty stable, even if the page number isn't, but I wanted to provide an interim solution. When this information is available on docs.sun.com very soon now, the URL will be stable. cs On 04/27/10 15:32, Daniel Carosone wrote: On Tue, Apr 27, 2010 at 11:29:04AM -0600, Cindy Swearingen wrote: The revised ZFS Administration Guide describes the ZFS version descriptions and the Solaris OS releases that provide the version and feature, starting on page 293, here: http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs It's not entirely clear how much of the text above you're quoting as the addition, but surely referring to a page number is even more volatile than a url? -- Dan. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance drop during scrub?
On 04/28/10 03:17 AM, Roy Sigurd Karlsbakk wrote: Hi all I have a test system with snv134 and 8x2TB drives in RAIDz2 and currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on the testpool drops to something hardly usable while scrubbing the pool. Is that small random or block I/O? I've found latency to be the killer rather than throughput, at lest when receiving snapshots. In normal operation, receiving an empty snapshot is a sub-second operation. While resilvering, at can take up to 30 seconds. The write speed on bigger snapshots is still acceptable. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrate ZFS volume to new pool
Unclear what you want to do? What's the goal for this excise? If you want to replace the pool with larger disks and the pool is in mirror or raidz. You just replace one disk at a time and allow the pool to rebuild it self. Once all the disk has been replace, it will atomically realize the disk increase and expand the pool. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance drop during scrub?
On 04/28/10 10:01 AM, Bob Friesenhahn wrote: On Wed, 28 Apr 2010, Ian Collins wrote: On 04/28/10 03:17 AM, Roy Sigurd Karlsbakk wrote: Hi all I have a test system with snv134 and 8x2TB drives in RAIDz2 and currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on the testpool drops to something hardly usable while scrubbing the pool. Is that small random or block I/O? I've found latency to be the killer rather than throughput, at lest when receiving snapshots. In normal operation, receiving an empty snapshot is a sub-second operation. While resilvering, at can take up to 30 seconds. The write speed on bigger snapshots is still acceptable. zfs scrub != zfs send Where did I say it did? I didn't even mention zfs send. My observation concerned poor performance (latency) during a scrub/resilver. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS vs SATA: Same size, same speed, why SAS?
On Tue, Apr 27, 2010 at 2:47 PM, Daniel Carosone d...@geek.com.au wrote: What's so lame about NCQ? Primarily, the meager number of outstanding requests; write cache is needed to pretend the writes are done straight away and free up the slots for reads. NCQ handles 32 outstanding operations. Considering that ZFS limits the outstanding requests to 10 (as of snv_125 I think?), that's not an issue. TCQ supports between 16 and 64 bits for the tags, depending on the implementation and underlying protocol. TCQ allows a command to be added to the as head of the queue, ordered, or simple. I don't believe that NCQ allows multiple queuing methods, and I can't be bothered to check. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Solaris 10 default caching segmap/vpm size
Whats the default size of the file system cache for Solaris 10 x86 and can it be tuned? I read various posts on the subject and its confusing.. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 default caching segmap/vpm size
ZFS does not use segmap. The ZFS ARC (Adaptive Replacement Cache) will consume what's available, memory-wise, based on the workload. There's an upper limit if zfs_arc_max has not been set, but I forget what it is. If other memory consumers (applications, other kernel subsystems) need memory, ZFS will release memory being used by the ARC. But, if no one else wants it /jim On Apr 27, 2010, at 9:07 PM, Brad wrote: Whats the default size of the file system cache for Solaris 10 x86 and can it be tuned? I read various posts on the subject and its confusing.. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Compellant announces zNAS
Today, Compellant announced their zNAS addition to their unified storage line. zNAS uses ZFS behind the scenes. http://www.compellent.com/Community/Blog/Posts/2010/4/Compellent-zNAS.aspx Congrats Compellant! -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Zpool errors
I had a problems with a UFS file system on a hardware raid controller. It was spitting out errors like crazy, so I rsynced it to a ZFS volume on the same machine. There were a lot of read errors during the transfer and the RAID controller alarm was going off constantly. Rsync was copying the corrupted files to the ZFS volume, and performing a zpool status -v reported the full path name of the affected files. Sometimes only an inode number appears instead of a file path. Is there any way to figure out exactly what files were affected with these inodes? disk_old/some/path/to/a/file disk_old:0x41229e disk_old:0x4124bf disk_old:0x4126a4 disk_old:0x41276f -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rpool on ssd. endurance question.
Hello. Is all this data what your looking for? Yes, thank you, Paul. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Extremely slow raidz resilvering
Just in case someone of you want to jump in, I created the case #100426-001820 to WD to ask for a firmware update to the WD??EARS drives without any 512-byte emulation, just the 4K sectors directly exposed. The WD forum thread: http://community.wdc.com/t5/Desktop/Poor-performace-in-OpenSolaris-with-4K-sector-drive-WD10EARS-in/m-p/20947#M1263 is checked by the dev department and would have a lot more weight if more comments/signatures were added by you guys. Have a good night. Regards, Leandro. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss