Re: [zfs-discuss] ATA UDMA data parity error
For the archive, I swapped the mobo and all is good now... (I copied 100GB into the pool without a crash) One problem I had was that Solaris would hang whenever booting - even when all the aoc-sat2-mv8 cards were pulled out. Turns out that switching the BIOS field USB 2.0 Controller Mode from HiSpeed to FullSpeed makes the difference - any ideas why? Thanks, Kent ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] problem with nfs share of zfs storage
Hello Francois, Monday, January 21, 2008, 9:51:22 PM, you wrote: FD I have a need to stream video over nfs. video is stored on zfs. every 10 FD minutes or so, the video will freeze, and then 1 minute later it FD resumes. This doesn't happen from an nfs mount on ufs. zfs server is a FD 32 bit P4 box with 512MB, running nexenta in plain text mode, and FD nothing else, really. Tried playback from different OSes and the same is FD happening. Network has more than 10x the capacity that is required, no FD compression on zfs FD Any idea what is going on? cpu is not pegged on server or playback FD client. Not sure what to look for. try to do iostat -xnz 1 while you are streamin and catch the moment you experience a problem. Also try vmstat -p 1 at the same time and catch the same moment. -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Ditto blocks in S10U4 ?
bash-3.00# cat /etc/release Solaris 10 8/07 s10x_u4wos_12b X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 16 August 2007 (with all the latest patches) bash-3.00# zpool list NAMESIZEUSED AVAILCAP HEALTH ALTROOT zpool1 20.8T 5.44G 20.8T 0% ONLINE - bash-3.00# zpool upgrade -v This system is currently running ZFS version 4. The following versions are supported: VER DESCRIPTION --- 1 Initial ZFS version 2 Ditto blocks (replicated metadata) 3 Hot spares and double parity RAID-Z 4 zpool history For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/N Where 'N' is the version number. bash-3.00# zfs set copies=2 zpool1 cannot set property for 'zpool1': invalid property 'copies' From http://www.opensolaris.org/os/community/zfs/version/2/ ... This version includes support for Ditto Blocks, or replicated metadata. Can anybody shed any light on it ? Regards przemol -- http://przemol.blogspot.com/ -- Kogo Doda ciagnela do lozka? Sprawdz http://link.interia.pl/f1cde ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Ditto blocks in S10U4 ?
On 22 January, 2008 - [EMAIL PROTECTED] sent me these 1,6K bytes: bash-3.00# cat /etc/release Solaris 10 8/07 s10x_u4wos_12b X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 16 August 2007 (with all the latest patches) bash-3.00# zpool list NAMESIZEUSED AVAILCAP HEALTH ALTROOT zpool1 20.8T 5.44G 20.8T 0% ONLINE - bash-3.00# zpool upgrade -v This system is currently running ZFS version 4. The following versions are supported: VER DESCRIPTION --- 1 Initial ZFS version 2 Ditto blocks (replicated metadata) 3 Hot spares and double parity RAID-Z 4 zpool history For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/N Where 'N' is the version number. bash-3.00# zfs set copies=2 zpool1 cannot set property for 'zpool1': invalid property 'copies' From http://www.opensolaris.org/os/community/zfs/version/2/ ... This version includes support for Ditto Blocks, or replicated metadata. Can anybody shed any light on it ? The 'copies' thing in zfs set is ditto blocks for data.. the one in ver2 is for metadata only.. /Tomas -- Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS vdev_cache
Hi All, If any dtrace script is available to figure out the vdev_cache (or software track buffer) reads in kiloBytes ? The document says the default size of the read is 128k , However vdev_cache source code implementation says the default size is 64k Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Sparc zfs root/boot status ?
Back in October/November 2007 when I asked about Sparc zfs boot and root capabilities, I got a reply indicating late December 2007 for a possible release. I was wondering what is the status right now, will this feature make it into build 79? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS vq_max_pending value ?
Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Updated ZFS Automatic Snapshot Service - version 0.10.
Hi all, I've got a slightly updated version of the ZFS Automatic Snapshot SMF Service on my blog. This version contains a few bugfixes (many thanks to Reid Spencer and Breandan Dezendorf!) as well as a small new feature - by default we now avoid taking snapshots for any datasets that are on a pool that's currently being scrubbed or resilvered to avoid running into 6343667. More at: http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_0_10 Is this service something that we'd like to put into OpenSolaris or are there plans for something similar that achieves the same goal (and perhaps integrates more neatly with the rest of ZFS) ? Otherwise, should I start filling in an ARC one-pager template or is this sort of utility something that's better left to sysadmins to implement themselves, rather than baking it into the OS ? cheers, tim -- Tim Foster, Sun Microsystems Inc, Solaris Engineering Ops http://blogs.sun.com/timf ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Swap on ZVOL safe to use?
Lori Alt wrote: The bug is being actively worked at this time (it just got a boost in urgency as a result of the issues it was causing for the zfs boot project). It is likely that there will be a fix soon (sooner than zfs boot will be available). In the meantime, I know of no workaround. Maybe someone else does. Is the fix to make it safe to swap on a ZVOL or is it the introduction of the raw (non COW) volumes mentioned previously ? -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
Mauro Mozzarelli wrote: Back in October/November 2007 when I asked about Sparc zfs boot and root capabilities, I got a reply indicating late December 2007 for a possible release. I was wondering what is the status right now, will this feature make it into build 79? No build 79 has long since closed and SPARC ZFS Boot isn't in it. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vdev_cache
Manoj Nayak writes: Hi All, If any dtrace script is available to figure out the vdev_cache (or software track buffer) reads in kiloBytes ? The document says the default size of the read is 128k , However vdev_cache source code implementation says the default size is 64k Thanks Manoj Nayak Which document ? It's 64K when it applies. Nevada won't use the vdev_cache for data block anymore. -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
zfs boot on sparc will not be putback on its own. It will be putback with the rest of zfs boot support, sometime around build 86. Lori Mauro Mozzarelli wrote: Back in October/November 2007 when I asked about Sparc zfs boot and root capabilities, I got a reply indicating late December 2007 for a possible release. I was wondering what is the status right now, will this feature make it into build 79? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Are there, or Does it make any sense to try to find a RAID card with battery backup that will ignore the ZFS commit commands when the battery is able to guarantee stable storage? I don't know if they do this, but I've recently had good non-ZFS performance with the IBM ServeRAID 8k raid that was in an xSeries server I was using. the 8k has 256MB or batter backed cache. The server it was in, only had 6 drive bays, and I'm not looking to have it do RAID5 for ZFS, but I just had the idea: Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs, and gain the advantage of the the 256MB battery backed cache, when I tell ZFS to do RAIDZ across them? I know battery-backed cache, and the proper commit semantics are generally found only on higher end raid controllers and arrays (right?) But I'm wondering now if I couldn't get an 8 port SATA controller that would let me map each single drive as a RAID 0 LUN and use it's cache to boost performance. My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. Anyone searched for this already? Anyone found any reasons why it wouldn't work already? -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Are there, or Does it make any sense to try to find a RAID card with battery backup that will ignore the ZFS commit commands when the battery is able to guarantee stable storage? I don't know if they do this, but I've recently had good non-ZFS performance with the IBM ServeRAID 8k raid that was in an xSeries server I was using. the 8k has 256MB or batter backed cache. The server it was in, only had 6 drive bays, and I'm not looking to have it do RAID5 for ZFS, but I just had the idea: Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs, and gain the advantage of the the 256MB battery backed cache, when I tell ZFS to do RAIDZ across them? I know battery-backed cache, and the proper commit semantics are generally found only on higher end raid controllers and arrays (right?) But I'm wondering now if I couldn't get an 8 port SATA controller that would let me map each single drive as a RAID 0 LUN and use it's cache to boost performance. My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. Anyone searched for this already? Anyone found any reasons why it wouldn't work already? -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
Manoj Nayak wrote: Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. Yes, this is how it works for a read without errors. For a write, you should see 4 writes, each 128KBytes/3. Writes may also be coalesced, so you may see larger physical writes. We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? No. vq_max_pending applies to each vdev. Use iostat to see what the device load is. For the commonly used Hitachi 500 GByte disks in a thumper, the read media bandwidth is 31-64.8 MBytes/s. Writes will be about 80% of reads, or 24.8-51.8 MBytes/s. In a thumper, the disk bandwidth will be the limiting factor for the hardware. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote: My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. For the above environment, you'll probably see a noticable improvement with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive cards exist for the common consumer (with ECC memory anyways). If you convince http://www.micromemory.com/ to sell you one, let us know :) Set set zfs:zil_disable = 1 in /etc/system to gauge the type of improvement you can expect. Don't use this in production though. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote: On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote: My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. For the above environment, you'll probably see a noticable improvement with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive cards exist for the common consumer (with ECC memory anyways). If you convince http://www.micromemory.com/ to sell you one, let us know :) I know, but for a that card you need a driver to make it appear as a device. Plus it would take a PCI slot. I was hoping to make use of the battery backed ram on a RAID card that I already have (but can't use since I want to let ZFS do the redundancy.) If I had a card with battery backed ram, how would I go about testing the commit semantics to see if it is only obeying ZFS commits when the battery is bad? Does anyone know if the IBM ServeRAID 7k or 8k do this correctly? If not any chance of getting IBM to 'fix' the firmware? The Solaris RedBooks I've read, they seem to think highly of ZFS. Back on the subject of NVRAM for ZIL devices, What are people using then for ZIL devices on the budget-limited side of things? I've foudn some SATA Flash drive, and a bunch that are IDE. Unfortunately the HW I'd like to stick this in is a little older... It's got a U320 SCSI controller in it. Has anyone found a good U320 Flash Disk that's not overkill size wise, and not outrageously expensive? Google found what appear to be a few OEM vendors, but no resellers on the qty I'd be interested in. Anyone using a USB Flash drive? Is USB fast enough to gain any benefits? -Kyle Set set zfs:zil_disable = 1 in /etc/system to gauge the type of improvement you can expect. Don't use this in production though. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
zfs boot on sparc will not be putback on its own. It will be putback with the rest of zfs boot support, sometime around build 86. Since we already have ZFS boot on x86, what else will be added in addition to ZFS boot for SPARC? Thanks Andrew. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool attach problem
On a V240 running s10u4 (no additional patches), I had a pool which looked like this: pre # zpool status pool: pool01 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool01 ONLINE 0 0 0 mirror ONLINE 0 0 0 c8t600C0FF0082668310F838000d0 ONLINE 0 0 0 c8t600C0FF007E4BE4C38F4ED00d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c8t600C0FF008266812A0877700d0 ONLINE 0 0 0 c8t600C0FF007E4BE2BEDBC9600d0 ONLINE 0 0 0 errors: No known data errors /pre Since this system is not in production yet, I wanted to do a little disk juggling as follows: pre # zpool detach pool01 c8t600C0FF007E4BE4C38F4ED00d0 # zpool detach pool01 c8t600C0FF007E4BE2BEDBC9600d0 /pre New pool status: pre # zpool status pool: pool01 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool01 ONLINE 0 0 0 c8t600C0FF0082668310F838000d0 ONLINE 0 0 0 c8t600C0FF008266812A0877700d0 ONLINE 0 0 0 errors: No known data errors /pre Finally, I wanted to re-establish mirrors, but am seeing the following errors: pre # zpool attach pool01 c8t600C0FF008266812A0877700d0 c8t600C0FF007E4BE4C38F4ED00d0 cannot attach c8t600C0FF007E4BE4C38F4ED00d0 to c8t600C0FF008266812A0877700d0: device is too small # zpool attach pool01 c8t600C0FF0082668310F838000d0 c8t600C0FF007E4BE2BEDBC9600d0 cannot attach c8t600C0FF007E4BE2BEDBC9600d0 to c8t600C0FF0082668310F838000d0: device is too small /pre Is this expected behavior? The 'zpool' man page says: If device is not currently part of a mirrored configuration, device automatically transforms into a two-way mirror of device and new_device. But, this isn't what I'm seeing . . . did I do something wrong? Here's the format output for the disks: pre 4. c8t600C0FF007E4BE2BEDBC9600d0 lt;SUN-StorEdge 3510-421F-545.91GBgt; /scsi_vhci/[EMAIL PROTECTED] 5. c8t600C0FF007E4BE4C38F4ED00d0 lt;SUN-StorEdge 3510-421F-545.91GBgt; /scsi_vhci/[EMAIL PROTECTED] 6. c8t600C0FF008266812A0877700d0 lt;SUN-StorEdge 3510-421F-545.91GBgt; /scsi_vhci/[EMAIL PROTECTED] 7. c8t600C0FF0082668310F838000d0 lt;SUN-StorEdge 3510-421F-545.91GBgt; /scsi_vhci/[EMAIL PROTECTED] /pre Rob This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
Manoj Nayak wrote: Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. Yes, this is how it works for a read without errors. For a write, you should see 4 writes, each 128KBytes/3. Writes may also be coalesced, so you may see larger physical writes. We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? No. vq_max_pending applies to each vdev. 4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , then 35*3=105 concurrent 42k io will be required to saturates the same disk. Thanks Manoj Nayak Use iostat to see what the device load is. For the commonly used Hitachi 500 GByte disks in a thumper, the read media bandwidth is 31-64.8 MBytes/s. Writes will be about 80% of reads, or 24.8-51.8 MBytes/s. In a thumper, the disk bandwidth will be the limiting factor for the hardware. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vdev_cache
Manoj Nayak writes: Hi All, If any dtrace script is available to figure out the vdev_cache (or software track buffer) reads in kiloBytes ? The document says the default size of the read is 128k , However vdev_cache source code implementation says the default size is 64k Thanks Manoj Nayak Which document ? It's 64K when it applies. Nevada won't use the vdev_cache for data block anymore. How readahead or software track buffer is going to used in Navada without vdev_cache ? Any pointer to documents regarding that ? Thanks Manoj Nayak -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
manoj nayak wrote: Manoj Nayak wrote: Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. Yes, this is how it works for a read without errors. For a write, you should see 4 writes, each 128KBytes/3. Writes may also be coalesced, so you may see larger physical writes. We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? No. vq_max_pending applies to each vdev. 4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , then 35*3=105 concurrent 42k io will be required to saturates the same disk. ZFS doesn't know anything about disk saturation. It will send up to vq_max_pending I/O requests per vdev (usually a vdev is a disk). It will try to keep vq_max_pending I/O requests queued to the vdev. For writes, you should see them become coalesced, so rather than sending 3 42.6kByte write requests to a vdev, you might see one 128kByte write request. In other words, ZFS has an I/O scheduler which is responsible for sending I/O requests to vdevs. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
On Jan 22, 2008, at 18:24, Lori Alt wrote: ZFS boot supported by the installation software, plus support for having swap and dump be zvols within the root pool (i.e., no longer requiring a separate swap/dump slice), plus various other features, such as support for failsafe-archive booting. Will there any support for tying into patching / Live Upgrade with the ZFS boot put back, or is that a separate project? Thanks for any info. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Carson Gaspar wrote: Kyle McDonald wrote: ... I know, but for a that card you need a driver to make it appear as a device. Plus it would take a PCI slot. I was hoping to make use of the battery backed ram on a RAID card that I already have (but can't use since I want to let ZFS do the redundancy.) If I had a card with battery backed ram, how would I go about testing the commit semantics to see if it is only obeying ZFS commits when the battery is bad? Any _sane_ controller that supports battery backed cache will disable its write cache if its battery goes bad. It should also log this. I'd check the docs or contact your vendor's tech support to verify the card you have is sane, and if it reports the error to its monitoring tools so you find out about it quickly. You're right. I forgot that. Not only would the commits need to happen right away, but the cache should be disabled completely. Now that you mention it, I know from experience, for the ServeRAID 7k/8k controllers, the cache is disabled if/when the battery fails. Good point. Now I just need to determine if a) the cache is used by the card even when useing the disks on it as JBOD, or b) if the card will allow me to make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and activate the write cache. Anyone know the answer to this? I'll be ordering 2 of the 7K's for my x346's this week. If niether A nor B will work I'm not sure there's any advantage to using the 7k card considering I want ZFS to do the mirroring. If this all does work, it should speed up all the writes to the disk, including the ZIL writes. Is there still an advantage to investigating a Solid State Disk, or Flash Drive device to reloacte the ZIL to? Now you'll probably _still_ need to disable the ZFS cache flushes, which is a global option, so you'd need to make sure that _all_ your ZFS devices had battery backed write caches or no write caches at all. I guess this is a better solution than chasing down firmware authors to get them to ignore flush requests. It's just too bad it's not settable on a pool by pool basis rather than server by server. Won't affect me though this will be the only pool on this machine. -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
manoj nayak wrote: - Original Message - From: Richard Elling [EMAIL PROTECTED] To: manoj nayak [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org Sent: Wednesday, January 23, 2008 7:20 AM Subject: Re: [zfs-discuss] ZFS vq_max_pending value ? manoj nayak wrote: Manoj Nayak wrote: Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. Yes, this is how it works for a read without errors. For a write, you should see 4 writes, each 128KBytes/3. Writes may also be coalesced, so you may see larger physical writes. We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? No. vq_max_pending applies to each vdev. 4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , then 35*3=105 concurrent 42k io will be required to saturates the same disk. ZFS doesn't know anything about disk saturation. It will send up to vq_max_pending I/O requests per vdev (usually a vdev is a disk). It will try to keep vq_max_pending I/O requests queued to the vdev. I can see the avg pending I/Os hitting my vq_max_pending limit, then raising the limit would be a good thing. I think , it's due to many 42k Read IO to individual disk in the 4 disk raidz group. You're dealing with a queue here. iostat's average pending I/Os represents the queue depth. Some devices can't handle a large queue. In any case, queuing theory applies. Note that for reads, the disk will likely have a track cache, so it is not a good assumption that a read I/O will require a media access. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote: On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: Anyone know the answer to this? I'll be ordering 2 of the 7K's for my x346's this week. If niether A nor B will work I'm not sure there's any advantage to using the 7k card considering I want ZFS to do the mirroring. Why even both with a H/W RAID array when you won't use the H/W RAID? Better to find a decent SAS/FC JBOD with cache. Would definitely be cheaper. I've never heard of such a thing? Do you have any links (cheap or not?) Do they exist for less than $350? Thats what the 7k will run me. Do they include an enclosure for at least 6 disks? the 7k will use the 6 U320 hot swap bays already in my IBM x346 chassis. I'm not being sarcastic, if something better exists, even for a little more, I'm interested. I'd especially love to switch to SATA as I'm about to pay about $550 each for 300GB U320 drives, and with SATA I could go bigger, or save money or both. :) -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , then 35*3=105 concurrent 42k io will be required to saturates the same disk. ZFS doesn't know anything about disk saturation. It will send up to vq_max_pending I/O requests per vdev (usually a vdev is a disk). It will try to keep vq_max_pending I/O requests queued to the vdev. I can see the avg pending I/Os hitting my vq_max_pending limit, then raising the limit would be a good thing. I think , it's due to many 42k Read IO to individual disk in the 4 disk raidz group. You're dealing with a queue here. iostat's average pending I/Os represents the queue depth. Some devices can't handle a large queue. In any case, queuing theory applies. Note that for reads, the disk will likely have a track cache, so it is not a good assumption that a read I/O will require a media access. My workload issues around 5000 MB read I/0 iopattern says around 55% of the IO are random in nature. I don't know how much prefetching through track cache is going to help here.Probably I can try disabling vdev_cache through set 'zfs_vdev_cache_max' 1 Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss