Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
On Tue, Feb 1, 2011 at 11:34 PM, Richard Elling wrote: > There is a failure going on here. It could be a cable or it could be a bad > disk or firmware. The actual fault might not be in the disk reporting the > errors (!) > It is not a media error. > Errors were as follows: Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401 Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401 Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401 Feb 01 19:33:04.9969 ereport.io.scsi.cmd.disk.tran 0x269f99ef0b300401 Feb 01 19:33:04.9970 ereport.io.scsi.cmd.disk.tran 0x269f9a165a400401 Verbose of a message: Feb 01 2011 19:33:04.996932283 ereport.io.scsi.cmd.disk.tran nvlist version: 0 class = ereport.io.scsi.cmd.disk.tran ena = 0x269f99ef0b300401 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev device-path = /pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (end detector) devid = id1,sd@n5000c50010ed6a31 driver-assessment = fail op-code = 0x0 cdb = 0x0 0x0 0x0 0x0 0x0 0x0 pkt-reason = 0x18 pkt-state = 0x1 pkt-stats = 0x0 __ttl = 0x1 __tod = 0x4d48a640 0x3b6bfabb It was a cable error, but why didn't fault management tell me about it? What do you mean by "The actual fault might not be in the disk reporting the errors (!) It is not a media error."? Fault might be sourcing from my SATA controller or something possibly? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
On Feb 1, 2011, at 6:49 PM, Krunal Desai wrote: >> The output of fmdump is explicit. I am interested to know if you saw >> aborts and timeouts or some other errors. > > I have the machine off atm while I install new disks (18x ST32000542AS), but > IIRC they appeared as transport errors (scsi..transport, I can > paste the exact errors in a little bit). A slew of transfer/soft errors > followed by the drive disappearing. I assume that my HBA took it offline, and > mpt driver reported that to the OS as an admin disconnecting, not as a > "failure" per se. There is a failure going on here. It could be a cable or it could be a bad disk or firmware. The actual fault might not be in the disk reporting the errors (!) It is not a media error. > >> The open-source version of smartmontools seems to be slightly out >> of date and somewhat finicky. Does anyone know of a better SMART >> implementation? > > That SUNWhd I mentioned seemed interesting, but I assume licensing means I > can only get that if I purchase SUn hardware. > >> Nice idea, except that the X4500 was EOL years ago and the replacement, >> X4540, uses LSI HBAs. I think you will find better Solaris support for the >> LSI >> chipsets because Oracle's Sun products use them from the top (M9000) all >> the way down the product line. > > Oops, forgot that the X4500s are actually kind of "old". I'll have to look up > what LSI controllers the newer models are using (the LSI 2xx8 something IIRC? > Will have to Google). No, they aren't that new. The LSI 2008 are 6 Gbps HBAs and the older 1064/1068 series are 3 Gbps. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
> The output of fmdump is explicit. I am interested to know if you saw > aborts and timeouts or some other errors. I have the machine off atm while I install new disks (18x ST32000542AS), but IIRC they appeared as transport errors (scsi..transport, I can paste the exact errors in a little bit). A slew of transfer/soft errors followed by the drive disappearing. I assume that my HBA took it offline, and mpt driver reported that to the OS as an admin disconnecting, not as a "failure" per se. > The open-source version of smartmontools seems to be slightly out > of date and somewhat finicky. Does anyone know of a better SMART > implementation? That SUNWhd I mentioned seemed interesting, but I assume licensing means I can only get that if I purchase SUn hardware. > Nice idea, except that the X4500 was EOL years ago and the replacement, > X4540, uses LSI HBAs. I think you will find better Solaris support for the LSI > chipsets because Oracle's Sun products use them from the top (M9000) all > the way down the product line. Oops, forgot that the X4500s are actually kind of "old". I'll have to look up what LSI controllers the newer models are using (the LSI 2xx8 something IIRC? Will have to Google). --khd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
On Feb 1, 2011, at 5:52 PM, Krunal Desai wrote: > On Tue, Feb 1, 2011 at 6:11 PM, Cindy Swearingen > wrote: >> I misspoke and should clarify: >> >> 1. fmdump identifies fault reports that explain system issues >> >> 2. fmdump -eV identifies errors or problem symptoms > > Gotcha; fmdump -eV gives me the information I need. It appears to have > been a loose cable, I'm hitting the machine with some heavy I/O load, > and the pool resilvered itself, drive has not dropped out. The output of fmdump is explicit. I am interested to know if you saw aborts and timeouts or some other errors. > > SMART status was reported healthy as well (got smartctl kind of > working), but I cannot read the SMART data of my disks behind the > 1068E due to limitations of smartmontools I guess. (e.g. 'smartctl -d > scsi -a /dev/rdsk/c10t0d0' gives me serial #, model, and just a > generic 'SMART Ok'). I assume that SUNWhd is licensed only for use on > the X4500 Thumper and family? I'd like to see if it works with the > 1068E. The open-source version of smartmontools seems to be slightly out of date and somewhat finicky. Does anyone know of a better SMART implementation? > > It's getting kind of tempting for me to investigate oing a run of > boards that run Marvell 88SX6081s behind a PLX PCIe <-> PCI-X bridge. > They should have beyond excellent support seeing as that is what the > X4500 uses to run its SATA ports. Nice idea, except that the X4500 was EOL years ago and the replacement, X4540, uses LSI HBAs. I think you will find better Solaris support for the LSI chipsets because Oracle's Sun products use them from the top (M9000) all the way down the product line. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] multiple disk failure (solved?)
On Feb 1, 2011, at 5:56 AM, Mike Tancsa wrote: > On 1/31/2011 4:19 PM, Mike Tancsa wrote: >> On 1/31/2011 3:14 PM, Cindy Swearingen wrote: >>> Hi Mike, >>> >>> Yes, this is looking much better. >>> >>> Some combination of removing corrupted files indicated in the zpool >>> status -v output, running zpool scrub and then zpool clear should >>> resolve the corruption, but its depends on how bad the corruption is. >>> >>> First, I would try least destruction method: Try to remove the >>> files listed below by using the rm command. >>> >>> This entry probably means that the metadata is corrupted or some >>> other file (like a temp file) no longer exists: >>> >>> tank1/argus-data:<0xc6> >> >> >> Hi Cindy, >> I removed the files that were listed, and now I am left with >> >> errors: Permanent errors have been detected in the following files: >> >>tank1/argus-data:<0xc5> >>tank1/argus-data:<0xc6> >>tank1/argus-data:<0xc7> >> >> I have started a scrub >> scrub: scrub in progress for 0h48m, 10.90% done, 6h35m to go > > > Looks like that was it! The scrub finished in the time it estimated and > that was all I needed to do. I did not have to to do zpool clear or any > other commands. Is there anything beyond scrub to check the integrity > of the pool ? That is exactly what scrub does. It validates all data on the disks. > > 0(offsite)# zpool status -v > pool: tank1 > state: ONLINE > scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46 > 2011 > config: > >NAMESTATE READ WRITE CKSUM >tank1 ONLINE 0 0 0 > raidz1ONLINE 0 0 0 >ad0 ONLINE 0 0 0 >ad1 ONLINE 0 0 0 >ad4 ONLINE 0 0 0 >ad6 ONLINE 0 0 0 > raidz1ONLINE 0 0 0 >ada0ONLINE 0 0 0 >ada1ONLINE 0 0 0 >ada2ONLINE 0 0 0 >ada3ONLINE 0 0 0 > raidz1ONLINE 0 0 0 >ada5ONLINE 0 0 0 >ada8ONLINE 0 0 0 >ada7ONLINE 0 0 0 >ada6ONLINE 0 0 0 > > errors: No known data errors Congrats! -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
On Tue, Feb 1, 2011 at 6:11 PM, Cindy Swearingen wrote: > I misspoke and should clarify: > > 1. fmdump identifies fault reports that explain system issues > > 2. fmdump -eV identifies errors or problem symptoms Gotcha; fmdump -eV gives me the information I need. It appears to have been a loose cable, I'm hitting the machine with some heavy I/O load, and the pool resilvered itself, drive has not dropped out. SMART status was reported healthy as well (got smartctl kind of working), but I cannot read the SMART data of my disks behind the 1068E due to limitations of smartmontools I guess. (e.g. 'smartctl -d scsi -a /dev/rdsk/c10t0d0' gives me serial #, model, and just a generic 'SMART Ok'). I assume that SUNWhd is licensed only for use on the X4500 Thumper and family? I'd like to see if it works with the 1068E. It's getting kind of tempting for me to investigate oing a run of boards that run Marvell 88SX6081s behind a PLX PCIe <-> PCI-X bridge. They should have beyond excellent support seeing as that is what the X4500 uses to run its SATA ports. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Question regarding "zfs snapshot -r"
On Tue, Feb 1 at 10:54, Rahul Deb wrote: Hello All, I have two questions related to "zfs snapshot -r" 1. When "zfs snapshot -r tank@today" command is issued, does it creates snapshots for all the�descendent file systems at the same moment? I mean to say if the command is issued at 10:20:35 PM, does the creation time of all the snapshots for descendent file systems are same? 2. Say, tank has around 5000 descendent file systems and "zfs snapshot -r tank@today" takes around 10 seconds to complete. If there is a new file systems created under tank within that 10 seconds period, does that snapshot process includes the new file system created within that 10 seconds? OR it will exclude that newly created filesystem? Thanks, -- Rahul I believe the contract is that the content of all recursive snapshots are consistent with the instant in time at which the snapshot command was executed. Quoting from the ZFS Administration Guide: Recursive ZFS snapshots are created quickly as one atomic operation. The snapshots are created together (all at once) or not created at all. The benefit of such an operation is that the snapshot data is always taken at one consistent time, even across descendent file systems. Therefore, in #2 above, the snapshot wouldn't include the new file in the descendent file system, because it was created after the moment in time when the snapshot was initiated. In #1 above, I would guess the snapshot time is the time of the initial command across all filesystems in the tree, even if it takes 10 seconds to actually complete the command. However, I have no such system where I can prove this guess as correct or not. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
I misspoke and should clarify: 1. fmdump identifies fault reports that explain system issues 2. fmdump -eV identifies errors or problem symptoms I'm unclear about your REMOVED status. I don't see it very often. The ZFS Admin Guide says: REMOVED The device was physically removed while the system was running. Device removal detection is hardware-dependent and might not be supported on all platforms. I need to check if FMA generally reports on devices that are REMOVED by the administrator, as ZFS seems to think in this case. Thanks, Cindy On 02/01/11 15:47, Krunal Desai wrote: On Tue, Feb 1, 2011 at 1:29 PM, Cindy Swearingen wrote: I agree that we need to get email updates for failing devices. Definitely! See if fmdump generated an error report using the commands below. Unfortunately not, see below: movax@megatron:/root# fmdump TIME UUID SUNW-MSG-ID EVENT fmdump: warning: /var/fm/fmd/fltlog is empty --khd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
On Tue, Feb 1, 2011 at 1:29 PM, Cindy Swearingen wrote: > I agree that we need to get email updates for failing devices. Definitely! > See if fmdump generated an error report using the commands below. Unfortunately not, see below: movax@megatron:/root# fmdump TIME UUID SUNW-MSG-ID EVENT fmdump: warning: /var/fm/fmd/fltlog is empty --khd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Question regarding "zfs snapshot -r"
Hello All, I have two questions related to "zfs snapshot -r" 1. When "zfs snapshot -r tank@today" command is issued, does it creates snapshots for all the descendent file systems at the same moment? I mean to say if the command is issued at 10:20:35 PM, does the creation time of all the snapshots for descendent file systems are same? 2. Say, tank has around 5000 descendent file systems and "zfs snapshot -r tank@today" takes around 10 seconds to complete. If there is a new file systems created under tank within that 10 seconds period, does that snapshot process includes the new file system created within that 10 seconds? OR it will exclude that newly created filesystem? Thanks, -- Rahul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?
Hi Krunal, It looks to me like FMA thinks that you removed the disk so you'll need to confirm whether the cable dropped or something else. I agree that we need to get email updates for failing devices. See if fmdump generated an error report using the commands below. Thanks, Cindy # fmdump TIME UUID SUNW-MSG-ID EVENT Jan 07 14:01:14.7839 04ee736a-b2cb-612f-ce5e-a0e43d666762 ZFS-8000-GH Diagnosed Jan 13 10:34:32.2301 04ee736a-b2cb-612f-ce5e-a0e43d666762 FMD-8000-58 Updated Then, review the contents: fmdump -u 04ee736a-b2cb-612f-ce5e-a0e43d666762 -v TIME UUID SUNW-MSG-ID EVENT Jan 07 14:01:14.7839 04ee736a-b2cb-612f-ce5e-a0e43d666762 ZFS-8000-GH Diagnosed 100% fault.fs.zfs.vdev.checksum Problem in: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383 Affects: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383 FRU: - Location: - Jan 13 10:34:32.2301 04ee736a-b2cb-612f-ce5e-a0e43d666762 FMD-8000-58 Updated 100% fault.fs.zfs.vdev.checksum Problem in: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383 Affects: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383 FRU: - Location: - Thanks, Cindy On 02/01/11 09:55, Krunal Desai wrote: I recently discovered a drive failure (either that or a loose cable, I need to investigate further) on my home fileserver. 'fmadm faulty' returns no output, but I can clearly see a failure when I do zpool status -v: pool: tank state: DEGRADED status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub canceled on Tue Feb 1 11:51:58 2011 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 c10t0d0 ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 c10t2d0 ONLINE 0 0 0 c10t3d0 REMOVED 0 0 0 c10t4d0 ONLINE 0 0 0 c10t5d0 ONLINE 0 0 0 c10t6d0 ONLINE 0 0 0 c10t7d0 ONLINE 0 0 0 In dmesg, I see: Feb 1 11:14:33 megatron scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (sd8): Feb 1 11:14:33 megatronCommand failed to complete...Device is gone never had any problems with these drives + mpt in snv_134 (on snv_151a now), only change was adding a second 1068E-IT that's currently unpopulated with drives. But more importantly I guess, why can't I see this failure in fmadm (and how would I go about setting up automatically dispatching an e-mail to me when stuff like this happens?)? Is a pool going degraded != to failure? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] fmadm faulty not showing faulty/offline disks?
I recently discovered a drive failure (either that or a loose cable, I need to investigate further) on my home fileserver. 'fmadm faulty' returns no output, but I can clearly see a failure when I do zpool status -v: pool: tank state: DEGRADED status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub canceled on Tue Feb 1 11:51:58 2011 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 c10t0d0 ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 c10t2d0 ONLINE 0 0 0 c10t3d0 REMOVED 0 0 0 c10t4d0 ONLINE 0 0 0 c10t5d0 ONLINE 0 0 0 c10t6d0 ONLINE 0 0 0 c10t7d0 ONLINE 0 0 0 In dmesg, I see: Feb 1 11:14:33 megatron scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (sd8): Feb 1 11:14:33 megatronCommand failed to complete...Device is gone never had any problems with these drives + mpt in snv_134 (on snv_151a now), only change was adding a second 1068E-IT that's currently unpopulated with drives. But more importantly I guess, why can't I see this failure in fmadm (and how would I go about setting up automatically dispatching an e-mail to me when stuff like this happens?)? Is a pool going degraded != to failure? -- --khd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] multiple disk failure (solved?)
Excellent. I think you are good for now as long as your hardware setup is stable. You survived a severe hardware failure so say a prayer and make sure this doesn't happen again. Always have good backups. Thanks, Cindy On 02/01/11 06:56, Mike Tancsa wrote: On 1/31/2011 4:19 PM, Mike Tancsa wrote: On 1/31/2011 3:14 PM, Cindy Swearingen wrote: Hi Mike, Yes, this is looking much better. Some combination of removing corrupted files indicated in the zpool status -v output, running zpool scrub and then zpool clear should resolve the corruption, but its depends on how bad the corruption is. First, I would try least destruction method: Try to remove the files listed below by using the rm command. This entry probably means that the metadata is corrupted or some other file (like a temp file) no longer exists: tank1/argus-data:<0xc6> Hi Cindy, I removed the files that were listed, and now I am left with errors: Permanent errors have been detected in the following files: tank1/argus-data:<0xc5> tank1/argus-data:<0xc6> tank1/argus-data:<0xc7> I have started a scrub scrub: scrub in progress for 0h48m, 10.90% done, 6h35m to go Looks like that was it! The scrub finished in the time it estimated and that was all I needed to do. I did not have to to do zpool clear or any other commands. Is there anything beyond scrub to check the integrity of the pool ? 0(offsite)# zpool status -v pool: tank1 state: ONLINE scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46 2011 config: NAMESTATE READ WRITE CKSUM tank1 ONLINE 0 0 0 raidz1ONLINE 0 0 0 ad0 ONLINE 0 0 0 ad1 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 0 0 0 raidz1ONLINE 0 0 0 ada0ONLINE 0 0 0 ada1ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 raidz1ONLINE 0 0 0 ada5ONLINE 0 0 0 ada8ONLINE 0 0 0 ada7ONLINE 0 0 0 ada6ONLINE 0 0 0 errors: No known data errors 0(offsite)# ---Mike ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup success stories?
> > Dedup is *hungry* for RAM. 8GB is not enough for > your configuration, > > most likely! First guess: double the RAM and then > you might have > > better > > luck. > > I know... that's why I use L2ARC > What is zdb -D showing? Does this give you any clue; http://blogs.sun.com/roch/entry/dedup_performance_considerations1 br, syljua -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] multiple disk failure (solved?)
On 1/31/2011 4:19 PM, Mike Tancsa wrote: > On 1/31/2011 3:14 PM, Cindy Swearingen wrote: >> Hi Mike, >> >> Yes, this is looking much better. >> >> Some combination of removing corrupted files indicated in the zpool >> status -v output, running zpool scrub and then zpool clear should >> resolve the corruption, but its depends on how bad the corruption is. >> >> First, I would try least destruction method: Try to remove the >> files listed below by using the rm command. >> >> This entry probably means that the metadata is corrupted or some >> other file (like a temp file) no longer exists: >> >> tank1/argus-data:<0xc6> > > > Hi Cindy, > I removed the files that were listed, and now I am left with > > errors: Permanent errors have been detected in the following files: > > tank1/argus-data:<0xc5> > tank1/argus-data:<0xc6> > tank1/argus-data:<0xc7> > > I have started a scrub > scrub: scrub in progress for 0h48m, 10.90% done, 6h35m to go Looks like that was it! The scrub finished in the time it estimated and that was all I needed to do. I did not have to to do zpool clear or any other commands. Is there anything beyond scrub to check the integrity of the pool ? 0(offsite)# zpool status -v pool: tank1 state: ONLINE scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46 2011 config: NAMESTATE READ WRITE CKSUM tank1 ONLINE 0 0 0 raidz1ONLINE 0 0 0 ad0 ONLINE 0 0 0 ad1 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 0 0 0 raidz1ONLINE 0 0 0 ada0ONLINE 0 0 0 ada1ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 raidz1ONLINE 0 0 0 ada5ONLINE 0 0 0 ada8ONLINE 0 0 0 ada7ONLINE 0 0 0 ada6ONLINE 0 0 0 errors: No known data errors 0(offsite)# ---Mike ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and L2ARC memory requirements?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk > > > Even *with* an L2ARC, your memory requirements are *substantial*, > > because the L2ARC itself needs RAM. 8 GB is simply inadequate for your > > test. > > With 50TB storage, and 1TB if L2ARC, with no dedup, what amount of ARC > would you would you recommeend? Without dedup and without L2ARC, the amount of ram you require is unrelated to amount of storage you have. Your ram requirement depends on what applications you run. Any excess ram you have will be used for ARC (that is, L1 ARC) and therefore used to benefit performance. So excess ram is always good. Do not be a cheapskate with ram. Regardless of whether you use ZFS, or any other filesystem, or any other OS, even windows or linux. Excess ram is always a good thing. It always improves stability and improves performance. If you are using a laptop and not serving anything and performance is not a major concern and you're free to reboot whenever you want, then you can survive on 2G of ram. But a server presumably DOES stuff and you don't want to reboot frequently. I'd recommend 4G minimally, 8G standard, and if you run any applications (databases, web servers, symantec products) then add more. And if you use dedup, or l2arc, then add more. > And then, _with_ dedup, what would you recommemend? If you have dedup enabled, add slightly under 3G ram for every 1TB unique data in your pool on top of whatever you've selected for your base ram configuration. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and TRIM
On 01/31/11 01:09 PM, Pasi Kärkkäinen wrote: On Mon, Jan 31, 2011 at 03:41:52PM +0100, Joerg Schilling wrote: Brandon High wrote: On Sat, Jan 29, 2011 at 8:31 AM, Edward Ned Harvey wrote: What is the status of ZFS support for TRIM? I believe it's been supported for a while now. http://www.c0t0d0s0.org/archives/6792-SATA-TRIM-support-in-Opensolaris.html The command is implemented in the sata driver but there does ot seem to be any user of the code. Btw is the SCSI equivalent also implemented? iirc it was called SCSI UNMAP (for SAS). No. - Garrett -- Pasi ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup success stories (take two)
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk > > Sorry about the initial post - it was wrong. The hardware configuration was > right, but for initial tests, I use NFS, meaning sync writes. This obviously > stresses the ARC/L2ARC more than async writes, but the result remains the > same. I'm sorry, that's not correct. L2ARC is a read cache. ZIL is used for sync writes. ZIL always exists. If there is no dedicated ZIL log device, then blocks are used for ZIL in the main storage pool. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup success stories?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk > > > Dedup is *hungry* for RAM. 8GB is not enough for your configuration, > > most likely! First guess: double the RAM and then you might have > > better > > luck. > > I know... that's why I use L2ARC l2arc is not a substitute for ram. In some cases it can improve disk performance in the absence of ram, but it cannot be used for in-memory applications and kernel. At best, what you're describing would be swap space on a SSD. Swap space is a substitute for ram. Be aware that SSD performance is 1/100th the performance of ram (or worse.) Garrett is right. Add more ram, if it is physically possible. And if it is not physically possible, think long and hard about upgrading your server so you can add more ram. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of James > > I’m trying to select the appropriate disk spindle speed for a proposal and > would welcome any experience and opinions (e.g. has anyone actively > chosen 10k/15k drives for a new ZFS build and, if so, why?). There is nothing special about ZFS in relation to spindle speed. If you get higher rpm's, then you get higher iops, and the same is true for EXT3, NTFS, HFS+, ZFS, etc. One characteristic people often overlook is: When you get a disk with higher capacity (say, 2T versus 600G) then you get more empty space and hence typically lower fragmentation in the drive. Also, the platter density is typically higher, so if the two drives have equal RPM's, typically the higher capacity drive can perform faster sustained sequential operations. Even if you use slow drives, assuming you have them in some sort of raid configuration, they quickly add up sequential speed to reach the bus speed. So if you expect to do large sequential operations, go for the lower rpm disks. But if you expect to do lots of small operations, then twice the rpm's literally means twice the performance. So for small random operations, go for the higher rpm disks. > ** My understanding is that ZFS will adjust the amount of data accepted into > each “transaction” (TXG) to ensure it can be written to disk in 5s.Async > data > will stay in ARC, Sync data will also go to ZIL or, if overthreshold, will go > to disk > and pointer to ZIL(on low latency SLOG) – ie. all writes apart from sync > writes ZFS will aggregate small random writes into larger sequential writes. So you don't have to worry too much about rpm's and iops during writes. But of course there's nothing you can do about the random reads. So if you do random reads, you do indeed want higher rpm's. Your understanding (or terminology) of arc is not correct. Arc and l2arc are read cache. The terminology for the context you're describing would be the write buffer. Async writes will be stored in the ram write buffer and optimized for sequential disk blocks before writing to disk. Whenever there are sync writes, they will be written to the ZIL (hopefully you have a dedicated ZIL log device) immediately, and then they will join the write buffer with all the other async writes. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss