Re: [zfs-discuss] ZFS still crashing after patch
Hello Richard, Monday, May 5, 2008, 4:12:23 PM, you wrote: RE> Rustam wrote: >> Hello Robert, >> >>> Which would happen if you have problem with HW and you're getting >>> wring checksums on both side of your mirrors. Maybe PS? >>> >>> Try memtest anyway or sunvts >>> >> Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest >> requires too much downtime which I cannot afford right now. >> RE> Sometimes if you read the docs, you can get confused by people who RE> intend to confuse you. SunVTS does work on a wide variety of RE> hardware, though it may not be "supported." To fully understand the RE> perspective, SunVTS is used by Sun in the manufacturing process. RE> It is the tests run on hardware before shipping to customers. It is not RE> intended to be a generic "test whatever hardware you find laying around RE> product." Nevertheless you can actually "persuade" it to run on non Sun HW - it's even in manual page IIRC. -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On May 5, 2008, at 4:43 PM, Bob Friesenhahn wrote: > On Mon, 5 May 2008, eric kustarz wrote: >> >> That's not true: >> http://blogs.sun.com/erickustarz/entry/zil_disable >> >> Perhaps people are using "consistency" to mean different things >> here... > > Consistency means that fsync() assures that the data will be written > to disk so no data is lost. It is not the same thing as "no > corruption". ZFS will happily lose some data in order to avoid some > corruption if the system loses power. Ok, that makes more sense. You're talking from the application perspective, whereas my blog entry is from the file system's perspective (disabling the ZIL does not compromise on-disk consistency). eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On Mon, 5 May 2008, Marcelo Leal wrote: > I'm calling consistency, "a coherent local view"... > I think that was one option to debug (if not a NFS server), without > generate a corrupted filesystem. In other words your flight reservation will not be lost if the system crashes. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On Mon, 5 May 2008, eric kustarz wrote: > > That's not true: > http://blogs.sun.com/erickustarz/entry/zil_disable > > Perhaps people are using "consistency" to mean different things here... Consistency means that fsync() assures that the data will be written to disk so no data is lost. It is not the same thing as "no corruption". ZFS will happily lose some data in order to avoid some corruption if the system loses power. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On May 5, 2008, at 1:43 PM, Bob Friesenhahn wrote: > On Mon, 5 May 2008, Marcelo Leal wrote: > >> Hello, If you believe that the problem can be related to ZIL code, >> you can try to disable it to debug (isolate) the problem. If it is >> not a fileserver (NFS), disabling the zil should not impact >> consistency. > > In what way is NFS special when it comes to ZFS consistency? If NFS > consistency is lost by disabling the zil then local consistency is > also lost. That's not true: http://blogs.sun.com/erickustarz/entry/zil_disable Perhaps people are using "consistency" to mean different things here... eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On Mon, 5 May 2008, Marcelo Leal wrote: > Hello, If you believe that the problem can be related to ZIL code, > you can try to disable it to debug (isolate) the problem. If it is > not a fileserver (NFS), disabling the zil should not impact > consistency. In what way is NFS special when it comes to ZFS consistency? If NFS consistency is lost by disabling the zil then local consistency is also lost. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Hello Leal, I've been already warned (http://www.opensolaris.org/jive/message.jspa?messageID=231349) that ZIL could be a cause and I made tests with zil_disabled. I run scrub and system crashed exactly at after the same period and the same error. ZIL known to cause some problems on writes, while all my problems are with zio_read and checksum_verify. This is NFS file server, but it crashed even when NFS unshared and nfs/server is disabled. So this is not NFS problem. I reduced panic occasions by setting zfs_prefetch_disable. This allows me to avoid unnecessary reads and reduces chances of reading bad checksums. For now I've 24 hours without crash which is much better than few times a day. However, I know that bad checksums are there and I need to fix them somehow. -- Rustam This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Hello, If you believe that the problem can be related to ZIL code, you can try to disable it to debug (isolate) the problem. If it is not a fileserver (NFS), disabling the zil should not impact consistency. Leal. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Rustam wrote: > Hello Robert, > >> Which would happen if you have problem with HW and you're getting >> wring checksums on both side of your mirrors. Maybe PS? >> >> Try memtest anyway or sunvts >> > Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest > requires too much downtime which I cannot afford right now. > Sometimes if you read the docs, you can get confused by people who intend to confuse you. SunVTS does work on a wide variety of hardware, though it may not be "supported." To fully understand the perspective, SunVTS is used by Sun in the manufacturing process. It is the tests run on hardware before shipping to customers. It is not intended to be a generic "test whatever hardware you find laying around product." -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Hello Robert, > Which would happen if you have problem with HW and you're getting > wring checksums on both side of your mirrors. Maybe PS? > > Try memtest anyway or sunvts Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest requires too much downtime which I cannot afford right now. However, I've interesting observations and now I can reproduce crash. It seems that I've bad checksum(s) and ZFS crashes each time when it tries to read it. Below are two cases: Case1: I've got a checksum error not striped over mirrors, this time it was checksum for a file and not <0x0>. I tried to read file twice. First try returned I/O error, second try caused panic. Here's the log: core# zpool status -xv pool: box5 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM box5ONLINE 0 0 2 mirrorONLINE 0 0 0 c1d0ONLINE 0 0 0 c2d0ONLINE 0 0 0 mirrorONLINE 0 0 2 c2d1ONLINE 0 0 4 c1d1ONLINE 0 0 4 errors: Permanent errors have been detected in the following files: box5:<0x0> /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file core# ll /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file -rw--- 1 user group 489 Apr 20 2006 /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file core# cat /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file cat: input error on /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file: I/O error core# zpool status -xv pool: box5 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM box5ONLINE 0 0 4 mirrorONLINE 0 0 0 c1d0ONLINE 0 0 0 c2d0ONLINE 0 0 0 mirrorONLINE 0 0 4 c2d1ONLINE 0 0 8 c1d1ONLINE 0 0 8 errors: Permanent errors have been detected in the following files: box5:<0x0> /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file core# cat /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file (Kernel Panic: BAD TRAP: type=e (#pf Page fault) rp=fe8001112490 addr=fe80882b7000) ... (after system boot up) core# rm /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file core# zpool status -xv pool: box5 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM box5ONLINE 0 0 0 mirrorONLINE 0 0 0 c1d0ONLINE 0 0 0 c2d0ONLINE 0 0 0 mirrorONLINE 0 0 0 c2d1ONLINE 0 0 0 c1d1ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: box5:<0x0> box5:<0x4a049a> core# mdb unix.17 vmcore.17 Loading modules: [ unix krtld genunix specfs dtrace cpu.generic uppc pcplusmp ufs ip hook neti sctp arp usba uhci fctl nca lofs zfs random nfs ipc sppp crypto ptm ] > ::status debugging crash dump vmcore.17 (64-bit) from core operating system: 5.10 Generic_127128-11 (i86pc) panic message: BAD TRAP: type=e (#pf Page fault) rp=fe8001112490 addr=fe80882b7000 dump content: kernel pages only > ::stack fletcher_2_native+0x13() zio_checksum_verify+0x27() zio_next_stage+0x65() zio_wait_for_children+0x49() zio_wait_children_done+0x15() zio_next_stage+0x65() zio_vdev_io_assess+0x84() zio_next_stage+0x65() vdev_cache_read+0x14c() vdev_disk_io_start+0x135() vdev_io_start+0x12() zio_vdev_io_start+0x7b() zio_next_stage_async+0xae() zio_nowait+9() vdev_mirror_io_start+0xa9() vdev_io_start+0x12() zio_vdev_io_start+0x7b() zio_next_stage_async+0xae() zio_nowait+9() vdev_mirror_io_start+0xa9() zio_vdev_io_start+0x116() zio_next_stage+0x65() zio_ready+0xec() zio_next_stage+0x65() zio_wait_for_children+0x49() zio_wait_chi
Re: [zfs-discuss] ZFS still crashing after patch
Hello Rustam, Saturday, May 3, 2008, 9:16:41 AM, you wrote: R> I don't think that this is hardware issue, however i don't except this. I'll try to explain why. R> 1. I've replaced all memory modules which are more likely to cause such a problem. R> 2. There are many different applications running on that server R> (Apache, PostgreSQL, etc.). However, if you look at the four R> different crash dump stack traces you see the same picture: R> -- crash dump st1 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> spa_scrub_io_start+0xf1() R> spa_scrub_cb+0x13d() R> -- crash dump st2 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> arc_read+0x3cc() R> dbuf_prefetch+0x11d() R> dmu_prefetch+0x107() R> zfs_readdir+0x408() R> fop_readdir+0x34() R> -- crash dump st3 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> arc_read+0x3cc() R> dbuf_prefetch+0x11d() R> dmu_prefetch+0x107() R> zfs_readdir+0x408() R> fop_readdir+0x34() R> -- crash dump st4 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> arc_read+0x3cc() R> dbuf_prefetch+0x11d() R> dmu_prefetch+0x107() R> zfs_readdir+0x408() R> fop_readdir+0x34() R> All four crash dumps show problem at zio_read/zio_buf_alloc. Three R> of these appeared during metadata prefetch (dmu_prefetch) and one R> during scrubbing. I don't think that it's coincidence. IMHO, R> checksum errors are the result of this inconsistency. Which would happen if you have problem with HW and you're getting wring checksums on both side of your mirrors. Maybe PS? Try memtest anyway or sunvts -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
I don't think that this is hardware issue, however i don't except this. I'll try to explain why. 1. I've replaced all memory modules which are more likely to cause such a problem. 2. There are many different applications running on that server (Apache, PostgreSQL, etc.). However, if you look at the four different crash dump stack traces you see the same picture: -- crash dump st1 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() spa_scrub_io_start+0xf1() spa_scrub_cb+0x13d() -- crash dump st2 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() -- crash dump st3 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() -- crash dump st4 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() All four crash dumps show problem at zio_read/zio_buf_alloc. Three of these appeared during metadata prefetch (dmu_prefetch) and one during scrubbing. I don't think that it's coincidence. IMHO, checksum errors are the result of this inconsistency. I tend to think that problem is in ZFS it exists even in the latest Solaris version (maybe OpenSolaris as well). > > Lots of CKSUM errors like you see is often indicative > of bad hardware. Run > memtest for 24-48 hours. > > -marc This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Rustam code.az> writes: > > Didn't help. Keeps crashing. > The worst thing is that I don't know where's the problem. More ideas on > how to find problem? Lots of CKSUM errors like you see is often indicative of bad hardware. Run memtest for 24-48 hours. -marc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
> Seems kind of old. I am using Generic_127112-11 here. > > Probably many hundreds of nasty bugs have been > eliminated since the version you are using. I've updated to the latest available kernel 127128-11 (from 28 Apr) which included a number of fixes to AHCI SATA driver and ZFS. Didn't help. Keeps crashing. The worst thing is that I don't know where's the problem. More ideas on how to find problem? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On Thu, 1 May 2008, Rustam wrote: > operating system: 5.10 Generic_127112-07 (i86pc) Seems kind of old. I am using Generic_127112-11 here. Probably many hundreds of nasty bugs have been eliminated since the version you are using. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
> Is your ZFS pool configured with redundancy (e.g mirrors, raidz) or is > it non-redundant? If non-redundant, then there is not much that ZFS > can really do if a device begins to fail. It's RAID 10 (more info here: http://www.opensolaris.org/jive/thread.jspa?threadID=57425): NAME STATE READ WRITE CKSUM box5 ONLINE 0 0 4 mirror ONLINE 0 0 2 c1d0 ONLINE 0 0 4 c2d0 ONLINE 0 0 4 mirror ONLINE 0 0 2 c2d1 ONLINE 0 0 4 c1d1 ONLINE 0 0 4 Actually, there's no damaged data so far. I don't get any "unable to read/write" kind of errors. It's just very strange checksum errors synchronized over all disks. > That's a bit harsh. ZFS is telling you that you u have corrupted data > based on the checksums. Other types of filesystems would likely simply > pass the corrupted data on silently. Checksums are good, no complaints about that. > Do you have the panic messages? ZFS won't cause panics based on bad > checksums. It will by default cause panic if it can't write data out to > any device or if it completely loses access to non-redundant devices or > loses both redundant devices at the same time. A number of panic messages and crash dump stack trace are attached to the original post (http://www.opensolaris.org/jive/thread.jspa?threadID=57425). Here is the short snip: > ::status debugging crash dump vmcore.5 (64-bit) from core operating system: 5.10 Generic_127112-07 (i86pc) panic message: BAD TRAP: type=e (#pf Page fault) rp=fe800017f8d0 addr=238 occurred in module "unix" due to a NULL pointer dereference dump content: kernel pages only > > ::stack mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() spa_scrub_io_start+0xf1() spa_scrub_cb+0x13d() traverse_callback+0x6a() traverse_segment+0x118() traverse_more+0x7b() spa_scrub_thread+0x147() thread_start+8() > Since this seems to show the same number of checksum errors across 2 > different channels and 4 different drives. Given that, I'd assume that > this is likely a dual-channel HBA of some sort. It would appear that > you either have bad hardware or some sort of driver issue. You right, this is the dual-channel Intel's ICH6 SATA controller. 10U4 has native support/drivers for this SATA controller (AHCI drivers afaik). The thing is that this hardware and ZFS were in production for almost 2 years (ok, not the best argument). However this problem occurred recently (20 days). It's even more strange because I didn't made any OS/diver upgrade or patch during last 2-3 months. However, this is good point. I've seen some new SATA/AHCI drivers available in 10U5. Maybe I should try to upgrade and see if it helps. Thanks Phil. -- Rustam This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Rustam wrote: > Today my production server crashed 4 times. THIS IS NIGHTMARE! > Self-healing file system?! For me ZFS is SELF-KILLING filesystem. > > I cannot fsck it, there's no such tool. I cannot scrub it, it crashes > 30-40 minutes after scrub starts. I cannot use it, it crashes a > number of times every day! And with every crash number of checksum > failures is growing: > > NAMESTATE READ WRITE CKSUM box5ONLINE 0 > 0 0 ...after a few hours... box5ONLINE 0 0 > 4 ...after a few hours... box5ONLINE 0 0 62 > ...after another few hours... box5ONLINE 0 0 > 120 ...crash! and we start again... box5ONLINE 0 0 > 0 ...etc... > > actually 120 is record, sometimes it crashed as soon as it boots. > > and always there's a permanent error: errors: Permanent errors have > been detected in the following files: box5:<0x0> > > and very wise self-healing advice: http://www.sun.com/msg/ZFS-8000-8A > Restore the file in question if possible. Otherwise restore the > entire pool from backup. > > Thanks, but if I restore it from backup it won't be ZFS anymore, > that's for sure. That's a bit harsh. ZFS is telling you that you have corrupted data based on the checksums. Other types of filesystems would likely simply pass the corrupted data on silently. > It's not I/O problem. AFAIK, default ZFS I/O error behavior is "wait" > to repair (i've 10U4, non-configurable). Then why it panics? Do you have the panic messages? ZFS won't cause panics based on bad checksums. It will by default cause panic if it can't write data out to any device or if it completely loses access to non-redundant devices or loses both redundant devices at the same time. > Recently there were discussions on failure of OpenSolaris community. > Now it's been more than half a month since I reported such an error. > Nobody even posted something like "RTFM". Come on guys, I know you > are there and busy with enterprise customers... but at least give me > some troubleshooting ideas. i'm totally lost. > > just to remind, it's heavily loaded fs with 3-4 million files and > folders. > > Link to original post: > http://www.opensolaris.org/jive/thread.jspa?threadID=57425 Since this seems to show the same number of checksum errors across 2 different channels and 4 different drives. Given that, I'd assume that this is likely a dual-channel HBA of some sort. It would appear that you either have bad hardware or some sort of driver issue. Regards, Phil ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
On Thu, 1 May 2008, Rustam wrote: > Today my production server crashed 4 times. THIS IS NIGHTMARE! > Self-healing file system?! For me ZFS is SELF-KILLING filesystem. > > I cannot fsck it, there's no such tool. > I cannot scrub it, it crashes 30-40 minutes after scrub starts. > I cannot use it, it crashes a number of times every day! And with every crash > number of checksum failures is growing: Is your ZFS pool configured with redundancy (e.g mirrors, raidz) or is it non-redundant? If non-redundant, then there is not much that ZFS can really do if a device begins to fail. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Today my production server crashed 4 times. THIS IS NIGHTMARE! Self-healing file system?! For me ZFS is SELF-KILLING filesystem. I cannot fsck it, there's no such tool. I cannot scrub it, it crashes 30-40 minutes after scrub starts. I cannot use it, it crashes a number of times every day! And with every crash number of checksum failures is growing: NAMESTATE READ WRITE CKSUM box5ONLINE 0 0 0 ...after a few hours... box5ONLINE 0 0 4 ...after a few hours... box5ONLINE 0 0 62 ...after another few hours... box5ONLINE 0 0 120 ...crash! and we start again... box5ONLINE 0 0 0 ...etc... actually 120 is record, sometimes it crashed as soon as it boots. and always there's a permanent error: errors: Permanent errors have been detected in the following files: box5:<0x0> and very wise self-healing advice: http://www.sun.com/msg/ZFS-8000-8A Restore the file in question if possible. Otherwise restore the entire pool from backup. Thanks, but if I restore it from backup it won't be ZFS anymore, that's for sure. It's not I/O problem. AFAIK, default ZFS I/O error behavior is "wait" to repair (i've 10U4, non-configurable). Then why it panics? Recently there were discussions on failure of OpenSolaris community. Now it's been more than half a month since I reported such an error. Nobody even posted something like "RTFM". Come on guys, I know you are there and busy with enterprise customers... but at least give me some troubleshooting ideas. i'm totally lost. just to remind, it's heavily loaded fs with 3-4 million files and folders. Link to original post: http://www.opensolaris.org/jive/thread.jspa?threadID=57425 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss