[zfs-discuss] ZFS error handling - suggestion
Howdy, I have at several times had issues with consumer grade PC hardware and ZFS not getting along. The problem is not the disks but the fact I dont have ECC and end to end checking on the datapath. What is happening is that random memory errors and bit flips are written out to disk and when read back again ZFS reports it as a checksum failure: pool: myth state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM mythONLINE 0 048 raidz1ONLINE 0 048 c7t1d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /myth/tv/1504_20080216203700.mpg /myth/tv/1509_20080217192700.mpg Note there are no disk errors, just entire RAID errors. I get the same thing on a mirror pool where both sides of the mirror have identical errors. All I can assume is that it was corrupted after the checksum was calculated and flushed to disk like that. In the past it was a motherboard capacitor that had popped - but it was enough to generate these errors under load. At any rate ZFS is doing the right thing by telling me - what I dont like is that from that point on I cant convince ZFS to ignore it. The data in question is video files - a bit flip here or there wont matter. But if ZFS reads the affected block it returns and I/O error and until I restore the file I have no option but to try and make the application skip over it. If it was UFS for example I would have never known, but ZFS makes a point of stopping anything using it - understandably, but annoyingly as well. What I would like to see is an option to ZFS in the style of the 'onerror' for UFS i.e the ability to tell ZFS to join fight club - let what doesnt matter truely slide. For example: zfs set erroraction=[iofail|log|ignore] This would default to the current action of iofail but in the event you wanted to try and recover or repair data, you could set log to say generate an FMA event that there is bad checksums, or ignore, to get on with your day. As mentioned, I see this as mostly an option to help repair data after the issue is identified or repaired. Of course its data specific, but if the application can allow it or handle it, why should ZFS get in the way? Just a thought. Cheers, Adrian PS: And yes, I am now buying some ECC memory. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
Mertol Ozyoney wrote: 2540 controler can achieve maximum 250 MB/sec on writes on the first 12 drives. So you are pretty close to maximum throughput already. Raid 5 can be a little bit slower. I'm a bit irritated now. I have ZFS running for some Sybase ASE 12.5 databases using X4600 servers (8x dual core, 64 GB RAM, Solaris 10 11/06) and 4 GBit/s lowest cost Infortrend Fibrechannel JBODs with a total of 4x 16 FC drives imported in a single mirrored zpool. I benchmarked them with tiobench, using a filesize of 64 GB and 32 parallel threads. With an untweaked ZFS the average throughput I got was: sequential random read 1GB/s, sequential write 296 MB/s, random write 353 MB/s, leading to a total of approx. 650,000 IOPS with a maximum latency of 350 ms after the databases went into production and the bottleneck are basically the FC HBA's. These are averages, the peaks flatline with reaching the 4 GBit/s FibreChannel maximum capacity pretty soon afterwards. I'm a bit disturbed because I think about switching to 2530/2540 shelves, but a maximum 250 MB/sec would disqualify them instantly, even with individual RAID controllers for each shelf. So my question is: Can I do the same thing I did with the IFT shelves, can I buy only 2501 JBOBDs and attach them directly to the server, thus *not* using the 2540 raid controller and still having access to the single drives? I'm quite nervous about this, because I'm not just talking about a single databases - I'd need a total number of 42 shelves and I'm pretty sure SUN doesn't offer TryBuy deals at such a scale. -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 11 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
On Mon, 18 Feb 2008, Ralf Ramge wrote: I'm a bit disturbed because I think about switching to 2530/2540 shelves, but a maximum 250 MB/sec would disqualify them instantly, even Note that this is single-file/single-thread I/O performance. I suggest that you read the formal benchmark report for this equipment since it covers multi-thread I/O performance as well. The multi-user performance is considerably higher. Given ZFS's smarts, the JBOD approach seems like a good one as long as the hardware provides a non-volatile cache. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
Bob Friesenhahn writes: On Fri, 15 Feb 2008, Roch Bourbonnais wrote: What was the interlace on the LUN ? The question was about LUN interlace not interface. 128K to 1M works better. The segment size is set to 128K. The max the 2540 allows is 512K. Unfortunately, the StorageTek 2540 and CAM documentation does not really define what segment size means. Any compression ? Compression is disabled. Does turn off checksum helps the number (that would point to a CPU limited throughput). I have not tried that but this system is loafing during the benchmark. It has four 3GHz Opteron cores. Does this output from 'iostat -xnz 20' help to understand issues? extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 3.00.7 26.43.5 0.0 0.00.04.2 0 2 c1t1d0 0.0 154.20.0 19680.3 0.0 20.70.0 134.2 0 59 c4t600A0B80003A8A0B096147B451BEd0 0.0 211.50.0 26940.5 1.1 33.95.0 160.5 99 100 c4t600A0B800039C9B50A9C47B4522Dd0 0.0 211.50.0 26940.6 1.1 33.95.0 160.4 99 100 c4t600A0B800039C9B50AA047B4529Bd0 0.0 154.00.0 19654.7 0.0 20.70.0 134.2 0 59 c4t600A0B80003A8A0B096647B453CEd0 0.0 211.30.0 26915.0 1.1 33.95.0 160.5 99 100 c4t600A0B800039C9B50AA447B4544Fd0 0.0 152.40.0 19447.0 0.0 20.50.0 134.5 0 59 c4t600A0B80003A8A0B096A47B4559Ed0 0.0 213.20.0 27183.8 0.9 34.14.2 159.9 90 100 c4t600A0B800039C9B50AA847B45605d0 0.0 152.50.0 19453.4 0.0 20.50.0 134.5 0 59 c4t600A0B80003A8A0B096E47B456DAd0 0.0 213.20.0 27177.4 0.9 34.14.2 159.9 90 100 c4t600A0B800039C9B50AAC47B45739d0 0.0 213.20.0 27195.3 0.9 34.14.2 159.9 90 100 c4t600A0B800039C9B50AB047B457ADd0 0.0 154.40.0 19711.8 0.0 20.70.0 134.0 0 59 c4t600A0B80003A8A0B097347B457D4d0 0.0 211.30.0 26958.6 1.1 33.95.0 160.6 99 100 c4t600A0B800039C9B50AB447B4595Fd0 Interesting that a subset of 5 disks are responding faster (which also leads to smaller actv queues and so lower service times) than the 7 others. and the slow ones are subject to more writes...haha. If the sizes of the luns are different (or have different amount of free blocks) then maybe ZFS is now trying to rebalance free space by targetting a subset of the disks with more new data. Pool throughput will be impacted by this. -r Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS error handling - suggestion
comment below... Adrian Saul wrote: Howdy, I have at several times had issues with consumer grade PC hardware and ZFS not getting along. The problem is not the disks but the fact I dont have ECC and end to end checking on the datapath. What is happening is that random memory errors and bit flips are written out to disk and when read back again ZFS reports it as a checksum failure: pool: myth state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM mythONLINE 0 048 raidz1ONLINE 0 048 c7t1d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /myth/tv/1504_20080216203700.mpg /myth/tv/1509_20080217192700.mpg Note there are no disk errors, just entire RAID errors. I get the same thing on a mirror pool where both sides of the mirror have identical errors. All I can assume is that it was corrupted after the checksum was calculated and flushed to disk like that. In the past it was a motherboard capacitor that had popped - but it was enough to generate these errors under load. At any rate ZFS is doing the right thing by telling me - what I dont like is that from that point on I cant convince ZFS to ignore it. The data in question is video files - a bit flip here or there wont matter. But if ZFS reads the affected block it returns and I/O error and until I restore the file I have no option but to try and make the application skip over it. If it was UFS for example I would have never known, but ZFS makes a point of stopping anything using it - understandably, but annoyingly as well. What I would like to see is an option to ZFS in the style of the 'onerror' for UFS i.e the ability to tell ZFS to join fight club - let what doesnt matter truely slide. For example: zfs set erroraction=[iofail|log|ignore] This would default to the current action of iofail but in the event you wanted to try and recover or repair data, you could set log to say generate an FMA event that there is bad checksums, or ignore, to get on with your day. As mentioned, I see this as mostly an option to help repair data after the issue is identified or repaired. Of course its data specific, but if the application can allow it or handle it, why should ZFS get in the way? Just a thought. Cheers, Adrian PS: And yes, I am now buying some ECC memory. I don't recall when this arrived in NV, but the failmode parameter for storage pools has already been implemented. From zpool(1m) failmode=wait | continue | panic Controls the system behavior in the event of catas- trophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows: waitBlocks all I/O access until the device con- nectivity is recovered and the errors are cleared. This is the default behavior. continueReturns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked. panic Prints out a message to the console and gen- erates a system crash dump. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS error handling - suggestion
Richard Elling wrote: Adrian Saul wrote: Howdy, I have at several times had issues with consumer grade PC hardware and ZFS not getting along. The problem is not the disks but the fact I dont have ECC and end to end checking on the datapath. What is happening is that random memory errors and bit flips are written out to disk and when read back again ZFS reports it as a checksum failure: pool: myth state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM mythONLINE 0 048 raidz1ONLINE 0 048 c7t1d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /myth/tv/1504_20080216203700.mpg /myth/tv/1509_20080217192700.mpg Note there are no disk errors, just entire RAID errors. I get the same thing on a mirror pool where both sides of the mirror have identical errors. All I can assume is that it was corrupted after the checksum was calculated and flushed to disk like that. In the past it was a motherboard capacitor that had popped - but it was enough to generate these errors under load. I got a similar CKSUM error recently in which a block from a different file ended up in one of my files. So this was not a simple bit-flip, but 64K of the file was bad. However, I do not think any disk filesystem should tolerate even bit flips. Even in video files, I'd want to know that I hacked the ZFS source to temporarily ignore the error so I could see what was wrong. So your error(s) might be something of this kind (except I do not understand, if so, how both of your mirrors were affected in the same way - do you know this, or did ZFS simply say that the file was not recoverable - i.e. it might have had different bad bits in the two mirrors?). For me, at least on subsequent reboots, no read or write errors were reported on mine either, just CKSUM (I do seem to recall other errors listed - read or write - but they were cleared on reboot, so I cannot recall it exactly). And I would think it's possible to get no errors if it's simply a misdirected block write. Still, I would then wonder why I didn't see *2* files with errors if this is what happened to me. I guess I am saying that this may not be a memory glitch, but could also be some IDE cable issue (as mine turned out to be). See my post here: http://lists.freebsd.org/pipermail/freebsd-stable/2008-February/040355.html At any rate ZFS is doing the right thing by telling me - what I dont like is that from that point on I cant convince ZFS to ignore it. The data in question is video files - a bit flip here or there wont matter. But if ZFS reads the affected block it returns and I/O error and until I restore the file I have no option but to try and make the application skip over it. If it was UFS for example I would have never known, but ZFS makes a point of stopping anything using it - understandably, but annoyingly as well. I understand your situation, and I agree that user-control might be nice (in my case, I would not have had to tweak the ZFS code). I do think that zpool status should still reveal the error, however, even if the file read does not report it (if you have set ZFS to ignore the error). I can also imagine this could be a bit dangerous if, e.g., the user forgets this option is set. PS: And yes, I am now buying some ECC memory. Good practice in general - I always use ECC. There is nothing worse than silent data corruption. I don't recall when this arrived in NV, but the failmode parameter for storage pools has already been implemented. From zpool(1m) failmode=wait | continue | panic Controls the system behavior in the event of catas- trophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows: waitBlocks all I/O access until the device con- nectivity is recovered and the errors are cleared. This is the default behavior. continueReturns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked. panic Prints out a message to the console and gen- erates a system crash dump. Is
Re: [zfs-discuss] ZFS error handling - suggestion
On Mon, Feb 18, 2008 at 11:52:48AM -0700, Joe Peterson wrote: Is wait the default behavior now? When I had CKSUM errors, reading the file would return EIO and stop reading at that point (returning only the good data so far). Do you mean it blocks access on the errored file, or on the whole device? I've noticed the former, but not the latter. The 'failmode' property only applies when writes fail, or read-during-write dependies, such as the spacemaps. It does not affect normal reads. - Eric -- Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS error handling - suggestion
On Mon, Feb 18, 2008 at 11:15:34AM -0800, Eric Schrock wrote: The 'failmode' property only applies when writes fail, or read-during-write dependies, such as the spacemaps. It does not affect ^ That should read 'dependencies', obviously ;-) - Eric -- Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] vxfs vs ufs vs zfs
Hello, I have just done comparison of all the above filesystems using the latest filebench. If you are interested: http://przemol.blogspot.com/2008/02/zfs-vs-vxfs-vs-ufs -on-x4500-thumper.html Regards przemol I would think there'd be a lot more variation based on workload, such that the overall comparison may fall far short of telling the whole story. For example, IIRC, VxFS is more or less extent-based (like mainframe storage), so serial I/O for large files should be perhaps its strongest point, while other workloads may do relatively better with the other filesystems. The free basic edition sounds cool, though - downloading now. I could use a bit of practice with VxVM/VxFS; it's always struck me as very good when it was good (online reorgs of storage and such), and an utter terror to untangle when it got messed up, not to mention rather more complicated that DiskSuite/SVM (and of course _waay_ more complicated than zfs :-) Any idea if it works with reasonably recent OpenSolaris (build 81) ? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'du' is not accurate on zfs
On Sat, 16 Feb 2008, Richard Elling wrote: ls -l shows the length. ls -s shows the size, which may be different than the length. You probably want size rather than du. That is true. Unfortunately 'ls -s' displays in units of disk blocks and does not also consider the 'h' option in order to provide a value suitable for humans. Bob ISTR someone already proposing to make ls -h -s work in a way one might hope for. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] vxfs vs ufs vs zfs
The free basic edition sounds cool, though - downloading now. I could use a bit of practice with VxVM/VxFS; it's always struck me as very good when it was good (online reorgs of storage and such), and an utter terror to untangle when it got messed up, not to mention rather more complicated that DiskSuite/SVM (and of course _waay_ more complicated than zfs :-) Also note that Veritas has a Simple Admin Utility (beta) available that works on Storage Foundation 4.0 or higher. You can find it here: http://www.symantec.com/business/products/agents_options.jsp?pcid=2245pvid=203_1 I played with it brielfly when they first introduced it after folks complained that vxvm/vxfs was so much more complicated than zfs. I don't really have a need for it myself, but it seemed to work fine. Todd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
Hello Joel, Saturday, February 16, 2008, 4:09:11 PM, you wrote: JM Bob, JM Here is how you can tell the array to ignore cache sync commands JM and the force unit access bits...(Sorry if it wraps..) JM On a Solaris CAM install, the 'service' command is in /opt/SUNWsefms/bin JM To read the current settings: JM service -d arrayname -c read -q nvsram region=0xf2 host=0x00 JM save this output so you can reverse the changes below easily if needed... JM To set new values: JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x17 value=0x01 host=0x00 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x18 value=0x01 host=0x00 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x21 value=0x01 host=0x00 JM Host region 00 is Solaris (w/Traffic Manager) JM You will need to reboot both controllers after making the change before it becomes active. Is it also necessary and does it work on 2530? -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion
Is this kernel panic a known ZFS bug, or should I open a new ticket? Note, this happened on an X4500 running S10U4 (127112-06) with NCQ disabled. Thanks. Feb 18 17:55:18 thumper1 ^Mpanic[cpu1]/thread=fe8000809c80: Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, line: 1692 Feb 18 17:55:18 thumper1 unix: [ID 10 kern.notice] Feb 18 17:55:18 thumper1 genunix: [ID 802836 kern.notice] fe80008099d0 fb9c9853 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809a00 zfs:zfsctl_ops_root+2fac59f2 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809a30 zfs:dbuf_write_done+c8 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809a70 zfs:arc_write_done+13b () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809ac0 zfs:zio_done+1b8 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809ad0 zfs:zio_next_stage+65 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b00 zfs:zio_wait_for_children+49 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b10 zfs:zio_wait_children_done+15 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b20 zfs:zio_next_stage+65 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b60 zfs:zio_vdev_io_assess+84 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b70 zfs:zio_next_stage+65 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809bd0 zfs:vdev_mirror_io_done+c1 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809be0 zfs:zio_vdev_io_done+14 () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809c60 genunix:taskq_thread+bc () Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809c70 unix:thread_start+8 () Feb 18 17:55:18 thumper1 unix: [ID 10 kern.notice] -- Stuart Anderson [EMAIL PROTECTED] http://www.ligo.caltech.edu/~anderson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion
The patches (127728-06 : sparc, 127729-07 : x86) which has the fix for this panic is in temporary state and will be released via SunSolve soon. Please contact your support channel to get these patches. -- Prabahar. Stuart Anderson wrote: On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote: Is this kernel panic a known ZFS bug, or should I open a new ticket? Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, line: 1692 It looks like this might be bug 6523336, http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1 Does anyone know when the Binary relief for this and other Sol10 ZFS kernel panics will be released as normal kernel patches? Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool shared between OSX and Solaris on a MacBook Pro
Hi, I got my MacBook pro set up to dual boot between Solaris and OSX and I have created a zpool to use as a shred storage for documents etc.. However got this strange thing when trying to access the zpool from Solaris, only root can see it?? I created the zpool on OSX as they are using an old version of the on disk format, if I create a zpool on Solaris all users can see it, strange Any ideas on what might be the issue here?? Cheers, Peter root# zpool get all zpace NAME PROPERTY VALUE SOURCE zpace bootfs - default zpace autoreplace off default zpace delegation off default root# zfs get all zpace/demo NAMEPROPERTY VALUE SOURCE zpace/demo typefilesystem - zpace/demo creationSat Feb 16 13:25 2008 - zpace/demo used66.2M - zpace/demo available 59.3G - zpace/demo referenced 66.2M - zpace/demo compressratio 1.00x - zpace/demo mounted yes- zpace/demo quota none default zpace/demo reservation none default zpace/demo recordsize 128K default zpace/demo mountpoint /Volumes/zpace/demodefault zpace/demo sharenfsoffdefault zpace/demo checksumon default zpace/demo compression offdefault zpace/demo atime on default zpace/demo devices on default zpace/demo execon default zpace/demo setuid on default zpace/demo readonlyoffdefault zpace/demo zoned offdefault zpace/demo snapdir hidden default zpace/demo aclmode groupmask default zpace/demo aclinherit secure default zpace/demo canmounton default zpace/demo shareiscsi offdefault zpace/demo xattr on default zpace/demo copies 1 default zpace/demo version 2 - ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion
On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote: Is this kernel panic a known ZFS bug, or should I open a new ticket? Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, line: 1692 It looks like this might be bug 6523336, http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1 Does anyone know when the Binary relief for this and other Sol10 ZFS kernel panics will be released as normal kernel patches? Thanks. -- Stuart Anderson [EMAIL PROTECTED] http://www.ligo.caltech.edu/~anderson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion
Thanks for the information. How does the temporary patch 127729-07 relate to the IDR127787 (x86) which I believe also claims to fix this panic? Thanks. On Mon, Feb 18, 2008 at 08:32:03PM -0800, Prabahar Jeyaram wrote: The patches (127728-06 : sparc, 127729-07 : x86) which has the fix for this panic is in temporary state and will be released via SunSolve soon. Please contact your support channel to get these patches. -- Prabahar. Stuart Anderson wrote: On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote: Is this kernel panic a known ZFS bug, or should I open a new ticket? Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, line: 1692 It looks like this might be bug 6523336, http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1 Does anyone know when the Binary relief for this and other Sol10 ZFS kernel panics will be released as normal kernel patches? Thanks. -- Stuart Anderson [EMAIL PROTECTED] http://www.ligo.caltech.edu/~anderson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion
Any IDRXX (Released immediately) is the interim relief (Will also contains the fix) provided to the customers till the official patch (Usually takes longer to be released) is available. Patch is supposed to be consider as the permanent solution. -- Prabahar. Stuart Anderson wrote: Thanks for the information. How does the temporary patch 127729-07 relate to the IDR127787 (x86) which I believe also claims to fix this panic? Thanks. On Mon, Feb 18, 2008 at 08:32:03PM -0800, Prabahar Jeyaram wrote: The patches (127728-06 : sparc, 127729-07 : x86) which has the fix for this panic is in temporary state and will be released via SunSolve soon. Please contact your support channel to get these patches. -- Prabahar. Stuart Anderson wrote: On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote: Is this kernel panic a known ZFS bug, or should I open a new ticket? Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, line: 1692 It looks like this might be bug 6523336, http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1 Does anyone know when the Binary relief for this and other Sol10 ZFS kernel panics will be released as normal kernel patches? Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
It is the same for the 2530, and I am fairly certain it is also valid for the 6130,6140, 6540. -Joel On Feb 18, 2008, at 3:51 PM, Robert Milkowski [EMAIL PROTECTED] wrote: Hello Joel, Saturday, February 16, 2008, 4:09:11 PM, you wrote: JM Bob, JM Here is how you can tell the array to ignore cache sync commands JM and the force unit access bits...(Sorry if it wraps..) JM On a Solaris CAM install, the 'service' command is in /opt/ SUNWsefms/bin JM To read the current settings: JM service -d arrayname -c read -q nvsram region=0xf2 host=0x00 JM save this output so you can reverse the changes below easily if needed... JM To set new values: JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x17 value=0x01 host=0x00 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x18 value=0x01 host=0x00 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x21 value=0x01 host=0x00 JM Host region 00 is Solaris (w/Traffic Manager) JM You will need to reboot both controllers after making the change before it becomes active. Is it also necessary and does it work on 2530? -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss