Re: [zfs-discuss] Advanced format 4K bug in illumos? ( was: zfs-discuss] Advanced Format HDD's - are we there yet? ...)
Hello, While we are talking Advanced format, does anyone know if bugid 7021758 is a issue in illumos? Synopsis: zpool disk corruption detected on 4k block disks http://wesunsolve.net/bugid/id/7021758 Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Pool faulted in a bad way
Hello, I have been asked to take a look at at poll on a old OSOL 2009.06 host. It have been left unattended for a long time and it was found in a FAULTED state. Two of the disks in the raildz2 pool seems to have failed, one have been replaced by a spare, the other one is UNAVAIL. The machine was restarted and the damaged disks was removed to make it possible to access the pool without it hanging on I/O-errors. Now, I have no indication on that more than two disk should have failed, and one of them seems to have been replaced by the spare. I would then have expected the pool to be in a working state even with two failed disks and some bad data on the remaining disks since metadata has additional replication. This is the current state of the pool, unable to be imported (at least with 2009.06): pool: tank state: FAULTED status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM tank FAULTED 0 0 1 corrupted data raidz2 DEGRADED 0 0 6 c12t0d0ONLINE 0 0 0 c12t1d0ONLINE 0 0 0 spare ONLINE 0 0 0 c12t2d0 ONLINE 0 0 0 c12t7d0 ONLINE 0 0 0 c12t3d0ONLINE 0 0 0 c12t4d0ONLINE 0 0 0 c12t5d0ONLINE 0 0 0 c12t6d0UNAVAIL 0 0 0 cannot open If we look at the status it is a mismatch of between the status message that states that insufficient replicas are available and the status of the disks. More troublesome is the corrupted data status for the whole pool. I also get bad config type 16 for stats from zdb. What can possible cause something like this, a faulty controller? Is there any way to recover (UB rollback with OI perhaps?) The server has ECC memory and another pool that is still working fine. The controller is a ARECA 1280. And some output from zdb: # zdb tank | more zdb: can't open tank: I/O error version=14 name='tank' state=0 txg=0 pool_guid=17315487329998392945 hostid=8783846 hostname='storage' vdev_tree type='root' id=0 guid=17315487329998392945 bad config type 16 for stats children[0] type='raidz' id=0 guid=14250359679717261360 nparity=2 metaslab_array=24 metaslab_shift=37 ashift=9 asize=14002698321920 is_log=0 root@storage:~# zdb tank version=14 name='tank' state=0 txg=0 pool_guid=17315487329998392945 hostid=8783846 hostname='storage' vdev_tree type='root' id=0 guid=17315487329998392945 bad config type 16 for stats children[0] type='raidz' id=0 guid=14250359679717261360 nparity=2 metaslab_array=24 metaslab_shift=37 ashift=9 asize=14002698321920 is_log=0 bad config type 16 for stats children[0] type='disk' id=0 guid=5644370057710608379 path='/dev/dsk/c12t0d0s0' devid='id1,sd@x001b4d23002bb800/a' phys_path='/pci@0,0/pci8086,25f8@4/pci8086,370@0/pci17d3,1260@e/disk@0,0:a' whole_disk=1 DTL=154 bad config type 16 for stats children[1] type='disk' id=1 guid=7134885674951774601 path='/dev/dsk/c12t1d0s0' devid='id1,sd@x001b4d23002bb810/a' phys_path='/pci@0,0/pci8086,25f8@4/pci8086,370@0/pci17d3,1260@e/disk@1,0:a' whole_disk=1 DTL=153 bad config type 16 for stats children[2] type='spare' id=2 guid=7434068041432431375 whole_disk=0 bad config type 16 for stats children[0] type='disk' id=0 guid=5913529661608977121 path='/dev/dsk/c12t2d0s0' devid='id1,sd@x001b4d23002bb820/a'
Re: [zfs-discuss] S11 vs illumos zfs compatiblity
On Dec 27, 2011, at 9:20 PM, Frank Cusack wrote: http://sparcv9.blogspot.com/2011/12/solaris-11-illumos-and-source.html If I upgrade ZFS to use the new features in Solaris 11 I will be unable to import my pool using the free ZFS implementation that is available in illumos based distributions Is that accurate? I understand if the S11 version is ahead of illumos, of course I can't use the same pools in both places, but that is the same problem as using an S11 pool on S10. The author is implying a much worse situation, that there are zfs tracks in addition to versions and that S11 is now on a different track and an S11 pool will not be usable elsewhere, ever. I hope it's just a misrepresentation. I think the author has a valid point ;) I probably should have written zpools instead of ZFS in that sentence. It is same as always with different pool version and features, but in this case we don't now if they will be implemented and implemented in the same way outside of Oracle after zpool version 28 since we do not have the source and Oracle does't want to play with us. Regards Henrik http://sparcv9.blogspot.com___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dskinfo utility
Hello, I got tired of gathering disk information from different places when working with Solaris disks so I wrote a small utility for summarizing the most commonly used information. It is especially tricky to work with a large set of SAN disks using MPxIO you do not even see the logical unit number in the name of the disk so your have to use other commands to acquire that information per disk. The focus of the first version is ZFS so it does understand which disks are part of pools, later version might add other volume managers or filesystems. Besides the name of the disk, size and usage it can also show number of FC paths to disks, if it's labeled, driver type, logical unit number, vendor, serial and product names. Examples, mind the format, it looks good with 80 columns: $ dskinfo list disk sizeuse type c0t600144F8288C50B55BC58DB70001d0 499G - iscsi c5t0d0 149G rpooldisk c5t2d0 37G - disk c6t0d0 1.4T zpool01 disk c6t1d0 1.4T zpool01 disk c6t2d0 1.4T zpool01 disk # dskinfo list-long disk size lun use p spd type lb c1t0d0 136G - rpool - - disk y c1t1d0 136G - rpool - - disk y c6t6879120292610822533095343732d0 100G 0x1 zpool03 4 4Gb fcy c6t6879120292610822533095343734d0 100G 0x3 zpool03 4 4Gb fcy c6t6879120292610822533095343736d0 404G 0x5 zpool03 4 4Gb fcy c6t6879120292610822533095343745d0 5T 0xbzpool03 4 4Gb fcy # dskinfo list-full disk size hex dec p spd type lb use vendor product serial c0t0d0 68G - - - - disk y rpoolFUJITSU MAP3735N SUN72G - c0t1d0 68G - - - - disk y rpoolFUJITSU MAP3735N SUN72G - c1t1d0 16G - - - - disk y storage SEAGATE ST318404LSUN18G - c1t2d0 16G - - - - disk y storage FUJITSU MAJ3182M SUN18G - c1t3d0 16G - - - - disk y storage FUJITSU MAJ3182M SUN18G - c1t4d0 16G - - - - disk y storage FUJITSU MAG3182L SUN18G - c1t5d0 16G - - - - disk y storage FUJITSU MAJ3182M SUN18G - c1t6d0 16G - - - - disk y storage FUJITSU MAJ3182M SUN18G - I'we been using it for myself for a while now, I thought it might fill a need so I am making the current version available for download. Download link and some other information can be found here: http://sparcv9.blogspot.com/2011/06/solaris-dskinfo-utility.html Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Thin devices/reclamation with ZFS?
Hello, Does anyone here have experience or thoughts regarding the use of ZFS on thin devices? Since ZFS is COW it will not play nicly with this feature, it will spread it's blocks all over the space it has been given, and it has curently no way to get back in contact with the storage arrays to tell them which blocks have been freed since SCSI UNMAP/TRIM is not implemented with ZFS (But TRIM was added to SATA in b146). Reclaiming disk space also seems a bit problematic since all data a spread across the disks including metadata, so even if you write the whole pool full of zeroes it will be mixed with non-zero data in the form of metadata. The vendor I am looking at requires 768K of zeroes to do a reclaim. I have done some initial quick test to see if updates without increasing the size of the data on disk ends upp with ZFS resusing the blocks and not spread out with new blocks all the time, but it seems to continue to claim new blocks. (With S10U9, this can have changed with zpool recover, i know ZFS in the past was supposed to reuse blocks to take advantage of the fastest parts of the disks?). There is a RFE for this, but I would like to know if someone have had experience with this in it's current state. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913905 Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Broadening of ZFS open source license
On May 11, 2010, at 10:29 PM, Hillel Lubman wrote: In the article about MeeGo: http://lwn.net/SubscriberLink/387196/103bbafc9266fd0d/ it is stated, that Oracle (together with RedHat) contributes a bulk part of BTRFS development. Given that ZFS and BTRFS both share many similar goals, wouldn't it be reasonable for Oracle to license ZFS under wider range of FOSS licenses (similar to how Mozilla released their code under triple license, since MPL is incomatible with GPL)? Is there any movement in that direction (or the solid intention not to do so?). I don't think so, not in the short run at least. Oracle has a edge over the competition with Solaris that also is the primary platform for ZFS development, they control Solaris and can used it in their advantage. Why give it away to the competition and incorporate ZFS it into a OS they does not control? Oracle knows how to make money and I don't think broadening the license for ZFS is going to do that in a near future. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
Hello, On 17 mar 2010, at 16.22, Paul van der Zwan paul.vanderz...@sun.com wrote: On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: Someone correct me if I'm wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that's already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. I wonder if the dedup ratio is calculated by the contents of the DDT or by all the data contents of the whole pool, i'we only looked at the ratio for datasets which had dedup on for the whole lifetime. If the former, data added when it's switched off will never alter the ratio (until rewritten when with dedup on). The source should have the answer, but i'm on mail only for a few weeks. It'a probably for the whole dataset, that makes the most sense, just a thought. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 18 mar 2010, at 18.38, Craig Alder craig.al...@sun.com wrote: I remembered reading a post about this a couple of months back. This post by Jeff Bonwick confirms that the dedupratio is calculated only on the data that you've attempted to deduplicate, i.e. only the data written whilst dedup is turned on - http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html . Ah, I was on the right track then with the DDT then :) guess most people have it turned on/off from the begining until BP rewrite to ensure everything is deduplicated(which is probably a good idea). Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why L2ARC device is used to store files ?
Hello, On Mar 5, 2010, at 10:46 AM, Abdullah Al-Dahlawi wrote: Greeting All I have create a pool that consists oh a hard disk and a ssd as a cache zpool create hdd c11t0d0p3 zpool add hdd cache c8t0d0p0 - cache device I ran an OLTP bench mark to emulate a DMBS One I ran the benchmark, the pool started create the database file on the ssd cache device ??? can any one explain why this happening ? is not L2ARC is used to absorb the evicted data from ARC ? No, it is not. if we look in the source there is a very good description of the L2ARC behavior: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/arc.c 1. There is no eviction path from the ARC to the L2ARC. Evictions from the ARC behave as usual, freeing buffers and placing headers on ghost lists. The ARC does not send buffers to the L2ARC during eviction as this would add inflated write latencies for all ARC memory pressure. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware for high-end ZFS NAS file server - 2010 March edition
Hello, On 4 mar 2010, at 11.11, Robert Milkowski mi...@task.gda.pl wrote: On 04/03/2010 09:46, Dan Dascalescu wrote: Please recommend your up-to-date high-end hardware components for building a highly fault-tolerant ZFS NAS file server. 2x M5000 + 4x EMC DMX Sorry, I couldn't resist :) I would not recommend that, you can't change boards in anything less than a M8000 so your service would have to switch nodes just to replace a CPU. ;) Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to verify ecc for ram is active and enabled?
Hello, On 4 mar 2010, at 10.26, ace tojakt...@gmail.com wrote: A process will continually scrub the memory, and is capable of correcting any one error per 64-bit word of memory. at http://www.stringliterals.com/?tag=opensolaris. If this is true what is the process and how is it accessed? No, it's a kernel thread, something like: # echo ::thread ! grep scrub Or echo memscrub_scans_done/U | mdb-k This depeds om what platform you are on, some platforms do ths in hardware. Google for the later to find some good pages with more info. I'm not at my workstation so mind minor faults. Henrik http://sparcv9.blogspot.com___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Fishworks 2010Q1 and dedup bug?
Hi all, Now that the Fishworks 2010.Q1 release seems to get deduplication, does anyone know if bugid: 6924824 (destroying a dedup-enabled dataset bricks system) is still valid, it has not been fixed in in onnv and it is not mentioned in the release notes. This is one of the bugs i've been keeping my eyes on before using dedup for any serious work, so I was a but surprised to see that it was in the 2010Q1 release but not fixed in ON. It might not be an issue, just curious, both from a fishworks perspective and from a OpenSolaris perspective. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] flying ZFS pools
Hello, On Mar 1, 2010, at 11:57 PM, Ahmad AlTwaijiry wrote: Hi everyone, I'm preparing around 6 Solaris physical servers and I want to see if it's possible to create a zfs pool that I can make it as a shared pool between all the 6 servers (not concurrent, just active-passive way) is that possible? Is there any article that can show me how to do it ? sorry if this is a basic question but I'm new to ZFS area, in UFS I can just create a metaset between all the servers and I just release and take over manually and this is what I want to do with ZFS It's even easier with ZFS, as long as all severs has access to all disks you can just do a zfs export of the pool and then a zpool import on another node. The easier part is that you do not need to add stuff to vfstab and or have any local knowledge of the pool layout. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Observations about compressability of metadata L2ARC
On Feb 21, 2010, at 6:40 PM, Andrey Kuzmin wrote: I don't see why this couldn't be extended beyond metadata (+1 for the idea): if zvol is compressed, ARC/L2ARC could store compressed data. The gain is apparent: if user has compression enabled for the volume, he/she expects volume's data to be compressable at good ratio, yielding significant reduction of ARC memory footprint/L2ARC usable capacity boost. I think something similar was discussed by Jeff and Bill in the ZFS keynote as KCA, Just-in-time decompression. Keeping prefetched data in memory but without decompressing it. I'll guess you would want the data decompressed it it's going to be used, at least frequently. The also discussed that unused data in the ARC might be able to be compressed in the future. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] More performance questions [on zfs over nfs]
On Feb 21, 2010, at 7:47 PM, Harry Putnam wrote: Working from a remote linux machine on a zfs fs that is an nfs mounted share (set for nfs availability on zfs server, mounted nfs on linux); I've been noticing a certain kind of sloth when messing with files. What I see: After writing a file it seems to take the fs too long to be able to display the size correctly (with du). You will not see the on disk size of the file with du before the transaction group have been committed which can take up to 30 seconds. ZFS does not even know how much space it will consume before writing out the data to disks since compression might be enabled. You can test this by executing sync(1M) on your file server, when it returns you should have the final size of the file. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] server hang with compression on, ping timeouts from remote machine
Hello Christo, On Jan 31, 2010, at 4:07 PM, Christo Kutrovsky wrote: Hello All, I am running NTFS over iSCSI on a ZFS ZVOL volume with compression=gzip-9 and blocksize=8K. The server is 2 core P4 3.0 Ghz with 5 GB of RAM. Whenever I start copying files from Windows onto the ZFS disk, after about 100-200 Mb been copied the server starts to experience freezes. I have iostat running, which freezes as well. Even pings on both of the network adapters are reporting either 4000 ms or timeouts for when the freeze is happening. I have reproduce the same behavior with a 1 GB test ZVOL. Whenever I do sequential writes of 64 Kb with compression=gzip-9 I experience the freezes. With compression=off it's all good. I've also experienced similar behavior (short freezes) when running zfs send|zfs receive with compression on LOCALLY on ZVOLs again. I think gzip in ZFS have a reputation being somewhat heavy on system resources, that said it would be nice if it did not have such a large impact on low level functions. Have a look in the archive, search for example death-spriral or Death-spriral revisited. Have you tried using the default compression algorithm also (lzjb, compresison=on)? Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Removing large holey file does not free space 6792701 (still)
Hello,I mentioned this problem a year ago here and filed 6792701 and I know it has been discussed since. It should have been fixed in snv_118, but I can still trigger the same problem. This is only triggered if the creation of a large file is aborted, for example by loss of power, crash or SIGINT to mkfile(1M). The bug should probably be reopened but I post it here since some people where seeing something similar.Example and attached zdb output:filer01a:/$ uname -a SunOS filer01a 5.11 snv_130 i86pc i386 i86pc Solarisfiler01a:/$ zpool create zpool01 raidz2 c4t0d0 c4t1d0 c4t2d0 c4t4d0 c4t5d0 c4t6d0filer01a:/$ zpool create zpool01 raidz2 c4t0d0 c4t1d0 c4t2d0 c4t4d0 c4t5d0 c4t6d0filer01a:/$ zfs list zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 123K 5.33T 42.0K /zpool01filer01a:/$ df -h /zpool01Filesystem Size Used Avail Use% Mounted onzpool015.4T 42K 5.4T 1% /zpool01filer01a:/$ mkfile 1024G /zpool01/largefile ^C filer01a:/$ zfs list zpool01NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ ls -hl /zpool01/largefile -rw--- 1 root root 1.0T 2010-01-22 15:02 /zpool01/largefilefiler01a:/$ rm /zpool01/largefilefiler01a:/$ sync filer01a:/$ zfs list zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ df -h /zpool01Filesystem Size Used Avail Use% Mounted onzpool015.4T 161G 5.2T 3% /zpool01filer01a:/$ ls -l /zpool01total 0filer01a:/$ zfs list -t all zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ zpool export zpool01 filer01a:/$ zpool import zpool01 filer01a:/$ zfs list zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ zfs -ddd zpool01cut Object lvl iblk dblk dsize lsize %full typecut5 5 16K 128K 160G 1T 15.64 ZFS plain file/cut zpool01.zdb Description: Binary data Henrikhttp://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Zpool is a bit Pessimistic at failures
Hello, Anyone else noticed that zpool is kind of negative when reporting back from some error conditions? Like: cannot import 'zpool01': I/O error Destroy and re-create the pool from a backup source. or even worse: cannot import 'rpool': pool already exists Destroy and re-create the pool from a backup source. The first one i got when doing some failure testing on my new storage node, i've pulled several disks from a raidz2 to simulate loss off connectivity, lastly I pulled a third one which as expected made the pool unusable and later exported the pool. But when I reconnected one of the previous two drives and tried a import I got this message, the pool was fine once I reconnected the last disk to fail, so the messages seems a bit pessimistic. The second one i got when importing a old rpool with altroot but forgot to specify a new name for the pool, the solution to just add a new name to the pool was much better than recreating the pool and restoring from backup. I think this could scare or even make new users do terrible things, even if the errors could be fixed. I think I'll file a bug, agree? Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I/O Read starvation
Hello, On Jan 11, 2010, at 6:53 PM, bank kus wrote: For example, you could set it to half your (8GB) memory so that 4GB is immediately available for other uses. * Set maximum ZFS ARC size to 4GB capping max sounds like a good idea. Are we still trying to solve the starvation problem? I filed a bug on the non-ZFS related urandom stall problem yesterday, primary since it can do nasty things from inside a resource capped zone: CR 6915579 solaris-cryp/random Large read from /dev/urandom can stall system Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I/O Read starvation
Hello again, On Jan 10, 2010, at 5:39 AM, bank kus wrote: Hi Henrik I have 16GB Ram on my system on a lesser RAM system dd does cause problems as I mentioned above. My __guess__ dd is probably sitting in some in memory cache since du -sh doesnt show the full file size until I do a sync. At this point I m less looking for QA type repro questions and/or speculations rather looking for ZFS design expectations. What is the expected behaviour, if one thread queues 100 reads and another thread comes later with 50 reads are these 50 reads __guaranteed__ to fall behind the first 100 or is timeslice/fairshre done between two streams? Btw this problem is pretty serious with 3 users using the system one of them initiating a large copy grinds the other 2 to a halt. Linux doesnt have this problem and this is almost a switch O/S moment for us unfortunately :-( Have you reproduced the problem without using /dev/urandom? I can only get this behavior when using dd from urandom, not using files with cp, and not even files with dd. This could then be related the random driver spending kernel time in high priority threads. So while I agree that this is not optimal, there is a huge difference in how bad it is, if it's urandom generated there is no problem with copying files. Since you also found that it's not related to ZFS (also tmpfs, and perhaps only urandom?) we are on the wrong list. Please isolate the problem, can we put aside any filesystem, if so we are on the wrong list, i've added perf-discuss also. Regards Henrik http://sparcv9.blogspot.com Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I/O Read starvation
Hello Bob, On Jan 10, 2010, at 4:54 PM, Bob Friesenhahn wrote: On Sun, 10 Jan 2010, Phil Harman wrote: In performance terms, you'll probably find that block sizes beyond 128K add little benefit. So I'd suggest something like: dd if=/dev/urandom of=largefile.txt bs=128k count=65536 dd if=largefile.txt of=./test/1.txt bs=128k dd if=largefile.txt of=./test/2.txt bs=128k As an interesting aside, on my Solaris 10U8 system (plus a zfs IDR), dd (Solaris or GNU) does not produce the expected file size when using /dev/urandom as input: Do you feel this is related to the filesystem, is there any difference between putting the data in a file on ZFS or just throwing it away? $(dd if=/dev/urandom of=/dev/null bs=1048576k count=16) gives me a quite unresponsive system too. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I/O Read starvation
Henrik http://sparcv9.blogspot.com On 9 jan 2010, at 04.49, bank kus kus.b...@gmail.com wrote: dd if=/dev/urandom of=largefile.txt bs=1G count=8 cp largefile.txt ./test/1.txt cp largefile.txt ./test/2.txt Thats it now the system is totally unusable after launching the two 8G copies. Until these copies finish no other application is able to launch completely. Checking prstat shows them to be in the sleep state. Question: I m guessing this because ZFS doesnt use CFQ and that one process is allowed to queue up all its I/O reads ahead of other processes? What is CFQ, a sheduler, if you are running OpenSolaris, then you do not have CFQ. Is there a concept of priority among I/O reads? I only ask because if root were to launch some GUI application they dont start up until both copies are done. So there is no concept of priority? Needless to say this does not exist on Linux 2.60... -- Probably not, but ZFS only runs in userspace on Linux with fuse so it will be quite different. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I/O Read starvation
On Jan 9, 2010, at 2:02 PM, bank kus wrote: Probably not, but ZFS only runs in userspace on Linux with fuse so it will be quite different. I wasnt clear in my description, I m referring to ext4 on Linux. In fact on a system with low RAM even the dd command makes the system horribly unresponsive. IMHO not having fairshare or timeslicing between different processes issuing reads is frankly unacceptable given a lame user can bring the system to a halt with 3 large file copies. Are there ZFS settings or Project Resource Control settings one can use to limit abuse from individual processes? -- Are your sure this problem is related to ZFS? I have no problem with multiple threads reading and writing to my pools, it's till responsive, if I however put urandom with dd into the mix I get much more latency. Does't for example $(dd if=/dev/urandom of=/dev/null bs=1048576k count=8) give you the same problem, or if you use the file you already created from urandom as input to dd? Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 and ZFS dedupe status
On Jan 5, 2010, at 4:38 PM, Bob Friesenhahn wrote: On Mon, 4 Jan 2010, Tony Russell wrote: I am under the impression that dedupe is still only in OpenSolaris and that support for dedupe is limited or non existent. Is this true? I would like to use ZFS and the dedupe capability to store multiple virtual machine images. The problem is that this will be in a production environment and would probably call for Solaris 10 instead of OpenSolaris. Are my statements on this valid or am I off track? If dedup gets scheduled for Solaris 10 (I don't know), it would surely not be available until at least a year from now. Dedup in OpenSolaris still seems risky to use other than for experimental purposes. It has only recently become available. I've just wrote an entry about update 9, I think it will contain zpool version 19, so no dedup for this release if that's correct. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Help on Mailing List
http://mail.opensolaris.org/pipermail/zfs-discuss/ Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] best way to configure raidz groups
Hello, On Dec 30, 2009, at 2:08 PM, Thomas Burgess wrote: I'm about to build a ZFS based NAS and i'd like some suggestions about how to set up my drives. The case i'm using holds 20 hot swap drives, so i plan to use either 4 vdevs with 5 drives or 5 vdevs with 4 drives each (and a hot spare inside the machine) The motherboard i'm getting has 4 pci-x slots 2 @ 133 Mhz and 2 @ 100 Mhz I was planning on buying 3 of the famous AOC-SAT2-MV8 cards which would give me more than enough sata slots. I'll also have 6 onboard slots. I also plan on using 2 sata= compact flash adapters with 16 gb compact flash cards for the os. My main question is what is the best way to lay out the vdevs? Does it really matter how i lay them out considering i only have gigabit network? It depends, random I/O and resilver/scrubbing should be a bit faster with 5 vdevs but for sequential data access it should not matter over gigabit. It all comes down to what you want out of the configuration, redundancy versus usable space and price. raidz2 might be a better choice than raidz, especially if you have large disks. For most of my storage need I would probably build a pool out of 4 radiz2 vdevs. Regards Henrik Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Dedupe reporting incorrect savings
Hello, On Dec 15, 2009, at 8:02 AM, Giridhar K R wrote: Hi, Created a zpool with 64k recordsize and enabled dedupe on it. zpool create -O recordsize=64k TestPool device1 zfs set dedup=on TestPool I copied files onto this pool over nfs from a windows client. Here is the output of zpool list Prompt:~# zpool list NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT TestPool 696G 19.1G 677G 2% 1.13x ONLINE - When I ran a dir /s command on the share from a windows client cmd, I see the file size as 51,193,782,290 bytes. The alloc size reported by zpool along with the DEDUP of 1.13x does not addup to 51,193,782,290 bytes. According to the DEDUP (Dedupe ratio) the amount of data copied is 21.58G (19.1G * 1.13) Are you sure this problem is related to ZFS, not a Windows, link or CIFS issue? Have you looked at the filesystem from the OpenSolaris host locally? Are sure there are no links in the filesystems that the windows client also counts? Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Space not freed?
Hello, On 14 dec 2009, at 14.16, Markus Kovero markus.kov...@nebula.fi wrote: Hi, if someone running 129 could try this out, turn off compression in your pool, mkfile 10g /pool/file123, see used space and then remove the file and see if it makes used space available again. I’m having trouble with this, reminds me of similar bug that occurred in 111-release. I filed bug about a year ago on a similar issue, bugid: 6792701, but it should have been fixed in snv_118. Regards Henrik http://sparcv9.blogspot.com___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub differs in execute time?
How do you do, On 13 nov 2009, at 11.07, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote: I have a raidz2 and did a scrub, it took 8h. Then I reconnected some drives to other SATA ports, and now it takes 15h to scrub?? Why is that? Could you perhaps provid some more info? Which OSOL release? are the new disks utilized? Have the pool data changed? Is there a difference in how much data that is read from the disks? Is the system otherwise idle? Which SATA controller? Does iostat show any errors? Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARC cache and ls query
Hello John, On Oct 30, 2009, at 9:03 PM, John wrote: Hi, On an idle server, when I do a recursive '/usr/bin/ls' on a folder, I see a lot of disk activity. This makes sense because the results (metadata/data) may not have been cached. When I do a second ls on the same folder right after the first one finished, I do see disk activity again. Can someone explain why the results are not cached in ARC? You would have disk access again unless you have turned set atime to off for that filesystem. I did posted something similar a few days back and write a summary of the ARC-part of my findings: http://sparcv9.blogspot.com/2009/10/curious-case-of-strange-arc.html Here is the whole thread: http://opensolaris.org/jive/thread.jspa?messageID=430385 If that does not explain it you should probably provide some more data, how many files, some ARC statistics etc. Regards Henrik Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARC cache and ls query
On Oct 30, 2009, at 10:20 PM, John wrote: thanks Henrik. This makes perfect sense. More questions. arc_meta_limit is set to a quarter of the ARC size. what is arc_meta_max ? On some systems, I have arc_meta_max arc_meta_limit. Example: arc_meta_used = 29427 MB arc_meta_limit= 16125 MB arc_meta_max = 29427 MB Example 2: arc_meta_used = 5885 MB arc_meta_limit= 5885 MB arc_meta_max = 17443 MB -- That looks very strange, the source says: if (arc_meta_max arc_meta_used) arc_meta_max = arc_meta_used; So arc_meta_max should be the maximum amount of that arc_meta_used has ever reached. The limit on the metadata is not enforced synchronously, but that seems to be quite a bit over the limit. What are these machines doing, are they quickly processing large numbers files/directories? I do not know the exact implementation of this but perhaps new metadata is added to the cache faster than it gets purged. Maybe someone else knows more exactly how this works? Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sub-optimal ZFS performance
On Oct 29, 2009, at 5:23 PM, Bob Friesenhahn wrote: On Thu, 29 Oct 2009, Orvar Korvar wrote: So the solution is to never get more than 90% full disk space, för fan? Right. While UFS created artificial limits to keep the filesystem from getting so full that it became sluggish and sick, ZFS does not seem to include those protections. Don't ever run a ZFS pool for a long duration of time at very close to full since it will become excessively fragmented. Setting quotas for all dataset could perhaps be of use for some of us. A überquota property for the whole pool would have been nice until a real solution is available. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sub-optimal ZFS performance
On Oct 16, 2009, at 1:01 AM, Henrik Johansson wrote: My guess would be that this is due to fragmentation during some time when the filesystem might have been close to full, but it is still pretty terrible numbers even with 0.5M files in the structure. And while this is very bad I would at least expect the ARC to cache data and make a second run go faster: I solved this, the second run was also slow since the metadata part of the ARC was to small, raising arc_meta_limit helped, and turning off atime also helped much since this directory seem to be terrible fragmented. With these changes the ARC helps so that the second goes as fast as it should. The fragmentation can be solved by a copy if I would want to keep the files. I wrote some more details about what I did if anyone is interested: http://sparcv9.blogspot.com/2009/10/curious-case-of-strange-arc.html I'll make sure to keep some more free space in my pools at all times now ;) Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] sub-optimal ZFS performance
Hello, ZFS is behaving strange on a OSOL laptop, your thoughts are welcome. I am running OSOL on my laptop, currently b124 and i found that the performance of ZFS is not optimal in all situations. If i check the how much space the package cache for pkg(1) uses, it takes a bit longer on this host than on comparable machine to which i transferred all the data. u...@host:/var/pkg$ time du -hs download 6.4Gdownload real87m5.112s user0m6.820s sys 1m46.111s My guess would be that this is due to fragmentation during some time when the filesystem might have been close to full, but it is still pretty terrible numbers even with 0.5M files in the structure. And while this is very bad I would at least expect the ARC to cache data and make a second run go faster: u...@host:/var/pkg$ time du -hs download 6.4Gdownload real94m14.688s user0m6.708s sys 1m27.105s Two runs on the machine to which i have transferred the directory structure: $ time du -hs download 6.4Gdownload real2m59.60s user0m3.83s sys 0m18.87s This goes a bit faster after the initial run also: $ time du -hs download 6.4Gdownload real0m15.40s user0m3.40s sys 0m11.43s The disk are of course very busy during the first runs on both machines, but on the slow machine it has to do all the work again while the disk in the fast machine gets to rest on the second run. Slow system (OSOL b124, T61 Intel c2d laptop root pool on 2.5 disk): memstat pre first run: Kernel 162685 635 16% ZFS File Data 81284 3178% Anon57323 2236% Exec and libs3248120% Page cache 14924581% Free (cachelist) 7881301% Free (freelist)700315 2735 68% Total 1027660 4014 Physical 1027659 4014 memstat post first run: Page SummaryPagesMB %Tot Kernel 461153 1801 45% ZFS File Data 83598 3268% Anon58389 2286% Exec and libs3215120% Page cache 14958581% Free (cachelist) 6849261% Free (freelist)399498 1560 39% Total 1027660 4014 Physical 1027659 4014 arcstat first run: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 21:02:31 27919 7114 7 3011 10 439M 3G 21:12:31 19060 3152 28 8 9760 32 734M 3G 21:22:31 22558 2557 25 0 9458 25 873M 3G 21:32:31 20651 2451 24 0 2450 24 985M 3G 21:42:31 17543 2443 24 0 2942 24 1G 3G 21:52:31 16248 2948 29 0 5448 29 1G 3G 22:02:31 15955 3454 34 0 9055 34 1G 3G 22:12:31 16441 2541 24 0 6141 25 1G 3G 22:22:31 16140 2440 24 0 6840 24 1G 3G arcstat second run: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 22:35:521K 447 24 429 2317 47 436 26 1G 3G 22:45:52 16340 2440 24 0 7540 24 1G 3G 22:55:52 16140 2540 24 0 8640 25 1G 3G 23:05:52 15940 2539 25 0 7140 25 1G 3G 23:15:52 15840 2540 25 0 8640 25 1G 3G 23:25:52 15840 2540 25 0 10040 25 1G 3G 23:35:52 15740 2540 25 0 10040 25 1G 3G 23:45:52 15840 2540 25 0 10040 25 1G 3G 23:55:52 16040 2540 25 0 10040 25 1G 3G 00:05:52 15640 2540 25 0 10040 25 1G 3G Fast system (OSOL b124, AMD Athlon X2 server, tested on root pool on 2.5 SATA disk) Memstat pre run: Page SummaryPagesMB %Tot Kernel 160338 6268% ZFS File Data 44875 1752% Anon24388951% Exec and libs1295 50% Page cache 6490250% Free (cachelist) 4786180% Free (freelist) 1753978 6851 88% Balloon 0
Re: [zfs-discuss] Hot Spares spin down?
Hi there, On Oct 8, 2009, at 9:46 PM, bjbm wrote: Sorry if this is a noob question but I can't seem to find this info anywhere. Are hot spares generally spunned down until they are needed? No, but have a look at power.conf(4) and the device-thresholds keyword to spin down disks. Here is a bigadmin article also: http://www.sun.com/bigadmin/features/articles/disk_power_saving.jsp Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] KCA ZFS keynote available
Hello everybody, The KCA ZFS keynote by Jeff and Bill seems to be available online now: http://blogs.sun.com/video/entry/kernel_conference_australia_2009_jeff It should probably be mentioned here, i might have missed it. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Error recovery with replicated metadata
Hello all, I have managed to get my hands on a OSOL 2009.06 root disk which has three failed blocks on it, these three blocks makes it impossible to boot from the disk and to import the disk to on another machine. I have checked the disk and three blocks are inaccessible, quite close to each other. Now should this not have good a good chance of being saved by replicated metadata? The data on the disk is usable, i did a block copy of the whole disk to a new one, and the scrub works out flawlessly. I guess this could be a timeout issue, but the disk is at least a WD RE2 disk with error recovery of 7 seconds. The failing systems release was 111a, and I have tried to import it into 122. The disk was used by one of my friends which i have converted into using Solaris and ZFS for his company storage needs, and he is a bit skeptical when three blocks makes the whole pool unusable. The good part is that he uses mirrors for his rpool even on this non critical system now ;) Anyway, can someone help to explain this, is there any timeouts that can be tuned to import the pool or is this a feature, obviously all data that is needed is intact on the disk since the block copy of the pool worked fine. Also don't we need a force option for the -e option to zdb, so that we can use it with pools thats not have been exported correctly from a failing machine? The import timeouts after 41 seconds: r...@arne:/usr/sbin# zpool import -f 2934589927925685355 dpool cannot import 'rpool' as 'dpool': one or more devices is currently unavailable r...@arne:/usr/sbin# zpool import pool: rpool id: 2934589927925685355 state: ONLINE status: The pool was last accessed by another system. action: The pool can be imported using its name or numeric identifier and the '-f' flag. see: http://www.sun.com/msg/ZFS-8000-EY config: rpool ONLINE c1t4d0s0 ONLINE Damaged blocks as reported by format: Medium error during read: block 8646022 (0x83ed86) (538/48/28) ASC: 0x11 ASCQ: 0x0 Medium error during read: block 8650804 (0x840034) (538/124/22) ASC: 0x11 ASCQ: 0x0 Medium error during read: block 8651987 (0x8404d3) (538/143/8) ASC: 0x11 ASCQ: 0x0 What i managed to get out of zdb: r...@arne:/usr/sbin# zdb -e 2934589927925685355 WARNING: pool '2934589927925685355' could not be loaded as it was last accessed by another system (host: keeper hostid: 0xc34967). See: http://www.sun.com/msg/ZFS-8000-EY zdb: can't open 2934589927925685355: No such file or directory r...@arne:/usr/sbin# zdb -l /dev/dsk/c1t4d0s0 LABEL 0 version=14 name='rpool' state=0 txg=269696 pool_guid=2934589927925685355 hostid=12798311 hostname='keeper' top_guid=9161928630964440615 guid=9161928630964440615 vdev_tree type='disk' id=0 guid=9161928630964440615 path='/dev/dsk/c7t1d0s0' devid='id1,s...@sata_wdc_wd5000ys-01m_wd-wcanu2080316/a' phys_path='/p...@0,0/pci8086,2...@1c,4/pci1043,8...@0/ d...@1,0:a' whole_disk=0 metaslab_array=23 metaslab_shift=32 ashift=9 asize=500067467264 is_log=0 LABEL 1 version=14 name='rpool' state=0 txg=269696 pool_guid=2934589927925685355 hostid=12798311 hostname='keeper' top_guid=9161928630964440615 guid=9161928630964440615 vdev_tree type='disk' id=0 guid=9161928630964440615 path='/dev/dsk/c7t1d0s0' devid='id1,s...@sata_wdc_wd5000ys-01m_wd-wcanu2080316/a' phys_path='/p...@0,0/pci8086,2...@1c,4/pci1043,8...@0/ d...@1,0:a' whole_disk=0 metaslab_array=23 metaslab_shift=32 ashift=9 asize=500067467264 is_log=0 LABEL 2 version=14 name='rpool' state=0 txg=269696 pool_guid=2934589927925685355 hostid=12798311 hostname='keeper' top_guid=9161928630964440615 guid=9161928630964440615 vdev_tree type='disk' id=0 guid=9161928630964440615 path='/dev/dsk/c7t1d0s0' devid='id1,s...@sata_wdc_wd5000ys-01m_wd-wcanu2080316/a' phys_path='/p...@0,0/pci8086,2...@1c,4/pci1043,8...@0/ d...@1,0:a' whole_disk=0 metaslab_array=23 metaslab_shift=32 ashift=9 asize=500067467264 is_log=0 LABEL 3 version=14 name='rpool' state=0 txg=269696 pool_guid=2934589927925685355 hostid=12798311 hostname='keeper' top_guid=9161928630964440615 guid=9161928630964440615 vdev_tree
Re: [zfs-discuss] Raid-Z Issue
On Sep 11, 2009, at 10:41 PM, Frank Middleton wrote: On 09/11/09 03:20 PM, Brandon Mercer wrote: They are so well known that simply by asking if you were using them suggests that they suck. :) There are actually pretty hit or miss issues with all 1.5TB drives but that particular manufacturer has had a few more than others. FWIW I have a few of them in mirrored pools and they have been working flawlessly for several months now with LSI controllers. The workload is bursty - mostly MDA driven code generation and compilation of 1M KLoC applications and they work well enough for that. Also by now probably a PetaByte of zfs send/recvs and many scrubs, never a timeout and never a checksum error. They are all rev CC1H. So your mileage may vary, as they say... I'we also been running three of them with SD17 in a raidz for about a year without any problems at all. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b122 and fake checksum errors
Hello Brian, On Sep 10, 2009, at 9:21 PM, Brian Hechinger wrote: I've hit google and it looks like this is still an issue in b122. Does this look like it will be fixed any time soon? If so, what build will it be fixed in and is there an ETA for the build to be released? Adam has integrated the fix so if everything goes as planed it will be part of snv_124 which probably is about a month away. I'm running with the fix and so far it looks good. http://hg.genunix.org/onnv-gate.hg/rev/c383b4d6980f Regards Henrik http://sparcv9.blogspot.com___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] This is the scrub that never ends...
Hello Will, On Sep 7, 2009, at 3:42 PM, Will Murnane wrote: What can cause this kind of behavior, and how can I make my pool finish scrubbing? No idea what is causing this but did you try to stop the scrub? If so what happened? (Might not be a good idea since this is not a normal state?) What release of OpenSolaris are you running? Maybe this could be of interest, but it is a duplicate and it should have been fixed in snv_110: running zpool scrub twice hangs the scrub Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool
Hi Adam, On Sep 2, 2009, at 1:54 AM, Adam Leventhal wrote: Hi James, After investigating this problem a bit I'd suggest avoiding deploying RAID-Z until this issue is resolved. I anticipate having it fixed in build 124. For those of us which have already upgraded and written data to our raidz pools, are there any risks of inconsistency, wrong checksums in the pool? Is there a bug id? Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool
Hello all, I have backed down to snv_117, when scrubbing this pool i got my first checksum errors ever on any build except snv_121. I wonder if this is a coincidence or if bad checksums have been generated by snv_121? So i have been running for 10 months without any checksum errors, i installed snv_121 and got plenty of them, now i also get them after backing to snv_117. I will check my hardware after the scrub is completed. Someone asked what hardware we where using, I am have a Asus M3N78-VM (nforce 8200) with ECC protected memory (And I think HT uses CRC?) the pool is a 3 disk raidz. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Expanding a raidz pool?
On Sep 2, 2009, at 7:14 PM, rarok wrote: I'm just a casual at ZFS but you want something that now don't exists. The most of the consumers want this but Sun is not interested in that market. To grow a existing RAIDZ just adding more disk to the RAIDZ would be great but at this moment there isn't anything like that. I would change customers to users, many people who use ZFS for their home server would like this, but they are often not customers. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool
Hello, On 25 aug 2009, at 14.29, Gary Gendel g...@genashor.com wrote: I have a 5-500GB disk Raid-Z pool that has been producing checksum errors right after upgrading SXCE to build 121. They seem to be randomly occurring on all 5 disks, so it doesn't look like a disk failure situation. Repeatingly running a scrub on the pools randomly repairs between 20 and a few hundred checksum errors. Since I hadn't physically touched the machine, it seems a very strong coincidence that it started right after I upgraded to 121. I had my first checksum errors in almost a year yesterday after upgrading to snv_121 on my filer. I blamed an esata device that was not part of the pool. I will do some testing tonight and see if I still get errors. The machine that got the errors has a Asus M3N78-VM MB (GF8200). Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Shrinking a zpool?
On 6 aug 2009, at 23.52, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: I still have not seen any formal announcement from Sun regarding deduplication. Everything has been based on remarks from code developers. To be fair, the official what's new document for 2009.06 states that dedup will be part of the next OSOL release in 2010. Or at least that we should look out for it ;) We're already looking forward to the next release due in 2010. Look out for great new features like an interactive installation for SPARC, the ability to install packages directly from the repository during the install, offline IPS support, a new version of the GNOME desktop, ZFS deduplication and user quotas, cloud integration and plenty more! As always, you can follow active development by adding the dev/ repository. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] No files but pool is full?
On 24 jul 2009, at 09.33, Markus Kovero markus.kov...@nebula.fi wrote: During our tests we noticed very disturbing behavior, what would be causing this? System is running latest stable opensolaris. Any other means to remove ghost files rather than destroying pool and restoring from backups? This looks like bug i filed a while ago, CR 6792701 removing large holey files does bot free space. The only solution I found to clean the pool when isolating the bug was to recreate it. The fix was integrated inbuild post OSOL 2009.06. Mkfile of a certain size will trigger this. Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] prstat -Z and load average values in different zones give same numeric results
Hello Nobel, On Apr 23, 2009, at 1:53 AM, Nobel Shelby wrote: Folks, Perplexing question about load average display with prstat -Z Solaris 10 OS U4 (08/07) We have 4 zones with very different processes and workloads.. The prstat -Z command issued within each of the zones, correctly displays the number of processes and lwps, but the load average value looks exactly the same on all non-global zones..I mean all 3 values (1,5,15 load averages) are the same which is quasi impossible given the different workloads.. Is there a bug here? Thanks, No, this is correct, unless you have defined resource pools with separate processor sets on the system all zones will share the same cpu resources as thus all have the same load average. If you bind a zone to a pool you will see the load average of the pool from inside the zone, it can also be observed from the global zone with poolstat(1M). Hope this helps. Regards Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs-crypto on OpenSolaris 2009.06?
So onnv_111 is no longer the target for crypto integration since that build is supposed to be included in osol 2009.06? Regards Henrik On 5 mar 2009, at 11.06, Darren J Moffat darr...@opensolaris.org wrote: Luca Morettoni wrote: A lot of people ask me about crypto layer over ZFS and the future integration in OpenSolaris (I read around snv_111), it may be ready for the next stable release (2009.06)? See: http://opensolaris.org/os/project/zfs-crypto/ No it won't be in 2009.06. To be in 2009.06 it would have to be finished by now and it is not. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two zvol devices one volume?
Thanks for the info Dave, I filed a bug on this: 6805659. Regards Henrik On Feb 13, 2009, at 1:30 AM, Dave wrote: Henrik Johansson wrote: I tried to export the zpool also, and I got this, the strange part is that it sometimes still thinks that the ubuntu-01-dsk01 dataset exists: # zpool export zpool01 cannot open 'zpool01/xvm/dsk/ubuntu-01-dsk01': dataset does not exist cannot unmount '/zpool01/dump': Device busy But: # zfs destroy zpool01/xvm/dsk/ubuntu-01-dsk01 cannot open 'zpool01/xvm/dsk/ubuntu-01-dsk01': dataset does not exist Regards I have seen this 'phantom dataset' with a pool on nv93. I created a zpool, created a dataset, then destroyed the zpool. When creating a new zpool on the same partitions/disks as the destroyed zpool, upon export I receive the same message as you describe above, even though I never created the dataset in the new pool. Creating a dataset of the same name and then destroying it doesn't seem to get rid of it, either. I never did remember to file a bug for it... Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on SAN?
Hi all, Ok, this might be to stir some things up again but I would like to make this more clear. I have been reading this and other threads regarding ZFS on SAN and how well ZFS can recover from a serious error such as a cached disk array goes down or the connection to the SAN is lost. What I am hearing (Miles, ZFS-8000-72) is that sometimes you can end up in an unrecoverable state that forces you to restore the whole pool. I have been operating quite large deployments of SVM/UFS VxFS/VxVM for some years and while you sometimes are forced to do a filesystem check and some files might end up in lost+found I have never lost a whole filesystem. This is despite whole arrays crashing, split-brain scenarios etc. In the previous discussion a lot of fingers was pointed at hardware and USB connections, but then some people mentioned loosing pools located SAN in this thread. We are currently evaluating if we should begin to implement ZFS in our SAN. I can see great opportunities with ZFS but if we have a higher risk of loosing entire pools that is a serious issue. I am aware that the other filesystems might not be in a correct state after a serious failure, but as stated before that can be much better than restoring a multi terabyte filesystem from yesterdays backup. So, what is the opinion, is this an existing problem even when using enterprise arrays? If I understand this correctly there should be no risk of loosing an entire pool if DKIOCFLUSHWRITECACHE is honored by the array? If it is a problem, will the worst case scenario be at least on pair with UFS/VxFS when 6667683 is fixed? Grateful for any additional information. Regards Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Two zvol devices one volume?
Hi, Can anyone explain the following to me? Two zpool devices points at the same data, I was installing osol 2008.11 in xVM when I saw that there already was a partition on the installation disk. An old dataset that I deleted since i gave it a slightly different name than I intended is not removed under /dev. I should not have used that name, but two device links should perhaps not point to the same device ether zfs list |grep xvm/dsk zpool01/xvm/dsk 25.0G 2.63T 24.0K /zpool01/xvm/dsk zpool01/xvm/dsk/osol01-dsk01 10G 2.64T 2.53G - zpool01/xvm/dsk/ubuntu01-dsk0110G 2.64T 21.3K - # ls -l /dev/zvol/dsk/zpool01/xvm/dsk total 3 lrwxrwxrwx 1 root root 41 Feb 10 18:19 osol01-dsk01 - ../../../../../../devices/pseudo/z...@0:4c lrwxrwxrwx 1 root root 41 Feb 10 18:14 ubuntu-01-dsk01 - ../../../../../../devices/pseudo/z...@0:4c lrwxrwxrwx 1 root root 41 Feb 10 18:19 ubuntu01-dsk01 - ../../../../../../devices/pseudo/z...@0:5c # zpool history |grep xvm 2009-02-08.22:42:12 zfs create zpool01/xvm 2009-02-08.22:42:23 zfs create zpool01/xvm/media 2009-02-08.22:42:45 zfs create zpool01/xvm/dsk 2009-02-10.18:14:41 zfs create -V 10G zpool01/xvm/dsk/ubuntu-01-dsk01 2009-02-10.18:15:10 zfs destroy zpool01/xvm/dsk/ubuntu-01-dsk01 2009-02-10.18:15:21 zfs create -V 10G zpool01/xvm/dsk/ubuntu01-dsk01 2009-02-10.18:15:33 zfs create -V 10G zpool01/xvm/dsk/osol01-dsk01 # uname -a SunOS ollespappa 5.11 snv_107 i86pc i386 i86xpv While I am writing, is there any known issues with sharemgr and zfs in this release? svc:/network/shares/group:zfs hangs when going down since sharemgr stop zfs never returns... Thanks Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two zvol devices one volume?
I tried to export the zpool also, and I got this, the strange part is that it sometimes still thinks that the ubuntu-01-dsk01 dataset exists: # zpool export zpool01 cannot open 'zpool01/xvm/dsk/ubuntu-01-dsk01': dataset does not exist cannot unmount '/zpool01/dump': Device busy But: # zfs destroy zpool01/xvm/dsk/ubuntu-01-dsk01 cannot open 'zpool01/xvm/dsk/ubuntu-01-dsk01': dataset does not exist Regards On Feb 12, 2009, at 11:51 PM, Henrik Johansson wrote: Hi, Can anyone explain the following to me? Two zpool devices points at the same data, I was installing osol 2008.11 in xVM when I saw that there already was a partition on the installation disk. An old dataset that I deleted since i gave it a slightly different name than I intended is not removed under /dev. I should not have used that name, but two device links should perhaps not point to the same device ether zfs list |grep xvm/dsk zpool01/xvm/dsk 25.0G 2.63T 24.0K /zpool01/xvm/dsk zpool01/xvm/dsk/osol01-dsk01 10G 2.64T 2.53G - zpool01/xvm/dsk/ubuntu01-dsk0110G 2.64T 21.3K - # ls -l /dev/zvol/dsk/zpool01/xvm/dsk total 3 lrwxrwxrwx 1 root root 41 Feb 10 18:19 osol01-dsk01 - ../../../../../../devices/pseudo/z...@0:4c lrwxrwxrwx 1 root root 41 Feb 10 18:14 ubuntu-01- dsk01 - ../../../../../../devices/pseudo/z...@0:4c lrwxrwxrwx 1 root root 41 Feb 10 18:19 ubuntu01-dsk01 - ../../../../../../devices/pseudo/z...@0:5c # zpool history |grep xvm 2009-02-08.22:42:12 zfs create zpool01/xvm 2009-02-08.22:42:23 zfs create zpool01/xvm/media 2009-02-08.22:42:45 zfs create zpool01/xvm/dsk 2009-02-10.18:14:41 zfs create -V 10G zpool01/xvm/dsk/ubuntu-01-dsk01 2009-02-10.18:15:10 zfs destroy zpool01/xvm/dsk/ubuntu-01-dsk01 2009-02-10.18:15:21 zfs create -V 10G zpool01/xvm/dsk/ubuntu01-dsk01 2009-02-10.18:15:33 zfs create -V 10G zpool01/xvm/dsk/osol01-dsk01 # uname -a SunOS ollespappa 5.11 snv_107 i86pc i386 i86xpv While I am writing, is there any known issues with sharemgr and zfs in this release? svc:/network/shares/group:zfs hangs when going down since sharemgr stop zfs never returns... Thanks Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best SXCE version for ZFS Home Server
On Nov 16, 2008, at 11:23 AM, Vincent Boisard wrote: I just found this: http://www.sun.com/software/solaris/whats_new.jsp I lists Solaris 10 features and is a first hint at what features are in. Another question: My MoBo has a JMB (363 I think) SATA controller. I know support is included now in sxce, but I don't know for s10U6. Is there a changelog for S10U6 somewhere like for SXCE ? Have a look at the bugids in the patches for S10U6, like the kernel patch 137137-09. There are lists of all the new patches in the documentation for the release at docs.sun.com. Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lost space in empty pool (no snapshots)
:64652d400:400 DVA[1]=0:c80032dc00:400 fletcher4 lzjb LE con tiguous birth=28 fill=5 cksum=ae2a07052:491babbaac1:f96aca21cd61:2405acfae00513 Object lvl iblk dblk lsize asize type 0716K16K16K 20.0K DMU dnode Object lvl iblk dblk lsize asize type 1116K512512 1.50K ZFS master node microzap: 512 bytes, 3 entries ROOT = 3 DELETE_QUEUE = 2 VERSION = 3 Object lvl iblk dblk lsize asize type 2116K512512 1.50K ZFS delete queue microzap: 512 bytes, 1 entries 5 = 5 Object lvl iblk dblk lsize asize type 3116K512512 1.50K ZFS directory 264 bonus ZFS znode path/ uid 0 gid 0 atime Sun Nov 16 01:11:53 2008 mtime Sun Nov 16 01:14:18 2008 ctime Sun Nov 16 01:14:18 2008 crtime Sun Nov 16 01:11:53 2008 gen 4 mode40755 size2 parent 3 links 2 xattr 0 rdev0x microzap: 512 bytes, 0 entries Object lvl iblk dblk lsize asize type 5516K 128K 750G 12.2G ZFS plain file 264 bonus ZFS znode path???object#5 uid 0 gid 0 atime Sun Nov 16 01:13:12 2008 mtime Sun Nov 16 01:14:07 2008 ctime Sun Nov 16 01:14:07 2008 crtime Sun Nov 16 01:13:12 2008 gen 16 mode100600 size805306368000 parent 3 links 0 xattr 0 rdev0x Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenStorage GUI
On 13 nov 2008, at 15.15, Darren J Moffat [EMAIL PROTECTED] wrote: I believe the issue is that VirtualBox doesn't understand the multi- file format VMDK files that are used for the boot disk (Sun Storage VMware*.vmdk). I believe from googling that this could be fixed if you have access to VMware server by combining them back into a single vmdk file - I don't have easy access to Vmware server so I can't try this. -- It's even possible to transfer the image to bare metal, the same procedure must be usable to move the image to other virtualization software, once bootet it's only ordinary block devices... Regards Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lost space in empty pool (no snapshots)
On Nov 11, 2008, at 1:56 AM, Victor Latushkin wrote: Henrik Johansson wrote: Hello, I have a snv101 machine with a three disk raidz pool which allocation of about 1TB with for no obvious reason, no snapshot, no files, nothing. I tried to run zdb on the pool to see if I got any useful info, but it has been working for over two hours without any more output. I know when the allocation occurred, I issued a mkfile 1024G command in the background, but changed my mind and killed the process, after that the 912G was missing (don't remember if I actually removed the test file or what happened). If I copy a file to the /tank filesystem it uses even more space, but that space is reclaimed after I remove the file. I could recreate the pool, it is empty, but I created it to test the system in the first place so I would like to know what's going on. I have tried to export and import the pool, but it stays the same. Any ideas? You can try to increase zdb verbosity by adding some -v swiches. Also try dumping all the objects with 'zdb - tank' (add even more 'd' for extra verbosity). Ah, that did provide some more output, I can see the reserved space is indeed meant for the file I created earlier: Dataset tank [ZPL], ID 16, cr_txg 1, 912G, 5 objects Object lvl iblk dblk lsize asize type 0716K16K16K 20.0K DMU dnode 1116K512512 1.50K ZFS master node 2116K512512 1.50K ZFS delete queue 3116K512512 1.50K ZFS directory 6516K 128K 1T 912G ZFS plain file cut Object lvl iblk dblk lsize asize type 6516K 128K 1T 912G ZFS plain file 264 bonus ZFS znode path???object#6 uid 0 gid 0 atime Sun Nov 9 20:12:30 2008 mtime Sun Nov 9 21:50:10 2008 ctime Sun Nov 9 21:50:10 2008 crtime Sun Nov 9 20:12:30 2008 gen 69 mode 100600 size 1099511627776 parent 3 links 0 xattr 0 rdev 0x [deferred free] [L0 SPA space map] 1000L/200P DVA[0]=0:70ce8a5400:400 DVA[1]= 0:1f800421c00:400 DVA[2]=0:36800067c00:400 fletcher4 lzjb LE contiguous birth =2259 fill=0 cksum=0:0:0:0 [deferred free] [L0 SPA space map] 1000L/400P DVA[0]=0:70ce8a6000:800 DVA[1]= 0:1f800422800:800 DVA[2]=0:36800061800:800 fletcher4 lzjb LE contiguous birth =2259 fill=0 cksum=0:0:0:0 [deferred free] [L0 SPA space map] 1000L/200P DVA[0]=0:70ce8a8c00:400 DVA[1]= 0:1f800423c00:400 DVA[2]=0:36800069c00:400 fletcher4 lzjb LE contiguous birth =2259 fill=0 cksum=0:0:0:0 [deferred free] [L0 DMU dnode] 4000L/800P DVA[0]=0:70ce8a8000:c00 DVA[1]=0:1f 800423000:c00 DVA[2]=0:36800069000:c00 fletcher4 lzjb LE contiguous birth=225 9 fill=0 cksum=0:0:0:0 [deferred free] [L0 DMU dnode] 4000L/a00P DVA[0]=0:70ce8a7000:1000 DVA[1]=0:1 f800295000:1000 DVA[2]=0:36800068000:1000 fletcher4 lzjb LE contiguous birth= 2259 fill=0 cksum=0:0:0:0 objset 0 object 0 offset 0x0 [L0 DMU objset] 400L/200P DVA[0]=0:70ce8ad800:400 DVA[1]=0:1f800429000:400 DVA[2]=0:3680006e800:400 fletcher4 lzjb LE contigu ous birth=2260 fill=74 cksum=1309351a7b:687cd8ec06d: 12b694ebbc4e8:253a3515eb9248 objset 0 object 0 offset 0x0 [L0 DMU dnode] 4000L/c00P DVA[0]=0:70ce8ac400:1400 DVA[1]=0:1f800427c00:1400 DVA[2]=0:3680006d400:1400 fletcher4 lzjb LE cont iguous birth=2260 fill=27 cksum=bbcf0aa9db:13ea5e4dc8e7d: 1425e68263d46ff:f14c2da e18c61e93 cut objset 16 object 6 offset 0x12f73c [L0 ZFS plain file] 2L/ 2P DVA[0]=0:c749c:3 fletcher2 uncompressed LE contiguous birth=164 fill=1 cksum=0:0:0:0 objset 16 object 6 offset 0x12f73e [L0 ZFS plain file] 2L/ 2P DVA[0]=0:c749f:3 fletcher2 uncompressed LE contiguous birth=164 fill=1 cksum=0:0:0:0 objset 16 object 6 offset 0x12f740 [L0 ZFS plain file] 2L/ 2P DVA[0]=0:c74a2:3 fletcher2 uncompressed LE contiguous birth=164 fill=1 cksum=0:0:0:0 objset 16 object 6 offset 0x12f742 [L0 ZFS plain file] 2L/ 2P DVA[0]=0:c74a5:3 fletcher2 uncompressed LE contiguous birth=164 fill=1 cksum=0:0:0:0 objset 16 object 6 offset 0x12f744 [L0 ZFS plain file] 2L/ 2P DVA[0]=0:c74a8:3 fletcher2 uncompressed LE contiguous birth=164 fill=1 cksum=0:0:0:0 objset 16 object 6 offset 0x12f746 [L0 ZFS plain file] 2L/ 2P DVA[0]=0:c74ab:3 fletcher2 uncompressed LE contiguous birth=164 fill=1 cksum=0:0:0:0 continue for more than 100MB of output But, why has this happened, is it any known issue? Regards Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo
[zfs-discuss] Fully supported 12-port SATA cards?
I am helping a friend who is build a storage server for his company and I have advocated for Solaris and ZFS. But now we are having trouble to find any good SATA controller for 12+ disks, there are Areca cards, but they seems very flaky (24 port support, but hangs with more than 12 in JBOD mode, hangs on disk failures etc). The AOC-SAT2-MV8,AOC-USAS-L8i have been mentioned, but it had problems with device numbering and/or as I understood it no support for hot plugging. Isn't there any supported PCI.* SATA card with at least 12 ports that do work and have a real driver supporting hot plugging/cfgadm- operations? The HCL does not tell you everything, and scanning the lists did not show any consensus regarding this. This seems to have been a problem for years, I tried to find a card for myself a while back. Henrik Johansson http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Disks errors not shown by zpool?
Ok, this is not a OpenSolaris question, but it is a Solaris and ZFS question. I have a pool with three mirrored vdevs. I just got an error message from FMD that read failed from one on the disks,(c1t6d0). All with instructions on how to handle the problem and replace the devices, so far everything is good. But the zpool still thinks everything is fine. Shouldn't zpool also show errors in this state? This was run on S10U4 with 127127-11. # zpool status -x all pools are healthy # zpool status pool: storage state: ONLINE scrub: scrub completed with 0 errors on Sun Jun 29 23:16:34 2008 config: NAMESTATE READ WRITE CKSUM storage ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 errors: No known data errors # fmdump -v TIME UUID SUNW-MSG-ID Jul 08 20:14:42.6951 3780a675-96ea-6fa4-bd55-cb078a539f08 ZFS-8000-D3 100% fault.fs.zfs.device Problem in: zfs://pool=storage/vdev=83de319aad25c131 Affects: zfs://pool=storage/vdev=83de319aad25c131 FRU: - Location: - From my message log: Jul 8 20:11:53 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:11:53 fortressSCSI transport failed: reason 'incomplete': retrying command Jul 8 20:12:56 fortress scsi: [ID 365881 kern.info] fas: 6.0: cdb=[ 0xa 0x0 0x1 0xda 0x2 0x0 ] Jul 8 20:12:56 fortress scsi: [ID 365881 kern.info] fas: 6.0: cdb=[ 0xa 0x0 0x3 0xda 0x2 0x0 ] Jul 8 20:12:56 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880 (fas1): Jul 8 20:12:56 fortressDisconnected tagged cmd(s) (2) timeout for Target 6.0 Jul 8 20:12:56 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:12:56 fortressSCSI transport failed: reason 'timeout': retrying command Jul 8 20:12:56 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:12:56 fortressSCSI transport failed: reason 'reset': retrying command Jul 8 20:12:59 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:12:59 fortressError for Command: write(10) Error Level: Retryable Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] Requested Block: 17672154 Error Block: 17672154 Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] Vendor: SEAGATESerial Number: 9946626576 Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] ASC: 0x29 (power on occurred), ASCQ: 0x1, FRU: 0x1 Jul 8 20:12:59 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:12:59 fortressError for Command: write(10) Error Level: Retryable Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] Requested Block: 17672154 Error Block: 17672154 Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] Vendor: SEAGATESerial Number: 9946626576 Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] Sense Key: Not Ready Jul 8 20:12:59 fortress scsi: [ID 107833 kern.notice] ASC: 0x4 (LUN is becoming ready), ASCQ: 0x1, FRU: 0x2 Jul 8 20:13:04 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:13:04 fortressError for Command: write(10) Error Level: Retryable Jul 8 20:13:04 fortress scsi: [ID 107833 kern.notice] Requested Block: 17672154 Error Block: 17672154 Jul 8 20:13:04 fortress scsi: [ID 107833 kern.notice] Vendor: SEAGATESerial Number: 9946626576 Jul 8 20:13:04 fortress scsi: [ID 107833 kern.notice] Sense Key: Not Ready Jul 8 20:13:04 fortress scsi: [ID 107833 kern.notice] ASC: 0x4 (LUN is becoming ready), ASCQ: 0x1, FRU: 0x2 Jul 8 20:13:09 fortress scsi: [ID 107833 kern.warning] WARNING: / [EMAIL PROTECTED],0/SUNW,[EMAIL PROTECTED],880/[EMAIL PROTECTED],0 (sd19): Jul 8 20:13:09 fortressError for Command: write(10) Error Level:
Re: [zfs-discuss] ZFS root finally here in SNV90
On Jun 5, 2008, at 12:05 AM, Rich Teer wrote: On Wed, 4 Jun 2008, Henrik Johansson wrote: Anyone knows what the deal with /export/home is? I though /home was the default home directory in Solaris? Nope, /export/home has always been the *physical* location for users' home directories. They're usually automounted under /home, though. You are right, it was my own old habit of creating my own physical directory under /home for stand alone machines that got me confused. But filesystem(5) says: /home Default root of a subtree for user directories. and useradd:s base_dir defaults to /home, where it tries to create a directory if used only with the -m flag. I know this doesn't work when the automounter is available, but it can be disabled or reconfigured. When i think about it /export/home has been created earlier also, with UFS. It's fun how old things can get one confused in a new context ;) Regards Henrik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss