Re: [zfs-discuss] Asymmetric zpool load
Ross wrote: Aha, found it! It was this thread, also started by Carsten :) http://www.opensolaris.org/jive/thread.jspa?threadID=78921tstart=45 Did I? Darn, I need to get a brain upgrade. But yes, there it was mainly focused on zfs send/receive being slow - but maybe these are also linked. What I will try today/this week: Put some stress on the system with bonnie and other tools and try to find slow disks and see if this could be the main problem but also look into more vdevs and then possible move to raidz to somehow compensate for lost disk space. Since we have 4 cold spares on the shelf plus a SMS warnings on disk failures (that is if fma catches them) the risk involved should be tolerable. More later. Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
Carsten Aulbert carsten.aulbert at aei.mpg.de writes: Put some stress on the system with bonnie and other tools and try to find slow disks Just run iostat -Mnx 2 (not zpool iostat) while ls is slow to find the slow disks. Look at the %b (busy) values. -marc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Hello Sanjeev, Wednesday, December 3, 2008, 5:20:47 AM, you wrote: S Hi, S A good rough estimate would be the total of the space S that is displayed under the USED column of zfs list for those snapshots. S Here is an example : S -- snip -- S [EMAIL PROTECTED] zfs list -r tank S NAME USED AVAIL REFER MOUNTPOINT S tank24.6M 38.9M19K /tank S tank/fs124.4M 38.9M18K /tank/fs1 S tank/[EMAIL PROTECTED] 24.4M - 24.4M - S -- snip -- S In the above case tank/[EMAIL PROTECTED] is using 24.4M. So, if we delete S that snapshot it would freeup about 24.4M. Let's delete it an S see what we get : S -- snip -- S [EMAIL PROTECTED] zfs destroy tank/[EMAIL PROTECTED] S [EMAIL PROTECTED] zfs list -r tank S NAME USED AVAIL REFER MOUNTPOINT S tank 220K 63.3M19K /tank S tank/fs118K 63.3M18K /tank/fs1 S -- snip -- S So, we did get back 24.4M freed (39.9M + 24.4M = 63.3M). S Note that this could get a little complicated if there are multiple S snapshots which refer to the same set of blocks. So, even after deleting S one snapshot you might not see the space freed up. And this could be because, S of the second snapshot which is refering to some of the blocks still. That's what I meant by: I'm afraid you can do only one at a time. The problem is that once you got several snapshots and you want to calculate how much space you will re-gain if you delete two (or more) of them - you just can't calculate it by looking at zfs list output. All you can say is how much space at least you will re-gain which is summary of used column for snapshots to be deleted - but if they share some blocks then you might or might not (depends if yet other snapshots share some of these shared blocks) re-gain much more. -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
Carsten Aulbert wrote: Put some stress on the system with bonnie and other tools and try to find slow disks and see if this could be the main problem but also look into more vdevs and then possible move to raidz to somehow compensate for lost disk space. Since we have 4 cold spares on the shelf plus a SMS warnings on disk failures (that is if fma catches them) the risk involved should be tolerable. First result with bonnie during the writing intelligently... phase I see this in a 2 minute average: zpool iostats: capacity operationsbandwidth pool used avail read write read write -- - - - - - - atlashome 1.70T 19.2T225 1.49K 342K 107M raidz2 550G 6.28T 74409 114K 32.6M c0t0d0 - - 0314 32.3K 2.51M c1t0d0 - - 0315 31.8K 2.52M c4t0d0 - - 0313 31.3K 2.52M c6t0d0 - - 0315 32.3K 2.51M c7t0d0 - - 0326 32.8K 2.50M c0t1d0 - - 0309 33.9K 2.52M c1t1d0 - - 0313 33.4K 2.51M c4t1d0 - - 0314 33.4K 2.52M c5t1d0 - - 0308 32.8K 2.52M c6t1d0 - - 0314 31.3K 2.51M c7t1d0 - - 0311 31.8K 2.52M c0t2d0 - - 0309 31.8K 2.52M c1t2d0 - - 0313 31.8K 2.51M c4t2d0 - - 0315 31.8K 2.52M c5t2d0 - - 0307 32.8K 2.52M raidz2 567G 6.26T 64529 96.5K 36.3M c6t2d0 - - 1368 74.2K 2.79M c7t2d0 - - 1366 74.2K 2.80M c0t3d0 - - 1364 75.8K 2.80M c1t3d0 - - 1365 75.2K 2.80M c4t3d0 - - 1368 76.8K 2.80M c5t3d0 - - 1362 76.3K 2.80M c6t3d0 - - 1366 77.9K 2.80M c7t3d0 - - 1365 76.8K 2.80M c0t4d0 - - 1361 76.8K 2.80M c1t4d0 - - 1363 75.8K 2.80M c4t4d0 - - 1366 76.3K 2.80M c6t4d0 - - 1364 78.4K 2.80M c7t4d0 - - 1370 78.9K 2.79M c0t5d0 - - 1365 77.3K 2.80M c1t5d0 - - 1364 74.7K 2.80M raidz2 620G 6.64T 86582 131K 37.9M c4t5d0 - - 18382 1.16M 2.74M c5t5d0 - - 10380 674K 2.74M c6t5d0 - - 18378 1.15M 2.73M c7t5d0 - - 9384 628K 2.74M c0t6d0 - - 18377 1.16M 2.74M c1t6d0 - - 10383 680K 2.75M c4t6d0 - - 19379 1.21M 2.73M c5t6d0 - - 10383 691K 2.75M c6t6d0 - - 19379 1.21M 2.73M c7t6d0 - - 10383 676K 2.72M c0t7d0 - - 18374 1.19M 2.75M c1t7d0 - - 10381 676K 2.74M c4t7d0 - - 19380 1.22M 2.74M c5t7d0 - - 10382 696K 2.74M c6t7d0 - - 18381 1.17M 2.74M c7t7d0 - - 9386 631K 2.75M -- - - - - - - iostat -Mnx 120: extended device statistics r/sw/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device 0.00.00.00.0 0.0 0.00.00.0 0 0 c2t0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c3t0d0 0.01.40.00.0 0.0 0.01.50.4 0 0 c5t0d0 0.6 351.50.02.6 0.4 0.11.20.2 3 8 c7t0d0 0.6 336.30.02.6 0.1 0.10.40.2 3 7 c0t0d0 0.6 340.80.02.6 0.2 0.10.60.2 3 7 c1t0d0 0.6 330.60.02.6 0.1 0.10.30.2 3 7 c5t1d0 0.6 336.70.02.6 0.1 0.10.30.2 3 7 c4t0d0 0.6 331.80.02.6 0.1 0.10.30.2 3 7 c0t1d0 0.6 339.00.02.6 0.4 0.11.10.2 3 7 c7t1d0 0.6 335.40.02.6 0.1 0.10.40.2 3 7 c1t1d0 0.6 329.20.02.6 0.1 0.10.30.2 3 7 c5t2d0 0.6 343.70.02.6 0.3 0.10.70.2 3 7 c4t1d0 0.6 331.80.02.6 0.1 0.10.30.2 2 7 c0t2d0 1.2 396.30.12.9 0.3 0.10.70.2 4 8 c7t2d0 0.6 336.70.02.6 0.1 0.10.40.2 3 7 c1t2d0 0.6 341.90.02.6 0.2 0.10.70.2 3 7 c4t2d0 1.3 390.70.12.9 0.3 0.10.80.2 4 9 c5t3d0 1.3 396.70.12.9 0.3 0.10.80.2 4 9 c7t3d0 1.3 393.60.12.9 0.2 0.10.60.2 4 9 c0t3d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c5t4d0 1.3 396.20.12.9 0.2 0.10.5
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
I'm having a very similar issue. Just updated to 10 u6 and upgrade my zpools. They are fine (all 3-way mirors), but I've lost the machine around 12:30am two nights in a row. I'm booting ZFS root pools, if that makes any difference. I also don't see anything in dmesg, nothing on the console either. I'm going to go back to the logs today to see what was going on around midnight on these occasions. I know there are some built-in cronjobs that run around that time - perhaps one of them in the culprit. What I'd really like is a way to force a core dump when the machine hangs like this. scat is a very nifty tool for debugging such things - but I'm not getting a core or panic or anything :( -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
Update: It would appear that the bug I was complaining about nearly a year ago is still at play here: http://opensolaris.org/jive/thread.jspa?threadID=49372tstart=0 Unfortunate Solution: Ditch Solaris 10 and run Nevada. The nice folks in the OpenSolaris project fixed the problem a long time ago. This means that I can't have Sun support until Nevada becomes a real product, but it's better than having a silent failure every time 6GB crosses the wire. My big question is why won't they fix it in Solaris 10? Sun's depriving themselves of my support revenue stream and I'm stuck with an unsupportable box as my core filer. Bad situation on so many levels.. If it weren't for the stellar quality of the Nevada builds (b91 uptime=132 days now with no problems), I'd not be sleeping much at night.. Imagine my embarrassment had I taken the high road and spent the $$$ for a Thumper for this purpose.. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
On Wed, Dec 3, 2008 at 7:49 AM, Jacob Ritorto [EMAIL PROTECTED]wrote: Update: It would appear that the bug I was complaining about nearly a year ago is still at play here: http://opensolaris.org/jive/thread.jspa?threadID=49372tstart=0 Unfortunate Solution: Ditch Solaris 10 and run Nevada. The nice folks in the OpenSolaris project fixed the problem a long time ago. This means that I can't have Sun support until Nevada becomes a real product, but it's better than having a silent failure every time 6GB crosses the wire. My big question is why won't they fix it in Solaris 10? Sun's depriving themselves of my support revenue stream and I'm stuck with an unsupportable box as my core filer. Bad situation on so many levels.. If it weren't for the stellar quality of the Nevada builds (b91 uptime=132 days now with no problems), I'd not be sleeping much at night.. Imagine my embarrassment had I taken the high road and spent the $$$ for a Thumper for this purpose.. Can't you just run opensolaris? They've got support contracts for that, and the bug should be fixed in 2008.11. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
Ok, I've done some more testing today and I almost don't know where to start. I'll begin with the good news for Miles :) - Rebooting doesn't appear to cause ZFS to loose the resilver status (but see 1. below) - Resilvering appears to work fine, once complete I never saw any checksum errors when scrubbing the pool. - Reconnecting iscsi drives causes zfs to automatically online the pool and automatically begin resilvering. And now the bad news: 1. While rebooting doesn't seem cause the resilver to loose it's status, something's causing it problems. I saw it restart several times. 2. With iscsi, you can't reboot with sendtargets enabled, static discovery still seems to be the order of the day. 3. There appears to be a disconnect between what iscsiadm knows and what ZFS knows about the status of the devices. And I have confirmation of some of my earlier findings too: 4. iSCSI still has a 3 minute timeout, during which time your pool will hang, no matter how many redundant drives you have available. 5. zpool status can still hang when a device goes offline, and when it finally recovers, it will then report out of date information. This could be Bug 6667199, but I've not seen anybody reporting the incorrect information part of this. 6. After one drive goes offline, during the resilver process, zpool status shows that information is being resilvered on the good drives. Does anybody know why this happens? 7. Although ZFS will automatically online a pool when iscsi devices come online, CIFS shares are not automatically remounted. I also have a few extra notes about a couple of those: 1 - resilver loosing status === Regarding the resilver restarting, I've seen it reported that zpool status can cause this when run as admin, but I'm not convinced that's the cause. Same for the rebooting problem. I was able to run zpool status dozens of times as an admin, but only two or three times did I see the resilver restart. Also, after rebooting, I could see that the resilver was showing that it was 66% complete, but then a second later it restarted. Now, none of this is conclusive. I really need to test with a much larger dataset to get an idea of what's really going on, but there's definately something weird happening here. 3 - disconnect between iscsiadm and ZFS = I repeated my test of offlining an iscsi target, this time checking iscsiadm to see when it disconnected. What I did was wait until iscsiadm reported 0 connections to the target, and then started a CIFS file copy and ran zpool status. Zpool status hung as expected, and a minute or so later, the CIFS copy failed. It seems that although iscsiadm was aware that the target was offline, ZFS did not yet know about it. As expected, a minute or so later, zpool status completed (returning incorrect results), and I could then run the CIFS copy fine. 5 - zpool status hanging and reporting incorrect information === When an iSCSI device goes offline, if you immediately run zpool status, it hangs for 3-4 minutes. Also, when it finally completes, it gives incorrect information, reporting all the devices as online. If you immediately re-run zpool status, it completes rapidly and will now correctly show the offline devices. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
I think my problem is actually different - I'm not using iSCSI at all. I will update if I find otherwise. And yes, I do think there is support available for OpenSolaris now: http://www.sun.com/service/opensolaris/faq.xml Blake On Wed, Dec 3, 2008 at 9:32 AM, Tim [EMAIL PROTECTED] wrote: On Wed, Dec 3, 2008 at 7:49 AM, Jacob Ritorto [EMAIL PROTECTED] wrote: Update: It would appear that the bug I was complaining about nearly a year ago is still at play here: http://opensolaris.org/jive/thread.jspa?threadID=49372tstart=0 Unfortunate Solution: Ditch Solaris 10 and run Nevada. The nice folks in the OpenSolaris project fixed the problem a long time ago. This means that I can't have Sun support until Nevada becomes a real product, but it's better than having a silent failure every time 6GB crosses the wire. My big question is why won't they fix it in Solaris 10? Sun's depriving themselves of my support revenue stream and I'm stuck with an unsupportable box as my core filer. Bad situation on so many levels.. If it weren't for the stellar quality of the Nevada builds (b91 uptime=132 days now with no problems), I'd not be sleeping much at night.. Imagine my embarrassment had I taken the high road and spent the $$$ for a Thumper for this purpose.. Can't you just run opensolaris? They've got support contracts for that, and the bug should be fixed in 2008.11. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450
I've done some basic testing with a X4150 machine using 6 disks in a RAID 5 and RAID Z configuration. They perform very similarly, but RAIDZ definitely has more system overhead. In many cases this won't be a big deal, but if you need as many CPU cycles as you can muster, hardware RAID may be your better choice. -Aaron On Tue, Dec 2, 2008 at 4:22 AM, Vikash Gupta [EMAIL PROTECTED] wrote: Hi, Has anyone implemented the Hardware RAID 1/5 on Sun X4150/X4450 class of servers . Also any comparison between ZFS Vs H/W Raid ? I would like to know the experience (good/bad) and the pros/cons? Regards, Vikash ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs_nocacheflush, nvram, and root pools
On 12/02/08 03:47, River Tarnell wrote: hi, i have a system connected to an external DAS (SCSI) array, using ZFS. the array has an nvram write cache, but it honours SCSI cache flush commands by flushing the nvram to disk. the array has no way to disable this behaviour. a well-known behaviour of ZFS is that it often issues cache flush commands to storage in order to ensure data integrity; while this is important with normal disks, it's useless for nvram write caches, and it effectively disables the cache. so far, i've worked around this by setting zfs_nocacheflush, as described at [1], which works fine. but now i want to upgrade this system to Solaris 10 Update 6, and use a ZFS root pool on its internal SCSI disks (previously, the root was UFS). the problem is that zfS_nocacheflush applies to all pools, which will include the root pool. my understanding of ZFS is that when run on a root pool, which uses slices (instead of whole disks), ZFS won't enable the write cache itself. i also didn't enable the write cache manually. so, it _should_ be safe to use zfs_nocacheflush, because there is no caching on the root pool. am i right, or could i encounter problems here? Yes you are right and this should work. You may want to check that the write cache is disabled on the root pool disks using 'format -e' + cache + write_cache + display. (the system is an NFS server, which means lots of synchronous writes (and therefore ZFS cache flushes), so i *really* want the performance benefit from using the nvram write cache.) Indeed, performance would be bad without it. - river. Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Flash Archive Support for ZFS?
zfs-discuss, Now that we've finally got support for ZFS root filesystems on Solaris 10, I was wondering if anyone knows what the status is for ZFS Flash Archive. Presumably it's got to use ZFS send/receive functionality, but is rolling that into FlashArchive something that's on the roadmap? Thanks, Matthew -- Matt Walburn http://mattwalburn.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Followup to my own post. Looks like my SVM setup was having problems prior to patch being applied. If I boot net:dhcp -s and poke around on the disks, it looks like disk0 is pre-patch state and disk1 is post-patch. I can get a shell if I boot disk1 -s So I think I am in SVM hell here not specfically ZFS patch breaking my box. Never mind! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
2. With iscsi, you can't reboot with sendtargets enabled, static discovery still seems to be the order of the day. I'm seeing this problem with static discovery: http://bugs.opensolaris.org/view_bug.do?bug_id=6775008. 4. iSCSI still has a 3 minute timeout, during which time your pool will hang, no matter how many redundant drives you have available. This is CR 649, http://bugs.opensolaris.org/view_bug.do?bug_id=649, which is separate from the boot time timeout, though, and also one that Sun so far has been unable to fix! -- Maurice Volaski, [EMAIL PROTECTED] Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
Hi Blake, Blake Irvin wrote: I'm having a very similar issue. Just updated to 10 u6 and upgrade my zpools. They are fine (all 3-way mirors), but I've lost the machine around 12:30am two nights in a row. What I'd really like is a way to force a core dump when the machine hangs like this. scat is a very nifty tool for debugging such things - but I'm not getting a core or panic or anything :( You can force a dump. Here are the steps: Before the system is hung: # mdb -K -F -- this will load kmdb and drop into it Don't worry if your system now seems hung. Type, carefully, with no typos: :c -- and carriage-return. You should get your prompt back Now, when the system is hung, type F1-a (that's function key f1 and the a key together. This should put you into kmdb. Now, type, (again, no typos): $systemdump This should give you a panic dump, followed by reboot, (unless your system is hard-hung). max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
Yeah, thanks Maurice, I just saw that one this afternoon. I guess you can't reboot with iscsi full stop... o_0 And I've seen the iscsi bug before (I was just too lazy to look it up lol), I've been complaining about that since February. In fact it's been a bad week for iscsi here, I've managed to crash the iscsi client twice in the last couple of days too (full kernel dump crashes), so I'll be filing a bug report on that tomorrow morning when I get back to the office. Ross On Wed, Dec 3, 2008 at 7:39 PM, Maurice Volaski [EMAIL PROTECTED] wrote: 2. With iscsi, you can't reboot with sendtargets enabled, static discovery still seems to be the order of the day. I'm seeing this problem with static discovery: http://bugs.opensolaris.org/view_bug.do?bug_id=6775008. 4. iSCSI still has a 3 minute timeout, during which time your pool will hang, no matter how many redundant drives you have available. This is CR 649, http://bugs.opensolaris.org/view_bug.do?bug_id=649, which is separate from the boot time timeout, though, and also one that Sun so far has been unable to fix! -- Maurice Volaski, [EMAIL PROTECTED] Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Hi Sanjeev and Milek, thanks for your replies but I'm afraid they are somewhat missing the point. I have a situation (and I believe it would be fairly common) where early snapshots would be sharing most data with the current filesystem and more recent snapshots are holding onto data that has been deleted from the current filesystem (after a recent big deletion of unused data). It is impossible to see what snapshots would need to be deleted to free up the space that was deleted from the current fs, without deleting the snapshots one by one. I think in any filesystem it is reasonable to expect to be able to determine what entities use up disk space. But ZFS is currently lacking this for snapshots. See the below USED column. The total USED adds up to about 11G. However the total data consumed by the snapshots is more in the range of 100G. But from the listing below it is simply impossible to see which snapshots are using it. NAME USED AVAIL REFER MOUNTPOINT storage457G 127G 28.4K /storage storage/myfilesystem 457G 127G 251G /storage/myfilesystem NAMEUSED AVAIL REFER MOUNTPOINT [EMAIL PROTECTED]0 - 28.4K - [EMAIL PROTECTED]0 - 28.4K - storage/[EMAIL PROTECTED]4.26G - 187G - storage/[EMAIL PROTECTED] 61.1M - 206G - storage/[EMAIL PROTECTED] 773M - 201G - storage/[EMAIL PROTECTED] 33.2M - 192G - storage/[EMAIL PROTECTED]62.6M - 212G - storage/[EMAIL PROTECTED] 5.29G - 217G - -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] OpenSolaris vs Linux
Hi list, Any one has ANY new data on OpenSolaris vs Linux? I only found an old post in 2006. http://mail.opensolaris.org/pipermail/zfs-discuss/2006-January/030366.html And any comments on if OpenSolaris performance is about the same as Solaris 10? Thanks! z___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
Hi Blake, Blake Irvin wrote: Thanks - however, the machine hangs and doesn't even accept console input when this occurs. I can't get into the kernel debugger in these cases. Are you directly on the console, or is the console on a serial port? If you are running over X windows, the input might still get in, but X may not be displaying. If keyboard input is not getting in, your machine is probably wedged at a high level interrupt, which sounds doubtful based on your problem description. I've enabled the deadman timer instead. I'm also using the automatic snapshot service to get a look at things like /var/adm/sa/sa** files that get overwritten after a hard reset. If the deadman timer does not trigger, the clock is almost certainly running, and your machine is almost certainly accepting keyboard input. Good luck, max I'm just going to stay up late tonight and see what happens :) Blake Hi Blake, Blake Irvin wrote: I'm having a very similar issue. Just updated to 10 u6 and upgrade my zpools. They are fine (all 3-way mirors), but I've lost the machine around 12:30am two nights in a row. What I'd really like is a way to force a core dump when the machine hangs like this. scat is a very nifty tool for debugging such things - but I'm not getting a core or panic or anything :( You can force a dump. Here are the steps: Before the system is hung: # mdb -K -F -- this will load kmdb and drop into it Don't worry if your system now seems hung. Type, carefully, with no typos: :c -- and carriage-return. You should get your prompt back Now, when the system is hung, type F1-a (that's function key f1 and the a key together. This should put you into kmdb. Now, type, (again, no typos): $systemdump This should give you a panic dump, followed by reboot, (unless your system is hard-hung). max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discu ss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
On Wed, Dec 3, 2008 at 2:39 PM, Joseph Zhou [EMAIL PROTECTED]wrote: Hi list, Any one has ANY new data on OpenSolaris vs Linux? I only found an old post in 2006. http://mail.opensolaris.org/pipermail/zfs-discuss/2006-January/030366.html And any comments on if OpenSolaris performance is about the same as Solaris 10? Thanks! z That's kind of open ended. What sort of performance are you looking for? NFS throughput? Software raid? What distro vs. Solaris? Opensolaris and Solaris are going to have different performance based on what exactly it is you're testing. Similar is probably accurate for a lot of things, but not everything. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
Thanks Tim, At this moment, I am looking into OpenStorage as NAS (file serving) vs. Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX) performance. I am also interested in block-based performance, but not as urgent as above. (Since 7000 is mainly doing NAS today, in a non-HPC-clustered fashion without Lustre. With Lustre, the performance competitive focuses are different from above). Thanks, z - Original Message - From: Tim To: Joseph Zhou Cc: zfs-discuss@opensolaris.org Sent: Wednesday, December 03, 2008 4:04 PM Subject: Re: [zfs-discuss] OpenSolaris vs Linux On Wed, Dec 3, 2008 at 2:39 PM, Joseph Zhou [EMAIL PROTECTED] wrote: Hi list, Any one has ANY new data on OpenSolaris vs Linux? I only found an old post in 2006. http://mail.opensolaris.org/pipermail/zfs-discuss/2006-January/030366.html And any comments on if OpenSolaris performance is about the same as Solaris 10? Thanks! z That's kind of open ended. What sort of performance are you looking for? NFS throughput? Software raid? What distro vs. Solaris? Opensolaris and Solaris are going to have different performance based on what exactly it is you're testing. Similar is probably accurate for a lot of things, but not everything. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
Joseph Zhou wrote: Thanks Tim, At this moment, I am looking into OpenStorage as NAS (file serving) vs. Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX) performance. There are still a number of ZFS/OpenSOlaris options to compare, iSCSI, Samba, CIFS, NFS. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
On Wed, Dec 3, 2008 at 3:11 PM, Joseph Zhou [EMAIL PROTECTED]wrote: Thanks Tim, At this moment, I am looking into OpenStorage as NAS (file serving) vs. Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX) performance. I am also interested in block-based performance, but not as urgent as above. (Since 7000 is mainly doing NAS today, in a non-HPC-clustered fashion without Lustre. With Lustre, the performance competitive focuses are different from above). Thanks, z Right, so hardware or software raid? NFS, CIFS, both? Win2k8 is going to blow serving NFS, but it can be done. Storage 7000 is going to have a COMPLETELY different performance envelope than vanilla opensolaris or solaris. With some customization using flash you might be able to get close, but if you want to know what a storage 7000 will do, you should ask for that, not just opensolaris. Here's an example of a loaded up 7000: http://blogs.sun.com/brendan/entry/a_quarter_million_nfs_iops If you want to compare it to something like NetApp though, it's tough, because how do you make your comparison? Price? What model NetApp are you going to use? What kind of server are you going to use? If you just want to use some numbers someone comes up with to make a decision on what platform to use, I'd argue you're going about it completely the wrong way. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
Thanks Ian, Tim, Ok, let me really hit one topic instead of trying to see in general what data are out there... Let's say OpenSolaris doing Samba vs. Linux doing Samba, in CIFS performance. (so I can link to the Win2008 CIFS numbers and NetApp CIFS numbers myself.) Is there any data to this specific point? Thanks! z - Original Message - From: Ian Collins [EMAIL PROTECTED] To: Joseph Zhou [EMAIL PROTECTED] Cc: Tim [EMAIL PROTECTED]; zfs-discuss@opensolaris.org Sent: Wednesday, December 03, 2008 4:31 PM Subject: Re: [zfs-discuss] OpenSolaris vs Linux Joseph Zhou wrote: Thanks Tim, At this moment, I am looking into OpenStorage as NAS (file serving) vs. Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX) performance. There are still a number of ZFS/OpenSOlaris options to compare, iSCSI, Samba, CIFS, NFS. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
On Wed, Dec 3, 2008 at 3:36 PM, Joseph Zhou [EMAIL PROTECTED]wrote: Thanks Ian, Tim, Ok, let me really hit one topic instead of trying to see in general what data are out there... Let's say OpenSolaris doing Samba vs. Linux doing Samba, in CIFS performance. (so I can link to the Win2008 CIFS numbers and NetApp CIFS numbers myself.) Is there any data to this specific point? Thanks! z So, you wouldn't use Samba on opensolaris, you'd use the native cifs stack. Then we have to look at the system itself. How much ram? How many and what kind of CPU's? How much disk on the backend? What kind of disk on the back end? I don't think you're going to find the numbers you're looking for to be quite honest. And even if you did, I don't know how usable they'd really be. I'd start by digging through the spc benchmarks. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
I am directly on the console. cde-login is disabled, so i'm dealing with direct entry. Are you directly on the console, or is the console on a serial port? If you are running over X windows, the input might still get in, but X may not be displaying. If keyboard input is not getting in, your machine is probably wedged at a high level interrupt, which sounds doubtful based on your problem description. Out of curiosity, why do you say that? I'm no expert on interrupts, so I'm curious. It DOES seem that keyboard entry is ignored in this situation, since I see no results from ctrl-c, for example (I had left the console running 'tail -f /var/adm/messages'. I'm not saying your are wrong, but if I should be examining interrupt issues, I'd like to know (I have 3 hard disk controllers in the box, for example...) If the deadman timer does not trigger, the clock is almost certainly running, and your machine is almost certainly accepting keyboard input. That's good to know. I just enabled deadman after the last freeze, so it will be a bit before I can test this (hope I don't have to). thanks! Blake Good luck, max -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
To provide a complete (meaning it is currently incomplete) disk usage report, I think zfs would need provide the following: Each block in a zfs fs in general is shared by one or more of either the current fs or snapshots. Call this the block's share set. Eg, (storage/fs, storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED]) Group the blocks (from the entire fs) that have identical share sets together and call this a shared collection. For each shared collection, report the disk usage based on the total number of blocks in that shared collection. It is to be expected that a snapshot or fs could appear in more than one shared collection. eg, zfs list-shared SHARED COLLECTION USED (storage/fs, storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED]) 100G (storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED]) 40G -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Hello none, Wednesday, December 3, 2008, 8:38:03 PM, you wrote: n Hi Sanjeev and Milek, thanks for your replies but I'm afraid they are somewhat missing the point. n I have a situation (and I believe it would be fairly common) where n early snapshots would be sharing most data with the current n filesystem and more recent snapshots are holding onto data that has n been deleted from the current filesystem (after a recent big deletion of unused data). n It is impossible to see what snapshots would need to be deleted to n free up the space that was deleted from the current fs, without n deleting the snapshots one by one. I think in any filesystem it is n reasonable to expect to be able to determine what entities use up n disk space. But ZFS is currently lacking this for snapshots. n See the below USED column. The total USED adds up to about n 11G. However the total data consumed by the snapshots is more in n the range of 100G. But from the listing below it is simply n impossible to see which snapshots are using it. In most cases you probably want to destroy an oldest snapshot. Then check if you happy with your free space, if not delete a next one and so on. The problem is how to present the information you are asking for, like - how much space will I regain if I delete snapshot #2 #4 #7 -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
On Wed, Dec 3, 2008 at 3:51 PM, Joseph Zhou [EMAIL PROTECTED]wrote: Ok, thanks Tim, which SPC are you talking about? SPC-1 and SPC-2 don't test NAS, those are block perf. SPECsfs97 v2/v3 and sfs2008 have no OpenStorage results. If there are standard storage benchmarks out there, I would not be here asking folks. To your point, how is the OpenSolaris native CIFS vs Linux Samba then? (if you think this is more apple-to-apple than they both run Samba) Again, I am here to explore data, not to argue, if I give you a dozen configurations, could you get me the performance estimates and how the estimates come from? I didn't think that route is possible. Thanks. z Sorry, I was referring to SPEC, not SPC. Perhaps you could ask one of the folks from Sun on these mailing lists if they have plans to post results. I'd imagine they do for at least the storage 7000 series. I think native cifs on Solaris vs. Samba on Linux is fair simply because it's what someone rolling out an implementation would use. It'll never be 100% apples-to-apples, so I'd say real-world is preferred over hampering one system to make it *closer* to the other. As for configurations, I probably have access to enough hardware to do most of the benchmarking, but this time of year, being end-of-quarter, I wouldn't have the time to do so. That doesn't mean there isn't someone else lurking who does. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Hi Milek, I specifically don't want to destroy the oldest snapshot because I already know it won't free up much disk space, since it is sharing most data with the current fs. If I delete it I will have lost any old data which might be needed for recovery (eg. if I accidentally corrupt some files). I don't know which snapshots which actually free up the space. In most cases you probably want to destroy an oldest snapshot. Then check if you happy with your free space, if not delete a next one and so on. The problem is how to present the information you are asking for, like - how much space will I regain if I delete snapshot #2 #4 #7 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
Hi Blake, Blake Irvin wrote: I am directly on the console. cde-login is disabled, so i'm dealing with direct entry. Are you directly on the console, or is the console on a serial port? If you are running over X windows, the input might still get in, but X may not be displaying. If keyboard input is not getting in, your machine is probably wedged at a high level interrupt, which sounds doubtful based on your problem description. Out of curiosity, why do you say that? I'm no expert on interrupts, so I'm curious. It DOES seem that keyboard entry is ignored in this situation, since I see no results from ctrl-c, for example (I had left the console running 'tail -f /var/adm/messages'. I'm not saying your are wrong, but if I should be examining interrupt issues, I'd like to know (I have 3 hard disk controllers in the box, for example...) Typing ctrl-c, and having process killed because of it are 2 different actions. The interpretation of ctrl-c as a kill character is done in a streams module (ldterm, I believe). This is not done at the device interrupt handler. I doubt you need to examine interrupts. I was only saying that you could try what I recommended to get a dump. The f1-a is handled at the driver during interrupt handling, so it should get processed. I have done this many times, so I am sure it works. If the deadman timer does not trigger, the clock is almost certainly running, and your machine is almost certainly accepting keyboard input. That's good to know. I just enabled deadman after the last freeze, so it will be a bit before I can test this (hope I don't have to). thanks! Blake Good luck, max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
n == none [EMAIL PROTECTED] writes: rm == Robert Milkowski [EMAIL PROTECTED] writes: n eg, n zfs list-shared n SHARED COLLECTIONUSED n (storage/fs, storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED]) 100G rm how much space will I regain if I delete snapshot #2 #4 #7 without dedup, I think any SHARED COLLECTION will be contiguous. It'll always be ``if I delete #2, #3, #4''. so if you have n snapshots, you'll have up to n * n+1 --- 2 SHARED COLLECTION's. kind of a lot, but not totally ridiculous. pgpzrn2msjvFi.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
haha, Tim, yes, I see the Open spirit in this reply! ;-) As I said, I am just exploring data. The Sun J4000 SPC1 and SPC2 benchmark results were nice, just lacking other published results with the iSCSI HBA as DAS, not as a network storage device (as 7000). Though I would attempt to say those results can be a basis for 7000 block-performance... any comment? Thanks! z - Original Message - From: Tim To: Joseph Zhou Cc: Ian Collins ; zfs-discuss@opensolaris.org Sent: Wednesday, December 03, 2008 5:00 PM Subject: Re: [zfs-discuss] OpenSolaris vs Linux On Wed, Dec 3, 2008 at 3:51 PM, Joseph Zhou [EMAIL PROTECTED] wrote: Ok, thanks Tim, which SPC are you talking about? SPC-1 and SPC-2 don't test NAS, those are block perf. SPECsfs97 v2/v3 and sfs2008 have no OpenStorage results. If there are standard storage benchmarks out there, I would not be here asking folks. To your point, how is the OpenSolaris native CIFS vs Linux Samba then? (if you think this is more apple-to-apple than they both run Samba) Again, I am here to explore data, not to argue, if I give you a dozen configurations, could you get me the performance estimates and how the estimates come from? I didn't think that route is possible. Thanks. z Sorry, I was referring to SPEC, not SPC. Perhaps you could ask one of the folks from Sun on these mailing lists if they have plans to post results. I'd imagine they do for at least the storage 7000 series. I think native cifs on Solaris vs. Samba on Linux is fair simply because it's what someone rolling out an implementation would use. It'll never be 100% apples-to-apples, so I'd say real-world is preferred over hampering one system to make it *closer* to the other. As for configurations, I probably have access to enough hardware to do most of the benchmarking, but this time of year, being end-of-quarter, I wouldn't have the time to do so. That doesn't mean there isn't someone else lurking who does. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
r == Ross [EMAIL PROTECTED] writes: rs I don't think it likes it if the iscsi targets aren't rs available during boot. from my cheatsheet: -8- ok boot -m milestone=none [boots. enter root password for maintenance.] bash-3.00# /sbin/mount -o remount,rw / [-- otherwise iscsiadm won't update /etc/iscsi/*] bash-3.00# /sbin/mount /usr bash-3.00# /sbin/mount /var bash-3.00# /sbin/mount /tmp bash-3.00# iscsiadm remove discovery-address 10.100.100.135 bash-3.00# iscsiadm remove discovery-address 10.100.100.138 bash-3.00# iscsiadm remove discovery-address 10.100.100.138 iscsiadm: unexpected OS error iscsiadm: Unable to complete operation [-- good. it's gone.] bash-3.00# sync bash-3.00# lockfs -fa bash-3.00# reboot -8- rs # time zpool status [...] rs real 3m51.774s so, this hang may happen in fewer situations, but it is not fixed. r 6. After one drive goes offline, during the resilver process, r zpool status shows that information is being resilvered on the r good drives. Does anybody know why this happens? I don't know why. I've seen that, too, though. For me it's always been relatively short, 1min. I wonder if there are three kinds of scrub-like things, not just two (resilvers and scrubs), and 'zpool status' is ``simplifying'' for us again? r 7. Although ZFS will automatically online a pool when iscsi r devices come online, CIFS shares are not automatically r remounted. For me, even plain filesystems are not all remounted. ZFS tries to mount them in the wrong order, so it would mount /a/b/c, then try to mount /a/b and complain ``directory not empty''. I'm not sure why it mounts things in the right order at boot/import, but in haphazard order after one of these auto-onlines. Then NFS exporting didn't work either. To fix, I have to 'zfs umount /a/b/c', but then there is a b/c directory inside filesystem /a, so I have to 'rmdir /a/b/c' by hand because the '... set mountpoint' koolaid creates the directories but doesn't remove them. Then 'zfs mount -a' and 'zfs share -a'. pgpJzJr1P7Q4e.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
On Wed, Dec 3, 2008 at 4:15 PM, Joseph Zhou [EMAIL PROTECTED]wrote: haha, Tim, yes, I see the Open spirit in this reply! ;-) As I said, I am just exploring data. The Sun J4000 SPC1 and SPC2 benchmark results were nice, just lacking other published results with the iSCSI HBA as DAS, not as a network storage device (as 7000). Though I would attempt to say those results can be a basis for 7000 block-performance... any comment? Thanks! z I'd imagine you'll see far better performance out of the 7000 with their use of flash. Only time will tell though :) --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris vs Linux
Joseph Zhou wrote: Thanks Ian, Tim, Ok, let me really hit one topic instead of trying to see in general what data are out there... Let's say OpenSolaris doing Samba vs. Linux doing Samba, in CIFS performance. (so I can link to the Win2008 CIFS numbers and NetApp CIFS numbers myself.) Is there any data to this specific point? I think what we are telling you is the only way to find the numbers you want for your configuration is to do your own tests. There are just too many variables for other people's data to be truly relevant. One of the benefits of Open Source is you only have to pay for your time to run tests. As Tim said, there's no point in limiting OpenSolaris to Samba. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Even if that maximum number of shared collections is that much, I think the information should be available if requested, even if it becomes similar to a 'scrub' operation in terms of the time taken. For a filesystem like mine that only has 10 or so snapshots, I'd really only expect 3 or 4 of the shared collections to stand out in terms of disk usage. For filesystems with 100's of snapshots, they can filter the data as desired. It would probably take a long time but at least the can get the information if they really need it. It's better than not having access to the information at all. I'm not sure if time could be saved by requesting disk usage for only a single shared collection, but if it does that could also be an option as you suggest. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool replace - choke point
I think we found the choke point. The silver lining is that it isn't the T2000 or ZFS. We think it is the new SAN, an Hitachi AMS1000, which has 7200RPM SATA disks with the cache turned off. This system has a very small cache, and when we did turn it on for one of the replacement LUNs we saw a 10x improvement - until the cache filled up about 1 minute later (was using zpool iostat). Oh well. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang
Thanks Max and Chris. I don't really want the problem to occur again, of course, but I'll be prepared if it does. On Wed, Dec 3, 2008 at 6:46 PM, Chris Siebenmann [EMAIL PROTECTED] wrote: You write: | If keyboard input is not getting in, your machine is probably wedged | at a high level interrupt, which sounds doubtful based on your | problem description. | Out of curiosity, why do you say that? I'm no expert on interrupts, so | I'm curious. It DOES seem that keyboard entry is ignored in this | situation, since I see no results from ctrl-c, for example (I had left | the console running 'tail -f /var/adm/messages'. I'm not saying your | are wrong, but if I should be examining interrupt issues, I'd like to | know (I have 3 hard disk controllers in the box, for example...) ^C handling requires a great deal of high-level kernel infrastructure to be working, far beyond basic interrupt handling. To get much visible reaction in a situation where nothing is producing output, for example, the system has to be able to get all the way to running your shell so that it can notice that tail has died and print the shell prompt. By contrast, if the console echoes '^C', you have a fair amount of interrupt handling. The Solaris kernel debugger hooks in to the system at a fairly low level (I believe significantly lower than all of the things that have to be working to even echo '^C', much less get all the way to executing user-level code). Thus, you can get into it and force-crash your system even if it is otherwise fairly dead, so I think that trying is well worth it in your situation. --- I shall clasp my hands together and bow to the corners of the world. Number Ten Ox, Bridge of Birds Chris Siebenmann [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool cannot replace a replacing device
I have a 10 drive raidz, recently one of the disks appeared to be generating errors (this later turned out to be a cable), I removed the disk from the array, ran vendor diagnostics (which zeroed it). Upon reinstalling the disk however zfs will not resilver it, it gets referred to numerically instead of by device name, and when i try to replace it, i get: # zpool replace data 17096229131581286394 c0t2d0 cannot replace 17096229131581286394 with c0t2d0: cannot replace a replacing device if i try to detach it i get: # zpool detach data 17096229131581286394 cannot detach 17096229131581286394: no valid replicas current zpool output looks like: # zpool status -v pool: data state: DEGRADED scrub: none requested config: NAMESTATE READ WRITE CKSUM dataDEGRADED 0 0 0 -raidz1DEGRADED 0 0 0 ---c0t0d0 ONLINE 0 0 0 ---c0t1d0 ONLINE 0 0 0 ---replacing UNAVAIL 0 543 0 insufficient replicas -- 17096229131581286394 FAULTED 0 581 0 was /dev/dsk/c0t2d0s0/old -- 11342560969745958696 FAULTED 0 582 0 was /dev/dsk/c0t2d0s0 ---c0t3d0 ONLINE 0 0 0 ---c0t4d0 ONLINE 0 0 0 ---c0t5d0 ONLINE 0 0 0 ---c0t6d0 ONLINE 0 0 0 ---c0t7d0 ONLINE 0 0 0 ---c2t2d0 ONLINE 0 0 0 ---c2t3d0 ONLINE 0 0 0 errors: No known data errors i have also tried exporting and reimporting the pool, any help would greatly appreciated. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Tue, Dec 02, 2008 at 12:22:49PM -0800, Vincent Fox wrote: Reviving this thread. We have a Solaris 10u4 system recently patched with 137137-09. Unfortunately the patch was applied from multi-user mode, I wonder if this may have been original posters problem as well? Anyhow we are now stuck No - in my case it was a 'not enough space' on / problem, not the multi-user mode ;-). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] help diagnosing system hang
Hi all, First, I'll say my intent is not to spam a bunch of lists, but after posting to opensolaris-discuss I had someone communicate with me offline that these lists would possibly be a better place to start. So here we are. For those on all three lists, sorry for the repetition. Second, this message is meant to solicit help in diagnosing the issue described below. Any hints on how DTrace may help, or where in general to start would be much appreciated. Back to the subject at hand. --- I'm testing an application which makes use of a large file mmap'd into memory, as if it the application was using malloc(). The file is roughly 2x the size of physical ram. Basically, I'm seeing the system stall for long periods of time, 60+ seconds, and then resume. The file lives on an SSD (Intel x25-e) and I'm using zfs's lzjb compression to make more efficient use of the ~30G of space provided by that SSD. The general flow of things is, start application, ask it to use a 50G file. The file is created in a sparse manner at the location designated, then mmap is called on the entire file. All fine up to this point. I then start loading data into the application, and it starts pushing data to the file as you'd expect. Data is pushed to the file early and often, as it's mmap'd with the MAP_SHARED flag. But, when the application's resident size reaches about 80% of the physical ram on the system, the system starts paging and things are still working relatively well, though slower, as expected. Soon after, when reaching about 40G of data, I get stalls accessing the SSD (according to iostat), in other words, no IO to that drive. When I started looking into what could be causing it, such as IO timeouts, I run dmesg and it hangs after printing a timestamp. I can ctrl-c dmesg, but subsequent runs provide no better results. I see no new messages in /var/adm/messages, as I'd expect. Eventually the system recovers, the latest case took over 10 minutes to recover, after killing the application mentioned above, and I do see disk timeouts in dmesg. So, I can only assume that there's either a driver bug in the SATA/SAS controller I'm using and it's throwing timeouts, or the SSD is having issues. Looking at the zpool configuration, I see that failmode=wait, and since that SSD is the only member of the zpool I would expect IO to hang. But, does that mean that dmesg should hang also? Does that mean that the kernel has at least one thread stuck? Would failmode=continue be more desired, or resilient? During the hang, load-avg is artificially high, fmd being the one process that sticks out in prstat output. But fmdump -v doesn't show anything relevant. Anyone have ideas on how to diagnose what's going on there? Thanks, Ethan System: Sun x4240 dual-amd2347, 32G of ram SAS/SATA Controller: LSI3081E OS: osol snv_98 SSD: Intel x25-e ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Su n X4150/X4450
Aaron Blew aaronblew at gmail.com writes: I've done some basic testing with a X4150 machine using 6 disks in a RAID 5 and RAID Z configuration. They perform very similarly, but RAIDZ definitely has more system overhead. Since hardware RAID 5 implementations usually do not checksum data (they only compute the parity, which is not the same thing), for an apples-to-apples performance comparison you should have benchmarked raidz with checksum=off. Is it what you did ? -marc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss