Re: [zfs-discuss] Asymmetric zpool load

2008-12-03 Thread Carsten Aulbert
Ross wrote:
 Aha, found it!  It was this thread, also started by Carsten :)
 http://www.opensolaris.org/jive/thread.jspa?threadID=78921tstart=45

Did I? Darn, I need to get a brain upgrade.

But yes, there it was mainly focused on zfs send/receive being slow -
but maybe these are also linked.

What I will try today/this week:

Put some stress on the system with bonnie and other tools and try to
find slow disks and see if this could be the main problem but also look
into more vdevs and then possible move to raidz to somehow compensate
for lost disk space. Since we have 4 cold spares on the shelf plus a SMS
warnings on disk failures (that is if fma catches them) the risk
involved should be tolerable.

More later.

Carsten
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Asymmetric zpool load

2008-12-03 Thread Marc Bevand
Carsten Aulbert carsten.aulbert at aei.mpg.de writes:
 
 Put some stress on the system with bonnie and other tools and try to
 find slow disks

Just run iostat -Mnx 2 (not zpool iostat) while ls is slow to find the slow 
disks. Look at the %b (busy) values.

-marc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread Robert Milkowski
Hello Sanjeev,

Wednesday, December 3, 2008, 5:20:47 AM, you wrote:

S Hi,

S A good rough estimate would be the total of the space
S that is displayed under the USED column of zfs list for those snapshots.

S Here is an example :
S -- snip --
S [EMAIL PROTECTED] zfs list -r tank
S NAME USED  AVAIL  REFER  MOUNTPOINT
S tank24.6M  38.9M19K  /tank
S tank/fs124.4M  38.9M18K  /tank/fs1
S tank/[EMAIL PROTECTED]  24.4M  -  24.4M  -
S -- snip --

S In the above case tank/[EMAIL PROTECTED] is using 24.4M. So, if we delete
S that snapshot it would freeup about 24.4M. Let's delete it an
S see what we get :

S -- snip --
S [EMAIL PROTECTED] zfs destroy tank/[EMAIL PROTECTED]
S [EMAIL PROTECTED] zfs list -r tank
S NAME   USED  AVAIL  REFER  MOUNTPOINT
S tank   220K  63.3M19K  /tank
S tank/fs118K  63.3M18K  /tank/fs1
S -- snip --

S So, we did get back 24.4M freed (39.9M + 24.4M = 63.3M).

S Note that this could get a little complicated if there are multiple
S snapshots which refer to the same set of blocks. So, even after deleting
S one snapshot you might not see the space freed up. And this could be because,
S of the second snapshot which is refering to some of the blocks still.


That's what I meant by:

 I'm afraid you can do only one at a time.

The problem is that once you got several snapshots and you want to
calculate how much space you will re-gain if you delete two (or more)
of them - you just can't calculate it by looking at zfs list output.
All you can say is how much space at least you will re-gain which is
summary of used column for snapshots to be deleted - but if they share
some blocks then you might or might not (depends if yet other
snapshots share some of these shared blocks) re-gain much more.


-- 
Best regards,
 Robert Milkowskimailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Asymmetric zpool load

2008-12-03 Thread Carsten Aulbert
Carsten Aulbert wrote:

 Put some stress on the system with bonnie and other tools and try to
 find slow disks and see if this could be the main problem but also look
 into more vdevs and then possible move to raidz to somehow compensate
 for lost disk space. Since we have 4 cold spares on the shelf plus a SMS
 warnings on disk failures (that is if fma catches them) the risk
 involved should be tolerable.

First result with bonnie during the writing intelligently... phase I
see this in a 2 minute average:

zpool iostats:

   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
atlashome   1.70T  19.2T225  1.49K   342K   107M
  raidz2 550G  6.28T 74409   114K  32.6M
c0t0d0  -  -  0314  32.3K  2.51M
c1t0d0  -  -  0315  31.8K  2.52M
c4t0d0  -  -  0313  31.3K  2.52M
c6t0d0  -  -  0315  32.3K  2.51M
c7t0d0  -  -  0326  32.8K  2.50M
c0t1d0  -  -  0309  33.9K  2.52M
c1t1d0  -  -  0313  33.4K  2.51M
c4t1d0  -  -  0314  33.4K  2.52M
c5t1d0  -  -  0308  32.8K  2.52M
c6t1d0  -  -  0314  31.3K  2.51M
c7t1d0  -  -  0311  31.8K  2.52M
c0t2d0  -  -  0309  31.8K  2.52M
c1t2d0  -  -  0313  31.8K  2.51M
c4t2d0  -  -  0315  31.8K  2.52M
c5t2d0  -  -  0307  32.8K  2.52M
  raidz2 567G  6.26T 64529  96.5K  36.3M
c6t2d0  -  -  1368  74.2K  2.79M
c7t2d0  -  -  1366  74.2K  2.80M
c0t3d0  -  -  1364  75.8K  2.80M
c1t3d0  -  -  1365  75.2K  2.80M
c4t3d0  -  -  1368  76.8K  2.80M
c5t3d0  -  -  1362  76.3K  2.80M
c6t3d0  -  -  1366  77.9K  2.80M
c7t3d0  -  -  1365  76.8K  2.80M
c0t4d0  -  -  1361  76.8K  2.80M
c1t4d0  -  -  1363  75.8K  2.80M
c4t4d0  -  -  1366  76.3K  2.80M
c6t4d0  -  -  1364  78.4K  2.80M
c7t4d0  -  -  1370  78.9K  2.79M
c0t5d0  -  -  1365  77.3K  2.80M
c1t5d0  -  -  1364  74.7K  2.80M
  raidz2 620G  6.64T 86582   131K  37.9M
c4t5d0  -  - 18382  1.16M  2.74M
c5t5d0  -  - 10380   674K  2.74M
c6t5d0  -  - 18378  1.15M  2.73M
c7t5d0  -  -  9384   628K  2.74M
c0t6d0  -  - 18377  1.16M  2.74M
c1t6d0  -  - 10383   680K  2.75M
c4t6d0  -  - 19379  1.21M  2.73M
c5t6d0  -  - 10383   691K  2.75M
c6t6d0  -  - 19379  1.21M  2.73M
c7t6d0  -  - 10383   676K  2.72M
c0t7d0  -  - 18374  1.19M  2.75M
c1t7d0  -  - 10381   676K  2.74M
c4t7d0  -  - 19380  1.22M  2.74M
c5t7d0  -  - 10382   696K  2.74M
c6t7d0  -  - 18381  1.17M  2.74M
c7t7d0  -  -  9386   631K  2.75M
--  -  -  -  -  -  -

iostat -Mnx 120:
extended device statistics
r/sw/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.00.00.0  0.0  0.00.00.0   0   0 c2t0d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c3t0d0
0.01.40.00.0  0.0  0.01.50.4   0   0 c5t0d0
0.6  351.50.02.6  0.4  0.11.20.2   3   8 c7t0d0
0.6  336.30.02.6  0.1  0.10.40.2   3   7 c0t0d0
0.6  340.80.02.6  0.2  0.10.60.2   3   7 c1t0d0
0.6  330.60.02.6  0.1  0.10.30.2   3   7 c5t1d0
0.6  336.70.02.6  0.1  0.10.30.2   3   7 c4t0d0
0.6  331.80.02.6  0.1  0.10.30.2   3   7 c0t1d0
0.6  339.00.02.6  0.4  0.11.10.2   3   7 c7t1d0
0.6  335.40.02.6  0.1  0.10.40.2   3   7 c1t1d0
0.6  329.20.02.6  0.1  0.10.30.2   3   7 c5t2d0
0.6  343.70.02.6  0.3  0.10.70.2   3   7 c4t1d0
0.6  331.80.02.6  0.1  0.10.30.2   2   7 c0t2d0
1.2  396.30.12.9  0.3  0.10.70.2   4   8 c7t2d0
0.6  336.70.02.6  0.1  0.10.40.2   3   7 c1t2d0
0.6  341.90.02.6  0.2  0.10.70.2   3   7 c4t2d0
1.3  390.70.12.9  0.3  0.10.80.2   4   9 c5t3d0
1.3  396.70.12.9  0.3  0.10.80.2   4   9 c7t3d0
1.3  393.60.12.9  0.2  0.10.60.2   4   9 c0t3d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c5t4d0
1.3  396.20.12.9  0.2  0.10.5

Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread Blake Irvin
I'm having a very similar issue.  Just updated to 10 u6 and upgrade my zpools.  
They are fine (all 3-way mirors), but I've lost the machine around 12:30am two 
nights in a row.

I'm booting ZFS root pools, if that makes any difference.

I also don't see anything in dmesg, nothing on the console either.

I'm going to go back to the logs today to see what was going on around midnight 
on these occasions.  I know there are some built-in cronjobs that run around 
that time - perhaps one of them in the culprit.

What I'd really like is a way to force a core dump when the machine hangs like 
this.  scat is a very nifty tool for debugging such things - but I'm not 
getting a core or panic or anything :(
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread Jacob Ritorto
Update:  It would appear that the bug I was complaining about nearly a 
year ago is still at play here:  
http://opensolaris.org/jive/thread.jspa?threadID=49372tstart=0

Unfortunate Solution:  Ditch Solaris 10 and run Nevada.  The nice folks 
in the OpenSolaris project fixed the problem a long time ago.

This means that I can't have Sun support until Nevada becomes a 
real product, but it's better than having a silent failure every time 
6GB crosses the wire.  My big question is why won't they fix it in 
Solaris 10?  Sun's depriving themselves of my support revenue stream and 
I'm stuck with an unsupportable box as my core filer.  Bad situation on 
so many levels..  If it weren't for the stellar quality of the Nevada 
builds (b91 uptime=132 days now with no problems), I'd not be sleeping 
much at night..  Imagine my embarrassment had I taken the high road and 
spent the $$$ for a Thumper for this purpose..








___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread Tim
On Wed, Dec 3, 2008 at 7:49 AM, Jacob Ritorto [EMAIL PROTECTED]wrote:

 Update:  It would appear that the bug I was complaining about nearly a
 year ago is still at play here:
 http://opensolaris.org/jive/thread.jspa?threadID=49372tstart=0

 Unfortunate Solution:  Ditch Solaris 10 and run Nevada.  The nice folks
 in the OpenSolaris project fixed the problem a long time ago.

This means that I can't have Sun support until Nevada becomes a
 real product, but it's better than having a silent failure every time
 6GB crosses the wire.  My big question is why won't they fix it in
 Solaris 10?  Sun's depriving themselves of my support revenue stream and
 I'm stuck with an unsupportable box as my core filer.  Bad situation on
 so many levels..  If it weren't for the stellar quality of the Nevada
 builds (b91 uptime=132 days now with no problems), I'd not be sleeping
 much at night..  Imagine my embarrassment had I taken the high road and
 spent the $$$ for a Thumper for this purpose..



Can't you just run opensolaris?  They've got support contracts for that, and
the bug should be fixed in 2008.11.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-12-03 Thread Ross
Ok, I've done some more testing today and I almost don't know where to start.

I'll begin with the good news for Miles :)
- Rebooting doesn't appear to cause ZFS to loose the resilver status (but see 
1. below)
- Resilvering appears to work fine, once complete I never saw any checksum 
errors when scrubbing the pool.
- Reconnecting iscsi drives causes zfs to automatically online the pool and 
automatically begin resilvering.

And now the bad news:
1.  While rebooting doesn't seem cause the resilver to loose it's status, 
something's causing it problems.  I saw it restart several times.
2.  With iscsi, you can't reboot with sendtargets enabled, static discovery 
still seems to be the order of the day.
3.  There appears to be a disconnect between what iscsiadm knows and what ZFS 
knows about the status of the devices.  

And I have confirmation of some of my earlier findings too:
4.  iSCSI still has a 3 minute timeout, during which time your pool will hang, 
no matter how many redundant drives you have available.
5.  zpool status can still hang when a  device goes offline, and when it 
finally recovers, it will then report out of date information.  This could be 
Bug 6667199, but I've not seen anybody reporting the incorrect information part 
of this.
6.  After one drive goes offline, during the resilver process, zpool status 
shows that information is being resilvered on the good drives.  Does anybody 
know why this happens?
7.  Although ZFS will automatically online a pool when iscsi devices come 
online, CIFS shares are not automatically remounted.

I also have a few extra notes about a couple of those:

1 - resilver loosing status
===
Regarding the resilver restarting, I've seen it reported that zpool status 
can cause this when run as admin, but I'm not convinced that's the cause.  Same 
for the rebooting problem.  I was able to run zpool status dozens of times as 
an admin, but only two or three times did I see the resilver restart.

Also, after rebooting, I could see that the resilver was showing that it was 
66% complete, but then a second later it restarted.

Now, none of this is conclusive.  I really need to test with a much larger 
dataset to get an idea of what's really going on, but there's definately 
something weird happening here.

3 - disconnect between iscsiadm and ZFS
=
I repeated my test of offlining an iscsi target, this time checking iscsiadm to 
see when it disconnected. 

What I did was wait until iscsiadm reported 0 connections to the target, and 
then started a CIFS file copy and ran zpool status.

Zpool status hung as expected, and a minute or so later, the CIFS copy failed.  
It seems that although iscsiadm was aware that the target was offline, ZFS did 
not yet know about it.  As expected, a minute or so later, zpool status 
completed (returning incorrect results), and I could then run the CIFS copy 
fine.

5 - zpool status hanging and reporting incorrect information
===
When an iSCSI device goes offline, if you immediately run zpool status, it 
hangs for 3-4 minutes.  Also, when it finally completes, it gives incorrect 
information, reporting all the devices as online.

If you immediately re-run zpool status, it completes rapidly and will now 
correctly show the offline devices.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread Blake
I think my problem is actually different - I'm not using iSCSI at all.
 I will update if I find otherwise.

And yes, I do think there is support available for OpenSolaris now:
http://www.sun.com/service/opensolaris/faq.xml

Blake



On Wed, Dec 3, 2008 at 9:32 AM, Tim [EMAIL PROTECTED] wrote:
 On Wed, Dec 3, 2008 at 7:49 AM, Jacob Ritorto [EMAIL PROTECTED]
 wrote:

 Update:  It would appear that the bug I was complaining about nearly a
 year ago is still at play here:
 http://opensolaris.org/jive/thread.jspa?threadID=49372tstart=0

 Unfortunate Solution:  Ditch Solaris 10 and run Nevada.  The nice folks
 in the OpenSolaris project fixed the problem a long time ago.

This means that I can't have Sun support until Nevada becomes a
 real product, but it's better than having a silent failure every time
 6GB crosses the wire.  My big question is why won't they fix it in
 Solaris 10?  Sun's depriving themselves of my support revenue stream and
 I'm stuck with an unsupportable box as my core filer.  Bad situation on
 so many levels..  If it weren't for the stellar quality of the Nevada
 builds (b91 uptime=132 days now with no problems), I'd not be sleeping
 much at night..  Imagine my embarrassment had I taken the high road and
 spent the $$$ for a Thumper for this purpose..


 Can't you just run opensolaris?  They've got support contracts for that, and
 the bug should be fixed in 2008.11.

 --Tim


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450

2008-12-03 Thread Aaron Blew
I've done some basic testing with a X4150 machine using 6 disks in a RAID 5
and RAID Z configuration.  They perform very similarly, but RAIDZ definitely
has more system overhead.  In many cases this won't be a big deal, but if
you need as many CPU cycles as you can muster, hardware RAID may be your
better choice.

-Aaron

On Tue, Dec 2, 2008 at 4:22 AM, Vikash Gupta [EMAIL PROTECTED] wrote:

  Hi,



 Has anyone implemented the Hardware RAID 1/5 on Sun X4150/X4450 class of
 servers .

 Also any comparison between ZFS Vs H/W Raid ?



 I would like to know the experience (good/bad) and the pros/cons?



 Regards,

 Vikash



 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs_nocacheflush, nvram, and root pools

2008-12-03 Thread Neil Perrin
On 12/02/08 03:47, River Tarnell wrote:
 hi,
 
 i have a system connected to an external DAS (SCSI) array, using ZFS.  the
 array has an nvram write cache, but it honours SCSI cache flush commands by
 flushing the nvram to disk.  the array has no way to disable this behaviour.  
 a
 well-known behaviour of ZFS is that it often issues cache flush commands to
 storage in order to ensure data integrity; while this is important with normal
 disks, it's useless for nvram write caches, and it effectively disables the
 cache.
 
 so far, i've worked around this by setting zfs_nocacheflush, as described at
 [1], which works fine.  but now i want to upgrade this system to Solaris 10
 Update 6, and use a ZFS root pool on its internal SCSI disks (previously, the
 root was UFS).  the problem is that zfS_nocacheflush applies to all pools,
 which will include the root pool.
 
 my understanding of ZFS is that when run on a root pool, which uses slices
 (instead of whole disks), ZFS won't enable the write cache itself.  i also
 didn't enable the write cache manually.  so, it _should_ be safe to use
 zfs_nocacheflush, because there is no caching on the root pool.
 
 am i right, or could i encounter problems here?

Yes you are right and this should work. You may want to check that
the write cache is disabled on the root pool disks
using 'format -e' + cache + write_cache + display.

 
 (the system is an NFS server, which means lots of synchronous writes (and
 therefore ZFS cache flushes), so i *really* want the performance benefit from
 using the nvram write cache.)

Indeed, performance would be bad without it.

 
   - river.

Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Flash Archive Support for ZFS?

2008-12-03 Thread Matt Walburn
zfs-discuss,

Now that we've finally got support for ZFS root filesystems on Solaris 10, I
was wondering if anyone knows what the status is for ZFS Flash Archive.
Presumably it's got to use ZFS send/receive functionality, but is rolling
that into FlashArchive something that's on the roadmap?

Thanks,
Matthew

--
Matt Walburn
http://mattwalburn.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-12-03 Thread Vincent Fox
Followup to my own post.

Looks like my SVM setup was having problems prior to patch being applied.
If I boot net:dhcp -s and poke around on the disks, it looks like disk0 is
pre-patch state and disk1 is post-patch.

I can get a shell if I
boot disk1 -s

So I think I am in SVM hell here not specfically ZFS patch breaking my box.
Never mind!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-12-03 Thread Maurice Volaski
2.  With iscsi, you can't reboot with sendtargets enabled, static 
discovery still seems to be the order of the day.

I'm seeing this problem with static discovery: 
http://bugs.opensolaris.org/view_bug.do?bug_id=6775008.

4.  iSCSI still has a 3 minute timeout, during which time your pool 
will hang, no matter how many redundant drives you have available.

This is CR 649, 
http://bugs.opensolaris.org/view_bug.do?bug_id=649, which is 
separate from the boot time timeout, though, and also one that Sun so 
far has been unable to fix!
-- 

Maurice Volaski, [EMAIL PROTECTED]
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread [EMAIL PROTECTED]
Hi Blake,

Blake Irvin wrote:
 I'm having a very similar issue.  Just updated to 10 u6 and upgrade my 
 zpools.  They are fine (all 3-way mirors), but I've lost the machine around 
 12:30am two nights in a row.


 What I'd really like is a way to force a core dump when the machine hangs 
 like this.  scat is a very nifty tool for debugging such things - but I'm not 
 getting a core or panic or anything :(
   
You can force a dump.  Here are the steps:

Before the system is hung:

# mdb -K -F   -- this will load kmdb and drop into it

Don't worry if your system now seems hung.
Type, carefully, with no typos:

:c   -- and carriage-return.  You should get your prompt back

Now, when the system is hung, type F1-a  (that's function key f1 and the 
a key together.
This should put you into kmdb.  Now, type, (again, no typos):

$systemdump

This should give you a panic dump, followed by reboot,  (unless your 
system is hard-hung).

max


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-12-03 Thread Ross Smith
Yeah, thanks Maurice, I just saw that one this afternoon.  I guess you
can't reboot with iscsi full stop... o_0

And I've seen the iscsi bug before (I was just too lazy to look it up
lol), I've been complaining about that since February.

In fact it's been a bad week for iscsi here, I've managed to crash the
iscsi client twice in the last couple of days too (full kernel dump
crashes), so I'll be filing a bug report on that tomorrow morning when
I get back to the office.

Ross


On Wed, Dec 3, 2008 at 7:39 PM, Maurice Volaski [EMAIL PROTECTED] wrote:
 2.  With iscsi, you can't reboot with sendtargets enabled, static
 discovery still seems to be the order of the day.

 I'm seeing this problem with static discovery:
 http://bugs.opensolaris.org/view_bug.do?bug_id=6775008.

 4.  iSCSI still has a 3 minute timeout, during which time your pool will
 hang, no matter how many redundant drives you have available.

 This is CR 649, http://bugs.opensolaris.org/view_bug.do?bug_id=649,
 which is separate from the boot time timeout, though, and also one that Sun
 so far has been unable to fix!
 --

 Maurice Volaski, [EMAIL PROTECTED]
 Computing Support, Rose F. Kennedy Center
 Albert Einstein College of Medicine of Yeshiva University

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread none
Hi Sanjeev and Milek, thanks for your replies but I'm afraid they are somewhat 
missing the point.
I have a situation (and I believe it would be fairly common) where early 
snapshots would be sharing most data with the current filesystem and more 
recent snapshots are holding onto data that has been deleted from the current 
filesystem (after a recent big deletion of unused data).

It is impossible to see what snapshots would need to be deleted to free up the 
space that was deleted from the current fs, without deleting the snapshots one 
by one. I think in any filesystem it is reasonable to expect to be able to 
determine what entities use up disk space. But ZFS is currently lacking this 
for snapshots.

See the below USED column. The total USED adds up to about 11G. However the 
total data consumed by the snapshots is more in the range of 100G. But from the 
listing below it is simply impossible to see which snapshots are using it.

NAME   USED  AVAIL  REFER  MOUNTPOINT
storage457G   127G  28.4K  /storage
storage/myfilesystem 457G   127G   251G  /storage/myfilesystem

NAMEUSED  AVAIL  REFER  
MOUNTPOINT
[EMAIL PROTECTED]0  -  28.4K  -
[EMAIL PROTECTED]0  -  28.4K  -
storage/[EMAIL PROTECTED]4.26G  -   187G  -
storage/[EMAIL PROTECTED]  61.1M  -   206G  -
storage/[EMAIL PROTECTED]   773M  -   201G  -
storage/[EMAIL PROTECTED]  33.2M  -   192G  -
storage/[EMAIL PROTECTED]62.6M  -   212G  -
storage/[EMAIL PROTECTED]  5.29G  -   217G  -
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Joseph Zhou
Hi list,

Any one has ANY new data on OpenSolaris vs Linux?

I only found an old post in 2006.
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-January/030366.html

And any comments on if OpenSolaris performance is about the same as Solaris 
10?

Thanks!
z___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread [EMAIL PROTECTED]
Hi Blake,

Blake Irvin wrote:
 Thanks - however, the machine hangs and doesn't even accept console input 
 when this occurs.  I can't get into the kernel debugger in these cases.
   
Are you directly on the console, or is the console on a serial port?  If 
you are
running over X windows, the input might still get in, but X may not be 
displaying.
If keyboard input is not getting in, your machine is probably wedged at 
a high
level interrupt, which sounds doubtful based on your problem description.

 I've enabled the deadman timer instead.  I'm also using the automatic 
 snapshot service to get a look at things like /var/adm/sa/sa** files that get 
 overwritten after a hard reset.
   
If the deadman timer does not trigger, the clock is almost certainly 
running, and your machine is
almost certainly accepting keyboard input.

Good luck,
max

 I'm just going to stay up late tonight and see what happens :)

 Blake




   
 Hi Blake,

 Blake Irvin wrote:
 
 I'm having a very similar issue.  Just updated to
   
 10 u6 and upgrade my zpools.  They are fine (all
 3-way mirors), but I've lost the machine around
 12:30am two nights in a row.
 
 What I'd really like is a way to force a core dump
   
 when the machine hangs like this.  scat is a very
 nifty tool for debugging such things - but I'm not
 getting a core or panic or anything :(
 
   
   
 You can force a dump.  Here are the steps:

 Before the system is hung:

 # mdb -K -F   -- this will load kmdb and drop into
 it

 Don't worry if your system now seems hung.
 Type, carefully, with no typos:

 :c   -- and carriage-return.  You should get your
 prompt back

 Now, when the system is hung, type F1-a  (that's
 function key f1 and the 
 a key together.
 This should put you into kmdb.  Now, type, (again, no
 typos):

 $systemdump

 This should give you a panic dump, followed by
 reboot,  (unless your 
 system is hard-hung).

 max


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Tim
On Wed, Dec 3, 2008 at 2:39 PM, Joseph Zhou [EMAIL PROTECTED]wrote:

  Hi list,

 Any one has ANY new data on OpenSolaris vs Linux?

 I only found an old post in 2006.
 http://mail.opensolaris.org/pipermail/zfs-discuss/2006-January/030366.html

 And any comments on if OpenSolaris performance is about the same as
 Solaris 10?

 Thanks!
 z


That's kind of open ended.  What sort of performance are you looking for?
NFS throughput?  Software raid?  What distro vs. Solaris?

Opensolaris and Solaris are going to have different performance based on
what exactly it is you're testing.  Similar is probably accurate for a lot
of things, but not everything.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Joseph Zhou
Thanks Tim,
At this moment, I am looking into OpenStorage as NAS (file serving) vs. Linux 
NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX) performance.

I am also interested in block-based performance, but not as urgent as above. 
(Since 7000 is mainly doing NAS today, in a non-HPC-clustered fashion without 
Lustre. With Lustre, the performance competitive focuses are different from 
above).

Thanks,
z  

  - Original Message - 
  From: Tim 
  To: Joseph Zhou 
  Cc: zfs-discuss@opensolaris.org 
  Sent: Wednesday, December 03, 2008 4:04 PM
  Subject: Re: [zfs-discuss] OpenSolaris vs Linux





  On Wed, Dec 3, 2008 at 2:39 PM, Joseph Zhou [EMAIL PROTECTED] wrote:

Hi list,

Any one has ANY new data on OpenSolaris vs Linux?

I only found an old post in 2006.
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-January/030366.html

And any comments on if OpenSolaris performance is about the same as 
Solaris 10?

Thanks!
z



  That's kind of open ended.  What sort of performance are you looking for?  
NFS throughput?  Software raid?  What distro vs. Solaris?

  Opensolaris and Solaris are going to have different performance based on what 
exactly it is you're testing.  Similar is probably accurate for a lot of 
things, but not everything.

  --Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Ian Collins
Joseph Zhou wrote:
 Thanks Tim,
 At this moment, I am looking into OpenStorage as NAS (file serving)
 vs. Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX)
 performance.
  
There are still a number of ZFS/OpenSOlaris options to compare, iSCSI,
Samba, CIFS, NFS.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Tim
On Wed, Dec 3, 2008 at 3:11 PM, Joseph Zhou [EMAIL PROTECTED]wrote:

  Thanks Tim,
 At this moment, I am looking into OpenStorage as NAS (file serving) vs.
 Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX) performance.

 I am also interested in block-based performance, but not as urgent as
 above. (Since 7000 is mainly doing NAS today, in a non-HPC-clustered
 fashion without Lustre. With Lustre, the performance competitive focuses are
 different from above).

 Thanks,
 z



Right, so hardware or software raid?  NFS, CIFS, both?  Win2k8 is going to
blow serving NFS, but it can be done.  Storage 7000 is going to have a
COMPLETELY different performance envelope than vanilla opensolaris or
solaris.  With some customization using flash you might be able to get
close, but if you want to know what a storage 7000 will do, you should ask
for that, not just opensolaris.

Here's an example of a loaded up 7000:
http://blogs.sun.com/brendan/entry/a_quarter_million_nfs_iops

If you want to compare it to something like NetApp though, it's tough,
because how do you make your comparison?  Price?  What model NetApp are you
going to use?  What kind of server are you going to use?

If you just want to use some numbers someone comes up with to make a
decision on what platform to use, I'd argue you're going about it completely
the wrong way.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Joseph Zhou
Thanks Ian, Tim,
Ok, let me really hit one topic instead of trying to see in general what 
data are out there...

Let's say OpenSolaris doing Samba vs. Linux doing Samba, in CIFS 
performance.
(so I can link to the Win2008 CIFS numbers and NetApp CIFS numbers myself.)

Is there any data to this specific point?
Thanks!
z

- Original Message - 
From: Ian Collins [EMAIL PROTECTED]
To: Joseph Zhou [EMAIL PROTECTED]
Cc: Tim [EMAIL PROTECTED]; zfs-discuss@opensolaris.org
Sent: Wednesday, December 03, 2008 4:31 PM
Subject: Re: [zfs-discuss] OpenSolaris vs Linux


 Joseph Zhou wrote:
 Thanks Tim,
 At this moment, I am looking into OpenStorage as NAS (file serving)
 vs. Linux NAS (Samba) vs. Win2008 NAS vs. NetApp (ONTAP, not GX)
 performance.

 There are still a number of ZFS/OpenSOlaris options to compare, iSCSI,
 Samba, CIFS, NFS.

 -- 
 Ian.
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Tim
On Wed, Dec 3, 2008 at 3:36 PM, Joseph Zhou [EMAIL PROTECTED]wrote:

 Thanks Ian, Tim,
 Ok, let me really hit one topic instead of trying to see in general what
 data are out there...

 Let's say OpenSolaris doing Samba vs. Linux doing Samba, in CIFS
 performance.
 (so I can link to the Win2008 CIFS numbers and NetApp CIFS numbers myself.)

 Is there any data to this specific point?
 Thanks!
 z


So, you wouldn't use Samba on opensolaris, you'd use the native cifs stack.
Then we have to look at the system itself.  How much ram?  How many and what
kind of CPU's?  How much disk on the backend?  What kind of disk on the back
end?

I don't think you're going to find the numbers you're looking for to be
quite honest.  And even if you did, I don't know how usable they'd really
be.  I'd start by digging through the spc benchmarks.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread Blake Irvin
I am directly on the console.  cde-login is disabled, so i'm dealing with 
direct entry.

 Are you directly on the console, or is the console on
 a serial port?  If 
 you are
 running over X windows, the input might still get in,
 but X may not be 
 displaying.
 If keyboard input is not getting in, your machine is
 probably wedged at 
 a high
 level interrupt, which sounds doubtful based on your
 problem description.
Out of curiosity, why do you say that?  I'm no expert on interrupts, so I'm 
curious.  It DOES seem that keyboard entry is ignored in this situation, since 
I see no results from ctrl-c, for example (I had left the console running 'tail 
-f /var/adm/messages'.  I'm not saying your are wrong, but if I should be 
examining interrupt issues, I'd like to know (I have 3 hard disk controllers in 
the box, for example...)
   
 If the deadman timer does not trigger, the clock is
 almost certainly 
 running, and your machine is
 almost certainly accepting keyboard input.
That's good to know.  I just enabled deadman after the last freeze, so it will 
be a bit before I can test this (hope I don't have to).

thanks!
Blake

 
 Good luck,
 max
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread none
To provide a complete (meaning it is currently incomplete) disk usage report, I 
think zfs would need provide the following:

Each block in a zfs fs in general is shared by one or more of either the 
current fs or snapshots. Call this the block's share set. Eg,
(storage/fs, storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED])

Group the blocks (from the entire fs) that have identical share sets together 
and call this a shared collection.  For each shared collection, report the 
disk usage based on the total number of blocks in that shared collection.

It is to be expected that a snapshot or fs could appear in more than one 
shared collection.

eg, 
zfs list-shared
SHARED COLLECTION   
   USED
(storage/fs, storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED])  
100G
(storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED], storage/[EMAIL 
PROTECTED]) 40G
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread Robert Milkowski
Hello none,

Wednesday, December 3, 2008, 8:38:03 PM, you wrote:

n Hi Sanjeev and Milek, thanks for your replies but I'm afraid they are 
somewhat missing the point.
n I have a situation (and I believe it would be fairly common) where
n early snapshots would be sharing most data with the current
n filesystem and more recent snapshots are holding onto data that has
n been deleted from the current filesystem (after a recent big deletion of 
unused data).

n It is impossible to see what snapshots would need to be deleted to
n free up the space that was deleted from the current fs, without
n deleting the snapshots one by one. I think in any filesystem it is
n reasonable to expect to be able to determine what entities use up
n disk space. But ZFS is currently lacking this for snapshots.

n See the below USED column. The total USED adds up to about
n 11G. However the total data consumed by the snapshots is more in
n the range of 100G. But from the listing below it is simply
n impossible to see which snapshots are using it.

In most cases you probably want to destroy an oldest snapshot. Then
check if you happy with your free space, if not delete a next one and
so on.

The problem is how to present the information you are asking for, like
- how much space will I regain if I delete snapshot #2 #4 #7



-- 
Best regards,
 Robert Milkowskimailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Tim
On Wed, Dec 3, 2008 at 3:51 PM, Joseph Zhou [EMAIL PROTECTED]wrote:

  Ok, thanks Tim, which SPC are you talking about?

 SPC-1 and SPC-2 don't test NAS, those are block perf.
 SPECsfs97 v2/v3 and sfs2008 have no OpenStorage results.

 If there are standard storage benchmarks out there, I would not be here
 asking folks.

 To your point, how is the OpenSolaris native CIFS vs Linux Samba then?  (if
 you think this is more apple-to-apple than they both run Samba)

 Again, I am here to explore data, not to argue, if I give you a dozen
 configurations, could you get me the performance estimates and how the
 estimates come from?  I didn't think that route is possible.

 Thanks.
 z


Sorry, I was referring to SPEC, not SPC.  Perhaps you could ask one of the
folks from Sun on these mailing lists if they have plans to post results.
I'd imagine they do for at least the storage 7000 series.

I think native cifs on Solaris vs. Samba on Linux is fair simply because
it's what someone rolling out an implementation would use.  It'll never be
100% apples-to-apples, so I'd say real-world is preferred over hampering one
system to make it *closer* to the other.

As for configurations, I probably have access to enough hardware to do most
of the benchmarking, but this time of year, being end-of-quarter, I wouldn't
have the time to do so.  That doesn't mean there isn't someone else lurking
who does.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread none
Hi Milek,
I specifically don't want to destroy the oldest snapshot because I already know 
it won't free up much disk space, since it is sharing most data with the 
current fs. If I delete it I will have lost any old data which might be needed 
for recovery (eg. if I accidentally corrupt some files). I don't know which 
snapshots which actually free up the space.

 In most cases you probably want to destroy an oldest
 snapshot. Then
 check if you happy with your free space, if not
 delete a next one and
 so on.
 
 The problem is how to present the information you are
 asking for, like
 - how much space will I regain if I delete snapshot
 #2 #4 #7
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread [EMAIL PROTECTED]
Hi Blake,

Blake Irvin wrote:
 I am directly on the console.  cde-login is disabled, so i'm dealing 
 with direct entry.
  
Are you directly on the console, or is the console on
 a serial port?  If you are
 running over X windows, the input might still get in,
 but X may not be displaying.
 If keyboard input is not getting in, your machine is
 probably wedged at a high
 level interrupt, which sounds doubtful based on your
 problem description.
 
 Out of curiosity, why do you say that?  I'm no expert on interrupts, 
 so I'm curious.  It DOES seem that keyboard entry is ignored in this 
 situation, since I see no results from ctrl-c, for example (I had left 
 the console running 'tail -f /var/adm/messages'.  I'm not saying your 
 are wrong, but if I should be examining interrupt issues, I'd like to 
 know (I have 3 hard disk controllers in the box, for example...)
   
Typing ctrl-c, and having process killed because of it are 2 different 
actions.
The interpretation of ctrl-c as a kill character is done in a streams 
module
(ldterm, I believe).  This is not done at the device interrupt handler.  
I doubt
you need to examine interrupts.  I was only saying that you could try 
what I
recommended to get a dump.  The f1-a is handled at the driver during 
interrupt
handling, so it should get processed.
I have done this many times, so I am sure it works.

   If the deadman timer does not trigger, the clock is
 almost certainly running, and your machine is
 almost certainly accepting keyboard input.
 
 That's good to know.  I just enabled deadman after the last freeze, so 
 it will be a bit before I can test this (hope I don't have to).

 thanks!
 Blake

  
 Good luck,
 max
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread Miles Nordin
 n == none  [EMAIL PROTECTED] writes:
 rm == Robert Milkowski [EMAIL PROTECTED] writes:

 n eg, 
 n zfs list-shared 
 n SHARED COLLECTIONUSED 
 n (storage/fs, storage/[EMAIL PROTECTED], storage/[EMAIL PROTECTED]) 100G

rm how much space will I regain if I delete snapshot #2 #4 #7

without dedup, I think any SHARED COLLECTION will be contiguous.
It'll always be ``if I delete #2, #3, #4''.  so if you have n
snapshots, you'll have up to

   n * n+1
   ---
  2

SHARED COLLECTION's.  kind of a lot, but not totally ridiculous.


pgpzrn2msjvFi.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Joseph Zhou
haha, Tim, yes, I see the Open spirit in this reply!   ;-)

As I said, I am just exploring data.

The Sun J4000 SPC1 and SPC2 benchmark results were nice, just lacking other 
published results with the iSCSI HBA as DAS, not as a network storage device 
(as 7000).  Though I would attempt to say those results can be a basis for 7000 
block-performance...

any comment?
Thanks!
z
  - Original Message - 
  From: Tim 
  To: Joseph Zhou 
  Cc: Ian Collins ; zfs-discuss@opensolaris.org 
  Sent: Wednesday, December 03, 2008 5:00 PM
  Subject: Re: [zfs-discuss] OpenSolaris vs Linux





  On Wed, Dec 3, 2008 at 3:51 PM, Joseph Zhou [EMAIL PROTECTED] wrote:

Ok, thanks Tim, which SPC are you talking about?   

SPC-1 and SPC-2 don't test NAS, those are block perf.
SPECsfs97 v2/v3 and sfs2008 have no OpenStorage results.

If there are standard storage benchmarks out there, I would not be here 
asking folks.

To your point, how is the OpenSolaris native CIFS vs Linux Samba then?  (if 
you think this is more apple-to-apple than they both run Samba)

Again, I am here to explore data, not to argue, if I give you a dozen 
configurations, could you get me the performance estimates and how the 
estimates come from?  I didn't think that route is possible.

Thanks.
z

  Sorry, I was referring to SPEC, not SPC.  Perhaps you could ask one of the 
folks from Sun on these mailing lists if they have plans to post results.  I'd 
imagine they do for at least the storage 7000 series.

  I think native cifs on Solaris vs. Samba on Linux is fair simply because it's 
what someone rolling out an implementation would use.  It'll never be 100% 
apples-to-apples, so I'd say real-world is preferred over hampering one system 
to make it *closer* to the other.

  As for configurations, I probably have access to enough hardware to do most 
of the benchmarking, but this time of year, being end-of-quarter, I wouldn't 
have the time to do so.  That doesn't mean there isn't someone else lurking who 
does.

  --Tim 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-12-03 Thread Miles Nordin
 r == Ross  [EMAIL PROTECTED] writes:

rs I don't think it likes it if the iscsi targets aren't
rs available during boot.

from my cheatsheet:
-8-
ok boot -m milestone=none
[boots.  enter root password for maintenance.]
bash-3.00# /sbin/mount -o remount,rw /  [-- otherwise iscsiadm won't update 
/etc/iscsi/*]
bash-3.00# /sbin/mount /usr
bash-3.00# /sbin/mount /var
bash-3.00# /sbin/mount /tmp
bash-3.00# iscsiadm remove discovery-address 10.100.100.135
bash-3.00# iscsiadm remove discovery-address 10.100.100.138
bash-3.00# iscsiadm remove discovery-address 10.100.100.138
iscsiadm: unexpected OS error
iscsiadm: Unable to complete operation  [-- good.  it's gone.]
bash-3.00# sync
bash-3.00# lockfs -fa
bash-3.00# reboot
-8-

rs # time zpool status 
[...]
rs real 3m51.774s

so, this hang may happen in fewer situations, but it is not fixed.

 r 6.  After one drive goes offline, during the resilver process,
 r zpool status shows that information is being resilvered on the
 r good drives.  Does anybody know why this happens?  

I don't know why.

I've seen that, too, though.  For me it's always been relatively
short, 1min.  I wonder if there are three kinds of scrub-like things,
not just two (resilvers and scrubs), and 'zpool status' is
``simplifying'' for us again?

 r 7.  Although ZFS will automatically online a pool when iscsi
 r devices come online, CIFS shares are not automatically
 r remounted.

For me, even plain filesystems are not all remounted.  ZFS tries to
mount them in the wrong order, so it would mount /a/b/c, then try to
mount /a/b and complain ``directory not empty''.  I'm not sure why it
mounts things in the right order at boot/import, but in haphazard
order after one of these auto-onlines.  Then NFS exporting didn't work
either.

To fix, I have to 'zfs umount /a/b/c', but then there is a b/c
directory inside filesystem /a, so I have to 'rmdir /a/b/c' by hand
because the '... set mountpoint' koolaid creates the directories but
doesn't remove them.  Then 'zfs mount -a' and 'zfs share -a'.


pgpJzJr1P7Q4e.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Tim
On Wed, Dec 3, 2008 at 4:15 PM, Joseph Zhou [EMAIL PROTECTED]wrote:

  haha, Tim, yes, I see the Open spirit in this reply!   ;-)

 As I said, I am just exploring data.

 The Sun J4000 SPC1 and SPC2 benchmark results were nice, just lacking other
 published results with the iSCSI HBA as DAS, not as a network storage device
 (as 7000).  Though I would attempt to say those results can be a basis for
 7000 block-performance...

 any comment?
 Thanks!
 z


I'd imagine you'll see far better performance out of the 7000 with their use
of flash.  Only time will tell though :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris vs Linux

2008-12-03 Thread Ian Collins
Joseph Zhou wrote:
 Thanks Ian, Tim,
 Ok, let me really hit one topic instead of trying to see in general
 what data are out there...

 Let's say OpenSolaris doing Samba vs. Linux doing Samba, in CIFS
 performance.
 (so I can link to the Win2008 CIFS numbers and NetApp CIFS numbers
 myself.)

 Is there any data to this specific point?

I think what we are telling you is the only way to find the numbers you
want for your configuration is to do your own tests.  There are just too
many variables for other people's data to be truly relevant.

One of the benefits of Open Source is you only have to pay for your time
to run tests.

As Tim said, there's no point in limiting OpenSolaris to Samba.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs free space

2008-12-03 Thread none
Even if that maximum number of shared collections is that much, I think the 
information should be available if requested, even if it becomes similar to a 
'scrub' operation in terms of the time taken. For a filesystem like mine that 
only has 10 or so snapshots, I'd really only expect 3 or 4 of the shared 
collections to stand out in terms of disk usage.

For filesystems with 100's of snapshots, they can filter the data as desired. 
It would probably take a long time but at least the can get the information if 
they really need it. It's better than not having access to the information at 
all.

I'm not sure if time could be saved by requesting disk usage for only a single 
shared collection, but if it does that could also be an option as you suggest.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-03 Thread Alan Rubin
I think we found the choke point.  The silver lining is that it isn't the T2000 
or ZFS.  We think it is the new SAN, an Hitachi AMS1000, which has 7200RPM SATA 
disks with the cache turned off.  This system has a very small cache, and when 
we did turn it on for one of the replacement LUNs we saw a 10x improvement - 
until the cache filled up about 1 minute later (was using zpool iostat).  Oh 
well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-12-03 Thread Blake
Thanks Max and Chris.  I don't really want the problem to occur again, of
course, but I'll be prepared if it does.

On Wed, Dec 3, 2008 at 6:46 PM, Chris Siebenmann [EMAIL PROTECTED] wrote:

 You write:
 |  If keyboard input is not getting in, your machine is probably wedged
 |  at a high level interrupt, which sounds doubtful based on your
 |  problem description.
 | Out of curiosity, why do you say that?  I'm no expert on interrupts, so
 | I'm curious.  It DOES seem that keyboard entry is ignored in this
 | situation, since I see no results from ctrl-c, for example (I had left
 | the console running 'tail -f /var/adm/messages'.  I'm not saying your
 | are wrong, but if I should be examining interrupt issues, I'd like to
 | know (I have 3 hard disk controllers in the box, for example...)

  ^C handling requires a great deal of high-level kernel infrastructure
 to be working, far beyond basic interrupt handling. To get much visible
 reaction in a situation where nothing is producing output, for example,
 the system has to be able to get all the way to running your shell so
 that it can notice that tail has died and print the shell prompt. By
 contrast, if the console echoes '^C', you have a fair amount of
 interrupt handling.

  The Solaris kernel debugger hooks in to the system at a fairly low
 level (I believe significantly lower than all of the things that have
 to be working to even echo '^C', much less get all the way to executing
 user-level code). Thus, you can get into it and force-crash your system
 even if it is otherwise fairly dead, so I think that trying is well
 worth it in your situation.

 ---
I shall clasp my hands together and bow to the corners of the
 world.
Number Ten Ox, Bridge of Birds
 Chris Siebenmann
 [EMAIL PROTECTED]

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool cannot replace a replacing device

2008-12-03 Thread Courtney Malone
I have a 10 drive raidz, recently one of the disks appeared to be generating 
errors (this later turned out to be a cable), I removed the disk from the 
array, ran vendor diagnostics (which zeroed it). Upon reinstalling the disk 
however zfs will not resilver it,  it gets referred to numerically instead of 
by device name, and when i try to replace it, i get:

# zpool replace data 17096229131581286394 c0t2d0

cannot replace 17096229131581286394 with c0t2d0: cannot replace a replacing 
device

if i try to detach it i get:

# zpool detach data 17096229131581286394

cannot detach 17096229131581286394: no valid replicas


current zpool output looks like:

# zpool status -v
  pool: data
 state: DEGRADED
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
dataDEGRADED 0 0 0
-raidz1DEGRADED 0 0 0
---c0t0d0  ONLINE   0 0 0
---c0t1d0  ONLINE   0 0 0
---replacing   UNAVAIL  0   543 0  insufficient 
replicas
-- 17096229131581286394  FAULTED  0   581 0  was 
/dev/dsk/c0t2d0s0/old
-- 11342560969745958696  FAULTED  0   582 0  was /dev/dsk/c0t2d0s0
---c0t3d0  ONLINE   0 0 0
---c0t4d0  ONLINE   0 0 0
---c0t5d0  ONLINE   0 0 0
---c0t6d0  ONLINE   0 0 0
---c0t7d0  ONLINE   0 0 0
---c2t2d0  ONLINE   0 0 0
---c2t3d0  ONLINE   0 0 0

errors: No known data errors

i have also tried exporting and reimporting the pool, any help would greatly 
appreciated.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-12-03 Thread Jens Elkner
On Tue, Dec 02, 2008 at 12:22:49PM -0800, Vincent Fox wrote:
 Reviving this thread.
 
 We have a Solaris 10u4 system recently patched with 137137-09.
 Unfortunately the patch was applied from multi-user mode, I wonder if this
 may have been original posters problem as well?  Anyhow we are now stuck

No - in my case it was a 'not enough space' on / problem, not the
multi-user mode ;-). 

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] help diagnosing system hang

2008-12-03 Thread Ethan Erchinger
Hi all,

First, I'll say my intent is not to spam a bunch of lists, but after 
posting to opensolaris-discuss I had someone communicate with me offline 
that these lists would possibly be a better place to start.  So here we 
are. For those on all three lists, sorry for the repetition.

Second, this message is meant to solicit help in diagnosing the issue 
described below.  Any hints on how DTrace may help, or where in general 
to start would be much appreciated.  Back to the subject at hand.

---

I'm testing an application which makes use of a large file mmap'd into 
memory, as if it the application was using malloc().  The file is 
roughly 2x the size of physical ram.  Basically, I'm seeing the system 
stall for long periods of time, 60+ seconds, and then resume.  The file 
lives on an SSD (Intel x25-e) and I'm using zfs's lzjb compression to 
make more efficient use of the ~30G of space provided by that SSD.

The general flow of things is, start application, ask it to use a 50G 
file. The file is created in a sparse manner at the location
designated, then mmap is called on the entire file.  All fine up to this
point.

I then start loading data into the application, and it starts pushing
data to the file as you'd expect.  Data is pushed to the file early and 
often, as it's mmap'd with the MAP_SHARED flag.  But, when the 
application's resident size reaches about 80% of the physical ram on the 
system, the system starts paging and things are still working relatively 
well, though slower, as expected.

Soon after, when reaching about 40G of data, I get stalls accessing the
SSD (according to iostat), in other words, no IO to that drive.  When I
started looking into what could be causing it, such as IO timeouts, I
run dmesg and it hangs after printing a timestamp.  I can ctrl-c dmesg,
but subsequent runs provide no better results.  I see no new messages in
/var/adm/messages, as I'd expect.

Eventually the system recovers, the latest case took over 10 minutes to
recover, after killing the application mentioned above, and I do see
disk timeouts in dmesg.

So, I can only assume that there's either a driver bug in the SATA/SAS
controller I'm using and it's throwing timeouts, or the SSD is having
issues.  Looking at the zpool configuration, I see that failmode=wait,
and since that SSD is the only member of the zpool I would expect IO to
hang.

But, does that mean that dmesg should hang also?  Does that mean that
the kernel has at least one thread stuck?  Would failmode=continue be
more desired, or resilient?

During the hang, load-avg is artificially high, fmd being the one
process that sticks out in prstat output.  But fmdump -v doesn't show
anything relevant.

Anyone have ideas on how to diagnose what's going on there?

Thanks,
Ethan

System: Sun x4240 dual-amd2347, 32G of ram
SAS/SATA Controller: LSI3081E
OS: osol snv_98
SSD: Intel x25-e


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Su n X4150/X4450

2008-12-03 Thread Marc Bevand
Aaron Blew aaronblew at gmail.com writes:
 
 I've done some basic testing with a X4150 machine using 6 disks in a
 RAID 5 and RAID Z configuration.  They perform very similarly, but RAIDZ
 definitely has more system overhead.

Since hardware RAID 5 implementations usually do not checksum data (they only 
compute the parity, which is not the same thing), for an apples-to-apples 
performance comparison you should have benchmarked raidz with checksum=off. Is 
it what you did ?

-marc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss