Re: [zfs-discuss] How long should an empty destroy take? snv_134

2011-03-06 Thread Richard Elling

On Mar 5, 2011, at 9:14 PM, Yaverot wrote:

 I'm (still) running snv_134 on a home server.  My main pool tank filled up 
 last night ( 1G free remaining ).
 So today I bought new drives, adding them one at a time running format 
 between each one to see what name they received.
 As I had a pair of names, I zpool create/add newtank mirror cxxt0do cyyt0d0 
 them.  
 
 Then I got to the point where I need my unused drives.  They were too small 
 to act as spares for tank, but I didn't want to lose track of them I stuck 
 them in another pool called others.
 
 We're heading into the 3rd hour of the zpool destroy on others.

zpool destroy or zfs destroy?
 -- richard

 The system isn't locked up, as it responds to local keyboard input, and 
 existing ssh  smb connections.
 While this destroy is running all other zpool/zfs commands appear to be 
 hung.
 
 The others pool never had more than 100G in it at one time, never had any 
 snapshots, and was empty for atleast two weeks prior to the destroy command.  
 I don't think dedup was ever used on it, but that should hardly matter when 
 the pool was already empty.
 others was never shared via smb or nfs.
 
 zpool destroy on an empty pool should be on the order of seconds, right?
 
 I really don't want to reboot/power down the server, as I'll use my current 
 connections, and if there's problems I don't know when the system will be 
 working again to re-establish.  
 
 Yes, I've triple checked, I'm not destroying tank.
 While writing the email, I attempted a new ssh connection, it got to the 
 Last login: line, but hasn't made it to the prompt.  So I really don't want 
 to take the server down physically.
 
 Doing a df, others doesn't show, but rpool, tank, and newtank do. Another 
 indication I issued destroy on the right pool.
 The smb connection is slower than normal, but still usable.  
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sun T3-2 and ZFS on JBODS

2011-03-06 Thread Marion Hakanson
sigbj...@nixtra.com said:
 I will do some testing on the loadbalance on/off. We have nearline SAS disks,
 which does have dual path from the disk, however it's still just 7200rpm
 drives.
 
 Are you using SATA , SAS or SAS-nearline in your array? Do you have multiple
 SAS connections to your arrays, or do you use a single connection per array
 only? 

We have four Dell MD1200's connected to three Solaris-10 systems.  Three
of the MD1200's have nearline-SAS 2TB 7200RPM drives, and one has SAS 300GB
15000RPM drives.  All the MD1200's are connected with dual SAS modules to
a dual-port HBA on their respective servers (one setup is with two MD1200's
daisy-chained, but again using dual SAS modules  cables).

Both types of drives suffer super-slow writes (but reasonable reads) when
loadbalance=roundrobin is in effect.  E.g 280 MB/sec sequential reads, and
28MB/sec sequential writes, for the 15kRPM SAS drives I tested last week.
We don't see this extreme slowness on our dual-path Sun J4000 JBOD's, but
those all have SATA drives (with the dual-port interposers inside the
drive sleds).

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How long should an empty destroy take? snv_134

2011-03-06 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Yaverot
 
 We're heading into the 3rd hour of the zpool destroy on others.
 The system isn't locked up, as it responds to local keyboard input, and

I bet you, you're in a semi-crashed state right now, which will degrade into
a full system crash.  You'll have no choice but to power cycle.  Prove me
wrong, please.   ;-)

I bet, as soon as you type in any zpool or zfs command ... even list
or status they will also hang indefinitely.  

Is your pool still 100% full?  That's probably the cause.  I suggest if
possible, immediately deleting something and destroying an old snapshot to
free up a little bit of space.  And then you can move onward...


 While this destroy is running all other zpool/zfs commands appear to be
 hung.

Oh, sorry, didn't see this before I wrote what I wrote above.  This just
further confirms what I said above.


 zpool destroy on an empty pool should be on the order of seconds, right?

zpool destroy is instant, regardless of how much data there is in a pool.
zfs destroy is instant for an empty volume, but zfs destroy takes a long
time for a lot of data.

But as mentioned above, that's irrelevant to your situation.  Because your
system is crashed, and even if you try init 0 or init 6...  They will fail.
You have no choice but to power cycle.

For the heck of it, I suggest init 0 first.  Then wait half an hour, and
power cycle.  Just to try and make the crash as graceful as possible.

As soon as it comes back up, free up a little bit of space, so you can avoid
a repeat.


 Yes, I've triple checked, I'm not destroying tank.
 While writing the email, I attempted a new ssh connection, it got to the
Last
 login: line, but hasn't made it to the prompt.  

Oh, sorry, yet again this is confirming what I said above.  semi-crashed and
degrading into a full crash.
Right now, you cannot open any new command prompts.
Soon it will stop responding to ping.  (Maybe 2-12 hours.)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How long should an empty destroy take? snv_134

2011-03-06 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Yaverot
 
 I'm (still) running snv_134 on a home server.  My main pool tank filled
up
 last night ( 1G free remaining ).

There is (or was) a bug that would sometimes cause the system to crash when
100% full.
http://comments.gmane.org/gmane.os.solaris.opensolaris.zfs/41227 

In that thread, the crash was related to being 100% full, running a scrub,
and some write operations all at the same time.  By any chance were you
running a scrub?

I am curious whether or not the scrub is actually an ingredient in that
failure scenario, or if the scrub was just coincidence for me.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How long should an empty destroy take? snv_134

2011-03-06 Thread Nathan Kroenert
Why wouldn't they try a reboot -d? That would at least get some data in 
the form of a crash dump if at all possible...


A power cycle seems a little medieval to me... At least in the first 
instance.


The other thing I have noted is that sometimes things to get wedged, and 
if you can find where, (mdb -k and take a poke at the stack of some of 
the zfs/zpool commands that are hung to see what they were operating on) 
and trying a zpool clear on that zpool.  Not that I'm recommending that 
you should *need* to, but that has got me unwedged on occasion. (though, 
usually when I have dome something administratively silly... ;)


Nathan.

 On 7/03/2011 12:14 PM, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Yaverot

We're heading into the 3rd hour of the zpool destroy on others.
The system isn't locked up, as it responds to local keyboard input, and

I bet you, you're in a semi-crashed state right now, which will degrade into
a full system crash.  You'll have no choice but to power cycle.  Prove me
wrong, please.   ;-)

I bet, as soon as you type in any zpool or zfs command ... even list
or status they will also hang indefinitely.

Is your pool still 100% full?  That's probably the cause.  I suggest if
possible, immediately deleting something and destroying an old snapshot to
free up a little bit of space.  And then you can move onward...



While this destroy is running all other zpool/zfs commands appear to be
hung.

Oh, sorry, didn't see this before I wrote what I wrote above.  This just
further confirms what I said above.



zpool destroy on an empty pool should be on the order of seconds, right?

zpool destroy is instant, regardless of how much data there is in a pool.
zfs destroy is instant for an empty volume, but zfs destroy takes a long
time for a lot of data.

But as mentioned above, that's irrelevant to your situation.  Because your
system is crashed, and even if you try init 0 or init 6...  They will fail.
You have no choice but to power cycle.

For the heck of it, I suggest init 0 first.  Then wait half an hour, and
power cycle.  Just to try and make the crash as graceful as possible.

As soon as it comes back up, free up a little bit of space, so you can avoid
a repeat.



Yes, I've triple checked, I'm not destroying tank.
While writing the email, I attempted a new ssh connection, it got to the

Last

login: line, but hasn't made it to the prompt.

Oh, sorry, yet again this is confirming what I said above.  semi-crashed and
degrading into a full crash.
Right now, you cannot open any new command prompts.
Soon it will stop responding to ping.  (Maybe 2-12 hours.)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How long should an empty destroy take? snv_134

2011-03-06 Thread Yaverot
Follow up, and current status:

In the morning I cut power (before receiving the 4 replies). Turning it on 
again, I got too impatient to get a text screen for diagnostics to show that I 
overfilled the keyboard buffer.  I forced it off again (to stop the beeps), 
then waited longer before attempting to switch it from splash-screen to console.

When it got up, others was still there, and a disk (c16) was faulted which, 
as I only used the pool for light testing, and holding the names of devices, 
was a stripe, so the pool was faulted.  My guess is that the disk switched to 
faulted between the zpool status and the zpool destroy others, and then got 
stuck trying to write the not-in-use label to the unavail disk.

I was able to zpool destroy -f others and add those to my newtank. ( using -f 
on add )
So newtank is now large enough for a send/recv from tank.  It isn't done yet, 
but a scrub on tank takes about 36 hours (newtank is mirrors instead of tank's 
raidz3).
Two drives show faulted in tank, one I found, it renamed itself from either c12 
or c14 to c21, but my attempt to add it back to the pool gave an error that c10 
is already part of tank. Yes c10 is part of tank, but the commandline referred 
to c14 and c21, so why talk about c10? Getting the data onto newtank seamed the 
best thing to push for, so I'm doing the send/recv with tank degraded, one more 
disk can disappear before any data is at risk. 

My power-off/reboot before running an export/import loop on newtank means all 
those drives have different names now than the ones I wrote on them. :(

rpool remains 1% inuse. tank reports 100% full (with 1.44G free), others is 
destroyed but I know that c16 is still physically connected and hasn't be 
zfs-delabeled should it ever online itself. zfs list shows data being recv'ed 
on newtank.

So:
1. send/recv tank-newtank progressing, and will hopefully finish with no 
problems.
2. Two disks apparently disappeared as they aren't part of any pool and don't 
show in format either.*
3. One disk renamed itself and therefore can't be readded/reattached to tank. 
(now c21)
4. All drives put into newtank before the destroy showed up, but with different 
names. newtank imported cleanly (at the time it was still empty).

*Or I don't see them because I get lost in the order. Comparing output requires 
scrolling back  forth, and they aren't sorted the same.  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss