Re: [zfs-discuss] Faulted raidz1 shows the same device twice ?!?

2007-12-11 Thread Richard Kowalski
To resolve this issue you need to run:

# zdb -l /dev/dsk/c18t0d0

# zpool export external
# zpool import external

# zpool clear external
# zpool scrub external
# zpool clear external


(I do not have specific answers to your questions at this time, only a fix.)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faulted raidz1 shows the same device twice ?!?

2007-12-11 Thread Jeff Thompson
Genius!  The export/import worked great.  With the new drive plugged in, I got:
# zpool status
  raidz1  DEGRADED 0 0 0
c18t0d0   ONLINE   0 0 0
17017229752965797825  FAULTED  0 0 0  was 
/dev/dsk/c18t0d0s0

Then I was able to replace:
# zpool replace external 17017229752965797825 c19t0d0

After resilvering, I'm up and running!

(For the question of where it was showing c18t0d0, it's interesting that it 
thought the second drive was c18t0d0s0.  Can you have a raidz1 with c18t0d0 and 
c18t0d0s0?)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Memory Sticks

2007-12-11 Thread Constantin Gonzalez
Hi Paul,

 # fdisk -E /dev/rdsk/c7t0d0s2

then

 # zpool create -f Radical-Vol /dev/dsk/c7t0d0

should work. The warnings you see are just there to double-check you don't
overwrite any previously used pool which you may regret. -f overrules that.

Hope this helps,
Constantin

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Trial x4500, zfs with NFS and quotas.

2007-12-11 Thread Robert Milkowski
Hello Jorgen,

Tuesday, December 11, 2007, 2:22:07 AM, you wrote:


 
 I don't know... while it will work I'm not sure I would trust it.
 Maybe just use Solaris Volume Manager with Soft Partitioning + UFS and
 forget about ZFS in your case?

JL Well, the idea was to see if it could replace the existing NetApps as 
JL that was what Jonathan promised it could do, and we do use snapshots on
JL the NetApps, so having zfs snapshots would be attractive, as well as 
JL easy to grow the file-system as needed. (Although, perhaps I can growfs
JL with SVM as well.)


JL You may be correct about the trust issue though. copied over a small 
JL volume from the netapp:

JL Filesystem size   used  avail capacity  Mounted on
JL 1.0T   8.7G  1005G 1%/export/vol1

JL NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
JL zpool1 20.8T   5.00G   20.8T 0%  ONLINE -

JL So copied 8.7Gb, to compressed volume takes up 5Gb. That is quite nice.
JL Enable the same quotas for users, then run quotacheck:

JL [snip]
JL #282759fixed:  files 0 - 4939  blocks 0 - 95888
JL #282859fixed:  files 0 - 9  blocks 0 - 144
JL Read from remote host x4500-test: Operation timed out
JL Connection to x4500-test closed.

JL and it has not come back, so not a panic, just a complete hang. I'll 
JL have to get NOC staff to go power cycle it.


JL We are bending over backwards trying to get the x4500 to work in a 
JL simple NAS design, but honestly, the x4500 is not a NAS. Nor can it 
JL compete with NetApps. As a Unix server with lots of disks, it is very nice.

JL Perhaps one day it can mind you, it just is not there today.


Well, I can't agree with you.
While it may be not suitable in your specific case, as I stated
before, in many cases where user quotas are not needed, x4500+zfs is a
very compelling solution, and definitely cheaper and more flexible
(except user quotas) than NetApp.

While I don't need user quotas I can understand people who do - if you
have only a couple (hundreds?) file systems and you are not
creating/destroying them then approach file system per user could work
(assuming you don't need users writing to common file systems and
still have a user quota) - nevertheless it's just an workaround in
some cases and in other it won't work.



-- 
Best regards,
 Robert Milkowski  mailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Odd prioritisation issues.

2007-12-11 Thread Dickon Hood
On Fri, Dec 07, 2007 at 13:14:56 +, I wrote:
: On Fri, Dec 07, 2007 at 12:58:17 +, Darren J Moffat wrote:
: : Dickon Hood wrote:
: : On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote:
: : : Dickon Hood wrote:

: : : We're seeing the writes stall in favour of the reads.  For normal
: : : workloads I can understand the reasons, but I was under the impression
: : : that real-time processes essentially trump all others, and I'm surprised
: : : by this behaviour; I had a dozen or so RT-processes sat waiting for disc
: : : for about 20s.

: : : Are the files opened with O_DSYNC or does the application call fsync ?

: : No.  O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND.  Would that help?

: : Don't know if it will help, but it will be different :-).  I suspected 
: : that since you put the processes in the RT class you would also be doing 
: : synchronous writes.

: Right.  I'll let you know on Monday; I'll need to restart it in the
: morning.

I was a tad busy yesterday and didn't have the time, but I've switched one
of our recorder processes (the one doing the HD stream; ~17Mb/s,
broadcasting a preview we don't mind trashing) to a version of the code
which opens its file O_DSYNC as suggested.

We've gone from ~130 write ops per second and 10MB/s to ~450 write ops per
second and 27MB/s, with a marginally higher CPU usage.  This is roughly
what I'd expect.

We've artifically throttled the reads, which has helped (but not fixed; it
isn't as determinative as we'd like) the starvation problem at the expense
of increasing a latency we'd rather have as close to zero as possible.

Any ideas?

Thanks.

-- 
Dickon Hood

Due to digital rights management, my .sig is temporarily unavailable.
Normal service will be resumed as soon as possible.  We apologise for the
inconvenience in the meantime.

No virus was found in this outgoing message as I didn't bother looking.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Dismounting ZFS

2007-12-11 Thread David Dyer-Bennet
I've used zfs unmount on the pool on my external disk, but while the 
filesystems are no longer visible, zpool still shows the pool as 
online.  I suspect I shouldn't disconnect the external device at this 
point (or at least that it's not ideal).  What else/other should I do 
administratively before disconnecting an external device with a ZFS pool 
on it?  (Single device, simple pool, no redundancy).

-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dismounting ZFS

2007-12-11 Thread Kyle McDonald
David Dyer-Bennet wrote:
 I've used zfs unmount on the pool on my external disk, but while the 
 filesystems are no longer visible, zpool still shows the pool as 
 online.  I suspect I shouldn't disconnect the external device at this 
 point (or at least that it's not ideal).  What else/other should I do 
 administratively before disconnecting an external device with a ZFS pool 
 on it?  (Single device, simple pool, no redundancy).

   
'zpool export' might be a good idea. Will make it cleaner if you plug it 
into a different machine next.

'zpool import' will be needed when you plug it back in.

   -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-11 Thread can you guess?
 Monday, December 10, 2007, 3:35:27 AM, you wrote:
 
 cyg  and it 
 made them slower
 
 cyg That's the second time you've claimed that, so you'll really at
 cyg least have to describe *how* you measured this even if the
 cyg detailed results of those measurements may be lost in the mists of time.
 
 
 cyg So far you don't really have much of a position to defend at
 cyg all:  rather, you sound like a lot of the disgruntled TOPS users
 cyg of that era.  Not that they didn't have good reasons to feel
 cyg disgruntled - but they frequently weren't very careful about aiming 
 their ire accurately.
 
 cyg Given that RMS really was *capable* of coming very close to the
 cyg performance capabilities of the underlying hardware, your
 cyg allegations just don't ring true.  Not being able to jump into
 
 And where is your proof that it was capable of coming very close to
 the...?

It's simple:  I *know* it, because I worked *with*, and *on*, it - for many 
years.  So when some bozo who worked with people with a major known chip on 
their shoulder over two decades ago comes along and knocks its capabilities, 
asking for specifics (not even hard evidence, just specific allegations which 
could be evaluated and if appropriate confronted) is hardly unreasonable.

Hell, *I* gave more specific reasons why someone might dislike RMS in 
particular and VMS in general (complex and therefore user-unfriendly low-level 
interfaces and sometimes poor *default* performance) than David did:  they just 
didn't happen to match those that he pulled out of (whereever) and that I 
challenged.

 Let me use your own words:
 
 In other words, you've got nothing, but you'd like people to believe it's 
 something.
 
 The phrase Put up or shut up comes to mind.
 
 Where are your proofs on some of your claims about ZFS?

Well, aside from the fact that anyone with even half a clue knows what the 
effects of uncontrolled file fragmentation are on sequential access performance 
(and can even estimate those effects within moderately small error bounds if 
they know what the disk characteristics are and how bad the fragmentation is), 
if you're looking for additional evidence that even someone otherwise totally 
ignorant could appreciate there's the fact that Unix has for over two decades 
been constantly moving in the direction of less file fragmentation on disk - 
starting with the efforts that FFS made to at least increase proximity and 
begin to remedy the complete disregard for contiguity that the early Unix file 
system displayed and to which ZFS has apparently regressed, through the 
additional modifications that Kleiman and McVoy introduced in the early '90s to 
group 56 KB of blocks adjacently when possible, through the extent-based 
architectures of VxFS, XFS, JFS, and soon-to-be ext4 file systems (
 I'm probably missing others here):  given the relative changes between disk 
access times and bandwidth over the past decade and a half, ZFS with its max 
128 KB blocks in splendid isolation offers significantly worse sequential 
performance relative to what's attainable than the systems that used 56 KB 
aggregates back then did (and they weren't all that great in that respect).

Given how slow Unix was to understand and start to deal with this issue, 
perhaps it's not surprising how ignorant some Unix people still are - despite 
the fact that other platforms fully understood the problem over three decades 
ago.

Last I knew, ZFS was still claiming that it needed nothing like 
defragmentation, while describing write allocation mechanisms that could allow 
disastrous degrees of fragmentation under conditions that I've described quite 
clearly.  If ZFS made no efforts whatsoever in this respect the potential for 
unacceptable performance would probably already have been obvious even to its 
blindest supporters, so I suspect that when ZFS is given the opportunity by a 
sequentially-writing application that doesn't force every write (or by use of 
the ZIL in some cases) it aggregates blocks in a file together in cache and 
destages them in one contiguous chunk to disk (rather than just mixing blocks 
willy-nilly in its batch disk writes) - and a lot of the time there's probably 
not enough other system write activity to make this infeasible, so that people 
haven't found sequential streaming performance to be all that bad most of the 
time (especially on the read end if their systems are lightly load
 ed and the fact that their disks may be working a lot harder than they ought 
to have to is not a problem).

But the potential remains for severe fragmention under heavily parallel access 
conditions, or when a file is updated at fine grain but then read sequentially 
(the whole basis of the recent database thread), and with that fragmentation 
comes commensurate performance degradation.  And even if you're not capable of 
understanding why yourself you should consider it significant that no one on 
the ZFS development team has piped up to say 

Re: [zfs-discuss] zpool kernel panics.

2007-12-11 Thread Edward Irvine
Hi Folks,

On 10/12/2007, at 12:22 AM, Edward Irvine wrote:

 Hi Folks,

 I've got a 3.9 Tb zpool, and it is casing kernel panics on my  
 Solaris 10 280r (SPARC) server.

 The message I get on panic is this:

 panic[cpu1]/thread=2a100a95cc0: zfs: freeing free segment  
 (offset=423713792 size=1024)

 This seems to come about when the zpool is being used or being  
 scrubbed - about twice a day at the moment. After the reboot, the  
 scrub seems to have been forgotten about - I can't get a zpool  
 scrub to complete.

 Any suggestions very much appreciated...

 --- snip ---

 $ zpool status zpool1
   pool: zpool1
  state: ONLINE
  scrub: none requested
 config:

 NAME   STATE READ  
 WRITE CKSUM
 zpool1 ONLINE
 0 0 0
   c7t600C0FF00B44BCE6BB00d0s2  ONLINE
 0 0 0
   c7t600C0FF00B44BCE6BB01d0s2  ONLINE
 0 0 0
   c7t600C0FF00B44BCE6BB02d0s0  ONLINE
 0 0 0
   c7t600C0FF00B0BD10ACD00d0s3  ONLINE
 0 0 0
   c7t600C0FF00B03D27D7100d0s0  ONLINE
 0 0 0

 errors: No known data error

 $ uname -a
 SunOS servername 5.10 Generic_120011-14 sun4u sparc SUNW,Sun-Fire-280R

  snip 

 Eddie




Each time the system crashes, it crashes with the same error message.  
This suggests to me that it is zpool corruption rather than faulty  
RAM, which is to blame.

So - is this particular zpool a lost cause?  :\

A number of folks have pointed out that this bug may have been fixed  
in a very recent version (nv-77?) of opensolaris.  As a last ditch  
approach, I'm thinking that I could put the current system disks  
(sol10u4) aside, do a quick install the latest opensolaris, import  
the zpool, and do a zpool scrub, export the zpool, shutdown, swap in  
the sol10u4 disks, reboot, import.

Sigh. Does this approach sound plausible?

Eddie




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool kernel panics.

2007-12-11 Thread James C. McPherson

Hi Eddie,

Edward Irvine wrote:
 Each time the system crashes, it crashes with the same error message.  
 This suggests to me that it is zpool corruption rather than faulty  
 RAM, which is to blame.
 
 So - is this particular zpool a lost cause?  :\

It's looking that way to me, but I'm definitely no expert.

 A number of folks have pointed out that this bug may have been fixed  
 in a very recent version (nv-77?) of opensolaris.  As a last ditch  
 approach, I'm thinking that I could put the current system disks  
 (sol10u4) aside, do a quick install the latest opensolaris, import  
 the zpool, and do a zpool scrub, export the zpool, shutdown, swap in  
 the sol10u4 disks, reboot, import.
 Sigh. Does this approach sound plausible?

It's definitely worth a shot, as long as you don't have
to zpool upgrade in order to do it.

I pulled your crash dump inside Sun, thankyou, but I haven't
had a chance to analyze it so I've passed the details on to
more knowledgeable ZFS ppl.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] x4500 recommendations for netbackup dsu?

2007-12-11 Thread Dave Lowenstein
Okay, my order for an x4500 went through so sometime soon I'll be using 
it as a big honkin area for DSUs and DSSUs for netbackup.

Does anybody have any experience with using zfs compression for this 
purpose? The thought of doubling 48tb to 96 tb is enticing. Are there 
any other zfs tweaks that might aid in performance for what will pretty 
much be a lot of long and large reads and writes?

I'm planning on one big chunk of space for a permanently on disk DSU, 
and another for the DSSU staging areas.

Also, I haven't looked into this but is a spare considered part of a 
zpool, or is there such a thing as a global spare?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-11 Thread Robert Milkowski
Hello can,

Tuesday, December 11, 2007, 6:57:43 PM, you wrote:

 Monday, December 10, 2007, 3:35:27 AM, you wrote:
 
 cyg  and it 
 made them slower
 
 cyg That's the second time you've claimed that, so you'll really at
 cyg least have to describe *how* you measured this even if the
 cyg detailed results of those measurements may be lost in the mists of time.
 
 
 cyg So far you don't really have much of a position to defend at
 cyg all:  rather, you sound like a lot of the disgruntled TOPS users
 cyg of that era.  Not that they didn't have good reasons to feel
 cyg disgruntled - but they frequently weren't very careful about aiming 
 their ire accurately.
 
 cyg Given that RMS really was *capable* of coming very close to the
 cyg performance capabilities of the underlying hardware, your
 cyg allegations just don't ring true.  Not being able to jump into
 
 And where is your proof that it was capable of coming very close to
 the...?

cyg It's simple:  I *know* it, because I worked *with*, and *on*, it
cyg - for many years.  So when some bozo who worked with people with
cyg a major known chip on their shoulder over two decades ago comes
cyg along and knocks its capabilities, asking for specifics (not even
cyg hard evidence, just specific allegations which could be evaluated
cyg and if appropriate confronted) is hardly unreasonable.

Bill, you openly criticize people (their work) who have worked on ZFS
for years... not that there's anything wrong with that, just please
realize that because you were working on it it doesn't mean it is/was
perfect - just the same as with ZFS.
I know, everyone loves their baby...

Nevertheless just because you were working on and with it, it's not a
proof. The person you were replaying to was also working with it (but
not on it I guess). Not that I'm interested in such a proof. Just
noticed that you're demanding some proof, while you are also just
write some statements on its performance without any actual proof.



 Let me use your own words:
 
 In other words, you've got nothing, but you'd like people to believe it's 
 something.
 
 The phrase Put up or shut up comes to mind.
 
 Where are your proofs on some of your claims about ZFS?

cyg Well, aside from the fact that anyone with even half a clue
cyg knows what the effects of uncontrolled file fragmentation are on
cyg sequential access performance (and can even estimate those
cyg effects within moderately small error bounds if they know what
cyg the disk characteristics are and how bad the fragmentation is),
cyg if you're looking for additional evidence that even someone
cyg otherwise totally ignorant could appreciate there's the fact that

I've never said there are not fragmentation problems with ZFS.
Well, actually I've been hit by the issue in one environment.
Also you haven't done your work home properly, as one of ZFS
developers actually stated they are going to work on ZFS
de-fragmentation and disk removal (pool shrinking).
See http://www.opensolaris.org/jive/thread.jspa?messageID=139680#139680
Lukasz happens to be my friend who is also working with the same
environment.

The point is, and you as a long time developer (I guess) should know it,
you can't have everything done at once (lack of resources, and it takes
some time anyway) so you must prioritize. ZFS is open source and if
someone thinks that given feature is more important than the other
he/she should try to fix it or at least voice it here so ZFS
developers can possibly adjust their priorities if there's good enough
and justified demand.

Now the important part - quite a lot of people are using ZFS, from
desktop usage, their laptops, small to big production environments,
clustered environments, SAN environemnts, JBODs, entry-level to high-end arrays,
different applications, workloads, etc. And somehow you can't find
many complaints about ZFS fragmentation. It doesn't mean the problem
doesn't exist (and I know it first hand) - it means that for whatever
reason for most people using ZFS it's not a big problem if problem at
all. However they do have other issues and many of them were already
addressed or are being addressed. I would say that ZFS developers at
least try to listen to the community.

Why am I asking for a proof - well, given constrains on resources, I
would say we (not that I'm ZFS developer) should focus on actual
problems people have with ZFS rather then theoretical problems (which
in some environments/workloads will show up and sooner or later they
will have to be addressed too).

Then you find people like Pawel Jakub Davidek (guy who ported ZFS to
FreeBSD) who started experimenting with RAID-5 like implementation
with ZFS - he provided even some numbers showing it might be worth
looking at. That's what community is about.

I don't see any point complaining about ZFS all over again - have you
actually run into the problem with ZFS yourself? I guess not. You just
assuming (correctly for some usage cases). I guess your message has
been well 

Re: [zfs-discuss] Yager on ZFS

2007-12-11 Thread Toby Thain

On 11-Dec-07, at 9:44 PM, Robert Milkowski wrote:

 Hello can,
 ...

 What some people are also looking for, I guess, is a black-box
 approach - easy to use GUI on top of Solaris/ZFS/iSCSI/etc. So they
 don't have to even know it's ZFS or Solaris. Well...


Pretty soon OS X will be exactly that - a native booting zero-admin  
ZFS-based system - as used by your grandmother on her iMac, your kid  
son on his iBook, etc


...
 Wouldn't it better serve you to actually contribute to the other
 project, where developers actually get it - where no one is personally
 attacking you, where there are no fundamental bad choices made while
 in design, where RAID-5 is flawless, fragmentation problem doesn't
 exist neither all the other corner cases.

And don't forget - the perfect system doesn't waste time  
checksumming! It's unnecessary!

 Performance is best in a
 market all the time, and I can run in on commodity HW or so called big
 iron, on a well known general purpose OS. Well, I assume that project
 is open source too - maybe you share with all of us that secret so  
 we can
 join it too and forget about ZFS? ... perhaps it's time to stop  
 being Don Quixote
 and move on?

At least Sr Quixote was funny and never rude without provocation.

--Toby







 -- 
 Best regards,
  Robertmailto:[EMAIL PROTECTED]
http://milek.blogspot.com

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] how do i set a zfs mountpoint for use NEXT mount??

2007-12-11 Thread Brett
Folks,

Not sure if any of this is possible, but thought I would ask. This is all part 
of simplifying my 2 Indiana zfsboot environments.

I am wondering if there is a way to set the mountpoint of a zfs and not have it 
immediately actioned. I want this so I can set the mountpoint of my alternate 
zfs boot (zpl_slim/root2) to / (even though I currently have an active rootfs 
of zpl_slim/root ) and then have the grub bootfs parameter control booting from 
one root or the other. 

Are the zfs mountpoints stored in zpool.chache or ondisk or both? I was 
wondering if i can tweak zpool.cache or use zdb to achieve this. 

Essentially this is so the zfs filesystems under zpl_slim/root get mounted 
correctly through inheritance. Currently I achieve booting from altroot by 
having the roots set as legacy and referenced in the vfstab. But what this 
means is the underlying zfs filesystems ( root2/opt, root2/usr, etc) dont get 
mounted as they inherit legacy mode.

Any assistance would be appreciated.

Rep
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Degraded zpool won't online disk device, instead resilvers spare

2007-12-11 Thread Kevin
I've got a zpool that has 4 raidz2 vdevs each with 4 disks (750GB), plus 4 
spares. At one point 2 disks failed (in different vdevs). The message in 
/var/adm/messages for the disks were 'device busy too long'. Then SMF printed 
this message:

Nov 23 04:23:51 x.x.com EVENT-TIME: Fri Nov 23 04:23:51 EST 2007
Nov 23 04:23:51 x.x.com PLATFORM: Sun Fire X4200 M2, CSN: 0734BD159F
  , HOSTNAME: x.x.com
Nov 23 04:23:51 x.x.com SOURCE: zfs-diagnosis, REV: 1.0
Nov 23 04:23:51 x.x.com EVENT-ID: bb0f6d83-0c12-6f0f-d121-99d72f7de981
Nov 23 04:23:51 x.x.com DESC: A ZFS device failed.  Refer to 
http://sun.com/msg/ZFS-8000-D3 for more information.
Nov 23 04:23:51 x.x.com AUTO-RESPONSE: No automated response will occur.
Nov 23 04:23:51 x.x.com IMPACT: Fault tolerance of the pool may be compromised.
Nov 23 04:23:51 x.x.com REC-ACTION: Run 'zpool status -x' and replace the bad 
device.

Interestingly, zfs reported the failure but did not bring two of the spare 
disks online to temporarily replace the failed disks.

Here's the zpool history command to see what hapenned after the failures (from 
Nov 26 on):

2007-11-21.20:56:47 zpool create tank raidz2 c5t22d0 c5t30d0 c5t23d0 c5t31d0
2007-11-21.20:57:07 zpool add tank raidz2 c5t24d0 c5t32d0 c5t25d0 c5t33d0
2007-11-21.20:57:17 zpool add tank raidz2 c5t26d0 c5t34d0 c5t27d0 c5t35d0
2007-11-21.20:57:35 zpool add tank raidz2 c5t28d0 c5t36d0 c5t29d0 c5t37d0
2007-11-21.20:57:44 zpool scrub tank
2007-11-23.02:15:38 zpool scrub tank
2007-11-26.12:16:41 zpool online tank c5t23d0
2007-11-26.12:17:48 zpool online tank c5t23d0
2007-11-26.12:18:59 zpool add tank spare c5t17d0
2007-11-26.12:29:32 zpool offline tank c5t29d0
2007-11-26.12:32:08 zpool online tank c5t29d0
2007-11-26.12:32:35 zpool scrub tank
2007-11-26.12:34:15 zpool scrub -s tank
2007-11-26.12:34:22 zpool export tank
2007-11-26.12:43:42 zpool import tank tank.2
2007-11-26.12:45:45 zpool export tank.2
2007-11-26.12:46:32 zpool import tank.2
2007-11-26.12:47:02 zpool scrub tank.2
2007-11-26.12:48:11 zpool add tank.2 spare c5t21d0 c4t17d0 c4t21d0
2007-11-26.14:02:08 zpool scrub -s tank.2
2007-11-27.01:56:35 zpool clear tank.2
2007-11-27.01:57:02 zfs set atime=off tank.2
2007-11-27.01:57:07 zfs set checksum=fletcher4 tank.2
2007-11-27.01:57:45 zfs create tank.2/a
2007-11-27.01:57:46 zfs create tank.2/b
2007-11-27.01:57:47 zfs create tank.2/c
2007-11-27.01:59:39 zpool scrub tank.2
2007-12-05.15:31:51 zpool online tank.2 c5t23d0
2007-12-05.15:32:02 zpool online tank.2 c5t29d0
2007-12-05.15:36:58 zpool online tank.2 c5t23d0
2007-12-05.16:24:56 zpool replace tank.2 c5t23d0 c5t17d0
2007-12-05.21:52:43 zpool replace tank.2 c5t29d0 c5t21d0
2007-12-06.16:12:24 zpool online tank.2 c5t29d0
2007-12-11.13:08:13 zpool online tank.2 c5t23d0
2007-12-11.19:52:38 zpool online tank.2 c5t29d0

You can see that I manually attached 2 of the spare devices to the pool. 
Scrubbing finished fairly quickly (within 5 hours probably).

Here is what the pool status looks like right now:

  pool: tank.2
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: resilver completed with 0 errors on Tue Dec 11 19:58:17 2007
config:

|NAME   STATE READ WRITE CKSUM
|tank.2  DEGRADED 0 0 0
|--raidz2   DEGRADED 0 0 0
|c5t22d0ONLINE   0 0 0
|c4t30d0ONLINE   0 0 0
|spare  DEGRADED 0 0 0
|--c5t23d0  UNAVAIL  0 0 0  cannot open
|--c5t17d0  ONLINE   0 0 0
|c4t31d0ONLINE   0 0 0
|--raidz2   ONLINE   0 0 0
|c5t24d0ONLINE   0 0 0
|c4t32d0ONLINE   0 0 0
|c5t25d0ONLINE   0 0 0
|c4t33d0ONLINE   0 0 0
|--raidz2   ONLINE   0 0 0
|c5t26d0ONLINE   0 0 0
|c4t34d0ONLINE   0 0 0
|c5t27d0ONLINE   0 0 0
|c4t35d0ONLINE   0 0 0
|--raidz2   DEGRADED 0 0 0
|c5t28d0ONLINE   0 0 0
|c4t36d0ONLINE   0 0 0
|spare  DEGRADED 0 0 0
|--c5t29d0  UNAVAIL  0 0 0  cannot open
|--c5t21d0  ONLINE   0 0 0
|c4t37d0ONLINE   0 0 0
|spares
|--c5t17d0  INUSE currently in use
|--c5t21d0  INUSE currently in use
|--c4t17d0  AVAIL   
|--c4t21d0  AVAIL   

errors: No known data errors

The disks failed because they were temporarily detached. 

Re: [zfs-discuss] Yager on ZFS

2007-12-11 Thread Al Hopper
On Tue, 11 Dec 2007, Robert Milkowski wrote:

 Hello can,

 Tuesday, December 11, 2007, 6:57:43 PM, you wrote:

 Monday, December 10, 2007, 3:35:27 AM, you wrote:

 cyg  and it
 made them slower

 cyg That's the second time you've claimed that, so you'll really at
 cyg least have to describe *how* you measured this even if the
 cyg detailed results of those measurements may be lost in the mists of 
 time.


 cyg So far you don't really have much of a position to defend at
 cyg all:  rather, you sound like a lot of the disgruntled TOPS users
 cyg of that era.  Not that they didn't have good reasons to feel
 cyg disgruntled - but they frequently weren't very careful about aiming 
 their ire accurately.

 cyg Given that RMS really was *capable* of coming very close to the
 cyg performance capabilities of the underlying hardware, your
 cyg allegations just don't ring true.  Not being able to jump into

 And where is your proof that it was capable of coming very close to
 the...?

 cyg It's simple:  I *know* it, because I worked *with*, and *on*, it
 cyg - for many years.  So when some bozo who worked with people with
 cyg a major known chip on their shoulder over two decades ago comes
 cyg along and knocks its capabilities, asking for specifics (not even
 cyg hard evidence, just specific allegations which could be evaluated
 cyg and if appropriate confronted) is hardly unreasonable.

 Bill, you openly criticize people (their work) who have worked on ZFS
 for years... not that there's anything wrong with that, just please
 realize that because you were working on it it doesn't mean it is/was
 perfect - just the same as with ZFS.
 I know, everyone loves their baby...

 Nevertheless just because you were working on and with it, it's not a
 proof. The person you were replaying to was also working with it (but
 not on it I guess). Not that I'm interested in such a proof. Just
 noticed that you're demanding some proof, while you are also just
 write some statements on its performance without any actual proof.



 Let me use your own words:

 In other words, you've got nothing, but you'd like people to believe it's 
 something.

 The phrase Put up or shut up comes to mind.

 Where are your proofs on some of your claims about ZFS?

 cyg Well, aside from the fact that anyone with even half a clue
 cyg knows what the effects of uncontrolled file fragmentation are on
 cyg sequential access performance (and can even estimate those
 cyg effects within moderately small error bounds if they know what
 cyg the disk characteristics are and how bad the fragmentation is),
 cyg if you're looking for additional evidence that even someone
 cyg otherwise totally ignorant could appreciate there's the fact that

 I've never said there are not fragmentation problems with ZFS.
 Well, actually I've been hit by the issue in one environment.
 Also you haven't done your work home properly, as one of ZFS
 developers actually stated they are going to work on ZFS
 de-fragmentation and disk removal (pool shrinking).
 See http://www.opensolaris.org/jive/thread.jspa?messageID=139680#139680
 Lukasz happens to be my friend who is also working with the same
 environment.

 The point is, and you as a long time developer (I guess) should know it,
 you can't have everything done at once (lack of resources, and it takes
 some time anyway) so you must prioritize. ZFS is open source and if
 someone thinks that given feature is more important than the other
 he/she should try to fix it or at least voice it here so ZFS
 developers can possibly adjust their priorities if there's good enough
 and justified demand.

 Now the important part - quite a lot of people are using ZFS, from
 desktop usage, their laptops, small to big production environments,
 clustered environments, SAN environemnts, JBODs, entry-level to high-end 
 arrays,
 different applications, workloads, etc. And somehow you can't find
 many complaints about ZFS fragmentation. It doesn't mean the problem
 doesn't exist (and I know it first hand) - it means that for whatever
 reason for most people using ZFS it's not a big problem if problem at
 all. However they do have other issues and many of them were already
 addressed or are being addressed. I would say that ZFS developers at
 least try to listen to the community.

 Why am I asking for a proof - well, given constrains on resources, I
 would say we (not that I'm ZFS developer) should focus on actual
 problems people have with ZFS rather then theoretical problems (which
 in some environments/workloads will show up and sooner or later they
 will have to be addressed too).

 Then you find people like Pawel Jakub Davidek (guy who ported ZFS to
 FreeBSD) who started experimenting with RAID-5 like implementation
 with ZFS - he provided even some numbers showing it might be worth
 looking at. That's what community is about.

 I don't see any point complaining about ZFS all over again - have you
 actually run into the problem with ZFS yourself? I