Re: [zfs-discuss] 3ware support

2008-02-12 Thread Nicolas Szalay
Le mardi 12 février 2008 à 07:22 +0100, Johan Kooijman a écrit :
 Goodmorning all,

Hi,

 can anyone confirm that 3ware raid controllers are indeed not working
 under Solaris/OpenSolaris? I can't seem to find it in the HCL.

I do confirm they don't work

 We're now using a 3Ware 9550SX as a S-ATA RAID controller. The
 original plan was to disable all it's RAID functions and use justs the
 S-ATA controller functionality for ZFS deployment.
 
 If indeed 3Ware isn't support, I have to buy a new controller. Any
 specific controller/brand you can recommend for Solaris?

I use Areca cards, with the driver supplied by Areca (certified in the
HCL)

Have a nice day,

-- 
Nicolas Szalay

Administrateur systèmes  réseaux

-- _
ASCII ribbon campaign ( )
 - against HTML email  X
  vCards / \


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread Lida Horn
Jason J. W. Williams wrote:
 X4500 problems seconded. Still having issues with port resets due to
 the Marvell driver. Though they seem considerably more transient and
 less likely to lock up the entire systems in the most recent ( b72)
 OpenSolaris builds.
   
Build 72 is pretty old.  The build date for that build was August 27, 2007.
It looks like build 75 should have been pretty good (October 8, 2007), but
for absolute, most up-to-date stuff you want build 84 (which will be
in a build on February 25, 2008).  The source code changes are
already visible in OpenSolaris and the marvell88sx binary is likewise
down loadable (I wish I could provide the source, but Marvell says no).

Please try something more recent than over 5 months old.

Regards,
Lida
 -J

 On Feb 12, 2008 9:35 AM, Carson Gaspar [EMAIL PROTECTED] wrote:
   
 Tim wrote:
 
 A much cheaper (and probably the BEST supported card), is the supermicro
 based on the marvell chipset.  This is the same chipset that is used in
 the thumper x4500 so you know that the folks at sun are doing their due
 diligence to make sure the drivers are solid.
   
 Except the drivers _aren't_ solid, at least in Solaris(tm). The
 OpenSolaris drivers may have been fixed (I know a lot of work is going
 into them, but I haven't tested them), but those fixes have not made it
 back into the supported realm.

 So if you need to run a supported OS, I'd skip the Marvell chips if
 possible, at least for now.

 --
 Carson

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread Jason J. W. Williams
X4500 problems seconded. Still having issues with port resets due to
the Marvell driver. Though they seem considerably more transient and
less likely to lock up the entire systems in the most recent ( b72)
OpenSolaris builds.

-J

On Feb 12, 2008 9:35 AM, Carson Gaspar [EMAIL PROTECTED] wrote:
 Tim wrote:
 

  A much cheaper (and probably the BEST supported card), is the supermicro
  based on the marvell chipset.  This is the same chipset that is used in
  the thumper x4500 so you know that the folks at sun are doing their due
  diligence to make sure the drivers are solid.

 Except the drivers _aren't_ solid, at least in Solaris(tm). The
 OpenSolaris drivers may have been fixed (I know a lot of work is going
 into them, but I haven't tested them), but those fixes have not made it
 back into the supported realm.

 So if you need to run a supported OS, I'd skip the Marvell chips if
 possible, at least for now.

 --
 Carson

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lost intermediate snapshot; incremental backup still possible?

2008-02-12 Thread Jeff Bonwick
I think so.  On your backup pool, roll back to the last snapshot that
was successfully received.  Then you should be able to send an incremental
between that one and the present.

Jeff

On Thu, Feb 07, 2008 at 08:38:38AM -0800, Ian wrote:
 I keep my system synchronized to a USB disk from time to time.  The script 
 works by sending incremental snapshots to a pool on the USB disk, then 
 deleting those snapshots from the source machine.
 
 A botched script ended up deleting a snapshot that was not successfully 
 received on the USB disk.  Now, I've lost the ability to send incrementally 
 since the intermediate snapshot is lost.  From what I gather, if I try to 
 send a full snapshot, it will require deleting and replacing the dataset on 
 the USB disk.  Is there any way around this?
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Need help with a dead disk

2008-02-12 Thread Marion Hakanson
[EMAIL PROTECTED] said:
 One thought I had was to unconfigure the bad disk with cfgadm. Would  that
 force the system back into the 'offline' response? 

In my experience (X4100 internal drive), that will make ZFS stop trying
to use it.  It's also a good idea to do this before you hot-unplug the
bad drive to replace it with a new one.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Computer usable output for zpool commands

2008-02-12 Thread eric kustarz

On Feb 1, 2008, at 7:17 AM, Nicolas Dorfsman wrote:

 Hi,

 I wrote an hobbit script around lunmap/hbamap commands to monitor  
 SAN health.
 I'd like to add detail on what is being hosted by those luns.

 With svm metastat -p is helpful.

 With zfs, zpool status output is awful for script.

 Is there somewhere an utility to show zpool informations in a  
 scriptable format ?

What exactly do you want to display?

We have the '-H' option to 'zfs list' and 'zfs get' for parsing.   
Feel free to experiment with the code to make zpool otuput more  
scriptable:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/ 
zpool/zpool_main.c

eric

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Need help with a dead disk

2008-02-12 Thread Brian H. Nelson
Here's a bit more info. The drive appears to have failed at 22:19 EST 
but it wasn't until 1:30 EST the next day that the system finally 
decided that it was bad. (Why?) Here's some relevant log stuff (with 
lots of repeated 'device not responding' errors removed) I don't know if 
it will be useful:


Feb 11 22:19:09 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):
Feb 11 22:19:09 maxwell SCSI transport failed: reason 
'incomplete': retrying command
Feb 11 22:19:10 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 11 22:19:10 maxwell disk not responding to selection
...
Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED] (isp0):

Feb 11 22:21:08 maxwell SCSI Cable/Connection problem.
Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.notice]   
Hardware/Firmware error.
Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED] (isp0):

Feb 11 22:21:08 maxwell Fatal error, resetting interface, flg 16

... (Why did this take so long?)

Feb 12 01:30:05 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 12 01:30:05 maxwell offline
...
Feb 12 01:30:22 maxwell fmd: [ID 441519 daemon.error] SUNW-MSG-ID: 
ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major

Feb 12 01:30:22 maxwell EVENT-TIME: Tue Feb 12 01:30:22 EST 2008
Feb 12 01:30:22 maxwell PLATFORM: SUNW,Ultra-250, CSN: -, HOSTNAME: maxwell
Feb 12 01:30:22 maxwell SOURCE: zfs-diagnosis, REV: 1.0
Feb 12 01:30:22 maxwell EVENT-ID: 7f48f376-2eb1-ccaf-afc5-e56f5bf4576f
Feb 12 01:30:22 maxwell DESC: A ZFS device failed.  Refer to 
http://sun.com/msg/ZFS-8000-D3 for more information.

Feb 12 01:30:22 maxwell AUTO-RESPONSE: No automated response will occur.
Feb 12 01:30:22 maxwell IMPACT: Fault tolerance of the pool may be 
compromised.
Feb 12 01:30:22 maxwell REC-ACTION: Run 'zpool status -x' and replace 
the bad device.



One thought I had was to unconfigure the bad disk with cfgadm. Would 
that force the system back into the 'offline' response?


Thanks,
-Brian



Brian H. Nelson wrote:
Ok. I think I answered my own question. ZFS _didn't_ realize that the 
disk was bad/stale. I power-cycled the failed drive (external) to see if 
it would come back up and/or run diagnostics on it. As soon as I did 
that, ZFS put the disk ONLINE and started using it again! Observe:


bash-3.00# zpool status
  pool: pool1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool1ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
c0t9d0   ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c2t0d0   ONLINE   0 0 0
c2t1d0   ONLINE   0 0 0
c2t2d0   ONLINE   2.11K 20.09 0

errors: No known data errors


Now I _really_ have a problem. I can't offline the disk myself:

bash-3.00# zpool offline pool1 c2t2d0
cannot offline c2t2d0: no valid replicas


I don't understand why, as 'zpool status' says all the other drives are OK.

What's worse, if I just power off the drive in question (trying to get 
back to where I started) the zpool hangs completely! I let it go for 
about 7 minutes thinking maybe there was some timeout, but still 
nothing. Any command that would access the zpool (including 'zpool  
status') hangs. The only way to fix is to power the external disk back 
on upon which everything starts working like nothing has happened. 
Nothing gets logged other than lots of these only while the drive is 
powered off:


Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 12 11:49:32 maxwell disk not responding to selection
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 12 11:49:32 maxwell offline or reservation conflict
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 

[zfs-discuss] Need help with a dead disk (was: ZFS keeps trying to open a dead disk: lots of logging)

2008-02-12 Thread Brian H. Nelson
Ok. I think I answered my own question. ZFS _didn't_ realize that the 
disk was bad/stale. I power-cycled the failed drive (external) to see if 
it would come back up and/or run diagnostics on it. As soon as I did 
that, ZFS put the disk ONLINE and started using it again! Observe:

bash-3.00# zpool status
  pool: pool1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool1ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
c0t9d0   ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c2t0d0   ONLINE   0 0 0
c2t1d0   ONLINE   0 0 0
c2t2d0   ONLINE   2.11K 20.09 0

errors: No known data errors


Now I _really_ have a problem. I can't offline the disk myself:

bash-3.00# zpool offline pool1 c2t2d0
cannot offline c2t2d0: no valid replicas

I don't understand why, as 'zpool status' says all the other drives are OK.

What's worse, if I just power off the drive in question (trying to get 
back to where I started) the zpool hangs completely! I let it go for 
about 7 minutes thinking maybe there was some timeout, but still 
nothing. Any command that would access the zpool (including 'zpool  
status') hangs. The only way to fix is to power the external disk back 
on upon which everything starts working like nothing has happened. 
Nothing gets logged other than lots of these only while the drive is 
powered off:

Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 11:49:32 maxwell disk not responding to selection
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 11:49:32 maxwell offline or reservation conflict
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 11:49:32 maxwell i/o to invalid geometry


What's going on here? What can I do to make ZFS let go of the bad drive? 
This is a production machine and I'm getting concerned. I _really_ don't 
like the fact that ZFS is using a suspect drive, but I can't seem to 
make it stop!

Thanks,
-Brian


Brian H. Nelson wrote:
 This is Solaris 10U3 w/127111-05.

 It appears that one of the disks in my zpool died yesterday. I got 
 several SCSI errors finally ending with 'device not responding to 
 selection'. That seems to be all well and good. ZFS figured it out and 
 the pool is degraded:

 maxwell /var/adm zpool status
   pool: pool1
  state: DEGRADED
 status: One or more devices could not be opened.  Sufficient replicas 
 exist for
 the pool to continue functioning in a degraded state.
 action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-D3
  scrub: none requested
 config:

 NAME STATE READ WRITE CKSUM
 pool1DEGRADED 0 0 0
   raidz1 DEGRADED 0 0 0
 c0t9d0   ONLINE   0 0 0
 c0t10d0  ONLINE   0 0 0
 c0t11d0  ONLINE   0 0 0
 c0t12d0  ONLINE   0 0 0
 c2t0d0   ONLINE   0 0 0
 c2t1d0   ONLINE   0 0 0
 c2t2d0   UNAVAIL  1.88K 17.98 0  cannot open

 errors: No known data errors


 My question is why does ZFS keep attempting to open the dead device? At 
 least that's what I assume is happening. About every minute, I get eight 
 of these entries in the messages log:

 Feb 12 10:15:54 maxwell scsi: [ID 107833 kern.warning] WARNING: 
 /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
 PROTECTED],0 (sd32):
 Feb 12 10:15:54 maxwell disk not responding to selection

 I also got a number of these thrown in for good measure:

 Feb 11 22:21:58 maxwell scsi: [ID 107833 kern.warning] WARNING: 
 /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
 PROTECTED],0 (sd32):
 Feb 11 22:21:58 maxwell SYNCHRONIZE CACHE command failed (5)


 Since the disk died last night (at about 11:20pm EST) I now have over 
 15K of similar entries in my log. What gives? Is this expected behavior? 
 If ZFS knows the device is having problems, why does it not just leave 
 it 

Re: [zfs-discuss] 3ware support

2008-02-12 Thread Lida Horn
Carson Gaspar wrote:
 Tim wrote:
   

   
 A much cheaper (and probably the BEST supported card), is the supermicro 
 based on the marvell chipset.  This is the same chipset that is used in 
 the thumper x4500 so you know that the folks at sun are doing their due 
 diligence to make sure the drivers are solid.
 

 Except the drivers _aren't_ solid, at least in Solaris(tm). The 
 OpenSolaris drivers may have been fixed (I know a lot of work is going 
 into them, but I haven't tested them), but those fixes have not made it 
 back into the supported realm.

 So if you need to run a supported OS, I'd skip the Marvell chips if 
 possible, at least for now.

   
Does this mean that support still has not provided you with working 
code?  I am surprised if
that is true.  I do not know of any reason why this should be the case.  
If you have not
been given fixed code I think you should escalate up the support chain.

Further, if more customers push for getting the latest changes that are 
in OpenSolaris
into Solaris 10, the more likely it is that the individuals responsible 
for evaluating what
should be back ported to Solaris 10 will accept those changes.

Regards and sympathy,
Lida
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread Rob Windsor
Johan Kooijman wrote:
 Goodmorning all,
 
 can anyone confirm that 3ware raid controllers are indeed not working
 under Solaris/OpenSolaris? I can't seem to find it in the HCL.
 
 We're now using a 3Ware 9550SX as a S-ATA RAID controller. The
 original plan was to disable all it's RAID functions and use justs the
 S-ATA controller functionality for ZFS deployment.
 
 If indeed 3Ware isn't support, I have to buy a new controller. Any
 specific controller/brand you can recommend for Solaris?

3ware cards do not work (as previously specified).  Even in 
linux/windows, they're pretty flaky -- if you had Solaris drivers, you'd 
probably shoot yourself in a month anyway.

I'm using the SuperMicro aoc-sat2-mv8 at the recommendation of someone 
else on this list.  It's a JBOD card, which is perfect for ZFS.  Also, 
you won't be paying for RAID functionality that you're wanting to 
disable anyway.

Rob++
-- 
Internet: [EMAIL PROTECTED] __o
Life: [EMAIL PROTECTED]_`\,_
(_)/ (_)
They couldn't hit an elephant at this distance.
   -- Major General John Sedgwick
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Need help with a dead disk

2008-02-12 Thread Ross
Hmm... this won't help you, but I think I'm having similar problems with an 
iSCSI target device.  If I offline the target, zfs hangs for just over 5 
minutes before it realises the device is unavailable, and even then it doesn't 
report the problem until I repeat the zpool status command.

What I see here every time is:
 - iSCSI device disconnected
 - zpool status, and all file i/o appears to hang for 5 mins
 - zpool status then finishes (reporting pools ok), and i/o carries on.
 - Immediately running zpool status again correctly shows the device as faulty 
and the pool as degraded.

It seems either ZFS or the Solaris driver stack has a problem when devices go 
offline.  Both of us have seen zpool status hang for huge amounts of time when 
there's a problem with a drive.  Not something that inspires confidence in a 
raid system.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Which DTrace provider to use

2008-02-12 Thread Jonathan Loran

Hi List,

I'm wondering if one of you expert DTrace guru's can help me.  I want to 
write a DTrace script to print out a a histogram of how long IO requests 
sit in the service queue.  I can output the results with the quantize 
method.  I'm not sure which provider I should be using for this.  Does 
anyone know?  I can easily adapt one of the DTrace Toolkit routines for 
this, if I can find the provider.

I'll also throw out the problem I'm trying to meter.  We are using ZFS 
on a large SAN array (4TB).  The pool on this array serves up a lot of 
users, (250 home file systems/directories) and also /usr/local and other 
OTS software.  It works fine most of the time, but then gets overloaded 
during busy periods.  I'm going to reconfigure the array to help with 
this, but I sure would love to have some metrics to know how big a 
difference my tweaks are making.  Basically, the problem users 
experience, when the load shoots up are huge latencies.  An ls on a 
non-cached directory, which usually is instantaneous, will take 20, 30, 
40 seconds or more.   Then when the storage array catches up, things get 
better.  My clients are not happy campers. 

I know, I know, I should have gone with a JBOD setup, but it's too late 
for that in this iteration of this server.  We we set this up, I had the 
gear already, and it's not in my budget to get new stuff right now.

Thanks for any help anyone can offer.

Jon

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread Carson Gaspar
Tim wrote:
 

 A much cheaper (and probably the BEST supported card), is the supermicro 
 based on the marvell chipset.  This is the same chipset that is used in 
 the thumper x4500 so you know that the folks at sun are doing their due 
 diligence to make sure the drivers are solid.

Except the drivers _aren't_ solid, at least in Solaris(tm). The 
OpenSolaris drivers may have been fixed (I know a lot of work is going 
into them, but I haven't tested them), but those fixes have not made it 
back into the supported realm.

So if you need to run a supported OS, I'd skip the Marvell chips if 
possible, at least for now.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS keeps trying to open a dead disk: lots of logging

2008-02-12 Thread Brian H. Nelson
This is Solaris 10U3 w/127111-05.

It appears that one of the disks in my zpool died yesterday. I got 
several SCSI errors finally ending with 'device not responding to 
selection'. That seems to be all well and good. ZFS figured it out and 
the pool is degraded:

maxwell /var/adm zpool status
  pool: pool1
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas 
exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool1DEGRADED 0 0 0
  raidz1 DEGRADED 0 0 0
c0t9d0   ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c2t0d0   ONLINE   0 0 0
c2t1d0   ONLINE   0 0 0
c2t2d0   UNAVAIL  1.88K 17.98 0  cannot open

errors: No known data errors


My question is why does ZFS keep attempting to open the dead device? At 
least that's what I assume is happening. About every minute, I get eight 
of these entries in the messages log:

Feb 12 10:15:54 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 10:15:54 maxwell disk not responding to selection

I also got a number of these thrown in for good measure:

Feb 11 22:21:58 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 11 22:21:58 maxwell SYNCHRONIZE CACHE command failed (5)


Since the disk died last night (at about 11:20pm EST) I now have over 
15K of similar entries in my log. What gives? Is this expected behavior? 
If ZFS knows the device is having problems, why does it not just leave 
it alone and wait for user intervention?

Also, I noticed that the 'action' says to attach the device and 'zpool 
online' it. Am I correct in assuming that a 'zpool replace' is what 
would really be needed, as the data on the disk will be outdated?

Thanks,
-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scrub halts

2008-02-12 Thread Lida Horn
Will Murnane wrote:
 On Feb 12, 2008 4:45 AM, Lida Horn [EMAIL PROTECTED] wrote:
   
 The latest changes to the sata and marvell88sx modules
 have been put  back to Solaris Nevada and should be
 available in the next build (build 84).  Hopefully,
 those of you who use it will find the changes helpful.
 
 I have indeed found it beneficial.  I installed the new drivers on two
 machines, both of which were intermittently giving errors about device
 resets.  One card did this so often that I believed the card was
 faulty and I would have to replace either the card or the motherboard.
   
I'm glad you find the new modules useful and am pleased with your results.
One thing of which I would like you to be aware is that some of what was 
done
was to suppress the messages.  In other words, some of what was happening
before is still happening, just silently.
 Since installing the new drivers I've had no issues whatsoever with
 drives on either box.  I ran zpool scrubs continuously on the flaky
 box, replaced a disk with another one, and copied data about in an
 attempt to replicate the bus errors I had previously seen, to no
 avail.  The other box has been similarly stable, as far as I can tell;
 I see no messages in the logs and the users haven't complained when I
 asked them.
   
No issues whatsoever, wonderful words to hear!
 Thank you for the work you've put into improving the state of these
 drivers; I meant to email you earlier this week and mention the great
 strides they have made, but other things took precedence.  That, to my
 mind, is the primary evolution these drivers have made: I don't have
 to worry about my HBAs any more.
   
I appreciate your taking the time to post and hope you have no further 
issues with
the driver.

Thank you,
Lida
 Thanks!
 Will
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread Tim
On 2/12/08, Johan Kooijman [EMAIL PROTECTED] wrote:

 Goodmorning all,

 can anyone confirm that 3ware raid controllers are indeed not working
 under Solaris/OpenSolaris? I can't seem to find it in the HCL.

 We're now using a 3Ware 9550SX as a S-ATA RAID controller. The
 original plan was to disable all it's RAID functions and use justs the
 S-ATA controller functionality for ZFS deployment.

 If indeed 3Ware isn't support, I have to buy a new controller. Any
 specific controller/brand you can recommend for Solaris?

 --
 Met vriendelijke groeten / With kind regards,
 Johan Kooijman

 T +31(0) 6 43 44 45 27
 F +31(0) 76 201 1179
 E [EMAIL PROTECTED]
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



Johan,

A much cheaper (and probably the BEST supported card), is the supermicro
based on the marvell chipset.  This is the same chipset that is used in the
thumper x4500 so you know that the folks at sun are doing their due
diligence to make sure the drivers are solid.

It's also much cheaper than almost all RAID based alternatives to boot.  If
you aren't using the raid functionality, don't waste your money on a raid
card :)

http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm

Here's where I purchased mine from, but I'm guessing you are not in the US
and they don't ship to your country of origin.
http://www.ewiz.com/detail.php?p=AOC-SAT2MVc=frpid=84b59337aa4414aa488fdf95dfd0de1a1e2a21528d6d2fbf89732c9ed77b72a4
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] We can't import pool zfs faulted

2008-02-12 Thread Stéphane Delmotte
Yesterday, we needed to stop our nfs file server.
after having restarted one pool zfs was marked as faulted

we exported the faulted pool
we tried to import it (even with option -f) but it can't
msg by solaris is cannot import one or more devices curently unavaillable.

what can I do ?
is there a way to rebuild a faulted pool ?

I have no backup.

thanks for any help
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%

2008-02-12 Thread Thomas Liesner
bda wrote:

 I haven't noticed this behavior when ZFS has (as recommended) the
 full disk.

Good to know, as i intended to use the whole disks anyway.
Thanks,
Tom
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%

2008-02-12 Thread Thomas Liesner
Ralf Ramge wrote:

 Quotas are applied to file systems, not pools, and a such are pretty 
 independent from the pool size. I found it best to give every user 
 his/her own filesystem and applying individual quotas afterwards.

Does this mean, that if i have a pool of 7TB with one filesystem for all users 
with a quota of 6TB i'd be alright?
The usage of that fs would never be over 80%, right?

Like in the following example for the pool shares with a poolsize of 228G an 
one fs with a quota of 100G:

shares 228G28K   220G 1%/shares
shares/production   100G   8,4G92G 9%/shares/production

This would suite me perfectly, as this would be exactly what i wanted to do ;)

Thanks,
Tom
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%

2008-02-12 Thread Bryan Allen
+--
| On 2008-02-12 02:40:33, Thomas Liesner wrote:
| 
| Subject: Re: [zfs-discuss] Avoiding performance decrease when pool usage is
|  over 80%
| 
| Nobody out there who ever had problems with low diskspace?

Only in shared-disk setups. i.e., ZFS lives on a slice or on
partition 2 with (typically) UFS slices or UFS on partition 1.

I've definitely tried to keep disk util under 80% for this
reason. Things become very slow as you pass that limit.

I haven't noticed this behavior when ZFS has (as recommended) the
full disk.
-- 
bda
Cyberpunk is dead.  Long live cyberpunk.
http://mirrorshades.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%

2008-02-12 Thread Thomas Liesner
Nobody out there who ever had problems with low diskspace?

Regrads,
Tom
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread James C. McPherson
Nicolas Szalay wrote:
 Le mardi 12 février 2008 à 07:22 +0100, Johan Kooijman a écrit :
 Goodmorning all,
 
 Hi,
 
 can anyone confirm that 3ware raid controllers are indeed not working
 under Solaris/OpenSolaris? I can't seem to find it in the HCL.
 
 I do confirm they don't work
 
 We're now using a 3Ware 9550SX as a S-ATA RAID controller. The
 original plan was to disable all it's RAID functions and use justs the
 S-ATA controller functionality for ZFS deployment.

 If indeed 3Ware isn't support, I have to buy a new controller. Any
 specific controller/brand you can recommend for Solaris?
 
 I use Areca cards, with the driver supplied by Areca (certified in the
 HCL)


I'm working on getting arcmsr integrated into OpenSolaris,
and I hope to integrate it into build 87.

The RFE is

6614012 add Areca SAS/SATA RAID adapter driver
PSARC 2008/079 arcmsr SAS/SATA RAID driver

The existing case materials (spec and manpage) should be
visible on www.opensolaris.org/os/community/arc


cheers
James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%

2008-02-12 Thread Ralf Ramge
Thomas Liesner wrote:
 Nobody out there who ever had problems with low diskspace?

   
Okay, I found your original mail :-)

Quotas are applied to file systems, not pools, and a such are pretty 
independent from the pool size. I found it best to give every user 
his/her own filesystem and applying individual quotas afterwards.

-- 

Ralf Ramge
Senior Solaris Administrator, SCNA, SCSA

Tel. +49-721-91374-3963 
[EMAIL PROTECTED] - http://web.de/

11 Internet AG
Brauerstraße 48
76135 Karlsruhe

Amtsgericht Montabaur HRB 6484

Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, 
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Norbert Lang, 
Achim Weiss 
Aufsichtsratsvorsitzender: Michael Scheeren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%

2008-02-12 Thread Ralf Ramge
Thomas Liesner wrote:
 Does this mean, that if i have a pool of 7TB with one filesystem for all 
 users with a quota of 6TB i'd be alright?
   
Yep. Although I *really* recommend creating individual file systems, 
e.g. if you have 1,000 users on your server, I'd create 1,000 file 
systems with a quota of  6 GB each.  Easier to handle, more flexible to 
use, easier to backup, it allows better use of snapshots and it's easier 
to migrate single users to other servers.

 The usage of that fs would never be over 80%, right?

   
Nope.

Don't mix up pools and file systems. your pool of 7TB will only be 
filled to a maximum of 6TB, but the file system will be 100% full. which 
shouldn't impact your overall performance.

 Like in the following example for the pool shares with a poolsize of 228G 
 an one fs with a quota of 100G:

 shares 228G28K   220G 1%/shares
 shares/production   100G   8,4G92G 9%/shares/production

 This would suite me perfectly, as this would be exactly what i wanted to do ;)

   
Yep, you got it.

-- 

Ralf Ramge
Senior Solaris Administrator, SCNA, SCSA

Tel. +49-721-91374-3963 
[EMAIL PROTECTED] - http://web.de/

11 Internet AG
Brauerstraße 48
76135 Karlsruhe

Amtsgericht Montabaur HRB 6484

Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, 
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Norbert Lang, 
Achim Weiss 
Aufsichtsratsvorsitzender: Michael Scheeren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] We can't import pool zfs faulted

2008-02-12 Thread Thomas Liesner
If you can't use zpool status, you probably should check wether your system 
is right and not all devices needed for this pool are currently available...

i.e. format...

Regards,
Tom
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss