Re: [zfs-discuss] Raid-Z Issue

2009-09-13 Thread Mads Skipper
I am buying a Lycom 2 port sata controller as I just want it to work. But the 
disks I use is Western Digital 1TB GP
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Export, Import = Windows sees wrong groups in ACLs

2009-09-13 Thread Owen Davies
Thanks.  I took a look and that is exactly what I was looking for.  Of course I 
have since just reset all the permissions on all my shares but it seems that 
the proper way to swap UIDs for users with permissions on CIFS shares is to:

Edit /etc/passwd
Edit /var/smb/smbpasswd

And to change GIDs for groups used on CIFS shares you need to both:

Edit /etc/group
Edit /var/smb/smbgroup.db

Is there a better way to do this than manually editing each file (or db)?  I 
don't think there is much of this sort of integration yet so that tools update 
things in a consistent way on both the UNIX side and the CIFS side.

Thanks,
Owen Davies
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] alternative hardware configurations for zfs

2009-09-13 Thread Jens Elkner
On Sat, Sep 12, 2009 at 02:37:35PM -0500, Tim Cook wrote:
>On Sat, Sep 12, 2009 at 10:17 AM, Damjan Perenic
...
>  I shopped for 1TB 7200rpm drives recently and I noticed Seagate
>  Barracude ES.2 has 1TB version with SATA and SAS interface.
> 
>On the flip side, according to storage review, the SATA version trumps the
>SAS version in pretty much everything but throughput (which is still
>negligible).
>
> [5]http://www.storagereview.com/php/benchmark/suite_v4.php?typeID=10&testbed
>ID=4&osID=6&raidconfigID=1&numDrives=1&devID_0=354&devID_1=362&devCnt=2
>--Tim

Just in case if interested in SATA, perhaps this helps (made on an almost
idle system):

elkner.sol /pool2 > uname -a
SunOS sol 5.11 snv_98 i86pc i386 i86xpv

elkner.sol /rpool >  prtdiag
System Configuration: Intel S5000PAL
BIOS Configuration: Intel Corporation S5000.86B.10.00.0091.081520081046 
08/15/2008
BMC Configuration: IPMI 2.0 (KCS: Keyboard Controller Style)

 Processor Sockets 

Version  Location Tag
 --
Intel(R) Xeon(R) CPU   E5440  @ 2.83GHz CPU1
Intel(R) Xeon(R) CPU   E5440  @ 2.83GHz CPU2
...

elkner.sol /pool2 > + /usr/X11/bin/scanpci | grep -i sata
 Intel Corporation 631xESB/632xESB SATA AHCI Controller

elkner.sol ~ > iostat -E | \
awk '/^sd/ { print $1; getline; print; getline; print }'
sd0
Vendor: ATA  Product: ST3250310NS  Revision: SN05 Serial No:
Size: 250.06GB <250059350016 bytes>
sd1
Vendor: ATA  Product: ST3250310NS  Revision: SN04 Serial No:
Size: 250.06GB <250059350016 bytes>
sd2
Vendor: ATA  Product: ST3250310NS  Revision: SN04 Serial No:
Size: 250.06GB <250059350016 bytes>
sd3
Vendor: ATA  Product: ST3250310NS  Revision: SN05 Serial No:
Size: 250.06GB <250059350016 bytes>
sd5
Vendor: ATA  Product: ST31000340NS Revision: SN06 Serial No:
Size: 1000.20GB <1000204886016 bytes>
sd6
Vendor: ATA  Product: ST31000340NS Revision: SN06 Serial No:
Size: 1000.20GB <1000204886016 bytes>

elkner.sol ~ > zpool status | grep ONLINE
 state: ONLINE
pool1   ONLINE   0 0 0
  mirrorONLINE   0 0 0
c1t2d0  ONLINE   0 0 0
c1t3d0  ONLINE   0 0 0
 state: ONLINE
pool2   ONLINE   0 0 0
  mirrorONLINE   0 0 0
c1t4d0  ONLINE   0 0 0
c1t5d0  ONLINE   0 0 0
 state: ONLINE
rpool ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c1t0d0s0  ONLINE   0 0 0
c1t1d0s0  ONLINE   0 0 0


elkner.sol /pool2 > + time sh -c "mkfile 4g xx; sync; echo ST31000340NS"
ST31000340NS
real 3:55.2
user0.0
sys 1.9

elkner.sol ~ > iostat -zmnx c1t4d0 c1t5d0 5 | grep -v device
0.0  154.20.0 19739.4  3.0 32.0   19.4  207.5 100 100 c1t4d0
0.0  125.80.0 16103.9  3.0 32.0   23.8  254.3 100 100 c1t5d0
0.0  133.00.0 16366.9  2.4 25.9   17.9  194.4  80  82 c1t4d0
0.0  158.00.0 19592.5  2.8 30.3   17.6  191.7  93  96 c1t5d0
0.0  159.40.0 20054.8  2.8 30.3   17.7  190.2  94  95 c1t4d0
0.0  140.20.0 17597.2  2.8 30.3   20.1  216.4  94  95 c1t5d0
0.0  134.80.0 16298.7  2.0 23.0   15.2  170.8  68  76 c1t4d0
0.0  154.40.0 18807.5  2.7 29.3   17.3  189.9  89  94 c1t5d0
0.0  188.40.0 24115.5  3.0 32.0   15.9  169.8 100 100 c1t4d0
0.0  159.80.0 20454.6  3.0 32.0   18.8  200.2 100 100 c1t5d0
0.0  120.00.0 14328.3  2.0 22.2   16.4  184.9  66  71 c1t4d0
0.0  143.20.0 17169.9  2.6 28.2   18.0  197.1  86  93 c1t5d0
0.0  157.00.0 19140.9  2.6 29.3   16.5  186.9  87  96 c1t4d0
0.0  169.20.0 20676.9  2.2 24.8   13.2  146.6  75  79 c1t5d0
0.0  156.20.0 19993.8  3.0 32.0   19.2  204.8 100 100 c1t4d0
0.0  140.40.0 17971.3  3.0 32.0   21.3  227.9 100 100 c1t5d0
0.0  138.80.0 16759.6  2.6 29.3   18.4  210.9  86  94 c1t4d0
0.0  146.60.0 17809.2  2.7 29.6   18.4  201.7  90  94 c1t5d0
0.0  133.80.0 16196.8  2.5 28.0   18.9  209.3  85  90 c1t4d0
0.0  134.00.0 16222.4  2.6 28.7   19.5  214.3  87  94 c1t5d0
r/sw/s   kr/skw/s wait actv wsvc_t asvc_t  %w  %b device

elkner.sol /pool1 > + time sh -c 'mkfile 4g xx; sync; echo ST3250310NS'
ST3250310NS
real 1:33.5
user0.0
sys 2.0

elkner.sol ~ > iostat -zmnx c1t2d0 c1t3d0 5 | grep -v device
0.2  408.61.6 49336.8 25.7  0.8   62.81.9  79  79 c1t3d0
0.2  432.61.6 53284.4 29.9  0.9   69.02.1  89  89 c1t2d0
0.2  456.01.6 56280.0 28.6  0.9   62.61.9  86  86 c1t3d0
0.8  389.8   17.6 45360.7 25.8  0.8   66.02.1  81  80 c1t2d0
0.4  368.63.2 42698.0 21.1  0.6   57.31.8  65  65 c1t3d0
1.0  432.48.0 52615.8 30.

Re: [zfs-discuss] Transient permanent errors

2009-09-13 Thread Stuart Anderson
I have seen this again on a different server. Presumably not a big  
deal, but a false alarm about "data corruption" is probably not good  
marketing for ZFS. Is this fixed in an opensolaris build?



# pca -l a -p ZFS
Using /var/tmp/patchdiag.xref from Sep/11/09
Host: samhome1 (SunOS 5.10/Generic_141415-10/i386/i86pc)
List: a (2/88)

Patch  IR   CR RSB Age Synopsis
-- -- - -- --- ---  
---
141105 02 = 02 ---  58 SunOS 5.10_x86: ZFS Administration Java Web  
Console Patch

141909 03 = 03 R--  30 SunOS 5.10_x86: ZFS patch



# zpool status -v rpool
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub in progress for 0h7m, 93.90% done, 0h0m to go
config:

NAME  STATE READ WRITE CKSUM
rpool ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c0t0d0s0  ONLINE   0 0 0
c0t1d0s0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

//dev/dsk/c0t0d0s0
//dev/dsk/c0t1d0s0


# zpool status -v rpool
  pool: rpool
 state: ONLINE
 scrub: scrub completed after 0h8m with 0 errors on Sun Sep 13  
17:22:47 2009

config:

NAME  STATE READ WRITE CKSUM
rpool ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c0t0d0s0  ONLINE   0 0 0
c0t1d0s0  ONLINE   0 0 0

errors: No known data errors

Thanks.


On Jun 28, 2009, at 7:31 PM, Stuart Anderson wrote:

This is S10U7 fully patched and not open solaris, but I would  
appreciate any

advice on the following transient "Permanent error" message generated
while running a zpool scrub.



--
Stuart Anderson  ander...@ligo.caltech.edu
http://www.ligo.caltech.edu/~anderson



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Error recovery with replicated metadata

2009-09-13 Thread Henrik Johansson

Hello all,

I have managed to get my hands on a OSOL 2009.06 root disk which has  
three failed blocks on it, these three blocks makes it impossible to  
boot from the disk and to import the disk to on another machine. I  
have checked the disk and three blocks are inaccessible, quite close  
to each other. Now should this not have good a good chance of being  
saved by replicated metadata? The data on the disk is usable, i did a  
block copy of the whole disk to a new one, and the scrub works out  
flawlessly. I guess this could be a timeout issue, but the disk is at  
least a WD RE2 disk with error recovery of 7 seconds. The failing  
systems release was 111a, and I have tried to import it into 122.


The disk was used by one of my friends which i have converted into  
using Solaris and ZFS for his company storage needs, and he is a bit  
skeptical when three blocks makes the whole pool unusable. The good  
part is that he uses mirrors for his rpool even on this non critical  
system now ;)


Anyway, can someone help to explain this, is there any timeouts that  
can be tuned to import the pool or is this a feature, obviously all  
data that is needed is intact on the disk since the block copy of the  
pool worked fine.


Also don't we need a force option for the -e option to zdb, so that we  
can use it with pools thats not have been exported correctly from a  
failing machine?


The import timeouts after 41 seconds:

r...@arne:/usr/sbin# zpool import -f 2934589927925685355 dpool
cannot import 'rpool' as 'dpool': one or more devices is currently  
unavailable


r...@arne:/usr/sbin# zpool import
  pool: rpool
id: 2934589927925685355
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier  
and

the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

rpool   ONLINE
  c1t4d0s0  ONLINE

Damaged blocks as reported by format:
Medium error during read: block 8646022 (0x83ed86) (538/48/28)
ASC: 0x11   ASCQ: 0x0
Medium error during read: block 8650804 (0x840034) (538/124/22)
ASC: 0x11   ASCQ: 0x0
Medium error during read: block 8651987 (0x8404d3) (538/143/8)
ASC: 0x11   ASCQ: 0x0

What i managed to get out of zdb:

r...@arne:/usr/sbin# zdb -e 2934589927925685355
WARNING: pool '2934589927925685355' could not be loaded as it was last  
accessed by another system (host: keeper hostid: 0xc34967). See: http://www.sun.com/msg/ZFS-8000-EY

zdb: can't open 2934589927925685355: No such file or directory

r...@arne:/usr/sbin# zdb -l /dev/dsk/c1t4d0s0

LABEL 0

version=14
name='rpool'
state=0
txg=269696
pool_guid=2934589927925685355
hostid=12798311
hostname='keeper'
top_guid=9161928630964440615
guid=9161928630964440615
vdev_tree
type='disk'
id=0
guid=9161928630964440615
path='/dev/dsk/c7t1d0s0'
devid='id1,s...@sata_wdc_wd5000ys-01m_wd-wcanu2080316/a'
phys_path='/p...@0,0/pci8086,2...@1c,4/pci1043,8...@0/ 
d...@1,0:a'

whole_disk=0
metaslab_array=23
metaslab_shift=32
ashift=9
asize=500067467264
is_log=0

LABEL 1

version=14
name='rpool'
state=0
txg=269696
pool_guid=2934589927925685355
hostid=12798311
hostname='keeper'
top_guid=9161928630964440615
guid=9161928630964440615
vdev_tree
type='disk'
id=0
guid=9161928630964440615
path='/dev/dsk/c7t1d0s0'
devid='id1,s...@sata_wdc_wd5000ys-01m_wd-wcanu2080316/a'
phys_path='/p...@0,0/pci8086,2...@1c,4/pci1043,8...@0/ 
d...@1,0:a'

whole_disk=0
metaslab_array=23
metaslab_shift=32
ashift=9
asize=500067467264
is_log=0

LABEL 2

version=14
name='rpool'
state=0
txg=269696
pool_guid=2934589927925685355
hostid=12798311
hostname='keeper'
top_guid=9161928630964440615
guid=9161928630964440615
vdev_tree
type='disk'
id=0
guid=9161928630964440615
path='/dev/dsk/c7t1d0s0'
devid='id1,s...@sata_wdc_wd5000ys-01m_wd-wcanu2080316/a'
phys_path='/p...@0,0/pci8086,2...@1c,4/pci1043,8...@0/ 
d...@1,0:a'

whole_disk=0
metaslab_array=23
metaslab_shift=32
ashift=9
asize=500067467264
is_log=0

LABEL 3

version=14
name='rpool'
state=0
txg=269696
pool_guid=2934589927925685355
hostid=12798311
hostname='keeper'
top_guid=9161928630964440615
guid=9161928630964440615
vdev_tree
t

[zfs-discuss] Strange problem, possibly zfs/zpool

2009-09-13 Thread Marc Emmerson
Hi all,
My opensolaris install is failing to boot, getting stuck just after the 
hostname is displayed during the boot process.  If I reinstall the OS it boots 
fine, but as soon as I import my raid-z array, boots fail as stated.

Immediately after I perform the zpool import (before the next reboot) the array 
is mounted successfully and I can access its contents.  However, after the 
following, inevitable reboot I can no longer boot the OS

This started happening after I updated from 118 to 121, I have not upgraded the 
pool to the version in 121, it is still at 118.

Any troubleshooting ideas or recommendations anyone?

Some background info:
My installation is extremely simple, I have an 160GB OS disk and a 10 disk 
RAID-Z array (consisting of 2 vdevs, each 5x1TB).

The server was installed with osol0906 a couple of months back and is only used 
for CIFS and Comstar iSCSI target, no other software has been installed.

Initially I upgraded directly from clean osol0906 to 118 and upgraded the pool 
version at that time.

Thanks,

marce
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] USB WD Passport 500GB zfs mirror bug

2009-09-13 Thread Stefan Parvu
Evening,

I would like to report a problem regarding zfs mirror when using USB WD 
passports drives.
I have 2 x 500 GB WD Passport drives and Im trying to use them as a zone pool 
where
some zones are created. Everything is fine except when Im trying to disconnect 
a member
of a mirror. Somehow the ordering of the members changes and Im not able to 
reconnect
them and be discovered by ZFS unless I do export/import them again.

Below details:

0. Hdw:
Acer Ferrari 4000

1. OS:
# cat /etc/release
  Solaris Express Community Edition snv_103 X86
   Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
   Assembled 17 November 2008


2. zpool import
# zpool status zones
  pool: zones
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
zones ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c7t0d0p0  ONLINE   0 0 0
c8t0d0p0  ONLINE   0 0 0

errors: No known data errors


3. Disconnecting one disk, say c7t0d0p0. Looks good the results:
# zpool status zones
  pool: zones
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
zones DEGRADED 0 0 0
  mirror  DEGRADED 0 0 0
c7t0d0p0  REMOVED  0   167 0
c8t0d0p0  ONLINE   0 0 0

errors: No known data errors


4. Reconnecting. Everything fine.
# zpool status zones
  pool: zones
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
zones ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c7t0d0p0  ONLINE   0   167 0
c8t0d0p0  ONLINE   0 0 0

errors: No known data errors


5. Disconnecting the other disk. Problems occur:
# zpool status zones
  pool: zones
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Sun Sep 13 20:58:02 2009
config:

NAME  STATE READ WRITE CKSUM
zones ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c7t0d0p0  ONLINE   0   167 0  294K resilvered
c7t0d0p0  ONLINE   0 0 0  208K resilvered

errors: No known data errors


# zpool status zones
  pool: zones
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: resilver completed after 0h0m with 0 errors on Sun Sep 13 20:58:02 2009
config:

NAME  STATE READ WRITE CKSUM
zones DEGRADED 0 0 0
  mirror  DEGRADED 0 0 0
c7t0d0p0  ONLINE   0   167 0  294K resilvered
c7t0d0p0  FAULTED  0   113 0  corrupted data

errors: No known data errors


I have disconnected c8t0d0p0 but zfs reports that c7t0d0p0 has been faulty !?
Any ideas what this is about ? The bug is related with kernel/zfs, or 
kernel/usb ?

Thanks,
Stefan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel X25-E SSD in x4500 followup

2009-09-13 Thread Paul B. Henson
On Sun, 13 Sep 2009, Mike Gerdts wrote:

> August 11 they released firmware revisions 8820, 8850, and 02G9,
> depending on the drive model.

Ooooh, cool, last time I checked they only had updates for the X25-M.
Thanks for the pointer.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem with snv_122 Zpool issue

2009-09-13 Thread Hamed
Actually I did both :( upgraded zpool from version 14 to version 18.
I did it manually after my OS installation. 

Right now i'm copying all my data from the zpool but the speed is very poor!
900 KB/s, usually I  have around 50 MB/s
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem with snv_122 Zpool issue

2009-09-13 Thread Richard Elling
Did you upgrade the OS or did you also do a zpool (or zfs) upgrade?   
An OS upgrade
will not do a zpool upgrade.  If you did not do zpool upgrade, then  
you should be able

to boot into the previous boot environment.

On Sep 12, 2009, at 8:16 AM, Hamed wrote:

Do you thing that this is a bug? If it is a bug, its okay for me. I  
can wait for future releases. But if this is happening only for me,  
then I really need help to solve this problem.


I'd suspect a driver change.  The summary of code changes for each  
release is

available at the download center. For example,
http://dlc.sun.com/osol/on/downloads/b121/
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why is Solaris 10 ZFS performance so terrible?

2009-09-13 Thread Christian Kendi

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

is already a diff for the source available?

El Sep 11, 2009, a las 4:02 PM, Rich Morris escribió:


On 09/10/09 16:22, en...@businessgrade.com wrote:

Quoting Bob Friesenhahn :


On Thu, 10 Sep 2009, Rich Morris wrote:


On 07/28/09 17:13, Rich Morris wrote:

On Mon, Jul 20, 2009 at 7:52 PM, Bob Friesenhahn wrote:

Sun has opened internal CR 6859997.  It is now in Dispatched   
state at High priority.


CR 6859997 has recently been fixed in Nevada.  This fix will  
also  be in Solaris 10 Update 9. This fix speeds up the  
sequential  prefetch pattern described in this CR without slowing  
down other  prefetch patterns.  Some kstats have also been added  
to help  improve the observability of ZFS file prefetching.


Excellent.  What level of read improvement are you seeing?  Is the
prefetch rate improved, or does the fix simply avoid losing the
prefetch?

Thanks,

Bob


Is this fixed in snv_122 or something else?


snv_124.   See 
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6859997

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (Darwin)

iD8DBQFKrRJnp+9ff145KVIRAhErAKCYKnv6Fn/Vn61Fa2MYpl9S+P9KGACeJUMA
g+RhFTRl9NdI0eNOx5aZaXw=
=QAX8
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel X25-E SSD in x4500 followup

2009-09-13 Thread Eric Schrock


On Sep 12, 2009, at 10:49 PM, Paul B. Henson wrote:


In any case, I agree with you that the firmware is buggy; however I
disagree with you as to the outcome of that bug. The drive is not  
returning
random garbage, it has *one* byte wrong. Other than that all of the  
data

seems ok, at least to my inexpert eyes. smartctl under Linux issues a
warning about that invalid byte and reports everything else ok.  
Solaris on

an x4500 evidentally barfs over that invalid byte and returns garbage.


Actually, it's not one byte - the entire page is garbage (as we saw in  
the dtrace output).  But I'm guessing that smartctl (and hardware  
SATL) is aborting on the first invalid record, while we keep going and  
blindly "translate" one form of garbage into another.


Overall, I think the Linux approach seems more useful. Be strict in  
what
you generate, and lenient in what you accept ;), or something like  
that. As
I already said, it would be really really nice if the Solaris driver  
could
be fixed to be a little more forgiving and deal better with the  
drive, but

I've got no expectation that it should be done. But it could be :).


Absolutely.  The SATA code could definitely be cleaned up to bail when  
processing an invalid record.  I can file a CR for you if you haven't  
already done so.  Also, I'd encourage any developers out there with  
one of these drives to take a shot at fixing the issue via the  
OpenSolaris sponsor process.


- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel X25-E SSD in x4500 followup

2009-09-13 Thread Eric Schrock


On Sep 12, 2009, at 11:14 PM, Paul B. Henson wrote:


On Sat, 12 Sep 2009, Paul B. Henson wrote:

On another note, my understanding is that the official Sun sold
and supported SSD for the x4540 is basically just an OEM'd Intel X25- 
E. Did
Sun install their own fixed firmware on their version of that drive,  
or
does it have the same buggy firmware as the street version? It would  
be

funny if you guys were shipping a drive with buggy firmware that just
happens to work because the x4540 hardware doesn't trip over the one
invalid byte :)...


The X4540 uses SAS, not SATA.  So the translation via SATL is done in  
hardware, not software.


- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel X25-E SSD in x4500 followup

2009-09-13 Thread Mike Gerdts
On Sun, Sep 13, 2009 at 1:14 AM, Paul B. Henson  wrote:
> On Sat, 12 Sep 2009, Paul B. Henson wrote:
>
>> In any case, I agree with you that the firmware is buggy; however I
>> disagree with you as to the outcome of that bug. The drive is not
>> returning random garbage, it has *one* byte wrong. Other than that all of
>> the data seems ok, at least to my inexpert eyes. smartctl under Linux
>> issues a warning about that invalid byte and reports everything else ok.
>> Solaris on an x4500 evidentally barfs over that invalid byte and returns
>> garbage.
>
> On another note, my understanding is that the official Sun sold
> and supported SSD for the x4540 is basically just an OEM'd Intel X25-E. Did
> Sun install their own fixed firmware on their version of that drive, or
> does it have the same buggy firmware as the street version? It would be
> funny if you guys were shipping a drive with buggy firmware that just
> happens to work because the x4540 hardware doesn't trip over the one
> invalid byte :)...

Perhaps some of their fixes have made it upstream.  Your message at
http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/000436.html
from June 10 suggests you are running firmware release (045C)8626.  On
August 11 they released firmware revisions 8820, 8850, and 02G9,
depending on the drive model.

http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&ProdId=3043&DwnldID=17485&lang=eng

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raidz replace issue

2009-09-13 Thread Mark J Musante


The device is listed with s0; did you try using c5t9d0s0 as the name?

On 12 Sep, 2009, at 17.44, Jeremy Kister wrote:


[sorry for the cross post to solarisx86]

One of my disks died that i had in a raidz configuration on a Sun  
V40z with Solaris 10u5.  I took the bad disk out, replaced the disk,  
and issued 'zpool replace pool c5t9d0'.  the resilver process  
started, and before it was done i rebooted the system.


now, the raidz is all upset:

# zpool status
  pool: pool
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient  
replicas exist for

the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: resilver completed with 0 errors on Sat Sep 12 17:19:57 2009
config:

NAMESTATE READ WRITE CKSUM
nfspool DEGRADED 0 0 0
  raidz1ONLINE   0 0 0
c3t4d0  ONLINE   0 0 0
c5t4d0  ONLINE   0 0 0
c3t5d0  ONLINE   0 0 0
c5t5d0  ONLINE   0 0 0
  raidz1DEGRADED 0 0 0
c3t8d0  ONLINE   0 0 0
c5t8d0  ONLINE   0 0 0
c3t9d0  ONLINE   0 0 0
c5t9d0s0/o  UNAVAIL  0 0 0  cannot open
  raidz1ONLINE   0 0 0
c3t10d0 ONLINE   0 0 0
c5t10d0 ONLINE   0 0 0
c3t11d0 ONLINE   0 0 0
c5t11d0 ONLINE   0 0 0
spares
  c3t15d0   AVAIL
  c3t14d0   AVAIL
  c5t14d0   AVAIL

# zpool replace nfspool c5t9d0 c5t9d0
cannot replace c5t9d0 with c5t9d0: no such device in pool
# suex zpool replace nfspool c5t90d0 c5t14d0
cannot replace c5t9d0 with c5t14d0: no such device in pool


Any clues on what to do here ?

--

Jeremy Kister
http://jeremy.kister.net./



--

Jeremy Kister
http://jeremy.kister.net./
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss





Regards,
markm


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss