Re: [zfs-discuss] can you recover a pool if you lose the zil (b134+)

2010-05-17 Thread Richard Skelton
Hi Geoff,
I also tested a ram disk as a zil and found I could recover the pool:-
ramdiskadm -a zil 1g
zpool create -f tank c1t3d0 c1t4d0 log /dev/ramdisk/zil
zpool status tank

reboot

zpool status tank
ramdiskadm -a zil 1g
zpool replace -f tank /dev/ramdisk/zil
zpool status tank


Cheers
Richard.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Performance on SATA Deive

2010-04-19 Thread Richard Skelton
 On 18/03/10 08:36 PM, Kashif Mumtaz wrote:
  Hi,
  I did another test on both machine. And write
 performance on ZFS extraordinary slow.
 
 Which build are you running?
 
 On snv_134, 2x dual-core cpus @ 3GHz and 8Gb ram (my
 desktop), I
 see these results:
 
 
 $ time dd if=/dev/zero of=test.dbf bs=8k
 count=1048576
 1048576+0 records in
 1048576+0 records out
 
 real  0m28.224s
 user  0m0.490s
 sys   0m19.061s
 
 This is a dataset on a straight mirrored pool, using
 two SATA2
 drives (320Gb Seagate).
On my Ultra24 with two mirrored 1Tb WD drives 8gb memory and snv_125

I only get :-
rich: ptime dd if=/dev/zero of=test.dbf bs=8k count=1048576
1048576+0 records in
1048576+0 records out

real 1:44.352133699
user0.444280089
sys13.526079085
rich: uname -a
SunOS ultra24 5.11 snv_125 i86pc i386 i86pc
rich: zpool status tank
  pool: tank
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
pool will no longer be accessible on older software versions.
 scrub: scrub completed after 0h30m with 0 errors on Mon Apr 19 02:36:08
2010
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
c1t3d0  ONLINE   0 0 0
c1t4d0  ONLINE   0 0 0

errors: No known data errors
rich: ipstat -En c1t3d0
ipstat: Command not found.
rich: iostat -En c1t3d0
c1t3d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD1001FALS-0 Revision: 0K05 Serial No:
Size: 1000.20GB 1000204886016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 4264 Predictive Failure Analysis: 0

rich: psrinfo -v
Status of virtual processor 0 as of: 04/19/2010 14:23:42
  on-line since 12/16/2009 21:56:59.
  The i386 processor operates at 3000 MHz,
and has an i387 compatible floating point processor.
Status of virtual processor 1 as of: 04/19/2010 14:23:42
  on-line since 12/16/2009 21:57:03.
  The i386 processor operates at 3000 MHz,
and has an i387 compatible floating point processor.
Status of virtual processor 2 as of: 04/19/2010 14:23:42
  on-line since 12/16/2009 21:57:03.
  The i386 processor operates at 3000 MHz,
and has an i387 compatible floating point processor.
Status of virtual processor 3 as of: 04/19/2010 14:23:42
  on-line since 12/16/2009 21:57:03.
  The i386 processor operates at 3000 MHz,
and has an i387 compatible floating point processor.



Why are my drives so slow?



 
 $ time dd if=test.dbf bs=8k of=/dev/null
 1048576+0 records in
 1048576+0 records out
 
 real  0m5.749s
 user  0m0.458s
 sys   0m5.260s
 
 
 James C. McPherson
 --
 Senior Software Engineer, Solaris
 Sun Microsystems
 http://www.jmcp.homeunix.com/blog
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL errors but device seems OK

2010-04-15 Thread Richard Skelton
Hi,
After a little bit more digging I found in /var/adm/messages:-
Mar 25 13:13:08 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:13:08 brszfs02timeout: early timeout, target=1 lun=0
Mar 25 13:13:08 brszfs02 gda: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1/c...@1,0 (Disk1):
Mar 25 13:13:08 brszfs02Error for command 'write sector'Error 
Level: Informational
Mar 25 13:13:08 brszfs02 gda: [ID 107833 kern.notice]   Sense Key: aborted 
command
Mar 25 13:13:08 brszfs02 gda: [ID 107833 kern.notice]   Vendor 'Gen-ATA ' error 
code: 0x3
Mar 25 13:13:43 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:13:43 brszfs02timeout: early timeout, target=1 lun=0
Mar 25 13:13:43 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:13:43 brszfs02timeout: early timeout, target=1 lun=0
Mar 25 13:13:43 brszfs02 gda: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1/c...@1,0 (Disk1):
Mar 25 13:13:43 brszfs02Error for command 'read sector' Error Level: 
Informational
Mar 25 13:13:43 brszfs02 gda: [ID 107833 kern.notice]   Sense Key: aborted 
command
Mar 25 13:13:43 brszfs02 gda: [ID 107833 kern.notice]   Vendor 'Gen-ATA ' error 
code: 0x3
Mar 25 13:13:43 brszfs02 gda: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1/c...@1,0 (Disk1):
Mar 25 13:13:43 brszfs02Error for command 'read sector' Error Level: 
Informational
Mar 25 13:13:43 brszfs02 gda: [ID 107833 kern.notice]   Sense Key: aborted 
command
Mar 25 13:13:43 brszfs02 gda: [ID 107833 kern.notice]   Vendor 'Gen-ATA ' error 
code: 0x3
Mar 25 13:14:18 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:14:18 brszfs02timeout: early timeout, target=1 lun=0
Mar 25 13:14:18 brszfs02 gda: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1/c...@1,0 (Disk1):
Mar 25 13:14:18 brszfs02Error for command 'read sector' Error Level: 
Informational
Mar 25 13:14:18 brszfs02 gda: [ID 107833 kern.notice]   Sense Key: aborted 
command
Mar 25 13:14:18 brszfs02 gda: [ID 107833 kern.notice]   Vendor 'Gen-ATA ' error 
code: 0x3
Mar 25 13:14:33 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:14:33 brszfs02timeout: abort request, target=0 lun=0
Mar 25 13:14:33 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:14:33 brszfs02timeout: abort device, target=0 lun=0
Mar 25 13:14:33 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:14:33 brszfs02timeout: reset target, target=0 lun=0
Mar 25 13:14:33 brszfs02 scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci-...@1f,2/i...@1 (ata1):
Mar 25 13:14:33 brszfs02timeout: reset bus, target=0 lun=0
Mar 25 13:14:34 brszfs02 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
Mar 25 13:14:34 brszfs02 EVENT-TIME: Thu Mar 25 13:14:34 GMT 2010
Mar 25 13:14:34 brszfs02 PLATFORM: HP-Compaq-dc7700-Convertible-Minitower, CSN: 
CZC7264JN4, HOSTNAME: brszfs02
Mar 25 13:14:34 brszfs02 SOURCE: zfs-diagnosis, REV: 1.0
Mar 25 13:14:34 brszfs02 EVENT-ID: 6c0bd163-56bf-ee92-e393-ce2063355b52
Mar 25 13:14:34 brszfs02 DESC: The number of I/O errors associated with a ZFS 
device exceeded
Mar 25 13:14:34 brszfs02 acceptable levels.  Refer to 
http://sun.com/msg/ZFS-8000-FD for more information.
Mar 25 13:14:34 brszfs02 AUTO-RESPONSE: The device has been offlined and marked 
as faulted.  An attempt
Mar 25 13:14:34 brszfs02 will be made to activate a hot spare if 
available.
Mar 25 13:14:34 brszfs02 IMPACT: Fault tolerance of the pool may be compromised.
Mar 25 13:14:34 brszfs02 REC-ACTION: Run 'zpool status -x' and replace the bad 
device.




If I remember correctly I was thrashing this pool with Bonnie++ at this time.

Cheers
Richard.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZIL errors but device seems OK

2010-04-14 Thread Richard Skelton
Hi,
I have installed OpenSolaris snv_134 from the iso at genunix.org.
Mon Mar 8 2010 New OpenSolaris preview, based on build 134
I created a zpool:-
NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  c7t4d0ONLINE   0 0 0
  c7t5d0ONLINE   0 0 0
  c7t6d0ONLINE   0 0 0
  c7t8d0ONLINE   0 0 0
  c7t9d0ONLINE   0 0 0
logs
  c5d1p1ONLINE   0 0 0
cache
  c5d1p2ONLINE   0 0 0

The log device and cache are each one half of a 128GB  OCZ VERTEX-TURBO flash 
card.

I am getting good NFS performance but have seen this error:-
r...@brszfs02:~# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  c7t4d0ONLINE   0 0 0
  c7t5d0ONLINE   0 0 0
  c7t6d0ONLINE   0 0 0
  c7t8d0ONLINE   0 0 0
  c7t9d0ONLINE   0 0 0
logs
  c5d1p1FAULTED  0 4 0  too many errors
cache
  c5d1p2ONLINE   0 0 0

errors: No known data errors

r...@brszfs02:~# fmadm faulty
---   -- -
TIMEEVENT-ID  MSG-ID SEVERITY
---   -- -
Mar 25 13:14:34 6c0bd163-56bf-ee92-e393-ce2063355b52  ZFS-8000-FDMajor

Host: brszfs02
Platform: HP-Compaq-dc7700-Convertible-MinitowerChassis_id  : CZC7264JN4
Product_sn  :

Fault class : fault.fs.zfs.vdev.io
Affects : zfs://pool=tank/vdev=4ec464b5bf74a898
  faulted but still in service
Problem in  : zfs://pool=tank/vdev=4ec464b5bf74a898
  faulted but still in service

Description : The number of I/O errors associated with a ZFS device exceeded
 acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-FD
  for more information.

Response: The device has been offlined and marked as faulted.  An attempt
 will be made to activate a hot spare if available.

Impact  : Fault tolerance of the pool may be compromised.

Action  : Run 'zpool status -x' and replace the bad device.

r...@brszfs02:~# iostat -En c5d1
c5d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: OCZ VERTEX-TURB Revision:  Serial No: 062F97G71C5T676 Size: 128.04GB 
128035160064 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0


As there seems to be not hardware errors as reported by iostat I ran zpool 
clear tank and a scrub on Monday.
Up to now I have seen no new errors, I have set-up a cron to scrub a 01:30 each 
day.

Is the flash card faulty or is this a ZFS problem?

Cheers
Richard
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss