Re: [zfs-discuss] trouble adding log and cache on SSD to a pool

2011-08-05 Thread Eugen Leitl
On Thu, Aug 04, 2011 at 11:58:47PM +0200, Eugen Leitl wrote:
 On Thu, Aug 04, 2011 at 02:43:30PM -0700, Larry Liu wrote:
 
  root@nexenta:/export/home/eugen# zpool add tank log /dev/dsk/c3d1p0
 
  You should use c3d1s0 here.
 
  Th
  root@nexenta:/export/home/eugen# zpool add tank cache /dev/dsk/c3d1p1
 
  Use c3d1s1.
 
 Thanks, that did the trick!
 
 root@nexenta:/export/home/eugen# zpool status tank
   pool: tank
  state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Fri Aug  5 03:04:57 2011
 config:
 
 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz2-0  ONLINE   0 0 0
 c0t0d0  ONLINE   0 0 0
 c0t1d0  ONLINE   0 0 0
 c0t2d0  ONLINE   0 0 0
 c0t3d0  ONLINE   0 0 0
 logs
   c3d1s0ONLINE   0 0 0
 cache
   c3d1s1ONLINE   0 0 0
 
 errors: No known data errors

Hmm, it doesn't seem to last more than a couple hours
under test load (mapped as a CIFS share receiving a
bittorrent download with 10 k small files in it at
about 10 MByte/s) before falling from the pool:

root@nexenta:/export/home/eugen# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
 scan: none requested
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  raidz2-0  ONLINE   0 0 0
c0t0d0  ONLINE   0 0 0
c0t1d0  ONLINE   0 0 0
c0t2d0  ONLINE   0 0 0
c0t3d0  ONLINE   0 0 0
logs
  c3d1s0FAULTED  0 4 0  too many errors
cache
  c3d1s1FAULTED 13 7.68K 0  too many errors

errors: No known data errors

dmesg sez

Aug  5 05:53:26 nexenta EVENT-TIME: Fri Aug  5 05:53:26 CEST 2011
Aug  5 05:53:26 nexenta PLATFORM: ProLiant-MicroServer, CSN: CN7051P024, 
HOSTNAME: nexenta
Aug  5 05:53:26 nexenta SOURCE: zfs-diagnosis, REV: 1.0
Aug  5 05:53:26 nexenta EVENT-ID: 516e9c7c-9e29-c504-a422-db37838fa676
Aug  5 05:53:26 nexenta DESC: A ZFS device failed.  Refer to 
http://sun.com/msg/ZFS-8000-D3 for more information.
Aug  5 05:53:26 nexenta AUTO-RESPONSE: No automated response will occur.
Aug  5 05:53:26 nexenta IMPACT: Fault tolerance of the pool may be compromised.
Aug  5 05:53:26 nexenta REC-ACTION: Run 'zpool status -x' and replace the bad 
device.
Aug  5 05:53:39 nexenta fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, 
TYPE: Fault, VER: 1, SEVERITY: Major
Aug  5 05:53:39 nexenta EVENT-TIME: Fri Aug  5 05:53:39 CEST 2011
Aug  5 05:53:39 nexenta PLATFORM: ProLiant-MicroServer, CSN: CN7051P024, 
HOSTNAME: nexenta
Aug  5 05:53:39 nexenta SOURCE: zfs-diagnosis, REV: 1.0
Aug  5 05:53:39 nexenta EVENT-ID: 3319749a-b6f7-c305-ec86-d94897dde85b
Aug  5 05:53:39 nexenta DESC: The number of I/O errors associated with a ZFS 
device exceeded
Aug  5 05:53:39 nexenta  acceptable levels.  Refer to 
http://sun.com/msg/ZFS-8000-FD for more information.
Aug  5 05:53:39 nexenta AUTO-RESPONSE: The device has been offlined and marked 
as faulted.  An attempt
Aug  5 05:53:39 nexenta  will be made to activate a hot spare if 
available.
Aug  5 05:53:39 nexenta IMPACT: Fault tolerance of the pool may be compromised.
Aug  5 05:53:39 nexenta REC-ACTION: Run 'zpool status -x' and replace the bad 
device.

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Kernel panic on zpool import. 200G of data inaccessible! assertion failed: zvol_get_stats(os, nv) == 0

2011-08-05 Thread Stu Whitefish
System: snv_151a 64 bit on Intel.
Error: panic[cpu0] assertion failed: zvol_get_stats(os, nv) == 0,
file: ../../common/fs/zfs/zfs_ioctl.c, line: 1815

Failure first seen on Solaris 10, update 8

History:

I recently received two 320G drives and realized from reading this list it
would have been better if I would have done the install on the small drives
but I didn't have them at the time. I added the two 320G drives and created
tank mirror.

I moved some data from other sources to the tank and then decided to go
ahead and do a new install. In preparation for that I moved all the data I
wanted to save onto the rpool mirror and then installed Solaris 10 update 8
again on the 320G drives.

When my system rebooted after the installation, I saw for some reason it
used my tank pool as root. I realize now since it was originally a root pool
and had boot blocks this didn't help. Anyway I shut down, changed the boot
order and then booted into my system. It paniced when trying to access the
tank and instantly rebooted. I had to go through this several times until I
caught a glimpse of one of the first messages:

assertion failed: zvol_get_stats(os, nv)

Here is what my system looks like when I boot into failsafe mode.

# zpool import
pool: rpool
id: 16453600103421700325
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

rpool ONLINE
mirror ONLINE
c0t2d0s0 ONLINE
c0t3d0s0 ONLINE

pool: tank
id: 12861119534757646169
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

tank ONLINE
mirror ONLINE
c0t0d0s0 ONLINE
c0t1d0s0 ONLINE

# zpool import tank
cannot import 'tank': pool may be in use from other system
use '-f' to import anyway

I
 installed Solaris 11 Express USB via Hiroshi-san's Windows tool. 
Unfortunately it also 

panics trying to import the pool although zpool 
import shows the pool online with no errors 

just like in the above doc.

http://imageshack.us/photo/my-images/13/zfsimportfail.jpg/

and here is an eerily identical photo capture made by somebody with a 
similar/identical 

error. http://prestonconnors.com/zvol_get_stats.jpg

At first I thought it was a copy of my screenshot but I see his terminal is 
white and mine is black.

Looks
 like the problem has been around since 2009 although my problem is with
 a newly created 

mirror pool that had plenty of space available (200G in use out of about 500G) 
and no snapshots 

were taken.

Similar discussion with discouraging lack of follow up:
http://opensolaris.org/jive/message.jspa?messageID=376366

Looks like the defect, it's closed and I see no resolution.

https://defect.opensolaris.org/bz/show_bug.cgi?id=5682

I have about 200G of data on the tank pool, about 100G or so I don't have
anywhere else. I created this pool specifically to make a safe place to
store data that I had accumulated over several years and didn't have
organized
 yet. I can't believe such a serious bug has been around for two years
and hasn't been fixed. Can somebody please help me get this data back?

Thank you.

Jim 


I joined the forums but I didn't see my post on zfs-discuss mailing list which
seems alot more active than the forum. Sorry if this is a duplicate for people 
on the mailing list.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] trouble adding log and cache on SSD to a pool

2011-08-05 Thread Eugen Leitl

I think I've found the source of my problem: I need to reflash
the N36L BIOS to a hacked russian version (sic) which allows
AHCI in the 5th drive bay

http://terabyt.es/2011/07/02/nas-build-guide-hp-n36l-microserver-with-nexenta-napp-it/

...

Update BIOS and install hacked Russian BIOS

The HP BIOS for N36L does not support anything but legacy IDE emulation on the 
internal ODD SATA port and the external eSATA port. This is a problem for 
Nexenta which can detect false disk errors when using the ODD drive on emulated 
IDE mode. Luckily an unknown Russian hacker somewhere has modified the BIOS to 
allow AHCI mode on both the internal and eSATA ports. I have always said, “Give 
the Russians two weeks and they will crack anything” and usually that has held 
true. Huge thank you to whomever has modified this BIOS given HPs complete 
failure to do so.

I have enabled this with good results. The main one being no emails from 
Nexenta informing you that the syspool has moved to a degraded state when it 
actually hasn’t :) 

...

On Fri, Aug 05, 2011 at 09:05:07AM +0200, Eugen Leitl wrote:
 On Thu, Aug 04, 2011 at 11:58:47PM +0200, Eugen Leitl wrote:
  On Thu, Aug 04, 2011 at 02:43:30PM -0700, Larry Liu wrote:
  
   root@nexenta:/export/home/eugen# zpool add tank log /dev/dsk/c3d1p0
  
   You should use c3d1s0 here.
  
   Th
   root@nexenta:/export/home/eugen# zpool add tank cache /dev/dsk/c3d1p1
  
   Use c3d1s1.
  
  Thanks, that did the trick!
  
  root@nexenta:/export/home/eugen# zpool status tank
pool: tank
   state: ONLINE
   scan: scrub repaired 0 in 0h0m with 0 errors on Fri Aug  5 03:04:57 2011
  config:
  
  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
raidz2-0  ONLINE   0 0 0
  c0t0d0  ONLINE   0 0 0
  c0t1d0  ONLINE   0 0 0
  c0t2d0  ONLINE   0 0 0
  c0t3d0  ONLINE   0 0 0
  logs
c3d1s0ONLINE   0 0 0
  cache
c3d1s1ONLINE   0 0 0
  
  errors: No known data errors
 
 Hmm, it doesn't seem to last more than a couple hours
 under test load (mapped as a CIFS share receiving a
 bittorrent download with 10 k small files in it at
 about 10 MByte/s) before falling from the pool:
 
 root@nexenta:/export/home/eugen# zpool status tank
   pool: tank
  state: DEGRADED
 status: One or more devices are faulted in response to persistent errors.
 Sufficient replicas exist for the pool to continue functioning in a
 degraded state.
 action: Replace the faulted device, or use 'zpool clear' to mark the device
 repaired.
  scan: none requested
 config:
 
 NAMESTATE READ WRITE CKSUM
 tankDEGRADED 0 0 0
   raidz2-0  ONLINE   0 0 0
 c0t0d0  ONLINE   0 0 0
 c0t1d0  ONLINE   0 0 0
 c0t2d0  ONLINE   0 0 0
 c0t3d0  ONLINE   0 0 0
 logs
   c3d1s0FAULTED  0 4 0  too many errors
 cache
   c3d1s1FAULTED 13 7.68K 0  too many errors
 
 errors: No known data errors
 
 dmesg sez
 
 Aug  5 05:53:26 nexenta EVENT-TIME: Fri Aug  5 05:53:26 CEST 2011
 Aug  5 05:53:26 nexenta PLATFORM: ProLiant-MicroServer, CSN: CN7051P024, 
 HOSTNAME: nexenta
 Aug  5 05:53:26 nexenta SOURCE: zfs-diagnosis, REV: 1.0
 Aug  5 05:53:26 nexenta EVENT-ID: 516e9c7c-9e29-c504-a422-db37838fa676
 Aug  5 05:53:26 nexenta DESC: A ZFS device failed.  Refer to 
 http://sun.com/msg/ZFS-8000-D3 for more information.
 Aug  5 05:53:26 nexenta AUTO-RESPONSE: No automated response will occur.
 Aug  5 05:53:26 nexenta IMPACT: Fault tolerance of the pool may be 
 compromised.
 Aug  5 05:53:26 nexenta REC-ACTION: Run 'zpool status -x' and replace the bad 
 device.
 Aug  5 05:53:39 nexenta fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
 ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
 Aug  5 05:53:39 nexenta EVENT-TIME: Fri Aug  5 05:53:39 CEST 2011
 Aug  5 05:53:39 nexenta PLATFORM: ProLiant-MicroServer, CSN: CN7051P024, 
 HOSTNAME: nexenta
 Aug  5 05:53:39 nexenta SOURCE: zfs-diagnosis, REV: 1.0
 Aug  5 05:53:39 nexenta EVENT-ID: 3319749a-b6f7-c305-ec86-d94897dde85b
 Aug  5 05:53:39 nexenta DESC: The number of I/O errors associated with a ZFS 
 device exceeded
 Aug  5 05:53:39 nexenta  acceptable levels.  Refer to 
 http://sun.com/msg/ZFS-8000-FD for more information.
 Aug  5 05:53:39 nexenta AUTO-RESPONSE: The device has been offlined and 
 marked as faulted.  An attempt
 Aug  5 05:53:39 nexenta  will be made to activate a hot spare if 
 available.
 Aug  5 05:53:39 nexenta IMPACT: Fault tolerance of the pool may be 
 compromised.
 Aug  5 05:53:39 nexenta REC-ACTION: Run 'zpool status -x' and replace the bad 
 device.
 
 -- 
 Eugen* Leitl a href=http://leitl.org;leitl/a 

[zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Edward Ned Harvey
After a certain rev, I know you can set the sync property, and it takes
effect immediately, and it's persistent across reboots.  But that doesn't
apply to Solaris 10.

 

My question:  Is there any way to make Disabled ZIL a normal mode of
operations in solaris 10?  Particularly:  

 

If I do this echo zil_disable/W0t1 | mdb -kw then I have to remount the
filesystem.  It's kind of difficult to do this automatically at boot time,
and impossible (as far as I know) for rpool.  The only solution I see is to
write some startup script which applies it to filesystems other than rpool.
Which feels kludgy.  Is there a better way?

 

Thanks...

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Darren J Moffat

On 08/05/11 13:11, Edward Ned Harvey wrote:

After a certain rev, I know you can set the sync property, and it
takes effect immediately, and it's persistent across reboots. But that
doesn't apply to Solaris 10.

My question: Is there any way to make Disabled ZIL a normal mode of
operations in solaris 10? Particularly:

If I do this echo zil_disable/W0t1 | mdb -kw then I have to remount
the filesystem. It's kind of difficult to do this automatically at boot
time, and impossible (as far as I know) for rpool. The only solution I
see is to write some startup script which applies it to filesystems
other than rpool. Which feels kludgy. Is there a better way?


echo set zfs:zil_disable = 1  /etc/system

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Tomas Ögren
On 05 August, 2011 - Darren J Moffat sent me these 0,9K bytes:

 On 08/05/11 13:11, Edward Ned Harvey wrote:
 After a certain rev, I know you can set the sync property, and it
 takes effect immediately, and it's persistent across reboots. But that
 doesn't apply to Solaris 10.

 My question: Is there any way to make Disabled ZIL a normal mode of
 operations in solaris 10? Particularly:

 If I do this echo zil_disable/W0t1 | mdb -kw then I have to remount
 the filesystem. It's kind of difficult to do this automatically at boot
 time, and impossible (as far as I know) for rpool. The only solution I
 see is to write some startup script which applies it to filesystems
 other than rpool. Which feels kludgy. Is there a better way?

 echo set zfs:zil_disable = 1  /etc/system

Or use  if you don't want to zap /etc/system..

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Michael Sullivan
On 5 Aug 11, at 08:14 , Darren J Moffat wrote:

 On 08/05/11 13:11, Edward Ned Harvey wrote:
 
 My question: Is there any way to make Disabled ZIL a normal mode of
 operations in solaris 10? Particularly:
 
 If I do this echo zil_disable/W0t1 | mdb -kw then I have to remount
 the filesystem. It's kind of difficult to do this automatically at boot
 time, and impossible (as far as I know) for rpool. The only solution I
 see is to write some startup script which applies it to filesystems
 other than rpool. Which feels kludgy. Is there a better way?
 
 echo set zfs:zil_disable = 1  /etc/system

echo set zfs:zil_disable = 1  /etc/system

Mike

---
Michael Sullivan   
m...@axsh.us
http://www.axsh.us/
Phone: +1-662-259-
Mobile: +1-662-202-7716


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wrong rpool used after reinstall!

2011-08-05 Thread Brian Wilson



On 8/3/2011 5:47 PM, Ian Collins wrote:

 On 08/ 4/11 01:29 AM, Stuart James Whitefish wrote:
I have Solaris on Sparc boxes available if it would help to do a net 
install
or jumpstart. I have never done those and it looks complicated, 
although I
think I may be able to get to the point in the u9 installer on my 
Intel box where it asks me whether I want to install from DVD etc. 
But I may be wrong, and anyway the single user shell in the u9 DVD 
also panics when I try to import tank so maybe that won't help.


Put your old drive in a USB enclosure and connect it to another system 
in order to read back the data.



I'm curious - would it work to boot from a live CD, go to shell, and 
deport/import/rename the old rpool, then boot normally?


I have only 4 sata ports on this Intel box so I have to keep pulling 
cables

to be able to boot from a DVD and then I won't have all my drives
available. I cannot move these drives to any other box because they are
consumer drives and my servers all have ultras.


Most modern boards will be boot from a live USB stick.



--


---
Brian Wilson, Solaris SE, UW-Madison DoIT
Room 3114 CSS608-263-8047
bfwilson(a)doit.wisc.edu
'I try to save a life a day. Usually it's my own.' - John Crichton
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Exapnd ZFS storage.

2011-08-05 Thread Nix
Thanks Guys... :-)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wrong rpool used after reinstall!

2011-08-05 Thread Stuart James Whitefish
Jim wrote:

 But I may be wrong, and anyway the single user shell in the u9 DVD also 
 panics when I try to import tank so maybe that won't help.

Ian wrote:

 Put your old drive in a USB enclosure and connect it
 to another system in order to read back the data.

Given that update 9 can't import the pool is this really worth trying?
I would have to buy the enclosures, if I had them already I would have tried it 
in 
desperation.

Jim wrote:

  I have only 4 sata ports on this Intel box so I have to keep pulling cables 
  to be 
  able to boot from a DVD and then I won't have all my drives available. I 
  cannot 
  move these drives to any other box because they are consumer drives and my 
  servers all have ultras.

Ian wrote:

 Most modern boards will be boot from a live USB
 stick.

True but I haven't found a way to get an ISO onto a USB that my system can boot 
from it. I was using DD to copy the iso to the usb drive. Is there some other 
way?

This is really frustrating. I haven't had any problems with Linux filesystems 
but I heard ZFS was safer. It's really ironic that I lost access to so much 
data after moving it to ZFS. Isn't there any way to get them back on my newly 
installed U8 system? If I disconnect this pool the system starts fine. 
Otherwise my questions above in my summary post might be key to getting this 
working.

Thanks,
Jim
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Kernel panic on zpool import. 200G of data inaccessible!

2011-08-05 Thread Stuart James Whitefish
I am opening a new thread since I found somebody else reported a similar 
failure in May and I didn't see a resolution hopefully this post will be easier 
to find for people with similar problems. Original thread was 
http://opensolaris.org/jive/thread.jspa?threadID=140861

System: snv_151a 64 bit on Intel.
Error: panic[cpu0] assertion failed: zvol_get_stats(os, nv) == 0,
file: ../../common/fs/zfs/zfs_ioctl.c, line: 1815

Failure first seen on Solaris 10, update 8

History:

I recently received two 320G drives and realized from reading this list it
would have been better if I would have done the install on the small drives
but I didn't have them at the time. I added the two 320G drives and created
tank mirror.

I moved some data from other sources to the tank and then decided to go
ahead and do a new install. In preparation for that I moved all the data I
wanted to save onto the rpool mirror and then installed Solaris 10 update 8
again on the 320G drives.

When my system rebooted after the installation, I saw for some reason it
used my tank pool as root. I realize now since it was originally a root pool
and had boot blocks this didn't help. Anyway I shut down, changed the boot
order and then booted into my system. It paniced when trying to access the
tank and instantly rebooted. I had to go through this several times until I
caught a glimpse of one of the first messages:

assertion failed: zvol_get_stats(os, nv)

Here is what my system looks like when I boot into failsafe mode.

# zpool import
pool: rpool
id: 16453600103421700325
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

rpool ONLINE
mirror ONLINE
c0t2d0s0 ONLINE
c0t3d0s0 ONLINE

pool: tank
id: 12861119534757646169
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

tank ONLINE
mirror ONLINE
c0t0d0s0 ONLINE
c0t1d0s0 ONLINE

# zpool import tank
cannot import 'tank': pool may be in use from other system
use '-f' to import anyway

Here is a photo of my screen (hah hah old fashioned screen shot) when Sol 11 
starts now that I tried importing my pool it fails constantly.

# zpool import -f tank

http://imageshack.us/photo/my-images/13/zfsimportfail.jpg/

I installed Solaris 11 Express USB via Hiroshi-san's Windows tool. 
Unfortunately it also panics trying to import the pool although zpool import 
shows the pool online with no errors just like in the above doc.

and here is an eerily identical photo capture made by somebody with a 
similar/identical error. http://prestonconnors.com/zvol_get_stats.jpg

At first I thought it was a copy of my screenshot but I see his terminal is 
white and mine is black.

Looks like the problem has been around since 2009 although my problem is with a 
newly created mirror pool that had plenty of space available (200G in use out 
of about 500G) and no snapshots were taken.

Similar discussion with discouraging lack of follow up:
http://opensolaris.org/jive/message.jspa?messageID=376366

Looks like the defect, it's closed and I see no resolution.

https://defect.opensolaris.org/bz/show_bug.cgi?id=5682

I have about 200G of data on the tank pool, about 100G or so I don't have
anywhere else. I created this pool specifically to make a safe place to
store data that I had accumulated over several years and didn't have
organized yet. I can't believe such a serious bug has been around for two years 
and hasn't been fixed. Can somebody please help me get this data back?

Thank you.

Jim
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wrong rpool used after reinstall!

2011-08-05 Thread Stuart James Whitefish
I'm opening a new thread since the original subject was not as helpful and I 
saw a similar problem mentioned in May of this year (2011) and others going 
back to 2009. New thread is found at 
http://opensolaris.org/jive/thread.jspa?threadID=140899
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Richard Elling
On Aug 5, 2011, at 6:14 AM, Darren J Moffat darr...@opensolaris.org wrote:

 On 08/05/11 13:11, Edward Ned Harvey wrote:
 After a certain rev, I know you can set the sync property, and it
 takes effect immediately, and it's persistent across reboots. But that
 doesn't apply to Solaris 10.
 
 My question: Is there any way to make Disabled ZIL a normal mode of
 operations in solaris 10? Particularly:
 
 If I do this echo zil_disable/W0t1 | mdb -kw then I have to remount
 the filesystem. It's kind of difficult to do this automatically at boot
 time, and impossible (as far as I know) for rpool. The only solution I
 see is to write some startup script which applies it to filesystems
 other than rpool. Which feels kludgy. Is there a better way?
 
 echo set zfs:zil_disable = 1  /etc/system

This is a great way to cure /etc/system viruses :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic on zpool import. 200G of data inaccessible!

2011-08-05 Thread Mike Gerdts
On Thu, Aug 4, 2011 at 2:47 PM, Stuart James Whitefish
swhitef...@yahoo.com wrote:
 # zpool import -f tank

 http://imageshack.us/photo/my-images/13/zfsimportfail.jpg/

I encourage you to open a support case and ask for an escalation on CR 7056738.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import starves machine of memory

2011-08-05 Thread Paul Kraus
Another update:

The configuration of the zpool is 45 x 1 TB drives in three
vdev's, each of 15 drives. We should have a net capacity of between 30
and 36 TB (and that agrees with my memory of the pool). I ran zdb -e
-d against the pool (not imported) and totaled the size of the
datasets and came up with just about 11 TB. This also agrees with my
memory (about 18 TB of data and about 1.5 compression ratio). If the
failed snapshot / zfs recv is 3 TB (like I think it should be) or
almost 8 TB (as Oracle is telling me based on some mdb -k examinations
of the dataset delete thread), I should still have almost 10 TB free.

I am making an assumption here, and that is that the size listed
for the dataset with zdb -d includes all snapshots of that dataset
(much like the SIZE field of a zfs list). If that is NOT the case,
then I need to come up with a different way to estimate the fullness
of this zpool.

On Thu, Aug 4, 2011 at 1:25 PM, Paul Kraus p...@kraus-haus.org wrote:
 Updates to my problem:

 1. The destroy operation appears to be restarting from the same point
 after the system hangs and has to be rebooted. Oracle gave me the
 following to track progress:

 echo '::pgrep zpool$ |::walk thread|::findstack -v' | mdb -k | grep
 dsl_dataset_destroy
 then take first arg of dsl_dataset_destroy and
 echo 'ARG::print dsl_dataset_t ds_phys-ds_used_bytes' | mdb -k

 I am logging these values every minute. Yesterday when I started
 tracking this I got a value of 0x75d97516b62, my last data point
 before the system hung was 0x4ee1098bdfd. My first first data point
 today after rebooting, restarting the logging scripts, and restarting
 the zpool import is 0x7a0b0634a1b. So it looks like I've made no real
 progress.

 2. It looks like the root cause of the original system crash that left
 the incomplete zfs recv snapshot is that the a zfs recv filled the
 zpool (there are two parallel zfs recv's running, one for an old
 configuration (many datasets) and one for the new (one large
 dataset)). My replication script checks for free space before stating
 the replication, but we had a huge data load and replication of it
 running (3 TB), and when it started there was room for it, but other
 (much smaller) data loads and replication may have consumed it. This
 system has no other activity on it, it is just a repository for this
 replicated data.

 So ... it looks like I have:
 - a full zpool
 - an incomplete (corrupt ?) snapshot from a zfs recv
 ... and every time I try to import this zpool I hang the system due to
 lack of memory (the box has 32 GB of RAM).

 Any suggestions how to delete / destroy this incomplete snapshot
 without running the system out of RAM ?

 On Wed, Aug 3, 2011 at 9:56 AM, Paul Kraus p...@kraus-haus.org wrote:
 An additional data point, when i try to do a zdb -e -d and find the
 incomplete zfs recv snapshot I get an error as follows:

 # sudo zdb -e -d xxx-yy-01 | grep %
 Could not open xxx-yy-01/aaa-bb-01/aaa-bb-01-01/%1309906801, error 16
 #

 Anyone know what error 16 means from zdb and how this might impact
 importing this zpool ?

 On Wed, Aug 3, 2011 at 9:19 AM, Paul Kraus p...@kraus-haus.org wrote:
    I am having a very odd problem, and so far the folks at Oracle
 Support have not provided a working solution, so I am asking the crowd
 here while still pursuing it via Oracle Support.

    The system is a T2000 running 10U9 with CPU-2010-01and two J4400
 loaded with 1 TB SATA drives. There is one zpool on the J4400 (3 x 15
 disk vdev + 3 hot spare). This system is the target for zfs send /
 recv replication from our production server.The OS is UFS on local
 disk.

     While I was on vacation this T2000 hung with out of resource
 errors. Other staff tried rebooting, which hung the box. Then they
 rebooted off of an old BE (10U9 without CPU-2010-01). Oracle Support
 had them apply a couple patches and an IDR to address zfs stability
 and reliability problems as well as set the following in /etc/system

 set zfs:zfs_arc_max = 0x7 (which is 28 GB)
 set zfs:arc_meta_limit = 0x7 (which is 28 GB)

    The system has 32 GB RAM and 32 (virtual) CPUs. They then tried
 importing the zpool and the system hung (after many hours) with the
 same out of resource error. At this point they left the problem for
 me :-(

    I removed the zfs.cache from the 10U9 + CPU 2010-10 BE and booted
 from that. I then applied the IDR (IDR146118-12 )and the zfs patch it
 depended on (145788-03). I did not include the zfs arc and zfs arc
 meta limits as I did not think they relevant. A zpool import shows the
 pool is OK and a sampling with zdb -l of the drives shows good labels.
 I started importing the zpool and after many hours it hung the system
 with out of resource errors. I had a number of tools running to see
 what was going on. The only thing this system is doing is importing
 the zpool.

 ARC had climbed to about 8 GB and then declined to 3 GB by the time
 the system hung. This tells me that 

Re: [zfs-discuss] Wrong rpool used after reinstall!

2011-08-05 Thread Bill
On Thu, Aug 04, 2011 at 03:52:39AM -0700, Stuart James Whitefish wrote:
 Jim wrote:
 
  But I may be wrong, and anyway the single user shell in the u9 DVD also 
  panics when I try to import tank so maybe that won't help.
 
 Ian wrote:
 
  Put your old drive in a USB enclosure and connect it
  to another system in order to read back the data.
 
 Given that update 9 can't import the pool is this really worth trying?
 I would have to buy the enclosures, if I had them already I would have tried 
 it in 
 desperation.
 
 Jim wrote:
 
   I have only 4 sata ports on this Intel box so I have to keep pulling 
   cables to be 
   able to boot from a DVD and then I won't have all my drives available. I 
   cannot 
   move these drives to any other box because they are consumer drives and 
   my 
   servers all have ultras.
 
 Ian wrote:
 
  Most modern boards will be boot from a live USB
  stick.
 
 True but I haven't found a way to get an ISO onto a USB that my system can 
 boot from it. I was using DD to copy the iso to the usb drive. Is there some 
 other way?


Maybe give http://unetbootin.sourceforge.net/ a try.

Bill



 
 This is really frustrating. I haven't had any problems with Linux filesystems 
 but I heard ZFS was safer. It's really ironic that I lost access to so much 
 data after moving it to ZFS. Isn't there any way to get them back on my newly 
 installed U8 system? If I disconnect this pool the system starts fine. 
 Otherwise my questions above in my summary post might be key to getting this 
 working.
 
 Thanks,
 Jim
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wrong rpool used after reinstall!

2011-08-05 Thread Bob Friesenhahn

On Fri, 5 Aug 2011, Bill wrote:


True but I haven't found a way to get an ISO onto a USB that my system can boot 
from it. I was using DD to copy the iso to the usb drive. Is there some other 
way?


Maybe give http://unetbootin.sourceforge.net/ a try.


This package seems to list support for most x86 OSs EXCEPT for 
*Solaris.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wrong rpool used after reinstall!

2011-08-05 Thread Ian Collins

 On 08/ 4/11 10:52 PM, Stuart James Whitefish wrote:

Ian wrote:

Put your old drive in a USB enclosure and connect it
to another system in order to read back the data.

Given that update 9 can't import the pool is this really worth trying?


I would use a newer (express maybe) system.


Most modern boards will be boot from a live USB
stick.

True but I haven't found a way to get an ISO onto a USB that my system can boot 
from it. I was using DD to copy the iso to the usb drive. Is there some other 
way?


Recent OpenSolaris based builds have a handy utility usbcopy.


This is really frustrating. I haven't had any problems with Linux filesystems 
but I heard ZFS was safer. It's really ironic that I lost access to so much 
data after moving it to ZFS. Isn't there any way to get them back on my newly 
installed U8 system? If I disconnect this pool the system starts fine. 
Otherwise my questions above in my summary post might be key to getting this 
working.


If you have support, badger them.  Otherwise use a newer system.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Large scale performance query

2011-08-05 Thread Orvar Korvar
Is mirrors really a realistic alternative? I mean, if I have to resilver a raid 
with 3TB discs, it can take days I suspect. With 4TB disks it can take a week, 
maybe. So, if I use mirror and one disk break, then I only have single 
redundancy while the mirror repairs. Reparation will take long time and will 
stress the disks, which means the other disk might malfunction.

Therefore, I think raidz2 or raidz3 that allows 2 or 3 disks to break while you 
resilver. Hence, mirror is not a realistic alternative when using large disks.

True/false? What do you guys say?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Problem booting after zfs upgrade

2011-08-05 Thread stuart anderson
After upgrading to zpool version 29/zfs version 5 on a S10 test system via the 
kernel patch 144501-19 it will now boot only as far as the to the grub menu.

What is a good Solaris rescue image that I can boot that will allow me to 
import this rpool to look at it (given the newer version)?

Thanks.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Large scale performance query

2011-08-05 Thread Ian Collins

 On 08/ 6/11 10:42 AM, Orvar Korvar wrote:

Is mirrors really a realistic alternative?


To what?  Some context would be helpful.


I mean, if I have to resilver a raid with 3TB discs, it can take days I 
suspect. With 4TB disks it can take a week, maybe. So, if I use mirror and one 
disk break, then I only have single redundancy while the mirror repairs. 
Reparation will take long time and will stress the disks, which means the other 
disk might malfunction.

Therefore, I think raidz2 or raidz3 that allows 2 or 3 disks to break while you 
resilver. Hence, mirror is not a realistic alternative when using large disks.

True/false? What do you guys say?


I don't have any exact like for like comparison data, but from what I've 
seen a mirror resilvers a lot faster than a drive in a raidz(2) vdev.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem booting after zfs upgrade

2011-08-05 Thread Ian Collins

 On 08/ 6/11 11:48 AM, stuart anderson wrote:

After upgrading to zpool version 29/zfs version 5 on a S10 test system via the 
kernel patch 144501-19 it will now boot only as far as the to the grub menu.

What is a good Solaris rescue image that I can boot that will allow me to 
import this rpool to look at it (given the newer version)?


A Solaris 11 express live CD.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Large scale performance query

2011-08-05 Thread Rob Cohen
Generally, mirrors resilver MUCH faster than RAIDZ, and you only lose 
redundancy on that stripe, so combined, you're much closer to RAIDZ2 odds than 
you might think, especially with hot spare(s), which I'd reccommend.

When you're talking about IOPS, each stripe can support 1 simultanious user.

Writing:
Each RAIDZ group = 1 stripe.
Each mirror group = 1 stripe.
So, 216 drives can be 24 stripes or 108 stripes.

Reading:
Each RAIDZ group = 1 stripe.
Each mirror group = 1 stripe per drive.
So, 216 drives can be 24 stripes or 216 stripes.

Actually, reads from mirrors are even more efficient than reads from stripes, 
because the software can optimally load balance across mirrors.

So, back to the original poster's question, 9 stripes might be enough to 
support 5 clients, but 216 stripes could support many more.

Actually, this is an area where RAID5/6 has an advantage over RAIDZ, if I 
understand correctly, because for RAID5/6 on read-only workloads, each drive 
acts like a stripe.  For workloads with writing, though, RAIDZ is significantly 
faster than RAID5/6, but mirrors/RAID10 give the best performance for all 
workloads.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Edward Ned Harvey
 From: Darren J Moffat [mailto:darr...@opensolaris.org]
 Sent: Friday, August 05, 2011 10:14 AM
 
  echo set zfs:zil_disable = 1  /etc/system
 
  This is a great way to cure /etc/system viruses :-)
 
 LOL!

:-)

Thank you.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem booting after zfs upgrade

2011-08-05 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Ian Collins
 
   On 08/ 6/11 11:48 AM, stuart anderson wrote:
  After upgrading to zpool version 29/zfs version 5 on a S10 test system
via
 the kernel patch 144501-19 it will now boot only as far as the to the grub
 menu.
 
  What is a good Solaris rescue image that I can boot that will allow me
to
 import this rpool to look at it (given the newer version)?
 
 A Solaris 11 express live CD.

FYI.

Before a certain rev, if you zpool upgrade, you have silently invalidated
your grub boot blocks, and you simply need to know based on past experience
that you need to installgrub.

After a certain rev, the system will notify you with a helpful informative
message.  You need to installgrub or something like that.

And after yet a later rev...  It does the installgrub for you automatically.

Or maybe I'm just talking about starting a new mirror of rpool.  Maybe the
same thing is not true in regards to zpool upgrade.  I don't know for sure.


In any event...  You need to do something like this:
installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t0d0s0
(substitute whatever device  slice you have used for rpool)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss