Re: [zfs-discuss] Which directories must be part of rpool?

2009-09-26 Thread Toby Thain


On 26-Sep-09, at 9:56 AM, Frank Middleton wrote:


On 09/25/09 09:58 PM, David Magda wrote:
...


Similar definition for [/tmp] Linux FWIW:


Yes, but unless they fixed it recently (=RHFC11), Linux doesn't  
actually
nuke /tmp, which seems to be mapped to disk. One side effect is  
that (like

MSWindows) AFAIK there isn't a native tmpfs, ...


Are you sure about that? My Linux systems do.

http://lxr.linux.no/linux+v2.6.31/Documentation/filesystems/tmpfs.txt

--Toby



Cheers -- Frank





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] extremely slow writes (with good reads)

2009-09-26 Thread Orvar Korvar
This controller card, you have turned off any raid functionality, yes? ZFS has 
total control of all discs, by itself? No hw raid intervening?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raidz failure, trying to recover

2009-09-26 Thread Liam Slusser
Oh, ps, This is on a Solaris 5.11 snv_99 - thanks! liam
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] raidz failure, trying to recover

2009-09-26 Thread Liam Slusser
Long story short, my cat jumped on my server at my house crashing two drives at 
the same time.  It was a 7 drive raidz (next time ill do raidz2).

The server crashed complaining about a drive failure, so i rebooted into single 
user mode not realizing that two drives failed.  I put in a new 500g 
replacement and had zfs start a replace operation which failed at about 2% 
because there was two broken drives.  From that point i turned off the computer 
and sent both drives to a data recovery place.  They were able to recover the 
data on one of the two drives (the one that i started the replace operation on) 
- great - that should be enough to get my data back.

I popped the newly recovered drive back in, it had an older tgx number then the 
other drives so i made a backup of each drive and then modified the tgx number 
to an earlier tgx number so they all match.

However i am still unable to mount the array - im getting the following error: 
(doesnt matter if i use -f or -F)

bash-3.2# zpool import data
  pool: data
id: 6962146434836213226
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

data   UNAVAIL  missing device
  raidz1   DEGRADED
c0t0d0 ONLINE
c0t1d0 ONLINE
replacing  ONLINE
  c0t2d0   ONLINE
  c0t7d0   ONLINE
c0t3d0 UNAVAIL  cannot open
c0t4d0 ONLINE
c0t5d0 ONLINE
c0t6d0 ONLINE

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.

Now i should have enough online devices to mount and get my data off however no 
luck.  I'm not really sure where to go at this point.

Do i have to fake a c0t3d0 drive so it thinks all drives are there?  Can 
somebody point me in the right direction?

thanks,
liam



p.s.  To help me find which uberblocks to modify to reset the tgx i wrote a 
little perl program which finds and prints out information in order to revert 
to an earlier tgx value.

Its a little messy since i wrote it super late at night quickly - but maybe it 
will help somebody else out.

http://liam821.com/findUberBlock.txt (its just a perl script)

Its easy to run.  It pulls in 256k of data and sorts it (or skipping X kbyte if 
you use the -s ###) and then searches for uberblocks.  (remember there is 4 
labels, 0 256, and then two at the end of the disk.  You need to manually 
figure out the end skip value...)  Calculating the GUID seems to always fail 
because the number is to large for perl so it returns a negative number.  meh 
wasnt important enough to try to figure out.

(the info below has NOTHING to do with my disk problem above, its a happy and 
health server that i wrote the tool on)

- find newest tgx number
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n
block=148 (0025000) transaction=15980419

- print verbose output
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -v
block=148 (0025000)
zfs_ver=3   (0003   )
transaction=15980419(d783 00f3  )
guid_sum=-14861410676147539 (7aad 2fc9 33a0 ffcb)
timestamp=1253958103(e1d7 4abd  )
(Sat Sep 26 02:41:43 2009)

raw =   0025000 b10c 00ba   0003   
0025010 d783 00f3   7aad 2fc9 33a0 ffcb
0025020 e1d7 4abd   0001   

- list all uberblocks
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -l
block=145 (0024400) transaction=15980288
block=146 (0024800) transaction=15980289
block=147 (0024c00) transaction=15980290
block=148 (0025000) transaction=15980291
block=149 (0025400) transaction=15980292
block=150 (0025800) transaction=15980293
block=151 (0025c00) transaction=15980294
block=152 (0026000) transaction=15980295
block=153 (0026400) transaction=15980296
block=154 (0026800) transaction=15980297
block=155 (0026c00) transaction=15980298
block=156 (0027000) transaction=15980299
block=157 (0027400) transaction=15980300
block=158 (0027800) transaction=15980301
.
.
.

- skip to 256 into the disk and find the newest uberblock
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -s 256
block=507 (7ec00) transaction=15980522

Now lets say i want to go back in time on this, using the program can help me 
do that.  If i wanted to go back in time to tgx 15980450...

bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -t 15980450
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=180 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=181 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=182 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=183 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=184 count=1 conv=notrunc
dd 

Re: [zfs-discuss] Help! System panic when pool imported

2009-09-26 Thread Victor Latushkin

Richard Elling wrote:

Assertion failures indicate bugs. You might try another version of the OS.
In general, they are easy to search for in the bugs database.  A quick
search reveals
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6822816
but that doesn't look like it will help you.  I suggest filing a new bug at
the very least.


I have redispatched 6822816, so it needs to be reevaluated since more 
information is available now.


victor


On Sep 24, 2009, at 10:21 PM, Albert Chin wrote:


Running snv_114 on an X4100M2 connected to a 6140. Made a clone of a
snapshot a few days ago:
 # zfs snapshot a...@b
 # zfs clone a...@b tank/a
 # zfs clone a...@b tank/b

The system started panicing after I tried:
 # zfs snapshot tank/b...@backup

So, I destroyed tank/b:
 # zfs destroy tank/b
then tried to destroy tank/a
 # zfs destroy tank/a

Now, the system is in an endless panic loop, unable to import the pool
at system startup or with zpool import. The panic dump is:
 panic[cpu1]/thread=ff0010246c60: assertion failed: 0 == 
zap_remove_int(mos, ds_prev-ds_phys-ds_next_clones_obj, obj, tx) 
(0x0 == 0x2), file: ../../common/fs/zfs/dsl_dataset.c, line: 1512


 ff00102468d0 genunix:assfail3+c1 ()
 ff0010246a50 zfs:dsl_dataset_destroy_sync+85a ()
 ff0010246aa0 zfs:dsl_sync_task_group_sync+eb ()
 ff0010246b10 zfs:dsl_pool_sync+196 ()
 ff0010246ba0 zfs:spa_sync+32a ()
 ff0010246c40 zfs:txg_sync_thread+265 ()
 ff0010246c50 unix:thread_start+8 ()

We really need to import this pool. Is there a way around this? We do
have snv_114 source on the system if we need to make changes to
usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the zfs
destroy transaction never completed and it is being replayed, causing
the panic. This cycle continues endlessly.

--
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which directories must be part of rpool?

2009-09-26 Thread Frank Middleton

On 09/25/09 09:58 PM, David Magda wrote:


The contents of /var/tmp can be expected to survive between boots (e.g.,
/var/tmp/vi.recover); /tmp is nuked on power cycles (because it's just
memory/swap):


Yes, but does mapping it to /tmp have any issues regarding booting
or image-update in the context of this thread? IMO nuking is a good
thing - /tmp and /var/tmp get really cluttered up after a few months,
the downside of robust hardware and software :-). Not sure I really
care about recovering vi edits in the case of UPS failure...


If a program is creating and deleting large numbers of files, and those
files aren't needed between reboots, then it really should be using /tmp.


Quite. But some lazy programmer of 3rd party software decided to use
the default tmpnam() function and I don't have access to the code :-(.

 tmpnam()
 The tmpnam() function always generates a file name using the
 path  prefix defined as P_tmpdir in the stdio.h header. On
 Solaris  systems,  the  default  value   for   P_tmpdir   is
 /var/tmp.


Similar definition for [/tmp] Linux FWIW:


Yes, but unless they fixed it recently (=RHFC11), Linux doesn't actually
nuke /tmp, which seems to be mapped to disk. One side effect is that (like
MSWindows) AFAIK there isn't a native tmpfs, so programs that create and
destroy large numbers of files run orders of magnitude slower there than
on Solaris - assuming the application doesn't use /var/tmp for them :-).
Compilers and code generators are typical of applications that do this,
though they don't usually do synchronous i/o as said programmer appears
to have done.

I suppose /var/tmp on zfs would never actually write these files unless
they were written synchronously. In the context of this thread, for
those of us with space constrained boot disks/ssds, is it OK to map
/var/tmp to /tmp, and /var/crash, /var/dump, and swap to a separate
data pool in the context of being able to reboot and install new images?
I've been doing so for a long time now with no problems that I know of.
Just wondering what the gurus think...

Havn't seen any definitive response regrading /opt, which IMO should
be a good candidate since the installer makes it a separate fs anyway.
/usr/local can definitely be kept on a separate pool. I wouldn't move
/root. I keep a separate /export/home/root and have root cd to it via
a script in /root that also sets HOME, although I noticed on snv123
that logging on as root succeeded even though it couldn't find bash
(defaulted to using sh). This may be a snv123 bug, but it is a huge
improvement on past behavior. I daresay logging on as root might
also work if root's home directory was awol. Haven't tried it...

Cheers -- Frank





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raidz failure, trying to recover

2009-09-26 Thread Liam Slusser
On second though, i used zdb -l to show each device - looks like my dd didnt 
have the desired effects i wanted.

I'm still showing a newer TGX number for all of my drives except c0t2d0 (the 
replacement which they fixed).  (This is probably why it wont mount eh?)

Is there anything else i need to do to roll back each drive so the tgx numbers 
match over then dd'n over each newer tgx entry on the disk?

(i booted opensolaris 2009.06 which ive read is a little more forgiving)

r...@opensolaris:~# zdb -l /dev/dsk/c7t0d0s0 | more

LABEL 0

version=13
name='data'
state=0
txg=778014
pool_guid=6962146434836213226
hostid=63246693
hostname='media'
top_guid=18396265026227018612
guid=6801152981449012737

r...@opensolaris:~# zdb -l /dev/dsk/c7t1d0s0 | more

LABEL 0

version=13
name='data'
state=0
txg=778014
pool_guid=6962146434836213226
hostid=63246693
hostname='media'
top_guid=18396265026227018612
guid=7077979893178320090


r...@opensolaris:~# zdb -l /dev/dsk/c7t2d0s0 | more

LABEL 0

version=13
name='data'
state=0
txg=777842
pool_guid=6962146434836213226
hostid=63246693
hostname='media'
top_guid=18396265026227018612
guid=7489495842431367457



r...@opensolaris:~# zpool status
  pool: data
 state: FAULTED
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
data   FAULTED  0 0 1  corrupted data
  raidz1   DEGRADED 0 0 6
c7t0d0 ONLINE   0 0 0
c7t1d0 ONLINE   0 0 0
replacing  ONLINE   0 0 0
  c7t2d0   ONLINE   0 0 0
  c7t7d0   ONLINE   0 0 0
c0t3d0 UNAVAIL  0 0 0  cannot open
c7t4d0 ONLINE   0 0 0
c7t5d0 ONLINE   0 0 0
c7t6d0 ONLINE   0 0 0
r...@opensolaris:~#

thanks,
liam
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to convert checksums

2009-09-26 Thread Orvar Korvar
I had this same question. I was recommended to use rsync or zfs send. I used 
both just to be safe. With zfs send, you create a snapshot and then send the 
snapshot. After deleting the snapshot on the target, you have identical copies. 
rsync seems to be used for this task also. And also zfs send.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Problem: ZFS Partition rewriten, how to recover data???

2009-09-26 Thread Darko Petkovski
I had a zfs partition written using zfs113 for Mac large around 1.37
TB, then under freebsd 7.2 following a guide on wiki I had wrote 'zpool
create trunk' eventually rewriting the partition. Now the question is
how to recover the partition or to recover data from it? Thanks



  ___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raidz failure, trying to recover

2009-09-26 Thread Orvar Korvar
Have you considered bying support? Maybe you will get guaranteed help, then?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Borked zpool, missing slog/zil

2009-09-26 Thread Erik Ableson
Hmmm - this is an annoying one.

I'm currently running an OpenSolaris install (2008.11 upgraded to 2009.06) :
SunOS shemhazai 5.11 snv_111b i86pc i386 i86pc Solaris

with a zpool made up of one radiz vdev and a small ramdisk based zil.  I 
usually swap out the zil for a file-based copy when I need to reboot (zpool 
replace /dev/ramdisk/slog /root/slog.tmp) but this time I had a brain fart and 
forgot to.

The server came back up and I could sort of work on the zpool but it was 
complaining so I did my replace command and it happily resilvered.  Then I 
restarted one more time in order to test bringing everything up cleanly and 
this time it can't find the file based zil.

I try importing and it comes back with:
zpool import
  pool: siovale
id: 13808783103733022257
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

siovale UNAVAIL  missing device
  raidz1ONLINE
c8d0ONLINE
c9d0ONLINE
c10d0   ONLINE
c11d0   ONLINE

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.

Now the file still exists so I don't know why it can't seem to find it and I 
thought the missing zil issue was corrected in this version (or did I miss 
something?).

I've looked around for solutions to bring it back online and ran across this 
method: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg16545.html 
but before I jump in on this one I was hoping there was a newer, cleaner 
approach that I missed somehow.

Ideas appreciated...

Erik

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which directories must be part of rpool?

2009-09-26 Thread Frank Middleton

On 09/26/09 12:11 PM, Toby Thain wrote:


Yes, but unless they fixed it recently (=RHFC11), Linux doesn't
actually nuke /tmp, which seems to be mapped to disk. One side
effect is that (like MSWindows) AFAIK there isn't a native tmpfs,
...


Are you sure about that? My Linux systems do.

http://lxr.linux.no/linux+v2.6.31/Documentation/filesystems/tmpfs.txt


OK, so you can mount /dev/shm on /tmp and /var/tmp, but that's
not the default, at least as of RHFC10. I have files in /tmp
going back to Feb 2008 :-). Evidently, quoting Wikipedia,
tmpfs is supported by the Linux kernel from version 2.4 and up.
http://en.wikipedia.org/wiki/TMPFS, FC1 6 years ago. Solaris /tmp
has been a tmpfs since 1990...

Now back to the thread...



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] extremely slow writes (with good reads)

2009-09-26 Thread paul
 This controller card, you have turned off any raid functionality, yes? ZFS
 has total control of all discs, by itself? No hw raid intervening?
 --
 This message posted from opensolaris.org



yes, it's an LSI 150-6, with the BIOS turned off, which turns it into a
dumb SATA card.

Paul

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool replace single disk with raidz

2009-09-26 Thread Chris Gerhard
Alas you need the fix for:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4852783

Until that arrives mirror the disk or rebuild the pool.

--chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which directories must be part of rpool?

2009-09-26 Thread Ian Collins

Frank Middleton wrote:


I suppose /var/tmp on zfs would never actually write these files unless
they were written synchronously. In the context of this thread, for
those of us with space constrained boot disks/ssds, is it OK to map
/var/tmp to /tmp, and /var/crash, /var/dump, and swap to a separate
data pool in the context of being able to reboot and install new images?
I've been doing so for a long time now with no problems that I know of.
Just wondering what the gurus think...

Moving /var/tmp works OK, I had a system root pool on an CF card and 
moved busy filesystems off to another pool.  I'm not sure which 
filesystem caused the problem, but this system was impossible to live 
upgrade.  swap and dump are volumes, so they can be anywhere (the both 
have commands to add/remove devices).



Havn't seen any definitive response regrading /opt, which IMO should
be a good candidate since the installer makes it a separate fs anyway.


Most of /opt can be relocated, but as I said, I was unable to live 
upgrade the box.  I only moved staroffice and then created filesystems 
with mountpoints in /opt before added applications that install to /opt.


See

http://www.sun.com/bigadmin/features/articles/nvm_boot.jsp

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which directories must be part of rpool?

2009-09-26 Thread Frank Middleton

On 09/26/09 05:25 PM, Ian Collins wrote:


Most of /opt can be relocated


There isn't much in there on a vanilla install (X86 snv111b)

# ls /opt
DTT  SUNWmlib


http://www.sun.com/bigadmin/features/articles/nvm_boot.jsp


You pretty much answered the OP with this link. Thanks for
posting it!

Cheers -- Frank

 
___

zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which directories must be part of rpool?

2009-09-26 Thread Toby Thain


On 26-Sep-09, at 2:55 PM, Frank Middleton wrote:


On 09/26/09 12:11 PM, Toby Thain wrote:


Yes, but unless they fixed it recently (=RHFC11), Linux doesn't
actually nuke /tmp, which seems to be mapped to disk. One side
effect is that (like MSWindows) AFAIK there isn't a native tmpfs,
...


Are you sure about that? My Linux systems do.

http://lxr.linux.no/linux+v2.6.31/Documentation/filesystems/tmpfs.txt


OK, so you can mount /dev/shm on /tmp and /var/tmp, but that's
not the default,



It has long been the default in Gentoo. This system in particular was  
installed in 2004.



at least as of RHFC10. I have files in /tmp
going back to Feb 2008 :-). Evidently, quoting Wikipedia,
tmpfs is supported by the Linux kernel from version 2.4 and up.
http://en.wikipedia.org/wiki/TMPFS, FC1 6 years ago. Solaris /tmp
has been a tmpfs since 1990...


The question wasn't who was first.

--Toby



Now back to the thread...





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Borked zpool, missing slog/zil

2009-09-26 Thread Ross
Do you have a backup copy of your zpool.cache file?

If you have that file, ZFS will happily mount a pool on boot without its slog 
device - it'll just flag the slog as faulted and you can do your normal 
replace.  I used that for a long while on a test server with a ramdisk slog - 
and I never needed to swap it to a file based slog.

However without a backup of that file to make zfs load the pool on boot I don't 
believe there is any way to import that pool.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss