Re: zpool destroy causes panic

2010-07-25 Thread Dan Langille

On 7/25/2010 1:58 PM, Dan Langille wrote:

I'm trying to destroy a zfs array which I recently created.  It contains
nothing of value.


Oh... I left this out:

FreeBSD kraken.unixathome.org 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Mar 
 5 00:46:11 EST 2010 
d...@kraken.example.org:/usr/obj/usr/src/sys/KRAKEN  amd64





# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors

Why sparse files? See this post:

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1007077+0+archive/2010/freebsd-stable/20100725.freebsd-stable


The two tmp files were created via:

dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0 oseek=1862g
dd if=/dev/zero of=/tmp/sparsefile2.img bs=1 count=0 oseek=1862g

And the array created with:

zpool create -f storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03 \
gpt/disk04 gpt/disk05 /tmp/sparsefile1.img /tmp/sparsefile2.img

The -f flag was required to avoid this message:

invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: raidz contains both files and devices


I tried to offline one of the sparse files:

zpool offline storage /tmp/sparsefile2.img

That caused a panic: http://www.langille.org/tmp/zpool-offline-panic.jpg

After rebooting, I rm'd both /tmp/sparsefile1.img and
/tmp/sparsefile2.img without thinking they were still in the zpool. Now
I am unable to destroy the pool. The system panics. I disabled ZFS via
/etc/rc.conf, rebooted, recreated the two sparse files, then did a
forcestart of zfs. Then I saw:

# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors


Another attempt to destroy the array created a panic.

Suggestions as to how to remove this array and get started again?




--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zpool destroy causes panic

2010-07-25 Thread Jeremy Chadwick
On Sun, Jul 25, 2010 at 01:58:34PM -0400, Dan Langille wrote:
 [...]
 NAME  STATE READ WRITE CKSUM
 storage   ONLINE   0 0 0
   raidz2  ONLINE   0 0 0
 gpt/disk01ONLINE   0 0 0
 gpt/disk02ONLINE   0 0 0
 gpt/disk03ONLINE   0 0 0
 gpt/disk04ONLINE   0 0 0
 gpt/disk05ONLINE   0 0 0
 /tmp/sparsefile1.img  UNAVAIL  0 0 0 corrupted data
 /tmp/sparsefile2.img  UNAVAIL  0 0 0 corrupted data
 
 [...]

 Another attempt to destroy the array created a panic.
 Suggestions as to how to remove this array and get started again?

 [...]

 FreeBSD kraken.unixathome.org 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Mar  5 
 00:46:11 EST 2010 d...@kraken.example.org:/usr/obj/usr/src/sys/KRAKEN  amd64

1) Try upgrading the system (to 8.1-STABLE).  There have been numerous
changes to ZFS on RELENG_8 since March 5th.  I don't know if any of them
would address your problem.

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c#rev1.8.2.4
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c#rev1.8.2.3
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c#rev1.8.2.2
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c#rev1.8.2.1

2) Try bringing the system down into single-user mode and zeroing out
the first and last 64kbytes of each gpt/diskXX (you'll have to figure
this out on your own, I'm not familiar with GPT) so that the ZFS
metadata goes away.

Footnote: can someone explain to me how ZFS would, upon reboot, know
that /tmp/sparsefile[12].img are part of the pool?  How would ZFS taste
metadata in this situation?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zpool destroy causes panic

2010-07-25 Thread Volodymyr Kostyrko

25.07.2010 23:18, Jeremy Chadwick wrote:

Footnote: can someone explain to me how ZFS would, upon reboot, know
that /tmp/sparsefile[12].img are part of the pool?  How would ZFS taste
metadata in this situation?


Just hacking it.

Each ZFS device which is part of the pool tracks all other devices which 
are part of the pool with their sizes, device ids, last known points. It 
doesn't know that /tmp/sparsefile[12].img is part of the pool, yet it 
does know that pool have had some /tmp/sparsefile[12].img before and now 
they can't be found or current contents doesn't look like ZFS device.


Can you try moving current files to /tmp/sparsefile[34].img and then 
readd them to the pool with zpool replace? One by one please.


--
Sphinx of black quartz judge my vow.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zpool destroy causes panic

2010-07-25 Thread Dan Langille

On 7/25/2010 4:37 PM, Volodymyr Kostyrko wrote:

25.07.2010 23:18, Jeremy Chadwick wrote:

Footnote: can someone explain to me how ZFS would, upon reboot, know
that /tmp/sparsefile[12].img are part of the pool? How would ZFS taste
metadata in this situation?


Just hacking it.

Each ZFS device which is part of the pool tracks all other devices which
are part of the pool with their sizes, device ids, last known points. It
doesn't know that /tmp/sparsefile[12].img is part of the pool, yet it
does know that pool have had some /tmp/sparsefile[12].img before and now
they can't be found or current contents doesn't look like ZFS device.

Can you try moving current files to /tmp/sparsefile[34].img and then
readd them to the pool with zpool replace? One by one please.


I do not know what the above paragraph means.

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zpool destroy causes panic

2010-07-25 Thread Volodymyr Kostyrko

25.07.2010 20:58, Dan Langille wrote:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data


0k, i'll try it from here. UNAVAIL means ZFS can't locate correct vdev 
for this pool member. Even if this file exists it's not used by ZFS 
because it lacks ZFS headers/footers.


You can (I think so) reinsert empty file to the pool with:

# zpool replace storage /tmp/sparsefile1.img /tmp/sparsefile1.img

^- pool ^- ZFS old vdev name ^- current file

If you replace both files you can theoretically bring pool to fully 
consistent state.


Also you can use md to convert files to devices:

# mdconfig -a -t vnode -f /tmp/sparsefile1.img
md0

And you can use md0 with your pool.

--
Sphinx of black quartz judge my vow.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zpool destroy causes panic

2010-07-25 Thread Dan Langille

On 7/25/2010 4:49 PM, Volodymyr Kostyrko wrote:

25.07.2010 20:58, Dan Langille wrote:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data


0k, i'll try it from here. UNAVAIL means ZFS can't locate correct vdev
for this pool member. Even if this file exists it's not used by ZFS
because it lacks ZFS headers/footers.

You can (I think so) reinsert empty file to the pool with:

# zpool replace storage /tmp/sparsefile1.img /tmp/sparsefile1.img

^- pool ^- ZFS old vdev name ^- current file

If you replace both files you can theoretically bring pool to fully
consistent state.

Also you can use md to convert files to devices:

# mdconfig -a -t vnode -f /tmp/sparsefile1.img
md0

And you can use md0 with your pool.


FYI, tried this, got a panic:

errors: No known data errors
# mdconfig -a -t vnode -f /tmp/sparsefile1.img
md0
# mdconfig -a -t vnode -f /tmp/sparsefile2.img
md1
# zpool replace storage /tmp/sparsefile1.img /dev/md0


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zpool destroy causes panic

2010-07-25 Thread Dan Langille

On 7/25/2010 1:58 PM, Dan Langille wrote:

I'm trying to destroy a zfs array which I recently created.  It contains
nothing of value.

# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors

Why sparse files? See this post:

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1007077+0+archive/2010/freebsd-stable/20100725.freebsd-stable


The two tmp files were created via:

dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0 oseek=1862g
dd if=/dev/zero of=/tmp/sparsefile2.img bs=1 count=0 oseek=1862g

And the array created with:

zpool create -f storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03 \
gpt/disk04 gpt/disk05 /tmp/sparsefile1.img /tmp/sparsefile2.img

The -f flag was required to avoid this message:

invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: raidz contains both files and devices


I tried to offline one of the sparse files:

zpool offline storage /tmp/sparsefile2.img

That caused a panic: http://www.langille.org/tmp/zpool-offline-panic.jpg

After rebooting, I rm'd both /tmp/sparsefile1.img and
/tmp/sparsefile2.img without thinking they were still in the zpool. Now
I am unable to destroy the pool. The system panics. I disabled ZFS via
/etc/rc.conf, rebooted, recreated the two sparse files, then did a
forcestart of zfs. Then I saw:

# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors


Another attempt to destroy the array created a panic.

Suggestions as to how to remove this array and get started again?


I fixed this by:

* reboot zfs_enable=NO in /etc/rc.conf
* rm /boot/zfs/zpool.cache
* wiping the first and last 16KB of each partition involved in the array

Now I'm trying mdconfig instead of sparse files.  Making progress, but 
not all the way there yet.  :)


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org