Re: [zfs-discuss] ZFS not starting

2011-12-02 Thread Gareth de Vaux
On Thu 2011-12-01 (14:19), Freddie Cash wrote:
 You will need to find a lot of extra RAM to stuff into that machine in
 order for it to boot correctly, load the dedeupe tables into ARC, process
 the intent log, and then import the pool.

Thanks guys, managed to get 24GB together and it made it (looks like it used
12GB of that).

 And, you'll need that extra RAM in order to destroy the ZFS filesystem that
 has dedupe enabled.

That filesystem's gone, and it seems like I've got mostly the right ammount
of free space. Waiting to see what a scrub has to say ..
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS not starting

2011-12-01 Thread Gareth de Vaux
Hi guys, when ZFS starts it ends up hanging the system.

We have a raidz over 5 x 2TB disks with 5 ZFS filesystems. (The
root filesystem is on separate disks).

# uname -a
FreeBSD fortinbras.XXX 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Oct 19 09:20:04 
SAST 2011 r...@storage.xxx:/usr/obj/usr/src/sys/GENERIC  amd64

ZFS filesystem version 5
ZFS storage pool version 28

The setup was working great until we decided to try out deduplication.
After testing it on a few files I set deduplication on 1 of the
filesystems and moved around 1.5TB of data into it. There was basically
no space saving, just a big performance hit, so we decided to take it off.
While deleting a directory on this filesystem in preparation the system
hung. After an abnormal half an hour long bootup everything was fine,
and a scrub was clean. We then decided to rather just destroy this
filesystem as it happened to have disposable data on and would be 100
times quicker(?) I ran the zfs destroy command which sat there for about
40 hours while the free space on the pool gradually increased, at which
point the system hung again. Rebooting took hours, stuck at ZFS
initialisation, after which the console returned:

pid 37 (zfs), uid 0, was killed: out of swap space
pid 38 (sh), uid 0, was killed: out of swap space

This's now the state I'm stuck in. I can naturally boot up without ZFS
but once I start it manually the disks in the pool start flashing for
an hour or 2 and the system hangs before they finish doing whatever
they're doing.

The system has 6GB of RAM and a 10GB swap partition. I added a 30GB
swap file but this hasn't helped.

# sysctl hw.physmem
hw.physmem: 6363394048

# sysctl vfs.zfs.arc_max
vfs.zfs.arc_max: 5045088256

(I lowered arc_max to 1GB but hasn't helped)

I've included the output when starting ZFS after setting vfs.zfs.debug=1
at the bottom. There's no more ZFS output for the next few hours while
the disks are flashing and the system is responsive.

A series of top outputs after starting ZFS:

last pid:  1536;  load averages:  0.00,  0.02,  0.08
21 processes:  1 running, 20 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 12M Active, 7344K Inact, 138M Wired, 44K Cache, 10M Buf, 5678M Free
Swap: 39G Total, 39G Free

last pid:  1567;  load averages:  0.13,  0.05,  0.08
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice,  2.3% system,  2.1% interrupt, 95.6% idle
Mem: 14M Active, 7880K Inact, 328M Wired, 44K Cache, 13M Buf, 5485M Free
Swap: 39G Total, 39G Free

last pid:  1632;  load averages:  0.06,  0.04,  0.06
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice,  0.5% system,  0.1% interrupt, 99.4% idle
Mem: 14M Active, 8040K Inact, 2421M Wired, 40K Cache, 13M Buf, 3392M Free
Swap: 39G Total, 39G Free

last pid:  1693;  load averages:  0.11,  0.10,  0.08
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice,  0.3% system,  0.1% interrupt, 99.5% idle
Mem: 14M Active, 8220K Inact, 4263M Wired, 40K Cache, 13M Buf, 1550M Free
Swap: 39G Total, 39G Free

last pid:  1767;  load averages:  0.00,  0.00,  0.00
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice, 27.6% system,  0.0% interrupt, 72.4% idle
Mem: 14M Active, 8212K Inact, 4380M Wired, 40K Cache, 13M Buf, 1433M Free
Swap: 39G Total, 39G Free

*sudden system freeze*

Whether it ends up utilising the swap space and/or thrashing I don't know -
I only know that the zpool disks' fancy LED's have stopped as well as the
general disk access LED. At this point I can still ping the host, and get
a response from telnetting to port 22, but I can't ssh in or access the
console.

Any suggestions would be appreciated, we're rather fond of the data on the
other 4 innocent filesystems ;)


The debug output:

Dec  1 15:56:07 fortinbras kernel: ZFS filesystem version 5
Dec  1 15:56:07 fortinbras kernel: zvol_init:1700[1]: ZVOL Initialized.
Dec  1 15:56:07 fortinbras kernel: ZFS storage pool version 28
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and 
consumer for ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada2...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 
12202424374202010419
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada1...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 

Re: [zfs-discuss] ZFS not starting

2011-12-01 Thread Garrett D'Amore
You have just learned the hard way that dedup is *highly* toxic if misused.  If 
you have a backup of your data, then you should delete the *pool*.  Trying to 
destroy the dataset (the zfs level filesystem) will probably never succeed 
unless you have it located on an SSD or you have an enormous amount of RAM 
(maybe 100GB?  I haven't done the math on your system).  There really isn't any 
other solution to this that I'm aware of.  (Destroying the filesystem means a 
*lot* of random I/O… your drives are probably completely swamped by the 
workload.)

In general, deleting data (especially filesystems) should almost never be done 
in the face of dedup, and you should not use dedup unless you know that your 
data has a lot of natural redundancies in it *and* you have adequate memory.

In general, dedup is *wrong* for use by typical home/hobbyist users.  It can 
make sense when hosting a lot of VM images, or in some situations like backups 
with a lot of redundant copies. 

I really wish we made it harder for end-users to enable dedup.  For the first 
year or so after Nexenta shipped it, it was the single most frequent source of 
support calls.

If you've not done the analysis already, and you're not a storage 
administrator, you probably should not enable dedup.

- Garrett

On Dec 1, 2011, at 1:43 PM, Gareth de Vaux wrote:

 Hi guys, when ZFS starts it ends up hanging the system.
 
 We have a raidz over 5 x 2TB disks with 5 ZFS filesystems. (The
 root filesystem is on separate disks).
 
 # uname -a
 FreeBSD fortinbras.XXX 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Oct 19 09:20:04 
 SAST 2011 r...@storage.xxx:/usr/obj/usr/src/sys/GENERIC  amd64
 
 ZFS filesystem version 5
 ZFS storage pool version 28
 
 The setup was working great until we decided to try out deduplication.
 After testing it on a few files I set deduplication on 1 of the
 filesystems and moved around 1.5TB of data into it. There was basically
 no space saving, just a big performance hit, so we decided to take it off.
 While deleting a directory on this filesystem in preparation the system
 hung. After an abnormal half an hour long bootup everything was fine,
 and a scrub was clean. We then decided to rather just destroy this
 filesystem as it happened to have disposable data on and would be 100
 times quicker(?) I ran the zfs destroy command which sat there for about
 40 hours while the free space on the pool gradually increased, at which
 point the system hung again. Rebooting took hours, stuck at ZFS
 initialisation, after which the console returned:
 
 pid 37 (zfs), uid 0, was killed: out of swap space
 pid 38 (sh), uid 0, was killed: out of swap space
 
 This's now the state I'm stuck in. I can naturally boot up without ZFS
 but once I start it manually the disks in the pool start flashing for
 an hour or 2 and the system hangs before they finish doing whatever
 they're doing.
 
 The system has 6GB of RAM and a 10GB swap partition. I added a 30GB
 swap file but this hasn't helped.
 
 # sysctl hw.physmem
 hw.physmem: 6363394048
 
 # sysctl vfs.zfs.arc_max
 vfs.zfs.arc_max: 5045088256
 
 (I lowered arc_max to 1GB but hasn't helped)
 
 I've included the output when starting ZFS after setting vfs.zfs.debug=1
 at the bottom. There's no more ZFS output for the next few hours while
 the disks are flashing and the system is responsive.
 
 A series of top outputs after starting ZFS:
 
 last pid:  1536;  load averages:  0.00,  0.02,  0.08
 21 processes:  1 running, 20 sleeping
 CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
 Mem: 12M Active, 7344K Inact, 138M Wired, 44K Cache, 10M Buf, 5678M Free
 Swap: 39G Total, 39G Free
 
 last pid:  1567;  load averages:  0.13,  0.05,  0.08
 25 processes:  1 running, 24 sleeping
 CPU:  0.0% user,  0.0% nice,  2.3% system,  2.1% interrupt, 95.6% idle
 Mem: 14M Active, 7880K Inact, 328M Wired, 44K Cache, 13M Buf, 5485M Free
 Swap: 39G Total, 39G Free
 
 last pid:  1632;  load averages:  0.06,  0.04,  0.06
 25 processes:  1 running, 24 sleeping
 CPU:  0.0% user,  0.0% nice,  0.5% system,  0.1% interrupt, 99.4% idle
 Mem: 14M Active, 8040K Inact, 2421M Wired, 40K Cache, 13M Buf, 3392M Free
 Swap: 39G Total, 39G Free
 
 last pid:  1693;  load averages:  0.11,  0.10,  0.08
 25 processes:  1 running, 24 sleeping
 CPU:  0.0% user,  0.0% nice,  0.3% system,  0.1% interrupt, 99.5% idle
 Mem: 14M Active, 8220K Inact, 4263M Wired, 40K Cache, 13M Buf, 1550M Free
 Swap: 39G Total, 39G Free
 
 last pid:  1767;  load averages:  0.00,  0.00,  0.00
 25 processes:  1 running, 24 sleeping
 CPU:  0.0% user,  0.0% nice, 27.6% system,  0.0% interrupt, 72.4% idle
 Mem: 14M Active, 8212K Inact, 4380M Wired, 40K Cache, 13M Buf, 1433M Free
 Swap: 39G Total, 39G Free
 
 *sudden system freeze*
 
 Whether it ends up utilising the swap space and/or thrashing I don't know -
 I only know that the zpool disks' fancy LED's have stopped as well as the
 general disk access LED. At this point I can still ping the host, 

Re: [zfs-discuss] ZFS not starting

2011-12-01 Thread Freddie Cash

 The system has 6GB of RAM and a 10GB swap partition. I added a 30GB
 swap file but this hasn't helped.


ZFS doesn't use swap for the ARC (it's wired aka unswappable memory).  And
ZFS uses the ARC for dedupe support.

You will need to find a lot of extra RAM to stuff into that machine in
order for it to boot correctly, load the dedeupe tables into ARC, process
the intent log, and then import the pool.

And, you'll need that extra RAM in order to destroy the ZFS filesystem that
has dedupe enabled.

Basically, your DDT (dedupe table) is running you out of ARC space and
livelocking (or is it deadlocking, never can keep those terms straight) the
box.

You can remove the RAM once you have things working again.  Just don't
re-enable dedupe until you have at least 16 GB of RAM in the box that can
be dedicated to ZFS.  And be sure to add a cache device to the pool.

I just went through something similar with an 8 GB ZFS box (RAM is on
order, but purchasing dept ordered from wrong supplier so we're stuck
waiting for it to arrive) where I tried to destroy dedupe'd filesystem.
 Exact same results as you.  Stole RAM out of a different server
temporarily to get things working on this box again.


 # sysctl hw.physmem
 hw.physmem: 6363394048

 # sysctl vfs.zfs.arc_max
 vfs.zfs.arc_max: 5045088256

 (I lowered arc_max to 1GB but hasn't helped)


DO NOT LOWER THE ARC WHEN DEDUPE ENABLED!!

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS not starting

2011-12-01 Thread Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D.

FYI
http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-113-size-zfs-dedup-1354231.html
never to late:-(


On 12/1/2011 5:19 PM, Freddie Cash wrote:


The system has 6GB of RAM and a 10GB swap partition. I added a 30GB
swap file but this hasn't helped.


ZFS doesn't use swap for the ARC (it's wired aka unswappable memory). 
 And ZFS uses the ARC for dedupe support.


You will need to find a lot of extra RAM to stuff into that machine in 
order for it to boot correctly, load the dedeupe tables into ARC, 
process the intent log, and then import the pool.


And, you'll need that extra RAM in order to destroy the ZFS filesystem 
that has dedupe enabled.


Basically, your DDT (dedupe table) is running you out of ARC space and 
livelocking (or is it deadlocking, never can keep those terms 
straight) the box.


You can remove the RAM once you have things working again.  Just don't 
re-enable dedupe until you have at least 16 GB of RAM in the box that 
can be dedicated to ZFS.  And be sure to add a cache device to the pool.


I just went through something similar with an 8 GB ZFS box (RAM is on 
order, but purchasing dept ordered from wrong supplier so we're stuck 
waiting for it to arrive) where I tried to destroy dedupe'd 
filesystem.  Exact same results as you.  Stole RAM out of a different 
server temporarily to get things working on this box again.


# sysctl hw.physmem
hw.physmem: 6363394048 tel:6363394048

# sysctl vfs.zfs.arc_max
vfs.zfs.arc_max: 5045088256 tel:5045088256

(I lowered arc_max to 1GB but hasn't helped)


DO NOT LOWER THE ARC WHEN DEDUPE ENABLED!!

--
Freddie Cash
fjwc...@gmail.com mailto:fjwc...@gmail.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Hung-Sheng Tsao Ph D.
Founder  Principal
HopBit GridComputing LLC
cell: 9734950840
http://laotsao.wordpress.com/
http://laotsao.blogspot.com/

attachment: laotsao.vcf___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss