Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-23 Thread Mark Bennett
Well I do have a plan.

Thanks to the portability of ZFS boot disks, I'll make two new OS disks on 
another machine with the next Nexcenta release, export the data pool and swap 
in the new ones.

That way, I can at least manage a zfs scrub without killing the performance and 
get the Intel SSD's I have been testing to work properly.

On the other hand, I could just use the spare 7210 Appliance boot disk I have 
lying about.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] deduplication: l2arc size

2010-08-23 Thread Frank Van Damme
Hi,

this has already been the source of a lot of interesting discussions, so
far I haven't found the ultimate conclusion. From some discussion on
this list in February, I learned that an antry in ZFS' deduplication
table takes (in practice) half a KiB of memory. At the moment my data
looks like this (output of zdb -D)...


DDT-sha256-zap-duplicate: 3299796 entries, size 350 on disk, 163 in core
DDT-sha256-zap-unique: 9727611 entries, size 333 on disk, 151 in core

dedup = 1.73, compress = 1.20, copies = 1.00, dedup * compress / copies
= 2.07

So that means the DDT contains a total of 13,027,407 entries, meaning
it's 6,670,032,384 bytes big. So suppose our data grow on with a factor
12, it will take 80 GB. So, it would be best to buy a 128 GB SSD as
L2ARC cache. Correct?


Thanks for enlightening me,


-- 
Frank Van Damme
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS offline ZIL corruption not detected

2010-08-23 Thread StorageConcepts
Hello, 

we are currently extensivly testing the DDRX1 drive for ZIL and we are going 
through all the corner cases. 

The headline above all our tests is do we still need to mirror ZIL with all 
current fixes in ZFS (zfs can recover zil failure, as long as you don't export 
the pool, with latest upstream you can also import a poool with a missing zil)? 
This question  is especially interesting with RAM based devices, because they 
don't wear out, have a very low bit error rate and use one PCIx slot - which 
are rare. Price is another aspect here :)

During our tests we found a strange behaviour of ZFS ZIL failures which are not 
device related and we are looking for help from the ZFS guru's here :) 

The test in question is called offline ZIL corruption. The question is, what 
happens if my ZIL data is corrupted while a server is transported or moved and 
not properly shut down. For this we do: 

- Prepare 2 OS installations (ProdudctOS and CorruptOS)
- Boot ProductOS and create a pool and add the ZIL 
- ProductOS: Issue synchronous I/O with a increasing TNX number (and print the 
latest committet transaciton)
- ProductOS: Power off the server and record the laast committet transaction
- Boot CorruptOS
- Write random data to the beginning of the ZIL (dd if=/dev/urandom of=ZIL  
~ 300 MB from start of disk, overwriting the first two disk labels)
- Boot ProductOS
- Verify that the data corruption is detected by checking the file with the 
transaction number against the one recorded

We ran the test and it seems with modern snv_134 the pool comes up after the 
corruption with all beeing ok, while ~1 Transactions (this is some seconds 
of writes with DDRX1) are missing and nobody knows about this. We ran a scrub 
and scrub does not even detect this. ZFS automatically repairs the labels on 
the ZIL, however no error is reported about the missing data.

While it is clear to us that if we do not have a mirrored zil, the data we have 
overwritten in the zil is lost, we are really wondering why ZFS does not REPORT 
about this corruption, silently ignoring it.

Is this is a bug or .. aehm ... a feature  :) ?

Regards, 
Robert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS offline ZIL corruption not detected

2010-08-23 Thread Neil Perrin

This is a consequence of the design for performance of the ZIL code.
Intent log blocks are dynamically allocated and chained together.
When reading the intent log we read each block and checksum it
with the embedded checksum within the same block. If we can't read
a block due to an IO error then that is reported, but if the checksum does
not match then we assume it's the end of the intent log chain.
Using this design means we the minimum number of writes to add
write an intent log record is just one write.

So corruption of an intent log is not going to generate any errors.

Neil.

On 08/23/10 10:41, StorageConcepts wrote:
Hello, 

we are currently extensivly testing the DDRX1 drive for ZIL and we are going through all the corner cases. 


The headline above all our tests is do we still need to mirror ZIL with all 
current fixes in ZFS (zfs can recover zil failure, as long as you don't export the pool, 
with latest upstream you can also import a poool with a missing zil)? This question  is 
especially interesting with RAM based devices, because they don't wear out, have a very 
low bit error rate and use one PCIx slot - which are rare. Price is another aspect here :)

During our tests we found a strange behaviour of ZFS ZIL failures which are not device related and we are looking for help from the ZFS guru's here :) 

The test in question is called offline ZIL corruption. The question is, what happens if my ZIL data is corrupted while a server is transported or moved and not properly shut down. For this we do: 


- Prepare 2 OS installations (ProdudctOS and CorruptOS)
- Boot ProductOS and create a pool and add the ZIL 
- ProductOS: Issue synchronous I/O with a increasing TNX number (and print the latest committet transaciton)

- ProductOS: Power off the server and record the laast committet transaction
- Boot CorruptOS
- Write random data to the beginning of the ZIL (dd if=/dev/urandom of=ZIL  
~ 300 MB from start of disk, overwriting the first two disk labels)
- Boot ProductOS
- Verify that the data corruption is detected by checking the file with the 
transaction number against the one recorded

We ran the test and it seems with modern snv_134 the pool comes up after the 
corruption with all beeing ok, while ~1 Transactions (this is some seconds 
of writes with DDRX1) are missing and nobody knows about this. We ran a scrub 
and scrub does not even detect this. ZFS automatically repairs the labels on 
the ZIL, however no error is reported about the missing data.

While it is clear to us that if we do not have a mirrored zil, the data we have 
overwritten in the zil is lost, we are really wondering why ZFS does not REPORT 
about this corruption, silently ignoring it.

Is this is a bug or .. aehm ... a feature  :) ?

Regards, 
Robert
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS offline ZIL corruption not detected

2010-08-23 Thread Neil Perrin

On 08/23/10 13:12, Markus Keil wrote:

Does that mean that when the begin of the intent log chain gets corrupted, all
other intent log data after the corruption area is lost, because the checksum of
the first corrupted block doesn't match? 
  


- Yes, but you wouldn't want to replay the following entries in case the 
log records

in the missing log block were important (eg create file).

Mirroring the slogs is recommended to minimise concerns about slogs 
corruption.



 
Regards,

Markus

Neil Perrin neil.per...@oracle.com hat am 23. August 2010 um 19:44
geschrieben:

  

This is a consequence of the design for performance of the ZIL code.
Intent log blocks are dynamically allocated and chained together.
When reading the intent log we read each block and checksum it
with the embedded checksum within the same block. If we can't read
a block due to an IO error then that is reported, but if the checksum does
not match then we assume it's the end of the intent log chain.
Using this design means we the minimum number of writes to add
write an intent log record is just one write.

So corruption of an intent log is not going to generate any errors.

Neil.

On 08/23/10 10:41, StorageConcepts wrote:


Hello,

we are currently extensivly testing the DDRX1 drive for ZIL and we are going
through all the corner cases.

The headline above all our tests is do we still need to mirror ZIL with
all current fixes in ZFS (zfs can recover zil failure, as long as you don't
export the pool, with latest upstream you can also import a poool with a
missing zil)? This question  is especially interesting with RAM based
devices, because they don't wear out, have a very low bit error rate and use
one PCIx slot - which are rare. Price is another aspect here :)

During our tests we found a strange behaviour of ZFS ZIL failures which are
not device related and we are looking for help from the ZFS guru's here :)

The test in question is called offline ZIL corruption. The question is,
what happens if my ZIL data is corrupted while a server is transported or
moved and not properly shut down. For this we do:

- Prepare 2 OS installations (ProdudctOS and CorruptOS)
- Boot ProductOS and create a pool and add the ZIL
- ProductOS: Issue synchronous I/O with a increasing TNX number (and print
the latest committet transaciton)
- ProductOS: Power off the server and record the laast committet transaction
- Boot CorruptOS
- Write random data to the beginning of the ZIL (dd if=/dev/urandom of=ZIL
 ~ 300 MB from start of disk, overwriting the first two disk labels)
- Boot ProductOS
- Verify that the data corruption is detected by checking the file with the
transaction number against the one recorded

We ran the test and it seems with modern snv_134 the pool comes up after the
corruption with all beeing ok, while ~1 Transactions (this is some
seconds of writes with DDRX1) are missing and nobody knows about this. We
ran a scrub and scrub does not even detect this. ZFS automatically repairs
the labels on the ZIL, however no error is reported about the missing data.

While it is clear to us that if we do not have a mirrored zil, the data we
have overwritten in the zil is lost, we are really wondering why ZFS does
not REPORT about this corruption, silently ignoring it.

Is this is a bug or .. aehm ... a feature  :) ?

Regards,
Robert
   
  


--
StorageConcepts Europe GmbH
    Storage: Beratung. Realisierung. Support     


Markus Keil            k...@storageconcepts.de
                       http://www.storageconcepts.de
Wiener StraÃYe 114-116Â  Telefon:Â  Â +49 (351) 8 76 92-21
01219 Dresden          Telefax:   +49 (351) 8 76 92-99
Handelregister Dresden, HRB 28281
Geschäftsführer: Robert Heinzmann, Gerd Jelinek
--
Rechtlicher Hinweis: Der Inhalt dieser E-Mail sowie etwaige Anlagen hierzu sind
vertraulich  und ausschlieÃYlich für den Gebrauch durch den Empfänger 
bestimmt,
soweit diese Nachricht im Einzelfall nicht einen anderweitigen Umgang erlaubt.
Auch kann der Inhalt der Nachricht Gegenstand von gesetzlichen Schutzrechten
sein. Soweit eine Weitergabe oder Verteilung nicht ausschlieÃYlich zu internen
Zwecken des Empfängers geschieht, wird jede Weitergabe, Verteilung oder 
sonstige
Kopierung untersagt. Sollten Sie nicht  der beabsichtigte Empfänger der 
Sendung
sein, informieren Sie den Absender bitte unverzüglich.
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss