Re: [OmniOS-discuss] r151014 users - beware of illumos 6214 - steps to check and repair...

2015-09-14 Thread Paul B. Henson
> From: Guenther Alka
> Sent: Monday, September 14, 2015 9:21 AM
> 
> 1. what is the recommended way to detect possible problems
>a. run scrub? seems useless

I don't think it is necessarily useless, it might detect a problem. However,
from what I understand there might be a problem it doesn't detect. So it can
be considered verification there is a problem, but not verification that
there isn't.

>b. run zdb pool and check for what

I ran a basic zdb and also a 'zdb -bbccsv', the former seems to be core
dumping on parsing the history, but the latter ran successfully with no
issues. If I understood George correctly, 'zdb -bbccsv' should be fairly
reliable on finding metadata corruption as it traverses all of the blocks.

> 2. when using an L2Arc and there is no obvious error detected by scrub
> or zdb
>a. trash the pool and restore from backup  via rsync with possible
> file corruptions but ZFS structure is 100% ok then
>b. keep the pool and hope that there is no metadata corruption?
>c. some action to verify that at least the pool is ok: 

Hmm, at this point given a successful scrub and successful zdb runs I'm
going to keep my fingers crossed that I have no corruption. I was only
running the buggy code for it out of month, without a particularly high
load, so hopefully I got lucky.

> 3. when using an L2Arc and there is an error detected by scrub or zdb
[...]
>b. keep the pool and hope that there is no metadata corruption

If the scrub or zdb detect errors, it is possible your box might panic at
some point, or be unable to import the pool after a reboot. So in that case,
I don't think just keeping it is advisable :). I'm not sure if there is any
way to fix it or if the best case is to try to restore it or temporarily
transfer the data elsewhere, re-create it, and put it back.

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] r151014 users - beware of illumos 6214 - steps to check and repair...

2015-09-14 Thread Guenther Alka

What is the recommended action on OmniOS 151014 about the L2Arc Problem

1. what is the recommended way to detect possible problems
  a. run scrub? seems useless
  b. run zdb pool and check for what
 You said:  Look for assertion failures, or other non-0 exits. Is 
this the key for a corrupt pool?


2. when using an L2Arc and there is no obvious error detected by scrub 
or zdb
  a. trash the pool and restore from backup  via rsync with possible 
file corruptions but ZFS structure is 100% ok then

  b. keep the pool and hope that there is no metadata corruption?
  c. some action to verify that at least the pool is ok: 

3. when using an L2Arc and there is an error detected by scrub or zdb
  a. trash the pool and restore from backup with possible file 
corruption but pool is 100% ok

  b. keep the pool and hope that there is no metadata corruption
  c. some action to verify that at least the pool is ok: 

Is there an alert page at OmniOS wiki about?

Gea


Am 10.09.2015 um 13:53 schrieb Dan McDonald:

If you are using a zpool with r151014 and you have an L2ARC ("cache") vdev, I 
recommend at this time disabling it.  You may disable it by uttering:

zpool remove  

For example:

zpool remove data c2t2d0

The bug in question has a good analysis here:

https://www.illumos.org/issues/6214

This bug can lead to problems ranging from false-positives on zpool scrub all 
the way up to actual pool corruption.

We will be updating the package repo AND the install media once 6214 is 
upstreamed to illumos-gate, and pulled back into the r151014 branch of 
illumos-omnios.  The fix is undergoing some tests from ZFS experts right now to 
verify its correctness.

So please disable your L2ARC/cache devices for maximum data safety.  You can 
add them back after we update r151014 by uttering:

zpool add  cache 

PLEASE NOTE the "cache" indicator when you add back.  If you omit this, the 
vdev is ADDED to your pool, an operation one can't reverse.

zpool add data cache c2t2d0

Thanks,
Dan

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss