I have just had an accident where there was a random memory corruption (via DMA)
all over the place.  Fortunately, the system crashed before too much damage has
been done.  But there still was some damage to the ZFS pool.  Initially the pool
even was in the faulted state, but zpool clear -F helped to restore it to a more
or less usable state.
I am trying to assess the state.  zpool scrub has found some errors.  Some files
were irreparable, but I was able to restore them.  I realize that not all errors
can be found this way, so I am also verifying checksums for all files for which
I have them recorded (it's looking good so far).

There is one interesting trouble.  A non-debug kernel is able to work with the
pool without problems.  But a debug kernel hits this assert:
        ASSERT(!claimed || !(zh->zh_flags & ZIL_CLAIM_LR_SEQ_VALID) ||
            (max_blk_seq == claim_blk_seq && max_lr_seq == claim_lr_seq));

(kgdb) p *zilog->zl_header
$2 = {
  zh_claim_txg = 11425925,
  zh_replay_seq = 0,
  zh_log = {
    blk_dva = {{
        dva_word = {0, 0}
      }, {
        dva_word = {0, 0}
      }, {
        dva_word = {0, 0}
      }},
    blk_prop = 0,
    blk_pad = {0, 0},
    blk_phys_birth = 0,
    blk_birth = 0,
    blk_fill = 0,
    blk_cksum = {
      zc_word = {0, 0, 0, 0}
    }
  },
  zh_claim_blk_seq = 1,
  zh_flags = 2,
  zh_claim_lr_seq = 0,
  zh_pad = {0, 0, 0}
}

So, the DVA is a hole, but zh_flags has ZIL_CLAIM_LR_SEQ_VALID and zh_claim_txg
and zh_claim_blk_seq are non-zero.  The txg number is reasonable.
This does not look like a random corruption.  Maybe the values are incorrect,
but they look sane.

This ASSERT is triggered for several different datasets.
What all of them have in common is that they were "dormant".  They were not set
to readonly, but there should not have been any writes to them for a long time.

So, I am curious if there is anything interesting that can be deduced from this
information.  Perhaps it could be a consequence of the rewind...

-- 
Andriy Gapon

------------------------------------------
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/Te134c02e590bc7f4-M70c719cf5262cce6c681124c
Powered by Topicbox: https://topicbox.com

Reply via email to