Ok, since my previous thread didn't seem to attract much attention,
let me try again.
An interrupted RAID5 reshape will cause the md device in question to
contain one corrupt chunk per stripe if resumed in the wrong manner.
A testcase can be found at http://www.nagilum.de/md/ .
The first testcase can be initialized with "start.sh" the real test
can then be run with "test.sh". The first testcase also uses dm-crypt
and xfs to show the corruption.
The second testcase uses nothing but mdadm and "testpat" - a small
program to write and verify a simple testpattern designed to find
block data corruptions. Use "v2_start.sh && v2_test.sh" to run.
At the end it will point out all the wrong bytes on the md device.
I'm not just interested in a simple behaviour fix I'm also interested
in what actually happens and if possible a repair program for that
kind of data corruption.
The bug is architectural agnostic. I first came across it using 2.6.23.8 on amd64 but I verified it on 2.6.23.[8-12] and 2.6.24-rc[5,6] on ppc. Always using mdadm 2.6.4.
The situation the bug first showed up was as follows:
1. A RAID5 reshape from 5->6 device was started.
2. After about 4% one disk failed, the machine appeared unresponsive and was rebooted.
3. A spare disk was added to the array.
4. The bad drive was re-added to the array in a different bay and the reshape resumed.
5. The drive failed again but the reshape continued.
6. The reshaped finished and after that the resync. The data after at about 4% on the md device is broken as described above.

Kind regards,
Alex.


========================================================================
#    _  __          _ __     http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__ ____ _(_) /_ ____ _  [EMAIL PROTECTED] \n +491776461165 #
#  /    / _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#           /___/     x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #
========================================================================




----------------------------------------------------------------
cakebox.homeunix.net - all the machine one needs..

Attachment: pgp41FEJ6D5Gy.pgp
Description: PGP Digital Signature

Reply via email to