Re: mismatch_cnt questions

2007-03-13 Thread Andre Noll
On 00:21, H. Peter Anvin wrote: I have just updated the paper at: http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf ... with this information (in slightly different notation and with a bit more detail.) There's a typo in the new section: s/By assumption, X_z != D_n/By

Re: mismatch_cnt questions

2007-03-13 Thread H. Peter Anvin
Andre Noll wrote: On 00:21, H. Peter Anvin wrote: I have just updated the paper at: http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf ... with this information (in slightly different notation and with a bit more detail.) There's a typo in the new section: s/By assumption, X_z !=

Re: mismatch_cnt questions - how about raid10?

2007-03-11 Thread Neil Brown
On Tuesday March 6, [EMAIL PROTECTED] wrote: I see. So basically for those of us who want to run swap on raid 1 or 10, and at the same time want to rely on mismatch_cnt for early problem detection, the only option is to create a separate md device just for the swap. Is this about

Re: mismatch_cnt questions

2007-03-08 Thread H. Peter Anvin
Bill Davidsen wrote: When last I looked at Hamming code, and that would be 1989 or 1990, I believe that I learned that the number of Hamming bits needed to cover N data bits was 1+log2(N), which for 512 bytes would be 1+12, and fit into a 16 bit field nicely. I don't know that I would go

Re: mismatch_cnt questions

2007-03-08 Thread Bill Davidsen
H. Peter Anvin wrote: Bill Davidsen wrote: When last I looked at Hamming code, and that would be 1989 or 1990, I believe that I learned that the number of Hamming bits needed to cover N data bits was 1+log2(N), which for 512 bytes would be 1+12, and fit into a 16 bit field nicely. I don't

Re: mismatch_cnt questions

2007-03-07 Thread H. Peter Anvin
H. Peter Anvin wrote: Eyal Lebedinsky wrote: Neil Brown wrote: [trim Q re how resync fixes data] For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity. Can raid6 identify the

Re: mismatch_cnt questions - how about raid10?

2007-03-06 Thread Peter Rabbitson
Neil Brown wrote: When we write to a raid1, the data is DMAed from memory out to each device independently, so if the memory changes between the two (or more) DMA operations, you will get inconsistency between the devices. Does this apply to raid 10 devices too? And in case of LVM if swap is

Re: mismatch_cnt questions - how about raid10?

2007-03-06 Thread Neil Brown
On Tuesday March 6, [EMAIL PROTECTED] wrote: Neil Brown wrote: When we write to a raid1, the data is DMAed from memory out to each device independently, so if the memory changes between the two (or more) DMA operations, you will get inconsistency between the devices. Does this apply to

Re: mismatch_cnt questions - how about raid10?

2007-03-06 Thread Peter Rabbitson
Neil Brown wrote: On Tuesday March 6, [EMAIL PROTECTED] wrote: Neil Brown wrote: When we write to a raid1, the data is DMAed from memory out to each device independently, so if the memory changes between the two (or more) DMA operations, you will get inconsistency between the devices. Does

Re: mismatch_cnt questions

2007-03-06 Thread Bill Davidsen
Neil Brown wrote: On Monday March 5, [EMAIL PROTECTED] wrote: Neil Brown wrote: [trim Q re how resync fixes data] For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity.

Re: mismatch_cnt questions

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote: Neil Brown wrote: [trim Q re how resync fixes data] For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity. Can raid6 identify

Re: mismatch_cnt questions

2007-03-05 Thread Paul Davidson
Hi Neil, I've been following this thread with interest and I have a few questions. Neil Brown wrote: On Monday March 5, [EMAIL PROTECTED] wrote: Neil Brown wrote: When a disk fails we know what to rewrite, but when we discover a mismatch we do not have this knowledge. It may corrupt the

Re: mismatch_cnt questions

2007-03-04 Thread Neil Brown
On Sunday March 4, [EMAIL PROTECTED] wrote: Hello, these questions apparently got buried in another thread, so here goes again ... I have a mismatch_cnt of 384 on a 2-way mirror. The box runs 2.6.17.4 and can't really be rebooted or have its kernel updated easily 1) Where does the

Re: mismatch_cnt questions

2007-03-04 Thread Eyal Lebedinsky
Neil Brown wrote: On Sunday March 4, [EMAIL PROTECTED] wrote: I have a mismatch_cnt of 384 on a 2-way mirror. [trim] 3) Is the repair sync action safe to use on the above kernel? Any other methods / additional steps for fixing this? repair is safe, though it may not be effective. repair for

Re: mismatch_cnt questions

2007-03-04 Thread Neil Brown
On Sunday March 4, [EMAIL PROTECTED] wrote: Hey, that was quick ... thanks! 1) Where does the mismatch come from? The box hasn't been down since the creation of the array. Do you have swap on the mirror at all? As a matter of fact I do, /dev/md0_p2 is a swap partition. I

Re: mismatch_cnt questions

2007-03-04 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote: Neil Brown wrote: On Sunday March 4, [EMAIL PROTECTED] wrote: I have a mismatch_cnt of 384 on a 2-way mirror. [trim] 3) Is the repair sync action safe to use on the above kernel? Any other methods / additional steps for fixing this? repair is

Re: mismatch_cnt questions

2007-03-04 Thread Eyal Lebedinsky
Neil Brown wrote: [trim Q re how resync fixes data] For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity. Can raid6 identify the bad block (two parity blocks could allow this if