> -----Original Message-----
> From: Bruno Prior [mailto:[EMAIL PROTECTED]]
> Sent: Monday, November 22, 1999 8:30 AM
> To: Linux-Raid; Michel Pelletier
> Subject: RE: errors on boot
> 
> 
> Michel,
> 
> Thanks for that. It makes things much clearer.
> 
> > Nov 16 13:36:28 korak kernel: sdb4's event counter: 00000016
> > Nov 16 13:36:28 korak kernel: sda4's event counter: 00000017
> > Nov 16 13:36:28 korak kernel: md: superblock update time 
> inconsistency
> > -- using
> > the most recent one
> > Nov 16 13:36:28 korak kernel: freshest: sda4
> > Nov 16 13:36:28 korak kernel: md2: kicking faulty sdb4!
> 
> This is the crucial part. I think it speaks for itself. The 
> mirrors are out of
> sync, so the RAID code assumes that there is a problem and 
> kicks the partition
> that was updated less recently (sdb4) out of the array. Did 
> you have an unclean
> shutdown, or was there a problem with the RAID before the 
> last shutdown such
> that sdb4 was kicked out of the array? Something like this 
> must have happened.

Yes, before shutdown we were getting errors on the console about
'writing beyond the end of a device'.  This is why I rebooted the
machine.
 
> Anyway, the solution is very simple. Just do "raidhotadd 
> /dev/md2 /dev/sdb4".
> You don't need to "raidhotremove /dev/md2 /dev/sdb4", because 
> it has already
> been kicked out of the array at startup. This will add 
> /dev/sdb4 back into the
> array. The RAID code should start resyncing it automatically 
> once it has been
> added back in. Have a look in /proc/mdstat and you should see 
> how the resync'ing
> process is going. Make sure you don't shutdown before 
> resync'ing has completed,
> or you will be back to square one. But you can use the array 
> quite happily while
> it is resyncing.

You the man!  Worked like a charm.
 
> It would be a good idea to try to figure out why the mirrors 
> were out of sync,
> in case this reveals a problem. If it was an unclean 
> shutdown, then there's no
> problem (apart from making sure you don't do it again). But 
> if sdb4 had been
> kicked out of the array, you need to know why to make sure it 
> doesn't happen
> again. If this was the case, you will need to check back 
> through your syslog to
> try to spot when it happened and what the reasons were. Or if 
> you can't be
> bothered to do this, at least keep an eye on /proc/mdstat 
> from now on (maybe
> using one of the monitoring scripts which are mentioned on 
> this list from time
> to time), to make sure that you know if it happens again.

Well I suspect it's got to do with the writing beyond the end of the
device error, which sounds like something I could have no control over.
However, there are several people here who do install a bunch of
software and generally muck about with the machine who I'll restrain
from now on.  I'll tool through dmesg and syslogs to see if I can glean
more than that. Thanks alot for your help!

-Michel

Reply via email to