On Monday, 17 March 2003 at 10:58:28 +0000, Scott Mitchell wrote:
> On Mon, Mar 10, 2003 at 11:15:32PM +0000, Scott Mitchell wrote:
>> Hi all,
>>
>> I wonder if anyone out there can shed any light on this:
>>
>> A drive failed on one of our Vinum-powered RAID-5 arrays over the weekend.
>> This morning, we swapped out the offending drive (hot-swappable SCSI
>> hardware), disklabel-ed it and restarted the offending subdisk.  Everything
>> seemed fine at this point, with vinum happily reviving the stale subdisk.
>>
>> However, twenty minutes later, with the revive 29% complete, I got this in
>> /var/log/messages:
>>
>> Mar 10 11:39:50 kokako vinum[12708]: can't revive raid.p0.s0: Invalid argument
>>
>> 'vinum list' was also showing an error message, which I foolishly didn't
>> capture, something along the lines of 'the revive process died'.  Lacking
>> any better ideas, I started the subdisk again.  The revival seemed to pick
>> up where it left off.
>>
>> Half an hour later, the box rebooted :-(  I wasn't actually watching it at
>> the time, so I don't know if it finished reviving the subdisk or not.
>> There's no indication in the logs as to what happened, but the timing of
>> the reboot is consistent with it happening around the time the subdisk
>> would have come back to life.
>>
>> Once the box came back up, I restarted the subdisk yet again (I had to
>> create the drive again first), with the RAID volume unmounted.  This time
>> the process finished without complaints and things seem to be working as
>> well as ever since then.
> [logs, etc. snipped...]
>
>
> No takers? 

I've been intending to do so, but there's not much I can do based on
the information you've supplied.

> Maybe someone who's done this (replacing a failed Vinum drive on
> hot-swap SCSI hardware) before can at least tell me whether:
>
>       - I should have done some camcontrol magic before rebuilding
>         the drive?

I can't see anything in particular you would need to do, but then I
haven't seen the details.

>       - Rebuilding the drive without unmounting the volume first was
>       just asking for trouble?

There have been reports of this kind of problem, mainly from Vallo
Kallaste, who has also responded.  I haven't seen it myself, and I
haven't heard of panics as a result.  But yes, umounting is a good
precaution.

>       - -hackers or even -stable is a better venue for this kind of problem?

-questions will do fine.

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to