> -----Original Message-----
> From: drbd-user-boun...@lists.linbit.com [mailto:drbd-user-
> boun...@lists.linbit.com] On Behalf Of robert.koe...@knapp.com
> Sent: Wednesday, January 22, 2014 3:52 AM
> To: drbd-user@lists.linbit.com
> Subject: [DRBD-user] Antwort: Re: proto c - corrupt files - directories
> missing
> 
> Hi!
> Yes, that should do the trick. However, ot be on the safe side and also
> check if the culperit might be the RAID controller underneath it aoud make
> sense to trigger a full resync by disconnecting on the Secondary (drbdadm
> disconnect resourcename), invalidating (drbdadm invalidate resourcename)
> on
> the secondary and then reconnecting (drbdadm connect resourcename). After
> the ensuing resync is finished run another verify. If you get OOS blocks
> again, chances are you are writing nonsense to the disk.
> IIRC DRBD checks the integrity of the transmission then data intrgirity
> checking is active, but not the actual blocks on disk. there it relies on
> the underlying layers of the storage subsystem to actually write the data
> as it was transmitted. A cronjob runing a verify once every while and
> another one checking for any OOS blocks (parsing /proc/drbd) and
> triggering

A verify at end of job can trigger an email (or some other action) if OOS 
blocks are found. I have a weekly cron job running here (r0 on Tuesday, r1 on 
Wednesday, etc) and get about 1 or 2 emails a year. (Crappy disks). I do the 
disconnect, replace the failing disk, create metadata and reconnect manually.

Dan


> a disconnect/reconnect if any are found, might be a good idea. Maybe
> someone from LINBIT can comment on this and come up with confirmation or a
> better solution. Ideally OOS blocks would be fixed automatically with the
> option todisable this function using the config.
> Mit freundlichen Grüßen / Best Regards
> 
> Robert Köppl
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to