Our current DRBD version is 8.0.16. We are constrained by the linux kernel
version we are using to upgrade to latest drbd version.
In this version after 70 - 80 failovers, we see that on board that becomes
cluster-primary FSCK -fy fixes some inodes.
This change in inodes is not being replicated to standby board.
If we perform failover at this stage, standby board that becomes primary shows
file corruption.
File Corruption => Content of one file is seen in another file.
Some one can help us with DRBD version which fixed this sync issue.
Solution:
In Run Level 3 we start a script to check the status of DRBD. If it is not in
expected state then we run recovery. This change helped us clock 500+ failovers
without any issue.
Code Excerpts:
create_md="NO"
cur_sta=`$DRBDADM state all`
pri_sec=`echo $cur_sta | awk -F/ '{print $1}'`
peer_st=`echo $cur_sta | awk -F/ '{print $2}'`
cstate=`$DRBDADM cstate all`
if [ "$pri_sec" = "Secondary" -a "$peer_st" = "Primary" -a "$cstate" =
"WFBitMapT" ]; then
create_md="YES"
fi
if [ "$pri_sec" = "Secondary" -a "$peer_st" = "Unknown" -a "$cstate" =
"WFConnection" ]; then
cat /proc/drbd | grep "ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0
ap:0" > /dev/null 2>&1
if [ $? -eq 0 ]; then
create_md="YES"
fi
fi
Thanks and Regards,
Lak
> Date: Sat, 18 Sep 2010 04:54:05 -0500
> From: [email protected]
> To: [email protected]
> CC: [email protected]; [email protected]
> Subject: Re: [DRBD-user] drbd and fsck
>
> On Sat, 18 Sep 2010, putcha narayana wrote:
>
> >
> >
> > FYI: If you run fsck on one node and it prints "FILE SYSTEM HAS BEEN
> > MODIFIED", use external script to run full sync on the other board. without
> > this sync we are seeing file corruption during failovers.
>
> Would you give an example of the external script
>
> >
> > THANKS AND REGARDSLAK
> >
> >
> >
> >> From: [email protected]
> >> To: [email protected]
> >> Date: Thu, 16 Sep 2010 12:37:52 -0400
> >> Subject: Re: [DRBD-user] drbd and fsck
> >>
> >> On 09/15/2010 03:06 PM, [email protected] wrote:
> >>
> >>> I have a ext3 filesystem on drbd
> >>> When I run fsck should I run it on all nodes[?]
> >>
> >> You probably want to run verify to check that the images are in sync. fsck
> >> the primary, verify the resource.
> >>
> >> It "sounds" like you want to know both copies are good.
> >>
> >> Dan in Atlanta
> >>
> >> _______________________________________________
> >> drbd-user mailing list
> >> [email protected]
> >> http://lists.linbit.com/mailman/listinfo/drbd-user
> >
>
> --
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user