Re: [Linux-HA] drbd primary/primary for ocfs2 and undetected split brain

Andreas Kurz Fri, 29 Jun 2012 08:01:23 -0700

On 06/29/2012 04:02 PM, EXTERNAL Konold Martin (erfrakon, RtP2/TEF72) wrote:
> Hi,
> 
> I am experiencing an error situation which gets not detected by the cluster.
> 
> I created a 2-node cluster using drbd and want to use ocfs2 on both nodes 
> simultaneously. (stripped off some monitor/meta stuff)


baaaaad idea ... pretty useless without the full configuration,
especially the meta attributes in this case. Also share your drbd and
corosycn configuration please. BTW: what is your use case to start with
a "simple" dual-primary OCFS2 setup?

> 
> primitive dlm ocf:pacemaker:controld
> primitive o2cb ocf:ocfs2:o2cb
> primitive resDRBD ocf:linbit:drbd \
>         params drbd_resource="r0" \
>         operations $id="resDRBD-operations"
> primitive resource-fs ocf:heartbeat:Filesystem \
>         params device="/dev/drbd_r0" directory="/SHARED" fstype="ocfs2"
> ms msDRBD resDRBD
> clone clone-dlm dlm
> clone clone-fs resource-fs
> clone clone-ocb o2cb
> colocation colocation-dlm-drbd inf: clone-dlm msDRBD:Master
> colocation colocation-fs-o2cb inf: clone-fs clone-ocb
> colocation colocation-ocation-dlm inf: clone-ocb clone-dlm
> order order-dlm-o2cb 0: clone-dlm clone-ocb
> order order-drbd-dlm 0: msDRBD:promote clone-dlm:start
> order order-o2cb-fs 0: clone-ocb clone-fs
> 
> The cluster starts up happily. (everything green in crm_gui) but
> 
> rt-lxcl9a:~ # drbd-overview
>   0:r0/0  WFConnection Primary/Unknown UpToDate/DUnknown C r-----
> rt-lxcl9b:~ # drbd-overview
>   0:r0/0  StandAlone Primary/Unknown UpToDate/DUnknown r-----
> 
> As you can see this is a split brain situation with both nodes having the 
> ocfs2 fs mounted but not in sync --> data loss will happen.
> 
> 1.      How to avoid split brain situations (I am confident that the cross 
> link using a 10GB cable was never interrupted)?

logs should reveal what happend

> 2.      How to resolve this?
> Switch cluster in maintenance mode and then follow 
> http://www.drbd.org/users-guide/s-resolve-split-brain.html ?

at least you also need to stop the filesystem if it is running and you
want to demote one Primary ... and then follow that link

> 3.      How to make the cluster aware of the split brain situation? (It 
> thinks everything is fine)

setup fening method "resource-and-stonith" in drbd configuration,
preferable use the "crm-fence-peer.sh" stonith script ... Pacemaker
itself or better the DRBD resource agent will not react on such a situation.

> 4.      Should the DRBD(OCFS2 setup be maintained outside the cluster instead?

better not ;-)

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now


> 
> Mit freundlichen Grüßen / Best regards
> 
> Martin Konold
> 
> Robert Bosch GmbH
> Automotive Electronics
> Postfach 13 42
> 72703 Reutlingen
> GERMANY
> www.bosch.com<http://www.bosch.com>
> 
> Tel. +49 7121 35 3322
> 
> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz 
> Fehrenbach, Siegfried Dais;
> Stefan Asenkerschbaumer, Bernd Bohr, Rudolf Colm, Volkmar Denner, Christoph 
> Kübel, Uwe Raschke,
> Wolf-Henning Scheider, Werner Struth, Peter Tyroller
> 
> 
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] drbd primary/primary for ocfs2 and undetected split brain

Reply via email to