>>> Andreas Kurz <[email protected]> schrieb am 20.12.2011 um 22:57 in 
>>> Nachricht
<[email protected]>:
> Hello,
> 
> On 12/20/2011 02:47 PM, Ulrich Windl wrote:
> > Hi!
> > 
> > I have a dual-primary DRBD that is not working well: It was working, then I 
> shut it down and restarted it. DRBD complained about split brain and fenced 
> the other node. When coming up, the other node fenced this node. IMHO no node 
> should have fenced each other.
> > 
> 
> no config from drbd, no cluster config, partial/filtered logs ...
> fragments ... you have _all_ information and can't find the problem ...
> sorry, but I can't see how anyone can help you based on that information.

Well,

to me the problem looks like this: When starting both DRBDs talk to each other 
successfully, then they say "we jsut talked about not being able to talk to 
each other, so let's commit suicide, because afterwards we can talk better to 
each other"

I think the diagnosis for "split brain" is based on disk content, not on 
communication failure, because the nodes just talked to each other. So a sync, 
not suicide would be the proper solution for the conflict.

And as far as the DRBD logs are concearned, they are complete in the interval 
that's interesting.

I only heard  from third party rumors that "this and that" isn't working, but 
nobody could actually tell me why. I was hoping to get some insight here.

> 
> I personally think it is part of the free community support deal to
> share as much information as possible if one wants help for free.

Well, if anybody has a dual-primary DRBD (with OCFS on top) working with 
pacemaker, would you share your configuration with me to find out what's 
different?

Here's my configuration:
# grep -v '^[      ]*#' *
global_common.conf:global {
global_common.conf:     usage-count no;
global_common.conf:}
global_common.conf:
global_common.conf:common {
global_common.conf:     protocol C;
global_common.conf:
global_common.conf:     handlers {
global_common.conf:             pri-on-incon-degr 
"/usr/lib/drbd/notify-pri-on-incon-degr.sh; 
/usr/lib/drbd/notify-emergency-reboot.sh; sync; echo b > /proc/sysrq-trigger ; 
reboot -f";
global_common.conf:             pri-lost-after-sb 
"/usr/lib/drbd/notify-pri-lost-after-sb.sh; 
/usr/lib/drbd/notify-emergency-reboot.sh; sync; echo b > /proc/sysrq-trigger ; 
reboot -f";
global_common.conf:             local-io-error 
"/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; 
sync; echo o > /proc/sysrq-trigger ; halt -f";
global_common.conf:             split-brain 
"/usr/lib/drbd/notify-split-brain.sh root";
global_common.conf:             out-of-sync 
"/usr/lib/drbd/notify-out-of-sync.sh root";
global_common.conf:     }
global_common.conf:
global_common.conf:     startup {
global_common.conf:             become-primary-on both;
global_common.conf:             wfc-timeout 15;
global_common.conf:     }
global_common.conf:
global_common.conf:     disk {
global_common.conf:             use-bmbv;
global_common.conf:     }
global_common.conf:
global_common.conf:     net {
global_common.conf:             allow-two-primaries;
global_common.conf:             after-sb-0pri discard-zero-changes;
global_common.conf:             after-sb-1pri discard-secondary;
global_common.conf:             after-sb-2pri disconnect;
global_common.conf:     }
global_common.conf:
global_common.conf:     syncer {
global_common.conf:     }
global_common.conf:}
r0.res:resource r0 {
r0.res: device /dev/drbd_r0 minor 0;
r0.res: disk /dev/sys/samba;
r0.res: meta-disk internal;
r0.res: on h02 {
r0.res:         address 172.20.78.2:7780;
r0.res: }
r0.res: on h06 {
r0.res:         address 172.20.78.6:7780;
r0.res: }
r0.res: syncer {
r0.res:         rate 7M;
r0.res: }
r0.res:}

Regards,
Ulrich


> 
> Regards,
> Andreas



 

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to