http://www.drbd.org/users-guide/s-resolve-split-brain.html



On Apr 17, 2012, at 11:06 AM, Jacek Osiecki wrote:

> Hello,
> 
> I am currently testing dual-master setup with DRBD+OCFS2.
> Finally I managed to get it working well on kernel 2.6.39.4, DRBD version 
> 8.3.10 (userland version: 8.4.1) and OCFS2 version 1.5.0.
> 
> I had some troubles with broken replication, and sometimes I see that
> automatic recovery sometimes works and sometimes does not. What's strange, is 
> that this still are tests, and actually when one server is fully functional, 
> second one has no processess that even touch the synchronized partition.
> 
> In dmesg on the active server it looks like this:
> 
> [707152.209885] block drbd0: Handshake successful: Agreed network protocol 
> version 96
> [707152.209895] block drbd0: conn( WFConnection -> WFReportParams )
> [707152.210068] block drbd0: Starting asender thread (from drbd0_receiver 
> [1096])
> [707152.210341] block drbd0: data-integrity-alg: <not-used>
> [707152.210352] block drbd0: max BIO size = 130560
> [707152.210359] block drbd0: drbd_sync_handshake:
> [707152.210363] block drbd0: self 
> 8631CEC3370B5C31:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:21 
> flags:0
> [707152.210368] block drbd0: peer 
> 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:0 
> flags:0
> [707152.210371] block drbd0: uuid_compare()=100 by rule 90
> [707152.210377] block drbd0: helper command: /sbin/drbdadm 
> initial-split-brain minor-0
> [707152.212439] block drbd0: helper command: /sbin/drbdadm 
> initial-split-brain minor-0 exit code 0 (0x0)
> [707152.212442] block drbd0: Split-Brain detected but unresolved, dropping 
> connection!
> [707152.212445] block drbd0: helper command: /sbin/drbdadm split-brain minor-0
> [707152.214134] block drbd0: helper command: /sbin/drbdadm split-brain 
> minor-0 exit code 0 (0x0)
> [707152.214137] block drbd0: conn( WFReportParams -> Disconnecting )
> [707152.214141] block drbd0: error receiving ReportState, l: 4!
> [707152.214150] block drbd0: asender terminated
> [707152.214154] block drbd0: Terminating drbd0_asender
> [707152.214177] block drbd0: Connection closed
> [707152.214180] block drbd0: conn( Disconnecting -> StandAlone )
> [707152.214188] block drbd0: receiver terminated
> [707152.214190] block drbd0: Terminating drbd0_receiver
> 
> Is there any help for this situation? I don't understand why the case isn't 
> solved, since second server doesn't write to drbd0, sometimes even partition 
> wasn't mounted (I can't be 100% sure, but it seems so).
> 
> I would be greatful if you could give me some hint how to make this 
> configuration stable, without sacrificing data on one of nodes (now in order 
> to recover I have to set second node to slave). Any ideas what is wrong in my 
> setup?
> 
> P.S. Any suggestions how to measure real performance (read/write/copy) of 
> DRBD+OCFS2? UnixBench gives crazy results (read performance about 10% of 
> local filesystem)...
> 
> Best regards,
> -- 
> Jacek Osiecki
> [email protected]
> 
> Silvercube s.c.
> ul. Makuszynskiego 4
> 31-752 Kraków
> +48 (12) 684 21 00_______________________________________________
> drbd-user mailing list
> [email protected]
> http://lists.linbit.com/mailman/listinfo/drbd-user

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to