http://www.drbd.org/users-guide/s-resolve-split-brain.html
On Apr 17, 2012, at 11:06 AM, Jacek Osiecki wrote: > Hello, > > I am currently testing dual-master setup with DRBD+OCFS2. > Finally I managed to get it working well on kernel 2.6.39.4, DRBD version > 8.3.10 (userland version: 8.4.1) and OCFS2 version 1.5.0. > > I had some troubles with broken replication, and sometimes I see that > automatic recovery sometimes works and sometimes does not. What's strange, is > that this still are tests, and actually when one server is fully functional, > second one has no processess that even touch the synchronized partition. > > In dmesg on the active server it looks like this: > > [707152.209885] block drbd0: Handshake successful: Agreed network protocol > version 96 > [707152.209895] block drbd0: conn( WFConnection -> WFReportParams ) > [707152.210068] block drbd0: Starting asender thread (from drbd0_receiver > [1096]) > [707152.210341] block drbd0: data-integrity-alg: <not-used> > [707152.210352] block drbd0: max BIO size = 130560 > [707152.210359] block drbd0: drbd_sync_handshake: > [707152.210363] block drbd0: self > 8631CEC3370B5C31:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:21 > flags:0 > [707152.210368] block drbd0: peer > 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:0 > flags:0 > [707152.210371] block drbd0: uuid_compare()=100 by rule 90 > [707152.210377] block drbd0: helper command: /sbin/drbdadm > initial-split-brain minor-0 > [707152.212439] block drbd0: helper command: /sbin/drbdadm > initial-split-brain minor-0 exit code 0 (0x0) > [707152.212442] block drbd0: Split-Brain detected but unresolved, dropping > connection! > [707152.212445] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 > [707152.214134] block drbd0: helper command: /sbin/drbdadm split-brain > minor-0 exit code 0 (0x0) > [707152.214137] block drbd0: conn( WFReportParams -> Disconnecting ) > [707152.214141] block drbd0: error receiving ReportState, l: 4! > [707152.214150] block drbd0: asender terminated > [707152.214154] block drbd0: Terminating drbd0_asender > [707152.214177] block drbd0: Connection closed > [707152.214180] block drbd0: conn( Disconnecting -> StandAlone ) > [707152.214188] block drbd0: receiver terminated > [707152.214190] block drbd0: Terminating drbd0_receiver > > Is there any help for this situation? I don't understand why the case isn't > solved, since second server doesn't write to drbd0, sometimes even partition > wasn't mounted (I can't be 100% sure, but it seems so). > > I would be greatful if you could give me some hint how to make this > configuration stable, without sacrificing data on one of nodes (now in order > to recover I have to set second node to slave). Any ideas what is wrong in my > setup? > > P.S. Any suggestions how to measure real performance (read/write/copy) of > DRBD+OCFS2? UnixBench gives crazy results (read performance about 10% of > local filesystem)... > > Best regards, > -- > Jacek Osiecki > [email protected] > > Silvercube s.c. > ul. Makuszynskiego 4 > 31-752 Kraków > +48 (12) 684 21 00_______________________________________________ > drbd-user mailing list > [email protected] > http://lists.linbit.com/mailman/listinfo/drbd-user _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
