On Wed, Feb 23, 2011 at 05:37:45PM +0800, [email protected] wrote:
> sorry about that. here is the config
>
> ###############################
> global { usage-count no; }
> common { syncer { rate 500M; } }
>
> #DRBD FOR SERVER START
> resource r0 {
> protocol C;
> startup {
> wfc-timeout 15;
> degr-wfc-timeout 60;
> become-primary-on both;
> }
> net {
> cram-hmac-alg sha1;
> shared-secret "a4tech";
> allow-two-primaries;
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
Right there.
"Dear DRBD,
if you happen to detect data divergence on connection handshake,
and during that handshake one side happens to be in Primary role,
the other side happens to be in Secondary role, please just throw away
the data of the side that is Secondary at that time.
"
Which matches exactly what the log says, below.
> after-sb-2pri disconnect;
> allow-two-primaries;
> }
> on vmcluster01 {
> device /dev/drbd0;
> disk /dev/vmdiskvg/server_root_disk;
> address 172.16.0.1:7788;
> meta-disk internal;
> }
> on vmcluster02 {
> device /dev/drbd0;
> disk /dev/vmdiskvg/server_root_disk;
> address 172.16.0.2:7788;
> meta-disk internal;
> }
> }
> ###############################
>
> here are the logs from vmcluster1 r0
> ###############################
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: Starting worker thread (from
> cqueue [4385])
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: disk( Diskless -> Attaching
> )
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: Found 4 transactions (94
> active extents) in activity log.
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: Method to ensure write
> ordering: barrier
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: Backing device's
> merge_bvec_fn() = ffffffff814445a0
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: max_segment_size ( = BIO
> size ) = 4096
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: drbd_bm_resize called with
> capacity == 10485368
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: resync bitmap: bits=1310671
> words=20480
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: size = 5120 MB (5242684 KB)
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: recounting of set bits took
> additional 0 jiffies
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: 0 KB (0 bits) marked
> out-of-sync by on disk bit-map.
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: Marked additional 200 MB as
> out-of-sync based on AL.
> Feb 16 08:00:09 vmcluster01 kernel: block drbd0: disk( Attaching -> UpToDate
> )
> Feb 16 08:00:11 vmcluster01 kernel: block drbd0: conn( StandAlone ->
> Unconnected )
> Feb 16 08:00:11 vmcluster01 kernel: block drbd0: Starting receiver thread
> (from drbd0_worker [4403])
> Feb 16 08:00:11 vmcluster01 kernel: block drbd0: receiver (re)started
> Feb 16 08:00:11 vmcluster01 kernel: block drbd0: conn( Unconnected ->
> WFConnection )
> Feb 16 08:00:26 vmcluster01 kernel: block drbd0: role( Secondary -> Primary )
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: Handshake successful: Agreed
> network protocol version 91
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: Peer authenticated using 20
> bytes of 'sha1' HMAC
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: conn( WFConnection ->
> WFReportParams )
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: Starting asender thread
> (from drbd0_receiver [4795])
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: data-integrity-alg:
> <not-used>
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: drbd_sync_handshake:
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: self
> 08EC53F109A4A4CF:5DB3199CA5DE92F4:DCE4E128E07FD84A:5995FC353101D3B7
> bits:51200 flags:0
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: peer
> 5453F9D5A1166380:5DB3199CA5DE92F5:DCE4E128E07FD84A:5995FC353101D3B7
> bits:145101 flags:2
Diverging data sets, both sides have been modified independently.
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: uuid_compare()=100 by rule 90
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: Split-Brain detected, 1
> primaries, automatically solved. Sync from this node
You configured it to discard the side that is Secondary during handshake.
And that is what DRBD does.
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: peer( Unknown -> Secondary )
> conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: conn( WFBitMapS ->
> SyncSource ) pdsk( UpToDate -> Inconsistent )
> Feb 16 08:01:53 vmcluster01 kernel: block drbd0: Began resync as SyncSource
> (will sync 632904 KB [158226 bits set]).
> Feb 16 08:02:17 vmcluster01 kernel: block drbd0: peer( Secondary -> Primary )
> Feb 16 08:04:50 vmcluster01 kernel: block drbd0: Resync done (total 87 sec;
> paused 0 sec; 7272 K/sec)
> Feb 16 08:04:50 vmcluster01 kernel: block drbd0: conn( SyncSource ->
> Connected ) pdsk( Inconsistent -> UpToDate )
> ###############################
> here are the logs for vmcluster2 (where the file servers are hosted)
> ###############################
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: Starting worker thread (from
> cqueue [3624])
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: disk( Diskless -> Attaching
> )
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: Found 4 transactions (136
> active extents) in activity log.
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: Method to ensure write
> ordering: barrier
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: Backing device's
> merge_bvec_fn() = ffffffffa011ccb0
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: max_segment_size ( = BIO
> size ) = 4096
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: drbd_bm_resize called with
> capacity == 10485368
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: resync bitmap: bits=1310671
> words=20480
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: size = 5120 MB (5242684 KB)
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: recounting of set bits took
> additional 0 jiffies
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: 87 MB (22299 bits) marked
> out-of-sync by on disk bit-map.
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: Marked additional 480 MB as
> out-of-sync based on AL.
> Feb 16 08:01:49 vmcluster02 kernel: block drbd0: disk( Attaching -> UpToDate
> )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: conn( StandAlone ->
> Unconnected )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: Starting receiver thread
> (from drbd0_worker [3647])
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: receiver (re)started
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: conn( Unconnected ->
> WFConnection )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: Handshake successful: Agreed
> network protocol version 91
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: Peer authenticated using 20
> bytes of 'sha1' HMAC
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: conn( WFConnection ->
> WFReportParams )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: Starting asender thread
> (from drbd0_receiver [4269])
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: data-integrity-alg:
> <not-used>
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: drbd_sync_handshake:
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: self
> 5453F9D5A1166380:5DB3199CA5DE92F5:DCE4E128E07FD84A:5995FC353101D3B7
> bits:145101 flags:0
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: peer
> 08EC53F109A4A4CF:5DB3199CA5DE92F4:DCE4E128E07FD84A:5995FC353101D3B7
> bits:51200 flags:2
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: uuid_compare()=100 by rule 90
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: Split-Brain detected, 1
> primaries, automatically solved. Sync from peer node
Exactly.
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: peer( Unknown -> Primary )
> conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: conn( WFBitMapT ->
> WFSyncUUID )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: helper command:
> /sbin/drbdadm before-resync-target minor-0
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: helper command:
> /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: conn( WFSyncUUID ->
> SyncTarget ) disk( UpToDate -> Inconsistent )
> Feb 16 08:01:50 vmcluster02 kernel: block drbd0: Began resync as SyncTarget
> (will sync 632904 KB [158226 bits set]).
> Feb 16 08:02:13 vmcluster02 kernel: block drbd0: role( Secondary -> Primary )
> Feb 16 08:03:17 vmcluster02 kernel: block drbd0: Resync done (total 87 sec;
> paused 0 sec; 7272 K/sec)
> Feb 16 08:03:17 vmcluster02 kernel: block drbd0: conn( SyncTarget ->
> Connected ) disk( Inconsistent -> UpToDate )
> Feb 16 08:03:17 vmcluster02 kernel: block drbd0: helper command:
> /sbin/drbdadm after-resync-target minor-0
> Feb 16 08:03:17 vmcluster02 kernel: block drbd0: helper command:
> /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user