On 04/05/18 09:10, Christiaan den Besten wrote:
Hi !
Question. Using DRBD 9.0.14 (latest from git) we can't get a resync after
verify working. Having a simple 2-node resource created/configured 8.x style.
A "drbdadm verify" now succesfully ends at 100% ( thank you some much Lars for
fixing this! ) and it notices inconsistent data blocks ( self inflicted by dd'ing some
zeros on the secondary node ).
We then have :
[149702.915093] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: conn(
Unconnected -> Connecting )
[149704.335863] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: Handshake
to peer 0 successful: Agreed network protocol version 113
[149704.335866] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: Feature
flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
[149704.336280] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: Peer
authenticated using 20 bytes HMAC
[149704.336299] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: Starting
ack_recv thread (from drbd_r_r_drbd9. [4924])
[149704.391726] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: Preparing
remote state change 196805945
[149704.392341] drbd r_drbd9.prolocation.net mhxen20.prolocation.net:
Committing remote state change 196805945 (primary_nodes=2)
[149704.392364] drbd r_drbd9.prolocation.net mhxen20.prolocation.net: conn(
Connecting -> Connected ) peer( Unknown -> Secondary )
[149704.397800] drbd r_drbd9.prolocation.net/0 drbd11 mhxen20.prolocation.net:
drbd_sync_handshake:
[149704.397805] drbd r_drbd9.prolocation.net/0 drbd11 mhxen20.prolocation.net:
self 9E1AD7F59E5434FA:0000000000000000:B3BDA5F13EDDFCEA:EE9BDB393791EAAC bits:0
flags:120
[149704.397807] drbd r_drbd9.prolocation.net/0 drbd11 mhxen20.prolocation.net:
peer 9E1AD7F59E5434FA:0000000000000000:9E1AD7F59E5434FA:B3BDA5F13EDDFCEA bits:0
flags:120
[149704.397809] drbd r_drbd9.prolocation.net/0 drbd11 mhxen20.prolocation.net:
uuid_compare()=0 by rule 38
[149704.397830] drbd r_drbd9.prolocation.net/0 drbd11 mhxen20.prolocation.net:
repl( Off -> Established )
[149704.405793] drbd r_drbd9.prolocation.net/1 drbd12 mhxen20.prolocation.net:
drbd_sync_handshake:
[149704.405796] drbd r_drbd9.prolocation.net/1 drbd12 mhxen20.prolocation.net:
self 686DD0F922994E9C:0000000000000000:AEB10B63BD82F43A:6805740BE5A46E08
bits:1048 flags:120
[149704.405799] drbd r_drbd9.prolocation.net/1 drbd12 mhxen20.prolocation.net:
peer 686DD0F922994E9C:0000000000000000:686DD0F922994E9C:AEB10B63BD82F43A
bits:1048 flags:120
[149704.405801] drbd r_drbd9.prolocation.net/1 drbd12 mhxen20.prolocation.net:
uuid_compare()=0 by rule 38
[149704.405803] drbd r_drbd9.prolocation.net/1 drbd12: No resync, but 1048 bits
in bitmap!
[149704.405821] drbd r_drbd9.prolocation.net/1 drbd12 mhxen20.prolocation.net:
repl( Off -> Established )
and the same on the other node
[146265.229215] drbd r_drbd9.prolocation.net/1 drbd12 mhxen10.prolocation.net:
drbd_sync_handshake:
[146265.229218] drbd r_drbd9.prolocation.net/1 drbd12 mhxen10.prolocation.net:
self 686DD0F922994E9C:0000000000000000:686DD0F922994E9C:AEB10B63BD82F43A
bits:1048 flags:120
[146265.229221] drbd r_drbd9.prolocation.net/1 drbd12 mhxen10.prolocation.net:
peer 686DD0F922994E9C:0000000000000000:AEB10B63BD82F43A:6805740BE5A46E08
bits:1048 flags:120
[146265.229223] drbd r_drbd9.prolocation.net/1 drbd12 mhxen10.prolocation.net:
uuid_compare()=0 by rule 38
[146265.229225] drbd r_drbd9.prolocation.net/1 drbd12: No resync, but 1048 bits
in bitmap!
[146265.229244] drbd r_drbd9.prolocation.net/1 drbd12 mhxen10.prolocation.net: pdsk(
DUnknown -> UpToDate ) repl( Off -> Established )
with
[root@mhxen10 ~]# grep ^
/sys/kernel/debug/drbd/resources/*/connections/*/*/proc_drbd
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/0/proc_drbd:11:
cs:Established ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/0/proc_drbd:
ns:41941724 nr:0 dw:0 dr:167767960 al:0 bm:0 lo:0 pe:[0;0] ua:0 ap:[0;0]
ep:1 wo:1 oos:0
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/0/proc_drbd:
resync: used:0/61 hits:0 misses:0 starving:0 locked:0 changed:0
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/0/proc_drbd:
act_log: used:0/1237 hits:0 misses:0 starving:0 locked:0 changed:0
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/0/proc_drbd:
blocked on activity log: 0
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/1/proc_drbd:12:
cs:Established ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/1/proc_drbd:
ns:41943040 nr:0 dw:0 dr:167773196 al:0 bm:0 lo:0 pe:[0;0] ua:0 ap:[0;0]
ep:1 wo:1 oos:4192
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/1/proc_drbd:
resync: used:0/61 hits:0 misses:0 starving:0 locked:0 changed:0
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/1/proc_drbd:
act_log: used:0/1237 hits:0 misses:0 starving:0 locked:0 changed:0
/sys/kernel/debug/drbd/resources/r_drbd9.prolocation.net/connections/mhxen20.prolocation.net/1/proc_drbd:
blocked on activity log: 0
Notice the oos:4192.
Disconnecting/reconnecting one or both ends won't make it resync. Is this
something we misconfigured, or should it have worked ... ?
A "drbdadm invalidate-remote r_drbd9.prolocation.net" on the primary node
forcing a full resync does get the job done.
Any advise on this ?
Hi Christiaan,
I frequently come across this on 9.x. The workaround I have used for a
long time is to disconnect and then wait long enough until some writes
have come in (or if you have access to the upper layer fs just touch a
file is enough). Then reconnect, which of course leads to a normal
resync (and I assume the oos are also taken care of as part of that). It
seems to work as I think I checked in the past and a subsequent verify
pass shows no oos).
regards,
Eddie
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user