Hello,

I already faced this issue sporadically a few months ago, it occured again last 
night.
Here is what happens.



Online verification is running (as a weekly basis) :

root@srv2-1:~# cat /proc/drbd 
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by root@srv2-1, 
2013-05-20 13:24:15
 1: cs:VerifyT ro:Primary/Secondary ds:UpToDate/UpToDate C r---d-
    ns:1717212 nr:0 dw:1720216 dr:844934114 al:7737 bm:0 lo:1 pe:4238 ua:2048 
ap:2049 ep:1 wo:b oos:0
    [=======>............] verified: 41.9% (1105704/1902544)M
    finish: 10095:28:53 speed: 28 (9,648) K/sec

With the following settings :
syncer {
  rate 10M;
  verify-alg crc32c;
}

During this verification, primary's network input rate is about 3Mbps, output 
rate 1Mbps (out of 100Mbps).



Some activity starts on the resource, taking network rate between 4Mbps and 
10Mbps (out of 100Mbps).
After about one hour, resource totally hangs, read and write are impossible, 
even a simple "ls" hangs.
Many many errors like the following one appear in the syslog :
Jun 19 21:08:10 srv2-1 kernel: block drbd1: [drbd1_worker/26788] sock_sendmsg 
time expired, ko = 4294967295



At this moment, to take the resource back to production, the only solution I 
found is to stop network communication between the two nodes (using 
netfilter/iptables).
Well, I did not think about testing "drbdadm disconnect".
I initially tested "/etc/init.d/drbd stop" on the secondary node, but it hung 
until network communication was cut.



Questions :

1 - Is there a bug that makes DRBD / online verification as if it was in a 
infinite loop, giving "sock_sendmsg time expired" messages ?
2 - Could it be possible for the DRBD team to investigate on that ?
3 - As a workaround, it there any DRBD configuration possible that would for 
example make the primary StandAlone (disconnect) in case of this error ?



Of course, thank you very much for your support !

Best regards,

Ben

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to