Hi,
We have experienced a strange replication problem since we use B protocol.
The scenario is the following:
Some binary files are saved to the replicated IO pair ( kernel:3.0.13,
drbd-8.3.12, protocol B, EXT3 )
Later they are copied to an other (but replicated) directory.
They are still consistent and there is no problem till the io1 (the
actual Primary) is rebooted.
Strange it needs a reboot. An enforced role change does not show the
symptom.
io2 takes the Primary role and when the cluster starts using the binary
files they show checksum error.
We have turned of the write cache in the sas disks ( sdparam --set WCE=0
/dev/sda )
and the symptom seemed to be disappeared, but later it surfaced again.
Those corrupted binary files has some 40 kbytes hole filled with zeros.
Yes it can be a HW issue, but we did not see it with C protocol
(which is deadly slow in our system unfortunately)
Have someone seen something similar ?
Thanks,
Akos
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user