Hello everybody!

We run 4 different (but similar) pairs of servers with 5 to 50 DRBD 8 resources 
each (one resource per LXC guest, backed by LVM2 volumes, i.e. DRBD on LVM) 
with resource sizes from 2 GB to 2 TB.

3 pairs of servers run Debian's 4.9 kernel, one pair runs 4.14.
DRBD module versions are 8.4.7 (probably with bits backported) resp. 8.4.10.

For half a year now we are running regular 
  drbdadm verify
on all DRBD resources, one after the other.

Unfortunately about half of the resources show between 4 and 10000s out-of-sync 
blocks (i.e. non-0 numbers after oos: in /proc/drbd) after most verify runs.


This may have been asked before, but does a non-0 number after oos: in 
/proc/drbd *really* mean there are definitely inconsistent data that the sync 
process (resp. the blocks' consistency flags) would not know about without the 
verification run?

Or could the normal sync process fool the verification by writing a block on 
the primary between the reads by the verification process?

As in:

1. verification reads block on machine A

2. block device driver writes block on primary and flags it as changed.
This knowningly creates an inconsistent state (and there's nothing wrong about 
that).

3. verification reads block on machine B

4. block device driver resp. sync writes block on secondary, and flags it as 
clean on the primary.
This resolvies the planned temporary inconsistency, but has fooled the 
verification process into a false negative.


So, if the oos: blocks are real data errors, we are out of ideas except hoping 
an update to kernel 4.16 resp. DRBD module 8.4.10 might solve the issues.
(We are going to upgrade some nodes to 4.16, which requires to activate 
apparmor, btw.)

We have thoroughly searched (filtered kernel calls) for O_DIRECT operations, 
and have found nothing except one Oracle process (one test machine only, don't 
care) and LVM's regular reads (supposedly outside the DRBD resources).

Is there any known source for out-of-sync blocks apart from O_DIRECT, faulty 
hardware, and (unknown) bugs?


Regards, Christoph


-- 

Christoph Lechleitner

Geschäftsführung

------------------------------------------------------------------------
ITEG IT-Engineers GmbH | Conradstr. 5, A-6020 Innsbruck
FN 365826f | Handelsgericht Innsbruck | Mobiltelefon: +43 676 3674710
Mail: [email protected] | Web: http://www.iteg.at/
------------------------------------------------------------------------

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to