Hi,

>> [root@nfs01 nfs]# cat /proc/drbd
>> version: 8.3.8 (api:88/proto:86-94)
> 
> Really do an upgrade! ... elrepo seems to have latest DRBD 8.3.12 packages

Thanks for the hint, we might consider that if nothing else helps :-)

Not that we dont want the newer version. Its the unofficial repository
that is the problem here. We are quite hesitant of unofficial repos,
because that systems hosts hundreds of customers.

>> Why these resyncs happen and so much data is being resynced, is another
>> case. The nodes were disconnected for 3-4 Minutes which does not justify
>> so much data. Anyways...
> 
> If you adjust your resource after changing an disk option the disk is
> detached/attached ... this means syncing the complete AL when done on a
> primary ... 3833*4MB=15332MB

Great! Thanks for the insight. Im really learning some stuff about drbd
here!

>> After issueing the mentioned dd command
>>
>> $ dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
>> 10240+0 records in
>> 10240+0 records out
>> 41943040 bytes (42 MB) copied, 0.11743 seconds, 357 MB/s
> 
> you benchmark your page cache here ... add oflag=direct to dd to bypass it

Now this makes me shiver and lough at the same time (shortened the output):

####
[root@nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
41943040 bytes (42 MB) copied, 24.7257 seconds, 1.7 MB/s

[root@nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 25.9601 seconds, 1.6 MB/s

[root@nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 44.4078 seconds, 944 kB/s

[root@nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
30384128 bytes (42 MB) copied, 26.9182 seconds, 1.3 MB/s
####

The load rises a little while doing this (to about 3-4), but the systems
remains usable.

> looks like I/O system or network is fully saturated

It seems more like some sort of drbd-cache-setting is broken somewhere.

On an LVM-Volume without DRBD dd works fine (i shortened the output):

####
[root@nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 0.738741 seconds, 56.8 MB/s

[root@nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 0.746778 seconds, 56.2 MB/s

[root@nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 0.733518 seconds, 57.2 MB/s

[root@nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 0.736617 seconds, 56.9 MB/s

[root@nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240
oflag=direct
41943040 bytes (42 MB) copied, 0.73078 seconds, 57.4 MB/s
####

The network link is also just fine. We've tested this with almost
100MB/s (that is Megabytes) of throughput. The only possible limit here
would be the syncer rate of 25MB/s, but the network-link is only
saturated during a resync.

Any more ideas with this info?

best regards
volker
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to