Hey guys,
So we've recently upgraded to DRBD 8.4.2, and have been noticing some... odd
behavior. Here's a sar extract for a glitch we noticed last night:
12:00:01 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz
await svctm %util
10:08:01 PM dev251-0 833.61 43883.30 13928.83 69.35 1.21
1.46 0.07 6.11
10:08:01 PM dev147-0 829.07 43883.30 13890.15 69.68 2119.54
1126.62 0.84 69.91
dev251 is the backing device, and 147 is the corresponding DRBD device. I
understand there would be some kind of overhead with DRBD, but 60% with an
await of over a second, which drives the queue up to 2000+ items, seems a
little wrong. For reference, here's the DRBD config for that device, from
drbdsetup:
resource edb {
options {
}
net {
max-buffers 131072;
verify-alg "md5";
}
_remote_host {
address ipv4 10.2.128.207:7788;
}
_this_host {
address ipv4 10.2.128.208:7788;
volume 0 {
device minor 0;
disk "/dev/fioa";
meta-disk internal;
disk {
fencing resource-only;
disk-flushes no;
md-flushes no;
resync-rate 307200k; # bytes/second
c-fill-target 6144s; # bytes
c-max-rate 307200k; # bytes/second
}
}
}
}
I've adjusted the c-fill-target based on a 10G link with 0.1ms average ping
time, and set max-buffers to the highest setting for similar reasons. And while
that did clear up some of the issues we were seeing, every once in a while, we
get spikes like the sar extract.
Am I missing something?
--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
[email protected]
______________________________________________
See http://www.peak6.com/email_disclaimer/ for terms and conditions related to
this email
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user