Hello list,
Odd one for you today. We recently started looking at the new IBM x3650M4
server as our next high-end machine for all clients. We have deployed several
standalone installations at smaller clients, running CentOS 6.2 - things have
been running very well - this is a monster of a server - 1U rack mount w/ 16
600GB HDD's in RAID 10 configuration with 132GB RAM.
Couple days ago I began setting up a pair of machines in HA as that's the next
logical step. Suddenly running into a DRBD perfomance issue I do not
understand. I set this pair of machines up the same way I always did under
CentOS 5.3:
- 2 servers in active/passive configuration
IBM x3650M4 with 4.3TB DRBD partiton across 16 600GB RAID10 2.5" HDDs
RAID Controller: IBM MegaRAID M5110e (LSI SAS2208 Thunderbolt)
132GB system RAM
- CentOS 6.2 on Kernel 2.6.32-220.7.1.el6.x86_64
- heartbeat v3.0.4
- pacemaker v1.1.6
- DRBD v8.4.1
- using the standard tuning I developed past couple years with IBM hardware and
the handy DRBD tuning guide:
- deadline scheduler via "elevator=deadline" in kernel command line
- using this drbd.conf:
global { usage-count yes; }
common {
handlers {
pri-on-incon-degr "/usr/local/bin/support_drbd_deg";
split-brain "/usr/local/bin/support_drbd_sb";
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
fence-peer "/usr/lib64/heartbeat/drbd-peer-outdater -t 5";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
}
disk {
resync-rate 300M; # limit the bandwidth which may be used by background
# synchronizations; use 30M for 1Gb NIC, 300M for 10Gb NIC
al-extents 3833; # Must be prime, number of active sets.
on-io-error detach; # What to do when the lower level device errors.
disk-barrier no;
disk-flushes no;
md-flushes no;
fencing resource-only;
#size 1000G; # for setting exact size of DRBD resource - DO NOT uncomment
this!!
#become-primary-on node-name # use this for DRBD withOUT heartbeat
}
net {
protocol C;
verify-alg md5; # can also use md5, crc32c, ect
csums-alg md5; # can also use md5, crc32c, ect
#timeout 60; # 6 seconds (unit = 0.1 seconds)
#connect-int 10; # 10 seconds (unit = 1 second)
#ping-int 10; # 10 seconds (unit = 1 second)
#ping-timeout 5; # 500 ms (unit = 0.1 seconds)
unplug-watermark 131072; # flush RAID controller buffers
max-buffers 80000; #datablock buffers used before writing to disk.
max-epoch-size 20000; # set max transfer size
sndbuf-size 0;
rcvbuf-size 0;
ko-count 4; # Peer is dead if this count is exceeded.
after-sb-0pri discard-zero-changes;
after-sb-1pri consensus;
after-sb-2pri disconnect;
rr-conflict disconnect;
cram-hmac-alg "sha256";
}
}
resource drbd0 {
options {
cpu-mask 0;
on-no-data-accessible io-error;
}
device /dev/drbd0;
disk /dev/sda4;
meta-disk internal;
on mofpeasHA1 {
address 10.211.32.1:7789;
}
on mofpeasHA2 {
address 10.211.32.2:7789;
}
}
I did of course diligently read through all of the documentation for DRBD
v8.4.1 with it being my first time above v8.3.7 on CentOS 5.3. Found these new
option that sounded flavorful:
options {
cpu-mask 0;
on-no-data-accessible io-error;
}
Also found that many options had moved around ( resync-rate replacing the old
rate, no more syncer section, etc ), so I modified the config we have been
using for a few years to reflect all these new changes I learned about. The
above config seems to work just fine at this point.
Doing a simple test of copying 4.5GB of data from my DRBD partition directly to
memory (/dev/shm/.) I have plenty of room there:
tmpfs 64G 4.5G 59G 8% /dev/shm
- echo 3 > /proc/sys/vm/drop_caches <- first I drop cache for an accurate test
- time cp -rp /usr/medent/tapetest/ /dev/shm/. <- here I copy a dir with
roughtly 4.5GB of random data real world data to system memory
real 0m57.623s
user 0m0.001s
sys 0m0.188s
Wow that takes a long time - almost a full minute. Doesn't seem right as this
machine is blazing fast. So I clear the cache and /dev/shm, then try the same
test but pulling the same data from a non-DRBD partition:
- echo 3 > /proc/sys/vm/drop_caches <- first I drop cache for an accurate test
- time cp -rp /root/tapetest/ /dev/shm/.
real 0m7.625s
user 0m0.064s
sys 0m3.272s
- Quite a large difference there!!
I can reproduce this over and over again - happens if DRBD is online and fully
replicating or if I take one node down to run without any replication going on
just to be sure.
-We did so much tuning in DRBD back with CentOS 5 and IBM MegaRAID M5015 in
similar RAID10 that I would hate to strip down my config to the default after
install and start from scratch
--
Kenneth DeChick
Linux Systems Administrator
-- MEDENT --
Kirk to Enterprise -- beam down yeoman Rand and a six-pack.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems