[Linux-HA] DRBD perfomance on IBM M5110e controller

Ken Dechick Thu, 12 Jul 2012 13:38:40 -0700

Hello list, 

Odd one for you today. We recently started looking at the new IBM x3650M4 
server as our next high-end machine for all clients. We have deployed several 
standalone installations at smaller clients, running CentOS 6.2 - things have 
been running very well - this is a monster of a server - 1U rack mount w/ 16 
600GB HDD's in RAID 10 configuration with 132GB RAM. 
Couple days ago I began setting up a pair of machines in HA as that's the next 
logical step. Suddenly running into a DRBD perfomance issue I do not 
understand. I set this pair of machines up the same way I always did under 
CentOS 5.3:


- 2 servers in active/passive configuration 
IBM x3650M4 with 4.3TB DRBD partiton across 16 600GB RAID10 2.5" HDDs 
RAID Controller: IBM MegaRAID M5110e (LSI SAS2208 Thunderbolt) 
132GB system RAM 
- CentOS 6.2 on Kernel 2.6.32-220.7.1.el6.x86_64 
- heartbeat v3.0.4 
- pacemaker v1.1.6 
- DRBD v8.4.1 
- using the standard tuning I developed past couple years with IBM hardware and 
the handy DRBD tuning guide: 
- deadline scheduler via "elevator=deadline" in kernel command line 
- using this drbd.conf: 


global { usage-count yes; } 
common { 
handlers { 
pri-on-incon-degr "/usr/local/bin/support_drbd_deg"; 
split-brain "/usr/local/bin/support_drbd_sb"; 
fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; 
fence-peer "/usr/lib64/heartbeat/drbd-peer-outdater -t 5"; 
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; 
} 
disk { 
resync-rate 300M; # limit the bandwidth which may be used by background 
# synchronizations; use 30M for 1Gb NIC, 300M for 10Gb NIC 
al-extents 3833; # Must be prime, number of active sets. 
on-io-error detach; # What to do when the lower level device errors. 
disk-barrier no; 
disk-flushes no; 
md-flushes no; 
fencing resource-only; 
#size 1000G; # for setting exact size of DRBD resource - DO NOT uncomment 
this!! 
#become-primary-on node-name # use this for DRBD withOUT heartbeat 
} 
net { 
protocol C; 
verify-alg md5; # can also use md5, crc32c, ect 
csums-alg md5; # can also use md5, crc32c, ect 
#timeout 60; # 6 seconds (unit = 0.1 seconds) 
#connect-int 10; # 10 seconds (unit = 1 second) 
#ping-int 10; # 10 seconds (unit = 1 second) 
#ping-timeout 5; # 500 ms (unit = 0.1 seconds) 
unplug-watermark 131072; # flush RAID controller buffers 
max-buffers 80000; #datablock buffers used before writing to disk. 
max-epoch-size 20000; # set max transfer size 
sndbuf-size 0; 
rcvbuf-size 0; 
ko-count 4; # Peer is dead if this count is exceeded. 
after-sb-0pri discard-zero-changes; 
after-sb-1pri consensus; 
after-sb-2pri disconnect; 
rr-conflict disconnect; 
cram-hmac-alg "sha256"; 
} 
} 
resource drbd0 { 
options { 
cpu-mask 0; 
on-no-data-accessible io-error; 
} 
device /dev/drbd0; 
disk /dev/sda4; 
meta-disk internal; 
on mofpeasHA1 { 
address 10.211.32.1:7789; 
} 
on mofpeasHA2 { 
address 10.211.32.2:7789; 
} 
} 

I did of course diligently read through all of the documentation for DRBD 
v8.4.1 with it being my first time above v8.3.7 on CentOS 5.3. Found these new 
option that sounded flavorful: 

options { 
cpu-mask 0; 
on-no-data-accessible io-error; 
} 

Also found that many options had moved around ( resync-rate replacing the old 
rate, no more syncer section, etc ), so I modified the config we have been 
using for a few years to reflect all these new changes I learned about. The 
above config seems to work just fine at this point. 

Doing a simple test of copying 4.5GB of data from my DRBD partition directly to 
memory (/dev/shm/.) I have plenty of room there: 

tmpfs 64G 4.5G 59G 8% /dev/shm 

- echo 3 > /proc/sys/vm/drop_caches <- first I drop cache for an accurate test 
- time cp -rp /usr/medent/tapetest/ /dev/shm/. <- here I copy a dir with 
roughtly 4.5GB of random data real world data to system memory 

real 0m57.623s 
user 0m0.001s 
sys 0m0.188s 


Wow that takes a long time - almost a full minute. Doesn't seem right as this 
machine is blazing fast. So I clear the cache and /dev/shm, then try the same 
test but pulling the same data from a non-DRBD partition: 


- echo 3 > /proc/sys/vm/drop_caches <- first I drop cache for an accurate test 
- time cp -rp /root/tapetest/ /dev/shm/. 

real 0m7.625s 
user 0m0.064s 
sys 0m3.272s 

- Quite a large difference there!! 
I can reproduce this over and over again - happens if DRBD is online and fully 
replicating or if I take one node down to run without any replication going on 
just to be sure. 

-We did so much tuning in DRBD back with CentOS 5 and IBM MegaRAID M5015 in 
similar RAID10 that I would hate to strip down my config to the default after 
install and start from scratch 


-- 
Kenneth DeChick 
Linux Systems Administrator 
-- MEDENT -- 


Kirk to Enterprise -- beam down yeoman Rand and a six-pack. 

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] DRBD perfomance on IBM M5110e controller

Reply via email to