I am forced to reply to myself here this morning after coming across this: 
http ://old. nabble .com/Performance-regression-with- DRBD 
-8.3.12-and-newer-to33995000. html #a33995000 

Which discusses big changes in CentOS in regard to barriers and flushes in the 
RAID controller introduced in EL6. 

I just re-tested to be 100% sure - I had disk barriers and flushes already 
disabled, but using the older syntax (yes I know it was written to be backward 
compatable for a while, but just in case ...) I changed to the newer syntax in 
my drbd .. conf : 

no-disk-barrier; 
no-disk-flushes; 
no-disk-flushes; 

I still find that the cfq scheduler gives me 3X performance over deadline (or 
noop , or anticipatory for that matter). Turning those 3 settings on or off 
makes no difference in the problem I am having. 
Also this morning I had time to test the new read-balancing in 8.4.1: 

I tried: 
read-balancing when-congested-remote; 

Thinking that this means if HA1 is congested (I see 97%+ I/O WAIT and disk 
reads as low as 80MB/s during my cp test - my RAID controller is capable of ), 
the system can balance out to a degree by reading from the HA2 node, but that 
didn't change things. Then I remembered I only had one node up and running so I 
am in DRBD "standalone mode" which doesn't help me - can't read from the other 
node if it's not there! Am I at least understanding all of this correctly, is 
that how read-balancing is supposed to work? In I had my other node online 
would I have expected to see a difference at all?? 

Recall that in my OP I am copying from drbd0 partition to / dev / shm and 
seeing performance 3X lower than if I perform the same test from a non- DRBD 
partition to / dev / shm . 

-Thanks 


-- 
Kenneth DeChick 
Linux Systems Administrator 
-- MEDENT -- 

This message and any attachments may contain information that is protected by 
law as privileged and confidential, and is transmitted for the sole use of the 
intended recipient(s). If you are not the intended recipient, you are hereby 
notified that any use, dissemination, copying or retention of this e-mail or 
the information contained herein is strictly prohibited. If you received this 
e-mail in error, please immediately notify the sender by e-mail, and 
permanently delete this e-mail. 

----- Original Message -----

From: "Ken Dechick " < kend @ medent .com> 
To: [email protected] 
Sent: Thursday, July 12, 2012 4:37:24 PM 
Subject: DRBD perfomance on IBM M5110e controller 


Hello list, 

Odd one for you today. We recently started looking at the new IBM x3650M4 
server as our next high-end machine for all clients. We have deployed several 
standalone installations at smaller clients, running CentOS 6.2 - things have 
been running very well - this is a monster of a server - 1U rack mount w/ 16 
600GB HDD's in RAID 10 configuration with 132GB RAM. 
Couple days ago I began setting up a pair of machines in HA as that's the next 
logical step. Suddenly running into a DRBD perfomance issue I do not 
understand. I set this pair of machines up the same way I always did under 
CentOS 5.3: 

- 2 servers in active/passive configuration 
IBM x3650M4 with 4.3TB DRBD partiton across 16 600GB RAID10 2.5" HDDs 
RAID Controller: IBM MegaRAID M5110e ( LSI SAS2208 Thunderbolt) 
132GB system RAM 
- CentOS 6.2 on Kernel 2.6.32-220.7.1.el6.x86_64 
- heartbeat v3.0.4 
- pacemaker v1.1.6 
- DRBD v8.4.1 
- using the standard tuning I developed past couple years with IBM hardware and 
the handy DRBD tuning guide: 
- deadline scheduler via "elevator=deadline" in kernel command line 
- using this drbd . conf : 


global { usage-count yes; } 
common { 
handlers { 
pri-on-incon-degr "/ usr /local/bin/support_ drbd _deg"; 
split-brain "/ usr /local/bin/support_ drbd _ sb "; 
fence-peer "/ usr /lib/ drbd /crm-fence-peer.sh"; 
fence-peer "/ usr /lib64/heartbeat/ drbd -peer-outdater -t 5"; 
after-resync-target "/ usr /lib/ drbd /crm-unfence-peer.sh"; 
} 
disk { 
resync-rate 300M; # limit the bandwidth which may be used by background 
# synchronizations; use 30M for 1Gb NIC , 300M for 10Gb NIC 
al-extents 3833; # Must be prime, number of active sets. 
on-io-error detach; # What to do when the lower level device errors. 
disk-barrier no; 
disk-flushes no; 
md-flushes no; 
fencing resource-only; 
#size 1000G; # for setting exact size of DRBD resource - DO NOT uncomment 
this!! 
#become-primary-on node-name # use this for DRBD withOUT heartbeat 
} 
net { 
protocol C; 
verify-alg md5; # can also use md5, crc32c, ect 
csums-alg md5; # can also use md5, crc32c, ect 
#timeout 60; # 6 seconds (unit = 0.1 seconds) 
#connect-int 10; # 10 seconds (unit = 1 second) 
#ping-int 10; # 10 seconds (unit = 1 second) 
#ping-timeout 5; # 500 ms (unit = 0.1 seconds) 
unplug-watermark 131072; # flush RAID controller buffers 
max-buffers 80000; # datablock buffers used before writing to disk. 
max-epoch-size 20000; # set max transfer size 
sndbuf-size 0; 
rcvbuf-size 0; 
ko-count 4; # Peer is dead if this count is exceeded. 
after- sb -0pri discard-zero-changes; 
after- sb -1pri consensus; 
after- sb -2pri disconnect; 
rr-conflict disconnect; 
cram-hmac-alg "sha256"; 
} 
} 
resource drbd0 { 
options { 
cpu-mask 0; 
on-no-data-accessible io-error; 
} 
device / dev /drbd0; 
disk / dev /sda4; 
meta-disk internal; 
on mofpeasHA1 { 
address 10.211.32.1:7789; 
} 
on mofpeasHA2 { 
address 10.211.32.2:7789; 
} 
} 

I did of course diligently read through all of the documentation for DRBD 
v8.4.1 with it being my first time above v8.3.7 on CentOS 5.3. Found these new 
option that sounded flavorful: 

options { 
cpu-mask 0; 
on-no-data-accessible io-error; 
} 

Also found that many options had moved around ( resync-rate replacing the old 
rate, no more syncer section, etc ), so I modified the config we have been 
using for a few years to reflect all these new changes I learned about. The 
above config seems to work just fine at this point. 

Doing a simple test of copying 4.5GB of data from my DRBD partition directly to 
memory (/ dev / shm /.) I have plenty of room there: 

tmpfs 64G 4.5G 59G 8% / dev / shm 

- echo 3 > / proc / sys / vm /drop_caches <- first I drop cache for an accurate 
test 
- time cp -rp / usr / medent / tapetest / / dev / shm /. <- here I copy a dir 
with roughtly 4.5GB of random data real world data to system memory 

real 0m57.623s 
user 0m0.001s 
sys 0m0.188s 


Wow that takes a long time - almost a full minute. Doesn't seem right as this 
machine is blazing fast. So I clear the cache and / dev / shm , then try the 
same test but pulling the same data from a non- DRBD partition: 


- echo 3 > / proc / sys / vm /drop_caches <- first I drop cache for an accurate 
test 
- time cp -rp /root/ tapetest / / dev / shm /. 

real 0m7.625s 
user 0m0.064s 
sys 0m3.272s 

- Quite a large difference there!! 
I can reproduce this over and over again - happens if DRBD is online and fully 
replicating or if I take one node down to run without any replication going on 
just to be sure. 

-We did so much tuning in DRBD back with CentOS 5 and IBM MegaRAID M5015 in 
similar RAID10 that I would hate to strip down my config to the default after 
install and start from scratch 


-- 
Kenneth DeChick 
Linux Systems Administrator 
-- MEDENT -- 


Kirk to Enterprise -- beam down yeoman Rand and a six-pack. 


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to