https://bugzilla.redhat.com/show_bug.cgi?id=620053
testio.cpp: https://bugzilla.redhat.com/attachment.cgi?id=435768 Multi thread I/O behaves like serialized in some situation. Description of problem: DEV1 is a block device with SOME partitions. DEV2 is a block device without any DATA. Both DEV1 and DEV2 are hardware based RAID-5 array. Each time, just after the system rebooting, random-accessing to DEV1 by multi thread Multi thread I/O behaves like serialized, it is very slow. At the same time DEV2 is fast. Normally, To access the block device by 10 threads concurrently should be faster than single thread, if the 2 number of total IO requests are equal. But a blank block devices behaves different. The more strange thing, If I insert a sleep between each I/O, the throughput goes up, than I remove the sleep, the throughput keeps high as DEV2. Version-Release number of selected component (if applicable): kernel-2.6.18-194.el5 kernel-2.6.18-194.8.1.el5 kernel-2.6.18-164.15.1.el5 glibc-2.5-49.el5_5.2 How reproducible: 100% at my 2 server. The two have different RAID controllers. Steps to Reproduce: ]# g++ testio.cpp -o testio -g -O2 -pthread ]# dd if=/dev/zero of=/dev/DEV2 bs=1M count=1024 ]# reboot ... ]# time ./testio -r /dev/DEV1 10 4096 0 1024000000 4096 32768 100 ..........<<<<<<<<<< real 0m1.457s user 0m0.001s sys 0m0.021s ]# time ./testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100 ..........<<<<<<<<<< real 0m11.596s user 0m0.004s sys 0m0.035s # Now, I insert the sleep of 50ms, speed goes up ]# time ./testio -r -s50000 /dev/DEV2 10 4096 0 1024000000 4096 32768 100 real 0m6.091s user 0m0.001s sys 0m0.019s # Now, I remove the sleep, speed keeps high ]# time ./testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100 ..........<<<<<<<<<< real 0m1.416s user 0m0.002s sys 0m0.022s Expected results: Additional info: I have tried boot with a debian-5 live cd, all things goes OK. I have tried to swap DEV1 and DEV2, the blank one is always the slow one. It seems that, the threads sometimes be serialzed. You can check the strace log: strace -f -s0 -etrace=pread ./testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100 2>&1 | fgrep -v '<...' | cut -d ' ' -f1-4 | uniq -c | less When it is slow: 10 [pid 4993] pread(10, 10 [pid 4995] pread(12, 8 [pid 4994] pread(11, 8 [pid 4986] pread(3, 7 [pid 4987] pread(4, 9 [pid 4988] pread(5, 9 [pid 4989] pread(6, 9 [pid 4990] pread(7, 8 [pid 4991] pread(8, 8 [pid 4992] pread(9, 8 [pid 4993] pread(10, 9 [pid 4995] pread(12, 10 [pid 4994] pread(11, 9 [pid 4986] pread(3, 9 [pid 4987] pread(4, 9 [pid 4988] pread(5, 10 [pid 4989] pread(6, 7 [pid 4990] pread(7, 8 [pid 4991] pread(8, 9 [pid 4992] pread(9, 10 [pid 4993] pread(10, 6 [pid 4995] pread(12, 8 [pid 4994] pread(11, 7 [pid 4986] pread(3, 8 [pid 4987] pread(4, When it is fast: 1 [pid 5151] pread(9, 1 [pid 5152] pread(10, 1 [pid 5147] pread(5, 1 [pid 5149] pread(7, 1 [pid 5150] pread(8, 1 [pid 5147] pread(5, 1 [pid 5151] pread(9, 1 [pid 5145] pread(3, 1 [pid 5153] pread(11, 1 [pid 5150] pread(8, 1 [pid 5146] pread(4, 1 [pid 5154] pread(12, 1 [pid 5150] pread(8, 1 [pid 5149] pread(7, 1 [pid 5147] pread(5, 1 [pid 5146] pread(4, 1 [pid 5149] pread(7, 1 [pid 5152] pread(10, 1 [pid 5150] pread(8, 1 [pid 5148] pread(6, 1 [pid 5147] pread(5, 1 [pid 5149] pread(7, 1 [pid 5151] pread(9, 1 [pid 5152] pread(10, 2 [pid 5151] pread(9, 1 [pid 5147] pread(5, 1 [pid 5152] pread(10, 1 [pid 5151] pread(9, 1 [pid 5147] pread(5, 1 [pid 5150] pread(8, 1 [pid 5147] pread(5, 1 [pid 5145] pread(3, You can see that. In the slow situation, threads DO NOT run cocurrently, one thread fires 10 IO, than another. And in the fast situation, threads runs cocurrently. Regards, Kirby Zhou from SOHU-RD +86-10-6272-8261 _______________________________________________ rhelv5-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/rhelv5-list
