[
https://issues.apache.org/jira/browse/AMQ-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853445#comment-13853445
]
Vaidya Krishnamurthy commented on AMQ-4947:
-------------------------------------------
Here is a comparison between RHEL 4 & RHEL 6 with a version of kahadb that
relies on fsync
RHEL 4:
Writes:
195843 writes of size 4096 written in 11.419 seconds.
17150.627 writes/second.
66.99464 megs/second.
Sync Writes:
18098 writes of size 4096 written in 10.001 seconds.
1809.619 writes/second.
7.0688243 megs/second.
Reads:
1408287 reads of size 4096 read in 10.001 seconds.
140814.62 writes/second.
550.0571 megs/second.
RHEL 6
Writes:
767995 writes of size 4096 written in 11.289 seconds.
68030.38 writes/second.
265.74368 megs/second.
Sync Writes:
1661 writes of size 4096 written in 10.003 seconds.
166.05019 writes/second.
0.64863354 megs/second.
Reads:
7430222 reads of size 4096 read in 10.001 seconds.
742947.9 writes/second.
2902.1401 megs/second.
Current kahadb implementation relies on file.getFD().sync() to make sure that
contents of the db gets flushed to the disk.
In the case of getFD().sync -> we can see that it gets translated to fsync() in
unix/linux . Please see below for the output of an strace on the Diskbenchmark
utility that is in kahadb
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.47 11.206794 10817 1036 503 futex
0.67 0.075989 37995 2 unlink
0.59 0.066990 224 299 fsync
0.11 0.011999 6000 2 2 restart_syscall
0.09 0.009870 0 111479 read
0.07 0.008294 0 111778 lseek
0.01 0.001366 5 300 write
0.00 0.000051 26 2 fstat
0.00 0.000034 9 4 madvise
0.00 0.000000 0 2 open
0.00 0.000000 0 3 close
0.00 0.000000 0 4 mmap
0.00 0.000000 0 13 mprotect
0.00 0.000000 0 6 rt_sigprocmask
0.00 0.000000 0 4 fcntl
0.00 0.000000 0 1 gettid
0.00 0.000000 0 2 sched_getaffinity
fsync performance over clustered file system gets worse as 2 kernel calls have
to be done:
1. flush the data into the file.
2. flush the metadata as well
Please refer to http://linux.die.net/man/2/fdatasync
Since fdatasync is equivalent to fsync when it pertains to metadata that is
required to handle the file correctly - we are not changing the semantics of
the write IMHO.
Here is a sample output of the strace when the DiskBenchmark was changed to
use the getChannel().force(false)
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.64 13.002538 17500 743 355 futex
0.62 0.081987 40994 2 unlink
0.47 0.061993 77 804 fdatasync
0.10 0.012998 6499 2 2 restart_syscall
0.08 0.010247 0 110354 lseek
0.08 0.010162 0 109549 read
0.02 0.002545 3 806 write
0.00 0.000000 0 1 open
0.00 0.000000 0 2 close
0.00 0.000000 0 1 fstat
0.00 0.000000 0 4 mmap
0.00 0.000000 0 13 mprotect
0.00 0.000000 0 6 rt_sigprocmask
0.00 0.000000 0 4 madvise
0.00 0.000000 0 1 dup2
0.00 0.000000 0 2 fcntl
0.00 0.000000 0 1 gettid
0.00 0.000000 0 2 sched_getaffinity
------ ----------- ----------- --------- --------- ----------------
100.00 13.182470 222297 357 total
And here is a DiskBenchMark using the a kahadb snapshot version using fdatasync
RHEL 6
Writes:
767995 writes of size 4096 written in 10.831 seconds.
70907.12 writes/second.
276.98093 megs/second.
Sync Writes:
59241 writes of size 4096 written in 10.001 seconds.
5923.508 writes/second.
23.138702 megs/second.
Reads:
7528771 reads of size 4096 read in 10.001 seconds.
752801.8 writes/second.
2940.632 megs/second.
RHEL 4
Writes:
193499 writes of size 4096 written in 11.371 seconds.
17016.885 writes/second.
66.472206 megs/second.
Sync Writes:
48167 writes of size 4096 written in 10.001 seconds.
4816.2183 writes/second.
18.813353 megs/second.
Reads:
1405618 reads of size 4096 read in 10.001 seconds.
140547.75 writes/second.
549.01465 megs/second.
> Reduce the reliance on fsync when writing to disk
> -------------------------------------------------
>
> Key: AMQ-4947
> URL: https://issues.apache.org/jira/browse/AMQ-4947
> Project: ActiveMQ
> Issue Type: Improvement
> Components: Message Store
> Affects Versions: 5.10.0
> Environment: RHEL 6 and RHEL 4
> Reporter: Vaidya Krishnamurthy
> Labels: kahadb
> Fix For: Unscheduled
>
> Attachments: changes.txt
>
>
> Moving AMQ from RHEL 4 to RHEL 6 affects performance of kahadb writes as
> seen from the DiskBenchmark
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)