[jira] [Commented] (AMQ-4947) Reduce the reliance on fsync when writing to disk

Vaidya Krishnamurthy (JIRA) Thu, 19 Dec 2013 15:20:33 -0800

    [ 
https://issues.apache.org/jira/browse/AMQ-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853445#comment-13853445
 ]


Vaidya Krishnamurthy commented on AMQ-4947:
-------------------------------------------

Here is a comparison between RHEL 4 & RHEL 6 with a version of kahadb that 
relies on fsync
 RHEL 4:
 Writes:
   195843 writes of size 4096 written in 11.419 seconds.
   17150.627 writes/second.
   66.99464 megs/second.
 
 Sync Writes:
   18098 writes of size 4096 written in 10.001 seconds.
   1809.619 writes/second.
   7.0688243 megs/second.
 
 Reads:
   1408287 reads of size 4096 read in 10.001 seconds.
   140814.62 writes/second.
   550.0571 megs/second.
 

RHEL 6

 Writes:
   767995 writes of size 4096 written in 11.289 seconds.
   68030.38 writes/second.
   265.74368 megs/second.
 
 Sync Writes:
   1661 writes of size 4096 written in 10.003 seconds.
   166.05019 writes/second.
   0.64863354 megs/second.
 
 Reads:
   7430222 reads of size 4096 read in 10.001 seconds.
   742947.9 writes/second.
   2902.1401 megs/second.
 
Current kahadb implementation relies on file.getFD().sync() to make sure that 
contents of the db gets flushed to the disk.

In the case of getFD().sync -> we can see that it gets translated to fsync() in 
unix/linux . Please see below for the output of an strace on the Diskbenchmark 
utility that is in kahadb

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.47   11.206794       10817      1036       503 futex
  0.67    0.075989       37995         2           unlink
  0.59    0.066990         224       299           fsync
  0.11    0.011999        6000         2         2 restart_syscall
  0.09    0.009870           0    111479           read
  0.07    0.008294           0    111778           lseek
  0.01    0.001366           5       300           write
  0.00    0.000051          26         2           fstat
  0.00    0.000034           9         4           madvise
  0.00    0.000000           0         2           open
  0.00    0.000000           0         3           close
  0.00    0.000000           0         4           mmap
  0.00    0.000000           0        13           mprotect
  0.00    0.000000           0         6           rt_sigprocmask
  0.00    0.000000           0         4           fcntl
  0.00    0.000000           0         1           gettid
  0.00    0.000000           0         2           sched_getaffinity


fsync performance over clustered file system gets worse as 2 kernel calls have 
to be done:

 1. flush the data into the file.
 2. flush the metadata as well

 Please refer to http://linux.die.net/man/2/fdatasync

 Since fdatasync is equivalent to fsync when it pertains to metadata that is 
required to handle the file correctly - we are not changing the semantics of 
the write IMHO.


 Here is a sample output of the strace when the DiskBenchmark was changed to 
use the getChannel().force(false)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.64   13.002538       17500       743       355 futex
  0.62    0.081987       40994         2           unlink
  0.47    0.061993          77       804           fdatasync
  0.10    0.012998        6499         2         2 restart_syscall
  0.08    0.010247           0    110354           lseek
  0.08    0.010162           0    109549           read
  0.02    0.002545           3       806           write
  0.00    0.000000           0         1           open
  0.00    0.000000           0         2           close
  0.00    0.000000           0         1           fstat
  0.00    0.000000           0         4           mmap
  0.00    0.000000           0        13           mprotect
  0.00    0.000000           0         6           rt_sigprocmask
  0.00    0.000000           0         4           madvise
  0.00    0.000000           0         1           dup2
  0.00    0.000000           0         2           fcntl
  0.00    0.000000           0         1           gettid
  0.00    0.000000           0         2           sched_getaffinity
------ ----------- ----------- --------- --------- ----------------
100.00   13.182470                222297       357 total

And here is a DiskBenchMark using the a kahadb snapshot version using fdatasync
RHEL 6
 Writes:
   767995 writes of size 4096 written in 10.831 seconds.
   70907.12 writes/second.
   276.98093 megs/second.
 
 Sync Writes:
   59241 writes of size 4096 written in 10.001 seconds.
   5923.508 writes/second.
   23.138702 megs/second.
 
 Reads:
   7528771 reads of size 4096 read in 10.001 seconds.
   752801.8 writes/second.
   2940.632 megs/second.
 
RHEL 4
 Writes:
   193499 writes of size 4096 written in 11.371 seconds.
   17016.885 writes/second.
   66.472206 megs/second.
 
 Sync Writes:
   48167 writes of size 4096 written in 10.001 seconds.
   4816.2183 writes/second.
   18.813353 megs/second.
 
 Reads:
   1405618 reads of size 4096 read in 10.001 seconds.
   140547.75 writes/second.
   549.01465 megs/second.
 

> Reduce the reliance on fsync when writing to disk
> -------------------------------------------------
>
>                 Key: AMQ-4947
>                 URL: https://issues.apache.org/jira/browse/AMQ-4947
>             Project: ActiveMQ
>          Issue Type: Improvement
>          Components: Message Store
>    Affects Versions: 5.10.0
>         Environment: RHEL 6 and RHEL 4
>            Reporter: Vaidya Krishnamurthy
>              Labels: kahadb
>             Fix For: Unscheduled
>
>         Attachments: changes.txt
>
>
> Moving AMQ from  RHEL 4 to RHEL 6 affects performance of kahadb writes as 
> seen from the DiskBenchmark



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (AMQ-4947) Reduce the reliance on fsync when writing to disk

Reply via email to