[
https://issues.apache.org/jira/browse/CASSANDRA-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Branson resolved CASSANDRA-3248.
-------------------------------------
Resolution: Fixed
Fix Version/s: 1.1
> CommitLog writer should call fdatasync instead of fsync
> -------------------------------------------------------
>
> Key: CASSANDRA-3248
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3248
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.6.13, 0.7.9, 0.8.6, 1.0.0, 1.1
> Environment: Linux
> Reporter: Zhu Han
> Assignee: Rick Branson
> Fix For: 1.1
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> CommitLogSegment uses SequentialWriter to flush the buffered data to log
> device. It depends on FileDescriptor#sync() which invokes fsync() as it force
> the file attributes to disk.
> However, at least on Linux, fdatasync() is good enough for commit log flush:
> bq. fdatasync() is similar to fsync(), but does not flush modified metadata
> unless that metadata is needed in order to allow a subsequent data retrieval
> to be correctly handled. For example, changes to st_atime or st_mtime
> (respectively, time of last access and time of last modification; see
> stat(2)) do not require flushing because they are not necessary for a
> subsequent data read to be handled correctly. On the other hand, a change to
> the file size (st_size, as made by say ftruncate(2)), would require a
> metadata flush.
> File size is synced to disk by fdatasync() either. Although the commit log
> recovery logic sorts the commit log segements on their modify timestamp, it
> can be removed safely, IMHO.
> I checked the native code of JRE 6. On Linux and Solaris,
> FileChannel#force(false) invokes fdatasync(). On windows, the false flag does
> not have any impact.
> On my log device (commodity SATA HDD, write cache disabled), there is large
> performance gap between fsync() and fdatasync():
> {quote}
> $sysbench --test=fileio --num-threads=1 --file-num=1 --file-total-size=10G
> --file-fsync-all=on --file-fsync-mode={color:red}fdatasync{color}
> --file-test-mode=seqwr --max-time=600 --file-block-size=2K --max-requests=0
> run
> {color:blue}54.90{color} Requests/sec executed
> per-request statistics:
> min: 8.29ms
> avg: 18.18ms
> max: 108.36ms
> approx. 95 percentile: 25.02ms
> $ sysbench --test=fileio --num-threads=1 --file-num=1 --file-total-size=10G
> --file-fsync-all=on --file-fsync-mode={color:red}fsync{color}
> --file-test-mode=seqwr --max-time=600 --file-block-size=2K --max-requests=0
> run
> {color:blue}28.08{color} Requests/sec executed
> per-request statistics:
> min: 33.28ms
> avg: 35.61ms
> max: 911.87ms
> approx. 95 percentile: 41.69ms
> {quote}
> I do think this is a very critical performance improvement.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira