[
https://issues.apache.org/jira/browse/HBASE-19024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209147#comment-16209147
]
Vikas Vishwakarma commented on HBASE-19024:
-------------------------------------------
[~anoop.hbase] we carried few combinations of tests since this was a part of a
larger story to enable WAL on SSD. So we looked at
* hflush HDD vs SSD
* hsync HDD vs SSD
* HDD hflush vs SSD hsync
* HDD hflush vs hsync
Each test was carried out for both small batches of few 100 bytes and large
batches of 1 MB and 10 MB
We used a multithreaded native HBase write loader for the tests that does batch
puts of 100 bytes, 1 MB, 10 MB using random data. Latency is calculated for
each batch put as well as total time taken for the loader to complete for few
million rows. As per our observation
* between hflush and hsync there is 10-15% degradation for using hsync instead
of hflush for HDD
SSD results are slightly controversial and not in-line with conventional belief
and we had a long discussion and experimentation phase on it. It will also
depend on the type and grade of SSD being used value or low grade SSD vs
enterprise SSD and other factors, so I am not posting those results here as
this jira is anyways independent of SSD :)
> provide a configurable option to hsync WAL edits to the disk for better
> durability
> ----------------------------------------------------------------------------------
>
> Key: HBASE-19024
> URL: https://issues.apache.org/jira/browse/HBASE-19024
> Project: HBase
> Issue Type: Improvement
> Components: wal
> Environment:
> Reporter: Vikas Vishwakarma
>
> At present we do not have an option to hsync WAL edits to the disk for better
> durability. In our local tests we see 10-15% latency impact of using hsync
> instead of hflush which is not very high.
> We should have a configurable option to hysnc WAL edits instead of just
> sync/hflush which will call the corresponding API on the hadoop side.
> Currently HBase handles both SYNC_WAL and FSYNC_WAL as the same calling
> FSDataOutputStream sync/hflush on the hadoop side. This can be modified to
> let FSYNC_WAL call hsync on the hadoop side instead of sync/hflush. We can
> keep the default value to sync as the current behavior and hsync can be
> enabled based on explicit configuration.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)