[jira] [Comment Edited] (HBASE-5954) Allow proper fsync support for HBase

Lars Hofhansl (JIRA) Wed, 28 Jan 2015 21:54:47 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296413#comment-14296413
 ]


Lars Hofhansl edited comment on HBASE-5954 at 1/29/15 5:54 AM:
---------------------------------------------------------------

Another thought. If we had HDFS storage classes and thus the ability to place 
the WAL on SSD, would we still want to allow (and force) the application to 
decide between hflush and hsync? We could simplify and offer two options:
# hflush (what we have now)
# hsync on SSD

That would be a global HBase option.
We might want to think about a group commit option as well: Wait a bit (a few 
ms, maybe, in the ballpark of a disk rotation or so), and then sync all 
accumulated edits in one go. Callers would wait until the accumulated batch is 
done. Almost all relational databases do that. In that case we can allow fsync 
even with rotating disks and lose less or none of the throughput, at the 
expense of a slight increase tail latency (those callers who got into the batch 
first and have to wait until it filled up or reached its time limit).

Here's the Postgres blurb on that: 
http://www.postgresql.org/message-id/caeylb_v5q8zdjnkb4+30_dpd3nrgfoxheurney3hsrcqsyd...@mail.gmail.com

Just saying. I'm also happy enough with this patch for now, and this patch 
would allow the application to strategically sync some of the edits even on 
rotating media.



was (Author: lhofhansl):
Another thought. If we had HDFS storage classes and thus the ability to place 
the WAL on SSD, would we still want to allow (and force) the application to 
decide between hflush and hsync? We could simplify and offer two options:
# hflush (what we have now)
# hsync on SSD

That would be a global HBase option.
We might want to think about a group commit option as well: Wait a bit (a few 
ms, maybe, in the ballpark of a disk rotation or so), and then sync all edit 
accumulated edit in one go. Callers would wait until the accumulated batch is 
done. Almost all relational databases do that. In that case we can allow fsync 
even with rotating disks and lose less of the throughput.

Just saying. I'm also happy enough with this patch for now, and this patch 
would allow the application to strategically sync some of the edits even on 
rotating media.


> Allow proper fsync support for HBase
> ------------------------------------
>
>                 Key: HBASE-5954
>                 URL: https://issues.apache.org/jira/browse/HBASE-5954
>             Project: HBase
>          Issue Type: Improvement
>          Components: HFile, wal
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: 5954-WIP-trunk.txt, 5954-WIP-v2-trunk.txt, 
> 5954-trunk-hdfs-trunk-v2.txt, 5954-trunk-hdfs-trunk-v3.txt, 
> 5954-trunk-hdfs-trunk-v4.txt, 5954-trunk-hdfs-trunk-v5.txt, 
> 5954-trunk-hdfs-trunk-v6.txt, 5954-trunk-hdfs-trunk.txt, 5954-v3-trunk.txt, 
> hbase-hdfs-744.txt
>
>
> At least get recommendation into 0.96 doc and some numbers running w/ this 
> hdfs feature enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-5954) Allow proper fsync support for HBase

Reply via email to