[ 
https://issues.apache.org/jira/browse/HBASE-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135542#comment-15135542
 ] 

stack commented on HBASE-14790:
-------------------------------

First, what you think of the 2x number? I think it is kinda cool.

On #1, would be cool if we could have something working here first, even if 
reflection (we have loads of that already). Then we go to the HDFS lads. Our 
using it and 2x perf will be a good story to tell.

On #2, how hard to do as a WALProvider? You know my story on 'options'. A flag 
in DefaultWALProvider is probably fine for a while but we should just cut over 
to async unless you can think of downsides.

Want a review now or should I wait?

> Implement a new DFSOutputStream for logging WAL only
> ----------------------------------------------------
>
>                 Key: HBASE-14790
>                 URL: https://issues.apache.org/jira/browse/HBASE-14790
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> The original {{DFSOutputStream}} is very powerful and aims to serve all 
> purposes. But in fact, we do not need most of the features if we only want to 
> log WAL. For example, we do not need pipeline recovery since we could just 
> close the old logger and open a new one. And also, we do not need to write 
> multiple blocks since we could also open a new logger if the old file is too 
> large.
> And the most important thing is that, it is hard to handle all the corner 
> cases to avoid data loss or data inconsistency(such as HBASE-14004) when 
> using original DFSOutputStream due to its complicated logic. And the 
> complicated logic also force us to use some magical tricks to increase 
> performance. For example, we need to use multiple threads to call {{hflush}} 
> when logging, and now we use 5 threads. But why 5 not 10 or 100?
> So here, I propose we should implement our own {{DFSOutputStream}} when 
> logging WAL. For correctness, and also for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to