[ 
https://issues.apache.org/jira/browse/HBASE-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003560#comment-15003560
 ] 

Haohui Mai commented on HBASE-14790:
------------------------------------

Making the errors in the pipeline visible to HBase allows HBase to detect 
failures and to recover from failures much faster. It has a lot of benefits in 
terms on reducing the latency of HBase.

An Exokernel style writer will eventually allow HBase to write to HDFS in 
parallel, which further reducing the latency by 3x.

I would suggest (1) implementing the writer in the HDFS project to reduce the 
cost of maintenance, (2) making it event-driven so that it is reusable when 
building today's {{DFSOutputStream}}. It's much harder to do so today as there 
are a lot of synchronization happening for throttling, etc.

It is relatively straightforward to implement the current client-side, pipeline 
protocol without handling failures. The potential issue I see is that the DN 
might mask the failures and introduce additional delays in the pipeline. To 
fully get the benefits it might require changing the protocol. That's being 
said, the project suddenly becomes much risker when it requires changes on the 
server side.

A less risky route is to combine the effort with the HTTP/2 initiatives of HDFS 
which allows full control on both the client and the server side. Thoughts?

> Implement a new DFSOutputStream for logging WAL only
> ----------------------------------------------------
>
>                 Key: HBASE-14790
>                 URL: https://issues.apache.org/jira/browse/HBASE-14790
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> The original {{DFSOutputStream}} is very powerful and aims to serve all 
> purposes. But in fact, we do not need most of the features if we only want to 
> log WAL. For example, we do not need pipeline recovery since we could just 
> close the old logger and open a new one. And also, we do not need to write 
> multiple blocks since we could also open a new logger if the old file is too 
> large.
> And the most important thing is that, it is hard to handle all the corner 
> cases to avoid data loss or data inconsistency(such as HBASE-14004) when 
> using original DFSOutputStream due to its complicated logic. And the 
> complicated logic also force us to use some magical tricks to increase 
> performance. For example, we need to use multiple threads to call {{hflush}} 
> when logging, and now we use 5 threads. But why 5 not 10 or 100?
> So here, I propose we should implement our own {{DFSOutputStream}} when 
> logging WAL. For correctness, and also for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to