[
https://issues.apache.org/jira/browse/HBASE-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135606#comment-15135606
]
Duo Zhang commented on HBASE-14790:
-----------------------------------
Fine. Let's do it in HBase.
A little problem is that if we get an error then the only way to close the file
is calling recoverLease. The reason is that I do not want to resend data to
datanode but in the current DTP there is no way to truncate block data from
client side so the only way to make a consensus on the block length is calling
recoverLease... In the new design, so this is not a big problem.
And also, we need to resolve HBASE-14949 first before applying this patch. In
the current implementation, we do not do pipeline recovery, so it is easier for
us to meet a sync failure. If we simply fail the sync request, we will also
easier to meet the inconsistency... So here we need the logic described in
HBASE-14004, write the unacked entries to new WAL file if sync failed. This
will lead to the problem described in HBASE-14949 that we may have two WAL
files with different data mapping to the same name when splitting.
On the implementation, the problem is I need to share lots of code of the
original FSHLog and other related classes. But this should not be a blocker,
let me try implementing a new WALProvider next.
And for the reviewing, you just do it on Github? Or I upload it here and on
reviewboard? I'm always happy with a review. Thanks.
> Implement a new DFSOutputStream for logging WAL only
> ----------------------------------------------------
>
> Key: HBASE-14790
> URL: https://issues.apache.org/jira/browse/HBASE-14790
> Project: HBase
> Issue Type: Improvement
> Reporter: Duo Zhang
>
> The original {{DFSOutputStream}} is very powerful and aims to serve all
> purposes. But in fact, we do not need most of the features if we only want to
> log WAL. For example, we do not need pipeline recovery since we could just
> close the old logger and open a new one. And also, we do not need to write
> multiple blocks since we could also open a new logger if the old file is too
> large.
> And the most important thing is that, it is hard to handle all the corner
> cases to avoid data loss or data inconsistency(such as HBASE-14004) when
> using original DFSOutputStream due to its complicated logic. And the
> complicated logic also force us to use some magical tricks to increase
> performance. For example, we need to use multiple threads to call {{hflush}}
> when logging, and now we use 5 threads. But why 5 not 10 or 100?
> So here, I propose we should implement our own {{DFSOutputStream}} when
> logging WAL. For correctness, and also for performance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)