[ 
https://issues.apache.org/jira/browse/HBASE-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-14790:
------------------------------
    Release Note: 
Implement a FanOutOneBlockAsyncDFSOutput for writing WAL only, the WAL provider 
which uses this class is AsyncFSWALProvider.

It is based on netty, and will write to 3 DNs at the same time 
concurrently(fan-out) so generally it will lead to a lower latency. And it is 
also fail-fast, the stream will become unwritable immediately after there are 
any read/write errors, no pipeline recovery. You need to call recoverLease to 
force close the output for this case. And it only supports to write a file with 
a single block. For WAL this is a good behavior as we can always open a new 
file when the old one is broken. The performance analysis in HBASE-16890 shows 
that it has a better performance.

Behavior changes:
1. As now we write to 3 DNs concurrently, according to the visibility guarantee 
of HDFS, the data will be available immediately when arriving at DN since all 
the DNs will be considered as the last one in pipeline. This means replication 
may read uncommitted data and replicate it to the remote cluster and cause data 
inconsistency. HBASE-14004 is used to solve the problem.
2. There will be no sync failure. When the output is broken, we will open a new 
file and write all the unacked wal entries to the new file. This means that we 
may have duplicated entries in wal files. HBASE-14949 is used to solve this 
problem.

  was:
Implement a FanOutOneBlockAsyncDFSOutput for writing WAL only, the WAL provider 
which uses this class is AsyncFSWALProvider.

It is based on netty, and will write to 3 DNs at the same time 
concurrently(fan-out) so generally it will lead to a lower latency. And it is 
also fail-fast, the stream will become unwritable immediately after there are 
any read/write errors, no pipeline recovery. You need to call recoverLease to 
force close the output for this case. For WAL this is a good behavior as we can 
always open a new file when the old one is broken. The performance analysis in 
HBASE-16890 shows that it has a better performance.

Behavior changes:
1. As now we write to 3 DNs concurrently, according to the visibility guarantee 
of HDFS, the data will be available immediately when arriving at DN since all 
the DNs will be considered as the last one in pipeline. This means replication 
may read uncommitted data and replicate it to the remote cluster and cause data 
inconsistency. HBASE-14004 is used to solve the problem.
2. There will be no sync failure. When the output is broken, we will open a new 
file and write all the unacked wal entries to the new file. This means that we 
may have duplicated entries in wal files. HBASE-14949 is used to solve this 
problem.


> Implement a new DFSOutputStream for logging WAL only
> ----------------------------------------------------
>
>                 Key: HBASE-14790
>                 URL: https://issues.apache.org/jira/browse/HBASE-14790
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0-beta-1
>
>
> The original {{DFSOutputStream}} is very powerful and aims to serve all 
> purposes. But in fact, we do not need most of the features if we only want to 
> log WAL. For example, we do not need pipeline recovery since we could just 
> close the old logger and open a new one. And also, we do not need to write 
> multiple blocks since we could also open a new logger if the old file is too 
> large.
> And the most important thing is that, it is hard to handle all the corner 
> cases to avoid data loss or data inconsistency(such as HBASE-14004) when 
> using original DFSOutputStream due to its complicated logic. And the 
> complicated logic also force us to use some magical tricks to increase 
> performance. For example, we need to use multiple threads to call {{hflush}} 
> when logging, and now we use 5 threads. But why 5 not 10 or 100?
> So here, I propose we should implement our own {{DFSOutputStream}} when 
> logging WAL. For correctness, and also for performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to