[
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616239#comment-16616239
]
Duo Zhang commented on HBASE-20952:
-----------------------------------
First ,WALSplitter is not a separated topic, it is the core of HBase. You can
disable replication but you can not disable wal splitting...
And I think your approach sound good, there is a register method(or initialize?
Or just do it in the constructor, not critical), when RS starts we will call it
to get the permit to write to the log system. In the FileSystem based log
system, it is just a creating of a directory, and for other log systems it is
And when master think the RS is die, then we call a disable method, which
prevent further appending. For FileSystem this is done by renaming and
recoverLease, and for other log systems I think there are ways to do this.
And I agree that, we should have different wal splitter for different wal
systems. For FileSystem, this maybe done by splitting wal files into several
recovered edits into the region directory, and for other log systems, we could
use different ways. But the key point here is that, we need to know there are
recovered edits when opening a region and scanning it to reconstruct the
memstore. So I think we should add another method to the WAL system, which is
used to get the recovered edits for a region when opening a region. IIRC
[~zyork] is working on deploy HBase on S3 and was fighting with the recovered
edits directory should be on S3 or HDFS, do not know what's the final solution
but after the discussion here, I think it should be on HDFS, not S3?
So I think here we will add two methods to the wal system. One is for splitting
wal for a region server, and the other is for getting recovered edits for a
region. If the implementation is wal per region, then the split method is just
a dummy one that does nothing, otherwise you still need to do something to make
separated wals for different regions. And if split is too heavy, you can do
filtering when getting recovered edits? Not sure, maybe.
And for replication, the above word 'subscribe/replay' inspires me. The
replication is just another subscriber of the wals, right? It receives the wals
for specific tables(regions), and then sends it to the remote clusters. So I
think we could introduce the subscribe/consume style APIs for the wal system,
then the implementation of replication will be straight-forward. I do not care
whether they are wal files or some topics on Kafka, just give me the stream to
read! And the FileSystem related code in the replication framework will also be
moved into the wal system. You can see the code, we just use zookeeper to
record the unconsumed wal files, and try to locate it on the FileSystem as it
may be moved to oldWALs. It is just a basic subscribe/consume framework I think.
And for sync replication, I think we should make it work with different wal
implementations. This is another story and I will keep tracking it. To be
honest I do not know the solution yet, but I'm optimistic.
So in general, I think the problem for the current wal abstraction is that, the
line is too low, we should cut it at a higher place, where fencing, log
splitting, and reading recovered edits should all be included in it, but now
lots of the code are outside the wal system. Thanks [~sergey.soldatov], your
post really helps.
> Re-visit the WAL API
> --------------------
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Reporter: Josh Elser
> Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an
> HBase WAL API should look like. What are the primitive calls that we require
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We
> should also have a mind for what is happening in the Ratis LogService (but
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We
> should make sure all consumers are generally going to be OK with the API we
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods
> which were "bolted" on such as {{AbstractFSWAL}} and
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability
> annotations are chosen.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)