[
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614232#comment-16614232
]
Duo Zhang commented on HBASE-20952:
-----------------------------------
The design doc does not help, it is just like pseudo-code. What I want to know
is that, how do we deal with several key problems if we want to remove the
direct dependency on FileSystem. There is a simple list that comes immediately
to my mind:
1. How do we do fencing when RS crashes? Now we need to rename the wal
directory for a RS, and then call recoverLease for all the files to confirm
that they are all closed. And at RS side, when creating a wal write, we use
createNonRecursive intentionally, so that if the wal directory has been
renamed, we can not create wal writers any more. How do we want to abstract
these operations in the new WAL API? How does other log systems, such as ratis,
deal with this?
2. For sync replication, we have a config called remote wal directory, which
exposes the file system to user. As it is implemented by us at Xiaomi, we can
help to find a work around on this. And the sync replication also replies on
the rename operation to do fencing.
3. The replication related stuffs. I have been asking this from long long ago,
but no one gives an overall solution. And looking at the code on the RB, we
have already started to change the stuffs in replication? And for
RecoveredReplicationSource, we make it abstract and introduce a new
FSRecoveredReplicationSource? Then where is the FSReplicationSource?
I always say, we should have an overall solution first, i.e., we should know
what the system looks like when we finish. Then we start to work things out.
Thanks.
> Re-visit the WAL API
> --------------------
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Reporter: Josh Elser
> Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an
> HBase WAL API should look like. What are the primitive calls that we require
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We
> should also have a mind for what is happening in the Ratis LogService (but
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We
> should make sure all consumers are generally going to be OK with the API we
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods
> which were "bolted" on such as {{AbstractFSWAL}} and
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability
> annotations are chosen.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)