[
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614292#comment-16614292
]
Duo Zhang commented on HBASE-20952:
-----------------------------------
API is not the first thing to decide. As I said above, the first thing is we
need to know the overall solution. You can see our design doc for serial
replication and sync replication
https://docs.google.com/document/d/1LHC3IRUc5i2V4_roNw8BDAOKGM4bEapR_hefpZxDT00/edit
https://docs.google.com/document/d/193D3aOxD-muPIZuQfI4Zo3_qg6-Nepeu_kraYJVQkiE/edit#heading=h.e8l9k556m3wi
There is no API design in it, but we try our best to describe how we plan to do
it in HBase.
{quote}
This is good; I hadn't thought about abstracting out fencing. We should have
API which pushes this fencing impl down into the Provider. For the Ratis
LogService, we designed api to be able to close() a Log; make it read-only. In
the context of HBase, we would close the Log before we start
recovery/re-assignment, and have the net-effect of preventing any half-dead RS
from continuing to try to add more edits to the Log. This effectively would
work like recoverLease() does now for the HDFS case.
{quote}
Yes this is what I really want to discuss, not something like whether we should
use WALInfo or WALIdentity.
The information you described is still not enough to solve all the problems. In
the old time we will roll the wal writer, and it is done by RS, so closing the
wal file is not enough, as the RS will try to open a new one and write to it.
That's why we need to rename the wal directory. In your words above, it seems
to me that we will only have one stream opened forever for a RS, then how do we
drop the old edits after flush? And how do we setup the wal stream? Only once
at the RS start up? And if there are errors later, we just abort? Without
trying to recover or open a new stream? Or it will be handled by ratis? And for
the FileSystem, we will use multi wal to increase the performance, and the
logic is messed up with WALProvider. Does ratis still need multi wal to
increase the performance? And if not, what's the plan? We need to refactor the
multi wal related code, to not work against the WALProvider but something with
the FileSystem related stuffs directly?
For the sync replication thing, it is just a DualAsyncWriter, which writes to
two HDFS clusters at once, I think it is possible to write to other log
systems, such as ratis, if you still share the AsyncWriter interface. The
problem here is that how to describe the place where we write the remote wals.
For FileSystem based wals, it is just a directory on a remote cluster, for
example, "hdfs://cluster-name/path". We need to find a way to describe other
log systems.
> Re-visit the WAL API
> --------------------
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Reporter: Josh Elser
> Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an
> HBase WAL API should look like. What are the primitive calls that we require
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We
> should also have a mind for what is happening in the Ratis LogService (but
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We
> should make sure all consumers are generally going to be OK with the API we
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods
> which were "bolted" on such as {{AbstractFSWAL}} and
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability
> annotations are chosen.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)