[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615265#comment-16615265
 ] 

Sergey Soldatov commented on HBASE-20952:
-----------------------------------------

{quote}In the old time we will roll the wal writer, and it is done by RS, so 
closing the wal file is not enough, as the RS will try to open a new one and 
write to it. That's why we need to rename the wal directory.
{quote}
For the WAL provider that doesn't depend on the HDFS directory structure, there 
should be a manager that keeps information about existing logs. Internals are 
implementation specific (i.e. for Kafka wal provider it may be a separate topic 
or some internal DB. For consensus-based logs like Ratis LogService that might 
be a separate state machine), But any new log should be registered there. 
Adding a new method to WALProvider like 'disable'/'decommission' that would 
tell the manager to reject new logs for particular RS (or even region if we 
consider wal per region schema) is not a problem. For the existing wal 
providers, that method may rename the wal directory.
{quote}In your words above, it seems to me that we will only have one stream 
opened forever for a RS, then how do we drop the old edits after flush? And how 
do we setup the wal stream? Only once at the RS start up? And if there are 
errors later, we just abort?
{quote}
Not necessary. There is no problem to have wal per region. Actually, in some 
cases, it would be preferable. For example Kafka topic per region. Any kind of 
recovery would be a simple subscribe/replay the particular topic. No log 
splits, less offline time. For a regular case, we are not talking about 
streams. It's just a WAL implementation that supports the append operation. For 
replication/recovery we should be able to get a stream and read from the 
particular ID/offset. Error handling should be hidden by the implementation. A 
simple example for quorum based implementation. We have 3 nodes quorum for log 
'RS1.1' (RS1, RS2,RS3). RS2, RS3 went down due some reason, so we lost the 
majority and this quorum becomes read-only. A new log 'RS1.2' is created with 
the quorum (RS1, RS4, RS5) and all writes are going there. But if we speak 
about reading stream it would provide a single instance that iterates through 
RS1.1 and RS1.2 continuously. The same approach may be applied to the existing 
wal files as well.
{quote}And for the FileSystem, we will use multi wal to increase the 
performance, and the logic is messed up with WALProvider. Does ratis still need 
multi wal to increase the performance? And if not, what's the plan? We need to 
refactor the multi wal related code, to not work against the WALProvider but 
something with the FileSystem related stuffs directly?
{quote}
That might be done in the further refactoring of multiwal. At the moment the 
approach is that we may specify 3rd party wal provider class in WALFactory. So 
if it's there, multiwal would not be used at all as it's the provider class. In 
other hands, it could be refactored to something like 'wal strategy' and works 
with any kind of providers.
{quote}had mentioned offline yesterday that he thinks some gaps still exist 
around WAL splitting – do you understand that well enough to suggest what needs 
to be addressed in the doc which is not already there?
{quote}
WALSplitter is a separate topic for the discussion. The current implementation 
has a bunch of dependencies on file operations such as temporary files, list of 
corrupted files, etc. From HBase perspective, it would be much easier to keep 
it as is and make log splitter an interface that should take log and create 
list of recovery logs. But from the perspective of 3rd party wal developer that 
would be a nightmare to handle all possible cases and fit into the split log 
chore logic. In other hands for the 1st iteration, this may be hidden by the 
schema where 3rd party wal may not use the splitter at all and recovery would 
be reading a stream of records provided by the WALProvider for a particular 
region

> Re-visit the WAL API
> --------------------
>
>                 Key: HBASE-20952
>                 URL: https://issues.apache.org/jira/browse/HBASE-20952
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>            Reporter: Josh Elser
>            Priority: Major
>         Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to