[
https://issues.apache.org/jira/browse/HBASE-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347864#comment-16347864
]
Appy commented on HBASE-19904:
------------------------------
{noformat}
- // There is a cyclic dependency between ReplicationSourceHandler and
WALFactory.
- // We use WALActionsListener to get the newly rolled WALs, so we need to
get the
- // WALActionsListeners from ReplicationSourceHandler before constructing
WALFactory. And then
- // ReplicationSourceHandler need to use WALFactory get the length of the
wal file being written.
- // So we here we need to construct WALFactory first, and then pass it to
the initialized method
- // of ReplicationSourceHandler.
{noformat}
Is this the core issue you are trying to solve?
Reading that comment...it's real bad code smell.
At system level, there's very clear strong dependency of replication on wal,
there's no way we should make wal depend on replication.
It exists right now and is very bad, we should try to get rid of it instead of
adding more scaffolding to support it.
----
Suggestion:
{code:java}
interface WALProvider {
...
// General observer/listener patterns allow adding/removing observer on the
fly. We just need adding, so let's add this.
void addWALActionsListener(WALActionsListener listener);
// This is right because currently WALFactory does
provider.getWAL().stream()....., no need of that indirection.
WALFileLengthProvider getWALFileLengthProvider(); // Can be done via
inheritance too, but let's use composition.
...
}
{code}
In HRegionServer#setupWALAndReplication
{code:java}
private void setupWALAndReplication() throws IOException {
.....
// WALFactory has no need to know about listeners, only WALProvider does.
this.walFactory = new WALFactory(conf, serverName.toString());
// No need of List<WALActionsListener> listeners and related logic.
walFactory.getWALProvider().addWALActionsListener(new MetricsWAL()); // Meta
wal provider takes care of itself, so need for it.
createNewReplicationInstance(conf, this, this.walFs, logDir, oldLogDir);
//btw, we are not even using the last 3 params in the function.
if (this.replicationSourceHandler != null) {
this.replicationSourceHandler.initialize(this, walFs, logDir, oldLogDir,
factory);
}
.....
}
{code}
In Replication#initialize
{code:java}
// Changed type of last param to WALProvider
public void initialize(Server server, FileSystem fs, Path logDir, Path
oldLogDir, WALProvider walProvider) throws IOException {
....
this.replicationManager = new ReplicationSourceManager(<other params>,
walProvider.getWALFileLengthProvider());
....
walProvider.addWALActionsListener(this);
}
{code}
After all this, WALFactory will not implement WALFileLengthProvider anymore.
I think this will work because Replication will be able to register itself
before anything calls WALFactory.getWAL(region).
Wtdy?
----
I am trying to come up with a solution considering that we'll have separate
hbase-wal and hbase-replication-implementation modules (ignoring naming for
now) in future.
I am not much familiar with Replication and WAL code, so it's possible that the
solution above is not comprehensive and may be missing/wrong on some elements,
but i certainly feel that we can have a good discussion here and come up with a
better design.
Also, since this is not a bug fix, this should not go in 2.0-beta.
[~stack] The more things we don't backport to branch-2 (since it's in 2.0
beta), the wider the gap from master becomes. Having one less branch certainly
decreases current backport work; but that also means no
improvements/maintenance stuff getting backported into 2.1, widening gap
between master and branch-2, and thus more merge conflicts in future. Please
create branch-2.0. :)
> Find a better way to deal with the cyclic dependency when initializing WAL
> and Replication
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-19904
> URL: https://issues.apache.org/jira/browse/HBASE-19904
> Project: HBase
> Issue Type: Improvement
> Components: Replication, wal
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19904.patch
>
>
> When implementing synchronous replication, I found that we need to depend
> more on replication in WAL so it is even more pain...
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)