belugabehr commented on a change in pull request #934: HBASE-23568: Improve
Threading of Replication
URL: https://github.com/apache/hbase/pull/934#discussion_r375877892
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
##########
@@ -304,31 +318,56 @@ private void initializeWALEntryFilter(UUID
peerClusterId) {
this.walEntryFilter = new ChainWALEntryFilter(filters);
}
- private void tryStartNewShipper(String walGroupId,
PriorityBlockingQueue<Path> queue) {
- workerThreads.compute(walGroupId, (key, value) -> {
- if (value != null) {
- if (LOG.isDebugEnabled()) {
- LOG.debug(
- "{} Someone has beat us to start a worker thread for wal group
{}",
- logPeerId(), key);
- }
- return value;
- } else {
- if (LOG.isDebugEnabled()) {
- LOG.debug("{} Starting up worker for wal group {}", logPeerId(),
key);
+ /**
+ * Synchronized method so that only one item is inserted at a time. Should
be the only place that
Review comment:
@bharathv Good question. The answer is: dead lock.
To simplify other parts of the code, I added Guava Future Callback. This
allows the thread to un-register itself once it has completed executing instead
of having to have a developer remember to explicitly do it in the meat of the
thread code (knowing that there can be many different exit code paths that all
need to make sure they call the `tryFinish` method). Not to mention there are
multiple subclasses that also all need to explicitly call `tryFinish`. Much
cleaner to let all the classes involved not even be aware that they are being
registered / unregistered.
However, the catch is that the callback is applied to the future after it
has been registered with the `ExecutionService` and therefore it is possible
that the thread finishes before the callback is applied. In such a scenario,
the callback is executed by the thread that tried to apply the callback in the
first place. So, what could happen is that, the main thread submits a
`Callable` to the `ExecutorService`, that `Callable` is assigned a thread,
executes, and finishes. The main thread then applies the callback and it must
run the un-register code on behalf of the `Callable` which means that it tries
to remove the `Callable` from the `workerThreads` Map. However, in the
original code, the main thread was running within the lock of the Map, so
trying to then remove an entry from the Map that is already had a lock on
caused a deadlock.
So now there is a single lock (synchronized) at the cost of using multiple
locks across the Map, but it would be hard to say that this has a measurable
negative impact on performance at the cost of much more simple code in the
thread.
https://github.com/google/guava/wiki/ListenableFutureExplained
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services