merlimat commented on a change in pull request #13130:
URL: https://github.com/apache/pulsar/pull/13130#discussion_r762358130
##########
File path:
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/AbstractMetadataStore.java
##########
@@ -317,7 +346,33 @@ public void invalidateAll() {
*/
protected void execute(Runnable task, CompletableFuture<?> future) {
try {
- executor.execute(task);
+ // Wrap the original task, so we can record the thread on which it
is running
+ TaskWrapper taskWrapper = new TaskWrapper(task);
+ executorWatchDog.execute(() -> {
Review comment:
> Can you please explain what is the problem with tracking each task?
The problem is that we can have 10s or 100s of thousand of such tasks per
each second, each with different durations. This adds an overhead in submitting
the tasks to the watcher thread.
With the other approach, we only need to submit a "dummy" task every few
seconds to the executor and check that it can complete. It is also completely
independent from the timeout of such operations (eg: even if a write operation
takes 30seconds, the thread wouldn't normally get blocked for such long time).
So, I think it's much more efficient to track 1 dummy task every few seconds
rather than every single task.
> Dumping the stack trace of all threads to logs could result in large
volume of logs
When you get the stack trace for the task that is blocked, you're only
seeing the stack trace of a healthy code path, while you won't see the other
threads that have caused the deadlock.
Most likely the `metadata-store` thread stack trace will be the most
relevant one, though not necessarily.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]