[GitHub] [pulsar] merlimat commented on a change in pull request #13130: [Issue 13129] [pulsar-metadata] Add watchdog thread in metadata store and track long running tasks.

GitBox Fri, 03 Dec 2021 16:36:28 -0800


merlimat commented on a change in pull request #13130:
URL: https://github.com/apache/pulsar/pull/13130#discussion_r762358130




##########
File path: 
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/AbstractMetadataStore.java
##########
@@ -317,7 +346,33 @@ public void invalidateAll() {
      */
     protected void execute(Runnable task, CompletableFuture<?> future) {
         try {
-            executor.execute(task);
+            // Wrap the original task, so we can record the thread on which it 
is running
+            TaskWrapper taskWrapper = new TaskWrapper(task);
+            executorWatchDog.execute(() -> {

Review comment:
       > Can you please explain what is the problem with tracking each task? 
   
   The problem is that we can have 10s or 100s of thousand of such tasks per 
each second, each with different durations. This adds an overhead in submitting 
the tasks to the watcher thread. 
   
   With the other approach, we only need to submit a "dummy" task every few 
seconds to the executor and check that it can complete. It is also completely 
independent from the timeout of such operations (eg: even if a write operation 
takes 30seconds, the thread wouldn't normally get blocked for such long time).
   
   So, I think it's much more efficient to track 1 dummy task every few seconds 
rather than every single task.
   
   > Dumping the stack trace of all threads to logs could result in large 
volume of logs
   
   When you get the stack trace for the task that is blocked, you're only 
seeing the stack trace of a healthy code path, while you won't see the other 
threads that have caused the deadlock. 
   Most likely the `metadata-store` thread stack trace will be the most 
relevant one, though not necessarily.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [pulsar] merlimat commented on a change in pull request #13130: [Issue 13129] [pulsar-metadata] Add watchdog thread in metadata store and track long running tasks.

Reply via email to