[ 
https://issues.apache.org/jira/browse/RATIS-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17560981#comment-17560981
 ] 

Baolong Mao commented on RATIS-1603:
------------------------------------

[~szetszwo] I'm so sorry to lead a wrong direction for this issue, after a deep 
learning of `ScheduledThreadPoolExecutor`, I found that it never grow worker 
thread while we submit new task to these thread pool,

 

Meanwhile, I wrote a simple example to prove this

 
{code:java}
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.TimeUnit;
import org.apache.ratis.util.Daemon;

public class Example {
  public static void main(String[] args) {
    final ScheduledThreadPoolExecutor e = new ScheduledThreadPoolExecutor(1, 
(ThreadFactory) Daemon::new);
    e.setRemoveOnCancelPolicy(true);
    while(true) {
      e.schedule(() -> {
        try {
          Thread.sleep(100000L);
        } catch (InterruptedException ex) {
          ex.printStackTrace();
        }
      }, 1, TimeUnit.MILLISECONDS);
    }
  }
}
 {code}
A few minutes later, I found there is only one thread for this thread pool, but 
there are so many tasks in the queue of the thread pool.

 

So I guess I maybe misunderstand the `ScheduledThreadPoolExecutor` and for the 
code in the screenshot I given, It should be only one thread for that executor. 
I'm not sure where the bad thread pool is in for now...

> TimeoutScheduler can have a huge amount of threads and cause OOM
> ----------------------------------------------------------------
>
>                 Key: RATIS-1603
>                 URL: https://issues.apache.org/jira/browse/RATIS-1603
>             Project: Ratis
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.0.0
>            Reporter: Baolong Mao
>            Assignee: Tsz-wo Sze
>            Priority: Major
>         Attachments: image-2022-06-29-09-33-28-818.png, 
> image-2022-06-29-09-37-07-089.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> !image-2022-06-29-09-33-28-818.png!
> The callstack are same for the 20k threads
> {code:java}
> "java.util.concurrent.ThreadPoolExecutor$Worker@64f771ea[State = -1, empty 
> queue]" #486880 daemon prio=5 os_prio=0 cpu=0.14ms elapsed=35.35s 
> allocated=1696B defined_classes=0 tid=0x00007f136fb65800 nid=0x2130e1 waiting 
> on condition  [0x00007f0f94ccf000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>     at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
>     - parking to wait for  <0x00007f1e0178ef68> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>     at 
> java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
>     at 
> java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
>     at java.lang.Thread.run([email protected]/Thread.java:829) {code}
> I found the problem is The following code, we should better give a maxsize 
> for the ScheduledThreadPoolExecutor
>  
> !image-2022-06-29-09-37-07-089.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to