[ 
https://issues.apache.org/jira/browse/RATIS-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564107#comment-17564107
 ] 

Baolong Mao commented on RATIS-1603:
------------------------------------

[~szetszwo] I'm sorry, for some non-technical problem, we cannot reproduce this 
issue recently, but I can try your patch immediately once possible.

For the screenshot, I guess there are many independent threadpool that time. 
The following is the parent of the jstack output, we can find that the 
conditions these thread were waiting are different, so there are 20k+ 
DelayedQueue I guess. WDYT.

 
{code:java}
"java.util.concurrent.ThreadPoolExecutor$Worker@64f771ea[State = -1, empty 
queue]" #486880 daemon prio=5 os_prio=0 cpu=0.14ms elapsed=35.35s 
allocated=1696B defined_classes=0 tid=0x00007f136fb65800 nid=0x2130e1 waiting 
on condition  [0x00007f0f94ccf000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
    - parking to wait for  <0x00007f1e0178ef68> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at 
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
    at 
java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run([email protected]/Thread.java:829)
"java.util.concurrent.ThreadPoolExecutor$Worker@2a196ae8[State = -1, empty 
queue]" #486881 daemon prio=5 os_prio=0 cpu=0.12ms elapsed=35.35s 
allocated=1736B defined_classes=0 tid=0x00007f136f20c000 nid=0x2130e2 waiting 
on condition  [0x00007f0e343ee000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
    - parking to wait for  <0x00007f1e004952e0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at 
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
    at 
java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run([email protected]/Thread.java:829)
"java.util.concurrent.ThreadPoolExecutor$Worker@1efd8a2a[State = -1, empty 
queue]" #486883 daemon prio=5 os_prio=0 cpu=0.11ms elapsed=35.35s 
allocated=1608B defined_classes=0 tid=0x00007f136d0f2000 nid=0x2130e4 waiting 
on condition  [0x00007f0e7e6ea000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
    - parking to wait for  <0x00007f1e0068b5a0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at 
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
    at 
java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run([email protected]/Thread.java:829)
"java.util.concurrent.ThreadPoolExecutor$Worker@1aa14531[State = -1, empty 
queue]" #486884 daemon prio=5 os_prio=0 cpu=0.11ms elapsed=35.35s 
allocated=1608B defined_classes=0 tid=0x00007f136d0e2000 nid=0x2130e5 waiting 
on condition  [0x00007f0e35905000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
    - parking to wait for  <0x00007f1e00732d80> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at 
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
    at 
java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run([email protected]/Thread.java:829)
"java.util.concurrent.ThreadPoolExecutor$Worker@7f8de832[State = -1, empty 
queue]" #486886 daemon prio=5 os_prio=0 cpu=0.11ms elapsed=35.35s 
allocated=1608B defined_classes=0 tid=0x00007f136c484000 nid=0x2130e7 waiting 
on condition  [0x00007f0e38a34000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
    - parking to wait for  <0x00007f1e01449c88> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at 
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
    at 
java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run([email protected]/Thread.java:829){code}
I upload the whole stack.1656358202011988877 as attachment, please take a look, 
Thanks.

 

> TimeoutScheduler can have a huge amount of threads and cause OOM
> ----------------------------------------------------------------
>
>                 Key: RATIS-1603
>                 URL: https://issues.apache.org/jira/browse/RATIS-1603
>             Project: Ratis
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.0.0
>            Reporter: Baolong Mao
>            Assignee: Tsz-wo Sze
>            Priority: Major
>         Attachments: image-2022-06-29-09-33-28-818.png, 
> image-2022-06-29-09-37-07-089.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> !image-2022-06-29-09-33-28-818.png!
> The callstack are same for the 20k threads
> {code:java}
> "java.util.concurrent.ThreadPoolExecutor$Worker@64f771ea[State = -1, empty 
> queue]" #486880 daemon prio=5 os_prio=0 cpu=0.14ms elapsed=35.35s 
> allocated=1696B defined_classes=0 tid=0x00007f136fb65800 nid=0x2130e1 waiting 
> on condition  [0x00007f0f94ccf000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>     at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
>     - parking to wait for  <0x00007f1e0178ef68> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>     at 
> java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
>     at 
> java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
>     at java.lang.Thread.run([email protected]/Thread.java:829) {code}
> I found the problem is The following code, we should better give a maxsize 
> for the ScheduledThreadPoolExecutor
>  
> !image-2022-06-29-09-37-07-089.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to