[
https://issues.apache.org/jira/browse/RATIS-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17560981#comment-17560981
]
Baolong Mao commented on RATIS-1603:
------------------------------------
[~szetszwo] I'm so sorry to lead a wrong direction for this issue, after a deep
learning of `ScheduledThreadPoolExecutor`, I found that it never grow worker
thread while we submit new task to these thread pool,
Meanwhile, I wrote a simple example to prove this
{code:java}
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.TimeUnit;
import org.apache.ratis.util.Daemon;
public class Example {
public static void main(String[] args) {
final ScheduledThreadPoolExecutor e = new ScheduledThreadPoolExecutor(1,
(ThreadFactory) Daemon::new);
e.setRemoveOnCancelPolicy(true);
while(true) {
e.schedule(() -> {
try {
Thread.sleep(100000L);
} catch (InterruptedException ex) {
ex.printStackTrace();
}
}, 1, TimeUnit.MILLISECONDS);
}
}
}
{code}
A few minutes later, I found there is only one thread for this thread pool, but
there are so many tasks in the queue of the thread pool.
So I guess I maybe misunderstand the `ScheduledThreadPoolExecutor` and for the
code in the screenshot I given, It should be only one thread for that executor.
I'm not sure where the bad thread pool is in for now...
> TimeoutScheduler can have a huge amount of threads and cause OOM
> ----------------------------------------------------------------
>
> Key: RATIS-1603
> URL: https://issues.apache.org/jira/browse/RATIS-1603
> Project: Ratis
> Issue Type: Bug
> Components: util
> Affects Versions: 2.0.0
> Reporter: Baolong Mao
> Assignee: Tsz-wo Sze
> Priority: Major
> Attachments: image-2022-06-29-09-33-28-818.png,
> image-2022-06-29-09-37-07-089.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> !image-2022-06-29-09-33-28-818.png!
> The callstack are same for the 20k threads
> {code:java}
> "java.util.concurrent.ThreadPoolExecutor$Worker@64f771ea[State = -1, empty
> queue]" #486880 daemon prio=5 os_prio=0 cpu=0.14ms elapsed=35.35s
> allocated=1696B defined_classes=0 tid=0x00007f136fb65800 nid=0x2130e1 waiting
> on condition [0x00007f0f94ccf000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
> - parking to wait for <0x00007f1e0178ef68> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2123)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1182)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
> at
> java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1054)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1114)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
> at java.lang.Thread.run([email protected]/Thread.java:829) {code}
> I found the problem is The following code, we should better give a maxsize
> for the ScheduledThreadPoolExecutor
>
> !image-2022-06-29-09-37-07-089.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)