[ https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628427#comment-17628427 ]
Stefan Egli commented on SLING-11662: ------------------------------------- The problem with the endless loop seems to be due to a breach of contract by sling.commons.scheduler.QuartzThreadPool: * [quartz' ThreadPool javadoc|https://github.com/quartz-scheduler/quartz/blob/v2.3.2/quartz-core/src/main/java/org/quartz/spi/ThreadPool.java#L69-L82] says that {quote}The implementation of this method should block until there is at least one available thread.{quote} * however [sling.commons.scheduler.QuartzThreadPool#blockForAvailableThreads|https://github.com/apache/sling-org-apache-sling-commons-scheduler/blob/a9ddf38ea9d9962c8938a381135827072fc9397f/src/main/java/org/apache/sling/commons/scheduler/impl/QuartzThreadPool.java#L80] does not guarantee {{>0}} - and in particular if "maxPoolSize == queueSize" then this method will return 0 * that in turn leads quartz to hit the [ironically commented line|https://github.com/quartz-scheduler/quartz/blob/v2.3.2/quartz-core/src/main/java/org/quartz/core/QuartzSchedulerThread.java#L411-L414] {code} } else { // if(availThreadCount > 0) // should never happen, if threadPool.blockForAvailableThreads() follows contract continue; // while (!halted) } {code} so it will just .. continue * now this game repeats for ever after until .. the CPU becomes too hot due to constant 100% spinning and it breaks down .. leading to damage in a datacenter and so on and so forth Presumably quartz did nothing wrong here - other than perhaps add a safety/paranoia {{Thread.sleep(1);}} before that {{continue}} to avoid this. The problem is rather on the sling side. Question is how to best fix this.. [~cziegeler], [~joerghoh], any suggestions? > Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize > ------------------------------------------------------------------------- > > Key: SLING-11662 > URL: https://issues.apache.org/jira/browse/SLING-11662 > Project: Sling > Issue Type: Bug > Components: Commons > Affects Versions: Commons Scheduler 2.7.12 > Reporter: Stefan Egli > Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > When configuring the ThreadPool with maxPoolSize == queueSize and endless > loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as > follows: > {noformat} > "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms > elapsed=5163.45s tid=0x000012345678ff00 nid=0x1234 runnable > [0x000087654321ff00] > java.lang.Thread.State: RUNNABLE > at > org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)