[
https://issues.apache.org/jira/browse/SLING-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Carsten Ziegeler updated SLING-13130:
-------------------------------------
Labels: Concurrency (was: )
> JobQueueImpl.startJobsGuard and semaphore permit leak when
> threadPool.execute() throws
> --------------------------------------------------------------------------------------
>
> Key: SLING-13130
> URL: https://issues.apache.org/jira/browse/SLING-13130
> Project: Sling
> Issue Type: Bug
> Components: Event
> Reporter: Carsten Ziegeler
> Priority: Major
> Labels: Concurrency
>
> In JobQueueImpl.startJobs(), if threadPool.execute() throws (e.g.
> RejectedExecutionException because the pool is full or shut down), two
> resources are permanently leaked:
> 1. The startJobsGuard AtomicBoolean remains true because the exception
> propagates past startJobsGuard.set(false) at the end of the method.
> 2. An available semaphore permit is lost because started=true is set before
> the execute() call, so the finally block skips available.release().
> The result is a permanently dead queue -- no new jobs can start because
> startJobsGuard blocks all entry, and maintain() (every 60s) is also blocked.
> This persists until JVM restart.
> Proposed fix:
> 1. Wrap startJobsGuard.set(false) in a try/finally around the entire
> while-loop.
> 2. Move started=true to after threadPool.execute() succeeds, so the finally
> block correctly releases the available permit on exception.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)