[ 
https://issues.apache.org/jira/browse/SLING-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carsten Ziegeler updated SLING-13130:
-------------------------------------
    Labels: Concurrency  (was: )

> JobQueueImpl.startJobsGuard and semaphore permit leak when 
> threadPool.execute() throws
> --------------------------------------------------------------------------------------
>
>                 Key: SLING-13130
>                 URL: https://issues.apache.org/jira/browse/SLING-13130
>             Project: Sling
>          Issue Type: Bug
>          Components: Event
>            Reporter: Carsten Ziegeler
>            Priority: Major
>              Labels: Concurrency
>
> In JobQueueImpl.startJobs(), if threadPool.execute() throws (e.g. 
> RejectedExecutionException because the pool is full or shut down), two 
> resources are permanently leaked:
> 1. The startJobsGuard AtomicBoolean remains true because the exception 
> propagates past startJobsGuard.set(false) at the end of the method.
> 2. An available semaphore permit is lost because started=true is set before 
> the execute() call, so the finally block skips available.release().
> The result is a permanently dead queue -- no new jobs can start because 
> startJobsGuard blocks all entry, and maintain() (every 60s) is also blocked. 
> This persists until JVM restart.
> Proposed fix:
> 1. Wrap startJobsGuard.set(false) in a try/finally around the entire 
> while-loop.
> 2. Move started=true to after threadPool.execute() succeeds, so the finally 
> block correctly releases the available permit on exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to