Carsten Ziegeler created SLING-13130:
----------------------------------------

             Summary: JobQueueImpl.startJobsGuard and semaphore permit leak 
when threadPool.execute() throws
                 Key: SLING-13130
                 URL: https://issues.apache.org/jira/browse/SLING-13130
             Project: Sling
          Issue Type: Bug
          Components: Event
            Reporter: Carsten Ziegeler


In JobQueueImpl.startJobs(), if threadPool.execute() throws (e.g. 
RejectedExecutionException because the pool is full or shut down), two 
resources are permanently leaked:

1. The startJobsGuard AtomicBoolean remains true because the exception 
propagates past startJobsGuard.set(false) at the end of the method.
2. An available semaphore permit is lost because started=true is set before the 
execute() call, so the finally block skips available.release().

The result is a permanently dead queue -- no new jobs can start because 
startJobsGuard blocks all entry, and maintain() (every 60s) is also blocked. 
This persists until JVM restart.

Proposed fix:
1. Wrap startJobsGuard.set(false) in a try/finally around the entire while-loop.
2. Move started=true to after threadPool.execute() succeeds, so the finally 
block correctly releases the available permit on exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to