Carsten Ziegeler created SLING-2896:
---------------------------------------
Summary: Job might be executed twice if a topology event occurs
Key: SLING-2896
URL: https://issues.apache.org/jira/browse/SLING-2896
Project: Sling
Issue Type: Bug
Components: Extensions
Affects Versions: Extensions Event 3.2.0
Reporter: Carsten Ziegeler
Assignee: Carsten Ziegeler
Fix For: Extensions Event 3.2.0
If a parallel queue is used (either parallel or round robing) with a limit of N
parallel jobs and there are X > N jobs in the queue, the (N+1) job might get
processed twice if a topology change occurs:
Assume we have a parallel queue with max parallel processing set to 8 and 15
jobs entering the queue.
The AbstractParallelJobQueue.start is called with the first 8 jobs - all fine,
limit is not yet hit, those jobs are started
When AbstractParallelJobQueue.start is called with job #9 the acquireSlot()
method waits, since the queue is full/busy: The waiting is done in
acquireSlot(), in the syncLock.wait() (line 78 of AbstractParallelJobQueue).
The calling method - start(JobHandler) - keeps a reference to job #9 !
In the meantime a TopologyEvent occurs. AFAICS this triggers 'outdating' the
existing and recreating a new queue.
The BackgroundLoader.loadJobsInTheBackground starts filling the new queue.
Unfortunatelly, this Backgroundloader schedules job #9 too - since it is
not yet marked in the repository in any way (there's only the previous queue
which has a reference to it in above mentioned start(JobHandler) method - but
the job #9 is not yet marked as running in the repository).
=> Thus job #9 is executed the first time by the new queue.
Eventually the outdated queue is finished with execution. The above
acquireSlot() method returns, and:
=> job #9 is executed the second time (by the outdated queue).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira