[
https://issues.apache.org/jira/browse/SLING-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970568#comment-13970568
]
Marc Pfaff commented on SLING-3502:
-----------------------------------
[~cziegeler]: The attached IT still fails with your patch. The problem is it
uses JobManager.getQueue() witch does not explicitly check for main queue. IMHO
this unveils the deeper problem. The code does not consequently use the same
values for reading and writing the queue map, which was leading to SLING-3381
already. If we start treating the main queue different than the other queues,
we need to explicitly check for the main queue name whenever the queue map is
read/written. JobManagerImpl.process() is another such candidate.
IMHO the filterName() method does not avoid name clashes between active queues
and outdated queues. Filtering the same queue name will always return the same
filtered queue name. Aren't name clashes avoided by AbstractJobQueue.outdate()?
There the queue is renamed to a unique name by appending <outdated>
(<hashcode>) to the queue name. Which leads me to think about whether
filterName() is needed at all for reading and writing the queue map?
I'll try to complete your patch by covering all read/write access to the queue
map.
> Main job queue is not properly outdated
> ----------------------------------------
>
> Key: SLING-3502
> URL: https://issues.apache.org/jira/browse/SLING-3502
> Project: Sling
> Issue Type: Bug
> Components: Extensions
> Affects Versions: Event 3.3.8
> Reporter: Marc Pfaff
> Assignee: Stefan Egli
> Attachments: SLING-3502-2.patch, SLING-3502-IT.patch,
> SLING-3502.patch, SLING-3502.patch
>
>
> The default job queue called <main queue> appears not to be properly outdated.
> The JobManager keeps an internal map of currently running job queues, indexed
> by job name. The code to outdate a queue (JobManagerImpl.outdateQueue()) uses
> a filtered queue name to look up the queue to outdate in this map. But the
> part that uses, creates and puts the queue on the map
> (JobManagerImpl.process()), does not filter the queue name.
> After outdating the main queue like this, there are two or more main queue
> entries in the map, depending on the number of topology changes happening,
> pointing to the same outdated queue instance. As one of the queues is still
> indexed with <main queue>, new jobs that use the main queue are always
> assigned an outdated queue. That's a dead end, as outdated queues do not
> appear to have a queue thread running no more.
> To reproduce:
> * Start one instance
> * Start a job that uses the main queue, so one instance of the main queue is
> created. This job passes fine.
> * Trigger a topology change, e.g. by adding a second instance to the same
> topology
> * Check the job manager in sling console, you should see two outdated main
> queues, properly labeled as outdated, but one of them is internally still
> indexed by <main queue>
> * Start another job that uses the main queue. This job and all following jobs
> using main queue never get executed
--
This message was sent by Atlassian JIRA
(v6.2#6252)