[ 
https://issues.apache.org/jira/browse/SLING-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carsten Ziegeler updated SLING-11095:
-------------------------------------
    Labels: Concurrency  (was: )

> Reduce repo traversals when scheduling next job
> -----------------------------------------------
>
>                 Key: SLING-11095
>                 URL: https://issues.apache.org/jira/browse/SLING-11095
>             Project: Sling
>          Issue Type: Improvement
>          Components: Event
>    Affects Versions: Event 4.2.24
>            Reporter: Joerg Hoh
>            Priority: Major
>              Labels: Concurrency
>
> In an AEM instance where a lot of Sling Job processing was ongoing I traced 
> all cases where a ResourceResolver was opened. I did this for a period of 
> around 30 seconds. 
> In 154 times the stacktrace looked like this:
> {noformat:title=loadJobs}
> [...]
>       at 
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryImpl.getServiceResourceResolver(ResourceResolverFactoryImpl.java:89)
>  [org.apache.sling.resourceresolver:1.8.0]
>       at 
> org.apache.sling.event.impl.jobs.config.JobManagerConfiguration.createResourceResolver(JobManagerConfiguration.java:298)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueJobCache.loadJobs(QueueJobCache.java:224)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueJobCache.getNextJob(QueueJobCache.java:180)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJobs(JobQueueImpl.java:261)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueManager.start(QueueManager.java:275)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueManager.handleEvent(QueueManager.java:460)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:431)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.tasks.HandlerTask.runWithoutDenylistTiming(HandlerTask.java:82)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:107)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.handler.EventAdminImpl.sendEvent(EventAdminImpl.java:155)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.security.EventAdminSecurityDecorator.sendEvent(EventAdminSecurityDecorator.java:96)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.sling.event.impl.jobs.notifications.NewJobSender.onChange(NewJobSender.java:121)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.resourceresolver.impl.observation.BasicObservationReporter.reportChanges(BasicObservationReporter.java:211)
>  [org.apache.sling.resourceresolver:1.8.0]
>       at 
> org.apache.sling.jcr.resource.internal.JcrResourceListener.onEvent(JcrResourceListener.java:155)
>  [org.apache.sling.jcr.resource:3.1.0]
>       at 
> org.apache.jackrabbit.commons.observation.ListenerTracker$1.onEvent(ListenerTracker.java:190)
>  [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.0]
>  {noformat}
> And in another 254 cases the stacktrace looked like this:
> {noformat:title=startProcessing}
>       at 
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryImpl.getServiceResourceResolver(ResourceResolverFactoryImpl.java:89)
>  [org.apache.sling.resourceresolver:1.8.0]
>       at 
> org.apache.sling.event.impl.jobs.config.JobManagerConfiguration.createResourceResolver(JobManagerConfiguration.java:298)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.JobHandler.persistJobProperties(JobHandler.java:217)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.JobHandler.startProcessing(JobHandler.java:74)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueJobCache.getNextJob(QueueJobCache.java:190)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJobs(JobQueueImpl.java:261)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueManager.start(QueueManager.java:275)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.event.impl.jobs.queues.QueueManager.handleEvent(QueueManager.java:460)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:431)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.tasks.HandlerTask.runWithoutDenylistTiming(HandlerTask.java:82)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:107)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.handler.EventAdminImpl.sendEvent(EventAdminImpl.java:155)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.felix.eventadmin.impl.security.EventAdminSecurityDecorator.sendEvent(EventAdminSecurityDecorator.java:96)
>  [org.apache.felix.eventadmin:1.6.2]
>       at 
> org.apache.sling.event.impl.jobs.notifications.NewJobSender.onChange(NewJobSender.java:121)
>  [org.apache.sling.event:4.2.24]
>       at 
> org.apache.sling.resourceresolver.impl.observation.BasicObservationReporter.reportChanges(BasicObservationReporter.java:211)
>  [org.apache.sling.resourceresolver:1.8.0]
>       at 
> org.apache.sling.jcr.resource.internal.JcrResourceListener.onEvent(JcrResourceListener.java:155)
>  [org.apache.sling.jcr.resource:3.1.0]
> {noformat}
> In the case of {{loadJobs}} the ResourceResolver is opened to load all Jobs, 
> although I am not sure if really all jobs are required at that point. To 
> achieve this it traverses the repository to load all jobs matching a topic 
> [1]. I wonder if this is traversal is necessary, because during that time 
> this is done around 5 times per second. Is it possible to optimize that, so 
> we don't have that much repository access.
> Not sure if we can achieve something similar in the case of {{startJobs}}, 
> maybe by holding a RR as part of the {{JobQueueCache}} and pass it to 
> {{startProcessing}}. This would allow us to use a RR for other purposes as 
> well.
> [1] 
> https://github.com/apache/sling-org-apache-sling-event/blob/dbffb437564d553dcb65d60742cbfbeccf9c587f/src/main/java/org/apache/sling/event/impl/jobs/queues/QueueJobCache.java#L288
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to