[
https://issues.apache.org/jira/browse/SLING-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242784#comment-15242784
]
Ian Boston commented on SLING-5645:
-----------------------------------
The Jobs API can be found at
https://github.com/ieb/sling/tree/jobs_28/contrib/extensions/jobs/core/src/main/java/org/apache/sling/jobs
with implementation in the same bundle. A Crankstart based IT test bundle
starts the Jobs Subsystem up inside a cut down OSGi container. Its dependencies
can best be seen by inspecting the crankstart provisioning models.
https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it/src/test/resources/provisioning-model/jobs-runtime.txt.
The implementation uses the MoM API leveraging two aspects of that contract,
namely that messages in a jobs queue are delivered once and once only to a
single message listener. Assuming this contract is honored by the MoM
implementation, the Jobs Subsystem can define a JobConsumer API with the
contract that implementors will guarantee any job accepted by a job consumer
will be processed. Assuming that contract is also honored the Job Sub system
can distribute jobs for execution as allowed by the MoM API.
To allow components that have an interest in the status of Jobs while they are
being executed the MoM API Publish subscribe capability is used to stream
status events for jobs in the queue and being executed. There has been some
review and discussion offlist about this aspect with those who have experience
running MoM infrastructure at scale and the consensus is that provided the
Pub/Sub implementation can be made durable when required, there is no need to
maintain a centralised persistence store dedicated to Job status. This
potentially frees the JobProcessing from the scalability limitations that a
store of that nature will introduce, but burdens the MoM API implementation
further.
There is an example implementation of a Synchronous JobConsumer in the
integration tests
https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it-services/src/main/java/org/apache/sling/jobs/it/services/FullySyncJob.java.
This is a trivial implementation as it uses the message thread to perform
JobProcessing. Typically that would not be viable as it would limit throughput
on the consuming JVM, but it demonstrates the process of consuming a Job. If
the execute method on the JobConsumer throws an exception, the interface
contract states that the Job message will not be dequeued from the MoM
implementation and it will perform a retry operation based on configuration.
ActiveMQ, the current OOTB implementation supports retry configuration on
queues.
The second example implementation,
https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it-services/src/main/java/org/apache/sling/jobs/it/services/AsyncJobConsumer.java
is a more realistic implementation. It uses a ThreadPoolExecutor with a fixed
queue size. As jobs are dequeued they enter the local queue, which is drained
by threads from the pool running Callables. If the queue becomes full, Job
messages are rejected and returned to the MoM Queue to be retried. When the
component shuts down the pool is drained. This implementation does not 100%
guarantee that every job starts execution under every scenario and it would not
survive a total hardware failure, however it limits the size of the queue to
balance the risk of damage against complexity.
The structure of the PoC is probably complete, although further work on testing
is ongoing.
> Provide a Jobs API and implementation suitable for widely distributed job
> processing.
> -------------------------------------------------------------------------------------
>
> Key: SLING-5645
> URL: https://issues.apache.org/jira/browse/SLING-5645
> Project: Sling
> Issue Type: New Feature
> Reporter: Ian Boston
> Assignee: Ian Boston
>
> This issue is to track work on a proof of concept to create a Jobs API and
> implementation that will work in a distributed environment where the job
> submitters and job consumers are not necessarily in the same JVM or in the
> same Sling cluster.
> Work is being done in a branch at
> https://github.com/ieb/sling/tree/jobs_28/contrib/extensions/jobs
> Since the implementation needs supporting APIs/Capabilities not already
> present in Sling. There are some sub-tasks associated with this issue to
> address those.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)