[
https://issues.apache.org/jira/browse/SLING-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171586#comment-15171586
]
Stefan Egli commented on SLING-5560:
------------------------------------
[~chetanm], [~cziegeler], we could have the job manager adapt the delay by
taking knowledge about the cluster state pre-shutdown and at-restart into
account:
* assuming we'd categorize the job manager into two phases:
** an unstable one which is during topology_changed and during reassignment of
jobs after a topology_changed
** a stable one which is after reassignment, during normal operation
* upon entering a stable phase (or as the last step after reassignment), the
job manager could persist the local cluster view (ie all slingIds of the local
cluster)
* upon restart (ie topology_init), the job manager could compare the new view
with that persisted 'last stable one'
** if they match (normal eg for tarMk), then the default could be very low, if
not even 0sec
** if they don't match, then you could have a 1, 2 or perhaps 5min default
> Delay job processing at startup to avoid unnecessary stale job handling
> -----------------------------------------------------------------------
>
> Key: SLING-5560
> URL: https://issues.apache.org/jira/browse/SLING-5560
> Project: Sling
> Issue Type: Improvement
> Components: Extensions
> Reporter: Chetan Mehrotra
> Fix For: Event 4.1.0
>
>
> While running in a cluster (or in some case non cluster setup also) Topology
> would become stable after "some" time. For e.g. in a 2 node setup by the time
> first node comes up second node might not have started so topology would not
> detect it and first node might think that second node is not there and it can
> then start assigning job for that node to current node under stable job
> processing.
> Instead of doing this just right at startup job processing should start after
> "some" delay such that topology becomes stable. This would avoid this
> unnecessary work and probably even reduce load on the master
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)