[jira] [Commented] (SLING-5560) Delay job processing at startup to avoid unnecessary stale job handling

Stefan Egli (JIRA) Mon, 29 Feb 2016 00:46:38 -0800

    [ 
https://issues.apache.org/jira/browse/SLING-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171586#comment-15171586
 ]


Stefan Egli commented on SLING-5560:
------------------------------------

[~chetanm], [~cziegeler], we could have the job manager adapt the delay by 
taking knowledge about the cluster state pre-shutdown and at-restart into 
account:
* assuming we'd categorize the job manager into two phases: 
** an unstable one which is during topology_changed and during reassignment of 
jobs after a topology_changed
** a stable one which is after reassignment, during normal operation
* upon entering a stable phase (or as the last step after reassignment), the 
job manager could persist the local cluster view (ie all slingIds of the local 
cluster)
* upon restart (ie topology_init), the job manager could compare the new view 
with that persisted 'last stable one'
** if they match (normal eg for tarMk), then the default could be very low, if 
not even 0sec
** if they don't match, then you could have a 1, 2 or perhaps 5min default

> Delay job processing at startup to avoid unnecessary stale job handling
> -----------------------------------------------------------------------
>
>                 Key: SLING-5560
>                 URL: https://issues.apache.org/jira/browse/SLING-5560
>             Project: Sling
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Chetan Mehrotra
>             Fix For: Event 4.1.0
>
>
> While running in a cluster (or in some case non cluster setup also) Topology 
> would become stable after "some" time. For e.g. in a 2 node setup by the time 
> first node comes up second node might not have started so topology would not 
> detect it and first node might think that second node is not there and it can 
> then start assigning job for that node to current node under stable job 
> processing.
> Instead of doing this just right at startup job processing should start after 
> "some" delay such that topology becomes stable. This would avoid this 
> unnecessary work and probably even reduce load on the master



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SLING-5560) Delay job processing at startup to avoid unnecessary stale job handling

Reply via email to