[ 
https://issues.apache.org/jira/browse/AURORA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Khutornenko reopened AURORA-1600:
---------------------------------------

This change has been unnecessarily reverted in 
https://reviews.apache.org/r/42922/ as part of AURORA-1603 investigation. 
Reopening this ticket to un-revert it.

> Job updates with large count of instance overrides halt scheduler perf
> ----------------------------------------------------------------------
>
>                 Key: AURORA-1600
>                 URL: https://issues.apache.org/jira/browse/AURORA-1600
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Maxim Khutornenko
>            Assignee: Maxim Khutornenko
>            Priority: Critical
>             Fix For: 0.12.0
>
>
> We have observed a case when a user update with a large number of specified 
> instance overrides (updateOnlyTheseInstances) results in significant 
> performance deterioration to the extent of scheduler processing almost no 
> offers and not scheduling any pending tasks for long periods (minutes to 
> hours). 
> The culprit appears to be the {{selectInstructions}} query. It's unacceptably 
> slow when number of instanceConfigs and/or instance overrides approaches 100. 
> Since it's called inside a write lock to guide individual instance updates, 
> nothing else can proceed including status updates and offer activities. 
> I was able to replicate this in jmh. Fix is incoming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to