[ 
https://issues.apache.org/jira/browse/AURORA-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685633#comment-15685633
 ] 

Mehrdad Nurolahzade commented on AURORA-137:
--------------------------------------------

In our clusters, I have noticed that the top three methods in terms of total 
wait time to acquire the storage lock are currently the following:
- {{org.apache.aurora.scheduler.BatchWorker.processBatch()}}
- 
{{org.apache.aurora.scheduler.reconciliation.TaskTimeout.TimedOutTaskHandler.run()}}
- {{org.apache.aurora.scheduler.TaskStatusHandlerImpl.run()}}

This is primarily due to very high frequency of invocations to these methods. 

Wait times experienced by 
{{org.apache.aurora.scheduler.mesos.MesosSchedulerImpl.resourceOffers()}} is an 
order of magnitude smaller compared to the above three processes (perhaps 
because the frequency of {{storage.write()}} invocations here similarly shrinks 
an orders of magnitude). However, it might still be worthwhile to reduce this 
frequency even further if all it takes is just a simple change.

> Save host attributes only when a task is being scheduled
> --------------------------------------------------------
>
>                 Key: AURORA-137
>                 URL: https://issues.apache.org/jira/browse/AURORA-137
>             Project: Aurora
>          Issue Type: Story
>          Components: Scheduler
>            Reporter: Bill Farner
>            Priority: Minor
>
> The scheduler currently aggressively saves host attributes when handling 
> {{resourceOffers}}, however it seems tractable for this to only happen when a 
> task is actually scheduled.  Context: the scheduler stores host attributes to 
> satisfy scheduling constraints (like host/rack diversity).  Doing this would 
> allow us to avoid waiting for the storage write lock, and handle 
> {{resourceOffers}} in a more deterministic time frame.
> One caveat with this approach is that the Offer would need to be plumbed into 
> {{SchedulingFilterImpl}} in a way so as to ensure that the attributes are 
> available for the offer being inspected.  In other words, we need to avoid 
> the chicken and egg of trying to read the attributes for a host when this is 
> the first offer ever received for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to