[ 
https://issues.apache.org/jira/browse/AURORA-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881652#comment-13881652
 ] 

Bill Farner commented on AURORA-116:
------------------------------------

In a particularly-loaded cluster, we observed handling resourceOffers to take 
an inordinate amount of time:
{noformat}
I0125 04:06:06.030261 25455 sched.cpp:528] Scheduler::resourceOffers took 
457.474384ms
I0125 04:06:07.431802 25456 sched.cpp:528] Scheduler::resourceOffers took 
499.004174ms
I0125 04:06:08.715579 25452 sched.cpp:528] Scheduler::resourceOffers took 
426.892948ms
I0125 04:06:10.422312 25458 sched.cpp:528] Scheduler::resourceOffers took 
788.509018ms
I0125 04:06:11.531437 25450 sched.cpp:528] Scheduler::resourceOffers took 
556.181547ms
I0125 04:06:12.849201 25452 sched.cpp:528] Scheduler::resourceOffers took 
557.399593ms
I0125 04:06:14.131196 25446 sched.cpp:528] Scheduler::resourceOffers took 
506.654534ms
I0125 04:06:15.558323 25457 sched.cpp:528] Scheduler::resourceOffers took 
603.352069ms
I0125 04:06:16.797667 25454 sched.cpp:528] Scheduler::resourceOffers took 
507.040296ms
I0125 04:06:18.342701 25449 sched.cpp:528] Scheduler::resourceOffers took 
718.241925ms
I0125 04:06:22.795732 25445 sched.cpp:528] Scheduler::resourceOffers took 
3.859263212secs
I0125 04:06:23.649204 25445 sched.cpp:528] Scheduler::resourceOffers took 
838.23624ms
I0125 04:06:24.176681 25445 sched.cpp:528] Scheduler::resourceOffers took 
522.324683ms
I0125 04:06:24.709750 25455 sched.cpp:528] Scheduler::resourceOffers took 
328.162458ms
I0125 04:06:25.272554 25455 sched.cpp:528] Scheduler::resourceOffers took 
559.136627ms
I0125 04:06:26.167621 25455 sched.cpp:528] Scheduler::resourceOffers took 
875.709069ms
I0125 04:06:26.493263 25443 sched.cpp:528] Scheduler::resourceOffers took 
134.088104ms
I0125 04:06:28.426606 25445 sched.cpp:528] Scheduler::resourceOffers took 
420.597132ms
I0125 04:06:31.088995 25446 sched.cpp:528] Scheduler::resourceOffers took 
1.262336563secs
I0125 04:06:31.934573 25456 sched.cpp:528] Scheduler::resourceOffers took 
832.207275ms
I0125 04:06:33.181200 25454 sched.cpp:528] Scheduler::resourceOffers took 
765.264834ms
I0125 04:06:35.013409 25457 sched.cpp:528] Scheduler::resourceOffers took 
1.250660605secs
I0125 04:06:35.311099 25448 sched.cpp:528] Scheduler::resourceOffers took 
230.108961ms
I0125 04:06:37.104035 25451 sched.cpp:528] Scheduler::resourceOffers took 
624.894808ms
I0125 04:06:38.150378 25450 sched.cpp:528] Scheduler::resourceOffers took 
427.445204ms
I0125 04:06:39.383716 25457 sched.cpp:528] Scheduler::resourceOffers took 
392.365989ms
{noformat}

> Improve efficiency of saving host attributes (or avoid saving host attributes)
> ------------------------------------------------------------------------------
>
>                 Key: AURORA-116
>                 URL: https://issues.apache.org/jira/browse/AURORA-116
>             Project: Aurora
>          Issue Type: Task
>          Components: Scheduler
>            Reporter: Bill Farner
>            Priority: Critical
>
> The scheduler performs multiple write operations for every resource offer, to 
> save slave attributes:
> {noformat}
>   public void resourceOffers(SchedulerDriver driver, List<Offer> offers) {
>     Preconditions.checkState(registered, "Must be registered before receiving 
> offers.");
>     for (final Offer offer : offers) {
>       log(Level.FINE, "Received offer: %s", offer);
>       resourceOffers.incrementAndGet();
>       storage.write(new MutateWork.NoResult.Quiet() {
>         @Override protected void execute(MutableStoreProvider storeProvider) {
>           
> storeProvider.getAttributeStore().saveHostAttributes(Conversions.getAttributes(offer));
>         }
>       });
> {noformat}
> This can unnecessarily block the singly-threaded message dispatch in the 
> scheduler driver.  An incremental improvement would be to aggregate all slave 
> info and save it in one write operation.  Better yet would be to perform 
> writes asynchronously (taking care to not break task scheduling, since 
> attributes are expected to be present).  Even better yet, it would be great 
> to determine if we can avoid storing host attributes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to