Thanks for the detailed timeline of the logs. The scheduler library does not create a new connection per call. I have a suspicion that the slowness might be related to https://issues.apache.org/jira/browse/MESOS-6405 instead. To confirm this, can you change your scheduler to use the `V0Mesos` class ( https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/v1/scheduler/V0Mesos.java) and build an updated timeline of logs based on that?
-anand On Mon, Mar 20, 2017 at 6:42 AM, <assaf_waiz...@amat.com> wrote: > Hi, > > > > I added logs to my scheduler and executor and trace a specific task (task > 1) timeline (the behavior is similar to all tasks). > > It seems the most of the time (~30ms, lines 4-5 below) is ‘lost’ between > the scheduler submits the task (i.e. sending an ACCEPT call with the > submitted task) until the master gets the HTTP POST of this call (does the > scheduler library creates a new connection per call?). > > After the master gets the ACCEPT call it takes another ~10ms for materà > slaveàexecutoràslaveàmaster round trip (but maybe its reasonable for 4 > IPC network calls). > > > > Later on, I also see that scheduler ACKNOWLEDGE call for task 1 status > update (i.e. sending an ACKNOWLEDGE call with the uuid got in status > update) also seen in master only ~35ms (lines 18-19 below) after the call. > I’m starting to conclude the each call using the scheduler library (which > actually involves HTTP POST) takes ~40ms. > > > > To sum it up, it seems that the main factor for the high latency I get > here is due to HTTP POST mechanism in scheduler library. Is there a way to > improve it? Is it possible to keep a scheduleràmaster connection always > connected? > > > > > > Thanks. > > > > *From:* Benjamin Mahler [mailto:bmah...@apache.org] > *Sent:* Thursday, March 16, 2017 3:12 AM > *To:* user > *Cc:* Anand Mazumdar > > *Subject:* Re: High performance, low latency framework over mesos > > > > The breakdown of the 10ms between launching on the agent and getting > TASK_FINISHED from the agent could be seen by looking at the agent logs. Do > you have them? > > > > The 40ms it takes from forwarding the update to receiving the > acknowledgement seems excessive, since the scheduler has to do a lot less > work than the agent and that only takes 10ms. It would be great have the > scheduler do some logging so that we can see if there is network latency > here or there is inefficient processing in the v1 scheduler library. > > > > As a next step, I would suggest building the complete timeline of logs, > which includes scheduler (you likely need to do some logging here), master > and agent logs. > > > > On Tue, Mar 14, 2017 at 8:43 AM, <assaf_waiz...@amat.com> wrote: > > Thanks Benjamin, > > > > I looked into the logs and it seems the delay is between the master and > the scheduler: > > Master log: > > I0314 *18:23:59.409423* 39743 master.cpp:3776] *Processing ACCEPT call* > for offers: [ afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-O25 ] on agent > edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 at slave(1)@10.201.98.16:5051 > (hadoop-master) for framework afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 > (Mesos BM Scheduler) > > W0314 18:23:59.410166 39743 validation.cpp:1064] Executor > 'MesosBMExecutorId' for task '8' uses less CPUs (None) than the minimum > required (0.01). Please update your executor, as this will be mandatory in > future releases. > > W0314 18:23:59.410221 39743 validation.cpp:1076] Executor > 'MesosBMExecutorId' for task '8' uses less memory (None) than the minimum > required (32MB). Please update your executor, as this will be mandatory in > future releases. > > I0314 18:23:59.410292 39743 master.cpp:9053] Adding task 8 with resources > cpus(*)(allocated: *):0.01 on agent edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 > at slave(1)@10.201.98.16:5051 (hadoop-master) > > I0314 *18:23:59.410331* 39743 master.cpp:4426] *Launching task 8 of > framework* afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 (Mesos BM Scheduler) > with resources cpus(*)(allocated: *):0.01 on agent > edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 > at slave(1)@10.201.98.16:5051 (hadoop-master) > > I0314 18:23:59.411258 39738 hierarchical.cpp:807] Updated allocation of > framework afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 on agent > edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 from cpus(*)(allocated: *):0.01 > to cpus(*)(allocated: *):0.01 > > I0314 18:23:59.415060 39723 master.cpp:6992] Sending 1 offers to framework > afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 (Mesos BM Scheduler) > > I0314 *18:23:59.420624* 39757 master.cpp:6154] *Status update > TASK_FINISHED* (UUID: 583ea071-de66-4050-9513-8ff432da605f) for *task 8* > of framework afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 from agent > edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 at slave(1)@10.201.98.16:5051 > (hadoop-master) > > I0314 *18:23:59.420671* 39757 master.cpp:6222] *Forwarding status update > TASK_FINISHED* (UUID: 583ea071-de66-4050-9513-8ff432da605f) for *task 8* > of framework afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 > > I0314 18:23:59.420819 39757 master.cpp:8302] Updating the state of task 8 > of framework afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 (latest state: > TASK_FINISHED, status update state: TASK_FINISHED) > > I0314 18:23:59.425354 39742 master.cpp:6992] Sending 1 offers to framework > afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 (Mesos BM Scheduler) > > I0314 18:23:59.459801 39741 http.cpp:420] HTTP POST for > /master/api/v1/scheduler from 127.0.0.1:36100 > > W0314 18:23:59.459889 39741 master.cpp:3634] Implicitly declining offers: > [ afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-O26 ] in ACCEPT call for framework > afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 as the launch operation > specified no tasks > > I0314 18:23:59.460055 39741 master.cpp:3776] Processing ACCEPT call for > offers: [ afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-O26 ] on agent > edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 at slave(1)@10.201.98.16:5051 > (hadoop-master) for framework afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 > (Mesos BM Scheduler) > > I0314 18:23:59.460435 39741 http.cpp:420] HTTP POST for > /master/api/v1/scheduler from 127.0.0.1:36100 > > I0314 *18:23:59.460484* 39741 master.cpp:5092*] Processing ACKNOWLEDGE > call* 583ea071-de66-4050-9513-8ff432da605f for *task 8* of framework > afd6b67b-cac0-4b9f-baf6-2a456f4e84fa-0000 (Mesos BM Scheduler) on agent > edbbafb6-4f7b-4da2-8782-8e01461906dc-S0 > > > > From master log you can see that the time between the master received the > offer accept, until forwarding task_finished status to scheduler is ~11ms. > > For some reason the scheduler acknowledge came after ~40ms although I’m > sending it immediately upon status update. Moreover I can’t explain why the > time between receiving an offer until accepting it (even on empty accept) > take ~40ms. I think this might be related to the problem. > > > > My scheduler is written in Java, inheriting > org.apache.mesos.v1.scheduler.Scheduler > and passing it to org.apache.mesos.v1.scheduler.V1Mesos. > > This is my scheduler impl for processUpdate and processOffers (called upon > ‘received’ with Protos.Event.UPDATE/OFFERS respectively): > > private void processOffers(Mesos mesos, List<Offer> offers) { > > for (Offer offer : offers) { > > Offer.Operation.Launch.Builder launch = > Offer.Operation.Launch.newBuilder(); > > > > double offerCpus = 0; > > for (Resource resource : offer.getResourcesList()){ > > if (resource.getName().equals("cpus")){ > > offerCpus += resource.getScalar().getValue(); > > } > > } > > > > LOGGER.info("Received offer " + offer.getId().getValue() + " > from agent: " + offer.getHostname() + " [cpus: " + offerCpus + "]"); > > > > while (!pendingTasks.isEmpty()) > > { > > Task task = pendingTasks.peek(); > > if (task.requiredCpus <= offerCpus){ > > TaskID taskId = TaskID.newBuilder() > > .setValue(Integer.toString( > launchedTasks++)).build(); > > > > LOGGER.info("Launching task " + taskId.getValue() + > > " using offer " + offer.getId().getValue()); > > > > TaskInfo taskInfo = TaskInfo.newBuilder() > > .setName("task " + taskId.getValue()) > > .setTaskId(taskId) > > .setAgentId(offer.getAgentId()) > > .addResources(Resource.newBuilder() > > .setName("cpus") > > .setType(Value.Type.SCALAR) > > .setScalar(Value.Scalar.newBuilder() > > .setValue(task.requiredCpus) > > .build())) > > .setExecutor(executorInfo) > > .build(); > > > > launch.addTaskInfos(taskInfo); > > > > offerCpus -= task.requiredCpus; > > submittedTasks.put(taskId, pendingTasks.poll()); > > } > > else { > > break; > > } > > } > > > > mesos.send(Call.newBuilder() > > .setType(Call.Type.ACCEPT) > > .setFrameworkId(frameworkId) > > .setAccept(Call.Accept.newBuilder() > > .addOfferIds(offer.getId()) > > .addOperations(Offer.Operation.newBuilder() > > .setType(Offer.Operation.Type.LAUNCH) > > .setLaunch(launch) > > .build()) > > .setFilters(Filters.newBuilder() > > .setRefuseSeconds(0) > > .build())) > > .build()); > > } > > > > LOGGER.info("Completed handling offers"); > > } > > > > private void processUpdate(Mesos mesos, TaskStatus taskStatus) > > { > > LOGGER.info("Task " + taskStatus.getTaskId().getValue() + " > status " + taskStatus.getState().name()); > > > > if (taskStatus.hasUuid()) { > > mesos.send(Call.newBuilder() > > .setType(Call.Type.ACKNOWLEDGE) > > .setFrameworkId(frameworkId) > > .setAcknowledge(Call.Acknowledge.newBuilder() > > .setAgentId(taskStatus.getAgentId()) > > .setTaskId(taskStatus.getTaskId()) > > .setUuid(taskStatus.getUuid()) > > .build()) > > .build()); > > } > > } > > > > *From:* Benjamin Mahler [mailto:bmah...@apache.org] > *Sent:* Friday, March 10, 2017 12:52 AM > *To:* user > *Subject:* Re: High performance, low latency framework over mesos > > > > Have you taken a look at the logs across your scheduler / master / agent > to see where the latency is introduced? We can then discuss options to > reduce the latency. > > > > Ben > > > > On Tue, Mar 7, 2017 at 5:03 AM, <assaf_waiz...@amat.com> wrote: > > Hi, > > > > I’m implementing my own framework (scheduler + executor) over mesos. The > framework is oriented for low latency, short running tasks (i.e. task time > is hundreds of ms). > > I wrote a simple scheduler and executor and measured the time of task > submission RTT (round trip) – that is the elapsed time from submitting the > task at scheduler (by accepting and offer) and until the scheduler received > TASK_FINISHED status. > > The task on executor side does nothing but to immediately sending > TASK_FINISHED status. > > > > I see that the RTT is ~50ms which is quite a lot for my scenario, I was > wondering if this is controllable and why does it take so long to receive > status update. > > > > Thanks > > > > *Assaf.* > > > ------------------------------ > > > > > > > ------------------------------ > > > > > > ------------------------------ > >