[
https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969499#comment-16969499
]
Andrei Sekretenko commented on MESOS-6405:
------------------------------------------
Tried to run both benchmarks (existing SchedulerReconcileTasks_BENCHMARK_Test
and Anand's r53113) on the current master head.
Both show noticeable added overhead for V1 API (~10x lower throughput).
It should be noted that both benchmarks run against an empty Mesos master, i.e.
what they show is basically an overhead due to
HTTP/(de)serialization/authentication/etc...)
It turns out that issues exposed by these two benchmarks are totally different,
which is not surprising at all: the first sends a single call and receives a
multitude of events, whereas the second one sends one call in repetition but
receives no API events.
The largest issue which shows up in the SchedulerReconcileTasks_BENCHMARK_Test
are inefficiencies in the V1 C++ scheduler client library (spawning/terminating
an AsyncExecutor process per each event and so on).
See [^SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg]
The benchmark from r53113 (SchedulerCallIngestion_BENCHMARK_Test) shows
surprisingly large overhead of using process::Sequence (master only?) and
HttpProxy (both sides?), and also of per-request authentication (on both
sides).
Compare:
[^SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg] and
[^SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_flamegraph.svg]
> Benchmark call ingestion path on the Mesos master.
> --------------------------------------------------
>
> Key: MESOS-6405
> URL: https://issues.apache.org/jira/browse/MESOS-6405
> Project: Mesos
> Issue Type: Improvement
> Components: master, scheduler api
> Reporter: Anand Mazumdar
> Assignee: Anand Mazumdar
> Priority: Critical
> Labels: mesosphere
> Attachments:
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_flamegraph.svg,
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_stacks.gz,
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg,
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_stacks.gz,
> SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg,
> SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_stacks.gz
>
>
> [~drexin] reported on the user mailing
> [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7-4444-A315-A6558753E08B%40apple.com%3E]
> that there seems to be a significant regression in performance on the call
> ingestion path on the Mesos master wrt to the scheduler driver (v0 API).
> We should create a benchmark to first get a sense of the numbers and then go
> about fixing the performance issues.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)