Re: Performance regression in v1 api vs v0

Anand Mazumdar Mon, 17 Oct 2016 12:20:07 -0700

Dario,

It's not immediately clear to me where the bottleneck might be. I filed
MESOS-6405 <https://issues.apache.org/jira/browse/MESOS-6405> to write a
benchmark that tries to mimic your test setup and then go about fixing the
issues.


-anand

On Sun, Oct 16, 2016 at 6:20 PM, Dario Rexin <[email protected]> wrote:

> Hi Anand,
>
> I tested with and without pipelining and it doesn’t make a difference.
> First of all because unlimited pipelining is not a good idea, because we
> still have to handle the responses and need to be able to relate the
> request and response upon return, i.e. store the context of the request
> until we receive the response. Also, we want to know as soon as possible
> when an error occurs, so early returns are very desirable. I agree that it
> shouldn't make a difference in how fast events can be processed if they are
> queued on the master vs. client, but this observation made it very apparent
> that throughput is a problem on the master. I did not make any requests
> that would potentially block for a long time, so it’s even weirder to me,
> that the throughput is so low. One thing I don’t understand for example, is
> why all messages go through the master process. The parsing for example
> could be done in a completely separate process and if every connected
> framework would be backed by its own process, the check if a framework is
> connected could also be done there (not to mention that this requirement
> exists only because we need to use multiple connections). Requiring all
> messages to go through a single process that can indefinitely block is
> obviously a huge bottleneck. I understand that this problem is not limited
> to the HTTP API, but I think it has to be fixed.
>
> —
> Dario
>
> On Oct 16, 2016, at 5:52 PM, Anand Mazumdar <[email protected]>
> wrote:
>
> Dario,
>
> Regarding:
>
> >This is especially concerning, as it means that accepting calls will
> completely stall when a long running call (e.g. retrieving state.json) is
> running.
>
> How does it help a client when it gets an early accepted response versus
> when accepting of calls is stalled i.e., queued up on the master actor? The
> client does not need to wait for a response before pipelining its next
> request to the master anyway. In your tests, do you send the next REVIVE
> call only upon receiving the response to the current call? That might
> explain the behavior you are seeing.
>
> -anand
>
> On Sun, Oct 16, 2016 at 11:58 AM, tommy xiao <[email protected]> wrote:
>
>> interesting this topic.
>>
>> 2016-10-17 2:51 GMT+08:00 Dario Rexin <[email protected]>:
>>
>>> Hi Anand,
>>>
>>> I tested with current HEAD. After I saw low throughput on our own HTTP
>>> API client, I wrote a small server that sends out fake events and accepts
>>> calls and our client was able to send a lot more calls to that server. I
>>> also wrote a small tool that simply sends as many calls to Mesos as
>>> possible without handling any events and get similar results there.I also
>>> observe extremely high CPU usage. While my sending tool is using ~10% CPU,
>>> Mesos runs on ~185%. The calls I send for testing are all REVIVE and I
>>> don’t have any agents connected, so there should be essentially nothing
>>> happening. One reason I could think of for the reduced throughput is that
>>> all calls are processed in the master process, before it sends back an
>>> ACCEPTED, leading to effectively single threaded processing of HTTP calls,
>>> interleaved with all other calls that are sent to the master process.
>>> Libprocess however just forwards the messages to the master process and
>>> then immediately  returns ACCEPTED. It also handles all connections in
>>> separate processes, whereas HTTP calls are effectively all handled by the
>>> master process.This is especially concerning, as it means that accepting
>>> calls will completely stall when a long running call (e.g. retrieving
>>> state.json) is running.
>>>
>>> Thanks,
>>> Dario
>>>
>>> On Oct 16, 2016, at 11:01 AM, Anand Mazumdar <[email protected]> wrote:
>>>
>>> Dario,
>>>
>>> Thanks for reporting this. Did you test this with 1.0 or the recent
>>> HEAD? We had done performance testing prior to 1.0rc1 and had not found any
>>> substantial discrepancy on the call ingestion path. Hence, we had focussed
>>> on fixing the performance issues around writing events on the stream in
>>> MESOS-5222 <https://issues.apache.org/jira/browse/MESOS-5222> and
>>> MESOS-5457 <https://issues.apache.org/jira/browse/MESOS-5457>.
>>>
>>> The numbers in the benchmark test pointed by Haosdent (v0 vs v1) differ
>>> due to the slowness of the client (scheduler library) in processing the
>>> status update events. We should add another benchmark that measures just
>>> the time taken by the master to write the events. I would file an issue
>>> shortly to address this.
>>>
>>> Do you mind filing an issue with more details on your test setup?
>>>
>>> -anand
>>>
>>> On Sun, Oct 16, 2016 at 12:05 AM, Dario Rexin <[email protected]> wrote:
>>>
>>>> Hi haosdent,
>>>>
>>>> thanks for the pointer! Your results show exactly what I’m
>>>> experiencing. I think especially for bigger clusters this could be very
>>>> problematic. It would be great to get some input from the folks working on
>>>> the HTTP API, especially Anand.
>>>>
>>>> Thanks,
>>>> Dario
>>>>
>>>> On Oct 16, 2016, at 12:01 AM, haosdent <[email protected]> wrote:
>>>>
>>>> Hmm, this is an interesting topic. @anandmazumdar create a benchmark
>>>> test case to compare v1 and v0 APIs before. You could run it via
>>>>
>>>> ```
>>>> ./bin/mesos-tests.sh --benchmark --gtest_filter="*SchedulerReco
>>>> ncileTasks_BENCHMARK_Test*"
>>>> ```
>>>>
>>>> Here is the result that run it in my machine.
>>>>
>>>> ```
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/0
>>>> Reconciling 1000 tasks took 386.451108ms using the scheduler library
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/0 (479 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/1
>>>> Reconciling 10000 tasks took 3.389258444secs using the scheduler library
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/1 (3435 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/2
>>>> Reconciling 50000 tasks took 16.624603964secs using the scheduler
>>>> library
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/2 (16737 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/3
>>>> Reconciling 100000 tasks took 33.134018718secs using the scheduler
>>>> library
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerLibrary/3 (33333 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/0
>>>> Reconciling 1000 tasks took 24.212092ms using the scheduler driver
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/0 (89 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/1
>>>> Reconciling 10000 tasks took 316.115078ms using the scheduler driver
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/1 (385 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/2
>>>> Reconciling 50000 tasks took 1.239050154secs using the scheduler driver
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/2 (1379 ms)
>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/3
>>>> Reconciling 100000 tasks took 2.38445672secs using the scheduler driver
>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>> BENCHMARK_Test.SchedulerDriver/3 (2711 ms)
>>>> ```
>>>>
>>>> *SchedulerLibrary* is the HTTP API, *SchedulerDriver* is the old way
>>>> based on libmesos.so.
>>>>
>>>> On Sun, Oct 16, 2016 at 2:41 PM, Dario Rexin <[email protected]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I recently did some performance testing on the v1 scheduler API and
>>>>> found that throughput is around 10x lower than for the v0 API. Using 1
>>>>> connection, I don’t get a lot more than 1,500 calls per second, where the
>>>>> v0 API can do ~15,000. If I use multiple connections, throughput maxes out
>>>>> at 3 connections and ~2,500 calls / s. If I add any more connections, the
>>>>> throughput per connection drops and the total throughput stays around
>>>>> ~2,500 calls / s. Has anyone done performance testing on the v1 API 
>>>>> before?
>>>>> It seems a little strange to me, that it’s so much slower, given that the
>>>>> v0 API also uses HTTP (well, more or less). I would be thankful for any
>>>>> comments and experience reports of other users.
>>>>>
>>>>> Thanks,
>>>>> Dario
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Haosdent Huang
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> Deshi Xiao
>> Twitter: xds2000
>> E-mail: xiaods(AT)gmail.com
>>
>
>
>
> --
> Anand Mazumdar
>
>
>

Re: Performance regression in v1 api vs v0

Reply via email to