Re: Performance regression in v1 api vs v0

Zhitao Li Mon, 17 Oct 2016 12:56:18 -0700

+1

We are also quite interested in this topic.


On Mon, Oct 17, 2016 at 12:34 PM, Dario Rexin <[email protected]> wrote:

> Hi Anand,
>
> thanks for creating the ticket. I will also investigate a bit more. I will
> probably be in SF on Thursday, so we could discuss in person.
>
> --
> Dario
>
> On Oct 17, 2016, at 12:19 PM, Anand Mazumdar <[email protected]> wrote:
>
> Dario,
>
> It's not immediately clear to me where the bottleneck might be. I filed
> MESOS-6405 <https://issues.apache.org/jira/browse/MESOS-6405> to write a
> benchmark that tries to mimic your test setup and then go about fixing the
> issues.
>
> -anand
>
> On Sun, Oct 16, 2016 at 6:20 PM, Dario Rexin <[email protected]> wrote:
>
>> Hi Anand,
>>
>> I tested with and without pipelining and it doesn’t make a difference.
>> First of all because unlimited pipelining is not a good idea, because we
>> still have to handle the responses and need to be able to relate the
>> request and response upon return, i.e. store the context of the request
>> until we receive the response. Also, we want to know as soon as possible
>> when an error occurs, so early returns are very desirable. I agree that it
>> shouldn't make a difference in how fast events can be processed if they are
>> queued on the master vs. client, but this observation made it very apparent
>> that throughput is a problem on the master. I did not make any requests
>> that would potentially block for a long time, so it’s even weirder to me,
>> that the throughput is so low. One thing I don’t understand for example, is
>> why all messages go through the master process. The parsing for example
>> could be done in a completely separate process and if every connected
>> framework would be backed by its own process, the check if a framework is
>> connected could also be done there (not to mention that this requirement
>> exists only because we need to use multiple connections). Requiring all
>> messages to go through a single process that can indefinitely block is
>> obviously a huge bottleneck. I understand that this problem is not limited
>> to the HTTP API, but I think it has to be fixed.
>>
>> —
>> Dario
>>
>> On Oct 16, 2016, at 5:52 PM, Anand Mazumdar <[email protected]>
>> wrote:
>>
>> Dario,
>>
>> Regarding:
>>
>> >This is especially concerning, as it means that accepting calls will
>> completely stall when a long running call (e.g. retrieving state.json) is
>> running.
>>
>> How does it help a client when it gets an early accepted response versus
>> when accepting of calls is stalled i.e., queued up on the master actor? The
>> client does not need to wait for a response before pipelining its next
>> request to the master anyway. In your tests, do you send the next REVIVE
>> call only upon receiving the response to the current call? That might
>> explain the behavior you are seeing.
>>
>> -anand
>>
>> On Sun, Oct 16, 2016 at 11:58 AM, tommy xiao <[email protected]> wrote:
>>
>>> interesting this topic.
>>>
>>> 2016-10-17 2:51 GMT+08:00 Dario Rexin <[email protected]>:
>>>
>>>> Hi Anand,
>>>>
>>>> I tested with current HEAD. After I saw low throughput on our own HTTP
>>>> API client, I wrote a small server that sends out fake events and accepts
>>>> calls and our client was able to send a lot more calls to that server. I
>>>> also wrote a small tool that simply sends as many calls to Mesos as
>>>> possible without handling any events and get similar results there.I also
>>>> observe extremely high CPU usage. While my sending tool is using ~10% CPU,
>>>> Mesos runs on ~185%. The calls I send for testing are all REVIVE and I
>>>> don’t have any agents connected, so there should be essentially nothing
>>>> happening. One reason I could think of for the reduced throughput is that
>>>> all calls are processed in the master process, before it sends back an
>>>> ACCEPTED, leading to effectively single threaded processing of HTTP calls,
>>>> interleaved with all other calls that are sent to the master process.
>>>> Libprocess however just forwards the messages to the master process and
>>>> then immediately  returns ACCEPTED. It also handles all connections in
>>>> separate processes, whereas HTTP calls are effectively all handled by the
>>>> master process.This is especially concerning, as it means that accepting
>>>> calls will completely stall when a long running call (e.g. retrieving
>>>> state.json) is running.
>>>>
>>>> Thanks,
>>>> Dario
>>>>
>>>> On Oct 16, 2016, at 11:01 AM, Anand Mazumdar <[email protected]> wrote:
>>>>
>>>> Dario,
>>>>
>>>> Thanks for reporting this. Did you test this with 1.0 or the recent
>>>> HEAD? We had done performance testing prior to 1.0rc1 and had not found any
>>>> substantial discrepancy on the call ingestion path. Hence, we had focussed
>>>> on fixing the performance issues around writing events on the stream in
>>>> MESOS-5222 <https://issues.apache.org/jira/browse/MESOS-5222> and
>>>> MESOS-5457 <https://issues.apache.org/jira/browse/MESOS-5457>.
>>>>
>>>> The numbers in the benchmark test pointed by Haosdent (v0 vs v1) differ
>>>> due to the slowness of the client (scheduler library) in processing the
>>>> status update events. We should add another benchmark that measures just
>>>> the time taken by the master to write the events. I would file an issue
>>>> shortly to address this.
>>>>
>>>> Do you mind filing an issue with more details on your test setup?
>>>>
>>>> -anand
>>>>
>>>> On Sun, Oct 16, 2016 at 12:05 AM, Dario Rexin <[email protected]> wrote:
>>>>
>>>>> Hi haosdent,
>>>>>
>>>>> thanks for the pointer! Your results show exactly what I’m
>>>>> experiencing. I think especially for bigger clusters this could be very
>>>>> problematic. It would be great to get some input from the folks working on
>>>>> the HTTP API, especially Anand.
>>>>>
>>>>> Thanks,
>>>>> Dario
>>>>>
>>>>> On Oct 16, 2016, at 12:01 AM, haosdent <[email protected]> wrote:
>>>>>
>>>>> Hmm, this is an interesting topic. @anandmazumdar create a benchmark
>>>>> test case to compare v1 and v0 APIs before. You could run it via
>>>>>
>>>>> ```
>>>>> ./bin/mesos-tests.sh --benchmark --gtest_filter="*SchedulerReco
>>>>> ncileTasks_BENCHMARK_Test*"
>>>>> ```
>>>>>
>>>>> Here is the result that run it in my machine.
>>>>>
>>>>> ```
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/0
>>>>> Reconciling 1000 tasks took 386.451108ms using the scheduler library
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/0 (479 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/1
>>>>> Reconciling 10000 tasks took 3.389258444secs using the scheduler
>>>>> library
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/1 (3435 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/2
>>>>> Reconciling 50000 tasks took 16.624603964secs using the scheduler
>>>>> library
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/2 (16737 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/3
>>>>> Reconciling 100000 tasks took 33.134018718secs using the scheduler
>>>>> library
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerLibrary/3 (33333 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/0
>>>>> Reconciling 1000 tasks took 24.212092ms using the scheduler driver
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/0 (89 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/1
>>>>> Reconciling 10000 tasks took 316.115078ms using the scheduler driver
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/1 (385 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/2
>>>>> Reconciling 50000 tasks took 1.239050154secs using the scheduler driver
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/2 (1379 ms)
>>>>> [ RUN      ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/3
>>>>> Reconciling 100000 tasks took 2.38445672secs using the scheduler driver
>>>>> [       OK ] Tasks/SchedulerReconcileTasks_
>>>>> BENCHMARK_Test.SchedulerDriver/3 (2711 ms)
>>>>> ```
>>>>>
>>>>> *SchedulerLibrary* is the HTTP API, *SchedulerDriver* is the old way
>>>>> based on libmesos.so.
>>>>>
>>>>> On Sun, Oct 16, 2016 at 2:41 PM, Dario Rexin <[email protected]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I recently did some performance testing on the v1 scheduler API and
>>>>>> found that throughput is around 10x lower than for the v0 API. Using 1
>>>>>> connection, I don’t get a lot more than 1,500 calls per second, where the
>>>>>> v0 API can do ~15,000. If I use multiple connections, throughput maxes 
>>>>>> out
>>>>>> at 3 connections and ~2,500 calls / s. If I add any more connections, the
>>>>>> throughput per connection drops and the total throughput stays around
>>>>>> ~2,500 calls / s. Has anyone done performance testing on the v1 API 
>>>>>> before?
>>>>>> It seems a little strange to me, that it’s so much slower, given that the
>>>>>> v0 API also uses HTTP (well, more or less). I would be thankful for any
>>>>>> comments and experience reports of other users.
>>>>>>
>>>>>> Thanks,
>>>>>> Dario
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Deshi Xiao
>>> Twitter: xds2000
>>> E-mail: xiaods(AT)gmail.com
>>>
>>
>>
>>
>> --
>> Anand Mazumdar
>>
>>
>>
>


-- 
Cheers,

Zhitao Li

Re: Performance regression in v1 api vs v0

Reply via email to