+1 We are also quite interested in this topic.
On Mon, Oct 17, 2016 at 12:34 PM, Dario Rexin <[email protected]> wrote: > Hi Anand, > > thanks for creating the ticket. I will also investigate a bit more. I will > probably be in SF on Thursday, so we could discuss in person. > > -- > Dario > > On Oct 17, 2016, at 12:19 PM, Anand Mazumdar <[email protected]> wrote: > > Dario, > > It's not immediately clear to me where the bottleneck might be. I filed > MESOS-6405 <https://issues.apache.org/jira/browse/MESOS-6405> to write a > benchmark that tries to mimic your test setup and then go about fixing the > issues. > > -anand > > On Sun, Oct 16, 2016 at 6:20 PM, Dario Rexin <[email protected]> wrote: > >> Hi Anand, >> >> I tested with and without pipelining and it doesn’t make a difference. >> First of all because unlimited pipelining is not a good idea, because we >> still have to handle the responses and need to be able to relate the >> request and response upon return, i.e. store the context of the request >> until we receive the response. Also, we want to know as soon as possible >> when an error occurs, so early returns are very desirable. I agree that it >> shouldn't make a difference in how fast events can be processed if they are >> queued on the master vs. client, but this observation made it very apparent >> that throughput is a problem on the master. I did not make any requests >> that would potentially block for a long time, so it’s even weirder to me, >> that the throughput is so low. One thing I don’t understand for example, is >> why all messages go through the master process. The parsing for example >> could be done in a completely separate process and if every connected >> framework would be backed by its own process, the check if a framework is >> connected could also be done there (not to mention that this requirement >> exists only because we need to use multiple connections). Requiring all >> messages to go through a single process that can indefinitely block is >> obviously a huge bottleneck. I understand that this problem is not limited >> to the HTTP API, but I think it has to be fixed. >> >> — >> Dario >> >> On Oct 16, 2016, at 5:52 PM, Anand Mazumdar <[email protected]> >> wrote: >> >> Dario, >> >> Regarding: >> >> >This is especially concerning, as it means that accepting calls will >> completely stall when a long running call (e.g. retrieving state.json) is >> running. >> >> How does it help a client when it gets an early accepted response versus >> when accepting of calls is stalled i.e., queued up on the master actor? The >> client does not need to wait for a response before pipelining its next >> request to the master anyway. In your tests, do you send the next REVIVE >> call only upon receiving the response to the current call? That might >> explain the behavior you are seeing. >> >> -anand >> >> On Sun, Oct 16, 2016 at 11:58 AM, tommy xiao <[email protected]> wrote: >> >>> interesting this topic. >>> >>> 2016-10-17 2:51 GMT+08:00 Dario Rexin <[email protected]>: >>> >>>> Hi Anand, >>>> >>>> I tested with current HEAD. After I saw low throughput on our own HTTP >>>> API client, I wrote a small server that sends out fake events and accepts >>>> calls and our client was able to send a lot more calls to that server. I >>>> also wrote a small tool that simply sends as many calls to Mesos as >>>> possible without handling any events and get similar results there.I also >>>> observe extremely high CPU usage. While my sending tool is using ~10% CPU, >>>> Mesos runs on ~185%. The calls I send for testing are all REVIVE and I >>>> don’t have any agents connected, so there should be essentially nothing >>>> happening. One reason I could think of for the reduced throughput is that >>>> all calls are processed in the master process, before it sends back an >>>> ACCEPTED, leading to effectively single threaded processing of HTTP calls, >>>> interleaved with all other calls that are sent to the master process. >>>> Libprocess however just forwards the messages to the master process and >>>> then immediately returns ACCEPTED. It also handles all connections in >>>> separate processes, whereas HTTP calls are effectively all handled by the >>>> master process.This is especially concerning, as it means that accepting >>>> calls will completely stall when a long running call (e.g. retrieving >>>> state.json) is running. >>>> >>>> Thanks, >>>> Dario >>>> >>>> On Oct 16, 2016, at 11:01 AM, Anand Mazumdar <[email protected]> wrote: >>>> >>>> Dario, >>>> >>>> Thanks for reporting this. Did you test this with 1.0 or the recent >>>> HEAD? We had done performance testing prior to 1.0rc1 and had not found any >>>> substantial discrepancy on the call ingestion path. Hence, we had focussed >>>> on fixing the performance issues around writing events on the stream in >>>> MESOS-5222 <https://issues.apache.org/jira/browse/MESOS-5222> and >>>> MESOS-5457 <https://issues.apache.org/jira/browse/MESOS-5457>. >>>> >>>> The numbers in the benchmark test pointed by Haosdent (v0 vs v1) differ >>>> due to the slowness of the client (scheduler library) in processing the >>>> status update events. We should add another benchmark that measures just >>>> the time taken by the master to write the events. I would file an issue >>>> shortly to address this. >>>> >>>> Do you mind filing an issue with more details on your test setup? >>>> >>>> -anand >>>> >>>> On Sun, Oct 16, 2016 at 12:05 AM, Dario Rexin <[email protected]> wrote: >>>> >>>>> Hi haosdent, >>>>> >>>>> thanks for the pointer! Your results show exactly what I’m >>>>> experiencing. I think especially for bigger clusters this could be very >>>>> problematic. It would be great to get some input from the folks working on >>>>> the HTTP API, especially Anand. >>>>> >>>>> Thanks, >>>>> Dario >>>>> >>>>> On Oct 16, 2016, at 12:01 AM, haosdent <[email protected]> wrote: >>>>> >>>>> Hmm, this is an interesting topic. @anandmazumdar create a benchmark >>>>> test case to compare v1 and v0 APIs before. You could run it via >>>>> >>>>> ``` >>>>> ./bin/mesos-tests.sh --benchmark --gtest_filter="*SchedulerReco >>>>> ncileTasks_BENCHMARK_Test*" >>>>> ``` >>>>> >>>>> Here is the result that run it in my machine. >>>>> >>>>> ``` >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/0 >>>>> Reconciling 1000 tasks took 386.451108ms using the scheduler library >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/0 (479 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/1 >>>>> Reconciling 10000 tasks took 3.389258444secs using the scheduler >>>>> library >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/1 (3435 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/2 >>>>> Reconciling 50000 tasks took 16.624603964secs using the scheduler >>>>> library >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/2 (16737 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/3 >>>>> Reconciling 100000 tasks took 33.134018718secs using the scheduler >>>>> library >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerLibrary/3 (33333 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/0 >>>>> Reconciling 1000 tasks took 24.212092ms using the scheduler driver >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/0 (89 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/1 >>>>> Reconciling 10000 tasks took 316.115078ms using the scheduler driver >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/1 (385 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/2 >>>>> Reconciling 50000 tasks took 1.239050154secs using the scheduler driver >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/2 (1379 ms) >>>>> [ RUN ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/3 >>>>> Reconciling 100000 tasks took 2.38445672secs using the scheduler driver >>>>> [ OK ] Tasks/SchedulerReconcileTasks_ >>>>> BENCHMARK_Test.SchedulerDriver/3 (2711 ms) >>>>> ``` >>>>> >>>>> *SchedulerLibrary* is the HTTP API, *SchedulerDriver* is the old way >>>>> based on libmesos.so. >>>>> >>>>> On Sun, Oct 16, 2016 at 2:41 PM, Dario Rexin <[email protected]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I recently did some performance testing on the v1 scheduler API and >>>>>> found that throughput is around 10x lower than for the v0 API. Using 1 >>>>>> connection, I don’t get a lot more than 1,500 calls per second, where the >>>>>> v0 API can do ~15,000. If I use multiple connections, throughput maxes >>>>>> out >>>>>> at 3 connections and ~2,500 calls / s. If I add any more connections, the >>>>>> throughput per connection drops and the total throughput stays around >>>>>> ~2,500 calls / s. Has anyone done performance testing on the v1 API >>>>>> before? >>>>>> It seems a little strange to me, that it’s so much slower, given that the >>>>>> v0 API also uses HTTP (well, more or less). I would be thankful for any >>>>>> comments and experience reports of other users. >>>>>> >>>>>> Thanks, >>>>>> Dario >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> Haosdent Huang >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> Deshi Xiao >>> Twitter: xds2000 >>> E-mail: xiaods(AT)gmail.com >>> >> >> >> >> -- >> Anand Mazumdar >> >> >> > -- Cheers, Zhitao Li

