interesting this topic. 2016-10-17 2:51 GMT+08:00 Dario Rexin <dre...@apple.com>:
> Hi Anand, > > I tested with current HEAD. After I saw low throughput on our own HTTP API > client, I wrote a small server that sends out fake events and accepts calls > and our client was able to send a lot more calls to that server. I also > wrote a small tool that simply sends as many calls to Mesos as possible > without handling any events and get similar results there.I also observe > extremely high CPU usage. While my sending tool is using ~10% CPU, Mesos > runs on ~185%. The calls I send for testing are all REVIVE and I don’t have > any agents connected, so there should be essentially nothing happening. One > reason I could think of for the reduced throughput is that all calls are > processed in the master process, before it sends back an ACCEPTED, leading > to effectively single threaded processing of HTTP calls, interleaved with > all other calls that are sent to the master process. Libprocess however > just forwards the messages to the master process and then immediately > returns ACCEPTED. It also handles all connections in separate processes, > whereas HTTP calls are effectively all handled by the master process.This > is especially concerning, as it means that accepting calls will completely > stall when a long running call (e.g. retrieving state.json) is running. > > Thanks, > Dario > > On Oct 16, 2016, at 11:01 AM, Anand Mazumdar <an...@apache.org> wrote: > > Dario, > > Thanks for reporting this. Did you test this with 1.0 or the recent HEAD? > We had done performance testing prior to 1.0rc1 and had not found any > substantial discrepancy on the call ingestion path. Hence, we had focussed > on fixing the performance issues around writing events on the stream in > MESOS-5222 <https://issues.apache.org/jira/browse/MESOS-5222> and > MESOS-5457 <https://issues.apache.org/jira/browse/MESOS-5457>. > > The numbers in the benchmark test pointed by Haosdent (v0 vs v1) differ > due to the slowness of the client (scheduler library) in processing the > status update events. We should add another benchmark that measures just > the time taken by the master to write the events. I would file an issue > shortly to address this. > > Do you mind filing an issue with more details on your test setup? > > -anand > > On Sun, Oct 16, 2016 at 12:05 AM, Dario Rexin <dre...@apple.com> wrote: > >> Hi haosdent, >> >> thanks for the pointer! Your results show exactly what I’m experiencing. >> I think especially for bigger clusters this could be very problematic. It >> would be great to get some input from the folks working on the HTTP API, >> especially Anand. >> >> Thanks, >> Dario >> >> On Oct 16, 2016, at 12:01 AM, haosdent <haosd...@gmail.com> wrote: >> >> Hmm, this is an interesting topic. @anandmazumdar create a benchmark test >> case to compare v1 and v0 APIs before. You could run it via >> >> ``` >> ./bin/mesos-tests.sh --benchmark --gtest_filter="*SchedulerReco >> ncileTasks_BENCHMARK_Test*" >> ``` >> >> Here is the result that run it in my machine. >> >> ``` >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrar >> y/0 >> Reconciling 1000 tasks took 386.451108ms using the scheduler library >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary/0 >> (479 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrar >> y/1 >> Reconciling 10000 tasks took 3.389258444secs using the scheduler library >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary/1 >> (3435 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrar >> y/2 >> Reconciling 50000 tasks took 16.624603964secs using the scheduler library >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary/2 >> (16737 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrar >> y/3 >> Reconciling 100000 tasks took 33.134018718secs using the scheduler library >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary/3 >> (33333 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver >> /0 >> Reconciling 1000 tasks took 24.212092ms using the scheduler driver >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver/0 >> (89 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver >> /1 >> Reconciling 10000 tasks took 316.115078ms using the scheduler driver >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver/1 >> (385 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver >> /2 >> Reconciling 50000 tasks took 1.239050154secs using the scheduler driver >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver/2 >> (1379 ms) >> [ RUN ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver >> /3 >> Reconciling 100000 tasks took 2.38445672secs using the scheduler driver >> [ OK ] Tasks/SchedulerReconcileTasks_BENCHMARK_Test.SchedulerDriver/3 >> (2711 ms) >> ``` >> >> *SchedulerLibrary* is the HTTP API, *SchedulerDriver* is the old way >> based on libmesos.so. >> >> On Sun, Oct 16, 2016 at 2:41 PM, Dario Rexin <dre...@apple.com> wrote: >> >>> Hi all, >>> >>> I recently did some performance testing on the v1 scheduler API and >>> found that throughput is around 10x lower than for the v0 API. Using 1 >>> connection, I don’t get a lot more than 1,500 calls per second, where the >>> v0 API can do ~15,000. If I use multiple connections, throughput maxes out >>> at 3 connections and ~2,500 calls / s. If I add any more connections, the >>> throughput per connection drops and the total throughput stays around >>> ~2,500 calls / s. Has anyone done performance testing on the v1 API before? >>> It seems a little strange to me, that it’s so much slower, given that the >>> v0 API also uses HTTP (well, more or less). I would be thankful for any >>> comments and experience reports of other users. >>> >>> Thanks, >>> Dario >>> >>> >> >> >> -- >> Best Regards, >> Haosdent Huang >> >> >> > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com