Re: Performance regression in v1 api vs v0

2016-10-16 Thread haosdent
Hmm, this is an interesting topic. @anandmazumdar create a benchmark test case to compare v1 and v0 APIs before. You could run it via ``` ./bin/mesos-tests.sh --benchmark --gtest_filter="*SchedulerReconcileTasks_BENCHMARK_Test*" ``` Here is the result that run it in my machine. ``` [ RUN ]

Re: Performance regression in v1 api vs v0

2016-10-16 Thread Dario Rexin
Hi haosdent, thanks for the pointer! Your results show exactly what I’m experiencing. I think especially for bigger clusters this could be very problematic. It would be great to get some input from the folks working on the HTTP API, especially Anand. Thanks, Dario > On Oct 16, 2016, at 12:01

Re: Performance regression in v1 api vs v0

2016-10-16 Thread Anand Mazumdar
Dario, Thanks for reporting this. Did you test this with 1.0 or the recent HEAD? We had done performance testing prior to 1.0rc1 and had not found any substantial discrepancy on the call ingestion path. Hence, we had focussed on fixing the performance issues around writing events on the stream in

Re: Performance regression in v1 api vs v0

2016-10-16 Thread Dario Rexin
Hi Anand, I tested with current HEAD. After I saw low throughput on our own HTTP API client, I wrote a small server that sends out fake events and accepts calls and our client was able to send a lot more calls to that server. I also wrote a small tool that simply sends as many calls to Mesos as

Re: Performance regression in v1 api vs v0

2016-10-16 Thread tommy xiao
interesting this topic. 2016-10-17 2:51 GMT+08:00 Dario Rexin : > Hi Anand, > > I tested with current HEAD. After I saw low throughput on our own HTTP API > client, I wrote a small server that sends out fake events and accepts calls > and our client was able to send a lot more calls to that serve

Re: Non-checkpointing frameworks

2016-10-16 Thread Qian Zhang
> > and requires operators to enable checkpointing on the slaves. Just curious why operator needs to enable checkpointing on the slaves (I do not see an agent flag for that), I think checkpointing should be enabled in framework level rather than slave. Thanks, Qian Zhang On Sun, Oct 16, 2016 a

Re: Performance regression in v1 api vs v0

2016-10-16 Thread Anand Mazumdar
Dario, Regarding: >This is especially concerning, as it means that accepting calls will completely stall when a long running call (e.g. retrieving state.json) is running. How does it help a client when it gets an early accepted response versus when accepting of calls is stalled i.e., queued up o

Re: Performance regression in v1 api vs v0

2016-10-16 Thread Dario Rexin
Hi Anand, I tested with and without pipelining and it doesn’t make a difference. First of all because unlimited pipelining is not a good idea, because we still have to handle the responses and need to be able to relate the request and response upon return, i.e. store the context of the request