Re: Implicit reconcile "pauses" offer stream in large cluster

2018-02-07 Thread Benjamin Mahler
Following up, did you gather any perf data for this? On Sat, Dec 30, 2017 at 8:15 AM, Meghdoot bhattacharya < meghdoo...@yahoo.com.invalid> wrote: > Zhitao any further updates on this? > > Thx > > > On Dec 13, 2017, at 1:02 PM, Benjamin Mahler wrote: > > > > You can check

Re: Implicit reconcile "pauses" offer stream in large cluster

2017-12-13 Thread Benjamin Mahler
You can check the diff, for example: https://github.com/apache/mesos/compare/1.3.0...1.4.0 I didn't notice any changes that look like they would cause this. What do the master logs show during the time frame? Have you profiled what the master and scheduler are doing during this time frame? On

Implicit reconcile "pauses" offer stream in large cluster

2017-12-12 Thread Zhitao Li
Hi, We have seen some potential problems when trying to upgrading Mesos from 1.3 to 1.4: when an implicit reconciliation happened for a large framework (Aurora) , the scheduler would not see any offer for several minutes. Strangely this does not show up once we revert back to 1.3. A couple of