Hi Alex, right, I didn't think about the other cases. Currently I would say we have 3 orders of magnitude more frameworks than roles.
> On Jul 18, 2016, at 2:31 AM, Alex Rukletsov <[email protected]> wrote: > > Dario, > > but this is true only for framework sorters, right? The total kept in the > role sorter is changed not on allocations, but when an agent joins or > leaves the cluster. Maintaining a priority queue for roles can make sense, > but may decrease the performance for framework sorters. > > What is the ratio frameworks / roles in your clusters? > >> On Fri, Jul 8, 2016 at 6:37 PM, Dario Rexin <[email protected]> wrote: >> >> Hi Alex, >> >> thanks for your input. We originally thought about that, too, but the >> problem is, that every time resources are allocated to a framework, this >> method will be called: >> >> void DRFSorter::add(const SlaveID& slaveId, const Resources& resources) >> >> It will add the passed resources to the total resources of the sorter and >> therefore invalidate the whole sorting (i.e. set dirty=true). So we would >> still have to actually sort the frameworks almost every time. In fact, >> frameworks are already kept sorted as long as possible, it’s just not >> possible to keep them sorted for very long because of the call to said >> function ;). >> >> -- >> Dario >> >>> On Jul 8, 2016, at 6:50 AM, Alex Rukletsov <[email protected]> wrote: >>> >>> I was not involved into conversations around this issue, so maybe you >> have >>> discussed this already (in this case, is the outcome of the discussion is >>> documented somewhere?). >>> >>> Though the patch seems good to me, it assumes that frameworks SUPPRESS >> when >>> they don't need offers. This is not always the case. Since now we have a >>> real world use case with ~6k frameworks, the "right thing to do" seems to >>> maintain a heap of roles and frameworks in the role and avoid sorting. >>> >>>> On Thu, Jul 7, 2016 at 7:20 PM, Dario Rexin <[email protected]> wrote: >>>> >>>> A bit more context: >>>> >>>> We have a very high number of frameworks on our clusters. In some cases >>>> ~6k. The biggest problem is the sort method, which has a complexity of >> O(n >>>> log n) and is called n*m times, where n = number of agents and m = >> number >>>> of roles. So in total we have a complexity of O(n^3 log n). I think >>>> reducing n is the most promising optimization here. We have been running >>>> this patch in production for quite a while now and have seen huge >>>> improvements in general allocation time and also in failover times. >>>> >>>> Also, if we were to add a parameterized version of SUPPRESS, what >> problems >>>> do you see with just differentiating between the two cases? >>>> >>>> Thanks, >>>> -- >>>> Dario >>>> >>>>> On Jul 7, 2016, at 8:40 AM, Dario Rexin <[email protected]> wrote: >>>>> >>>>> Hi Joris, >>>>> >>>>> I still don't really understand why we would parameterize SUPPRESS, to >>>> me that sounds like a case for filters. The idea of SUPPRESS was to >>>> completely stop getting offers. >>>>> >>>>> Could you please explain why you think the patch is a hack? To me it >>>> just seems logical to not sort frameworks that don't need to be >> considered >>>> in the allocator. >>>>> >>>>> Thanks, >>>>> Dario >>>>> >>>>>> On 07.07.2016, at 7:38 AM, Joris Van Remoortere <[email protected]> >>>> wrote: >>>>>> >>>>>> The reason that SUPPRESS doesn't just deactivate is because the intent >>>> was >>>>>> to be able to parameterize this call. At that point the change >> wouldn't >>>>>> work without turning this in to 2 cases. >>>>>> >>>>>> I have asked to look at what a parameterized suppress would like and >>>>>> understand the performance impact of that before we do this. >>>>>> Have we reached consensus that there's no way to implement a generic >>>>>> parameterized suppress that is performant? >>>>>> >>>>>> There are some refactorings that we had discussed with James, Jacob, >> and >>>>>> Ian that seem like lower hanging fruit. After those are made it might >> be >>>>>> worth reconsidering whether we need to do this hack. >>>>>> >>>>>> >>>>>> >>>>>> — >>>>>> *Joris Van Remoortere* >>>>>> Mesosphere >>>>>> >>>>>>> On Thu, Jul 7, 2016 at 10:15 AM, Guangya Liu <[email protected]> >>>> wrote: >>>>>>> >>>>>>> Hi Ben and Dario, >>>>>>> >>>>>>> The reason that we have "SUPPRESS" call is as following: >>>>>>> 1) Act as the complement to the current REVIVE call. >>>>>>> 2) The HTTP API do not have an API to "Deactivate" a framework, we >>>> want to >>>>>>> use "SUPPRESS", "DECLINE" and "DECLINE_INVERSE_OFFERS" to implement >> the >>>>>>> call for "DeactivateFrameworkMessage". >>>>>>> >>>>>>> You can also refer to >> https://issues.apache.org/jira/browse/MESOS-3037 >>>> for >>>>>>> detail. >>>>>>> >>>>>>> So I think that Dario's patch is good, we should remove the framework >>>>>>> clients when "SUPPRESS" and add the framework client back when >>>> "REVIVE". to >>>>>>> ignore those frameworks from sorter. >>>>>>> >>>>>>> @Viond, any comments for this? >>>>>>> >>>>>>> @Ben, for your concern of the benchmark test result is not easy to >>>>>>> understand, I have filed a JIRA ticket here >>>>>>> https://issues.apache.org/jira/browse/MESOS-5800 to trace. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Guangya >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Thu, Jul 7, 2016 at 6:01 AM, Dario Rexin <[email protected]> >> wrote: >>>>>>>> >>>>>>>> Hi Vinod, >>>>>>>> >>>>>>>> thanks for your reply. The reason it’s so much faster is because the >>>>>>>> sorting is a lot faster with fewer frameworks. Looping shouldn’t >> make >>>> a >>>>>>>> huge difference, as it used to just skip over the deactivated >>>> frameworks. >>>>>>>> >>>>>>>> I don’t know what effects deactivating the framework in the master >>>> would >>>>>>>> have. The framework is still active and listening for events / >> sending >>>>>>>> calls. Could you please elaborate? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -- >>>>>>>> Dario >>>>>>>> >>>>>>>> On Jul 6, 2016, at 2:56 PM, Benjamin Mahler <[email protected]> >>>> wrote: >>>>>>>> >>>>>>>> +implementer and shepherd of SUPPRESS >>>>>>>> >>>>>>>> Is there any reason we didn't already just "deactivate" frameworks >>>> that >>>>>>>> were suppressing offers? That seems to be the natural >> implementation, >>>>>>>> performance aside, because the meaning of "deactivated" is: not >> being >>>>>>> sent >>>>>>>> any offers. The patch you posted seems to only take this half-way: >>>>>>> suppress >>>>>>>> = deactivation in the allocator, but not in the master. >>>>>>>> >>>>>>>> Also, Dario it's a bit hard to interpret these numbers without >> reading >>>>>>> the >>>>>>>> benchmark code. My interpretation of these numbers is that this >> change >>>>>>>> makes the allocation loop complete more quickly when there are many >>>>>>>> frameworks that are in the suppressed state, because we have to loop >>>> over >>>>>>>> fewer clients. Is this an accurate interpretation? >>>>>>>> >>>>>>>> On Wed, Jul 6, 2016 at 2:08 PM, Dario Rexin <[email protected]> >> wrote: >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I would like to revive >>>> https://issues.apache.org/jira/browse/MESOS-4694 >>>>>>> < >>>>>>>> https://issues.apache.org/jira/browse/MESOS-4694>, especially >>>>>>>> https://reviews.apache.org/r/43666/ < >>>> https://reviews.apache.org/r/43666/ >>>>>>>> . >>>>>>>> We heavily depend on this patch and would love to see it merged. To >>>> show >>>>>>>> the value of this patch, I ran the benchmark from >>>>>>>> https://reviews.apache.org/r/49616/ < >>>> https://reviews.apache.org/r/49616/ >>>>>>>> >>>>>>>> first on HEAD and then with the aforementioned patch applied. I took >>>> some >>>>>>>> lines out to make it easier to see the changes over time in the >>>> patched >>>>>>>> version and to keep this email shorter ;). I would love to get some >>>>>>>> feedback and discuss any necessary changes to get this patch merged. >>>>>>>> >>>>>>>> Here are the results: >>>>>>>> >>>>>>>> Mesos HEAD: >>>>>>>> >>>>>>>> Using 2000 agents and 200 frameworks >>>>>>>> round 0 allocate took 3.064665secs to make 199 offers >>>>>>>> round 1 allocate took 3.029418secs to make 198 offers >>>>>>>> round 2 allocate took 3.091427secs to make 197 offers >>>>>>>> round 3 allocate took 2.955457secs to make 196 offers >>>>>>>> round 4 allocate took 3.133789secs to make 195 offers >>>>>>>> [...] >>>>>>>> round 50 allocate took 3.109859secs to make 149 offers >>>>>>>> round 51 allocate took 3.062746secs to make 148 offers >>>>>>>> round 52 allocate took 3.146043secs to make 147 offers >>>>>>>> round 53 allocate took 3.042948secs to make 146 offers >>>>>>>> round 54 allocate took 3.097835secs to make 145 offers >>>>>>>> [...] >>>>>>>> round 100 allocate took 3.027475secs to make 99 offers >>>>>>>> round 101 allocate took 3.021641secs to make 98 offers >>>>>>>> round 102 allocate took 2.9853secs to make 97 offers >>>>>>>> round 103 allocate took 3.145925secs to make 96 offers >>>>>>>> round 104 allocate took 2.99094secs to make 95 offers >>>>>>>> [...] >>>>>>>> round 150 allocate took 3.080406secs to make 49 offers >>>>>>>> round 151 allocate took 3.109412secs to make 48 offers >>>>>>>> round 152 allocate took 2.992129secs to make 47 offers >>>>>>>> round 153 allocate took 3.405642secs to make 46 offers >>>>>>>> round 154 allocate took 4.153354secs to make 45 offers >>>>>>>> [...] >>>>>>>> round 195 allocate took 3.10015secs to make 4 offers >>>>>>>> round 196 allocate took 3.029347secs to make 3 offers >>>>>>>> round 197 allocate took 2.982825secs to make 2 offers >>>>>>>> round 198 allocate took 2.934595secs to make 1 offers >>>>>>>> round 199 allocate took 313212us to make 0 offers >>>>>>>> >>>>>>>> Mesos HEAD + allocator patch: >>>>>>>> >>>>>>>> Using 2000 agents and 200 frameworks >>>>>>>> round 0 allocate took 3.248205secs to make 199 offers >>>>>>>> round 1 allocate took 3.170852secs to make 198 offers >>>>>>>> round 2 allocate took 3.135146secs to make 197 offers >>>>>>>> round 3 allocate took 3.143857secs to make 196 offers >>>>>>>> round 4 allocate took 3.127641secs to make 195 offers >>>>>>>> [...] >>>>>>>> round 50 allocate took 2.492077secs to make 149 offers >>>>>>>> round 51 allocate took 2.435054secs to make 148 offers >>>>>>>> round 52 allocate took 2.472204secs to make 147 offers >>>>>>>> round 53 allocate took 2.457228secs to make 146 offers >>>>>>>> round 54 allocate took 2.413916secs to make 145 offers >>>>>>>> [...] >>>>>>>> round 100 allocate took 1.645015secs to make 99 offers >>>>>>>> round 101 allocate took 1.647373secs to make 98 offers >>>>>>>> round 102 allocate took 1.619147secs to make 97 offers >>>>>>>> round 103 allocate took 1.625496secs to make 96 offers >>>>>>>> round 104 allocate took 1.580513secs to make 95 offers >>>>>>>> [...] >>>>>>>> round 150 allocate took 1.064716secs to make 49 offers >>>>>>>> round 151 allocate took 1.065604secs to make 48 offers >>>>>>>> round 152 allocate took 1.053049secs to make 47 offers >>>>>>>> round 153 allocate took 1.041333secs to make 46 offers >>>>>>>> round 154 allocate took 1.0461secs to make 45 offers >>>>>>>> [...] >>>>>>>> round 195 allocate took 569640us to make 4 offers >>>>>>>> round 196 allocate took 562107us to make 3 offers >>>>>>>> round 197 allocate took 547632us to make 2 offers >>>>>>>> round 198 allocate took 530765us to make 1 offers >>>>>>>> round 199 allocate took 24426us to make 0 offers >>>>>>>> >>>>>>>> -- >>>>>>>> Dario >>>>>>> >>>> >>>> >> >>
