Thanks, Adam, for explaining the fair share scheduling algorithm in more detail. What is interesting is bumping up Chronos' refuse interval to 1s didn't work. It only worked when it was bumped up to 5s (even with Marathon). Is there any explanation for that? I would think that with the 1 second refuse interval, after 5 secs when Chronos refused the offer, Marathon would receive it in the same allocation round. Also, if you could leave a note on the JIRA and/or close it, if this is the expected behavior that would be great.
-Elizabeth On Thu, Apr 9, 2015 at 11:43 AM, Adam Bordelon <[email protected]> wrote: > If Chronos is in fact further below its fair share at any given point in > time, then it will always get the offer. If, however, Chronos recently > declined that offer and we're still within its refuse_interval, then it > will have to be offered to the next framework (Marathon). In the scenario > where Chronos drastically reduces its refuse_interval (to 0.1s), it was > never giving other frameworks the opportunity to be offered those > resources. Since the default mesos allocation interval is 1s, that pretty > much guarantees that Chronos' 0.1s refuse_interval had already expired by > the time the next allocation round occurs, so it gets offered the same > resources again. > The refuse_interval should never be less than the master's > allocation_interval. This has been fixed in the latest Chronos. > > On Thu, Apr 9, 2015 at 10:59 AM, Elizabeth Lingg <[email protected]> > wrote: > >> Hello, >> >> Changing the offer refuse interval to the default was done in Chronos >> master. We are pushing out another Chronos release with some important >> changes including that one soon. >> >> However, the question still remains, why does Chronos keep getting the >> offer back after refusal and why is it not offered to Marathon? Even with a >> different refuse interval, I would think that with the fair share >> algorithm, Marathon would also get the offer as well. Could you possibly >> comment on this JIRA issue as well, >> https://issues.apache.org/jira/browse/MESOS-2546, so we can investigate? >> >> Thanks, >> Elizabeth >> >> On Wed, Apr 8, 2015 at 9:44 PM, <[email protected]> wrote: >> >>> @Elizabeth, we did encountered the same issue. >>> >>> I just verified that both mesos-0.20.1 and the latest mesos-0.22.0 work >>> OK if I comment chronos refuse filter, that makes the refuse second changed >>> back to default 5s, if not, both mesos versions fail. >>> 2015年4月9日 上午8:48于 [email protected]写道: >>> >>> @David, after receiving and declining offer for the first time, >>> chronos's share is still the smaller one, so mesos continues to provide >>> offer to chronos instead of marathon. >>> @Elizabeth, I will try with the latest mesos a moment later. >>> 2015年4月9日 上午8:37于 Elizabeth Lingg <[email protected]>写道: >>> >>> Correct, the issue we were seeing is that Chronos would decline the >>> offer, but Chronos would keep getting it back instead of it being offered >>> to Marathon. We were reproducing this issue on a cluster with a single >>> master/slave, which is a bit of an anti-pattern. I'm not sure if anyone was >>> able to reproduce this with the latest version of Mesos. >>> >>> -Elizabeth >>> >>> On Wed, Apr 8, 2015 at 5:19 PM, David Greenberg <[email protected]> >>> wrote: >>> >>>> I believe that DRF is more of a "right of first refusal." Even though >>>> marathon's got the higher share, all that means is that chronos will get >>>> the offer first; marathon will have to wait until chronos declines it. >>>> >>>> On Wed, Apr 8, 2015 at 5:33 PM Elizabeth Lingg <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> This sounds like an issue we encountered, >>>>> https://issues.apache.org/jira/browse/MESOS-2546. Are you able to >>>>> reproduce this in the latest release? If so, could you add a comment to >>>>> the >>>>> issue? >>>>> >>>>> Thanks, >>>>> Elizabeth >>>>> >>>>> On Wed, Apr 8, 2015 at 3:35 AM, <[email protected]> wrote: >>>>> >>>>>> Suppose I registered two frameworks namely marathon and chronos(both >>>>>> in role *) one after another, successfully deployed and run one app app1 >>>>>> by marathon, also deployed one cron app app2 by chronos, before app2 due >>>>>> or >>>>>> after app2 finished, I can't deploy and launch any new app by marathon >>>>>> although there are many resources left, because mesos will send offer to >>>>>> chronos all the time as share of chronos is smaller according to DRF, so >>>>>> is >>>>>> DRF unreasonable and is there any advice on how to allocate resources in >>>>>> this scenario? Any suggestipn will be appreciate. >>>>>> Best regards! >>>>>> >>>>> >>>>> >>> >> >

