Hi everyone, For now (and in the current Serenity[1]), we avoided oversubscription of non-compressible resources. If you look at a system like Heracles[2], the polling interval is in O(seconds) which may not be fast enough to react to OOM. But shouldn't be impossible, taken all the work VMWare has been putting into it. We postponed it for now. There is, however, nothing that prevents you from doing that with the current resource estimator and QoS controller APIs.
In the first version of Serenity, we relied on slack in terms of cpu time to deem how many resources which should be oversubscribed and then used hardware performance counters to monitor (and protect) latency critical workloads. It turned out, however, to be extremely difficult to make universal without access to the APMs (SLI/SLO) from the application itself. Niklas [1] https://github.com/mesosphere/serenity On Mon, Mar 21, 2016 at 10:54 AM, Zhitao Li <[email protected]> wrote: > Hi Stephan, > > Glad someone is sharing interest in this topic. My company is also very > interested in this topic. Sharing a couple of thoughts: > > 1. I believe there real difficulties here come from isolation: how Mesos > would handle over committed memory because it cannot throttle like CPU? > 2. Handling this within one single Mesos framework could differ from the > case of running multiple frameworks; > 3. I know you are active on Apache Aurora. I believe right now Aurora does > not consider ram as revocable resources, but we probably work together to > expand that once we know the isolation story. > > > On Mon, Mar 21, 2016 at 8:30 AM, Erb, Stephan <[email protected] > > wrote: > >> Judging from the epic description, this seems to target the >> oversubscription of reserved resources on the framework level. >> >> >> However, my question was targeting the task level, where one task of a >> framework is requesting more RAM than it actually uses, and another tasks >> from the same framework can be started as revocable and use those slack >> resources. >> >> >> The latter is already possible with compressible resources such as CPU or >> bandwidth. I am now interested in non-compressible resources (i.e. memory). >> >> >> ------------------------------ >> *From:* Guangya Liu <[email protected]> >> *Sent:* Monday, March 21, 2016 15:53 >> *To:* [email protected] >> *Subject:* Re: Current state of the oversubscription feature >> >> https://issues.apache.org/jira/browse/MESOS-4967 is planning to >> introduce "Oversubscription for reservation", can you please help check >> if this help? >> >> Thanks, >> >> Guangya >> >> On Mon, Mar 21, 2016 at 8:54 PM, Erb, Stephan < >> [email protected]> wrote: >> >>> Hi everyone, >>> >>> I am interested in the current state of the Mesos oversubscription >>> feature [1]. In particular, I would like to know if anyone has taken a >>> closer look at non-compressible resources such as memory. >>> >>> Anything I should be aware of? >>> >>> Thanks and Best Regards, >>> Stephan >>> >>> [1] http://mesos.apache.org/documentation/latest/oversubscription/ >> >> >> > > > -- > Cheers, > > Zhitao Li > -- Niklas

