Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon, Jun 16, 2014 at 04:29:15PM +0200, Michal Hocko wrote: > > They're all in the mainline now. > > git grep CFTYPE_ON_ON_DFL origin/master didn't show me anything. lol, it should have been CFTYPE_ONLY_ON_DFL. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon 16-06-14 10:12:33, Tejun Heo wrote: > On Mon, Jun 16, 2014 at 04:04:48PM +0200, Michal Hocko wrote: > > > For whatever reason, a user is stuck with thread-level granularity for > > > controllers which work that way, the user can use the old hierarchies > > > for them for the time being. > > > > So he can mount memcg with new cgroup API and others with old? > > Yes, you can read Documentation/cgroups/unified-hierarchy.txt for more > details. I think I cc'd you when posting unified hierarchy patchset, > didn't I? OK, I've obviously pushed that out of my brain, because you are really clear about it: " All controllers which are not bound to other hierarchies are automatically bound to unified hierarchy and show up at the root of it. Controllers which are enabled only in the root of unified hierarchy can be bound to other hierarchies at any time. This allows mixing unified hierarchy with the traditional multiple hierarchies in a fully backward compatible way. " This of course sorts out my concerns. Sorry about the noise! > > > Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the > > > opposite. > > > > OK, I wasn't aware of this. On which branch I find this? > > They're all in the mainline now. git grep CFTYPE_ON_ON_DFL origin/master didn't show me anything. Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon, Jun 16, 2014 at 04:04:48PM +0200, Michal Hocko wrote: > > For whatever reason, a user is stuck with thread-level granularity for > > controllers which work that way, the user can use the old hierarchies > > for them for the time being. > > So he can mount memcg with new cgroup API and others with old? Yes, you can read Documentation/cgroups/unified-hierarchy.txt for more details. I think I cc'd you when posting unified hierarchy patchset, didn't I? > > Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the > > opposite. > > OK, I wasn't aware of this. On which branch I find this? They're all in the mainline now. > > Knobs marked with the flag only appear on the default > > hierarchy (cgroup core internally calls it the default hierarchy as > > this is the tree all the controllers are attached to by default). > > I am not sure I understand. So they are visible only in the hierarchy > mounted with the new cgroup API (sane or how is it called)? Yeap. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon 16-06-14 09:57:41, Tejun Heo wrote: > Hello, Michal. > > On Mon, Jun 16, 2014 at 02:59:15PM +0200, Michal Hocko wrote: > > > There sure is a question of how fast userland will move to the new > > > interface. > > > > Yeah, I was mostly thinking about those who would need to to bigger > > changes. AFAIR threads will no longer be distributable between groups. > > Thread-level granularity should go away no matter what, but this is > completely irrelevant to memcg which can't do per-thread anyway. Yes, I wasn't afraid about memcg. It was a setup which requires more controllers that I was worried about. > For whatever reason, a user is stuck with thread-level granularity for > controllers which work that way, the user can use the old hierarchies > for them for the time being. So he can mount memcg with new cgroup API and others with old? > > > is used but I don't think there's any chance of removing the knob. > > > There's a reason why we're introducing a new version of the whole > > > cgroup interface which can co-exist with the existing one after all. > > > If you wanna version memcg interface separately, maybe that'd work but > > > it sounds like a lot of extra hassle for not much gain. > > > > No, I didn't mean to version the interface. I just wanted to have > > gradual transition for potential soft_limit users. > > > > Maybe I am misunderstanding something but I thought that new version of > > API will contain all knobs which are not marked .flags = CFTYPE_INSANE > > while the old API will contain all of them. > > Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the > opposite. OK, I wasn't aware of this. On which branch I find this? > Knobs marked with the flag only appear on the default > hierarchy (cgroup core internally calls it the default hierarchy as > this is the tree all the controllers are attached to by default). I am not sure I understand. So they are visible only in the hierarchy mounted with the new cgroup API (sane or how is it called)? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
Hello, Michal. On Mon, Jun 16, 2014 at 02:59:15PM +0200, Michal Hocko wrote: > > There sure is a question of how fast userland will move to the new > > interface. > > Yeah, I was mostly thinking about those who would need to to bigger > changes. AFAIR threads will no longer be distributable between groups. Thread-level granularity should go away no matter what, but this is completely irrelevant to memcg which can't do per-thread anyway. For whatever reason, a user is stuck with thread-level granularity for controllers which work that way, the user can use the old hierarchies for them for the time being. > > is used but I don't think there's any chance of removing the knob. > > There's a reason why we're introducing a new version of the whole > > cgroup interface which can co-exist with the existing one after all. > > If you wanna version memcg interface separately, maybe that'd work but > > it sounds like a lot of extra hassle for not much gain. > > No, I didn't mean to version the interface. I just wanted to have > gradual transition for potential soft_limit users. > > Maybe I am misunderstanding something but I thought that new version of > API will contain all knobs which are not marked .flags = CFTYPE_INSANE > while the old API will contain all of them. Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the opposite. Knobs marked with the flag only appear on the default hierarchy (cgroup core internally calls it the default hierarchy as this is the tree all the controllers are attached to by default). Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu 12-06-14 12:51:05, Johannes Weiner wrote: > On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: > > On Thu 12-06-14 09:56:00, Johannes Weiner wrote: > > > On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: > > [...] > > > > Anyway, the situation now is pretty chaotic. I plan to gather all the > > > > patchse posted so far and repost for the future discussion. I just need > > > > to finish some internal tasks and will post it soon. > > > > > > That would be great, thanks, it's really hard to follow this stuff > > > halfway in and halfway outside of -mm. > > > > > > Now that we roughly figured out what knobs and semantics we want, it > > > would be great to figure out the merging logistics. > > > > > > I would prefer if we could introduce max, high, low, min in unified > > > hierarchy, and *only* in there, so that we never have to worry about > > > it coexisting and interacting with the existing hard and soft limit. Btw. what is the way to introduce a knob _only_ in the new cgroup API? I am aware only about .flags = CFTYPE_INSANE which works other way around. > > The primary question would be, whether this is is the best transition > > strategy. I do not know how many users apart from developers are really > > using unified hierarchy. I would be worried that we merge a feature which > > will not be used for a long time. > > Unified hierarchy is the next version of the cgroup interface, and > once the development tag drops I consider the old memcg interface > deprecated. Deprecated in the unified hierarchy mount, right? There will be still the old API around AFAIU. The deprecated knobs will be only not visible in the new API. So we cannot simply remove all the code after unified hierarchy drops its DEVEL status, can we? > It makes very little sense to me to put up additional > incentives at this point to continue the use of the old interface, > when we already struggle with manpower to maintain even one of them. > > > Moreover, if somebody wants to transition from soft limit then it would > > be really hard because switching to unified hierarchy might be a no-go. > > > > I think that it is clear that we should deprecate soft_limit ASAP. I > > also think it wont't hurt to have min, low, high in both old and unified > > API and strongly warn if somebody tries to use soft_limit along with any > > of the new APIs in the first step. Later we can even forbid any > > combination by a hard failure. > > Why would somebody NOT be able to convert to unified hierarchy > eventually? I've mentioned that in other email. I remember people complaining about threads not being distributable over groups in the past. Things might have changed in the mean time, I was too busy to pay closer attention so I might be completely wrong here. > How big is the intersection of cases that can't convert to unified > hierarchy AND are using the soft limit AND want to use the new low > limit? I am not talking about intentional usage of soft limit with new knobs. That would be unsupported of course and I meant to complain about that in the logs and later even fail on an attempt. > Merging a different concept with its own naming scheme into an already > confusing interface, spamming the dmesg if someone gets it wrong, > potentially introducing more breakage with the hard failure, putting > up incentives to stick with a deprecated and confusing interface... > This is a lot of horrible stuff in an attempt to accomodate very few > usecases - if any - when we are *already versioning the interface* and > have the opportunity for a clean transition. > > The transition to min, low, high, max is effort in itself. Conflating > the two models sounds more detrimental than anything else, with a very > dubious upside at that. > > > > It would also be beneficial to introduce them all close to each other, > > > develop them together, possibly submit them in the same patch series, > > > so that we know the requirements and how the code should look like in > > > the big picture and can offer a fully consistent and documented usage > > > model in the unified hierarchy. > > > > Min and Low should definitely go together. High sounds like an > > orthogonal problem (pro-active reclaim vs reclaim protection) so I think > > it can go its own way and pace. We still have to discuss its semantic > > and I feel it would be a bit disturbing to have everything in one > > bundle. > > > > I do understand your point about the global picture, though. Do you > > think that there is a risk that formulating semantic for High limit > > might change the way how Min and Low would be defined? > > I think one of the biggest hinderances in making forward progress on > individual limits is that we only had a laundry list of occasionally > conflicting requirements but never a consistent big picture to design > around and match full usecases to. It's much easier and less error > prone to develop the concept as a whole, alongside full
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu 12-06-14 12:17:33, Tejun Heo wrote: > Hello, Michal. > > On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: > > The primary question would be, whether this is is the best transition > > strategy. I do not know how many users apart from developers are really > > using unified hierarchy. I would be worried that we merge a feature which > > will not be used for a long time. > > I'm planning to drop __DEVEL__ mask from the unified hierarchy in a > cycle, at most two. OK, I am obviously behind the current cgroup core changes. I thought that unified hierarchy will be for development only for much more time. > The biggest hold up at the moment is > straightening out the interfaces and interaction between memcg and > blkcg because I think it'd be silly to have to go through another > round of interface versioning effort right after transitioning to > unified hierarchy. I'm not too confident whether it'd be possible to > get blkcg completely in shape by that time, but, if that takes too > long, I'll just leave blkcg behind temporarily. So, at least from > kernel side, it's not gonna be too long. > > There sure is a question of how fast userland will move to the new > interface. Yeah, I was mostly thinking about those who would need to to bigger changes. AFAIR threads will no longer be distributable between groups. > Some are already playing with unified hierarchy and > planning to migrate as soon as possible but there sure will be others > who will take more time. Can't tell for sure, but the thing is that > migration to min/low/high/max scheme is a signficant migration effort > too, so I'm not sure how much we'd gain by doing that separately. > It'd be an extra transition step for userland (optional but still), > more combinations of configration to handle for memcg, and it's not > like unified hierarchy is that difficult to transition to. > > > Moreover, if somebody wants to transition from soft limit then it would > > be really hard because switching to unified hierarchy might be a no-go. > > Why would that be a no-go? I remember discussions about per-thread distributions and some other things missing from the new API. > Its usage is mostly similar with > tranditional hierarchies and can be used with other hierarchies, so > while it'd take some adaptation, in most cases gradual transition > shouldn't be a big problem. OK > > I think that it is clear that we should deprecate soft_limit ASAP. I > > also think it wont't hurt to have min, low, high in both old and unified > > API and strongly warn if somebody tries to use soft_limit along with any > > of the new APIs in the first step. Later we can even forbid any > > combination by a hard failure. > > I don't quite understand how you plan to deprecate it. Sure you can > fail with -EINVAL or whatnot when the wrong combination Yes, I was thinking that direction. First warn and then EINVAL later. > is used but I don't think there's any chance of removing the knob. > There's a reason why we're introducing a new version of the whole > cgroup interface which can co-exist with the existing one after all. > If you wanna version memcg interface separately, maybe that'd work but > it sounds like a lot of extra hassle for not much gain. No, I didn't mean to version the interface. I just wanted to have gradual transition for potential soft_limit users. Maybe I am misunderstanding something but I thought that new version of API will contain all knobs which are not marked .flags = CFTYPE_INSANE while the old API will contain all of them. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu 12-06-14 12:17:33, Tejun Heo wrote: Hello, Michal. On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: The primary question would be, whether this is is the best transition strategy. I do not know how many users apart from developers are really using unified hierarchy. I would be worried that we merge a feature which will not be used for a long time. I'm planning to drop __DEVEL__ mask from the unified hierarchy in a cycle, at most two. OK, I am obviously behind the current cgroup core changes. I thought that unified hierarchy will be for development only for much more time. The biggest hold up at the moment is straightening out the interfaces and interaction between memcg and blkcg because I think it'd be silly to have to go through another round of interface versioning effort right after transitioning to unified hierarchy. I'm not too confident whether it'd be possible to get blkcg completely in shape by that time, but, if that takes too long, I'll just leave blkcg behind temporarily. So, at least from kernel side, it's not gonna be too long. There sure is a question of how fast userland will move to the new interface. Yeah, I was mostly thinking about those who would need to to bigger changes. AFAIR threads will no longer be distributable between groups. Some are already playing with unified hierarchy and planning to migrate as soon as possible but there sure will be others who will take more time. Can't tell for sure, but the thing is that migration to min/low/high/max scheme is a signficant migration effort too, so I'm not sure how much we'd gain by doing that separately. It'd be an extra transition step for userland (optional but still), more combinations of configration to handle for memcg, and it's not like unified hierarchy is that difficult to transition to. Moreover, if somebody wants to transition from soft limit then it would be really hard because switching to unified hierarchy might be a no-go. Why would that be a no-go? I remember discussions about per-thread distributions and some other things missing from the new API. Its usage is mostly similar with tranditional hierarchies and can be used with other hierarchies, so while it'd take some adaptation, in most cases gradual transition shouldn't be a big problem. OK I think that it is clear that we should deprecate soft_limit ASAP. I also think it wont't hurt to have min, low, high in both old and unified API and strongly warn if somebody tries to use soft_limit along with any of the new APIs in the first step. Later we can even forbid any combination by a hard failure. I don't quite understand how you plan to deprecate it. Sure you can fail with -EINVAL or whatnot when the wrong combination Yes, I was thinking that direction. First warn and then EINVAL later. is used but I don't think there's any chance of removing the knob. There's a reason why we're introducing a new version of the whole cgroup interface which can co-exist with the existing one after all. If you wanna version memcg interface separately, maybe that'd work but it sounds like a lot of extra hassle for not much gain. No, I didn't mean to version the interface. I just wanted to have gradual transition for potential soft_limit users. Maybe I am misunderstanding something but I thought that new version of API will contain all knobs which are not marked .flags = CFTYPE_INSANE while the old API will contain all of them. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu 12-06-14 12:51:05, Johannes Weiner wrote: On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: On Thu 12-06-14 09:56:00, Johannes Weiner wrote: On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: [...] Anyway, the situation now is pretty chaotic. I plan to gather all the patchse posted so far and repost for the future discussion. I just need to finish some internal tasks and will post it soon. That would be great, thanks, it's really hard to follow this stuff halfway in and halfway outside of -mm. Now that we roughly figured out what knobs and semantics we want, it would be great to figure out the merging logistics. I would prefer if we could introduce max, high, low, min in unified hierarchy, and *only* in there, so that we never have to worry about it coexisting and interacting with the existing hard and soft limit. Btw. what is the way to introduce a knob _only_ in the new cgroup API? I am aware only about .flags = CFTYPE_INSANE which works other way around. The primary question would be, whether this is is the best transition strategy. I do not know how many users apart from developers are really using unified hierarchy. I would be worried that we merge a feature which will not be used for a long time. Unified hierarchy is the next version of the cgroup interface, and once the development tag drops I consider the old memcg interface deprecated. Deprecated in the unified hierarchy mount, right? There will be still the old API around AFAIU. The deprecated knobs will be only not visible in the new API. So we cannot simply remove all the code after unified hierarchy drops its DEVEL status, can we? It makes very little sense to me to put up additional incentives at this point to continue the use of the old interface, when we already struggle with manpower to maintain even one of them. Moreover, if somebody wants to transition from soft limit then it would be really hard because switching to unified hierarchy might be a no-go. I think that it is clear that we should deprecate soft_limit ASAP. I also think it wont't hurt to have min, low, high in both old and unified API and strongly warn if somebody tries to use soft_limit along with any of the new APIs in the first step. Later we can even forbid any combination by a hard failure. Why would somebody NOT be able to convert to unified hierarchy eventually? I've mentioned that in other email. I remember people complaining about threads not being distributable over groups in the past. Things might have changed in the mean time, I was too busy to pay closer attention so I might be completely wrong here. How big is the intersection of cases that can't convert to unified hierarchy AND are using the soft limit AND want to use the new low limit? I am not talking about intentional usage of soft limit with new knobs. That would be unsupported of course and I meant to complain about that in the logs and later even fail on an attempt. Merging a different concept with its own naming scheme into an already confusing interface, spamming the dmesg if someone gets it wrong, potentially introducing more breakage with the hard failure, putting up incentives to stick with a deprecated and confusing interface... This is a lot of horrible stuff in an attempt to accomodate very few usecases - if any - when we are *already versioning the interface* and have the opportunity for a clean transition. The transition to min, low, high, max is effort in itself. Conflating the two models sounds more detrimental than anything else, with a very dubious upside at that. It would also be beneficial to introduce them all close to each other, develop them together, possibly submit them in the same patch series, so that we know the requirements and how the code should look like in the big picture and can offer a fully consistent and documented usage model in the unified hierarchy. Min and Low should definitely go together. High sounds like an orthogonal problem (pro-active reclaim vs reclaim protection) so I think it can go its own way and pace. We still have to discuss its semantic and I feel it would be a bit disturbing to have everything in one bundle. I do understand your point about the global picture, though. Do you think that there is a risk that formulating semantic for High limit might change the way how Min and Low would be defined? I think one of the biggest hinderances in making forward progress on individual limits is that we only had a laundry list of occasionally conflicting requirements but never a consistent big picture to design around and match full usecases to. It's much easier and less error prone to develop the concept as a whole, alongside full real-life configurations. They are symmetrical pieces whose semantics very much depend on each other, so I wouldn't like too much lag between
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
Hello, Michal. On Mon, Jun 16, 2014 at 02:59:15PM +0200, Michal Hocko wrote: There sure is a question of how fast userland will move to the new interface. Yeah, I was mostly thinking about those who would need to to bigger changes. AFAIR threads will no longer be distributable between groups. Thread-level granularity should go away no matter what, but this is completely irrelevant to memcg which can't do per-thread anyway. For whatever reason, a user is stuck with thread-level granularity for controllers which work that way, the user can use the old hierarchies for them for the time being. is used but I don't think there's any chance of removing the knob. There's a reason why we're introducing a new version of the whole cgroup interface which can co-exist with the existing one after all. If you wanna version memcg interface separately, maybe that'd work but it sounds like a lot of extra hassle for not much gain. No, I didn't mean to version the interface. I just wanted to have gradual transition for potential soft_limit users. Maybe I am misunderstanding something but I thought that new version of API will contain all knobs which are not marked .flags = CFTYPE_INSANE while the old API will contain all of them. Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the opposite. Knobs marked with the flag only appear on the default hierarchy (cgroup core internally calls it the default hierarchy as this is the tree all the controllers are attached to by default). Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon 16-06-14 09:57:41, Tejun Heo wrote: Hello, Michal. On Mon, Jun 16, 2014 at 02:59:15PM +0200, Michal Hocko wrote: There sure is a question of how fast userland will move to the new interface. Yeah, I was mostly thinking about those who would need to to bigger changes. AFAIR threads will no longer be distributable between groups. Thread-level granularity should go away no matter what, but this is completely irrelevant to memcg which can't do per-thread anyway. Yes, I wasn't afraid about memcg. It was a setup which requires more controllers that I was worried about. For whatever reason, a user is stuck with thread-level granularity for controllers which work that way, the user can use the old hierarchies for them for the time being. So he can mount memcg with new cgroup API and others with old? is used but I don't think there's any chance of removing the knob. There's a reason why we're introducing a new version of the whole cgroup interface which can co-exist with the existing one after all. If you wanna version memcg interface separately, maybe that'd work but it sounds like a lot of extra hassle for not much gain. No, I didn't mean to version the interface. I just wanted to have gradual transition for potential soft_limit users. Maybe I am misunderstanding something but I thought that new version of API will contain all knobs which are not marked .flags = CFTYPE_INSANE while the old API will contain all of them. Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the opposite. OK, I wasn't aware of this. On which branch I find this? Knobs marked with the flag only appear on the default hierarchy (cgroup core internally calls it the default hierarchy as this is the tree all the controllers are attached to by default). I am not sure I understand. So they are visible only in the hierarchy mounted with the new cgroup API (sane or how is it called)? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon, Jun 16, 2014 at 04:04:48PM +0200, Michal Hocko wrote: For whatever reason, a user is stuck with thread-level granularity for controllers which work that way, the user can use the old hierarchies for them for the time being. So he can mount memcg with new cgroup API and others with old? Yes, you can read Documentation/cgroups/unified-hierarchy.txt for more details. I think I cc'd you when posting unified hierarchy patchset, didn't I? Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the opposite. OK, I wasn't aware of this. On which branch I find this? They're all in the mainline now. Knobs marked with the flag only appear on the default hierarchy (cgroup core internally calls it the default hierarchy as this is the tree all the controllers are attached to by default). I am not sure I understand. So they are visible only in the hierarchy mounted with the new cgroup API (sane or how is it called)? Yeap. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon 16-06-14 10:12:33, Tejun Heo wrote: On Mon, Jun 16, 2014 at 04:04:48PM +0200, Michal Hocko wrote: For whatever reason, a user is stuck with thread-level granularity for controllers which work that way, the user can use the old hierarchies for them for the time being. So he can mount memcg with new cgroup API and others with old? Yes, you can read Documentation/cgroups/unified-hierarchy.txt for more details. I think I cc'd you when posting unified hierarchy patchset, didn't I? OK, I've obviously pushed that out of my brain, because you are really clear about it: All controllers which are not bound to other hierarchies are automatically bound to unified hierarchy and show up at the root of it. Controllers which are enabled only in the root of unified hierarchy can be bound to other hierarchies at any time. This allows mixing unified hierarchy with the traditional multiple hierarchies in a fully backward compatible way. This of course sorts out my concerns. Sorry about the noise! Nope, some changes don't fit that model. CFTYPE_ON_ON_DFL is the opposite. OK, I wasn't aware of this. On which branch I find this? They're all in the mainline now. git grep CFTYPE_ON_ON_DFL origin/master didn't show me anything. Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Mon, Jun 16, 2014 at 04:29:15PM +0200, Michal Hocko wrote: They're all in the mainline now. git grep CFTYPE_ON_ON_DFL origin/master didn't show me anything. lol, it should have been CFTYPE_ONLY_ON_DFL. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: > On Thu 12-06-14 09:56:00, Johannes Weiner wrote: > > On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: > [...] > > > Anyway, the situation now is pretty chaotic. I plan to gather all the > > > patchse posted so far and repost for the future discussion. I just need > > > to finish some internal tasks and will post it soon. > > > > That would be great, thanks, it's really hard to follow this stuff > > halfway in and halfway outside of -mm. > > > > Now that we roughly figured out what knobs and semantics we want, it > > would be great to figure out the merging logistics. > > > > I would prefer if we could introduce max, high, low, min in unified > > hierarchy, and *only* in there, so that we never have to worry about > > it coexisting and interacting with the existing hard and soft limit. > > The primary question would be, whether this is is the best transition > strategy. I do not know how many users apart from developers are really > using unified hierarchy. I would be worried that we merge a feature which > will not be used for a long time. Unified hierarchy is the next version of the cgroup interface, and once the development tag drops I consider the old memcg interface deprecated. It makes very little sense to me to put up additional incentives at this point to continue the use of the old interface, when we already struggle with manpower to maintain even one of them. > Moreover, if somebody wants to transition from soft limit then it would > be really hard because switching to unified hierarchy might be a no-go. > > I think that it is clear that we should deprecate soft_limit ASAP. I > also think it wont't hurt to have min, low, high in both old and unified > API and strongly warn if somebody tries to use soft_limit along with any > of the new APIs in the first step. Later we can even forbid any > combination by a hard failure. Why would somebody NOT be able to convert to unified hierarchy eventually? How big is the intersection of cases that can't convert to unified hierarchy AND are using the soft limit AND want to use the new low limit? Merging a different concept with its own naming scheme into an already confusing interface, spamming the dmesg if someone gets it wrong, potentially introducing more breakage with the hard failure, putting up incentives to stick with a deprecated and confusing interface... This is a lot of horrible stuff in an attempt to accomodate very few usecases - if any - when we are *already versioning the interface* and have the opportunity for a clean transition. The transition to min, low, high, max is effort in itself. Conflating the two models sounds more detrimental than anything else, with a very dubious upside at that. > > It would also be beneficial to introduce them all close to each other, > > develop them together, possibly submit them in the same patch series, > > so that we know the requirements and how the code should look like in > > the big picture and can offer a fully consistent and documented usage > > model in the unified hierarchy. > > Min and Low should definitely go together. High sounds like an > orthogonal problem (pro-active reclaim vs reclaim protection) so I think > it can go its own way and pace. We still have to discuss its semantic > and I feel it would be a bit disturbing to have everything in one > bundle. > > I do understand your point about the global picture, though. Do you > think that there is a risk that formulating semantic for High limit > might change the way how Min and Low would be defined? I think one of the biggest hinderances in making forward progress on individual limits is that we only had a laundry list of occasionally conflicting requirements but never a consistent big picture to design around and match full usecases to. It's much easier and less error prone to develop the concept as a whole, alongside full real-life configurations. They are symmetrical pieces whose semantics very much depend on each other, so I wouldn't like too much lag between those. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
Hello, Michal. On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: > The primary question would be, whether this is is the best transition > strategy. I do not know how many users apart from developers are really > using unified hierarchy. I would be worried that we merge a feature which > will not be used for a long time. I'm planning to drop __DEVEL__ mask from the unified hierarchy in a cycle, at most two. The biggest hold up at the moment is straightening out the interfaces and interaction between memcg and blkcg because I think it'd be silly to have to go through another round of interface versioning effort right after transitioning to unified hierarchy. I'm not too confident whether it'd be possible to get blkcg completely in shape by that time, but, if that takes too long, I'll just leave blkcg behind temporarily. So, at least from kernel side, it's not gonna be too long. There sure is a question of how fast userland will move to the new interface. Some are already playing with unified hierarchy and planning to migrate as soon as possible but there sure will be others who will take more time. Can't tell for sure, but the thing is that migration to min/low/high/max scheme is a signficant migration effort too, so I'm not sure how much we'd gain by doing that separately. It'd be an extra transition step for userland (optional but still), more combinations of configration to handle for memcg, and it's not like unified hierarchy is that difficult to transition to. > Moreover, if somebody wants to transition from soft limit then it would > be really hard because switching to unified hierarchy might be a no-go. Why would that be a no-go? Its usage is mostly similar with tranditional hierarchies and can be used with other hierarchies, so while it'd take some adaptation, in most cases gradual transition shouldn't be a big problem. > I think that it is clear that we should deprecate soft_limit ASAP. I > also think it wont't hurt to have min, low, high in both old and unified > API and strongly warn if somebody tries to use soft_limit along with any > of the new APIs in the first step. Later we can even forbid any > combination by a hard failure. I don't quite understand how you plan to deprecate it. Sure you can fail with -EINVAL or whatnot when the wrong combination is used but I don't think there's any chance of removing the knob. There's a reason why we're introducing a new version of the whole cgroup interface which can co-exist with the existing one after all. If you wanna version memcg interface separately, maybe that'd work but it sounds like a lot of extra hassle for not much gain. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu 12-06-14 09:56:00, Johannes Weiner wrote: > On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: [...] > > Anyway, the situation now is pretty chaotic. I plan to gather all the > > patchse posted so far and repost for the future discussion. I just need > > to finish some internal tasks and will post it soon. > > That would be great, thanks, it's really hard to follow this stuff > halfway in and halfway outside of -mm. > > Now that we roughly figured out what knobs and semantics we want, it > would be great to figure out the merging logistics. > > I would prefer if we could introduce max, high, low, min in unified > hierarchy, and *only* in there, so that we never have to worry about > it coexisting and interacting with the existing hard and soft limit. The primary question would be, whether this is is the best transition strategy. I do not know how many users apart from developers are really using unified hierarchy. I would be worried that we merge a feature which will not be used for a long time. Moreover, if somebody wants to transition from soft limit then it would be really hard because switching to unified hierarchy might be a no-go. I think that it is clear that we should deprecate soft_limit ASAP. I also think it wont't hurt to have min, low, high in both old and unified API and strongly warn if somebody tries to use soft_limit along with any of the new APIs in the first step. Later we can even forbid any combination by a hard failure. > It would also be beneficial to introduce them all close to each other, > develop them together, possibly submit them in the same patch series, > so that we know the requirements and how the code should look like in > the big picture and can offer a fully consistent and documented usage > model in the unified hierarchy. Min and Low should definitely go together. High sounds like an orthogonal problem (pro-active reclaim vs reclaim protection) so I think it can go its own way and pace. We still have to discuss its semantic and I feel it would be a bit disturbing to have everything in one bundle. I do understand your point about the global picture, though. Do you think that there is a risk that formulating semantic for High limit might change the way how Min and Low would be defined? > Does that make sense? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: > On Wed 11-06-14 11:36:31, Johannes Weiner wrote: > [...] > > This code is truly dreadful. > > > > Don't call it guarantee when it doesn't guarantee anything. I thought > > we agreed that min, low, high, max, is reasonable nomenclature, please > > use it consistently. > > I can certainly change the internal naming. I will use your wmark naming > suggestion. Cool, thanks. > > With my proposed cleanups and scalability fixes in the other mail, the > > vmscan.c changes to support the min watermark would be something like > > the following. > > The semantic is, however, much different as pointed out in the other email. > The following on top of you cleanup will lead to the same deadlock > described in 1st patch (mm, memcg: allow OOM if no memcg is eligible > during direct reclaim). I'm currently reworking shrink_zones() and getting rid of all_unreclaimable() etc. to remove the code duplication. > Anyway, the situation now is pretty chaotic. I plan to gather all the > patchse posted so far and repost for the future discussion. I just need > to finish some internal tasks and will post it soon. That would be great, thanks, it's really hard to follow this stuff halfway in and halfway outside of -mm. Now that we roughly figured out what knobs and semantics we want, it would be great to figure out the merging logistics. I would prefer if we could introduce max, high, low, min in unified hierarchy, and *only* in there, so that we never have to worry about it coexisting and interacting with the existing hard and soft limit. It would also be beneficial to introduce them all close to each other, develop them together, possibly submit them in the same patch series, so that we know the requirements and how the code should look like in the big picture and can offer a fully consistent and documented usage model in the unified hierarchy. Does that make sense? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Wed 11-06-14 11:36:31, Johannes Weiner wrote: [...] > This code is truly dreadful. > > Don't call it guarantee when it doesn't guarantee anything. I thought > we agreed that min, low, high, max, is reasonable nomenclature, please > use it consistently. I can certainly change the internal naming. I will use your wmark naming suggestion. > With my proposed cleanups and scalability fixes in the other mail, the > vmscan.c changes to support the min watermark would be something like > the following. The semantic is, however, much different as pointed out in the other email. The following on top of you cleanup will lead to the same deadlock described in 1st patch (mm, memcg: allow OOM if no memcg is eligible during direct reclaim). Anyway, the situation now is pretty chaotic. I plan to gather all the patchse posted so far and repost for the future discussion. I just need to finish some internal tasks and will post it soon. > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 687076b7a1a6..cee19b6d04dc 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2259,7 +2259,7 @@ static void shrink_zone(struct zone *zone, struct > scan_control *sc) >*/ > if (priority < DEF_PRIORITY - 2) > break; > - > + case MEMCG_WMARK_MIN: > /* XXX: skip the whole subtree */ > memcg = mem_cgroup_iter(root, memcg, ); > continue; > -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Wed 11-06-14 11:36:31, Johannes Weiner wrote: [...] This code is truly dreadful. Don't call it guarantee when it doesn't guarantee anything. I thought we agreed that min, low, high, max, is reasonable nomenclature, please use it consistently. I can certainly change the internal naming. I will use your wmark naming suggestion. With my proposed cleanups and scalability fixes in the other mail, the vmscan.c changes to support the min watermark would be something like the following. The semantic is, however, much different as pointed out in the other email. The following on top of you cleanup will lead to the same deadlock described in 1st patch (mm, memcg: allow OOM if no memcg is eligible during direct reclaim). Anyway, the situation now is pretty chaotic. I plan to gather all the patchse posted so far and repost for the future discussion. I just need to finish some internal tasks and will post it soon. diff --git a/mm/vmscan.c b/mm/vmscan.c index 687076b7a1a6..cee19b6d04dc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2259,7 +2259,7 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) */ if (priority DEF_PRIORITY - 2) break; - + case MEMCG_WMARK_MIN: /* XXX: skip the whole subtree */ memcg = mem_cgroup_iter(root, memcg, reclaim); continue; -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: On Wed 11-06-14 11:36:31, Johannes Weiner wrote: [...] This code is truly dreadful. Don't call it guarantee when it doesn't guarantee anything. I thought we agreed that min, low, high, max, is reasonable nomenclature, please use it consistently. I can certainly change the internal naming. I will use your wmark naming suggestion. Cool, thanks. With my proposed cleanups and scalability fixes in the other mail, the vmscan.c changes to support the min watermark would be something like the following. The semantic is, however, much different as pointed out in the other email. The following on top of you cleanup will lead to the same deadlock described in 1st patch (mm, memcg: allow OOM if no memcg is eligible during direct reclaim). I'm currently reworking shrink_zones() and getting rid of all_unreclaimable() etc. to remove the code duplication. Anyway, the situation now is pretty chaotic. I plan to gather all the patchse posted so far and repost for the future discussion. I just need to finish some internal tasks and will post it soon. That would be great, thanks, it's really hard to follow this stuff halfway in and halfway outside of -mm. Now that we roughly figured out what knobs and semantics we want, it would be great to figure out the merging logistics. I would prefer if we could introduce max, high, low, min in unified hierarchy, and *only* in there, so that we never have to worry about it coexisting and interacting with the existing hard and soft limit. It would also be beneficial to introduce them all close to each other, develop them together, possibly submit them in the same patch series, so that we know the requirements and how the code should look like in the big picture and can offer a fully consistent and documented usage model in the unified hierarchy. Does that make sense? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu 12-06-14 09:56:00, Johannes Weiner wrote: On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: [...] Anyway, the situation now is pretty chaotic. I plan to gather all the patchse posted so far and repost for the future discussion. I just need to finish some internal tasks and will post it soon. That would be great, thanks, it's really hard to follow this stuff halfway in and halfway outside of -mm. Now that we roughly figured out what knobs and semantics we want, it would be great to figure out the merging logistics. I would prefer if we could introduce max, high, low, min in unified hierarchy, and *only* in there, so that we never have to worry about it coexisting and interacting with the existing hard and soft limit. The primary question would be, whether this is is the best transition strategy. I do not know how many users apart from developers are really using unified hierarchy. I would be worried that we merge a feature which will not be used for a long time. Moreover, if somebody wants to transition from soft limit then it would be really hard because switching to unified hierarchy might be a no-go. I think that it is clear that we should deprecate soft_limit ASAP. I also think it wont't hurt to have min, low, high in both old and unified API and strongly warn if somebody tries to use soft_limit along with any of the new APIs in the first step. Later we can even forbid any combination by a hard failure. It would also be beneficial to introduce them all close to each other, develop them together, possibly submit them in the same patch series, so that we know the requirements and how the code should look like in the big picture and can offer a fully consistent and documented usage model in the unified hierarchy. Min and Low should definitely go together. High sounds like an orthogonal problem (pro-active reclaim vs reclaim protection) so I think it can go its own way and pace. We still have to discuss its semantic and I feel it would be a bit disturbing to have everything in one bundle. I do understand your point about the global picture, though. Do you think that there is a risk that formulating semantic for High limit might change the way how Min and Low would be defined? Does that make sense? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
Hello, Michal. On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: The primary question would be, whether this is is the best transition strategy. I do not know how many users apart from developers are really using unified hierarchy. I would be worried that we merge a feature which will not be used for a long time. I'm planning to drop __DEVEL__ mask from the unified hierarchy in a cycle, at most two. The biggest hold up at the moment is straightening out the interfaces and interaction between memcg and blkcg because I think it'd be silly to have to go through another round of interface versioning effort right after transitioning to unified hierarchy. I'm not too confident whether it'd be possible to get blkcg completely in shape by that time, but, if that takes too long, I'll just leave blkcg behind temporarily. So, at least from kernel side, it's not gonna be too long. There sure is a question of how fast userland will move to the new interface. Some are already playing with unified hierarchy and planning to migrate as soon as possible but there sure will be others who will take more time. Can't tell for sure, but the thing is that migration to min/low/high/max scheme is a signficant migration effort too, so I'm not sure how much we'd gain by doing that separately. It'd be an extra transition step for userland (optional but still), more combinations of configration to handle for memcg, and it's not like unified hierarchy is that difficult to transition to. Moreover, if somebody wants to transition from soft limit then it would be really hard because switching to unified hierarchy might be a no-go. Why would that be a no-go? Its usage is mostly similar with tranditional hierarchies and can be used with other hierarchies, so while it'd take some adaptation, in most cases gradual transition shouldn't be a big problem. I think that it is clear that we should deprecate soft_limit ASAP. I also think it wont't hurt to have min, low, high in both old and unified API and strongly warn if somebody tries to use soft_limit along with any of the new APIs in the first step. Later we can even forbid any combination by a hard failure. I don't quite understand how you plan to deprecate it. Sure you can fail with -EINVAL or whatnot when the wrong combination is used but I don't think there's any chance of removing the knob. There's a reason why we're introducing a new version of the whole cgroup interface which can co-exist with the existing one after all. If you wanna version memcg interface separately, maybe that'd work but it sounds like a lot of extra hassle for not much gain. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Thu, Jun 12, 2014 at 04:22:37PM +0200, Michal Hocko wrote: On Thu 12-06-14 09:56:00, Johannes Weiner wrote: On Thu, Jun 12, 2014 at 03:22:07PM +0200, Michal Hocko wrote: [...] Anyway, the situation now is pretty chaotic. I plan to gather all the patchse posted so far and repost for the future discussion. I just need to finish some internal tasks and will post it soon. That would be great, thanks, it's really hard to follow this stuff halfway in and halfway outside of -mm. Now that we roughly figured out what knobs and semantics we want, it would be great to figure out the merging logistics. I would prefer if we could introduce max, high, low, min in unified hierarchy, and *only* in there, so that we never have to worry about it coexisting and interacting with the existing hard and soft limit. The primary question would be, whether this is is the best transition strategy. I do not know how many users apart from developers are really using unified hierarchy. I would be worried that we merge a feature which will not be used for a long time. Unified hierarchy is the next version of the cgroup interface, and once the development tag drops I consider the old memcg interface deprecated. It makes very little sense to me to put up additional incentives at this point to continue the use of the old interface, when we already struggle with manpower to maintain even one of them. Moreover, if somebody wants to transition from soft limit then it would be really hard because switching to unified hierarchy might be a no-go. I think that it is clear that we should deprecate soft_limit ASAP. I also think it wont't hurt to have min, low, high in both old and unified API and strongly warn if somebody tries to use soft_limit along with any of the new APIs in the first step. Later we can even forbid any combination by a hard failure. Why would somebody NOT be able to convert to unified hierarchy eventually? How big is the intersection of cases that can't convert to unified hierarchy AND are using the soft limit AND want to use the new low limit? Merging a different concept with its own naming scheme into an already confusing interface, spamming the dmesg if someone gets it wrong, potentially introducing more breakage with the hard failure, putting up incentives to stick with a deprecated and confusing interface... This is a lot of horrible stuff in an attempt to accomodate very few usecases - if any - when we are *already versioning the interface* and have the opportunity for a clean transition. The transition to min, low, high, max is effort in itself. Conflating the two models sounds more detrimental than anything else, with a very dubious upside at that. It would also be beneficial to introduce them all close to each other, develop them together, possibly submit them in the same patch series, so that we know the requirements and how the code should look like in the big picture and can offer a fully consistent and documented usage model in the unified hierarchy. Min and Low should definitely go together. High sounds like an orthogonal problem (pro-active reclaim vs reclaim protection) so I think it can go its own way and pace. We still have to discuss its semantic and I feel it would be a bit disturbing to have everything in one bundle. I do understand your point about the global picture, though. Do you think that there is a risk that formulating semantic for High limit might change the way how Min and Low would be defined? I think one of the biggest hinderances in making forward progress on individual limits is that we only had a laundry list of occasionally conflicting requirements but never a consistent big picture to design around and match full usecases to. It's much easier and less error prone to develop the concept as a whole, alongside full real-life configurations. They are symmetrical pieces whose semantics very much depend on each other, so I wouldn't like too much lag between those. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Wed, Jun 11, 2014 at 10:00:24AM +0200, Michal Hocko wrote: > Some users (e.g. Google) would like to have stronger semantic than low > limit offers currently. The fallback mode is not desirable and they > prefer hitting OOM killer rather than ignoring low limit for protected > groups. > > There are other possible usecases which can benefit from hard > guarantees. There are loads which will simply start trashing if the > memory working set drops under certain level and it is more appropriate > to simply kill and restart such a load if the required memory cannot > be provided. Another usecase would be a hard memory isolation for > containers. > > The min_limit is initialized to 0 and it has precedence over low_limit. > If the reclaim is not able to find any memcg in the reclaimed hierarchy > above min_limit then OOM killer is triggered to resolve the situation. > > Signed-off-by: Michal Hocko > --- > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 99137aecd95f..8e844bd42c51 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2220,13 +2220,12 @@ static inline bool should_continue_reclaim(struct > zone *zone, > * > * @zone: zone to shrink > * @sc: scan control with additional reclaim parameters > - * @honor_memcg_guarantee: do not reclaim memcgs which are within their > memory > - * guarantee > + * @soft_guarantee: Use soft guarantee reclaim target for memcg reclaim. > * > * Returns the number of reclaimed memcgs. > */ > static unsigned __shrink_zone(struct zone *zone, struct scan_control *sc, > - bool honor_memcg_guarantee) > + bool soft_guarantee) > { > unsigned long nr_reclaimed, nr_scanned; > unsigned nr_scanned_groups = 0; > @@ -2245,11 +2244,10 @@ static unsigned __shrink_zone(struct zone *zone, > struct scan_control *sc, > memcg = mem_cgroup_iter(root, NULL, ); > do { > struct lruvec *lruvec; > - bool within_guarantee; > > /* Memcg might be protected from the reclaim */ > - within_guarantee = mem_cgroup_within_guarantee(memcg, > root); > - if (honor_memcg_guarantee && within_guarantee) { > + if (mem_cgroup_within_guarantee(memcg, root, > + soft_guarantee)) { > /* >* It would be more optimal to skip the memcg >* subtree now but we do not have a memcg iter > @@ -2259,8 +2257,8 @@ static unsigned __shrink_zone(struct zone *zone, struct > scan_control *sc, > continue; > } > > - if (within_guarantee) > - mem_cgroup_guarantee_breached(memcg); > + if (!soft_guarantee) > + mem_cgroup_soft_guarantee_breached(memcg); > > lruvec = mem_cgroup_zone_lruvec(zone, memcg); > nr_scanned_groups++; > @@ -2297,20 +2295,27 @@ static unsigned __shrink_zone(struct zone *zone, > struct scan_control *sc, > > static void shrink_zone(struct zone *zone, struct scan_control *sc) > { > - bool honor_guarantee = true; > + bool soft_guarantee = true; > > - while (!__shrink_zone(zone, sc, honor_guarantee)) { > + while (!__shrink_zone(zone, sc, soft_guarantee)) { > /* >* The previous round of reclaim didn't find anything to scan >* because > - * a) the whole reclaimed hierarchy is within guarantee so > - *we fallback to ignore the guarantee because other option > - *would be the OOM > + * a) the whole reclaimed hierarchy is within soft guarantee so > + *we are switching to the hard guarantee reclaim target >* b) multiple reclaimers are racing and so the first round >*should be retried >*/ > - if (mem_cgroup_all_within_guarantee(sc->target_mem_cgroup)) > - honor_guarantee = false; > + if (mem_cgroup_all_within_guarantee(sc->target_mem_cgroup, > + soft_guarantee)) { > + /* > + * Nothing to reclaim even with hard guarantees so > + * we have to OOM > + */ > + if (!soft_guarantee) > + break; > + soft_guarantee = false; > + } > } > } > > @@ -2574,7 +2579,8 @@ out: >* If the target memcg is not eligible for reclaim then we have no > option >* but OOM >*/ > - if (!sc->nr_scanned && > mem_cgroup_all_within_guarantee(sc->target_mem_cgroup)) > + if (!sc->nr_scanned && > +
[PATCH 2/2] memcg: Allow guarantee reclaim
Some users (e.g. Google) would like to have stronger semantic than low limit offers currently. The fallback mode is not desirable and they prefer hitting OOM killer rather than ignoring low limit for protected groups. There are other possible usecases which can benefit from hard guarantees. There are loads which will simply start trashing if the memory working set drops under certain level and it is more appropriate to simply kill and restart such a load if the required memory cannot be provided. Another usecase would be a hard memory isolation for containers. The min_limit is initialized to 0 and it has precedence over low_limit. If the reclaim is not able to find any memcg in the reclaimed hierarchy above min_limit then OOM killer is triggered to resolve the situation. Signed-off-by: Michal Hocko --- Documentation/cgroups/memory.txt | 26 ++ include/linux/memcontrol.h | 14 -- include/linux/res_counter.h | 32 ++-- mm/memcontrol.c | 18 +++--- mm/oom_kill.c| 6 -- mm/vmscan.c | 38 ++ 6 files changed, 93 insertions(+), 41 deletions(-) diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index bf895d7e1363..6929a06c9e5d 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -61,6 +61,7 @@ Brief summary of control files. memory.low_limit_breached # number of times low_limit has been # ignored and the cgroup reclaimed even # when it was above the limit + memory.min_limit_in_bytes # set/show min limit for memory reclaim memory.memsw.limit_in_bytes# set/show limit of memory+Swap usage memory.failcnt # show the number of memory usage hits limits memory.memsw.failcnt # show the number of memory+Swap hits limits @@ -248,14 +249,23 @@ global VM. Cgroups can get reclaimed basically under two conditions to select and kill the bulkiest task in the hiearchy. (See 10. OOM Control below.) -Groups might be also protected from both global and limit reclaim by -low_limit_in_bytes knob. If the limit is non-zero the reclaim logic -doesn't include groups (and their subgroups - see 6. Hierarchy support) -which are below the low limit if there is other eligible cgroup in the -reclaimed hierarchy. If all groups which participate reclaim are under -their low limits then all of them are reclaimed and the low limit is -ignored. low_limit_breached counter in memory.stat file can be checked -to see how many times such an event occurred. +Groups might be also protected from both global and limit reclaim +by low_limit_in_bytes and min_limit_in_bytes knobs. The first one +provides an optimistic reclaim protection while the later one provides +hard memory reclaim protection guarantee. Both limits are 0 by default +and min watermark has always precedence to low watermark. + +If the low limit is non-zero the reclaim logic doesn't include +groups (and their subgroups - see 6. Hierarchy support) which are +below low_limit if there is other eligible cgroup in the reclaimed +hierarchy. If all groups which participate reclaim are under their low +limits then all of them are reclaimed and the low limit is ignored. +low_limit_breached counter in memory.stat file can be checked to see how +many times such an event occurred. + +If, however, all the groups under reclaimed hierarchy are under their min +limits then no reclaim is done and OOM killer is triggered to resolve the +situation. In other words low_limit is never breached by the reclaim. Note2: When panic_on_oom is set to "2", the whole system will panic. diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5e2ca2163b12..ddb96729a6b6 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -93,10 +93,11 @@ bool task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg); extern bool mem_cgroup_within_guarantee(struct mem_cgroup *memcg, - struct mem_cgroup *root); + struct mem_cgroup *root, bool soft_guarantee); -extern void mem_cgroup_guarantee_breached(struct mem_cgroup *memcg); -extern bool mem_cgroup_all_within_guarantee(struct mem_cgroup *root); +extern void mem_cgroup_soft_guarantee_breached(struct mem_cgroup *memcg); +extern bool mem_cgroup_all_within_guarantee(struct mem_cgroup *root, + bool soft_guarantee); extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page); extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); @@ -295,14 +296,15 @@ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page, } static inline bool mem_cgroup_within_guarantee(struct mem_cgroup *memcg, - struct mem_cgroup *root) +
[PATCH 2/2] memcg: Allow guarantee reclaim
Some users (e.g. Google) would like to have stronger semantic than low limit offers currently. The fallback mode is not desirable and they prefer hitting OOM killer rather than ignoring low limit for protected groups. There are other possible usecases which can benefit from hard guarantees. There are loads which will simply start trashing if the memory working set drops under certain level and it is more appropriate to simply kill and restart such a load if the required memory cannot be provided. Another usecase would be a hard memory isolation for containers. The min_limit is initialized to 0 and it has precedence over low_limit. If the reclaim is not able to find any memcg in the reclaimed hierarchy above min_limit then OOM killer is triggered to resolve the situation. Signed-off-by: Michal Hocko mho...@suse.cz --- Documentation/cgroups/memory.txt | 26 ++ include/linux/memcontrol.h | 14 -- include/linux/res_counter.h | 32 ++-- mm/memcontrol.c | 18 +++--- mm/oom_kill.c| 6 -- mm/vmscan.c | 38 ++ 6 files changed, 93 insertions(+), 41 deletions(-) diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index bf895d7e1363..6929a06c9e5d 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -61,6 +61,7 @@ Brief summary of control files. memory.low_limit_breached # number of times low_limit has been # ignored and the cgroup reclaimed even # when it was above the limit + memory.min_limit_in_bytes # set/show min limit for memory reclaim memory.memsw.limit_in_bytes# set/show limit of memory+Swap usage memory.failcnt # show the number of memory usage hits limits memory.memsw.failcnt # show the number of memory+Swap hits limits @@ -248,14 +249,23 @@ global VM. Cgroups can get reclaimed basically under two conditions to select and kill the bulkiest task in the hiearchy. (See 10. OOM Control below.) -Groups might be also protected from both global and limit reclaim by -low_limit_in_bytes knob. If the limit is non-zero the reclaim logic -doesn't include groups (and their subgroups - see 6. Hierarchy support) -which are below the low limit if there is other eligible cgroup in the -reclaimed hierarchy. If all groups which participate reclaim are under -their low limits then all of them are reclaimed and the low limit is -ignored. low_limit_breached counter in memory.stat file can be checked -to see how many times such an event occurred. +Groups might be also protected from both global and limit reclaim +by low_limit_in_bytes and min_limit_in_bytes knobs. The first one +provides an optimistic reclaim protection while the later one provides +hard memory reclaim protection guarantee. Both limits are 0 by default +and min watermark has always precedence to low watermark. + +If the low limit is non-zero the reclaim logic doesn't include +groups (and their subgroups - see 6. Hierarchy support) which are +below low_limit if there is other eligible cgroup in the reclaimed +hierarchy. If all groups which participate reclaim are under their low +limits then all of them are reclaimed and the low limit is ignored. +low_limit_breached counter in memory.stat file can be checked to see how +many times such an event occurred. + +If, however, all the groups under reclaimed hierarchy are under their min +limits then no reclaim is done and OOM killer is triggered to resolve the +situation. In other words low_limit is never breached by the reclaim. Note2: When panic_on_oom is set to 2, the whole system will panic. diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5e2ca2163b12..ddb96729a6b6 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -93,10 +93,11 @@ bool task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg); extern bool mem_cgroup_within_guarantee(struct mem_cgroup *memcg, - struct mem_cgroup *root); + struct mem_cgroup *root, bool soft_guarantee); -extern void mem_cgroup_guarantee_breached(struct mem_cgroup *memcg); -extern bool mem_cgroup_all_within_guarantee(struct mem_cgroup *root); +extern void mem_cgroup_soft_guarantee_breached(struct mem_cgroup *memcg); +extern bool mem_cgroup_all_within_guarantee(struct mem_cgroup *root, + bool soft_guarantee); extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page); extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); @@ -295,14 +296,15 @@ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page, } static inline bool mem_cgroup_within_guarantee(struct mem_cgroup *memcg, - struct mem_cgroup
Re: [PATCH 2/2] memcg: Allow guarantee reclaim
On Wed, Jun 11, 2014 at 10:00:24AM +0200, Michal Hocko wrote: Some users (e.g. Google) would like to have stronger semantic than low limit offers currently. The fallback mode is not desirable and they prefer hitting OOM killer rather than ignoring low limit for protected groups. There are other possible usecases which can benefit from hard guarantees. There are loads which will simply start trashing if the memory working set drops under certain level and it is more appropriate to simply kill and restart such a load if the required memory cannot be provided. Another usecase would be a hard memory isolation for containers. The min_limit is initialized to 0 and it has precedence over low_limit. If the reclaim is not able to find any memcg in the reclaimed hierarchy above min_limit then OOM killer is triggered to resolve the situation. Signed-off-by: Michal Hocko mho...@suse.cz --- diff --git a/mm/vmscan.c b/mm/vmscan.c index 99137aecd95f..8e844bd42c51 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2220,13 +2220,12 @@ static inline bool should_continue_reclaim(struct zone *zone, * * @zone: zone to shrink * @sc: scan control with additional reclaim parameters - * @honor_memcg_guarantee: do not reclaim memcgs which are within their memory - * guarantee + * @soft_guarantee: Use soft guarantee reclaim target for memcg reclaim. * * Returns the number of reclaimed memcgs. */ static unsigned __shrink_zone(struct zone *zone, struct scan_control *sc, - bool honor_memcg_guarantee) + bool soft_guarantee) { unsigned long nr_reclaimed, nr_scanned; unsigned nr_scanned_groups = 0; @@ -2245,11 +2244,10 @@ static unsigned __shrink_zone(struct zone *zone, struct scan_control *sc, memcg = mem_cgroup_iter(root, NULL, reclaim); do { struct lruvec *lruvec; - bool within_guarantee; /* Memcg might be protected from the reclaim */ - within_guarantee = mem_cgroup_within_guarantee(memcg, root); - if (honor_memcg_guarantee within_guarantee) { + if (mem_cgroup_within_guarantee(memcg, root, + soft_guarantee)) { /* * It would be more optimal to skip the memcg * subtree now but we do not have a memcg iter @@ -2259,8 +2257,8 @@ static unsigned __shrink_zone(struct zone *zone, struct scan_control *sc, continue; } - if (within_guarantee) - mem_cgroup_guarantee_breached(memcg); + if (!soft_guarantee) + mem_cgroup_soft_guarantee_breached(memcg); lruvec = mem_cgroup_zone_lruvec(zone, memcg); nr_scanned_groups++; @@ -2297,20 +2295,27 @@ static unsigned __shrink_zone(struct zone *zone, struct scan_control *sc, static void shrink_zone(struct zone *zone, struct scan_control *sc) { - bool honor_guarantee = true; + bool soft_guarantee = true; - while (!__shrink_zone(zone, sc, honor_guarantee)) { + while (!__shrink_zone(zone, sc, soft_guarantee)) { /* * The previous round of reclaim didn't find anything to scan * because - * a) the whole reclaimed hierarchy is within guarantee so - *we fallback to ignore the guarantee because other option - *would be the OOM + * a) the whole reclaimed hierarchy is within soft guarantee so + *we are switching to the hard guarantee reclaim target * b) multiple reclaimers are racing and so the first round *should be retried */ - if (mem_cgroup_all_within_guarantee(sc-target_mem_cgroup)) - honor_guarantee = false; + if (mem_cgroup_all_within_guarantee(sc-target_mem_cgroup, + soft_guarantee)) { + /* + * Nothing to reclaim even with hard guarantees so + * we have to OOM + */ + if (!soft_guarantee) + break; + soft_guarantee = false; + } } } @@ -2574,7 +2579,8 @@ out: * If the target memcg is not eligible for reclaim then we have no option * but OOM */ - if (!sc-nr_scanned mem_cgroup_all_within_guarantee(sc-target_mem_cgroup)) + if (!sc-nr_scanned + mem_cgroup_all_within_guarantee(sc-target_mem_cgroup, false)) return 0; This code is truly dreadful. Don't call it guarantee