Re: memcg/cgroup: do not fail fail on pre_destroy callbacks
On Mon 29-10-12 16:26:02, Tejun Heo wrote: > Hello, Michal. > > > Tejun is planning to build on top of that and make some more cleanups > > in the cgroup core (namely get rid of of the whole retry code in > > cgroup_rmdir). > > I applied 1-3 to the following branch which is based on top of v3.6. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git > cgroup-destroy-updates Ok, Andrew droped all the patches from his tree and I set up this branch for automerging to -mm git tree. > I'll follow up with updates to the destroy path which will replace #4. > #5 and #6 should be stackable on top. Could you take care of them and apply those two on top of the first one which guarantees that css_tryget fails and no new task can appear in the group (aka #4 without follow up cleanups)? So that Andrew doesn't have to care about them later. Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: memcg/cgroup: do not fail fail on pre_destroy callbacks
On Mon 29-10-12 16:26:02, Tejun Heo wrote: Hello, Michal. Tejun is planning to build on top of that and make some more cleanups in the cgroup core (namely get rid of of the whole retry code in cgroup_rmdir). I applied 1-3 to the following branch which is based on top of v3.6. git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git cgroup-destroy-updates Ok, Andrew droped all the patches from his tree and I set up this branch for automerging to -mm git tree. I'll follow up with updates to the destroy path which will replace #4. #5 and #6 should be stackable on top. Could you take care of them and apply those two on top of the first one which guarantees that css_tryget fails and no new task can appear in the group (aka #4 without follow up cleanups)? So that Andrew doesn't have to care about them later. Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: memcg/cgroup: do not fail fail on pre_destroy callbacks
Hello, Michal. > Tejun is planning to build on top of that and make some more cleanups > in the cgroup core (namely get rid of of the whole retry code in > cgroup_rmdir). I applied 1-3 to the following branch which is based on top of v3.6. git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git cgroup-destroy-updates I'll follow up with updates to the destroy path which will replace #4. #5 and #6 should be stackable on top. So, Andrew, there's likely be a conflict in the near future. Just dropping #4-#6 till Michal and I sort it out should be enough. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: memcg/cgroup: do not fail fail on pre_destroy callbacks
Hello, Michal. Tejun is planning to build on top of that and make some more cleanups in the cgroup core (namely get rid of of the whole retry code in cgroup_rmdir). I applied 1-3 to the following branch which is based on top of v3.6. git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git cgroup-destroy-updates I'll follow up with updates to the destroy path which will replace #4. #5 and #6 should be stackable on top. So, Andrew, there's likely be a conflict in the near future. Just dropping #4-#6 till Michal and I sort it out should be enough. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
memcg/cgroup: do not fail fail on pre_destroy callbacks
Hi, memcg is the only controller which might fail in its pre_destroy callback which makes the cgroup core more complicated for no good reason. This is an attempt to change this unfortunate state. I have previously posted this as an RFC https://lkml.org/lkml/2012/10/17/246 and the feedback was mostly positive. Nobody seem to see any issues with the approach so let's move on from the RFC. The patchset still needs good portion of testing and I am working on it. I would also like to see some Acks ;) The patchset is posted as v3 because some of the patches went trough 2 revisions during RFC. The first two patches are just clean ups. They could be merged even without the rest. The real change, although the code is not changed that much, is the 3rd patch. It changes the way how we handle mem_cgroup_move_parent failures. We have to realize that all those failures are *temporal*. Because we are either racing with the page removal or the page is temporarily off the LRU because of migration resp. global reclaim. As a result we do not fail mem_cgroup_force_empty_list if the page cannot be moved to the parent and rather retry until the LRU is empty. The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy after css are frozen and the group is marked as removed which means that all css_tryget will fail as well as no new task can attach the group resp. no new child group can be added. Tejun is planning to build on top of that and make some more cleanups in the cgroup core (namely get rid of of the whole retry code in cgroup_rmdir). This makes unfortunate inter-tree dependency between Andrew's and Tejun's tree therefore I have based all the work on 3.6 kernel so that it can be merged into Tejun's cgroup tree as well into -mm git tree (Andrew will see all the changes from linux-next). I do not like to push memcg changes through other than Andrew's tree but this seems to be easier as other cgroup changes will probably depend on the Tejun's cleanups. Is everybody OK with this? The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty && no child condition. See the specific patches for the changelogs. Michal Hocko (6): memcg: split mem_cgroup_force_empty into reclaiming and reparenting parts memcg: root_cgroup cannot reach mem_cgroup_move_parent memcg: Simplify mem_cgroup_force_empty_list error handling cgroups: forbid pre_destroy callback to fail memcg: make mem_cgroup_reparent_charges non failing hugetlb: do not fail in hugetlb_cgroup_pre_destroy Cumulative diffstat: kernel/cgroup.c | 30 --- mm/hugetlb_cgroup.c | 11 ++-- mm/memcontrol.c | 148 ++- 3 files changed, 99 insertions(+), 90 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
memcg/cgroup: do not fail fail on pre_destroy callbacks
Hi, memcg is the only controller which might fail in its pre_destroy callback which makes the cgroup core more complicated for no good reason. This is an attempt to change this unfortunate state. I have previously posted this as an RFC https://lkml.org/lkml/2012/10/17/246 and the feedback was mostly positive. Nobody seem to see any issues with the approach so let's move on from the RFC. The patchset still needs good portion of testing and I am working on it. I would also like to see some Acks ;) The patchset is posted as v3 because some of the patches went trough 2 revisions during RFC. The first two patches are just clean ups. They could be merged even without the rest. The real change, although the code is not changed that much, is the 3rd patch. It changes the way how we handle mem_cgroup_move_parent failures. We have to realize that all those failures are *temporal*. Because we are either racing with the page removal or the page is temporarily off the LRU because of migration resp. global reclaim. As a result we do not fail mem_cgroup_force_empty_list if the page cannot be moved to the parent and rather retry until the LRU is empty. The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy after css are frozen and the group is marked as removed which means that all css_tryget will fail as well as no new task can attach the group resp. no new child group can be added. Tejun is planning to build on top of that and make some more cleanups in the cgroup core (namely get rid of of the whole retry code in cgroup_rmdir). This makes unfortunate inter-tree dependency between Andrew's and Tejun's tree therefore I have based all the work on 3.6 kernel so that it can be merged into Tejun's cgroup tree as well into -mm git tree (Andrew will see all the changes from linux-next). I do not like to push memcg changes through other than Andrew's tree but this seems to be easier as other cgroup changes will probably depend on the Tejun's cleanups. Is everybody OK with this? The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty no child condition. See the specific patches for the changelogs. Michal Hocko (6): memcg: split mem_cgroup_force_empty into reclaiming and reparenting parts memcg: root_cgroup cannot reach mem_cgroup_move_parent memcg: Simplify mem_cgroup_force_empty_list error handling cgroups: forbid pre_destroy callback to fail memcg: make mem_cgroup_reparent_charges non failing hugetlb: do not fail in hugetlb_cgroup_pre_destroy Cumulative diffstat: kernel/cgroup.c | 30 --- mm/hugetlb_cgroup.c | 11 ++-- mm/memcontrol.c | 148 ++- 3 files changed, 99 insertions(+), 90 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks
(2012/10/17 22:30), Michal Hocko wrote: > Hi, > memcg is the only controller which might fail in its pre_destroy > callback which makes the cgroup core more complicated for no good > reason. This is an attempt to change this unfortunate state. > > I am sending this a RFC because I would like to hear back whether the > approach is correct. I thought that the changes would be more invasive > but it seems that the current code was mostly prepared for this and it > needs just some small tweaks (so I might be missing something important > here). > > The first two patches are just clean ups. They could be merged even > without the rest. > > The real change, although the code is not changed that much, is the 3rd > patch. It changes the way how we handle mem_cgroup_move_parent failures. > We have to realize that all those failures are *temporal*. Because we > are either racing with the page removal or the page is temporarily off > the LRU because of migration resp. global reclaim. As a result we do > not fail mem_cgroup_force_empty_list if the page cannot be moved to the > parent and rather retry until the LRU is empty. > > The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy > inside the cgroup_lock which is not very nice because the callbacks > can take some time. Maybe we can move this call at the very end of the > function? > All I need for memcg is that cgroup_call_pre_destroy has been called and > that no new cgroups can be attached to the group. The cgroup_lock is > necessary for the later condition but if we move after CGRP_REMOVED flag > is set then we are safe as well. > > The last two patches are trivial follow ups for the cgroups core change > because now we know that nobody will interfere with us so we can drop > those empty && no child condition. > > Comments, thoughts? > > Michal Hocko (6): >memcg: split mem_cgroup_force_empty into reclaiming and reparenting > parts >memcg: root_cgroup cannot reach mem_cgroup_move_parent >memcg: Simplify mem_cgroup_force_empty_list error handling >cgroups: forbid pre_destroy callback to fail >memcg: make mem_cgroup_reparent_charges non failing >hugetlb: do not fail in hugetlb_cgroup_pre_destroy > > Cumulative diffstat: > kernel/cgroup.c | 30 - > mm/hugetlb_cgroup.c | 11 ++--- > mm/memcontrol.c | 124 > +++ > 3 files changed, 78 insertions(+), 87 deletions(-) Thank you very much ! The whole patch seems good to me and I like this approach. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks
On 10/17/2012 05:30 PM, Michal Hocko wrote: > Hi, > memcg is the only controller which might fail in its pre_destroy > callback which makes the cgroup core more complicated for no good > reason. This is an attempt to change this unfortunate state. > > I am sending this a RFC because I would like to hear back whether the > approach is correct. I thought that the changes would be more invasive > but it seems that the current code was mostly prepared for this and it > needs just some small tweaks (so I might be missing something important > here). > > The first two patches are just clean ups. They could be merged even > without the rest. > > The real change, although the code is not changed that much, is the 3rd > patch. It changes the way how we handle mem_cgroup_move_parent failures. > We have to realize that all those failures are *temporal*. Because we > are either racing with the page removal or the page is temporarily off > the LRU because of migration resp. global reclaim. As a result we do > not fail mem_cgroup_force_empty_list if the page cannot be moved to the > parent and rather retry until the LRU is empty. > > The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy > inside the cgroup_lock which is not very nice because the callbacks > can take some time. Maybe we can move this call at the very end of the > function? > All I need for memcg is that cgroup_call_pre_destroy has been called and > that no new cgroups can be attached to the group. The cgroup_lock is > necessary for the later condition but if we move after CGRP_REMOVED flag > is set then we are safe as well. > > The last two patches are trivial follow ups for the cgroups core change > because now we know that nobody will interfere with us so we can drop > those empty && no child condition. > > Comments, thoughts? > I personally don't see anything fundamentally wrong with this. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks
Hi, memcg is the only controller which might fail in its pre_destroy callback which makes the cgroup core more complicated for no good reason. This is an attempt to change this unfortunate state. I am sending this a RFC because I would like to hear back whether the approach is correct. I thought that the changes would be more invasive but it seems that the current code was mostly prepared for this and it needs just some small tweaks (so I might be missing something important here). The first two patches are just clean ups. They could be merged even without the rest. The real change, although the code is not changed that much, is the 3rd patch. It changes the way how we handle mem_cgroup_move_parent failures. We have to realize that all those failures are *temporal*. Because we are either racing with the page removal or the page is temporarily off the LRU because of migration resp. global reclaim. As a result we do not fail mem_cgroup_force_empty_list if the page cannot be moved to the parent and rather retry until the LRU is empty. The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy inside the cgroup_lock which is not very nice because the callbacks can take some time. Maybe we can move this call at the very end of the function? All I need for memcg is that cgroup_call_pre_destroy has been called and that no new cgroups can be attached to the group. The cgroup_lock is necessary for the later condition but if we move after CGRP_REMOVED flag is set then we are safe as well. The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty && no child condition. Comments, thoughts? Michal Hocko (6): memcg: split mem_cgroup_force_empty into reclaiming and reparenting parts memcg: root_cgroup cannot reach mem_cgroup_move_parent memcg: Simplify mem_cgroup_force_empty_list error handling cgroups: forbid pre_destroy callback to fail memcg: make mem_cgroup_reparent_charges non failing hugetlb: do not fail in hugetlb_cgroup_pre_destroy Cumulative diffstat: kernel/cgroup.c | 30 - mm/hugetlb_cgroup.c | 11 ++--- mm/memcontrol.c | 124 +++ 3 files changed, 78 insertions(+), 87 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks
Hi, memcg is the only controller which might fail in its pre_destroy callback which makes the cgroup core more complicated for no good reason. This is an attempt to change this unfortunate state. I am sending this a RFC because I would like to hear back whether the approach is correct. I thought that the changes would be more invasive but it seems that the current code was mostly prepared for this and it needs just some small tweaks (so I might be missing something important here). The first two patches are just clean ups. They could be merged even without the rest. The real change, although the code is not changed that much, is the 3rd patch. It changes the way how we handle mem_cgroup_move_parent failures. We have to realize that all those failures are *temporal*. Because we are either racing with the page removal or the page is temporarily off the LRU because of migration resp. global reclaim. As a result we do not fail mem_cgroup_force_empty_list if the page cannot be moved to the parent and rather retry until the LRU is empty. The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy inside the cgroup_lock which is not very nice because the callbacks can take some time. Maybe we can move this call at the very end of the function? All I need for memcg is that cgroup_call_pre_destroy has been called and that no new cgroups can be attached to the group. The cgroup_lock is necessary for the later condition but if we move after CGRP_REMOVED flag is set then we are safe as well. The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty no child condition. Comments, thoughts? Michal Hocko (6): memcg: split mem_cgroup_force_empty into reclaiming and reparenting parts memcg: root_cgroup cannot reach mem_cgroup_move_parent memcg: Simplify mem_cgroup_force_empty_list error handling cgroups: forbid pre_destroy callback to fail memcg: make mem_cgroup_reparent_charges non failing hugetlb: do not fail in hugetlb_cgroup_pre_destroy Cumulative diffstat: kernel/cgroup.c | 30 - mm/hugetlb_cgroup.c | 11 ++--- mm/memcontrol.c | 124 +++ 3 files changed, 78 insertions(+), 87 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks
On 10/17/2012 05:30 PM, Michal Hocko wrote: Hi, memcg is the only controller which might fail in its pre_destroy callback which makes the cgroup core more complicated for no good reason. This is an attempt to change this unfortunate state. I am sending this a RFC because I would like to hear back whether the approach is correct. I thought that the changes would be more invasive but it seems that the current code was mostly prepared for this and it needs just some small tweaks (so I might be missing something important here). The first two patches are just clean ups. They could be merged even without the rest. The real change, although the code is not changed that much, is the 3rd patch. It changes the way how we handle mem_cgroup_move_parent failures. We have to realize that all those failures are *temporal*. Because we are either racing with the page removal or the page is temporarily off the LRU because of migration resp. global reclaim. As a result we do not fail mem_cgroup_force_empty_list if the page cannot be moved to the parent and rather retry until the LRU is empty. The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy inside the cgroup_lock which is not very nice because the callbacks can take some time. Maybe we can move this call at the very end of the function? All I need for memcg is that cgroup_call_pre_destroy has been called and that no new cgroups can be attached to the group. The cgroup_lock is necessary for the later condition but if we move after CGRP_REMOVED flag is set then we are safe as well. The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty no child condition. Comments, thoughts? I personally don't see anything fundamentally wrong with this. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks
(2012/10/17 22:30), Michal Hocko wrote: Hi, memcg is the only controller which might fail in its pre_destroy callback which makes the cgroup core more complicated for no good reason. This is an attempt to change this unfortunate state. I am sending this a RFC because I would like to hear back whether the approach is correct. I thought that the changes would be more invasive but it seems that the current code was mostly prepared for this and it needs just some small tweaks (so I might be missing something important here). The first two patches are just clean ups. They could be merged even without the rest. The real change, although the code is not changed that much, is the 3rd patch. It changes the way how we handle mem_cgroup_move_parent failures. We have to realize that all those failures are *temporal*. Because we are either racing with the page removal or the page is temporarily off the LRU because of migration resp. global reclaim. As a result we do not fail mem_cgroup_force_empty_list if the page cannot be moved to the parent and rather retry until the LRU is empty. The 4th patch is for cgroup core. I have moved cgroup_call_pre_destroy inside the cgroup_lock which is not very nice because the callbacks can take some time. Maybe we can move this call at the very end of the function? All I need for memcg is that cgroup_call_pre_destroy has been called and that no new cgroups can be attached to the group. The cgroup_lock is necessary for the later condition but if we move after CGRP_REMOVED flag is set then we are safe as well. The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty no child condition. Comments, thoughts? Michal Hocko (6): memcg: split mem_cgroup_force_empty into reclaiming and reparenting parts memcg: root_cgroup cannot reach mem_cgroup_move_parent memcg: Simplify mem_cgroup_force_empty_list error handling cgroups: forbid pre_destroy callback to fail memcg: make mem_cgroup_reparent_charges non failing hugetlb: do not fail in hugetlb_cgroup_pre_destroy Cumulative diffstat: kernel/cgroup.c | 30 - mm/hugetlb_cgroup.c | 11 ++--- mm/memcontrol.c | 124 +++ 3 files changed, 78 insertions(+), 87 deletions(-) Thank you very much ! The whole patch seems good to me and I like this approach. Thanks, -Kame -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/