Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon 22-05-17 10:52:44, Mikulas Patocka wrote: > > > On Mon, 22 May 2017, Michal Hocko wrote: [...] > > I am not sure I understand. OOM killer is invoked for _all_ allocations > > <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the > > OOM killer is not disabled (oom_killer_disable) and that only happens > > from the PM suspend path which makes sure that no userspace is active at > > the time. AFAIU this is a userspace triggered path and so the later > > shouldn't apply to it and GFP_KERNEL should be therefore sufficient. > > Relying to a portion of memory reserves to prevent from deadlock seems > > fundamentaly broken to me. > > > > The lvm2 was designed this way - it is broken, but there is not much that > can be done about it - fixing this would mean major rewrite. The only > thing we can do about it is to lower the deadlock probability with > __GFP_HIGH (or PF_MEMALLOC that was used some times ago). But let me repeat. GFP_KERNEL allocation for order-0 page will not fail. If you need non-failing semantic then just make it clear by adding __GFP_NOFAIL rather than __GFP_HIGH. Memory reserves are a scarce resource and there are users which might really need it from atomic contexts. Anyway, this is not the code I am maintaining so I will not argue more and won't nack the patch. But is smells like a pure cargo cult, to be honest. If you really insist, though, I would just ask to have a more detailed explanation why it is _believed_ the flag is needed because the vague "Use __GFP_HIGH to avoid low memory issues when a device is suspended and the ioctl is needed to resume it." doesn't really clarify much to be honest. Thanks! -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon 22-05-17 10:52:44, Mikulas Patocka wrote: > > > On Mon, 22 May 2017, Michal Hocko wrote: [...] > > I am not sure I understand. OOM killer is invoked for _all_ allocations > > <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the > > OOM killer is not disabled (oom_killer_disable) and that only happens > > from the PM suspend path which makes sure that no userspace is active at > > the time. AFAIU this is a userspace triggered path and so the later > > shouldn't apply to it and GFP_KERNEL should be therefore sufficient. > > Relying to a portion of memory reserves to prevent from deadlock seems > > fundamentaly broken to me. > > > > The lvm2 was designed this way - it is broken, but there is not much that > can be done about it - fixing this would mean major rewrite. The only > thing we can do about it is to lower the deadlock probability with > __GFP_HIGH (or PF_MEMALLOC that was used some times ago). But let me repeat. GFP_KERNEL allocation for order-0 page will not fail. If you need non-failing semantic then just make it clear by adding __GFP_NOFAIL rather than __GFP_HIGH. Memory reserves are a scarce resource and there are users which might really need it from atomic contexts. Anyway, this is not the code I am maintaining so I will not argue more and won't nack the patch. But is smells like a pure cargo cult, to be honest. If you really insist, though, I would just ask to have a more detailed explanation why it is _believed_ the flag is needed because the vague "Use __GFP_HIGH to avoid low memory issues when a device is suspended and the ioctl is needed to resume it." doesn't really clarify much to be honest. Thanks! -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon, 22 May 2017, Michal Hocko wrote: > On Mon 22-05-17 08:00:11, Mikulas Patocka wrote: > > > > On Mon, 22 May 2017, Michal Hocko wrote: > > > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > > > daemon dmeventd does some action (for example, when dm-mirror leg > > > > fails, > > > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > > > reload the device). > > > > > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so > > > > that > > > > the ioctls issued by dmeventd have higher chance of succeeding if some > > > > I/O > > > > is blocked, waiting for dmeventd action. It reduces the possibility of > > > > low-memory-deadlock, though it doesn't eliminate it entirely. > > > > > > So what happens if the memory reserves are depleted. Do we deadlock? > > > > Yes, it will deadlock. > > That would be more than unfortunate and begs for a different solution. > The thing is that __GFP_HIGH is not propagated to all allocations in the > vmalloc proper. E.g. page table allocations are hardcoded GFP_KERNEL. For a typical device mapper use, the ioctl area is smaller than 4k, so the vmalloc won't happen. > > > Why is OOM killer insufficient to allow the further progress? > > > > I don't know if the OOM killer will or won't be triggered in this > > situation, it depends on the people who wrote the OOM killer. > > I am not sure I understand. OOM killer is invoked for _all_ allocations > <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the > OOM killer is not disabled (oom_killer_disable) and that only happens > from the PM suspend path which makes sure that no userspace is active at > the time. AFAIU this is a userspace triggered path and so the later > shouldn't apply to it and GFP_KERNEL should be therefore sufficient. > Relying to a portion of memory reserves to prevent from deadlock seems > fundamentaly broken to me. > > -- > Michal Hocko > SUSE Labs The lvm2 was designed this way - it is broken, but there is not much that can be done about it - fixing this would mean major rewrite. The only thing we can do about it is to lower the deadlock probability with __GFP_HIGH (or PF_MEMALLOC that was used some times ago). Mikulas
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon, 22 May 2017, Michal Hocko wrote: > On Mon 22-05-17 08:00:11, Mikulas Patocka wrote: > > > > On Mon, 22 May 2017, Michal Hocko wrote: > > > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > > > daemon dmeventd does some action (for example, when dm-mirror leg > > > > fails, > > > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > > > reload the device). > > > > > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so > > > > that > > > > the ioctls issued by dmeventd have higher chance of succeeding if some > > > > I/O > > > > is blocked, waiting for dmeventd action. It reduces the possibility of > > > > low-memory-deadlock, though it doesn't eliminate it entirely. > > > > > > So what happens if the memory reserves are depleted. Do we deadlock? > > > > Yes, it will deadlock. > > That would be more than unfortunate and begs for a different solution. > The thing is that __GFP_HIGH is not propagated to all allocations in the > vmalloc proper. E.g. page table allocations are hardcoded GFP_KERNEL. For a typical device mapper use, the ioctl area is smaller than 4k, so the vmalloc won't happen. > > > Why is OOM killer insufficient to allow the further progress? > > > > I don't know if the OOM killer will or won't be triggered in this > > situation, it depends on the people who wrote the OOM killer. > > I am not sure I understand. OOM killer is invoked for _all_ allocations > <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the > OOM killer is not disabled (oom_killer_disable) and that only happens > from the PM suspend path which makes sure that no userspace is active at > the time. AFAIU this is a userspace triggered path and so the later > shouldn't apply to it and GFP_KERNEL should be therefore sufficient. > Relying to a portion of memory reserves to prevent from deadlock seems > fundamentaly broken to me. > > -- > Michal Hocko > SUSE Labs The lvm2 was designed this way - it is broken, but there is not much that can be done about it - fixing this would mean major rewrite. The only thing we can do about it is to lower the deadlock probability with __GFP_HIGH (or PF_MEMALLOC that was used some times ago). Mikulas
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon 22-05-17 08:00:11, Mikulas Patocka wrote: > > > On Mon, 22 May 2017, Michal Hocko wrote: > > > On Fri 19-05-17 19:43:23, Mikulas Patocka wrote: > > > > > > > > > On Fri, 19 May 2017, Michal Hocko wrote: > > > > > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > > > > (Adding back the correct linux-mm email address and also adding > > > > > linux-kernel.) > > > > > > > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > > > > [...] > > > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > > > > assuming there was a reason to do it in the first place in two > > > > > > different > > > > > > ways. > > > > > > > > Hmm, the old PF_MEMALLOC used to have the following comment > > > > /* > > > > * Trying to avoid low memory issues when a device is > > > > * suspended. > > > > */ > > > > > > > > I am not really sure what that means but __GFP_HIGH certainly have a > > > > different semantic than PF_MEMALLOC. The later grants the full access to > > > > the memory reserves while the prior on partial access. If this is > > > > _really_ > > > > needed then it deserves a comment explaining why. > > > > -- > > > > Michal Hocko > > > > SUSE Labs > > > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > > daemon dmeventd does some action (for example, when dm-mirror leg fails, > > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > > reload the device). > > > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > > > the ioctls issued by dmeventd have higher chance of succeeding if some > > > I/O > > > is blocked, waiting for dmeventd action. It reduces the possibility of > > > low-memory-deadlock, though it doesn't eliminate it entirely. > > > > So what happens if the memory reserves are depleted. Do we deadlock? > > Yes, it will deadlock. That would be more than unfortunate and begs for a different solution. The thing is that __GFP_HIGH is not propagated to all allocations in the vmalloc proper. E.g. page table allocations are hardcoded GFP_KERNEL. > > Why is OOM killer insufficient to allow the further progress? > > I don't know if the OOM killer will or won't be triggered in this > situation, it depends on the people who wrote the OOM killer. I am not sure I understand. OOM killer is invoked for _all_ allocations <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the OOM killer is not disabled (oom_killer_disable) and that only happens from the PM suspend path which makes sure that no userspace is active at the time. AFAIU this is a userspace triggered path and so the later shouldn't apply to it and GFP_KERNEL should be therefore sufficient. Relying to a portion of memory reserves to prevent from deadlock seems fundamentaly broken to me. -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon 22-05-17 08:00:11, Mikulas Patocka wrote: > > > On Mon, 22 May 2017, Michal Hocko wrote: > > > On Fri 19-05-17 19:43:23, Mikulas Patocka wrote: > > > > > > > > > On Fri, 19 May 2017, Michal Hocko wrote: > > > > > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > > > > (Adding back the correct linux-mm email address and also adding > > > > > linux-kernel.) > > > > > > > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > > > > [...] > > > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > > > > assuming there was a reason to do it in the first place in two > > > > > > different > > > > > > ways. > > > > > > > > Hmm, the old PF_MEMALLOC used to have the following comment > > > > /* > > > > * Trying to avoid low memory issues when a device is > > > > * suspended. > > > > */ > > > > > > > > I am not really sure what that means but __GFP_HIGH certainly have a > > > > different semantic than PF_MEMALLOC. The later grants the full access to > > > > the memory reserves while the prior on partial access. If this is > > > > _really_ > > > > needed then it deserves a comment explaining why. > > > > -- > > > > Michal Hocko > > > > SUSE Labs > > > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > > daemon dmeventd does some action (for example, when dm-mirror leg fails, > > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > > reload the device). > > > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > > > the ioctls issued by dmeventd have higher chance of succeeding if some > > > I/O > > > is blocked, waiting for dmeventd action. It reduces the possibility of > > > low-memory-deadlock, though it doesn't eliminate it entirely. > > > > So what happens if the memory reserves are depleted. Do we deadlock? > > Yes, it will deadlock. That would be more than unfortunate and begs for a different solution. The thing is that __GFP_HIGH is not propagated to all allocations in the vmalloc proper. E.g. page table allocations are hardcoded GFP_KERNEL. > > Why is OOM killer insufficient to allow the further progress? > > I don't know if the OOM killer will or won't be triggered in this > situation, it depends on the people who wrote the OOM killer. I am not sure I understand. OOM killer is invoked for _all_ allocations <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the OOM killer is not disabled (oom_killer_disable) and that only happens from the PM suspend path which makes sure that no userspace is active at the time. AFAIU this is a userspace triggered path and so the later shouldn't apply to it and GFP_KERNEL should be therefore sufficient. Relying to a portion of memory reserves to prevent from deadlock seems fundamentaly broken to me. -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon, 22 May 2017, Michal Hocko wrote: > On Fri 19-05-17 19:43:23, Mikulas Patocka wrote: > > > > > > On Fri, 19 May 2017, Michal Hocko wrote: > > > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > > > (Adding back the correct linux-mm email address and also adding > > > > linux-kernel.) > > > > > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > > > [...] > > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > > > assuming there was a reason to do it in the first place in two > > > > > different > > > > > ways. > > > > > > Hmm, the old PF_MEMALLOC used to have the following comment > > > /* > > > * Trying to avoid low memory issues when a device is > > > * suspended. > > > */ > > > > > > I am not really sure what that means but __GFP_HIGH certainly have a > > > different semantic than PF_MEMALLOC. The later grants the full access to > > > the memory reserves while the prior on partial access. If this is _really_ > > > needed then it deserves a comment explaining why. > > > -- > > > Michal Hocko > > > SUSE Labs > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > daemon dmeventd does some action (for example, when dm-mirror leg fails, > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > reload the device). > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > > the ioctls issued by dmeventd have higher chance of succeeding if some I/O > > is blocked, waiting for dmeventd action. It reduces the possibility of > > low-memory-deadlock, though it doesn't eliminate it entirely. > > So what happens if the memory reserves are depleted. Do we deadlock? Yes, it will deadlock. > Why is OOM killer insufficient to allow the further progress? I don't know if the OOM killer will or won't be triggered in this situation, it depends on the people who wrote the OOM killer. > -- > Michal Hocko > SUSE Labs Mikulas
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Mon, 22 May 2017, Michal Hocko wrote: > On Fri 19-05-17 19:43:23, Mikulas Patocka wrote: > > > > > > On Fri, 19 May 2017, Michal Hocko wrote: > > > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > > > (Adding back the correct linux-mm email address and also adding > > > > linux-kernel.) > > > > > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > > > [...] > > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > > > assuming there was a reason to do it in the first place in two > > > > > different > > > > > ways. > > > > > > Hmm, the old PF_MEMALLOC used to have the following comment > > > /* > > > * Trying to avoid low memory issues when a device is > > > * suspended. > > > */ > > > > > > I am not really sure what that means but __GFP_HIGH certainly have a > > > different semantic than PF_MEMALLOC. The later grants the full access to > > > the memory reserves while the prior on partial access. If this is _really_ > > > needed then it deserves a comment explaining why. > > > -- > > > Michal Hocko > > > SUSE Labs > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > daemon dmeventd does some action (for example, when dm-mirror leg fails, > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > reload the device). > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > > the ioctls issued by dmeventd have higher chance of succeeding if some I/O > > is blocked, waiting for dmeventd action. It reduces the possibility of > > low-memory-deadlock, though it doesn't eliminate it entirely. > > So what happens if the memory reserves are depleted. Do we deadlock? Yes, it will deadlock. > Why is OOM killer insufficient to allow the further progress? I don't know if the OOM killer will or won't be triggered in this situation, it depends on the people who wrote the OOM killer. > -- > Michal Hocko > SUSE Labs Mikulas
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Fri 19-05-17 19:43:23, Mikulas Patocka wrote: > > > On Fri, 19 May 2017, Michal Hocko wrote: > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > > (Adding back the correct linux-mm email address and also adding > > > linux-kernel.) > > > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > > [...] > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > > assuming there was a reason to do it in the first place in two > > > > different > > > > ways. > > > > Hmm, the old PF_MEMALLOC used to have the following comment > > /* > > * Trying to avoid low memory issues when a device is > > * suspended. > > */ > > > > I am not really sure what that means but __GFP_HIGH certainly have a > > different semantic than PF_MEMALLOC. The later grants the full access to > > the memory reserves while the prior on partial access. If this is _really_ > > needed then it deserves a comment explaining why. > > -- > > Michal Hocko > > SUSE Labs > > Sometimes, I/O to a device mapper device is blocked until the userspace > daemon dmeventd does some action (for example, when dm-mirror leg fails, > dmeventd needs to mark the leg as failed in the lvm metadata and then > reload the device). > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > the ioctls issued by dmeventd have higher chance of succeeding if some I/O > is blocked, waiting for dmeventd action. It reduces the possibility of > low-memory-deadlock, though it doesn't eliminate it entirely. So what happens if the memory reserves are depleted. Do we deadlock? Why is OOM killer insufficient to allow the further progress? -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Fri 19-05-17 19:43:23, Mikulas Patocka wrote: > > > On Fri, 19 May 2017, Michal Hocko wrote: > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > > (Adding back the correct linux-mm email address and also adding > > > linux-kernel.) > > > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > > [...] > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > > assuming there was a reason to do it in the first place in two > > > > different > > > > ways. > > > > Hmm, the old PF_MEMALLOC used to have the following comment > > /* > > * Trying to avoid low memory issues when a device is > > * suspended. > > */ > > > > I am not really sure what that means but __GFP_HIGH certainly have a > > different semantic than PF_MEMALLOC. The later grants the full access to > > the memory reserves while the prior on partial access. If this is _really_ > > needed then it deserves a comment explaining why. > > -- > > Michal Hocko > > SUSE Labs > > Sometimes, I/O to a device mapper device is blocked until the userspace > daemon dmeventd does some action (for example, when dm-mirror leg fails, > dmeventd needs to mark the leg as failed in the lvm metadata and then > reload the device). > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > the ioctls issued by dmeventd have higher chance of succeeding if some I/O > is blocked, waiting for dmeventd action. It reduces the possibility of > low-memory-deadlock, though it doesn't eliminate it entirely. So what happens if the memory reserves are depleted. Do we deadlock? Why is OOM killer insufficient to allow the further progress? -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Sat, 20 May 2017, Michal Hocko wrote: > On Fri 19-05-17 19:50:24, Mikulas Patocka wrote: > > > > > > On Thu, 18 May 2017, Junaid Shahid wrote: > > > > > d224e9381897 (drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded > > > variant) left out the __GFP_HIGH flag when converting from __vmalloc to > > > kvmalloc. This can cause the IOCTL to fail in some low memory situations > > > where it wouldn't have failed earlier. This patch adds it back to avoid > > > any potential regression. > > > > > > Signed-off-by: Junaid Shahid> > > > Acked-by: Mikulas Patocka > > Can we have a comment explaning why we need memory reserves in this case > please? Here I'm sending the patch with the comment: d224e9381897 (drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded variant) left out the __GFP_HIGH flag when converting from __vmalloc to kvmalloc. This can cause the IOCTL to fail in some low memory situations where it wouldn't have failed earlier. This patch adds it back to avoid any potential regression. Signed-off-by: Junaid Shahid Signed-off-by: Mikulas Patocka --- drivers/md/dm-ioctl.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) Index: linux-2.6/drivers/md/dm-ioctl.c === --- linux-2.6.orig/drivers/md/dm-ioctl.c +++ linux-2.6/drivers/md/dm-ioctl.c @@ -1710,12 +1710,13 @@ static int copy_params(struct dm_ioctl _ } /* -* Try to avoid low memory issues when a device is suspended. +* Use __GFP_HIGH to avoid low memory issues when a device is suspended +* and the ioctl is needed to resume it. * Use kmalloc() rather than vmalloc() when we can. */ dmi = NULL; noio_flag = memalloc_noio_save(); - dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL); + dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL | __GFP_HIGH); memalloc_noio_restore(noio_flag); if (!dmi) {
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Sat, 20 May 2017, Michal Hocko wrote: > On Fri 19-05-17 19:50:24, Mikulas Patocka wrote: > > > > > > On Thu, 18 May 2017, Junaid Shahid wrote: > > > > > d224e9381897 (drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded > > > variant) left out the __GFP_HIGH flag when converting from __vmalloc to > > > kvmalloc. This can cause the IOCTL to fail in some low memory situations > > > where it wouldn't have failed earlier. This patch adds it back to avoid > > > any potential regression. > > > > > > Signed-off-by: Junaid Shahid > > > > Acked-by: Mikulas Patocka > > Can we have a comment explaning why we need memory reserves in this case > please? Here I'm sending the patch with the comment: d224e9381897 (drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded variant) left out the __GFP_HIGH flag when converting from __vmalloc to kvmalloc. This can cause the IOCTL to fail in some low memory situations where it wouldn't have failed earlier. This patch adds it back to avoid any potential regression. Signed-off-by: Junaid Shahid Signed-off-by: Mikulas Patocka --- drivers/md/dm-ioctl.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) Index: linux-2.6/drivers/md/dm-ioctl.c === --- linux-2.6.orig/drivers/md/dm-ioctl.c +++ linux-2.6/drivers/md/dm-ioctl.c @@ -1710,12 +1710,13 @@ static int copy_params(struct dm_ioctl _ } /* -* Try to avoid low memory issues when a device is suspended. +* Use __GFP_HIGH to avoid low memory issues when a device is suspended +* and the ioctl is needed to resume it. * Use kmalloc() rather than vmalloc() when we can. */ dmi = NULL; noio_flag = memalloc_noio_save(); - dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL); + dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL | __GFP_HIGH); memalloc_noio_restore(noio_flag); if (!dmi) {
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Fri, 19 May 2017, Michal Hocko wrote: > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > (Adding back the correct linux-mm email address and also adding > > linux-kernel.) > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > [...] > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > assuming there was a reason to do it in the first place in two different > > > ways. > > Hmm, the old PF_MEMALLOC used to have the following comment > /* > * Trying to avoid low memory issues when a device is > * suspended. > */ > > I am not really sure what that means but __GFP_HIGH certainly have a > different semantic than PF_MEMALLOC. The later grants the full access to > the memory reserves while the prior on partial access. If this is _really_ > needed then it deserves a comment explaining why. > -- > Michal Hocko > SUSE Labs Sometimes, I/O to a device mapper device is blocked until the userspace daemon dmeventd does some action (for example, when dm-mirror leg fails, dmeventd needs to mark the leg as failed in the lvm metadata and then reload the device). The dmeventd daemon mlocks itself in memory so that it doesn't generate any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that the ioctls issued by dmeventd have higher chance of succeeding if some I/O is blocked, waiting for dmeventd action. It reduces the possibility of low-memory-deadlock, though it doesn't eliminate it entirely. Mikulas
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Fri, 19 May 2017, Michal Hocko wrote: > On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > > (Adding back the correct linux-mm email address and also adding > > linux-kernel.) > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > [...] > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > > assuming there was a reason to do it in the first place in two different > > > ways. > > Hmm, the old PF_MEMALLOC used to have the following comment > /* > * Trying to avoid low memory issues when a device is > * suspended. > */ > > I am not really sure what that means but __GFP_HIGH certainly have a > different semantic than PF_MEMALLOC. The later grants the full access to > the memory reserves while the prior on partial access. If this is _really_ > needed then it deserves a comment explaining why. > -- > Michal Hocko > SUSE Labs Sometimes, I/O to a device mapper device is blocked until the userspace daemon dmeventd does some action (for example, when dm-mirror leg fails, dmeventd needs to mark the leg as failed in the lvm metadata and then reload the device). The dmeventd daemon mlocks itself in memory so that it doesn't generate any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that the ioctls issued by dmeventd have higher chance of succeeding if some I/O is blocked, waiting for dmeventd action. It reduces the possibility of low-memory-deadlock, though it doesn't eliminate it entirely. Mikulas
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > (Adding back the correct linux-mm email address and also adding linux-kernel.) > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: [...] > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > assuming there was a reason to do it in the first place in two different > > ways. Hmm, the old PF_MEMALLOC used to have the following comment /* * Trying to avoid low memory issues when a device is * suspended. */ I am not really sure what that means but __GFP_HIGH certainly have a different semantic than PF_MEMALLOC. The later grants the full access to the memory reserves while the prior on partial access. If this is _really_ needed then it deserves a comment explaining why. -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
On Thu 18-05-17 19:50:46, Junaid Shahid wrote: > (Adding back the correct linux-mm email address and also adding linux-kernel.) > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: [...] > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > > assuming there was a reason to do it in the first place in two different > > ways. Hmm, the old PF_MEMALLOC used to have the following comment /* * Trying to avoid low memory issues when a device is * suspended. */ I am not really sure what that means but __GFP_HIGH certainly have a different semantic than PF_MEMALLOC. The later grants the full access to the memory reserves while the prior on partial access. If this is _really_ needed then it deserves a comment explaining why. -- Michal Hocko SUSE Labs
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
(Adding back the correct linux-mm email address and also adding linux-kernel.) On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > On Thu, 18 May 2017, Michal Hocko wrote: > > > On Thu 18-05-17 11:50:40, Junaid Shahid wrote: > > > d224e9381897 (drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded > > > variant) left out the __GFP_HIGH flag when converting from __vmalloc to > > > kvmalloc. This can cause the IOCTL to fail in some low memory situations > > > where it wouldn't have failed earlier. This patch adds it back to avoid > > > any potential regression. > > > > The code previously used __GFP_HIGH only for the vmalloc fallback and > > that doesn't make that much sense with the current implementation > > because vmalloc does order-0 pages and those do not really fail and the > > oom killer is invoked to free memory. > > > > Order-0 pages certainly do fail, there is not an infinite amount of memory > nor is there a specific exemption to allow order-0 memory to be alloctable > below watermarks without this gfp flag. OOM kill is the last thing we > want for these allocations since they are very temporary. > > > There is no reason to access memory reserves from this context. > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > assuming there was a reason to do it in the first place in two different > ways. > > This decision is up to the device mapper maintainers. > > > > Signed-off-by: Junaid Shahid> > > --- > > > drivers/md/dm-ioctl.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c > > > index 0555b4410e05..bacad7637a56 100644 > > > --- a/drivers/md/dm-ioctl.c > > > +++ b/drivers/md/dm-ioctl.c > > > @@ -1715,7 +1715,7 @@ static int copy_params(struct dm_ioctl __user > > > *user, struct dm_ioctl *param_kern > > >*/ > > > dmi = NULL; > > > noio_flag = memalloc_noio_save(); > > > - dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL); > > > + dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL | __GFP_HIGH); > > > memalloc_noio_restore(noio_flag); > > > > > > if (!dmi) {
Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()
(Adding back the correct linux-mm email address and also adding linux-kernel.) On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote: > On Thu, 18 May 2017, Michal Hocko wrote: > > > On Thu 18-05-17 11:50:40, Junaid Shahid wrote: > > > d224e9381897 (drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded > > > variant) left out the __GFP_HIGH flag when converting from __vmalloc to > > > kvmalloc. This can cause the IOCTL to fail in some low memory situations > > > where it wouldn't have failed earlier. This patch adds it back to avoid > > > any potential regression. > > > > The code previously used __GFP_HIGH only for the vmalloc fallback and > > that doesn't make that much sense with the current implementation > > because vmalloc does order-0 pages and those do not really fail and the > > oom killer is invoked to free memory. > > > > Order-0 pages certainly do fail, there is not an infinite amount of memory > nor is there a specific exemption to allow order-0 memory to be alloctable > below watermarks without this gfp flag. OOM kill is the last thing we > want for these allocations since they are very temporary. > > > There is no reason to access memory reserves from this context. > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH, > assuming there was a reason to do it in the first place in two different > ways. > > This decision is up to the device mapper maintainers. > > > > Signed-off-by: Junaid Shahid > > > --- > > > drivers/md/dm-ioctl.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c > > > index 0555b4410e05..bacad7637a56 100644 > > > --- a/drivers/md/dm-ioctl.c > > > +++ b/drivers/md/dm-ioctl.c > > > @@ -1715,7 +1715,7 @@ static int copy_params(struct dm_ioctl __user > > > *user, struct dm_ioctl *param_kern > > >*/ > > > dmi = NULL; > > > noio_flag = memalloc_noio_save(); > > > - dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL); > > > + dmi = kvmalloc(param_kernel->data_size, GFP_KERNEL | __GFP_HIGH); > > > memalloc_noio_restore(noio_flag); > > > > > > if (!dmi) {