Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On Tue, Feb 04, 2020 at 09:45:24AM +0100, David Hildenbrand wrote: > I really hope we'll find more reviewers in general - I'm also not happy > if my patches go upstream with little/no review. However, patches > shouldn't be stuck for multiple merge windows in linux-next IMHO > (excluding exceptions of course) - then they should either be sent > upstream (and eventually fixed later) or dropped. First of all sorry for my lack of review, as lately I have been a bit disconnected of the list because lack of time. Lucky my I managed to find some time, so I went through the patches that did lack review (#6-#10). I hope this helps in moving forward the series, although Michal's review would be great as well. -- Oscar Salvador SUSE L3
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
>> I can understand this is desirable (yet, I am >> not sure if this makes sense with the current take-and-not-give-back >> review mentality on this list). >> >> Although it will make upstreaming stuff *even harder* and *even slower*, >> maybe we should start to only queue patches that have an ACK/RB, so they >> won't get blocked by this later on? At least that makes your life easier >> and people won't have to eventually follow up on patches that have been >> in linux-next for months. > > The merge rate would still be the review rate, but the resulting merges > would be of less tested code. That's a valid point. > >> Note: the result will be that many of my patches will still not get >> reviewed, won't get queued/upstreamed, I will continuously ping and >> resend, I will lose interest because I have better things to do, I will >> lose interest in our code quality, I will lose interest to review. >> >> (side note: some people might actually enjoy me sending less cleanup >> patches, so this approach might be desirable for some ;) ) >> >> One alternative is to send patches upstream once they have been lying >> around in linux-next for $RANDOM number of months, because they >> obviously saw some testing and nobody started to yell at them once >> stumbling over them on linux-mm. > > Yes, I think that's the case with these patches and I've sent them to > Linus. Hopefully Michel will be able to find time to look them over in > the next month or so. I really hope we'll find more reviewers in general - I'm also not happy if my patches go upstream with little/no review. However, patches shouldn't be stuck for multiple merge windows in linux-next IMHO (excluding exceptions of course) - then they should either be sent upstream (and eventually fixed later) or dropped. Thanks Andrew! -- Thanks, David / dhildenb
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On Fri, 31 Jan 2020 10:18:34 +0100 David Hildenbrand wrote: > On 31.01.20 05:40, Andrew Morton wrote: > > On Tue, 3 Dec 2019 14:36:38 +0100 Oscar Salvador wrote: > > > >> On Mon, Dec 02, 2019 at 10:09:51AM +0100, David Hildenbrand wrote: > >>> @Michal, @Oscar, can some of you at least have a patch #5 now so we can > >>> proceed with that? (the other patches can stay in -next some time longer) > >> > >> Hi, > >> > >> I will be having a look at patch#5 shortly. > >> > >> Thanks for the reminder > > > > Things haven't improved a lot :( > > > > mm-memmap_init-update-variable-name-in-memmap_init_zone.patch > > mm-memory_hotplug-poison-memmap-in-remove_pfn_range_from_zone.patch > > mm-memory_hotplug-we-always-have-a-zone-in-find_smallestbiggest_section_pfn.patch > > mm-memory_hotplug-dont-check-for-all-holes-in-shrink_zone_span.patch > > mm-memory_hotplug-drop-local-variables-in-shrink_zone_span.patch > > mm-memory_hotplug-cleanup-__remove_pages.patch > > > > The first patch has reviews, the remainder are unloved. > > Trying hard not to rant about the review mentality on this list, but I'm > afraid I can't totally bite my tongue ... :) > > Now, this is an uncomfortable situation for you and me. You have to ping > people about review and patches are stuck in your tree. I have a growing > list of patches that are somewhat considered "done", but well, > not-upstream-at-all. I have patches that are long in RHEL and were > properly tested, but could get dropped any time because -ENOREVIEW. > > Our process nowadays seems to be, to only upstream what has an ACK/RB > (fixes/features/cleanups). Yes, we've been doing this for a couple of years now. I make an exception for Vitaly's zswap patches because he appears to be the only person who knows the code (since Harry's internship ended). I think this is the first time we've hit a significant logjam. Presumably the holiday season contributed to this. It isn't clear to me that we've gained much from this policy. But until this cycle I've seen little harm. > I can understand this is desirable (yet, I am > not sure if this makes sense with the current take-and-not-give-back > review mentality on this list). > > Although it will make upstreaming stuff *even harder* and *even slower*, > maybe we should start to only queue patches that have an ACK/RB, so they > won't get blocked by this later on? At least that makes your life easier > and people won't have to eventually follow up on patches that have been > in linux-next for months. The merge rate would still be the review rate, but the resulting merges would be of less tested code. > Note: the result will be that many of my patches will still not get > reviewed, won't get queued/upstreamed, I will continuously ping and > resend, I will lose interest because I have better things to do, I will > lose interest in our code quality, I will lose interest to review. > > (side note: some people might actually enjoy me sending less cleanup > patches, so this approach might be desirable for some ;) ) > > One alternative is to send patches upstream once they have been lying > around in linux-next for $RANDOM number of months, because they > obviously saw some testing and nobody started to yell at them once > stumbling over them on linux-mm. Yes, I think that's the case with these patches and I've sent them to Linus. Hopefully Michel will be able to find time to look them over in the next month or so.
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
>>> The first patch has reviews, the remainder are unloved. >> >> Trying hard not to rant about the review mentality on this list, but I'm >> afraid I can't totally bite my tongue ... :) > > I am afraid this is less about mentality than the lack of man power. > This is not a new problem. We have much more code producers than > reviewers. It's part of the "take-and-not-give-back review mentality" and I hope you "smelled" that the comment was not targeted at you. [...] >> Although it will make upstreaming stuff *even harder* and *even slower*, >> maybe we should start to only queue patches that have an ACK/RB, so they >> won't get blocked by this later on? At least that makes your life easier >> and people won't have to eventually follow up on patches that have been >> in linux-next for months. > > I wouldn't mind if patched got merged to mmotm less pro-actively at all. > People tend to care less to follow up on patches that are in the queue > already from my past experience. And also it encourages to generate more > code than review. The current process will at least encourage me in the long term to generate less code (changes) as well. I consider patches that have not been merged upstream as on my TODO list - and it seems to keep on growing. (yes, there are people that fire-and-forget) Then, I much rather prefer to just get no reply on my patches, ping two times, and eventually dump them into the trash. As explained, will WHP result in me generating less code changes overall. Problem partially solved (at least one producer less) :) > > This is certainly not a black or white of course. Some areas have barely > anybody for a review except for the person actively writing code in that > area so this really needs the case by case approach. Yes. > > Anyway this is not a new discussion or a new problem we are facing. I > believe that part of the problem is that the MM subsystem doesn't really > have official maintainers so there is nobody really responsible for > particular parts of the subsystem. Sure Andrew is merging patches based > on the review feedback or his gut feeling but I am afraid this is not > enough. If we would have "official maintainers" would it really help (besides Andrew having less stuff to do of course)? E.g., if you would pick up the memory hotplug patches, you would still have to have a look at them - it's still the producer-consumer imbalance (and you would have an even higher workload). But yeah, we should most probably finally have official maintainers. For people sending patches, it's often not obvious whom to cc (and whom to ping). Smells like a "LSF/MM/BPF TOPIC", but as you said, it's not a new problem/discussion. (I won't be around, so I can't bring that topic up) Anyhow, my two cents. -- Thanks, David / dhildenb
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On Fri 31-01-20 10:18:34, David Hildenbrand wrote: > On 31.01.20 05:40, Andrew Morton wrote: > > On Tue, 3 Dec 2019 14:36:38 +0100 Oscar Salvador wrote: > > > >> On Mon, Dec 02, 2019 at 10:09:51AM +0100, David Hildenbrand wrote: > >>> @Michal, @Oscar, can some of you at least have a patch #5 now so we can > >>> proceed with that? (the other patches can stay in -next some time longer) > >> > >> Hi, > >> > >> I will be having a look at patch#5 shortly. > >> > >> Thanks for the reminder > > > > Things haven't improved a lot :( > > > > mm-memmap_init-update-variable-name-in-memmap_init_zone.patch > > mm-memory_hotplug-poison-memmap-in-remove_pfn_range_from_zone.patch > > mm-memory_hotplug-we-always-have-a-zone-in-find_smallestbiggest_section_pfn.patch > > mm-memory_hotplug-dont-check-for-all-holes-in-shrink_zone_span.patch > > mm-memory_hotplug-drop-local-variables-in-shrink_zone_span.patch > > mm-memory_hotplug-cleanup-__remove_pages.patch > > > > The first patch has reviews, the remainder are unloved. > > Trying hard not to rant about the review mentality on this list, but I'm > afraid I can't totally bite my tongue ... :) I am afraid this is less about mentality than the lack of man power. This is not a new problem. We have much more code producers than reviewers. In this particular case the review is expected from me and I am sorry that my bandwith doesn't scale with the email traffic in my inbox. I do very much appreciate the amount of work you are doing in the hotplug area but we need more reviewers here. > Now, this is an uncomfortable situation for you and me. You have to ping > people about review and patches are stuck in your tree. I have a growing > list of patches that are somewhat considered "done", but well, > not-upstream-at-all. I have patches that are long in RHEL and were > properly tested, but could get dropped any time because -ENOREVIEW. > > Our process nowadays seems to be, to only upstream what has an ACK/RB > (fixes/features/cleanups). I can understand this is desirable (yet, I am > not sure if this makes sense with the current take-and-not-give-back > review mentality on this list). > > Although it will make upstreaming stuff *even harder* and *even slower*, > maybe we should start to only queue patches that have an ACK/RB, so they > won't get blocked by this later on? At least that makes your life easier > and people won't have to eventually follow up on patches that have been > in linux-next for months. I wouldn't mind if patched got merged to mmotm less pro-actively at all. People tend to care less to follow up on patches that are in the queue already from my past experience. And also it encourages to generate more code than review. This is certainly not a black or white of course. Some areas have barely anybody for a review except for the person actively writing code in that area so this really needs the case by case approach. Anyway this is not a new discussion or a new problem we are facing. I believe that part of the problem is that the MM subsystem doesn't really have official maintainers so there is nobody really responsible for particular parts of the subsystem. Sure Andrew is merging patches based on the review feedback or his gut feeling but I am afraid this is not enough. -- Michal Hocko SUSE Labs
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On 31.01.20 05:40, Andrew Morton wrote: > On Tue, 3 Dec 2019 14:36:38 +0100 Oscar Salvador wrote: > >> On Mon, Dec 02, 2019 at 10:09:51AM +0100, David Hildenbrand wrote: >>> @Michal, @Oscar, can some of you at least have a patch #5 now so we can >>> proceed with that? (the other patches can stay in -next some time longer) >> >> Hi, >> >> I will be having a look at patch#5 shortly. >> >> Thanks for the reminder > > Things haven't improved a lot :( > > mm-memmap_init-update-variable-name-in-memmap_init_zone.patch > mm-memory_hotplug-poison-memmap-in-remove_pfn_range_from_zone.patch > mm-memory_hotplug-we-always-have-a-zone-in-find_smallestbiggest_section_pfn.patch > mm-memory_hotplug-dont-check-for-all-holes-in-shrink_zone_span.patch > mm-memory_hotplug-drop-local-variables-in-shrink_zone_span.patch > mm-memory_hotplug-cleanup-__remove_pages.patch > > The first patch has reviews, the remainder are unloved. Trying hard not to rant about the review mentality on this list, but I'm afraid I can't totally bite my tongue ... :) Now, this is an uncomfortable situation for you and me. You have to ping people about review and patches are stuck in your tree. I have a growing list of patches that are somewhat considered "done", but well, not-upstream-at-all. I have patches that are long in RHEL and were properly tested, but could get dropped any time because -ENOREVIEW. Our process nowadays seems to be, to only upstream what has an ACK/RB (fixes/features/cleanups). I can understand this is desirable (yet, I am not sure if this makes sense with the current take-and-not-give-back review mentality on this list). Although it will make upstreaming stuff *even harder* and *even slower*, maybe we should start to only queue patches that have an ACK/RB, so they won't get blocked by this later on? At least that makes your life easier and people won't have to eventually follow up on patches that have been in linux-next for months. Note: the result will be that many of my patches will still not get reviewed, won't get queued/upstreamed, I will continuously ping and resend, I will lose interest because I have better things to do, I will lose interest in our code quality, I will lose interest to review. (side note: some people might actually enjoy me sending less cleanup patches, so this approach might be desirable for some ;) ) One alternative is to send patches upstream once they have been lying around in linux-next for $RANDOM number of months, because they obviously saw some testing and nobody started to yell at them once stumbling over them on linux-mm. -- Thanks, David / dhildenb
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On Tue, 3 Dec 2019 14:36:38 +0100 Oscar Salvador wrote: > On Mon, Dec 02, 2019 at 10:09:51AM +0100, David Hildenbrand wrote: > > @Michal, @Oscar, can some of you at least have a patch #5 now so we can > > proceed with that? (the other patches can stay in -next some time longer) > > Hi, > > I will be having a look at patch#5 shortly. > > Thanks for the reminder Things haven't improved a lot :( mm-memmap_init-update-variable-name-in-memmap_init_zone.patch mm-memory_hotplug-poison-memmap-in-remove_pfn_range_from_zone.patch mm-memory_hotplug-we-always-have-a-zone-in-find_smallestbiggest_section_pfn.patch mm-memory_hotplug-dont-check-for-all-holes-in-shrink_zone_span.patch mm-memory_hotplug-drop-local-variables-in-shrink_zone_span.patch mm-memory_hotplug-cleanup-__remove_pages.patch The first patch has reviews, the remainder are unloved.
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On Mon, Dec 02, 2019 at 10:09:51AM +0100, David Hildenbrand wrote: > @Michal, @Oscar, can some of you at least have a patch #5 now so we can > proceed with that? (the other patches can stay in -next some time longer) Hi, I will be having a look at patch#5 shortly. Thanks for the reminder -- Oscar Salvador SUSE L3
Re: [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
On 06.10.19 10:56, David Hildenbrand wrote: > This series fixes the access of uninitialized memmaps when shrinking > zones/nodes and when removing memory. Also, it contains all fixes for > crashes that can be triggered when removing certain namespace using > memunmap_pages() - ZONE_DEVICE, reported by Aneesh. > > We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be > more involved (we don't have SECTION_IS_ONLINE as an indicator), and > shrinking is only of limited use (set_zone_contiguous() cannot detect > the ZONE_DEVICE as contiguous). > > We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of > code to a minimum. Shrinking is especially necessary to keep > zone->contiguous set where possible, especially, on memory unplug of > DIMMs at zone boundaries. > > -- > > Zones are now properly shrunk when offlining memory blocks or when > onlining failed. This allows to properly shrink zones on memory unplug > even if the separate memory blocks of a DIMM were onlined to different > zones or re-onlined to a different zone after offlining. > > Example: > > :/# cat /proc/zoneinfo > Node 1, zone Movable > spanned 0 > present 0 > managed 0 > :/# echo "online_movable" > /sys/devices/system/memory/memory41/state > :/# echo "online_movable" > /sys/devices/system/memory/memory43/state > :/# cat /proc/zoneinfo > Node 1, zone Movable > spanned 98304 > present 65536 > managed 65536 > :/# echo 0 > /sys/devices/system/memory/memory43/online > :/# cat /proc/zoneinfo > Node 1, zone Movable > spanned 32768 > present 32768 > managed 32768 > :/# echo 0 > /sys/devices/system/memory/memory41/online > :/# cat /proc/zoneinfo > Node 1, zone Movable > spanned 0 > present 0 > managed 0 > > -- > > I tested this with DIMMs on x86. I didn't test the ZONE_DEVICE part. > > v4 -> v6: > - "mm/memunmap: Don't access uninitialized memmap in memunmap_pages()" > -- Minimize code changes, rephrase subject and description > - "mm/memory_hotplug: Don't access uninitialized memmaps in > shrink_zone_span()" > -- Add ifdef to make it compile without ZONE_DEVICE > > v4 -> v5: > - "mm/memory_hotplug: Don't access uninitialized memmaps in > shrink_zone_span()" > -- Add more details why ZONE_DEVICE is special > - Include two patches from Aneesh > -- "mm/memunmap: Use the correct start and end pfn when removing pages > from zone" > -- "mm/memmap_init: Update variable name in memmap_init_zone" > > v3 -> v4: > - Drop "mm/memremap: Get rid of memmap_init_zone_device()" > -- As Alexander noticed, it was messy either way > - Drop "mm/memory_hotplug: Exit early in __remove_pages() on BUGs" > - Drop "mm: Exit early in set_zone_contiguous() if already contiguous" > - Drop "mm/memory_hotplug: Optimize zone shrinking code when checking for > holes" > - Merged "mm/memory_hotplug: Remove pages from a zone before removing > memory" and "mm/memory_hotplug: Remove zone parameter from > __remove_pages()" into "mm/memory_hotplug: Shrink zones when offlining > memory" > - Added "mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()" > - Stop shrinking ZONE_DEVICE > - Reshuffle patches, moving all fixes to the front. Add Fixes: tags. > - Change subject/description of various patches > - Minor changes (too many to mention) > > Cc: Aneesh Kumar K.V > Cc: Andrew Morton > Cc: Dan Williams > Cc: Michal Hocko > So, patch #1-#4 are already upstream. The other patches have been in -next for quite a long time, and I (+other RH people) ran excessive tests on them. Especially patch #5 is a BUG fix I want to see upstream rather sooner than later (last know "uninitialized memmap" access). Andrew decided not to send these for 5.5 due to lack of ack/review - which is unfortunate, but the right thing to do. @Michal, @Oscar, can some of you at least have a patch #5 now so we can proceed with that? (the other patches can stay in -next some time longer) -- Thanks, David / dhildenb
[PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory
This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -- Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 -- I tested this with DIMMs on x86. I didn't test the ZONE_DEVICE part. v4 -> v6: - "mm/memunmap: Don't access uninitialized memmap in memunmap_pages()" -- Minimize code changes, rephrase subject and description - "mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()" -- Add ifdef to make it compile without ZONE_DEVICE v4 -> v5: - "mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()" -- Add more details why ZONE_DEVICE is special - Include two patches from Aneesh -- "mm/memunmap: Use the correct start and end pfn when removing pages from zone" -- "mm/memmap_init: Update variable name in memmap_init_zone" v3 -> v4: - Drop "mm/memremap: Get rid of memmap_init_zone_device()" -- As Alexander noticed, it was messy either way - Drop "mm/memory_hotplug: Exit early in __remove_pages() on BUGs" - Drop "mm: Exit early in set_zone_contiguous() if already contiguous" - Drop "mm/memory_hotplug: Optimize zone shrinking code when checking for holes" - Merged "mm/memory_hotplug: Remove pages from a zone before removing memory" and "mm/memory_hotplug: Remove zone parameter from __remove_pages()" into "mm/memory_hotplug: Shrink zones when offlining memory" - Added "mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()" - Stop shrinking ZONE_DEVICE - Reshuffle patches, moving all fixes to the front. Add Fixes: tags. - Change subject/description of various patches - Minor changes (too many to mention) Cc: Aneesh Kumar K.V Cc: Andrew Morton Cc: Dan Williams Cc: Michal Hocko Aneesh Kumar K.V (2): mm/memunmap: Don't access uninitialized memmap in memunmap_pages() mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand (8): mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() mm/memory_hotplug: Shrink zones when offlining memory mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() mm/memory_hotplug: Drop local variables in shrink_zone_span() mm/memory_hotplug: Cleanup __remove_pages() arch/arm64/mm/mmu.c| 4 +- arch/ia64/mm/init.c| 4 +- arch/powerpc/mm/mem.c | 3 +- arch/s390/mm/init.c| 4 +- arch/sh/mm/init.c | 4 +- arch/x86/mm/init_32.c | 4 +- arch/x86/mm/init_64.c | 4 +- include/linux/memory_hotplug.h | 7 +- mm/memory_hotplug.c| 186 - mm/memremap.c | 13 ++- mm/page_alloc.c| 8 +- 11 files changed, 90 insertions(+), 151 deletions(-) -- 2.21.0