On Fri 03-02-17 18:36:54, Trevor Cordes wrote: > On 2017-02-01 Michal Hocko wrote: > > On Wed 01-02-17 03:29:28, Trevor Cordes wrote: > > > On 2017-01-30 Michal Hocko wrote: > > [...] > > > > Testing with Valinall rc6 released just yesterday would be a good > > > > fit. There are some more fixes sitting on mmotm on top and maybe > > > > we want some of them in finall 4.10. Anyway all those pending > > > > changes should be merged in the next merge window - aka 4.11 > > > > > > After 30 hours of running vanilla 4.10.0-rc6, the box started to go > > > bonkers at 3am, so vanilla does not fix the bug :-( But, the bug > > > hit differently this time, the box just bogged down like crazy and > > > gave really weird top output. Starting nano would take 10s, then > > > would run full speed, then when saving a file would take 5s. > > > Starting any prog not in cache took equally as long. > > > > Could you try with to_test/linus-tree/oom_hickups branch on the same > > git tree? I have cherry-picked "mm, vmscan: consider eligible zones in > > get_scan_count" which might be the missing part. > > I ran to_test/linus-tree/oom_hickups branch (4.10.0-rc6+) for 50 hours > and it does NOT have the bug! No problems at all so far.
OK, that is definitely good to know. My other fix ("mm, vmscan: consider eligible zones in get_scan_count") was more theoretical than bug driven. I would add your Tested-by: Trevor Cordes <tre...@tecnopolis.ca> unless you have anything against that. > So I think whatever to_test/linus-tree/oom_hickups has that since-4.9 > has that vanilla 4.10-rc6 does *not* have is indeed the fix. > > For my reference, and I know you guys aren't distro-specific, what is > the best way to get this fix into Fedora 24 (currently 4.9)? I will send this patch to 4.9+ stable as soon as it hits Linus tree. -- Michal Hocko SUSE Labs