On Sun, Jan 14, 2018 at 3:54 AM, Tetsuo Handa
<penguin-ker...@i-love.sakura.ne.jp> wrote:
> This memory corruption bug occurs even on CONFIG_SMP=n CONFIG_PREEMPT_NONE=y
> kernel. This bug highly depends on timing and thus too difficult to bisect.
> This bug seems to exist at least since Linux 4.8 (judging from the traces, 
> though
> the cause might be different). None of debugging configuration gives me a 
> clue.
> So far only CONFIG_HIGHMEM=y CONFIG_DEBUG_PAGEALLOC=y kernel (with RAM enough 
> to
> use HighMem: zone) seems to hit this bug, but it might be just by chance 
> caused
> by timings. Thus, there is no evidence that 64bit kernels are not affected by
> this bug. But I can't narrow down any more. Thus, I call for developers who 
> can
> narrow down / identify where the memory corruption bug is.

Hmm.

I guess I'm still hung up on the "it does not look like a valid
'struct page *'" thing.

Can you reproduce this with CONFIG_FLATMEM=y instead of CONFIG_SPARSEMEM?

Because if you can, I think we can easily add a few more pfn and
'struct page' validation debug statements. With SPARSEMEM, it gets
pretty complicated because the whole struct page setup is much more
complex.

              Linus

Reply via email to