On 2025/07/04 7:35, Tao Liu wrote: > Hi Petr, > > On Fri, Jul 4, 2025 at 2:31 AM Petr Tesarik <ptesa...@suse.com> wrote: >> >> On Tue, 1 Jul 2025 19:59:53 +1200 >> Tao Liu <l...@redhat.com> wrote: >> >>> Hi Kazu, >>> >>> Thanks for your comments! >>> >>> On Tue, Jul 1, 2025 at 7:38 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio...@nec.com> >>> wrote: >>>> >>>> Hi Tao, >>>> >>>> thank you for the patch. >>>> >>>> On 2025/06/25 11:23, Tao Liu wrote: >>>>> A vmcore corrupt issue has been noticed in powerpc arch [1]. It can be >>>>> reproduced with upstream makedumpfile. >>>>> >>>>> When analyzing the corrupt vmcore using crash, the following error >>>>> message will output: >>>>> >>>>> crash: compressed kdump: uncompress failed: 0 >>>>> crash: read error: kernel virtual address: c0001e2d2fe48000 type: >>>>> "hardirq thread_union" >>>>> crash: cannot read hardirq_ctx[930] at c0001e2d2fe48000 >>>>> crash: compressed kdump: uncompress failed: 0 >>>>> >>>>> If the vmcore is generated without num-threads option, then no such >>>>> errors are noticed. >>>>> >>>>> With --num-threads=N enabled, there will be N sub-threads created. All >>>>> sub-threads are producers which responsible for mm page processing, e.g. >>>>> compression. The main thread is the consumer which responsible for >>>>> writing the compressed data into file. page_flag_buf->ready is used to >>>>> sync main and sub-threads. When a sub-thread finishes page processing, >>>>> it will set ready flag to be FLAG_READY. In the meantime, main thread >>>>> looply check all threads of the ready flags, and break the loop when >>>>> find FLAG_READY. >>>> >>>> I've tried to reproduce the issue, but I couldn't on x86_64. >>> >>> Yes, I cannot reproduce it on x86_64 either, but the issue is very >>> easily reproduced on ppc64 arch, which is where our QE reported. >> >> Yes, this is expected. X86 implements a strongly ordered memory model, >> so a "store-to-memory" instruction ensures that the new value is >> immediately observed by other CPUs. >> >> FWIW the current code is wrong even on X86, because it does nothing to >> prevent compiler optimizations. The compiler is then allowed to reorder >> instructions so that the write to page_flag_buf->ready happens after >> other writes; with a bit of bad scheduling luck, the consumer thread >> may see an inconsistent state (e.g. read a stale page_flag_buf->pfn). >> Note that thanks to how compilers are designed (today), this issue is >> more or less hypothetical. Nevertheless, the use of atomics fixes it, >> because they also serve as memory barriers.
Thank you Petr, for the information. I was wondering whether atomic operations might be necessary for the other members of page_flag_buf, but it looks like they won't be necessary in this case. Then I was convinced that the issue would be fixed by removing the inconsistency of page_flag_buf->ready. And the patch tested ok, so ack. Thanks, Kazu > > Thanks a lot for your detailed explanation, it's very helpful! I > haven't thought of the possibility of instruction reordering and > atomic_rw prevents the reorder. > > Thanks, > Tao Liu > >> >> Petr T >>