On Fri, Sep 23, 2022 at 11:15 AM Kellen Renshaw <1970...@bugs.launchpad.net> wrote: > > I have put up a draft SRU exception page at > https://wiki.ubuntu.com/MakedumpfileUpdates. Comments/edits are welcome. > I am working on finding an appropriate place for test vmcores to be > stored.
Awesome! Thanks for starting this. I made some edits already, but here's some other things I'd recommend: - Provide links under resources (it's a web page!) - be more precise about what will be tested, ideally in a checklist form. Make it clear enough that you could give the list to another developer and they would run the same tests and get the same results. - Testing other architectures should not be optional IMO. makedumpfile has architecture specific knowledge. You could easily break one w/o noticing on another. - Mention the plan to filter a pool of saved dump files as a regression test. That is key IMO. - If you have the regression testing pool, I don't see any reason to have to do the full kdump process everywhere. Just do that once (one kernel, one arch) to make sure the kdump-tools<->makedumpfile interface has been preserved. Testing the full crash dump process is painful, but makedumpfile really is just used as a filter, and we can test that easily. -dann -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to makedumpfile in Ubuntu. https://bugs.launchpad.net/bugs/1970672 Title: makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid pmd_pte." Status in makedumpfile package in Ubuntu: Confirmed Bug description: [Impact] * On Focal with an HWE (>=5.12) kernel, makedumpfile can sometimes fail with "__vtop4_x86_64: Can't get a valid pmd_pte." * makedumpfile falls back to cp for the dump, resulting in extremely large vmcores. This can impact both collection and analysis due to lack of space for the resulting vmcore. * This is fixed in upstream commit present in versions 1.7.0 and 1.7.1: https://github.com/makedumpfile/makedumpfile/commit/646456862df8926ba10dd7330abf3bf0f887e1b6 commit 646456862df8926ba10dd7330abf3bf0f887e1b6 Author: Kazuhito Hagio <k-hagio...@nec.com> Date: Wed May 26 14:31:26 2021 +0900 [PATCH] Increase SECTION_MAP_LAST_BIT to 5 * Required for kernel 5.12 Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions") added a section flag (SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on some machines like this: __vtop4_x86_64: Can't get a valid pmd_pte. readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical address. readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768 __exclude_unnecessary_pages: Can't read the buffer of struct page. create_2nd_bitmap: Can't exclude unnecessary pages. Increase SECTION_MAP_LAST_BIT to 5 to fix this. The bit had not been used until the change, so we can just increase the value. Signed-off-by: Kazuhito Hagio <k-hagio...@nec.com> [Test Plan] * Confirm that makedumpfile works as expected by triggering a kdump. * Confirm that the patched makedumpfile works as expected on a system known to experience the issue. * Confirm that the patched makedumpfile is able to work with a cp- generated known affected vmcore to compress it. The unpatched version fails. [Where problems could occur] * This change could adversely affect the collection/compression of vmcores during a kdump situation resulting in fallback to cp. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1970672/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp