Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
在 2018年10月27日 06:25, Borislav Petkov 写道: > On Fri, Oct 26, 2018 at 06:24:40PM +0200, Petr Tesarik wrote: >> But we need the MSR value from the panic kernel environment, not while >> the production kernel is still running, right? > > Actually, we need only the encryption bit number (and it should be 0 > otherwise to denote SME wasn't enabled). > Thanks for your comment. For this patch, it really needs only the encryption bit number. For the AMD machine with SME feature, the OS or HV sets bit 47 of a physical address to 1 in the page table entry to indicate the page should be encrypted. Thanks. Lianbo > I guess something like > > VMCOREINFO_NUMBER(sme_mask); > > which gets written by the kexec-ed kernel. > >> If so, then this reminds me that I have wanted for a long time to store >> more of the hardware state in a vmcore NOTE after a kernel crash ... >> control registers, MSRs and whatnot. Of course, this would be a >> long-term project, but I wonder what other people think about it in >> general. > > I guess that sounds like a good idea - the more relevant hw info for > debugging, the better. Determining the important MSRs to save would need > a bit of a pondering over though. For example, some MSRs are per-core, > some per-socket, etc... > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
在 2018年10月27日 00:35, Borislav Petkov 写道: > On Fri, Oct 26, 2018 at 08:32:11PM +0800, lijiang wrote: >> If SME is enabled in the first kernel, the crash kernel's page >> table(pgd/pud/pmd/pte) >> contains the memory encryption mask, so i have to remove the sme mask to >> obtain the >> true physical address when dump vmcore. > > Sorry, I have no clue what makedumpfile does exactly so you'd have to > be more detailed (or wait until I look at it :)). Which kernel accesses > which kernel's pagetable? > > /me goes and looks at the makedumpfile's manpage... > So sorry that i should provide more detail about this. Thanks a lot for spending time reading the manpage. > Ok, it uses vmcoreinfo to exclude pages which would mean, it accesses > the first kernel's pagetable and traverses it. > > Am I close? > Yes, your explanation is perfect. Thanks. Lianbo ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [BUG] vmcore-dmesg cant' read dmesg log from /proc/vmcore if log_buf is reallocated due to large number of CPUs
Hi Vadim, On Fri, Oct 26, 2018 at 6:49 PM Vadim Lomovtsev wrote: > > Hi Bhupesh, > > On Fri, Oct 26, 2018 at 03:49:11PM +0530, Bhupesh Sharma wrote: > > > > Hi Vadim, > > On Fri, Oct 26, 2018 at 3:41 PM Vadim Lomovtsev > > wrote: > > > > > > Hi Bhupesh, > > > > > > On Fri, Oct 26, 2018 at 12:25:17PM +0530, Bhupesh Sharma wrote: > > > > > > > > ease p > > > > before seiHi Vadim, > > > > > > > > On Thu, Oct 25, 2018 at 4:10 PM Vadim Lomovtsev > > > > wrote: > > > > > > > > > > Hello Bhupesh, > > > > > > > > > > On Thu, Oct 25, 2018 at 03:00:08AM +0530, Bhupesh Sharma wrote: > > > > > > External Email > > > > > > > > > > > > Hello Vadim, > > > > > > > > > > > > On Wed, Oct 24, 2018 at 6:23 PM Lomovtsev, Vadim > > > > > > wrote: > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > Following issue has been found for vmcore-dmesg app with latest > > > > > > > release (94159bc3c264fa26395e56302072276a139d18af 2.0.18-rc1) of > > > > > > > kexec-tools at CentOS 7.5 distro: > > > > > > > > > > > > > > While having systems with large number of CPUs (e.g. Cavium > > > > > > > ThunderX2 has 224) the log_buf gets reallocated by > > > > > > > memblock_virt_alloc() at the setup_log_buf routine > > > > > > > (https://elixir.bootlin.com/linux/v4.16.18/source/kernel/printk/printk.c#L1108). > > > > > > > > > > > > > > Then while dumping vmcore the vmcore-dmesg can't find dmesg log > > > > > > > at /proc/vmcore file and exits with following message: > > > > > > > Failed to read log text of size 0 bytes: Bad address > > > > > > > > > > > > > > However it (vmcore-dmesg app) reads properly the log_buf symbol, > > > > > > > it's address and eventually it's value from /proc/vmcore but > > > > > > > fails to find dmesg data then. > > > > > > > > > > > > > > In the same time the makedumpfile is able to find and extract > > > > > > > dmesg buffer from /proc/vmcore. > > > > > > > The makedumpfile comes with kexec-tools-2.0.15-13.el7_5.2.aarch64 > > > > > > > package. > > > > > > > > > > > > > > The issue is not reproduced for systems with small number of CPUs > > > > > > > and log_buf not reallocated to memblock section. > > > > > > > > > > > > Seems like you are hitting a known issue we saw on qualcomm > > > > > > amberwing > > > > > > platforms as well. > > > > > > I have sent a patch-series titled 'kexec-tools/arm64: Add support to > > > > > > read PHYS_OFFSET from vmcoreinfo inside '/proc/kcore' to this list > > > > > > just a few minutes back. > > > > > > > > > > > > I have Cc'ed you to the patchset as I think it might fix the issue > > > > > > for > > > > > > you. > > > > > > > > > > Got them, thank you. > > > > > > > > > > > Kindly try the patchset on your platform (cavium?) and let me > > > > > > know if this fixes the issue for you. > > > > > > > > > > Sure, I'd like to check them at my side, but.. > > > > > I fall into merge conflicts while trying to apply them onto > > > > > https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/ > > > > > master, kexec-tools 2.0.18-rc1 > > > > > 94159bc3c264fa26395e56302072276a139d18af > > > > > > > > Hmm.. that's strange as I rebased them on kexec-tools 2.0.18-rc1 > > > > (94159bc3c264fa26395e56302072276a139d18af) > > > > before sending out the patchset. > > > > > > > > > Are there any specific branch/revision for them to be applied ? > > > > > (or it might be my mail server issues with formatting emails). > > > > > > > > > > > > > Can you please try picking them up from my public github tree instead? > > > > Here you can find the same: > > > > https://github.com/bhupesh-sharma/kexec-tools/tree/read-phys-offset-from-kcore-upstream-v1 > > > > > > > > Please pick the top 2 commit from here. > > > > > > Applied them onto commit '94159bc kexec-tools 2.0.18-rc1'. > > > > > > Still having following error while saving dmesg by vmcore-dmesg: > > > > > > kdump: saving vmcore-dmesg.txt > > > Failed to read log text of size 0 bytes: Bad address > > > kdump: saving vmcore-dmesg.txt failed > > > > > > So far tried kernels 4.14.78, 4.16.18. > > > > You would need kernel 4.19-rc5 or above as the same exposes VMCOREINFO > > as '/proc/kcore'. > > So far with 4.19-rc6 (and updated kexec, vmcore-dmesg but having kdump > scripts from CentOS) > the crashkernel can't found sysroot and thus it can't dump anything, so it > timeouts and reboot system. > > > If you are having issues while switching to newer kernel, please share > > the output(s) of following on your platform: > > > > # kexec -p /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname > > -r`.img --reuse-cmdline -d > > > > attached as kexec-start.log.xz > > > and, > > > > # readelf -l vmcore > > [root@2sgbt-53 vlomovts]# readelf -l vmcore > readelf: vmcore: Error: No such file > [root@2sgbt-53 vlomovts]# uname -r > 4.19.0-rc6+ > > > > > and, > > > > # cat /proc/iomem > > attached as cat-proc-iomem.log.xz Just to confirm: these logs are after your apply my kexec-tools patches, right? It looks likely that
Re: [BUG] Set device tree bootargs failed when DTB does not contain a chosen node.
Hi Vicenç, On Fri, Oct 26, 2018 at 6:42 PM Vicente Bergas wrote: > > Hello, > when executing > kexec -d --dtb dtb_without_chosen_node.dtb --append 'cmdline' --load Image > it reports > dtb_set_property: fdt_add_subnode failed: > kexec: Set device tree bootargs failed. > > It has been tested on the arm64 architecture with version v2.0.17 and > v2.0.18-rc1 Can you share some details on the underlying platform and kernel version you are using? Looking at the logs, I am assuming you are running 'kexec' on a arm64 platform (as it depends on creating a dtb to be passed to the 2nd kernel). My general advice is to avoid the --dtb option as it is known to cause issues with kexec in the past (see [0] for details) Ideally, if you don't use the --dtb option, then kexec will read the existing kernel's dtb from /proc/device-tree, and that device tree will include all the changes that the boot loader has done, including adding details of the available memory in the system. [0]. https://www.spinics.net/lists/arm-kernel/msg618236.html Hope this helps. Regards, Bhupesh ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH] x86_64, vmcoreinfo: Append 'page_offset_base' to vmcoreinfo
Since commit 23c85094fe1895caefdd ["proc/kcore: add vmcoreinfo note to /proc/kcore"]), '/proc/kcore' contains a new PT_NOTE which carries the VMCOREINFO information. If the same is available, one can use it in user-land to retrieve machine specific symbols or strings being appended to the vmcoreinfo even for live-debugging of the primary kernel as a standard interface exposed by kernel for sharing machine specific details with the user-land. In the past I had a discussion with James, where he suggested this approach (please see [0]) and I really liked the idea. Since then I have been working on unifying the implementations of (atleast) the commonly used user-space utilities that provide live-debugging capabilities (tools like 'makedumpfile' and 'crash-utility', see [1] for details of these tools). For the same, when live debugging on x86_64 machines, user-space tools currently rely on different mechanisms to determine the 'page_offset_base' value (i.e. start of direct mapping of all physical memory). One of the approach used by 'makedumpfile' user-space tool for e.g. is to calculate the same from the last PT_LOAD available in '/proc/kcore', which can be flaky as and when new sections (for e.g. KCORE_REMAP which was added to recent kernels) are added to kcore. For other architectures like arm64, I have already proposed using the vmcoreinfo note (in '/proc/kcore') in the user-space utilities to determine machine specific details like VA_BITS, PAGE_OFFSET, kasrl_offset() (see [2] for details), for which different user-space tools earlier used different (and at times flaky) approaches like: - Reading kernel CONFIGs from user-space and determining CONFIG values like VA_BITS from there. - Reading symbols from '/proc/kallsyms' and determining their values via '/dev/mem' interface. - Reading symbols from 'vmlinux' and determing their values from reading memory. This patch allows appending 'page_offset_base' for x86_64 platforms to vmcoreinfo, so that user-space tools can use the same as a standard interface to determine the start of direct mapping of all physical memory. Testing: --- - I tested this patch (rebased on 'linux-next') on a x86_64 machine using the modified 'makedumpfile' user-space code (see [3] for my github tree which contains the same) for determining how many pages are dumpable when different dump_level is specified (which is one use-case of live-debugging via 'makedumpfile'). - I tested both the KASLR and non-KASLR boot cases with this patch. - Here is one sample log (for KASLR boot case) on my x86_64 machine: < snip..> The kernel doesn't support mmap(),read() will be used instead. TYPE PAGES EXCLUDABLE DESCRIPTION -- ZERO 21299 yes Pages filled with zero NON_PRI_CACHE91785 yes Cache pages without private flag PRI_CACHE1 yes Cache pages with private flag USER 14057 yes User process pages FREE 740346 yes Free pages KERN_DATA58152 no Dumpable kernel data page size: 4096 Total pages on system: 925640 Total size on system:3791421440 Byte I understand that there might be some reservations about exporting such machine-specific details in the vmcoreinfo, but to unify the implementations across user-land and archs, perhaps this would be good starting point to start a discussion. [0]. https://www.mail-archive.com/kexec@lists.infradead.org/msg20300.html [1]. MAN pages -> MAKEDUMPFILE(8) and CRASH(8) [2]. https://www.spinics.net/lists/kexec/msg21608.html http://lists.infradead.org/pipermail/kexec/2018-October/021725.html [3]. https://github.com/bhupesh-sharma/makedumpfile/tree/add-page-offset-base-to-vmcore-v1 Cc: Boris Petkov Cc: Baoquan He Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Kazuhito Hagio Cc: Dave Anderson Cc: James Morse Cc: Omar Sandoval Cc: x...@kernel.org Cc: kexec@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- arch/x86/kernel/machine_kexec_64.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 4c8acdfdc5a7..834ccefef867 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -356,6 +356,7 @@ void arch_crash_save_vmcoreinfo(void) VMCOREINFO_SYMBOL(init_top_pgt); vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n", pgtable_l5_enabled()); + VMCOREINFO_NUMBER(page_offset_base); #ifdef CONFIG_NUMA VMCOREINFO_SYMBOL(node_data); -- 2.7.4 ___ kexec mailing list kexec@lists.infra
Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
On Fri, Oct 26, 2018 at 06:24:40PM +0200, Petr Tesarik wrote: > But we need the MSR value from the panic kernel environment, not while > the production kernel is still running, right? Actually, we need only the encryption bit number (and it should be 0 otherwise to denote SME wasn't enabled). I guess something like VMCOREINFO_NUMBER(sme_mask); which gets written by the kexec-ed kernel. > If so, then this reminds me that I have wanted for a long time to store > more of the hardware state in a vmcore NOTE after a kernel crash ... > control registers, MSRs and whatnot. Of course, this would be a > long-term project, but I wonder what other people think about it in > general. I guess that sounds like a good idea - the more relevant hw info for debugging, the better. Determining the important MSRs to save would need a bit of a pondering over though. For example, some MSRs are per-core, some per-socket, etc... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] makedumpfile/x86_64: Fix calculation of page_offset for kernel 4.19
Hi Kazu, On Sat, Oct 27, 2018 at 12:34 AM Kazuhito Hagio wrote: > > Hi Bhupesh, Baoquan, > > As for x86_64, I'm going to merge the patch below for fixing the > --mem-usage issue with kernel 4.19, if there is no objection. > I think the same approach will also work on arm64 with regard to > page_offset for the time being.. > I am sorry for holding on to my reply on the patch (sent by me) which we were discussing past week, but it has not been easy for me to have access to different type of arm64 boards (so that we can cover a broad spectrum of the boards) and my x86_64 virtual machine is also giving me several weird setup issues this week. Anyways, I think for now I would suggest that you hold on applying this patch, as although it works on my x86_64 vm and one type of arm64 boards, it fails on a few other arm64 setups. I think I have a cleaner approach in mind for x86_64 (which I am just thrashing out on my x86_64 vm) and along with the earlier patch I shared for arm64 it should fix issues with live debugging (i.e. --mem-usage use case) with makedumpfile. But for that I need to send a x86_64 kernel patch and see their opinions on the same (as Bao captured in this earlier email, it might be that the x86_64 maintainers may not like this approach, but I think we can try and start a discussion at-least as I see that the arm64 kernel maintainers are willing to accept only this approach further). I would suggest that since we are looking to support newer kernels, we better shift to a uniform approach for all the archs (which is the intent behind the kernel patchset which enabled VMCOREINFO PT_NOTE in 'proc/kcore'). I know that you and Bao have some apprehensions with the same, so let me try and put out the solution which should fix the x86_64 part as well and then we can thrash out a solution that probably fits all archs. I will also come back with my comments on the review comments on the patch I posted last week in a day or two. Thanks for your patience. Regards, Bhupesh > -- > From: Kazuhito Hagio > Date: Fri, 26 Oct 2018 14:43:22 -0400 > Subject: [PATCH] x86_64: Fix calculation of page_offset for kernel 4.19 > > * Required for kernel 4.19 > > Kernel commit 6855dc41b24619c3d1de3dbd27dd0546b0e45272 ("x86: Add > entry trampolines to kcore") added program headers for PTI entry > trampoline pages to /proc/kcore. > > This caused the failure of makedumpfile --mem-usage due to wrong > calculation of page_offset. > > # makedumpfile --mem-usage /proc/kcore > [...] > set_kcore_vmcoreinfo: Can't get the offset of VMCOREINFO(/proc/kcore). > Success > > makedumpfile Failed. > > Since program headers for linear maps are located after ones for > kernel text and so on in /proc/vmcore and /proc/kcore, with this > patch, we use the last valid one to set page_offset. > > Also, this patch adds a few debug messages for better debugging. > > Cc: Bhupesh Sharma > Cc: Baoquan He > Signed-off-by: Kazuhito Hagio > --- > arch/x86_64.c | 14 +- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/arch/x86_64.c b/arch/x86_64.c > index 2b3c0bb..ed2a970 100644 > --- a/arch/x86_64.c > +++ b/arch/x86_64.c > @@ -95,10 +95,17 @@ get_page_offset_x86_64(void) > ERRMSG("Can't read page_offset_base.\n"); > return FALSE; > } > + DEBUG_MSG("page_offset : %lx (from page_offset_base)\n", > + info->page_offset); > return TRUE; > } > > if (get_num_pt_loads()) { > + /* > +* Since program headers for linear maps are located after > +* ones for kernel text and so on in /proc/vmcore and > +* /proc/kcore, we use the last valid one to set page_offset. > +*/ > for (i = 0; > get_pt_load(i, &phys_start, NULL, &virt_start, NULL); > i++) { > @@ -106,9 +113,13 @@ get_page_offset_x86_64(void) > && virt_start < __START_KERNEL_map > && phys_start != NOT_PADDR) { > info->page_offset = virt_start - phys_start; > - return TRUE; > } > } > + if (info->page_offset) { > + DEBUG_MSG("page_offset : %lx (from pt_load)\n", > + info->page_offset); > + return TRUE; > + } > } > > if (info->kernel_version < KERNEL_VERSION(2, 6, 27)) { > @@ -119,6 +130,7 @@ get_page_offset_x86_64(void) > info->page_offset = __PAGE_OFFSET_2_6_27; > } > > + DEBUG_MSG("page_offset : %lx (from constant)\n", info->page_offset); > return TRUE; > } > > -- > 1.8.3.1 > > ___ kexec mailing list kexec@lists.infr
[PATCH] makedumpfile/x86_64: Fix calculation of page_offset for kernel 4.19
Hi Bhupesh, Baoquan, As for x86_64, I'm going to merge the patch below for fixing the --mem-usage issue with kernel 4.19, if there is no objection. I think the same approach will also work on arm64 with regard to page_offset for the time being.. Thanks, Kazu -- From: Kazuhito Hagio Date: Fri, 26 Oct 2018 14:43:22 -0400 Subject: [PATCH] x86_64: Fix calculation of page_offset for kernel 4.19 * Required for kernel 4.19 Kernel commit 6855dc41b24619c3d1de3dbd27dd0546b0e45272 ("x86: Add entry trampolines to kcore") added program headers for PTI entry trampoline pages to /proc/kcore. This caused the failure of makedumpfile --mem-usage due to wrong calculation of page_offset. # makedumpfile --mem-usage /proc/kcore [...] set_kcore_vmcoreinfo: Can't get the offset of VMCOREINFO(/proc/kcore). Success makedumpfile Failed. Since program headers for linear maps are located after ones for kernel text and so on in /proc/vmcore and /proc/kcore, with this patch, we use the last valid one to set page_offset. Also, this patch adds a few debug messages for better debugging. Cc: Bhupesh Sharma Cc: Baoquan He Signed-off-by: Kazuhito Hagio --- arch/x86_64.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86_64.c b/arch/x86_64.c index 2b3c0bb..ed2a970 100644 --- a/arch/x86_64.c +++ b/arch/x86_64.c @@ -95,10 +95,17 @@ get_page_offset_x86_64(void) ERRMSG("Can't read page_offset_base.\n"); return FALSE; } + DEBUG_MSG("page_offset : %lx (from page_offset_base)\n", + info->page_offset); return TRUE; } if (get_num_pt_loads()) { + /* +* Since program headers for linear maps are located after +* ones for kernel text and so on in /proc/vmcore and +* /proc/kcore, we use the last valid one to set page_offset. +*/ for (i = 0; get_pt_load(i, &phys_start, NULL, &virt_start, NULL); i++) { @@ -106,9 +113,13 @@ get_page_offset_x86_64(void) && virt_start < __START_KERNEL_map && phys_start != NOT_PADDR) { info->page_offset = virt_start - phys_start; - return TRUE; } } + if (info->page_offset) { + DEBUG_MSG("page_offset : %lx (from pt_load)\n", + info->page_offset); + return TRUE; + } } if (info->kernel_version < KERNEL_VERSION(2, 6, 27)) { @@ -119,6 +130,7 @@ get_page_offset_x86_64(void) info->page_offset = __PAGE_OFFSET_2_6_27; } + DEBUG_MSG("page_offset : %lx (from constant)\n", info->page_offset); return TRUE; } -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
On Fri, Oct 26, 2018 at 08:32:11PM +0800, lijiang wrote: > If SME is enabled in the first kernel, the crash kernel's page > table(pgd/pud/pmd/pte) > contains the memory encryption mask, so i have to remove the sme mask to > obtain the > true physical address when dump vmcore. Sorry, I have no clue what makedumpfile does exactly so you'd have to be more detailed (or wait until I look at it :)). Which kernel accesses which kernel's pagetable? /me goes and looks at the makedumpfile's manpage... Ok, it uses vmcoreinfo to exclude pages which would mean, it accesses the first kernel's pagetable and traverses it. Am I close? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
On Fri, 26 Oct 2018 20:32:11 +0800 lijiang wrote: >[...] > For AMD machine with the SME feature, the msr 'MSR_K8_SYSCFG' can examine > whether SME is enabled in kernel, but the kexec is also userspace tool, > it has no permission to access the msr. But we need the MSR value from the panic kernel environment, not while the production kernel is still running, right? If so, then this reminds me that I have wanted for a long time to store more of the hardware state in a vmcore NOTE after a kernel crash ... control registers, MSRs and whatnot. Of course, this would be a long-term project, but I wonder what other people think about it in general. Just my 2 cents, Petr T > Furthermore, i also tried to read the "/dev/cpu/cpu[number]/msr", but > the value depends on BIOS's configuration. That is to say, if SME is > set in BIOS, the value of msr is always 0xF4 whatever the kernel > commandline parameter is "mem_encrypt=on" or "mem_encrypt=off". > > If i made a mistake, please help to point it out. > > Thanks. > Lianbo > > ___ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[BUG] Set device tree bootargs failed when DTB does not contain a chosen node.
Hello, when executing kexec -d --dtb dtb_without_chosen_node.dtb --append 'cmdline' --load Image it reports dtb_set_property: fdt_add_subnode failed: kexec: Set device tree bootargs failed. It has been tested on the arm64 architecture with version v2.0.17 and v2.0.18-rc1 Regards, Vicenç. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
在 2018年10月26日 17:43, Boris Petkov 写道: > On October 26, 2018 10:36:30 AM GMT+01:00, Lianbo Jiang > wrote: >> For AMD machine with SME feature, makedumpfile tools need to know >> whether the crash kernel was encrypted or not. > > Why? > If SME is enabled in the first kernel, the crash kernel's page table(pgd/pud/pmd/pte) contains the memory encryption mask, so i have to remove the sme mask to obtain the true physical address when dump vmcore. >> So it is necessary >> to write the sme_me_mask to vmcoreinfo. >> >> Signed-off-by: Lianbo Jiang >> --- >> arch/x86/kernel/machine_kexec_64.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/arch/x86/kernel/machine_kexec_64.c >> b/arch/x86/kernel/machine_kexec_64.c >> index 4c8acdfdc5a7..dcfdb64d1097 100644 >> --- a/arch/x86/kernel/machine_kexec_64.c >> +++ b/arch/x86/kernel/machine_kexec_64.c >> @@ -357,6 +357,8 @@ void arch_crash_save_vmcoreinfo(void) >> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n", >> pgtable_l5_enabled()); >> >> +VMCOREINFO_NUMBER(sme_me_mask); > > No we're not going to expose a kernel-internal mask to userspace. > If so, can i set a variable flag for the 'sme_me_mask' and export the variable flag? For example: void arch_crash_save_vmcoreinfo(void) { if (sme_active()) sme_enabled = 1; VMCOREINFO_NUMBER(sme_enabled); } > If at all needed, add functions to kexec which figure out whether we are > encrypted or not and export that result as a kexec variable. > > For AMD machine with the SME feature, the msr 'MSR_K8_SYSCFG' can examine whether SME is enabled in kernel, but the kexec is also userspace tool, it has no permission to access the msr. Furthermore, i also tried to read the "/dev/cpu/cpu[number]/msr", but the value depends on BIOS's configuration. That is to say, if SME is set in BIOS, the value of msr is always 0xF4 whatever the kernel commandline parameter is "mem_encrypt=on" or "mem_encrypt=off". If i made a mistake, please help to point it out. Thanks. Lianbo ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [BUG] vmcore-dmesg cant' read dmesg log from /proc/vmcore if log_buf is reallocated due to large number of CPUs
Hi Vadim, On Fri, Oct 26, 2018 at 3:41 PM Vadim Lomovtsev wrote: > > Hi Bhupesh, > > On Fri, Oct 26, 2018 at 12:25:17PM +0530, Bhupesh Sharma wrote: > > > > ease p > > before seiHi Vadim, > > > > On Thu, Oct 25, 2018 at 4:10 PM Vadim Lomovtsev > > wrote: > > > > > > Hello Bhupesh, > > > > > > On Thu, Oct 25, 2018 at 03:00:08AM +0530, Bhupesh Sharma wrote: > > > > External Email > > > > > > > > Hello Vadim, > > > > > > > > On Wed, Oct 24, 2018 at 6:23 PM Lomovtsev, Vadim > > > > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > Following issue has been found for vmcore-dmesg app with latest > > > > > release (94159bc3c264fa26395e56302072276a139d18af 2.0.18-rc1) of > > > > > kexec-tools at CentOS 7.5 distro: > > > > > > > > > > While having systems with large number of CPUs (e.g. Cavium ThunderX2 > > > > > has 224) the log_buf gets reallocated by memblock_virt_alloc() at the > > > > > setup_log_buf routine > > > > > (https://elixir.bootlin.com/linux/v4.16.18/source/kernel/printk/printk.c#L1108). > > > > > > > > > > Then while dumping vmcore the vmcore-dmesg can't find dmesg log at > > > > > /proc/vmcore file and exits with following message: > > > > > Failed to read log text of size 0 bytes: Bad address > > > > > > > > > > However it (vmcore-dmesg app) reads properly the log_buf symbol, it's > > > > > address and eventually it's value from /proc/vmcore but fails to find > > > > > dmesg data then. > > > > > > > > > > In the same time the makedumpfile is able to find and extract dmesg > > > > > buffer from /proc/vmcore. > > > > > The makedumpfile comes with kexec-tools-2.0.15-13.el7_5.2.aarch64 > > > > > package. > > > > > > > > > > The issue is not reproduced for systems with small number of CPUs and > > > > > log_buf not reallocated to memblock section. > > > > > > > > Seems like you are hitting a known issue we saw on qualcomm amberwing > > > > platforms as well. > > > > I have sent a patch-series titled 'kexec-tools/arm64: Add support to > > > > read PHYS_OFFSET from vmcoreinfo inside '/proc/kcore' to this list > > > > just a few minutes back. > > > > > > > > I have Cc'ed you to the patchset as I think it might fix the issue for > > > > you. > > > > > > Got them, thank you. > > > > > > > Kindly try the patchset on your platform (cavium?) and let me > > > > know if this fixes the issue for you. > > > > > > Sure, I'd like to check them at my side, but.. > > > I fall into merge conflicts while trying to apply them onto > > > https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/ > > > master, kexec-tools 2.0.18-rc1 94159bc3c264fa26395e56302072276a139d18af > > > > Hmm.. that's strange as I rebased them on kexec-tools 2.0.18-rc1 > > (94159bc3c264fa26395e56302072276a139d18af) > > before sending out the patchset. > > > > > Are there any specific branch/revision for them to be applied ? > > > (or it might be my mail server issues with formatting emails). > > > > > > > Can you please try picking them up from my public github tree instead? > > Here you can find the same: > > https://github.com/bhupesh-sharma/kexec-tools/tree/read-phys-offset-from-kcore-upstream-v1 > > > > Please pick the top 2 commit from here. > > Applied them onto commit '94159bc kexec-tools 2.0.18-rc1'. > > Still having following error while saving dmesg by vmcore-dmesg: > > kdump: saving vmcore-dmesg.txt > Failed to read log text of size 0 bytes: Bad address > kdump: saving vmcore-dmesg.txt failed > > So far tried kernels 4.14.78, 4.16.18. You would need kernel 4.19-rc5 or above as the same exposes VMCOREINFO as '/proc/kcore'. If you are having issues while switching to newer kernel, please share the output(s) of following on your platform: # kexec -p /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline -d and, # readelf -l vmcore and, # cat /proc/iomem And then I can suggest a hack, which you can try and test on your platform and then we can take it forward from there. Thanks, Bhupesh > > > > Thanks, > > Bhupesh > > > > > > > > > > > > > Thanks, > > > > Bhupesh ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [BUG] vmcore-dmesg cant' read dmesg log from /proc/vmcore if log_buf is reallocated due to large number of CPUs
Hi Bhupesh, On Fri, Oct 26, 2018 at 12:25:17PM +0530, Bhupesh Sharma wrote: > > ease p > before seiHi Vadim, > > On Thu, Oct 25, 2018 at 4:10 PM Vadim Lomovtsev > wrote: > > > > Hello Bhupesh, > > > > On Thu, Oct 25, 2018 at 03:00:08AM +0530, Bhupesh Sharma wrote: > > > External Email > > > > > > Hello Vadim, > > > > > > On Wed, Oct 24, 2018 at 6:23 PM Lomovtsev, Vadim > > > wrote: > > > > > > > > Hi all, > > > > > > > > Following issue has been found for vmcore-dmesg app with latest release > > > > (94159bc3c264fa26395e56302072276a139d18af 2.0.18-rc1) of kexec-tools at > > > > CentOS 7.5 distro: > > > > > > > > While having systems with large number of CPUs (e.g. Cavium ThunderX2 > > > > has 224) the log_buf gets reallocated by memblock_virt_alloc() at the > > > > setup_log_buf routine > > > > (https://elixir.bootlin.com/linux/v4.16.18/source/kernel/printk/printk.c#L1108). > > > > > > > > Then while dumping vmcore the vmcore-dmesg can't find dmesg log at > > > > /proc/vmcore file and exits with following message: > > > > Failed to read log text of size 0 bytes: Bad address > > > > > > > > However it (vmcore-dmesg app) reads properly the log_buf symbol, it's > > > > address and eventually it's value from /proc/vmcore but fails to find > > > > dmesg data then. > > > > > > > > In the same time the makedumpfile is able to find and extract dmesg > > > > buffer from /proc/vmcore. > > > > The makedumpfile comes with kexec-tools-2.0.15-13.el7_5.2.aarch64 > > > > package. > > > > > > > > The issue is not reproduced for systems with small number of CPUs and > > > > log_buf not reallocated to memblock section. > > > > > > Seems like you are hitting a known issue we saw on qualcomm amberwing > > > platforms as well. > > > I have sent a patch-series titled 'kexec-tools/arm64: Add support to > > > read PHYS_OFFSET from vmcoreinfo inside '/proc/kcore' to this list > > > just a few minutes back. > > > > > > I have Cc'ed you to the patchset as I think it might fix the issue for > > > you. > > > > Got them, thank you. > > > > > Kindly try the patchset on your platform (cavium?) and let me > > > know if this fixes the issue for you. > > > > Sure, I'd like to check them at my side, but.. > > I fall into merge conflicts while trying to apply them onto > > https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/ > > master, kexec-tools 2.0.18-rc1 94159bc3c264fa26395e56302072276a139d18af > > Hmm.. that's strange as I rebased them on kexec-tools 2.0.18-rc1 > (94159bc3c264fa26395e56302072276a139d18af) > before sending out the patchset. > > > Are there any specific branch/revision for them to be applied ? > > (or it might be my mail server issues with formatting emails). > > > > Can you please try picking them up from my public github tree instead? > Here you can find the same: > https://github.com/bhupesh-sharma/kexec-tools/tree/read-phys-offset-from-kcore-upstream-v1 > > Please pick the top 2 commit from here. Applied them onto commit '94159bc kexec-tools 2.0.18-rc1'. Still having following error while saving dmesg by vmcore-dmesg: kdump: saving vmcore-dmesg.txt Failed to read log text of size 0 bytes: Bad address kdump: saving vmcore-dmesg.txt failed So far tried kernels 4.14.78, 4.16.18. WBR, Vadim > > Thanks, > Bhupesh > > > > > > > > > Thanks, > > > Bhupesh ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
On October 26, 2018 10:36:30 AM GMT+01:00, Lianbo Jiang wrote: >For AMD machine with SME feature, makedumpfile tools need to know >whether the crash kernel was encrypted or not. Why? > So it is necessary >to write the sme_me_mask to vmcoreinfo. > >Signed-off-by: Lianbo Jiang >--- > arch/x86/kernel/machine_kexec_64.c | 2 ++ > 1 file changed, 2 insertions(+) > >diff --git a/arch/x86/kernel/machine_kexec_64.c >b/arch/x86/kernel/machine_kexec_64.c >index 4c8acdfdc5a7..dcfdb64d1097 100644 >--- a/arch/x86/kernel/machine_kexec_64.c >+++ b/arch/x86/kernel/machine_kexec_64.c >@@ -357,6 +357,8 @@ void arch_crash_save_vmcoreinfo(void) > vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n", > pgtable_l5_enabled()); > >+ VMCOREINFO_NUMBER(sme_me_mask); No we're not going to expose a kernel-internal mask to userspace. If at all needed, add functions to kexec which figure out whether we are encrypted or not and export that result as a kexec variable. -- Sent from a small device: formatting sux and brevity is inevitable. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH] kdump, vmcoreinfo: Export sme_me_mask value to vmcoreinfo
For AMD machine with SME feature, makedumpfile tools need to know whether the crash kernel was encrypted or not. So it is necessary to write the sme_me_mask to vmcoreinfo. Signed-off-by: Lianbo Jiang --- arch/x86/kernel/machine_kexec_64.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 4c8acdfdc5a7..dcfdb64d1097 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -357,6 +357,8 @@ void arch_crash_save_vmcoreinfo(void) vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n", pgtable_l5_enabled()); + VMCOREINFO_NUMBER(sme_me_mask); + #ifdef CONFIG_NUMA VMCOREINFO_SYMBOL(node_data); VMCOREINFO_LENGTH(node_data, MAX_NUMNODES); -- 2.17.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec