Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel
On Mon, 20 Jan 2020 at 23:31, Andy Shevchenko wrote: > > On Mon, Jan 20, 2020 at 9:28 PM Eric W. Biederman > wrote: > > Andy Shevchenko writes: > > > On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote: > > >> Ccing efi people. > > >> > > >> On 12/16/16 at 02:33pm, Jean Delvare wrote: > > >> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote: > > >> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote: > > >> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote: > > >> > > > > I am no kexec expert but this confuses me. Shouldn't the second > > >> > > > > kernel have access to the EFI systab as the first kernel does? It > > >> > > > > includes many more pointers than just ACPI and DMI tables, and it > > >> > > > > would seem inconvenient to have to pass all these addresses > > >> > > > > individually explicitly. > > >> > > > > > >> > > > Yes, in modern linux kernel, kexec has the support for EFI, I > > >> > > > think it > > >> > > > should work naturally at least in x86_64. > > >> > > > > >> > > Thanks for this good news! > > >> > > > > >> > > Unfortunately Intel Galileo is 32-bit platform. > > >> > > > >> > If it was done for X86_64 then maybe it can be generalized to X86? > > >> > > >> For X86_64, we have a new way for efi runtime memmory mapping, in i386 > > >> code it still use old ioremap way. It is impossible to use same way as > > >> the X86_64 since the virtual address space is limited. > > >> > > >> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not > > >> sure, I would suggest Andy to do a test first with efi=noruntime for > > >> kexec 2nd kernel. > > > > > > Guys, it was quite a long no hear from you. As I told you the proposed > > > work > > > around didn't help. Today I found that Microsoft Surface 3 also affected > > > by this. > > > > > > Can we apply these patches for now until you will find better > > > solution? > > > > Not a chance. The patches don't apply to any kernel in the git history. > > > > Which may be part of your problem. You are or at least were running > > with code that has not been merged upstream. > > It's done against linux-next. > Applied clearly. (Not the version in this more than yearly old series > of course, that's why I told I can resend) > > > > P.S. I may resend them rebased on recent vanilla. > > > > Second. I looked at your test results and they don't directly make > > sense. dmidecode bypasses the kernel completely or it did last time > > I looked so I don't know why you would be using that to test if > > something in the kernel is working. > > > > However dmidecode failing suggests that the actual problem is something > > in the first kernel is stomping the dmi tables. > > See below. > > > Adding a command line option won't fix stomped tables. > > It provides a mechanism, which seems to be absent, to the second > kernel to know where to look for SMBIOS tables. > > > So what I would suggest is: > > a) Verify that dmidecode works before kexec. > > Yes, it does. > > > b) Test to see if dmidecode works after kexec. > > No, it doesn't. > > > c) Once (a) shows that dmidecode works and (b) shows that dmidecode > >fails figure out what is stomping your dmi tables during or before > >kexec and that is what should get fixed. > > The problem here as I can see it that EFI and kexec protocols are not > friendly to each other. > I'm not an expert in either. That's why I'm asking for possible > solutions. And this needs to be done in kernel to allow drivers to > work. > > Does the > > commit 4996c02306a25def1d352ec8e8f48895bbc7dea9 > Author: Takao Indoh > Date: Thu Jul 14 18:05:21 2011 -0400 > > ACPI: introduce "acpi_rsdp=" parameter for kdump > > description shed a light on this? > > > Now using a non-efi method of dmi detection relies on the > > tables being between 0xF and 0x1. AKA the last 64K > > of the first 1MiB of memory. You might check to see if your > > dmi tables are in that address range. > > # dmidecode --no-sysfs > # dmidecode 3.2 > Scanning /dev/mem for entry point. > # No SMBIOS nor DMI entry point found, sorry. > > === with patch applied === > # dmidecode > ... > Release Date: 03/10/2015 > ... > > > > > Otherwise I suspect the good solution is to give efi it's own page > > tables in the kernel and switch to it whenever efi functions are called. > > > > > But on 32bit the Linux kernel has historically been just fine directly > > accessing the hardware, and ignoring efi and all of the other BIOS's. > > It seems not only for 32-bit Linux kernel anymore. MS Surface 3 runs > 64-bit code. > > > So if that doesn't work on Intel Galileo that is probably a firmware > > problem. > > It's not only about Galileo anymore. > Looking at the x86 kexec EFI code, it seems that it has special handling for the legacy SMBIOS table address, but not for the SMBIOS3 table address, which was introduced to accommodate SMBIOS tables living in memory that is not 32-bit addressable. Could anyone check
Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel
On Mon, Jan 20, 2020 at 9:28 PM Eric W. Biederman wrote: > Andy Shevchenko writes: > > On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote: > >> Ccing efi people. > >> > >> On 12/16/16 at 02:33pm, Jean Delvare wrote: > >> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote: > >> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote: > >> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote: > >> > > > > I am no kexec expert but this confuses me. Shouldn't the second > >> > > > > kernel have access to the EFI systab as the first kernel does? It > >> > > > > includes many more pointers than just ACPI and DMI tables, and it > >> > > > > would seem inconvenient to have to pass all these addresses > >> > > > > individually explicitly. > >> > > > > >> > > > Yes, in modern linux kernel, kexec has the support for EFI, I think > >> > > > it > >> > > > should work naturally at least in x86_64. > >> > > > >> > > Thanks for this good news! > >> > > > >> > > Unfortunately Intel Galileo is 32-bit platform. > >> > > >> > If it was done for X86_64 then maybe it can be generalized to X86? > >> > >> For X86_64, we have a new way for efi runtime memmory mapping, in i386 > >> code it still use old ioremap way. It is impossible to use same way as > >> the X86_64 since the virtual address space is limited. > >> > >> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not > >> sure, I would suggest Andy to do a test first with efi=noruntime for > >> kexec 2nd kernel. > > > > Guys, it was quite a long no hear from you. As I told you the proposed work > > around didn't help. Today I found that Microsoft Surface 3 also affected > > by this. > > > > Can we apply these patches for now until you will find better > > solution? > > Not a chance. The patches don't apply to any kernel in the git history. > > Which may be part of your problem. You are or at least were running > with code that has not been merged upstream. It's done against linux-next. Applied clearly. (Not the version in this more than yearly old series of course, that's why I told I can resend) > > P.S. I may resend them rebased on recent vanilla. > > Second. I looked at your test results and they don't directly make > sense. dmidecode bypasses the kernel completely or it did last time > I looked so I don't know why you would be using that to test if > something in the kernel is working. > > However dmidecode failing suggests that the actual problem is something > in the first kernel is stomping the dmi tables. See below. > Adding a command line option won't fix stomped tables. It provides a mechanism, which seems to be absent, to the second kernel to know where to look for SMBIOS tables. > So what I would suggest is: > a) Verify that dmidecode works before kexec. Yes, it does. > b) Test to see if dmidecode works after kexec. No, it doesn't. > c) Once (a) shows that dmidecode works and (b) shows that dmidecode >fails figure out what is stomping your dmi tables during or before >kexec and that is what should get fixed. The problem here as I can see it that EFI and kexec protocols are not friendly to each other. I'm not an expert in either. That's why I'm asking for possible solutions. And this needs to be done in kernel to allow drivers to work. Does the commit 4996c02306a25def1d352ec8e8f48895bbc7dea9 Author: Takao Indoh Date: Thu Jul 14 18:05:21 2011 -0400 ACPI: introduce "acpi_rsdp=" parameter for kdump description shed a light on this? > Now using a non-efi method of dmi detection relies on the > tables being between 0xF and 0x1. AKA the last 64K > of the first 1MiB of memory. You might check to see if your > dmi tables are in that address range. # dmidecode --no-sysfs # dmidecode 3.2 Scanning /dev/mem for entry point. # No SMBIOS nor DMI entry point found, sorry. === with patch applied === # dmidecode ... Release Date: 03/10/2015 ... > > Otherwise I suspect the good solution is to give efi it's own page > tables in the kernel and switch to it whenever efi functions are called. > > But on 32bit the Linux kernel has historically been just fine directly > accessing the hardware, and ignoring efi and all of the other BIOS's. It seems not only for 32-bit Linux kernel anymore. MS Surface 3 runs 64-bit code. > So if that doesn't work on Intel Galileo that is probably a firmware > problem. It's not only about Galileo anymore. -- With Best Regards, Andy Shevchenko ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel
On Mon, Jan 20, 2020 at 11:44 PM Jean Delvare wrote: > > On Mon, 20 Jan 2020 10:04:04 -0600, Eric W. Biederman wrote: > > Second. I looked at your test results and they don't directly make > > sense. dmidecode bypasses the kernel completely or it did last time > > I looked so I don't know why you would be using that to test if > > something in the kernel is working. > > That must have been long ago. A recent version of dmidecode (>= 3.0) > running on a recent kernel > (>= d7f96f97c4031fa4ffdb7801f9aae23e96170a6f, v4.2) will read the DMI > data from /sys/firmware/dmi/tables, so it is very much relying on the > kernel doing the right thing. If not, it will still try to fallback to > reading from /dev/mem directly on certain architectures. You can force > that old method with --no-sysfs. > > Hope that helps, I don't understand how it possible can help for in-kernel code, like DMI quirks in a drivers. -- With Best Regards, Andy Shevchenko ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel
On Mon, 20 Jan 2020 10:04:04 -0600, Eric W. Biederman wrote: > Second. I looked at your test results and they don't directly make > sense. dmidecode bypasses the kernel completely or it did last time > I looked so I don't know why you would be using that to test if > something in the kernel is working. That must have been long ago. A recent version of dmidecode (>= 3.0) running on a recent kernel (>= d7f96f97c4031fa4ffdb7801f9aae23e96170a6f, v4.2) will read the DMI data from /sys/firmware/dmi/tables, so it is very much relying on the kernel doing the right thing. If not, it will still try to fallback to reading from /dev/mem directly on certain architectures. You can force that old method with --no-sysfs. Hope that helps, -- Jean Delvare SUSE L3 Support ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Contact Diplomatic Agent, Mr. Mcclaine John to receive your ATM CARD valued the sum of $12.8Million United States Dollars
Attn: Dear Beneficiary, I wish to inform you that the diplomatic agent conveying your ATM CARD valued the sum of $12.8Million United States Dollars has misplaced your address and he is currently stranded at (George Bush International Airport) Houston Texas USA now We required you to reconfirm the following information's below to him so that he can deliver your Payment CARD to you today or tomorrow morning as information provided with open communications via email and telephone for security reasons. HERE IS THE DETAILS HE NEED FROM YOU URGENT YOUR FULL NAME: ADDRESS: MOBILE NO: NAME OF YOUR NEAREST AIRPORT: A COPY OF YOUR IDENTIFICATION : Note; do contact the diplomatic agent immediately through the information's listed below Contact Person: Diplomatic Agent, Mr. Mcclaine John EMAIL: mcclainejohn...@gmail.com Tel:(223) 777-7518 Contact the diplomatic agent immediately because he is waiting to hear from you today with the needed information's. NOTE: The Diplomatic agent does not know that the content of the consignment box is $12.800,000,00 Million United States Dollars and on no circumstances should you let him know the content. The consignment was moved from here as family treasures, so never allow him to open the box. Please I have paid delivery fees for you but the only money you must send to Mcclaine John is your ATM CARD delivery fee $25.00 only. text Him as you contact Him Immediately Thanks, with Regards. Prof, William Roberts Director DHL COURIER SERVICES-Benin ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel
Andy Shevchenko writes: > On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote: >> Ccing efi people. >> >> On 12/16/16 at 02:33pm, Jean Delvare wrote: >> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote: >> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote: >> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote: >> > > > > I am no kexec expert but this confuses me. Shouldn't the second >> > > > > kernel have access to the EFI systab as the first kernel does? It >> > > > > includes many more pointers than just ACPI and DMI tables, and it >> > > > > would seem inconvenient to have to pass all these addresses >> > > > > individually explicitly. >> > > > >> > > > Yes, in modern linux kernel, kexec has the support for EFI, I think it >> > > > should work naturally at least in x86_64. >> > > >> > > Thanks for this good news! >> > > >> > > Unfortunately Intel Galileo is 32-bit platform. >> > >> > If it was done for X86_64 then maybe it can be generalized to X86? >> >> For X86_64, we have a new way for efi runtime memmory mapping, in i386 >> code it still use old ioremap way. It is impossible to use same way as >> the X86_64 since the virtual address space is limited. >> >> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not >> sure, I would suggest Andy to do a test first with efi=noruntime for >> kexec 2nd kernel. > > Guys, it was quite a long no hear from you. As I told you the proposed work > around didn't help. Today I found that Microsoft Surface 3 also affected > by this. > > Can we apply these patches for now until you will find better > solution? Not a chance. The patches don't apply to any kernel in the git history. Which may be part of your problem. You are or at least were running with code that has not been merged upstream. > P.S. I may resend them rebased on recent vanilla. Second. I looked at your test results and they don't directly make sense. dmidecode bypasses the kernel completely or it did last time I looked so I don't know why you would be using that to test if something in the kernel is working. However dmidecode failing suggests that the actual problem is something in the first kernel is stomping the dmi tables. Adding a command line option won't fix stomped tables. So what I would suggest is: a) Verify that dmidecode works before kexec. b) Test to see if dmidecode works after kexec. c) Once (a) shows that dmidecode works and (b) shows that dmidecode fails figure out what is stomping your dmi tables during or before kexec and that is what should get fixed. Now using a non-efi method of dmi detection relies on the tables being between 0xF and 0x1. AKA the last 64K of the first 1MiB of memory. You might check to see if your dmi tables are in that address range. Otherwise I suspect the good solution is to give efi it's own page tables in the kernel and switch to it whenever efi functions are called. But on 32bit the Linux kernel has historically been just fine directly accessing the hardware, and ignoring efi and all of the other BIOS's. So if that doesn't work on Intel Galileo that is probably a firmware problem. Eric ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec -e not working: root disk not able to detect
On Mon, Jan 13, 2020 at 10:28 AM Prabhakar Kushwaha wrote: > > [Re sending keeping mailing list and others in CC] > > On Fri, Jan 10, 2020 at 5:56 AM Bjorn Helgaas wrote: > > > > [+cc Jens, ahci.c maintainer] > > > > On Mon, Jan 06, 2020 at 05:24:44PM +0530, Prabhakar Kushwaha wrote: > > > Hi All, > > > > > > I am trying kexec -e with latest kernel i.e. Linux-5.5.0-rc4. Here > > > second kernel is not able to detect/mount hard-disk having root file > > > system (INTEL SSDSC2BB240G7). > > > > > > [ 279.690575] ata1: softreset failed (1st FIS failed) > > > [ 279.695446] ata1: limiting SATA link speed to 3.0 Gbps > > > [ 280.910020] ata1: SATA link down (SStatus 0 SControl 320) > > > [ 282.626018] ata1: SATA link down (SStatus 0 SControl 300) > > > [ 282.631409] ata1: link online but 1 devices misclassified, retrying > > > [ 282.637665] ata1: reset failed (errno=-11), retrying in 9 secs > > > [ 298.294546] ata1: failed to reset engine (errno=-5) > > > [ 302.042967] ata1: softreset failed (1st FIS failed) > > > [ 308.798609] ata1: failed to reset engine (errno=-5) > > > [ 337.546605] ata1: softreset failed (1st FIS failed) > > > [ 337.551475] ata1: limiting SATA link speed to 3.0 Gbps > > > [ 338.766022] ata1: SATA link down (SStatus 0 SControl 320) > > > [ 339.270943] ata1: EH pending after 5 tries, giving up > > > > > > I found following two workaround for this issue. > > > A) Define ".shutdown" in driver/ata/ahci.c. > > > > > > reboot --> kernel_kexec() --> kernel_restart_prepare() --> > > > device_shutdown() --> pci_device_shutdown() --> ahci_shutdown_one() > > > --> new function > > > > > > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c > > > index 4bfd1b14b390..50a101002885 100644 > > > --- a/drivers/ata/ahci.c > > > +++ b/drivers/ata/ahci.c > > > @@ -81,6 +81,7 @@ enum board_ids { > > > > > > static int ahci_init_one(struct pci_dev *pdev, const struct > > > pci_device_id *ent); > > > static void ahci_remove_one(struct pci_dev *dev); > > > +static void ahci_shutdown_one(struct pci_dev *dev); > > > static int ahci_vt8251_hardreset(struct ata_link *link, unsigned int > > > *class, > > > unsigned long deadline); > > > static int ahci_avn_hardreset(struct ata_link *link, unsigned int > > > *class, > > > @@ -606,6 +607,7 @@ static struct pci_driver ahci_pci_driver = { > > > .id_table = ahci_pci_tbl, > > > .probe = ahci_init_one, > > > .remove = ahci_remove_one, > > > + .shutdown = ahci_shutdown_one, > > > .driver = { > > > .pm = _pci_pm_ops, > > > }, > > > > > > +static void ahci_shutdown_one(struct pci_dev *pdev) > > > +{ > > > + pm_runtime_get_noresume(>dev); > > > + ata_pci_remove_one(pdev); > > > +} > > > + > > > Note: After defining shutdown, error related to file-system write > > > seen. Looks like even after device_shutdown, file system related > > > transaction goes to disk. > > > > > > B)) Commenting of pci_clear_master() from pci_device_shutdown() > > > reboot --> kernel_kexec() --> kernel_restart_prepare() --> > > > device_shutdown() --> pci_device_shutdown() > > > > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > > > index 0454ca0e4e3f..ddffaa9321bb 100644 > > > --- a/drivers/pci/pci-driver.c > > > +++ b/drivers/pci/pci-driver.c > > > @@ -481,8 +481,10 @@ static void pci_device_shutdown(struct device *dev) > > > /* > > > * If this is a kexec reboot, turn off Bus Master bit on the > > > @@ -491,8 +493,16 @@ static void pci_device_shutdown(struct device *dev) > > > * If it is not a kexec reboot, firmware will hit the PCI > > > * devices with big hammer and stop their DMA any way. > > > */ > > > > > > - if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) > > > -pci_clear_master(pci_dev); > > > > I doubt we would remove this without a much clearer justification. > > > > > Here pci_dev current_state. It is "0" i.e. D0. > > > > > > From A and B. Looks like even after pci_clear_master(), Some DMA > > > transactions going on PCIe device causing device in unstable. > > > Not sure if this is the reason and how to solve this problem. > > > > Is it possible the ahci driver depends on receiving the device with > > bus mastering already enabled? I would guess that would be the common > > case in a non-kexec boot -- the BIOS probably hands off the device > > with bus mastering enabled. > > > > Above code clearing Bus Master. May be it try to make sure that next > kernel get PCIe in same state in which BIOS could have provided. > Not sure why this is causing issue in our case. Need to debug more. > I added more prints and added stack dump in ata_scsi_queuecmd(). I found that lots of syscall __arm64_sys_fsync() happens for file system sync during kexec -e. These calls happens even after
Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel
On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote: > Ccing efi people. > > On 12/16/16 at 02:33pm, Jean Delvare wrote: > > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote: > > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote: > > > > On 12/15/16 at 12:28pm, Jean Delvare wrote: > > > > > I am no kexec expert but this confuses me. Shouldn't the second > > > > > kernel have access to the EFI systab as the first kernel does? It > > > > > includes many more pointers than just ACPI and DMI tables, and it > > > > > would seem inconvenient to have to pass all these addresses > > > > > individually explicitly. > > > > > > > > Yes, in modern linux kernel, kexec has the support for EFI, I think it > > > > should work naturally at least in x86_64. > > > > > > Thanks for this good news! > > > > > > Unfortunately Intel Galileo is 32-bit platform. > > > > If it was done for X86_64 then maybe it can be generalized to X86? > > For X86_64, we have a new way for efi runtime memmory mapping, in i386 > code it still use old ioremap way. It is impossible to use same way as > the X86_64 since the virtual address space is limited. > > But maybe for 32bit, kexec kernel can run in physical mode, but I'm not > sure, I would suggest Andy to do a test first with efi=noruntime for > kexec 2nd kernel. Guys, it was quite a long no hear from you. As I told you the proposed work around didn't help. Today I found that Microsoft Surface 3 also affected by this. Can we apply these patches for now until you will find better solution? P.S. I may resend them rebased on recent vanilla. -- With Best Regards, Andy Shevchenko ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCHv2] mm/sparse: reset section's mem_map when fully deactivated
On Mon, Jan 20, 2020 at 5:03 PM David Hildenbrand wrote: > [...] > I think you dropped my > > Acked-by: David Hildenbrand > Sorry to forget it. Thanks for your kind review. Regards, Pingfan ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] makedumpfile: cope with not-present mem section
On 01/20/2020 04:59 PM, Baoquan He wrote: > On 01/20/20 at 09:33am, David Hildenbrand wrote: >> >> >>> Am 20.01.2020 um 03:25 schrieb Pingfan Liu : >>> >>> After kernel commit ba72b4c8cf60 ("mm/sparsemem: support sub-section >>> hotplug"), when hot-removed, section_mem_map is still encoded with section >>> start pfn, not NULL. This break the current makedumpfile. >>> >>> Whatever section_mem_map coding info after hot-removed, it is reliable >>> just to work on SECTION_MARKED_PRESENT bit. Fixing makedumpfile by this >>> way. >>> >>> Signed-off-by: Pingfan Liu >>> To: kexec@lists.infradead.org >>> Cc: Kazuhito Hagio >>> Cc: Baoquan He >>> Cc: David Hildenbrand >>> Cc: Andrew Morton >>> Cc: Dan Williams >>> Cc: Oscar Salvador >>> Cc: Michal Hocko >>> Cc: Qian Cai >>> --- >>> makedumpfile.c | 6 +- >>> 1 file changed, 1 insertion(+), 5 deletions(-) >>> >>> diff --git a/makedumpfile.c b/makedumpfile.c >>> index e290fbd..ab40a58 100644 >>> --- a/makedumpfile.c >>> +++ b/makedumpfile.c >>> @@ -3406,8 +3406,6 @@ section_mem_map_addr(unsigned long addr, unsigned >>> long *map_mask) >>>map = ULONG(mem_section + OFFSET(mem_section.section_mem_map)); >>>mask = SECTION_MAP_MASK; >>>*map_mask = map & ~mask; >>> -if (map == 0x0) >>> -*map_mask |= SECTION_MARKED_PRESENT; This should be a trick to let map==0x0 survive the logic if (!(map_mask & SECTION_MARKED_PRESENT)) { return FALSE; } in validate_mem_section(). >> >> Why was that added in the first place? This looks like dome compat handling >> to me. Are we sure we can remove it? The logic introduced by commit 14876c45aef ("[PATCH makedumpfile] handle mem_section as either a pointer or an array") Combining section_mem_map_addr() and the following logic in validate_mem_section() if (mem_map == 0) { mem_map = NOT_MEMMAP_ADDR; } I reached the same conclusion as Baoquan's. > > > The original code assumes that there are only two kinds of mem_section, > one is empty value, the second is a present one. This removing behaves > the same as the old code, I don't see a problem with it. > > I tried to understand the code, but I don't know why it need call > validate_mem_section() twice and compare the returned value. I think it could be if/else, no need to call twice. Cc Thadeu Lima de Souza Cascardo, who contributes the original code, and may have some idea about it. Thanks, Pingfan > > > ___ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCHv2] mm/sparse: reset section's mem_map when fully deactivated
On 20.01.20 08:29, Michal Hocko wrote: > On Mon 20-01-20 10:33:14, Pingfan Liu wrote: >> After commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug"), >> when a mem section is fully deactivated, section_mem_map still records the >> section's start pfn, which is not used any more and will be reassigned >> during re-added. >> >> In analogy with alloc/free pattern, it is better to clear all fields of >> section_mem_map. >> >> Beside this, it breaks the user space tool "makedumpfile" [1], which makes >> assumption that a hot-removed section has mem_map as NULL, instead of >> checking directly against SECTION_MARKED_PRESENT bit. (makedumpfile will be >> better to change the assumption, and need a patch) >> >> The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , >> trigger a crash, and save vmcore by makedumpfile > > While makedumpfile lives very closely to the kernel and occasional > breakage is to be expected I still believe that Fixes: ba72b4c8cf60 > is due. +1 as I expressed earlier. > >> [1]: makedumpfile, commit e73016540293 ("[v1.6.7] Update version") >> >> Signed-off-by: Pingfan Liu >> To: linux...@kvack.org >> Cc: Andrew Morton >> Cc: David Hildenbrand >> Cc: Dan Williams >> Cc: Oscar Salvador >> Cc: Michal Hocko >> Cc: Baoquan He >> Cc: Qian Cai >> Cc: kexec@lists.infradead.org >> Cc: Kazuhito Hagio > I think you dropped my Acked-by: David Hildenbrand -- Thanks, David / dhildenb ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] makedumpfile: cope with not-present mem section
On 01/20/20 at 09:33am, David Hildenbrand wrote: > > > > Am 20.01.2020 um 03:25 schrieb Pingfan Liu : > > > > After kernel commit ba72b4c8cf60 ("mm/sparsemem: support sub-section > > hotplug"), when hot-removed, section_mem_map is still encoded with section > > start pfn, not NULL. This break the current makedumpfile. > > > > Whatever section_mem_map coding info after hot-removed, it is reliable > > just to work on SECTION_MARKED_PRESENT bit. Fixing makedumpfile by this > > way. > > > > Signed-off-by: Pingfan Liu > > To: kexec@lists.infradead.org > > Cc: Kazuhito Hagio > > Cc: Baoquan He > > Cc: David Hildenbrand > > Cc: Andrew Morton > > Cc: Dan Williams > > Cc: Oscar Salvador > > Cc: Michal Hocko > > Cc: Qian Cai > > --- > > makedumpfile.c | 6 +- > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > diff --git a/makedumpfile.c b/makedumpfile.c > > index e290fbd..ab40a58 100644 > > --- a/makedumpfile.c > > +++ b/makedumpfile.c > > @@ -3406,8 +3406,6 @@ section_mem_map_addr(unsigned long addr, unsigned > > long *map_mask) > >map = ULONG(mem_section + OFFSET(mem_section.section_mem_map)); > >mask = SECTION_MAP_MASK; > >*map_mask = map & ~mask; > > -if (map == 0x0) > > -*map_mask |= SECTION_MARKED_PRESENT; > > Why was that added in the first place? This looks like dome compat handling > to me. Are we sure we can remove it? The original code assumes that there are only two kinds of mem_section, one is empty value, the second is a present one. This removing behaves the same as the old code, I don't see a problem with it. I tried to understand the code, but I don't know why it need call validate_mem_section() twice and compare the returned value. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] makedumpfile: cope with not-present mem section
> Am 20.01.2020 um 03:25 schrieb Pingfan Liu : > > After kernel commit ba72b4c8cf60 ("mm/sparsemem: support sub-section > hotplug"), when hot-removed, section_mem_map is still encoded with section > start pfn, not NULL. This break the current makedumpfile. > > Whatever section_mem_map coding info after hot-removed, it is reliable > just to work on SECTION_MARKED_PRESENT bit. Fixing makedumpfile by this > way. > > Signed-off-by: Pingfan Liu > To: kexec@lists.infradead.org > Cc: Kazuhito Hagio > Cc: Baoquan He > Cc: David Hildenbrand > Cc: Andrew Morton > Cc: Dan Williams > Cc: Oscar Salvador > Cc: Michal Hocko > Cc: Qian Cai > --- > makedumpfile.c | 6 +- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/makedumpfile.c b/makedumpfile.c > index e290fbd..ab40a58 100644 > --- a/makedumpfile.c > +++ b/makedumpfile.c > @@ -3406,8 +3406,6 @@ section_mem_map_addr(unsigned long addr, unsigned long > *map_mask) >map = ULONG(mem_section + OFFSET(mem_section.section_mem_map)); >mask = SECTION_MAP_MASK; >*map_mask = map & ~mask; > -if (map == 0x0) > -*map_mask |= SECTION_MARKED_PRESENT; Why was that added in the first place? This looks like dome compat handling to me. Are we sure we can remove it? ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec