Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2020-01-20 Thread Ard Biesheuvel
On Mon, 20 Jan 2020 at 23:31, Andy Shevchenko  wrote:
>
> On Mon, Jan 20, 2020 at 9:28 PM Eric W. Biederman  
> wrote:
> > Andy Shevchenko  writes:
> > > On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote:
> > >> Ccing efi people.
> > >>
> > >> On 12/16/16 at 02:33pm, Jean Delvare wrote:
> > >> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote:
> > >> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote:
> > >> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote:
> > >> > > > > I am no kexec expert but this confuses me. Shouldn't the second
> > >> > > > > kernel have access to the EFI systab as the first kernel does? It
> > >> > > > > includes many more pointers than just ACPI and DMI tables, and it
> > >> > > > > would seem inconvenient to have to pass all these addresses
> > >> > > > > individually explicitly.
> > >> > > >
> > >> > > > Yes, in modern linux kernel, kexec has the support for EFI, I 
> > >> > > > think it
> > >> > > > should work naturally at least in x86_64.
> > >> > >
> > >> > > Thanks for this good news!
> > >> > >
> > >> > > Unfortunately Intel Galileo is 32-bit platform.
> > >> >
> > >> > If it was done for X86_64 then maybe it can be generalized to X86?
> > >>
> > >> For X86_64, we have a new way for efi runtime memmory mapping, in i386
> > >> code it still use old ioremap way. It is impossible to use same way as
> > >> the X86_64 since the virtual address space is limited.
> > >>
> > >> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not
> > >> sure, I would suggest Andy to do a test first with efi=noruntime for
> > >> kexec 2nd kernel.
> > >
> > > Guys, it was quite a long no hear from you. As I told you the proposed 
> > > work
> > > around didn't help. Today I found that Microsoft Surface 3 also affected
> > > by this.
> > >
> > > Can we apply these patches for now until you will find better
> > > solution?
> >
> > Not a chance.  The patches don't apply to any kernel in the git history.
> >
> > Which may be part of your problem.  You are or at least were running
> > with code that has not been merged upstream.
>
> It's done against linux-next.
> Applied clearly. (Not the version in this more than yearly old series
> of course, that's why I told I can resend)
>
> > > P.S. I may resend them rebased on recent vanilla.
> >
> > Second.  I looked at your test results and they don't directly make
> > sense.  dmidecode bypasses the kernel completely or it did last time
> > I looked so I don't know why you would be using that to test if
> > something in the kernel is working.
> >
> > However dmidecode failing suggests that the actual problem is something
> > in the first kernel is stomping the dmi tables.
>
> See below.
>
> > Adding a command line option won't fix stomped tables.
>
> It provides a mechanism, which seems to be absent, to the second
> kernel to know where to look for SMBIOS tables.
>
> > So what I would suggest is:
> > a) Verify that dmidecode works before kexec.
>
> Yes, it does.
>
> > b) Test to see if dmidecode works after kexec.
>
> No, it doesn't.
>
> > c) Once (a) shows that dmidecode works and (b) shows that dmidecode
> >fails figure out what is stomping your dmi tables during or before
> >kexec and that is what should get fixed.
>
> The problem here as I can see it that EFI and kexec protocols are not
> friendly to each other.
> I'm not an expert in either. That's why I'm asking for possible
> solutions. And this needs to be done in kernel to allow drivers to
> work.
>
> Does the
>
> commit 4996c02306a25def1d352ec8e8f48895bbc7dea9
> Author: Takao Indoh 
> Date:   Thu Jul 14 18:05:21 2011 -0400
>
> ACPI: introduce "acpi_rsdp=" parameter for kdump
>
> description shed a light on this?
>
> > Now using a non-efi method of dmi detection relies on the
> > tables being between 0xF and 0x1. AKA the last 64K
> > of the first 1MiB of memory.  You might check to see if your
> > dmi tables are in that address range.
>
> # dmidecode --no-sysfs
> # dmidecode 3.2
> Scanning /dev/mem for entry point.
> # No SMBIOS nor DMI entry point found, sorry.
>
> === with patch applied ===
> # dmidecode
> ...
> Release Date: 03/10/2015
> ...
>
> >
> > Otherwise I suspect the good solution is to give efi it's own page
> > tables in the kernel and switch to it whenever efi functions are called.
> >
>
> > But on 32bit the Linux kernel has historically been just fine directly
> > accessing the hardware, and ignoring efi and all of the other BIOS's.
>
> It seems not only for 32-bit Linux kernel anymore. MS Surface 3 runs
> 64-bit code.
>
> > So if that doesn't work on Intel Galileo that is probably a firmware
> > problem.
>
> It's not only about Galileo anymore.
>

Looking at the x86 kexec EFI code, it seems that it has special
handling for the legacy SMBIOS table address, but not for the SMBIOS3
table address, which was introduced to accommodate SMBIOS tables
living in memory that is not 32-bit addressable.

Could anyone check 

Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2020-01-20 Thread Andy Shevchenko
On Mon, Jan 20, 2020 at 9:28 PM Eric W. Biederman  wrote:
> Andy Shevchenko  writes:
> > On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote:
> >> Ccing efi people.
> >>
> >> On 12/16/16 at 02:33pm, Jean Delvare wrote:
> >> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote:
> >> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote:
> >> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote:
> >> > > > > I am no kexec expert but this confuses me. Shouldn't the second
> >> > > > > kernel have access to the EFI systab as the first kernel does? It
> >> > > > > includes many more pointers than just ACPI and DMI tables, and it
> >> > > > > would seem inconvenient to have to pass all these addresses
> >> > > > > individually explicitly.
> >> > > >
> >> > > > Yes, in modern linux kernel, kexec has the support for EFI, I think 
> >> > > > it
> >> > > > should work naturally at least in x86_64.
> >> > >
> >> > > Thanks for this good news!
> >> > >
> >> > > Unfortunately Intel Galileo is 32-bit platform.
> >> >
> >> > If it was done for X86_64 then maybe it can be generalized to X86?
> >>
> >> For X86_64, we have a new way for efi runtime memmory mapping, in i386
> >> code it still use old ioremap way. It is impossible to use same way as
> >> the X86_64 since the virtual address space is limited.
> >>
> >> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not
> >> sure, I would suggest Andy to do a test first with efi=noruntime for
> >> kexec 2nd kernel.
> >
> > Guys, it was quite a long no hear from you. As I told you the proposed work
> > around didn't help. Today I found that Microsoft Surface 3 also affected
> > by this.
> >
> > Can we apply these patches for now until you will find better
> > solution?
>
> Not a chance.  The patches don't apply to any kernel in the git history.
>
> Which may be part of your problem.  You are or at least were running
> with code that has not been merged upstream.

It's done against linux-next.
Applied clearly. (Not the version in this more than yearly old series
of course, that's why I told I can resend)

> > P.S. I may resend them rebased on recent vanilla.
>
> Second.  I looked at your test results and they don't directly make
> sense.  dmidecode bypasses the kernel completely or it did last time
> I looked so I don't know why you would be using that to test if
> something in the kernel is working.
>
> However dmidecode failing suggests that the actual problem is something
> in the first kernel is stomping the dmi tables.

See below.

> Adding a command line option won't fix stomped tables.

It provides a mechanism, which seems to be absent, to the second
kernel to know where to look for SMBIOS tables.

> So what I would suggest is:
> a) Verify that dmidecode works before kexec.

Yes, it does.

> b) Test to see if dmidecode works after kexec.

No, it doesn't.

> c) Once (a) shows that dmidecode works and (b) shows that dmidecode
>fails figure out what is stomping your dmi tables during or before
>kexec and that is what should get fixed.

The problem here as I can see it that EFI and kexec protocols are not
friendly to each other.
I'm not an expert in either. That's why I'm asking for possible
solutions. And this needs to be done in kernel to allow drivers to
work.

Does the

commit 4996c02306a25def1d352ec8e8f48895bbc7dea9
Author: Takao Indoh 
Date:   Thu Jul 14 18:05:21 2011 -0400

ACPI: introduce "acpi_rsdp=" parameter for kdump

description shed a light on this?

> Now using a non-efi method of dmi detection relies on the
> tables being between 0xF and 0x1. AKA the last 64K
> of the first 1MiB of memory.  You might check to see if your
> dmi tables are in that address range.

# dmidecode --no-sysfs
# dmidecode 3.2
Scanning /dev/mem for entry point.
# No SMBIOS nor DMI entry point found, sorry.

=== with patch applied ===
# dmidecode
...
Release Date: 03/10/2015
...

>
> Otherwise I suspect the good solution is to give efi it's own page
> tables in the kernel and switch to it whenever efi functions are called.
>

> But on 32bit the Linux kernel has historically been just fine directly
> accessing the hardware, and ignoring efi and all of the other BIOS's.

It seems not only for 32-bit Linux kernel anymore. MS Surface 3 runs
64-bit code.

> So if that doesn't work on Intel Galileo that is probably a firmware
> problem.

It's not only about Galileo anymore.

-- 
With Best Regards,
Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2020-01-20 Thread Andy Shevchenko
On Mon, Jan 20, 2020 at 11:44 PM Jean Delvare  wrote:
>
> On Mon, 20 Jan 2020 10:04:04 -0600, Eric W. Biederman wrote:
> > Second.  I looked at your test results and they don't directly make
> > sense.  dmidecode bypasses the kernel completely or it did last time
> > I looked so I don't know why you would be using that to test if
> > something in the kernel is working.
>
> That must have been long ago. A recent version of dmidecode (>= 3.0)
> running on a recent kernel
> (>= d7f96f97c4031fa4ffdb7801f9aae23e96170a6f, v4.2) will read the DMI
> data from /sys/firmware/dmi/tables, so it is very much relying on the
> kernel doing the right thing. If not, it will still try to fallback to
> reading from /dev/mem directly on certain architectures. You can force
> that old method with --no-sysfs.
>
> Hope that helps,

I don't understand how it possible can help for in-kernel code, like
DMI quirks in a drivers.

-- 
With Best Regards,
Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2020-01-20 Thread Jean Delvare
On Mon, 20 Jan 2020 10:04:04 -0600, Eric W. Biederman wrote:
> Second.  I looked at your test results and they don't directly make
> sense.  dmidecode bypasses the kernel completely or it did last time
> I looked so I don't know why you would be using that to test if
> something in the kernel is working.

That must have been long ago. A recent version of dmidecode (>= 3.0)
running on a recent kernel
(>= d7f96f97c4031fa4ffdb7801f9aae23e96170a6f, v4.2) will read the DMI
data from /sys/firmware/dmi/tables, so it is very much relying on the
kernel doing the right thing. If not, it will still try to fallback to
reading from /dev/mem directly on certain architectures. You can force
that old method with --no-sysfs.

Hope that helps,
-- 
Jean Delvare
SUSE L3 Support

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Contact Diplomatic Agent, Mr. Mcclaine John to receive your ATM CARD valued the sum of $12.8Million United States Dollars

2020-01-20 Thread Prof, William Roberts
Attn: Dear Beneficiary,

I wish to inform you that the diplomatic agent conveying your ATM CARD
valued the sum of $12.8Million United States Dollars has misplaced
your address and he is currently stranded at (George Bush
International Airport) Houston Texas USA now
We required you to reconfirm the following information's below to him
so that he can deliver your Payment CARD to you today or tomorrow
morning as information provided with open communications via email and
telephone for security reasons.
HERE IS THE DETAILS  HE NEED FROM YOU URGENT
YOUR FULL NAME:
ADDRESS:
MOBILE NO:
NAME OF YOUR NEAREST AIRPORT:
A COPY OF YOUR IDENTIFICATION :

Note; do contact the diplomatic agent immediately through the
information's listed below
Contact Person: Diplomatic Agent, Mr. Mcclaine John
EMAIL: mcclainejohn...@gmail.com
Tel:(223) 777-7518

Contact the diplomatic agent immediately
because he is waiting to hear from you today with the needed information's.

NOTE: The Diplomatic agent does not know that the content of the
consignment box is $12.800,000,00 Million United States Dollars and on
no circumstances should you let him know the content. The consignment
was moved from here as family treasures, so never allow him to open
the box. Please I have paid delivery fees for you but the only money
you must send to Mcclaine John is your ATM CARD delivery fee $25.00
only. text Him as you contact Him Immediately

Thanks,
with Regards.
Prof, William Roberts
Director DHL COURIER SERVICES-Benin

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2020-01-20 Thread Eric W. Biederman
Andy Shevchenko  writes:

> On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote:
>> Ccing efi people.
>> 
>> On 12/16/16 at 02:33pm, Jean Delvare wrote:
>> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote:
>> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote:
>> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote:
>> > > > > I am no kexec expert but this confuses me. Shouldn't the second
>> > > > > kernel have access to the EFI systab as the first kernel does? It
>> > > > > includes many more pointers than just ACPI and DMI tables, and it
>> > > > > would seem inconvenient to have to pass all these addresses
>> > > > > individually explicitly.
>> > > > 
>> > > > Yes, in modern linux kernel, kexec has the support for EFI, I think it
>> > > > should work naturally at least in x86_64.
>> > > 
>> > > Thanks for this good news!
>> > > 
>> > > Unfortunately Intel Galileo is 32-bit platform.
>> > 
>> > If it was done for X86_64 then maybe it can be generalized to X86?
>> 
>> For X86_64, we have a new way for efi runtime memmory mapping, in i386
>> code it still use old ioremap way. It is impossible to use same way as
>> the X86_64 since the virtual address space is limited.
>> 
>> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not
>> sure, I would suggest Andy to do a test first with efi=noruntime for
>> kexec 2nd kernel.
>
> Guys, it was quite a long no hear from you. As I told you the proposed work
> around didn't help. Today I found that Microsoft Surface 3 also affected
> by this.
>
> Can we apply these patches for now until you will find better
> solution?

Not a chance.  The patches don't apply to any kernel in the git history.

Which may be part of your problem.  You are or at least were running
with code that has not been merged upstream.

> P.S. I may resend them rebased on recent vanilla.

Second.  I looked at your test results and they don't directly make
sense.  dmidecode bypasses the kernel completely or it did last time
I looked so I don't know why you would be using that to test if
something in the kernel is working.

However dmidecode failing suggests that the actual problem is something
in the first kernel is stomping the dmi tables.

Adding a command line option won't fix stomped tables.

So what I would suggest is:
a) Verify that dmidecode works before kexec.
b) Test to see if dmidecode works after kexec.
c) Once (a) shows that dmidecode works and (b) shows that dmidecode
   fails figure out what is stomping your dmi tables during or before
   kexec and that is what should get fixed.

Now using a non-efi method of dmi detection relies on the
tables being between 0xF and 0x1. AKA the last 64K
of the first 1MiB of memory.  You might check to see if your
dmi tables are in that address range.

Otherwise I suspect the good solution is to give efi it's own page
tables in the kernel and switch to it whenever efi functions are called.

But on 32bit the Linux kernel has historically been just fine directly
accessing the hardware, and ignoring efi and all of the other BIOS's.
So if that doesn't work on Intel Galileo that is probably a firmware
problem.

Eric


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec -e not working: root disk not able to detect

2020-01-20 Thread Prabhakar Kushwaha
On Mon, Jan 13, 2020 at 10:28 AM Prabhakar Kushwaha
 wrote:
>
> [Re sending keeping mailing list and others in CC]
>
> On Fri, Jan 10, 2020 at 5:56 AM Bjorn Helgaas  wrote:
> >
> > [+cc Jens, ahci.c maintainer]
> >
> > On Mon, Jan 06, 2020 at 05:24:44PM +0530, Prabhakar Kushwaha wrote:
> > > Hi All,
> > >
> > > I am trying kexec -e with latest kernel i.e. Linux-5.5.0-rc4.  Here
> > > second kernel is not able to detect/mount hard-disk having root file
> > > system (INTEL SSDSC2BB240G7).
> > >
> > > [  279.690575] ata1: softreset failed (1st FIS failed)
> > > [  279.695446] ata1: limiting SATA link speed to 3.0 Gbps
> > > [  280.910020] ata1: SATA link down (SStatus 0 SControl 320)
> > > [  282.626018] ata1: SATA link down (SStatus 0 SControl 300)
> > > [  282.631409] ata1: link online but 1 devices misclassified, retrying
> > > [  282.637665] ata1: reset failed (errno=-11), retrying in 9 secs
> > > [  298.294546] ata1: failed to reset engine (errno=-5)
> > > [  302.042967] ata1: softreset failed (1st FIS failed)
> > > [  308.798609] ata1: failed to reset engine (errno=-5)
> > > [  337.546605] ata1: softreset failed (1st FIS failed)
> > > [  337.551475] ata1: limiting SATA link speed to 3.0 Gbps
> > > [  338.766022] ata1: SATA link down (SStatus 0 SControl 320)
> > > [  339.270943] ata1: EH pending after 5 tries, giving up
> > >
> > > I found following two workaround for this issue.
> > > A) Define ".shutdown" in driver/ata/ahci.c.
> > >
> > > reboot --> kernel_kexec() --> kernel_restart_prepare() -->
> > > device_shutdown() --> pci_device_shutdown() --> ahci_shutdown_one()
> > > --> new function
> > >
> > > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> > > index 4bfd1b14b390..50a101002885 100644
> > > --- a/drivers/ata/ahci.c
> > > +++ b/drivers/ata/ahci.c
> > > @@ -81,6 +81,7 @@ enum board_ids {
> > >
> > >  static int ahci_init_one(struct pci_dev *pdev, const struct
> > > pci_device_id *ent);
> > >  static void ahci_remove_one(struct pci_dev *dev);
> > >  +static void ahci_shutdown_one(struct pci_dev *dev);
> > >  static int ahci_vt8251_hardreset(struct ata_link *link, unsigned int 
> > > *class,
> > >   unsigned long deadline);
> > >   static int ahci_avn_hardreset(struct ata_link *link, unsigned int 
> > > *class,
> > >  @@ -606,6 +607,7 @@ static struct pci_driver ahci_pci_driver = {
> > >  .id_table   = ahci_pci_tbl,
> > >  .probe  = ahci_init_one,
> > >  .remove = ahci_remove_one,
> > >  +   .shutdown   = ahci_shutdown_one,
> > >  .driver = {
> > >  .pm = _pci_pm_ops,
> > >  },
> > >
> > >  +static void ahci_shutdown_one(struct pci_dev *pdev)
> > >  +{
> > >  +   pm_runtime_get_noresume(>dev);
> > >  +   ata_pci_remove_one(pdev);
> > >  +}
> > >  +
> > > Note: After defining shutdown, error related to file-system write
> > > seen. Looks like even after device_shutdown, file system related
> > > transaction goes to disk.
> > >
> > > B)) Commenting of pci_clear_master() from pci_device_shutdown()
> > > reboot --> kernel_kexec() --> kernel_restart_prepare() -->
> > > device_shutdown() --> pci_device_shutdown()
> > >
> > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > index 0454ca0e4e3f..ddffaa9321bb 100644
> > > --- a/drivers/pci/pci-driver.c
> > > +++ b/drivers/pci/pci-driver.c
> > > @@ -481,8 +481,10 @@ static void pci_device_shutdown(struct device *dev)
> > > /*
> > >  * If this is a kexec reboot, turn off Bus Master bit on the
> > > @@ -491,8 +493,16 @@ static void pci_device_shutdown(struct device *dev)
> > >  * If it is not a kexec reboot, firmware will hit the PCI
> > >  * devices with big hammer and stop their DMA any way.
> > >  */
> > >
> > >  - if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> > >  -pci_clear_master(pci_dev);
> >
> > I doubt we would remove this without a much clearer justification.
> >
> > > Here pci_dev current_state. It is "0" i.e. D0.
> > >
> > > From A and B. Looks like even after pci_clear_master(), Some DMA
> > > transactions going on PCIe device  causing device in unstable.
> > > Not sure if this is the reason and how to solve this problem.
> >
> > Is it possible the ahci driver depends on receiving the device with
> > bus mastering already enabled?  I would guess that would be the common
> > case in a non-kexec boot -- the BIOS probably hands off the device
> > with bus mastering enabled.
> >
>
> Above code clearing Bus Master. May be it try to make sure that next
> kernel get PCIe in same state in which BIOS could have provided.
> Not sure why this is causing issue in our case. Need to debug more.
>

I added more prints and added stack dump in ata_scsi_queuecmd().
I found that lots of  syscall __arm64_sys_fsync() happens for file
system sync during kexec -e. These calls happens even after 

Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2020-01-20 Thread Andy Shevchenko
On Sat, Dec 17, 2016 at 06:57:21PM +0800, Dave Young wrote:
> Ccing efi people.
> 
> On 12/16/16 at 02:33pm, Jean Delvare wrote:
> > On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote:
> > > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote:
> > > > On 12/15/16 at 12:28pm, Jean Delvare wrote:
> > > > > I am no kexec expert but this confuses me. Shouldn't the second
> > > > > kernel have access to the EFI systab as the first kernel does? It
> > > > > includes many more pointers than just ACPI and DMI tables, and it
> > > > > would seem inconvenient to have to pass all these addresses
> > > > > individually explicitly.
> > > > 
> > > > Yes, in modern linux kernel, kexec has the support for EFI, I think it
> > > > should work naturally at least in x86_64.
> > > 
> > > Thanks for this good news!
> > > 
> > > Unfortunately Intel Galileo is 32-bit platform.
> > 
> > If it was done for X86_64 then maybe it can be generalized to X86?
> 
> For X86_64, we have a new way for efi runtime memmory mapping, in i386
> code it still use old ioremap way. It is impossible to use same way as
> the X86_64 since the virtual address space is limited.
> 
> But maybe for 32bit, kexec kernel can run in physical mode, but I'm not
> sure, I would suggest Andy to do a test first with efi=noruntime for
> kexec 2nd kernel.

Guys, it was quite a long no hear from you. As I told you the proposed work
around didn't help. Today I found that Microsoft Surface 3 also affected
by this.

Can we apply these patches for now until you will find better solution?

P.S. I may resend them rebased on recent vanilla.

-- 
With Best Regards,
Andy Shevchenko



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCHv2] mm/sparse: reset section's mem_map when fully deactivated

2020-01-20 Thread Pingfan Liu
On Mon, Jan 20, 2020 at 5:03 PM David Hildenbrand  wrote:
>
[...]
> I think you dropped my
>
> Acked-by: David Hildenbrand 
>
Sorry to forget it.
Thanks for your kind review.

Regards,
Pingfan

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: cope with not-present mem section

2020-01-20 Thread piliu


On 01/20/2020 04:59 PM, Baoquan He wrote:
> On 01/20/20 at 09:33am, David Hildenbrand wrote:
>>
>>
>>> Am 20.01.2020 um 03:25 schrieb Pingfan Liu :
>>>
>>> After kernel commit ba72b4c8cf60 ("mm/sparsemem: support sub-section
>>> hotplug"), when hot-removed, section_mem_map is still encoded with section
>>> start pfn, not NULL. This break the current makedumpfile.
>>>
>>> Whatever section_mem_map coding info after hot-removed, it is reliable
>>> just to work on SECTION_MARKED_PRESENT bit. Fixing makedumpfile by this
>>> way.
>>>
>>> Signed-off-by: Pingfan Liu 
>>> To: kexec@lists.infradead.org
>>> Cc: Kazuhito Hagio 
>>> Cc: Baoquan He 
>>> Cc: David Hildenbrand 
>>> Cc: Andrew Morton 
>>> Cc: Dan Williams 
>>> Cc: Oscar Salvador 
>>> Cc: Michal Hocko 
>>> Cc: Qian Cai 
>>> ---
>>> makedumpfile.c | 6 +-
>>> 1 file changed, 1 insertion(+), 5 deletions(-)
>>>
>>> diff --git a/makedumpfile.c b/makedumpfile.c
>>> index e290fbd..ab40a58 100644
>>> --- a/makedumpfile.c
>>> +++ b/makedumpfile.c
>>> @@ -3406,8 +3406,6 @@ section_mem_map_addr(unsigned long addr, unsigned 
>>> long *map_mask)
>>>map = ULONG(mem_section + OFFSET(mem_section.section_mem_map));
>>>mask = SECTION_MAP_MASK;
>>>*map_mask = map & ~mask;
>>> -if (map == 0x0)
>>> -*map_mask |= SECTION_MARKED_PRESENT;
This should be a trick to let map==0x0 survive the logic
if (!(map_mask & SECTION_MARKED_PRESENT)) {
return FALSE;
}
in validate_mem_section().
>>
>> Why was that added in the first place? This looks like dome compat handling 
>> to me. Are we sure we can remove it?
The logic introduced by commit 14876c45aef ("[PATCH makedumpfile] handle
mem_section as either a pointer or an array")
Combining section_mem_map_addr() and the following logic in
validate_mem_section()
if (mem_map == 0) {
mem_map = NOT_MEMMAP_ADDR;
}

I reached the same conclusion as Baoquan's.
> 
> 
> The original code assumes that there are only two kinds of mem_section,
> one is empty value, the second is a present one. This removing behaves
> the same as the old code, I don't see a problem with it.
> 
> I tried to understand the code, but I don't know why it need call
> validate_mem_section() twice and compare the returned value.
I think it could be if/else, no need to call twice.
Cc Thadeu Lima de Souza Cascardo, who contributes the original code, and
may have some idea about it.

Thanks,
Pingfan

> 
> 
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCHv2] mm/sparse: reset section's mem_map when fully deactivated

2020-01-20 Thread David Hildenbrand
On 20.01.20 08:29, Michal Hocko wrote:
> On Mon 20-01-20 10:33:14, Pingfan Liu wrote:
>> After commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug"),
>> when a mem section is fully deactivated, section_mem_map still records the
>> section's start pfn, which is not used any more and will be reassigned
>> during re-added.
>>
>> In analogy with alloc/free pattern, it is better to clear all fields of
>> section_mem_map.
>>
>> Beside this, it breaks the user space tool "makedumpfile" [1], which makes
>> assumption that a hot-removed section has mem_map as NULL, instead of
>> checking directly against SECTION_MARKED_PRESENT bit. (makedumpfile will be
>> better to change the assumption, and need a patch)
>>
>> The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" ,
>> trigger a crash, and save vmcore by makedumpfile
> 
> While makedumpfile lives very closely to the kernel and occasional
> breakage is to be expected I still believe that Fixes: ba72b4c8cf60
> is due.

+1 as I expressed earlier.

> 
>> [1]: makedumpfile, commit e73016540293 ("[v1.6.7] Update version")
>>
>> Signed-off-by: Pingfan Liu 
>> To: linux...@kvack.org
>> Cc: Andrew Morton 
>> Cc: David Hildenbrand 
>> Cc: Dan Williams 
>> Cc: Oscar Salvador 
>> Cc: Michal Hocko 
>> Cc: Baoquan He 
>> Cc: Qian Cai 
>> Cc: kexec@lists.infradead.org
>> Cc: Kazuhito Hagio 
> 
I think you dropped my

Acked-by: David Hildenbrand 


-- 
Thanks,

David / dhildenb


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: cope with not-present mem section

2020-01-20 Thread Baoquan He
On 01/20/20 at 09:33am, David Hildenbrand wrote:
> 
> 
> > Am 20.01.2020 um 03:25 schrieb Pingfan Liu :
> > 
> > After kernel commit ba72b4c8cf60 ("mm/sparsemem: support sub-section
> > hotplug"), when hot-removed, section_mem_map is still encoded with section
> > start pfn, not NULL. This break the current makedumpfile.
> > 
> > Whatever section_mem_map coding info after hot-removed, it is reliable
> > just to work on SECTION_MARKED_PRESENT bit. Fixing makedumpfile by this
> > way.
> > 
> > Signed-off-by: Pingfan Liu 
> > To: kexec@lists.infradead.org
> > Cc: Kazuhito Hagio 
> > Cc: Baoquan He 
> > Cc: David Hildenbrand 
> > Cc: Andrew Morton 
> > Cc: Dan Williams 
> > Cc: Oscar Salvador 
> > Cc: Michal Hocko 
> > Cc: Qian Cai 
> > ---
> > makedumpfile.c | 6 +-
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index e290fbd..ab40a58 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -3406,8 +3406,6 @@ section_mem_map_addr(unsigned long addr, unsigned 
> > long *map_mask)
> >map = ULONG(mem_section + OFFSET(mem_section.section_mem_map));
> >mask = SECTION_MAP_MASK;
> >*map_mask = map & ~mask;
> > -if (map == 0x0)
> > -*map_mask |= SECTION_MARKED_PRESENT;
> 
> Why was that added in the first place? This looks like dome compat handling 
> to me. Are we sure we can remove it?


The original code assumes that there are only two kinds of mem_section,
one is empty value, the second is a present one. This removing behaves
the same as the old code, I don't see a problem with it.

I tried to understand the code, but I don't know why it need call
validate_mem_section() twice and compare the returned value.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: cope with not-present mem section

2020-01-20 Thread David Hildenbrand


> Am 20.01.2020 um 03:25 schrieb Pingfan Liu :
> 
> After kernel commit ba72b4c8cf60 ("mm/sparsemem: support sub-section
> hotplug"), when hot-removed, section_mem_map is still encoded with section
> start pfn, not NULL. This break the current makedumpfile.
> 
> Whatever section_mem_map coding info after hot-removed, it is reliable
> just to work on SECTION_MARKED_PRESENT bit. Fixing makedumpfile by this
> way.
> 
> Signed-off-by: Pingfan Liu 
> To: kexec@lists.infradead.org
> Cc: Kazuhito Hagio 
> Cc: Baoquan He 
> Cc: David Hildenbrand 
> Cc: Andrew Morton 
> Cc: Dan Williams 
> Cc: Oscar Salvador 
> Cc: Michal Hocko 
> Cc: Qian Cai 
> ---
> makedumpfile.c | 6 +-
> 1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index e290fbd..ab40a58 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -3406,8 +3406,6 @@ section_mem_map_addr(unsigned long addr, unsigned long 
> *map_mask)
>map = ULONG(mem_section + OFFSET(mem_section.section_mem_map));
>mask = SECTION_MAP_MASK;
>*map_mask = map & ~mask;
> -if (map == 0x0)
> -*map_mask |= SECTION_MARKED_PRESENT;

Why was that added in the first place? This looks like dome compat handling to 
me. Are we sure we can remove it?


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec