RE: [PATCH v3 1/3] open_dump_bitmap: open bitmap file in non-cyclic case

2016-09-20 Thread Atsushi Kumagai
Hello Martin,

>The logic of set_bitmap() requires that a bitmap fd exists in the
>non-cyclic case.
>
>Signed-off-by: Martin Wilck 

I'll merge this version into v1.6.1, thanks for your work.

Regards,
Atsushi Kumagai

>---
> makedumpfile.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/makedumpfile.c b/makedumpfile.c
>index d168dfd..6164468 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -1365,7 +1365,7 @@ open_dump_bitmap(void)
>
>   /* Unnecessary to open */
>   if (!info->working_dir && !info->flag_reassemble && 
> !info->flag_refiltering
>-  && !info->flag_sadump && !info->flag_mem_usage)
>+  && !info->flag_sadump && !info->flag_mem_usage && info->flag_cyclic)
>   return TRUE;
>
>   tmpname = getenv("TMPDIR");
>--
>2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v10 RESEND 0/3] Documentation: Add description of enable multi-cpus support for kdump

2016-09-20 Thread Baoquan He
On 09/20/16 at 06:03pm, Jonathan Corbet wrote:
> On Mon, 19 Sep 2016 13:59:46 +0800
> Baoquan He  wrote:
> 
> > This is v10 post. In this patchset patch 1/3 is added to give more details
> > about nr_cpus and maxcpus in kernel-parameters.txt. This is suggested by
> > Jonathan since the description of them is unclear so that people can't see
> > what's the exact difference between them. Otherwise no big further change
> > for 2/3 and 3/3 which comprise the old post.
> 
> So somehow I'd applied 1/3 a while back, but not the other two.  All three
> are in the docs tree now, thanks.

Thanks!


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v10 RESEND 0/3] Documentation: Add description of enable multi-cpus support for kdump

2016-09-20 Thread Jonathan Corbet
On Mon, 19 Sep 2016 13:59:46 +0800
Baoquan He  wrote:

> This is v10 post. In this patchset patch 1/3 is added to give more details
> about nr_cpus and maxcpus in kernel-parameters.txt. This is suggested by
> Jonathan since the description of them is unclear so that people can't see
> what's the exact difference between them. Otherwise no big further change
> for 2/3 and 3/3 which comprise the old post.

So somehow I'd applied 1/3 a while back, but not the other two.  All three
are in the docs tree now, thanks.

jon

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATHC v2 0/9] ima: carry the measurement list across kexec

2016-09-20 Thread Eric W. Biederman

Thiago Jung Bauermann  writes:

> Am Samstag, 17 September 2016, 00:17:37 schrieb Eric W. Biederman:
>> Thiago Jung Bauermann  writes:
>> > Hello Eric,
>> > 
>> > Am Freitag, 16 September 2016, 14:47:13 schrieb Eric W. Biederman:
>> >> I can see tracking to see if the list has changed at some
>> >> point and causing a reboot(LINUX_REBOOT_CMD_KEXEC) to fail.
>> > 
>> > Yes, that is an interesting feature that I can add using the checksum-
>> > verifying part of my code. I can submit a patch for that if there's
>> > interest, adding a reboot notifier that verifies the checksum and causes
>> > a regular reboot instead of a kexec reboot if the checksum fails.
>> 
>> I was thinking an early failure instead of getting all of the way down
>> into a kernel an discovering the tpm/ima subsystem would not
>> initialized.  But where that falls in the reboot pathway I don't expect
>> there is much value in it.
>
> I'm not sure I understand. What I described doesn't involve the tpm or ima. 
> I'm suggesting that if I take the parts of patch 4/5 in the kexec hand-over 
> buffer series that verify the image checksum, I can submit a separate patch 
> that checks the integrity of the kexec image early in kernel_kexec() and 
> reverts to a regular reboot if the check fails. This would be orthogonal to 
> ima carrying its measurement list across kexec.
>
> I think there is value in that, because if the kexec image is corrupted the 
> machine will just get stuck in the purgatory and (unless it's a platform 
> where the purgatory can print to the console) without even an error message 
> explaining what is going on. Whereas if we notice the corruption before 
> jumping into the purgatory we can switch to a regular reboot and the machine 
> will boot successfully.
>
> To have an early failure, when would the checksum verification be done?
> What I can think of is to have kexec_file_load accept a new flag 
> KEXEC_FILE_VERIFY_IMAGE, which userspace could use to request an integrity 
> check when it's about to start the reboot procedure. Then it can decide to 
> either reload the kernel or use a regular reboot if the image is corrupted.
>
> Is this what you had in mind?

Sort of.

I was just thinking that instead of having the boot path verify your ima
list matches what is in the tpm and halting the boot there, we could
test that on reboot.  Which would give a clean failure without the nasty
poking into a prepared image.  The downside is that we have already run
the shutdown scripts so it wouldn't be much cleaner, than triggering a
machine reboot from elsewhere.

But I don't think we should spend too much time on that.  It was a
passing thought.  We should focus on getting a non-poked ima buffer
cleanly into kexec and we can worry about the rest later.

>> >> At least the common bootloader cases that I know of using kexec are
>> >> very
>> >> minimal distributions that live in a ramdisk and as such it should be
>> >> very straight forward to measure what is needed at or before
>> >> sys_kexec_load.  But that was completely dismissed as unrealistic so I
>> >> don't have a clue what actual problem you are trying to solve.
>> > 
>> > We are interested in solving the problem in a general way because it
>> > will be useful to us in the future for the case of an arbitrary number
>> > of kexecs (and thus not only a bootloader but also multiple full-blown
>> > distros may be involved in the chain).
>> > 
>> > But you are right that for the use case for which we currently need this
>> > feature it's feasible to measure everything upfront. We can cross the
>> > other bridge when we get there.
>> 
>> Then let's start there.  Passing the measurment list is something that
>> should not be controversial.
>
> Great!
>
>> >> If there is anyway we can start small and not with this big scary
>> >> infrastructure change I would very much prefer it.
>> > 
>> > Sounds good. If we pre-measure everything then the following patches
>> > from my buffer hand-over series are enough:
>> > 
>> > [PATCH v5 2/5] kexec_file: Add buffer hand-over support for the next
>> > kernel [PATCH v5 3/5] powerpc: kexec_file: Add buffer hand-over support
>> > for the next kernel
>> > 
>> > Would you consider including those two?
>> > 
>> > And like I mentioned in the cover letter, patch 1/5 is an interesting
>> > improvement that is worth considering.
>> 
>> So from 10,000 feet I think that is correct.
>> 
>> I am not quite certain why a new mechanism is being invented.  We have
>> other information that is already passed (much of it architecture
>> specific) like the flattened device tree.  If you remove the need to
>> update the information can you just append this information to the
>> flattened device tree without a new special mechanism to pass the data?
>> 
>> I am just reluctant to invent a new mechanism when there is an existing
>> mechanism that looks like it should work without problems.
>
> Michael Ellerman suggested putting the buffer contents inside the device 

RE: RE: Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-09-20 Thread 河合英宏 / KAWAI,HIDEHIRO
Here is the revised commit description reflecting Dave's
comment.  Cc list was copied from -mm version.

From: Hidehiro Kawai 
Subject: x86/panic: replace smp_send_stop() with kdump friendly version in 
panic path

This patch fixes a problem reported by Daniel Walker
(https://lkml.org/lkml/2015/6/24/44).

When kernel panics with crash_kexec_post_notifiers kernel parameter
enabled, other CPUs are stopped by smp_send_stop() instead of
machine_crash_shutdown() in __crash_kexec() path.

  panic()
if crash_kexec_post_notifiers == 1
  smp_send_stop()
  atomic_notifier_call_chain()
  kmsg_dump()
__crash_kexec()
  machine_crash_shutdown()

Different from smp_send_stop(), machine_crash_shutdown() stops other
CPUs with extra works for kdump.  So, if smp_send_stop() stops other
CPUs in advance, these extra works won't be done.  For x86, kdump
routines miss to save other CPUs' registers and disable virtualization
extensions.

To fix this problem, call a new kdump friendly function,
crash_smp_send_stop(), instead of the smp_send_stop() when
crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
weak function, and it just call smp_send_stop().  Architecture
codes should override it so that kdump can work appropriately.
This patch only provides x86-specific version.

For Xen's PV kernel, just keep the current behavior.
As for Dom0, it doesn't use crash_kexec routines, and it relies on
panic notifier chain.  At the end of the chain, a hypercall is
issued which requests the hypervisor to execute kdump.  This means
regardless of crash_kexec_post_notifiers setting, smp_send_stop().
For PV HVM, it would work similarly to baremetal kernels with extra
cleanups for hypervisor.  It doesn't need additional care.

Changes in V4:
- Keep to use smp_send_stop if crash_kexec_post_notifiers is not set
- Rename panic_smp_send_stop to crash_smp_send_stop
- Don't change the behavior for Xen's PV kernel

Changes in V3:
- Revise comments, description, and symbol names

Changes in V2:
- Replace smp_send_stop() call with crash_kexec version which
  saves cpu states and cleans up VMX/SVM
- Drop a fix for Problem 1 at this moment

Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
Link: 
http://lkml.kernel.org/r/20160810080948.11028.15344.st...@sysi4-13.yrl.intra.hitachi.co.jp
Signed-off-by: Hidehiro Kawai 
Reported-by: Daniel Walker 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
Cc: Eric Biederman 
Cc: Masami Hiramatsu 
Cc: Daniel Walker 
Cc: Xunlei Pang 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Borislav Petkov 
Cc: David Vrabel 
Cc: Toshi Kani 
Cc: Ralf Baechle 
Cc: David Daney 
Cc: Aaro Koskinen 
Cc: "Steven J. Hill" 
Cc: Corey Minyard 
Signed-off-by: Andrew Morton 

> -Original Message-
> From: Hidehiro Kawai
> Hi Xunlei,
> 
> > From: Xunlei Pang [mailto:xp...@redhat.com]
> > Sent: Tuesday, September 20, 2016 4:40 PM
> > On 2016/08/15/ at 19:22, Hidehiro Kawai wrote:
> > > Hi Dave,
> > >
> > > Thank you for the review.
> > >
> > >> From: Dave Young [mailto:dyo...@redhat.com]
> > >> Sent: Friday, August 12, 2016 12:17 PM
> > >>
> > >> Thanks for the update.
> > >> On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
> > >>> Daniel Walker reported problems which happens when
> > >>> crash_kexec_post_notifiers kernel option is enabled
> > >>> (https://lkml.org/lkml/2015/6/24/44).
> > >>>
> > >>> In that case, smp_send_stop() is called before entering kdump routines
> > >>> which assume other CPUs are still online.  As the result, for x86,
> > >>> kdump routines fail to save other CPUs' registers  and disable
> > >>> virtualization extensions.
> > >> Seems you simplified the changelog, but I think a little more details
> > >> will be helpful to understand the patch. You know sometimes lkml.org
> > >> does not work well.
> > > So, I'll try another archives when I post patch set next time.
> >
> > Hi Hidehiro Kawai,
> >
> > What's the status of this patch set, are you going to send an updated 
> > version?
> 
> Sorry, I misunderstood what Dave said, and I thought I don't
> need to revise the patch set.
> 
> Currently, this patch set is in -mm, then -next tree.
> What I need to fix is only commit descriptions, so I'm going to
> post revised descriptions as replies to this thread, and then
> request to Andrew to replace them if there is no objection.
> (or should I resend the whole patch set?)
> 
> Regards,
> Hidehiro Kawai
> 
> > >>> To fix this problem, call a new kdump friendly function,
> > >>> crash_smp_send_stop(), instead of the smp_send_stop() when
> > >>> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> > >>> weak function, and it just call smp_send_stop().  Architecture
> > >>> codes should override it so that kdump can work appropriately.
> > >>> This patch only provides x86-specific version.
> > >>>
> > >>> For Xen's PV kernel, just keep the current behavior.
> > >> Could you explain a bit about above Xen PV kernel behavior?
> > 

RE: Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-09-20 Thread 河合英宏 / KAWAI,HIDEHIRO
Dave Young suggested to me to explain the problem in more detail,
so here is the revised commit description.  The patch is now in -mm,
so I copied Cc list from -mm version.  Also I added Corey Minyard's
Tested-by and Reviewed-by.

From: Hidehiro Kawai 
Subject: mips/panic: replace smp_send_stop() with kdump friendly version in 
panic path

This patch fixes the problems reported by Daniel Walker
(https://lkml.org/lkml/2015/6/24/44).

When kernel panics with crash_kexec_post_notifiers kernel parameter
enabled, other CPUs are stopped by smp_send_stop() instead of
machine_crash_shutdown() in __crash_kexec() path.

  panic()
if crash_kexec_post_notifiers == 1
  smp_send_stop()
  atomic_notifier_call_chain()
  kmsg_dump()
__crash_kexec()
  machine_crash_shutdown()
octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs

Different from smp_send_stop(), machine_crash_shutdown() stops other
CPUs with extra works for kdump.  So, if smp_send_stop() stops other
CPUs in advance, these extra works won't be done.  As the result,
kdump routines miss to save other CPUs' registers.  Additionally for
MIPS OCTEON, it misses to stop the watchdog timer.

To fix this problem, call a new kdump friendly function,
crash_smp_send_stop(), instead of the smp_send_stop() when
crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
weak function, and it just call smp_send_stop().  Architecture
codes should override it so that kdump can work appropriately.
This patch provides MIPS version.

Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
Link: 
http://lkml.kernel.org/r/20160810080950.11028.28000.st...@sysi4-13.yrl.intra.hitachi.co.jp
Signed-off-by: Hidehiro Kawai 
Reported-by: Daniel Walker 
Tested-by: Corey Minyard 
Reviewed-by: Corey Minyard 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
Cc: Eric Biederman 
Cc: Masami Hiramatsu 
Cc: Daniel Walker 
Cc: Xunlei Pang 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Borislav Petkov 
Cc: David Vrabel 
Cc: Toshi Kani 
Cc: Ralf Baechle 
Cc: David Daney 
Cc: Aaro Koskinen 
Cc: "Steven J. Hill" 
Signed-off-by: Andrew Morton 

> From: Corey Minyard [mailto:cminy...@mvista.com]
> Sent: Friday, August 19, 2016 6:18 AM
> Sorry this took so long, but I have finally tested this, it seems to
> work fine:
> 
> Tested-by: Corey Minyard 
> Reviewed-by: Corey Minyard 
> 
> On 08/10/2016 03:09 AM, Hidehiro Kawai wrote:
> > Daniel Walker reported problems which happens when
> > crash_kexec_post_notifiers kernel option is enabled
> > (https://lkml.org/lkml/2015/6/24/44).
> >
> > In that case, smp_send_stop() is called before entering kdump routines
> > which assume other CPUs are still online.  As the result, kdump
> > routines fail to save other CPUs' registers.  Additionally for MIPS
> > OCTEON, it misses to stop the watchdog timer.
> >
> > To fix this problem, call a new kdump friendly function,
> > crash_smp_send_stop(), instead of the smp_send_stop() when
> > crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> > weak function, and it just call smp_send_stop().  Architecture
> > codes should override it so that kdump can work appropriately.
> > This patch provides MIPS version.
> >
> > Reported-by: Daniel Walker 
> > Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" 
> > option)
> > Signed-off-by: Hidehiro Kawai 
> > Cc: Ralf Baechle 
> > Cc: David Daney 
> > Cc: Aaro Koskinen 
> > Cc: "Steven J. Hill" 
> > Cc: Corey Minyard 
> >
> > ---
> > I'm not familiar with MIPS, and I don't have a test environment and
> > just did build tests only.  Please don't apply this patch until
> > someone does enough tests, otherwise simply drop this patch.
> > ---
> >   arch/mips/cavium-octeon/setup.c  |   14 ++
> >   arch/mips/include/asm/kexec.h|1 +
> >   arch/mips/kernel/crash.c |   18 +-
> >   arch/mips/kernel/machine_kexec.c |1 +
> >   4 files changed, 33 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/mips/cavium-octeon/setup.c 
> > b/arch/mips/cavium-octeon/setup.c
> > index cb16fcc..5537f95 100644
> > --- a/arch/mips/cavium-octeon/setup.c
> > +++ b/arch/mips/cavium-octeon/setup.c
> > @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs)
> > default_machine_crash_shutdown(regs);
> >   }
> >
> > +#ifdef CONFIG_SMP
> > +void octeon_crash_smp_send_stop(void)
> > +{
> > +   int cpu;
> > +
> > +   /* disable watchdogs */
> > +   for_each_online_cpu(cpu)
> > +   cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0);
> > +}
> > +#endif
> > +
> >   #endif /* CONFIG_KEXEC */
> >
> >   #ifdef CONFIG_CAVIUM_RESERVE32
> > @@ -911,6 +922,9 @@ void __init prom_init(void)
> > _machine_kexec_shutdown = octeon_shutdown;
> > _machine_crash_shutdown = octeon_crash_shutdown;
> > _machine_kexec_prepare = octeon_kexec_prepare;
> > +#ifdef CONFIG_SMP
> > +   _crash_smp_send_stop = octeon_crash_smp_send_

[PATCH v3 1/3] open_dump_bitmap: open bitmap file in non-cyclic case

2016-09-20 Thread Martin Wilck
The logic of set_bitmap() requires that a bitmap fd exists in the
non-cyclic case.

Signed-off-by: Martin Wilck 
---
 makedumpfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index d168dfd..6164468 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1365,7 +1365,7 @@ open_dump_bitmap(void)
 
/* Unnecessary to open */
if (!info->working_dir && !info->flag_reassemble && 
!info->flag_refiltering
-   && !info->flag_sadump && !info->flag_mem_usage)
+   && !info->flag_sadump && !info->flag_mem_usage && info->flag_cyclic)
return TRUE;
 
tmpname = getenv("TMPDIR");
-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 2/3] move call to open_dump_bitmap() to after call to initial()

2016-09-20 Thread Martin Wilck
When open_files_for_creating_dumpfile() is called, we don't have all
necessary information to determine whether a bitmap file is actually
needed. In particular, we don't know whether info->flag_cyclic will
ultimately be set. This patch moves the call to open_dump_bitmap()
to after initialize() when all flags are known.

For the dump_dmesg() path, no bitmap file is needed.

Signed-off-by: Martin Wilck 
---
 makedumpfile.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 6164468..30e1fa8 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1480,9 +1480,6 @@ open_files_for_creating_dumpfile(void)
if (!open_dump_memory())
return FALSE;
 
-   if (!open_dump_bitmap())
-   return FALSE;
-
return TRUE;
 }
 
@@ -9747,6 +9744,9 @@ create_dumpfile(void)
if (!initial())
return FALSE;
 
+   if (!open_dump_bitmap())
+   return FALSE;
+
/* create an array of translations from pfn to vmemmap pages */
if (info->flag_excludevm) {
if (find_vmemmap() == FAILED) {
@@ -10917,6 +10917,9 @@ int show_mem_usage(void)
if (!initial())
return FALSE;
 
+   if (!open_dump_bitmap())
+   return FALSE;
+
if (!prepare_bitmap_buffer())
return FALSE;
 
-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 3/3] close_dump_bitmap: simplify logic

2016-09-20 Thread Martin Wilck
The boolean expression replicates the logic of open_dump_bitmap().
It's simpler and less error-prone to simply check if fd_bitmap is
valid.

Signed-off-by: Martin Wilck 
---
 makedumpfile.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 30e1fa8..0f5be7f 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -8615,8 +8615,7 @@ close_dump_file(void)
 void
 close_dump_bitmap(void)
 {
-   if (!info->working_dir && !info->flag_reassemble && 
!info->flag_refiltering
-   && !info->flag_sadump && !info->flag_mem_usage)
+   if (info->fd_bitmap < 0)
return;
 
if (close(info->fd_bitmap) < 0)
-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 3/3] close_dump_bitmap: simplify logic

2016-09-20 Thread Martin Wilck
Kumagai-san,
On Tue, 2016-09-20 at 09:43 +, Atsushi Kumagai wrote:
> 
> This version looks good, so could you add your Signed-off-by ?

OK, I'll resend the whole v3 series.

Martin



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v3 3/3] close_dump_bitmap: simplify logic

2016-09-20 Thread Atsushi Kumagai
Hello Martin,

>The boolean expression replicates the logic of open_dump_bitmap().
>It's simpler and less error-prone to simply check if fd_bitmap is
>valid.
>
>(I forgot to apply the very change that Petr had asked for in V2 of
>this patch. I'm very sorry. Please apply V3).

This version looks good, so could you add your Signed-off-by ?

Thanks,
Atsushi Kumagai

>---
> makedumpfile.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
>diff --git a/makedumpfile.c b/makedumpfile.c
>index 30e1fa8..0f5be7f 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -8615,8 +8615,7 @@ close_dump_file(void)
> void
> close_dump_bitmap(void)
> {
>-  if (!info->working_dir && !info->flag_reassemble && 
>!info->flag_refiltering
>-  && !info->flag_sadump && !info->flag_mem_usage)
>+  if (info->fd_bitmap < 0)
>   return;
>
>   if (close(info->fd_bitmap) < 0)
>--
>2.9.3

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-09-20 Thread 河合英宏 / KAWAI,HIDEHIRO
Hi Xunlei,

> From: Xunlei Pang [mailto:xp...@redhat.com]
> Sent: Tuesday, September 20, 2016 4:40 PM
> On 2016/08/15/ at 19:22, Hidehiro Kawai wrote:
> > Hi Dave,
> >
> > Thank you for the review.
> >
> >> From: Dave Young [mailto:dyo...@redhat.com]
> >> Sent: Friday, August 12, 2016 12:17 PM
> >>
> >> Thanks for the update.
> >> On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
> >>> Daniel Walker reported problems which happens when
> >>> crash_kexec_post_notifiers kernel option is enabled
> >>> (https://lkml.org/lkml/2015/6/24/44).
> >>>
> >>> In that case, smp_send_stop() is called before entering kdump routines
> >>> which assume other CPUs are still online.  As the result, for x86,
> >>> kdump routines fail to save other CPUs' registers  and disable
> >>> virtualization extensions.
> >> Seems you simplified the changelog, but I think a little more details
> >> will be helpful to understand the patch. You know sometimes lkml.org
> >> does not work well.
> > So, I'll try another archives when I post patch set next time.
> 
> Hi Hidehiro Kawai,
> 
> What's the status of this patch set, are you going to send an updated version?

Sorry, I misunderstood what Dave said, and I thought I don't
need to revise the patch set.

Currently, this patch set is in -mm, then -next tree.
What I need to fix is only commit descriptions, so I'm going to
post revised descriptions as replies to this thread, and then
request to Andrew to replace them if there is no objection.
(or should I resend the whole patch set?)

Regards,
Hidehiro Kawai

> >>> To fix this problem, call a new kdump friendly function,
> >>> crash_smp_send_stop(), instead of the smp_send_stop() when
> >>> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> >>> weak function, and it just call smp_send_stop().  Architecture
> >>> codes should override it so that kdump can work appropriately.
> >>> This patch only provides x86-specific version.
> >>>
> >>> For Xen's PV kernel, just keep the current behavior.
> >> Could you explain a bit about above Xen PV kernel behavior?
> >>
> >> BTW, this version looks better,  I think I'm fine with this version
> >> besides of the questions about changelog.
> > As for Dom0 kernel, it doesn't use crash_kexec routines, and
> > it relies on panic notifier chain.  At the end of the chain,
> > xen_panic_event is called, and it issues a hypercall which
> > requests Hypervisor to execute kdump.  This means whether
> > crash_kexec_panic_notifiers is set or not, panic notifiers
> > are called after smp_send_stop.  Even if we save registers
> > in Dom0 kernel, they seem to be ignored (Hypervisor is responsible
> > for that).  This is why I kept the current behavior for Xen.
> >
> > For PV DomU kernel, kdump is not supported.  For PV HVM
> > DomU, I'm not sure what will happen on panic because I
> > couldn't boot PV HVM DomU and test it.  But I think it will
> > work similarly to baremetal kernels with extra cleanups
> > for Hypervisor.
> >
> > Best regards,
> >
> > Hidehiro Kawai
> >
> >>> Changes in V4:
> >>> - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set
> >>> - Rename panic_smp_send_stop to crash_smp_send_stop
> >>> - Don't change the behavior for Xen's PV kernel
> >>>
> >>> Changes in V3:
> >>> - Revise comments, description, and symbol names
> >>>
> >>> Changes in V2:
> >>> - Replace smp_send_stop() call with crash_kexec version which
> >>>   saves cpu states and cleans up VMX/SVM
> >>> - Drop a fix for Problem 1 at this moment
> >>>
> >>> Reported-by: Daniel Walker 
> >>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" 
> >>> option)
> >>> Signed-off-by: Hidehiro Kawai 
> >>> Cc: Dave Young 
> >>> Cc: Baoquan He 
> >>> Cc: Vivek Goyal 
> >>> Cc: Eric Biederman 
> >>> Cc: Masami Hiramatsu 
> >>> Cc: Daniel Walker 
> >>> Cc: Xunlei Pang 
> >>> Cc: Thomas Gleixner 
> >>> Cc: Ingo Molnar 
> >>> Cc: "H. Peter Anvin" 
> >>> Cc: Borislav Petkov 
> >>> Cc: David Vrabel 
> >>> Cc: Toshi Kani 
> >>> Cc: Andrew Morton 
> >>> ---
> >>>  arch/x86/include/asm/kexec.h |1 +
> >>>  arch/x86/include/asm/smp.h   |1 +
> >>>  arch/x86/kernel/crash.c  |   22 +---
> >>>  arch/x86/kernel/smp.c|5 
> >>>  kernel/panic.c   |   47 
> >>> --
> >>>  5 files changed, 66 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> >>> index d2434c1..282630e 100644
> >>> --- a/arch/x86/include/asm/kexec.h
> >>> +++ b/arch/x86/include/asm/kexec.h
> >>> @@ -210,6 +210,7 @@ struct kexec_entry64_regs {
> >>>
> >>>  typedef void crash_vmclear_fn(void);
> >>>  extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
> >>> +extern void kdump_nmi_shootdown_cpus(void);
> >>>
> >>>  #endif /* __ASSEMBLY__ */
> >>>
> >>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> >>> index ebd0c16..f70989c 100644
> >>> --- a/arch/x86/include/asm/smp.

Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-09-20 Thread Xunlei Pang
On 2016/08/15/ at 19:22, Hidehiro Kawai wrote:
> Hi Dave,
>
> Thank you for the review.
>
>> From: Dave Young [mailto:dyo...@redhat.com]
>> Sent: Friday, August 12, 2016 12:17 PM
>>
>> Thanks for the update.
>> On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
>>> Daniel Walker reported problems which happens when
>>> crash_kexec_post_notifiers kernel option is enabled
>>> (https://lkml.org/lkml/2015/6/24/44).
>>>
>>> In that case, smp_send_stop() is called before entering kdump routines
>>> which assume other CPUs are still online.  As the result, for x86,
>>> kdump routines fail to save other CPUs' registers  and disable
>>> virtualization extensions.
>> Seems you simplified the changelog, but I think a little more details
>> will be helpful to understand the patch. You know sometimes lkml.org
>> does not work well.
> So, I'll try another archives when I post patch set next time.

Hi Hidehiro Kawai,

What's the status of this patch set, are you going to send an updated version?

Regards,
Xunlei

>>> To fix this problem, call a new kdump friendly function,
>>> crash_smp_send_stop(), instead of the smp_send_stop() when
>>> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
>>> weak function, and it just call smp_send_stop().  Architecture
>>> codes should override it so that kdump can work appropriately.
>>> This patch only provides x86-specific version.
>>>
>>> For Xen's PV kernel, just keep the current behavior.
>> Could you explain a bit about above Xen PV kernel behavior?
>>
>> BTW, this version looks better,  I think I'm fine with this version
>> besides of the questions about changelog.
> As for Dom0 kernel, it doesn't use crash_kexec routines, and
> it relies on panic notifier chain.  At the end of the chain,
> xen_panic_event is called, and it issues a hypercall which
> requests Hypervisor to execute kdump.  This means whether
> crash_kexec_panic_notifiers is set or not, panic notifiers
> are called after smp_send_stop.  Even if we save registers
> in Dom0 kernel, they seem to be ignored (Hypervisor is responsible
> for that).  This is why I kept the current behavior for Xen.
>
> For PV DomU kernel, kdump is not supported.  For PV HVM
> DomU, I'm not sure what will happen on panic because I
> couldn't boot PV HVM DomU and test it.  But I think it will
> work similarly to baremetal kernels with extra cleanups
> for Hypervisor.
>
> Best regards,
>
> Hidehiro Kawai
>
>>> Changes in V4:
>>> - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set
>>> - Rename panic_smp_send_stop to crash_smp_send_stop
>>> - Don't change the behavior for Xen's PV kernel
>>>
>>> Changes in V3:
>>> - Revise comments, description, and symbol names
>>>
>>> Changes in V2:
>>> - Replace smp_send_stop() call with crash_kexec version which
>>>   saves cpu states and cleans up VMX/SVM
>>> - Drop a fix for Problem 1 at this moment
>>>
>>> Reported-by: Daniel Walker 
>>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" 
>>> option)
>>> Signed-off-by: Hidehiro Kawai 
>>> Cc: Dave Young 
>>> Cc: Baoquan He 
>>> Cc: Vivek Goyal 
>>> Cc: Eric Biederman 
>>> Cc: Masami Hiramatsu 
>>> Cc: Daniel Walker 
>>> Cc: Xunlei Pang 
>>> Cc: Thomas Gleixner 
>>> Cc: Ingo Molnar 
>>> Cc: "H. Peter Anvin" 
>>> Cc: Borislav Petkov 
>>> Cc: David Vrabel 
>>> Cc: Toshi Kani 
>>> Cc: Andrew Morton 
>>> ---
>>>  arch/x86/include/asm/kexec.h |1 +
>>>  arch/x86/include/asm/smp.h   |1 +
>>>  arch/x86/kernel/crash.c  |   22 +---
>>>  arch/x86/kernel/smp.c|5 
>>>  kernel/panic.c   |   47 
>>> --
>>>  5 files changed, 66 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
>>> index d2434c1..282630e 100644
>>> --- a/arch/x86/include/asm/kexec.h
>>> +++ b/arch/x86/include/asm/kexec.h
>>> @@ -210,6 +210,7 @@ struct kexec_entry64_regs {
>>>
>>>  typedef void crash_vmclear_fn(void);
>>>  extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
>>> +extern void kdump_nmi_shootdown_cpus(void);
>>>
>>>  #endif /* __ASSEMBLY__ */
>>>
>>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
>>> index ebd0c16..f70989c 100644
>>> --- a/arch/x86/include/asm/smp.h
>>> +++ b/arch/x86/include/asm/smp.h
>>> @@ -50,6 +50,7 @@ struct smp_ops {
>>> void (*smp_cpus_done)(unsigned max_cpus);
>>>
>>> void (*stop_other_cpus)(int wait);
>>> +   void (*crash_stop_other_cpus)(void);
>>> void (*smp_send_reschedule)(int cpu);
>>>
>>> int (*cpu_up)(unsigned cpu, struct task_struct *tidle);
>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>> index 9616cf7..650830e 100644
>>> --- a/arch/x86/kernel/crash.c
>>> +++ b/arch/x86/kernel/crash.c
>>> @@ -133,15 +133,31 @@ static void kdump_nmi_callback(int cpu, struct 
>>> pt_regs *regs)
>>> disable_local_APIC();
>>>  }
>>>
>>> -static void kdump_nmi_shootdown_cpus(void)
>>> +void kdump_nmi_shoo

Re: [PATCH v26 6/7] arm64: kdump: update a kernel doc

2016-09-20 Thread AKASHI Takahiro
On Fri, Sep 16, 2016 at 05:08:28PM +0100, James Morse wrote:
> Hi Akashi,
> 
> On 07/09/16 05:29, AKASHI Takahiro wrote:
> > This patch adds arch specific descriptions about kdump usage on arm64
> > to kdump.txt.
> 
> > diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> 
> > @@ -249,6 +249,13 @@ Dump-capture kernel config options (Arch Dependent, 
> > arm)
> >  
> >  AUTO_ZRELADDR=y
> >  
> > +Dump-capture kernel config options (Arch Dependent, arm64)
> > +--
> > +
> > +- Please note that kvm of the dump-capture kernel will not be enabled
> > +  on non-VHE systems even if it is configured. This is because the CPU
> > +  cannot be reset to EL2 on panic.
> 
> Nit:
> cannot be -> will not be

OK.

> We could try to do this, but its more code that could prevent us reaching the
> kdump kernel, so we choose not to.
> 
> 
> > @@ -370,6 +381,9 @@ For s390x:
> >  For arm:
> > "1 maxcpus=1 reset_devices"
> >  
> > +For arm64:
> > +   "1 maxcpus=1 reset_devices"
> > +
> 
> 'maxcpus=1' is a bit fragile. Since 44dbcc93ab67145 ("arm64: Fix behavior of
> maxcpus=N") udev on ubuntu vivid (running on Juno) has taken it upon itself to
> bring the secondary cores online, even when booted with 'maxcpus=1'.
> 
> Can we change the recomendation to "1 nosmp reset_devices"?

Well, I have no strong opinion here, but I'm not quite sure whether
this change does make any difference in practice.
Looking at kernel/smp.c, the only difference is setup_max_cpus. But
given that arch_disable_smp_support() is null and smp_cpus_done()
ignores "max_cpus" on arm64, I don't think that the change is very
meaningful.

I might miss something.

Thanks,
-Takahiro AKASHI

> 
> Thanks,
> 
> James
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v26 3/7] arm64: kdump: add kdump support

2016-09-20 Thread AKASHI Takahiro
On Fri, Sep 16, 2016 at 03:50:24PM +0100, James Morse wrote:
> On 07/09/16 05:29, AKASHI Takahiro wrote:
> > On crash dump kernel, all the information about primary kernel's system
> > memory (core image) is available in elf core header.
> > The primary kernel will set aside this header with reserve_elfcorehdr()
> > at boot time and inform crash dump kernel of its location via a new
> > device-tree property, "linux,elfcorehdr".
> > 
> > Please note that all other architectures use traditional "elfcorehdr="
> > kernel parameter for this purpose.
> > 
> > Then crash dump kernel will access the primary kernel's memory with
> > copy_oldmem_page(), which reads one page by ioremap'ing it since it does
> > not reside in linear mapping on crash dump kernel.
> > 
> > We also need our own elfcorehdr_read() here since the header is placed
> > within crash dump kernel's usable memory.
> 
> One nit below, looks good.

Fixed.

Thanks,
-Takahiro AKASHI

> Reviewed-by: James Morse 
> 
> 
> Thanks,
> 
> James
> 
> 
> > diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
> 
> > +/**
> > + * copy_oldmem_page() - copy one page from old kernel memory
> > + * @pfn: page frame number to be copied
> > + * @buf: buffer where the copied page is placed
> > + * @csize: number of bytes to copy
> > + * @offset: offset in bytes into the page
> > + * @userbuf: if set, @buf is in a user address space
> > + *
> > + * This function copies one page from old kernel memory into buffer 
> > pointed by
> > + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of 
> > bytes
> > + * copied or negative error in case of failure.
> > + */
> > +ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
> > +size_t csize, unsigned long offset,
> > +int userbuf)
> > +{
> > +   void *vaddr;
> > +
> > +   if (!csize)
> > +   return 0;
> > +
> > +   vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB);
> > +   if (!vaddr)
> > +   return -ENOMEM;
> > +
> > +   if (userbuf) {
> 
> > +   if (copy_to_user(buf, vaddr + offset, csize)) {
> 
> If you re-cast buf with (char __user *), it should stop sparse complaining:
> > ../arch/arm64/kernel/crash_dump.c:45:34: warning: incorrect type in 
> > argument 1
> (different address spaces)
> > ../arch/arm64/kernel/crash_dump.c:45:34:expected void [noderef] 
> > *to
> > ../arch/arm64/kernel/crash_dump.c:45:34:got char *buf
> 
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v26 2/7] arm64: kdump: implement machine_crash_shutdown()

2016-09-20 Thread AKASHI Takahiro
On Fri, Sep 16, 2016 at 03:49:31PM +0100, James Morse wrote:
> On 16/09/16 04:21, AKASHI Takahiro wrote:
> > On Wed, Sep 14, 2016 at 07:09:33PM +0100, James Morse wrote:
> >> On 07/09/16 05:29, AKASHI Takahiro wrote:
> >>> Primary kernel calls machine_crash_shutdown() to shut down non-boot cpus
> >>> and save registers' status in per-cpu ELF notes before starting crash
> >>> dump kernel. See kernel_kexec().
> >>> Even if not all secondary cpus have shut down, we do kdump anyway.
> >>>
> >>> As we don't have to make non-boot(crashed) cpus offline (to preserve
> >>> correct status of cpus at crash dump) before shutting down, this patch
> >>> also adds a variant of smp_send_stop().
> 
> >>> diff --git a/arch/arm64/include/asm/kexec.h 
> >>> b/arch/arm64/include/asm/kexec.h
> >>> index 04744dc..a908958 100644
> >>> --- a/arch/arm64/include/asm/kexec.h
> >>> +++ b/arch/arm64/include/asm/kexec.h
> >>> @@ -40,7 +40,46 @@
> >>>  static inline void crash_setup_regs(struct pt_regs *newregs,
> >>>   struct pt_regs *oldregs)
> >>>  {
> >>> - /* Empty routine needed to avoid build errors. */
> >>> + if (oldregs) {
> >>> + memcpy(newregs, oldregs, sizeof(*newregs));
> >>> + } else {
> >>> + u64 tmp1, tmp2;
> >>> +
> >>> + __asm__ __volatile__ (
> >>> + "stp x0,   x1, [%2, #16 *  0]\n"
> >>> + "stp x2,   x3, [%2, #16 *  1]\n"
> >>> + "stp x4,   x5, [%2, #16 *  2]\n"
> >>> + "stp x6,   x7, [%2, #16 *  3]\n"
> >>> + "stp x8,   x9, [%2, #16 *  4]\n"
> >>> + "stpx10,  x11, [%2, #16 *  5]\n"
> >>> + "stpx12,  x13, [%2, #16 *  6]\n"
> >>> + "stpx14,  x15, [%2, #16 *  7]\n"
> >>> + "stpx16,  x17, [%2, #16 *  8]\n"
> >>> + "stpx18,  x19, [%2, #16 *  9]\n"
> >>> + "stpx20,  x21, [%2, #16 * 10]\n"
> >>> + "stpx22,  x23, [%2, #16 * 11]\n"
> >>> + "stpx24,  x25, [%2, #16 * 12]\n"
> >>> + "stpx26,  x27, [%2, #16 * 13]\n"
> >>> + "stpx28,  x29, [%2, #16 * 14]\n"
> >>> + "mov %0,  sp\n"
> >>> + "stpx30,  %0,  [%2, #16 * 15]\n"
> >>> +
> >>> + "/* faked current PSTATE */\n"
> >>> + "mrs %0, CurrentEL\n"
> >>> + "mrs %1, DAIF\n"
> >>> + "orr %0, %0, %1\n"
> >>> + "mrs %1, NZCV\n"
> >>> + "orr %0, %0, %1\n"
> >>> +
> >>
> >> What about SPSEL? While we don't use it, it is correctly preserved for
> >> everything except a CPU that calls panic()...
> > 
> > My comment above might be confusing, but what I want to fake
> > here is "spsr" as pt_regs.pstate is normally set based on spsr_el1.
> > So there is no corresponding field of SPSEL in spsr.
> 
> Here is my logic, I may have missed something obvious, see what you think:
> 
> SPSR_EL{1,2} shows the CPU mode 'M' in bits 0-4, From aarch64 bit 4 is always 
> 0.
> From the register definitions in the ARM-ARM C5.2, likely values in 0-3 are:
> 0100 EL1t
> 0101 EL1h
> 1000 EL2t
> 1001 EL2h
> 
> I'm pretty sure this least significant bit is what SPSEL changes, so it does 
> get
> implicitly recorded in SPSR.
> CurrentEL returns a value in bits 0-3, of which 0-1 are RES0, so we lose the
> difference between EL?t and EL?h.

OK.
SPSel will be added assuming that CurrentEL is never 0 here.

> 
> > 
> >>
> >>> + /* pc */
> >>> + "adr %1, 1f\n"
> >>> + "1:\n"
> >>> + "stp %1, %0,   [%2, #16 * 16]\n"
> >>> + : "=r" (tmp1), "=r" (tmp2), "+r" (newregs)
> >>> + :
> >>> + : "memory"
> 
> Do you need the memory clobber? This asm only modifies values in newregs.

What about this (including the change above):
|   "/* faked current PSTATE */\n"
|   "mrs %0, CurrentEL\n"
|   "mrs %1, SPSEL\n"
|   "orr %0, %0, %1\n"
|   "mrs %1, DAIF\n"
|   "orr %0, %0, %1\n"
|   "mrs %1, NZCV\n"
|   "orr %0, %0, %1\n"
|   /* pc */
|   "adr %1, 1f\n"
|   "1:\n"
|   "stp %1, %0,   [%2, #16 * 16]\n"
|   : "+r" (tmp1), "+r" (tmp2)
|   : "r" (newregs)
|   : "memory"


> 
> >>> + );
> >>> + }
> >>>  }
> >>>  
> >>>  #endif /* __ASSEMBLY__ */
> 
> 
> >>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> 
> >>> +#ifdef CONFIG_KEXEC_CORE
> >>> +void smp_send_crash_stop(void)
> >>> +{
> >>> + cpumask_t mask;
> >>> + unsigned long timeout;
> >>> +
> >>> + if (num_online_cpus() == 1)
> >>> +