Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-15 Thread Michael Ellerman
On Fri, 2016-01-15 at 15:59 +1100, Stewart Smith wrote:
> Michael Ellerman  writes:
> > So my preference would be just an incremental patch ASAP to fix the kernel 
> > to
> > do the right thing with the new interface.
>
> I see that's merged now, which is great! Even if someone is bisecting
> back, things will be fine too.

Yep. And the bisection window is only 17 commits, so not a major problem in the
end.

I've added a check to my scripts so that when a patch touches opal-api.h it
yells at me to check skiboot. Hopefully that will prevent this case in future.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-14 Thread Stewart Smith
Michael Ellerman  writes:
>> > > Applied to powerpc next, thanks.
>> > > 
>> > > https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f
>> > 
>> > The firmware interface changed slightly since this kernel patch[1], it
>> > added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal
>> > number to flush, theoretically allowing this to be plumbed into TTY
>> > layer or something too.
>> > 
>> > So, we'll either have to update this patch or replace it with an updated
>> > one.
>> > 
>> > [1] i'm pushing the accepted skiboot patch now.
>> > 
>> I'm working on an updated kernel patch to use the new parameter and 
>> additional
>> return values, so I suppose it's up to mpe whether or not this patch gets
>> merged now and another gets sent later to amend it, or if this patch gets
>> reverted in next and I can send a V4 adding the new stuff.
>
> Doh. I'd rather not revert it, unless we have to.
>
> Basically we're passing junk in r3, which skiboot is expecting to be the
> terminal number.

and skiboot will just return OPAL_PARAMETER and the kernel code ignores
the return value, so all will be *fine*. It may even work by accident sometimes.

> So running the current kernel code on the updated skiboot shouldn't crash and
> burn, it just won't actually work the way it's supposed to.

Right, it'll just do nothing.

> So my preference would be just an incremental patch ASAP to fix the kernel to
> do the right thing with the new interface.

I see that's merged now, which is great! Even if someone is bisecting
back, things will be fine too.

-- 
Stewart Smith
OPAL Architect, IBM.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-12 Thread Michael Ellerman
On Tue, 2016-01-12 at 15:17 +1100, Russell Currey wrote:
> On Tue, 2016-01-12 at 14:44 +1100, Stewart Smith wrote:
> > Michael Ellerman  writes:
> > > On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
> > > > On BMC machines, console output is controlled by the OPAL firmware and 
> > > > is
> > > > only flushed when its pollers are called.  When the kernel is in a panic
> > > > state, it no longer calls these pollers and thus console output does not
> > > > completely flush, causing some output from the panic to be lost.
> > > > 
> > > > Output is only actually lost when the kernel is configured to not power
> > > > off
> > > > or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
> > > > flushes the console buffer as part of its power down routines.  Before
> > > > this
> > > > patch, however, only partial output would be printed during the timeout
> > > > wait.
> > > > 
> > > > This patch adds a new kmsg_dumper which gets called at panic time to
> > > > ensure
> > > > panic output is not lost.  It accomplishes this by calling
> > > > OPAL_CONSOLE_FLUSH
> > > > in the OPAL API, and if that is not available, the pollers are called
> > > > enough
> > > > times to (hopefully) completely flush the buffer.
> > > > 
> > > > The flushing mechanism will only affect output printed at and before the
> > > > kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel
> > > > panic"
> > > > message may still be truncated as follows:
> > > > 

> > > > > Call Trace:
> > > > > [c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc 
> > > > > (unreliable)
> > > > > [c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
> > > > > [c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
> > > > > [c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
> > > > > [c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
> > > > > [c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
> > > > > [c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
> > > > > ---[ end Kernel panic - not
> > > > 
> > > > This functionality is implemented as a kmsg_dumper as it seems to be the
> > > > most sensible way to introduce platform-specific functionality to the
> > > > panic function.
> > > > 
> > > > Signed-off-by: Russell Currey 
> > > > Reviewed-by: Andrew Donnellan 
> > > 
> > > Applied to powerpc next, thanks.
> > > 
> > > https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f
> > 
> > The firmware interface changed slightly since this kernel patch[1], it
> > added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal
> > number to flush, theoretically allowing this to be plumbed into TTY
> > layer or something too.
> > 
> > So, we'll either have to update this patch or replace it with an updated
> > one.
> > 
> > [1] i'm pushing the accepted skiboot patch now.
> > 
> I'm working on an updated kernel patch to use the new parameter and additional
> return values, so I suppose it's up to mpe whether or not this patch gets
> merged now and another gets sent later to amend it, or if this patch gets
> reverted in next and I can send a V4 adding the new stuff.

Doh. I'd rather not revert it, unless we have to.

Basically we're passing junk in r3, which skiboot is expecting to be the
terminal number.

So running the current kernel code on the updated skiboot shouldn't crash and
burn, it just won't actually work the way it's supposed to.

So my preference would be just an incremental patch ASAP to fix the kernel to
do the right thing with the new interface.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-11 Thread Michael Ellerman
On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
> On BMC machines, console output is controlled by the OPAL firmware and is
> only flushed when its pollers are called.  When the kernel is in a panic
> state, it no longer calls these pollers and thus console output does not
> completely flush, causing some output from the panic to be lost.
> 
> Output is only actually lost when the kernel is configured to not power off
> or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
> flushes the console buffer as part of its power down routines.  Before this
> patch, however, only partial output would be printed during the timeout wait.
> 
> This patch adds a new kmsg_dumper which gets called at panic time to ensure
> panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
> in the OPAL API, and if that is not available, the pollers are called enough
> times to (hopefully) completely flush the buffer.
> 
> The flushing mechanism will only affect output printed at and before the
> kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
> message may still be truncated as follows:
> 
> >Call Trace:
> >[c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
> >[c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
> >[c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
> >[c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
> >[c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
> >[c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
> >[c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
> >---[ end Kernel panic - not
> 
> This functionality is implemented as a kmsg_dumper as it seems to be the
> most sensible way to introduce platform-specific functionality to the
> panic function.
> 
> Signed-off-by: Russell Currey 
> Reviewed-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-11 Thread Stewart Smith
Michael Ellerman  writes:
> On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
>> On BMC machines, console output is controlled by the OPAL firmware and is
>> only flushed when its pollers are called.  When the kernel is in a panic
>> state, it no longer calls these pollers and thus console output does not
>> completely flush, causing some output from the panic to be lost.
>> 
>> Output is only actually lost when the kernel is configured to not power off
>> or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
>> flushes the console buffer as part of its power down routines.  Before this
>> patch, however, only partial output would be printed during the timeout wait.
>> 
>> This patch adds a new kmsg_dumper which gets called at panic time to ensure
>> panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
>> in the OPAL API, and if that is not available, the pollers are called enough
>> times to (hopefully) completely flush the buffer.
>> 
>> The flushing mechanism will only affect output printed at and before the
>> kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
>> message may still be truncated as follows:
>> 
>> >Call Trace:
>> >[c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
>> >[c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
>> >[c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
>> >[c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
>> >[c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
>> >[c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
>> >[c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
>> >---[ end Kernel panic - not
>> 
>> This functionality is implemented as a kmsg_dumper as it seems to be the
>> most sensible way to introduce platform-specific functionality to the
>> panic function.
>> 
>> Signed-off-by: Russell Currey 
>> Reviewed-by: Andrew Donnellan 
>
> Applied to powerpc next, thanks.
>
> https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f

The firmware interface changed slightly since this kernel patch[1], it
added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal
number to flush, theoretically allowing this to be plumbed into TTY
layer or something too.

So, we'll either have to update this patch or replace it with an updated
one.

[1] i'm pushing the accepted skiboot patch now.

-- 
Stewart Smith
OPAL Architect, IBM.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-11 Thread Russell Currey
On Tue, 2016-01-12 at 14:44 +1100, Stewart Smith wrote:
> Michael Ellerman  writes:
> > On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
> > > On BMC machines, console output is controlled by the OPAL firmware and is
> > > only flushed when its pollers are called.  When the kernel is in a panic
> > > state, it no longer calls these pollers and thus console output does not
> > > completely flush, causing some output from the panic to be lost.
> > > 
> > > Output is only actually lost when the kernel is configured to not power
> > > off
> > > or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
> > > flushes the console buffer as part of its power down routines.  Before
> > > this
> > > patch, however, only partial output would be printed during the timeout
> > > wait.
> > > 
> > > This patch adds a new kmsg_dumper which gets called at panic time to
> > > ensure
> > > panic output is not lost.  It accomplishes this by calling
> > > OPAL_CONSOLE_FLUSH
> > > in the OPAL API, and if that is not available, the pollers are called
> > > enough
> > > times to (hopefully) completely flush the buffer.
> > > 
> > > The flushing mechanism will only affect output printed at and before the
> > > kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel
> > > panic"
> > > message may still be truncated as follows:
> > > 
> > > > Call Trace:
> > > > [c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
> > > > [c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
> > > > [c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
> > > > [c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
> > > > [c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
> > > > [c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
> > > > [c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
> > > > ---[ end Kernel panic - not
> > > 
> > > This functionality is implemented as a kmsg_dumper as it seems to be the
> > > most sensible way to introduce platform-specific functionality to the
> > > panic function.
> > > 
> > > Signed-off-by: Russell Currey 
> > > Reviewed-by: Andrew Donnellan 
> > 
> > Applied to powerpc next, thanks.
> > 
> > https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f
> 
> The firmware interface changed slightly since this kernel patch[1], it
> added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal
> number to flush, theoretically allowing this to be plumbed into TTY
> layer or something too.
> 
> So, we'll either have to update this patch or replace it with an updated
> one.
> 
> [1] i'm pushing the accepted skiboot patch now.
> 
I'm working on an updated kernel patch to use the new parameter and additional
return values, so I suppose it's up to mpe whether or not this patch gets
merged now and another gets sent later to amend it, or if this patch gets
reverted in next and I can send a V4 adding the new stuff.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2015-11-26 Thread Russell Currey
On BMC machines, console output is controlled by the OPAL firmware and is
only flushed when its pollers are called.  When the kernel is in a panic
state, it no longer calls these pollers and thus console output does not
completely flush, causing some output from the panic to be lost.

Output is only actually lost when the kernel is configured to not power off
or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
flushes the console buffer as part of its power down routines.  Before this
patch, however, only partial output would be printed during the timeout wait.

This patch adds a new kmsg_dumper which gets called at panic time to ensure
panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
in the OPAL API, and if that is not available, the pollers are called enough
times to (hopefully) completely flush the buffer.

The flushing mechanism will only affect output printed at and before the
kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
message may still be truncated as follows:

>Call Trace:
>[c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
>[c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
>[c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
>[c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
>[c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
>[c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
>[c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
>---[ end Kernel panic - not

This functionality is implemented as a kmsg_dumper as it seems to be the
most sensible way to introduce platform-specific functionality to the
panic function.

Signed-off-by: Russell Currey 
Reviewed-by: Andrew Donnellan 
---
V3: Added more detail to the commit message, no code change
V2: Fix some code style issues, no functional change

A patch to add the preferred OPAL call for flushing the console buffer,
OPAL_CONSOLE_FLUSH, was recently sent upstream to Skiboot.  You can track
its progress here: https://patchwork.ozlabs.org/patch/547379/
---
 arch/powerpc/include/asm/opal-api.h|  3 +-
 arch/powerpc/include/asm/opal.h|  3 ++
 arch/powerpc/platforms/powernv/Makefile|  1 +
 arch/powerpc/platforms/powernv/opal-kmsg.c | 68 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S |  1 +
 arch/powerpc/platforms/powernv/opal.c  |  3 ++
 6 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/powernv/opal-kmsg.c

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 8374afe..f8faaae 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -157,7 +157,8 @@
 #define OPAL_LEDS_GET_INDICATOR114
 #define OPAL_LEDS_SET_INDICATOR115
 #define OPAL_CEC_REBOOT2   116
-#define OPAL_LAST  116
+#define OPAL_CONSOLE_FLUSH 117
+#define OPAL_LAST  117
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 8001159..a5fd407 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -35,6 +35,7 @@ int64_t opal_console_read(int64_t term_number, __be64 *length,
  uint8_t *buffer);
 int64_t opal_console_write_buffer_space(int64_t term_number,
__be64 *length);
+void opal_console_flush(void);
 int64_t opal_rtc_read(__be32 *year_month_day,
  __be64 *hour_minute_second_millisecond);
 int64_t opal_rtc_write(uint32_t year_month_day,
@@ -262,6 +263,8 @@ extern int opal_resync_timebase(void);
 
 extern void opal_lpc_init(void);
 
+extern void opal_kmsg_init(void);
+
 extern int opal_event_request(unsigned int opal_event_nr);
 
 struct opal_sg_list *opal_vmalloc_to_sg_list(void *vmalloc_addr,
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index 1c8cdb6..b9de7ef 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,6 +2,7 @@ obj-y   += setup.o opal-wrappers.o opal.o 
opal-async.o idle.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
 obj-y  += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o
+obj-y  += opal-kmsg.o
 
 obj-$(CONFIG_SMP)  += smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)  += pci.o pci-p5ioc2.o pci-ioda.o
diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c 
b/arch/powerpc/platforms/powernv/opal-kmsg.c
new file mode 100644
index 000..bd3b2ee
--- /dev/null
+++