Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On 10/5/07, Linas Vepstas [EMAIL PROTECTED] wrote: On Thu, Oct 04, 2007 at 05:01:47PM -0700, Nish Aravamudan wrote: On 10/2/07, Tony Breeds [EMAIL PROTECTED] wrote: On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote: I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. Sure, what about? Clarify when RTAS logging is enabled. Signed-off-by: Tony Breeds [EMAIL PROTECTED] For what it's worth, on a different ppc64 box, this resolves a similar panic for me. Tested-by: Nishanth Aravamudan [EMAIL PROTECTED] For the reasons explained, I'd really like to nack Tony's patch. I see. Can you reply in this thread with the patch you mentioned in your other reply? (or point me to a copy of it) Thanks, Nish ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Thu, Oct 04, 2007 at 05:01:47PM -0700, Nish Aravamudan wrote: On 10/2/07, Tony Breeds [EMAIL PROTECTED] wrote: On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote: I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. Sure, what about? Clarify when RTAS logging is enabled. Signed-off-by: Tony Breeds [EMAIL PROTECTED] For what it's worth, on a different ppc64 box, this resolves a similar panic for me. Tested-by: Nishanth Aravamudan [EMAIL PROTECTED] For the reasons explained, I'd really like to nack Tony's patch. --linas ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On 10/2/07, Tony Breeds [EMAIL PROTECTED] wrote: On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote: I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. Sure, what about? Clarify when RTAS logging is enabled. Signed-off-by: Tony Breeds [EMAIL PROTECTED] For what it's worth, on a different ppc64 box, this resolves a similar panic for me. Tested-by: Nishanth Aravamudan [EMAIL PROTECTED] Thanks, Nish ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Wed, Oct 03, 2007 at 02:09:46PM +1000, Michael Ellerman wrote: Until we initialise what exactly? Until we allocate the error log buffer. The original crash was for a null-pointer deref of the unallocated buffer. I just sent out a patch to fix this; its a bit simpler than the below. In that email, I remarked: Andy Whitcroft's crash was appearently due to firmware complaining about lost power, (actually, lost power supply redundancy!), which occurred very early during boot. Type0040 (EPOW) Status: bypassed new Residual error from previous boot. EPOW Sensor Value: 0002 EPOW warning due to loss of redundancy. EPOW general power fault. I've no clue why firmware thought it was OK to report this during one of the earliest calls to RTAS; I'm still investiigating that. --linas ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Mon, Sep 24, 2007 at 01:35:31PM +0100, Andy Whitcroft wrote: Seeing the following from an older power LPAR, pretty sure we had this in the previous -mm also: I haven't forgetten about this ... and am looking at it now. Seems that whenever I go to reserve the machine pSeries-102, someone else is using it :-) --linas ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Tue, Oct 02, 2007 at 06:28:19PM -0500, Linas Vepstas wrote: On Mon, Sep 24, 2007 at 01:35:31PM +0100, Andy Whitcroft wrote: Seeing the following from an older power LPAR, pretty sure we had this in the previous -mm also: I haven't forgetten about this ... and am looking at it now. Seems that whenever I go to reserve the machine pSeries-102, someone else is using it :-) This panic is caused by [POWERPC] pseries: Fix jumbled no_logging flag. (79c0108d1b9db4864ab77b2a95dfa04f2dcf264c), in the powerpc/for-2.6.24 branch. It looks to me that we have logging enabled too early now. I think the following is a reasonable fix? --- Explicitly enable RTAS error logging, when it should be ready. Signed-off-by: Tony Breeds [EMAIL PROTECTED] --- arch/powerpc/platforms/pseries/rtasd.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c index 30925d2..0df5d0d 100644 --- a/arch/powerpc/platforms/pseries/rtasd.c +++ b/arch/powerpc/platforms/pseries/rtasd.c @@ -54,7 +54,10 @@ static unsigned int rtas_event_scan_rate; static int full_rtas_msgs = 0; /* Stop logging to nvram after first fatal error */ -static int no_more_logging; +static int no_more_logging = 1; /* Until we initialize everything, + * make sure we don't try logging + * anything */ + static int error_log_cnt; @@ -414,6 +417,8 @@ static int rtasd(void *unused) memset(logdata, 0, rtas_error_log_max); rc = nvram_read_error_log(logdata, rtas_error_log_max, err_type, error_log_cnt); + /* We can use rtas_log_buf now */ + no_more_logging = 0; if (!rc) { if (err_type != ERR_FLAG_ALREADY_LOGGED) { Yours Tony linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/ Jan 28 - Feb 02 2008 The Australian Linux Technical Conference! ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Wed, 2007-10-03 at 10:26 +1000, Tony Breeds wrote: On Tue, Oct 02, 2007 at 06:28:19PM -0500, Linas Vepstas wrote: On Mon, Sep 24, 2007 at 01:35:31PM +0100, Andy Whitcroft wrote: Seeing the following from an older power LPAR, pretty sure we had this in the previous -mm also: I haven't forgetten about this ... and am looking at it now. Seems that whenever I go to reserve the machine pSeries-102, someone else is using it :-) This panic is caused by [POWERPC] pseries: Fix jumbled no_logging flag. (79c0108d1b9db4864ab77b2a95dfa04f2dcf264c), in the powerpc/for-2.6.24 branch. It looks to me that we have logging enabled too early now. I think the following is a reasonable fix? --- Explicitly enable RTAS error logging, when it should be ready. Signed-off-by: Tony Breeds [EMAIL PROTECTED] --- arch/powerpc/platforms/pseries/rtasd.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c index 30925d2..0df5d0d 100644 --- a/arch/powerpc/platforms/pseries/rtasd.c +++ b/arch/powerpc/platforms/pseries/rtasd.c @@ -54,7 +54,10 @@ static unsigned int rtas_event_scan_rate; static int full_rtas_msgs = 0; /* Stop logging to nvram after first fatal error */ -static int no_more_logging; +static int no_more_logging = 1; /* Until we initialize everything, + * make sure we don't try logging + * anything */ + I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote: I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. Sure, what about? Clarify when RTAS logging is enabled. Signed-off-by: Tony Breeds [EMAIL PROTECTED] --- arch/powerpc/platforms/pseries/rtasd.c | 15 +-- 1 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c index 30925d2..73401c8 100644 --- a/arch/powerpc/platforms/pseries/rtasd.c +++ b/arch/powerpc/platforms/pseries/rtasd.c @@ -54,8 +54,9 @@ static unsigned int rtas_event_scan_rate; static int full_rtas_msgs = 0; /* Stop logging to nvram after first fatal error */ -static int no_more_logging; - +static int logging_enabled; /* Until we initialize everything, + * make sure we don't try logging + * anything */ static int error_log_cnt; /* @@ -217,7 +218,7 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal) } /* Write error to NVRAM */ - if (!no_more_logging !(err_type ERR_FLAG_BOOT)) + if (logging_enabled !(err_type ERR_FLAG_BOOT)) nvram_write_error_log(buf, len, err_type, error_log_cnt); /* @@ -229,8 +230,8 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal) printk_log_rtas(buf, len); /* Check to see if we need to or have stopped logging */ - if (fatal || no_more_logging) { - no_more_logging = 1; + if (fatal || !logging_enabled) { + logging_enabled = 0; spin_unlock_irqrestore(rtasd_log_lock, s); return; } @@ -302,7 +303,7 @@ static ssize_t rtas_log_read(struct file * file, char __user * buf, spin_lock_irqsave(rtasd_log_lock, s); /* if it's 0, then we know we got the last one (the one in NVRAM) */ - if (rtas_log_size == 0 !no_more_logging) + if (rtas_log_size == 0 logging_enabled) nvram_clear_error_log(); spin_unlock_irqrestore(rtasd_log_lock, s); @@ -414,6 +415,8 @@ static int rtasd(void *unused) memset(logdata, 0, rtas_error_log_max); rc = nvram_read_error_log(logdata, rtas_error_log_max, err_type, error_log_cnt); + /* We can use rtas_log_buf now */ + logging_enabled = 1; if (!rc) { if (err_type != ERR_FLAG_ALREADY_LOGGED) { Yours Tony linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/ Jan 28 - Feb 02 2008 The Australian Linux Technical Conference! ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
On Wed, 2007-10-03 at 11:19 +1000, Tony Breeds wrote: On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote: I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. Sure, what about? Better .. but .. :D diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c index 30925d2..73401c8 100644 --- a/arch/powerpc/platforms/pseries/rtasd.c +++ b/arch/powerpc/platforms/pseries/rtasd.c @@ -54,8 +54,9 @@ static unsigned int rtas_event_scan_rate; static int full_rtas_msgs = 0; /* Stop logging to nvram after first fatal error */ -static int no_more_logging; - +static int logging_enabled; /* Until we initialize everything, + * make sure we don't try logging + * anything */ Until we initialise what exactly? static int error_log_cnt; /* @@ -217,7 +218,7 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal) } /* Write error to NVRAM */ - if (!no_more_logging !(err_type ERR_FLAG_BOOT)) + if (logging_enabled !(err_type ERR_FLAG_BOOT)) nvram_write_error_log(buf, len, err_type, error_log_cnt); /* @@ -229,8 +230,8 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal) printk_log_rtas(buf, len); /* Check to see if we need to or have stopped logging */ - if (fatal || no_more_logging) { - no_more_logging = 1; + if (fatal || !logging_enabled) { + logging_enabled = 0; spin_unlock_irqrestore(rtasd_log_lock, s); return; } Hmmm, this routine has 4 separate lock-dropping exit paths .. @@ -302,7 +303,7 @@ static ssize_t rtas_log_read(struct file * file, char __user * buf, spin_lock_irqsave(rtasd_log_lock, s); /* if it's 0, then we know we got the last one (the one in NVRAM) */ - if (rtas_log_size == 0 !no_more_logging) + if (rtas_log_size == 0 logging_enabled) nvram_clear_error_log(); spin_unlock_irqrestore(rtasd_log_lock, s); @@ -414,6 +415,8 @@ static int rtasd(void *unused) memset(logdata, 0, rtas_error_log_max); rc = nvram_read_error_log(logdata, rtas_error_log_max, err_type, error_log_cnt); + /* We can use rtas_log_buf now */ + logging_enabled = 1; if (!rc) { if (err_type != ERR_FLAG_ALREADY_LOGGED) { What exactly happens that allows us to do logging? I don't see any ordering between anything else and the setting of the flag, and AFAICT we're not inside a spinlock or anything here. cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev