Re: applesmc oops in 3.10/3.11
On Wed, Oct 09, 2013 at 10:29:39AM +0200, Henrik Rydberg wrote: > From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001 > From: Henrik Rydberg > Date: Wed, 2 Oct 2013 19:15:03 +0200 > Subject: [PATCH] hwmon: (applesmc) Always read until end of data > > The crash reported and investigated in commit 5f4513 turned out to be > caused by a change to the read interface on newer (2012) SMCs. > > Tests by Chris show that simply reading the data valid line is enough > for the problem to go away. Additional tests show that the newer SMCs > no longer wait for the number of requested bytes, but start sending > data right away. Apparently the number of bytes to read is no longer > specified as before, but instead found out by reading until end of > data. Failure to read until end of data confuses the state machine, > which eventually causes the crash. > > As a remedy, assuming bit0 is the read valid line, make sure there is > nothing more to read before leaving the read function. > > Tested to resolve the original problem, and runtested on MBA3,1, > MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on > machines before 2012. > > Tested-by: Chris Murphy > Signed-off-by: Henrik Rydberg Applied. I'll do my usual sanity testing and send it to Linus either tomorrow or Friday. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Hi Guenter, > > I want to make one more test tomorrow, then I'll send a proper patch. > > Thanks! > > > Ok, great. And here it is, with compiler warning fixed. Thanks everybody, Henrik >From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] hwmon: (applesmc) Always read until end of data The crash reported and investigated in commit 5f4513 turned out to be caused by a change to the read interface on newer (2012) SMCs. Tests by Chris show that simply reading the data valid line is enough for the problem to go away. Additional tests show that the newer SMCs no longer wait for the number of requested bytes, but start sending data right away. Apparently the number of bytes to read is no longer specified as before, but instead found out by reading until end of data. Failure to read until end of data confuses the state machine, which eventually causes the crash. As a remedy, assuming bit0 is the read valid line, make sure there is nothing more to read before leaving the read function. Tested to resolve the original problem, and runtested on MBA3,1, MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on machines before 2012. Tested-by: Chris Murphy Signed-off-by: Henrik Rydberg --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..3288f13 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data = 0; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn("%.4s: read len fail\n", key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i < 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status & 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn("flushed %d bytes, last value is: %d\n", i, data); + return 0; } -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Hi Guenter, I want to make one more test tomorrow, then I'll send a proper patch. Thanks! Ok, great. And here it is, with compiler warning fixed. Thanks everybody, Henrik From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] hwmon: (applesmc) Always read until end of data The crash reported and investigated in commit 5f4513 turned out to be caused by a change to the read interface on newer (2012) SMCs. Tests by Chris show that simply reading the data valid line is enough for the problem to go away. Additional tests show that the newer SMCs no longer wait for the number of requested bytes, but start sending data right away. Apparently the number of bytes to read is no longer specified as before, but instead found out by reading until end of data. Failure to read until end of data confuses the state machine, which eventually causes the crash. As a remedy, assuming bit0 is the read valid line, make sure there is nothing more to read before leaving the read function. Tested to resolve the original problem, and runtested on MBA3,1, MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on machines before 2012. Tested-by: Chris Murphy ch...@cmurf.com Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..3288f13 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data = 0; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn(%.4s: read len fail\n, key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn(flushed %d bytes, last value is: %d\n, i, data); + return 0; } -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 09, 2013 at 10:29:39AM +0200, Henrik Rydberg wrote: From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] hwmon: (applesmc) Always read until end of data The crash reported and investigated in commit 5f4513 turned out to be caused by a change to the read interface on newer (2012) SMCs. Tests by Chris show that simply reading the data valid line is enough for the problem to go away. Additional tests show that the newer SMCs no longer wait for the number of requested bytes, but start sending data right away. Apparently the number of bytes to read is no longer specified as before, but instead found out by reading until end of data. Failure to read until end of data confuses the state machine, which eventually causes the crash. As a remedy, assuming bit0 is the read valid line, make sure there is nothing more to read before leaving the read function. Tested to resolve the original problem, and runtested on MBA3,1, MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on machines before 2012. Tested-by: Chris Murphy ch...@cmurf.com Signed-off-by: Henrik Rydberg rydb...@euromail.se Applied. I'll do my usual sanity testing and send it to Linus either tomorrow or Friday. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 08, 2013 at 06:29:17PM +0200, Henrik Rydberg wrote: > Hi Guenter, > > > > > So, what should we do with this patch ? Apply it ? > > > > > > So far I'm getting nothing on the original machine. As of today it's > > > applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. > > > Unfortunately at the moment I'm a bit too dense to figure out how to get > > > a new kernel applied to an existing live package so I can try this on a > > > USB stick. While maybe unrelated, the oops was occurring at least 4x as > > > often booted from USB stick media than HDD. > > > > > > > Seems to me we should apply it. > > > > Henrik, > > > > what do you think ? Did you have time for additional testing ? > > If you think we should apply it, please send me a signed patch. > > I want to make one more test tomorrow, then I'll send a proper patch. Thanks! > Ok, great. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Hi Guenter, > > > So, what should we do with this patch ? Apply it ? > > > > So far I'm getting nothing on the original machine. As of today it's > > applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately > > at the moment I'm a bit too dense to figure out how to get a new kernel > > applied to an existing live package so I can try this on a USB stick. While > > maybe unrelated, the oops was occurring at least 4x as often booted from > > USB stick media than HDD. > > > > Seems to me we should apply it. > > Henrik, > > what do you think ? Did you have time for additional testing ? > If you think we should apply it, please send me a signed patch. I want to make one more test tomorrow, then I'll send a proper patch. Thanks! Henrik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Mon, Oct 07, 2013 at 05:46:55PM -0600, Chris Murphy wrote: > > On Oct 7, 2013, at 5:42 PM, Guenter Roeck wrote: > > > On 10/02/2013 10:24 AM, Henrik Rydberg wrote: > > > >>> From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 > >> From: Henrik Rydberg > >> Date: Wed, 2 Oct 2013 19:15:03 +0200 > >> Subject: [PATCH] applesmc remedy take 2 > >> > >> Conjectured problem: there are remnant bytes ready on the data line > >> which corrupts the read after a failure. > >> > >> Remedy: assuming bit0 is the read valid line, try to flush it before > >> starting a new command. > >> > >> Tests by Chris suggests reading the status is enough for the problem > >> to go away, which is consistent with a change in the SMC interface, > >> where the number of bytes to read is no longer specified, but found > >> out by reading until end of data. > >> > >> Tested on a MacBookAir3,1, but the original problem has not been > >> reproduced. > > > > So, what should we do with this patch ? Apply it ? > > So far I'm getting nothing on the original machine. As of today it's applied > as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the > moment I'm a bit too dense to figure out how to get a new kernel applied to > an existing live package so I can try this on a USB stick. While maybe > unrelated, the oops was occurring at least 4x as often booted from USB stick > media than HDD. > Seems to me we should apply it. Henrik, what do you think ? Did you have time for additional testing ? If you think we should apply it, please send me a signed patch. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Mon, Oct 07, 2013 at 05:46:55PM -0600, Chris Murphy wrote: On Oct 7, 2013, at 5:42 PM, Guenter Roeck li...@roeck-us.net wrote: On 10/02/2013 10:24 AM, Henrik Rydberg wrote: From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. So, what should we do with this patch ? Apply it ? So far I'm getting nothing on the original machine. As of today it's applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment I'm a bit too dense to figure out how to get a new kernel applied to an existing live package so I can try this on a USB stick. While maybe unrelated, the oops was occurring at least 4x as often booted from USB stick media than HDD. Seems to me we should apply it. Henrik, what do you think ? Did you have time for additional testing ? If you think we should apply it, please send me a signed patch. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Hi Guenter, So, what should we do with this patch ? Apply it ? So far I'm getting nothing on the original machine. As of today it's applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment I'm a bit too dense to figure out how to get a new kernel applied to an existing live package so I can try this on a USB stick. While maybe unrelated, the oops was occurring at least 4x as often booted from USB stick media than HDD. Seems to me we should apply it. Henrik, what do you think ? Did you have time for additional testing ? If you think we should apply it, please send me a signed patch. I want to make one more test tomorrow, then I'll send a proper patch. Thanks! Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 08, 2013 at 06:29:17PM +0200, Henrik Rydberg wrote: Hi Guenter, So, what should we do with this patch ? Apply it ? So far I'm getting nothing on the original machine. As of today it's applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment I'm a bit too dense to figure out how to get a new kernel applied to an existing live package so I can try this on a USB stick. While maybe unrelated, the oops was occurring at least 4x as often booted from USB stick media than HDD. Seems to me we should apply it. Henrik, what do you think ? Did you have time for additional testing ? If you think we should apply it, please send me a signed patch. I want to make one more test tomorrow, then I'll send a proper patch. Thanks! Ok, great. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 7, 2013, at 5:42 PM, Guenter Roeck wrote: > On 10/02/2013 10:24 AM, Henrik Rydberg wrote: > >>> From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 >> From: Henrik Rydberg >> Date: Wed, 2 Oct 2013 19:15:03 +0200 >> Subject: [PATCH] applesmc remedy take 2 >> >> Conjectured problem: there are remnant bytes ready on the data line >> which corrupts the read after a failure. >> >> Remedy: assuming bit0 is the read valid line, try to flush it before >> starting a new command. >> >> Tests by Chris suggests reading the status is enough for the problem >> to go away, which is consistent with a change in the SMC interface, >> where the number of bytes to read is no longer specified, but found >> out by reading until end of data. >> >> Tested on a MacBookAir3,1, but the original problem has not been >> reproduced. > > So, what should we do with this patch ? Apply it ? So far I'm getting nothing on the original machine. As of today it's applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment I'm a bit too dense to figure out how to get a new kernel applied to an existing live package so I can try this on a USB stick. While maybe unrelated, the oops was occurring at least 4x as often booted from USB stick media than HDD. Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 10/02/2013 10:24 AM, Henrik Rydberg wrote: From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. So, what should we do with this patch ? Apply it ? Thanks, Guenter --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..c0ff350 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn("%.4s: read len fail\n", key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i < 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status & 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn("flushed %d bytes, last value is: %d\n", i, data); + return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 10/02/2013 10:24 AM, Henrik Rydberg wrote: From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. So, what should we do with this patch ? Apply it ? Thanks, Guenter --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..c0ff350 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn(%.4s: read len fail\n, key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn(flushed %d bytes, last value is: %d\n, i, data); + return 0; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 7, 2013, at 5:42 PM, Guenter Roeck li...@roeck-us.net wrote: On 10/02/2013 10:24 AM, Henrik Rydberg wrote: From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. So, what should we do with this patch ? Apply it ? So far I'm getting nothing on the original machine. As of today it's applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment I'm a bit too dense to figure out how to get a new kernel applied to an existing live package so I can try this on a USB stick. While maybe unrelated, the oops was occurring at least 4x as often booted from USB stick media than HDD. Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 03:34:41PM -0600, Chris Murphy wrote: > > On Oct 2, 2013, at 2:59 PM, Guenter Roeck wrote: > > > On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote: > >> > >> On Oct 2, 2013, at 12:02 PM, Guenter Roeck wrote: > >> > >>> On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: > On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: > > On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: > > One thing I have seen in all logs is the earlier "send_byte fail" > > message, so > > I think that is a pre-requisite. > > Not necessarily - it could be that the patch actually fixes the root > cause. One possible scenario is that on recent SMCs, some of the > commands produce more data than we actually read. This would > eventually lead to both data corruption and overflow somwhere in the > SMC internals. If the original SMC error is interpreted as a read > buffer overflow, then that problem should be fixed with this patch. > > >>> > >>> Good point. > >>> > >>> But shouldn't we at least get the "flushed %d bytes" warning message > >>> in this case ? > >> > >> The explanation I have there is that the (newer) SMC needs the > >> application to read the 'no more bytes' or it will get confused. It > >> makes sense, if the number of bytes to read is no longer specified. > >> > > You mean that just reading from APPLESMC_CMD_PORT would solve the > > problem ? > > That might make sense. > > It also points at the possibility of a smaller patch to test, but I > have not had the time to check this very deeply myself: > > >>> I like this patch much more than the previous patch. Chris, can you test > >>> it ? > >> > >> Yes. Building now. What kernel message should I be looking for? At least > >> on 2011 and 2012 laptops I have yet to see an Oops related to smc. The > >> kernel with previous patch at least is not causing problems on them so > >> far, which works well as I can test more on the 2008 model. > >> > > None, if I understand correctly and if the patch really fixes the root cause > > of the problem. > > A vast majority of the Ooops I've had are when booting from flash media, > testing Fedora installs. Is it possible the much slower kernel load and boot > time is a trigger? If so, I'll look into modifying the media to accept the > custom kernel and requisite fat initramfs. > Yes, that could be a possible trigger. I thought it might be triggered by faster boot (as one gets with 3.10 and 3.11), but slow boot is just as likely. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 2, 2013, at 2:59 PM, Guenter Roeck wrote: > On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote: >> >> On Oct 2, 2013, at 12:02 PM, Guenter Roeck wrote: >> >>> On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: > On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: > One thing I have seen in all logs is the earlier "send_byte fail" > message, so > I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. >>> >>> Good point. >>> >>> But shouldn't we at least get the "flushed %d bytes" warning message in >>> this case ? >> >> The explanation I have there is that the (newer) SMC needs the >> application to read the 'no more bytes' or it will get confused. It >> makes sense, if the number of bytes to read is no longer specified. >> > You mean that just reading from APPLESMC_CMD_PORT would solve the problem > ? > That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: >>> I like this patch much more than the previous patch. Chris, can you test it >>> ? >> >> Yes. Building now. What kernel message should I be looking for? At least on >> 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel >> with previous patch at least is not causing problems on them so far, which >> works well as I can test more on the 2008 model. >> > None, if I understand correctly and if the patch really fixes the root cause > of the problem. A vast majority of the Ooops I've had are when booting from flash media, testing Fedora installs. Is it possible the much slower kernel load and boot time is a trigger? If so, I'll look into modifying the media to accept the custom kernel and requisite fat initramfs. Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote: > > On Oct 2, 2013, at 12:02 PM, Guenter Roeck wrote: > > > On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: > >> On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: > >>> On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: > >>> One thing I have seen in all logs is the earlier "send_byte fail" > >>> message, so > >>> I think that is a pre-requisite. > >> > >> Not necessarily - it could be that the patch actually fixes the root > >> cause. One possible scenario is that on recent SMCs, some of the > >> commands produce more data than we actually read. This would > >> eventually lead to both data corruption and overflow somwhere in the > >> SMC internals. If the original SMC error is interpreted as a read > >> buffer overflow, then that problem should be fixed with this patch. > >> > > > > Good point. > > > > But shouldn't we at least get the "flushed %d bytes" warning message in > > this case ? > > The explanation I have there is that the (newer) SMC needs the > application to read the 'no more bytes' or it will get confused. It > makes sense, if the number of bytes to read is no longer specified. > > >>> You mean that just reading from APPLESMC_CMD_PORT would solve the problem > >>> ? > >>> That might make sense. > >> > >> It also points at the possibility of a smaller patch to test, but I > >> have not had the time to check this very deeply myself: > >> > > I like this patch much more than the previous patch. Chris, can you test it > > ? > > Yes. Building now. What kernel message should I be looking for? At least on > 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel > with previous patch at least is not causing problems on them so far, which > works well as I can test more on the 2008 model. > None, if I understand correctly and if the patch really fixes the root cause of the problem. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 2, 2013, at 12:02 PM, Guenter Roeck wrote: > On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: >> On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: >>> On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: >>> One thing I have seen in all logs is the earlier "send_byte fail" >>> message, so >>> I think that is a pre-requisite. >> >> Not necessarily - it could be that the patch actually fixes the root >> cause. One possible scenario is that on recent SMCs, some of the >> commands produce more data than we actually read. This would >> eventually lead to both data corruption and overflow somwhere in the >> SMC internals. If the original SMC error is interpreted as a read >> buffer overflow, then that problem should be fixed with this patch. >> > > Good point. > > But shouldn't we at least get the "flushed %d bytes" warning message in > this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. >>> You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? >>> That might make sense. >> >> It also points at the possibility of a smaller patch to test, but I >> have not had the time to check this very deeply myself: >> > I like this patch much more than the previous patch. Chris, can you test it ? Yes. Building now. What kernel message should I be looking for? At least on 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with previous patch at least is not causing problems on them so far, which works well as I can test more on the 2008 model. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: > On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: > > On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: > > > > >>One thing I have seen in all logs is the earlier "send_byte fail" > > > > >>message, so > > > > >>I think that is a pre-requisite. > > > > > > > > > >Not necessarily - it could be that the patch actually fixes the root > > > > >cause. One possible scenario is that on recent SMCs, some of the > > > > >commands produce more data than we actually read. This would > > > > >eventually lead to both data corruption and overflow somwhere in the > > > > >SMC internals. If the original SMC error is interpreted as a read > > > > >buffer overflow, then that problem should be fixed with this patch. > > > > > > > > > > > > > Good point. > > > > > > > > But shouldn't we at least get the "flushed %d bytes" warning message in > > > > this case ? > > > > > > The explanation I have there is that the (newer) SMC needs the > > > application to read the 'no more bytes' or it will get confused. It > > > makes sense, if the number of bytes to read is no longer specified. > > > > > You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? > > That might make sense. > > It also points at the possibility of a smaller patch to test, but I > have not had the time to check this very deeply myself: > I like this patch much more than the previous patch. Chris, can you test it ? Thanks, Guenter > Thanks, > Henrik > > From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 > From: Henrik Rydberg > Date: Wed, 2 Oct 2013 19:15:03 +0200 > Subject: [PATCH] applesmc remedy take 2 > > Conjectured problem: there are remnant bytes ready on the data line > which corrupts the read after a failure. > > Remedy: assuming bit0 is the read valid line, try to flush it before > starting a new command. > > Tests by Chris suggests reading the status is enough for the problem > to go away, which is consistent with a change in the SMC interface, > where the number of bytes to read is no longer specified, but found > out by reading until end of data. > > Tested on a MacBookAir3,1, but the original problem has not been > reproduced. > --- > drivers/hwmon/applesmc.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c > index 98814d1..c0ff350 100644 > --- a/drivers/hwmon/applesmc.c > +++ b/drivers/hwmon/applesmc.c > @@ -230,6 +230,7 @@ static int send_argument(const char *key) > > static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) > { > + u8 status, data; > int i; > > if (send_command(cmd) || send_argument(key)) { > @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, > u8 len) > return -EIO; > } > > + /* This has no effect on newer (2012) SMCs */ > if (send_byte(len, APPLESMC_DATA_PORT)) { > pr_warn("%.4s: read len fail\n", key); > return -EIO; > @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, > u8 len) > buffer[i] = inb(APPLESMC_DATA_PORT); > } > > + /* Read the data port until bit0 is cleared */ > + for (i = 0; i < 16; i++) { > + udelay(APPLESMC_MIN_WAIT); > + status = inb(APPLESMC_CMD_PORT); > + if (!(status & 0x01)) > + break; > + data = inb(APPLESMC_DATA_PORT); > + } > + if (i) > + pr_warn("flushed %d bytes, last value is: %d\n", i, data); > + > return 0; > } > > -- > 1.8.3.4 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: > On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: > > > >>One thing I have seen in all logs is the earlier "send_byte fail" > > > >>message, so > > > >>I think that is a pre-requisite. > > > > > > > >Not necessarily - it could be that the patch actually fixes the root > > > >cause. One possible scenario is that on recent SMCs, some of the > > > >commands produce more data than we actually read. This would > > > >eventually lead to both data corruption and overflow somwhere in the > > > >SMC internals. If the original SMC error is interpreted as a read > > > >buffer overflow, then that problem should be fixed with this patch. > > > > > > > > > > Good point. > > > > > > But shouldn't we at least get the "flushed %d bytes" warning message in > > > this case ? > > > > The explanation I have there is that the (newer) SMC needs the > > application to read the 'no more bytes' or it will get confused. It > > makes sense, if the number of bytes to read is no longer specified. > > > You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? > That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: Thanks, Henrik >From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..c0ff350 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn("%.4s: read len fail\n", key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i < 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status & 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn("flushed %d bytes, last value is: %d\n", i, data); + return 0; } -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: > > >>One thing I have seen in all logs is the earlier "send_byte fail" > > >>message, so > > >>I think that is a pre-requisite. > > > > > >Not necessarily - it could be that the patch actually fixes the root > > >cause. One possible scenario is that on recent SMCs, some of the > > >commands produce more data than we actually read. This would > > >eventually lead to both data corruption and overflow somwhere in the > > >SMC internals. If the original SMC error is interpreted as a read > > >buffer overflow, then that problem should be fixed with this patch. > > > > > > > Good point. > > > > But shouldn't we at least get the "flushed %d bytes" warning message in > > this case ? > > The explanation I have there is that the (newer) SMC needs the > application to read the 'no more bytes' or it will get confused. It > makes sense, if the number of bytes to read is no longer specified. > You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
> >>One thing I have seen in all logs is the earlier "send_byte fail" message, > >>so > >>I think that is a pre-requisite. > > > >Not necessarily - it could be that the patch actually fixes the root > >cause. One possible scenario is that on recent SMCs, some of the > >commands produce more data than we actually read. This would > >eventually lead to both data corruption and overflow somwhere in the > >SMC internals. If the original SMC error is interpreted as a read > >buffer overflow, then that problem should be fixed with this patch. > > > > Good point. > > But shouldn't we at least get the "flushed %d bytes" warning message in this > case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. Thanks, Henrik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 10/02/2013 02:53 AM, Henrik Rydberg wrote: Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 One thing I have seen in all logs is the earlier "send_byte fail" message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the "flushed %d bytes" warning message in this case ? Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
> > Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after > > ~dozen reboots, I'm not triggering the problem. The only items in dmesg > > with smc in it: > > > > [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 > > [ 13.833402] input: applesmc as > > /devices/platform/applesmc.768/input/input10 > > > > One thing I have seen in all logs is the earlier "send_byte fail" message, so > I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Even so, the patch does require more testing. It would be great to get coverage on all models we can find. Thanks, Henrik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Even so, the patch does require more testing. It would be great to get coverage on all models we can find. Thanks, Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 10/02/2013 02:53 AM, Henrik Rydberg wrote: Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. Thanks, Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: Thanks, Henrik From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..c0ff350 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn(%.4s: read len fail\n, key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn(flushed %d bytes, last value is: %d\n, i, data); + return 0; } -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: I like this patch much more than the previous patch. Chris, can you test it ? Thanks, Guenter Thanks, Henrik From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Wed, 2 Oct 2013 19:15:03 +0200 Subject: [PATCH] applesmc remedy take 2 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Tests by Chris suggests reading the status is enough for the problem to go away, which is consistent with a change in the SMC interface, where the number of bytes to read is no longer specified, but found out by reading until end of data. Tested on a MacBookAir3,1, but the original problem has not been reproduced. --- drivers/hwmon/applesmc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..c0ff350 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -230,6 +230,7 @@ static int send_argument(const char *key) static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { + u8 status, data; int i; if (send_command(cmd) || send_argument(key)) { @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } + /* This has no effect on newer (2012) SMCs */ if (send_byte(len, APPLESMC_DATA_PORT)) { pr_warn(%.4s: read len fail\n, key); return -EIO; @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) buffer[i] = inb(APPLESMC_DATA_PORT); } + /* Read the data port until bit0 is cleared */ + for (i = 0; i 16; i++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (i) + pr_warn(flushed %d bytes, last value is: %d\n, i, data); + return 0; } -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote: On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: I like this patch much more than the previous patch. Chris, can you test it ? Yes. Building now. What kernel message should I be looking for? At least on 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with previous patch at least is not causing problems on them so far, which works well as I can test more on the 2008 model. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote: On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote: On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: I like this patch much more than the previous patch. Chris, can you test it ? Yes. Building now. What kernel message should I be looking for? At least on 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with previous patch at least is not causing problems on them so far, which works well as I can test more on the 2008 model. None, if I understand correctly and if the patch really fixes the root cause of the problem. Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 2, 2013, at 2:59 PM, Guenter Roeck li...@roeck-us.net wrote: On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote: On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote: On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: I like this patch much more than the previous patch. Chris, can you test it ? Yes. Building now. What kernel message should I be looking for? At least on 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with previous patch at least is not causing problems on them so far, which works well as I can test more on the 2008 model. None, if I understand correctly and if the patch really fixes the root cause of the problem. A vast majority of the Ooops I've had are when booting from flash media, testing Fedora installs. Is it possible the much slower kernel load and boot time is a trigger? If so, I'll look into modifying the media to accept the custom kernel and requisite fat initramfs. Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Oct 02, 2013 at 03:34:41PM -0600, Chris Murphy wrote: On Oct 2, 2013, at 2:59 PM, Guenter Roeck li...@roeck-us.net wrote: On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote: On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote: On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote: On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote: On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote: One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Not necessarily - it could be that the patch actually fixes the root cause. One possible scenario is that on recent SMCs, some of the commands produce more data than we actually read. This would eventually lead to both data corruption and overflow somwhere in the SMC internals. If the original SMC error is interpreted as a read buffer overflow, then that problem should be fixed with this patch. Good point. But shouldn't we at least get the flushed %d bytes warning message in this case ? The explanation I have there is that the (newer) SMC needs the application to read the 'no more bytes' or it will get confused. It makes sense, if the number of bytes to read is no longer specified. You mean that just reading from APPLESMC_CMD_PORT would solve the problem ? That might make sense. It also points at the possibility of a smaller patch to test, but I have not had the time to check this very deeply myself: I like this patch much more than the previous patch. Chris, can you test it ? Yes. Building now. What kernel message should I be looking for? At least on 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with previous patch at least is not causing problems on them so far, which works well as I can test more on the 2008 model. None, if I understand correctly and if the patch really fixes the root cause of the problem. A vast majority of the Ooops I've had are when booting from flash media, testing Fedora installs. Is it possible the much slower kernel load and boot time is a trigger? If so, I'll look into modifying the media to accept the custom kernel and requisite fat initramfs. Yes, that could be a possible trigger. I thought it might be triggered by faster boot (as one gets with 3.10 and 3.11), but slow boot is just as likely. Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 10/01/2013 08:55 PM, Chris Murphy wrote: On Oct 1, 2013, at 9:51 PM, Guenter Roeck wrote: On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote: On Oct 1, 2013, at 10:24 AM, Guenter Roeck wrote: On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 Hi Chris, That only means that you did not hit the problem. There may be some secondary trigger (cold boot ? coffee on the cpu ?). One thing I have seen in all logs is the earlier "send_byte fail" message, so I think that is a pre-requisite. I have no idea how to trigger it. I have tried cold and warm boots. Boots between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other testing, maybe I'll stumble onto it. I am sure you didn't pour coffee onto the CPU yet :) Basic rule of testing: You'll only hit the problem again after you are convinced that it was magically fixed by a completely unrelated change. Cheers, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 1, 2013, at 9:51 PM, Guenter Roeck wrote: > On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote: >> >> On Oct 1, 2013, at 10:24 AM, Guenter Roeck wrote: >> >>> On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck wrote: > On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 >>> >>> Explains the crash, but the new key count is very wrong. 1174405121 = >>> 0x4601. >>> Which I guess explains the subsequent memory allocation error in the >>> log. >>> >>> Henrik, any idea what might be going on ? Is it possible that the >>> previous >>> command failure leaves some state machine in a bad state ? >> >> I seem to recall a report on another similar state problem on newer >> machines, so maybe, yes. Older machines seem fine, I have never >> encountered the problem myself. Here is a patch to test that >> theory. It has been tested to be pretty harmless on two different >> generations. >> >> I really really do not want to add an 'if (value is insane)' check ;-) >> > Chris, > > any chance you can load this patch on an affected machine so we can get > test feedback ? This one is too experimental to submit upstream without > knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). >>> Only requirement is that it also includes the previous patch, so it would be >>> optimal if you can apply it on top of the previous image. >> >> Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after >> ~dozen reboots, I'm not triggering the problem. The only items in dmesg with >> smc in it: >> >> [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 >> [ 13.833402] input: applesmc as >> /devices/platform/applesmc.768/input/input10 >> > > Hi Chris, > > That only means that you did not hit the problem. There may be some secondary > trigger (cold boot ? coffee on the cpu ?). > > One thing I have seen in all logs is the earlier "send_byte fail" message, so > I think that is a pre-requisite. I have no idea how to trigger it. I have tried cold and warm boots. Boots between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other testing, maybe I'll stumble onto it. Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote: > > On Oct 1, 2013, at 10:24 AM, Guenter Roeck wrote: > > > On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: > >> > >> On Oct 1, 2013, at 9:19 AM, Guenter Roeck wrote: > >> > >>> On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: > >> Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. > >> > >> [ 10.886016] applesmc: key count changed from 261 to 1174405121 > >> > > > > Explains the crash, but the new key count is very wrong. 1174405121 = > > 0x4601. > > Which I guess explains the subsequent memory allocation error in the > > log. > > > > Henrik, any idea what might be going on ? Is it possible that the > > previous > > command failure leaves some state machine in a bad state ? > > I seem to recall a report on another similar state problem on newer > machines, so maybe, yes. Older machines seem fine, I have never > encountered the problem myself. Here is a patch to test that > theory. It has been tested to be pretty harmless on two different > generations. > > I really really do not want to add an 'if (value is insane)' check ;-) > > >>> Chris, > >>> > >>> any chance you can load this patch on an affected machine so we can get > >>> test feedback ? This one is too experimental to submit upstream without > >>> knowing that it really fixes the problem. > >> > >> Yes. What kernel.org source version should I apply it against? I'd use the > >> non-debug config file from an equivalent version Fedora kernel, unless > >> asked otherwise. And also should I test it on other vintages? I have here > >> MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). > >> > > Only requirement is that it also includes the previous patch, so it would be > > optimal if you can apply it on top of the previous image. > > Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after > ~dozen reboots, I'm not triggering the problem. The only items in dmesg with > smc in it: > > [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 > [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 > Hi Chris, That only means that you did not hit the problem. There may be some secondary trigger (cold boot ? coffee on the cpu ?). One thing I have seen in all logs is the earlier "send_byte fail" message, so I think that is a pre-requisite. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 1, 2013, at 10:24 AM, Guenter Roeck wrote: > On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: >> >> On Oct 1, 2013, at 9:19 AM, Guenter Roeck wrote: >> >>> On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: >> Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. >> >> [ 10.886016] applesmc: key count changed from 261 to 1174405121 >> > > Explains the crash, but the new key count is very wrong. 1174405121 = > 0x4601. > Which I guess explains the subsequent memory allocation error in the log. > > Henrik, any idea what might be going on ? Is it possible that the previous > command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) >>> Chris, >>> >>> any chance you can load this patch on an affected machine so we can get >>> test feedback ? This one is too experimental to submit upstream without >>> knowing that it really fixes the problem. >> >> Yes. What kernel.org source version should I apply it against? I'd use the >> non-debug config file from an equivalent version Fedora kernel, unless asked >> otherwise. And also should I test it on other vintages? I have here >> MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). >> > Only requirement is that it also includes the previous patch, so it would be > optimal if you can apply it on top of the previous image. Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: > > On Oct 1, 2013, at 9:19 AM, Guenter Roeck wrote: > > > On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: > Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. > > [ 10.886016] applesmc: key count changed from 261 to 1174405121 > > >>> > >>> Explains the crash, but the new key count is very wrong. 1174405121 = > >>> 0x4601. > >>> Which I guess explains the subsequent memory allocation error in the log. > >>> > >>> Henrik, any idea what might be going on ? Is it possible that the previous > >>> command failure leaves some state machine in a bad state ? > >> > >> I seem to recall a report on another similar state problem on newer > >> machines, so maybe, yes. Older machines seem fine, I have never > >> encountered the problem myself. Here is a patch to test that > >> theory. It has been tested to be pretty harmless on two different > >> generations. > >> > >> I really really do not want to add an 'if (value is insane)' check ;-) > >> > > Chris, > > > > any chance you can load this patch on an affected machine so we can get > > test feedback ? This one is too experimental to submit upstream without > > knowing that it really fixes the problem. > > Yes. What kernel.org source version should I apply it against? I'd use the > non-debug config file from an equivalent version Fedora kernel, unless asked > otherwise. And also should I test it on other vintages? I have here > MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). > Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: > > >Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. > > > > > >[ 10.886016] applesmc: key count changed from 261 to 1174405121 > > > > > > > Explains the crash, but the new key count is very wrong. 1174405121 = > > 0x4601. > > Which I guess explains the subsequent memory allocation error in the log. > > > > Henrik, any idea what might be going on ? Is it possible that the previous > > command failure leaves some state machine in a bad state ? > > I seem to recall a report on another similar state problem on newer > machines, so maybe, yes. Older machines seem fine, I have never > encountered the problem myself. Here is a patch to test that > theory. It has been tested to be pretty harmless on two different > generations. > > I really really do not want to add an 'if (value is insane)' check ;-) > Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Thanks, Guenter > Thanks, > Henrik > > From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001 > From: Henrik Rydberg > Date: Tue, 1 Oct 2013 12:16:09 +0200 > Subject: [PATCH] applesmc remedy take 1 > > Conjectured problem: there are remnant bytes ready on the data line > which corrupts the read after a failure. > > Remedy: assuming bit0 is the read valid line, try to flush it before > starting a new command. > > Exception: the write-number-of-bytes-to-read command seems to differ > between models (it may not be needed on the newest), so do not try to > flush the data at that particular point. > > Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem > has not been reproduced, so the actual effect of this patch is unknown. > --- > drivers/hwmon/applesmc.c | 29 + > 1 file changed, 21 insertions(+), 8 deletions(-) > > diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c > index 98814d1..f6eaf6d1 100644 > --- a/drivers/hwmon/applesmc.c > +++ b/drivers/hwmon/applesmc.c > @@ -186,10 +186,23 @@ static int wait_read(void) > * send_byte - Write to SMC port, retrying when necessary. Callers > * must hold applesmc_lock. > */ > -static int send_byte(u8 cmd, u16 port) > +static int send_byte(u8 cmd, u16 port, bool flush) > { > - u8 status; > - int us; > + u8 status, data; > + int us, nskip; > + > + if (flush) { > + /* read the data port until bit0 is cleared */ > + for (nskip = 0; nskip < 16; nskip++) { > + udelay(APPLESMC_MIN_WAIT); > + status = inb(APPLESMC_CMD_PORT); > + if (!(status & 0x01)) > + break; > + data = inb(APPLESMC_DATA_PORT); > + } > + if (nskip) > + pr_warn("flushed %d bytes\n", nskip); > + } > > outb(cmd, port); > for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) { > @@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port) > > static int send_command(u8 cmd) > { > - return send_byte(cmd, APPLESMC_CMD_PORT); > + return send_byte(cmd, APPLESMC_CMD_PORT, true); > } > > static int send_argument(const char *key) > @@ -223,7 +236,7 @@ static int send_argument(const char *key) > int i; > > for (i = 0; i < 4; i++) > - if (send_byte(key[i], APPLESMC_DATA_PORT)) > + if (send_byte(key[i], APPLESMC_DATA_PORT, true)) > return -EIO; > return 0; > } > @@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, > u8 len) > return -EIO; > } > > - if (send_byte(len, APPLESMC_DATA_PORT)) { > + if (send_byte(len, APPLESMC_DATA_PORT, false)) { > pr_warn("%.4s: read len fail\n", key); > return -EIO; > } > @@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 > *buffer, u8 len) > return -EIO; > } > > - if (send_byte(len, APPLESMC_DATA_PORT)) { > + if (send_byte(len, APPLESMC_DATA_PORT, true)) { > pr_warn("%.4s: write len fail\n", key); > return -EIO; > } > > for (i = 0; i < len; i++) { > - if (send_byte(buffer[i], APPLESMC_DATA_PORT)) { > + if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) { > pr_warn("%s: write data fail\n", key); > return -EIO; > } > -- > 1.8.3.4 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 1, 2013, at 9:19 AM, Guenter Roeck wrote: > On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 >>> >>> Explains the crash, but the new key count is very wrong. 1174405121 = >>> 0x4601. >>> Which I guess explains the subsequent memory allocation error in the log. >>> >>> Henrik, any idea what might be going on ? Is it possible that the previous >>> command failure leaves some state machine in a bad state ? >> >> I seem to recall a report on another similar state problem on newer >> machines, so maybe, yes. Older machines seem fine, I have never >> encountered the problem myself. Here is a patch to test that >> theory. It has been tested to be pretty harmless on two different >> generations. >> >> I really really do not want to add an 'if (value is insane)' check ;-) >> > Chris, > > any chance you can load this patch on an affected machine so we can get > test feedback ? This one is too experimental to submit upstream without > knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
> >Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. > > > >[ 10.886016] applesmc: key count changed from 261 to 1174405121 > > > > Explains the crash, but the new key count is very wrong. 1174405121 = > 0x4601. > Which I guess explains the subsequent memory allocation error in the log. > > Henrik, any idea what might be going on ? Is it possible that the previous > command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Thanks, Henrik >From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Tue, 1 Oct 2013 12:16:09 +0200 Subject: [PATCH] applesmc remedy take 1 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Exception: the write-number-of-bytes-to-read command seems to differ between models (it may not be needed on the newest), so do not try to flush the data at that particular point. Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem has not been reproduced, so the actual effect of this patch is unknown. --- drivers/hwmon/applesmc.c | 29 + 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..f6eaf6d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -186,10 +186,23 @@ static int wait_read(void) * send_byte - Write to SMC port, retrying when necessary. Callers * must hold applesmc_lock. */ -static int send_byte(u8 cmd, u16 port) +static int send_byte(u8 cmd, u16 port, bool flush) { - u8 status; - int us; + u8 status, data; + int us, nskip; + + if (flush) { + /* read the data port until bit0 is cleared */ + for (nskip = 0; nskip < 16; nskip++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status & 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (nskip) + pr_warn("flushed %d bytes\n", nskip); + } outb(cmd, port); for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) { @@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port) static int send_command(u8 cmd) { - return send_byte(cmd, APPLESMC_CMD_PORT); + return send_byte(cmd, APPLESMC_CMD_PORT, true); } static int send_argument(const char *key) @@ -223,7 +236,7 @@ static int send_argument(const char *key) int i; for (i = 0; i < 4; i++) - if (send_byte(key[i], APPLESMC_DATA_PORT)) + if (send_byte(key[i], APPLESMC_DATA_PORT, true)) return -EIO; return 0; } @@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } - if (send_byte(len, APPLESMC_DATA_PORT)) { + if (send_byte(len, APPLESMC_DATA_PORT, false)) { pr_warn("%.4s: read len fail\n", key); return -EIO; } @@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 *buffer, u8 len) return -EIO; } - if (send_byte(len, APPLESMC_DATA_PORT)) { + if (send_byte(len, APPLESMC_DATA_PORT, true)) { pr_warn("%.4s: write len fail\n", key); return -EIO; } for (i = 0; i < len; i++) { - if (send_byte(buffer[i], APPLESMC_DATA_PORT)) { + if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) { pr_warn("%s: write data fail\n", key); return -EIO; } -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Thanks, Henrik From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Tue, 1 Oct 2013 12:16:09 +0200 Subject: [PATCH] applesmc remedy take 1 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Exception: the write-number-of-bytes-to-read command seems to differ between models (it may not be needed on the newest), so do not try to flush the data at that particular point. Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem has not been reproduced, so the actual effect of this patch is unknown. --- drivers/hwmon/applesmc.c | 29 + 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..f6eaf6d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -186,10 +186,23 @@ static int wait_read(void) * send_byte - Write to SMC port, retrying when necessary. Callers * must hold applesmc_lock. */ -static int send_byte(u8 cmd, u16 port) +static int send_byte(u8 cmd, u16 port, bool flush) { - u8 status; - int us; + u8 status, data; + int us, nskip; + + if (flush) { + /* read the data port until bit0 is cleared */ + for (nskip = 0; nskip 16; nskip++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (nskip) + pr_warn(flushed %d bytes\n, nskip); + } outb(cmd, port); for (us = APPLESMC_MIN_WAIT; us APPLESMC_MAX_WAIT; us = 1) { @@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port) static int send_command(u8 cmd) { - return send_byte(cmd, APPLESMC_CMD_PORT); + return send_byte(cmd, APPLESMC_CMD_PORT, true); } static int send_argument(const char *key) @@ -223,7 +236,7 @@ static int send_argument(const char *key) int i; for (i = 0; i 4; i++) - if (send_byte(key[i], APPLESMC_DATA_PORT)) + if (send_byte(key[i], APPLESMC_DATA_PORT, true)) return -EIO; return 0; } @@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } - if (send_byte(len, APPLESMC_DATA_PORT)) { + if (send_byte(len, APPLESMC_DATA_PORT, false)) { pr_warn(%.4s: read len fail\n, key); return -EIO; } @@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 *buffer, u8 len) return -EIO; } - if (send_byte(len, APPLESMC_DATA_PORT)) { + if (send_byte(len, APPLESMC_DATA_PORT, true)) { pr_warn(%.4s: write len fail\n, key); return -EIO; } for (i = 0; i len; i++) { - if (send_byte(buffer[i], APPLESMC_DATA_PORT)) { + if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) { pr_warn(%s: write data fail\n, key); return -EIO; } -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Thanks, Guenter Thanks, Henrik From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Tue, 1 Oct 2013 12:16:09 +0200 Subject: [PATCH] applesmc remedy take 1 Conjectured problem: there are remnant bytes ready on the data line which corrupts the read after a failure. Remedy: assuming bit0 is the read valid line, try to flush it before starting a new command. Exception: the write-number-of-bytes-to-read command seems to differ between models (it may not be needed on the newest), so do not try to flush the data at that particular point. Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem has not been reproduced, so the actual effect of this patch is unknown. --- drivers/hwmon/applesmc.c | 29 + 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 98814d1..f6eaf6d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -186,10 +186,23 @@ static int wait_read(void) * send_byte - Write to SMC port, retrying when necessary. Callers * must hold applesmc_lock. */ -static int send_byte(u8 cmd, u16 port) +static int send_byte(u8 cmd, u16 port, bool flush) { - u8 status; - int us; + u8 status, data; + int us, nskip; + + if (flush) { + /* read the data port until bit0 is cleared */ + for (nskip = 0; nskip 16; nskip++) { + udelay(APPLESMC_MIN_WAIT); + status = inb(APPLESMC_CMD_PORT); + if (!(status 0x01)) + break; + data = inb(APPLESMC_DATA_PORT); + } + if (nskip) + pr_warn(flushed %d bytes\n, nskip); + } outb(cmd, port); for (us = APPLESMC_MIN_WAIT; us APPLESMC_MAX_WAIT; us = 1) { @@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port) static int send_command(u8 cmd) { - return send_byte(cmd, APPLESMC_CMD_PORT); + return send_byte(cmd, APPLESMC_CMD_PORT, true); } static int send_argument(const char *key) @@ -223,7 +236,7 @@ static int send_argument(const char *key) int i; for (i = 0; i 4; i++) - if (send_byte(key[i], APPLESMC_DATA_PORT)) + if (send_byte(key[i], APPLESMC_DATA_PORT, true)) return -EIO; return 0; } @@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) return -EIO; } - if (send_byte(len, APPLESMC_DATA_PORT)) { + if (send_byte(len, APPLESMC_DATA_PORT, false)) { pr_warn(%.4s: read len fail\n, key); return -EIO; } @@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 *buffer, u8 len) return -EIO; } - if (send_byte(len, APPLESMC_DATA_PORT)) { + if (send_byte(len, APPLESMC_DATA_PORT, true)) { pr_warn(%.4s: write len fail\n, key); return -EIO; } for (i = 0; i len; i++) { - if (send_byte(buffer[i], APPLESMC_DATA_PORT)) { + if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) { pr_warn(%s: write data fail\n, key); return -EIO; } -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote: On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 Hi Chris, That only means that you did not hit the problem. There may be some secondary trigger (cold boot ? coffee on the cpu ?). One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Oct 1, 2013, at 9:51 PM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote: On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 Hi Chris, That only means that you did not hit the problem. There may be some secondary trigger (cold boot ? coffee on the cpu ?). One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. I have no idea how to trigger it. I have tried cold and warm boots. Boots between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other testing, maybe I'll stumble onto it. Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 10/01/2013 08:55 PM, Chris Murphy wrote: On Oct 1, 2013, at 9:51 PM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote: On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote: On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote: On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote: Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? I seem to recall a report on another similar state problem on newer machines, so maybe, yes. Older machines seem fine, I have never encountered the problem myself. Here is a patch to test that theory. It has been tested to be pretty harmless on two different generations. I really really do not want to add an 'if (value is insane)' check ;-) Chris, any chance you can load this patch on an affected machine so we can get test feedback ? This one is too experimental to submit upstream without knowing that it really fixes the problem. Yes. What kernel.org source version should I apply it against? I'd use the non-debug config file from an equivalent version Fedora kernel, unless asked otherwise. And also should I test it on other vintages? I have here MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012). Only requirement is that it also includes the previous patch, so it would be optimal if you can apply it on top of the previous image. Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after ~dozen reboots, I'm not triggering the problem. The only items in dmesg with smc in it: [ 13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1 [ 13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10 Hi Chris, That only means that you did not hit the problem. There may be some secondary trigger (cold boot ? coffee on the cpu ?). One thing I have seen in all logs is the earlier send_byte fail message, so I think that is a pre-requisite. I have no idea how to trigger it. I have tried cold and warm boots. Boots between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other testing, maybe I'll stumble onto it. I am sure you didn't pour coffee onto the CPU yet :) Basic rule of testing: You'll only hit the problem again after you are convinced that it was magically fixed by a completely unrelated change. Cheers, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/30/2013 06:57 PM, Chris Murphy wrote: On Sep 27, 2013, at 5:33 PM, Guenter Roeck wrote: On 09/27/2013 11:03 AM, Chris Murphy wrote: On Sep 27, 2013, at 11:59 AM, Guenter Roeck wrote: On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: Suggested-by: Guenter Roeck Signed-off-by: Henrik Rydberg --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. The patch includes a new warning message. pr_warn("key count changed from %d to %d\n", s->key_count, count); It would be great if there would be a means to detect if this message is seen in a kernel log, because it would show that the potential crash condition fixed with the patch was actually encountered. This would help us to determine if we actually fixed the problem or not. Of course, we'll know if is wasn't fixed if the system still crashes. Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? Guenter Attaching new full dmesg to the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11 Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Sep 27, 2013, at 5:33 PM, Guenter Roeck wrote: > On 09/27/2013 11:03 AM, Chris Murphy wrote: >> >> On Sep 27, 2013, at 11:59 AM, Guenter Roeck wrote: >> >>> On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck wrote: > On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: >> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg >> wrote: >> This suggests that initialization may be attempted more than once. >> The key cache >> is allocated only once, but the number of keys is read for each >> attempt. >> >> No idea if that can happen, but if the number of keys can increase >> after >> the first initialization attempt you would have an explanation for >> the crash. > > Good idea, and easy enough to test with the patch below. > Should we apply this patch even though it may not solve the specific problem ? >>> >>> Yes, why not - it certainly won't hurt. I am running it right now, so >>> it is at least run-tested. >>> Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. >>> >>> Yes - I agree that the error state is far-fetched, but it is hard to >>> see any other logical explanation. There is of course always the >>> possibility that the problem is somewhere else completely. >>> >>> Proper patch attached. >>> >>> Thanks, >>> Henrik >>> >>> --- >>> >>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 >>> From: Henrik Rydberg >>> Date: Thu, 26 Sep 2013 08:33:16 +0200 >>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding >>> >>> After reports from Chris and Josh Boyer of a rare crash in applesmc, >>> Guenter pointed at the initialization problem fixed below. The patch >>> has not been verified to fix the crash, but should be applied >>> regardless. >>> >>> Reported-by: >>> Suggested-by: Guenter Roeck >>> Signed-off-by: Henrik Rydberg >>> --- >>> drivers/hwmon/applesmc.c | 11 ++- >>> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> Thanks for the quick reply. I'll get this rolled into our kernels soon. >> > I sent a pull request to Linus, so you should be able to pull it from > the upstream kernel shortly. Would be great to get feedback if the patch > solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. >>> When you do, can you possibly trigger an event based on the warning added >>> with the patch ? This might help us to identify if the problem fixed >>> with the patch actually happens. >> >> I don't understand the question. I'm uncertain how to trigger, and also what >> event. >> > > The patch includes a new warning message. > > pr_warn("key count changed from %d to %d\n", >s->key_count, count); > > It would be great if there would be a means to detect if this message is seen > in a kernel log, because it would show that the potential crash condition > fixed with the patch was actually encountered. This would help us to determine > if we actually fixed the problem or not. > > Of course, we'll know if is wasn't fixed if the system still crashes. Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Attaching new full dmesg to the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11 Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Sep 27, 2013, at 5:33 PM, Guenter Roeck li...@roeck-us.net wrote: On 09/27/2013 11:03 AM, Chris Murphy wrote: On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. The patch includes a new warning message. pr_warn(key count changed from %d to %d\n, s-key_count, count); It would be great if there would be a means to detect if this message is seen in a kernel log, because it would show that the potential crash condition fixed with the patch was actually encountered. This would help us to determine if we actually fixed the problem or not. Of course, we'll know if is wasn't fixed if the system still crashes. Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Attaching new full dmesg to the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11 Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/30/2013 06:57 PM, Chris Murphy wrote: On Sep 27, 2013, at 5:33 PM, Guenter Roeck li...@roeck-us.net wrote: On 09/27/2013 11:03 AM, Chris Murphy wrote: On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. The patch includes a new warning message. pr_warn(key count changed from %d to %d\n, s-key_count, count); It would be great if there would be a means to detect if this message is seen in a kernel log, because it would show that the potential crash condition fixed with the patch was actually encountered. This would help us to determine if we actually fixed the problem or not. Of course, we'll know if is wasn't fixed if the system still crashes. Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. [ 10.886016] applesmc: key count changed from 261 to 1174405121 Explains the crash, but the new key count is very wrong. 1174405121 = 0x4601. Which I guess explains the subsequent memory allocation error in the log. Henrik, any idea what might be going on ? Is it possible that the previous command failure leaves some state machine in a bad state ? Guenter Attaching new full dmesg to the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11 Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/27/2013 11:03 AM, Chris Murphy wrote: On Sep 27, 2013, at 11:59 AM, Guenter Roeck wrote: On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: Suggested-by: Guenter Roeck Signed-off-by: Henrik Rydberg --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. The patch includes a new warning message. pr_warn("key count changed from %d to %d\n", s->key_count, count); It would be great if there would be a means to detect if this message is seen in a kernel log, because it would show that the potential crash condition fixed with the patch was actually encountered. This would help us to determine if we actually fixed the problem or not. Of course, we'll know if is wasn't fixed if the system still crashes. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Sep 27, 2013, at 11:12 AM, Guenter Roeck wrote: > On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: >> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: >> This suggests that initialization may be attempted more than once. The >> key cache >> is allocated only once, but the number of keys is read for each attempt. >> >> No idea if that can happen, but if the number of keys can increase after >> the first initialization attempt you would have an explanation for the >> crash. > > Good idea, and easy enough to test with the patch below. > Should we apply this patch even though it may not solve the specific problem ? >>> >>> Yes, why not - it certainly won't hurt. I am running it right now, so >>> it is at least run-tested. >>> Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. >>> >>> Yes - I agree that the error state is far-fetched, but it is hard to >>> see any other logical explanation. There is of course always the >>> possibility that the problem is somewhere else completely. >>> >>> Proper patch attached. >>> >>> Thanks, >>> Henrik >>> >>> --- >>> >>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 >>> From: Henrik Rydberg >>> Date: Thu, 26 Sep 2013 08:33:16 +0200 >>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding >>> >>> After reports from Chris and Josh Boyer of a rare crash in applesmc, >>> Guenter pointed at the initialization problem fixed below. The patch >>> has not been verified to fix the crash, but should be applied >>> regardless. >>> >>> Reported-by: >>> Suggested-by: Guenter Roeck >>> Signed-off-by: Henrik Rydberg >>> --- >>> drivers/hwmon/applesmc.c | 11 ++- >>> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> Thanks for the quick reply. I'll get this rolled into our kernels soon. >> > I sent a pull request to Linus, so you should be able to pull it from > the upstream kernel shortly. Would be great to get feedback if the patch > solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Sep 27, 2013, at 11:59 AM, Guenter Roeck wrote: > On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: >> >> On Sep 27, 2013, at 11:12 AM, Guenter Roeck wrote: >> >>> On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. >>> >>> Good idea, and easy enough to test with the patch below. >>> >> Should we apply this patch even though it may not solve the specific >> problem ? > > Yes, why not - it certainly won't hurt. I am running it right now, so > it is at least run-tested. > >> Again, not sure if the key count can change, but the current code is at >> the very >> least inconsistent, as it keeps reading the key count without updating or >> verifying the cache size. > > Yes - I agree that the error state is far-fetched, but it is hard to > see any other logical explanation. There is of course always the > possibility that the problem is somewhere else completely. > > Proper patch attached. > > Thanks, > Henrik > > --- > > From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 > From: Henrik Rydberg > Date: Thu, 26 Sep 2013 08:33:16 +0200 > Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding > > After reports from Chris and Josh Boyer of a rare crash in applesmc, > Guenter pointed at the initialization problem fixed below. The patch > has not been verified to fix the crash, but should be applied > regardless. > > Reported-by: > Suggested-by: Guenter Roeck > Signed-off-by: Henrik Rydberg > --- > drivers/hwmon/applesmc.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. >>> I sent a pull request to Linus, so you should be able to pull it from >>> the upstream kernel shortly. Would be great to get feedback if the patch >>> solves the problem (or doesn't). >> >> I'll start running it when it appears in koji. It's very transient, maybe >> one oops per week with lots of (other) testing. I'm not even sure if it >> happens on warm or cold boots or both. >> > When you do, can you possibly trigger an event based on the warning added > with the patch ? This might help us to identify if the problem fixed > with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. Chris-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: > On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: > >> > > This suggests that initialization may be attempted more than once. The > >> > > key cache > >> > > is allocated only once, but the number of keys is read for each > >> > > attempt. > >> > > > >> > > No idea if that can happen, but if the number of keys can increase > >> > > after > >> > > the first initialization attempt you would have an explanation for the > >> > > crash. > >> > > >> > Good idea, and easy enough to test with the patch below. > >> > > >> Should we apply this patch even though it may not solve the specific > >> problem ? > > > > Yes, why not - it certainly won't hurt. I am running it right now, so > > it is at least run-tested. > > > >> Again, not sure if the key count can change, but the current code is at > >> the very > >> least inconsistent, as it keeps reading the key count without updating or > >> verifying the cache size. > > > > Yes - I agree that the error state is far-fetched, but it is hard to > > see any other logical explanation. There is of course always the > > possibility that the problem is somewhere else completely. > > > > Proper patch attached. > > > > Thanks, > > Henrik > > > > --- > > > > From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 > > From: Henrik Rydberg > > Date: Thu, 26 Sep 2013 08:33:16 +0200 > > Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding > > > > After reports from Chris and Josh Boyer of a rare crash in applesmc, > > Guenter pointed at the initialization problem fixed below. The patch > > has not been verified to fix the crash, but should be applied > > regardless. > > > > Reported-by: > > Suggested-by: Guenter Roeck > > Signed-off-by: Henrik Rydberg > > --- > > drivers/hwmon/applesmc.c | 11 ++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > Thanks for the quick reply. I'll get this rolled into our kernels soon. > Patch is now upstream (commit 5f4513864). Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: > > On Sep 27, 2013, at 11:12 AM, Guenter Roeck wrote: > > > On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: > >> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg > >> wrote: > >> This suggests that initialization may be attempted more than once. The > >> key cache > >> is allocated only once, but the number of keys is read for each > >> attempt. > >> > >> No idea if that can happen, but if the number of keys can increase > >> after > >> the first initialization attempt you would have an explanation for the > >> crash. > > > > Good idea, and easy enough to test with the patch below. > > > Should we apply this patch even though it may not solve the specific > problem ? > >>> > >>> Yes, why not - it certainly won't hurt. I am running it right now, so > >>> it is at least run-tested. > >>> > Again, not sure if the key count can change, but the current code is at > the very > least inconsistent, as it keeps reading the key count without updating or > verifying the cache size. > >>> > >>> Yes - I agree that the error state is far-fetched, but it is hard to > >>> see any other logical explanation. There is of course always the > >>> possibility that the problem is somewhere else completely. > >>> > >>> Proper patch attached. > >>> > >>> Thanks, > >>> Henrik > >>> > >>> --- > >>> > >>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 > >>> From: Henrik Rydberg > >>> Date: Thu, 26 Sep 2013 08:33:16 +0200 > >>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding > >>> > >>> After reports from Chris and Josh Boyer of a rare crash in applesmc, > >>> Guenter pointed at the initialization problem fixed below. The patch > >>> has not been verified to fix the crash, but should be applied > >>> regardless. > >>> > >>> Reported-by: > >>> Suggested-by: Guenter Roeck > >>> Signed-off-by: Henrik Rydberg > >>> --- > >>> drivers/hwmon/applesmc.c | 11 ++- > >>> 1 file changed, 10 insertions(+), 1 deletion(-) > >> > >> Thanks for the quick reply. I'll get this rolled into our kernels soon. > >> > > I sent a pull request to Linus, so you should be able to pull it from > > the upstream kernel shortly. Would be great to get feedback if the patch > > solves the problem (or doesn't). > > I'll start running it when it appears in koji. It's very transient, maybe one > oops per week with lots of (other) testing. I'm not even sure if it happens > on warm or cold boots or both. > When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: > On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: > >> > > This suggests that initialization may be attempted more than once. The > >> > > key cache > >> > > is allocated only once, but the number of keys is read for each > >> > > attempt. > >> > > > >> > > No idea if that can happen, but if the number of keys can increase > >> > > after > >> > > the first initialization attempt you would have an explanation for the > >> > > crash. > >> > > >> > Good idea, and easy enough to test with the patch below. > >> > > >> Should we apply this patch even though it may not solve the specific > >> problem ? > > > > Yes, why not - it certainly won't hurt. I am running it right now, so > > it is at least run-tested. > > > >> Again, not sure if the key count can change, but the current code is at > >> the very > >> least inconsistent, as it keeps reading the key count without updating or > >> verifying the cache size. > > > > Yes - I agree that the error state is far-fetched, but it is hard to > > see any other logical explanation. There is of course always the > > possibility that the problem is somewhere else completely. > > > > Proper patch attached. > > > > Thanks, > > Henrik > > > > --- > > > > From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 > > From: Henrik Rydberg > > Date: Thu, 26 Sep 2013 08:33:16 +0200 > > Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding > > > > After reports from Chris and Josh Boyer of a rare crash in applesmc, > > Guenter pointed at the initialization problem fixed below. The patch > > has not been verified to fix the crash, but should be applied > > regardless. > > > > Reported-by: > > Suggested-by: Guenter Roeck > > Signed-off-by: Henrik Rydberg > > --- > > drivers/hwmon/applesmc.c | 11 ++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > Thanks for the quick reply. I'll get this rolled into our kernels soon. > I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg wrote: >> > > This suggests that initialization may be attempted more than once. The >> > > key cache >> > > is allocated only once, but the number of keys is read for each attempt. >> > > >> > > No idea if that can happen, but if the number of keys can increase after >> > > the first initialization attempt you would have an explanation for the >> > > crash. >> > >> > Good idea, and easy enough to test with the patch below. >> > >> Should we apply this patch even though it may not solve the specific problem >> ? > > Yes, why not - it certainly won't hurt. I am running it right now, so > it is at least run-tested. > >> Again, not sure if the key count can change, but the current code is at the >> very >> least inconsistent, as it keeps reading the key count without updating or >> verifying the cache size. > > Yes - I agree that the error state is far-fetched, but it is hard to > see any other logical explanation. There is of course always the > possibility that the problem is somewhere else completely. > > Proper patch attached. > > Thanks, > Henrik > > --- > > From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 > From: Henrik Rydberg > Date: Thu, 26 Sep 2013 08:33:16 +0200 > Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding > > After reports from Chris and Josh Boyer of a rare crash in applesmc, > Guenter pointed at the initialization problem fixed below. The patch > has not been verified to fix the crash, but should be applied > regardless. > > Reported-by: > Suggested-by: Guenter Roeck > Signed-off-by: Henrik Rydberg > --- > drivers/hwmon/applesmc.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. Patch is now upstream (commit 5f4513864). Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. Chris-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/27/2013 11:03 AM, Chris Murphy wrote: On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote: On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote: On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote: On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Thanks for the quick reply. I'll get this rolled into our kernels soon. I sent a pull request to Linus, so you should be able to pull it from the upstream kernel shortly. Would be great to get feedback if the patch solves the problem (or doesn't). I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both. When you do, can you possibly trigger an event based on the warning added with the patch ? This might help us to identify if the problem fixed with the patch actually happens. I don't understand the question. I'm uncertain how to trigger, and also what event. The patch includes a new warning message. pr_warn(key count changed from %d to %d\n, s-key_count, count); It would be great if there would be a means to detect if this message is seen in a kernel log, because it would show that the potential crash condition fixed with the patch was actually encountered. This would help us to determine if we actually fixed the problem or not. Of course, we'll know if is wasn't fixed if the system still crashes. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
> >Yes - I agree that the error state is far-fetched, but it is hard to > >see any other logical explanation. There is of course always the > >possibility that the problem is somewhere else completely. > > > There are also ACPI conflicts in each of the bug reports I looked at. > Can this play a role, or is that "normal" on apple systems ? I honstestly don't know. On my own machine (MBA3,1), I have problems with the PCIe NIC somehow overwriting the graphics buffer, resulting in a corrupted screen at boot. After a suspend/resume, that problem is gone, but there is surely something fishy going on... Thanks, Henrik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
> Applied. I'll run my sanity tests before I send it upstream. > I'll also Cc: stable. Great. > Interesting is that this started to happen with 3.10, even though > I did not find any relevant changes in the driver. Is it possible that > changed boot timing (ie reduced boot time) exposes this problem ? I was thinking the same thing - if the SMC takes some time to boot up, it could very well be the culprit. Imagine that, the kernel is too fast. ;-) Thanks, Henrik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/25/2013 11:34 PM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: Suggested-by: Guenter Roeck Signed-off-by: Henrik Rydberg --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Applied. I'll run my sanity tests before I send it upstream. I'll also Cc: stable. Interesting is that this started to happen with 3.10, even though I did not find any relevant changes in the driver. Is it possible that changed boot timing (ie reduced boot time) exposes this problem ? Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/25/2013 11:34 PM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. There are also ACPI conflicts in each of the bug reports I looked at. Can this play a role, or is that "normal" on apple systems ? Thanks, Guenter Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: Suggested-by: Guenter Roeck Signed-off-by: Henrik Rydberg --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..98814d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s->init_complete) return 0; - ret = read_register_count(>key_count); + ret = read_register_count(); if (ret) return ret; + if (s->cache && s->key_count != count) { + pr_warn("key count changed from %d to %d\n", + s->key_count, count); + kfree(s->cache); + s->cache = NULL; + } + s->key_count = count; + if (!s->cache) s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL); if (!s->cache) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
> > > This suggests that initialization may be attempted more than once. The > > > key cache > > > is allocated only once, but the number of keys is read for each attempt. > > > > > > No idea if that can happen, but if the number of keys can increase after > > > the first initialization attempt you would have an explanation for the > > > crash. > > > > Good idea, and easy enough to test with the patch below. > > > Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. > Again, not sure if the key count can change, but the current code is at the > very > least inconsistent, as it keeps reading the key count without updating or > verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- >From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: Suggested-by: Guenter Roeck Signed-off-by: Henrik Rydberg --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..98814d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s->init_complete) return 0; - ret = read_register_count(>key_count); + ret = read_register_count(); if (ret) return ret; + if (s->cache && s->key_count != count) { + pr_warn("key count changed from %d to %d\n", + s->key_count, count); + kfree(s->cache); + s->cache = NULL; + } + s->key_count = count; + if (!s->cache) s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL); if (!s->cache) -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..98814d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = smcreg; bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s-init_complete) return 0; - ret = read_register_count(s-key_count); + ret = read_register_count(count); if (ret) return ret; + if (s-cache s-key_count != count) { + pr_warn(key count changed from %d to %d\n, + s-key_count, count); + kfree(s-cache); + s-cache = NULL; + } + s-key_count = count; + if (!s-cache) s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL); if (!s-cache) -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/25/2013 11:34 PM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. There are also ACPI conflicts in each of the bug reports I looked at. Can this play a role, or is that normal on apple systems ? Thanks, Guenter Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..98814d1 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = smcreg; bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s-init_complete) return 0; - ret = read_register_count(s-key_count); + ret = read_register_count(count); if (ret) return ret; + if (s-cache s-key_count != count) { + pr_warn(key count changed from %d to %d\n, + s-key_count, count); + kfree(s-cache); + s-cache = NULL; + } + s-key_count = count; + if (!s-cache) s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL); if (!s-cache) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On 09/25/2013 11:34 PM, Henrik Rydberg wrote: This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Yes, why not - it certainly won't hurt. I am running it right now, so it is at least run-tested. Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. Proper patch attached. Thanks, Henrik --- From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001 From: Henrik Rydberg rydb...@euromail.se Date: Thu, 26 Sep 2013 08:33:16 +0200 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding After reports from Chris and Josh Boyer of a rare crash in applesmc, Guenter pointed at the initialization problem fixed below. The patch has not been verified to fix the crash, but should be applied regardless. Reported-by: jwbo...@fedoraproject.org Suggested-by: Guenter Roeck li...@roeck-us.net Signed-off-by: Henrik Rydberg rydb...@euromail.se --- drivers/hwmon/applesmc.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Applied. I'll run my sanity tests before I send it upstream. I'll also Cc: stable. Interesting is that this started to happen with 3.10, even though I did not find any relevant changes in the driver. Is it possible that changed boot timing (ie reduced boot time) exposes this problem ? Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Applied. I'll run my sanity tests before I send it upstream. I'll also Cc: stable. Great. Interesting is that this started to happen with 3.10, even though I did not find any relevant changes in the driver. Is it possible that changed boot timing (ie reduced boot time) exposes this problem ? I was thinking the same thing - if the SMC takes some time to boot up, it could very well be the culprit. Imagine that, the kernel is too fast. ;-) Thanks, Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
Yes - I agree that the error state is far-fetched, but it is hard to see any other logical explanation. There is of course always the possibility that the problem is somewhere else completely. There are also ACPI conflicts in each of the bug reports I looked at. Can this play a role, or is that normal on apple systems ? I honstestly don't know. On my own machine (MBA3,1), I have problems with the PCIe NIC somehow overwriting the graphics buffer, resulting in a corrupted screen at boot. After a suspend/resume, that problem is gone, but there is surely something fishy going on... Thanks, Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Sep 25, 2013 at 11:48:07PM +0200, Henrik Rydberg wrote: > On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote: > > On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote: > > > Hi All, > > > > > > Chris has reported[1] an issue with the applesmc driver in the 3.10 > > > and 3.11 kernels. This seems to be a bit transient, but the oops is > > > consistent across those releases. This is with a MacBook Pro 4,1. > > > The backtrace is below. > > > > > > Any ideas on this one? > > > > > Some of the bug reports suggest that there are related messages in at least > > some > > of the kernel logs. > > > > Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: > > 0x01 > > Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail > > > > This suggests that initialization may be attempted more than once. The key > > cache > > is allocated only once, but the number of keys is read for each attempt. > > > > No idea if that can happen, but if the number of keys can increase after > > the first initialization attempt you would have an explanation for the > > crash. > > Good idea, and easy enough to test with the patch below. > Should we apply this patch even though it may not solve the specific problem ? Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Thanks, Guenter > diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c > index 62c2e32..d962c4d 100644 > --- a/drivers/hwmon/applesmc.c > +++ b/drivers/hwmon/applesmc.c > @@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void) > { > struct applesmc_registers *s = > bool left_light_sensor, right_light_sensor; > + unsigned int count; > u8 tmp[1]; > int ret; > > if (s->init_complete) > return 0; > > - ret = read_register_count(>key_count); > + ret = read_register_count(); > if (ret) > return ret; > > + if (s->cache && s->key_count != count) { > + pr_warn("key count changed from %d to %d\n", s->key_count, > count); > + kfree(s->cache); > + s->cache = NULL; > + } > + s->key_count = count; > + > if (!s->cache) > s->cache = kcalloc(s->key_count, sizeof(*s->cache), > GFP_KERNEL); > if (!s->cache) > > Thanks, > Henrik > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote: > On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote: > > Hi All, > > > > Chris has reported[1] an issue with the applesmc driver in the 3.10 > > and 3.11 kernels. This seems to be a bit transient, but the oops is > > consistent across those releases. This is with a MacBook Pro 4,1. > > The backtrace is below. > > > > Any ideas on this one? > > > Some of the bug reports suggest that there are related messages in at least > some > of the kernel logs. > > Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: > 0x01 > Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail > > This suggests that initialization may be attempted more than once. The key > cache > is allocated only once, but the number of keys is read for each attempt. > > No idea if that can happen, but if the number of keys can increase after > the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..d962c4d 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s->init_complete) return 0; - ret = read_register_count(>key_count); + ret = read_register_count(); if (ret) return ret; + if (s->cache && s->key_count != count) { + pr_warn("key count changed from %d to %d\n", s->key_count, count); + kfree(s->cache); + s->cache = NULL; + } + s->key_count = count; + if (!s->cache) s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL); if (!s->cache) Thanks, Henrik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote: > Hi All, > > Chris has reported[1] an issue with the applesmc driver in the 3.10 > and 3.11 kernels. This seems to be a bit transient, but the oops is > consistent across those releases. This is with a MacBook Pro 4,1. > The backtrace is below. > > Any ideas on this one? > Some of the bug reports suggest that there are related messages in at least some of the kernel logs. Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 0x01 Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Guenter > josh > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1011719 > > BUG: unable to handle kernel paging request at 880322aed005 > IP: [] applesmc_get_entry_by_index+0x25/0xf0 [applesmc] > PGD 1fe5067 PUD 0 > Oops: [#1] SMP > Modules linked in: b43 coretemp kvm_intel kvm joydev bcma mac80211 > uvcvideo snd_hda_codec_realtek applesmc(+) iTCO_wdt cfg80211 > iTCO_vendor_support microcode input_polldev snd_hda_intel > snd_hda_codec i2c_i801 videobuf2_vmalloc videobuf2_memops > videobuf2_core videodev snd_hwdep snd_seq bcm5974 btusb hid_appleir > snd_seq_device media bluetooth snd_pcm lpc_ich mfd_core rfkill > snd_page_alloc snd_timer snd apple_bl soundcore shpchp acpi_cpufreq > mperf nfsd auth_rpcgss nfs_acl lockd isofs squashfs btrfs libcrc32c > xor zlib_deflate raid6_pq usb_storage nouveau firewire_ohci > ata_generic pata_acpi firewire_core crc_itu_t sky2 mxm_wmi wmi > i2c_algo_bit ssb drm_kms_helper mmc_core ttm drm i2c_core video sunrpc > loop > CPU: 1 PID: 492 Comm: systemd-udevd Not tainted 3.11.0-300.fc20.x86_64 #1 > Hardware name: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS > MBP41.88Z.00C1.B03.0802271651 02/27/08 > task: 880132a407a0 ti: 8801343fe000 task.ti: 8801343fe000 > RIP: 0010:[] [] > applesmc_get_entry_by_index+0x25/0xf0 [applesmc] > RSP: 0018:8801343ffa18 EFLAGS: 00010286 > RAX: 2600 RBX: 880322aed000 RCX: 8801343ffaf4 > RDX: 7200 RSI: a05df06e RDI: 2600 > RBP: 8801343ffa38 R08: 8801343fe000 R09: 880132a6c100 > R10: R11: 00014180 R12: 2600 > R13: 4c01 R14: a05df06e R15: 8801343ffa84 > FS: 7f6aac863880() GS:88013fd0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 880322aed005 CR3: 0001340fa000 CR4: 07e0 > Stack: > 0004deef ff10 2600 > 8801343ffa70 a05dd4f1 0032 0001 > a05df06e 880134be3810 a05e0660 8801343ffaa8 > Call Trace: > [] applesmc_get_lower_bound+0x51/0xa0 [applesmc] > [] applesmc_get_entry_by_key+0x23/0xc0 [applesmc] > [] applesmc_read_key+0x19/0x70 [applesmc] > [] applesmc_init_smcreg+0x8f/0x2d0 [applesmc] > [] applesmc_probe+0x12/0x30 [applesmc] > [] platform_drv_probe+0x3c/0x70 > [] ? driver_sysfs_add+0x82/0xb0 > [] driver_probe_device+0x87/0x390 > [] ? driver_probe_device+0x390/0x390 > [] __device_attach+0x3b/0x40 > [] bus_for_each_drv+0x63/0xa0 > [] device_attach+0x88/0xa0 > [] bus_probe_device+0x98/0xc0 > [] device_add+0x4c4/0x7a0 > [] platform_device_add+0xd1/0x2d0 > [] platform_device_register_full+0xe0/0x120 > [] ? 0xa05e2fff > [] applesmc_init+0x98/0x1000 [applesmc] > [] do_one_initcall+0xfa/0x1b0 > [] ? set_memory_nx+0x43/0x50 > [] load_module+0x1bbd/0x2660 > [] ? store_uevent+0x40/0x40 > [] SyS_finit_module+0x86/0xb0 > [] system_call_fastpath+0x16/0x1b > Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 63 c7 48 8d 14 40 48 89 > e5 41 54 41 89 fc 53 48 8d 1c 90 48 83 ec 10 48 03 1d 03 33 00 00 <80> > 7b 05 00 48 89 d8 74 12 48 83 c4 10 5b 41 5c 5d c3 66 0f 1f > RIP [] applesmc_get_entry_by_index+0x25/0xf0 [applesmc] > RSP > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote: Hi All, Chris has reported[1] an issue with the applesmc driver in the 3.10 and 3.11 kernels. This seems to be a bit transient, but the oops is consistent across those releases. This is with a MacBook Pro 4,1. The backtrace is below. Any ideas on this one? Some of the bug reports suggest that there are related messages in at least some of the kernel logs. Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 0x01 Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Guenter josh [1] https://bugzilla.redhat.com/show_bug.cgi?id=1011719 BUG: unable to handle kernel paging request at 880322aed005 IP: [a05dd345] applesmc_get_entry_by_index+0x25/0xf0 [applesmc] PGD 1fe5067 PUD 0 Oops: [#1] SMP Modules linked in: b43 coretemp kvm_intel kvm joydev bcma mac80211 uvcvideo snd_hda_codec_realtek applesmc(+) iTCO_wdt cfg80211 iTCO_vendor_support microcode input_polldev snd_hda_intel snd_hda_codec i2c_i801 videobuf2_vmalloc videobuf2_memops videobuf2_core videodev snd_hwdep snd_seq bcm5974 btusb hid_appleir snd_seq_device media bluetooth snd_pcm lpc_ich mfd_core rfkill snd_page_alloc snd_timer snd apple_bl soundcore shpchp acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd isofs squashfs btrfs libcrc32c xor zlib_deflate raid6_pq usb_storage nouveau firewire_ohci ata_generic pata_acpi firewire_core crc_itu_t sky2 mxm_wmi wmi i2c_algo_bit ssb drm_kms_helper mmc_core ttm drm i2c_core video sunrpc loop CPU: 1 PID: 492 Comm: systemd-udevd Not tainted 3.11.0-300.fc20.x86_64 #1 Hardware name: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS MBP41.88Z.00C1.B03.0802271651 02/27/08 task: 880132a407a0 ti: 8801343fe000 task.ti: 8801343fe000 RIP: 0010:[a05dd345] [a05dd345] applesmc_get_entry_by_index+0x25/0xf0 [applesmc] RSP: 0018:8801343ffa18 EFLAGS: 00010286 RAX: 2600 RBX: 880322aed000 RCX: 8801343ffaf4 RDX: 7200 RSI: a05df06e RDI: 2600 RBP: 8801343ffa38 R08: 8801343fe000 R09: 880132a6c100 R10: R11: 00014180 R12: 2600 R13: 4c01 R14: a05df06e R15: 8801343ffa84 FS: 7f6aac863880() GS:88013fd0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 880322aed005 CR3: 0001340fa000 CR4: 07e0 Stack: 0004deef ff10 2600 8801343ffa70 a05dd4f1 0032 0001 a05df06e 880134be3810 a05e0660 8801343ffaa8 Call Trace: [a05dd4f1] applesmc_get_lower_bound+0x51/0xa0 [applesmc] [a05dd563] applesmc_get_entry_by_key+0x23/0xc0 [applesmc] [a05dd9f9] applesmc_read_key+0x19/0x70 [applesmc] [a05de08f] applesmc_init_smcreg+0x8f/0x2d0 [applesmc] [a05de2e2] applesmc_probe+0x12/0x30 [applesmc] [813f41dc] platform_drv_probe+0x3c/0x70 [813f18b2] ? driver_sysfs_add+0x82/0xb0 [813f1f77] driver_probe_device+0x87/0x390 [813f2280] ? driver_probe_device+0x390/0x390 [813f22bb] __device_attach+0x3b/0x40 [813eff73] bus_for_each_drv+0x63/0xa0 [813f1e78] device_attach+0x88/0xa0 [813f11e8] bus_probe_device+0x98/0xc0 [813eeeb4] device_add+0x4c4/0x7a0 [813f3cb1] platform_device_add+0xd1/0x2d0 [813f4430] platform_device_register_full+0xe0/0x120 [a05e3000] ? 0xa05e2fff [a05e3098] applesmc_init+0x98/0x1000 [applesmc] [810020fa] do_one_initcall+0xfa/0x1b0 [81052483] ? set_memory_nx+0x43/0x50 [810cc58d] load_module+0x1bbd/0x2660 [810c8890] ? store_uevent+0x40/0x40 [810cd1a6] SyS_finit_module+0x86/0xb0 [81652cd9] system_call_fastpath+0x16/0x1b Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 63 c7 48 8d 14 40 48 89 e5 41 54 41 89 fc 53 48 8d 1c 90 48 83 ec 10 48 03 1d 03 33 00 00 80 7b 05 00 48 89 d8 74 12 48 83 c4 10 5b 41 5c 5d c3 66 0f 1f RIP [a05dd345] applesmc_get_entry_by_index+0x25/0xf0 [applesmc] RSP 8801343ffa18 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote: On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote: Hi All, Chris has reported[1] an issue with the applesmc driver in the 3.10 and 3.11 kernels. This seems to be a bit transient, but the oops is consistent across those releases. This is with a MacBook Pro 4,1. The backtrace is below. Any ideas on this one? Some of the bug reports suggest that there are related messages in at least some of the kernel logs. Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 0x01 Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..d962c4d 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = smcreg; bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s-init_complete) return 0; - ret = read_register_count(s-key_count); + ret = read_register_count(count); if (ret) return ret; + if (s-cache s-key_count != count) { + pr_warn(key count changed from %d to %d\n, s-key_count, count); + kfree(s-cache); + s-cache = NULL; + } + s-key_count = count; + if (!s-cache) s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL); if (!s-cache) Thanks, Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: applesmc oops in 3.10/3.11
On Wed, Sep 25, 2013 at 11:48:07PM +0200, Henrik Rydberg wrote: On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote: On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote: Hi All, Chris has reported[1] an issue with the applesmc driver in the 3.10 and 3.11 kernels. This seems to be a bit transient, but the oops is consistent across those releases. This is with a MacBook Pro 4,1. The backtrace is below. Any ideas on this one? Some of the bug reports suggest that there are related messages in at least some of the kernel logs. Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 0x01 Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail This suggests that initialization may be attempted more than once. The key cache is allocated only once, but the number of keys is read for each attempt. No idea if that can happen, but if the number of keys can increase after the first initialization attempt you would have an explanation for the crash. Good idea, and easy enough to test with the patch below. Should we apply this patch even though it may not solve the specific problem ? Again, not sure if the key count can change, but the current code is at the very least inconsistent, as it keeps reading the key count without updating or verifying the cache size. Thanks, Guenter diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 62c2e32..d962c4d 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void) { struct applesmc_registers *s = smcreg; bool left_light_sensor, right_light_sensor; + unsigned int count; u8 tmp[1]; int ret; if (s-init_complete) return 0; - ret = read_register_count(s-key_count); + ret = read_register_count(count); if (ret) return ret; + if (s-cache s-key_count != count) { + pr_warn(key count changed from %d to %d\n, s-key_count, count); + kfree(s-cache); + s-cache = NULL; + } + s-key_count = count; + if (!s-cache) s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL); if (!s-cache) Thanks, Henrik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/