Re: applesmc oops in 3.10/3.11

2013-10-09 Thread Guenter Roeck
On Wed, Oct 09, 2013 at 10:29:39AM +0200, Henrik Rydberg wrote:
> From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001
> From: Henrik Rydberg 
> Date: Wed, 2 Oct 2013 19:15:03 +0200
> Subject: [PATCH] hwmon: (applesmc) Always read until end of data
> 
> The crash reported and investigated in commit 5f4513 turned out to be
> caused by a change to the read interface on newer (2012) SMCs.
> 
> Tests by Chris show that simply reading the data valid line is enough
> for the problem to go away. Additional tests show that the newer SMCs
> no longer wait for the number of requested bytes, but start sending
> data right away.  Apparently the number of bytes to read is no longer
> specified as before, but instead found out by reading until end of
> data. Failure to read until end of data confuses the state machine,
> which eventually causes the crash.
> 
> As a remedy, assuming bit0 is the read valid line, make sure there is
> nothing more to read before leaving the read function.
> 
> Tested to resolve the original problem, and runtested on MBA3,1,
> MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on
> machines before 2012.
> 
> Tested-by: Chris Murphy 
> Signed-off-by: Henrik Rydberg 

Applied. I'll do my usual sanity testing and send it to Linus
either tomorrow or Friday.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-09 Thread Henrik Rydberg
Hi Guenter,

> > I want to make one more test tomorrow, then I'll send a proper patch. 
> > Thanks!
> > 
> Ok, great.

And here it is, with compiler warning fixed.

Thanks everybody,
Henrik

>From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg 
Date: Wed, 2 Oct 2013 19:15:03 +0200
Subject: [PATCH] hwmon: (applesmc) Always read until end of data

The crash reported and investigated in commit 5f4513 turned out to be
caused by a change to the read interface on newer (2012) SMCs.

Tests by Chris show that simply reading the data valid line is enough
for the problem to go away. Additional tests show that the newer SMCs
no longer wait for the number of requested bytes, but start sending
data right away.  Apparently the number of bytes to read is no longer
specified as before, but instead found out by reading until end of
data. Failure to read until end of data confuses the state machine,
which eventually causes the crash.

As a remedy, assuming bit0 is the read valid line, make sure there is
nothing more to read before leaving the read function.

Tested to resolve the original problem, and runtested on MBA3,1,
MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on
machines before 2012.

Tested-by: Chris Murphy 
Signed-off-by: Henrik Rydberg 
---
 drivers/hwmon/applesmc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..3288f13 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -230,6 +230,7 @@ static int send_argument(const char *key)
 
 static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
 {
+   u8 status, data = 0;
int i;
 
if (send_command(cmd) || send_argument(key)) {
@@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}
 
+   /* This has no effect on newer (2012) SMCs */
if (send_byte(len, APPLESMC_DATA_PORT)) {
pr_warn("%.4s: read len fail\n", key);
return -EIO;
@@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
u8 len)
buffer[i] = inb(APPLESMC_DATA_PORT);
}
 
+   /* Read the data port until bit0 is cleared */
+   for (i = 0; i < 16; i++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status & 0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (i)
+   pr_warn("flushed %d bytes, last value is: %d\n", i, data);
+
return 0;
 }
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-09 Thread Henrik Rydberg
Hi Guenter,

  I want to make one more test tomorrow, then I'll send a proper patch. 
  Thanks!
  
 Ok, great.

And here it is, with compiler warning fixed.

Thanks everybody,
Henrik

From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg rydb...@euromail.se
Date: Wed, 2 Oct 2013 19:15:03 +0200
Subject: [PATCH] hwmon: (applesmc) Always read until end of data

The crash reported and investigated in commit 5f4513 turned out to be
caused by a change to the read interface on newer (2012) SMCs.

Tests by Chris show that simply reading the data valid line is enough
for the problem to go away. Additional tests show that the newer SMCs
no longer wait for the number of requested bytes, but start sending
data right away.  Apparently the number of bytes to read is no longer
specified as before, but instead found out by reading until end of
data. Failure to read until end of data confuses the state machine,
which eventually causes the crash.

As a remedy, assuming bit0 is the read valid line, make sure there is
nothing more to read before leaving the read function.

Tested to resolve the original problem, and runtested on MBA3,1,
MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on
machines before 2012.

Tested-by: Chris Murphy ch...@cmurf.com
Signed-off-by: Henrik Rydberg rydb...@euromail.se
---
 drivers/hwmon/applesmc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..3288f13 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -230,6 +230,7 @@ static int send_argument(const char *key)
 
 static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
 {
+   u8 status, data = 0;
int i;
 
if (send_command(cmd) || send_argument(key)) {
@@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}
 
+   /* This has no effect on newer (2012) SMCs */
if (send_byte(len, APPLESMC_DATA_PORT)) {
pr_warn(%.4s: read len fail\n, key);
return -EIO;
@@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
u8 len)
buffer[i] = inb(APPLESMC_DATA_PORT);
}
 
+   /* Read the data port until bit0 is cleared */
+   for (i = 0; i  16; i++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status  0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (i)
+   pr_warn(flushed %d bytes, last value is: %d\n, i, data);
+
return 0;
 }
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-09 Thread Guenter Roeck
On Wed, Oct 09, 2013 at 10:29:39AM +0200, Henrik Rydberg wrote:
 From 18fa0f55b764ad0fe5fc49f81bae281e5110ed56 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Wed, 2 Oct 2013 19:15:03 +0200
 Subject: [PATCH] hwmon: (applesmc) Always read until end of data
 
 The crash reported and investigated in commit 5f4513 turned out to be
 caused by a change to the read interface on newer (2012) SMCs.
 
 Tests by Chris show that simply reading the data valid line is enough
 for the problem to go away. Additional tests show that the newer SMCs
 no longer wait for the number of requested bytes, but start sending
 data right away.  Apparently the number of bytes to read is no longer
 specified as before, but instead found out by reading until end of
 data. Failure to read until end of data confuses the state machine,
 which eventually causes the crash.
 
 As a remedy, assuming bit0 is the read valid line, make sure there is
 nothing more to read before leaving the read function.
 
 Tested to resolve the original problem, and runtested on MBA3,1,
 MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on
 machines before 2012.
 
 Tested-by: Chris Murphy ch...@cmurf.com
 Signed-off-by: Henrik Rydberg rydb...@euromail.se

Applied. I'll do my usual sanity testing and send it to Linus
either tomorrow or Friday.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-08 Thread Guenter Roeck
On Tue, Oct 08, 2013 at 06:29:17PM +0200, Henrik Rydberg wrote:
> Hi Guenter,
> 
> > > > So, what should we do with this patch ? Apply it ?
> > > 
> > > So far I'm getting nothing on the original machine. As of today it's 
> > > applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. 
> > > Unfortunately at the moment I'm a bit too dense to figure out how to get 
> > > a new kernel applied to an existing live package so I can try this on a 
> > > USB stick. While maybe unrelated, the oops was occurring at least 4x as 
> > > often booted from USB stick media than HDD.
> > > 
> > 
> > Seems to me we should apply it.
> > 
> > Henrik,
> > 
> > what do you think ? Did you have time for additional testing ?
> > If you think we should apply it, please send me a signed patch.
> 
> I want to make one more test tomorrow, then I'll send a proper patch. Thanks!
> 
Ok, great.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-08 Thread Henrik Rydberg
Hi Guenter,

> > > So, what should we do with this patch ? Apply it ?
> > 
> > So far I'm getting nothing on the original machine. As of today it's 
> > applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately 
> > at the moment I'm a bit too dense to figure out how to get a new kernel 
> > applied to an existing live package so I can try this on a USB stick. While 
> > maybe unrelated, the oops was occurring at least 4x as often booted from 
> > USB stick media than HDD.
> > 
> 
> Seems to me we should apply it.
> 
> Henrik,
> 
> what do you think ? Did you have time for additional testing ?
> If you think we should apply it, please send me a signed patch.

I want to make one more test tomorrow, then I'll send a proper patch. Thanks!

Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-08 Thread Guenter Roeck
On Mon, Oct 07, 2013 at 05:46:55PM -0600, Chris Murphy wrote:
> 
> On Oct 7, 2013, at 5:42 PM, Guenter Roeck  wrote:
> 
> > On 10/02/2013 10:24 AM, Henrik Rydberg wrote:
> > 
> >>> From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
> >> From: Henrik Rydberg 
> >> Date: Wed, 2 Oct 2013 19:15:03 +0200
> >> Subject: [PATCH] applesmc remedy take 2
> >> 
> >> Conjectured problem: there are remnant bytes ready on the data line
> >> which corrupts the read after a failure.
> >> 
> >> Remedy: assuming bit0 is the read valid line, try to flush it before
> >> starting a new command.
> >> 
> >> Tests by Chris suggests reading the status is enough for the problem
> >> to go away, which is consistent with a change in the SMC interface,
> >> where the number of bytes to read is no longer specified, but found
> >> out by reading until end of data.
> >> 
> >> Tested on a MacBookAir3,1, but the original problem has not been
> >> reproduced.
> > 
> > So, what should we do with this patch ? Apply it ?
> 
> So far I'm getting nothing on the original machine. As of today it's applied 
> as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the 
> moment I'm a bit too dense to figure out how to get a new kernel applied to 
> an existing live package so I can try this on a USB stick. While maybe 
> unrelated, the oops was occurring at least 4x as often booted from USB stick 
> media than HDD.
> 

Seems to me we should apply it.

Henrik,

what do you think ? Did you have time for additional testing ?
If you think we should apply it, please send me a signed patch.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-08 Thread Guenter Roeck
On Mon, Oct 07, 2013 at 05:46:55PM -0600, Chris Murphy wrote:
 
 On Oct 7, 2013, at 5:42 PM, Guenter Roeck li...@roeck-us.net wrote:
 
  On 10/02/2013 10:24 AM, Henrik Rydberg wrote:
  
  From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
  From: Henrik Rydberg rydb...@euromail.se
  Date: Wed, 2 Oct 2013 19:15:03 +0200
  Subject: [PATCH] applesmc remedy take 2
  
  Conjectured problem: there are remnant bytes ready on the data line
  which corrupts the read after a failure.
  
  Remedy: assuming bit0 is the read valid line, try to flush it before
  starting a new command.
  
  Tests by Chris suggests reading the status is enough for the problem
  to go away, which is consistent with a change in the SMC interface,
  where the number of bytes to read is no longer specified, but found
  out by reading until end of data.
  
  Tested on a MacBookAir3,1, but the original problem has not been
  reproduced.
  
  So, what should we do with this patch ? Apply it ?
 
 So far I'm getting nothing on the original machine. As of today it's applied 
 as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the 
 moment I'm a bit too dense to figure out how to get a new kernel applied to 
 an existing live package so I can try this on a USB stick. While maybe 
 unrelated, the oops was occurring at least 4x as often booted from USB stick 
 media than HDD.
 

Seems to me we should apply it.

Henrik,

what do you think ? Did you have time for additional testing ?
If you think we should apply it, please send me a signed patch.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-08 Thread Henrik Rydberg
Hi Guenter,

   So, what should we do with this patch ? Apply it ?
  
  So far I'm getting nothing on the original machine. As of today it's 
  applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately 
  at the moment I'm a bit too dense to figure out how to get a new kernel 
  applied to an existing live package so I can try this on a USB stick. While 
  maybe unrelated, the oops was occurring at least 4x as often booted from 
  USB stick media than HDD.
  
 
 Seems to me we should apply it.
 
 Henrik,
 
 what do you think ? Did you have time for additional testing ?
 If you think we should apply it, please send me a signed patch.

I want to make one more test tomorrow, then I'll send a proper patch. Thanks!

Henrik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-08 Thread Guenter Roeck
On Tue, Oct 08, 2013 at 06:29:17PM +0200, Henrik Rydberg wrote:
 Hi Guenter,
 
So, what should we do with this patch ? Apply it ?
   
   So far I'm getting nothing on the original machine. As of today it's 
   applied as the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. 
   Unfortunately at the moment I'm a bit too dense to figure out how to get 
   a new kernel applied to an existing live package so I can try this on a 
   USB stick. While maybe unrelated, the oops was occurring at least 4x as 
   often booted from USB stick media than HDD.
   
  
  Seems to me we should apply it.
  
  Henrik,
  
  what do you think ? Did you have time for additional testing ?
  If you think we should apply it, please send me a signed patch.
 
 I want to make one more test tomorrow, then I'll send a proper patch. Thanks!
 
Ok, great.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-07 Thread Chris Murphy

On Oct 7, 2013, at 5:42 PM, Guenter Roeck  wrote:

> On 10/02/2013 10:24 AM, Henrik Rydberg wrote:
> 
>>> From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
>> From: Henrik Rydberg 
>> Date: Wed, 2 Oct 2013 19:15:03 +0200
>> Subject: [PATCH] applesmc remedy take 2
>> 
>> Conjectured problem: there are remnant bytes ready on the data line
>> which corrupts the read after a failure.
>> 
>> Remedy: assuming bit0 is the read valid line, try to flush it before
>> starting a new command.
>> 
>> Tests by Chris suggests reading the status is enough for the problem
>> to go away, which is consistent with a change in the SMC interface,
>> where the number of bytes to read is no longer specified, but found
>> out by reading until end of data.
>> 
>> Tested on a MacBookAir3,1, but the original problem has not been
>> reproduced.
> 
> So, what should we do with this patch ? Apply it ?

So far I'm getting nothing on the original machine. As of today it's applied as 
the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment 
I'm a bit too dense to figure out how to get a new kernel applied to an 
existing live package so I can try this on a USB stick. While maybe unrelated, 
the oops was occurring at least 4x as often booted from USB stick media than 
HDD.

Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-07 Thread Guenter Roeck

On 10/02/2013 10:24 AM, Henrik Rydberg wrote:


From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001

From: Henrik Rydberg 
Date: Wed, 2 Oct 2013 19:15:03 +0200
Subject: [PATCH] applesmc remedy take 2

Conjectured problem: there are remnant bytes ready on the data line
which corrupts the read after a failure.

Remedy: assuming bit0 is the read valid line, try to flush it before
starting a new command.

Tests by Chris suggests reading the status is enough for the problem
to go away, which is consistent with a change in the SMC interface,
where the number of bytes to read is no longer specified, but found
out by reading until end of data.

Tested on a MacBookAir3,1, but the original problem has not been
reproduced.


So, what should we do with this patch ? Apply it ?

Thanks,
Guenter


---
  drivers/hwmon/applesmc.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..c0ff350 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -230,6 +230,7 @@ static int send_argument(const char *key)

  static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
  {
+   u8 status, data;
int i;

if (send_command(cmd) || send_argument(key)) {
@@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}

+   /* This has no effect on newer (2012) SMCs */
if (send_byte(len, APPLESMC_DATA_PORT)) {
pr_warn("%.4s: read len fail\n", key);
return -EIO;
@@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
u8 len)
buffer[i] = inb(APPLESMC_DATA_PORT);
}

+   /* Read the data port until bit0 is cleared */
+   for (i = 0; i < 16; i++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status & 0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (i)
+   pr_warn("flushed %d bytes, last value is: %d\n", i, data);
+
return 0;
  }




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-07 Thread Guenter Roeck

On 10/02/2013 10:24 AM, Henrik Rydberg wrote:


From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001

From: Henrik Rydberg rydb...@euromail.se
Date: Wed, 2 Oct 2013 19:15:03 +0200
Subject: [PATCH] applesmc remedy take 2

Conjectured problem: there are remnant bytes ready on the data line
which corrupts the read after a failure.

Remedy: assuming bit0 is the read valid line, try to flush it before
starting a new command.

Tests by Chris suggests reading the status is enough for the problem
to go away, which is consistent with a change in the SMC interface,
where the number of bytes to read is no longer specified, but found
out by reading until end of data.

Tested on a MacBookAir3,1, but the original problem has not been
reproduced.


So, what should we do with this patch ? Apply it ?

Thanks,
Guenter


---
  drivers/hwmon/applesmc.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..c0ff350 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -230,6 +230,7 @@ static int send_argument(const char *key)

  static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
  {
+   u8 status, data;
int i;

if (send_command(cmd) || send_argument(key)) {
@@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}

+   /* This has no effect on newer (2012) SMCs */
if (send_byte(len, APPLESMC_DATA_PORT)) {
pr_warn(%.4s: read len fail\n, key);
return -EIO;
@@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
u8 len)
buffer[i] = inb(APPLESMC_DATA_PORT);
}

+   /* Read the data port until bit0 is cleared */
+   for (i = 0; i  16; i++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status  0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (i)
+   pr_warn(flushed %d bytes, last value is: %d\n, i, data);
+
return 0;
  }




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-07 Thread Chris Murphy

On Oct 7, 2013, at 5:42 PM, Guenter Roeck li...@roeck-us.net wrote:

 On 10/02/2013 10:24 AM, Henrik Rydberg wrote:
 
 From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Wed, 2 Oct 2013 19:15:03 +0200
 Subject: [PATCH] applesmc remedy take 2
 
 Conjectured problem: there are remnant bytes ready on the data line
 which corrupts the read after a failure.
 
 Remedy: assuming bit0 is the read valid line, try to flush it before
 starting a new command.
 
 Tests by Chris suggests reading the status is enough for the problem
 to go away, which is consistent with a change in the SMC interface,
 where the number of bytes to read is no longer specified, but found
 out by reading until end of data.
 
 Tested on a MacBookAir3,1, but the original problem has not been
 reproduced.
 
 So, what should we do with this patch ? Apply it ?

So far I'm getting nothing on the original machine. As of today it's applied as 
the last patch on 3.12.0-0.rc4.git0.1.fc20.x86_64. Unfortunately at the moment 
I'm a bit too dense to figure out how to get a new kernel applied to an 
existing live package so I can try this on a USB stick. While maybe unrelated, 
the oops was occurring at least 4x as often booted from USB stick media than 
HDD.

Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 03:34:41PM -0600, Chris Murphy wrote:
> 
> On Oct 2, 2013, at 2:59 PM, Guenter Roeck  wrote:
> 
> > On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote:
> >> 
> >> On Oct 2, 2013, at 12:02 PM, Guenter Roeck  wrote:
> >> 
> >>> On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
>  On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
> > On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
> > One thing I have seen in all logs is the earlier "send_byte fail" 
> > message, so
> > I think that is a pre-requisite.
>  
>  Not necessarily - it could be that the patch actually fixes the root
>  cause. One possible scenario is that on recent SMCs, some of the
>  commands produce more data than we actually read. This would
>  eventually lead to both data corruption and overflow somwhere in the
>  SMC internals.  If the original SMC error is interpreted as a read
>  buffer overflow, then that problem should be fixed with this patch.
>  
> >>> 
> >>> Good point.
> >>> 
> >>> But shouldn't we at least get the "flushed %d bytes" warning message 
> >>> in this case ?
> >> 
> >> The explanation I have there is that the (newer) SMC needs the
> >> application to read the 'no more bytes' or it will get confused. It
> >> makes sense, if the number of bytes to read is no longer specified.
> >> 
> > You mean that just reading from APPLESMC_CMD_PORT would solve the 
> > problem ?
> > That might make sense.
>  
>  It also points at the possibility of a smaller patch to test, but I
>  have not had the time to check this very deeply myself:
>  
> >>> I like this patch much more than the previous patch. Chris, can you test 
> >>> it ?
> >> 
> >> Yes. Building now. What kernel message should I be looking for? At least 
> >> on 2011 and 2012 laptops I have yet to see an Oops related to smc. The 
> >> kernel with previous patch at least is not causing problems on them so 
> >> far, which works well as I can test more on the 2008 model.
> >> 
> > None, if I understand correctly and if the patch really fixes the root cause
> > of the problem.
> 
> A vast majority of the Ooops I've had are when booting from flash media, 
> testing Fedora installs. Is it possible the much slower kernel load and boot 
> time is a trigger? If so, I'll look into modifying the media to accept the 
> custom kernel and requisite fat initramfs.
> 
Yes, that could be a possible trigger. I thought it might be triggered by faster
boot (as one gets with 3.10 and 3.11), but slow boot is just as likely.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Chris Murphy

On Oct 2, 2013, at 2:59 PM, Guenter Roeck  wrote:

> On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote:
>> 
>> On Oct 2, 2013, at 12:02 PM, Guenter Roeck  wrote:
>> 
>>> On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
 On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
> On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
> One thing I have seen in all logs is the earlier "send_byte fail" 
> message, so
> I think that is a pre-requisite.
 
 Not necessarily - it could be that the patch actually fixes the root
 cause. One possible scenario is that on recent SMCs, some of the
 commands produce more data than we actually read. This would
 eventually lead to both data corruption and overflow somwhere in the
 SMC internals.  If the original SMC error is interpreted as a read
 buffer overflow, then that problem should be fixed with this patch.
 
>>> 
>>> Good point.
>>> 
>>> But shouldn't we at least get the "flushed %d bytes" warning message in 
>>> this case ?
>> 
>> The explanation I have there is that the (newer) SMC needs the
>> application to read the 'no more bytes' or it will get confused. It
>> makes sense, if the number of bytes to read is no longer specified.
>> 
> You mean that just reading from APPLESMC_CMD_PORT would solve the problem 
> ?
> That might make sense.
 
 It also points at the possibility of a smaller patch to test, but I
 have not had the time to check this very deeply myself:
 
>>> I like this patch much more than the previous patch. Chris, can you test it 
>>> ?
>> 
>> Yes. Building now. What kernel message should I be looking for? At least on 
>> 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel 
>> with previous patch at least is not causing problems on them so far, which 
>> works well as I can test more on the 2008 model.
>> 
> None, if I understand correctly and if the patch really fixes the root cause
> of the problem.

A vast majority of the Ooops I've had are when booting from flash media, 
testing Fedora installs. Is it possible the much slower kernel load and boot 
time is a trigger? If so, I'll look into modifying the media to accept the 
custom kernel and requisite fat initramfs.


Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote:
> 
> On Oct 2, 2013, at 12:02 PM, Guenter Roeck  wrote:
> 
> > On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
> >> On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
> >>> On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
> >>> One thing I have seen in all logs is the earlier "send_byte fail" 
> >>> message, so
> >>> I think that is a pre-requisite.
> >> 
> >> Not necessarily - it could be that the patch actually fixes the root
> >> cause. One possible scenario is that on recent SMCs, some of the
> >> commands produce more data than we actually read. This would
> >> eventually lead to both data corruption and overflow somwhere in the
> >> SMC internals.  If the original SMC error is interpreted as a read
> >> buffer overflow, then that problem should be fixed with this patch.
> >> 
> > 
> > Good point.
> > 
> > But shouldn't we at least get the "flushed %d bytes" warning message in 
> > this case ?
>  
>  The explanation I have there is that the (newer) SMC needs the
>  application to read the 'no more bytes' or it will get confused. It
>  makes sense, if the number of bytes to read is no longer specified.
>  
> >>> You mean that just reading from APPLESMC_CMD_PORT would solve the problem 
> >>> ?
> >>> That might make sense.
> >> 
> >> It also points at the possibility of a smaller patch to test, but I
> >> have not had the time to check this very deeply myself:
> >> 
> > I like this patch much more than the previous patch. Chris, can you test it 
> > ?
> 
> Yes. Building now. What kernel message should I be looking for? At least on 
> 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel 
> with previous patch at least is not causing problems on them so far, which 
> works well as I can test more on the 2008 model.
> 
None, if I understand correctly and if the patch really fixes the root cause
of the problem.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Chris Murphy

On Oct 2, 2013, at 12:02 PM, Guenter Roeck  wrote:

> On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
>> On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
>>> On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
>>> One thing I have seen in all logs is the earlier "send_byte fail" 
>>> message, so
>>> I think that is a pre-requisite.
>> 
>> Not necessarily - it could be that the patch actually fixes the root
>> cause. One possible scenario is that on recent SMCs, some of the
>> commands produce more data than we actually read. This would
>> eventually lead to both data corruption and overflow somwhere in the
>> SMC internals.  If the original SMC error is interpreted as a read
>> buffer overflow, then that problem should be fixed with this patch.
>> 
> 
> Good point.
> 
> But shouldn't we at least get the "flushed %d bytes" warning message in 
> this case ?
 
 The explanation I have there is that the (newer) SMC needs the
 application to read the 'no more bytes' or it will get confused. It
 makes sense, if the number of bytes to read is no longer specified.
 
>>> You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
>>> That might make sense.
>> 
>> It also points at the possibility of a smaller patch to test, but I
>> have not had the time to check this very deeply myself:
>> 
> I like this patch much more than the previous patch. Chris, can you test it ?

Yes. Building now. What kernel message should I be looking for? At least on 
2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with 
previous patch at least is not causing problems on them so far, which works 
well as I can test more on the 2008 model.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
> On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
> > On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
> > > > >>One thing I have seen in all logs is the earlier "send_byte fail" 
> > > > >>message, so
> > > > >>I think that is a pre-requisite.
> > > > >
> > > > >Not necessarily - it could be that the patch actually fixes the root
> > > > >cause. One possible scenario is that on recent SMCs, some of the
> > > > >commands produce more data than we actually read. This would
> > > > >eventually lead to both data corruption and overflow somwhere in the
> > > > >SMC internals.  If the original SMC error is interpreted as a read
> > > > >buffer overflow, then that problem should be fixed with this patch.
> > > > >
> > > > 
> > > > Good point.
> > > > 
> > > > But shouldn't we at least get the "flushed %d bytes" warning message in 
> > > > this case ?
> > > 
> > > The explanation I have there is that the (newer) SMC needs the
> > > application to read the 'no more bytes' or it will get confused. It
> > > makes sense, if the number of bytes to read is no longer specified.
> > > 
> > You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
> > That might make sense.
> 
> It also points at the possibility of a smaller patch to test, but I
> have not had the time to check this very deeply myself:
> 
I like this patch much more than the previous patch. Chris, can you test it ?

Thanks,
Guenter

> Thanks,
> Henrik
> 
> From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
> From: Henrik Rydberg 
> Date: Wed, 2 Oct 2013 19:15:03 +0200
> Subject: [PATCH] applesmc remedy take 2
> 
> Conjectured problem: there are remnant bytes ready on the data line
> which corrupts the read after a failure.
> 
> Remedy: assuming bit0 is the read valid line, try to flush it before
> starting a new command.
> 
> Tests by Chris suggests reading the status is enough for the problem
> to go away, which is consistent with a change in the SMC interface,
> where the number of bytes to read is no longer specified, but found
> out by reading until end of data.
> 
> Tested on a MacBookAir3,1, but the original problem has not been
> reproduced.
> ---
>  drivers/hwmon/applesmc.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
> index 98814d1..c0ff350 100644
> --- a/drivers/hwmon/applesmc.c
> +++ b/drivers/hwmon/applesmc.c
> @@ -230,6 +230,7 @@ static int send_argument(const char *key)
>  
>  static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
>  {
> + u8 status, data;
>   int i;
>  
>   if (send_command(cmd) || send_argument(key)) {
> @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
> u8 len)
>   return -EIO;
>   }
>  
> + /* This has no effect on newer (2012) SMCs */
>   if (send_byte(len, APPLESMC_DATA_PORT)) {
>   pr_warn("%.4s: read len fail\n", key);
>   return -EIO;
> @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
> u8 len)
>   buffer[i] = inb(APPLESMC_DATA_PORT);
>   }
>  
> + /* Read the data port until bit0 is cleared */
> + for (i = 0; i < 16; i++) {
> + udelay(APPLESMC_MIN_WAIT);
> + status = inb(APPLESMC_CMD_PORT);
> + if (!(status & 0x01))
> + break;
> + data = inb(APPLESMC_DATA_PORT);
> + }
> + if (i)
> + pr_warn("flushed %d bytes, last value is: %d\n", i, data);
> +
>   return 0;
>  }
>  
> -- 
> 1.8.3.4
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Henrik Rydberg
On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
> On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
> > > >>One thing I have seen in all logs is the earlier "send_byte fail" 
> > > >>message, so
> > > >>I think that is a pre-requisite.
> > > >
> > > >Not necessarily - it could be that the patch actually fixes the root
> > > >cause. One possible scenario is that on recent SMCs, some of the
> > > >commands produce more data than we actually read. This would
> > > >eventually lead to both data corruption and overflow somwhere in the
> > > >SMC internals.  If the original SMC error is interpreted as a read
> > > >buffer overflow, then that problem should be fixed with this patch.
> > > >
> > > 
> > > Good point.
> > > 
> > > But shouldn't we at least get the "flushed %d bytes" warning message in 
> > > this case ?
> > 
> > The explanation I have there is that the (newer) SMC needs the
> > application to read the 'no more bytes' or it will get confused. It
> > makes sense, if the number of bytes to read is no longer specified.
> > 
> You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
> That might make sense.

It also points at the possibility of a smaller patch to test, but I
have not had the time to check this very deeply myself:

Thanks,
Henrik

>From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg 
Date: Wed, 2 Oct 2013 19:15:03 +0200
Subject: [PATCH] applesmc remedy take 2

Conjectured problem: there are remnant bytes ready on the data line
which corrupts the read after a failure.

Remedy: assuming bit0 is the read valid line, try to flush it before
starting a new command.

Tests by Chris suggests reading the status is enough for the problem
to go away, which is consistent with a change in the SMC interface,
where the number of bytes to read is no longer specified, but found
out by reading until end of data.

Tested on a MacBookAir3,1, but the original problem has not been
reproduced.
---
 drivers/hwmon/applesmc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..c0ff350 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -230,6 +230,7 @@ static int send_argument(const char *key)
 
 static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
 {
+   u8 status, data;
int i;
 
if (send_command(cmd) || send_argument(key)) {
@@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}
 
+   /* This has no effect on newer (2012) SMCs */
if (send_byte(len, APPLESMC_DATA_PORT)) {
pr_warn("%.4s: read len fail\n", key);
return -EIO;
@@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
u8 len)
buffer[i] = inb(APPLESMC_DATA_PORT);
}
 
+   /* Read the data port until bit0 is cleared */
+   for (i = 0; i < 16; i++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status & 0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (i)
+   pr_warn("flushed %d bytes, last value is: %d\n", i, data);
+
return 0;
 }
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
> > >>One thing I have seen in all logs is the earlier "send_byte fail" 
> > >>message, so
> > >>I think that is a pre-requisite.
> > >
> > >Not necessarily - it could be that the patch actually fixes the root
> > >cause. One possible scenario is that on recent SMCs, some of the
> > >commands produce more data than we actually read. This would
> > >eventually lead to both data corruption and overflow somwhere in the
> > >SMC internals.  If the original SMC error is interpreted as a read
> > >buffer overflow, then that problem should be fixed with this patch.
> > >
> > 
> > Good point.
> > 
> > But shouldn't we at least get the "flushed %d bytes" warning message in 
> > this case ?
> 
> The explanation I have there is that the (newer) SMC needs the
> application to read the 'no more bytes' or it will get confused. It
> makes sense, if the number of bytes to read is no longer specified.
> 
You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
That might make sense.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Henrik Rydberg
> >>One thing I have seen in all logs is the earlier "send_byte fail" message, 
> >>so
> >>I think that is a pre-requisite.
> >
> >Not necessarily - it could be that the patch actually fixes the root
> >cause. One possible scenario is that on recent SMCs, some of the
> >commands produce more data than we actually read. This would
> >eventually lead to both data corruption and overflow somwhere in the
> >SMC internals.  If the original SMC error is interpreted as a read
> >buffer overflow, then that problem should be fixed with this patch.
> >
> 
> Good point.
> 
> But shouldn't we at least get the "flushed %d bytes" warning message in this 
> case ?

The explanation I have there is that the (newer) SMC needs the
application to read the 'no more bytes' or it will get confused. It
makes sense, if the number of bytes to read is no longer specified.

Thanks,
Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck

On 10/02/2013 02:53 AM, Henrik Rydberg wrote:

Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
smc in it:

[   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
[   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10



One thing I have seen in all logs is the earlier "send_byte fail" message, so
I think that is a pre-requisite.


Not necessarily - it could be that the patch actually fixes the root
cause. One possible scenario is that on recent SMCs, some of the
commands produce more data than we actually read. This would
eventually lead to both data corruption and overflow somwhere in the
SMC internals.  If the original SMC error is interpreted as a read
buffer overflow, then that problem should be fixed with this patch.



Good point.

But shouldn't we at least get the "flushed %d bytes" warning message in this 
case ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Henrik Rydberg
> > Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
> > ~dozen reboots, I'm not triggering the problem. The only items in dmesg 
> > with smc in it:
> > 
> > [   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
> > [   13.833402] input: applesmc as 
> > /devices/platform/applesmc.768/input/input10
> > 
> 
> One thing I have seen in all logs is the earlier "send_byte fail" message, so
> I think that is a pre-requisite.

Not necessarily - it could be that the patch actually fixes the root
cause. One possible scenario is that on recent SMCs, some of the
commands produce more data than we actually read. This would
eventually lead to both data corruption and overflow somwhere in the
SMC internals.  If the original SMC error is interpreted as a read
buffer overflow, then that problem should be fixed with this patch.

Even so, the patch does require more testing. It would be great to get
coverage on all models we can find.

Thanks,
Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Henrik Rydberg
  Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
  ~dozen reboots, I'm not triggering the problem. The only items in dmesg 
  with smc in it:
  
  [   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
  [   13.833402] input: applesmc as 
  /devices/platform/applesmc.768/input/input10
  
 
 One thing I have seen in all logs is the earlier send_byte fail message, so
 I think that is a pre-requisite.

Not necessarily - it could be that the patch actually fixes the root
cause. One possible scenario is that on recent SMCs, some of the
commands produce more data than we actually read. This would
eventually lead to both data corruption and overflow somwhere in the
SMC internals.  If the original SMC error is interpreted as a read
buffer overflow, then that problem should be fixed with this patch.

Even so, the patch does require more testing. It would be great to get
coverage on all models we can find.

Thanks,
Henrik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck

On 10/02/2013 02:53 AM, Henrik Rydberg wrote:

Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
smc in it:

[   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
[   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10



One thing I have seen in all logs is the earlier send_byte fail message, so
I think that is a pre-requisite.


Not necessarily - it could be that the patch actually fixes the root
cause. One possible scenario is that on recent SMCs, some of the
commands produce more data than we actually read. This would
eventually lead to both data corruption and overflow somwhere in the
SMC internals.  If the original SMC error is interpreted as a read
buffer overflow, then that problem should be fixed with this patch.



Good point.

But shouldn't we at least get the flushed %d bytes warning message in this 
case ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Henrik Rydberg
 One thing I have seen in all logs is the earlier send_byte fail message, 
 so
 I think that is a pre-requisite.
 
 Not necessarily - it could be that the patch actually fixes the root
 cause. One possible scenario is that on recent SMCs, some of the
 commands produce more data than we actually read. This would
 eventually lead to both data corruption and overflow somwhere in the
 SMC internals.  If the original SMC error is interpreted as a read
 buffer overflow, then that problem should be fixed with this patch.
 
 
 Good point.
 
 But shouldn't we at least get the flushed %d bytes warning message in this 
 case ?

The explanation I have there is that the (newer) SMC needs the
application to read the 'no more bytes' or it will get confused. It
makes sense, if the number of bytes to read is no longer specified.

Thanks,
Henrik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
  One thing I have seen in all logs is the earlier send_byte fail 
  message, so
  I think that is a pre-requisite.
  
  Not necessarily - it could be that the patch actually fixes the root
  cause. One possible scenario is that on recent SMCs, some of the
  commands produce more data than we actually read. This would
  eventually lead to both data corruption and overflow somwhere in the
  SMC internals.  If the original SMC error is interpreted as a read
  buffer overflow, then that problem should be fixed with this patch.
  
  
  Good point.
  
  But shouldn't we at least get the flushed %d bytes warning message in 
  this case ?
 
 The explanation I have there is that the (newer) SMC needs the
 application to read the 'no more bytes' or it will get confused. It
 makes sense, if the number of bytes to read is no longer specified.
 
You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
That might make sense.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Henrik Rydberg
On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
 On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
   One thing I have seen in all logs is the earlier send_byte fail 
   message, so
   I think that is a pre-requisite.
   
   Not necessarily - it could be that the patch actually fixes the root
   cause. One possible scenario is that on recent SMCs, some of the
   commands produce more data than we actually read. This would
   eventually lead to both data corruption and overflow somwhere in the
   SMC internals.  If the original SMC error is interpreted as a read
   buffer overflow, then that problem should be fixed with this patch.
   
   
   Good point.
   
   But shouldn't we at least get the flushed %d bytes warning message in 
   this case ?
  
  The explanation I have there is that the (newer) SMC needs the
  application to read the 'no more bytes' or it will get confused. It
  makes sense, if the number of bytes to read is no longer specified.
  
 You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
 That might make sense.

It also points at the possibility of a smaller patch to test, but I
have not had the time to check this very deeply myself:

Thanks,
Henrik

From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg rydb...@euromail.se
Date: Wed, 2 Oct 2013 19:15:03 +0200
Subject: [PATCH] applesmc remedy take 2

Conjectured problem: there are remnant bytes ready on the data line
which corrupts the read after a failure.

Remedy: assuming bit0 is the read valid line, try to flush it before
starting a new command.

Tests by Chris suggests reading the status is enough for the problem
to go away, which is consistent with a change in the SMC interface,
where the number of bytes to read is no longer specified, but found
out by reading until end of data.

Tested on a MacBookAir3,1, but the original problem has not been
reproduced.
---
 drivers/hwmon/applesmc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..c0ff350 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -230,6 +230,7 @@ static int send_argument(const char *key)
 
 static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
 {
+   u8 status, data;
int i;
 
if (send_command(cmd) || send_argument(key)) {
@@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}
 
+   /* This has no effect on newer (2012) SMCs */
if (send_byte(len, APPLESMC_DATA_PORT)) {
pr_warn(%.4s: read len fail\n, key);
return -EIO;
@@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
u8 len)
buffer[i] = inb(APPLESMC_DATA_PORT);
}
 
+   /* Read the data port until bit0 is cleared */
+   for (i = 0; i  16; i++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status  0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (i)
+   pr_warn(flushed %d bytes, last value is: %d\n, i, data);
+
return 0;
 }
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
 On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
  On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
One thing I have seen in all logs is the earlier send_byte fail 
message, so
I think that is a pre-requisite.

Not necessarily - it could be that the patch actually fixes the root
cause. One possible scenario is that on recent SMCs, some of the
commands produce more data than we actually read. This would
eventually lead to both data corruption and overflow somwhere in the
SMC internals.  If the original SMC error is interpreted as a read
buffer overflow, then that problem should be fixed with this patch.


Good point.

But shouldn't we at least get the flushed %d bytes warning message in 
this case ?
   
   The explanation I have there is that the (newer) SMC needs the
   application to read the 'no more bytes' or it will get confused. It
   makes sense, if the number of bytes to read is no longer specified.
   
  You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
  That might make sense.
 
 It also points at the possibility of a smaller patch to test, but I
 have not had the time to check this very deeply myself:
 
I like this patch much more than the previous patch. Chris, can you test it ?

Thanks,
Guenter

 Thanks,
 Henrik
 
 From 4451da32414080bd0563ee9e061f19bf90463cc5 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Wed, 2 Oct 2013 19:15:03 +0200
 Subject: [PATCH] applesmc remedy take 2
 
 Conjectured problem: there are remnant bytes ready on the data line
 which corrupts the read after a failure.
 
 Remedy: assuming bit0 is the read valid line, try to flush it before
 starting a new command.
 
 Tests by Chris suggests reading the status is enough for the problem
 to go away, which is consistent with a change in the SMC interface,
 where the number of bytes to read is no longer specified, but found
 out by reading until end of data.
 
 Tested on a MacBookAir3,1, but the original problem has not been
 reproduced.
 ---
  drivers/hwmon/applesmc.c | 13 +
  1 file changed, 13 insertions(+)
 
 diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
 index 98814d1..c0ff350 100644
 --- a/drivers/hwmon/applesmc.c
 +++ b/drivers/hwmon/applesmc.c
 @@ -230,6 +230,7 @@ static int send_argument(const char *key)
  
  static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
  {
 + u8 status, data;
   int i;
  
   if (send_command(cmd) || send_argument(key)) {
 @@ -237,6 +238,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
 u8 len)
   return -EIO;
   }
  
 + /* This has no effect on newer (2012) SMCs */
   if (send_byte(len, APPLESMC_DATA_PORT)) {
   pr_warn(%.4s: read len fail\n, key);
   return -EIO;
 @@ -250,6 +252,17 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
 u8 len)
   buffer[i] = inb(APPLESMC_DATA_PORT);
   }
  
 + /* Read the data port until bit0 is cleared */
 + for (i = 0; i  16; i++) {
 + udelay(APPLESMC_MIN_WAIT);
 + status = inb(APPLESMC_CMD_PORT);
 + if (!(status  0x01))
 + break;
 + data = inb(APPLESMC_DATA_PORT);
 + }
 + if (i)
 + pr_warn(flushed %d bytes, last value is: %d\n, i, data);
 +
   return 0;
  }
  
 -- 
 1.8.3.4
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Chris Murphy

On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote:

 On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
 On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
 On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
 One thing I have seen in all logs is the earlier send_byte fail 
 message, so
 I think that is a pre-requisite.
 
 Not necessarily - it could be that the patch actually fixes the root
 cause. One possible scenario is that on recent SMCs, some of the
 commands produce more data than we actually read. This would
 eventually lead to both data corruption and overflow somwhere in the
 SMC internals.  If the original SMC error is interpreted as a read
 buffer overflow, then that problem should be fixed with this patch.
 
 
 Good point.
 
 But shouldn't we at least get the flushed %d bytes warning message in 
 this case ?
 
 The explanation I have there is that the (newer) SMC needs the
 application to read the 'no more bytes' or it will get confused. It
 makes sense, if the number of bytes to read is no longer specified.
 
 You mean that just reading from APPLESMC_CMD_PORT would solve the problem ?
 That might make sense.
 
 It also points at the possibility of a smaller patch to test, but I
 have not had the time to check this very deeply myself:
 
 I like this patch much more than the previous patch. Chris, can you test it ?

Yes. Building now. What kernel message should I be looking for? At least on 
2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel with 
previous patch at least is not causing problems on them so far, which works 
well as I can test more on the 2008 model.

Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote:
 
 On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote:
 
  On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
  On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
  On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
  One thing I have seen in all logs is the earlier send_byte fail 
  message, so
  I think that is a pre-requisite.
  
  Not necessarily - it could be that the patch actually fixes the root
  cause. One possible scenario is that on recent SMCs, some of the
  commands produce more data than we actually read. This would
  eventually lead to both data corruption and overflow somwhere in the
  SMC internals.  If the original SMC error is interpreted as a read
  buffer overflow, then that problem should be fixed with this patch.
  
  
  Good point.
  
  But shouldn't we at least get the flushed %d bytes warning message in 
  this case ?
  
  The explanation I have there is that the (newer) SMC needs the
  application to read the 'no more bytes' or it will get confused. It
  makes sense, if the number of bytes to read is no longer specified.
  
  You mean that just reading from APPLESMC_CMD_PORT would solve the problem 
  ?
  That might make sense.
  
  It also points at the possibility of a smaller patch to test, but I
  have not had the time to check this very deeply myself:
  
  I like this patch much more than the previous patch. Chris, can you test it 
  ?
 
 Yes. Building now. What kernel message should I be looking for? At least on 
 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel 
 with previous patch at least is not causing problems on them so far, which 
 works well as I can test more on the 2008 model.
 
None, if I understand correctly and if the patch really fixes the root cause
of the problem.

Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Chris Murphy

On Oct 2, 2013, at 2:59 PM, Guenter Roeck li...@roeck-us.net wrote:

 On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote:
 
 On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
 On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
 On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
 One thing I have seen in all logs is the earlier send_byte fail 
 message, so
 I think that is a pre-requisite.
 
 Not necessarily - it could be that the patch actually fixes the root
 cause. One possible scenario is that on recent SMCs, some of the
 commands produce more data than we actually read. This would
 eventually lead to both data corruption and overflow somwhere in the
 SMC internals.  If the original SMC error is interpreted as a read
 buffer overflow, then that problem should be fixed with this patch.
 
 
 Good point.
 
 But shouldn't we at least get the flushed %d bytes warning message in 
 this case ?
 
 The explanation I have there is that the (newer) SMC needs the
 application to read the 'no more bytes' or it will get confused. It
 makes sense, if the number of bytes to read is no longer specified.
 
 You mean that just reading from APPLESMC_CMD_PORT would solve the problem 
 ?
 That might make sense.
 
 It also points at the possibility of a smaller patch to test, but I
 have not had the time to check this very deeply myself:
 
 I like this patch much more than the previous patch. Chris, can you test it 
 ?
 
 Yes. Building now. What kernel message should I be looking for? At least on 
 2011 and 2012 laptops I have yet to see an Oops related to smc. The kernel 
 with previous patch at least is not causing problems on them so far, which 
 works well as I can test more on the 2008 model.
 
 None, if I understand correctly and if the patch really fixes the root cause
 of the problem.

A vast majority of the Ooops I've had are when booting from flash media, 
testing Fedora installs. Is it possible the much slower kernel load and boot 
time is a trigger? If so, I'll look into modifying the media to accept the 
custom kernel and requisite fat initramfs.


Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-02 Thread Guenter Roeck
On Wed, Oct 02, 2013 at 03:34:41PM -0600, Chris Murphy wrote:
 
 On Oct 2, 2013, at 2:59 PM, Guenter Roeck li...@roeck-us.net wrote:
 
  On Wed, Oct 02, 2013 at 12:33:00PM -0600, Chris Murphy wrote:
  
  On Oct 2, 2013, at 12:02 PM, Guenter Roeck li...@roeck-us.net wrote:
  
  On Wed, Oct 02, 2013 at 07:24:10PM +0200, Henrik Rydberg wrote:
  On Wed, Oct 02, 2013 at 09:47:18AM -0700, Guenter Roeck wrote:
  On Wed, Oct 02, 2013 at 06:34:18PM +0200, Henrik Rydberg wrote:
  One thing I have seen in all logs is the earlier send_byte fail 
  message, so
  I think that is a pre-requisite.
  
  Not necessarily - it could be that the patch actually fixes the root
  cause. One possible scenario is that on recent SMCs, some of the
  commands produce more data than we actually read. This would
  eventually lead to both data corruption and overflow somwhere in the
  SMC internals.  If the original SMC error is interpreted as a read
  buffer overflow, then that problem should be fixed with this patch.
  
  
  Good point.
  
  But shouldn't we at least get the flushed %d bytes warning message 
  in this case ?
  
  The explanation I have there is that the (newer) SMC needs the
  application to read the 'no more bytes' or it will get confused. It
  makes sense, if the number of bytes to read is no longer specified.
  
  You mean that just reading from APPLESMC_CMD_PORT would solve the 
  problem ?
  That might make sense.
  
  It also points at the possibility of a smaller patch to test, but I
  have not had the time to check this very deeply myself:
  
  I like this patch much more than the previous patch. Chris, can you test 
  it ?
  
  Yes. Building now. What kernel message should I be looking for? At least 
  on 2011 and 2012 laptops I have yet to see an Oops related to smc. The 
  kernel with previous patch at least is not causing problems on them so 
  far, which works well as I can test more on the 2008 model.
  
  None, if I understand correctly and if the patch really fixes the root cause
  of the problem.
 
 A vast majority of the Ooops I've had are when booting from flash media, 
 testing Fedora installs. Is it possible the much slower kernel load and boot 
 time is a trigger? If so, I'll look into modifying the media to accept the 
 custom kernel and requisite fat initramfs.
 
Yes, that could be a possible trigger. I thought it might be triggered by faster
boot (as one gets with 3.10 and 3.11), but slow boot is just as likely.

Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck

On 10/01/2013 08:55 PM, Chris Murphy wrote:


On Oct 1, 2013, at 9:51 PM, Guenter Roeck  wrote:


On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote:


On Oct 1, 2013, at 10:24 AM, Guenter Roeck  wrote:


On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:


On Oct 1, 2013, at 9:19 AM, Guenter Roeck  wrote:


On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:

Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.

[   10.886016] applesmc: key count changed from 261 to 1174405121



Explains the crash, but the new key count is very wrong. 1174405121 = 
0x4601.
Which I guess explains the subsequent memory allocation error in the log.

Henrik, any idea what might be going on ? Is it possible that the previous
command failure leaves some state machine in a bad state ?


I seem to recall a report on another similar state problem on newer
machines, so maybe, yes. Older machines seem fine, I have never
encountered the problem myself. Here is a patch to test that
theory. It has been tested to be pretty harmless on two different
generations.

I really really do not want to add an 'if (value is insane)' check ;-)


Chris,

any chance you can load this patch on an affected machine so we can get
test feedback ? This one is too experimental to submit upstream without
knowing that it really fixes the problem.


Yes. What kernel.org source version should I apply it against? I'd use the 
non-debug config file from an equivalent version Fedora kernel, unless asked 
otherwise. And also should I test it on other vintages? I have here 
MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).


Only requirement is that it also includes the previous patch, so it would be
optimal if you can apply it on top of the previous image.


Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
smc in it:

[   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
[   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10



Hi Chris,

That only means that you did not hit the problem. There may be some secondary
trigger (cold boot ? coffee on the cpu ?).

One thing I have seen in all logs is the earlier "send_byte fail" message, so
I think that is a pre-requisite.


I have no idea how to trigger it. I have tried cold and warm boots. Boots 
between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other 
testing, maybe I'll stumble onto it.



I am sure you didn't pour coffee onto the CPU yet :)

Basic rule of testing: You'll only hit the problem again after you are convinced
that it was magically fixed by a completely unrelated change.

Cheers,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Chris Murphy

On Oct 1, 2013, at 9:51 PM, Guenter Roeck  wrote:

> On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote:
>> 
>> On Oct 1, 2013, at 10:24 AM, Guenter Roeck  wrote:
>> 
>>> On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
 
 On Oct 1, 2013, at 9:19 AM, Guenter Roeck  wrote:
 
> On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
 Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
 
 [   10.886016] applesmc: key count changed from 261 to 1174405121
 
>>> 
>>> Explains the crash, but the new key count is very wrong. 1174405121 = 
>>> 0x4601.
>>> Which I guess explains the subsequent memory allocation error in the 
>>> log.
>>> 
>>> Henrik, any idea what might be going on ? Is it possible that the 
>>> previous
>>> command failure leaves some state machine in a bad state ?
>> 
>> I seem to recall a report on another similar state problem on newer
>> machines, so maybe, yes. Older machines seem fine, I have never
>> encountered the problem myself. Here is a patch to test that
>> theory. It has been tested to be pretty harmless on two different
>> generations.
>> 
>> I really really do not want to add an 'if (value is insane)' check ;-)
>> 
> Chris,
> 
> any chance you can load this patch on an affected machine so we can get
> test feedback ? This one is too experimental to submit upstream without
> knowing that it really fixes the problem.
 
 Yes. What kernel.org source version should I apply it against? I'd use the 
 non-debug config file from an equivalent version Fedora kernel, unless 
 asked otherwise. And also should I test it on other vintages? I have here 
 MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
 
>>> Only requirement is that it also includes the previous patch, so it would be
>>> optimal if you can apply it on top of the previous image.
>> 
>> Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
>> ~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
>> smc in it:
>> 
>> [   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
>> [   13.833402] input: applesmc as 
>> /devices/platform/applesmc.768/input/input10
>> 
> 
> Hi Chris,
> 
> That only means that you did not hit the problem. There may be some secondary
> trigger (cold boot ? coffee on the cpu ?).
> 
> One thing I have seen in all logs is the earlier "send_byte fail" message, so
> I think that is a pre-requisite.

I have no idea how to trigger it. I have tried cold and warm boots. Boots 
between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other 
testing, maybe I'll stumble onto it.


Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck
On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote:
> 
> On Oct 1, 2013, at 10:24 AM, Guenter Roeck  wrote:
> 
> > On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
> >> 
> >> On Oct 1, 2013, at 9:19 AM, Guenter Roeck  wrote:
> >> 
> >>> On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
> >> Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
> >> 
> >> [   10.886016] applesmc: key count changed from 261 to 1174405121
> >> 
> > 
> > Explains the crash, but the new key count is very wrong. 1174405121 = 
> > 0x4601.
> > Which I guess explains the subsequent memory allocation error in the 
> > log.
> > 
> > Henrik, any idea what might be going on ? Is it possible that the 
> > previous
> > command failure leaves some state machine in a bad state ?
>  
>  I seem to recall a report on another similar state problem on newer
>  machines, so maybe, yes. Older machines seem fine, I have never
>  encountered the problem myself. Here is a patch to test that
>  theory. It has been tested to be pretty harmless on two different
>  generations.
>  
>  I really really do not want to add an 'if (value is insane)' check ;-)
>  
> >>> Chris,
> >>> 
> >>> any chance you can load this patch on an affected machine so we can get
> >>> test feedback ? This one is too experimental to submit upstream without
> >>> knowing that it really fixes the problem.
> >> 
> >> Yes. What kernel.org source version should I apply it against? I'd use the 
> >> non-debug config file from an equivalent version Fedora kernel, unless 
> >> asked otherwise. And also should I test it on other vintages? I have here 
> >> MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
> >> 
> > Only requirement is that it also includes the previous patch, so it would be
> > optimal if you can apply it on top of the previous image.
> 
> Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
> ~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
> smc in it:
> 
> [   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
> [   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10
> 

Hi Chris,

That only means that you did not hit the problem. There may be some secondary
trigger (cold boot ? coffee on the cpu ?).

One thing I have seen in all logs is the earlier "send_byte fail" message, so
I think that is a pre-requisite.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Chris Murphy

On Oct 1, 2013, at 10:24 AM, Guenter Roeck  wrote:

> On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
>> 
>> On Oct 1, 2013, at 9:19 AM, Guenter Roeck  wrote:
>> 
>>> On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
>> Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
>> 
>> [   10.886016] applesmc: key count changed from 261 to 1174405121
>> 
> 
> Explains the crash, but the new key count is very wrong. 1174405121 = 
> 0x4601.
> Which I guess explains the subsequent memory allocation error in the log.
> 
> Henrik, any idea what might be going on ? Is it possible that the previous
> command failure leaves some state machine in a bad state ?
 
 I seem to recall a report on another similar state problem on newer
 machines, so maybe, yes. Older machines seem fine, I have never
 encountered the problem myself. Here is a patch to test that
 theory. It has been tested to be pretty harmless on two different
 generations.
 
 I really really do not want to add an 'if (value is insane)' check ;-)
 
>>> Chris,
>>> 
>>> any chance you can load this patch on an affected machine so we can get
>>> test feedback ? This one is too experimental to submit upstream without
>>> knowing that it really fixes the problem.
>> 
>> Yes. What kernel.org source version should I apply it against? I'd use the 
>> non-debug config file from an equivalent version Fedora kernel, unless asked 
>> otherwise. And also should I test it on other vintages? I have here 
>> MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
>> 
> Only requirement is that it also includes the previous patch, so it would be
> optimal if you can apply it on top of the previous image.

Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
smc in it:

[   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
[   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10


Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck
On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
> 
> On Oct 1, 2013, at 9:19 AM, Guenter Roeck  wrote:
> 
> > On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
>  Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
>  
>  [   10.886016] applesmc: key count changed from 261 to 1174405121
>  
> >>> 
> >>> Explains the crash, but the new key count is very wrong. 1174405121 = 
> >>> 0x4601.
> >>> Which I guess explains the subsequent memory allocation error in the log.
> >>> 
> >>> Henrik, any idea what might be going on ? Is it possible that the previous
> >>> command failure leaves some state machine in a bad state ?
> >> 
> >> I seem to recall a report on another similar state problem on newer
> >> machines, so maybe, yes. Older machines seem fine, I have never
> >> encountered the problem myself. Here is a patch to test that
> >> theory. It has been tested to be pretty harmless on two different
> >> generations.
> >> 
> >> I really really do not want to add an 'if (value is insane)' check ;-)
> >> 
> > Chris,
> > 
> > any chance you can load this patch on an affected machine so we can get
> > test feedback ? This one is too experimental to submit upstream without
> > knowing that it really fixes the problem.
> 
> Yes. What kernel.org source version should I apply it against? I'd use the 
> non-debug config file from an equivalent version Fedora kernel, unless asked 
> otherwise. And also should I test it on other vintages? I have here 
> MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
> 
Only requirement is that it also includes the previous patch, so it would be
optimal if you can apply it on top of the previous image.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck
On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
> > >Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
> > >
> > >[   10.886016] applesmc: key count changed from 261 to 1174405121
> > >
> > 
> > Explains the crash, but the new key count is very wrong. 1174405121 = 
> > 0x4601.
> > Which I guess explains the subsequent memory allocation error in the log.
> > 
> > Henrik, any idea what might be going on ? Is it possible that the previous
> > command failure leaves some state machine in a bad state ?
> 
> I seem to recall a report on another similar state problem on newer
> machines, so maybe, yes. Older machines seem fine, I have never
> encountered the problem myself. Here is a patch to test that
> theory. It has been tested to be pretty harmless on two different
> generations.
> 
> I really really do not want to add an 'if (value is insane)' check ;-)
> 
Chris,

any chance you can load this patch on an affected machine so we can get
test feedback ? This one is too experimental to submit upstream without
knowing that it really fixes the problem.

Thanks,
Guenter

> Thanks,
> Henrik
> 
> From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001
> From: Henrik Rydberg 
> Date: Tue, 1 Oct 2013 12:16:09 +0200
> Subject: [PATCH] applesmc remedy take 1
> 
> Conjectured problem: there are remnant bytes ready on the data line
> which corrupts the read after a failure.
> 
> Remedy: assuming bit0 is the read valid line, try to flush it before
> starting a new command.
> 
> Exception: the write-number-of-bytes-to-read command seems to differ
> between models (it may not be needed on the newest), so do not try to
> flush the data at that particular point.
> 
> Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem
> has not been reproduced, so the actual effect of this patch is unknown.
> ---
>  drivers/hwmon/applesmc.c | 29 +
>  1 file changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
> index 98814d1..f6eaf6d1 100644
> --- a/drivers/hwmon/applesmc.c
> +++ b/drivers/hwmon/applesmc.c
> @@ -186,10 +186,23 @@ static int wait_read(void)
>   * send_byte - Write to SMC port, retrying when necessary. Callers
>   * must hold applesmc_lock.
>   */
> -static int send_byte(u8 cmd, u16 port)
> +static int send_byte(u8 cmd, u16 port, bool flush)
>  {
> - u8 status;
> - int us;
> + u8 status, data;
> + int us, nskip;
> +
> + if (flush) {
> + /* read the data port until bit0 is cleared */
> + for (nskip = 0; nskip < 16; nskip++) {
> + udelay(APPLESMC_MIN_WAIT);
> + status = inb(APPLESMC_CMD_PORT);
> + if (!(status & 0x01))
> + break;
> + data = inb(APPLESMC_DATA_PORT);
> + }
> + if (nskip)
> + pr_warn("flushed %d bytes\n", nskip);
> + }
>  
>   outb(cmd, port);
>   for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
> @@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port)
>  
>  static int send_command(u8 cmd)
>  {
> - return send_byte(cmd, APPLESMC_CMD_PORT);
> + return send_byte(cmd, APPLESMC_CMD_PORT, true);
>  }
>  
>  static int send_argument(const char *key)
> @@ -223,7 +236,7 @@ static int send_argument(const char *key)
>   int i;
>  
>   for (i = 0; i < 4; i++)
> - if (send_byte(key[i], APPLESMC_DATA_PORT))
> + if (send_byte(key[i], APPLESMC_DATA_PORT, true))
>   return -EIO;
>   return 0;
>  }
> @@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
> u8 len)
>   return -EIO;
>   }
>  
> - if (send_byte(len, APPLESMC_DATA_PORT)) {
> + if (send_byte(len, APPLESMC_DATA_PORT, false)) {
>   pr_warn("%.4s: read len fail\n", key);
>   return -EIO;
>   }
> @@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 
> *buffer, u8 len)
>   return -EIO;
>   }
>  
> - if (send_byte(len, APPLESMC_DATA_PORT)) {
> + if (send_byte(len, APPLESMC_DATA_PORT, true)) {
>   pr_warn("%.4s: write len fail\n", key);
>   return -EIO;
>   }
>  
>   for (i = 0; i < len; i++) {
> - if (send_byte(buffer[i], APPLESMC_DATA_PORT)) {
> + if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) {
>   pr_warn("%s: write data fail\n", key);
>   return -EIO;
>   }
> -- 
> 1.8.3.4
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Chris Murphy

On Oct 1, 2013, at 9:19 AM, Guenter Roeck  wrote:

> On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
 Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
 
 [   10.886016] applesmc: key count changed from 261 to 1174405121
 
>>> 
>>> Explains the crash, but the new key count is very wrong. 1174405121 = 
>>> 0x4601.
>>> Which I guess explains the subsequent memory allocation error in the log.
>>> 
>>> Henrik, any idea what might be going on ? Is it possible that the previous
>>> command failure leaves some state machine in a bad state ?
>> 
>> I seem to recall a report on another similar state problem on newer
>> machines, so maybe, yes. Older machines seem fine, I have never
>> encountered the problem myself. Here is a patch to test that
>> theory. It has been tested to be pretty harmless on two different
>> generations.
>> 
>> I really really do not want to add an 'if (value is insane)' check ;-)
>> 
> Chris,
> 
> any chance you can load this patch on an affected machine so we can get
> test feedback ? This one is too experimental to submit upstream without
> knowing that it really fixes the problem.

Yes. What kernel.org source version should I apply it against? I'd use the 
non-debug config file from an equivalent version Fedora kernel, unless asked 
otherwise. And also should I test it on other vintages? I have here 
MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).

Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Henrik Rydberg
> >Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
> >
> >[   10.886016] applesmc: key count changed from 261 to 1174405121
> >
> 
> Explains the crash, but the new key count is very wrong. 1174405121 = 
> 0x4601.
> Which I guess explains the subsequent memory allocation error in the log.
> 
> Henrik, any idea what might be going on ? Is it possible that the previous
> command failure leaves some state machine in a bad state ?

I seem to recall a report on another similar state problem on newer
machines, so maybe, yes. Older machines seem fine, I have never
encountered the problem myself. Here is a patch to test that
theory. It has been tested to be pretty harmless on two different
generations.

I really really do not want to add an 'if (value is insane)' check ;-)

Thanks,
Henrik

>From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg 
Date: Tue, 1 Oct 2013 12:16:09 +0200
Subject: [PATCH] applesmc remedy take 1

Conjectured problem: there are remnant bytes ready on the data line
which corrupts the read after a failure.

Remedy: assuming bit0 is the read valid line, try to flush it before
starting a new command.

Exception: the write-number-of-bytes-to-read command seems to differ
between models (it may not be needed on the newest), so do not try to
flush the data at that particular point.

Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem
has not been reproduced, so the actual effect of this patch is unknown.
---
 drivers/hwmon/applesmc.c | 29 +
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..f6eaf6d1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -186,10 +186,23 @@ static int wait_read(void)
  * send_byte - Write to SMC port, retrying when necessary. Callers
  * must hold applesmc_lock.
  */
-static int send_byte(u8 cmd, u16 port)
+static int send_byte(u8 cmd, u16 port, bool flush)
 {
-   u8 status;
-   int us;
+   u8 status, data;
+   int us, nskip;
+
+   if (flush) {
+   /* read the data port until bit0 is cleared */
+   for (nskip = 0; nskip < 16; nskip++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status & 0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (nskip)
+   pr_warn("flushed %d bytes\n", nskip);
+   }
 
outb(cmd, port);
for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
@@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port)
 
 static int send_command(u8 cmd)
 {
-   return send_byte(cmd, APPLESMC_CMD_PORT);
+   return send_byte(cmd, APPLESMC_CMD_PORT, true);
 }
 
 static int send_argument(const char *key)
@@ -223,7 +236,7 @@ static int send_argument(const char *key)
int i;
 
for (i = 0; i < 4; i++)
-   if (send_byte(key[i], APPLESMC_DATA_PORT))
+   if (send_byte(key[i], APPLESMC_DATA_PORT, true))
return -EIO;
return 0;
 }
@@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}
 
-   if (send_byte(len, APPLESMC_DATA_PORT)) {
+   if (send_byte(len, APPLESMC_DATA_PORT, false)) {
pr_warn("%.4s: read len fail\n", key);
return -EIO;
}
@@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 
*buffer, u8 len)
return -EIO;
}
 
-   if (send_byte(len, APPLESMC_DATA_PORT)) {
+   if (send_byte(len, APPLESMC_DATA_PORT, true)) {
pr_warn("%.4s: write len fail\n", key);
return -EIO;
}
 
for (i = 0; i < len; i++) {
-   if (send_byte(buffer[i], APPLESMC_DATA_PORT)) {
+   if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) {
pr_warn("%s: write data fail\n", key);
return -EIO;
}
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Henrik Rydberg
 Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
 
 [   10.886016] applesmc: key count changed from 261 to 1174405121
 
 
 Explains the crash, but the new key count is very wrong. 1174405121 = 
 0x4601.
 Which I guess explains the subsequent memory allocation error in the log.
 
 Henrik, any idea what might be going on ? Is it possible that the previous
 command failure leaves some state machine in a bad state ?

I seem to recall a report on another similar state problem on newer
machines, so maybe, yes. Older machines seem fine, I have never
encountered the problem myself. Here is a patch to test that
theory. It has been tested to be pretty harmless on two different
generations.

I really really do not want to add an 'if (value is insane)' check ;-)

Thanks,
Henrik

From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg rydb...@euromail.se
Date: Tue, 1 Oct 2013 12:16:09 +0200
Subject: [PATCH] applesmc remedy take 1

Conjectured problem: there are remnant bytes ready on the data line
which corrupts the read after a failure.

Remedy: assuming bit0 is the read valid line, try to flush it before
starting a new command.

Exception: the write-number-of-bytes-to-read command seems to differ
between models (it may not be needed on the newest), so do not try to
flush the data at that particular point.

Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem
has not been reproduced, so the actual effect of this patch is unknown.
---
 drivers/hwmon/applesmc.c | 29 +
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 98814d1..f6eaf6d1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -186,10 +186,23 @@ static int wait_read(void)
  * send_byte - Write to SMC port, retrying when necessary. Callers
  * must hold applesmc_lock.
  */
-static int send_byte(u8 cmd, u16 port)
+static int send_byte(u8 cmd, u16 port, bool flush)
 {
-   u8 status;
-   int us;
+   u8 status, data;
+   int us, nskip;
+
+   if (flush) {
+   /* read the data port until bit0 is cleared */
+   for (nskip = 0; nskip  16; nskip++) {
+   udelay(APPLESMC_MIN_WAIT);
+   status = inb(APPLESMC_CMD_PORT);
+   if (!(status  0x01))
+   break;
+   data = inb(APPLESMC_DATA_PORT);
+   }
+   if (nskip)
+   pr_warn(flushed %d bytes\n, nskip);
+   }
 
outb(cmd, port);
for (us = APPLESMC_MIN_WAIT; us  APPLESMC_MAX_WAIT; us = 1) {
@@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port)
 
 static int send_command(u8 cmd)
 {
-   return send_byte(cmd, APPLESMC_CMD_PORT);
+   return send_byte(cmd, APPLESMC_CMD_PORT, true);
 }
 
 static int send_argument(const char *key)
@@ -223,7 +236,7 @@ static int send_argument(const char *key)
int i;
 
for (i = 0; i  4; i++)
-   if (send_byte(key[i], APPLESMC_DATA_PORT))
+   if (send_byte(key[i], APPLESMC_DATA_PORT, true))
return -EIO;
return 0;
 }
@@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 
len)
return -EIO;
}
 
-   if (send_byte(len, APPLESMC_DATA_PORT)) {
+   if (send_byte(len, APPLESMC_DATA_PORT, false)) {
pr_warn(%.4s: read len fail\n, key);
return -EIO;
}
@@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 
*buffer, u8 len)
return -EIO;
}
 
-   if (send_byte(len, APPLESMC_DATA_PORT)) {
+   if (send_byte(len, APPLESMC_DATA_PORT, true)) {
pr_warn(%.4s: write len fail\n, key);
return -EIO;
}
 
for (i = 0; i  len; i++) {
-   if (send_byte(buffer[i], APPLESMC_DATA_PORT)) {
+   if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) {
pr_warn(%s: write data fail\n, key);
return -EIO;
}
-- 
1.8.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Chris Murphy

On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote:

 On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
 Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
 
 [   10.886016] applesmc: key count changed from 261 to 1174405121
 
 
 Explains the crash, but the new key count is very wrong. 1174405121 = 
 0x4601.
 Which I guess explains the subsequent memory allocation error in the log.
 
 Henrik, any idea what might be going on ? Is it possible that the previous
 command failure leaves some state machine in a bad state ?
 
 I seem to recall a report on another similar state problem on newer
 machines, so maybe, yes. Older machines seem fine, I have never
 encountered the problem myself. Here is a patch to test that
 theory. It has been tested to be pretty harmless on two different
 generations.
 
 I really really do not want to add an 'if (value is insane)' check ;-)
 
 Chris,
 
 any chance you can load this patch on an affected machine so we can get
 test feedback ? This one is too experimental to submit upstream without
 knowing that it really fixes the problem.

Yes. What kernel.org source version should I apply it against? I'd use the 
non-debug config file from an equivalent version Fedora kernel, unless asked 
otherwise. And also should I test it on other vintages? I have here 
MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).

Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck
On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
  Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
  
  [   10.886016] applesmc: key count changed from 261 to 1174405121
  
  
  Explains the crash, but the new key count is very wrong. 1174405121 = 
  0x4601.
  Which I guess explains the subsequent memory allocation error in the log.
  
  Henrik, any idea what might be going on ? Is it possible that the previous
  command failure leaves some state machine in a bad state ?
 
 I seem to recall a report on another similar state problem on newer
 machines, so maybe, yes. Older machines seem fine, I have never
 encountered the problem myself. Here is a patch to test that
 theory. It has been tested to be pretty harmless on two different
 generations.
 
 I really really do not want to add an 'if (value is insane)' check ;-)
 
Chris,

any chance you can load this patch on an affected machine so we can get
test feedback ? This one is too experimental to submit upstream without
knowing that it really fixes the problem.

Thanks,
Guenter

 Thanks,
 Henrik
 
 From d48a9e4e6e45dcd9c7e7ad88df714b92772a62d6 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Tue, 1 Oct 2013 12:16:09 +0200
 Subject: [PATCH] applesmc remedy take 1
 
 Conjectured problem: there are remnant bytes ready on the data line
 which corrupts the read after a failure.
 
 Remedy: assuming bit0 is the read valid line, try to flush it before
 starting a new command.
 
 Exception: the write-number-of-bytes-to-read command seems to differ
 between models (it may not be needed on the newest), so do not try to
 flush the data at that particular point.
 
 Tested on a MacBookPro10,1 and a MacBookAir3,1, but the original problem
 has not been reproduced, so the actual effect of this patch is unknown.
 ---
  drivers/hwmon/applesmc.c | 29 +
  1 file changed, 21 insertions(+), 8 deletions(-)
 
 diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
 index 98814d1..f6eaf6d1 100644
 --- a/drivers/hwmon/applesmc.c
 +++ b/drivers/hwmon/applesmc.c
 @@ -186,10 +186,23 @@ static int wait_read(void)
   * send_byte - Write to SMC port, retrying when necessary. Callers
   * must hold applesmc_lock.
   */
 -static int send_byte(u8 cmd, u16 port)
 +static int send_byte(u8 cmd, u16 port, bool flush)
  {
 - u8 status;
 - int us;
 + u8 status, data;
 + int us, nskip;
 +
 + if (flush) {
 + /* read the data port until bit0 is cleared */
 + for (nskip = 0; nskip  16; nskip++) {
 + udelay(APPLESMC_MIN_WAIT);
 + status = inb(APPLESMC_CMD_PORT);
 + if (!(status  0x01))
 + break;
 + data = inb(APPLESMC_DATA_PORT);
 + }
 + if (nskip)
 + pr_warn(flushed %d bytes\n, nskip);
 + }
  
   outb(cmd, port);
   for (us = APPLESMC_MIN_WAIT; us  APPLESMC_MAX_WAIT; us = 1) {
 @@ -215,7 +228,7 @@ static int send_byte(u8 cmd, u16 port)
  
  static int send_command(u8 cmd)
  {
 - return send_byte(cmd, APPLESMC_CMD_PORT);
 + return send_byte(cmd, APPLESMC_CMD_PORT, true);
  }
  
  static int send_argument(const char *key)
 @@ -223,7 +236,7 @@ static int send_argument(const char *key)
   int i;
  
   for (i = 0; i  4; i++)
 - if (send_byte(key[i], APPLESMC_DATA_PORT))
 + if (send_byte(key[i], APPLESMC_DATA_PORT, true))
   return -EIO;
   return 0;
  }
 @@ -237,7 +250,7 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, 
 u8 len)
   return -EIO;
   }
  
 - if (send_byte(len, APPLESMC_DATA_PORT)) {
 + if (send_byte(len, APPLESMC_DATA_PORT, false)) {
   pr_warn(%.4s: read len fail\n, key);
   return -EIO;
   }
 @@ -262,13 +275,13 @@ static int write_smc(u8 cmd, const char *key, const u8 
 *buffer, u8 len)
   return -EIO;
   }
  
 - if (send_byte(len, APPLESMC_DATA_PORT)) {
 + if (send_byte(len, APPLESMC_DATA_PORT, true)) {
   pr_warn(%.4s: write len fail\n, key);
   return -EIO;
   }
  
   for (i = 0; i  len; i++) {
 - if (send_byte(buffer[i], APPLESMC_DATA_PORT)) {
 + if (send_byte(buffer[i], APPLESMC_DATA_PORT, true)) {
   pr_warn(%s: write data fail\n, key);
   return -EIO;
   }
 -- 
 1.8.3.4
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck
On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
 
 On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote:
 
  On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
  Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
  
  [   10.886016] applesmc: key count changed from 261 to 1174405121
  
  
  Explains the crash, but the new key count is very wrong. 1174405121 = 
  0x4601.
  Which I guess explains the subsequent memory allocation error in the log.
  
  Henrik, any idea what might be going on ? Is it possible that the previous
  command failure leaves some state machine in a bad state ?
  
  I seem to recall a report on another similar state problem on newer
  machines, so maybe, yes. Older machines seem fine, I have never
  encountered the problem myself. Here is a patch to test that
  theory. It has been tested to be pretty harmless on two different
  generations.
  
  I really really do not want to add an 'if (value is insane)' check ;-)
  
  Chris,
  
  any chance you can load this patch on an affected machine so we can get
  test feedback ? This one is too experimental to submit upstream without
  knowing that it really fixes the problem.
 
 Yes. What kernel.org source version should I apply it against? I'd use the 
 non-debug config file from an equivalent version Fedora kernel, unless asked 
 otherwise. And also should I test it on other vintages? I have here 
 MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
 
Only requirement is that it also includes the previous patch, so it would be
optimal if you can apply it on top of the previous image.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Chris Murphy

On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote:

 On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
 
 On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
 Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
 
 [   10.886016] applesmc: key count changed from 261 to 1174405121
 
 
 Explains the crash, but the new key count is very wrong. 1174405121 = 
 0x4601.
 Which I guess explains the subsequent memory allocation error in the log.
 
 Henrik, any idea what might be going on ? Is it possible that the previous
 command failure leaves some state machine in a bad state ?
 
 I seem to recall a report on another similar state problem on newer
 machines, so maybe, yes. Older machines seem fine, I have never
 encountered the problem myself. Here is a patch to test that
 theory. It has been tested to be pretty harmless on two different
 generations.
 
 I really really do not want to add an 'if (value is insane)' check ;-)
 
 Chris,
 
 any chance you can load this patch on an affected machine so we can get
 test feedback ? This one is too experimental to submit upstream without
 knowing that it really fixes the problem.
 
 Yes. What kernel.org source version should I apply it against? I'd use the 
 non-debug config file from an equivalent version Fedora kernel, unless asked 
 otherwise. And also should I test it on other vintages? I have here 
 MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
 
 Only requirement is that it also includes the previous patch, so it would be
 optimal if you can apply it on top of the previous image.

Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
smc in it:

[   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
[   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10


Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck
On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote:
 
 On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote:
 
  On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
  
  On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote:
  
  On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
  Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
  
  [   10.886016] applesmc: key count changed from 261 to 1174405121
  
  
  Explains the crash, but the new key count is very wrong. 1174405121 = 
  0x4601.
  Which I guess explains the subsequent memory allocation error in the 
  log.
  
  Henrik, any idea what might be going on ? Is it possible that the 
  previous
  command failure leaves some state machine in a bad state ?
  
  I seem to recall a report on another similar state problem on newer
  machines, so maybe, yes. Older machines seem fine, I have never
  encountered the problem myself. Here is a patch to test that
  theory. It has been tested to be pretty harmless on two different
  generations.
  
  I really really do not want to add an 'if (value is insane)' check ;-)
  
  Chris,
  
  any chance you can load this patch on an affected machine so we can get
  test feedback ? This one is too experimental to submit upstream without
  knowing that it really fixes the problem.
  
  Yes. What kernel.org source version should I apply it against? I'd use the 
  non-debug config file from an equivalent version Fedora kernel, unless 
  asked otherwise. And also should I test it on other vintages? I have here 
  MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
  
  Only requirement is that it also includes the previous patch, so it would be
  optimal if you can apply it on top of the previous image.
 
 Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
 ~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
 smc in it:
 
 [   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
 [   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10
 

Hi Chris,

That only means that you did not hit the problem. There may be some secondary
trigger (cold boot ? coffee on the cpu ?).

One thing I have seen in all logs is the earlier send_byte fail message, so
I think that is a pre-requisite.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Chris Murphy

On Oct 1, 2013, at 9:51 PM, Guenter Roeck li...@roeck-us.net wrote:

 On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote:
 
 On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:
 
 On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:
 Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.
 
 [   10.886016] applesmc: key count changed from 261 to 1174405121
 
 
 Explains the crash, but the new key count is very wrong. 1174405121 = 
 0x4601.
 Which I guess explains the subsequent memory allocation error in the 
 log.
 
 Henrik, any idea what might be going on ? Is it possible that the 
 previous
 command failure leaves some state machine in a bad state ?
 
 I seem to recall a report on another similar state problem on newer
 machines, so maybe, yes. Older machines seem fine, I have never
 encountered the problem myself. Here is a patch to test that
 theory. It has been tested to be pretty harmless on two different
 generations.
 
 I really really do not want to add an 'if (value is insane)' check ;-)
 
 Chris,
 
 any chance you can load this patch on an affected machine so we can get
 test feedback ? This one is too experimental to submit upstream without
 knowing that it really fixes the problem.
 
 Yes. What kernel.org source version should I apply it against? I'd use the 
 non-debug config file from an equivalent version Fedora kernel, unless 
 asked otherwise. And also should I test it on other vintages? I have here 
 MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).
 
 Only requirement is that it also includes the previous patch, so it would be
 optimal if you can apply it on top of the previous image.
 
 Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
 ~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
 smc in it:
 
 [   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
 [   13.833402] input: applesmc as 
 /devices/platform/applesmc.768/input/input10
 
 
 Hi Chris,
 
 That only means that you did not hit the problem. There may be some secondary
 trigger (cold boot ? coffee on the cpu ?).
 
 One thing I have seen in all logs is the earlier send_byte fail message, so
 I think that is a pre-requisite.

I have no idea how to trigger it. I have tried cold and warm boots. Boots 
between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other 
testing, maybe I'll stumble onto it.


Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-10-01 Thread Guenter Roeck

On 10/01/2013 08:55 PM, Chris Murphy wrote:


On Oct 1, 2013, at 9:51 PM, Guenter Roeck li...@roeck-us.net wrote:


On Tue, Oct 01, 2013 at 07:09:26PM -0600, Chris Murphy wrote:


On Oct 1, 2013, at 10:24 AM, Guenter Roeck li...@roeck-us.net wrote:


On Tue, Oct 01, 2013 at 09:33:13AM -0600, Chris Murphy wrote:


On Oct 1, 2013, at 9:19 AM, Guenter Roeck li...@roeck-us.net wrote:


On Tue, Oct 01, 2013 at 12:55:26PM +0200, Henrik Rydberg wrote:

Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.

[   10.886016] applesmc: key count changed from 261 to 1174405121



Explains the crash, but the new key count is very wrong. 1174405121 = 
0x4601.
Which I guess explains the subsequent memory allocation error in the log.

Henrik, any idea what might be going on ? Is it possible that the previous
command failure leaves some state machine in a bad state ?


I seem to recall a report on another similar state problem on newer
machines, so maybe, yes. Older machines seem fine, I have never
encountered the problem myself. Here is a patch to test that
theory. It has been tested to be pretty harmless on two different
generations.

I really really do not want to add an 'if (value is insane)' check ;-)


Chris,

any chance you can load this patch on an affected machine so we can get
test feedback ? This one is too experimental to submit upstream without
knowing that it really fixes the problem.


Yes. What kernel.org source version should I apply it against? I'd use the 
non-debug config file from an equivalent version Fedora kernel, unless asked 
otherwise. And also should I test it on other vintages? I have here 
MBP4,1(2008); MBP8,2(2011), and MBP10,2(2012).


Only requirement is that it also includes the previous patch, so it would be
optimal if you can apply it on top of the previous image.


Patch added on top of 3.12.0-0.rc3.git0.1.fc20.x86_64 and built. But after 
~dozen reboots, I'm not triggering the problem. The only items in dmesg with 
smc in it:

[   13.799819] applesmc: key=261 fan=2 temp=14 index=14 acc=1 lux=2 kbd=1
[   13.833402] input: applesmc as /devices/platform/applesmc.768/input/input10



Hi Chris,

That only means that you did not hit the problem. There may be some secondary
trigger (cold boot ? coffee on the cpu ?).

One thing I have seen in all logs is the earlier send_byte fail message, so
I think that is a pre-requisite.


I have no idea how to trigger it. I have tried cold and warm boots. Boots 
between linux and OS X to linux. *shrug* I'll keep trying as I'm doing other 
testing, maybe I'll stumble onto it.



I am sure you didn't pour coffee onto the CPU yet :)

Basic rule of testing: You'll only hit the problem again after you are convinced
that it was magically fixed by a completely unrelated change.

Cheers,
Guenter

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-30 Thread Guenter Roeck

On 09/30/2013 06:57 PM, Chris Murphy wrote:


On Sep 27, 2013, at 5:33 PM, Guenter Roeck  wrote:


On 09/27/2013 11:03 AM, Chris Murphy wrote:


On Sep 27, 2013, at 11:59 AM, Guenter Roeck  wrote:


On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:


On Sep 27, 2013, at 11:12 AM, Guenter Roeck  wrote:


On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:

On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---

 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg 
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: 
Suggested-by: Guenter Roeck 
Signed-off-by: Henrik Rydberg 
---
drivers/hwmon/applesmc.c | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)


Thanks for the quick reply.  I'll get this rolled into our kernels soon.


I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).


I'll start running it when it appears in koji. It's very transient, maybe one 
oops per week with lots of (other) testing. I'm not even sure if it happens on 
warm or cold boots or both.


When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.


I don't understand the question. I'm uncertain how to trigger, and also what 
event.



The patch includes a new warning message.

pr_warn("key count changed from %d to %d\n",
s->key_count, count);

It would be great if there would be a means to detect if this message is seen
in a kernel log, because it would show that the potential crash condition
fixed with the patch was actually encountered. This would help us to determine
if we actually fixed the problem or not.

Of course, we'll know if is wasn't fixed if the system still crashes.


Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.

[   10.886016] applesmc: key count changed from 261 to 1174405121



Explains the crash, but the new key count is very wrong. 1174405121 = 
0x4601.
Which I guess explains the subsequent memory allocation error in the log.

Henrik, any idea what might be going on ? Is it possible that the previous
command failure leaves some state machine in a bad state ?

Guenter


Attaching new full dmesg to the bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11

Chris



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-30 Thread Chris Murphy

On Sep 27, 2013, at 5:33 PM, Guenter Roeck  wrote:

> On 09/27/2013 11:03 AM, Chris Murphy wrote:
>> 
>> On Sep 27, 2013, at 11:59 AM, Guenter Roeck  wrote:
>> 
>>> On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
 
 On Sep 27, 2013, at 11:12 AM, Guenter Roeck  wrote:
 
> On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
>> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  
>> wrote:
>> This suggests that initialization may be attempted more than once. 
>> The key cache
>> is allocated only once, but the number of keys is read for each 
>> attempt.
>> 
>> No idea if that can happen, but if the number of keys can increase 
>> after
>> the first initialization attempt you would have an explanation for 
>> the crash.
> 
> Good idea, and easy enough to test with the patch below.
> 
 Should we apply this patch even though it may not solve the specific 
 problem ?
>>> 
>>> Yes, why not - it certainly won't hurt. I am running it right now, so
>>> it is at least run-tested.
>>> 
 Again, not sure if the key count can change, but the current code is 
 at the very
 least inconsistent, as it keeps reading the key count without updating 
 or
 verifying the cache size.
>>> 
>>> Yes - I agree that the error state is far-fetched, but it is hard to
>>> see any other logical explanation. There is of course always the
>>> possibility that the problem is somewhere else completely.
>>> 
>>> Proper patch attached.
>>> 
>>> Thanks,
>>> Henrik
>>> 
>>> ---
>>> 
>>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
>>> From: Henrik Rydberg 
>>> Date: Thu, 26 Sep 2013 08:33:16 +0200
>>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
>>> 
>>> After reports from Chris and Josh Boyer of a rare crash in applesmc,
>>> Guenter pointed at the initialization problem fixed below. The patch
>>> has not been verified to fix the crash, but should be applied
>>> regardless.
>>> 
>>> Reported-by: 
>>> Suggested-by: Guenter Roeck 
>>> Signed-off-by: Henrik Rydberg 
>>> ---
>>> drivers/hwmon/applesmc.c | 11 ++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>> 
>> Thanks for the quick reply.  I'll get this rolled into our kernels soon.
>> 
> I sent a pull request to Linus, so you should be able to pull it from
> the upstream kernel shortly. Would be great to get feedback if the patch
> solves the problem (or doesn't).
 
 I'll start running it when it appears in koji. It's very transient, maybe 
 one oops per week with lots of (other) testing. I'm not even sure if it 
 happens on warm or cold boots or both.
 
>>> When you do, can you possibly trigger an event based on the warning added
>>> with the patch ? This might help us to identify if the problem fixed
>>> with the patch actually happens.
>> 
>> I don't understand the question. I'm uncertain how to trigger, and also what 
>> event.
>> 
> 
> The patch includes a new warning message.
> 
>   pr_warn("key count changed from %d to %d\n",
>s->key_count, count);
> 
> It would be great if there would be a means to detect if this message is seen
> in a kernel log, because it would show that the potential crash condition
> fixed with the patch was actually encountered. This would help us to determine
> if we actually fixed the problem or not.
> 
> Of course, we'll know if is wasn't fixed if the system still crashes.

Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. 

[   10.886016] applesmc: key count changed from 261 to 1174405121

Attaching new full dmesg to the bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11

Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-30 Thread Chris Murphy

On Sep 27, 2013, at 5:33 PM, Guenter Roeck li...@roeck-us.net wrote:

 On 09/27/2013 11:03 AM, Chris Murphy wrote:
 
 On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
 
 On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
 On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se 
 wrote:
 This suggests that initialization may be attempted more than once. 
 The key cache
 is allocated only once, but the number of keys is read for each 
 attempt.
 
 No idea if that can happen, but if the number of keys can increase 
 after
 the first initialization attempt you would have an explanation for 
 the crash.
 
 Good idea, and easy enough to test with the patch below.
 
 Should we apply this patch even though it may not solve the specific 
 problem ?
 
 Yes, why not - it certainly won't hurt. I am running it right now, so
 it is at least run-tested.
 
 Again, not sure if the key count can change, but the current code is 
 at the very
 least inconsistent, as it keeps reading the key count without updating 
 or
 verifying the cache size.
 
 Yes - I agree that the error state is far-fetched, but it is hard to
 see any other logical explanation. There is of course always the
 possibility that the problem is somewhere else completely.
 
 Proper patch attached.
 
 Thanks,
 Henrik
 
 ---
 
 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Thu, 26 Sep 2013 08:33:16 +0200
 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
 
 After reports from Chris and Josh Boyer of a rare crash in applesmc,
 Guenter pointed at the initialization problem fixed below. The patch
 has not been verified to fix the crash, but should be applied
 regardless.
 
 Reported-by: jwbo...@fedoraproject.org
 Suggested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Henrik Rydberg rydb...@euromail.se
 ---
 drivers/hwmon/applesmc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)
 
 Thanks for the quick reply.  I'll get this rolled into our kernels soon.
 
 I sent a pull request to Linus, so you should be able to pull it from
 the upstream kernel shortly. Would be great to get feedback if the patch
 solves the problem (or doesn't).
 
 I'll start running it when it appears in koji. It's very transient, maybe 
 one oops per week with lots of (other) testing. I'm not even sure if it 
 happens on warm or cold boots or both.
 
 When you do, can you possibly trigger an event based on the warning added
 with the patch ? This might help us to identify if the problem fixed
 with the patch actually happens.
 
 I don't understand the question. I'm uncertain how to trigger, and also what 
 event.
 
 
 The patch includes a new warning message.
 
   pr_warn(key count changed from %d to %d\n,
s-key_count, count);
 
 It would be great if there would be a means to detect if this message is seen
 in a kernel log, because it would show that the potential crash condition
 fixed with the patch was actually encountered. This would help us to determine
 if we actually fixed the problem or not.
 
 Of course, we'll know if is wasn't fixed if the system still crashes.

Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. 

[   10.886016] applesmc: key count changed from 261 to 1174405121

Attaching new full dmesg to the bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11

Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-30 Thread Guenter Roeck

On 09/30/2013 06:57 PM, Chris Murphy wrote:


On Sep 27, 2013, at 5:33 PM, Guenter Roeck li...@roeck-us.net wrote:


On 09/27/2013 11:03 AM, Chris Murphy wrote:


On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote:


On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:


On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote:


On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:

On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---

 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg rydb...@euromail.se
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: jwbo...@fedoraproject.org
Suggested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Henrik Rydberg rydb...@euromail.se
---
drivers/hwmon/applesmc.c | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)


Thanks for the quick reply.  I'll get this rolled into our kernels soon.


I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).


I'll start running it when it appears in koji. It's very transient, maybe one 
oops per week with lots of (other) testing. I'm not even sure if it happens on 
warm or cold boots or both.


When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.


I don't understand the question. I'm uncertain how to trigger, and also what 
event.



The patch includes a new warning message.

pr_warn(key count changed from %d to %d\n,
s-key_count, count);

It would be great if there would be a means to detect if this message is seen
in a kernel log, because it would show that the potential crash condition
fixed with the patch was actually encountered. This would help us to determine
if we actually fixed the problem or not.

Of course, we'll know if is wasn't fixed if the system still crashes.


Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.

[   10.886016] applesmc: key count changed from 261 to 1174405121



Explains the crash, but the new key count is very wrong. 1174405121 = 
0x4601.
Which I guess explains the subsequent memory allocation error in the log.

Henrik, any idea what might be going on ? Is it possible that the previous
command failure leaves some state machine in a bad state ?

Guenter


Attaching new full dmesg to the bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11

Chris



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck

On 09/27/2013 11:03 AM, Chris Murphy wrote:


On Sep 27, 2013, at 11:59 AM, Guenter Roeck  wrote:


On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:


On Sep 27, 2013, at 11:12 AM, Guenter Roeck  wrote:


On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:

On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---

 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg 
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: 
Suggested-by: Guenter Roeck 
Signed-off-by: Henrik Rydberg 
---
drivers/hwmon/applesmc.c | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)


Thanks for the quick reply.  I'll get this rolled into our kernels soon.


I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).


I'll start running it when it appears in koji. It's very transient, maybe one 
oops per week with lots of (other) testing. I'm not even sure if it happens on 
warm or cold boots or both.


When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.


I don't understand the question. I'm uncertain how to trigger, and also what 
event.



The patch includes a new warning message.

pr_warn("key count changed from %d to %d\n",
s->key_count, count);

It would be great if there would be a means to detect if this message is seen
in a kernel log, because it would show that the potential crash condition
fixed with the patch was actually encountered. This would help us to determine
if we actually fixed the problem or not.

Of course, we'll know if is wasn't fixed if the system still crashes.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Chris Murphy

On Sep 27, 2013, at 11:12 AM, Guenter Roeck  wrote:

> On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
>> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  wrote:
>> This suggests that initialization may be attempted more than once. The 
>> key cache
>> is allocated only once, but the number of keys is read for each attempt.
>> 
>> No idea if that can happen, but if the number of keys can increase after
>> the first initialization attempt you would have an explanation for the 
>> crash.
> 
> Good idea, and easy enough to test with the patch below.
> 
 Should we apply this patch even though it may not solve the specific 
 problem ?
>>> 
>>> Yes, why not - it certainly won't hurt. I am running it right now, so
>>> it is at least run-tested.
>>> 
 Again, not sure if the key count can change, but the current code is at 
 the very
 least inconsistent, as it keeps reading the key count without updating or
 verifying the cache size.
>>> 
>>> Yes - I agree that the error state is far-fetched, but it is hard to
>>> see any other logical explanation. There is of course always the
>>> possibility that the problem is somewhere else completely.
>>> 
>>> Proper patch attached.
>>> 
>>> Thanks,
>>> Henrik
>>> 
>>> ---
>>> 
>>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
>>> From: Henrik Rydberg 
>>> Date: Thu, 26 Sep 2013 08:33:16 +0200
>>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
>>> 
>>> After reports from Chris and Josh Boyer of a rare crash in applesmc,
>>> Guenter pointed at the initialization problem fixed below. The patch
>>> has not been verified to fix the crash, but should be applied
>>> regardless.
>>> 
>>> Reported-by: 
>>> Suggested-by: Guenter Roeck 
>>> Signed-off-by: Henrik Rydberg 
>>> ---
>>> drivers/hwmon/applesmc.c | 11 ++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>> 
>> Thanks for the quick reply.  I'll get this rolled into our kernels soon.
>> 
> I sent a pull request to Linus, so you should be able to pull it from
> the upstream kernel shortly. Would be great to get feedback if the patch
> solves the problem (or doesn't).

I'll start running it when it appears in koji. It's very transient, maybe one 
oops per week with lots of (other) testing. I'm not even sure if it happens on 
warm or cold boots or both.

Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Chris Murphy

On Sep 27, 2013, at 11:59 AM, Guenter Roeck  wrote:

> On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
>> 
>> On Sep 27, 2013, at 11:12 AM, Guenter Roeck  wrote:
>> 
>>> On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
 On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  
 wrote:
 This suggests that initialization may be attempted more than once. The 
 key cache
 is allocated only once, but the number of keys is read for each 
 attempt.
 
 No idea if that can happen, but if the number of keys can increase 
 after
 the first initialization attempt you would have an explanation for the 
 crash.
>>> 
>>> Good idea, and easy enough to test with the patch below.
>>> 
>> Should we apply this patch even though it may not solve the specific 
>> problem ?
> 
> Yes, why not - it certainly won't hurt. I am running it right now, so
> it is at least run-tested.
> 
>> Again, not sure if the key count can change, but the current code is at 
>> the very
>> least inconsistent, as it keeps reading the key count without updating or
>> verifying the cache size.
> 
> Yes - I agree that the error state is far-fetched, but it is hard to
> see any other logical explanation. There is of course always the
> possibility that the problem is somewhere else completely.
> 
> Proper patch attached.
> 
> Thanks,
> Henrik
> 
> ---
> 
> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
> From: Henrik Rydberg 
> Date: Thu, 26 Sep 2013 08:33:16 +0200
> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
> 
> After reports from Chris and Josh Boyer of a rare crash in applesmc,
> Guenter pointed at the initialization problem fixed below. The patch
> has not been verified to fix the crash, but should be applied
> regardless.
> 
> Reported-by: 
> Suggested-by: Guenter Roeck 
> Signed-off-by: Henrik Rydberg 
> ---
> drivers/hwmon/applesmc.c | 11 ++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
 
 Thanks for the quick reply.  I'll get this rolled into our kernels soon.
 
>>> I sent a pull request to Linus, so you should be able to pull it from
>>> the upstream kernel shortly. Would be great to get feedback if the patch
>>> solves the problem (or doesn't).
>> 
>> I'll start running it when it appears in koji. It's very transient, maybe 
>> one oops per week with lots of (other) testing. I'm not even sure if it 
>> happens on warm or cold boots or both.
>> 
> When you do, can you possibly trigger an event based on the warning added
> with the patch ? This might help us to identify if the problem fixed
> with the patch actually happens.

I don't understand the question. I'm uncertain how to trigger, and also what 
event.

Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  wrote:
> >> > > This suggests that initialization may be attempted more than once. The 
> >> > > key cache
> >> > > is allocated only once, but the number of keys is read for each 
> >> > > attempt.
> >> > >
> >> > > No idea if that can happen, but if the number of keys can increase 
> >> > > after
> >> > > the first initialization attempt you would have an explanation for the 
> >> > > crash.
> >> >
> >> > Good idea, and easy enough to test with the patch below.
> >> >
> >> Should we apply this patch even though it may not solve the specific 
> >> problem ?
> >
> > Yes, why not - it certainly won't hurt. I am running it right now, so
> > it is at least run-tested.
> >
> >> Again, not sure if the key count can change, but the current code is at 
> >> the very
> >> least inconsistent, as it keeps reading the key count without updating or
> >> verifying the cache size.
> >
> > Yes - I agree that the error state is far-fetched, but it is hard to
> > see any other logical explanation. There is of course always the
> > possibility that the problem is somewhere else completely.
> >
> > Proper patch attached.
> >
> > Thanks,
> > Henrik
> >
> > ---
> >
> > From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
> > From: Henrik Rydberg 
> > Date: Thu, 26 Sep 2013 08:33:16 +0200
> > Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
> >
> > After reports from Chris and Josh Boyer of a rare crash in applesmc,
> > Guenter pointed at the initialization problem fixed below. The patch
> > has not been verified to fix the crash, but should be applied
> > regardless.
> >
> > Reported-by: 
> > Suggested-by: Guenter Roeck 
> > Signed-off-by: Henrik Rydberg 
> > ---
> >  drivers/hwmon/applesmc.c | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> Thanks for the quick reply.  I'll get this rolled into our kernels soon.
> 
Patch is now upstream (commit 5f4513864).

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck
On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
> 
> On Sep 27, 2013, at 11:12 AM, Guenter Roeck  wrote:
> 
> > On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
> >> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  
> >> wrote:
> >> This suggests that initialization may be attempted more than once. The 
> >> key cache
> >> is allocated only once, but the number of keys is read for each 
> >> attempt.
> >> 
> >> No idea if that can happen, but if the number of keys can increase 
> >> after
> >> the first initialization attempt you would have an explanation for the 
> >> crash.
> > 
> > Good idea, and easy enough to test with the patch below.
> > 
>  Should we apply this patch even though it may not solve the specific 
>  problem ?
> >>> 
> >>> Yes, why not - it certainly won't hurt. I am running it right now, so
> >>> it is at least run-tested.
> >>> 
>  Again, not sure if the key count can change, but the current code is at 
>  the very
>  least inconsistent, as it keeps reading the key count without updating or
>  verifying the cache size.
> >>> 
> >>> Yes - I agree that the error state is far-fetched, but it is hard to
> >>> see any other logical explanation. There is of course always the
> >>> possibility that the problem is somewhere else completely.
> >>> 
> >>> Proper patch attached.
> >>> 
> >>> Thanks,
> >>> Henrik
> >>> 
> >>> ---
> >>> 
> >>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
> >>> From: Henrik Rydberg 
> >>> Date: Thu, 26 Sep 2013 08:33:16 +0200
> >>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
> >>> 
> >>> After reports from Chris and Josh Boyer of a rare crash in applesmc,
> >>> Guenter pointed at the initialization problem fixed below. The patch
> >>> has not been verified to fix the crash, but should be applied
> >>> regardless.
> >>> 
> >>> Reported-by: 
> >>> Suggested-by: Guenter Roeck 
> >>> Signed-off-by: Henrik Rydberg 
> >>> ---
> >>> drivers/hwmon/applesmc.c | 11 ++-
> >>> 1 file changed, 10 insertions(+), 1 deletion(-)
> >> 
> >> Thanks for the quick reply.  I'll get this rolled into our kernels soon.
> >> 
> > I sent a pull request to Linus, so you should be able to pull it from
> > the upstream kernel shortly. Would be great to get feedback if the patch
> > solves the problem (or doesn't).
> 
> I'll start running it when it appears in koji. It's very transient, maybe one 
> oops per week with lots of (other) testing. I'm not even sure if it happens 
> on warm or cold boots or both.
> 
When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  wrote:
> >> > > This suggests that initialization may be attempted more than once. The 
> >> > > key cache
> >> > > is allocated only once, but the number of keys is read for each 
> >> > > attempt.
> >> > >
> >> > > No idea if that can happen, but if the number of keys can increase 
> >> > > after
> >> > > the first initialization attempt you would have an explanation for the 
> >> > > crash.
> >> >
> >> > Good idea, and easy enough to test with the patch below.
> >> >
> >> Should we apply this patch even though it may not solve the specific 
> >> problem ?
> >
> > Yes, why not - it certainly won't hurt. I am running it right now, so
> > it is at least run-tested.
> >
> >> Again, not sure if the key count can change, but the current code is at 
> >> the very
> >> least inconsistent, as it keeps reading the key count without updating or
> >> verifying the cache size.
> >
> > Yes - I agree that the error state is far-fetched, but it is hard to
> > see any other logical explanation. There is of course always the
> > possibility that the problem is somewhere else completely.
> >
> > Proper patch attached.
> >
> > Thanks,
> > Henrik
> >
> > ---
> >
> > From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
> > From: Henrik Rydberg 
> > Date: Thu, 26 Sep 2013 08:33:16 +0200
> > Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
> >
> > After reports from Chris and Josh Boyer of a rare crash in applesmc,
> > Guenter pointed at the initialization problem fixed below. The patch
> > has not been verified to fix the crash, but should be applied
> > regardless.
> >
> > Reported-by: 
> > Suggested-by: Guenter Roeck 
> > Signed-off-by: Henrik Rydberg 
> > ---
> >  drivers/hwmon/applesmc.c | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> Thanks for the quick reply.  I'll get this rolled into our kernels soon.
> 
I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Josh Boyer
On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg  wrote:
>> > > This suggests that initialization may be attempted more than once. The 
>> > > key cache
>> > > is allocated only once, but the number of keys is read for each attempt.
>> > >
>> > > No idea if that can happen, but if the number of keys can increase after
>> > > the first initialization attempt you would have an explanation for the 
>> > > crash.
>> >
>> > Good idea, and easy enough to test with the patch below.
>> >
>> Should we apply this patch even though it may not solve the specific problem 
>> ?
>
> Yes, why not - it certainly won't hurt. I am running it right now, so
> it is at least run-tested.
>
>> Again, not sure if the key count can change, but the current code is at the 
>> very
>> least inconsistent, as it keeps reading the key count without updating or
>> verifying the cache size.
>
> Yes - I agree that the error state is far-fetched, but it is hard to
> see any other logical explanation. There is of course always the
> possibility that the problem is somewhere else completely.
>
> Proper patch attached.
>
> Thanks,
> Henrik
>
> ---
>
> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
> From: Henrik Rydberg 
> Date: Thu, 26 Sep 2013 08:33:16 +0200
> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
>
> After reports from Chris and Josh Boyer of a rare crash in applesmc,
> Guenter pointed at the initialization problem fixed below. The patch
> has not been verified to fix the crash, but should be applied
> regardless.
>
> Reported-by: 
> Suggested-by: Guenter Roeck 
> Signed-off-by: Henrik Rydberg 
> ---
>  drivers/hwmon/applesmc.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)

Thanks for the quick reply.  I'll get this rolled into our kernels soon.

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Josh Boyer
On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote:
   This suggests that initialization may be attempted more than once. The 
   key cache
   is allocated only once, but the number of keys is read for each attempt.
  
   No idea if that can happen, but if the number of keys can increase after
   the first initialization attempt you would have an explanation for the 
   crash.
 
  Good idea, and easy enough to test with the patch below.
 
 Should we apply this patch even though it may not solve the specific problem 
 ?

 Yes, why not - it certainly won't hurt. I am running it right now, so
 it is at least run-tested.

 Again, not sure if the key count can change, but the current code is at the 
 very
 least inconsistent, as it keeps reading the key count without updating or
 verifying the cache size.

 Yes - I agree that the error state is far-fetched, but it is hard to
 see any other logical explanation. There is of course always the
 possibility that the problem is somewhere else completely.

 Proper patch attached.

 Thanks,
 Henrik

 ---

 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Thu, 26 Sep 2013 08:33:16 +0200
 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

 After reports from Chris and Josh Boyer of a rare crash in applesmc,
 Guenter pointed at the initialization problem fixed below. The patch
 has not been verified to fix the crash, but should be applied
 regardless.

 Reported-by: jwbo...@fedoraproject.org
 Suggested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Henrik Rydberg rydb...@euromail.se
 ---
  drivers/hwmon/applesmc.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

Thanks for the quick reply.  I'll get this rolled into our kernels soon.

josh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
 On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote:
This suggests that initialization may be attempted more than once. The 
key cache
is allocated only once, but the number of keys is read for each 
attempt.
   
No idea if that can happen, but if the number of keys can increase 
after
the first initialization attempt you would have an explanation for the 
crash.
  
   Good idea, and easy enough to test with the patch below.
  
  Should we apply this patch even though it may not solve the specific 
  problem ?
 
  Yes, why not - it certainly won't hurt. I am running it right now, so
  it is at least run-tested.
 
  Again, not sure if the key count can change, but the current code is at 
  the very
  least inconsistent, as it keeps reading the key count without updating or
  verifying the cache size.
 
  Yes - I agree that the error state is far-fetched, but it is hard to
  see any other logical explanation. There is of course always the
  possibility that the problem is somewhere else completely.
 
  Proper patch attached.
 
  Thanks,
  Henrik
 
  ---
 
  From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
  From: Henrik Rydberg rydb...@euromail.se
  Date: Thu, 26 Sep 2013 08:33:16 +0200
  Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
 
  After reports from Chris and Josh Boyer of a rare crash in applesmc,
  Guenter pointed at the initialization problem fixed below. The patch
  has not been verified to fix the crash, but should be applied
  regardless.
 
  Reported-by: jwbo...@fedoraproject.org
  Suggested-by: Guenter Roeck li...@roeck-us.net
  Signed-off-by: Henrik Rydberg rydb...@euromail.se
  ---
   drivers/hwmon/applesmc.c | 11 ++-
   1 file changed, 10 insertions(+), 1 deletion(-)
 
 Thanks for the quick reply.  I'll get this rolled into our kernels soon.
 
I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck
On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
 
 On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote:
 
  On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
  On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se 
  wrote:
  This suggests that initialization may be attempted more than once. The 
  key cache
  is allocated only once, but the number of keys is read for each 
  attempt.
  
  No idea if that can happen, but if the number of keys can increase 
  after
  the first initialization attempt you would have an explanation for the 
  crash.
  
  Good idea, and easy enough to test with the patch below.
  
  Should we apply this patch even though it may not solve the specific 
  problem ?
  
  Yes, why not - it certainly won't hurt. I am running it right now, so
  it is at least run-tested.
  
  Again, not sure if the key count can change, but the current code is at 
  the very
  least inconsistent, as it keeps reading the key count without updating or
  verifying the cache size.
  
  Yes - I agree that the error state is far-fetched, but it is hard to
  see any other logical explanation. There is of course always the
  possibility that the problem is somewhere else completely.
  
  Proper patch attached.
  
  Thanks,
  Henrik
  
  ---
  
  From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
  From: Henrik Rydberg rydb...@euromail.se
  Date: Thu, 26 Sep 2013 08:33:16 +0200
  Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
  
  After reports from Chris and Josh Boyer of a rare crash in applesmc,
  Guenter pointed at the initialization problem fixed below. The patch
  has not been verified to fix the crash, but should be applied
  regardless.
  
  Reported-by: jwbo...@fedoraproject.org
  Suggested-by: Guenter Roeck li...@roeck-us.net
  Signed-off-by: Henrik Rydberg rydb...@euromail.se
  ---
  drivers/hwmon/applesmc.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)
  
  Thanks for the quick reply.  I'll get this rolled into our kernels soon.
  
  I sent a pull request to Linus, so you should be able to pull it from
  the upstream kernel shortly. Would be great to get feedback if the patch
  solves the problem (or doesn't).
 
 I'll start running it when it appears in koji. It's very transient, maybe one 
 oops per week with lots of (other) testing. I'm not even sure if it happens 
 on warm or cold boots or both.
 
When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck
On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
 On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote:
This suggests that initialization may be attempted more than once. The 
key cache
is allocated only once, but the number of keys is read for each 
attempt.
   
No idea if that can happen, but if the number of keys can increase 
after
the first initialization attempt you would have an explanation for the 
crash.
  
   Good idea, and easy enough to test with the patch below.
  
  Should we apply this patch even though it may not solve the specific 
  problem ?
 
  Yes, why not - it certainly won't hurt. I am running it right now, so
  it is at least run-tested.
 
  Again, not sure if the key count can change, but the current code is at 
  the very
  least inconsistent, as it keeps reading the key count without updating or
  verifying the cache size.
 
  Yes - I agree that the error state is far-fetched, but it is hard to
  see any other logical explanation. There is of course always the
  possibility that the problem is somewhere else completely.
 
  Proper patch attached.
 
  Thanks,
  Henrik
 
  ---
 
  From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
  From: Henrik Rydberg rydb...@euromail.se
  Date: Thu, 26 Sep 2013 08:33:16 +0200
  Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
 
  After reports from Chris and Josh Boyer of a rare crash in applesmc,
  Guenter pointed at the initialization problem fixed below. The patch
  has not been verified to fix the crash, but should be applied
  regardless.
 
  Reported-by: jwbo...@fedoraproject.org
  Suggested-by: Guenter Roeck li...@roeck-us.net
  Signed-off-by: Henrik Rydberg rydb...@euromail.se
  ---
   drivers/hwmon/applesmc.c | 11 ++-
   1 file changed, 10 insertions(+), 1 deletion(-)
 
 Thanks for the quick reply.  I'll get this rolled into our kernels soon.
 
Patch is now upstream (commit 5f4513864).

Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Chris Murphy

On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote:

 On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
 
 On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote:
 
 On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
 On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se 
 wrote:
 This suggests that initialization may be attempted more than once. The 
 key cache
 is allocated only once, but the number of keys is read for each 
 attempt.
 
 No idea if that can happen, but if the number of keys can increase 
 after
 the first initialization attempt you would have an explanation for the 
 crash.
 
 Good idea, and easy enough to test with the patch below.
 
 Should we apply this patch even though it may not solve the specific 
 problem ?
 
 Yes, why not - it certainly won't hurt. I am running it right now, so
 it is at least run-tested.
 
 Again, not sure if the key count can change, but the current code is at 
 the very
 least inconsistent, as it keeps reading the key count without updating or
 verifying the cache size.
 
 Yes - I agree that the error state is far-fetched, but it is hard to
 see any other logical explanation. There is of course always the
 possibility that the problem is somewhere else completely.
 
 Proper patch attached.
 
 Thanks,
 Henrik
 
 ---
 
 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Thu, 26 Sep 2013 08:33:16 +0200
 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
 
 After reports from Chris and Josh Boyer of a rare crash in applesmc,
 Guenter pointed at the initialization problem fixed below. The patch
 has not been verified to fix the crash, but should be applied
 regardless.
 
 Reported-by: jwbo...@fedoraproject.org
 Suggested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Henrik Rydberg rydb...@euromail.se
 ---
 drivers/hwmon/applesmc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)
 
 Thanks for the quick reply.  I'll get this rolled into our kernels soon.
 
 I sent a pull request to Linus, so you should be able to pull it from
 the upstream kernel shortly. Would be great to get feedback if the patch
 solves the problem (or doesn't).
 
 I'll start running it when it appears in koji. It's very transient, maybe 
 one oops per week with lots of (other) testing. I'm not even sure if it 
 happens on warm or cold boots or both.
 
 When you do, can you possibly trigger an event based on the warning added
 with the patch ? This might help us to identify if the problem fixed
 with the patch actually happens.

I don't understand the question. I'm uncertain how to trigger, and also what 
event.

Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Chris Murphy

On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote:

 On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
 On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote:
 This suggests that initialization may be attempted more than once. The 
 key cache
 is allocated only once, but the number of keys is read for each attempt.
 
 No idea if that can happen, but if the number of keys can increase after
 the first initialization attempt you would have an explanation for the 
 crash.
 
 Good idea, and easy enough to test with the patch below.
 
 Should we apply this patch even though it may not solve the specific 
 problem ?
 
 Yes, why not - it certainly won't hurt. I am running it right now, so
 it is at least run-tested.
 
 Again, not sure if the key count can change, but the current code is at 
 the very
 least inconsistent, as it keeps reading the key count without updating or
 verifying the cache size.
 
 Yes - I agree that the error state is far-fetched, but it is hard to
 see any other logical explanation. There is of course always the
 possibility that the problem is somewhere else completely.
 
 Proper patch attached.
 
 Thanks,
 Henrik
 
 ---
 
 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
 From: Henrik Rydberg rydb...@euromail.se
 Date: Thu, 26 Sep 2013 08:33:16 +0200
 Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
 
 After reports from Chris and Josh Boyer of a rare crash in applesmc,
 Guenter pointed at the initialization problem fixed below. The patch
 has not been verified to fix the crash, but should be applied
 regardless.
 
 Reported-by: jwbo...@fedoraproject.org
 Suggested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Henrik Rydberg rydb...@euromail.se
 ---
 drivers/hwmon/applesmc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)
 
 Thanks for the quick reply.  I'll get this rolled into our kernels soon.
 
 I sent a pull request to Linus, so you should be able to pull it from
 the upstream kernel shortly. Would be great to get feedback if the patch
 solves the problem (or doesn't).

I'll start running it when it appears in koji. It's very transient, maybe one 
oops per week with lots of (other) testing. I'm not even sure if it happens on 
warm or cold boots or both.

Chris--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-27 Thread Guenter Roeck

On 09/27/2013 11:03 AM, Chris Murphy wrote:


On Sep 27, 2013, at 11:59 AM, Guenter Roeck li...@roeck-us.net wrote:


On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:


On Sep 27, 2013, at 11:12 AM, Guenter Roeck li...@roeck-us.net wrote:


On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:

On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg rydb...@euromail.se wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---

 From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg rydb...@euromail.se
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: jwbo...@fedoraproject.org
Suggested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Henrik Rydberg rydb...@euromail.se
---
drivers/hwmon/applesmc.c | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)


Thanks for the quick reply.  I'll get this rolled into our kernels soon.


I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).


I'll start running it when it appears in koji. It's very transient, maybe one 
oops per week with lots of (other) testing. I'm not even sure if it happens on 
warm or cold boots or both.


When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.


I don't understand the question. I'm uncertain how to trigger, and also what 
event.



The patch includes a new warning message.

pr_warn(key count changed from %d to %d\n,
s-key_count, count);

It would be great if there would be a means to detect if this message is seen
in a kernel log, because it would show that the potential crash condition
fixed with the patch was actually encountered. This would help us to determine
if we actually fixed the problem or not.

Of course, we'll know if is wasn't fixed if the system still crashes.

Thanks,
Guenter

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Henrik Rydberg
> >Yes - I agree that the error state is far-fetched, but it is hard to
> >see any other logical explanation. There is of course always the
> >possibility that the problem is somewhere else completely.
> >
> There are also ACPI conflicts in each of the bug reports I looked at.
> Can this play a role, or is that "normal" on apple systems ?

I honstestly don't know. On my own machine (MBA3,1), I have problems
with the PCIe NIC somehow overwriting the graphics buffer, resulting
in a corrupted screen at boot. After a suspend/resume, that problem is
gone, but there is surely something fishy going on...

Thanks,
Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Henrik Rydberg
> Applied. I'll run my sanity tests before I send it upstream.
> I'll also Cc: stable.

Great.

> Interesting is that this started to happen with 3.10, even though
> I did not find any relevant changes in the driver. Is it possible that
> changed boot timing (ie reduced boot time) exposes this problem ?

I was thinking the same thing - if the SMC takes some time to boot up,
it could very well be the culprit. Imagine that, the kernel is too
fast. ;-)

Thanks,
Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Guenter Roeck

On 09/25/2013 11:34 PM, Henrik Rydberg wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---


From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001

From: Henrik Rydberg 
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: 
Suggested-by: Guenter Roeck 
Signed-off-by: Henrik Rydberg 
---
  drivers/hwmon/applesmc.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)


Applied. I'll run my sanity tests before I send it upstream.
I'll also Cc: stable.

Interesting is that this started to happen with 3.10, even though
I did not find any relevant changes in the driver. Is it possible that
changed boot timing (ie reduced boot time) exposes this problem ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Guenter Roeck

On 09/25/2013 11:34 PM, Henrik Rydberg wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.


There are also ACPI conflicts in each of the bug reports I looked at.
Can this play a role, or is that "normal" on apple systems ?

Thanks,
Guenter


Proper patch attached.

Thanks,
Henrik

---


From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001

From: Henrik Rydberg 
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: 
Suggested-by: Guenter Roeck 
Signed-off-by: Henrik Rydberg 
---
  drivers/hwmon/applesmc.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 62c2e32..98814d1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void)
  {
struct applesmc_registers *s = 
bool left_light_sensor, right_light_sensor;
+   unsigned int count;
u8 tmp[1];
int ret;

if (s->init_complete)
return 0;

-   ret = read_register_count(>key_count);
+   ret = read_register_count();
if (ret)
return ret;

+   if (s->cache && s->key_count != count) {
+   pr_warn("key count changed from %d to %d\n",
+   s->key_count, count);
+   kfree(s->cache);
+   s->cache = NULL;
+   }
+   s->key_count = count;
+
if (!s->cache)
s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL);
if (!s->cache)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Henrik Rydberg
> > > This suggests that initialization may be attempted more than once. The 
> > > key cache
> > > is allocated only once, but the number of keys is read for each attempt.
> > > 
> > > No idea if that can happen, but if the number of keys can increase after
> > > the first initialization attempt you would have an explanation for the 
> > > crash.
> > 
> > Good idea, and easy enough to test with the patch below.
> > 
> Should we apply this patch even though it may not solve the specific problem ?

Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.

> Again, not sure if the key count can change, but the current code is at the 
> very
> least inconsistent, as it keeps reading the key count without updating or
> verifying the cache size.

Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---

>From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg 
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: 
Suggested-by: Guenter Roeck 
Signed-off-by: Henrik Rydberg 
---
 drivers/hwmon/applesmc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 62c2e32..98814d1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void)
 {
struct applesmc_registers *s = 
bool left_light_sensor, right_light_sensor;
+   unsigned int count;
u8 tmp[1];
int ret;
 
if (s->init_complete)
return 0;
 
-   ret = read_register_count(>key_count);
+   ret = read_register_count();
if (ret)
return ret;
 
+   if (s->cache && s->key_count != count) {
+   pr_warn("key count changed from %d to %d\n",
+   s->key_count, count);
+   kfree(s->cache);
+   s->cache = NULL;
+   }
+   s->key_count = count;
+
if (!s->cache)
s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL);
if (!s->cache)
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Henrik Rydberg
   This suggests that initialization may be attempted more than once. The 
   key cache
   is allocated only once, but the number of keys is read for each attempt.
   
   No idea if that can happen, but if the number of keys can increase after
   the first initialization attempt you would have an explanation for the 
   crash.
  
  Good idea, and easy enough to test with the patch below.
  
 Should we apply this patch even though it may not solve the specific problem ?

Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.

 Again, not sure if the key count can change, but the current code is at the 
 very
 least inconsistent, as it keeps reading the key count without updating or
 verifying the cache size.

Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---

From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg rydb...@euromail.se
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: jwbo...@fedoraproject.org
Suggested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Henrik Rydberg rydb...@euromail.se
---
 drivers/hwmon/applesmc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 62c2e32..98814d1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void)
 {
struct applesmc_registers *s = smcreg;
bool left_light_sensor, right_light_sensor;
+   unsigned int count;
u8 tmp[1];
int ret;
 
if (s-init_complete)
return 0;
 
-   ret = read_register_count(s-key_count);
+   ret = read_register_count(count);
if (ret)
return ret;
 
+   if (s-cache  s-key_count != count) {
+   pr_warn(key count changed from %d to %d\n,
+   s-key_count, count);
+   kfree(s-cache);
+   s-cache = NULL;
+   }
+   s-key_count = count;
+
if (!s-cache)
s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL);
if (!s-cache)
-- 
1.8.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Guenter Roeck

On 09/25/2013 11:34 PM, Henrik Rydberg wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.


There are also ACPI conflicts in each of the bug reports I looked at.
Can this play a role, or is that normal on apple systems ?

Thanks,
Guenter


Proper patch attached.

Thanks,
Henrik

---


From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001

From: Henrik Rydberg rydb...@euromail.se
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: jwbo...@fedoraproject.org
Suggested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Henrik Rydberg rydb...@euromail.se
---
  drivers/hwmon/applesmc.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 62c2e32..98814d1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -525,16 +525,25 @@ static int applesmc_init_smcreg_try(void)
  {
struct applesmc_registers *s = smcreg;
bool left_light_sensor, right_light_sensor;
+   unsigned int count;
u8 tmp[1];
int ret;

if (s-init_complete)
return 0;

-   ret = read_register_count(s-key_count);
+   ret = read_register_count(count);
if (ret)
return ret;

+   if (s-cache  s-key_count != count) {
+   pr_warn(key count changed from %d to %d\n,
+   s-key_count, count);
+   kfree(s-cache);
+   s-cache = NULL;
+   }
+   s-key_count = count;
+
if (!s-cache)
s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL);
if (!s-cache)



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Guenter Roeck

On 09/25/2013 11:34 PM, Henrik Rydberg wrote:

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.


Good idea, and easy enough to test with the patch below.


Should we apply this patch even though it may not solve the specific problem ?


Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.


Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.


Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.

Thanks,
Henrik

---


From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001

From: Henrik Rydberg rydb...@euromail.se
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: jwbo...@fedoraproject.org
Suggested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Henrik Rydberg rydb...@euromail.se
---
  drivers/hwmon/applesmc.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)


Applied. I'll run my sanity tests before I send it upstream.
I'll also Cc: stable.

Interesting is that this started to happen with 3.10, even though
I did not find any relevant changes in the driver. Is it possible that
changed boot timing (ie reduced boot time) exposes this problem ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Henrik Rydberg
 Applied. I'll run my sanity tests before I send it upstream.
 I'll also Cc: stable.

Great.

 Interesting is that this started to happen with 3.10, even though
 I did not find any relevant changes in the driver. Is it possible that
 changed boot timing (ie reduced boot time) exposes this problem ?

I was thinking the same thing - if the SMC takes some time to boot up,
it could very well be the culprit. Imagine that, the kernel is too
fast. ;-)

Thanks,
Henrik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-26 Thread Henrik Rydberg
 Yes - I agree that the error state is far-fetched, but it is hard to
 see any other logical explanation. There is of course always the
 possibility that the problem is somewhere else completely.
 
 There are also ACPI conflicts in each of the bug reports I looked at.
 Can this play a role, or is that normal on apple systems ?

I honstestly don't know. On my own machine (MBA3,1), I have problems
with the PCIe NIC somehow overwriting the graphics buffer, resulting
in a corrupted screen at boot. After a suspend/resume, that problem is
gone, but there is surely something fishy going on...

Thanks,
Henrik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-25 Thread Guenter Roeck
On Wed, Sep 25, 2013 at 11:48:07PM +0200, Henrik Rydberg wrote:
> On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote:
> > On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote:
> > > Hi All,
> > > 
> > > Chris has reported[1] an issue with the applesmc driver in the 3.10
> > > and 3.11 kernels.  This seems to be a bit transient, but the oops is
> > > consistent across those releases.  This is with a MacBook Pro 4,1.
> > > The backtrace is below.
> > > 
> > > Any ideas on this one?
> > > 
> > Some of the bug reports suggest that there are related messages in at least 
> > some
> > of the kernel logs.
> > 
> > Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 
> > 0x01
> > Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail
> > 
> > This suggests that initialization may be attempted more than once. The key 
> > cache
> > is allocated only once, but the number of keys is read for each attempt.
> > 
> > No idea if that can happen, but if the number of keys can increase after
> > the first initialization attempt you would have an explanation for the 
> > crash.
> 
> Good idea, and easy enough to test with the patch below.
> 
Should we apply this patch even though it may not solve the specific problem ?

Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.

Thanks,
Guenter

> diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
> index 62c2e32..d962c4d 100644
> --- a/drivers/hwmon/applesmc.c
> +++ b/drivers/hwmon/applesmc.c
> @@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void)
>  {
> struct applesmc_registers *s = 
> bool left_light_sensor, right_light_sensor;
> +   unsigned int count;
> u8 tmp[1];
> int ret;
>  
> if (s->init_complete)
> return 0;
>  
> -   ret = read_register_count(>key_count);
> +   ret = read_register_count();
> if (ret)
> return ret;
>  
> +   if (s->cache && s->key_count != count) {
> +   pr_warn("key count changed from %d to %d\n", s->key_count, 
> count);
> +   kfree(s->cache);
> +   s->cache = NULL;
> +   }
> +   s->key_count = count;
> +
> if (!s->cache)
> s->cache = kcalloc(s->key_count, sizeof(*s->cache), 
> GFP_KERNEL);
> if (!s->cache)
>  
> Thanks,
> Henrik
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-25 Thread Henrik Rydberg
On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote:
> On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote:
> > Hi All,
> > 
> > Chris has reported[1] an issue with the applesmc driver in the 3.10
> > and 3.11 kernels.  This seems to be a bit transient, but the oops is
> > consistent across those releases.  This is with a MacBook Pro 4,1.
> > The backtrace is below.
> > 
> > Any ideas on this one?
> > 
> Some of the bug reports suggest that there are related messages in at least 
> some
> of the kernel logs.
> 
> Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 
> 0x01
> Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail
> 
> This suggests that initialization may be attempted more than once. The key 
> cache
> is allocated only once, but the number of keys is read for each attempt.
> 
> No idea if that can happen, but if the number of keys can increase after
> the first initialization attempt you would have an explanation for the crash.

Good idea, and easy enough to test with the patch below.

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 62c2e32..d962c4d 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void)
 {
struct applesmc_registers *s = 
bool left_light_sensor, right_light_sensor;
+   unsigned int count;
u8 tmp[1];
int ret;
 
if (s->init_complete)
return 0;
 
-   ret = read_register_count(>key_count);
+   ret = read_register_count();
if (ret)
return ret;
 
+   if (s->cache && s->key_count != count) {
+   pr_warn("key count changed from %d to %d\n", s->key_count, 
count);
+   kfree(s->cache);
+   s->cache = NULL;
+   }
+   s->key_count = count;
+
if (!s->cache)
s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL);
if (!s->cache)
 
Thanks,
Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-25 Thread Guenter Roeck
On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote:
> Hi All,
> 
> Chris has reported[1] an issue with the applesmc driver in the 3.10
> and 3.11 kernels.  This seems to be a bit transient, but the oops is
> consistent across those releases.  This is with a MacBook Pro 4,1.
> The backtrace is below.
> 
> Any ideas on this one?
> 
Some of the bug reports suggest that there are related messages in at least some
of the kernel logs.

Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 0x01
Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.

Guenter


> josh
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1011719
> 
> BUG: unable to handle kernel paging request at 880322aed005
> IP: [] applesmc_get_entry_by_index+0x25/0xf0 [applesmc]
> PGD 1fe5067 PUD 0
> Oops:  [#1] SMP
> Modules linked in: b43 coretemp kvm_intel kvm joydev bcma mac80211
> uvcvideo snd_hda_codec_realtek applesmc(+) iTCO_wdt cfg80211
> iTCO_vendor_support microcode input_polldev snd_hda_intel
> snd_hda_codec i2c_i801 videobuf2_vmalloc videobuf2_memops
> videobuf2_core videodev snd_hwdep snd_seq bcm5974 btusb hid_appleir
> snd_seq_device media bluetooth snd_pcm lpc_ich mfd_core rfkill
> snd_page_alloc snd_timer snd apple_bl soundcore shpchp acpi_cpufreq
> mperf nfsd auth_rpcgss nfs_acl lockd isofs squashfs btrfs libcrc32c
> xor zlib_deflate raid6_pq usb_storage nouveau firewire_ohci
> ata_generic pata_acpi firewire_core crc_itu_t sky2 mxm_wmi wmi
> i2c_algo_bit ssb drm_kms_helper mmc_core ttm drm i2c_core video sunrpc
> loop
> CPU: 1 PID: 492 Comm: systemd-udevd Not tainted 3.11.0-300.fc20.x86_64 #1
> Hardware name: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS
> MBP41.88Z.00C1.B03.0802271651 02/27/08
> task: 880132a407a0 ti: 8801343fe000 task.ti: 8801343fe000
> RIP: 0010:[]  []
> applesmc_get_entry_by_index+0x25/0xf0 [applesmc]
> RSP: 0018:8801343ffa18  EFLAGS: 00010286
> RAX: 2600 RBX: 880322aed000 RCX: 8801343ffaf4
> RDX: 7200 RSI: a05df06e RDI: 2600
> RBP: 8801343ffa38 R08: 8801343fe000 R09: 880132a6c100
> R10:  R11: 00014180 R12: 2600
> R13: 4c01 R14: a05df06e R15: 8801343ffa84
> FS:  7f6aac863880() GS:88013fd0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 880322aed005 CR3: 0001340fa000 CR4: 07e0
> Stack:
>  0004deef ff10 2600 
>  8801343ffa70 a05dd4f1 0032 0001
>  a05df06e 880134be3810 a05e0660 8801343ffaa8
> Call Trace:
>  [] applesmc_get_lower_bound+0x51/0xa0 [applesmc]
>  [] applesmc_get_entry_by_key+0x23/0xc0 [applesmc]
>  [] applesmc_read_key+0x19/0x70 [applesmc]
>  [] applesmc_init_smcreg+0x8f/0x2d0 [applesmc]
>  [] applesmc_probe+0x12/0x30 [applesmc]
>  [] platform_drv_probe+0x3c/0x70
>  [] ? driver_sysfs_add+0x82/0xb0
>  [] driver_probe_device+0x87/0x390
>  [] ? driver_probe_device+0x390/0x390
>  [] __device_attach+0x3b/0x40
>  [] bus_for_each_drv+0x63/0xa0
>  [] device_attach+0x88/0xa0
>  [] bus_probe_device+0x98/0xc0
>  [] device_add+0x4c4/0x7a0
>  [] platform_device_add+0xd1/0x2d0
>  [] platform_device_register_full+0xe0/0x120
>  [] ? 0xa05e2fff
>  [] applesmc_init+0x98/0x1000 [applesmc]
>  [] do_one_initcall+0xfa/0x1b0
>  [] ? set_memory_nx+0x43/0x50
>  [] load_module+0x1bbd/0x2660
>  [] ? store_uevent+0x40/0x40
>  [] SyS_finit_module+0x86/0xb0
>  [] system_call_fastpath+0x16/0x1b
> Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 63 c7 48 8d 14 40 48 89
> e5 41 54 41 89 fc 53 48 8d 1c 90 48 83 ec 10 48 03 1d 03 33 00 00 <80>
> 7b 05 00 48 89 d8 74 12 48 83 c4 10 5b 41 5c 5d c3 66 0f 1f
> RIP  [] applesmc_get_entry_by_index+0x25/0xf0 [applesmc]
>  RSP 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-25 Thread Guenter Roeck
On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote:
 Hi All,
 
 Chris has reported[1] an issue with the applesmc driver in the 3.10
 and 3.11 kernels.  This seems to be a bit transient, but the oops is
 consistent across those releases.  This is with a MacBook Pro 4,1.
 The backtrace is below.
 
 Any ideas on this one?
 
Some of the bug reports suggest that there are related messages in at least some
of the kernel logs.

Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 0x01
Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail

This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.

Guenter


 josh
 
 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1011719
 
 BUG: unable to handle kernel paging request at 880322aed005
 IP: [a05dd345] applesmc_get_entry_by_index+0x25/0xf0 [applesmc]
 PGD 1fe5067 PUD 0
 Oops:  [#1] SMP
 Modules linked in: b43 coretemp kvm_intel kvm joydev bcma mac80211
 uvcvideo snd_hda_codec_realtek applesmc(+) iTCO_wdt cfg80211
 iTCO_vendor_support microcode input_polldev snd_hda_intel
 snd_hda_codec i2c_i801 videobuf2_vmalloc videobuf2_memops
 videobuf2_core videodev snd_hwdep snd_seq bcm5974 btusb hid_appleir
 snd_seq_device media bluetooth snd_pcm lpc_ich mfd_core rfkill
 snd_page_alloc snd_timer snd apple_bl soundcore shpchp acpi_cpufreq
 mperf nfsd auth_rpcgss nfs_acl lockd isofs squashfs btrfs libcrc32c
 xor zlib_deflate raid6_pq usb_storage nouveau firewire_ohci
 ata_generic pata_acpi firewire_core crc_itu_t sky2 mxm_wmi wmi
 i2c_algo_bit ssb drm_kms_helper mmc_core ttm drm i2c_core video sunrpc
 loop
 CPU: 1 PID: 492 Comm: systemd-udevd Not tainted 3.11.0-300.fc20.x86_64 #1
 Hardware name: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS
 MBP41.88Z.00C1.B03.0802271651 02/27/08
 task: 880132a407a0 ti: 8801343fe000 task.ti: 8801343fe000
 RIP: 0010:[a05dd345]  [a05dd345]
 applesmc_get_entry_by_index+0x25/0xf0 [applesmc]
 RSP: 0018:8801343ffa18  EFLAGS: 00010286
 RAX: 2600 RBX: 880322aed000 RCX: 8801343ffaf4
 RDX: 7200 RSI: a05df06e RDI: 2600
 RBP: 8801343ffa38 R08: 8801343fe000 R09: 880132a6c100
 R10:  R11: 00014180 R12: 2600
 R13: 4c01 R14: a05df06e R15: 8801343ffa84
 FS:  7f6aac863880() GS:88013fd0() knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 880322aed005 CR3: 0001340fa000 CR4: 07e0
 Stack:
  0004deef ff10 2600 
  8801343ffa70 a05dd4f1 0032 0001
  a05df06e 880134be3810 a05e0660 8801343ffaa8
 Call Trace:
  [a05dd4f1] applesmc_get_lower_bound+0x51/0xa0 [applesmc]
  [a05dd563] applesmc_get_entry_by_key+0x23/0xc0 [applesmc]
  [a05dd9f9] applesmc_read_key+0x19/0x70 [applesmc]
  [a05de08f] applesmc_init_smcreg+0x8f/0x2d0 [applesmc]
  [a05de2e2] applesmc_probe+0x12/0x30 [applesmc]
  [813f41dc] platform_drv_probe+0x3c/0x70
  [813f18b2] ? driver_sysfs_add+0x82/0xb0
  [813f1f77] driver_probe_device+0x87/0x390
  [813f2280] ? driver_probe_device+0x390/0x390
  [813f22bb] __device_attach+0x3b/0x40
  [813eff73] bus_for_each_drv+0x63/0xa0
  [813f1e78] device_attach+0x88/0xa0
  [813f11e8] bus_probe_device+0x98/0xc0
  [813eeeb4] device_add+0x4c4/0x7a0
  [813f3cb1] platform_device_add+0xd1/0x2d0
  [813f4430] platform_device_register_full+0xe0/0x120
  [a05e3000] ? 0xa05e2fff
  [a05e3098] applesmc_init+0x98/0x1000 [applesmc]
  [810020fa] do_one_initcall+0xfa/0x1b0
  [81052483] ? set_memory_nx+0x43/0x50
  [810cc58d] load_module+0x1bbd/0x2660
  [810c8890] ? store_uevent+0x40/0x40
  [810cd1a6] SyS_finit_module+0x86/0xb0
  [81652cd9] system_call_fastpath+0x16/0x1b
 Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 63 c7 48 8d 14 40 48 89
 e5 41 54 41 89 fc 53 48 8d 1c 90 48 83 ec 10 48 03 1d 03 33 00 00 80
 7b 05 00 48 89 d8 74 12 48 83 c4 10 5b 41 5c 5d c3 66 0f 1f
 RIP  [a05dd345] applesmc_get_entry_by_index+0x25/0xf0 [applesmc]
  RSP 8801343ffa18
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-25 Thread Henrik Rydberg
On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote:
 On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote:
  Hi All,
  
  Chris has reported[1] an issue with the applesmc driver in the 3.10
  and 3.11 kernels.  This seems to be a bit transient, but the oops is
  consistent across those releases.  This is with a MacBook Pro 4,1.
  The backtrace is below.
  
  Any ideas on this one?
  
 Some of the bug reports suggest that there are related messages in at least 
 some
 of the kernel logs.
 
 Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 
 0x01
 Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail
 
 This suggests that initialization may be attempted more than once. The key 
 cache
 is allocated only once, but the number of keys is read for each attempt.
 
 No idea if that can happen, but if the number of keys can increase after
 the first initialization attempt you would have an explanation for the crash.

Good idea, and easy enough to test with the patch below.

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 62c2e32..d962c4d 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void)
 {
struct applesmc_registers *s = smcreg;
bool left_light_sensor, right_light_sensor;
+   unsigned int count;
u8 tmp[1];
int ret;
 
if (s-init_complete)
return 0;
 
-   ret = read_register_count(s-key_count);
+   ret = read_register_count(count);
if (ret)
return ret;
 
+   if (s-cache  s-key_count != count) {
+   pr_warn(key count changed from %d to %d\n, s-key_count, 
count);
+   kfree(s-cache);
+   s-cache = NULL;
+   }
+   s-key_count = count;
+
if (!s-cache)
s-cache = kcalloc(s-key_count, sizeof(*s-cache), GFP_KERNEL);
if (!s-cache)
 
Thanks,
Henrik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: applesmc oops in 3.10/3.11

2013-09-25 Thread Guenter Roeck
On Wed, Sep 25, 2013 at 11:48:07PM +0200, Henrik Rydberg wrote:
 On Wed, Sep 25, 2013 at 12:56:28PM -0700, Guenter Roeck wrote:
  On Wed, Sep 25, 2013 at 03:06:25PM -0400, Josh Boyer wrote:
   Hi All,
   
   Chris has reported[1] an issue with the applesmc driver in the 3.10
   and 3.11 kernels.  This seems to be a bit transient, but the oops is
   consistent across those releases.  This is with a MacBook Pro 4,1.
   The backtrace is below.
   
   Any ideas on this one?
   
  Some of the bug reports suggest that there are related messages in at least 
  some
  of the kernel logs.
  
  Jul 02 22:23:22 f19s.local kernel: applesmc: send_byte(0x04, 0x0300) fail: 
  0x01
  Jul 02 22:23:22 f19s.local kernel: applesmc: : read len fail
  
  This suggests that initialization may be attempted more than once. The key 
  cache
  is allocated only once, but the number of keys is read for each attempt.
  
  No idea if that can happen, but if the number of keys can increase after
  the first initialization attempt you would have an explanation for the 
  crash.
 
 Good idea, and easy enough to test with the patch below.
 
Should we apply this patch even though it may not solve the specific problem ?

Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.

Thanks,
Guenter

 diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
 index 62c2e32..d962c4d 100644
 --- a/drivers/hwmon/applesmc.c
 +++ b/drivers/hwmon/applesmc.c
 @@ -525,16 +525,24 @@ static int applesmc_init_smcreg_try(void)
  {
 struct applesmc_registers *s = smcreg;
 bool left_light_sensor, right_light_sensor;
 +   unsigned int count;
 u8 tmp[1];
 int ret;
  
 if (s-init_complete)
 return 0;
  
 -   ret = read_register_count(s-key_count);
 +   ret = read_register_count(count);
 if (ret)
 return ret;
  
 +   if (s-cache  s-key_count != count) {
 +   pr_warn(key count changed from %d to %d\n, s-key_count, 
 count);
 +   kfree(s-cache);
 +   s-cache = NULL;
 +   }
 +   s-key_count = count;
 +
 if (!s-cache)
 s-cache = kcalloc(s-key_count, sizeof(*s-cache), 
 GFP_KERNEL);
 if (!s-cache)
  
 Thanks,
 Henrik
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/