Re: [PATCH,FIXED] PCMCIA card inserted, five s2ram cycles, you're dead

2012-02-05 Thread Dominik Brodowski
On Sun, Jan 22, 2012 at 09:12:48AM +, Russell King - ARM Linux wrote:
 Unfortunately, the description will need a couple of edits for it to
 apply properly (I just applied it to a local private branch.)
 
 On Sun, Jan 22, 2012 at 08:51:57AM +, Russell King - ARM Linux wrote:
  @@ -546,8 +524,8 @@ static int socket_late_resume(struct pcmcia_socket *skt)
 
 This needs a space before it, otherwise git will try to take this as a
 patch to be applied.
 
  Cc: sta...@vger.kernel.org
  Tested-by: Russell King rmk+ker...@arm.linux.org.uk
  Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
  -- 
 
 And I keep messing this marker up in hand-crafted messages.

Applied, thanks.

Best,
Dominik

___
Linux PCMCIA reimplementation list
http://lists.infradead.org/mailman/listinfo/linux-pcmcia


[PATCH,FIXED] PCMCIA card inserted, five s2ram cycles, you're dead

2012-01-22 Thread Russell King - ARM Linux
Last night, I wrote:
 static int pcmcia_bus_early_resume(struct pcmcia_socket *skt)
 {
 if (!verify_cis_cache(skt)) {
 pcmcia_put_socket(skt);
 return 0;
 }
 
 I've not been able to check that theory tonight.  Maybe someone who
 knows the code can suggest - if not, I'll try deleting that
 pcmcia_put_socket() call at some point tomorrow.

Yes, this does appear to be the problem.  Tested patch below.  It looks
like the bug has been around since July 2010 - maybe no one suspends and
resumes with their PCMCIA (not Cardbus) card inserted.

Given its age, it seems to affect many stable kernels.

Note that this is a memory-corrupting bug: not only does it cause the
warning, but as a result of dropping the refcount to zero, it causes the
pcmcia_socket0 device structure to be freed while it still has references,
causing slab caches corruption.  A fatal oops quickly follows this warning
- often even just a 'dmesg' following the warning causes the kernel to
oops.

8
From: Russell King rmk+ker...@arm.linux.org.uk
Subject: [PATCH] Fix PCMCIA socket refcount decrementing on each resume

While testing suspend/resume on an ARM device with PCMCIA support, and
a CF card inserted, I found that after five suspend and resumes, the
kernel would complain, and shortly die after with slab corruption.

WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()

As the message doesn't give a clue about which kobject, and the built-in
debugging in drivers/base/power/main.c happens too late, this was added
right before each get_device():

printk(%s: %p [%s] %u\n, __func__, dev, kobject_name(dev-kobj), 
atomic_read(dev-kobj.kref.refcount));

and on the 3rd s2ram cycle, the following behaviour observed:

On the 3rd suspend/resume cycle:

dpm_prepare: c1a0d998 [pcmcia_socket0] 3
dpm_suspend: c1a0d998 [pcmcia_socket0] 3
dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 3
dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 3
dpm_resume: c1a0d998 [pcmcia_socket0] 3
dpm_complete: c1a0d998 [pcmcia_socket0] 2

4th:

dpm_prepare: c1a0d998 [pcmcia_socket0] 2
dpm_suspend: c1a0d998 [pcmcia_socket0] 2
dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 2
dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 2
dpm_resume: c1a0d998 [pcmcia_socket0] 2
dpm_complete: c1a0d998 [pcmcia_socket0] 1

5th:

dpm_prepare: c1a0d998 [pcmcia_socket0] 1
dpm_suspend: c1a0d998 [pcmcia_socket0] 1
dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 1
dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 1
dpm_resume: c1a0d998 [pcmcia_socket0] 1
dpm_complete: c1a0d998 [pcmcia_socket0] 0
[ cut here ]
WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()
Modules linked in: ucb1x00_core
Backtrace:
[c0212090] (dump_backtrace+0x0/0x110) from [c04799dc] (dump_stack+0x18/0x1c)
[c04799c4] (dump_stack+0x0/0x1c) from [c021cba0] 
(warn_slowpath_common+0x50/0x68)
[c021cb50] (warn_slowpath_common+0x0/0x68) from [c021cbdc] 
(warn_slowpath_null+0x24/0x28)
[c021cbb8] (warn_slowpath_null+0x0/0x28) from [c0335374] 
(kobject_get+0x28/0x50)
[c033534c] (kobject_get+0x0/0x50) from [c03804f4] (get_device+0x1c/0x24)
[c0388c90] (dpm_complete+0x0/0x1a0) from [c0389cc0] 
(dpm_resume_end+0x1c/0x20)
...

Looking at commit 7b24e7988263 (pcmcia: split up central event handler),
the following change was made to cs.c:

@@ -546,8 +524,8 @@ static int socket_late_resume(struct pcmcia_socket *skt)
return 0;
}
 #endif
-
-   send_event(skt, CS_EVENT_PM_RESUME, CS_EVENT_PRI_LOW);
+   if (!(skt-state  SOCKET_CARDBUS)  (skt-callback))
+   skt-callback-early_resume(skt);
return 0;
 }


And the corresponding change in ds.c is from:

-static int ds_event(struct pcmcia_socket *skt, event_t event, int priority)
-{
-   struct pcmcia_socket *s = pcmcia_get_socket(skt);
...
-   switch (event) {
...
-   case CS_EVENT_PM_RESUME:
-   if (verify_cis_cache(skt) != 0) {
-   dev_dbg(skt-dev, cis mismatch - different card\n);
-   /* first, remove the card */
-   ds_event(skt, CS_EVENT_CARD_REMOVAL, CS_EVENT_PRI_HIGH);
-   mutex_lock(s-ops_mutex);
-   destroy_cis_cache(skt);
-   kfree(skt-fake_cis);
-   skt-fake_cis = NULL;
-   s-functions = 0;
-   mutex_unlock(s-ops_mutex);
-   /* now, add the new card */
-   ds_event(skt, CS_EVENT_CARD_INSERTION,
-CS_EVENT_PRI_LOW);
-   }
-   break;
...
-}

-pcmcia_put_socket(s);

-return 0;
-} /* ds_event */

to:

+static int pcmcia_bus_early_resume(struct pcmcia_socket *skt)
+{
+   if (!verify_cis_cache(skt)) {
+   pcmcia_put_socket(skt);
+   return 0;
+   }

+   dev_dbg(skt-dev, cis mismatch - different card\n);

+   /* first, remove the card */
+   

Re: [PATCH,FIXED] PCMCIA card inserted, five s2ram cycles, you're dead

2012-01-22 Thread Russell King - ARM Linux
Unfortunately, the description will need a couple of edits for it to
apply properly (I just applied it to a local private branch.)

On Sun, Jan 22, 2012 at 08:51:57AM +, Russell King - ARM Linux wrote:
 @@ -546,8 +524,8 @@ static int socket_late_resume(struct pcmcia_socket *skt)

This needs a space before it, otherwise git will try to take this as a
patch to be applied.

 Cc: sta...@vger.kernel.org
 Tested-by: Russell King rmk+ker...@arm.linux.org.uk
 Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
 -- 

And I keep messing this marker up in hand-crafted messages.

___
Linux PCMCIA reimplementation list
http://lists.infradead.org/mailman/listinfo/linux-pcmcia