On Tuesday 20 January 2015 10:49:29 Alan Stern wrote:
> On Mon, 19 Jan 2015, Heiko Przybyl wrote:
> > On Monday 19 January 2015 11:17:59 Alan Stern wrote:
> > >
> > > That's easy enough to test. All you have to do is change the
> > > spin_lock/unlock statements to their irq_save/restore variants.
> >
> > Well, thought about that as well, but I'm not sure when to take it as
> > fixed and when to take it as issue-just-didn't-happen-yet, because of the
> > not-so- deterministic occurrence of the error. But I can try it out
> > anyway, just wanted to have some feedback before trying.
>
> By the way, failing to disable interrupts when acquiring a spinlock
> generally does not lead to data corruption -- it leads to deadlocks.
> So I doubt this is the cause of your problem. If you really want to,
> you could add a
>
> WARN_ON(!irqs_disabled());
>
> line to ohci_irq().
You're right. After thinking/reading a bit more about the topic, not disabled
IRQs would cause deadlocks, because there's already one thread in the critical
section.
>
> No idea.
>
> It might be a good idea for you to try something a little more
> invasive. How about writing a routine to check the entire
> ohci->eds_in_use list for validity (each forward pointer is matched by
> the corresponding backward pointer), and calling this routine at each
> place where the list gets modified, before the modification happens?
>
> You could also make sure that an entry being added to the list isn't on
> the list already, and whenever an entry is deleted from the list
> either it really is on the list or else its list pointers point to
> themselves.
>
I'm not 100% sure, but then it's probably a race between urb
enqueuing (duplicates?) and the watchdog orphan cleanup.
The crash log already shows the double list add in ohci_urb_enqueue
"
ohci-hcd.c:238: list_add(&ed->in_use_list, &ohci->eds_in_use);
"
This is probably due to the ed returned by ed_get() being reused before the
watchdog ran, thus the same in_use_list re-added to ohci.eds_in_use.
Entries seem to get removed in finish_unlinks()
"
ohci-q.c:1090: list_del(&ed->in_use_list);
"
with list_del() poisoning the next/prev pointers of the removed entry.
Now with the watchdog starting cleanup it iterates over the ohci.eds_in_use
list
that still has the second very same entry of in_use_list we double-added (but
now with 0xdead... pointers) and we fault on
"
ohci-hcd.c:761: if (ed->pending_td) {
"
I hope that makes any sense? I'll hook up the list checking tomorrow. Though I
haven't hit the (double-add) problem again, since the bug report. Seems pretty
specific the whole thing.
> Alan Stern
Heiko
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html