Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Hi, >> >> it turned out ipipe_critical_enter is broken on SMP > 2 CPUs: On one >> CPU, Linux may have acquired an rwlock for reading when being preempted >> by the critical IPI. On some other CPU, Linux may have entered >> write_lock_irq[save] before the IPI arrived. The reader will be stuck in >> __ipipe_do_critical_sync, the writer in __write_lock_failed - forever. >> First seen on real silicon (once per "few" hundreds of boots), finally >> caught under KVM and nailed down. >> >> Two approaches to resolve this issue come to my mind so far. The first >> one is to restart the whole ipipe_critical_enter after some (how many?) >> cycles of futile waiting. The other is to accept the critical IPI even >> if the top-most domain is stalled (as it sits in write_lock_irq), but >> I'm not 100% that our optimistic IRQ mask will always allow this when >> Linux is on the top (I assume we can safely require other domains to >> avoid such deadlocks by design). >> >> Comments? Better ideas? > > I guess, the rwlocks are ipipe rwlocks, right?
Nope, plain Linux tasklist_lock. No Xenomai domain active at this point, just Linux. > I am not sure it is different from your second idea, but what about > spinning in write_lock_irq/save with irqs on? Hard-IRQs on (which is what my second idea would rely on) or Linux IRQs on (which would involve patching Linux spinlock arch code and may have side effects)? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux _______________________________________________ Adeos-main mailing list [email protected] https://mail.gna.org/listinfo/adeos-main
