Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: >>>> Hi, >>>> >>>> it turned out ipipe_critical_enter is broken on SMP > 2 CPUs: On one >>>> CPU, Linux may have acquired an rwlock for reading when being preempted >>>> by the critical IPI. On some other CPU, Linux may have entered >>>> write_lock_irq[save] before the IPI arrived. The reader will be stuck in >>>> __ipipe_do_critical_sync, the writer in __write_lock_failed - forever. >>>> First seen on real silicon (once per "few" hundreds of boots), finally >>>> caught under KVM and nailed down. >>>> >>>> Two approaches to resolve this issue come to my mind so far. The first >>>> one is to restart the whole ipipe_critical_enter after some (how many?) >>>> cycles of futile waiting. The other is to accept the critical IPI even >>>> if the top-most domain is stalled (as it sits in write_lock_irq), but >>>> I'm not 100% that our optimistic IRQ mask will always allow this when >>>> Linux is on the top (I assume we can safely require other domains to >>>> avoid such deadlocks by design). >>>> >>>> Comments? Better ideas? >>> I guess, the rwlocks are ipipe rwlocks, right? >> Nope, plain Linux tasklist_lock. No Xenomai domain active at this point, >> just Linux. > > Then how could this happen? Is not the critical IPI always able to > preempt Linux?
Obviously not if Linux is top-most. Hard IRQs are enabled on all CPUs, but the Linux domain is stalled. So the critical IPI is not delivered by ipipe. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux _______________________________________________ Adeos-main mailing list [email protected] https://mail.gna.org/listinfo/adeos-main
