Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> it turned out ipipe_critical_enter is broken on SMP > 2 CPUs: On one
>> CPU, Linux may have acquired an rwlock for reading when being preempted
>> by the critical IPI. On some other CPU, Linux may have entered
>> write_lock_irq[save] before the IPI arrived. The reader will be stuck in
>> __ipipe_do_critical_sync, the writer in __write_lock_failed - forever.
>> First seen on real silicon (once per "few" hundreds of boots), finally
>> caught under KVM and nailed down.
>>
>> Two approaches to resolve this issue come to my mind so far. The first
>> one is to restart the whole ipipe_critical_enter after some (how many?)
>> cycles of futile waiting. The other is to accept the critical IPI even
>> if the top-most domain is stalled (as it sits in write_lock_irq), but
>> I'm not 100% that our optimistic IRQ mask will always allow this when
>> Linux is on the top (I assume we can safely require other domains to
>> avoid such deadlocks by design).
>>
>> Comments? Better ideas?
> 
> I guess, the rwlocks are ipipe rwlocks, right?

Nope, plain Linux tasklist_lock. No Xenomai domain active at this point,
just Linux.

> I am not sure it is different from your second idea, but what about
> spinning in write_lock_irq/save with irqs on?

Hard-IRQs on (which is what my second idea would rely on) or Linux IRQs
on (which would involve patching Linux spinlock arch code and may have
side effects)?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

_______________________________________________
Adeos-main mailing list
[email protected]
https://mail.gna.org/listinfo/adeos-main

Reply via email to