Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Philippe,
>>>>
>>>> this
>>>>
>>>> int fastcall __ipipe_dispatch_event (unsigned event, void *data)
>>>> ...
>>>>    ipipe_cpudom_var(next_domain, evsync) |= (1LL << event);
>>>>    local_irq_restore_hw(flags);
>>>>    propagate = !evhand(event, start_domain, data);
>>>>    local_irq_save_hw(flags);
>>>>    ipipe_cpudom_var(next_domain, evsync) &= ~(1LL << event);
>>>>
>>>> doesn't fly on SMP. While the invoked event handler is running, it may
>>>> happen that the caller gets migrated to another CPU. The result is an
>>>> inconsistent evsync state that causes ipipe_catch_event to stall (test
>>>> case: invoke Jerome's system() test a few times, them try to unload
>>>> Xenomai skins and nucleus).
>>>>
>>>> First idea (I've nothing implemented to far, would happily leave it to
>>>> someone else's hand): Track event handler entry/exit with an, say, 8 bit
>>>> per-cpu counter. On event deregistration, just summarize over the
>>>> per-cpu counters and wait for the sum to become 0. This has just the
>>>> drawback that it may cause livelocks on large SMP boxes when trying to
>>>> wait for a busy event. I've no perfect idea so far.
>>> Second approach to solve this (currently) last known ipipe issue cleanly:
>>>
>>> As long as some task in the system has __ipipe_dispatch_event (and thus
>>> some of the registered event handlers) on its stack, it is not safe to
>>> unregister some handler. For simplicity reasons, I don't think we should
>>> make any difference which handlers are precisely busy. So, how to find
>>> out if there are such tasks?
>>>
>>> I checked the code and derived three classes of preconditions for the
>>> invocation of __ipipe_dispatch_event:
>>>
>>>  1) ipipe_init_notify and ipipe_cleanup_notify -> no preconditions
>>>
>>>  2) ipipe_trap_notify -> some Linux task has PF_EVNOTIFY set or
>>>                          there are tasks with custom stacks present
>>>
>>>  3) all other invocations -> some Linux task has PF_EVNOTIFY set
> 
> 4) ipipe_syscall_watched_p used in syscall pipelining -> PF_EVNOTIFY ||
> non-regular syscall

Damn it.

> 
>>> This means by walking the full Linux task list and checking that are no
>>> more PF_EVNOTIFY-tasks present, we can easily exclude 3). In fact, 2)
>>> can also be excluded this way because we can reasonably demand that any
>>> I-pipe user fiddling with event handlers has killed all non-Linux tasks
>>> beforehand.
>>> This leaves us with 1) which should be handled via classic
>>> RCU as Linux offers it. Viable? It would even shorten the fast path a bit!
>>>
>>> Still no code, I would like to collect feedback on this new idea first.
>> OK, here is some code. Seems to work fine (now that I fixed the cleanup
>> race in Xenomai as well - the issue happened to pop up right after
>> applying this patch :-/).
>>
> 
> This patch changes the predicate from "a given registered handler should
> no be currently running" to "no task in the system should be currently
> asking for any kind of I-pipe notification", which has not quite the
> same scope. For instance, we could no more deregister a handler only for
> a particular event leaving others installed. It becomes an "all or
> nothing" test, even if that still remains compatible with the way
> Xenomai works though.
> 
> However, we have a real problem with the syscall event. This one has an
> additional precondition to PF_EVNOTIFY which is unfortunately dynamic,
> i.e. syscall_nr >= NR_SYSCALL. This is used to relay non-regular
> syscalls to whoever it may concern in the pipeline, when PF_EVNOTIFY is
> not set. Xenomai uses this property for its shadow binding syscall,
> which will eventually set PF_EVNOTIFY for the caller. IOW, an attempt to
> bind to the skin from userland while the nucleus is deregistering the
> syscall handler would escape the RCU sync with the current patch.
> 
> But holding a read-side RCU lock to postpone the quiescent RCU
> state won't fly, since the syscall handler may reschedule Linux-wise.
> 
> Looking back at the actual issue, we only have a single case to handle
> for the current code to be correct: detect when a handler appears to
> linger indefinitely on a given CPU (albeit it does not really), due to a
> migration.
> 
> This should be done in a way that does not require complex
> locking/tracking on the fast path since we are really dealing with a
> corner case here. The attached patch does this by allowing a grace
> period before assuming that the handler will never complete on the
> current CPU, looking for some quiescent state. The latter part is almost
> overkill, since clearing the handler in the domain struct already avoids
> livelocking and would prevent such a handler to be fired quite early
> anyway. The underlying assumption is that tasks which should be notified
> of system events will be killed before someone attempts to deregister
> the event handlers they rely on, so they should not be able to run/sleep
> more than the grace period on behalf of the notifier.
> 
> Does this patch also fix the case you pointed out?

May solve my particular test case, but given sleeping event handlers,
its clear to me now that the whole bit-flag based locking scheme remains
fragile once you have >1 tasks on one CPU inside the same
to-be-unregistered handler. Then the first leaving task will release the
lock, shadowing the existence of further tasks inside the handler. We
would need SRCU-like locking here if if were available for our
environment...

Jan

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Adeos-main mailing list
[email protected]
https://mail.gna.org/listinfo/adeos-main

Reply via email to