----- On Jan 27, 2021, at 2:16 PM, rostedt [email protected] wrote:

> On Wed, 27 Jan 2021 13:13:22 -0500 (EST)
> Mathieu Desnoyers <[email protected]> wrote:
> 
>> > Thanks for bringing that up.
>> 
>> Requiring an RCU synchronize on element removal is quite intrusive, and can
>> be problematic if tracepoint removal is called from e.g. preempt-off context.
> 
> But how often do you remove more than one callback from the same
> tracepoint? Or should I say, from a lot of tracepoints?
> 
> This will only synchronize for the following case:
> 
> Add three callbacks to a single tracepoint.
> Remove the first one.
>    <rcu callback to update the counters>
> Remove the second one
>   <triggers a synchronization if the counters have not been finished
>    updating>
> Remove the third one.
>   <no synchronization needed, because it's being freed>
> 
> And we may be able to make this work in batch too.
> 
> More to come, but I really like this approach over the others because it
> does not increase the size of the kernel for a failure that should never
> happen in practice.

My concern with introducing a scheme based on synchronize_rcu to handle 
out-of-memory
scenarios is not about how frequently synchronize_rcu will be called, but
about the added complexity this adds to the tracepoints insertion/removal.
This has chances to explode in unlikely scenarios, which will become harder to 
test
for. This will also introduce constraints on which kernel context can perform
tracepoint removal.

I notice that the error report which leads to this v4 is against v2 of the 
patch [1],
which has known issues. I wonder whether there are any such issues with v3, 
which is
using a function stub ? [2]

The main concern I had about v3 of the patch was that the prototype of the
stub (void argument list) does not match the prototype of the called function. 
As
discussed in [2], there are other scenarios where the kernel expects this to 
work,
based on the expectation that all architectures supported by the Linux kernel 
have
a calling convention where the caller performs the clean-up.

So I would prefer the approach taken in v3 rather than adding the kind of 
complexity
introduced in v4. Ideally we should document, near the stub function, that this
expects the calling convention to have the caller perform the clean-up (cdecl 
family),
and that the returned type (void) of the function always match. It's only the 
arguments
which may differ.

There is still one thing that I'm not 100% sure about. It's related to this 
comment
from Kees Cook [3], hinting that in a CFI build the function prototype of the 
call
site and the function need to match. So do we need extra annotation of the stub 
function
to make this work in a CFI build ?

Thanks,

Mathieu

[1] https://lore.kernel.org/bpf/[email protected]/
[2] https://lore.kernel.org/bpf/[email protected]/
[3] https://lore.kernel.org/bpf/202011171330.94C6BA7E93@keescook/
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Reply via email to