On 05/02/19 08:50, Nadav Amit wrote: >> In cases where RCU cannot be used (e.g. because some callees need to RCU >> synchronise), it might be possible to add a variant that uses >> synchronize_rcu_tasks() when updating, but this series does not attempt this. > I wonder why. Mainly because I have yet to convince myself that it's the Right Thing. Note also the following (from kernel/rcu/update.c):
/* * This is a very specialized primitive, intended only for a few uses in * tracing and other situations requiring manipulation of function * preambles and profiling hooks. The synchronize_rcu_tasks() function * is not (yet) intended for heavy use from multiple CPUs. */ > This seems like an easy solution, and according to Josh, Steven > Rostedt and the documentation appears to be valid. Will it hurt performance, though, if we end up (say) having rcu-tasks- based synchronisation for updates on every indirect call in the kernel? (As would result from a plugin-based opt-out approach.) > As I stated before, I think that the best solution is to use a GCC plugin, > [...] Such a solution will not enable the calling code to be > written in C and would require a plugin for each architecture. I'm afraid I don't see why. If we use the static_calls infrastructure, but then do a source-level transformation in the compiler plugin to turn indirect calls into dynamic_calls, it should be possible to create an opt-out system without any arch-specific code in the plugin (the arch- specific stuff being all in the static_calls code). Any reason that can't be done? (Note: I don't know much about GCC internals, maybe there's something obvious that stops a plugin doing things like that.) > Feel free to try my code and give me feedback. I did not get a feedback on my > last version. Is there a fundamental problem with my plugin? Did you try it > and got bad results, or perhaps it did not build? I didn't test your patches yet, because I was busy trying to get mine working and ready to post (and also with unrelated work). But now that that's done, next time I have cycles spare for indirect call stuff I guess testing (and reviewing) your approach will be next on my list. > Why do you prefer an approach > which requires annotation of the callers, instead of something that is much > more transparent? I'm concerned about the overhead (in both time and memory) of running learning on every indirect call site (including ones that aren't in a hot-path, and ones which have such a wide variety of callees that promotion really doesn't help) throughout the whole kernel. Also, an annotating programmer knows the locking/rcu context and can thus tell whether a given dynamic_call should use synchronise_rcu_tasks(), synchronise_rcu(), or perhaps something else (if e.g. the call always happens under a mutex, then the updater work could take that mutex). The real answer, though, is that I don't so much prefer this approach, as think that both should be tried "publicly" and evaluated by more developers than just us three. There's a reason this series is marked RFC ;-) -Ed