Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
On Tue, 2018-12-04 at 18:12 +, Edward Cree wrote: > On 04/12/18 17:44, Paolo Abeni wrote: > > On Tue, 2018-12-04 at 17:13 +, Edward Cree wrote: > > > On 03/12/18 11:40, Paolo Abeni wrote: > > > > This header define a bunch of helpers that allow avoiding the > > > > retpoline overhead when calling builtin functions via function pointers. > > > > It boils down to explicitly comparing the function pointers to > > > > known builtin functions and eventually invoke directly the latter. > > > > > > > > The macros defined here implement the boilerplate for the above schema > > > > and will be used by the next patches. > > > > > > > > rfc -> v1: > > > > - use branch prediction hint, as suggested by Eric > > > > > > > > Suggested-by: Eric Dumazet > > > > Signed-off-by: Paolo Abeni > > > > --- > > > I'm not sure I see the reason why this is done with numbers and > > > 'name ## NR', adding extra distance between the callsite and the > > > list of callees. In particular it means that each callable needs > > > to specify its index. > > > Wouldn't it be simpler just to have > > > #define 1(f, f1, ...) \ > > > (likely(f == f1) ? f1(__VA_ARGS__) : f(__VA_ARGS__)) > > > #define INDIRECT_CALL_2(f, f2, f1, ...) \ > > > (likely(f == f2) ? f2(__VA_ARGS__) : INDIRECT_CALL_1(f, f1, > > > __VA_ARGS__)) > > > etc.? Removing the need for INDIRECT_CALLABLE_DECLARE_* entirely. > > Thank you for the review! > > > > As some of the builtin symbols are static, we would still need some > > macro wrappers to properly specify the scope when retpoline is enabled. > Ah I see, it hadn't occurred to me that static callees might not be > available at the callsite. Makes sense now. In that case, have my > Acked-By for this patch, if you want it. I gave a shot to your idea, and after all I think is cleaner. So I'll send v2 with that change. Thanks, Paolo
Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
On 04/12/18 17:44, Paolo Abeni wrote: > On Tue, 2018-12-04 at 17:13 +, Edward Cree wrote: >> On 03/12/18 11:40, Paolo Abeni wrote: >>> This header define a bunch of helpers that allow avoiding the >>> retpoline overhead when calling builtin functions via function pointers. >>> It boils down to explicitly comparing the function pointers to >>> known builtin functions and eventually invoke directly the latter. >>> >>> The macros defined here implement the boilerplate for the above schema >>> and will be used by the next patches. >>> >>> rfc -> v1: >>> - use branch prediction hint, as suggested by Eric >>> >>> Suggested-by: Eric Dumazet >>> Signed-off-by: Paolo Abeni >>> --- >> I'm not sure I see the reason why this is done with numbers and >> 'name ## NR', adding extra distance between the callsite and the >> list of callees. In particular it means that each callable needs >> to specify its index. >> Wouldn't it be simpler just to have >> #define 1(f, f1, ...) \ >> (likely(f == f1) ? f1(__VA_ARGS__) : f(__VA_ARGS__)) >> #define INDIRECT_CALL_2(f, f2, f1, ...) \ >> (likely(f == f2) ? f2(__VA_ARGS__) : INDIRECT_CALL_1(f, f1, >> __VA_ARGS__)) >> etc.? Removing the need for INDIRECT_CALLABLE_DECLARE_* entirely. > Thank you for the review! > > As some of the builtin symbols are static, we would still need some > macro wrappers to properly specify the scope when retpoline is enabled. Ah I see, it hadn't occurred to me that static callees might not be available at the callsite. Makes sense now. In that case, have my Acked-By for this patch, if you want it. > This: > https://lore.kernel.org/lkml/cover.1543200841.git.jpoim...@redhat.com/T/#ma30f6b2aa655c99e93cfb267fef75b8fe9fca29b > > is possibly related to what you are planning. Yes! That looks like exactly the sort of machinery I need, thanks. My idea is to build on that something that counts stats of where each indirect call goes, then every now and then patches in a new list of possible targets to an INDIRECT_CALL_*()-style jump tree, based on the N callees that were seen the most often. (Such a solution could even cope with callees in modules: the jump table doesn't need a reference on the module because it only calls the function (from the module) if the function pointer is pointing at it, in which case (hopefully) whoever supplied the function pointer has a reference to the module.) But I'm not sure I have the mad hacker skillz to make it work. -Ed
Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
On Tue, 2018-12-04 at 17:13 +, Edward Cree wrote: > On 03/12/18 11:40, Paolo Abeni wrote: > > This header define a bunch of helpers that allow avoiding the > > retpoline overhead when calling builtin functions via function pointers. > > It boils down to explicitly comparing the function pointers to > > known builtin functions and eventually invoke directly the latter. > > > > The macros defined here implement the boilerplate for the above schema > > and will be used by the next patches. > > > > rfc -> v1: > > - use branch prediction hint, as suggested by Eric > > > > Suggested-by: Eric Dumazet > > Signed-off-by: Paolo Abeni > > --- > I'm not sure I see the reason why this is done with numbers and > 'name ## NR', adding extra distance between the callsite and the > list of callees. In particular it means that each callable needs > to specify its index. > Wouldn't it be simpler just to have > #define 1(f, f1, ...) \ > (likely(f == f1) ? f1(__VA_ARGS__) : f(__VA_ARGS__)) > #define INDIRECT_CALL_2(f, f2, f1, ...) \ > (likely(f == f2) ? f2(__VA_ARGS__) : INDIRECT_CALL_1(f, f1, > __VA_ARGS__)) > etc.? Removing the need for INDIRECT_CALLABLE_DECLARE_* entirely. Thank you for the review! As some of the builtin symbols are static, we would still need some macro wrappers to properly specify the scope when retpoline is enabled. Also, I think that f1, f2... declaration before INDIRECT_CALL_ would be uglier, as we need to list there the function names (so we would have the same list in 2 places). Anyway this sounds really one thing that will enrage guys on lklm. Suggestions for alternative solutions more than welcome ;) > PS: this has reminded me of my desire to try runtime creation of > these kinds of branch tables with self-modifying code This: https://lore.kernel.org/lkml/cover.1543200841.git.jpoim...@redhat.com/T/#ma30f6b2aa655c99e93cfb267fef75b8fe9fca29b is possibly related to what you are planning. AFAICS should work only for global function pointers, not for e.g. function ptr inside lists, so the above and this series should be complementary. Cheers, Paolo
Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
On 03/12/18 11:40, Paolo Abeni wrote: > This header define a bunch of helpers that allow avoiding the > retpoline overhead when calling builtin functions via function pointers. > It boils down to explicitly comparing the function pointers to > known builtin functions and eventually invoke directly the latter. > > The macros defined here implement the boilerplate for the above schema > and will be used by the next patches. > > rfc -> v1: > - use branch prediction hint, as suggested by Eric > > Suggested-by: Eric Dumazet > Signed-off-by: Paolo Abeni > --- I'm not sure I see the reason why this is done with numbers and 'name ## NR', adding extra distance between the callsite and the list of callees. In particular it means that each callable needs to specify its index. Wouldn't it be simpler just to have #define INDIRECT_CALL_1(f, f1, ...) \ (likely(f == f1) ? f1(__VA_ARGS__) : f(__VA_ARGS__)) #define INDIRECT_CALL_2(f, f2, f1, ...) \ (likely(f == f2) ? f2(__VA_ARGS__) : INDIRECT_CALL_1(f, f1, __VA_ARGS__)) etc.? Removing the need for INDIRECT_CALLABLE_DECLARE_* entirely. At least the commit message should explain the rationale for not doing things this way. -Ed PS: this has reminded me of my desire to try runtime creation of these kinds of branch tables with self-modifying code; is there any documentation on how to go about writing to kernel .text at runtime? Last time I had a try at it I got very confused.
Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
From: Paolo Abeni Date: Tue, 04 Dec 2018 12:27:51 +0100 > On Mon, 2018-12-03 at 10:04 -0800, Eric Dumazet wrote: >> On 12/03/2018 03:40 AM, Paolo Abeni wrote: >> > This header define a bunch of helpers that allow avoiding the >> > retpoline overhead when calling builtin functions via function pointers. >> > It boils down to explicitly comparing the function pointers to >> > known builtin functions and eventually invoke directly the latter. >> > >> > The macros defined here implement the boilerplate for the above schema >> > and will be used by the next patches. >> > >> > rfc -> v1: >> > - use branch prediction hint, as suggested by Eric >> > >> > Suggested-by: Eric Dumazet >> > Signed-off-by: Paolo Abeni >> > --- >> > include/linux/indirect_call_wrapper.h | 77 +++ >> > 1 file changed, 77 insertions(+) >> > create mode 100644 include/linux/indirect_call_wrapper.h >> >> This needs to be discussed more broadly, please include lkml > > Agreed. @David: please let me know if you prefer a repost or a v2 with > the expanded recipients list. v2 probably works better and will help me better keep track of things. Thanks for asking.
Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
On Mon, 2018-12-03 at 10:04 -0800, Eric Dumazet wrote: > On 12/03/2018 03:40 AM, Paolo Abeni wrote: > > This header define a bunch of helpers that allow avoiding the > > retpoline overhead when calling builtin functions via function pointers. > > It boils down to explicitly comparing the function pointers to > > known builtin functions and eventually invoke directly the latter. > > > > The macros defined here implement the boilerplate for the above schema > > and will be used by the next patches. > > > > rfc -> v1: > > - use branch prediction hint, as suggested by Eric > > > > Suggested-by: Eric Dumazet > > Signed-off-by: Paolo Abeni > > --- > > include/linux/indirect_call_wrapper.h | 77 +++ > > 1 file changed, 77 insertions(+) > > create mode 100644 include/linux/indirect_call_wrapper.h > > This needs to be discussed more broadly, please include lkml Agreed. @David: please let me know if you prefer a repost or a v2 with the expanded recipients list. > Also please include Paul Turner in the discussion, since > he was > the inventor of this idea. I will do. Thanks, Paolo
Re: [PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
On 12/03/2018 03:40 AM, Paolo Abeni wrote: > This header define a bunch of helpers that allow avoiding the > retpoline overhead when calling builtin functions via function pointers. > It boils down to explicitly comparing the function pointers to > known builtin functions and eventually invoke directly the latter. > > The macros defined here implement the boilerplate for the above schema > and will be used by the next patches. > > rfc -> v1: > - use branch prediction hint, as suggested by Eric > > Suggested-by: Eric Dumazet > Signed-off-by: Paolo Abeni > --- > include/linux/indirect_call_wrapper.h | 77 +++ > 1 file changed, 77 insertions(+) > create mode 100644 include/linux/indirect_call_wrapper.h This needs to be discussed more broadly, please include lkml Also please include Paul Turner in the discussion, since he was the inventor of this idea.
[PATCH net-next 1/4] indirect call wrappers: helpers to speed-up indirect calls of builtin
This header define a bunch of helpers that allow avoiding the retpoline overhead when calling builtin functions via function pointers. It boils down to explicitly comparing the function pointers to known builtin functions and eventually invoke directly the latter. The macros defined here implement the boilerplate for the above schema and will be used by the next patches. rfc -> v1: - use branch prediction hint, as suggested by Eric Suggested-by: Eric Dumazet Signed-off-by: Paolo Abeni --- include/linux/indirect_call_wrapper.h | 77 +++ 1 file changed, 77 insertions(+) create mode 100644 include/linux/indirect_call_wrapper.h diff --git a/include/linux/indirect_call_wrapper.h b/include/linux/indirect_call_wrapper.h new file mode 100644 index ..d6894ffc33f6 --- /dev/null +++ b/include/linux/indirect_call_wrapper.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_INDIRECT_CALL_WRAPPER_H +#define _LINUX_INDIRECT_CALL_WRAPPER_H + +#ifdef CONFIG_RETPOLINE + +/* + * INDIRECT_CALL_$NR - wrapper for indirect calls with $NR known builtin + * @f: function pointer + * @name: base name for builtin functions, see INDIRECT_CALLABLE_DECLARE_$NR + * @__VA_ARGS__: arguments for @f + * + * Avoid retpoline overhead for known builtin, checking @f vs each of them and + * eventually invoking directly the builtin function. Fallback to the indirect + * call + */ +#define INDIRECT_CALL_1(f, name, ...) \ + ({ \ + likely(f == name ## 1) ? name ## 1(__VA_ARGS__) : \ +f(__VA_ARGS__);\ + }) +#define INDIRECT_CALL_2(f, name, ...) \ + ({ \ + likely(f == name ## 2) ? name ## 2(__VA_ARGS__) : \ +INDIRECT_CALL_1(f, name, __VA_ARGS__); \ + }) + +/* + * INDIRECT_CALLABLE_DECLARE_$NR - declare $NR known builtin for + * INDIRECT_CALL_$NR usage + * @type: return type for the builtin function + * @name: base name for builtin functions, the full list is generated appending + *the numbers in the 1..@NR range + * @__VA_ARGS__: arguments type list for the builtin function + * + * Builtin with higher $NR will be checked first by INDIRECT_CALL_$NR + */ +#define INDIRECT_CALLABLE_DECLARE_1(type, name, ...) \ + extern type name ## 1(__VA_ARGS__) +#define INDIRECT_CALLABLE_DECLARE_2(type, name, ...) \ + extern type name ## 2(__VA_ARGS__); \ + INDIRECT_CALLABLE_DECLARE_1(type, name, __VA_ARGS__) + +/* + * INDIRECT_CALLABLE - allow usage of a builtin function from INDIRECT_CALL_$NR + * @f: builtin function name + * @nr: id associated with this builtin, higher values will be checked first by + * INDIRECT_CALL_$NR + * @type: function return type + * @name: base name used by INDIRECT_CALL_ to access the builtin list + * @__VA_ARGS__: arguments type list for @f + */ +#define INDIRECT_CALLABLE(f, nr, type, name, ...) \ + __alias(f) type name ## nr(__VA_ARGS__) + +#else +#define INDIRECT_CALL_1(f, name, ...) f(__VA_ARGS__) +#define INDIRECT_CALL_2(f, name, ...) f(__VA_ARGS__) +#define INDIRECT_CALLABLE_DECLARE_1(type, name, ...) +#define INDIRECT_CALLABLE_DECLARE_2(type, name, ...) +#define INDIRECT_CALLABLE(f, nr, type, name, ...) +#endif + +/* + * We can use INDIRECT_CALL_$NR for ipv6 related functions only if ipv6 is + * builtin, this macro simplify dealing with indirect calls with only ipv4/ipv6 + * alternatives + */ +#if IS_BUILTIN(CONFIG_IPV6) +#define INDIRECT_CALL_INET INDIRECT_CALL_2 +#elif IS_ENABLED(CONFIG_INET) +#define INDIRECT_CALL_INET INDIRECT_CALL_1 +#else +#define INDIRECT_CALL_INET(...) +#endif + +#endif -- 2.19.2