from:"Florent Revest"

Re: [PATCH v9 00/36] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

2024-04-24 Thread Florent Revest

Neat! :) I had a look at mostly the "high level" part (fprobe and
arm64 specific bits) and this seems to be in a good state to me.

Thanks for all that work, that is quite a refactoring :)

On Mon, Apr 15, 2024 at 2:49 PM Masami Hiramatsu (Google)
 wrote:
>
> Hi,
>
> Here is the 9th version of the series to re-implement the fprobe on
> function-graph tracer. The previous version is;
>
> https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/
>
> This version is ported on the latest kernel (v6.9-rc3 + probes/for-next)
> and fixed some bugs + performance optimization patch[36/36].
>  - [12/36] Fix to clear fgraph_array entry in registration failure, also
>return -ENOSPC when fgraph_array is full.
>  - [28/36] Add new store_fprobe_entry_data() for fprobe.
>  - [31/36] Remove DIV_ROUND_UP() and fix entry data address calculation.
>  - [36/36] Add new flag to skip timestamp recording.
>
> Overview
> 
> This series does major 2 changes, enable multiple function-graphs on
> the ftrace (e.g. allow function-graph on sub instances) and rewrite the
> fprobe on this function-graph.
>
> The former changes had been sent from Steven Rostedt 4 years ago (*),
> which allows users to set different setting function-graph tracer (and
> other tracers based on function-graph) in each trace-instances at the
> same time.
>
> (*) https://lore.kernel.org/all/20190525031633.811342...@goodmis.org/
>
> The purpose of latter change are;
>
>  1) Remove dependency of the rethook from fprobe so that we can reduce
>the return hook code and shadow stack.
>
>  2) Make 'ftrace_regs' the common trace interface for the function
>boundary.
>
> 1) Currently we have 2(or 3) different function return hook codes,
>  the function-graph tracer and rethook (and legacy kretprobe).
>  But since this  is redundant and needs double maintenance cost,
>  I would like to unify those. From the user's viewpoint, function-
>  graph tracer is very useful to grasp the execution path. For this
>  purpose, it is hard to use the rethook in the function-graph
>  tracer, but the opposite is possible. (Strictly speaking, kretprobe
>  can not use it because it requires 'pt_regs' for historical reasons.)
>
> 2) Now the fprobe provides the 'pt_regs' for its handler, but that is
>  wrong for the function entry and exit. Moreover, depending on the
>  architecture, there is no way to accurately reproduce 'pt_regs'
>  outside of interrupt or exception handlers. This means fprobe should
>  not use 'pt_regs' because it does not use such exceptions.
>  (Conversely, kprobe should use 'pt_regs' because it is an abstract
>   interface of the software breakpoint exception.)
>
> This series changes fprobe to use function-graph tracer for tracing
> function entry and exit, instead of mixture of ftrace and rethook.
> Unlike the rethook which is a per-task list of system-wide allocated
> nodes, the function graph's ret_stack is a per-task shadow stack.
> Thus it does not need to set 'nr_maxactive' (which is the number of
> pre-allocated nodes).
> Also the handlers will get the 'ftrace_regs' instead of 'pt_regs'.
> Since eBPF mulit_kprobe/multi_kretprobe events still use 'pt_regs' as
> their register interface, this changes it to convert 'ftrace_regs' to
> 'pt_regs'. Of course this conversion makes an incomplete 'pt_regs',
> so users must access only registers for function parameters or
> return value.
>
> Design
> --
> Instead of using ftrace's function entry hook directly, the new fprobe
> is built on top of the function-graph's entry and return callbacks
> with 'ftrace_regs'.
>
> Since the fprobe requires access to 'ftrace_regs', the architecture
> must support CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS and
> CONFIG_HAVE_FTRACE_GRAPH_FUNC, which enables to call function-graph
> entry callback with 'ftrace_regs', and also
> CONFIG_HAVE_FUNCTION_GRAPH_FREGS, which passes the ftrace_regs to
> return_to_handler.
>
> All fprobes share a single function-graph ops (means shares a common
> ftrace filter) similar to the kprobe-on-ftrace. This needs another
> layer to find corresponding fprobe in the common function-graph
> callbacks, but has much better scalability, since the number of
> registered function-graph ops is limited.
>
> In the entry callback, the fprobe runs its entry_handler and saves the
> address of 'fprobe' on the function-graph's shadow stack as data. The
> return callback decodes the data to get the 'fprobe' address, and runs
> the exit_handler.
>
> The fprobe introduces two hash-tables, one is for entry callback which
> searches fprobes related to the given function address passed by entry
> callback. The other is for a return callback which checks if the given
> 'fprobe' data structure pointer is still valid. Note that it is
> possible to unregister fprobe before the return callback runs. Thus
> the address validation must be done before using it in the return
> callback.
>
> This series can be applied against the

Re: [PATCH v9 01/36] tracing: Add a comment about ftrace_regs definition

2024-04-24 Thread Florent Revest

On Wed, Apr 24, 2024 at 2:23 PM Florent Revest  wrote:
>
> On Mon, Apr 15, 2024 at 2:49 PM Masami Hiramatsu (Google)
>  wrote:
> >
> > From: Masami Hiramatsu (Google) 
> >
> > To clarify what will be expected on ftrace_regs, add a comment to the
> > architecture independent definition of the ftrace_regs.
> >
> > Signed-off-by: Masami Hiramatsu (Google) 
> > Acked-by: Mark Rutland 
> > ---
> >  Changes in v8:
> >   - Update that the saved registers depends on the context.
> >  Changes in v3:
> >   - Add instruction pointer
> >  Changes in v2:
> >   - newly added.
> > ---
> >  include/linux/ftrace.h |   26 ++
> >  1 file changed, 26 insertions(+)
> >
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index 54d53f345d14..b81f1afa82a1 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -118,6 +118,32 @@ extern int ftrace_enabled;
> >
> >  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >
> > +/**
> > + * ftrace_regs - ftrace partial/optimal register set
> > + *
> > + * ftrace_regs represents a group of registers which is used at the
> > + * function entry and exit. There are three types of registers.
> > + *
> > + * - Registers for passing the parameters to callee, including the stack
> > + *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
> > + * - Registers for passing the return values to caller.
> > + *   (e.g. rax and rdx on x86_64)
>
> Ooc, have we ever considered skipping argument registers that are not
> return value registers in the exit code paths ? For example, why would
> we want to save rdi in a return handler ?
>
> But if we want to avoid the situation of having "sparse ftrace_regs"
> all over again, we'd have to split ftrace_regs into a ftrace_args_regs
> and a ftrace_ret_regs which would make this refactoring even more
> painful, just to skip a few instructions. :|
>
> I don't necessarily think it's worth it, I just wanted to make sure
> this was considered.

Ah, well, I just reached patch 22 and noticed that there you add add:

+ * Basically, ftrace_regs stores the registers related to the context.
+ * On function entry, registers for function parameters and hooking the
+ * function call are stored, and on function exit, registers for function
+ * return value and frame pointers are stored.

So ftrace_regs can be a a sparse structure then. That's fair enough with me! ;)

Re: [PATCH v9 01/36] tracing: Add a comment about ftrace_regs definition

2024-04-24 Thread Florent Revest

On Mon, Apr 15, 2024 at 2:49 PM Masami Hiramatsu (Google)
 wrote:
>
> From: Masami Hiramatsu (Google) 
>
> To clarify what will be expected on ftrace_regs, add a comment to the
> architecture independent definition of the ftrace_regs.
>
> Signed-off-by: Masami Hiramatsu (Google) 
> Acked-by: Mark Rutland 
> ---
>  Changes in v8:
>   - Update that the saved registers depends on the context.
>  Changes in v3:
>   - Add instruction pointer
>  Changes in v2:
>   - newly added.
> ---
>  include/linux/ftrace.h |   26 ++
>  1 file changed, 26 insertions(+)
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 54d53f345d14..b81f1afa82a1 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -118,6 +118,32 @@ extern int ftrace_enabled;
>
>  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
>
> +/**
> + * ftrace_regs - ftrace partial/optimal register set
> + *
> + * ftrace_regs represents a group of registers which is used at the
> + * function entry and exit. There are three types of registers.
> + *
> + * - Registers for passing the parameters to callee, including the stack
> + *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
> + * - Registers for passing the return values to caller.
> + *   (e.g. rax and rdx on x86_64)

Ooc, have we ever considered skipping argument registers that are not
return value registers in the exit code paths ? For example, why would
we want to save rdi in a return handler ?

But if we want to avoid the situation of having "sparse ftrace_regs"
all over again, we'd have to split ftrace_regs into a ftrace_args_regs
and a ftrace_ret_regs which would make this refactoring even more
painful, just to skip a few instructions. :|

I don't necessarily think it's worth it, I just wanted to make sure
this was considered.

> + * - Registers for hooking the function call and return including the
> + *   frame pointer (the frame pointer is architecture/config dependent)
> + *   (e.g. rip, rbp and rsp for x86_64)
> + *
> + * Also, architecture dependent fields can be used for internal process.
> + * (e.g. orig_ax on x86_64)
> + *
> + * On the function entry, those registers will be restored except for
> + * the stack pointer, so that user can change the function parameters
> + * and instruction pointer (e.g. live patching.)
> + * On the function exit, only registers which is used for return values
> + * are restored.
> + *
> + * NOTE: user *must not* access regs directly, only do it via APIs, because
> + * the member can be changed according to the architecture.
> + */
>  struct ftrace_regs {
> struct pt_regs  regs;
>  };
>

Re: [PATCH bpf-next v5 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-04-20 Thread Florent Revest

On Tue, Apr 20, 2021 at 12:54 AM Alexei Starovoitov
 wrote:
>
> On Mon, Apr 19, 2021 at 05:52:39PM +0200, Florent Revest wrote:
> > This type provides the guarantee that an argument is going to be a const
> > pointer to somewhere in a read-only map value. It also checks that this
> > pointer is followed by a zero character before the end of the map value.
> >
> > Signed-off-by: Florent Revest 
> > Acked-by: Andrii Nakryiko 
> > ---
> >  include/linux/bpf.h   |  1 +
> >  kernel/bpf/verifier.c | 41 +
> >  2 files changed, 42 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 77d1d8c65b81..c160526fc8bf 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -309,6 +309,7 @@ enum bpf_arg_type {
> >   ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type */
> >   ARG_PTR_TO_FUNC,/* pointer to a bpf program function */
> >   ARG_PTR_TO_STACK_OR_NULL,   /* pointer to stack or NULL */
> > + ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only 
> > string */
> >   __BPF_ARG_TYPE_MAX,
> >  };
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 852541a435ef..5f46dd6f3383 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -4787,6 +4787,7 @@ static const struct bpf_reg_types spin_lock_types = { 
> > .types = { PTR_TO_MAP_VALU
> >  static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { 
> > PTR_TO_PERCPU_BTF_ID } };
> >  static const struct bpf_reg_types func_ptr_types = { .types = { 
> > PTR_TO_FUNC } };
> >  static const struct bpf_reg_types stack_ptr_types = { .types = { 
> > PTR_TO_STACK } };
> > +static const struct bpf_reg_types const_str_ptr_types = { .types = { 
> > PTR_TO_MAP_VALUE } };
> >
> >  static const struct bpf_reg_types 
> > *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> >   [ARG_PTR_TO_MAP_KEY]= _key_value_types,
> > @@ -4817,6 +4818,7 @@ static const struct bpf_reg_types 
> > *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> >   [ARG_PTR_TO_PERCPU_BTF_ID]  = _btf_ptr_types,
> >   [ARG_PTR_TO_FUNC]   = _ptr_types,
> >   [ARG_PTR_TO_STACK_OR_NULL]  = _ptr_types,
> > + [ARG_PTR_TO_CONST_STR]  = _str_ptr_types,
> >  };
> >
> >  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > @@ -5067,6 +5069,45 @@ static int check_func_arg(struct bpf_verifier_env 
> > *env, u32 arg,
> >   if (err)
> >   return err;
> >   err = check_ptr_alignment(env, reg, 0, size, true);
> > + } else if (arg_type == ARG_PTR_TO_CONST_STR) {
> > + struct bpf_map *map = reg->map_ptr;
> > + int map_off;
> > + u64 map_addr;
> > + char *str_ptr;
> > +
> > + if (reg->type != PTR_TO_MAP_VALUE || !map ||
>
> I think the 'type' check is redundant,
> since check_reg_type() did it via compatible_reg_types.
> If so it's probably better to remove it here ?
>
> '!map' looks unnecessary. Can it ever happen? If yes, it's a verifier bug.
> For example in check_mem_access() we just deref reg->map_ptr without checking
> which, I think, is correct.

I agree with all of the above. I only thought it's better to be safe
than sorry but if you'd like I could follow up with a patch that
removes some checks?

> > + !bpf_map_is_rdonly(map)) {
>
> This check is needed, of course.
>
> > + verbose(env, "R%d does not point to a readonly 
> > map'\n", regno);
> > + return -EACCES;
> > + }
> > +
> > + if (!tnum_is_const(reg->var_off)) {
> > + verbose(env, "R%d is not a constant address'\n", 
> > regno);
> > + return -EACCES;
> > + }
> > +
> > + if (!map->ops->map_direct_value_addr) {
> > + verbose(env, "no direct value access support for this 
> > map type\n");
> > + return -EACCES;
> > + }
> > +
> > + err = check_map_access(env, regno, reg->off,
> > +map->value_size - reg->off, false);
> > + if (err)
> > + return err;
> > +
> > + map_off = reg->off + reg->var_off.value;
&g

Re: [PATCH bpf-next v5 0/6] Add a snprintf eBPF helper

2021-04-20 Thread Florent Revest

On Mon, Apr 19, 2021 at 9:34 PM Andrii Nakryiko
 wrote:
>
> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest  wrote:
> >
> > We have a usecase where we want to audit symbol names (if available) in
> > callback registration hooks. (ex: fentry/nf_register_net_hook)
> >
> > A few months back, I proposed a bpf_kallsyms_lookup series but it was
> > decided in the reviews that a more generic helper, bpf_snprintf, would
> > be more useful.
> >
> > This series implements the helper according to the feedback received in
> > https://lore.kernel.org/bpf/20201126165748.1748417-1-rev...@google.com/T/#u
> >
> > - A new arg type guarantees the NULL-termination of string arguments and
> >   lets us pass format strings in only one arg
> > - A new helper is implemented using that guarantee. Because the format
> >   string is known at verification time, the format string validation is
> >   done by the verifier
> > - To implement a series of tests for bpf_snprintf, the logic for
> >   marshalling variadic args in a fixed-size array is reworked as per:
> > https://lore.kernel.org/bpf/20210310015455.1095207-1-rev...@chromium.org/T/#u
> >
> > ---
> > Changes in v5:
> > - Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
> > - Added a couple of extra incorrect specifiers tests
> > - Call test_snprintf_single__destroy unconditionally
> > - Fixed a C++-style comment
> >
> > ---
> > Changes in v4:
> > - Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
> >   kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
> > - Added negative test cases (various invalid format strings)
> > - Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
> > - Fixed a mistake that caused temporary buffers to be unconditionally
> >   freed in bpf_printf_prepare
> > - Fixed a mistake that caused missing 0 character to be ignored
> > - Fixed a warning about integer to pointer conversion
> > - Misc cleanups
> >
> > ---
> > Changes in v3:
> > - Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
> > - Made zero-termination check more consistent
> > - Allowed NULL output_buffer
> > - Simplified the BPF_CAST_FMT_ARG macro
> > - Three new test cases: number padding, simple string with no arg and
> >   string length extraction only with a NULL output buffer
> > - Clarified helper's description for edge cases (eg: str_size == 0)
> > - Lots of cosmetic changes
> >
> > ---
> > Changes in v2:
> > - Extracted the format validation/argument sanitization in a generic way
> >   for all printf-like helpers.
> > - bpf_snprintf's str_size can now be 0
> > - bpf_snprintf is now exposed to all BPF program types
> > - We now preempt_disable when using a per-cpu temporary buffer
> > - Addressed a few cosmetic changes
> >
> > Florent Revest (6):
> >   bpf: Factorize bpf_trace_printk and bpf_seq_printf
> >   bpf: Add a ARG_PTR_TO_CONST_STR argument type
> >   bpf: Add a bpf_snprintf helper
> >   libbpf: Initialize the bpf_seq_printf parameters array field by field
> >   libbpf: Introduce a BPF_SNPRINTF helper macro
> >   selftests/bpf: Add a series of tests for bpf_snprintf
> >
> >  include/linux/bpf.h   |  22 ++
> >  include/uapi/linux/bpf.h  |  28 ++
> >  kernel/bpf/helpers.c  | 306 ++
> >  kernel/bpf/verifier.c |  82 
> >  kernel/trace/bpf_trace.c  | 373 ++
> >  tools/include/uapi/linux/bpf.h|  28 ++
> >  tools/lib/bpf/bpf_tracing.h   |  58 ++-
> >  .../selftests/bpf/prog_tests/snprintf.c   | 125 ++
> >  .../selftests/bpf/progs/test_snprintf.c   |  73 
> >  .../bpf/progs/test_snprintf_single.c  |  20 +
> >  10 files changed, 770 insertions(+), 345 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> >
> > --
> > 2.31.1.368.gbe11c130af-goog
> >
>
> Looks great, thank you!
>
> For the series:
>
> Acked-by: Andrii Nakryiko 

Thank you for the all the fast and in-depth reviews Andrii! :)

[PATCH bpf-next v5 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-19 Thread Florent Revest

Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

The implementation of the two existing helpers already drifted quite a
bit so unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
  NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
  using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
  strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
  stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
  preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

The implementation is also moved from bpf_trace.c to helpers.c because
the upcoming bpf_snprintf helper will be made available to all BPF
programs and will need it.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h  |  20 +++
 kernel/bpf/helpers.c | 256 +++
 kernel/trace/bpf_trace.c | 371 ---
 3 files changed, 313 insertions(+), 334 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ff8cd68c01b3..77d1d8c65b81 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2077,4 +2077,24 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type 
t,
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
+enum bpf_printf_mod_type {
+   BPF_PRINTF_INT,
+   BPF_PRINTF_LONG,
+   BPF_PRINTF_LONG_LONG,
+};
+
+/* Workaround for getting va_list handling working with different argument type
+ * combinations generically for 32 and 64 bit archs.
+ */
+#define BPF_CAST_FMT_ARG(arg_nb, args, mod)\
+   (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \
+(mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64)  \
+ ? (u64)args[arg_nb]   \
+ : (u32)args[arg_nb])
+
+int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
+  u64 *final_args, enum bpf_printf_mod_type *mod,
+  u32 num_args);
+void bpf_printf_cleanup(void);
+
 #endif /* _LINUX_BPF_H */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index f306611c4ddf..9ca57eb1fc0d 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -669,6 +669,262 @@ const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
.arg1_type  = ARG_PTR_TO_PERCPU_BTF_ID,
 };
 
+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
+   size_t bufsz)
+{
+   void __user *user_ptr = (__force void __user *)unsafe_ptr;
+
+   buf[0] = 0;
+
+   switch (fmt_ptype) {
+   case 's':
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+   if ((unsigned long)unsafe_ptr < TASK_SIZE)
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
+   fallthrough;
+#endif
+   case 'k':
+   return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
+   case 'u':
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
+   }
+
+   return -EINVAL;
+}
+
+/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+ */
+#define MAX_PRINTF_BUF_LEN 512
+
+struct bpf_printf_buf {
+   char tmp_buf[MAX_PRINTF_BUF_LEN];
+};
+static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
+static DEFINE_PER_CPU(int, bpf_printf_buf_used);
+
+static int try_get_fmt_tmp_buf(char **tmp_buf)
+{
+   struct bpf_printf_buf *bufs;
+   int used;
+
+   if (*tmp_buf)
+   return 0;
+
+   preempt_disable();
+   used = this_cpu_inc_return(bpf_printf_buf_used);
+   if (WARN_ON_ONCE(used > 1)) {
+   this_cpu_dec(bpf_printf_buf_used);
+   preempt_enable();
+   return -EBUSY;
+   }
+   bufs = this_cpu_ptr(_printf_buf);
+   *tmp_buf = bufs->tmp_buf;
+
+   return 0;
+}
+
+void bpf_printf_cleanup(void)
+{
+   if (this_cpu_read(bpf_printf_buf_used)) {
+

[PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-19 Thread Florent Revest

The "positive" part tests all format specifiers when things go well.

The "negative" part makes sure that incorrect format strings fail at
load time.

Signed-off-by: Florent Revest 
---
 .../selftests/bpf/prog_tests/snprintf.c   | 125 ++
 .../selftests/bpf/progs/test_snprintf.c   |  73 ++
 .../bpf/progs/test_snprintf_single.c  |  20 +++
 3 files changed, 218 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index ..a958c22aec75
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include "test_snprintf.skel.h"
+#include "test_snprintf_single.skel.h"
+
+#define EXP_NUM_OUT  "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET  sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT   "127.000.000.001 :::::::0001"
+#define EXP_IP_RET   sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT  "schedule schedule+0x0/"
+#define MIN_SYM_RET  sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot 
*/
+#define EXP_ADDR_OUT " 0add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT  "str1 longstr"
+#define EXP_STR_RET  sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT "%over"
+#define EXP_OVER_RET 10
+
+#define EXP_PAD_OUT "4 000"
+#define EXP_PAD_RET 97
+
+#define EXP_NO_ARG_OUT "simple case"
+#define EXP_NO_ARG_RET 12
+
+#define EXP_NO_BUF_RET 29
+
+void test_snprintf_positive(void)
+{
+   char exp_addr_out[] = EXP_ADDR_OUT;
+   char exp_sym_out[]  = EXP_SYM_OUT;
+   struct test_snprintf *skel;
+
+   skel = test_snprintf__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "skel_open"))
+   return;
+
+   if (!ASSERT_OK(test_snprintf__attach(skel), "skel_attach"))
+   goto cleanup;
+
+   /* trigger tracepoint */
+   usleep(1);
+
+   ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+   ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+   ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+   ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+   ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+sizeof(exp_sym_out) - 1), "sym_out");
+   ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+   ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+sizeof(exp_addr_out) - 1), "addr_out");
+   ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+   ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+   ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+   ASSERT_STREQ(skel->bss->over_out, EXP_OVER_OUT, "over_out");
+   ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+   ASSERT_STREQ(skel->bss->pad_out, EXP_PAD_OUT, "pad_out");
+   ASSERT_EQ(skel->bss->pad_ret, EXP_PAD_RET, "pad_ret");
+
+   ASSERT_STREQ(skel->bss->noarg_out, EXP_NO_ARG_OUT, "no_arg_out");
+   ASSERT_EQ(skel->bss->noarg_ret, EXP_NO_ARG_RET, "no_arg_ret");
+
+   ASSERT_EQ(skel->bss->nobuf_ret, EXP_NO_BUF_RET, "no_buf_ret");
+
+cleanup:
+   test_snprintf__destroy(skel);
+}
+
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+/* Loads an eBPF object calling bpf_snprintf with up to 10 characters of fmt */
+static int load_single_snprintf(char *fmt)
+{
+   struct test_snprintf_single *skel;
+   int ret;
+
+   skel = test_snprintf_single__open();
+   if (!skel)
+   return -EINVAL;
+
+   memcpy(skel->rodata->fmt, fmt, min(strlen(fmt) + 1, 10));
+
+   ret = test_snprintf_single__load(skel);
+   test_snprintf_single__destroy(skel);
+
+   return ret;
+}
+
+void test_snprintf_negative(void)
+{
+   ASSERT_OK(load_single_snprintf("valid %d"), "valid usage");
+
+   ASSERT_ERR(load_single_snprintf("0123456789"), "no terminating zero");
+   ASSERT_ERR(load_single_snprintf("%d %d"), "too many specifiers");
+   ASSERT_ERR(load_

[PATCH bpf-next v5 3/6] bpf: Add a bpf_snprintf helper

2021-04-19 Thread Florent Revest

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 include/linux/bpf.h|  1 +
 include/uapi/linux/bpf.h   | 28 +++
 kernel/bpf/helpers.c   | 50 ++
 kernel/bpf/verifier.c  | 41 
 kernel/trace/bpf_trace.c   |  2 ++
 tools/include/uapi/linux/bpf.h | 28 +++
 6 files changed, 150 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c160526fc8bf..f8a45f109e96 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1953,6 +1953,7 @@ extern const struct bpf_func_proto 
bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
 extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
 extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index df164a44bb41..ec6d85a81744 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4708,6 +4708,33 @@ union bpf_attr {
  * Return
  * The number of traversed map elements for success, **-EINVAL** 
for
  * invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 
data_len)
+ * Description
+ * Outputs a string into the **str** buffer of size **str_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the formatted string, including
+ * the trailing zero character. If the return value is greater than
+ * **str_size**, **str** contains a truncated string, guaranteed to
+ * be zero-terminated except when **str_size** is 0.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -4875,6 +4902,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu),  \
FN(for_each_map_elem),  \
+   FN(snprintf),   \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 9ca57eb1fc0d..85b26ca5aacd 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -925,6 +925,54 @@ int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 
*raw_args,
return err;
 }
 
+#define MAX_SNPRINTF_VARARGS   12
+
+BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
+  const void *, data, u32, data_len)
+{
+   enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
+   u64 args[MAX_SNPRINTF_VARARGS];
+   int err, num_args;
+
+   if (data_len % 8 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
+   (data_len && !data))
+   return -EINVAL;
+   num_args = data_len / 8;
+
+   /* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
+* can safely give an unbounded size.
+*/
+   err = bpf_printf_prepare(fmt, UINT_MAX, data, args, mod, num_args);
+

[PATCH bpf-next v5 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro

2021-04-19 Thread Florent Revest

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 tools/lib/bpf/bpf_tracing.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 1c2e91ee041d..8c954ebc0c7c 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
   ___param, sizeof(___param)); \
 })
 
+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead 
of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...)  \
+({ \
+   static const char ___fmt[] = fmt;   \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   \
+   bpf_snprintf(out, out_size, ___fmt, \
+___param, sizeof(___param));   \
+})
+
 #endif
-- 
2.31.1.368.gbe11c130af-goog

[PATCH bpf-next v5 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-04-19 Thread Florent Revest

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 tools/lib/bpf/bpf_tracing.h | 40 +++--
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..1c2e91ee041d 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,20 +413,38 @@ typeof(name(0)) name(struct pt_regs *ctx) 
\
 }  \
 static __always_inline typeof(name(0)) ##name(struct pt_regs *ctx, ##args)
 
+#define ___bpf_fill0(arr, p, x) do {} while (0)
+#define ___bpf_fill1(arr, p, x) arr[p] = x
+#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, 
args)
+#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, 
args)
+#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, 
args)
+#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, 
args)
+#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, 
args)
+#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, 
args)
+#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, 
args)
+#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, 
args)
+#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, 
args)
+#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 
1, args)
+#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 
1, args)
+#define ___bpf_fill(arr, args...) \
+   ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
+
 /*
  * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
  * in a structure.
  */
-#define BPF_SEQ_PRINTF(seq, fmt, args...)  \
-   ({  \
-   _Pragma("GCC diagnostic push")  \
-   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
-   static const char ___fmt[] = fmt;   \
-   unsigned long long ___param[] = { args };   \
-   _Pragma("GCC diagnostic pop")   \
-   int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),\
-   ___param, sizeof(___param));\
-   ___ret; \
-   })
+#define BPF_SEQ_PRINTF(seq, fmt, args...)  \
+({ \
+   static const char ___fmt[] = fmt;   \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   \
+   bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
+  ___param, sizeof(___param)); \
+})
 
 #endif
-- 
2.31.1.368.gbe11c130af-goog

[PATCH bpf-next v5 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-04-19 Thread Florent Revest

This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.

Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 41 +
 2 files changed, 42 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 77d1d8c65b81..c160526fc8bf 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -309,6 +309,7 @@ enum bpf_arg_type {
ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type */
ARG_PTR_TO_FUNC,/* pointer to a bpf program function */
ARG_PTR_TO_STACK_OR_NULL,   /* pointer to stack or NULL */
+   ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only 
string */
__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 852541a435ef..5f46dd6f3383 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4787,6 +4787,7 @@ static const struct bpf_reg_types spin_lock_types = { 
.types = { PTR_TO_MAP_VALU
 static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { 
PTR_TO_PERCPU_BTF_ID } };
 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } 
};
 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK 
} };
+static const struct bpf_reg_types const_str_ptr_types = { .types = { 
PTR_TO_MAP_VALUE } };
 
 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_MAP_KEY]= _key_value_types,
@@ -4817,6 +4818,7 @@ static const struct bpf_reg_types 
*compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_PERCPU_BTF_ID]  = _btf_ptr_types,
[ARG_PTR_TO_FUNC]   = _ptr_types,
[ARG_PTR_TO_STACK_OR_NULL]  = _ptr_types,
+   [ARG_PTR_TO_CONST_STR]  = _str_ptr_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -5067,6 +5069,45 @@ static int check_func_arg(struct bpf_verifier_env *env, 
u32 arg,
if (err)
return err;
err = check_ptr_alignment(env, reg, 0, size, true);
+   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
+   struct bpf_map *map = reg->map_ptr;
+   int map_off;
+   u64 map_addr;
+   char *str_ptr;
+
+   if (reg->type != PTR_TO_MAP_VALUE || !map ||
+   !bpf_map_is_rdonly(map)) {
+   verbose(env, "R%d does not point to a readonly map'\n", 
regno);
+   return -EACCES;
+   }
+
+   if (!tnum_is_const(reg->var_off)) {
+   verbose(env, "R%d is not a constant address'\n", regno);
+   return -EACCES;
+   }
+
+   if (!map->ops->map_direct_value_addr) {
+   verbose(env, "no direct value access support for this 
map type\n");
+   return -EACCES;
+   }
+
+   err = check_map_access(env, regno, reg->off,
+  map->value_size - reg->off, false);
+   if (err)
+   return err;
+
+   map_off = reg->off + reg->var_off.value;
+   err = map->ops->map_direct_value_addr(map, _addr, map_off);
+   if (err) {
+   verbose(env, "direct value access on string failed\n");
+   return err;
+   }
+
+   str_ptr = (char *)(long)(map_addr);
+   if (!strnchr(str_ptr + map_off, map->value_size - map_off, 0)) {
+   verbose(env, "string is not zero-terminated\n");
+   return -EINVAL;
+   }
}
 
return err;
-- 
2.31.1.368.gbe11c130af-goog

[PATCH bpf-next v5 0/6] Add a snprintf eBPF helper

2021-04-19 Thread Florent Revest

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-rev...@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-rev...@chromium.org/T/#u

---
Changes in v5:
- Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
- Added a couple of extra incorrect specifiers tests
- Call test_snprintf_single__destroy unconditionally
- Fixed a C++-style comment

---
Changes in v4:
- Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
  kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
- Added negative test cases (various invalid format strings)
- Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
- Fixed a mistake that caused temporary buffers to be unconditionally
  freed in bpf_printf_prepare
- Fixed a mistake that caused missing 0 character to be ignored
- Fixed a warning about integer to pointer conversion
- Misc cleanups

---
Changes in v3:
- Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
- Made zero-termination check more consistent
- Allowed NULL output_buffer
- Simplified the BPF_CAST_FMT_ARG macro
- Three new test cases: number padding, simple string with no arg and
  string length extraction only with a NULL output buffer
- Clarified helper's description for edge cases (eg: str_size == 0)
- Lots of cosmetic changes

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
  for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes

Florent Revest (6):
  bpf: Factorize bpf_trace_printk and bpf_seq_printf
  bpf: Add a ARG_PTR_TO_CONST_STR argument type
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro
  selftests/bpf: Add a series of tests for bpf_snprintf

 include/linux/bpf.h   |  22 ++
 include/uapi/linux/bpf.h  |  28 ++
 kernel/bpf/helpers.c  | 306 ++
 kernel/bpf/verifier.c |  82 
 kernel/trace/bpf_trace.c  | 373 ++
 tools/include/uapi/linux/bpf.h|  28 ++
 tools/lib/bpf/bpf_tracing.h   |  58 ++-
 .../selftests/bpf/prog_tests/snprintf.c   | 125 ++
 .../selftests/bpf/progs/test_snprintf.c   |  73 
 .../bpf/progs/test_snprintf_single.c  |  20 +
 10 files changed, 770 insertions(+), 345 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c

-- 
2.31.1.368.gbe11c130af-goog

Re: [PATCH bpf-next v4 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-16 Thread Florent Revest

On Fri, Apr 16, 2021 at 1:20 AM Andrii Nakryiko
 wrote:
>
> On Wed, Apr 14, 2021 at 11:54 AM Florent Revest  wrote:
> > +/* Loads an eBPF object calling bpf_snprintf with up to 10 characters of 
> > fmt */
> > +static int load_single_snprintf(char *fmt)
> > +{
> > +   struct test_snprintf_single *skel;
> > +   int ret;
> > +
> > +   skel = test_snprintf_single__open();
> > +   if (!skel)
> > +   return -EINVAL;
> > +
> > +   memcpy(skel->rodata->fmt, fmt, min(strlen(fmt) + 1, 10));
> > +
> > +   ret = test_snprintf_single__load(skel);
> > +   if (!ret)
> > +   test_snprintf_single__destroy(skel);
>
> destroy unconditionally?

sweet!

> > +void test_snprintf_negative(void)
> > +{
> > +   ASSERT_OK(load_single_snprintf("valid %d"), "valid usage");
> > +
> > +   ASSERT_ERR(load_single_snprintf("0123456789"), "no terminating 
> > zero");
> > +   ASSERT_ERR(load_single_snprintf("%d %d"), "too many specifiers");
> > +   ASSERT_ERR(load_single_snprintf("%pi5"), "invalid specifier 1");
> > +   ASSERT_ERR(load_single_snprintf("%a"), "invalid specifier 2");
> > +   ASSERT_ERR(load_single_snprintf("%"), "invalid specifier 3");
> > +   ASSERT_ERR(load_single_snprintf("\x80"), "non ascii character");
> > +   ASSERT_ERR(load_single_snprintf("\x1"), "non printable character");
>
> Some more cases that came up in my mind:
>
> 1. %123987129387192387 -- long and unterminated specified
> 2. similarly %--- or something like that
>
> Do you think they are worth checking?

well, it doesn't hurt :) and it's very easy to add so no problem

> > +++ b/tools/testing/selftests/bpf/progs/test_snprintf_single.c
> > @@ -0,0 +1,20 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2021 Google LLC. */
> > +
> > +#include 
> > +#include 
> > +
> > +// The format string is filled from the userspace side such that loading 
> > fails
>
> C++ style format

Oopsie

Re: [PATCH bpf-next v3 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-15 Thread Florent Revest

On Thu, Apr 15, 2021 at 12:16 AM Andrii Nakryiko
 wrote:
>
> On Wed, Apr 14, 2021 at 2:21 AM Florent Revest  wrote:
> >
> > On Wed, Apr 14, 2021 at 1:21 AM Andrii Nakryiko
> >  wrote:
> > >
> > > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  
> > > wrote:
> > > >
> > > > This exercises most of the format specifiers.
> > > >
> > > > Signed-off-by: Florent Revest 
> > > > Acked-by: Andrii Nakryiko 
> > > > ---
> > >
> > > As I mentioned on another patch, we probably need negative tests even
> > > more than positive ones.
> >
> > Agreed.
> >
> > > I think an easy and nice way to do this is to have a separate BPF
> > > skeleton where fmt string and arguments are provided through read-only
> > > global variables, so that user-space can re-use the same BPF skeleton
> > > to simulate multiple cases. BPF program itself would just call
> > > bpf_snprintf() and store the returned result.
> >
> > Ah, great idea! I was thinking of having one skeleton for each but it
> > would be a bit much indeed.
> >
> > Because the format string needs to be in a read only map though, I
> > hope it can be modified from userspace before loading. I'll try it out
> > and see :) if it doesn't work I'll just use more skeletons
>
> You need read-only variables (const volatile my_type). Their contents
> are statically verified by BPF verifier, yet user-space can pre-setup
> it at runtime.

Thanks :) v4 has negative fmt tests

Re: [PATCH bpf-next v3 3/6] bpf: Add a bpf_snprintf helper

2021-04-15 Thread Florent Revest

On Thu, Apr 15, 2021 at 12:57 AM Andrii Nakryiko
 wrote:
>
> On Wed, Apr 14, 2021 at 2:46 AM Florent Revest  wrote:
> >
> > On Wed, Apr 14, 2021 at 1:16 AM Andrii Nakryiko
> >  wrote:
> > > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  
> > > wrote:
> > > > +
> > > > +   return err + 1;
> > >
> > > snprintf() already returns string length *including* terminating zero,
> > > so this is wrong
> >
> > lib/vsprintf.c says:
> >  * The return value is the number of characters which would be
> >  * generated for the given input, excluding the trailing null,
> >  * as per ISO C99.
> >
> > Also if I look at the "no arg" test case in the selftest patch.
> > "simple case" is asserted to return 12 which seems correct to me
> > (includes the terminating zero only once). Am I missing something ?
> >
>
> no, you are right, but that means that bpf_trace_printk is broken, it
> doesn't do + 1 (which threw me off here), shall we fix that?

Answered in the 1/6 thread

> > However that makes me wonder whether it would be more appropriate to
> > return the value excluding the trailing null. On one hand it makes
> > sense to be coherent with other BPF helpers that include the trailing
> > zero (as discussed in patch v1), on the other hand the helper is
> > clearly named after the standard "snprintf" function and it's likely
> > that users will assume it works the same as the std snprintf.
>
>
> Having zero included simplifies BPF code tremendously for cases like
> bpf_probe_read_str(). So no, let's stick with including zero
> terminator in return size.

Cool :)

Re: [PATCH bpf-next v4 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-15 Thread Florent Revest

On Thu, Apr 15, 2021 at 2:38 AM Andrii Nakryiko
 wrote:
> On Wed, Apr 14, 2021 at 11:54 AM Florent Revest  wrote:
> > +static int try_get_fmt_tmp_buf(char **tmp_buf)
> > +{
> > +   struct bpf_printf_buf *bufs;
> > +   int used;
> > +
> > +   if (*tmp_buf)
> > +   return 0;
> > +
> > +   preempt_disable();
> > +   used = this_cpu_inc_return(bpf_printf_buf_used);
> > +   if (WARN_ON_ONCE(used > 1)) {
> > +   this_cpu_dec(bpf_printf_buf_used);
>
> this makes me uncomfortable. If used > 1, you won't preempt_enable()
> here, but you'll decrease count. Then later bpf_printf_cleanup() will
> be called (inside bpf_printf_prepare()) and will further decrease
> count (which it didn't increase, so it's a mess now).

Awkward, yes. :( This code is untested because it only covers a niche
preempt_rt usecase that is hard to reproduce but I should have thought
harder about these corner cases.

> > +   i += 2;
> > +   if (!final_args)
> > +   goto fmt_next;
> > +
> > +   if (try_get_fmt_tmp_buf(_buf)) {
> > +   err = -EBUSY;
> > +   goto out;
>
> this probably should bypass doing bpf_printf_cleanup() and
> try_get_fmt_tmp_buf() should enable preemption internally on error.

Yes. I'll fix this and spend some more brain cycles thinking about
what I'm doing. ;)

> > -static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
> > +BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
> > +  u64, arg2, u64, arg3)
> >  {
> > +   u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
> > +   enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
> > static char buf[BPF_TRACE_PRINTK_SIZE];
> > unsigned long flags;
> > -   va_list ap;
> > int ret;
> >
> > -   raw_spin_lock_irqsave(_printk_lock, flags);
> > -   va_start(ap, fmt);
> > -   ret = vsnprintf(buf, sizeof(buf), fmt, ap);
> > -   va_end(ap);
> > -   /* vsnprintf() will not append null for zero-length strings */
> > +   ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
> > +MAX_TRACE_PRINTK_VARARGS);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > +   ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, 
> > mod),
> > +   BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, 
> > mod));
> > +   /* snprintf() will not append null for zero-length strings */
> > if (ret == 0)
> > buf[0] = '\0';
> > +
> > +   raw_spin_lock_irqsave(_printk_lock, flags);
> > trace_bpf_trace_printk(buf);
> > raw_spin_unlock_irqrestore(_printk_lock, flags);
> >
> > -   return ret;
>
> see here, no + 1 :(

I wonder if it's a bug or a feature though. The helper documentation
says the helper returns "the number of bytes written to the buffer". I
am not familiar with the internals of trace_printk but if the
terminating \0 is not outputted in the trace_printk buffer, then it
kind of makes sense.

Also, if anyone uses this return value, I can imagine that the usecase
would be if (ret == 0) assume_nothing_was_written(). And if we
suddenly output 1 here, we might break something.

Because the helper is quite old, maybe we should improve the helper
documentation instead? Your call :)

Re: [PATCH] selftests/bpf: Fix the ASSERT_ERR_PTR macro

2021-04-15 Thread Florent Revest

On Thu, Apr 15, 2021 at 2:28 AM Andrii Nakryiko
 wrote:
> On Wed, Apr 14, 2021 at 11:58 AM Martin KaFai Lau  wrote:
> > On Wed, Apr 14, 2021 at 05:56:32PM +0200, Florent Revest wrote:
> > > It is just missing a ';'. This macro is not used by any test yet.
> > >
> > > Signed-off-by: Florent Revest 
> > Fixes: 22ba36351631 ("selftests/bpf: Move and extend ASSERT_xxx() testing 
> > macros")
> >
>
> Thanks, Martin. Added Fixes tag and applied to bpf-next.
>
> > Since it has not been used, it could be bpf-next.  Please also tag
> > it in the future.

Sorry about that, I'll make sure I remember it next time :)

> > Acked-by: Martin KaFai Lau

[PATCH bpf-next v4 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-04-14 Thread Florent Revest

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 tools/lib/bpf/bpf_tracing.h | 40 +++--
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..1c2e91ee041d 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,20 +413,38 @@ typeof(name(0)) name(struct pt_regs *ctx) 
\
 }  \
 static __always_inline typeof(name(0)) ##name(struct pt_regs *ctx, ##args)
 
+#define ___bpf_fill0(arr, p, x) do {} while (0)
+#define ___bpf_fill1(arr, p, x) arr[p] = x
+#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, 
args)
+#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, 
args)
+#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, 
args)
+#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, 
args)
+#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, 
args)
+#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, 
args)
+#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, 
args)
+#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, 
args)
+#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, 
args)
+#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 
1, args)
+#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 
1, args)
+#define ___bpf_fill(arr, args...) \
+   ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
+
 /*
  * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
  * in a structure.
  */
-#define BPF_SEQ_PRINTF(seq, fmt, args...)  \
-   ({  \
-   _Pragma("GCC diagnostic push")  \
-   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
-   static const char ___fmt[] = fmt;   \
-   unsigned long long ___param[] = { args };   \
-   _Pragma("GCC diagnostic pop")   \
-   int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),\
-   ___param, sizeof(___param));\
-   ___ret; \
-   })
+#define BPF_SEQ_PRINTF(seq, fmt, args...)  \
+({ \
+   static const char ___fmt[] = fmt;   \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   \
+   bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
+  ___param, sizeof(___param)); \
+})
 
 #endif
-- 
2.31.1.295.g9ea45b61b8-goog

[PATCH bpf-next v4 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-14 Thread Florent Revest

The "positive" part tests all format specifiers when things go well.

The "negative" part makes sure that incorrect format strings fail at
load time.

Signed-off-by: Florent Revest 
---
 .../selftests/bpf/prog_tests/snprintf.c   | 124 ++
 .../selftests/bpf/progs/test_snprintf.c   |  73 +++
 .../bpf/progs/test_snprintf_single.c  |  20 +++
 3 files changed, 217 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index ..661ffb390b4a
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include "test_snprintf.skel.h"
+#include "test_snprintf_single.skel.h"
+
+#define EXP_NUM_OUT  "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET  sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT   "127.000.000.001 :::::::0001"
+#define EXP_IP_RET   sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT  "schedule schedule+0x0/"
+#define MIN_SYM_RET  sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot 
*/
+#define EXP_ADDR_OUT " 0add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT  "str1 longstr"
+#define EXP_STR_RET  sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT "%over"
+#define EXP_OVER_RET 10
+
+#define EXP_PAD_OUT "4 000"
+#define EXP_PAD_RET 97
+
+#define EXP_NO_ARG_OUT "simple case"
+#define EXP_NO_ARG_RET 12
+
+#define EXP_NO_BUF_RET 29
+
+void test_snprintf_positive(void)
+{
+   char exp_addr_out[] = EXP_ADDR_OUT;
+   char exp_sym_out[]  = EXP_SYM_OUT;
+   struct test_snprintf *skel;
+
+   skel = test_snprintf__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "skel_open"))
+   return;
+
+   if (!ASSERT_OK(test_snprintf__attach(skel), "skel_attach"))
+   goto cleanup;
+
+   /* trigger tracepoint */
+   usleep(1);
+
+   ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+   ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+   ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+   ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+   ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+sizeof(exp_sym_out) - 1), "sym_out");
+   ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+   ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+sizeof(exp_addr_out) - 1), "addr_out");
+   ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+   ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+   ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+   ASSERT_STREQ(skel->bss->over_out, EXP_OVER_OUT, "over_out");
+   ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+   ASSERT_STREQ(skel->bss->pad_out, EXP_PAD_OUT, "pad_out");
+   ASSERT_EQ(skel->bss->pad_ret, EXP_PAD_RET, "pad_ret");
+
+   ASSERT_STREQ(skel->bss->noarg_out, EXP_NO_ARG_OUT, "no_arg_out");
+   ASSERT_EQ(skel->bss->noarg_ret, EXP_NO_ARG_RET, "no_arg_ret");
+
+   ASSERT_EQ(skel->bss->nobuf_ret, EXP_NO_BUF_RET, "no_buf_ret");
+
+cleanup:
+   test_snprintf__destroy(skel);
+}
+
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+/* Loads an eBPF object calling bpf_snprintf with up to 10 characters of fmt */
+static int load_single_snprintf(char *fmt)
+{
+   struct test_snprintf_single *skel;
+   int ret;
+
+   skel = test_snprintf_single__open();
+   if (!skel)
+   return -EINVAL;
+
+   memcpy(skel->rodata->fmt, fmt, min(strlen(fmt) + 1, 10));
+
+   ret = test_snprintf_single__load(skel);
+   if (!ret)
+   test_snprintf_single__destroy(skel);
+
+   return ret;
+}
+
+void test_snprintf_negative(void)
+{
+   ASSERT_OK(load_single_snprintf("valid %d"), "valid usage");
+
+   ASSERT_ERR(load_single_snprintf("0123456789"), "no terminating zero");
+   ASSERT_ERR(load_single_snprintf("%d %d"), "too many specifiers")

[PATCH bpf-next v4 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro

2021-04-14 Thread Florent Revest

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 tools/lib/bpf/bpf_tracing.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 1c2e91ee041d..8c954ebc0c7c 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
   ___param, sizeof(___param)); \
 })
 
+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead 
of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...)  \
+({ \
+   static const char ___fmt[] = fmt;   \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   \
+   bpf_snprintf(out, out_size, ___fmt, \
+___param, sizeof(___param));   \
+})
+
 #endif
-- 
2.31.1.295.g9ea45b61b8-goog

[PATCH bpf-next v4 3/6] bpf: Add a bpf_snprintf helper

2021-04-14 Thread Florent Revest

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h|  1 +
 include/uapi/linux/bpf.h   | 28 +++
 kernel/bpf/helpers.c   | 50 ++
 kernel/bpf/verifier.c  | 41 
 kernel/trace/bpf_trace.c   |  2 ++
 tools/include/uapi/linux/bpf.h | 28 +++
 6 files changed, 150 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c160526fc8bf..f8a45f109e96 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1953,6 +1953,7 @@ extern const struct bpf_func_proto 
bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
 extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
 extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index df164a44bb41..ec6d85a81744 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4708,6 +4708,33 @@ union bpf_attr {
  * Return
  * The number of traversed map elements for success, **-EINVAL** 
for
  * invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 
data_len)
+ * Description
+ * Outputs a string into the **str** buffer of size **str_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the formatted string, including
+ * the trailing zero character. If the return value is greater than
+ * **str_size**, **str** contains a truncated string, guaranteed to
+ * be zero-terminated except when **str_size** is 0.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -4875,6 +4902,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu),  \
FN(for_each_map_elem),  \
+   FN(snprintf),   \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index ff427f5b3358..9a58518d72dc 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -923,6 +923,54 @@ int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 
*raw_args,
return err;
 }
 
+#define MAX_SNPRINTF_VARARGS   12
+
+BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
+  const void *, data, u32, data_len)
+{
+   enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
+   u64 args[MAX_SNPRINTF_VARARGS];
+   int err, num_args;
+
+   if (data_len % 8 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
+   (data_len && !data))
+   return -EINVAL;
+   num_args = data_len / 8;
+
+   /* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
+* can safely give an unbounded size.
+*/
+   err = bpf_printf_prepare(fmt, UINT_MAX, data, args, mod, num_args);
+   if (err < 0)
+

[PATCH bpf-next v4 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-04-14 Thread Florent Revest

This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.

Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 41 +
 2 files changed, 42 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 77d1d8c65b81..c160526fc8bf 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -309,6 +309,7 @@ enum bpf_arg_type {
ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type */
ARG_PTR_TO_FUNC,/* pointer to a bpf program function */
ARG_PTR_TO_STACK_OR_NULL,   /* pointer to stack or NULL */
+   ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only 
string */
__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 852541a435ef..5f46dd6f3383 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4787,6 +4787,7 @@ static const struct bpf_reg_types spin_lock_types = { 
.types = { PTR_TO_MAP_VALU
 static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { 
PTR_TO_PERCPU_BTF_ID } };
 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } 
};
 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK 
} };
+static const struct bpf_reg_types const_str_ptr_types = { .types = { 
PTR_TO_MAP_VALUE } };
 
 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_MAP_KEY]= _key_value_types,
@@ -4817,6 +4818,7 @@ static const struct bpf_reg_types 
*compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_PERCPU_BTF_ID]  = _btf_ptr_types,
[ARG_PTR_TO_FUNC]   = _ptr_types,
[ARG_PTR_TO_STACK_OR_NULL]  = _ptr_types,
+   [ARG_PTR_TO_CONST_STR]  = _str_ptr_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -5067,6 +5069,45 @@ static int check_func_arg(struct bpf_verifier_env *env, 
u32 arg,
if (err)
return err;
err = check_ptr_alignment(env, reg, 0, size, true);
+   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
+   struct bpf_map *map = reg->map_ptr;
+   int map_off;
+   u64 map_addr;
+   char *str_ptr;
+
+   if (reg->type != PTR_TO_MAP_VALUE || !map ||
+   !bpf_map_is_rdonly(map)) {
+   verbose(env, "R%d does not point to a readonly map'\n", 
regno);
+   return -EACCES;
+   }
+
+   if (!tnum_is_const(reg->var_off)) {
+   verbose(env, "R%d is not a constant address'\n", regno);
+   return -EACCES;
+   }
+
+   if (!map->ops->map_direct_value_addr) {
+   verbose(env, "no direct value access support for this 
map type\n");
+   return -EACCES;
+   }
+
+   err = check_map_access(env, regno, reg->off,
+  map->value_size - reg->off, false);
+   if (err)
+   return err;
+
+   map_off = reg->off + reg->var_off.value;
+   err = map->ops->map_direct_value_addr(map, _addr, map_off);
+   if (err) {
+   verbose(env, "direct value access on string failed\n");
+   return err;
+   }
+
+   str_ptr = (char *)(long)(map_addr);
+   if (!strnchr(str_ptr + map_off, map->value_size - map_off, 0)) {
+   verbose(env, "string is not zero-terminated\n");
+   return -EINVAL;
+   }
}
 
return err;
-- 
2.31.1.295.g9ea45b61b8-goog

[PATCH bpf-next v4 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-14 Thread Florent Revest

Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

The implementation of the two existing helpers already drifted quite a
bit so unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
  NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
  using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
  strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
  stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
  preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

The implementation is also moved from bpf_trace.c to helpers.c because
the upcoming bpf_snprintf helper will be made available to all BPF
programs and will need it.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h  |  20 +++
 kernel/bpf/helpers.c | 254 +++
 kernel/trace/bpf_trace.c | 371 ---
 3 files changed, 311 insertions(+), 334 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ff8cd68c01b3..77d1d8c65b81 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2077,4 +2077,24 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type 
t,
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
+enum bpf_printf_mod_type {
+   BPF_PRINTF_INT,
+   BPF_PRINTF_LONG,
+   BPF_PRINTF_LONG_LONG,
+};
+
+/* Workaround for getting va_list handling working with different argument type
+ * combinations generically for 32 and 64 bit archs.
+ */
+#define BPF_CAST_FMT_ARG(arg_nb, args, mod)\
+   (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \
+(mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64)  \
+ ? (u64)args[arg_nb]   \
+ : (u32)args[arg_nb])
+
+int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
+  u64 *final_args, enum bpf_printf_mod_type *mod,
+  u32 num_args);
+void bpf_printf_cleanup(void);
+
 #endif /* _LINUX_BPF_H */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index f306611c4ddf..ff427f5b3358 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -669,6 +669,260 @@ const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
.arg1_type  = ARG_PTR_TO_PERCPU_BTF_ID,
 };
 
+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
+   size_t bufsz)
+{
+   void __user *user_ptr = (__force void __user *)unsafe_ptr;
+
+   buf[0] = 0;
+
+   switch (fmt_ptype) {
+   case 's':
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+   if ((unsigned long)unsafe_ptr < TASK_SIZE)
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
+   fallthrough;
+#endif
+   case 'k':
+   return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
+   case 'u':
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
+   }
+
+   return -EINVAL;
+}
+
+/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+ */
+#define MAX_PRINTF_BUF_LEN 512
+
+struct bpf_printf_buf {
+   char tmp_buf[MAX_PRINTF_BUF_LEN];
+};
+static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
+static DEFINE_PER_CPU(int, bpf_printf_buf_used);
+
+static int try_get_fmt_tmp_buf(char **tmp_buf)
+{
+   struct bpf_printf_buf *bufs;
+   int used;
+
+   if (*tmp_buf)
+   return 0;
+
+   preempt_disable();
+   used = this_cpu_inc_return(bpf_printf_buf_used);
+   if (WARN_ON_ONCE(used > 1)) {
+   this_cpu_dec(bpf_printf_buf_used);
+   return -EBUSY;
+   }
+   bufs = this_cpu_ptr(_printf_buf);
+   *tmp_buf = bufs->tmp_buf;
+
+   return 0;
+}
+
+void bpf_printf_cleanup(void)
+{
+   if (this_cpu_read(bpf_printf_buf_used)) {
+   this_cpu_dec(bp

[PATCH bpf-next v4 0/6] Add a snprintf eBPF helper

2021-04-14 Thread Florent Revest

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-rev...@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-rev...@chromium.org/T/#u

---
Changes in v4:
- Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
  kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
- Added negative test cases (various invalid format strings)
- Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
- Fixed a mistake that caused temporary buffers to be unconditionally
  freed in bpf_printf_prepare
- Fixed a mistake that caused missing 0 character to be ignored
- Fixed a warning about integer to pointer conversion
- Misc cleanups

---
Changes in v3:
- Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
- Made zero-termination check more consistent
- Allowed NULL output_buffer
- Simplified the BPF_CAST_FMT_ARG macro
- Three new test cases: number padding, simple string with no arg and
  string length extraction only with a NULL output buffer
- Clarified helper's description for edge cases (eg: str_size == 0)
- Lots of cosmetic changes

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
  for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes

Florent Revest (6):
  bpf: Factorize bpf_trace_printk and bpf_seq_printf
  bpf: Add a ARG_PTR_TO_CONST_STR argument type
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro
  selftests/bpf: Add a series of tests for bpf_snprintf

 include/linux/bpf.h   |  22 ++
 include/uapi/linux/bpf.h  |  28 ++
 kernel/bpf/helpers.c  | 304 ++
 kernel/bpf/verifier.c |  82 
 kernel/trace/bpf_trace.c  | 373 ++
 tools/include/uapi/linux/bpf.h|  28 ++
 tools/lib/bpf/bpf_tracing.h   |  58 ++-
 .../selftests/bpf/prog_tests/snprintf.c   | 124 ++
 .../selftests/bpf/progs/test_snprintf.c   |  73 
 .../bpf/progs/test_snprintf_single.c  |  20 +
 10 files changed, 767 insertions(+), 345 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c

-- 
2.31.1.295.g9ea45b61b8-goog

Re: [PATCH bpf-next v3 3/6] bpf: Add a bpf_snprintf helper

2021-04-14 Thread Florent Revest

Hey Geert! :)

On Wed, Apr 14, 2021 at 8:02 PM Geert Uytterhoeven  wrote:
> On Wed, Apr 14, 2021 at 9:41 AM Andrii Nakryiko
>  wrote:
> > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  wrote:
> > > +   fmt = (char *)fmt_addr + fmt_map_off;
> > > +
> >
> > bot complained about lack of (long) cast before fmt_addr, please address
>
> (uintptr_t), I assume?

(uintptr_t) seems more correct to me as well. However, I just had a
look at the rest of verifier.c and (long) casts are already used
pretty much everywhere whereas uintptr_t isn't used yet.
I'll send a v4 with a long cast for the sake of consistency with the
rest of the verifier.

[PATCH] selftests/bpf: Fix the ASSERT_ERR_PTR macro

2021-04-14 Thread Florent Revest

It is just missing a ';'. This macro is not used by any test yet.

Signed-off-by: Florent Revest 
---
 tools/testing/selftests/bpf/test_progs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_progs.h 
b/tools/testing/selftests/bpf/test_progs.h
index e87c8546230e..ee7e3b45182a 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -210,7 +210,7 @@ extern int test__join_cgroup(const char *path);
 #define ASSERT_ERR_PTR(ptr, name) ({   \
static int duration = 0;\
const void *___res = (ptr); \
-   bool ___ok = IS_ERR(___res) \
+   bool ___ok = IS_ERR(___res);\
CHECK(!___ok, (name), "unexpected pointer: %p\n", ___res);  \
___ok;  \
 })
-- 
2.31.1.295.g9ea45b61b8-goog

Re: [PATCH bpf-next v3 3/6] bpf: Add a bpf_snprintf helper

2021-04-14 Thread Florent Revest

On Mon, Apr 12, 2021 at 10:32 PM kernel test robot  wrote:
>m68k-linux-ld: kernel/bpf/verifier.o: in function 
> `check_helper_call.isra.0':
> >> verifier.c:(.text+0xf79e): undefined reference to `bpf_printf_prepare'
>m68k-linux-ld: kernel/bpf/helpers.o: in function `bpf_base_func_proto':
> >> helpers.c:(.text+0xd82): undefined reference to `bpf_snprintf_proto'

I'll move the implementation of bpf_printf_prepare/bpf_printf_cleanup
and bpf_snprintf to kernel/bpf/helpers.c so that they all get built on
kernels with CONFIG_BPF_SYSCALL but not CONFIG_BPF_EVENTS.

Re: [PATCH bpf-next v3 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-14 Thread Florent Revest

On Wed, Apr 14, 2021 at 11:56 AM Florent Revest  wrote:
> On Wed, Apr 14, 2021 at 1:01 AM Andrii Nakryiko
>  wrote:
> > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  wrote:
> > > +   err = 0;
> > > +out:
> > > +   put_fmt_tmp_buf();
> >
> > so you are putting tmp_buf unconditionally, even when there was no
> > error. That seems wrong? Should this be:
> >
> > if (err)
> > put_fmt_tmp_buf()
> >
> > ?
>
> Yeah the naming is unfortunate, as discussed in the other patch, I
> will rename that to bpf_pintf_cleanup instead. It's not clear from the
> name that it only "puts" if the buffer was already gotten.

Ah, sorry I see what you meant! Indeed, my mistake. :|

Re: [PATCH bpf-next v3 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-14 Thread Florent Revest

On Wed, Apr 14, 2021 at 1:01 AM Andrii Nakryiko
 wrote:
> On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  wrote:
> > +/* Per-cpu temp buffers which can be used by printf-like helpers for %s or 
> > %p
> > + */
> > +#define MAX_PRINTF_BUF_LEN 512
> > +
> > +struct bpf_printf_buf {
> > +   char tmp_buf[MAX_PRINTF_BUF_LEN];
> > +};
> > +static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
> > +static DEFINE_PER_CPU(int, bpf_printf_buf_used);
> > +
> > +static int try_get_fmt_tmp_buf(char **tmp_buf)
> >  {
> > -   static char buf[BPF_TRACE_PRINTK_SIZE];
> > -   unsigned long flags;
> > -   va_list ap;
> > -   int ret;
> > +   struct bpf_printf_buf *bufs = this_cpu_ptr(_printf_buf);
>
> why doing this_cpu_ptr() if below (if *tmp_buf case), you will not use
> it. just a waste of CPU, no?

Sure I can move it past the conditions.

> > +   int used;
> >
> > -   raw_spin_lock_irqsave(_printk_lock, flags);
> > -   va_start(ap, fmt);
> > -   ret = vsnprintf(buf, sizeof(buf), fmt, ap);
> > -   va_end(ap);
> > -   /* vsnprintf() will not append null for zero-length strings */
> > -   if (ret == 0)
> > -   buf[0] = '\0';
> > -   trace_bpf_trace_printk(buf);
> > -   raw_spin_unlock_irqrestore(_printk_lock, flags);
> > +   if (*tmp_buf)
> > +   return 0;
> >
> > -   return ret;
> > +   preempt_disable();
> > +   used = this_cpu_inc_return(bpf_printf_buf_used);
> > +   if (WARN_ON_ONCE(used > 1)) {
> > +   this_cpu_dec(bpf_printf_buf_used);
> > +   return -EBUSY;
> > +   }
>
> get bufs pointer here instead?

Okay :)

> > +   *tmp_buf = bufs->tmp_buf;
> > +
> > +   return 0;
> > +}
> > +
> > +static void put_fmt_tmp_buf(void)
> > +{
> > +   if (this_cpu_read(bpf_printf_buf_used)) {
> > +   this_cpu_dec(bpf_printf_buf_used);
> > +   preempt_enable();
> > +   }
> >  }
> >
> >  /*
> > - * Only limited trace_printk() conversion specifiers allowed:
> > - * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
> > + * bpf_parse_fmt_str - Generic pass on format strings for printf-like 
> > helpers
> > + *
> > + * Returns a negative value if fmt is an invalid format string or 0 
> > otherwise.
> > + *
> > + * This can be used in two ways:
> > + * - Format string verification only: when final_args and mod are NULL
> > + * - Arguments preparation: in addition to the above verification, it 
> > writes in
> > + *   final_args a copy of raw_args where pointers from BPF have been 
> > sanitized
> > + *   into pointers safe to use by snprintf. This also writes in the mod 
> > array
> > + *   the size requirement of each argument, usable by BPF_CAST_FMT_ARG for 
> > ex.
> > + *
> > + * In argument preparation mode, if 0 is returned, safe temporary buffers 
> > are
> > + * allocated and put_fmt_tmp_buf should be called to free them after use.
> >   */
> > -BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
> > -  u64, arg2, u64, arg3)
> > -{
> > -   int i, mod[3] = {}, fmt_cnt = 0;
> > -   char buf[64], fmt_ptype;
> > -   void *unsafe_ptr = NULL;
> > -   bool str_seen = false;
> > +int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
> > +   u64 *final_args, enum bpf_printf_mod_type *mod,
> > +   u32 num_args)
> > +{
> > +   int err, i, curr_specifier = 0, copy_size;
> > +   char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > +   size_t tmp_buf_len = MAX_PRINTF_BUF_LEN;
> > +   enum bpf_printf_mod_type current_mod;
> > +   u64 current_arg;
>
> naming consistency: current_arg vs curr_specifier? maybe just cur_arg
> and cur_spec?

Ahah, you're right again :)

> > +   char fmt_ptype;
> > +
> > +   if ((final_args && !mod) || (mod && !final_args))
>
> nit: same check:
>
> if (!!final_args != !!mod)

Fancy! :)

> > +   return -EINVAL;
> >
> > -   /*
> > -* bpf_check()->check_func_arg()->check_stack_boundary()
> > -* guarantees that fmt points to bpf program stack,
> > -* fmt_size bytes of it were initialized and fmt_size > 0
> > -*/
> > -   if (fmt[--fmt_size] != 0)
> > +   fmt_size = (str

Re: [PATCH bpf-next v3 3/6] bpf: Add a bpf_snprintf helper

2021-04-14 Thread Florent Revest

On Wed, Apr 14, 2021 at 1:16 AM Andrii Nakryiko
 wrote:
> On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  wrote:
> > +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> > +  struct bpf_reg_state *regs)
> > +{
> > +   struct bpf_reg_state *fmt_reg = [BPF_REG_3];
> > +   struct bpf_reg_state *data_len_reg = [BPF_REG_5];
> > +   struct bpf_map *fmt_map = fmt_reg->map_ptr;
> > +   int err, fmt_map_off, num_args;
> > +   u64 fmt_addr;
> > +   char *fmt;
> > +
> > +   /* data must be an array of u64 */
> > +   if (data_len_reg->var_off.value % 8)
> > +   return -EINVAL;
> > +   num_args = data_len_reg->var_off.value / 8;
> > +
> > +   /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> > +* and map_direct_value_addr is set.
> > +*/
> > +   fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> > +   err = fmt_map->ops->map_direct_value_addr(fmt_map, _addr,
> > + fmt_map_off);
> > +   if (err)
> > +   return err;
> > +   fmt = (char *)fmt_addr + fmt_map_off;
> > +
>
> bot complained about lack of (long) cast before fmt_addr, please address

Will do.

> > +   /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
> > +* all of them to snprintf().
> > +*/
> > +   err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > +   BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, 
> > mod),
> > +   BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, 
> > mod),
> > +   BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, 
> > mod),
> > +   BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, 
> > mod),
> > +   BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, 
> > mod),
> > +   BPF_CAST_FMT_ARG(11, args, mod));
> > +
> > +   put_fmt_tmp_buf();
>
> reading this for at least 3rd time, this put_fmt_tmp_buf() looks a bit
> out of place and kind of random. I think bpf_printf_cleanup() name
> pairs with bpf_printf_prepare() better.

Yes, I thought it would be clever to name that function
put_fmt_tmp_buf() as a clear parallel to try_get_fmt_tmp_buf() but
because it only puts the buffer if it is used and because they get
called in two different contexts, it's after all maybe not such a
clever name... I'll revert to bpf_printf_cleanup(). Thank you for your
patience with my naming adventures! :)

> > +
> > +   return err + 1;
>
> snprintf() already returns string length *including* terminating zero,
> so this is wrong

lib/vsprintf.c says:
 * The return value is the number of characters which would be
 * generated for the given input, excluding the trailing null,
 * as per ISO C99.

Also if I look at the "no arg" test case in the selftest patch.
"simple case" is asserted to return 12 which seems correct to me
(includes the terminating zero only once). Am I missing something ?

However that makes me wonder whether it would be more appropriate to
return the value excluding the trailing null. On one hand it makes
sense to be coherent with other BPF helpers that include the trailing
zero (as discussed in patch v1), on the other hand the helper is
clearly named after the standard "snprintf" function and it's likely
that users will assume it works the same as the std snprintf.

Re: [PATCH bpf-next v3 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-14 Thread Florent Revest

On Wed, Apr 14, 2021 at 1:21 AM Andrii Nakryiko
 wrote:
>
> On Mon, Apr 12, 2021 at 8:38 AM Florent Revest  wrote:
> >
> > This exercises most of the format specifiers.
> >
> > Signed-off-by: Florent Revest 
> > Acked-by: Andrii Nakryiko 
> > ---
>
> As I mentioned on another patch, we probably need negative tests even
> more than positive ones.

Agreed.

> I think an easy and nice way to do this is to have a separate BPF
> skeleton where fmt string and arguments are provided through read-only
> global variables, so that user-space can re-use the same BPF skeleton
> to simulate multiple cases. BPF program itself would just call
> bpf_snprintf() and store the returned result.

Ah, great idea! I was thinking of having one skeleton for each but it
would be a bit much indeed.

Because the format string needs to be in a read only map though, I
hope it can be modified from userspace before loading. I'll try it out
and see :) if it doesn't work I'll just use more skeletons

> Whether we need to validate the verifier log is up to debate (though
> it's not that hard to do by overriding libbpf_print_fn() callback),
> I'd be ok at least knowing that some bad format strings are rejected
> and don't crash the kernel.

Alright :)

>
> >  .../selftests/bpf/prog_tests/snprintf.c   | 81 +++
> >  .../selftests/bpf/progs/test_snprintf.c   | 74 +
> >  2 files changed, 155 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> >
>
> [...]

[PATCH bpf-next v3 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-04-12 Thread Florent Revest

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest 
---
 tools/lib/bpf/bpf_tracing.h | 40 +++--
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..1c2e91ee041d 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,20 +413,38 @@ typeof(name(0)) name(struct pt_regs *ctx) 
\
 }  \
 static __always_inline typeof(name(0)) ##name(struct pt_regs *ctx, ##args)
 
+#define ___bpf_fill0(arr, p, x) do {} while (0)
+#define ___bpf_fill1(arr, p, x) arr[p] = x
+#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, 
args)
+#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, 
args)
+#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, 
args)
+#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, 
args)
+#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, 
args)
+#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, 
args)
+#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, 
args)
+#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, 
args)
+#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, 
args)
+#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 
1, args)
+#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 
1, args)
+#define ___bpf_fill(arr, args...) \
+   ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
+
 /*
  * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
  * in a structure.
  */
-#define BPF_SEQ_PRINTF(seq, fmt, args...)  \
-   ({  \
-   _Pragma("GCC diagnostic push")  \
-   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
-   static const char ___fmt[] = fmt;   \
-   unsigned long long ___param[] = { args };   \
-   _Pragma("GCC diagnostic pop")   \
-   int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),\
-   ___param, sizeof(___param));\
-   ___ret; \
-   })
+#define BPF_SEQ_PRINTF(seq, fmt, args...)  \
+({ \
+   static const char ___fmt[] = fmt;   \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   \
+   bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
+  ___param, sizeof(___param)); \
+})
 
 #endif
-- 
2.31.1.295.g9ea45b61b8-goog

[PATCH bpf-next v3 0/6] Add a snprintf eBPF helper

2021-04-12 Thread Florent Revest

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-rev...@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-rev...@chromium.org/T/#u

---
Changes in v3:
- Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
- Made zero-termination check more consistent
- Allowed NULL output_buffer
- Simplified the BPF_CAST_FMT_ARG macro
- Three new test cases: number padding, simple string with no arg and
  string length extraction only with a NULL output buffer
- Clarified helper's description for edge cases (eg: str_size == 0)
- Lots of cosmetic changes

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
  for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes

Florent Revest (6):
  bpf: Factorize bpf_trace_printk and bpf_seq_printf
  bpf: Add a ARG_PTR_TO_CONST_STR argument type
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro
  selftests/bpf: Add a series of tests for bpf_snprintf

 include/linux/bpf.h   |   7 +
 include/uapi/linux/bpf.h  |  28 +
 kernel/bpf/helpers.c  |   2 +
 kernel/bpf/verifier.c |  82 +++
 kernel/trace/bpf_trace.c  | 579 +-
 tools/include/uapi/linux/bpf.h|  28 +
 tools/lib/bpf/bpf_tracing.h   |  58 +-
 .../selftests/bpf/prog_tests/snprintf.c   |  81 +++
 .../selftests/bpf/progs/test_snprintf.c   |  74 +++
 9 files changed, 647 insertions(+), 292 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

-- 
2.31.1.295.g9ea45b61b8-goog

[PATCH bpf-next v3 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro

2021-04-12 Thread Florent Revest

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest 
---
 tools/lib/bpf/bpf_tracing.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 1c2e91ee041d..8c954ebc0c7c 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
   ___param, sizeof(___param)); \
 })
 
+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead 
of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...)  \
+({ \
+   static const char ___fmt[] = fmt;   \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   \
+   bpf_snprintf(out, out_size, ___fmt, \
+___param, sizeof(___param));   \
+})
+
 #endif
-- 
2.31.1.295.g9ea45b61b8-goog

[PATCH bpf-next v3 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-12 Thread Florent Revest

This exercises most of the format specifiers.

Signed-off-by: Florent Revest 
Acked-by: Andrii Nakryiko 
---
 .../selftests/bpf/prog_tests/snprintf.c   | 81 +++
 .../selftests/bpf/progs/test_snprintf.c   | 74 +
 2 files changed, 155 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index ..3ad1ee885273
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,81 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include "test_snprintf.skel.h"
+
+#define EXP_NUM_OUT  "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET  sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT   "127.000.000.001 :::::::0001"
+#define EXP_IP_RET   sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT  "schedule schedule+0x0/"
+#define MIN_SYM_RET  sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot 
*/
+#define EXP_ADDR_OUT " 0add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT  "str1 longstr"
+#define EXP_STR_RET  sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT "%over"
+#define EXP_OVER_RET 10
+
+#define EXP_PAD_OUT "4 000"
+#define EXP_PAD_RET 97
+
+#define EXP_NO_ARG_OUT "simple case"
+#define EXP_NO_ARG_RET 12
+
+#define EXP_NO_BUF_RET 29
+
+void test_snprintf(void)
+{
+   char exp_addr_out[] = EXP_ADDR_OUT;
+   char exp_sym_out[]  = EXP_SYM_OUT;
+   struct test_snprintf *skel;
+
+   skel = test_snprintf__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "skel_open"))
+   return;
+
+   if (!ASSERT_OK(test_snprintf__attach(skel), "skel_attach"))
+   goto cleanup;
+
+   /* trigger tracepoint */
+   usleep(1);
+
+   ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+   ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+   ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+   ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+   ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+sizeof(exp_sym_out) - 1), "sym_out");
+   ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+   ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+sizeof(exp_addr_out) - 1), "addr_out");
+   ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+   ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+   ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+   ASSERT_STREQ(skel->bss->over_out, EXP_OVER_OUT, "over_out");
+   ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+   ASSERT_STREQ(skel->bss->pad_out, EXP_PAD_OUT, "pad_out");
+   ASSERT_EQ(skel->bss->pad_ret, EXP_PAD_RET, "pad_ret");
+
+   ASSERT_STREQ(skel->bss->noarg_out, EXP_NO_ARG_OUT, "no_arg_out");
+   ASSERT_EQ(skel->bss->noarg_ret, EXP_NO_ARG_RET, "no_arg_ret");
+
+   ASSERT_EQ(skel->bss->nobuf_ret, EXP_NO_BUF_RET, "no_buf_ret");
+
+cleanup:
+   test_snprintf__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_snprintf.c 
b/tools/testing/selftests/bpf/progs/test_snprintf.c
new file mode 100644
index ..4c36f355dfca
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_snprintf.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include 
+#include 
+#include 
+
+char num_out[64] = {};
+long num_ret = 0;
+
+char ip_out[64] = {};
+long ip_ret = 0;
+
+char sym_out[64] = {};
+long sym_ret = 0;
+
+char addr_out[64] = {};
+long addr_ret = 0;
+
+char str_out[64] = {};
+long str_ret = 0;
+
+char over_out[6] = {};
+long over_ret = 0;
+
+char pad_out[10] = {};
+long pad_ret = 0;
+
+char noarg_out[64] = {};
+long noarg_ret = 0;
+
+long nobuf_ret = 0;
+
+extern const void schedule __ksym;
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+   /* Convenient values to pretty-print */
+   const __u8 ex_ipv4[] = {127, 0, 0, 1};
+   const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
+   const char str1[] = "str1";
+   const char longstr[] = "longstr";
+
+   /* Int

[PATCH bpf-next v3 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-12 Thread Florent Revest

Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

The implementation of the two existing helpers already drifted quite a
bit so unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
  NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
  using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
  strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
  stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
  preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

Signed-off-by: Florent Revest 
---
 kernel/trace/bpf_trace.c | 529 ++-
 1 file changed, 248 insertions(+), 281 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0d23755c2747..3ce9aeee6681 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -372,7 +372,7 @@ static const struct bpf_func_proto 
*bpf_get_probe_write_proto(void)
return _probe_write_user_proto;
 }
 
-static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
size_t bufsz)
 {
void __user *user_ptr = (__force void __user *)unsafe_ptr;
@@ -382,178 +382,292 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
switch (fmt_ptype) {
case 's':
 #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
-   if ((unsigned long)unsafe_ptr < TASK_SIZE) {
-   strncpy_from_user_nofault(buf, user_ptr, bufsz);
-   break;
-   }
+   if ((unsigned long)unsafe_ptr < TASK_SIZE)
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
fallthrough;
 #endif
case 'k':
-   strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
-   break;
+   return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
case 'u':
-   strncpy_from_user_nofault(buf, user_ptr, bufsz);
-   break;
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
}
+
+   return -EINVAL;
 }
 
 static DEFINE_RAW_SPINLOCK(trace_printk_lock);
 
-#define BPF_TRACE_PRINTK_SIZE   1024
+enum bpf_printf_mod_type {
+   BPF_PRINTF_INT,
+   BPF_PRINTF_LONG,
+   BPF_PRINTF_LONG_LONG,
+};
+
+/* Workaround for getting va_list handling working with different argument type
+ * combinations generically for 32 and 64 bit archs.
+ */
+#define BPF_CAST_FMT_ARG(arg_nb, args, mod)\
+   (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \
+(mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64)  \
+ ? (u64)args[arg_nb]   \
+ : (u32)args[arg_nb])
 
-static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
+/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+ */
+#define MAX_PRINTF_BUF_LEN 512
+
+struct bpf_printf_buf {
+   char tmp_buf[MAX_PRINTF_BUF_LEN];
+};
+static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
+static DEFINE_PER_CPU(int, bpf_printf_buf_used);
+
+static int try_get_fmt_tmp_buf(char **tmp_buf)
 {
-   static char buf[BPF_TRACE_PRINTK_SIZE];
-   unsigned long flags;
-   va_list ap;
-   int ret;
+   struct bpf_printf_buf *bufs = this_cpu_ptr(_printf_buf);
+   int used;
 
-   raw_spin_lock_irqsave(_printk_lock, flags);
-   va_start(ap, fmt);
-   ret = vsnprintf(buf, sizeof(buf), fmt, ap);
-   va_end(ap);
-   /* vsnprintf() will not append null for zero-length strings */
-   if (ret == 0)
-   buf[0] = '\0';
-   trace_bpf_trace_printk(buf);
-   raw_spin_unlock_irqrestore(_printk_lock, flags);
+   if (*tmp_buf)
+   return 0;
 
-   return ret;
+   preempt_disable();
+   used = this_

[PATCH bpf-next v3 3/6] bpf: Add a bpf_snprintf helper

2021-04-12 Thread Florent Revest

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h|  6 
 include/uapi/linux/bpf.h   | 28 +++
 kernel/bpf/helpers.c   |  2 ++
 kernel/bpf/verifier.c  | 41 
 kernel/trace/bpf_trace.c   | 50 ++
 tools/include/uapi/linux/bpf.h | 28 +++
 6 files changed, 155 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7d3890b3..a3650fc93068 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1953,6 +1953,7 @@ extern const struct bpf_func_proto 
bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
 extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
 extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
@@ -2078,4 +2079,9 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type 
t,
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
+enum bpf_printf_mod_type;
+int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
+  u64 *final_args, enum bpf_printf_mod_type *mod,
+  u32 num_args);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 49371eba98ba..40546d4676f1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4671,6 +4671,33 @@ union bpf_attr {
  * Return
  * The number of traversed map elements for success, **-EINVAL** 
for
  * invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 
data_len)
+ * Description
+ * Outputs a string into the **str** buffer of size **str_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the formatted string, including
+ * the trailing zero character. If the return value is greater than
+ * **str_size**, **str** contains a truncated string, guaranteed to
+ * be zero-terminated except when **str_size** is 0.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -4838,6 +4865,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu),  \
FN(for_each_map_elem),  \
+   FN(snprintf),   \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index f306611c4ddf..ec45c7526924 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -757,6 +757,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
return _probe_read_kernel_str_proto;
case BPF_FUNC_snprintf_btf:
return _snprintf_btf_proto;
+   case BPF_FUNC_snprintf:
+   return _snprintf_proto;
default:
return NULL;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5f46dd6f3383..d4020e5f91ee 100644

[PATCH bpf-next v3 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-04-12 Thread Florent Revest

This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 41 +
 2 files changed, 42 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ff8cd68c01b3..7d3890b3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -309,6 +309,7 @@ enum bpf_arg_type {
ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type */
ARG_PTR_TO_FUNC,/* pointer to a bpf program function */
ARG_PTR_TO_STACK_OR_NULL,   /* pointer to stack or NULL */
+   ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only 
string */
__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 852541a435ef..5f46dd6f3383 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4787,6 +4787,7 @@ static const struct bpf_reg_types spin_lock_types = { 
.types = { PTR_TO_MAP_VALU
 static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { 
PTR_TO_PERCPU_BTF_ID } };
 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } 
};
 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK 
} };
+static const struct bpf_reg_types const_str_ptr_types = { .types = { 
PTR_TO_MAP_VALUE } };
 
 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_MAP_KEY]= _key_value_types,
@@ -4817,6 +4818,7 @@ static const struct bpf_reg_types 
*compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_PERCPU_BTF_ID]  = _btf_ptr_types,
[ARG_PTR_TO_FUNC]   = _ptr_types,
[ARG_PTR_TO_STACK_OR_NULL]  = _ptr_types,
+   [ARG_PTR_TO_CONST_STR]  = _str_ptr_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -5067,6 +5069,45 @@ static int check_func_arg(struct bpf_verifier_env *env, 
u32 arg,
if (err)
return err;
err = check_ptr_alignment(env, reg, 0, size, true);
+   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
+   struct bpf_map *map = reg->map_ptr;
+   int map_off;
+   u64 map_addr;
+   char *str_ptr;
+
+   if (reg->type != PTR_TO_MAP_VALUE || !map ||
+   !bpf_map_is_rdonly(map)) {
+   verbose(env, "R%d does not point to a readonly map'\n", 
regno);
+   return -EACCES;
+   }
+
+   if (!tnum_is_const(reg->var_off)) {
+   verbose(env, "R%d is not a constant address'\n", regno);
+   return -EACCES;
+   }
+
+   if (!map->ops->map_direct_value_addr) {
+   verbose(env, "no direct value access support for this 
map type\n");
+   return -EACCES;
+   }
+
+   err = check_map_access(env, regno, reg->off,
+  map->value_size - reg->off, false);
+   if (err)
+   return err;
+
+   map_off = reg->off + reg->var_off.value;
+   err = map->ops->map_direct_value_addr(map, _addr, map_off);
+   if (err) {
+   verbose(env, "direct value access on string failed\n");
+   return err;
+   }
+
+   str_ptr = (char *)(long)(map_addr);
+   if (!strnchr(str_ptr + map_off, map->value_size - map_off, 0)) {
+   verbose(env, "string is not zero-terminated\n");
+   return -EINVAL;
+   }
}
 
return err;
-- 
2.31.1.295.g9ea45b61b8-goog

Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-08 Thread Florent Revest

On Wed, Apr 7, 2021 at 11:54 PM Andrii Nakryiko
 wrote:
> On Tue, Apr 6, 2021 at 8:35 AM Florent Revest  wrote:
> > On Fri, Mar 26, 2021 at 11:51 PM Andrii Nakryiko
> >  wrote:
> > > On Fri, Mar 26, 2021 at 2:53 PM Andrii Nakryiko
> > >  wrote:
> > > > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  
> > > > wrote:
> > > > > +/* Horrid workaround for getting va_list handling working with 
> > > > > different
> > > > > + * argument type combinations generically for 32 and 64 bit archs.
> > > > > + */
> > > > > +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)  
> > > > >   \
> > > > > +   ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||  
> > > > >   \
> > > > > +(mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64))   
> > > > >   \
> > > > > + ? args[arg_nb]  
> > > > >   \
> > > > > + : ((mod[arg_nb] == BPF_PRINTF_LONG ||   
> > > > >   \
> > > > > +(mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 
> > > > > 32))  \
> > > >
> > > > is this right? INT is always 32-bit, it's only LONG that differs.
> > > > Shouldn't the rule be
> > > >
> > > > (LONG_LONG || LONG && __BITS_PER_LONG) -> (__u64)args[args_nb]
> > > > (INT || LONG && __BITS_PER_LONG == 32) -> (__u32)args[args_nb]
> > > >
> > > > Does (long) cast do anything fancy when casting from u64? Sorry, maybe
> > > > I'm confused.
> >
> > To be honest, I am also confused by that logic... :p My patch tries to
> > conserve exactly the same logic as "88a5c690b6 bpf: fix
> > bpf_trace_printk on 32 bit archs" because I was also afraid of missing
> > something and could not test it on 32 bit arches. From that commit
> > description, it is unclear to me what "u32 and long are passed
> > differently to u64, since the result of C conditional operators
> > follows the "usual arithmetic conversions" rules" means. Maybe Daniel
> > can comment on this ?
>
> Yeah, no idea. Seems like the code above should work fine for 32 and
> 64 bitness and both little- and big-endianness.

Yeah, looks good to me as well. I'll use it in v3.

> > > > > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > > > > +   u64 *final_args, enum bpf_printf_mod_type 
> > > > > *mod,
> > > > > +   u32 num_args)
> > > > > +{
> > > > > +   struct bpf_printf_buf *bufs = this_cpu_ptr(_printf_buf);
> > > > > +   int err, i, fmt_cnt = 0, copy_size, used;
> > > > > +   char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > > > > +   bool prepare_args = final_args && mod;
> > > >
> > > > probably better to enforce that both or none are specified, otherwise
> > > > return error
> >
> > Fair :)
> >
> > > it's actually three of them: raw_args, mod, and num_args, right? All
> > > three are either NULL or non-NULL.
> >
> > It is a bit tricky to see from that patch but in "3/6 bpf: Add a
> > bpf_snprintf helper" the verifier code calls this function with
> > num_args != 0 to check whether the number of arguments is correct
> > without actually converting anything.
> >
> > Also when the helper gets called, raw_args can come from the BPF
> > program and be NULL but in that case we will also have num_args = 0
> > guaranteed by the helper so the loop will bail out if it encounters a
> > format specifier.
>
> ok, but at least final_args and mod are locked together, so should be
> enforced to be either null or not, right?

Yes :) will do.

> > > > > +   enum bpf_printf_mod_type current_mod;
> > > > > +   size_t tmp_buf_len;
> > > > > +   u64 current_arg;
> > > > > +   char fmt_ptype;
> > > > > +
> > > > > +   for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {
> > > >
> > > > Can we say that if the last character is not '\0' then it's a bad
> > > > format string and return -EINVAL? And if \0 is inside the format
> > > > string, then it's also a bad format string? I wonder what others think
> > > > about this?... I

Re: [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper

2021-04-08 Thread Florent Revest

On Thu, Apr 8, 2021 at 12:03 AM Andrii Nakryiko
 wrote:
> On Tue, Apr 6, 2021 at 9:06 AM Florent Revest  wrote:
> > On Fri, Mar 26, 2021 at 11:55 PM Andrii Nakryiko
> >  wrote:
> > > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  
> > > wrote:
> > > > + * Formats **%s** and **%p{i,I}{4,6}** require to read 
> > > > kernel
> > > > + * memory. Reading kernel memory may fail due to either 
> > > > invalid
> > > > + * address or valid address but requiring a major memory 
> > > > fault. If
> > > > + * reading kernel memory fails, the string for **%s** will 
> > > > be an
> > > > + * empty string, and the ip address for **%p{i,I}{4,6}** 
> > > > will be 0.
> > >
> > > would it make sense for sleepable programs to allow memory fault when
> > > reading memory?
> >
> > Probably yes. How would you do that ? I'm guessing that in
> > bpf_trace_copy_string you would call either strncpy_from_X_nofault or
> > strncpy_from_X depending on a condition but I'm not sure which one.
>
> So you'd have different bpf_snprintf_proto definitions for sleepable
> and non-sleepable programs. And each implementation would call
> bpf_printf_prepare() with a flag specifying which copy_string variant
> to use (sleepable or not). So for BPF users it would be the same
> bpf_snprintf() helper, but it would transparently be doing different
> things depending on which BPF program it is being called from. That's
> how we do bpf_get_stack(), for example, see
> bpf_get_stack_proto_pe/bpf_get_stack_proto_raw_tp/bpf_get_stack_proto_tp.
>
> But consider that for a follow up, no need to address right now.

Ok let's keep this separate.

> >
> > > > + * Not returning error to bpf program is consistent with 
> > > > what
> > > > + * **bpf_trace_printk**\ () does for now.
> > > > + *
> > > > + * Return
> > > > + * The strictly positive length of the formatted string, 
> > > > including
> > > > + * the trailing zero character. If the return value is 
> > > > greater than
> > > > + * **str_size**, **str** contains a truncated string, 
> > > > guaranteed to
> > > > + * be zero-terminated.
> > >
> > > Except when str_size == 0.
> >
> > Right
> >
>
> So I assume you'll adjust the comment? I always find it confusing when
> zero case is allowed but it is not specified what's the behavior is.

Yes, sorry it wasn't clear :) I agree it's worth being explicit.

> > > > +   err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, 
> > > > mod),
> > > > +   BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, 
> > > > args, mod),
> > > > +   BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, 
> > > > args, mod),
> > > > +   BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, 
> > > > args, mod),
> > > > +   BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, 
> > > > args, mod),
> > > > +   BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, 
> > > > args, mod),
> > > > +   BPF_CAST_FMT_ARG(11, args, mod));
> > > > +   if (str_size)
> > > > +   str[str_size - 1] = '\0';
> > >
> > > hm... what if err < str_size ?
> >
> > Then there would be two zeroes, one set by snprintf in the middle and
> > one set by us at the end. :| I was a bit lazy there, I agree it would
> > be nicer if we'd do if (err >= str_size) instead.
> >
>
> snprintf() seems to be always zero-terminating the string if str_size
> > 0, and does nothing if str_size == 0, which is exactly what you
> want, so you can just drop that zero termination logic.

Oh, that's right! I was confused by snprintf's documentation "the
resulting string is truncated" but as I read the vsnprintf
implementation I see this is indeed always zero-terminated. Great :)

> > Also makes me wonder what if str == NULL and str_size != 0. I just
> > assumed that the verifier would prevent that from happening but
> > discussions in the other patches make me unsure now.
>
>
> ARG_CONST_SIZE_OR_ZERO should make sure that ARG_PTR_TO_MEM before
> that is a valid initialized memory. But please double-check, of
> course.

Will do.

Re: [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper

2021-04-06 Thread Florent Revest

On Fri, Mar 26, 2021 at 11:55 PM Andrii Nakryiko
 wrote:
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  wrote:
> > The implementation takes inspiration from the existing bpf_trace_printk
> > helper but there are a few differences:
> >
> > To allow for a large number of format-specifiers, parameters are
> > provided in an array, like in bpf_seq_printf.
> >
> > Because the output string takes two arguments and the array of
> > parameters also takes two arguments, the format string needs to fit in
> > one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
> > NULL-terminated read-only map, we don't need a format string length arg.
> >
> > Because the format-string is known at verification time, we also move
> > most of the format string validation, currently done in formatting
> > helper calls, into the verifier logic. This makes debugging easier and
> > also slightly improves the runtime performance.
> >
> > Signed-off-by: Florent Revest 
> > ---
> >  include/linux/bpf.h|  6 
> >  include/uapi/linux/bpf.h   | 28 ++
> >  kernel/bpf/helpers.c   |  2 ++
> >  kernel/bpf/verifier.c  | 41 +++
> >  kernel/trace/bpf_trace.c   | 52 ++
> >  tools/include/uapi/linux/bpf.h | 28 ++
> >  6 files changed, 157 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 7b5319d75b3e..f3d9c8fa60b3 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1893,6 +1893,7 @@ extern const struct bpf_func_proto 
> > bpf_skc_to_tcp_request_sock_proto;
> >  extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
> >  extern const struct bpf_func_proto bpf_copy_from_user_proto;
> >  extern const struct bpf_func_proto bpf_snprintf_btf_proto;
> > +extern const struct bpf_func_proto bpf_snprintf_proto;
> >  extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> > @@ -2018,4 +2019,9 @@ int bpf_arch_text_poke(void *ip, enum 
> > bpf_text_poke_type t,
> >  struct btf_id_set;
> >  bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
> >
> > +enum bpf_printf_mod_type;
> > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > +   u64 *final_args, enum bpf_printf_mod_type *mod,
> > +   u32 num_args);
> > +
> >  #endif /* _LINUX_BPF_H */
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 2d3036e292a9..86af61e912c6 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4660,6 +4660,33 @@ union bpf_attr {
> >   * Return
> >   * The number of traversed map elements for success, 
> > **-EINVAL** for
> >   * invalid **flags**.
> > + *
> > + * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, 
> > u32 data_len)
> > + * Description
> > + * Outputs a string into the **str** buffer of size 
> > **str_size**
> > + * based on a format string stored in a read-only map pointed 
> > by
> > + * **fmt**.
> > + *
> > + * Each format specifier in **fmt** corresponds to one u64 
> > element
> > + * in the **data** array. For strings and pointers where 
> > pointees
> > + * are accessed, only the pointer values are stored in the 
> > *data*
> > + * array. The *data_len* is the size of *data* in bytes.
> > + *
> > + * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
> > + * memory. Reading kernel memory may fail due to either invalid
> > + * address or valid address but requiring a major memory 
> > fault. If
> > + * reading kernel memory fails, the string for **%s** will be 
> > an
> > + * empty string, and the ip address for **%p{i,I}{4,6}** will 
> > be 0.
>
> would it make sense for sleepable programs to allow memory fault when
> reading memory?

Probably yes. How would you do that ? I'm guessing that in
bpf_trace_copy_string you would call either strncpy_from_X_nofault or
strncpy_from_X depending on a condition but I'm not sure which one.

> > + * Not returning error to bpf program is consistent with what
> > + * **bpf_trace_printk**\ () does for

Re: [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-04-06 Thread Florent Revest

On Sat, Mar 27, 2021 at 12:01 AM Andrii Nakryiko
 wrote:
>
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  wrote:
> >
> > When initializing the __param array with a one liner, if all args are
> > const, the initial array value will be placed in the rodata section but
> > because libbpf does not support relocation in the rodata section, any
> > pointer in this array will stay NULL.
> >
> > Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
> > Signed-off-by: Florent Revest 
> > ---
> >  tools/lib/bpf/bpf_tracing.h | 26 ++
> >  1 file changed, 22 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> > index f9ef37707888..d9a4c3f77ff4 100644
> > --- a/tools/lib/bpf/bpf_tracing.h
> > +++ b/tools/lib/bpf/bpf_tracing.h
> > @@ -413,6 +413,22 @@ typeof(name(0)) name(struct pt_regs *ctx)  
> > \
> >  }  
> > \
> >  static __always_inline typeof(name(0)) ##name(struct pt_regs *ctx, 
> > ##args)
> >
> > +#define ___bpf_fill0(arr, p, x)
>
> can you please double-check that no-argument BPF_SEQ_PRINTF won't
> generate a warning about spurious ';'? Maybe it's better to have zero
> case as `do {} while(0);` ?
>
> > +#define ___bpf_fill1(arr, p, x) arr[p] = x
> > +#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 
> > 1, args)
> > +#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 
> > 1, args)
> > +#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 
> > 1, args)
> > +#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 
> > 1, args)
> > +#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 
> > 1, args)
> > +#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 
> > 1, args)
> > +#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 
> > 1, args)
> > +#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 
> > 1, args)
> > +#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p 
> > + 1, args)
> > +#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p 
> > + 1, args)
> > +#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p 
> > + 1, args)
> > +#define ___bpf_fill(arr, args...) \
> > +   ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
>
> cool. this is regular enough to easily comprehend :)
>
> > +
> >  /*
> >   * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
> >   * in a structure.
> > @@ -421,12 +437,14 @@ static __always_inline typeof(name(0)) 
> > ##name(struct pt_regs *ctx, ##args)
> > ({  
> > \
> > _Pragma("GCC diagnostic push")  
> > \
> > _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  
> > \
> > +   unsigned long long ___param[___bpf_narg(args)]; 
> > \
> > static const char ___fmt[] = fmt;   
> > \
> > -   unsigned long long ___param[] = { args };   
> > \
> > +   int __ret;  
> > \
> > +   ___bpf_fill(___param, args);
> > \
> > _Pragma("GCC diagnostic pop")   
> > \
>
> Let's clean this up a little bit;
> 1. static const char ___fmt should be the very first
> 2. _Pragma scope should be minimal necessary, which includes only
> ___bpf_fill, right?
> 3. Empty line after int __ret; and let's keep three underscores for 
> consistency.
>
>
> > -   int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),
> > \
> > -   ___param, sizeof(___param));
> > \
> > -   ___ret; 
> > \
> > +   __ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt), 
> > \
> > +  ___param, sizeof(___param)); 
> > \
> > +   __ret;  
> > \
>
> but actually you don't need __ret at all, just bpf_seq_printf() here, right?

Agreed with everything and also the indentation comment in 5/6, thanks.

Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-04-06 Thread Florent Revest

On Sat, Mar 27, 2021 at 12:05 AM Andrii Nakryiko
 wrote:
>
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  wrote:
> >
> > This exercises most of the format specifiers when things go well.
> >
> > Signed-off-by: Florent Revest 
> > ---
>
> Looks good. Please add a no-argument test case as well.

Agreed

> Acked-by: Andrii Nakryiko 
>
> >  .../selftests/bpf/prog_tests/snprintf.c   | 65 +++
> >  .../selftests/bpf/progs/test_snprintf.c   | 59 +
> >  2 files changed, 124 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> >
>
> [...]
>
> > +
> > +SEC("raw_tp/sys_enter")
> > +int handler(const void *ctx)
> > +{
> > +   /* Convenient values to pretty-print */
> > +   const __u8 ex_ipv4[] = {127, 0, 0, 1};
> > +   const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> > 0, 1};
> > +   const char str1[] = "str1";
> > +   const char longstr[] = "longstr";
> > +   extern const void schedule __ksym;
>
> oh, fancy. I'd move it out of this function into global space, though,
> to make it more apparent. I almost missed that it's a special one.

Just schedule? Alright.

Re: [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-04-06 Thread Florent Revest

On Fri, Mar 26, 2021 at 11:23 PM Andrii Nakryiko
 wrote:
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  wrote:
> > +
> > +   map_off = reg->off + reg->var_off.value;
> > +   err = map->ops->map_direct_value_addr(map, _addr, 
> > map_off);
> > +   if (err)
> > +   return err;
> > +
> > +   str_ptr = (char *)(long)(map_addr);
> > +   if (!strnchr(str_ptr + map_off,
> > +map->value_size - reg->off - map_off, 0))
>
> you are double subtracting reg->off here. isn't map->value_size -
> map_off what you want?

Good catch!

> > +   verbose(env, "string is not zero-terminated\n");
>
> I'd prefer `return -EINVAL;`, but at least set err, otherwise what's the 
> point?

Ah yeah, absolutely.

Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-04-06 Thread Florent Revest

[Sorry for the late replies, I'm just back from a long easter break :)]

On Fri, Mar 26, 2021 at 11:51 PM Andrii Nakryiko
 wrote:
> On Fri, Mar 26, 2021 at 2:53 PM Andrii Nakryiko
>  wrote:
> > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest  wrote:
> > > Unfortunately, the implementation of the two existing helpers already
> > > drifted quite a bit and unifying them entailed a lot of changes:
> >
> > "Unfortunately" as in a lot of extra work for you? I think overall
> > though it was very fortunate that you ended up doing it, all
> > implementations are more feature-complete and saner now, no? Thanks a
> > lot for your hard work!

Ahah, "unfortunately" a bit of extra work for me, indeed. But I find
this kind of refactoring patches even harder to review than to write
so thank you too!

> > > - bpf_trace_printk always expected fmt[fmt_size] to be the terminating
> > >   NULL character, this is no longer true, the first 0 is terminating.
> >
> > You mean if you had bpf_trace_printk("bla bla\0some more bla\0", 24)
> > it would emit that zero character? If yes, I don't think it was a sane
> > behavior anyways.

The call to snprintf in bpf_do_trace_printk would eventually ignore
"some more bla" but the parsing done in bpf_trace_printk would indeed
read the whole string.

> > This is great, you already saved some lines of code! I suspect I'll
> > have some complaints about mods (it feels like this preample should
> > provide extra information about which arguments have to be read from
> > kernel/user memory, but I'll see next patches first.
>
> Disregard the last part (at least for now). I had a mental model that
> it should be possible to parse a format string once and then remember
> "instructions" (i.e., arg1 is long, arg2 is string, and so on). But
> that's too complicated, so I think re-parsing the format string is
> much simpler.

I also wanted to do that originally but realized it would keep a lot
of the complexity in the helpers themselves and not really move the
needle.

> > > +/* Horrid workaround for getting va_list handling working with different
> > > + * argument type combinations generically for 32 and 64 bit archs.
> > > + */
> > > +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)\
> > > +   ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||\
> > > +(mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64)) \
> > > + ? args[arg_nb]\
> > > + : ((mod[arg_nb] == BPF_PRINTF_LONG || \
> > > +(mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))  \
> >
> > is this right? INT is always 32-bit, it's only LONG that differs.
> > Shouldn't the rule be
> >
> > (LONG_LONG || LONG && __BITS_PER_LONG) -> (__u64)args[args_nb]
> > (INT || LONG && __BITS_PER_LONG == 32) -> (__u32)args[args_nb]
> >
> > Does (long) cast do anything fancy when casting from u64? Sorry, maybe
> > I'm confused.

To be honest, I am also confused by that logic... :p My patch tries to
conserve exactly the same logic as "88a5c690b6 bpf: fix
bpf_trace_printk on 32 bit archs" because I was also afraid of missing
something and could not test it on 32 bit arches. From that commit
description, it is unclear to me what "u32 and long are passed
differently to u64, since the result of C conditional operators
follows the "usual arithmetic conversions" rules" means. Maybe Daniel
can comment on this ?

> > > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > > +   u64 *final_args, enum bpf_printf_mod_type *mod,
> > > +   u32 num_args)
> > > +{
> > > +   struct bpf_printf_buf *bufs = this_cpu_ptr(_printf_buf);
> > > +   int err, i, fmt_cnt = 0, copy_size, used;
> > > +   char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > > +   bool prepare_args = final_args && mod;
> >
> > probably better to enforce that both or none are specified, otherwise
> > return error

Fair :)

> it's actually three of them: raw_args, mod, and num_args, right? All
> three are either NULL or non-NULL.

It is a bit tricky to see from that patch but in "3/6 bpf: Add a
bpf_snprintf helper" the verifier code calls this function with
num_args != 0 to check whether the number of arguments is correct
without actually converting anything.

Also when the helper gets called, raw_args can come from the BPF
program and be NULL but in that case we will also have

[PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-03-23 Thread Florent Revest

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest 
---
 tools/lib/bpf/bpf_tracing.h | 26 ++
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..d9a4c3f77ff4 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,6 +413,22 @@ typeof(name(0)) name(struct pt_regs *ctx)  
\
 }  \
 static __always_inline typeof(name(0)) ##name(struct pt_regs *ctx, ##args)
 
+#define ___bpf_fill0(arr, p, x)
+#define ___bpf_fill1(arr, p, x) arr[p] = x
+#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, 
args)
+#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, 
args)
+#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, 
args)
+#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, 
args)
+#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, 
args)
+#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, 
args)
+#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, 
args)
+#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, 
args)
+#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, 
args)
+#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 
1, args)
+#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 
1, args)
+#define ___bpf_fill(arr, args...) \
+   ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
+
 /*
  * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
  * in a structure.
@@ -421,12 +437,14 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
({  \
_Pragma("GCC diagnostic push")  \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   unsigned long long ___param[___bpf_narg(args)]; \
static const char ___fmt[] = fmt;   \
-   unsigned long long ___param[] = { args };   \
+   int __ret;  \
+   ___bpf_fill(___param, args);\
_Pragma("GCC diagnostic pop")   \
-   int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),\
-   ___param, sizeof(___param));\
-   ___ret; \
+   __ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
+  ___param, sizeof(___param)); \
+   __ret;  \
})
 
 #endif
-- 
2.31.0.291.g576ba9dcdaf-goog

[PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-03-23 Thread Florent Revest

This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 38 ++
 2 files changed, 39 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a25730eaa148..7b5319d75b3e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -308,6 +308,7 @@ enum bpf_arg_type {
ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type */
ARG_PTR_TO_FUNC,/* pointer to a bpf program function */
ARG_PTR_TO_STACK_OR_NULL,   /* pointer to stack or NULL */
+   ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only 
string */
__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e26c5170c953..9e03608725b4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4601,6 +4601,7 @@ static const struct bpf_reg_types spin_lock_types = { 
.types = { PTR_TO_MAP_VALU
 static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { 
PTR_TO_PERCPU_BTF_ID } };
 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } 
};
 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK 
} };
+static const struct bpf_reg_types const_str_ptr_types = { .types = { 
PTR_TO_MAP_VALUE } };
 
 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_MAP_KEY]= _key_value_types,
@@ -4631,6 +4632,7 @@ static const struct bpf_reg_types 
*compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_PERCPU_BTF_ID]  = _btf_ptr_types,
[ARG_PTR_TO_FUNC]   = _ptr_types,
[ARG_PTR_TO_STACK_OR_NULL]  = _ptr_types,
+   [ARG_PTR_TO_CONST_STR]  = _str_ptr_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -4881,6 +4883,42 @@ static int check_func_arg(struct bpf_verifier_env *env, 
u32 arg,
if (err)
return err;
err = check_ptr_alignment(env, reg, 0, size, true);
+   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
+   struct bpf_map *map = reg->map_ptr;
+   int map_off;
+   u64 map_addr;
+   char *str_ptr;
+
+   if (reg->type != PTR_TO_MAP_VALUE || !map ||
+   !bpf_map_is_rdonly(map)) {
+   verbose(env, "R%d does not point to a readonly map'\n", 
regno);
+   return -EACCES;
+   }
+
+   if (!tnum_is_const(reg->var_off)) {
+   verbose(env, "R%d is not a constant address'\n", regno);
+   return -EACCES;
+   }
+
+   if (!map->ops->map_direct_value_addr) {
+   verbose(env, "no direct value access support for this 
map type\n");
+   return -EACCES;
+   }
+
+   err = check_map_access(env, regno, reg->off,
+  map->value_size - reg->off, false);
+   if (err)
+   return err;
+
+   map_off = reg->off + reg->var_off.value;
+   err = map->ops->map_direct_value_addr(map, _addr, map_off);
+   if (err)
+   return err;
+
+   str_ptr = (char *)(long)(map_addr);
+   if (!strnchr(str_ptr + map_off,
+map->value_size - reg->off - map_off, 0))
+   verbose(env, "string is not zero-terminated\n");
}
 
return err;
-- 
2.31.0.291.g576ba9dcdaf-goog

[PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper

2021-03-23 Thread Florent Revest

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
NULL-terminated read-only map, we don't need a format string length arg.

Because the format-string is known at verification time, we also move
most of the format string validation, currently done in formatting
helper calls, into the verifier logic. This makes debugging easier and
also slightly improves the runtime performance.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h|  6 
 include/uapi/linux/bpf.h   | 28 ++
 kernel/bpf/helpers.c   |  2 ++
 kernel/bpf/verifier.c  | 41 +++
 kernel/trace/bpf_trace.c   | 52 ++
 tools/include/uapi/linux/bpf.h | 28 ++
 6 files changed, 157 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7b5319d75b3e..f3d9c8fa60b3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1893,6 +1893,7 @@ extern const struct bpf_func_proto 
bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
 extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
 extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
@@ -2018,4 +2019,9 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type 
t,
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
+enum bpf_printf_mod_type;
+int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
+   u64 *final_args, enum bpf_printf_mod_type *mod,
+   u32 num_args);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2d3036e292a9..86af61e912c6 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4660,6 +4660,33 @@ union bpf_attr {
  * Return
  * The number of traversed map elements for success, **-EINVAL** 
for
  * invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 
data_len)
+ * Description
+ * Outputs a string into the **str** buffer of size **str_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the formatted string, including
+ * the trailing zero character. If the return value is greater than
+ * **str_size**, **str** contains a truncated string, guaranteed to
+ * be zero-terminated.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -4827,6 +4854,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu),  \
FN(for_each_map_elem),  \
+   FN(snprintf),   \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 074800226327..12f4cfb04fe7 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -750,6 +750,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
return _probe_read_kernel_str_proto;
case BPF_FUNC_snprintf_btf:
return _snprintf_btf_proto;
+   case BPF_FUNC_snprintf:
+   return _snprintf_proto;
default:
return NULL;
}
diff --git a/kernel/bpf/verifier.c b/kernel

[PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

2021-03-23 Thread Florent Revest

This exercises most of the format specifiers when things go well.

Signed-off-by: Florent Revest 
---
 .../selftests/bpf/prog_tests/snprintf.c   | 65 +++
 .../selftests/bpf/progs/test_snprintf.c   | 59 +
 2 files changed, 124 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index ..948a05e6b2cb
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include "test_snprintf.skel.h"
+
+#define EXP_NUM_OUT  "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET  sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT   "127.000.000.001 :::::::0001"
+#define EXP_IP_RET   sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT  "schedule schedule+0x0/"
+#define MIN_SYM_RET  sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot 
*/
+#define EXP_ADDR_OUT " 0add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT  "str1 longstr"
+#define EXP_STR_RET  sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT "%over"
+#define EXP_OVER_RET 10
+
+void test_snprintf(void)
+{
+   char exp_addr_out[] = EXP_ADDR_OUT;
+   char exp_sym_out[]  = EXP_SYM_OUT;
+   struct test_snprintf *skel;
+
+   skel = test_snprintf__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "skel_open"))
+   return;
+
+   if (!ASSERT_OK(test_snprintf__attach(skel), "skel_attach"))
+   goto cleanup;
+
+   /* trigger tracepoint */
+   usleep(1);
+
+   ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+   ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+   ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+   ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+   ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+sizeof(exp_sym_out) - 1), "sym_out");
+   ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+   ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+sizeof(exp_addr_out) - 1), "addr_out");
+   ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+   ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+   ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+   ASSERT_STREQ(skel->bss->over_out, EXP_OVER_OUT, "over_out");
+   ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+cleanup:
+   test_snprintf__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_snprintf.c 
b/tools/testing/selftests/bpf/progs/test_snprintf.c
new file mode 100644
index ..e18709055fad
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_snprintf.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include 
+#include 
+#include 
+
+char num_out[64] = {};
+long num_ret = 0;
+
+char ip_out[64] = {};
+long ip_ret = 0;
+
+char sym_out[64] = {};
+long sym_ret = 0;
+
+char addr_out[64] = {};
+long addr_ret = 0;
+
+char str_out[64] = {};
+long str_ret = 0;
+
+char over_out[6] = {};
+long over_ret = 0;
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+   /* Convenient values to pretty-print */
+   const __u8 ex_ipv4[] = {127, 0, 0, 1};
+   const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
+   const char str1[] = "str1";
+   const char longstr[] = "longstr";
+   extern const void schedule __ksym;
+
+   /* Integer types */
+   num_ret  = BPF_SNPRINTF(num_out, sizeof(num_out),
+   "%d %u %x %li %llu %lX",
+   -8, 9, 150, -424242, 1337, 0xDABBAD00);
+   /* IP addresses */
+   ip_ret   = BPF_SNPRINTF(ip_out, sizeof(ip_out), "%pi4 %pI6",
+   _ipv4, _ipv6);
+   /* Symbol lookup formatting */
+   sym_ret  = BPF_SNPRINTF(sym_out,  sizeof(sym_out), "%ps %pS %pB",
+   , , );
+   /* Kernel pointers */
+   addr_ret = BPF_SNPRINTF(addr_out, sizeof(addr_out), "%pK %px %p",
+   0, 0x0ADD4E55, 0x0ADD4E55);
+   /* Strings embedding */
+   str_ret  = BPF_SNPRINTF(str_out, sizeof(str_out), "%s %+05s",
+   str1, longstr);
+   /* Overflow */
+   over_ret = BPF_SNPRINTF(over_out, sizeof(over_out), "%%overflow");
+
+   return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.31.0.291.g576ba9dcdaf-goog

[PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro

2021-03-23 Thread Florent Revest

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest 
---
 tools/lib/bpf/bpf_tracing.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index d9a4c3f77ff4..e5c6ede6060b 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
__ret;  \
})
 
+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead 
of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...)  \
+   ({  \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   static const char ___fmt[] = fmt;   \
+   int __ret;  \
+   ___bpf_fill(___param, args);\
+   _Pragma("GCC diagnostic pop")   \
+   __ret = bpf_snprintf(out, out_size, ___fmt, \
+___param, sizeof(___param));   \
+   __ret;  \
+   })
+
 #endif
-- 
2.31.0.291.g576ba9dcdaf-goog

[PATCH bpf-next v2 0/6] Add a snprintf eBPF helper

2021-03-23 Thread Florent Revest

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-rev...@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-rev...@chromium.org/T/#u

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
  for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes

Florent Revest (6):
  bpf: Factorize bpf_trace_printk and bpf_seq_printf
  bpf: Add a ARG_PTR_TO_CONST_STR argument type
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro
  selftests/bpf: Add a series of tests for bpf_snprintf

 include/linux/bpf.h   |   7 +
 include/uapi/linux/bpf.h  |  28 +
 kernel/bpf/helpers.c  |   2 +
 kernel/bpf/verifier.c |  79 +++
 kernel/trace/bpf_trace.c  | 581 +-
 tools/include/uapi/linux/bpf.h|  28 +
 tools/lib/bpf/bpf_tracing.h   |  44 +-
 .../selftests/bpf/prog_tests/snprintf.c   |  65 ++
 .../selftests/bpf/progs/test_snprintf.c   |  59 ++
 9 files changed, 604 insertions(+), 289 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

-- 
2.31.0.291.g576ba9dcdaf-goog

[PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

2021-03-23 Thread Florent Revest

Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

Unfortunately, the implementation of the two existing helpers already
drifted quite a bit and unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
  NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
  using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
  strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
  stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
  preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

Signed-off-by: Florent Revest 
---
 kernel/trace/bpf_trace.c | 529 ++-
 1 file changed, 244 insertions(+), 285 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0d23755c2747..0fdca94a3c9c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -372,7 +372,7 @@ static const struct bpf_func_proto 
*bpf_get_probe_write_proto(void)
return _probe_write_user_proto;
 }
 
-static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
size_t bufsz)
 {
void __user *user_ptr = (__force void __user *)unsafe_ptr;
@@ -382,178 +382,284 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
switch (fmt_ptype) {
case 's':
 #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
-   if ((unsigned long)unsafe_ptr < TASK_SIZE) {
-   strncpy_from_user_nofault(buf, user_ptr, bufsz);
-   break;
-   }
+   if ((unsigned long)unsafe_ptr < TASK_SIZE)
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
fallthrough;
 #endif
case 'k':
-   strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
-   break;
+   return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
case 'u':
-   strncpy_from_user_nofault(buf, user_ptr, bufsz);
-   break;
+   return strncpy_from_user_nofault(buf, user_ptr, bufsz);
}
+
+   return -EINVAL;
 }
 
 static DEFINE_RAW_SPINLOCK(trace_printk_lock);
 
-#define BPF_TRACE_PRINTK_SIZE   1024
+enum bpf_printf_mod_type {
+   BPF_PRINTF_INT,
+   BPF_PRINTF_LONG,
+   BPF_PRINTF_LONG_LONG,
+};
 
-static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
-{
-   static char buf[BPF_TRACE_PRINTK_SIZE];
-   unsigned long flags;
-   va_list ap;
-   int ret;
+/* Horrid workaround for getting va_list handling working with different
+ * argument type combinations generically for 32 and 64 bit archs.
+ */
+#define BPF_CAST_FMT_ARG(arg_nb, args, mod)\
+   ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||\
+(mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64)) \
+ ? args[arg_nb]\
+ : ((mod[arg_nb] == BPF_PRINTF_LONG || \
+(mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))  \
+ ? (long)args[arg_nb]  \
+ : (u32)args[arg_nb]))
+
+/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+ */
+#define MAX_PRINTF_BUF_LEN 512
 
-   raw_spin_lock_irqsave(_printk_lock, flags);
-   va_start(ap, fmt);
-   ret = vsnprintf(buf, sizeof(buf), fmt, ap);
-   va_end(ap);
-   /* vsnprintf() will not append null for zero-length strings */
-   if (ret == 0)
-   buf[0] = '\0';
-   trace_bpf_trace_printk(buf);
-   raw_spin_unlock_irqrestore(_printk_lock, flags);
+struct bpf_printf_buf {
+   char tmp_buf[MAX_PRINTF_BUF_LEN];
+};
+static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
+static DEFINE_PER_CPU(int, bpf_printf_buf_used)

Re: [PATCH bpf-next 2/5] bpf: Add a bpf_snprintf helper

2021-03-23 Thread Florent Revest

On Tue, Mar 23, 2021 at 4:21 AM Alexei Starovoitov
 wrote:
>
> On Wed, Mar 10, 2021 at 11:02:08PM +0100, Florent Revest wrote:
> >
> > +struct bpf_snprintf_buf {
> > + char buf[MAX_SNPRINTF_MEMCPY][MAX_SNPRINTF_STR_LEN];
> > +};
> > +static DEFINE_PER_CPU(struct bpf_snprintf_buf, bpf_snprintf_buf);
> > +static DEFINE_PER_CPU(int, bpf_snprintf_buf_used);
> > +
> > +BPF_CALL_5(bpf_snprintf, char *, out, u32, out_size, char *, fmt, u64 *, 
> > args,
> > +u32, args_len)
> > +{
> > + int err, i, buf_used, copy_size, fmt_cnt = 0, memcpy_cnt = 0;
> > + u64 params[MAX_SNPRINTF_VARARGS];
> > + struct bpf_snprintf_buf *bufs;
> > +
> > + buf_used = this_cpu_inc_return(bpf_snprintf_buf_used);
> > + if (WARN_ON_ONCE(buf_used > 1)) {
>
> this can trigger only if the helper itself gets preempted and
> another bpf prog will run on the same cpu and will call into this helper
> again, right?
> If so, how about adding preempt_disable here to avoid this case?

Ah, neat, that sounds like a good idea indeed. This was really just
cargo-culted from bpf_seq_printf but as part of my grand unification
attempt for the various printf-like helpers, I can try to make it use
preempt_disable as well yes.

> It won't prevent the case where kprobe is inside snprintf core,
> so the counter is still needed, but it wouldn't trigger by accident.

Good point, I will keep it around then.

> Also since bufs are not used always, how about grabbing the
> buffers only when %p or %s are seen in fmt?
> After snprintf() is done it would conditionally do:
> if (bufs_were_used) {
>this_cpu_dec(bpf_snprintf_buf_used);
>preempt_enable();
> }
> This way simple bpf_snprintf won't ever hit EBUSY.

Absolutely, it would be nice. :)

> > + err = -EBUSY;
> > + goto out;
> > + }
> > +
> > + bufs = this_cpu_ptr(_snprintf_buf);
> > +
> > + /*
> > +  * The verifier has already done most of the heavy-work for us in
> > +  * check_bpf_snprintf_call. We know that fmt is well formatted and 
> > that
> > +  * args_len is valid. The only task left is to convert some of the
> > +  * arguments. For the %s and %pi* specifiers, we need to read buffers
> > +  * from a kernel address during the helper call.
> > +  */
> > + for (i = 0; fmt[i] != '\0'; i++) {
> > + if (fmt[i] != '%')
> > + continue;
> > +
> > + if (fmt[i + 1] == '%') {
> > + i++;
> > + continue;
> > + }
> > +
> > + /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] 
> > */
> > + i++;
> > +
> > + /* skip optional "[0 +-][num]" width formating field */
> > + while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
> > +fmt[i] == ' ')
> > + i++;
> > + if (fmt[i] >= '1' && fmt[i] <= '9') {
> > + i++;
> > + while (fmt[i] >= '0' && fmt[i] <= '9')
> > + i++;
> > + }
> > +
> > + if (fmt[i] == 's') {
> > + void *unsafe_ptr = (void *)(long)args[fmt_cnt];
> > +
> > + err = 
> > strncpy_from_kernel_nofault(bufs->buf[memcpy_cnt],
> > +   unsafe_ptr,
> > +   
> > MAX_SNPRINTF_STR_LEN);
> > + if (err < 0)
> > + bufs->buf[memcpy_cnt][0] = '\0';
> > + params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt];
>
> how about:
> char buf[512]; instead?
> instead of memcpy_cnt++ remember how many bytes of the buf were used and
> copy next arg after that.
> The scratch space would be used more efficiently.
> The helper would potentially return ENOSPC if the first string printed via %s
> consumed most of the 512 space and the second string doesn't fit.
> But the verifier-time if (memcpy_cnt >= MAX_SNPRINTF_MEMCPY) can be removed.
> Ten small %s will work fine.

Cool! That is also a good idea :)

> We can allocate a page per-cpu when this helper is used by prog and free
> that page when all progs with bpf_snprintf are unloaded.
> But extra complexity is probably not worth it. I would start with 512 per-cpu.
> It's going to be enough for most users.

Yes, let's maybe keep that for later. I think there is already enough
complexity going into the printf-like helpers unification patch.

> Overall looks great. Cannot wait for v2 :)

Ahah wait until you see that patch! :D

Re: [PATCH bpf-next 1/5] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-03-17 Thread Florent Revest

On Wed, Mar 17, 2021 at 2:02 AM Andrii Nakryiko
 wrote:
> On Tue, Mar 16, 2021 at 5:46 PM Florent Revest  wrote:
> > On Wed, Mar 17, 2021 at 1:35 AM Andrii Nakryiko
> >  wrote:
> > > On Tue, Mar 16, 2021 at 4:58 PM Florent Revest  
> > > wrote:
> > > > On Tue, Mar 16, 2021 at 2:03 AM Andrii Nakryiko
> > > >  wrote:
> > > > > On Wed, Mar 10, 2021 at 2:02 PM Florent Revest  
> > > > > wrote:
> > > > > > +   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
> > > > > > +   struct bpf_map *map = reg->map_ptr;
> > > > > > +   int map_off, i;
> > > > > > +   u64 map_addr;
> > > > > > +   char *map_ptr;
> > > > > > +
> > > > > > +   if (!map || !bpf_map_is_rdonly(map)) {
> > > > > > +   verbose(env, "R%d does not point to a 
> > > > > > readonly map'\n", regno);
> > > > > > +   return -EACCES;
> > > > > > +   }
> > > > > > +
> > > > > > +   if (!tnum_is_const(reg->var_off)) {
> > > > > > +   verbose(env, "R%d is not a constant 
> > > > > > address'\n", regno);
> > > > > > +   return -EACCES;
> > > > > > +   }
> > > > > > +
> > > > > > +   if (!map->ops->map_direct_value_addr) {
> > > > > > +   verbose(env, "no direct value access 
> > > > > > support for this map type\n");
> > > > > > +   return -EACCES;
> > > > > > +   }
> > > > > > +
> > > > > > +   err = check_helper_mem_access(env, regno,
> > > > > > + map->value_size - 
> > > > > > reg->off,
> > > > > > + false, meta);
> > > > >
> > > > > you expect reg to be PTR_TO_MAP_VALUE, so probably better to directly
> > > > > use check_map_access(). And double-check that register is of expected
> > > > > type. just the presence of ref->map_ptr might not be sufficient?
> > > >
> > > > Sorry, just making sure I understand your comment correctly, are you
> > > > suggesting that we:
> > > > 1- skip the check_map_access_type() currently done by
> > > > check_helper_mem_access()? or did you implicitly mean that we should
> > > > call it as well next to check_map_access() ?
> > >
> > > check_helper_mem_access() will call check_map_access() for
> > > PTR_TO_MAP_VALUE and we expect only PTR_TO_MAP_VALUE, right? So why go
> > > through check_helper_mem_access() if we know we need
> > > check_map_access()? Less indirection, more explicit. So I meant
> > > "replace check_helper_mem_access() with check_map_access()".
> >
> > Mhh I suspect there's still a misunderstanding, these function names
> > are really confusing ahah.
> > What about check_map_access*_type*. which is also called by
> > check_helper_mem_access (before check_map_access):
> > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/verifier.c#n4329
> >
> > Your message sounds like we should skip it so I was asking if that's
> > what you also implicitly meant or if you missed it?
>
> ah, you meant READ/WRITE access? ok, let's keep
> check_helper_mem_access() then, never mind me

Ah cool, then we are on the same page :)

Re: [PATCH bpf-next 1/5] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-03-16 Thread Florent Revest

On Wed, Mar 17, 2021 at 1:35 AM Andrii Nakryiko
 wrote:
> On Tue, Mar 16, 2021 at 4:58 PM Florent Revest  wrote:
> > On Tue, Mar 16, 2021 at 2:03 AM Andrii Nakryiko
> >  wrote:
> > > On Wed, Mar 10, 2021 at 2:02 PM Florent Revest  
> > > wrote:
> > > > +   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
> > > > +   struct bpf_map *map = reg->map_ptr;
> > > > +   int map_off, i;
> > > > +   u64 map_addr;
> > > > +   char *map_ptr;
> > > > +
> > > > +   if (!map || !bpf_map_is_rdonly(map)) {
> > > > +   verbose(env, "R%d does not point to a readonly 
> > > > map'\n", regno);
> > > > +   return -EACCES;
> > > > +   }
> > > > +
> > > > +   if (!tnum_is_const(reg->var_off)) {
> > > > +   verbose(env, "R%d is not a constant 
> > > > address'\n", regno);
> > > > +   return -EACCES;
> > > > +   }
> > > > +
> > > > +   if (!map->ops->map_direct_value_addr) {
> > > > +   verbose(env, "no direct value access support 
> > > > for this map type\n");
> > > > +   return -EACCES;
> > > > +   }
> > > > +
> > > > +   err = check_helper_mem_access(env, regno,
> > > > + map->value_size - 
> > > > reg->off,
> > > > + false, meta);
> > >
> > > you expect reg to be PTR_TO_MAP_VALUE, so probably better to directly
> > > use check_map_access(). And double-check that register is of expected
> > > type. just the presence of ref->map_ptr might not be sufficient?
> >
> > Sorry, just making sure I understand your comment correctly, are you
> > suggesting that we:
> > 1- skip the check_map_access_type() currently done by
> > check_helper_mem_access()? or did you implicitly mean that we should
> > call it as well next to check_map_access() ?
>
> check_helper_mem_access() will call check_map_access() for
> PTR_TO_MAP_VALUE and we expect only PTR_TO_MAP_VALUE, right? So why go
> through check_helper_mem_access() if we know we need
> check_map_access()? Less indirection, more explicit. So I meant
> "replace check_helper_mem_access() with check_map_access()".

Mhh I suspect there's still a misunderstanding, these function names
are really confusing ahah.
What about check_map_access*_type*. which is also called by
check_helper_mem_access (before check_map_access):
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/verifier.c#n4329

Your message sounds like we should skip it so I was asking if that's
what you also implicitly meant or if you missed it?

> > 2- enforce (reg->type == PTR_TO_MAP_VALUE) even if currently
> > guaranteed by compatible_reg_types, just to stay on the safe side ?
>
> I can't follow compatible_reg_types :( If it does, then I guess it's
> fine without this check.

It's alright, I can keep an extra check just for safety. :)

Re: [PATCH bpf-next 1/5] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-03-16 Thread Florent Revest

On Tue, Mar 16, 2021 at 2:03 AM Andrii Nakryiko
 wrote:
> On Wed, Mar 10, 2021 at 2:02 PM Florent Revest  wrote:
> > +   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
> > +   struct bpf_map *map = reg->map_ptr;
> > +   int map_off, i;
> > +   u64 map_addr;
> > +   char *map_ptr;
> > +
> > +   if (!map || !bpf_map_is_rdonly(map)) {
> > +   verbose(env, "R%d does not point to a readonly 
> > map'\n", regno);
> > +   return -EACCES;
> > +   }
> > +
> > +   if (!tnum_is_const(reg->var_off)) {
> > +   verbose(env, "R%d is not a constant address'\n", 
> > regno);
> > +   return -EACCES;
> > +   }
> > +
> > +   if (!map->ops->map_direct_value_addr) {
> > +   verbose(env, "no direct value access support for 
> > this map type\n");
> > +   return -EACCES;
> > +   }
> > +
> > +   err = check_helper_mem_access(env, regno,
> > + map->value_size - reg->off,
> > + false, meta);
>
> you expect reg to be PTR_TO_MAP_VALUE, so probably better to directly
> use check_map_access(). And double-check that register is of expected
> type. just the presence of ref->map_ptr might not be sufficient?

Sorry, just making sure I understand your comment correctly, are you
suggesting that we:
1- skip the check_map_access_type() currently done by
check_helper_mem_access()? or did you implicitly mean that we should
call it as well next to check_map_access() ?
2- enforce (reg->type == PTR_TO_MAP_VALUE) even if currently
guaranteed by compatible_reg_types, just to stay on the safe side ?

Re: [PATCH bpf-next 3/5] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-03-16 Thread Florent Revest

On Tue, Mar 16, 2021 at 5:36 AM Andrii Nakryiko
 wrote:
> On Wed, Mar 10, 2021 at 2:02 PM Florent Revest  wrote:
> > +#define ___bpf_build_param0(narg, x)
> > +#define ___bpf_build_param1(narg, x) ___param[narg - 1] = x
> > +#define ___bpf_build_param2(narg, x, args...) ___param[narg - 2] = x; \
> > + ___bpf_build_param1(narg, 
> > args)
> > +#define ___bpf_build_param3(narg, x, args...) ___param[narg - 3] = x; \
> > + ___bpf_build_param2(narg, 
> > args)
> > +#define ___bpf_build_param4(narg, x, args...) ___param[narg - 4] = x; \
> > + ___bpf_build_param3(narg, 
> > args)
> > +#define ___bpf_build_param5(narg, x, args...) ___param[narg - 5] = x; \
> > + ___bpf_build_param4(narg, 
> > args)
> > +#define ___bpf_build_param6(narg, x, args...) ___param[narg - 6] = x; \
> > + ___bpf_build_param5(narg, 
> > args)
> > +#define ___bpf_build_param7(narg, x, args...) ___param[narg - 7] = x; \
> > + ___bpf_build_param6(narg, 
> > args)
> > +#define ___bpf_build_param8(narg, x, args...) ___param[narg - 8] = x; \
> > + ___bpf_build_param7(narg, 
> > args)
> > +#define ___bpf_build_param9(narg, x, args...) ___param[narg - 9] = x; \
> > + ___bpf_build_param8(narg, 
> > args)
> > +#define ___bpf_build_param10(narg, x, args...) ___param[narg - 10] = x; \
> > +  ___bpf_build_param9(narg, 
> > args)
> > +#define ___bpf_build_param11(narg, x, args...) ___param[narg - 11] = x; \
> > +  ___bpf_build_param10(narg, 
> > args)
> > +#define ___bpf_build_param12(narg, x, args...) ___param[narg - 12] = x; \
> > +  ___bpf_build_param11(narg, 
> > args)
>
> took me some time to get why the [narg - 12] :) it makes sense, but
> then I started wondering why not
>
> #define ___bpf_build_param12(narg, x, args...)
> ___bpf_build_param11(narg, args); ___param[11] = x
>
> ? seems more straightforward, no?

Unless I'm misunderstanding something, I don't think this would work.
The awkward "narg - 12" comes from the fact that these variadic macros
work by taking the first argument out of the variadic arguments (x
followed by args) and calling another macro with what's left (args).

So if you do __bpf_build_param(arg1, arg2) you will have
__bpf_build_param2() called with arg1 and __bpf_build_param1() called
with arg2. And if you do __bpf_build_param(arg1, arg2, arg3) you will
have __bpf_build_param3() called with arg1, __bpf_build_param2()
called with arg2, and __bpf_build_param1() called with arg3.
Basically, things are inverted, the position at which you need to
insert in ___param evolves in the opposite direction of the X after
___bpf_build_param which is the number of arguments left.

No matter in which order __bpf_build_paramX calls
__bpf_build_param(X-1) (before or after setting ___param[n]) you will
be unable to know just from the macro name at which cell in __param
you need to write the argument. (except for __bpf_build_param12 which
is an exception, because the max number of arg is 12, if this macro
gets called, then we know that narg=12 and we will always write at
__param[0])

That being said, I share your concern that this code is hard to read.
So instead of giving narg to each macro, I tried to give a pos
argument which indicates in which cell the macro should write. pos is
basically a counter that goes from 0 to narg as macros go from narg to
0.

#define ___bpf_fill0(array, pos, x)
#define ___bpf_fill1(array, pos, x) array[pos] = x
#define ___bpf_fill2(array, pos, x, args...) array[pos] = x;
___bpf_fill1(array, pos + 1, args)
#define ___bpf_fill3(array, pos, x, args...) array[pos] = x;
___bpf_fill2(array, pos + 1, args)
#define ___bpf_fill4(array, pos, x, args...) array[pos] = x;
___bpf_fill3(array, pos + 1, args)
#define ___bpf_fill5(array, pos, x, args...) array[pos] = x;
___bpf_fill4(array, pos + 1, args)
#define ___bpf_fill6(array, pos, x, args...) array[pos] = x;
___bpf_fill5(array, pos + 1, args)
#define ___bpf_fill7(array, pos, x, args...) array[pos] = x;
___bpf_fill6(array, pos + 1, args)
#define ___bpf_fill8(array, pos, x, args...) array[pos] = x;
___bpf_fill7(array, pos + 1, args)
#define ___bpf_fill9(array, pos, x, args...) array[pos] = x;
___bpf_fill8(array, pos + 1, args)
#define ___bpf_fill10(array, pos, x, args...) array[pos] = x;
___bpf_fill9(array, pos + 1, args)
#define ___bpf_fill11(array,

Re: [PATCH bpf-next 2/5] bpf: Add a bpf_snprintf helper

2021-03-16 Thread Florent Revest

On Tue, Mar 16, 2021 at 2:25 AM Andrii Nakryiko
 wrote:
>
> On Wed, Mar 10, 2021 at 2:02 PM Florent Revest  wrote:
> >
> > The implementation takes inspiration from the existing bpf_trace_printk
> > helper but there are a few differences:
> >
> > To allow for a large number of format-specifiers, parameters are
> > provided in an array, like in bpf_seq_printf.
> >
> > Because the output string takes two arguments and the array of
> > parameters also takes two arguments, the format string needs to fit in
> > one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
> > NULL-terminated read-only map, we don't need a format string length arg.
> >
> > Because the format-string is known at verification time, we also move
> > most of the format string validation, currently done in formatting
> > helper calls, into the verifier logic. This makes debugging easier and
> > also slightly improves the runtime performance.
> >
> > Signed-off-by: Florent Revest 
> > ---
> >  include/linux/bpf.h|   4 +
> >  include/uapi/linux/bpf.h   |  28 +++
> >  kernel/bpf/verifier.c  | 137 +
> >  kernel/trace/bpf_trace.c   | 110 ++
> >  tools/include/uapi/linux/bpf.h |  28 +++
> >  5 files changed, 307 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 7b5319d75b3e..d78175c9a887 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1902,6 +1902,10 @@ extern const struct bpf_func_proto 
> > bpf_task_storage_get_proto;
> >  extern const struct bpf_func_proto bpf_task_storage_delete_proto;
> >  extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
> >
> > +#define MAX_SNPRINTF_VARARGS   12
> > +#define MAX_SNPRINTF_MEMCPY6
> > +#define MAX_SNPRINTF_STR_LEN   128
> > +
> >  const struct bpf_func_proto *bpf_tracing_func_proto(
> > enum bpf_func_id func_id, const struct bpf_prog *prog);
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 2d3036e292a9..3cbdc8ae00e7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4660,6 +4660,33 @@ union bpf_attr {
> >   * Return
> >   * The number of traversed map elements for success, 
> > **-EINVAL** for
> >   * invalid **flags**.
> > + *
> > + * long bpf_snprintf(char *out, u32 out_size, const char *fmt, u64 *data, 
> > u32 data_len)
>
> bpf_snprintf_btf calls out and out_size str and str_size, let's be consistent?
>
> > + * Description
> > + * Outputs a string into the **out** buffer of size 
> > **out_size**
> > + * based on a format string stored in a read-only map pointed 
> > by
> > + * **fmt**.
> > + *
> > + * Each format specifier in **fmt** corresponds to one u64 
> > element
> > + * in the **data** array. For strings and pointers where 
> > pointees
> > + * are accessed, only the pointer values are stored in the 
> > *data*
> > + * array. The *data_len* is the size of *data* in bytes.
> > + *
> > + * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
> > + * memory. Reading kernel memory may fail due to either invalid
> > + * address or valid address but requiring a major memory 
> > fault. If
> > + * reading kernel memory fails, the string for **%s** will be 
> > an
> > + * empty string, and the ip address for **%p{i,I}{4,6}** will 
> > be 0.
> > + * Not returning error to bpf program is consistent with what
> > + * **bpf_trace_printk**\ () does for now.
> > + *
> > + * Return
> > + * The strictly positive length of the printed string, 
> > including
> > + * the trailing NUL character. If the return value is greater 
> > than
> > + * **out_size**, **out** contains a truncated string, without a
> > + * trailing NULL character.
>
> this deviates from the behavior in other BPF helpers dealing with
> strings. and it's extremely inconvenient for users to get
> non-zero-terminated string. I think we should always zero-terminate.
>
> > + *
> > + * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)  \
> > FN(unspec),

Re: [BUG] One-liner array initialization with two pointers in BPF results in NULLs

2021-03-10 Thread Florent Revest

On Wed, Mar 10, 2021 at 10:51 PM Andrii Nakryiko
 wrote:
> On Wed, Mar 10, 2021 at 12:12 PM Andrii Nakryiko
>  wrote:
> > On Wed, Mar 10, 2021 at 8:59 AM Yonghong Song  wrote:
> > > On 3/10/21 3:48 AM, Florent Revest wrote:
> > > > On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song  wrote:
> > > >> On 3/9/21 7:43 PM, Yonghong Song wrote:
> > > >>> On 3/9/21 5:54 PM, Florent Revest wrote:
> > > >>>> I noticed that initializing an array of pointers using this syntax:
> > > >>>> __u64 array[] = { (__u64), (__u64) };
> > > >>>> (which is a fairly common operation with macros such as 
> > > >>>> BPF_SEQ_PRINTF)
> > > >>>> always results in array[0] and array[1] being NULL.
> > > >>>>
> > > >>>> Interestingly, if the array is only initialized with one pointer, ex:
> > > >>>> __u64 array[] = { (__u64) };
> > > >>>> Then array[0] will not be NULL.
> > > >>>>
> > > >>>> Or if the array is initialized field by field, ex:
> > > >>>> __u64 array[2];
> > > >>>> array[0] = (__u64)
> > > >>>> array[1] = (__u64)
> > > >>>> Then array[0] and array[1] will not be NULL either.
> > > >>>>
> > > >>>> I'm assuming that this should have something to do with relocations
> > > >>>> and might be a bug in clang or in libbpf but because I don't know 
> > > >>>> much
> > > >>>> about these, I thought that reporting could be a good first step. :)
> > > >>>
> > > >>> Thanks for reporting. What you guess is correct, this is due to
> > > >>> relocations :-(
> > > >>>
> > > >>> The compiler notoriously tend to put complex initial values into
> > > >>> rodata section. For example, for
> > > >>>  __u64 array[] = { (__u64), (__u64) };
> > > >>> the compiler will put
> > > >>>  { (__u64), (__u64) }
> > > >>> into rodata section.
> > > >>>
> > > >>> But  and  themselves need relocation since they are
> > > >>> address of static variables which will sit inside .data section.
> > > >>>
> > > >>> So in the elf file, you will see the following relocations:
> > > >>>
> > > >>> RELOCATION RECORDS FOR [.rodata]:
> > > >>> OFFSET   TYPE VALUE
> > > >>> 0018 R_BPF_64_64  .data
> > > >>> 0020 R_BPF_64_64  .data
> > > >
> > > > Right :) Thank you for the explanations Yonghong!
> > > >
> > > >>> Currently, libbpf does not handle relocation inside .rodata
> > > >>> section, so they content remains 0.
> > > >
> > > > Just for my own edification, why is .rodata relocation not yet handled
> > > > in libbpf ? Is it because of a read-only mapping that makes it more
> > > > difficult ?
> > >
> > > We don't have this use case before. In general, people do not put
> > > string pointers in init code in the declaration. I think
> > > bpf_seq_printf() is special about this and hence triggering
> > > the issue.

Fair enough, the only reasonable usecase that I can think of is a
selftest like the one I wrote for bpf_snprintf and the macro in
bpf_tracing.h will be a good enough workaround for that.

> > > To support relocation of rodata section, kernel needs to be
> > > involved and this is actually more complicated as
> >
> > Exactly. It would be trivial for libbpf to support it, but it needs to
> > resolve to the actual in-kernel address of a map (plus offset), which
> > libbpf has no way of knowing.

Ah right, I see now, thanks! Indeed this would be quite complex and
probably not very useful.

> Having said that, libbpf should probably error out when such
> relocation is present, because there is no way the application with
> such relocations is going to be correct.

Good point, it would have helped me notice the problem earlier. :)

[PATCH bpf-next 5/5] selftests/bpf: Add a series of tests for bpf_snprintf

2021-03-10 Thread Florent Revest

This exercices most of the format specifiers when things go well.

Signed-off-by: Florent Revest 
---
 .../selftests/bpf/prog_tests/snprintf.c   | 71 +++
 .../selftests/bpf/progs/test_snprintf.c   | 71 +++
 2 files changed, 142 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index ..23af1dbd1eeb
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include "test_snprintf.skel.h"
+
+static int duration;
+
+#define EXP_NUM_OUT  "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET  sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT   "127.000.000.001 :::::::0001"
+#define EXP_IP_RET   sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT  "schedule schedule+0x0/"
+#define MIN_SYM_RET  sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot 
*/
+#define EXP_ADDR_OUT " 0add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT  "str1 longstr"
+#define EXP_STR_RET  sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT {'%', 'o', 'v', 'e', 'r'}
+#define EXP_OVER_RET 10
+
+void test_snprintf(void)
+{
+   char exp_addr_out[] = EXP_ADDR_OUT;
+   char exp_over_out[] = EXP_OVER_OUT;
+   char exp_sym_out[]  = EXP_SYM_OUT;
+   struct test_snprintf *skel;
+   int err;
+
+   skel = test_snprintf__open_and_load();
+   if (CHECK(!skel, "skel_open", "failed to open and load skeleton\n"))
+   return;
+
+   err = test_snprintf__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* trigger tracepoint */
+   usleep(1);
+
+   ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+   ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+   ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+   ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+   ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+sizeof(exp_sym_out) - 1), "sym_out");
+   ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+   ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+sizeof(exp_addr_out) - 1), "addr_out");
+   ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+   ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+   ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+   ASSERT_OK(memcmp(skel->bss->over_out, exp_over_out,
+sizeof(exp_over_out)), "over_out");
+   ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+cleanup:
+   test_snprintf__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_snprintf.c 
b/tools/testing/selftests/bpf/progs/test_snprintf.c
new file mode 100644
index ..6c8aa4988e69
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_snprintf.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include 
+#include 
+#include 
+#include 
+
+#define OUT_LEN 64
+
+/* Integer types */
+static const char num_fmt[] = "%d %u %x %li %llu %lX";
+#define NUMBERS -8, 9, 150, -424242, 1337, 0xDABBAD00
+
+char num_out[OUT_LEN] = {};
+long num_ret = 0;
+
+/* IP addresses */
+static const char ip_fmt[] = "%pi4 %pI6";
+static const __u8 dummy_ipv4[] = {127, 0, 0, 1}; /* 127.0.0.1 */
+static const __u32 dummy_ipv6[] = {0, 0, 0, bpf_htonl(1)}; /* ::1/128 */
+#define IPS _ipv4, _ipv6
+
+char ip_out[OUT_LEN] = {};
+long ip_ret = 0;
+
+/* Symbol lookup formatting */
+static const char sym_fmt[] = "%ps %pS %pB";
+extern const void schedule __ksym;
+#define SYMBOLS , , 
+
+char sym_out[OUT_LEN] = {};
+long sym_ret = 0;
+
+/* Kernel pointers */
+static const char addr_fmt[] = "%pK %px %p";
+#define ADDRESSES 0, 0x0ADD4E55, 0x0ADD4E55
+
+char addr_out[OUT_LEN] = {};
+long addr_ret = 0;
+
+/* Strings embedding */
+static const char str_fmt[] = "%s %+05s";
+static const char str1[] = "str1";
+static const char longstr[] = "longstr";
+#define STRINGS str1, longstr
+
+char str_out[OUT_LEN] = {};
+long str_ret = 0;
+

[PATCH bpf-next 0/5] Add a snprintf eBPF helper

2021-03-10 Thread Florent Revest

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-rev...@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-rev...@chromium.org/T/#u

Florent Revest (5):
  bpf: Add a ARG_PTR_TO_CONST_STR argument type
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro
  selftests/bpf: Add a series of tests for bpf_snprintf

 include/linux/bpf.h   |   5 +
 include/uapi/linux/bpf.h  |  28 +++
 kernel/bpf/verifier.c | 178 ++
 kernel/trace/bpf_trace.c  | 110 +++
 tools/include/uapi/linux/bpf.h|  28 +++
 tools/lib/bpf/bpf_tracing.h   |  45 -
 .../selftests/bpf/prog_tests/snprintf.c   |  71 +++
 .../selftests/bpf/progs/test_snprintf.c   |  71 +++
 8 files changed, 535 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH bpf-next 3/5] libbpf: Initialize the bpf_seq_printf parameters array field by field

2021-03-10 Thread Florent Revest

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

This is a workaround, ideally the rodata relocation should be supported
by libbpf but this would require a disproportionate amount of work given
the actual usecases. (it is very unlikely that one uses a const array of
relocated addresses)

Signed-off-by: Florent Revest 
---
 tools/lib/bpf/bpf_tracing.h | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..f6a2deb3cd5b 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,6 +413,34 @@ typeof(name(0)) name(struct pt_regs *ctx)  
\
 }  \
 static __always_inline typeof(name(0)) ##name(struct pt_regs *ctx, ##args)
 
+#define ___bpf_build_param0(narg, x)
+#define ___bpf_build_param1(narg, x) ___param[narg - 1] = x
+#define ___bpf_build_param2(narg, x, args...) ___param[narg - 2] = x; \
+ ___bpf_build_param1(narg, args)
+#define ___bpf_build_param3(narg, x, args...) ___param[narg - 3] = x; \
+ ___bpf_build_param2(narg, args)
+#define ___bpf_build_param4(narg, x, args...) ___param[narg - 4] = x; \
+ ___bpf_build_param3(narg, args)
+#define ___bpf_build_param5(narg, x, args...) ___param[narg - 5] = x; \
+ ___bpf_build_param4(narg, args)
+#define ___bpf_build_param6(narg, x, args...) ___param[narg - 6] = x; \
+ ___bpf_build_param5(narg, args)
+#define ___bpf_build_param7(narg, x, args...) ___param[narg - 7] = x; \
+ ___bpf_build_param6(narg, args)
+#define ___bpf_build_param8(narg, x, args...) ___param[narg - 8] = x; \
+ ___bpf_build_param7(narg, args)
+#define ___bpf_build_param9(narg, x, args...) ___param[narg - 9] = x; \
+ ___bpf_build_param8(narg, args)
+#define ___bpf_build_param10(narg, x, args...) ___param[narg - 10] = x; \
+  ___bpf_build_param9(narg, args)
+#define ___bpf_build_param11(narg, x, args...) ___param[narg - 11] = x; \
+  ___bpf_build_param10(narg, args)
+#define ___bpf_build_param12(narg, x, args...) ___param[narg - 12] = x; \
+  ___bpf_build_param11(narg, args)
+#define ___bpf_build_param(args...) \
+   unsigned long long ___param[___bpf_narg(args)]; \
+   ___bpf_apply(___bpf_build_param, ___bpf_narg(args))(___bpf_narg(args), 
args)
+
 /*
  * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
  * in a structure.
@@ -422,7 +450,7 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
_Pragma("GCC diagnostic push")  \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
static const char ___fmt[] = fmt;   \
-   unsigned long long ___param[] = { args };   \
+   ___bpf_build_param(args);   \
_Pragma("GCC diagnostic pop")   \
int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),\
___param, sizeof(___param));\
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH bpf-next 2/5] bpf: Add a bpf_snprintf helper

2021-03-10 Thread Florent Revest

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
NULL-terminated read-only map, we don't need a format string length arg.

Because the format-string is known at verification time, we also move
most of the format string validation, currently done in formatting
helper calls, into the verifier logic. This makes debugging easier and
also slightly improves the runtime performance.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h|   4 +
 include/uapi/linux/bpf.h   |  28 +++
 kernel/bpf/verifier.c  | 137 +
 kernel/trace/bpf_trace.c   | 110 ++
 tools/include/uapi/linux/bpf.h |  28 +++
 5 files changed, 307 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7b5319d75b3e..d78175c9a887 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1902,6 +1902,10 @@ extern const struct bpf_func_proto 
bpf_task_storage_get_proto;
 extern const struct bpf_func_proto bpf_task_storage_delete_proto;
 extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
 
+#define MAX_SNPRINTF_VARARGS   12
+#define MAX_SNPRINTF_MEMCPY6
+#define MAX_SNPRINTF_STR_LEN   128
+
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2d3036e292a9..3cbdc8ae00e7 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4660,6 +4660,33 @@ union bpf_attr {
  * Return
  * The number of traversed map elements for success, **-EINVAL** 
for
  * invalid **flags**.
+ *
+ * long bpf_snprintf(char *out, u32 out_size, const char *fmt, u64 *data, u32 
data_len)
+ * Description
+ * Outputs a string into the **out** buffer of size **out_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the printed string, including
+ * the trailing NUL character. If the return value is greater than
+ * **out_size**, **out** contains a truncated string, without a
+ * trailing NULL character.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -4827,6 +4854,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu),  \
FN(for_each_map_elem),  \
+   FN(snprintf),   \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c99b2b67dc8d..3ab549df817b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5732,6 +5732,137 @@ static int check_reference_leak(struct bpf_verifier_env 
*env)
return state->acquired_refs ? -EINVAL : 0;
 }
 
+int check_bpf_snprintf_call(struct bpf_verifier_env *env,
+   struct bpf_reg_state *regs)
+{
+   struct bpf_reg_state *fmt_reg = [BPF_REG_3];
+   struct bpf_reg_state *data_len_reg = [BPF_REG_5];
+   struct bpf_map *fmt_map = fmt_reg->map_ptr;
+   int err, fmt_map_off, i, fmt_cnt = 0, memcpy_cnt = 0, num_args;
+   u64 fmt_addr;
+   char *fmt;
+
+   /* data must be an array of u64 so data_len must be a multiple of 8 */
+   if (data_len_reg->var_off.value & 7)
+   return -EINVAL;
+   num_args = data_len_reg->var_off.value / 8;
+
+   /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
+* and map_direct_v

[PATCH bpf-next 4/5] libbpf: Introduce a BPF_SNPRINTF helper macro

2021-03-10 Thread Florent Revest

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest 
---
 tools/lib/bpf/bpf_tracing.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f6a2deb3cd5b..89e82da9b8a0 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -457,4 +457,19 @@ static __always_inline typeof(name(0)) ##name(struct 
pt_regs *ctx, ##args)
___ret; \
})
 
+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead 
of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...)  \
+   ({  \
+   _Pragma("GCC diagnostic push")  \
+   _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
+   ___bpf_build_param(args);   \
+   _Pragma("GCC diagnostic pop")   \
+   int ___ret = bpf_snprintf(out, out_size, fmt,   \
+   ___param, sizeof(___param));\
+   ___ret; \
+   })
+
 #endif
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH bpf-next 1/5] bpf: Add a ARG_PTR_TO_CONST_STR argument type

2021-03-10 Thread Florent Revest

This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a NULL character before the end of the map value.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 41 +
 2 files changed, 42 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a25730eaa148..7b5319d75b3e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -308,6 +308,7 @@ enum bpf_arg_type {
ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type */
ARG_PTR_TO_FUNC,/* pointer to a bpf program function */
ARG_PTR_TO_STACK_OR_NULL,   /* pointer to stack or NULL */
+   ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only 
string */
__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f9096b049cd6..c99b2b67dc8d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4601,6 +4601,7 @@ static const struct bpf_reg_types spin_lock_types = { 
.types = { PTR_TO_MAP_VALU
 static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { 
PTR_TO_PERCPU_BTF_ID } };
 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } 
};
 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK 
} };
+static const struct bpf_reg_types const_str_ptr_types = { .types = { 
PTR_TO_MAP_VALUE } };
 
 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_MAP_KEY]= _key_value_types,
@@ -4631,6 +4632,7 @@ static const struct bpf_reg_types 
*compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_PERCPU_BTF_ID]  = _btf_ptr_types,
[ARG_PTR_TO_FUNC]   = _ptr_types,
[ARG_PTR_TO_STACK_OR_NULL]  = _ptr_types,
+   [ARG_PTR_TO_CONST_STR]  = _str_ptr_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -4881,6 +4883,45 @@ static int check_func_arg(struct bpf_verifier_env *env, 
u32 arg,
if (err)
return err;
err = check_ptr_alignment(env, reg, 0, size, true);
+   } else if (arg_type == ARG_PTR_TO_CONST_STR) {
+   struct bpf_map *map = reg->map_ptr;
+   int map_off, i;
+   u64 map_addr;
+   char *map_ptr;
+
+   if (!map || !bpf_map_is_rdonly(map)) {
+   verbose(env, "R%d does not point to a readonly map'\n", 
regno);
+   return -EACCES;
+   }
+
+   if (!tnum_is_const(reg->var_off)) {
+   verbose(env, "R%d is not a constant address'\n", regno);
+   return -EACCES;
+   }
+
+   if (!map->ops->map_direct_value_addr) {
+   verbose(env, "no direct value access support for this 
map type\n");
+   return -EACCES;
+   }
+
+   err = check_helper_mem_access(env, regno,
+ map->value_size - reg->off,
+ false, meta);
+   if (err)
+   return err;
+
+   map_off = reg->off + reg->var_off.value;
+   err = map->ops->map_direct_value_addr(map, _addr, map_off);
+   if (err)
+   return err;
+
+   map_ptr = (char *)(map_addr);
+   for (i = map_off; map_ptr[i] != '\0'; i++) {
+   if (i == map->value_size - 1) {
+   verbose(env, "map does not contain a 
NULL-terminated string\n");
+   return -EACCES;
+   }
+   }
}
 
return err;
-- 
2.30.1.766.gb4fecdf3b7-goog

Re: [BUG] One-liner array initialization with two pointers in BPF results in NULLs

2021-03-10 Thread Florent Revest

On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song  wrote:
> On 3/9/21 7:43 PM, Yonghong Song wrote:
> > On 3/9/21 5:54 PM, Florent Revest wrote:
> >> I noticed that initializing an array of pointers using this syntax:
> >> __u64 array[] = { (__u64), (__u64) };
> >> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF)
> >> always results in array[0] and array[1] being NULL.
> >>
> >> Interestingly, if the array is only initialized with one pointer, ex:
> >> __u64 array[] = { (__u64) };
> >> Then array[0] will not be NULL.
> >>
> >> Or if the array is initialized field by field, ex:
> >> __u64 array[2];
> >> array[0] = (__u64)
> >> array[1] = (__u64)
> >> Then array[0] and array[1] will not be NULL either.
> >>
> >> I'm assuming that this should have something to do with relocations
> >> and might be a bug in clang or in libbpf but because I don't know much
> >> about these, I thought that reporting could be a good first step. :)
> >
> > Thanks for reporting. What you guess is correct, this is due to
> > relocations :-(
> >
> > The compiler notoriously tend to put complex initial values into
> > rodata section. For example, for
> > __u64 array[] = { (__u64), (__u64) };
> > the compiler will put
> > { (__u64), (__u64) }
> > into rodata section.
> >
> > But  and  themselves need relocation since they are
> > address of static variables which will sit inside .data section.
> >
> > So in the elf file, you will see the following relocations:
> >
> > RELOCATION RECORDS FOR [.rodata]:
> > OFFSET   TYPE VALUE
> > 0018 R_BPF_64_64  .data
> > 0020 R_BPF_64_64  .data

Right :) Thank you for the explanations Yonghong!

> > Currently, libbpf does not handle relocation inside .rodata
> > section, so they content remains 0.

Just for my own edification, why is .rodata relocation not yet handled
in libbpf ? Is it because of a read-only mapping that makes it more
difficult ?

> > That is why you see the issue with pointer as NULL.
> >
> > With array size of 1, compiler does not bother to put it into
> > rodata section.
> >
> > I *guess* that it works in the macro due to some kind of heuristics,
> > e.g., nested blocks, etc, and llvm did not promote the array init value
> > to rodata. I will double check whether llvm can complete prevent
> > such transformation.
> >
> > Maybe in the future libbpf is able to handle relocations for
> > rodata section too. But for the time being, please just consider to use
> > either macro, or the explicit array assignment.
>
> Digging into the compiler, the compiler tries to make *const* initial
> value into rodata section if the initial value size > 64, so in
> this case, macro does not work either. I think this is how you
> discovered the issue.

Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how
I found the bug.

> The llvm does not provide target hooks to
> influence this transformation.

Oh, that is unfortunate :) Thanks for looking into it! I feel that the
real fix would be in libbpf anyway and the rest is just workarounds.

> So, there are two workarounds,
> (1).__u64 param_working[2];
>  param_working[0] = (__u64)str1;
>  param_working[1] = (__u64)str2;
> (2). BPF_SEQ_PRINTF(seq, "%s ", str1);
>   BPF_SEQ_PRINTF(seq, "%s", str2);

(2) is a bit impractical for my actual usecase. I am implementing a
bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep
the selftest short with a few BPF_SNPRINTF() calls that exercise most
format specifiers.

> In practice, if you have at least one non-const format argument,
> you should be fine. But if all format arguments are constant, then
> none of them should be strings.

Just for context, this does not only happen for strings but also for
all sorts of pointers, for example, when I try to do address lookup of
global __ksym variables, which is important for my selftest.

> Maybe we could change marco
> unsigned long long ___param[] = { args };
> to declare an array explicitly and then have a loop to
> assign each array element?

I think this would be a good workaround for now, indeed. :) I'll look
into it today and send it as part of my bpf_snprintf series.

Thanks!

[BUG] One-liner array initialization with two pointers in BPF results in NULLs

2021-03-09 Thread Florent Revest

I noticed that initializing an array of pointers using this syntax:
__u64 array[] = { (__u64), (__u64) };
(which is a fairly common operation with macros such as BPF_SEQ_PRINTF)
always results in array[0] and array[1] being NULL.

Interestingly, if the array is only initialized with one pointer, ex:
__u64 array[] = { (__u64) };
Then array[0] will not be NULL.

Or if the array is initialized field by field, ex:
__u64 array[2];
array[0] = (__u64)
array[1] = (__u64)
Then array[0] and array[1] will not be NULL either.

I'm assuming that this should have something to do with relocations
and might be a bug in clang or in libbpf but because I don't know much
about these, I thought that reporting could be a good first step. :)

I attached below a repro with a dummy selftest that I expect should pass
but fails to pass with the latest clang and bpf-next. Hopefully, the
logic should be simple: I try to print two strings from pointers in an
array using bpf_seq_printf but depending on how the array is initialized
the helper either receives the string pointers or NULL pointers:

test_bug:FAIL:read unexpected read: actual 'str1= str2= str1=STR1
str2=STR2 ' != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 '

Signed-off-by: Florent Revest 
---
 tools/testing/selftests/bpf/prog_tests/bug.c | 41 +++
 tools/testing/selftests/bpf/progs/test_bug.c | 43 
 2 files changed, 84 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bug.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_bug.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bug.c 
b/tools/testing/selftests/bpf/prog_tests/bug.c
new file mode 100644
index ..4b0fafd936b7
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/bug.c
@@ -0,0 +1,41 @@
+#include 
+#include "test_bug.skel.h"
+
+static int duration;
+
+void test_bug(void)
+{
+   struct test_bug *skel;
+   struct bpf_link *link;
+   char buf[64] = {};
+   int iter_fd, len;
+
+   skel = test_bug__open_and_load();
+   if (CHECK(!skel, "test_bug__open_and_load",
+ "skeleton open_and_load failed\n"))
+   goto destroy;
+
+   link = bpf_program__attach_iter(skel->progs.bug, NULL);
+   if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
+   goto destroy;
+
+   iter_fd = bpf_iter_create(bpf_link__fd(link));
+   if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
+   goto free_link;
+
+   len = read(iter_fd, buf, sizeof(buf));
+   CHECK(len < 0, "read", "read failed: %s\n", strerror(errno));
+   // BUG: We expect the strings to be printed in both cases but only the
+   // second case works.
+   // actual 'str1= str2= str1=STR1 str2=STR2 '
+   // != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 '
+   ASSERT_STREQ(buf, "str1=STR1 str2=STR2 str1=STR1 str2=STR2 ", "read");
+
+   close(iter_fd);
+
+free_link:
+   bpf_link__destroy(link);
+destroy:
+   test_bug__destroy(skel);
+}
+
diff --git a/tools/testing/selftests/bpf/progs/test_bug.c 
b/tools/testing/selftests/bpf/progs/test_bug.c
new file mode 100644
index ..c41e69483785
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_bug.c
@@ -0,0 +1,43 @@
+#include "bpf_iter.h"
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+SEC("iter/task")
+int bug(struct bpf_iter__task *ctx)
+{
+   struct seq_file *seq = ctx->meta->seq;
+
+   /* We want to print two strings */
+   static const char fmt[] = "str1=%s str2=%s ";
+   static char str1[] = "STR1";
+   static char str2[] = "STR2";
+
+   /*
+* Because bpf_seq_printf takes parameters to its format specifiers in
+* an array, we need to stuff pointers to str1 and str2 in a u64 array.
+*/
+
+   /* First, we try a one-liner array initialization. Note that this is
+* what the BPF_SEQ_PRINTF macro does under the hood. */
+   __u64 param_not_working[] = { (__u64)str1, (__u64)str2 };
+   /* But we also try a field by field initialization of the array. We
+* would expect the arrays and the behavior to be exactly the same. */
+   __u64 param_working[2];
+   param_working[0] = (__u64)str1;
+   param_working[1] = (__u64)str2;
+
+   /* For convenience, only print once */
+   if (ctx->meta->seq_num != 0)
+   return 0;
+
+   /* Using the one-liner array of params, it does not print the strings */
+   bpf_seq_printf(seq, fmt, sizeof(fmt),
+  param_not_working, sizeof(param_not_working));
+   /* Using the field-by-field array of params, it prints the strings */
+   bpf_seq_printf(seq, fmt, sizeof(fmt),
+  param_working, sizeof(param_working));
+
+   return 0;
+}
-- 
2.30.1.766.gb4fecdf3b7-goog

Re: [PATCH bpf-next v7 2/5] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-02-10 Thread Florent Revest

On Wed, Feb 10, 2021 at 8:52 PM Andrii Nakryiko
 wrote:
>
> On Wed, Feb 10, 2021 at 3:14 AM Florent Revest  wrote:
> >
> > This needs a new helper that:
> > - can work in a sleepable context (using sock_gen_cookie)
> > - takes a struct sock pointer and checks that it's not NULL
> >
> > Signed-off-by: Florent Revest 
> > Acked-by: KP Singh 
> > ---
>
> It's customary to send cover letter with patch sets of 2 or more
> related patches. It's a good place to explain the motivation of a
> patch set. And a good place to ack all patches in one go ;)

You're right :) I first (naively!) thought it would be a short series
but it grew bigger than I originally thought. I will make sure I do in
the future. ;)

> Acked-by: Andrii Nakryiko 
>
>
> >  include/linux/bpf.h|  1 +
> >  include/uapi/linux/bpf.h   |  8 
> >  kernel/trace/bpf_trace.c   |  2 ++
> >  net/core/filter.c  | 12 
> >  tools/include/uapi/linux/bpf.h |  8 
> >  5 files changed, 31 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 321966fc35db..d212ae7d9731 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1888,6 +1888,7 @@ extern const struct bpf_func_proto 
> > bpf_per_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> >  extern const struct bpf_func_proto bpf_sock_from_file_proto;
> > +extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> >
> >  const struct bpf_func_proto *bpf_tracing_func_proto(
> > enum bpf_func_id func_id, const struct bpf_prog *prog);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 0b735c2729b2..a8d9ad543300 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1673,6 +1673,14 @@ union bpf_attr {
> >   * Return
> >   * A 8-byte long unique number.
> >   *
> > + * u64 bpf_get_socket_cookie(struct sock *sk)
> > + * Description
> > + * Equivalent to **bpf_get_socket_cookie**\ () helper that 
> > accepts
> > + * *sk*, but gets socket from a BTF **struct sock**. This 
> > helper
> > + * also works for sleepable programs.
> > + * Return
> > + * A 8-byte long unique number or 0 if *sk* is NULL.
> > + *
> >   * u32 bpf_get_socket_uid(struct sk_buff *skb)
> >   * Return
> >   * The owner UID of the socket associated to *skb*. If the 
> > socket
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 6c0018abe68a..845b2168e006 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1760,6 +1760,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, 
> > const struct bpf_prog *prog)
> > return _sk_storage_delete_tracing_proto;
> > case BPF_FUNC_sock_from_file:
> > return _sock_from_file_proto;
> > +   case BPF_FUNC_get_socket_cookie:
> > +   return _get_socket_ptr_cookie_proto;
> >  #endif
> > case BPF_FUNC_seq_printf:
> > return prog->expected_attach_type == BPF_TRACE_ITER ?
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index e15d4741719a..57aaed478362 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -4631,6 +4631,18 @@ static const struct bpf_func_proto 
> > bpf_get_socket_cookie_sock_proto = {
> > .arg1_type  = ARG_PTR_TO_CTX,
> >  };
> >
> > +BPF_CALL_1(bpf_get_socket_ptr_cookie, struct sock *, sk)
> > +{
> > +   return sk ? sock_gen_cookie(sk) : 0;
> > +}
> > +
> > +const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto = {
> > +   .func   = bpf_get_socket_ptr_cookie,
> > +   .gpl_only   = false,
> > +   .ret_type   = RET_INTEGER,
> > +   .arg1_type  = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
> > +};
> > +
> >  BPF_CALL_1(bpf_get_socket_cookie_sock_ops, struct bpf_sock_ops_kern *, ctx)
> >  {
> > return __sock_gen_cookie(ctx->sk);
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 0b735c2729b2..a8d9ad543300 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -1673,6 +1673,14 @@ union bpf_attr {
> >   * Return
> >   * A 8-byte long uni

[PATCH bpf-next v7 5/5] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2021-02-10 Thread Florent Revest

This builds up on the existing socket cookie test which checks whether
the bpf_get_socket_cookie helpers provide the same value in
cgroup/connect6 and sockops programs for a socket created by the
userspace part of the test.

Instead of having an update_cookie sockops program tag a socket local
storage with 0xFF, this uses both an update_cookie_sockops program and
an update_cookie_tracing program which succesively tag the socket with
0x0F and then 0xF0.

Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 .../selftests/bpf/prog_tests/socket_cookie.c  | 11 --
 .../selftests/bpf/progs/socket_cookie_prog.c  | 36 +--
 2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
index e12a31d3752c..232db28dde18 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -35,9 +35,14 @@ void test_socket_cookie(void)
if (!ASSERT_OK_PTR(skel->links.set_cookie, "prog_attach"))
goto close_cgroup_fd;
 
-   skel->links.update_cookie = bpf_program__attach_cgroup(
-   skel->progs.update_cookie, cgroup_fd);
-   if (!ASSERT_OK_PTR(skel->links.update_cookie, "prog_attach"))
+   skel->links.update_cookie_sockops = bpf_program__attach_cgroup(
+   skel->progs.update_cookie_sockops, cgroup_fd);
+   if (!ASSERT_OK_PTR(skel->links.update_cookie_sockops, "prog_attach"))
+   goto close_cgroup_fd;
+
+   skel->links.update_cookie_tracing = bpf_program__attach(
+   skel->progs.update_cookie_tracing);
+   if (!ASSERT_OK_PTR(skel->links.update_cookie_tracing, "prog_attach"))
goto close_cgroup_fd;
 
server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
diff --git a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c 
b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
index fbd5eaf39720..35630a5aaf5f 100644
--- a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
+++ b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
@@ -5,6 +5,7 @@
 
 #include 
 #include 
+#include 
 
 #define AF_INET6 10
 
@@ -20,6 +21,14 @@ struct {
__type(value, struct socket_cookie);
 } socket_cookies SEC(".maps");
 
+/*
+ * These three programs get executed in a row on connect() syscalls. The
+ * userspace side of the test creates a client socket, issues a connect() on it
+ * and then checks that the local storage associated with this socket has:
+ * cookie_value == local_port << 8 | 0xFF
+ * The different parts of this cookie_value are appended by those hooks if they
+ * all agree on the output of bpf_get_socket_cookie().
+ */
 SEC("cgroup/connect6")
 int set_cookie(struct bpf_sock_addr *ctx)
 {
@@ -33,14 +42,14 @@ int set_cookie(struct bpf_sock_addr *ctx)
if (!p)
return 1;
 
-   p->cookie_value = 0xFF;
+   p->cookie_value = 0xF;
p->cookie_key = bpf_get_socket_cookie(ctx);
 
return 1;
 }
 
 SEC("sockops")
-int update_cookie(struct bpf_sock_ops *ctx)
+int update_cookie_sockops(struct bpf_sock_ops *ctx)
 {
struct bpf_sock *sk = ctx->sk;
struct socket_cookie *p;
@@ -61,9 +70,30 @@ int update_cookie(struct bpf_sock_ops *ctx)
if (p->cookie_key != bpf_get_socket_cookie(ctx))
return 1;
 
-   p->cookie_value = (ctx->local_port << 8) | p->cookie_value;
+   p->cookie_value |= (ctx->local_port << 8);
 
return 1;
 }
 
+SEC("fexit/inet_stream_connect")
+int BPF_PROG(update_cookie_tracing, struct socket *sock,
+struct sockaddr *uaddr, int addr_len, int flags)
+{
+   struct socket_cookie *p;
+
+   if (uaddr->sa_family != AF_INET6)
+   return 0;
+
+   p = bpf_sk_storage_get(_cookies, sock->sk, 0, 0);
+   if (!p)
+   return 0;
+
+   if (p->cookie_key != bpf_get_socket_cookie(sock->sk))
+   return 0;
+
+   p->cookie_value |= 0xF0;
+
+   return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.30.0.478.g8a0d178c01-goog

[PATCH bpf-next v7 4/5] selftests/bpf: Use vmlinux.h in socket_cookie_prog.c

2021-02-10 Thread Florent Revest

When migrating from the bpf.h's to the vmlinux.h's definition of struct
bps_sock, an interesting LLVM behavior happened. LLVM started producing
two fetches of ctx->sk in the sockops program this means that the
verifier could not keep track of the NULL-check on ctx->sk. Therefore,
we need to extract ctx->sk in a variable before checking and
dereferencing it.

Acked-by: KP Singh 
Acked-by: Andrii Nakryiko 
Signed-off-by: Florent Revest 
---
 .../testing/selftests/bpf/progs/socket_cookie_prog.c  | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c 
b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
index 81e84be6f86d..fbd5eaf39720 100644
--- a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
+++ b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
@@ -1,12 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2018 Facebook
 
-#include 
-#include 
+#include "vmlinux.h"
 
 #include 
 #include 
 
+#define AF_INET6 10
+
 struct socket_cookie {
__u64 cookie_key;
__u32 cookie_value;
@@ -41,7 +42,7 @@ int set_cookie(struct bpf_sock_addr *ctx)
 SEC("sockops")
 int update_cookie(struct bpf_sock_ops *ctx)
 {
-   struct bpf_sock *sk;
+   struct bpf_sock *sk = ctx->sk;
struct socket_cookie *p;
 
if (ctx->family != AF_INET6)
@@ -50,10 +51,10 @@ int update_cookie(struct bpf_sock_ops *ctx)
if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
return 1;
 
-   if (!ctx->sk)
+   if (!sk)
return 1;
 
-   p = bpf_sk_storage_get(_cookies, ctx->sk, 0, 0);
+   p = bpf_sk_storage_get(_cookies, sk, 0, 0);
if (!p)
return 1;
 
-- 
2.30.0.478.g8a0d178c01-goog

[PATCH bpf-next v7 2/5] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-02-10 Thread Florent Revest

This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 include/linux/bpf.h|  1 +
 include/uapi/linux/bpf.h   |  8 
 kernel/trace/bpf_trace.c   |  2 ++
 net/core/filter.c  | 12 
 tools/include/uapi/linux/bpf.h |  8 
 5 files changed, 31 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 321966fc35db..d212ae7d9731 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1888,6 +1888,7 @@ extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
 extern const struct bpf_func_proto bpf_sock_from_file_proto;
+extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0b735c2729b2..a8d9ad543300 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1673,6 +1673,14 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(struct sock *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**. This helper
+ * also works for sleepable programs.
+ * Return
+ * A 8-byte long unique number or 0 if *sk* is NULL.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 6c0018abe68a..845b2168e006 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1760,6 +1760,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const 
struct bpf_prog *prog)
return _sk_storage_delete_tracing_proto;
case BPF_FUNC_sock_from_file:
return _sock_from_file_proto;
+   case BPF_FUNC_get_socket_cookie:
+   return _get_socket_ptr_cookie_proto;
 #endif
case BPF_FUNC_seq_printf:
return prog->expected_attach_type == BPF_TRACE_ITER ?
diff --git a/net/core/filter.c b/net/core/filter.c
index e15d4741719a..57aaed478362 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4631,6 +4631,18 @@ static const struct bpf_func_proto 
bpf_get_socket_cookie_sock_proto = {
.arg1_type  = ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_1(bpf_get_socket_ptr_cookie, struct sock *, sk)
+{
+   return sk ? sock_gen_cookie(sk) : 0;
+}
+
+const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto = {
+   .func   = bpf_get_socket_ptr_cookie,
+   .gpl_only   = false,
+   .ret_type   = RET_INTEGER,
+   .arg1_type  = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
+};
+
 BPF_CALL_1(bpf_get_socket_cookie_sock_ops, struct bpf_sock_ops_kern *, ctx)
 {
return __sock_gen_cookie(ctx->sk);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0b735c2729b2..a8d9ad543300 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1673,6 +1673,14 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(struct sock *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**. This helper
+ * also works for sleepable programs.
+ * Return
+ * A 8-byte long unique number or 0 if *sk* is NULL.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
-- 
2.30.0.478.g8a0d178c01-goog

[PATCH bpf-next v7 3/5] selftests/bpf: Integrate the socket_cookie test to test_progs

2021-02-10 Thread Florent Revest

Currently, the selftest for the BPF socket_cookie helpers is built and
run independently from test_progs. It's easy to forget and hard to
maintain.

This patch moves the socket cookies test into prog_tests/ and vastly
simplifies its logic by:
- rewriting the loading code with BPF skeletons
- rewriting the server/client code with network helpers
- rewriting the cgroup code with test__join_cgroup
- rewriting the error handling code with CHECKs

Signed-off-by: Florent Revest 
Acked-by: KP Singh 
Acked-by: Andrii Nakryiko 
---
 tools/testing/selftests/bpf/.gitignore|   1 -
 tools/testing/selftests/bpf/Makefile  |   3 +-
 .../selftests/bpf/prog_tests/socket_cookie.c  |  71 ++
 .../selftests/bpf/progs/socket_cookie_prog.c  |   2 -
 .../selftests/bpf/test_socket_cookie.c| 208 --
 5 files changed, 72 insertions(+), 213 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/socket_cookie.c
 delete mode 100644 tools/testing/selftests/bpf/test_socket_cookie.c

diff --git a/tools/testing/selftests/bpf/.gitignore 
b/tools/testing/selftests/bpf/.gitignore
index 9abca0616ec0..c0c48fdb9ac1 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -17,7 +17,6 @@ test_sockmap
 test_lirc_mode2_user
 get_cgroup_id_user
 test_skb_cgroup_id_user
-test_socket_cookie
 test_cgroup_storage
 test_flow_dissector
 flow_dissector_load
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index f0674d406f40..044bfdcf5b74 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -31,7 +31,7 @@ LDLIBS += -lcap -lelf -lz -lrt -lpthread
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map 
test_progs \
test_verifier_log test_dev_cgroup \
-   test_sock test_sockmap get_cgroup_id_user test_socket_cookie \
+   test_sock test_sockmap get_cgroup_id_user \
test_cgroup_storage \
test_netcnt test_tcpnotify_user test_sysctl \
test_progs-no_alu32
@@ -185,7 +185,6 @@ $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
 $(OUTPUT)/test_skb_cgroup_id_user: cgroup_helpers.c
 $(OUTPUT)/test_sock: cgroup_helpers.c
 $(OUTPUT)/test_sock_addr: cgroup_helpers.c
-$(OUTPUT)/test_socket_cookie: cgroup_helpers.c
 $(OUTPUT)/test_sockmap: cgroup_helpers.c
 $(OUTPUT)/test_tcpnotify_user: cgroup_helpers.c trace_helpers.c
 $(OUTPUT)/get_cgroup_id_user: cgroup_helpers.c
diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
new file mode 100644
index ..e12a31d3752c
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Google LLC.
+// Copyright (c) 2018 Facebook
+
+#include 
+#include "socket_cookie_prog.skel.h"
+#include "network_helpers.h"
+
+static int duration;
+
+struct socket_cookie {
+   __u64 cookie_key;
+   __u32 cookie_value;
+};
+
+void test_socket_cookie(void)
+{
+   int server_fd = 0, client_fd = 0, cgroup_fd = 0, err = 0;
+   socklen_t addr_len = sizeof(struct sockaddr_in6);
+   struct socket_cookie_prog *skel;
+   __u32 cookie_expected_value;
+   struct sockaddr_in6 addr;
+   struct socket_cookie val;
+
+   skel = socket_cookie_prog__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "skel_open"))
+   return;
+
+   cgroup_fd = test__join_cgroup("/socket_cookie");
+   if (CHECK(cgroup_fd < 0, "join_cgroup", "cgroup creation failed\n"))
+   goto out;
+
+   skel->links.set_cookie = bpf_program__attach_cgroup(
+   skel->progs.set_cookie, cgroup_fd);
+   if (!ASSERT_OK_PTR(skel->links.set_cookie, "prog_attach"))
+   goto close_cgroup_fd;
+
+   skel->links.update_cookie = bpf_program__attach_cgroup(
+   skel->progs.update_cookie, cgroup_fd);
+   if (!ASSERT_OK_PTR(skel->links.update_cookie, "prog_attach"))
+   goto close_cgroup_fd;
+
+   server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+   if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
+   goto close_cgroup_fd;
+
+   client_fd = connect_to_fd(server_fd, 0);
+   if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
+   goto close_server_fd;
+
+   err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.socket_cookies),
+ _fd, );
+   if (!ASSERT_OK(err, "map_lookup(socket_cookies)"))
+   goto close_client_fd;
+
+   err = getsockname(client_fd, (struct sockaddr *), _len);
+   if (!ASSERT_OK(err, "getsockname"))
+

[PATCH bpf-next v7 1/5] bpf: Be less specific about socket cookies guarantees

2021-02-10 Thread Florent Revest

Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.

Reported-by: Daniel Borkmann 
Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 include/uapi/linux/bpf.h   | 8 
 tools/include/uapi/linux/bpf.h | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c001766adcbc..0b735c2729b2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1656,22 +1656,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c001766adcbc..0b735c2729b2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1656,22 +1656,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
-- 
2.30.0.478.g8a0d178c01-goog

Re: [PATCH bpf-next v6 2/5] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-02-10 Thread Florent Revest

On Mon, Feb 1, 2021 at 11:37 PM Alexei Starovoitov
 wrote:
>
> On Mon, Feb 1, 2021 at 2:32 PM Daniel Borkmann  wrote:
> >
> > On 1/30/21 12:45 PM, Florent Revest wrote:
> > > On Fri, Jan 29, 2021 at 1:49 PM Daniel Borkmann  
> > > wrote:
> > >> On 1/29/21 11:57 AM, Daniel Borkmann wrote:
> > >>> On 1/27/21 10:01 PM, Andrii Nakryiko wrote:
> > >>>> On Tue, Jan 26, 2021 at 10:36 AM Florent Revest  
> > >>>> wrote:
> > >>>>>
> > >>>>> This needs a new helper that:
> > >>>>> - can work in a sleepable context (using sock_gen_cookie)
> > >>>>> - takes a struct sock pointer and checks that it's not NULL
> > >>>>>
> > >>>>> Signed-off-by: Florent Revest 
> > >>>>> Acked-by: KP Singh 
> > >>>>> ---
> > >>>>>include/linux/bpf.h|  1 +
> > >>>>>include/uapi/linux/bpf.h   |  8 
> > >>>>>kernel/trace/bpf_trace.c   |  2 ++
> > >>>>>net/core/filter.c  | 12 
> > >>>>>tools/include/uapi/linux/bpf.h |  8 
> > >>>>>5 files changed, 31 insertions(+)
> > >>>>>
> > >>>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > >>>>> index 1aac2af12fed..26219465e1f7 100644
> > >>>>> --- a/include/linux/bpf.h
> > >>>>> +++ b/include/linux/bpf.h
> > >>>>> @@ -1874,6 +1874,7 @@ extern const struct bpf_func_proto 
> > >>>>> bpf_per_cpu_ptr_proto;
> > >>>>>extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> > >>>>>extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> > >>>>>extern const struct bpf_func_proto bpf_sock_from_file_proto;
> > >>>>> +extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> > >>>>>
> > >>>>>const struct bpf_func_proto *bpf_tracing_func_proto(
> > >>>>>   enum bpf_func_id func_id, const struct bpf_prog *prog);
> > >>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > >>>>> index 0b735c2729b2..5855c398d685 100644
> > >>>>> --- a/include/uapi/linux/bpf.h
> > >>>>> +++ b/include/uapi/linux/bpf.h
> > >>>>> @@ -1673,6 +1673,14 @@ union bpf_attr {
> > >>>>> * Return
> > >>>>> * A 8-byte long unique number.
> > >>>>> *
> > >>>>> + * u64 bpf_get_socket_cookie(void *sk)
> > >>>>
> > >>>> should the type be `struct sock *` then?
> > >>>
> > >>> Checking libbpf's generated bpf_helper_defs.h it generates:
> > >>>
> > >>> /*
> > >>>* bpf_get_socket_cookie
> > >>>*
> > >>>*  If the **struct sk_buff** pointed by *skb* has a known socket,
> > >>>*  retrieve the cookie (generated by the kernel) of this socket.
> > >>>*  If no cookie has been set yet, generate a new cookie. Once
> > >>>*  generated, the socket cookie remains stable for the life of 
> > >>> the
> > >>>*  socket. This helper can be useful for monitoring per socket
> > >>>*  networking traffic statistics as it provides a global socket
> > >>>*  identifier that can be assumed unique.
> > >>>*
> > >>>* Returns
> > >>>*  A 8-byte long non-decreasing number on success, or 0 if the
> > >>>*  socket field is missing inside *skb*.
> > >>>*/
> > >>> static __u64 (*bpf_get_socket_cookie)(void *ctx) = (void *) 46;
> > >>>
> > >>> So in terms of helper comment it's picking up the description from the
> > >>> `u64 bpf_get_socket_cookie(struct sk_buff *skb)` signature. With that
> > >>> in mind it would likely make sense to add the actual `struct sock *` 
> > >>> type
> > >>> to the comment to make it more clear in here.
> > >>
> > >> One thought that still came to mind when looking over the series again, 
> > >> do
> > >> we need to blacklist certain functions from

Re: [PATCH bpf-next v6 2/5] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-01-30 Thread Florent Revest

On Fri, Jan 29, 2021 at 1:49 PM Daniel Borkmann  wrote:
>
> On 1/29/21 11:57 AM, Daniel Borkmann wrote:
> > On 1/27/21 10:01 PM, Andrii Nakryiko wrote:
> >> On Tue, Jan 26, 2021 at 10:36 AM Florent Revest  
> >> wrote:
> >>>
> >>> This needs a new helper that:
> >>> - can work in a sleepable context (using sock_gen_cookie)
> >>> - takes a struct sock pointer and checks that it's not NULL
> >>>
> >>> Signed-off-by: Florent Revest 
> >>> Acked-by: KP Singh 
> >>> ---
> >>>   include/linux/bpf.h|  1 +
> >>>   include/uapi/linux/bpf.h   |  8 
> >>>   kernel/trace/bpf_trace.c   |  2 ++
> >>>   net/core/filter.c  | 12 
> >>>   tools/include/uapi/linux/bpf.h |  8 
> >>>   5 files changed, 31 insertions(+)
> >>>
> >>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> >>> index 1aac2af12fed..26219465e1f7 100644
> >>> --- a/include/linux/bpf.h
> >>> +++ b/include/linux/bpf.h
> >>> @@ -1874,6 +1874,7 @@ extern const struct bpf_func_proto 
> >>> bpf_per_cpu_ptr_proto;
> >>>   extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> >>>   extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> >>>   extern const struct bpf_func_proto bpf_sock_from_file_proto;
> >>> +extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> >>>
> >>>   const struct bpf_func_proto *bpf_tracing_func_proto(
> >>>  enum bpf_func_id func_id, const struct bpf_prog *prog);
> >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >>> index 0b735c2729b2..5855c398d685 100644
> >>> --- a/include/uapi/linux/bpf.h
> >>> +++ b/include/uapi/linux/bpf.h
> >>> @@ -1673,6 +1673,14 @@ union bpf_attr {
> >>>* Return
> >>>* A 8-byte long unique number.
> >>>*
> >>> + * u64 bpf_get_socket_cookie(void *sk)
> >>
> >> should the type be `struct sock *` then?
> >
> > Checking libbpf's generated bpf_helper_defs.h it generates:
> >
> > /*
> >   * bpf_get_socket_cookie
> >   *
> >   *  If the **struct sk_buff** pointed by *skb* has a known socket,
> >   *  retrieve the cookie (generated by the kernel) of this socket.
> >   *  If no cookie has been set yet, generate a new cookie. Once
> >   *  generated, the socket cookie remains stable for the life of the
> >   *  socket. This helper can be useful for monitoring per socket
> >   *  networking traffic statistics as it provides a global socket
> >   *  identifier that can be assumed unique.
> >   *
> >   * Returns
> >   *  A 8-byte long non-decreasing number on success, or 0 if the
> >   *  socket field is missing inside *skb*.
> >   */
> > static __u64 (*bpf_get_socket_cookie)(void *ctx) = (void *) 46;
> >
> > So in terms of helper comment it's picking up the description from the
> > `u64 bpf_get_socket_cookie(struct sk_buff *skb)` signature. With that
> > in mind it would likely make sense to add the actual `struct sock *` type
> > to the comment to make it more clear in here.
>
> One thought that still came to mind when looking over the series again, do
> we need to blacklist certain functions from bpf_get_socket_cookie() under
> tracing e.g. when attaching to, say fexit? For example, if sk_prot_free()
> would be temporary uninlined/exported for testing and bpf_get_socket_cookie()
> was invoked from a prog upon fexit where sock was already passed back to
> allocator, I presume there's risk of mem corruption, no?

Mh, this is interesting. I can try to add a deny list in v7 but I'm
not sure whether I'll be able to catch them all. I'm assuming that
__sk_destruct, sk_destruct, __sk_free, sk_free would be other
problematic functions but potentially there would be more.

Re: [PATCH bpf-next v6 2/5] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-01-30 Thread Florent Revest

On Wed, Jan 27, 2021 at 10:01 PM Andrii Nakryiko
 wrote:
>
> On Tue, Jan 26, 2021 at 10:36 AM Florent Revest  wrote:
> >
> > This needs a new helper that:
> > - can work in a sleepable context (using sock_gen_cookie)
> > - takes a struct sock pointer and checks that it's not NULL
> >
> > Signed-off-by: Florent Revest 
> > Acked-by: KP Singh 
> > ---
> >  include/linux/bpf.h|  1 +
> >  include/uapi/linux/bpf.h   |  8 
> >  kernel/trace/bpf_trace.c   |  2 ++
> >  net/core/filter.c  | 12 
> >  tools/include/uapi/linux/bpf.h |  8 
> >  5 files changed, 31 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 1aac2af12fed..26219465e1f7 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1874,6 +1874,7 @@ extern const struct bpf_func_proto 
> > bpf_per_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> >  extern const struct bpf_func_proto bpf_sock_from_file_proto;
> > +extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> >
> >  const struct bpf_func_proto *bpf_tracing_func_proto(
> > enum bpf_func_id func_id, const struct bpf_prog *prog);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 0b735c2729b2..5855c398d685 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1673,6 +1673,14 @@ union bpf_attr {
> >   * Return
> >   * A 8-byte long unique number.
> >   *
> > + * u64 bpf_get_socket_cookie(void *sk)
>
> should the type be `struct sock *` then?

Oh, absolutely. :) Thank you Andrii

[PATCH bpf-next v6 4/5] selftests/bpf: Use vmlinux.h in socket_cookie_prog.c

2021-01-26 Thread Florent Revest

When migrating from the bpf.h's to the vmlinux.h's definition of struct
bps_sock, an interesting LLVM behavior happened. LLVM started producing
two fetches of ctx->sk in the sockops program this means that the
verifier could not keep track of the NULL-check on ctx->sk. Therefore,
we need to extract ctx->sk in a variable before checking and
dereferencing it.

Acked-by: KP Singh 
Signed-off-by: Florent Revest 
---
 .../testing/selftests/bpf/progs/socket_cookie_prog.c  | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c 
b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
index 81e84be6f86d..fbd5eaf39720 100644
--- a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
+++ b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
@@ -1,12 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2018 Facebook
 
-#include 
-#include 
+#include "vmlinux.h"
 
 #include 
 #include 
 
+#define AF_INET6 10
+
 struct socket_cookie {
__u64 cookie_key;
__u32 cookie_value;
@@ -41,7 +42,7 @@ int set_cookie(struct bpf_sock_addr *ctx)
 SEC("sockops")
 int update_cookie(struct bpf_sock_ops *ctx)
 {
-   struct bpf_sock *sk;
+   struct bpf_sock *sk = ctx->sk;
struct socket_cookie *p;
 
if (ctx->family != AF_INET6)
@@ -50,10 +51,10 @@ int update_cookie(struct bpf_sock_ops *ctx)
if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
return 1;
 
-   if (!ctx->sk)
+   if (!sk)
return 1;
 
-   p = bpf_sk_storage_get(_cookies, ctx->sk, 0, 0);
+   p = bpf_sk_storage_get(_cookies, sk, 0, 0);
if (!p)
return 1;
 
-- 
2.30.0.280.ga3ce27912f-goog

[PATCH bpf-next v6 5/5] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2021-01-26 Thread Florent Revest

This builds up on the existing socket cookie test which checks whether
the bpf_get_socket_cookie helpers provide the same value in
cgroup/connect6 and sockops programs for a socket created by the
userspace part of the test.

Instead of having an update_cookie sockops program tag a socket local
storage with 0xFF, this uses both an update_cookie_sockops program and
an update_cookie_tracing program which succesively tag the socket with
0x0F and then 0xF0.

Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 .../selftests/bpf/prog_tests/socket_cookie.c  | 11 --
 .../selftests/bpf/progs/socket_cookie_prog.c  | 36 +--
 2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
index e12a31d3752c..232db28dde18 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -35,9 +35,14 @@ void test_socket_cookie(void)
if (!ASSERT_OK_PTR(skel->links.set_cookie, "prog_attach"))
goto close_cgroup_fd;
 
-   skel->links.update_cookie = bpf_program__attach_cgroup(
-   skel->progs.update_cookie, cgroup_fd);
-   if (!ASSERT_OK_PTR(skel->links.update_cookie, "prog_attach"))
+   skel->links.update_cookie_sockops = bpf_program__attach_cgroup(
+   skel->progs.update_cookie_sockops, cgroup_fd);
+   if (!ASSERT_OK_PTR(skel->links.update_cookie_sockops, "prog_attach"))
+   goto close_cgroup_fd;
+
+   skel->links.update_cookie_tracing = bpf_program__attach(
+   skel->progs.update_cookie_tracing);
+   if (!ASSERT_OK_PTR(skel->links.update_cookie_tracing, "prog_attach"))
goto close_cgroup_fd;
 
server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
diff --git a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c 
b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
index fbd5eaf39720..35630a5aaf5f 100644
--- a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
+++ b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
@@ -5,6 +5,7 @@
 
 #include 
 #include 
+#include 
 
 #define AF_INET6 10
 
@@ -20,6 +21,14 @@ struct {
__type(value, struct socket_cookie);
 } socket_cookies SEC(".maps");
 
+/*
+ * These three programs get executed in a row on connect() syscalls. The
+ * userspace side of the test creates a client socket, issues a connect() on it
+ * and then checks that the local storage associated with this socket has:
+ * cookie_value == local_port << 8 | 0xFF
+ * The different parts of this cookie_value are appended by those hooks if they
+ * all agree on the output of bpf_get_socket_cookie().
+ */
 SEC("cgroup/connect6")
 int set_cookie(struct bpf_sock_addr *ctx)
 {
@@ -33,14 +42,14 @@ int set_cookie(struct bpf_sock_addr *ctx)
if (!p)
return 1;
 
-   p->cookie_value = 0xFF;
+   p->cookie_value = 0xF;
p->cookie_key = bpf_get_socket_cookie(ctx);
 
return 1;
 }
 
 SEC("sockops")
-int update_cookie(struct bpf_sock_ops *ctx)
+int update_cookie_sockops(struct bpf_sock_ops *ctx)
 {
struct bpf_sock *sk = ctx->sk;
struct socket_cookie *p;
@@ -61,9 +70,30 @@ int update_cookie(struct bpf_sock_ops *ctx)
if (p->cookie_key != bpf_get_socket_cookie(ctx))
return 1;
 
-   p->cookie_value = (ctx->local_port << 8) | p->cookie_value;
+   p->cookie_value |= (ctx->local_port << 8);
 
return 1;
 }
 
+SEC("fexit/inet_stream_connect")
+int BPF_PROG(update_cookie_tracing, struct socket *sock,
+struct sockaddr *uaddr, int addr_len, int flags)
+{
+   struct socket_cookie *p;
+
+   if (uaddr->sa_family != AF_INET6)
+   return 0;
+
+   p = bpf_sk_storage_get(_cookies, sock->sk, 0, 0);
+   if (!p)
+   return 0;
+
+   if (p->cookie_key != bpf_get_socket_cookie(sock->sk))
+   return 0;
+
+   p->cookie_value |= 0xF0;
+
+   return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.30.0.280.ga3ce27912f-goog

[PATCH bpf-next v6 3/5] selftests/bpf: Integrate the socket_cookie test to test_progs

2021-01-26 Thread Florent Revest

Currently, the selftest for the BPF socket_cookie helpers is built and
run independently from test_progs. It's easy to forget and hard to
maintain.

This patch moves the socket cookies test into prog_tests/ and vastly
simplifies its logic by:
- rewriting the loading code with BPF skeletons
- rewriting the server/client code with network helpers
- rewriting the cgroup code with test__join_cgroup
- rewriting the error handling code with CHECKs

Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 tools/testing/selftests/bpf/.gitignore|   1 -
 tools/testing/selftests/bpf/Makefile  |   3 +-
 .../selftests/bpf/prog_tests/socket_cookie.c  |  71 ++
 .../selftests/bpf/progs/socket_cookie_prog.c  |   2 -
 .../selftests/bpf/test_socket_cookie.c| 208 --
 5 files changed, 72 insertions(+), 213 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/socket_cookie.c
 delete mode 100644 tools/testing/selftests/bpf/test_socket_cookie.c

diff --git a/tools/testing/selftests/bpf/.gitignore 
b/tools/testing/selftests/bpf/.gitignore
index 9abca0616ec0..c0c48fdb9ac1 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -17,7 +17,6 @@ test_sockmap
 test_lirc_mode2_user
 get_cgroup_id_user
 test_skb_cgroup_id_user
-test_socket_cookie
 test_cgroup_storage
 test_flow_dissector
 flow_dissector_load
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 63d6288e419c..af00fe3b7fb9 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -33,7 +33,7 @@ LDLIBS += -lcap -lelf -lz -lrt -lpthread
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map 
test_progs \
test_verifier_log test_dev_cgroup \
-   test_sock test_sockmap get_cgroup_id_user test_socket_cookie \
+   test_sock test_sockmap get_cgroup_id_user \
test_cgroup_storage \
test_netcnt test_tcpnotify_user test_sysctl \
test_progs-no_alu32
@@ -187,7 +187,6 @@ $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
 $(OUTPUT)/test_skb_cgroup_id_user: cgroup_helpers.c
 $(OUTPUT)/test_sock: cgroup_helpers.c
 $(OUTPUT)/test_sock_addr: cgroup_helpers.c
-$(OUTPUT)/test_socket_cookie: cgroup_helpers.c
 $(OUTPUT)/test_sockmap: cgroup_helpers.c
 $(OUTPUT)/test_tcpnotify_user: cgroup_helpers.c trace_helpers.c
 $(OUTPUT)/get_cgroup_id_user: cgroup_helpers.c
diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
new file mode 100644
index ..e12a31d3752c
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Google LLC.
+// Copyright (c) 2018 Facebook
+
+#include 
+#include "socket_cookie_prog.skel.h"
+#include "network_helpers.h"
+
+static int duration;
+
+struct socket_cookie {
+   __u64 cookie_key;
+   __u32 cookie_value;
+};
+
+void test_socket_cookie(void)
+{
+   int server_fd = 0, client_fd = 0, cgroup_fd = 0, err = 0;
+   socklen_t addr_len = sizeof(struct sockaddr_in6);
+   struct socket_cookie_prog *skel;
+   __u32 cookie_expected_value;
+   struct sockaddr_in6 addr;
+   struct socket_cookie val;
+
+   skel = socket_cookie_prog__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "skel_open"))
+   return;
+
+   cgroup_fd = test__join_cgroup("/socket_cookie");
+   if (CHECK(cgroup_fd < 0, "join_cgroup", "cgroup creation failed\n"))
+   goto out;
+
+   skel->links.set_cookie = bpf_program__attach_cgroup(
+   skel->progs.set_cookie, cgroup_fd);
+   if (!ASSERT_OK_PTR(skel->links.set_cookie, "prog_attach"))
+   goto close_cgroup_fd;
+
+   skel->links.update_cookie = bpf_program__attach_cgroup(
+   skel->progs.update_cookie, cgroup_fd);
+   if (!ASSERT_OK_PTR(skel->links.update_cookie, "prog_attach"))
+   goto close_cgroup_fd;
+
+   server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+   if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
+   goto close_cgroup_fd;
+
+   client_fd = connect_to_fd(server_fd, 0);
+   if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
+   goto close_server_fd;
+
+   err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.socket_cookies),
+ _fd, );
+   if (!ASSERT_OK(err, "map_lookup(socket_cookies)"))
+   goto close_client_fd;
+
+   err = getsockname(client_fd, (struct sockaddr *), _len);
+   if (!ASSERT_OK(err, "getsockname"))
+   goto close_client_fd;

[PATCH bpf-next v6 2/5] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-01-26 Thread Florent Revest

This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 include/linux/bpf.h|  1 +
 include/uapi/linux/bpf.h   |  8 
 kernel/trace/bpf_trace.c   |  2 ++
 net/core/filter.c  | 12 
 tools/include/uapi/linux/bpf.h |  8 
 5 files changed, 31 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1aac2af12fed..26219465e1f7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1874,6 +1874,7 @@ extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
 extern const struct bpf_func_proto bpf_sock_from_file_proto;
+extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0b735c2729b2..5855c398d685 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1673,6 +1673,14 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(void *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**. This helper
+ * also works for sleepable programs.
+ * Return
+ * A 8-byte long unique number or 0 if *sk* is NULL.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 6c0018abe68a..845b2168e006 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1760,6 +1760,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const 
struct bpf_prog *prog)
return _sk_storage_delete_tracing_proto;
case BPF_FUNC_sock_from_file:
return _sock_from_file_proto;
+   case BPF_FUNC_get_socket_cookie:
+   return _get_socket_ptr_cookie_proto;
 #endif
case BPF_FUNC_seq_printf:
return prog->expected_attach_type == BPF_TRACE_ITER ?
diff --git a/net/core/filter.c b/net/core/filter.c
index 9ab94e90d660..606e2b6115ed 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4631,6 +4631,18 @@ static const struct bpf_func_proto 
bpf_get_socket_cookie_sock_proto = {
.arg1_type  = ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_1(bpf_get_socket_ptr_cookie, struct sock *, sk)
+{
+   return sk ? sock_gen_cookie(sk) : 0;
+}
+
+const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto = {
+   .func   = bpf_get_socket_ptr_cookie,
+   .gpl_only   = false,
+   .ret_type   = RET_INTEGER,
+   .arg1_type  = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
+};
+
 BPF_CALL_1(bpf_get_socket_cookie_sock_ops, struct bpf_sock_ops_kern *, ctx)
 {
return __sock_gen_cookie(ctx->sk);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0b735c2729b2..5855c398d685 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1673,6 +1673,14 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(void *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**. This helper
+ * also works for sleepable programs.
+ * Return
+ * A 8-byte long unique number or 0 if *sk* is NULL.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
-- 
2.30.0.280.ga3ce27912f-goog

[PATCH bpf-next v6 1/5] bpf: Be less specific about socket cookies guarantees

2021-01-26 Thread Florent Revest

Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.

Reported-by: Daniel Borkmann 
Signed-off-by: Florent Revest 
Acked-by: KP Singh 
---
 include/uapi/linux/bpf.h   | 8 
 tools/include/uapi/linux/bpf.h | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c001766adcbc..0b735c2729b2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1656,22 +1656,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c001766adcbc..0b735c2729b2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1656,22 +1656,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
-- 
2.30.0.280.ga3ce27912f-goog

Re: [PATCH bpf-next v5 4/4] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2021-01-26 Thread Florent Revest

On Sat, Jan 23, 2021 at 9:45 PM Yonghong Song  wrote:
> On 1/22/21 7:34 AM, Florent Revest wrote:
> > On Wed, Jan 20, 2021 at 8:06 PM Florent Revest  wrote:
> >>
> >> On Wed, Jan 20, 2021 at 8:04 PM Alexei Starovoitov
> >>  wrote:
> >>>
> >>> On Wed, Jan 20, 2021 at 9:08 AM KP Singh  wrote:
> >>>>
> >>>> On Tue, Jan 19, 2021 at 5:00 PM Florent Revest  
> >>>> wrote:
> >>>>>
> >>>>> This builds up on the existing socket cookie test which checks whether
> >>>>> the bpf_get_socket_cookie helpers provide the same value in
> >>>>> cgroup/connect6 and sockops programs for a socket created by the
> >>>>> userspace part of the test.
> >>>>>
> >>>>> Adding a tracing program to the existing objects requires a different
> >>>>> attachment strategy and different headers.
> >>>>>
> >>>>> Signed-off-by: Florent Revest 
> >>>>
> >>>> Acked-by: KP Singh 
> >>>>
> >>>> (one minor note, doesn't really need fixing as a part of this though)
> >>>>
> >>>>> ---
> >>>>>   .../selftests/bpf/prog_tests/socket_cookie.c  | 24 +++
> >>>>>   .../selftests/bpf/progs/socket_cookie_prog.c  | 41 ---
> >>>>>   2 files changed, 52 insertions(+), 13 deletions(-)
> >>>>>
> >>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
> >>>>> b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >>>>> index 53d0c44e7907..e5c5e2ea1deb 100644
> >>>>> --- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >>>>> +++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >>>>> @@ -15,8 +15,8 @@ struct socket_cookie {
> >>>>>
> >>>>>   void test_socket_cookie(void)
> >>>>>   {
> >>>>> +   struct bpf_link *set_link, *update_sockops_link, 
> >>>>> *update_tracing_link;
> >>>>>  socklen_t addr_len = sizeof(struct sockaddr_in6);
> >>>>> -   struct bpf_link *set_link, *update_link;
> >>>>>  int server_fd, client_fd, cgroup_fd;
> >>>>>  struct socket_cookie_prog *skel;
> >>>>>  __u32 cookie_expected_value;
> >>>>> @@ -39,15 +39,21 @@ void test_socket_cookie(void)
> >>>>>PTR_ERR(set_link)))
> >>>>>  goto close_cgroup_fd;
> >>>>>
> >>>>> -   update_link = 
> >>>>> bpf_program__attach_cgroup(skel->progs.update_cookie,
> >>>>> -cgroup_fd);
> >>>>> -   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err 
> >>>>> %ld\n",
> >>>>> - PTR_ERR(update_link)))
> >>>>> +   update_sockops_link = bpf_program__attach_cgroup(
> >>>>> +   skel->progs.update_cookie_sockops, cgroup_fd);
> >>>>> +   if (CHECK(IS_ERR(update_sockops_link), 
> >>>>> "update-sockops-link-cg-attach",
> >>>>> + "err %ld\n", PTR_ERR(update_sockops_link)))
> >>>>>  goto free_set_link;
> >>>>>
> >>>>> +   update_tracing_link = bpf_program__attach(
> >>>>> +   skel->progs.update_cookie_tracing);
> >>>>> +   if (CHECK(IS_ERR(update_tracing_link), 
> >>>>> "update-tracing-link-attach",
> >>>>> + "err %ld\n", PTR_ERR(update_tracing_link)))
> >>>>> +   goto free_update_sockops_link;
> >>>>> +
> >>>>>  server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
> >>>>>  if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
> >>>>> -   goto free_update_link;
> >>>>> +   goto free_update_tracing_link;
> >>>>>
> >>>>>  client_fd = connect_to_fd(server_fd, 0);
> >>>>>  if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
> >>

Re: [PATCH bpf-next v5 4/4] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2021-01-22 Thread Florent Revest

On Wed, Jan 20, 2021 at 8:06 PM Florent Revest  wrote:
>
> On Wed, Jan 20, 2021 at 8:04 PM Alexei Starovoitov
>  wrote:
> >
> > On Wed, Jan 20, 2021 at 9:08 AM KP Singh  wrote:
> > >
> > > On Tue, Jan 19, 2021 at 5:00 PM Florent Revest  
> > > wrote:
> > > >
> > > > This builds up on the existing socket cookie test which checks whether
> > > > the bpf_get_socket_cookie helpers provide the same value in
> > > > cgroup/connect6 and sockops programs for a socket created by the
> > > > userspace part of the test.
> > > >
> > > > Adding a tracing program to the existing objects requires a different
> > > > attachment strategy and different headers.
> > > >
> > > > Signed-off-by: Florent Revest 
> > >
> > > Acked-by: KP Singh 
> > >
> > > (one minor note, doesn't really need fixing as a part of this though)
> > >
> > > > ---
> > > >  .../selftests/bpf/prog_tests/socket_cookie.c  | 24 +++
> > > >  .../selftests/bpf/progs/socket_cookie_prog.c  | 41 ---
> > > >  2 files changed, 52 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
> > > > b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> > > > index 53d0c44e7907..e5c5e2ea1deb 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> > > > @@ -15,8 +15,8 @@ struct socket_cookie {
> > > >
> > > >  void test_socket_cookie(void)
> > > >  {
> > > > +   struct bpf_link *set_link, *update_sockops_link, 
> > > > *update_tracing_link;
> > > > socklen_t addr_len = sizeof(struct sockaddr_in6);
> > > > -   struct bpf_link *set_link, *update_link;
> > > > int server_fd, client_fd, cgroup_fd;
> > > > struct socket_cookie_prog *skel;
> > > > __u32 cookie_expected_value;
> > > > @@ -39,15 +39,21 @@ void test_socket_cookie(void)
> > > >   PTR_ERR(set_link)))
> > > > goto close_cgroup_fd;
> > > >
> > > > -   update_link = 
> > > > bpf_program__attach_cgroup(skel->progs.update_cookie,
> > > > -cgroup_fd);
> > > > -   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err 
> > > > %ld\n",
> > > > - PTR_ERR(update_link)))
> > > > +   update_sockops_link = bpf_program__attach_cgroup(
> > > > +   skel->progs.update_cookie_sockops, cgroup_fd);
> > > > +   if (CHECK(IS_ERR(update_sockops_link), 
> > > > "update-sockops-link-cg-attach",
> > > > + "err %ld\n", PTR_ERR(update_sockops_link)))
> > > > goto free_set_link;
> > > >
> > > > +   update_tracing_link = bpf_program__attach(
> > > > +   skel->progs.update_cookie_tracing);
> > > > +   if (CHECK(IS_ERR(update_tracing_link), 
> > > > "update-tracing-link-attach",
> > > > + "err %ld\n", PTR_ERR(update_tracing_link)))
> > > > +   goto free_update_sockops_link;
> > > > +
> > > > server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
> > > > if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
> > > > -   goto free_update_link;
> > > > +   goto free_update_tracing_link;
> > > >
> > > > client_fd = connect_to_fd(server_fd, 0);
> > > > if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
> > > > @@ -71,8 +77,10 @@ void test_socket_cookie(void)
> > > > close(client_fd);
> > > >  close_server_fd:
> > > > close(server_fd);
> > > > -free_update_link:
> > > > -   bpf_link__destroy(update_link);
> > > > +free_update_tracing_link:
> > > > +   bpf_link__destroy(update_tracing_link);
> > >
> > > I don't think this need to block submission unless there are other
> > > issues but the
> > > bpf_link__destroy can just be called in a single clean

Re: [PATCH bpf-next v5 3/4] selftests/bpf: Integrate the socket_cookie test to test_progs

2021-01-22 Thread Florent Revest

On Thu, Jan 21, 2021 at 8:55 AM Andrii Nakryiko
 wrote:
>
> On Tue, Jan 19, 2021 at 8:00 AM Florent Revest  wrote:
> >
> > Currently, the selftest for the BPF socket_cookie helpers is built and
> > run independently from test_progs. It's easy to forget and hard to
> > maintain.
> >
> > This patch moves the socket cookies test into prog_tests/ and vastly
> > simplifies its logic by:
> > - rewriting the loading code with BPF skeletons
> > - rewriting the server/client code with network helpers
> > - rewriting the cgroup code with test__join_cgroup
> > - rewriting the error handling code with CHECKs
> >
> > Signed-off-by: Florent Revest 
> > ---
>
> Few nits below regarding skeleton and ASSERT_xxx usage.
>
> >  tools/testing/selftests/bpf/Makefile  |   3 +-
> >  .../selftests/bpf/prog_tests/socket_cookie.c  |  82 +++
> >  .../selftests/bpf/progs/socket_cookie_prog.c  |   2 -
> >  .../selftests/bpf/test_socket_cookie.c| 208 --
>
> please also update .gitignore

Good catch!

> >  4 files changed, 83 insertions(+), 212 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >  delete mode 100644 tools/testing/selftests/bpf/test_socket_cookie.c
> >
>
> [...]
>
> > +
> > +   skel = socket_cookie_prog__open_and_load();
> > +   if (CHECK(!skel, "socket_cookie_prog__open_and_load",
> > + "skeleton open_and_load failed\n"))
>
> nit: ASSERT_PTR_OK

Ah great, I find the ASSERT semantic much easier to follow than CHECKs.

> > +   return;
> > +
> > +   cgroup_fd = test__join_cgroup("/socket_cookie");
> > +   if (CHECK(cgroup_fd < 0, "join_cgroup", "cgroup creation failed\n"))
> > +   goto destroy_skel;
> > +
> > +   set_link = bpf_program__attach_cgroup(skel->progs.set_cookie,
> > + cgroup_fd);
>
> you can use skel->links->set_cookie here and it will be auto-destroyed
> when the whole skeleton is destroyed. More simplification.

Sick. :)

> > +   if (CHECK(IS_ERR(set_link), "set-link-cg-attach", "err %ld\n",
> > + PTR_ERR(set_link)))
> > +   goto close_cgroup_fd;
> > +
> > +   update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
> > +cgroup_fd);
>
> same as above, no need to maintain your link outside of skeleton
>
>
> > +   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
> > + PTR_ERR(update_link)))
> > +   goto free_set_link;
> > +
> > +   server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
> > +   if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
> > +   goto free_update_link;
> > +
> > +   client_fd = connect_to_fd(server_fd, 0);
> > +   if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
> > +   goto close_server_fd;
>
> nit: ASSERT_OK is nicer (here and in few other places)

Did you mean ASSERT_OK for the two following err checks ?

ASSERT_OK does not seem right for a fd check where we want fd to be
positive. ASSERT_OK does: "bool ___ok = ___res == 0;"

I will keep my "CHECK(fd < 0" but maybe there could be an
ASSERT_POSITIVE that does "bool ___ok = ___res >= 0;"

> > +
> > +   err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.socket_cookies),
> > + _fd, );
> > +   if (CHECK(err, "map_lookup", "err %d errno %d\n", err, errno))
> > +   goto close_client_fd;
> > +
> > +   err = getsockname(client_fd, (struct sockaddr *), _len);
> > +   if (CHECK(err, "getsockname", "Can't get client local addr\n"))
> > +   goto close_client_fd;
> > +
> > +   cookie_expected_value = (ntohs(addr.sin6_port) << 8) | 0xFF;
> > +   CHECK(val.cookie_value != cookie_expected_value, "",
> > + "Unexpected value in map: %x != %x\n", val.cookie_value,
> > + cookie_expected_value);
>
> nit: ASSERT_NEQ is nicer

Indeed.

> > +
> > +close_client_fd:
> > +   close(client_fd);
> > +close_server_fd:
> > +   close(server_fd);
> > +free_update_link:
> > +   bpf_link__destroy(update_link);
> > +free_set_link:
> > +   bpf_link__destroy(set_link);
> > +close_cgroup_fd:
> > +   close(cgroup_fd);
> > +destroy_skel:
> > +   socket_cookie_prog__destroy(skel);
> > +}
>
> [...]

Re: [PATCH bpf-next v5 4/4] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2021-01-20 Thread Florent Revest

On Wed, Jan 20, 2021 at 8:04 PM Alexei Starovoitov
 wrote:
>
> On Wed, Jan 20, 2021 at 9:08 AM KP Singh  wrote:
> >
> > On Tue, Jan 19, 2021 at 5:00 PM Florent Revest  wrote:
> > >
> > > This builds up on the existing socket cookie test which checks whether
> > > the bpf_get_socket_cookie helpers provide the same value in
> > > cgroup/connect6 and sockops programs for a socket created by the
> > > userspace part of the test.
> > >
> > > Adding a tracing program to the existing objects requires a different
> > > attachment strategy and different headers.
> > >
> > > Signed-off-by: Florent Revest 
> >
> > Acked-by: KP Singh 
> >
> > (one minor note, doesn't really need fixing as a part of this though)
> >
> > > ---
> > >  .../selftests/bpf/prog_tests/socket_cookie.c  | 24 +++
> > >  .../selftests/bpf/progs/socket_cookie_prog.c  | 41 ---
> > >  2 files changed, 52 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
> > > b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> > > index 53d0c44e7907..e5c5e2ea1deb 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> > > @@ -15,8 +15,8 @@ struct socket_cookie {
> > >
> > >  void test_socket_cookie(void)
> > >  {
> > > +   struct bpf_link *set_link, *update_sockops_link, 
> > > *update_tracing_link;
> > > socklen_t addr_len = sizeof(struct sockaddr_in6);
> > > -   struct bpf_link *set_link, *update_link;
> > > int server_fd, client_fd, cgroup_fd;
> > > struct socket_cookie_prog *skel;
> > > __u32 cookie_expected_value;
> > > @@ -39,15 +39,21 @@ void test_socket_cookie(void)
> > >   PTR_ERR(set_link)))
> > > goto close_cgroup_fd;
> > >
> > > -   update_link = 
> > > bpf_program__attach_cgroup(skel->progs.update_cookie,
> > > -cgroup_fd);
> > > -   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err 
> > > %ld\n",
> > > - PTR_ERR(update_link)))
> > > +   update_sockops_link = bpf_program__attach_cgroup(
> > > +   skel->progs.update_cookie_sockops, cgroup_fd);
> > > +   if (CHECK(IS_ERR(update_sockops_link), 
> > > "update-sockops-link-cg-attach",
> > > + "err %ld\n", PTR_ERR(update_sockops_link)))
> > > goto free_set_link;
> > >
> > > +   update_tracing_link = bpf_program__attach(
> > > +   skel->progs.update_cookie_tracing);
> > > +   if (CHECK(IS_ERR(update_tracing_link), 
> > > "update-tracing-link-attach",
> > > + "err %ld\n", PTR_ERR(update_tracing_link)))
> > > +   goto free_update_sockops_link;
> > > +
> > > server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
> > > if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
> > > -   goto free_update_link;
> > > +   goto free_update_tracing_link;
> > >
> > > client_fd = connect_to_fd(server_fd, 0);
> > > if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
> > > @@ -71,8 +77,10 @@ void test_socket_cookie(void)
> > > close(client_fd);
> > >  close_server_fd:
> > > close(server_fd);
> > > -free_update_link:
> > > -   bpf_link__destroy(update_link);
> > > +free_update_tracing_link:
> > > +   bpf_link__destroy(update_tracing_link);
> >
> > I don't think this need to block submission unless there are other
> > issues but the
> > bpf_link__destroy can just be called in a single cleanup label because
> > it handles null or
> > erroneous inputs:
> >
> > int bpf_link__destroy(struct bpf_link *link)
> > {
> > int err = 0;
> >
> > if (IS_ERR_OR_NULL(link))
> >  return 0;
> > [...]
>
> +1 to KP's point.
>
> Also Florent, how did you test it?
> This test fails in CI and in my manual run:
> ./test_progs -t cook
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP

Re: [PATCH bpf-next v4 2/4] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-01-19 Thread Florent Revest

On Wed, Dec 9, 2020 at 5:35 PM Daniel Borkmann  wrote:
>
> On 12/9/20 2:26 PM, Florent Revest wrote:
> > This needs two new helpers, one that works in a sleepable context (using
> > sock_gen_cookie which disables/enables preemption) and one that does not
> > (for performance reasons). Both take a struct sock pointer and need to
> > check it for NULLness.
> >
> > This helper could also be useful to other BPF program types such as LSM.
>
> Looks like this commit description is now stale and needs to be updated
> since we only really add one helper?
>
> > Signed-off-by: Florent Revest 
> > ---
> >   include/linux/bpf.h|  1 +
> >   include/uapi/linux/bpf.h   |  7 +++
> >   kernel/trace/bpf_trace.c   |  2 ++
> >   net/core/filter.c  | 12 
> >   tools/include/uapi/linux/bpf.h |  7 +++
> >   5 files changed, 29 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 07cb5d15e743..5a858e8c3f1a 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1860,6 +1860,7 @@ extern const struct bpf_func_proto 
> > bpf_per_cpu_ptr_proto;
> >   extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> >   extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> >   extern const struct bpf_func_proto bpf_sock_from_file_proto;
> > +extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> >
> >   const struct bpf_func_proto *bpf_tracing_func_proto(
> >   enum bpf_func_id func_id, const struct bpf_prog *prog);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index ba59309f4d18..9ac66cf25959 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1667,6 +1667,13 @@ union bpf_attr {
> >*  Return
> >*  A 8-byte long unique number.
> >*
> > + * u64 bpf_get_socket_cookie(void *sk)
> > + *   Description
> > + *   Equivalent to **bpf_get_socket_cookie**\ () helper that 
> > accepts
> > + *   *sk*, but gets socket from a BTF **struct sock**.
>
> Maybe add a small comment that this one also works for sleepable [tracing] 
> progs?
>
> > + *   Return
> > + *   A 8-byte long unique number.
>
> ... or 0 if *sk* is NULL.

Argh, I somehow missed this email during my holidays, I'm sending a
v5. Thank you Daniel!

[PATCH bpf-next v5 3/4] selftests/bpf: Integrate the socket_cookie test to test_progs

2021-01-19 Thread Florent Revest

Currently, the selftest for the BPF socket_cookie helpers is built and
run independently from test_progs. It's easy to forget and hard to
maintain.

This patch moves the socket cookies test into prog_tests/ and vastly
simplifies its logic by:
- rewriting the loading code with BPF skeletons
- rewriting the server/client code with network helpers
- rewriting the cgroup code with test__join_cgroup
- rewriting the error handling code with CHECKs

Signed-off-by: Florent Revest 
---
 tools/testing/selftests/bpf/Makefile  |   3 +-
 .../selftests/bpf/prog_tests/socket_cookie.c  |  82 +++
 .../selftests/bpf/progs/socket_cookie_prog.c  |   2 -
 .../selftests/bpf/test_socket_cookie.c| 208 --
 4 files changed, 83 insertions(+), 212 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/socket_cookie.c
 delete mode 100644 tools/testing/selftests/bpf/test_socket_cookie.c

diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 63d6288e419c..af00fe3b7fb9 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -33,7 +33,7 @@ LDLIBS += -lcap -lelf -lz -lrt -lpthread
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map 
test_progs \
test_verifier_log test_dev_cgroup \
-   test_sock test_sockmap get_cgroup_id_user test_socket_cookie \
+   test_sock test_sockmap get_cgroup_id_user \
test_cgroup_storage \
test_netcnt test_tcpnotify_user test_sysctl \
test_progs-no_alu32
@@ -187,7 +187,6 @@ $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
 $(OUTPUT)/test_skb_cgroup_id_user: cgroup_helpers.c
 $(OUTPUT)/test_sock: cgroup_helpers.c
 $(OUTPUT)/test_sock_addr: cgroup_helpers.c
-$(OUTPUT)/test_socket_cookie: cgroup_helpers.c
 $(OUTPUT)/test_sockmap: cgroup_helpers.c
 $(OUTPUT)/test_tcpnotify_user: cgroup_helpers.c trace_helpers.c
 $(OUTPUT)/get_cgroup_id_user: cgroup_helpers.c
diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
new file mode 100644
index ..53d0c44e7907
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Google LLC.
+// Copyright (c) 2018 Facebook
+
+#include 
+#include "socket_cookie_prog.skel.h"
+#include "network_helpers.h"
+
+static int duration;
+
+struct socket_cookie {
+   __u64 cookie_key;
+   __u32 cookie_value;
+};
+
+void test_socket_cookie(void)
+{
+   socklen_t addr_len = sizeof(struct sockaddr_in6);
+   struct bpf_link *set_link, *update_link;
+   int server_fd, client_fd, cgroup_fd;
+   struct socket_cookie_prog *skel;
+   __u32 cookie_expected_value;
+   struct sockaddr_in6 addr;
+   struct socket_cookie val;
+   int err = 0;
+
+   skel = socket_cookie_prog__open_and_load();
+   if (CHECK(!skel, "socket_cookie_prog__open_and_load",
+ "skeleton open_and_load failed\n"))
+   return;
+
+   cgroup_fd = test__join_cgroup("/socket_cookie");
+   if (CHECK(cgroup_fd < 0, "join_cgroup", "cgroup creation failed\n"))
+   goto destroy_skel;
+
+   set_link = bpf_program__attach_cgroup(skel->progs.set_cookie,
+ cgroup_fd);
+   if (CHECK(IS_ERR(set_link), "set-link-cg-attach", "err %ld\n",
+ PTR_ERR(set_link)))
+   goto close_cgroup_fd;
+
+   update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
+cgroup_fd);
+   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
+ PTR_ERR(update_link)))
+   goto free_set_link;
+
+   server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+   if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
+   goto free_update_link;
+
+   client_fd = connect_to_fd(server_fd, 0);
+   if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
+   goto close_server_fd;
+
+   err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.socket_cookies),
+ _fd, );
+   if (CHECK(err, "map_lookup", "err %d errno %d\n", err, errno))
+   goto close_client_fd;
+
+   err = getsockname(client_fd, (struct sockaddr *), _len);
+   if (CHECK(err, "getsockname", "Can't get client local addr\n"))
+   goto close_client_fd;
+
+   cookie_expected_value = (ntohs(addr.sin6_port) << 8) | 0xFF;
+   CHECK(val.cookie_value != cookie_expected_value, "",
+

[PATCH bpf-next v5 4/4] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2021-01-19 Thread Florent Revest

This builds up on the existing socket cookie test which checks whether
the bpf_get_socket_cookie helpers provide the same value in
cgroup/connect6 and sockops programs for a socket created by the
userspace part of the test.

Adding a tracing program to the existing objects requires a different
attachment strategy and different headers.

Signed-off-by: Florent Revest 
---
 .../selftests/bpf/prog_tests/socket_cookie.c  | 24 +++
 .../selftests/bpf/progs/socket_cookie_prog.c  | 41 ---
 2 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
index 53d0c44e7907..e5c5e2ea1deb 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -15,8 +15,8 @@ struct socket_cookie {
 
 void test_socket_cookie(void)
 {
+   struct bpf_link *set_link, *update_sockops_link, *update_tracing_link;
socklen_t addr_len = sizeof(struct sockaddr_in6);
-   struct bpf_link *set_link, *update_link;
int server_fd, client_fd, cgroup_fd;
struct socket_cookie_prog *skel;
__u32 cookie_expected_value;
@@ -39,15 +39,21 @@ void test_socket_cookie(void)
  PTR_ERR(set_link)))
goto close_cgroup_fd;
 
-   update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
-cgroup_fd);
-   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
- PTR_ERR(update_link)))
+   update_sockops_link = bpf_program__attach_cgroup(
+   skel->progs.update_cookie_sockops, cgroup_fd);
+   if (CHECK(IS_ERR(update_sockops_link), "update-sockops-link-cg-attach",
+ "err %ld\n", PTR_ERR(update_sockops_link)))
goto free_set_link;
 
+   update_tracing_link = bpf_program__attach(
+   skel->progs.update_cookie_tracing);
+   if (CHECK(IS_ERR(update_tracing_link), "update-tracing-link-attach",
+ "err %ld\n", PTR_ERR(update_tracing_link)))
+   goto free_update_sockops_link;
+
server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
-   goto free_update_link;
+   goto free_update_tracing_link;
 
client_fd = connect_to_fd(server_fd, 0);
if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
@@ -71,8 +77,10 @@ void test_socket_cookie(void)
close(client_fd);
 close_server_fd:
close(server_fd);
-free_update_link:
-   bpf_link__destroy(update_link);
+free_update_tracing_link:
+   bpf_link__destroy(update_tracing_link);
+free_update_sockops_link:
+   bpf_link__destroy(update_sockops_link);
 free_set_link:
bpf_link__destroy(set_link);
 close_cgroup_fd:
diff --git a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c 
b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
index 81e84be6f86d..1f770b732cb1 100644
--- a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
+++ b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
@@ -1,11 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2018 Facebook
 
-#include 
-#include 
+#include "vmlinux.h"
 
 #include 
 #include 
+#include 
+
+#define AF_INET6 10
 
 struct socket_cookie {
__u64 cookie_key;
@@ -19,6 +21,14 @@ struct {
__type(value, struct socket_cookie);
 } socket_cookies SEC(".maps");
 
+/*
+ * These three programs get executed in a row on connect() syscalls. The
+ * userspace side of the test creates a client socket, issues a connect() on it
+ * and then checks that the local storage associated with this socket has:
+ * cookie_value == local_port << 8 | 0xFF
+ * The different parts of this cookie_value are appended by those hooks if they
+ * all agree on the output of bpf_get_socket_cookie().
+ */
 SEC("cgroup/connect6")
 int set_cookie(struct bpf_sock_addr *ctx)
 {
@@ -32,14 +42,14 @@ int set_cookie(struct bpf_sock_addr *ctx)
if (!p)
return 1;
 
-   p->cookie_value = 0xFF;
+   p->cookie_value = 0xF;
p->cookie_key = bpf_get_socket_cookie(ctx);
 
return 1;
 }
 
 SEC("sockops")
-int update_cookie(struct bpf_sock_ops *ctx)
+int update_cookie_sockops(struct bpf_sock_ops *ctx)
 {
struct bpf_sock *sk;
struct socket_cookie *p;
@@ -60,9 +70,30 @@ int update_cookie(struct bpf_sock_ops *ctx)
if (p->cookie_key != bpf_get_socket_cookie(ctx))
return 1;
 
-   p->cookie_value = (ctx->local_port << 8) | p->cookie_value;
+   p->cookie_value |= (ctx->local_port << 8);

[PATCH bpf-next v5 1/4] bpf: Be less specific about socket cookies guarantees

2021-01-19 Thread Florent Revest

Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.

Reported-by: Daniel Borkmann 
Signed-off-by: Florent Revest 
---
 include/uapi/linux/bpf.h   | 8 
 tools/include/uapi/linux/bpf.h | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c001766adcbc..0b735c2729b2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1656,22 +1656,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c001766adcbc..0b735c2729b2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1656,22 +1656,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
-- 
2.30.0.284.gd98b1dd5eaa7-goog

[PATCH bpf-next v5 2/4] bpf: Expose bpf_get_socket_cookie to tracing programs

2021-01-19 Thread Florent Revest

This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h|  1 +
 include/uapi/linux/bpf.h   |  8 
 kernel/trace/bpf_trace.c   |  2 ++
 net/core/filter.c  | 12 
 tools/include/uapi/linux/bpf.h |  8 
 5 files changed, 31 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1aac2af12fed..26219465e1f7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1874,6 +1874,7 @@ extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
 extern const struct bpf_func_proto bpf_sock_from_file_proto;
+extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0b735c2729b2..5855c398d685 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1673,6 +1673,14 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(void *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**. This helper
+ * also works for sleepable programs.
+ * Return
+ * A 8-byte long unique number or 0 if *sk* is NULL.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 6c0018abe68a..845b2168e006 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1760,6 +1760,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const 
struct bpf_prog *prog)
return _sk_storage_delete_tracing_proto;
case BPF_FUNC_sock_from_file:
return _sock_from_file_proto;
+   case BPF_FUNC_get_socket_cookie:
+   return _get_socket_ptr_cookie_proto;
 #endif
case BPF_FUNC_seq_printf:
return prog->expected_attach_type == BPF_TRACE_ITER ?
diff --git a/net/core/filter.c b/net/core/filter.c
index 9ab94e90d660..606e2b6115ed 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4631,6 +4631,18 @@ static const struct bpf_func_proto 
bpf_get_socket_cookie_sock_proto = {
.arg1_type  = ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_1(bpf_get_socket_ptr_cookie, struct sock *, sk)
+{
+   return sk ? sock_gen_cookie(sk) : 0;
+}
+
+const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto = {
+   .func   = bpf_get_socket_ptr_cookie,
+   .gpl_only   = false,
+   .ret_type   = RET_INTEGER,
+   .arg1_type  = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
+};
+
 BPF_CALL_1(bpf_get_socket_cookie_sock_ops, struct bpf_sock_ops_kern *, ctx)
 {
return __sock_gen_cookie(ctx->sk);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0b735c2729b2..5855c398d685 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1673,6 +1673,14 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(void *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**. This helper
+ * also works for sleepable programs.
+ * Return
+ * A 8-byte long unique number or 0 if *sk* is NULL.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
-- 
2.30.0.284.gd98b1dd5eaa7-goog

Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

2020-12-22 Thread Florent Revest

On Fri, Dec 18, 2020 at 4:20 AM Alexei Starovoitov
 wrote:
> As far as 6 arg issue:
> long bpf_snprintf(const char *out, u32 out_size,
>   const char *fmt, u32 fmt_size,
>   const void *data, u32 data_len);
> Yeah. It won't work as-is, but fmt_size is unnecessary nowadays.
> The verifier understands read-only data.
> Hence the helper can be:
> long bpf_snprintf(const char *out, u32 out_size,
>   const char *fmt,
>   const void *data, u32 data_len);
> The 3rd arg cannot be ARG_PTR_TO_MEM.
> Instead we can introduce ARG_PTR_TO_CONST_STR in the verifier.
> See check_mem_access() where it's doing bpf_map_direct_read().
> That 'fmt' string will be accessed through the same bpf_map_direct_read().
> The verifier would need to check that it's NUL-terminated valid string.

Ok, this works for me.

> It should probably do % specifier checks at the same time.

However, I'm still not sure whether that would work. Did you maybe
miss my comment in a previous email? Let me put it back here:

> The iteration that bpf_trace_printk does over the format string
> argument is not only used for validation. It is also used to remember
> what extra operations need to be done based on the modifier types. For
> example, it remembers whether an arg should be interpreted as 32bits or
> 64bits. In the case of string printing, it also remembers whether it is
> a kernel-space or user-space pointer so that bpf_trace_copy_string can
> be called with the right arg. If we were to run the iteration over the format
> string in the verifier, how would you recommend that we
> "remember" the modifier type until the helper gets called ?

The best solution I can think of would be to iterate over the format
string in the helper. In that case, the format string verification in
the verifier would be redundant and the format string wouldn't have to
be constant. Do you have any suggestions ?

> At the end bpf_snprintf() will have 5 args and when wrapped with
> BPF_SNPRINTF() macro it will accept arbitrary number of arguments to print.
> It also will be generally useful to do all other kinds of pretty printing.

Yep this macro is a good idea, I like that. :)

Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

2020-12-22 Thread Florent Revest

On Fri, Dec 18, 2020 at 9:47 PM Andrii Nakryiko
 wrote:
>
> On Fri, Dec 18, 2020 at 12:36 PM Alexei Starovoitov
>  wrote:
> >
> > On Fri, Dec 18, 2020 at 10:53:57AM -0800, Andrii Nakryiko wrote:
> > > On Thu, Dec 17, 2020 at 7:20 PM Alexei Starovoitov
> > >  wrote:
> > > >
> > > > On Thu, Dec 17, 2020 at 09:26:09AM -0800, Yonghong Song wrote:
> > > > >
> > > > >
> > > > > On 12/17/20 7:31 AM, Florent Revest wrote:
> > > > > > On Mon, Dec 14, 2020 at 7:47 AM Yonghong Song  wrote:
> > > > > > > On 12/11/20 6:40 AM, Florent Revest wrote:
> > > > > > > > On Wed, Dec 2, 2020 at 10:18 PM Alexei Starovoitov
> > > > > > > >  wrote:
> > > > > > > > > I still think that adopting printk/vsnprintf for this instead 
> > > > > > > > > of
> > > > > > > > > reinventing the wheel
> > > > > > > > > is more flexible and easier to maintain long term.
> > > > > > > > > Almost the same layout can be done with vsnprintf
> > > > > > > > > with exception of \0 char.
> > > > > > > > > More meaningful names, etc.
> > > > > > > > > See Documentation/core-api/printk-formats.rst
> > > > > > > >
> > > > > > > > I agree this would be nice. I finally got a bit of time to 
> > > > > > > > experiment
> > > > > > > > with this and I noticed a few things:
> > > > > > > >
> > > > > > > > First of all, because helpers only have 5 arguments, if we use 
> > > > > > > > two for
> > > > > > > > the output buffer and its size and two for the format string 
> > > > > > > > and its
> > > > > > > > size, we are only left with one argument for a modifier. This 
> > > > > > > > is still
> > > > > > > > enough for our usecase (where we'd only use "%ps" for example) 
> > > > > > > > but it
> > > > > > > > does not strictly-speaking allow for the same layout that Andrii
> > > > > > > > proposed.
> > > > > > >
> > > > > > > See helper bpf_seq_printf. It packs all arguments for format 
> > > > > > > string and
> > > > > > > puts them into an array. bpf_seq_printf will unpack them as it 
> > > > > > > parsed
> > > > > > > through the format string. So it should be doable to have more 
> > > > > > > than
> > > > > > > "%ps" in format string.
> > > > > >
> > > > > > This could be a nice trick, thank you for the suggestion Yonghong :)
> > > > > >
> > > > > > My understanding is that this would also require two extra args (one
> > > > > > for the array of arguments and one for the size of this array) so it
> > > > > > would still not fit the 5 arguments limit I described in my previous
> > > > > > email.
> > > > > > eg: this would not be possible:
> > > > > > long bpf_snprintf(const char *out, u32 out_size,
> > > > > >const char *fmt, u32 fmt_size,
> > > > > >   const void *data, u32 data_len)
> > > > >
> > > > > Right. bpf allows only up to 5 parameters.
> > > > > >
> > > > > > Would you then suggest that we also put the format string and its
> > > > > > length in the first and second cells of this array and have 
> > > > > > something
> > > > > > along the line of:
> > > > > > long bpf_snprintf(const char *out, u32 out_size,
> > > > > >const void *args, u32 args_len) ?
> > > > > > This seems like a fairly opaque signature to me and harder to 
> > > > > > verify.
> > > > >
> > > > > One way is to define an explicit type for args, something like
> > > > >struct bpf_fmt_str_data {
> > > > >   char *fmt;
> > > > >   u64 fmt_len;
> > > > >   u64 data[];
> > > > >};
> > > >
> > > > that feels a bit convoluted.
> > > >
> &

Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

2020-12-22 Thread Florent Revest

On Tue, Dec 22, 2020 at 3:18 PM Christoph Hellwig  wrote:
>
> FYI, there is a reason why kallsyms_lookup is not exported any more.
> I don't think adding that back through a backdoor is a good idea.

Did you maybe mean kallsyms_lookup_name (the one that looks an address
up based on a symbol name) ? It used to be exported but isn't anymore
indeed.
However, this is not what we're trying to do. As far as I can tell,
kallsyms_lookup (the one that looks a symbol name up based on an
address) has never been exported but its close cousins sprint_symbol
and sprint_symbol_no_offset (which only call kallsyms_lookup and
pretty print the result) are still exported, they are also used by
vsprintf. Is this an issue ?

Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

2020-12-17 Thread Florent Revest

On Mon, Dec 14, 2020 at 7:47 AM Yonghong Song  wrote:
> On 12/11/20 6:40 AM, Florent Revest wrote:
> > On Wed, Dec 2, 2020 at 10:18 PM Alexei Starovoitov
> >  wrote:
> >> I still think that adopting printk/vsnprintf for this instead of
> >> reinventing the wheel
> >> is more flexible and easier to maintain long term.
> >> Almost the same layout can be done with vsnprintf
> >> with exception of \0 char.
> >> More meaningful names, etc.
> >> See Documentation/core-api/printk-formats.rst
> >
> > I agree this would be nice. I finally got a bit of time to experiment
> > with this and I noticed a few things:
> >
> > First of all, because helpers only have 5 arguments, if we use two for
> > the output buffer and its size and two for the format string and its
> > size, we are only left with one argument for a modifier. This is still
> > enough for our usecase (where we'd only use "%ps" for example) but it
> > does not strictly-speaking allow for the same layout that Andrii
> > proposed.
>
> See helper bpf_seq_printf. It packs all arguments for format string and
> puts them into an array. bpf_seq_printf will unpack them as it parsed
> through the format string. So it should be doable to have more than
> "%ps" in format string.

This could be a nice trick, thank you for the suggestion Yonghong :)

My understanding is that this would also require two extra args (one
for the array of arguments and one for the size of this array) so it
would still not fit the 5 arguments limit I described in my previous
email.
eg: this would not be possible:
long bpf_snprintf(const char *out, u32 out_size,
  const char *fmt, u32 fmt_size,
 const void *data, u32 data_len)

Would you then suggest that we also put the format string and its
length in the first and second cells of this array and have something
along the line of:
long bpf_snprintf(const char *out, u32 out_size,
  const void *args, u32 args_len) ?
This seems like a fairly opaque signature to me and harder to verify.

Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

2020-12-11 Thread Florent Revest

On Wed, Dec 2, 2020 at 10:18 PM Alexei Starovoitov
 wrote:
> I still think that adopting printk/vsnprintf for this instead of
> reinventing the wheel
> is more flexible and easier to maintain long term.
> Almost the same layout can be done with vsnprintf
> with exception of \0 char.
> More meaningful names, etc.
> See Documentation/core-api/printk-formats.rst

I agree this would be nice. I finally got a bit of time to experiment
with this and I noticed a few things:

First of all, because helpers only have 5 arguments, if we use two for
the output buffer and its size and two for the format string and its
size, we are only left with one argument for a modifier. This is still
enough for our usecase (where we'd only use "%ps" for example) but it
does not strictly-speaking allow for the same layout that Andrii
proposed.

> If we force fmt to come from readonly map then bpf_trace_printk()-like
> run-time check of fmt string can be moved into load time check
> and performance won't suffer.

Regarding this bit, I have the impression that this would not be
possible, but maybe I'm missing something ? :)

The iteration that bpf_trace_printk does over the format string
argument is not only used for validation. It is also used to remember
what extra operations need to be done based on the modifier types. For
example, it remembers whether an arg should be interpreted as 32bits or
64bits. In the case of string printing, it also remembers whether it is
a kernel-space or user-space pointer so that bpf_trace_copy_string can
be called with the right arg. If we were to run the iteration over the format
string in the verifier, how would you recommend that we
"remember" the modifier type until the helper gets called ?

[PATCH bpf-next v4 4/4] selftests/bpf: Add a selftest for the tracing bpf_get_socket_cookie

2020-12-09 Thread Florent Revest

This builds up on the existing socket cookie test which checks whether
the bpf_get_socket_cookie helpers provide the same value in
cgroup/connect6 and sockops programs for a socket created by the
userspace part of the test.

Adding a tracing program to the existing objects requires a different
attachment strategy and different headers.

Signed-off-by: Florent Revest 
---
 .../selftests/bpf/prog_tests/socket_cookie.c  | 24 +++
 .../selftests/bpf/progs/socket_cookie_prog.c  | 41 ---
 2 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
index 53d0c44e7907..e5c5e2ea1deb 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -15,8 +15,8 @@ struct socket_cookie {
 
 void test_socket_cookie(void)
 {
+   struct bpf_link *set_link, *update_sockops_link, *update_tracing_link;
socklen_t addr_len = sizeof(struct sockaddr_in6);
-   struct bpf_link *set_link, *update_link;
int server_fd, client_fd, cgroup_fd;
struct socket_cookie_prog *skel;
__u32 cookie_expected_value;
@@ -39,15 +39,21 @@ void test_socket_cookie(void)
  PTR_ERR(set_link)))
goto close_cgroup_fd;
 
-   update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
-cgroup_fd);
-   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
- PTR_ERR(update_link)))
+   update_sockops_link = bpf_program__attach_cgroup(
+   skel->progs.update_cookie_sockops, cgroup_fd);
+   if (CHECK(IS_ERR(update_sockops_link), "update-sockops-link-cg-attach",
+ "err %ld\n", PTR_ERR(update_sockops_link)))
goto free_set_link;
 
+   update_tracing_link = bpf_program__attach(
+   skel->progs.update_cookie_tracing);
+   if (CHECK(IS_ERR(update_tracing_link), "update-tracing-link-attach",
+ "err %ld\n", PTR_ERR(update_tracing_link)))
+   goto free_update_sockops_link;
+
server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
-   goto free_update_link;
+   goto free_update_tracing_link;
 
client_fd = connect_to_fd(server_fd, 0);
if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
@@ -71,8 +77,10 @@ void test_socket_cookie(void)
close(client_fd);
 close_server_fd:
close(server_fd);
-free_update_link:
-   bpf_link__destroy(update_link);
+free_update_tracing_link:
+   bpf_link__destroy(update_tracing_link);
+free_update_sockops_link:
+   bpf_link__destroy(update_sockops_link);
 free_set_link:
bpf_link__destroy(set_link);
 close_cgroup_fd:
diff --git a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c 
b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
index 81e84be6f86d..1f770b732cb1 100644
--- a/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
+++ b/tools/testing/selftests/bpf/progs/socket_cookie_prog.c
@@ -1,11 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2018 Facebook
 
-#include 
-#include 
+#include "vmlinux.h"
 
 #include 
 #include 
+#include 
+
+#define AF_INET6 10
 
 struct socket_cookie {
__u64 cookie_key;
@@ -19,6 +21,14 @@ struct {
__type(value, struct socket_cookie);
 } socket_cookies SEC(".maps");
 
+/*
+ * These three programs get executed in a row on connect() syscalls. The
+ * userspace side of the test creates a client socket, issues a connect() on it
+ * and then checks that the local storage associated with this socket has:
+ * cookie_value == local_port << 8 | 0xFF
+ * The different parts of this cookie_value are appended by those hooks if they
+ * all agree on the output of bpf_get_socket_cookie().
+ */
 SEC("cgroup/connect6")
 int set_cookie(struct bpf_sock_addr *ctx)
 {
@@ -32,14 +42,14 @@ int set_cookie(struct bpf_sock_addr *ctx)
if (!p)
return 1;
 
-   p->cookie_value = 0xFF;
+   p->cookie_value = 0xF;
p->cookie_key = bpf_get_socket_cookie(ctx);
 
return 1;
 }
 
 SEC("sockops")
-int update_cookie(struct bpf_sock_ops *ctx)
+int update_cookie_sockops(struct bpf_sock_ops *ctx)
 {
struct bpf_sock *sk;
struct socket_cookie *p;
@@ -60,9 +70,30 @@ int update_cookie(struct bpf_sock_ops *ctx)
if (p->cookie_key != bpf_get_socket_cookie(ctx))
return 1;
 
-   p->cookie_value = (ctx->local_port << 8) | p->cookie_value;
+   p->cookie_value |= (ctx->local_port << 8);

[PATCH bpf-next v4 1/4] bpf: Be less specific about socket cookies guarantees

2020-12-09 Thread Florent Revest

Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.

Reported-by: Daniel Borkmann 
Signed-off-by: Florent Revest 
---
 include/uapi/linux/bpf.h   | 8 
 tools/include/uapi/linux/bpf.h | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 30b477a26482..ba59309f4d18 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1650,22 +1650,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 30b477a26482..ba59309f4d18 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1650,22 +1650,22 @@ union bpf_attr {
  * networking traffic statistics as it provides a global socket
  * identifier that can be assumed unique.
  * Return
- * A 8-byte long non-decreasing number on success, or 0 if the
- * socket field is missing inside *skb*.
+ * A 8-byte long unique number on success, or 0 if the socket
+ * field is missing inside *skb*.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
  * Description
  * Equivalent to bpf_get_socket_cookie() helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_addr** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
  * Description
  * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
  * *skb*, but gets socket from **struct bpf_sock_ops** context.
  * Return
- * A 8-byte long non-decreasing number.
+ * A 8-byte long unique number.
  *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
-- 
2.29.2.576.ga3fc446d84-goog

[PATCH bpf-next v4 2/4] bpf: Expose bpf_get_socket_cookie to tracing programs

2020-12-09 Thread Florent Revest

This needs two new helpers, one that works in a sleepable context (using
sock_gen_cookie which disables/enables preemption) and one that does not
(for performance reasons). Both take a struct sock pointer and need to
check it for NULLness.

This helper could also be useful to other BPF program types such as LSM.

Signed-off-by: Florent Revest 
---
 include/linux/bpf.h|  1 +
 include/uapi/linux/bpf.h   |  7 +++
 kernel/trace/bpf_trace.c   |  2 ++
 net/core/filter.c  | 12 
 tools/include/uapi/linux/bpf.h |  7 +++
 5 files changed, 29 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 07cb5d15e743..5a858e8c3f1a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1860,6 +1860,7 @@ extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
 extern const struct bpf_func_proto bpf_sock_from_file_proto;
+extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ba59309f4d18..9ac66cf25959 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1667,6 +1667,13 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(void *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**.
+ * Return
+ * A 8-byte long unique number.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 52ddd217d6a1..be5e96de306d 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1760,6 +1760,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const 
struct bpf_prog *prog)
return _sk_storage_delete_tracing_proto;
case BPF_FUNC_sock_from_file:
return _sock_from_file_proto;
+   case BPF_FUNC_get_socket_cookie:
+   return _get_socket_ptr_cookie_proto;
 #endif
case BPF_FUNC_seq_printf:
return prog->expected_attach_type == BPF_TRACE_ITER ?
diff --git a/net/core/filter.c b/net/core/filter.c
index 255aeee72402..13ad9a64f04f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4631,6 +4631,18 @@ static const struct bpf_func_proto 
bpf_get_socket_cookie_sock_proto = {
.arg1_type  = ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_1(bpf_get_socket_ptr_cookie, struct sock *, sk)
+{
+   return sk ? sock_gen_cookie(sk) : 0;
+}
+
+const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto = {
+   .func   = bpf_get_socket_ptr_cookie,
+   .gpl_only   = false,
+   .ret_type   = RET_INTEGER,
+   .arg1_type  = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
+};
+
 BPF_CALL_1(bpf_get_socket_cookie_sock_ops, struct bpf_sock_ops_kern *, ctx)
 {
return __sock_gen_cookie(ctx->sk);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ba59309f4d18..9ac66cf25959 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1667,6 +1667,13 @@ union bpf_attr {
  * Return
  * A 8-byte long unique number.
  *
+ * u64 bpf_get_socket_cookie(void *sk)
+ * Description
+ * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts
+ * *sk*, but gets socket from a BTF **struct sock**.
+ * Return
+ * A 8-byte long unique number.
+ *
  * u32 bpf_get_socket_uid(struct sk_buff *skb)
  * Return
  * The owner UID of the socket associated to *skb*. If the socket
-- 
2.29.2.576.ga3fc446d84-goog

[PATCH bpf-next v4 3/4] selftests/bpf: Integrate the socket_cookie test to test_progs

2020-12-09 Thread Florent Revest

Currently, the selftest for the BPF socket_cookie helpers is built and
run independently from test_progs. It's easy to forget and hard to
maintain.

This patch moves the socket cookies test into prog_tests/ and vastly
simplifies its logic by:
- rewriting the loading code with BPF skeletons
- rewriting the server/client code with network helpers
- rewriting the cgroup code with test__join_cgroup
- rewriting the error handling code with CHECKs

Signed-off-by: Florent Revest 
---
 tools/testing/selftests/bpf/Makefile  |   3 +-
 .../selftests/bpf/prog_tests/socket_cookie.c  |  82 +++
 .../selftests/bpf/progs/socket_cookie_prog.c  |   2 -
 .../selftests/bpf/test_socket_cookie.c| 208 --
 4 files changed, 83 insertions(+), 212 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/socket_cookie.c
 delete mode 100644 tools/testing/selftests/bpf/test_socket_cookie.c

diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index ac25ba5d0d6c..c21960d5f286 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -33,7 +33,7 @@ LDLIBS += -lcap -lelf -lz -lrt -lpthread
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map 
test_progs \
test_verifier_log test_dev_cgroup \
-   test_sock test_sockmap get_cgroup_id_user test_socket_cookie \
+   test_sock test_sockmap get_cgroup_id_user \
test_cgroup_storage \
test_netcnt test_tcpnotify_user test_sysctl \
test_progs-no_alu32 \
@@ -167,7 +167,6 @@ $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
 $(OUTPUT)/test_skb_cgroup_id_user: cgroup_helpers.c
 $(OUTPUT)/test_sock: cgroup_helpers.c
 $(OUTPUT)/test_sock_addr: cgroup_helpers.c
-$(OUTPUT)/test_socket_cookie: cgroup_helpers.c
 $(OUTPUT)/test_sockmap: cgroup_helpers.c
 $(OUTPUT)/test_tcpnotify_user: cgroup_helpers.c trace_helpers.c
 $(OUTPUT)/get_cgroup_id_user: cgroup_helpers.c
diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c 
b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
new file mode 100644
index ..53d0c44e7907
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Google LLC.
+// Copyright (c) 2018 Facebook
+
+#include 
+#include "socket_cookie_prog.skel.h"
+#include "network_helpers.h"
+
+static int duration;
+
+struct socket_cookie {
+   __u64 cookie_key;
+   __u32 cookie_value;
+};
+
+void test_socket_cookie(void)
+{
+   socklen_t addr_len = sizeof(struct sockaddr_in6);
+   struct bpf_link *set_link, *update_link;
+   int server_fd, client_fd, cgroup_fd;
+   struct socket_cookie_prog *skel;
+   __u32 cookie_expected_value;
+   struct sockaddr_in6 addr;
+   struct socket_cookie val;
+   int err = 0;
+
+   skel = socket_cookie_prog__open_and_load();
+   if (CHECK(!skel, "socket_cookie_prog__open_and_load",
+ "skeleton open_and_load failed\n"))
+   return;
+
+   cgroup_fd = test__join_cgroup("/socket_cookie");
+   if (CHECK(cgroup_fd < 0, "join_cgroup", "cgroup creation failed\n"))
+   goto destroy_skel;
+
+   set_link = bpf_program__attach_cgroup(skel->progs.set_cookie,
+ cgroup_fd);
+   if (CHECK(IS_ERR(set_link), "set-link-cg-attach", "err %ld\n",
+ PTR_ERR(set_link)))
+   goto close_cgroup_fd;
+
+   update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
+cgroup_fd);
+   if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
+ PTR_ERR(update_link)))
+   goto free_set_link;
+
+   server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+   if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
+   goto free_update_link;
+
+   client_fd = connect_to_fd(server_fd, 0);
+   if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
+   goto close_server_fd;
+
+   err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.socket_cookies),
+ _fd, );
+   if (CHECK(err, "map_lookup", "err %d errno %d\n", err, errno))
+   goto close_client_fd;
+
+   err = getsockname(client_fd, (struct sockaddr *), _len);
+   if (CHECK(err, "getsockname", "Can't get client local addr\n"))
+   goto close_client_fd;
+
+   cookie_expected_value = (ntohs(addr.sin6_port) << 8) | 0xFF;
+   CHECK(val.cookie_value != cookie_expected_value, "",
+

Re: [PATCH bpf-next v3 2/4] bpf: Expose bpf_get_socket_cookie to tracing programs

2020-12-09 Thread Florent Revest

On Tue, 2020-12-08 at 23:08 +0100, KP Singh wrote:
> My understanding is you can simply always call sock_gen_cookie and
> not have two protos.
> 
> This will disable preemption in sleepable programs and not have any
> effect in non-sleepable programs since preemption will already be
> disabled.

Sure, that works. I thought that providing two helper implems would
slightly improve performances on non-sleepable programs but I can send
a v4 with only one helper that calls sock_gen_cookie.

1 2 3 >

1 - 100 of 213 matches

Mail list logo