Re: [PATCH bpf v3] x86/cpufeature: bpf hack for clang not supporting asm goto

2018-05-10 Thread Gianluca Borello
On Thu, May 10, 2018 at 9:28 AM Borislav Petkov  wrote:

> As someone already pointed out on IRC, arch/x86/include/asm/cpufeature.h
> is solely a kernel header so nothing but kernel should include it. So
> forget the userspace breakage "argument".


For what is worth, I have the same exact problem in a relatively popular
open source system call tracer, and my attempt to fix the issue from user
space has been:
https://github.com/draios/sysdig/commit/2958eb1d52e047f4b93d1238be803e7c405bdec2

While I can definitely live with that (and I'd be happy to submit a patch
to samples/ following the same approach) and absolutely respect the
technical authority of the reviewers here, it would be nice to recognize
that these changes actually affect users to a certain degree, even if from
a technical point of view they don't break userspace.

 From a practical point of view, BPF is widely used from userspace programs
to access some kernel data structures to gather visibility information, and
even the simplest use case, such as including linux/sched.h to access some
task_struct members, ends up pulling in arch/x86/include/asm/cpufeature.h,
thus (ab)using that file. Adding these quirks definitely increases the
complexity a developer needs to keep in mind in order to take advantage of
a BPF based instrumentation.

Thanks


Re: [RFC bpf] bpf, x64: fix JIT emission for dead code

2018-04-25 Thread Gianluca Borello
On Wed, Apr 25, 2018 at 8:34 AM Daniel Borkmann 
wrote:

> I've applied this fix to bpf tree, thanks Gianluca!

Thank you all for the quick review, really appreciated!


[RFC bpf] bpf, x64: fix JIT emission for dead code

2018-04-24 Thread Gianluca Borello
Commit 2a5418a13fcf ("bpf: improve dead code sanitizing") replaced dead
code with a series of ja-1 instructions, for safety. That made JIT
compilation much more complex for some BPF programs. One instance of such
programs is, for example:

bool flag = false
...
/* A bunch of other code */
...
if (flag)
do_something()

In some cases llvm is not able to remove at compile time the code for
do_something(), so the generated BPF program ends up with a large amount
of dead instructions. In one specific real life example, there are two
series of ~500 and ~1000 dead instructions in the program. When the
verifier replaces them with a series of ja-1 instructions, it causes an
interesting behavior at JIT time.

During the first pass, since all the instructions are estimated at 64
bytes, the ja-1 instructions end up being translated as 5 bytes JMP
instructions (0xE9), since the jump offsets become increasingly large (>
127) as each instruction gets discovered to be 5 bytes instead of the
estimated 64.

Starting from the second pass, the first N instructions of the ja-1
sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets
become <= 127 this time. In particular, N is defined as roughly 127 / (5
- 2) ~= 42. So, each further pass will make the subsequent N JMP
instructions shrink from 5 to 2 bytes, making the image shrink every time.
This means that in order to have the entire program converge, there need
to be, in the real example above, at least ~1000 / 42 ~= 24 passes just
for translating the dead code. If we add this number to the passes needed
to translate the other non dead code, it brings such program to 40+
passes, and JIT doesn't complete. Ultimately the userspace loader fails
because such BPF program was supposed to be part of a prog array owner
being JITed.

While it is certainly possible to try to refactor such programs to help
the compiler remove dead code, the behavior is not really intuitive and it
puts further burden on the BPF developer who is not expecting such
behavior. To make things worse, such programs are working just fine in all
the kernel releases prior to the ja-1 fix.

A possible approach to mitigate this behavior consists into noticing that
for ja-1 instructions we don't really need to rely on the estimated size
of the previous and current instructions, we know that a -1 BPF jump
offset can be safely translated into a 0xEB instruction with a jump offset
of -2.

Such fix brings the BPF program in the previous example to complete again
in ~9 passes.

Fixes: 2a5418a13fcf ("bpf: improve dead code sanitizing")
Signed-off-by: Gianluca Borello 
---
Hi

Posting this as RFC since I just wanted to report this potential bug and
propose a possible fix, although I'm not sure if:

1) There might be a better fix on the JIT side
2) We might want to replace the ja-1 instructions with something else
3) We might want to ignore this issue

If we choose option 3, I'd just like to point out that this caused a real
regression on a couple BPF programs that are part of a larger collection
of programs that used to work fine on 4.15 and I recently found broken 
in 4.16, so there would be some value in somehow addressing this.

Thanks

 arch/x86/net/bpf_jit_comp.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b725154182cc..abce27ceb411 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1027,7 +1027,17 @@ xadd:if (is_imm8(insn->off))
break;
 
case BPF_JMP | BPF_JA:
-   jmp_offset = addrs[i + insn->off] - addrs[i];
+   if (insn->off == -1)
+   /* -1 jmp instructions will always jump
+* backwards two bytes. Explicitly handling
+* this case avoids wasting too many passes
+* when there are long sequences of replaced
+* dead code.
+*/
+   jmp_offset = -2;
+   else
+   jmp_offset = addrs[i + insn->off] - addrs[i];
+
if (!jmp_offset)
/* optimize out nop jumps */
break;
-- 
2.17.0



Re: Issue accessing task_struct from BPF due to 4.16 stack-protector changes

2018-03-02 Thread Gianluca Borello
On Fri, Mar 2, 2018 at 12:42 PM, Alexei Starovoitov
 wrote:
>
> good catch!
> I wonder why sched.h is using this flag insead of relying on #defines from 
> autoconf.h
> It could have been using CONFIG_HAVE_CC_STACKPROTECTOR
> instead of CONFIG_CC_STACKPROTECTOR, no ?
>

Thanks for your reply Alexei. I think switching to
HAVE_CC_STACKPROTECTOR could indeed solve this particular BPF issue in
a cleaner way (I tested it), at the cost of having that struct member
always present for the supported architectures even if the stack
protector is actually disabled (e.g. CONFIG_CC_STACKPROTECTOR_NONE=y).

Not sure if this could be frowned upon by someone considering how
critical task_struct is, but on the other hand is really just 8 bytes.

Thanks


Issue accessing task_struct from BPF due to 4.16 stack-protector changes

2018-03-02 Thread Gianluca Borello
Hello,

While testing bpf-next, I noticed that I was reading garbage when
accessing some task_struct members, and the issue seems caused by the
recent commit 2bc2f688fdf8 ("Makefile: move stack-protector
availability out of Kconfig") which removes CONFIG_CC_STACKPROTECTOR
from autoconf.h.

When I compile my BPF program, offsetof(struct task_struct, files),
which is the member I'm dereferencing, returns 1768 (where the garbage
is), whereas doing it on 4.15 returns 1776 (where the correct member
is). I believe when compiling with clang this portion of the
task_struct doesn't get considered anymore:

#ifdef CONFIG_CC_STACKPROTECTOR
/* Canary value for the -fstack-protector GCC feature: */
unsigned long stack_canary;
#endif

I solved it by adding $(KBUILD_CPPFLAGS) to my BPF Makefile (which is
pretty similar to the one used in samples/bpf/Makefile).

Two questions:

1) Do you confirm this is the proper way to handle this moving
forward? Or should there be a better way?

2) Would you consider useful a simple patch to samples/bpf/Makefile so
that other developers will not be stuck in a long bisect to figure out
why they read garbage when dereferencing task_struct? I assume that
several people use that Makefile as a template to start their project,
like I did (perhaps I'm assuming wrong though).

Thanks


[PATCH bpf-next] bpf: fix stacksafe exploration when comparing states

2017-12-23 Thread Gianluca Borello
Commit cc2b14d51053 ("bpf: teach verifier to recognize zero initialized
stack") introduced a very relaxed check when comparing stacks of different
states, effectively returning a positive result in many cases where it
shouldn't.

This can create problems in cases such as this following C pseudocode:

long var;
long *x = bpf_map_lookup(...);
if (!x)
return;

if (*x != 0xbeef)
var = 0;
else
var = 1;

/* This is the key part, calling a helper causes an explored state
 * to be saved with the information that "var" is on the stack as
 * STACK_ZERO, since the helper is first met by the verifier after
 * the "var = 0" assignment. This state will however be wrongly used
 * also for the "var = 1" case, so the verifier assumes "var" is always
 * 0 and will replace the NULL assignment with nops, because the
 * search pruning prevents it from exploring the faulty branch.
 */
bpf_ktime_get_ns();

if (var)
*(long *)0 = 0xbeef;

Fix the issue by making sure that the stack is fully explored before
returning a positive comparison result.

Also attach a couple tests that highlight the bad behavior. In the first
test, without this fix instructions 16 and 17 are replaced with nops
instead of being rejected by the verifier.

The second test, instead, allows a program to make a potentially illegal
read from the stack.

Fixes: cc2b14d51053 ("bpf: teach verifier to recognize zero initialized stack")
Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
---
 kernel/bpf/verifier.c   |  2 +-
 tools/testing/selftests/bpf/test_verifier.c | 51 +
 2 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8b442ae125d0..93e1c77dae1d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4107,7 +4107,7 @@ static bool stacksafe(struct bpf_func_state *old,
 
if (!(old->stack[spi].spilled_ptr.live & REG_LIVE_READ))
/* explored state didn't use this */
-   return true;
+   continue;
 
if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == 
STACK_INVALID)
continue;
diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index 3bacff0d6f91..5e79515d10c5 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -9715,6 +9715,57 @@ static struct bpf_test tests[] = {
.result = REJECT,
.prog_type = BPF_PROG_TYPE_XDP,
},
+   {
+   "search pruning: all branches should be verified (nop 
operation)",
+   .insns = {
+   BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+   BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+   BPF_LD_MAP_FD(BPF_REG_1, 0),
+   BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 11),
+   BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0xbeef, 2),
+   BPF_MOV64_IMM(BPF_REG_4, 0),
+   BPF_JMP_A(1),
+   BPF_MOV64_IMM(BPF_REG_4, 1),
+   BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_4, -16),
+   BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns),
+   BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -16),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_5, 0, 2),
+   BPF_MOV64_IMM(BPF_REG_6, 0),
+   BPF_ST_MEM(BPF_DW, BPF_REG_6, 0, 0xdead),
+   BPF_EXIT_INSN(),
+   },
+   .fixup_map1 = { 3 },
+   .errstr = "R6 invalid mem access 'inv'",
+   .result = REJECT,
+   .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+   },
+   {
+   "search pruning: all branches should be verified (invalid stack 
access)",
+   .insns = {
+   BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+   BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+   BPF_LD_MAP_FD(BPF_REG_1, 0),
+   BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8),
+   BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0),
+   BPF_MOV64_IMM(BPF_REG_4, 0),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0xbeef, 2),
+   BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_4, -16),
+   BPF_JMP_A(1),
+   

Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL

2017-11-22 Thread Gianluca Borello
On Tue, Nov 21, 2017 at 2:31 PM, Alexei Starovoitov
 wrote:
>
> yeah sorry about this hack. Gianluca reported this issue as well.
> Yonghong fixed it for bpf_probe_read only. We will extend
> the fix to bpf_probe_read_str() and bpf_perf_event_output() asap.
> The above workaround gets too much into llvm and verifier details
> we should strive to make bpf program writing as easy as possible.
>

Hi Arnaldo

With the help of Alexei, Daniel and Yonghong I just submitted a new
series ("bpf: fix semantics issues with helpers receiving NULL
arguments") that includes a fix in bpf_perf_event_output. This should
simplify the way you write your bpf programs, so you shouldn't be
required to write those convoluted branches anymore (there are a few
usage examples in the commit log).

In my case it made writing the code much easier, after applying it I
haven't been surprised by the compiler output in a while, and I hope
your experience will be improved as well.

Thanks


[PATCH net 3/4] bpf: change bpf_probe_read_str arg2 type to ARG_CONST_SIZE_OR_ZERO

2017-11-22 Thread Gianluca Borello
Commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO
semantics") relaxed the treatment of ARG_CONST_SIZE_OR_ZERO due to the way
the compiler generates optimized BPF code when checking boundaries of an
argument from C code. A typical example of this optimized code can be
generated using the bpf_probe_read_str helper when operating on variable
memory:

/* len is a generic scalar */
if (len > 0 && len <= 0x7fff)
bpf_probe_read_str(p, len, s);

251: (79) r1 = *(u64 *)(r10 -88)
252: (07) r1 += -1
253: (25) if r1 > 0x7ffe goto pc-42
254: (bf) r1 = r7
255: (79) r2 = *(u64 *)(r10 -88)
256: (bf) r8 = r4
257: (85) call bpf_probe_read_str#45
R2 min value is negative, either use unsigned or 'var &= const'

With this code, the verifier loses track of the variable.

Replacing arg2 with ARG_CONST_SIZE_OR_ZERO is thus desirable since it
avoids this quite common case which leads to usability issues, and the
compiler generates code that the verifier can more easily test:

if (len <= 0x7fff)
bpf_probe_read_str(p, len, s);

or

bpf_probe_read_str(p, len & 0x7fff, s);

No changes to the bpf_probe_read_str helper are necessary since
strncpy_from_unsafe itself immediately returns if the size passed is 0.

Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 728909f7951c..ed8601a1a861 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -494,7 +494,7 @@ static const struct bpf_func_proto bpf_probe_read_str_proto 
= {
.gpl_only   = true,
.ret_type   = RET_INTEGER,
.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
-   .arg2_type  = ARG_CONST_SIZE,
+   .arg2_type  = ARG_CONST_SIZE_OR_ZERO,
.arg3_type  = ARG_ANYTHING,
 };

--
2.14.1


[PATCH net 4/4] bpf: change bpf_perf_event_output arg5 type to ARG_CONST_SIZE_OR_ZERO

2017-11-22 Thread Gianluca Borello
Commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO
semantics") relaxed the treatment of ARG_CONST_SIZE_OR_ZERO due to the way
the compiler generates optimized BPF code when checking boundaries of an
argument from C code. A typical example of this optimized code can be
generated using the bpf_perf_event_output helper when operating on variable
memory:

/* len is a generic scalar */
if (len > 0 && len <= 0x7fff)
bpf_perf_event_output(ctx, &perf_map, 0, buf, len);

110: (79) r5 = *(u64 *)(r10 -40)
111: (bf) r1 = r5
112: (07) r1 += -1
113: (25) if r1 > 0x7ffe goto pc+6
114: (bf) r1 = r6
115: (18) r2 = 0x94e5f166c200
117: (b7) r3 = 0
118: (bf) r4 = r7
119: (85) call bpf_perf_event_output#25
R5 min value is negative, either use unsigned or 'var &= const'

With this code, the verifier loses track of the variable.

Replacing arg5 with ARG_CONST_SIZE_OR_ZERO is thus desirable since it
avoids this quite common case which leads to usability issues, and the
compiler generates code that the verifier can more easily test:

if (len <= 0x7fff)
bpf_perf_event_output(ctx, &perf_map, 0, buf, len);

or

bpf_perf_event_output(ctx, &perf_map, 0, buf, len & 0x7fff);

No changes to the bpf_perf_event_output helper are necessary since it can
handle a case where size is 0, and an empty frame is pushed.

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
---
 kernel/trace/bpf_trace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ed8601a1a861..27d1f4ffa3de 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -403,7 +403,7 @@ static const struct bpf_func_proto 
bpf_perf_event_output_proto = {
.arg2_type  = ARG_CONST_MAP_PTR,
.arg3_type  = ARG_ANYTHING,
.arg4_type  = ARG_PTR_TO_MEM,
-   .arg5_type  = ARG_CONST_SIZE,
+   .arg5_type  = ARG_CONST_SIZE_OR_ZERO,
 };

 static DEFINE_PER_CPU(struct pt_regs, bpf_pt_regs);
@@ -605,7 +605,7 @@ static const struct bpf_func_proto 
bpf_perf_event_output_proto_tp = {
.arg2_type  = ARG_CONST_MAP_PTR,
.arg3_type  = ARG_ANYTHING,
.arg4_type  = ARG_PTR_TO_MEM,
-   .arg5_type  = ARG_CONST_SIZE,
+   .arg5_type  = ARG_CONST_SIZE_OR_ZERO,
 };

 BPF_CALL_3(bpf_get_stackid_tp, void *, tp_buff, struct bpf_map *, map,
--
2.14.1


[PATCH net 2/4] bpf: remove explicit handling of 0 for arg2 in bpf_probe_read

2017-11-22 Thread Gianluca Borello
Commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to
ARG_CONST_SIZE_OR_ZERO") changed arg2 type to ARG_CONST_SIZE_OR_ZERO to
simplify writing bpf programs by taking advantage of the new semantics
introduced for ARG_CONST_SIZE_OR_ZERO which allows  arguments.

In order to prevent the helper from actually passing a NULL pointer to
probe_kernel_read, which can happen when  is passed to the helper,
the commit also introduced an explicit check against size == 0.

After the recent introduction of the ARG_PTR_TO_MEM_OR_NULL type,
bpf_probe_read can not receive a pair of  arguments anymore, thus
the check is not needed anymore and can be removed, since probe_kernel_read
can correctly handle a  call. This also fixes the semantics of
the helper before it gets officially released and bpf programs start
relying on this check.

Fixes: 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to 
ARG_CONST_SIZE_OR_ZERO")
Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Acked-by: Yonghong Song 
---
 kernel/trace/bpf_trace.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index a5580c670866..728909f7951c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -78,16 +78,12 @@ EXPORT_SYMBOL_GPL(trace_call_bpf);

 BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
 {
-   int ret = 0;
-
-   if (unlikely(size == 0))
-   goto out;
+   int ret;

ret = probe_kernel_read(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
memset(dst, 0, size);

- out:
return ret;
 }

--
2.14.1


[PATCH net 1/4] bpf: introduce ARG_PTR_TO_MEM_OR_NULL

2017-11-22 Thread Gianluca Borello
With the current ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM semantics, an helper
argument can be NULL when the next argument type is ARG_CONST_SIZE_OR_ZERO
and the verifier can prove the value of this next argument is 0. However,
most helpers are just interested in handling , so forcing them to
deal with  makes the implementation of those helpers more
complicated for no apparent benefits, requiring them to explicitly handle
those corner cases with checks that bpf programs could start relying upon,
preventing the possibility of removing them later.

Solve this by making ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM never accept NULL
even when ARG_CONST_SIZE_OR_ZERO is set, and introduce a new argument type
ARG_PTR_TO_MEM_OR_NULL to explicitly deal with the NULL case.

Currently, the only helper that needs this is bpf_csum_diff_proto(), so
change arg1 and arg3 to this new type as well.

Also add a new battery of tests that explicitly test the
!ARG_PTR_TO_MEM_OR_NULL combination: all the current ones testing the
various  variations are focused on bpf_csum_diff, so cover also
other helpers.

Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
---
 include/linux/bpf.h |   1 +
 kernel/bpf/verifier.c   |   4 +-
 net/core/filter.c   |   4 +-
 tools/testing/selftests/bpf/test_verifier.c | 113 ++--
 4 files changed, 112 insertions(+), 10 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 76c577281d78..e55e4255a210 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -78,6 +78,7 @@ enum bpf_arg_type {
 * functions that access data on eBPF program stack
 */
ARG_PTR_TO_MEM, /* pointer to valid memory (stack, packet, map 
value) */
+   ARG_PTR_TO_MEM_OR_NULL, /* pointer to valid memory or NULL */
ARG_PTR_TO_UNINIT_MEM,  /* pointer to memory does not need to be 
initialized,
 * helper function must fill all bytes or clear
 * them in error case.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index dd54d20ace2f..308b0638ec5d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1384,13 +1384,15 @@ static int check_func_arg(struct bpf_verifier_env *env, 
u32 regno,
if (type != expected_type)
goto err_type;
} else if (arg_type == ARG_PTR_TO_MEM ||
+  arg_type == ARG_PTR_TO_MEM_OR_NULL ||
   arg_type == ARG_PTR_TO_UNINIT_MEM) {
expected_type = PTR_TO_STACK;
/* One exception here. In case function allows for NULL to be
 * passed in as argument, it's a SCALAR_VALUE type. Final test
 * happens during stack boundary checking.
 */
-   if (register_is_null(*reg))
+   if (register_is_null(*reg) &&
+   arg_type == ARG_PTR_TO_MEM_OR_NULL)
/* final test in check_stack_boundary() */;
else if (!type_is_pkt_pointer(type) &&
 type != PTR_TO_MAP_VALUE &&
diff --git a/net/core/filter.c b/net/core/filter.c
index 1afa17935954..6a85e67fafce 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1646,9 +1646,9 @@ static const struct bpf_func_proto bpf_csum_diff_proto = {
.gpl_only   = false,
.pkt_access = true,
.ret_type   = RET_INTEGER,
-   .arg1_type  = ARG_PTR_TO_MEM,
+   .arg1_type  = ARG_PTR_TO_MEM_OR_NULL,
.arg2_type  = ARG_CONST_SIZE_OR_ZERO,
-   .arg3_type  = ARG_PTR_TO_MEM,
+   .arg3_type  = ARG_PTR_TO_MEM_OR_NULL,
.arg4_type  = ARG_CONST_SIZE_OR_ZERO,
.arg5_type  = ARG_ANYTHING,
 };
diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index 2a5267bef160..3c64f30cf63c 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -5631,7 +5631,7 @@ static struct bpf_test tests[] = {
.prog_type = BPF_PROG_TYPE_TRACEPOINT,
},
{
-   "helper access to variable memory: size = 0 allowed on NULL",
+   "helper access to variable memory: size = 0 allowed on NULL 
(ARG_PTR_TO_MEM_OR_NULL)",
.insns = {
BPF_MOV64_IMM(BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_2, 0),
@@ -5645,7 +5645,7 @@ static struct bpf_test tests[] = {
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
},
{
-   "helper access to variable memory: size > 0 not allowed on 
NULL",
+   "helper access to variable memory: size > 0 not allowed on NULL 
(ARG_PTR_TO_MEM_OR_NULL)",
.insns = {
   

[PATCH net 0/4] bpf: fix semantics issues with helpers receiving NULL arguments

2017-11-22 Thread Gianluca Borello
This set includes some fixes in semantics and usability issues that emerged
recently, and would be good to have them in net before the next release.

In particular, ARG_CONST_SIZE_OR_ZERO semantics was recently changed in
commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO
semantics") with the goal of letting the compiler generate simpler code
that the verifier can more easily accept.

To handle this change in semantics, a few checks in some helpers were
added, like in commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2
type to ARG_CONST_SIZE_OR_ZERO"), and those checks are less than ideal
because once they make it into a released kernel bpf programs can start
relying on them, preventing the possibility of being removed later on.

This patch tries to fix the issue by introducing a new argument type
ARG_PTR_TO_MEM_OR_NULL that can be used for helpers that can receive a
 tuple. By doing so, we can fix the semantics of the other helpers
that don't need  and can just handle , allowing the code
to get rid of those checks.

Gianluca Borello (4):
  bpf: introduce ARG_PTR_TO_MEM_OR_NULL
  bpf: remove explicit handling of 0 for arg2 in bpf_probe_read
  bpf: change bpf_probe_read_str arg2 type to ARG_CONST_SIZE_OR_ZERO
  bpf: change bpf_perf_event_output arg5 type to ARG_CONST_SIZE_OR_ZERO

 include/linux/bpf.h |   1 +
 kernel/bpf/verifier.c   |   4 +-
 kernel/trace/bpf_trace.c|  12 +--
 net/core/filter.c   |   4 +-
 tools/testing/selftests/bpf/test_verifier.c | 113 ++--
 5 files changed, 116 insertions(+), 18 deletions(-)

--
2.14.1


[PATCH net-next] bpf: remove tail_call and get_stackid helper declarations from bpf.h

2017-10-25 Thread Gianluca Borello
commit afdb09c720b6 ("security: bpf: Add LSM hooks for bpf object related
syscall") included linux/bpf.h in linux/security.h. As a result, bpf
programs including bpf_helpers.h and some other header that ends up
pulling in also security.h, such as several examples under samples/bpf,
fail to compile because bpf_tail_call and bpf_get_stackid are now
"redefined as different kind of symbol".

>From bpf.h:

u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5);
u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);

Whereas in bpf_helpers.h they are:

static void (*bpf_tail_call)(void *ctx, void *map, int index);
static int (*bpf_get_stackid)(void *ctx, void *map, int flags);

Fix this by removing the unused declaration of bpf_tail_call and moving
the declaration of bpf_get_stackid in bpf_trace.c, which is the only
place where it's needed.

Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
---
 include/linux/bpf.h  | 3 ---
 kernel/trace/bpf_trace.c | 2 ++
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 172be7faf7ba..520aeebe0d93 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -231,9 +231,6 @@ struct bpf_event_entry {
struct rcu_head rcu;
 };
 
-u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5);
-u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
-
 bool bpf_prog_array_compatible(struct bpf_array *array, const struct bpf_prog 
*fp);
 int bpf_prog_calc_tag(struct bpf_prog *fp);
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index b65011d320e3..136aa6bb0422 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -15,6 +15,8 @@
 #include 
 #include "trace.h"
 
+u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
+
 /**
  * trace_call_bpf - invoke BPF program
  * @call: tracepoint event
-- 
2.15.0.rc2



[PATCH net-next] bpf: add bpf_probe_read_str helper

2017-01-18 Thread Gianluca Borello
Provide a simple helper with the same semantics of strncpy_from_unsafe():

int bpf_probe_read_str(void *dst, int size, const void *unsafe_addr)

This gives more flexibility to a bpf program. A typical use case is
intercepting a file name during sys_open(). The current approach is:

SEC("kprobe/sys_open")
void bpf_sys_open(struct pt_regs *ctx)
{
char buf[PATHLEN]; // PATHLEN is defined to 256
bpf_probe_read(buf, sizeof(buf), ctx->di);

/* consume buf */
}

This is suboptimal because the size of the string needs to be estimated
at compile time, causing more memory to be copied than often necessary,
and can become more problematic if further processing on buf is done,
for example by pushing it to userspace via bpf_perf_event_output(),
since the real length of the string is unknown and the entire buffer
must be copied (and defining an unrolled strnlen() inside the bpf
program is a very inefficient and unfeasible approach).

With the new helper, the code can easily operate on the actual string
length rather than the buffer size:

SEC("kprobe/sys_open")
void bpf_sys_open(struct pt_regs *ctx)
{
char buf[PATHLEN]; // PATHLEN is defined to 256
int res = bpf_probe_read_str(buf, sizeof(buf), ctx->di);

/* consume buf, for example push it to userspace via
 * bpf_perf_event_output(), but this time we can use
 * res (the string length) as event size, after checking
 * its boundaries.
 */
}

Another useful use case is when parsing individual process arguments or
individual environment variables navigating current->mm->arg_start and
current->mm->env_start: using this helper and the return value, one can
quickly iterate at the right offset of the memory area.

The code changes simply leverage the already existent
strncpy_from_unsafe() kernel function, which is safe to be called from a
bpf program as it is used in bpf_trace_printk().

Signed-off-by: Gianluca Borello 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
---
 include/uapi/linux/bpf.h | 15 ++-
 kernel/trace/bpf_trace.c | 32 
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0eb0e87dbe9f..54a5894bb4ea 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -430,6 +430,18 @@ union bpf_attr {
  * @xdp_md: pointer to xdp_md
  * @delta: An positive/negative integer to be added to xdp_md.data
  * Return: 0 on success or negative on error
+ *
+ * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
+ * Copy a NUL terminated string from unsafe address. In case the string
+ * length is smaller than size, the target is not padded with further NUL
+ * bytes. In case the string length is larger than size, just count-1
+ * bytes are copied and the last byte is set to NUL.
+ * @dst: destination address
+ * @size: maximum number of bytes to copy, including the trailing NUL
+ * @unsafe_ptr: unsafe address
+ * Return:
+ *   > 0 length of the string including the trailing NUL on success
+ *   < 0 error
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -476,7 +488,8 @@ union bpf_attr {
FN(set_hash_invalid),   \
FN(get_numa_node_id),   \
FN(skb_change_head),\
-   FN(xdp_adjust_head),
+   FN(xdp_adjust_head),\
+   FN(probe_read_str),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index c22a961d1a42..424daa4586d1 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -395,6 +395,36 @@ static const struct bpf_func_proto 
bpf_current_task_under_cgroup_proto = {
.arg2_type  = ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
+  const void *, unsafe_ptr)
+{
+   int ret;
+
+   /*
+* The strncpy_from_unsafe() call will likely not fill the entire
+* buffer, but that's okay in this circumstance as we're probing
+* arbitrary memory anyway similar to bpf_probe_read() and might
+* as well probe the stack. Thus, memory is explicitly cleared
+* only in error case, so that improper users ignoring return
+* code altogether don't copy garbage; otherwise length of string
+* is returned that can be used for bpf_perf_event_output() et al.
+*/
+   ret = strncpy_from_unsafe(dst, unsafe_ptr, size);
+   if (unlikely(ret < 0))
+   memset(dst, 0, size);
+
+   return ret;
+}
+
+static const struct bpf_func_proto bpf_probe_read_str_proto = {
+   .func   = bpf_probe_read_str,
+   .gpl_only   = true,
+   .ret_t