Re: [PATCH bpf v2 2/6] bpf: powerpc64: add JIT support for multi-function programs

2018-05-18 Thread Naveen N. Rao

Daniel Borkmann wrote:

On 05/18/2018 02:50 PM, Sandipan Das wrote:

This adds support for bpf-to-bpf function calls in the powerpc64
JIT compiler. The JIT compiler converts the bpf call instructions
to native branch instructions. After a round of the usual passes,
the start addresses of the JITed images for the callee functions
are known. Finally, to fixup the branch target addresses, we need
to perform an extra pass.

Because of the address range in which JITed images are allocated
on powerpc64, the offsets of the start addresses of these images
from __bpf_call_base are as large as 64 bits. So, for a function
call, we cannot use the imm field of the instruction to determine
the callee's address. Instead, we use the alternative method of
getting it from the list of function addresses in the auxillary
data of the caller by using the off field as an index.

Signed-off-by: Sandipan Das 
---
 arch/powerpc/net/bpf_jit_comp64.c | 79 ++-
 1 file changed, 69 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 1bdb1aff0619..25939892d8f7 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -256,7 +256,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct 
codegen_context *ctx, u32
 /* Assemble the body code between the prologue & epilogue */
 static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
  struct codegen_context *ctx,
- u32 *addrs)
+ u32 *addrs, bool extra_pass)
 {
const struct bpf_insn *insn = fp->insnsi;
int flen = fp->len;
@@ -712,11 +712,23 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
break;
 
 		/*

-* Call kernel helper
+* Call kernel helper or bpf function
 */
case BPF_JMP | BPF_CALL:
ctx->seen |= SEEN_FUNC;
-   func = (u8 *) __bpf_call_base + imm;
+
+   /* bpf function call */
+   if (insn[i].src_reg == BPF_PSEUDO_CALL && extra_pass)


Perhaps it might make sense here for !extra_pass to set func to some dummy
address as otherwise the 'kernel helper call' branch used for this is a bit
misleading in that sense. The PPC_LI64() used in bpf_jit_emit_func_call()
optimizes the immediate addr, I presume the JIT can handle situations where
in the final extra_pass the image needs to grow/shrink again (due to different
final address for the call)?


That's a good catch. We don't handle that -- we expect to get the size 
right on first pass. We could probably have PPC_FUNC_ADDR() pad the 
result with nops to make it a constant 5-instruction sequence.





+   if (fp->aux->func && off < fp->aux->func_cnt)
+   /* use the subprog id from the off
+* field to lookup the callee address
+*/
+   func = (u8 *) 
fp->aux->func[off]->bpf_func;
+   else
+   return -EINVAL;
+   /* kernel helper call */
+   else
+   func = (u8 *) __bpf_call_base + imm;
 
 			bpf_jit_emit_func_call(image, ctx, (u64)func);
 
@@ -864,6 +876,14 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,

return 0;
 }
 
+struct powerpc64_jit_data {

+   struct bpf_binary_header *header;
+   u32 *addrs;
+   u8 *image;
+   u32 proglen;
+   struct codegen_context ctx;
+};
+
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 {
u32 proglen;
@@ -871,6 +891,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
u8 *image = NULL;
u32 *code_base;
u32 *addrs;
+   struct powerpc64_jit_data *jit_data;
struct codegen_context cgctx;
int pass;
int flen;
@@ -878,6 +899,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
struct bpf_prog *org_fp = fp;
struct bpf_prog *tmp_fp;
bool bpf_blinded = false;
+   bool extra_pass = false;
 
 	if (!fp->jit_requested)

return org_fp;
@@ -891,7 +913,28 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
fp = tmp_fp;
}
 
+	jit_data = fp->aux->jit_data;

+   if (!jit_data) {
+   jit_data = kzalloc(sizeof(*jit_data), GFP_KERNEL);
+   if (!jit_data) {
+   fp = org_fp;
+   goto out;
+   }
+   fp->aux->jit_data = jit_data;
+   }
+
flen = fp->len;
+   addrs = jit_data->addrs;
+   if (addrs) {
+   cgctx = jit_data->ctx;
+   

Re: [RFC][PATCH bpf] tools: bpftool: Fix tags for bpf-to-bpf calls

2018-05-03 Thread Naveen N. Rao

Alexei Starovoitov wrote:

On 3/1/18 12:51 AM, Naveen N. Rao wrote:

Daniel Borkmann wrote:


Worst case if there's nothing better, potentially what one could do in
bpf_prog_get_info_by_fd() is to dump an array of full addresses and
have the imm part as the index pointing to one of them, just unfortunate
that it's likely only needed in ppc64.


Ok. We seem to have discussed a few different aspects in this thread.
Let me summarize the different aspects we have discussed:
1. Passing address of JIT'ed function to the JIT engines:
   Two approaches discussed:
   a. Existing approach, where the subprog address is encoded as an
offset from __bpf_call_base() in imm32 field of the BPF call
instruction. This requires the JIT'ed function to be within 2GB of
__bpf_call_base(), which won't be true on ppc64, at the least. So,
this won't on ppc64 (and any other architectures where vmalloc'ed
(module_alloc()) memory is from a different, far, address range).


it looks like ppc64 doesn't guarantee today that all of module_alloc()
will be within 32-bit, but I think it should be trivial to add such
guarantee. If so, we can define another __bpf_call_base specifically
for bpf-to-bpf calls when jit is on.


Ok, we prefer not to do that for powerpc (atleast, not for all of 
module_alloc()) at this point.


And since option (c) below is not preferable, I think we will implement 
what Daniel suggested above. This patchset already handles communicating 
the BPF function addresses to the JIT engine, and enhancing 
bpf_prog_get_info_by_fd() should address the concerns with bpftool.



- Naveen




Re: [RFC][PATCH bpf] tools: bpftool: Fix tags for bpf-to-bpf calls

2018-03-01 Thread Naveen N. Rao

Daniel Borkmann wrote:

On 02/27/2018 01:13 PM, Sandipan Das wrote:

With this patch, it will look like this:
   0: (85) call pc+2#bpf_prog_8f85936f29a7790a+3


(Note the +2 is the insn->off already.)


   1: (b7) r0 = 1
   2: (95) exit
   3: (b7) r0 = 2
   4: (95) exit

where 8f85936f29a7790a is the tag of the bpf program and 3 is
the offset to the start of the subprog from the start of the
program.


The problem with this approach would be that right now the name is
something like bpf_prog_5f76847930402518_F where the subprog tag is
just a placeholder so in future, this may well adapt to e.g. the actual
function name from the elf file. Note that when kallsyms is enabled
then a name like bpf_prog_5f76847930402518_F will also appear in stack
traces, perf records, etc, so for correlation/debugging it would really
help to have them the same everywhere.

Worst case if there's nothing better, potentially what one could do in
bpf_prog_get_info_by_fd() is to dump an array of full addresses and
have the imm part as the index pointing to one of them, just unfortunate
that it's likely only needed in ppc64.


Ok. We seem to have discussed a few different aspects in this thread.  
Let me summarize the different aspects we have discussed:

1. Passing address of JIT'ed function to the JIT engines:
   Two approaches discussed:
   a. Existing approach, where the subprog address is encoded as an 
   offset from __bpf_call_base() in imm32 field of the BPF call 
   instruction. This requires the JIT'ed function to be within 2GB of 
   __bpf_call_base(), which won't be true on ppc64, at the least. So, 
   this won't on ppc64 (and any other architectures where vmalloc'ed 
   (module_alloc()) memory is from a different, far, address range).
   
   [As a side note, is it _actually_ guaranteed that JIT'ed functions 
   will be within 2GB (signed 32-bit...) on all other architectures 
   where BPF JIT is supported? I'm not quite sure how memory allocation 
   works on other architectures, but it looks like this can fail if 
   there are other larger allocations.]


   b. Pass the full 64-bit address of the call target in an auxiliary 
   field for the JIT engine to use (as implemented in this mail chain).  
   We can then use this to determine the call target if this is a 
   pseudo call.


   There is a third option we can consider:
   c. Convert BPF pseudo call instruction into a 2-instruction sequence 
   (similar to BPF_DW) and encode the full 64-bit call target in the 
   second bpf instruction. To distinguish this from other instruction 
   forms, we can set imm32 to -1.


   If we go with (b) or (c), we will need to take a call on whether we 
   will implement this in the same manner across all architectures, or 
   if we should have ppc64 (and any other affected architectures) work 
   differently from the rest.


   Further more, for (b), bpftool won't be able to derive the target 
   function call address, but approaches (a) and (c) are fine. More 
   about that below...


2. Indicating target function in bpftool:
   In the existing approach, bpftool can determine target address since 
   the offset is encoded in imm32 and is able to lookup the name from 
   kallsyms, if enabled.


   If we go with approach (b) for ppc64, this won't work and we will 
   have to minimally update bpftool to detect that the target address 
   is not available on ppc64.


   If we go with approach (c), the target address will be available and 
   we should be able to update bpftool to look that up.


   [As a side note, I suppose part of Sandipan's point with the 
   previous patch was to make the bpftool output consistent whether or 
   not JIT is enabled. It does look a bit weird that bpftool shows the 
   address of a JIT'ed function when asked to print the BPF bytecode.]


Thoughts?


- Naveen




Re: [RFC][PATCH bpf v2 1/2] bpf: allow 64-bit offsets for bpf function calls

2018-02-20 Thread Naveen N. Rao

Michael Ellerman wrote:

"Naveen N. Rao" <naveen.n@linux.vnet.ibm.com> writes:

Daniel Borkmann wrote:

On 02/15/2018 05:25 PM, Daniel Borkmann wrote:

On 02/13/2018 05:05 AM, Sandipan Das wrote:

The imm field of a bpf_insn is a signed 32-bit integer. For
JIT-ed bpf-to-bpf function calls, it stores the offset from
__bpf_call_base to the start of the callee function.

For some architectures, such as powerpc64, it was found that
this offset may be as large as 64 bits because of which this
cannot be accomodated in the imm field without truncation.

To resolve this, we additionally make aux->func within each
bpf_prog associated with the functions to point to the list
of all function addresses determined by the verifier.

We keep the value assigned to the off field of the bpf_insn
as a way to index into aux->func and also set aux->func_cnt
so that this can be used for performing basic upper bound
checks for the off field.

Signed-off-by: Sandipan Das <sandi...@linux.vnet.ibm.com>
---
v2: Make aux->func point to the list of functions determined
by the verifier rather than allocating a separate callee
list for each function.


Approach looks good to me; do you know whether s390x JIT would
have similar requirement? I think one limitation that would still
need to be addressed later with such approach would be regarding the
xlated prog dump in bpftool, see 'BPF calls via JIT' in 7105e828c087
("bpf: allow for correlation of maps and helpers in dump"). Any
ideas for this (potentially if we could use off + imm for calls,
we'd get to 48 bits, but that seems still not be enough as you say)?


All good points. I'm not really sure how s390x works, so I can't comment 
on that, but I'm copying Michael Holzheu for his consideration.


With the existing scheme, 48 bits won't be enough, so we rejected that 
approach. I can also see how this will be a problem with bpftool, but I 
haven't looked into it in detail. I wonder if we can annotate the output 
to indicate the function being referred to?




One other random thought, although I'm not sure how feasible this
is for ppc64 JIT to realize ... but idea would be to have something
like the below:

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 29ca920..daa7258 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -512,6 +512,11 @@ int bpf_get_kallsym(unsigned int symnum, unsigned long 
*value, char *type,
return ret;
 }

+void * __weak bpf_jit_image_alloc(unsigned long size)
+{
+   return module_alloc(size);
+}
+
 struct bpf_binary_header *
 bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 unsigned int alignment,
@@ -525,7 +530,7 @@ bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 * random section of illegal instructions.
 */
size = round_up(proglen + sizeof(*hdr) + 128, PAGE_SIZE);
-   hdr = module_alloc(size);
+   hdr = bpf_jit_image_alloc(size);
if (hdr == NULL)
return NULL;

And ppc64 JIT could override bpf_jit_image_alloc() in a similar way
like some archs would override the module_alloc() helper through a
custom implementation, usually via __vmalloc_node_range(), so we
could perhaps fit the range for BPF JITed images in a way that they
could use the 32bit imm in the end? There are not that many progs
loaded typically, so the range could be a bit narrower in such case
anyway. (Not sure if this would work out though, but I thought to
bring it up.)


That'd be a good option to consider. I don't think we want to allocate 
anything from the linear memory range since users could load 
unprivileged BPF programs and consume a lot of memory that way. I doubt 
if we can map vmalloc'ed memory into the 0xc0 address range, but I'm not 
entirely sure.


Michael,
Is the above possible? The question is if we can have BPF programs be 
allocated within 4GB of __bpf_call_base (which is a kernel symbol), so 
that calls to those programs can be encoded in a 32-bit immediate field 
in a BPF instruction.


Hmmm.

It's not technically impossible, but I don't think it's really a good
option.

The 0xc range is a linear mapping of RAM, and the kernel tends to be
near the start of RAM for reasons. That means there generally isn't a
hole in the 0xc range within 4GB for you to map BPF programs.

You could create a hole by making the 0xc mapping non linear, ie.
mapping some RAM near the kernel elsewhere in the 0xc range, to make a
hole that you can then remap BPF programs into. But I think that would
cause a lot of bugs, it's a pretty fundamental assumption that the
linear mapping is 1:1.

As an extension, we may be able to extend it to 
48-bits by combining with another BPF instruction field (offset). In 
either case, the vmalloc'ed address range won't work.


48-bits could possibly work, we don't have systems with that much RAM
*yet*. So you could remap the BPF programs at the end of the 0xc range,
or somewhere we h

Re: [RFC][PATCH bpf v2 1/2] bpf: allow 64-bit offsets for bpf function calls

2018-02-16 Thread Naveen N. Rao

Daniel Borkmann wrote:

On 02/15/2018 05:25 PM, Daniel Borkmann wrote:

On 02/13/2018 05:05 AM, Sandipan Das wrote:

The imm field of a bpf_insn is a signed 32-bit integer. For
JIT-ed bpf-to-bpf function calls, it stores the offset from
__bpf_call_base to the start of the callee function.

For some architectures, such as powerpc64, it was found that
this offset may be as large as 64 bits because of which this
cannot be accomodated in the imm field without truncation.

To resolve this, we additionally make aux->func within each
bpf_prog associated with the functions to point to the list
of all function addresses determined by the verifier.

We keep the value assigned to the off field of the bpf_insn
as a way to index into aux->func and also set aux->func_cnt
so that this can be used for performing basic upper bound
checks for the off field.

Signed-off-by: Sandipan Das 
---
v2: Make aux->func point to the list of functions determined
by the verifier rather than allocating a separate callee
list for each function.


Approach looks good to me; do you know whether s390x JIT would
have similar requirement? I think one limitation that would still
need to be addressed later with such approach would be regarding the
xlated prog dump in bpftool, see 'BPF calls via JIT' in 7105e828c087
("bpf: allow for correlation of maps and helpers in dump"). Any
ideas for this (potentially if we could use off + imm for calls,
we'd get to 48 bits, but that seems still not be enough as you say)?


All good points. I'm not really sure how s390x works, so I can't comment 
on that, but I'm copying Michael Holzheu for his consideration.


With the existing scheme, 48 bits won't be enough, so we rejected that 
approach. I can also see how this will be a problem with bpftool, but I 
haven't looked into it in detail. I wonder if we can annotate the output 
to indicate the function being referred to?




One other random thought, although I'm not sure how feasible this
is for ppc64 JIT to realize ... but idea would be to have something
like the below:

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 29ca920..daa7258 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -512,6 +512,11 @@ int bpf_get_kallsym(unsigned int symnum, unsigned long 
*value, char *type,
return ret;
 }

+void * __weak bpf_jit_image_alloc(unsigned long size)
+{
+   return module_alloc(size);
+}
+
 struct bpf_binary_header *
 bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 unsigned int alignment,
@@ -525,7 +530,7 @@ bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 * random section of illegal instructions.
 */
size = round_up(proglen + sizeof(*hdr) + 128, PAGE_SIZE);
-   hdr = module_alloc(size);
+   hdr = bpf_jit_image_alloc(size);
if (hdr == NULL)
return NULL;

And ppc64 JIT could override bpf_jit_image_alloc() in a similar way
like some archs would override the module_alloc() helper through a
custom implementation, usually via __vmalloc_node_range(), so we
could perhaps fit the range for BPF JITed images in a way that they
could use the 32bit imm in the end? There are not that many progs
loaded typically, so the range could be a bit narrower in such case
anyway. (Not sure if this would work out though, but I thought to
bring it up.)


That'd be a good option to consider. I don't think we want to allocate 
anything from the linear memory range since users could load 
unprivileged BPF programs and consume a lot of memory that way. I doubt 
if we can map vmalloc'ed memory into the 0xc0 address range, but I'm not 
entirely sure.


Michael,
Is the above possible? The question is if we can have BPF programs be 
allocated within 4GB of __bpf_call_base (which is a kernel symbol), so 
that calls to those programs can be encoded in a 32-bit immediate field 
in a BPF instruction. As an extension, we may be able to extend it to 
48-bits by combining with another BPF instruction field (offset). In 
either case, the vmalloc'ed address range won't work.


The alternative is to pass the full 64-bit address of the BPF program in 
an auxiliary field (as proposed in this patch set) but we need to fix it 
up for 'bpftool' as well.


Thanks,
Naveen




Re: [RFC][PATCH bpf 1/2] bpf: allow 64-bit offsets for bpf function calls

2018-02-09 Thread Naveen N. Rao

Naveen N. Rao wrote:

Alexei Starovoitov wrote:

On 2/8/18 4:03 AM, Sandipan Das wrote:

The imm field of a bpf_insn is a signed 32-bit integer. For
JIT-ed bpf-to-bpf function calls, it stores the offset from
__bpf_call_base to the start of the callee function.

For some architectures, such as powerpc64, it was found that
this offset may be as large as 64 bits because of which this
cannot be accomodated in the imm field without truncation.

To resolve this, we additionally use the aux data within each
bpf_prog associated with the caller functions to store the
addresses of their respective callees.

Signed-off-by: Sandipan Das <sandi...@linux.vnet.ibm.com>
---
 kernel/bpf/verifier.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5fb69a85d967..52088b4ca02f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5282,6 +5282,19 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 * run last pass of JIT
 */
for (i = 0; i <= env->subprog_cnt; i++) {
+   u32 flen = func[i]->len, callee_cnt = 0;
+   struct bpf_prog **callee;
+
+   /* for now assume that the maximum number of bpf function
+* calls that can be made by a caller must be at most the
+* number of bpf instructions in that function
+*/
+   callee = kzalloc(sizeof(func[i]) * flen, GFP_KERNEL);
+   if (!callee) {
+   err = -ENOMEM;
+   goto out_free;
+   }
+
insn = func[i]->insnsi;
for (j = 0; j < func[i]->len; j++, insn++) {
if (insn->code != (BPF_JMP | BPF_CALL) ||
@@ -5292,6 +5305,26 @@ static int jit_subprogs(struct bpf_verifier_env *env)
insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
func[subprog]->bpf_func -
__bpf_call_base;
+
+   /* the offset to the callee from __bpf_call_base
+* may be larger than what the 32 bit integer imm
+* can accomodate which will truncate the higher
+* order bits
+*
+* to avoid this, we additionally utilize the aux
+* data of each caller function for storing the
+* addresses of every callee associated with it
+*/
+   callee[callee_cnt++] = func[subprog];


can you share typical /proc/kallsyms ?
Are you saying that kernel and kernel modules are allocated from
address spaces that are always more than 32-bit apart?


Yes. On ppc64, kernel text is linearly mapped from 0xc000, 
while vmalloc'ed area starts from 0xd000 (for radix, this is

different, but still beyond a 32-bit offset).


That would mean that all kernel calls into modules are far calls
and the other way around form .ko into kernel?
Performance is probably suffering because every call needs to be built
with full 64-bit offset. No ?


Possibly, and I think Michael can give a better perspective, but I think
this is due to our ABI. For inter-module calls, we need to setup the TOC
pointer (or the address of the function being called with ABIv2), which 
would require us to load a full address regardless.


Thinking more about this, as an optimization, for bpf-to-bpf calls, we 
could detect a near call and just emit a relative branch since we don't 
care about TOC with BPF. But, this will depend on whether the different 
BPF functions are close enough (within 32MB) of one another.


We can attempt that once the generic changes are finalized on.

Thanks,
Naveen




Re: [RFC][PATCH bpf 1/2] bpf: allow 64-bit offsets for bpf function calls

2018-02-08 Thread Naveen N. Rao

Alexei Starovoitov wrote:

On 2/8/18 4:03 AM, Sandipan Das wrote:

The imm field of a bpf_insn is a signed 32-bit integer. For
JIT-ed bpf-to-bpf function calls, it stores the offset from
__bpf_call_base to the start of the callee function.

For some architectures, such as powerpc64, it was found that
this offset may be as large as 64 bits because of which this
cannot be accomodated in the imm field without truncation.

To resolve this, we additionally use the aux data within each
bpf_prog associated with the caller functions to store the
addresses of their respective callees.

Signed-off-by: Sandipan Das 
---
 kernel/bpf/verifier.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5fb69a85d967..52088b4ca02f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5282,6 +5282,19 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 * run last pass of JIT
 */
for (i = 0; i <= env->subprog_cnt; i++) {
+   u32 flen = func[i]->len, callee_cnt = 0;
+   struct bpf_prog **callee;
+
+   /* for now assume that the maximum number of bpf function
+* calls that can be made by a caller must be at most the
+* number of bpf instructions in that function
+*/
+   callee = kzalloc(sizeof(func[i]) * flen, GFP_KERNEL);
+   if (!callee) {
+   err = -ENOMEM;
+   goto out_free;
+   }
+
insn = func[i]->insnsi;
for (j = 0; j < func[i]->len; j++, insn++) {
if (insn->code != (BPF_JMP | BPF_CALL) ||
@@ -5292,6 +5305,26 @@ static int jit_subprogs(struct bpf_verifier_env *env)
insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
func[subprog]->bpf_func -
__bpf_call_base;
+
+   /* the offset to the callee from __bpf_call_base
+* may be larger than what the 32 bit integer imm
+* can accomodate which will truncate the higher
+* order bits
+*
+* to avoid this, we additionally utilize the aux
+* data of each caller function for storing the
+* addresses of every callee associated with it
+*/
+   callee[callee_cnt++] = func[subprog];


can you share typical /proc/kallsyms ?
Are you saying that kernel and kernel modules are allocated from
address spaces that are always more than 32-bit apart?


Yes. On ppc64, kernel text is linearly mapped from 0xc000, 
while vmalloc'ed area starts from 0xd000 (for radix, this is

different, but still beyond a 32-bit offset).


That would mean that all kernel calls into modules are far calls
and the other way around form .ko into kernel?
Performance is probably suffering because every call needs to be built
with full 64-bit offset. No ?


Possibly, and I think Michael can give a better perspective, but I think
this is due to our ABI. For inter-module calls, we need to setup the TOC
pointer (or the address of the function being called with ABIv2), which 
would require us to load a full address regardless.


- Naveen




Re: [PATCH bpf-next 01/13] bpf: xor of a/x in cbpf can be done

2018-01-28 Thread Naveen N. Rao

in 32 bit alu

Daniel Borkmann wrote:

Very minor optimization; saves 1 byte per program in x86_64
JIT in cBPF prologue.


... but increases program size by 4 bytes on ppc64 :(
In general, this is an area I've been wanting to spend some time on.  
Powerpc doesn't have 32-bit sub-registers, so we need to emit an 
additional instruction to clear the higher 32-bits for all 32-bit 
operations. I need to look at the performance impact.


- Naveen



Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 net/core/filter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 18da42a..cba2f73 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -401,8 +401,8 @@ static int bpf_convert_filter(struct sock_filter *prog, int 
len,
/* Classic BPF expects A and X to be reset first. These need
 * to be guaranteed to be the first two instructions.
 */
-   *new_insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
-   *new_insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
+   *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
+   *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
 
 		/* All programs must keep CTX in callee saved BPF_REG_CTX.

 * In eBPF case it's done by the compiler, here we need to
--
2.9.5







Re: [PATCH bpf-next 08/13] bpf, ppc64: remove obsolete exception handling from div/mod

2018-01-28 Thread Naveen N. Rao

Daniel Borkmann wrote:

Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from ppc64 JIT.

Signed-off-by: Daniel Borkmann <dan...@iogearbox.net>
Cc: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 8 
 1 file changed, 8 deletions(-)


Probably too late, but none the less:
Acked-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

Thanks,
Naveen



diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 217a78e..0a34b0c 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -381,10 +381,6 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
goto bpf_alu32_trunc;
case BPF_ALU | BPF_DIV | BPF_X: /* (u32) dst /= (u32) src */
case BPF_ALU | BPF_MOD | BPF_X: /* (u32) dst %= (u32) src */
-   PPC_CMPWI(src_reg, 0);
-   PPC_BCC_SHORT(COND_NE, (ctx->idx * 4) + 12);
-   PPC_LI(b2p[BPF_REG_0], 0);
-   PPC_JMP(exit_addr);
if (BPF_OP(code) == BPF_MOD) {
PPC_DIVWU(b2p[TMP_REG_1], dst_reg, src_reg);
PPC_MULW(b2p[TMP_REG_1], src_reg,
@@ -395,10 +391,6 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
goto bpf_alu32_trunc;
case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */
case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */
-   PPC_CMPDI(src_reg, 0);
-   PPC_BCC_SHORT(COND_NE, (ctx->idx * 4) + 12);
-   PPC_LI(b2p[BPF_REG_0], 0);
-   PPC_JMP(exit_addr);
if (BPF_OP(code) == BPF_MOD) {
PPC_DIVD(b2p[TMP_REG_1], dst_reg, src_reg);
PPC_MULD(b2p[TMP_REG_1], src_reg,
--
2.9.5






Re: [RFC PATCH] bpf: Add helpers to read useful task_struct members

2017-11-07 Thread Naveen N. Rao

Alexei Starovoitov wrote:

On 11/7/17 12:55 AM, Naveen N. Rao wrote:

I thought such struct shouldn't change layout.
If it is we need to fix include/linux/compiler-clang.h to do that
anon struct as well.


We considered that, but it looked to be very dependent on the version of
gcc used to build the kernel. But, this may be a simpler approach for
the shorter term.



why it would depend on version of gcc?


From what I can see, randomized_struct_fields_start is defined only for 
gcc >= 4.6. For older versions, it does not get mapped to an anonymous 
structure. We may not care for older gcc versions, but..


The other issue was that __randomize_layout maps to __designated_init 
when randstruct plugin is not enabled, which is in turn an attribute on 
gcc >= v5.1, but not otherwise.



We just need this, no?

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index de179993e039..4e29ab6187cb 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -15,3 +15,6 @@
   * with any version that can compile the kernel
   */
  #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), 
__COUNTER__)

+
+#define randomized_struct_fields_start struct {
+#define randomized_struct_fields_end   };

since offsets are mandated by C standard.


Yes, this is what we're testing with and is probably sufficient for our 
purposes.


- Naveen




Re: [RFC PATCH] bpf: Add helpers to read useful task_struct members

2017-11-06 Thread Naveen N. Rao

Alexei Starovoitov wrote:

On 11/5/17 2:31 AM, Naveen N. Rao wrote:

Hi Alexei,

Alexei Starovoitov wrote:

On 11/3/17 3:58 PM, Sandipan Das wrote:

For added security, the layout of some structures can be
randomized by enabling CONFIG_GCC_PLUGIN_RANDSTRUCT. One
such structure is task_struct. To build BPF programs, we
use Clang which does not support this feature. So, if we
attempt to read a field of a structure with a randomized
layout within a BPF program, we do not get the expected
value because of incorrect offsets. To observe this, it
is not mandatory to have CONFIG_GCC_PLUGIN_RANDSTRUCT
enabled because the structure annotations/members added
for this purpose are enough to cause this. So, all kernel
builds are affected.



...


diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f90860d1f897..324508d27bd2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -338,6 +338,16 @@ union bpf_attr {
  * @skb: pointer to skb
  * Return: classid if != 0
  *
+ * u64 bpf_get_task_pid_tgid(struct task_struct *task)
+ * Return: task->tgid << 32 | task->pid
+ *
+ * int bpf_get_task_comm(struct task_struct *task)
+ * Stores task->comm into buf
+ * Return: 0 on success or negative error
+ *
+ * u32 bpf_get_task_flags(struct task_struct *task)
+ * Return: task->flags
+ *


I don't think it's a solution.
Tracing scripts read other fields too.
Making it work for these 3 fields is a drop in a bucket.


Indeed. However...


If randomization is used I think we have to accept
that existing bpf scripts won't be usable.


... the actual issue is that randomization isn't necessary for this to
show up. The annotations added to mark off the structure members results
in some structure members being moved into an anonymous structure, which
would then get padded differently. So, *all* kernels since v4.13 are
affected, afaict.


hmm. why would all 4.13+ be affected?
It's just an anonymous struct inside task_struct.
Are you saying that due to clang not adding this 'struct { };' treatment 
to task_struct?


Yes, that's what it looked like.


I thought such struct shouldn't change layout.
If it is we need to fix include/linux/compiler-clang.h to do that
anon struct as well.


We considered that, but it looked to be very dependent on the version of 
gcc used to build the kernel. But, this may be a simpler approach for 
the shorter term.





As such, we wanted to propose this as a short term solution, but I do
agree that this doesn't solve the real issue.


Long term solution is to support 'BPF Type Format' or BTF
(which is old C-Type Format) for kernel data structures,
so bcc scripts wouldn't need to use kernel headers and clang.
The proper offsets will be described in BTF.
We were planning to use it initially to describe map key/value,
but it applies for this case as well.
There will be a tool that will take dwarf from vmlinux and
compress it into BTF. Kernel will also be able to verify
that BTF is a valid BTF.


This is the first that I've heard about BTF. Can you share more details
about it, or point me to some place where it has been discussed?

We considered having tools derive the structure offsets from debuginfo,
but debuginfo may not always be present on production systems. So, it
isn't clear if having that dependency is fine. I'm not sure how BTF will
be different.


It was discussed at this year plumbers:
https://lwn.net/Articles/734453/

btw the name BTF is work in progress. We started with CTF, but
it conflicts with all other meanings of this abbreviation.
Likely we will call it something different at the end.

Initial goal was to describe key/map values of bpf maps to
make debugging easier, but now we want to use this compressed
type format for tracing as well, since installing kernel headers
everywhere doesn't scale well while CTF can be embedded in vmlinux


Makes sense, though I'm curious on how you're planning to have this work
without the kernel headers :)



We were also thinking to improve verifier with CTF knowledge too.
Like if CTF describes that map value is two u32, but bpf program
is doing 8-byte access then something is wrong and either warn
or reject such program.


Sounds good. I look forward to more details/patches on this front once 
you're ready to share more.


Thanks,
- Naveen




Re: [RFC PATCH] bpf: Add helpers to read useful task_struct members

2017-11-04 Thread Naveen N. Rao

Hi Alexei,

Alexei Starovoitov wrote:

On 11/3/17 3:58 PM, Sandipan Das wrote:

For added security, the layout of some structures can be
randomized by enabling CONFIG_GCC_PLUGIN_RANDSTRUCT. One
such structure is task_struct. To build BPF programs, we
use Clang which does not support this feature. So, if we
attempt to read a field of a structure with a randomized
layout within a BPF program, we do not get the expected
value because of incorrect offsets. To observe this, it
is not mandatory to have CONFIG_GCC_PLUGIN_RANDSTRUCT
enabled because the structure annotations/members added
for this purpose are enough to cause this. So, all kernel
builds are affected.

For example, considering samples/bpf/offwaketime_kern.c,
if we try to print the values of pid and comm inside the
task_struct passed to waker() by adding the following
lines of code at the appropriate place

  char fmt[] = "waker(): p->pid = %u, p->comm = %s\n";
  bpf_trace_printk(fmt, sizeof(fmt), _(p->pid), _(p->comm));

it is seen that upon rebuilding and running this sample
followed by inspecting /sys/kernel/debug/tracing/trace,
the output looks like the following

   _-=> irqs-off
  / _=> need-resched
 | / _---=> hardirq/softirq
 || / _--=> preempt-depth
 ||| / delay
TASK-PID   CPU#  TIMESTAMP  FUNCTION
   | |   |      | |
  -0 [007] d.s.  1883.443594: 0x0001: waker(): p->pid = 0, 
p->comm =
  -0 [018] d.s.  1883.453588: 0x0001: waker(): p->pid = 0, 
p->comm =
  -0 [007] d.s.  1883.463584: 0x0001: waker(): p->pid = 0, 
p->comm =
  -0 [009] d.s.  1883.483586: 0x0001: waker(): p->pid = 0, 
p->comm =
  -0 [005] d.s.  1883.493583: 0x0001: waker(): p->pid = 0, 
p->comm =
  -0 [009] d.s.  1883.503583: 0x0001: waker(): p->pid = 0, 
p->comm =
  -0 [018] d.s.  1883.513578: 0x0001: waker(): p->pid = 0, 
p->comm =
 systemd-journal-3140  [003] d...  1883.627660: 0x0001: waker(): 
 p->pid = 0, p->comm =
 systemd-journal-3140  [003] d...  1883.627704: 0x0001: waker(): 
 p->pid = 0, p->comm =
 systemd-journal-3140  [003] d...  1883.627723: 0x0001: waker(): 
 p->pid = 0, p->comm =


To avoid this, we add new BPF helpers that read the
correct values for some of the important task_struct
members such as pid, tgid, comm and flags which are
extensively used in BPF-based analysis tools such as
bcc. Since these helpers are built with GCC, they use
the correct offsets when referencing a member.

Signed-off-by: Sandipan Das 

...

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f90860d1f897..324508d27bd2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -338,6 +338,16 @@ union bpf_attr {
  * @skb: pointer to skb
  * Return: classid if != 0
  *
+ * u64 bpf_get_task_pid_tgid(struct task_struct *task)
+ * Return: task->tgid << 32 | task->pid
+ *
+ * int bpf_get_task_comm(struct task_struct *task)
+ * Stores task->comm into buf
+ * Return: 0 on success or negative error
+ *
+ * u32 bpf_get_task_flags(struct task_struct *task)
+ * Return: task->flags
+ *


I don't think it's a solution.
Tracing scripts read other fields too.
Making it work for these 3 fields is a drop in a bucket.


Indeed. However...


If randomization is used I think we have to accept
that existing bpf scripts won't be usable.


... the actual issue is that randomization isn't necessary for this to 
show up. The annotations added to mark off the structure members results 
in some structure members being moved into an anonymous structure, which 
would then get padded differently. So, *all* kernels since v4.13 are 
affected, afaict.


As such, we wanted to propose this as a short term solution, but I do 
agree that this doesn't solve the real issue.



Long term solution is to support 'BPF Type Format' or BTF
(which is old C-Type Format) for kernel data structures,
so bcc scripts wouldn't need to use kernel headers and clang.
The proper offsets will be described in BTF.
We were planning to use it initially to describe map key/value,
but it applies for this case as well.
There will be a tool that will take dwarf from vmlinux and
compress it into BTF. Kernel will also be able to verify
that BTF is a valid BTF.


This is the first that I've heard about BTF. Can you share more details 
about it, or point me to some place where it has been discussed?


We considered having tools derive the structure offsets from debuginfo, 
but debuginfo may not always be present on production systems. So, it 
isn't clear if having that dependency is fine. I'm not sure how BTF will

be different.


I'm assuming that gcc randomization plugin produces dwarf
with correct offsets, if not, it would have to be fixed.

Re: [PATCH 1/1] bpf: take advantage of stack_depth tracking in powerpc JIT

2017-09-01 Thread Naveen N. Rao
On 2017/09/02 12:23AM, Sandipan Das wrote:
> Take advantage of stack_depth tracking, originally introduced for
> x64, in powerpc JIT as well. Round up allocated stack by 16 bytes
> to make sure it stays aligned for functions called from JITed bpf
> program.
> 
> Signed-off-by: Sandipan Das <sandi...@linux.vnet.ibm.com>
> ---

LGTM, thanks!
Reviewed-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

Michael,
Seeing as this is powerpc specific, can you please take this through 
your tree?


Thanks,
Naveen

>  arch/powerpc/net/bpf_jit64.h  |  7 ---
>  arch/powerpc/net/bpf_jit_comp64.c | 16 ++--
>  2 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
> index 62fa7589db2b..8bdef7ed28a8 100644
> --- a/arch/powerpc/net/bpf_jit64.h
> +++ b/arch/powerpc/net/bpf_jit64.h
> @@ -23,7 +23,7 @@
>   *   [   nv gpr save area] 8*8   |
>   *   [tail_call_cnt  ] 8 |
>   *   [local_tmp_var  ] 8 |
> - * fp (r31) -->  [   ebpf stack space] 512   |
> + * fp (r31) -->  [   ebpf stack space] upto 512  |
>   *   [ frame header  ] 32/112|
>   * sp (r1) --->  [stack pointer  ] --
>   */
> @@ -32,8 +32,8 @@
>  #define BPF_PPC_STACK_SAVE   (8*8)
>  /* for bpf JIT code internal usage */
>  #define BPF_PPC_STACK_LOCALS 16
> -/* Ensure this is quadword aligned */
> -#define BPF_PPC_STACKFRAME   (STACK_FRAME_MIN_SIZE + MAX_BPF_STACK + \
> +/* stack frame excluding BPF stack, ensure this is quadword aligned */
> +#define BPF_PPC_STACKFRAME   (STACK_FRAME_MIN_SIZE + \
>BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
> 
>  #ifndef __ASSEMBLY__
> @@ -103,6 +103,7 @@ struct codegen_context {
>*/
>   unsigned int seen;
>   unsigned int idx;
> + unsigned int stack_size;
>  };
> 
>  #endif /* !__ASSEMBLY__ */
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> b/arch/powerpc/net/bpf_jit_comp64.c
> index 6ba5d253e857..a01362c88f6a 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -69,7 +69,7 @@ static inline bool bpf_has_stack_frame(struct 
> codegen_context *ctx)
>  static int bpf_jit_stack_local(struct codegen_context *ctx)
>  {
>   if (bpf_has_stack_frame(ctx))
> - return STACK_FRAME_MIN_SIZE + MAX_BPF_STACK;
> + return STACK_FRAME_MIN_SIZE + ctx->stack_size;
>   else
>   return -(BPF_PPC_STACK_SAVE + 16);
>  }
> @@ -82,8 +82,9 @@ static int bpf_jit_stack_tailcallcnt(struct codegen_context 
> *ctx)
>  static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
>  {
>   if (reg >= BPF_PPC_NVR_MIN && reg < 32)
> - return (bpf_has_stack_frame(ctx) ? BPF_PPC_STACKFRAME : 0)
> - - (8 * (32 - reg));
> + return (bpf_has_stack_frame(ctx) ?
> + (BPF_PPC_STACKFRAME + ctx->stack_size) : 0)
> + - (8 * (32 - reg));
> 
>   pr_err("BPF JIT is asking about unknown registers");
>   BUG();
> @@ -134,7 +135,7 @@ static void bpf_jit_build_prologue(u32 *image, struct 
> codegen_context *ctx)
>   PPC_BPF_STL(0, 1, PPC_LR_STKOFF);
>   }
> 
> - PPC_BPF_STLU(1, 1, -BPF_PPC_STACKFRAME);
> + PPC_BPF_STLU(1, 1, -(BPF_PPC_STACKFRAME + ctx->stack_size));
>   }
> 
>   /*
> @@ -161,7 +162,7 @@ static void bpf_jit_build_prologue(u32 *image, struct 
> codegen_context *ctx)
>   /* Setup frame pointer to point to the bpf stack area */
>   if (bpf_is_seen_register(ctx, BPF_REG_FP))
>   PPC_ADDI(b2p[BPF_REG_FP], 1,
> - STACK_FRAME_MIN_SIZE + MAX_BPF_STACK);
> + STACK_FRAME_MIN_SIZE + ctx->stack_size);
>  }
> 
>  static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context 
> *ctx)
> @@ -183,7 +184,7 @@ static void bpf_jit_emit_common_epilogue(u32 *image, 
> struct codegen_context *ctx
> 
>   /* Tear down our stack frame */
>   if (bpf_has_stack_frame(ctx)) {
> - PPC_ADDI(1, 1, BPF_PPC_STACKFRAME);
> + PPC_ADDI(1, 1, BPF_PPC_STACKFRAME + ctx->stack_size);
>   if (ctx->seen & SEEN_FUNC) {
>   PPC_BPF_LL(0, 1, PPC_LR_STKOFF);
>   PPC_MTLR(0);
> @@ -993,6 +994,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> 
>   memset(, 0, sizeof(struct codegen_context));
> 
> + /* Make sure that the stack is quadword aligned. */
> + cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
> +
>   /* Scouting faux-generate pass 0 */
>   if (bpf_jit_build_body(fp, 0, , addrs)) {
>   /* We hit something illegal or unsupported. */
> -- 
> 2.13.5
> 



Re: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit endian operations

2017-01-24 Thread 'Naveen N. Rao'
On 2017/01/24 04:13PM, David Laight wrote:
> From: 'Naveen N. Rao'
> > Sent: 23 January 2017 19:22
> > On 2017/01/15 09:00AM, Benjamin Herrenschmidt wrote:
> > > On Fri, 2017-01-13 at 23:22 +0530, 'Naveen N. Rao' wrote:
> > > > > That rather depends on whether the processor has a store to load 
> > > > > forwarder
> > > > > that will satisfy the read from the store buffer.
> > > > > I don't know about ppc, but at least some x86 will do that.
> > > >
> > > > Interesting - good to know that.
> > > >
> > > > However, I don't think powerpc does that and in-register swap is likely
> > > > faster regardless. Note also that gcc prefers this form at higher
> > > > optimization levels.
> > >
> > > Of course powerpc has a load-store forwarder these days, however, I
> > > wouldn't be surprised if the in-register form was still faster on some
> > > implementations, but this needs to be tested.
> > 
> > Thanks for clarifying! To test this, I wrote a simple (perhaps naive)
> > test that just issues a whole lot of endian swaps and in _that_ test, it
> > does look like the load-store forwarder is doing pretty well.
> ...
> > This is all in a POWER8 vm. On POWER7, the in-register variant is around
> > 4 times faster than the ldbrx variant.
> ...
> 
> I wonder which is faster on the little 1GHz embedded ppc we use here.

Worth a test, for sure.
FWIW, this patch won't matter since eBPF JIT is for ppc64.

Thanks,
Naveen



Re: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit endian operations

2017-01-23 Thread 'Naveen N. Rao'
On 2017/01/15 09:00AM, Benjamin Herrenschmidt wrote:
> On Fri, 2017-01-13 at 23:22 +0530, 'Naveen N. Rao' wrote:
> > > That rather depends on whether the processor has a store to load forwarder
> > > that will satisfy the read from the store buffer.
> > > I don't know about ppc, but at least some x86 will do that.
> > 
> > Interesting - good to know that.
> > 
> > However, I don't think powerpc does that and in-register swap is likely 
> > faster regardless. Note also that gcc prefers this form at higher 
> > optimization levels.
> 
> Of course powerpc has a load-store forwarder these days, however, I
> wouldn't be surprised if the in-register form was still faster on some
> implementations, but this needs to be tested.

Thanks for clarifying! To test this, I wrote a simple (perhaps naive) 
test that just issues a whole lot of endian swaps and in _that_ test, it 
does look like the load-store forwarder is doing pretty well.

The tests:

bpf-bswap.S:
---
.file   "bpf-bswap.S"
.abiversion 2
.section".text"
.align 2
.globl main
.type   main, @function
main:
mflr0
std 0,16(1)
stdu1,-32760(1)
addi3,1,32
li  4,0
li  5,32720
li  11,32720
mulli   11,11,8
li  10,0
li  7,16
1:  ldx 6,3,4
stdx6,1,7
ldbrx   6,1,7
stdx6,3,4
addi4,4,8
cmpd4,5
beq 2f
b   1b
2:  addi10,10,1
li  4,0
cmpd10,11
beq 3f
b   1b
3:  li  3,0
addi1,1,32760
ld  0,16(1)
mtlr0
blr

bpf-bswap-reg.S:
---
.file   "bpf-bswap-reg.S"
.abiversion 2
.section".text"
.align 2
.globl main
.type   main, @function
main:
mflr0
std 0,16(1)
stdu1,-32760(1)
addi3,1,32
li  4,0
li  5,32720
li  11,32720
mulli   11,11,8
li  10,0
1:  ldx 6,3,4
rldicl  7,6,32,32
rlwinm  8,6,24,0,31
rlwimi  8,6,8,8,15
rlwinm  9,7,24,0,31
rlwimi  8,6,8,24,31
rlwimi  9,7,8,8,15
rlwimi  9,7,8,24,31
rldicr  8,8,32,31
or  6,8,9
stdx6,3,4
addi4,4,8
cmpd4,5
beq 2f
b   1b
2:  addi10,10,1
li  4,0
cmpd10,11
beq 3f
b   1b
3:  li  3,0
addi1,1,32760
ld  0,16(1)
mtlr0
blr

Profiling the two variants:

# perf stat ./bpf-bswap

 Performance counter stats for './bpf-bswap':

   1395.979224  task-clock (msec) #0.999 CPUs utilized  

 0  context-switches  #0.000 K/sec  

 0  cpu-migrations#0.000 K/sec  

45  page-faults   #0.032 K/sec  

 4,651,874,673  cycles#3.332 GHz
  (66.87%)
 3,141,186  stalled-cycles-frontend   #0.07% frontend cycles 
idle (50.57%)
 1,117,289,485  stalled-cycles-backend#   24.02% backend cycles 
idle  (50.57%)
 8,565,963,861  instructions  #1.84  insn per cycle 

  #0.13  stalled cycles per 
insn  (67.05%)
 2,174,029,771  branches  # 1557.351 M/sec  
  (49.69%)
   262,656  branch-misses #0.01% of all branches
  (50.05%)

   1.396893189 seconds time elapsed

# perf stat ./bpf-bswap-reg

 Performance counter stats for './bpf-bswap-reg':

   1819.758102  task-clock (msec) #0.999 CPUs utilized  

 3  context-switches  #0.002 K/sec  

 0  cpu-migrations#0.000 K/sec  

44  page-faults   #0.024 K/sec  

 6,034,777,602  cycles#3.316 GHz
  (66.83%)
 2,010,983  stalled-cycles-frontend   #0.03% frontend cycles 
idle (50.47%)
 1,024,975,759  stalled-cycles-backend#   16.98% backend cycles 
idle  (50.52%)
16,043,732,849  instructions  #2.66  insn per cycle 

  #0.06  stalled cycles per 
insn  (67.01%)
 2,148,710,750  branches  # 1180.767 M/sec  
  (49.57%)
   268,046  branch-m

Re: [PATCH 1/3] powerpc: bpf: remove redundant check for non-null image

2017-01-23 Thread Naveen N. Rao
Hi David,

On 2017/01/16 01:38PM, David Miller wrote:
> 
> I'm assuming these patches will go via the powerpc tree.
> 
> If you want them to go into net-next, I kindly ask that you always
> explicitly say so, and furthermore always submit a patch series with
> a proper "[PATCH 0/N] ..." header posting.

Sure. Sorry for the confusion. I will be more explicit next time.

Thanks,
Naveen



Re: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit endian operations

2017-01-13 Thread 'Naveen N. Rao'
On 2017/01/13 05:17PM, David Laight wrote:
> From: Naveen N. Rao
> > Sent: 13 January 2017 17:10
> > Generate instructions to perform the endian conversion using registers,
> > rather than generating two memory accesses.
> > 
> > The "way easier and faster" comment was obviously for the author, not
> > the processor.
> 
> That rather depends on whether the processor has a store to load forwarder
> that will satisfy the read from the store buffer.
> I don't know about ppc, but at least some x86 will do that.

Interesting - good to know that.

However, I don't think powerpc does that and in-register swap is likely 
faster regardless. Note also that gcc prefers this form at higher 
optimization levels.

Thanks,
Naveen



[PATCH 1/3] powerpc: bpf: remove redundant check for non-null image

2017-01-13 Thread Naveen N. Rao
From: Daniel Borkmann <dan...@iogearbox.net>

We have a check earlier to ensure we don't proceed if image is NULL. As
such, the redundant check can be removed.

Signed-off-by: Daniel Borkmann <dan...@iogearbox.net>
[Added similar changes for classic BPF JIT]
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp.c   | 17 +
 arch/powerpc/net/bpf_jit_comp64.c | 16 
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 7e706f3..f9941b3 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -662,16 +662,17 @@ void bpf_jit_compile(struct bpf_prog *fp)
 */
bpf_jit_dump(flen, proglen, pass, code_base);
 
-   if (image) {
-   bpf_flush_icache(code_base, code_base + (proglen/4));
+   bpf_flush_icache(code_base, code_base + (proglen/4));
+
 #ifdef CONFIG_PPC64
-   /* Function descriptor nastiness: Address + TOC */
-   ((u64 *)image)[0] = (u64)code_base;
-   ((u64 *)image)[1] = local_paca->kernel_toc;
+   /* Function descriptor nastiness: Address + TOC */
+   ((u64 *)image)[0] = (u64)code_base;
+   ((u64 *)image)[1] = local_paca->kernel_toc;
 #endif
-   fp->bpf_func = (void *)image;
-   fp->jited = 1;
-   }
+
+   fp->bpf_func = (void *)image;
+   fp->jited = 1;
+
 out:
kfree(addrs);
return;
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 0fe98a5..89b6a86 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -1046,16 +1046,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog 
*fp)
 */
bpf_jit_dump(flen, proglen, pass, code_base);
 
-   if (image) {
-   bpf_flush_icache(bpf_hdr, image + alloclen);
+   bpf_flush_icache(bpf_hdr, image + alloclen);
+
 #ifdef PPC64_ELF_ABI_v1
-   /* Function descriptor nastiness: Address + TOC */
-   ((u64 *)image)[0] = (u64)code_base;
-   ((u64 *)image)[1] = local_paca->kernel_toc;
+   /* Function descriptor nastiness: Address + TOC */
+   ((u64 *)image)[0] = (u64)code_base;
+   ((u64 *)image)[1] = local_paca->kernel_toc;
 #endif
-   fp->bpf_func = (void *)image;
-   fp->jited = 1;
-   }
+
+   fp->bpf_func = (void *)image;
+   fp->jited = 1;
 
 out:
kfree(addrs);
-- 
2.10.2



[PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit endian operations

2017-01-13 Thread Naveen N. Rao
Generate instructions to perform the endian conversion using registers,
rather than generating two memory accesses.

The "way easier and faster" comment was obviously for the author, not
the processor.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 1e313db..0413a89 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -599,16 +599,22 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
break;
case 64:
/*
-* Way easier and faster(?) to store the value
-* into stack and then use ldbrx
+* We'll split it up into two words, swap those
+* independently and then merge them back.
 *
-* ctx->seen will be reliable in pass2, but
-* the instructions generated will remain the
-* same across all passes
+* First up, let's swap the most-significant 
word.
 */
-   PPC_STD(dst_reg, 1, bpf_jit_stack_local(ctx));
-   PPC_ADDI(b2p[TMP_REG_1], 1, 
bpf_jit_stack_local(ctx));
-   PPC_LDBRX(dst_reg, 0, b2p[TMP_REG_1]);
+   PPC_RLDICL(b2p[TMP_REG_1], dst_reg, 32, 32);
+   PPC_RLWINM(b2p[TMP_REG_2], b2p[TMP_REG_1], 8, 
0, 31);
+   PPC_RLWIMI(b2p[TMP_REG_2], b2p[TMP_REG_1], 24, 
0, 7);
+   PPC_RLWIMI(b2p[TMP_REG_2], b2p[TMP_REG_1], 24, 
16, 23);
+   /* Then, the second half */
+   PPC_RLWINM(b2p[TMP_REG_1], dst_reg, 8, 0, 31);
+   PPC_RLWIMI(b2p[TMP_REG_1], dst_reg, 24, 0, 7);
+   PPC_RLWIMI(b2p[TMP_REG_1], dst_reg, 24, 16, 23);
+   /* Merge back */
+   PPC_RLDICR(dst_reg, b2p[TMP_REG_1], 32, 31);
+   PPC_OR(dst_reg, dst_reg, b2p[TMP_REG_2]);
break;
}
break;
-- 
2.10.2



[PATCH 2/3] powerpc: bpf: flush the entire JIT buffer

2017-01-13 Thread Naveen N. Rao
With bpf_jit_binary_alloc(), we allocate at a page granularity and fill
the rest of the space with illegal instructions to mitigate BPF spraying
attacks, while having the actual JIT'ed BPF program at a random location
within the allocated space. Under this scenario, it would be better to
flush the entire allocated buffer rather than just the part containing
the actual program. We already flush the buffer from start to the end of
the BPF program. Extend this to include the illegal instructions after
the BPF program.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 89b6a86..1e313db 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -1046,8 +1046,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 */
bpf_jit_dump(flen, proglen, pass, code_base);
 
-   bpf_flush_icache(bpf_hdr, image + alloclen);
-
 #ifdef PPC64_ELF_ABI_v1
/* Function descriptor nastiness: Address + TOC */
((u64 *)image)[0] = (u64)code_base;
@@ -1057,6 +1055,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
fp->bpf_func = (void *)image;
fp->jited = 1;
 
+   bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + (bpf_hdr->pages * PAGE_SIZE));
+
 out:
kfree(addrs);
 
-- 
2.10.2



Re: [PATCH 2/3] bpf powerpc: implement support for tail calls

2016-09-26 Thread Naveen N. Rao
On 2016/09/26 11:00AM, Daniel Borkmann wrote:
> On 09/26/2016 10:56 AM, Naveen N. Rao wrote:
> > On 2016/09/24 03:30AM, Alexei Starovoitov wrote:
> > > On Sat, Sep 24, 2016 at 12:33:54AM +0200, Daniel Borkmann wrote:
> > > > On 09/23/2016 10:35 PM, Naveen N. Rao wrote:
> > > > > Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
> > > > > programs. This can be achieved either by:
> > > > > (1) retaining the stack setup by the first eBPF program and having all
> > > > > subsequent eBPF programs re-using it, or,
> > > > > (2) by unwinding/tearing down the stack and having each eBPF program
> > > > > deal with its own stack as it sees fit.
> > > > > 
> > > > > To ensure that this does not create loops, there is a limit to how 
> > > > > many
> > > > > tail calls can be done (currently 32). This requires the JIT'ed code 
> > > > > to
> > > > > maintain a count of the number of tail calls done so far.
> > > > > 
> > > > > Approach (1) is simple, but requires every eBPF program to have 
> > > > > (almost)
> > > > > the same prologue/epilogue, regardless of whether they need it. This 
> > > > > is
> > > > > inefficient for small eBPF programs which may not sometimes need a
> > > > > prologue at all. As such, to minimize impact of tail call
> > > > > implementation, we use approach (2) here which needs each eBPF program
> > > > > in the chain to use its own prologue/epilogue. This is not ideal when
> > > > > many tail calls are involved and when all the eBPF programs in the 
> > > > > chain
> > > > > have similar prologue/epilogue. However, the impact is restricted to
> > > > > programs that do tail calls. Individual eBPF programs are not 
> > > > > affected.
> > > > > 
> > > > > We maintain the tail call count in a fixed location on the stack and
> > > > > updated tail call count values are passed in through this. The very
> > > > > first eBPF program in a chain sets this up to 0 (the first 2
> > > > > instructions). Subsequent tail calls skip the first two eBPF JIT
> > > > > instructions to maintain the count. For programs that don't do tail
> > > > > calls themselves, the first two instructions are NOPs.
> > > > > 
> > > > > Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> > > > 
> > > > Thanks for adding support, Naveen, that's really great! I think 2) seems
> > > > fine as well in this context as prologue size can vary quite a bit here,
> > > > and depending on program types likelihood of tail call usage as well 
> > > > (but
> > > > I wouldn't expect deep nesting). Thanks a lot!
> > > 
> > > Great stuff. In this circumstances approach 2 makes sense to me as well.
> > 
> > Alexie, Daniel,
> > Thanks for the quick review!
> 
> The patches would go via Michael's tree (same way as with the JIT itself
> in the past), right?

Yes, this set is contained within arch/powerpc, so Michael can take this 
through his tree.

The other set with updates to samples/bpf can probably go through 
David's tree.

- Naveen



Re: [PATCH 2/3] bpf powerpc: implement support for tail calls

2016-09-26 Thread Naveen N. Rao
On 2016/09/24 03:30AM, Alexei Starovoitov wrote:
> On Sat, Sep 24, 2016 at 12:33:54AM +0200, Daniel Borkmann wrote:
> > On 09/23/2016 10:35 PM, Naveen N. Rao wrote:
> > >Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
> > >programs. This can be achieved either by:
> > >(1) retaining the stack setup by the first eBPF program and having all
> > >subsequent eBPF programs re-using it, or,
> > >(2) by unwinding/tearing down the stack and having each eBPF program
> > >deal with its own stack as it sees fit.
> > >
> > >To ensure that this does not create loops, there is a limit to how many
> > >tail calls can be done (currently 32). This requires the JIT'ed code to
> > >maintain a count of the number of tail calls done so far.
> > >
> > >Approach (1) is simple, but requires every eBPF program to have (almost)
> > >the same prologue/epilogue, regardless of whether they need it. This is
> > >inefficient for small eBPF programs which may not sometimes need a
> > >prologue at all. As such, to minimize impact of tail call
> > >implementation, we use approach (2) here which needs each eBPF program
> > >in the chain to use its own prologue/epilogue. This is not ideal when
> > >many tail calls are involved and when all the eBPF programs in the chain
> > >have similar prologue/epilogue. However, the impact is restricted to
> > >programs that do tail calls. Individual eBPF programs are not affected.
> > >
> > >We maintain the tail call count in a fixed location on the stack and
> > >updated tail call count values are passed in through this. The very
> > >first eBPF program in a chain sets this up to 0 (the first 2
> > >instructions). Subsequent tail calls skip the first two eBPF JIT
> > >instructions to maintain the count. For programs that don't do tail
> > >calls themselves, the first two instructions are NOPs.
> > >
> > >Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> > 
> > Thanks for adding support, Naveen, that's really great! I think 2) seems
> > fine as well in this context as prologue size can vary quite a bit here,
> > and depending on program types likelihood of tail call usage as well (but
> > I wouldn't expect deep nesting). Thanks a lot!
> 
> Great stuff. In this circumstances approach 2 makes sense to me as well.

Alexie, Daniel,
Thanks for the quick review!

- Naveen



[PATCH 1/2] bpf samples: fix compiler errors with sockex2 and sockex3

2016-09-23 Thread Naveen N. Rao
These samples fail to compile as 'struct flow_keys' conflicts with
definition in net/flow_dissector.h. Fix the same by renaming the
structure used in the sample.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 samples/bpf/sockex2_kern.c | 10 +-
 samples/bpf/sockex3_kern.c |  8 
 samples/bpf/sockex3_user.c |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/samples/bpf/sockex2_kern.c b/samples/bpf/sockex2_kern.c
index ba0e177..44e5846 100644
--- a/samples/bpf/sockex2_kern.c
+++ b/samples/bpf/sockex2_kern.c
@@ -14,7 +14,7 @@ struct vlan_hdr {
__be16 h_vlan_encapsulated_proto;
 };
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -59,7 +59,7 @@ static inline __u32 ipv6_addr_hash(struct __sk_buff *ctx, 
__u64 off)
 }
 
 static inline __u64 parse_ip(struct __sk_buff *skb, __u64 nhoff, __u64 
*ip_proto,
-struct flow_keys *flow)
+struct bpf_flow_keys *flow)
 {
__u64 verlen;
 
@@ -83,7 +83,7 @@ static inline __u64 parse_ip(struct __sk_buff *skb, __u64 
nhoff, __u64 *ip_proto
 }
 
 static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 nhoff, __u64 
*ip_proto,
-  struct flow_keys *flow)
+  struct bpf_flow_keys *flow)
 {
*ip_proto = load_byte(skb,
  nhoff + offsetof(struct ipv6hdr, nexthdr));
@@ -96,7 +96,7 @@ static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 
nhoff, __u64 *ip_pro
return nhoff;
 }
 
-static inline bool flow_dissector(struct __sk_buff *skb, struct flow_keys 
*flow)
+static inline bool flow_dissector(struct __sk_buff *skb, struct bpf_flow_keys 
*flow)
 {
__u64 nhoff = ETH_HLEN;
__u64 ip_proto;
@@ -198,7 +198,7 @@ struct bpf_map_def SEC("maps") hash_map = {
 SEC("socket2")
 int bpf_prog2(struct __sk_buff *skb)
 {
-   struct flow_keys flow;
+   struct bpf_flow_keys flow;
struct pair *value;
u32 key;
 
diff --git a/samples/bpf/sockex3_kern.c b/samples/bpf/sockex3_kern.c
index 41ae2fd..95907f8 100644
--- a/samples/bpf/sockex3_kern.c
+++ b/samples/bpf/sockex3_kern.c
@@ -61,7 +61,7 @@ struct vlan_hdr {
__be16 h_vlan_encapsulated_proto;
 };
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -88,7 +88,7 @@ static inline __u32 ipv6_addr_hash(struct __sk_buff *ctx, 
__u64 off)
 }
 
 struct globals {
-   struct flow_keys flow;
+   struct bpf_flow_keys flow;
 };
 
 struct bpf_map_def SEC("maps") percpu_map = {
@@ -114,14 +114,14 @@ struct pair {
 
 struct bpf_map_def SEC("maps") hash_map = {
.type = BPF_MAP_TYPE_HASH,
-   .key_size = sizeof(struct flow_keys),
+   .key_size = sizeof(struct bpf_flow_keys),
.value_size = sizeof(struct pair),
.max_entries = 1024,
 };
 
 static void update_stats(struct __sk_buff *skb, struct globals *g)
 {
-   struct flow_keys key = g->flow;
+   struct bpf_flow_keys key = g->flow;
struct pair *value;
 
value = bpf_map_lookup_elem(_map, );
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index d4184ab..3fcfd8c4 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -7,7 +7,7 @@
 #include 
 #include 
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -49,7 +49,7 @@ int main(int argc, char **argv)
(void) f;
 
for (i = 0; i < 5; i++) {
-   struct flow_keys key = {}, next_key;
+   struct bpf_flow_keys key = {}, next_key;
struct pair value;
 
sleep(1);
-- 
2.9.3



[PATCH 2/2] bpf samples: update tracex5 sample to use __seccomp_filter

2016-09-23 Thread Naveen N. Rao
seccomp_phase1() does not exist anymore. Instead, update sample to use
__seccomp_filter(). While at it, set max locked memory to unlimited.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
I am not completely sure if __seccomp_filter is the right place to hook
in. This works for me though. Please review.

Thanks,
Naveen


 samples/bpf/tracex5_kern.c | 16 +++-
 samples/bpf/tracex5_user.c |  3 +++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c
index f95f232..fd12d71 100644
--- a/samples/bpf/tracex5_kern.c
+++ b/samples/bpf/tracex5_kern.c
@@ -19,20 +19,18 @@ struct bpf_map_def SEC("maps") progs = {
.max_entries = 1024,
 };
 
-SEC("kprobe/seccomp_phase1")
+SEC("kprobe/__seccomp_filter")
 int bpf_prog1(struct pt_regs *ctx)
 {
-   struct seccomp_data sd;
-
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   int sc_nr = (int)PT_REGS_PARM1(ctx);
 
/* dispatch into next BPF program depending on syscall number */
-   bpf_tail_call(ctx, , sd.nr);
+   bpf_tail_call(ctx, , sc_nr);
 
/* fall through -> unknown syscall */
-   if (sd.nr >= __NR_getuid && sd.nr <= __NR_getsid) {
+   if (sc_nr >= __NR_getuid && sc_nr <= __NR_getsid) {
char fmt[] = "syscall=%d (one of get/set uid/pid/gid)\n";
-   bpf_trace_printk(fmt, sizeof(fmt), sd.nr);
+   bpf_trace_printk(fmt, sizeof(fmt), sc_nr);
}
return 0;
 }
@@ -42,7 +40,7 @@ PROG(__NR_write)(struct pt_regs *ctx)
 {
struct seccomp_data sd;
 
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] == 512) {
char fmt[] = "write(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
@@ -55,7 +53,7 @@ PROG(__NR_read)(struct pt_regs *ctx)
 {
struct seccomp_data sd;
 
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] > 128 && sd.args[2] <= 1024) {
char fmt[] = "read(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
diff --git a/samples/bpf/tracex5_user.c b/samples/bpf/tracex5_user.c
index a04dd3c..36b5925 100644
--- a/samples/bpf/tracex5_user.c
+++ b/samples/bpf/tracex5_user.c
@@ -6,6 +6,7 @@
 #include 
 #include "libbpf.h"
 #include "bpf_load.h"
+#include 
 
 /* install fake seccomp program to enable seccomp code path inside the kernel,
  * so that our kprobe attached to seccomp_phase1() can be triggered
@@ -27,8 +28,10 @@ int main(int ac, char **argv)
 {
FILE *f;
char filename[256];
+   struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+   setrlimit(RLIMIT_MEMLOCK, );
 
if (load_bpf_file(filename)) {
printf("%s", bpf_log_buf);
-- 
2.9.3



[PATCH 3/3] bpf powerpc: add support for bpf constant blinding

2016-09-23 Thread Naveen N. Rao
In line with similar support for other architectures by Daniel Borkmann.

'MOD Default X' from test_bpf without constant blinding:
84 bytes emitted from JIT compiler (pass:3, flen:7)
d58a4688 + :
   0:   nop
   4:   nop
   8:   std r27,-40(r1)
   c:   std r28,-32(r1)
  10:   xor r8,r8,r8
  14:   xor r28,r28,r28
  18:   mr  r27,r3
  1c:   li  r8,66
  20:   cmpwi   r28,0
  24:   bne 0x0030
  28:   li  r8,0
  2c:   b   0x0044
  30:   divwu   r9,r8,r28
  34:   mullw   r9,r28,r9
  38:   subfr8,r9,r8
  3c:   rotlwi  r8,r8,0
  40:   li  r8,66
  44:   ld  r27,-40(r1)
  48:   ld  r28,-32(r1)
  4c:   mr  r3,r8
  50:   blr

... and with constant blinding:
140 bytes emitted from JIT compiler (pass:3, flen:11)
dbd6ab24 + :
   0:   nop
   4:   nop
   8:   std r27,-40(r1)
   c:   std r28,-32(r1)
  10:   xor r8,r8,r8
  14:   xor r28,r28,r28
  18:   mr  r27,r3
  1c:   lis r2,-22834
  20:   ori r2,r2,36083
  24:   rotlwi  r2,r2,0
  28:   xorir2,r2,36017
  2c:   xoris   r2,r2,42702
  30:   rotlwi  r2,r2,0
  34:   mr  r8,r2
  38:   rotlwi  r8,r8,0
  3c:   cmpwi   r28,0
  40:   bne 0x004c
  44:   li  r8,0
  48:   b   0x007c
  4c:   divwu   r9,r8,r28
  50:   mullw   r9,r28,r9
  54:   subfr8,r9,r8
  58:   rotlwi  r8,r8,0
  5c:   lis r2,-17137
  60:   ori r2,r2,39065
  64:   rotlwi  r2,r2,0
  68:   xorir2,r2,39131
  6c:   xoris   r2,r2,48399
  70:   rotlwi  r2,r2,0
  74:   mr  r8,r2
  78:   rotlwi  r8,r8,0
  7c:   ld  r27,-40(r1)
  80:   ld  r28,-32(r1)
  84:   mr  r3,r8
  88:   blr

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit64.h  |  9 +
 arch/powerpc/net/bpf_jit_comp64.c | 36 +---
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 038e00b..62fa758 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -39,10 +39,10 @@
 #ifndef __ASSEMBLY__
 
 /* BPF register usage */
-#define SKB_HLEN_REG   (MAX_BPF_REG + 0)
-#define SKB_DATA_REG   (MAX_BPF_REG + 1)
-#define TMP_REG_1  (MAX_BPF_REG + 2)
-#define TMP_REG_2  (MAX_BPF_REG + 3)
+#define SKB_HLEN_REG   (MAX_BPF_JIT_REG + 0)
+#define SKB_DATA_REG   (MAX_BPF_JIT_REG + 1)
+#define TMP_REG_1  (MAX_BPF_JIT_REG + 2)
+#define TMP_REG_2  (MAX_BPF_JIT_REG + 3)
 
 /* BPF to ppc register mappings */
 static const int b2p[] = {
@@ -62,6 +62,7 @@ static const int b2p[] = {
/* frame pointer aka BPF_REG_10 */
[BPF_REG_FP] = 31,
/* eBPF jit internal registers */
+   [BPF_REG_AX] = 2,
[SKB_HLEN_REG] = 25,
[SKB_DATA_REG] = 26,
[TMP_REG_1] = 9,
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 3ec29d6..0fe98a5 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -974,21 +974,37 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
int pass;
int flen;
struct bpf_binary_header *bpf_hdr;
+   struct bpf_prog *org_fp = fp;
+   struct bpf_prog *tmp_fp;
+   bool bpf_blinded = false;
 
if (!bpf_jit_enable)
-   return fp;
+   return org_fp;
+
+   tmp_fp = bpf_jit_blind_constants(org_fp);
+   if (IS_ERR(tmp_fp))
+   return org_fp;
+
+   if (tmp_fp != org_fp) {
+   bpf_blinded = true;
+   fp = tmp_fp;
+   }
 
flen = fp->len;
addrs = kzalloc((flen+1) * sizeof(*addrs), GFP_KERNEL);
-   if (addrs == NULL)
-   return fp;
+   if (addrs == NULL) {
+   fp = org_fp;
+   goto out;
+   }
+
+   memset(, 0, sizeof(struct codegen_context));
 
-   cgctx.idx = 0;
-   cgctx.seen = 0;
/* Scouting faux-generate pass 0 */
-   if (bpf_jit_build_body(fp, 0, , addrs))
+   if (bpf_jit_build_body(fp, 0, , addrs)) {
/* We hit something illegal or unsupported. */
+   fp = org_fp;
goto out;
+   }
 
/*
 * Pretend to build prologue, given the features we've seen.  This will
@@ -1003,8 +1019,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
bpf_hdr = bpf_jit_binary_alloc(alloclen, , 4,
bpf_jit_fill_ill_insns);
-   if (!bpf_hdr)
+   if (!bpf_hdr) {
+   fp = org_fp;
goto out;
+   }
 
code_base = (u32 *)(image + FUNCTION_DESCR_SIZE);
 
@@ -1041,6 +1059,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 out:
kfree(addrs);
+
+   if (bpf_blinded)
+   bpf_jit_prog_release_other(fp, fp == org_fp ? tmp_fp : org_fp);
+
return fp;
 }
 
-- 
2.9.3



[PATCH 2/3] bpf powerpc: implement support for tail calls

2016-09-23 Thread Naveen N. Rao
Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
programs. This can be achieved either by:
(1) retaining the stack setup by the first eBPF program and having all
subsequent eBPF programs re-using it, or,
(2) by unwinding/tearing down the stack and having each eBPF program
deal with its own stack as it sees fit.

To ensure that this does not create loops, there is a limit to how many
tail calls can be done (currently 32). This requires the JIT'ed code to
maintain a count of the number of tail calls done so far.

Approach (1) is simple, but requires every eBPF program to have (almost)
the same prologue/epilogue, regardless of whether they need it. This is
inefficient for small eBPF programs which may not sometimes need a
prologue at all. As such, to minimize impact of tail call
implementation, we use approach (2) here which needs each eBPF program
in the chain to use its own prologue/epilogue. This is not ideal when
many tail calls are involved and when all the eBPF programs in the chain
have similar prologue/epilogue. However, the impact is restricted to
programs that do tail calls. Individual eBPF programs are not affected.

We maintain the tail call count in a fixed location on the stack and
updated tail call count values are passed in through this. The very
first eBPF program in a chain sets this up to 0 (the first 2
instructions). Subsequent tail calls skip the first two eBPF JIT
instructions to maintain the count. For programs that don't do tail
calls themselves, the first two instructions are NOPs.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/ppc-opcode.h |   2 +
 arch/powerpc/net/bpf_jit.h|   2 +
 arch/powerpc/net/bpf_jit64.h  |   1 +
 arch/powerpc/net/bpf_jit_comp64.c | 149 +++---
 4 files changed, 126 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 127ebf5..54ff8ce 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -236,6 +236,7 @@
 #define PPC_INST_STWU  0x9400
 #define PPC_INST_MFLR  0x7c0802a6
 #define PPC_INST_MTLR  0x7c0803a6
+#define PPC_INST_MTCTR 0x7c0903a6
 #define PPC_INST_CMPWI 0x2c00
 #define PPC_INST_CMPDI 0x2c20
 #define PPC_INST_CMPW  0x7c00
@@ -250,6 +251,7 @@
 #define PPC_INST_SUB   0x7c50
 #define PPC_INST_BLR   0x4e800020
 #define PPC_INST_BLRL  0x4e800021
+#define PPC_INST_BCTR  0x4e800420
 #define PPC_INST_MULLD 0x7c0001d2
 #define PPC_INST_MULLW 0x7c0001d6
 #define PPC_INST_MULHWU0x7c16
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index d5301b6..89f7007 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -40,6 +40,8 @@
 #define PPC_BLR()  EMIT(PPC_INST_BLR)
 #define PPC_BLRL() EMIT(PPC_INST_BLRL)
 #define PPC_MTLR(r)EMIT(PPC_INST_MTLR | ___PPC_RT(r))
+#define PPC_BCTR() EMIT(PPC_INST_BCTR)
+#define PPC_MTCTR(r)   EMIT(PPC_INST_MTCTR | ___PPC_RT(r))
 #define PPC_ADDI(d, a, i)  EMIT(PPC_INST_ADDI | ___PPC_RT(d) |   \
 ___PPC_RA(a) | IMM_L(i))
 #define PPC_MR(d, a)   PPC_OR(d, a, a)
diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index a1645d7..038e00b 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -88,6 +88,7 @@ DECLARE_LOAD_FUNC(sk_load_byte);
 #define SEEN_FUNC  0x1000 /* might call external helpers */
 #define SEEN_STACK 0x2000 /* uses BPF stack */
 #define SEEN_SKB   0x4000 /* uses sk_buff */
+#define SEEN_TAILCALL  0x8000 /* uses tail calls */
 
 struct codegen_context {
/*
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 5f8c91f..3ec29d6 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "bpf_jit64.h"
 
@@ -77,6 +78,11 @@ static int bpf_jit_stack_local(struct codegen_context *ctx)
return -(BPF_PPC_STACK_SAVE + 16);
 }
 
+static int bpf_jit_stack_tailcallcnt(struct codegen_context *ctx)
+{
+   return bpf_jit_stack_local(ctx) + 8;
+}
+
 static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
 {
if (reg >= BPF_PPC_NVR_MIN && reg < 32)
@@ -102,33 +108,25 @@ static void bpf_jit_emit_skb_loads(u32 *image, struct 
codegen_context *ctx)
PPC_BPF_LL(b2p[SKB_DATA_REG], 3, offsetof(struct sk_buff, data));
 }
 
-static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, 
u64 func)
+static void bpf_jit_bui

[PATCH 1/3] bpf powerpc: introduce accessors for using the tmp local stack space

2016-09-23 Thread Naveen N. Rao
While at it, ensure that the location of the local save area is
consistent whether or not we setup our own stackframe. This property is
utilised in the next patch that adds support for tail calls.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit64.h  | 16 +---
 arch/powerpc/net/bpf_jit_comp64.c | 79 ++-
 2 files changed, 55 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 5046d6f..a1645d7 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -16,22 +16,25 @@
 
 /*
  * Stack layout:
+ * Ensure the top half (upto local_tmp_var) stays consistent
+ * with our redzone usage.
  *
  * [   prev sp ] <-
  * [   nv gpr save area] 8*8   |
+ * [tail_call_cnt  ] 8 |
+ * [local_tmp_var  ] 8 |
  * fp (r31) -->[   ebpf stack space] 512   |
- * [  local/tmp var space  ] 16|
  * [ frame header  ] 32/112|
  * sp (r1) --->[stack pointer  ] --
  */
 
-/* for bpf JIT code internal usage */
-#define BPF_PPC_STACK_LOCALS   16
 /* for gpr non volatile registers BPG_REG_6 to 10, plus skb cache registers */
 #define BPF_PPC_STACK_SAVE (8*8)
+/* for bpf JIT code internal usage */
+#define BPF_PPC_STACK_LOCALS   16
 /* Ensure this is quadword aligned */
-#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS + \
-MAX_BPF_STACK + BPF_PPC_STACK_SAVE)
+#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + MAX_BPF_STACK + \
+BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
 
 #ifndef __ASSEMBLY__
 
@@ -65,6 +68,9 @@ static const int b2p[] = {
[TMP_REG_2] = 10
 };
 
+/* PPC NVR range -- update this if we ever use NVRs below r24 */
+#define BPF_PPC_NVR_MIN24
+
 /* Assembly helpers */
 #define DECLARE_LOAD_FUNC(func)u64 func(u64 r3, u64 r4);   
\
u64 func##_negative_offset(u64 r3, u64 r4); 
\
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 6073b78..5f8c91f 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -58,6 +58,35 @@ static inline bool bpf_has_stack_frame(struct 
codegen_context *ctx)
return ctx->seen & SEEN_FUNC || bpf_is_seen_register(ctx, BPF_REG_FP);
 }
 
+/*
+ * When not setting up our own stackframe, the redzone usage is:
+ *
+ * [   prev sp ] <-
+ * [ ...   ]   |
+ * sp (r1) --->[stack pointer  ] --
+ * [   nv gpr save area] 8*8
+ * [tail_call_cnt  ] 8
+ * [local_tmp_var  ] 8
+ * [   unused red zone ] 208 bytes protected
+ */
+static int bpf_jit_stack_local(struct codegen_context *ctx)
+{
+   if (bpf_has_stack_frame(ctx))
+   return STACK_FRAME_MIN_SIZE + MAX_BPF_STACK;
+   else
+   return -(BPF_PPC_STACK_SAVE + 16);
+}
+
+static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
+{
+   if (reg >= BPF_PPC_NVR_MIN && reg < 32)
+   return (bpf_has_stack_frame(ctx) ? BPF_PPC_STACKFRAME : 0)
+   - (8 * (32 - reg));
+
+   pr_err("BPF JIT is asking about unknown registers");
+   BUG();
+}
+
 static void bpf_jit_emit_skb_loads(u32 *image, struct codegen_context *ctx)
 {
/*
@@ -100,9 +129,8 @@ static void bpf_jit_emit_func_call(u32 *image, struct 
codegen_context *ctx, u64
 static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
 {
int i;
-   bool new_stack_frame = bpf_has_stack_frame(ctx);
 
-   if (new_stack_frame) {
+   if (bpf_has_stack_frame(ctx)) {
/*
 * We need a stack frame, but we don't necessarily need to
 * save/restore LR unless we call other functions
@@ -122,9 +150,7 @@ static void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 */
for (i = BPF_REG_6; i <= BPF_REG_10; i++)
if (bpf_is_seen_register(ctx, i))
-   PPC_BPF_STL(b2p[i], 1,
-   (new_stack_frame ? BPF_PPC_STACKFRAME : 0) -
-   (8 * (32 - b2p[i])));
+   PPC_BPF_STL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, 
b2p[i]));
 
/*
 * Save additional non-volatile regs if we cache skb
@@ -132,22 +158,21 @@ static void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 */
if (ctx->seen &

[PATCHv2 6/7] ppc: bpf/jit: Isolate classic BPF JIT specifics into a separate header

2016-06-22 Thread Naveen N. Rao
Break out classic BPF JIT specifics into a separate header in
preparation for eBPF JIT implementation. Note that ppc32 will still need
the classic BPF JIT.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h  | 121 +-
 arch/powerpc/net/bpf_jit32.h| 139 
 arch/powerpc/net/bpf_jit_asm.S  |   2 +-
 arch/powerpc/net/bpf_jit_comp.c |   2 +-
 4 files changed, 143 insertions(+), 121 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit32.h

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 9041d3f..313cfaf 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -1,4 +1,5 @@
-/* bpf_jit.h: BPF JIT compiler for PPC64
+/*
+ * bpf_jit.h: BPF JIT compiler for PPC
  *
  * Copyright 2011 Matt Evans <m...@ozlabs.org>, IBM Corporation
  *
@@ -10,66 +11,8 @@
 #ifndef _BPF_JIT_H
 #define _BPF_JIT_H
 
-#ifdef CONFIG_PPC64
-#define BPF_PPC_STACK_R3_OFF   48
-#define BPF_PPC_STACK_LOCALS   32
-#define BPF_PPC_STACK_BASIC(48+64)
-#define BPF_PPC_STACK_SAVE (18*8)
-#define BPF_PPC_STACKFRAME (BPF_PPC_STACK_BASIC+BPF_PPC_STACK_LOCALS+ \
-BPF_PPC_STACK_SAVE)
-#define BPF_PPC_SLOWPATH_FRAME (48+64)
-#else
-#define BPF_PPC_STACK_R3_OFF   24
-#define BPF_PPC_STACK_LOCALS   16
-#define BPF_PPC_STACK_BASIC(24+32)
-#define BPF_PPC_STACK_SAVE (18*4)
-#define BPF_PPC_STACKFRAME (BPF_PPC_STACK_BASIC+BPF_PPC_STACK_LOCALS+ \
-BPF_PPC_STACK_SAVE)
-#define BPF_PPC_SLOWPATH_FRAME (24+32)
-#endif
-
-#define REG_SZ (BITS_PER_LONG/8)
-
-/*
- * Generated code register usage:
- *
- * As normal PPC C ABI (e.g. r1=sp, r2=TOC), with:
- *
- * skb r3  (Entry parameter)
- * A register  r4
- * X register  r5
- * addr param  r6
- * r7-r10  scratch
- * skb->data   r14
- * skb headlen r15 (skb->len - skb->data_len)
- * m[0]r16
- * m[...]  ...
- * m[15]   r31
- */
-#define r_skb  3
-#define r_ret  3
-#define r_A4
-#define r_X5
-#define r_addr 6
-#define r_scratch1 7
-#define r_scratch2 8
-#define r_D14
-#define r_HL   15
-#define r_M16
-
 #ifndef __ASSEMBLY__
 
-/*
- * Assembly helpers from arch/powerpc/net/bpf_jit.S:
- */
-#define DECLARE_LOAD_FUNC(func)\
-   extern u8 func[], func##_negative_offset[], func##_positive_offset[]
-
-DECLARE_LOAD_FUNC(sk_load_word);
-DECLARE_LOAD_FUNC(sk_load_half);
-DECLARE_LOAD_FUNC(sk_load_byte);
-DECLARE_LOAD_FUNC(sk_load_byte_msh);
-
 #ifdef CONFIG_PPC64
 #define FUNCTION_DESCR_SIZE24
 #else
@@ -131,46 +74,6 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0)
 #endif
 
-/* Convenience helpers for the above with 'far' offsets: */
-#define PPC_LBZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LBZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LBZ(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LD_OFFS(r, base, i) do { if ((i) < 32768) PPC_LD(r, base, i); \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LD(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LWZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LWZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LWZ(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LHZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LHZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LHZ(r, r, IMM_L(i)); } } while(0)
-
-#ifdef CONFIG_PPC64
-#define PPC_LL_OFFS(r, base, i) do { PPC_LD_OFFS(r, base, i); } while(0)
-#else
-#define PPC_LL_OFFS(r, base, i) do { PPC_LWZ_OFFS(r, base, i); } while(0)
-#endif
-
-#ifdef CONFIG_SMP
-#ifdef CONFIG_PPC64
-#define PPC_BPF_LOAD_CPU(r)\
-   do { BUILD_BUG_ON(FIELD_SIZEOF(struct paca_struct, paca_index) != 2);   
\
-   PPC_LHZ_OFFS(r, 13, offsetof(struct paca_struct, paca_index));  
\
-   } while (0)
-#else
-#define PPC_BPF_LOAD_CPU(r) \
-   do { BUILD_BUG_ON(FIELD_SIZEOF(str

[PATCHv2 3/7] ppc: bpf/jit: Optimize 64-bit Immediate loads

2016-06-22 Thread Naveen N. Rao
Similar to the LI32() optimization, if the value can be represented
in 32-bits, use LI32(). Also handle loading a few specific forms of
immediate values in an optimum manner.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index a9882db..4c1e055 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -244,20 +244,25 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
} } while(0)
 
 #define PPC_LI64(d, i) do {  \
-   if (!((uintptr_t)(i) & 0xULL))\
+   if ((long)(i) >= -2147483648 &&   \
+   (long)(i) < 2147483648)   \
PPC_LI32(d, i);   \
else {\
-   PPC_LIS(d, ((uintptr_t)(i) >> 48));   \
-   if ((uintptr_t)(i) & 0xULL)   \
-   PPC_ORI(d, d, \
-   ((uintptr_t)(i) >> 32) & 0x); \
+   if (!((uintptr_t)(i) & 0x8000ULL))\
+   PPC_LI(d, ((uintptr_t)(i) >> 32) & 0x);   \
+   else {\
+   PPC_LIS(d, ((uintptr_t)(i) >> 48));   \
+   if ((uintptr_t)(i) & 0xULL)   \
+   PPC_ORI(d, d, \
+ ((uintptr_t)(i) >> 32) & 0x);   \
+   } \
PPC_SLDI(d, d, 32);   \
if ((uintptr_t)(i) & 0xULL)   \
PPC_ORIS(d, d,\
 ((uintptr_t)(i) >> 16) & 0x);\
if ((uintptr_t)(i) & 0xULL)   \
PPC_ORI(d, d, (uintptr_t)(i) & 0x);   \
-   } } while (0);
+   } } while (0)
 
 #ifdef CONFIG_PPC64
 #define PPC_FUNC_ADDR(d,i) do { PPC_LI64(d, i); } while(0)
-- 
2.8.2



[PATCHv2 0/7] eBPF JIT for PPC64

2016-06-22 Thread Naveen N. Rao
v2 changes:
  - Patch 1 is new and is cc'ed -stable
  - Patch 7 has 3 changes:
- Include asm/kprobes.h to resolve a build error reported by Michael
Ellerman
- Remove check for fp in bpf_int_jit_compile() as suggested by Daniel
Borkmann
- Fix a crash on Cell processor reported by Michael Ellerman by
changing image to be a u8 pointer for proper size calculation.


Earlier cover letter:
Implement extended BPF JIT for ppc64. We retain the classic BPF JIT for
ppc32 and move ppc64 BE/LE to use the new JIT. Classic BPF filters will
be converted to extended BPF (see convert_filter()) and JIT'ed with the
new compiler.

Most of the existing macros are retained and fixed/enhanced where
appropriate. Patches 1-4 are geared towards this.

Patch 5 breaks out the classic BPF JIT specifics into a separate
bpf_jit32.h header file, while retaining all the generic instruction
macros in bpf_jit.h.

Patch 6 implements eBPF JIT for ppc64.

Since the RFC patchset [1], powerpc JIT has now gained support for skb
access helpers and now passes all tests in test_bpf.ko. Review comments
on the RFC patches have been addressed (use of an ABI macro [2] and use
of bpf_jit_binary_alloc()), along with a few other generic fixes and
updates.

Prominent TODOs:
 - implement BPF tail calls
 - support for BPF constant blinding

Please note that patch [2] is a pre-requisite for this patchset, and is
not yet upstream.


- Naveen

[1] http://thread.gmane.org/gmane.linux.kernel/2188694
[2] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/96514


Naveen N. Rao (7):
  ppc bpf/jit: Disable classic BPF JIT on ppc64le
  ppc: bpf/jit: Fix/enhance 32-bit Load Immediate implementation
  ppc: bpf/jit: Optimize 64-bit Immediate loads
  ppc: bpf/jit: Introduce rotate immediate instructions
  ppc: bpf/jit: A few cleanups
  ppc: bpf/jit: Isolate classic BPF JIT specifics into a separate header
  ppc: ebpf/jit: Implement JIT compiler for extended BPF

 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/asm-compat.h |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |  22 +-
 arch/powerpc/net/Makefile |   4 +
 arch/powerpc/net/bpf_jit.h| 235 -
 arch/powerpc/net/bpf_jit32.h  | 139 +
 arch/powerpc/net/bpf_jit64.h  | 102 
 arch/powerpc/net/bpf_jit_asm.S|   2 +-
 arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
 arch/powerpc/net/bpf_jit_comp.c   |  10 +-
 arch/powerpc/net/bpf_jit_comp64.c | 954 ++
 11 files changed, 1502 insertions(+), 151 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit32.h
 create mode 100644 arch/powerpc/net/bpf_jit64.h
 create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
 create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

-- 
2.8.2



[PATCHv2 5/7] ppc: bpf/jit: A few cleanups

2016-06-22 Thread Naveen N. Rao
1. Per the ISA, ADDIS actually uses RT, rather than RS. Though
the result is the same, make the usage clear.
2. The multiply instruction used is a 32-bit multiply. Rename PPC_MUL()
to PPC_MULW() to make the same clear.
3. PPC_STW[U] take the entire 16-bit immediate value and do not require
word-alignment, per the ISA. Change the macros to use IMM_L().
4. A few white-space cleanups to satisfy checkpatch.pl.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h  | 13 +++--
 arch/powerpc/net/bpf_jit_comp.c |  8 
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 95d0e38..9041d3f 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -83,7 +83,7 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
  */
 #define IMM_H(i)   ((uintptr_t)(i)>>16)
 #define IMM_HA(i)  (((uintptr_t)(i)>>16) +   \
-(((uintptr_t)(i) & 0x8000) >> 15))
+   (((uintptr_t)(i) & 0x8000) >> 15))
 #define IMM_L(i)   ((uintptr_t)(i) & 0x)
 
 #define PLANT_INSTR(d, idx, instr)   \
@@ -99,16 +99,16 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_MR(d, a)   PPC_OR(d, a, a)
 #define PPC_LI(r, i)   PPC_ADDI(r, 0, i)
 #define PPC_ADDIS(d, a, i) EMIT(PPC_INST_ADDIS | \
-___PPC_RS(d) | ___PPC_RA(a) | IMM_L(i))
+___PPC_RT(d) | ___PPC_RA(a) | IMM_L(i))
 #define PPC_LIS(r, i)  PPC_ADDIS(r, 0, i)
 #define PPC_STD(r, base, i)EMIT(PPC_INST_STD | ___PPC_RS(r) |\
 ___PPC_RA(base) | ((i) & 0xfffc))
 #define PPC_STDU(r, base, i)   EMIT(PPC_INST_STDU | ___PPC_RS(r) |   \
 ___PPC_RA(base) | ((i) & 0xfffc))
 #define PPC_STW(r, base, i)EMIT(PPC_INST_STW | ___PPC_RS(r) |\
-___PPC_RA(base) | ((i) & 0xfffc))
+___PPC_RA(base) | IMM_L(i))
 #define PPC_STWU(r, base, i)   EMIT(PPC_INST_STWU | ___PPC_RS(r) |   \
-___PPC_RA(base) | ((i) & 0xfffc))
+___PPC_RA(base) | IMM_L(i))
 
 #define PPC_LBZ(r, base, i)EMIT(PPC_INST_LBZ | ___PPC_RT(r) |\
 ___PPC_RA(base) | IMM_L(i))
@@ -174,13 +174,14 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_CMPWI(a, i)EMIT(PPC_INST_CMPWI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMPDI(a, i)EMIT(PPC_INST_CMPDI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMPLWI(a, i)   EMIT(PPC_INST_CMPLWI | ___PPC_RA(a) | IMM_L(i))
-#define PPC_CMPLW(a, b)EMIT(PPC_INST_CMPLW | ___PPC_RA(a) | 
___PPC_RB(b))
+#define PPC_CMPLW(a, b)EMIT(PPC_INST_CMPLW | ___PPC_RA(a) |
  \
+   ___PPC_RB(b))
 
 #define PPC_SUB(d, a, b)   EMIT(PPC_INST_SUB | ___PPC_RT(d) |\
 ___PPC_RB(a) | ___PPC_RA(b))
 #define PPC_ADD(d, a, b)   EMIT(PPC_INST_ADD | ___PPC_RT(d) |\
 ___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_MUL(d, a, b)   EMIT(PPC_INST_MULLW | ___PPC_RT(d) |  \
+#define PPC_MULW(d, a, b)  EMIT(PPC_INST_MULLW | ___PPC_RT(d) |  \
 ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_MULHWU(d, a, b)EMIT(PPC_INST_MULHWU | ___PPC_RT(d) | \
 ___PPC_RA(a) | ___PPC_RB(b))
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 2d66a84..6012aac 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -161,14 +161,14 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
break;
case BPF_ALU | BPF_MUL | BPF_X: /* A *= X; */
ctx->seen |= SEEN_XREG;
-   PPC_MUL(r_A, r_A, r_X);
+   PPC_MULW(r_A, r_A, r_X);
break;
case BPF_ALU | BPF_MUL | BPF_K: /* A

[PATCHv2 4/7] ppc: bpf/jit: Introduce rotate immediate instructions

2016-06-22 Thread Naveen N. Rao
Since we will be using the rotate immediate instructions for extended
BPF JIT, let's introduce macros for the same. And since the shift
immediate operations use the rotate immediate instructions, let's redo
those macros to use the newly introduced instructions.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/ppc-opcode.h |  2 ++
 arch/powerpc/net/bpf_jit.h| 20 +++-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 1d035c1..fd8d640 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -272,6 +272,8 @@
 #define __PPC_SH(s)__PPC_WS(s)
 #define __PPC_MB(s)(((s) & 0x1f) << 6)
 #define __PPC_ME(s)(((s) & 0x1f) << 1)
+#define __PPC_MB64(s)  (__PPC_MB(s) | ((s) & 0x20))
+#define __PPC_ME64(s)  __PPC_MB64(s)
 #define __PPC_BI(s)(((s) & 0x1f) << 16)
 #define __PPC_CT(t)(((t) & 0x0f) << 21)
 
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 4c1e055..95d0e38 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -210,18 +210,20 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRW(d, a, s)   EMIT(PPC_INST_SRW | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_RLWINM(d, a, i, mb, me)EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
+   ___PPC_RS(a) | __PPC_SH(i) |  \
+   __PPC_MB(mb) | __PPC_ME(me))
+#define PPC_RLDICR(d, a, i, me)EMIT(PPC_INST_RLDICR | 
___PPC_RA(d) | \
+   ___PPC_RS(a) | __PPC_SH(i) |  \
+   __PPC_ME64(me) | (((i) & 0x20) >> 4))
+
 /* slwi = rlwinm Rx, Ry, n, 0, 31-n */
-#define PPC_SLWI(d, a, i)  EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(i) | \
-__PPC_MB(0) | __PPC_ME(31-(i)))
+#define PPC_SLWI(d, a, i)  PPC_RLWINM(d, a, i, 0, 31-(i))
 /* srwi = rlwinm Rx, Ry, 32-n, n, 31 */
-#define PPC_SRWI(d, a, i)  EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(32-(i)) |\
-__PPC_MB(i) | __PPC_ME(31))
+#define PPC_SRWI(d, a, i)  PPC_RLWINM(d, a, 32-(i), i, 31)
 /* sldi = rldicr Rx, Ry, n, 63-n */
-#define PPC_SLDI(d, a, i)  EMIT(PPC_INST_RLDICR | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(i) | \
-__PPC_MB(63-(i)) | (((i) & 0x20) >> 4))
+#define PPC_SLDI(d, a, i)  PPC_RLDICR(d, a, i, 63-(i))
+
 #define PPC_NEG(d, a)  EMIT(PPC_INST_NEG | ___PPC_RT(d) | ___PPC_RA(a))
 
 /* Long jump; (unconditional 'branch') */
-- 
2.8.2



[PATCHv2 7/7] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-22 Thread Naveen N. Rao
PPC64 eBPF JIT compiler.

Enable with:
echo 1 > /proc/sys/net/core/bpf_jit_enable
or
echo 2 > /proc/sys/net/core/bpf_jit_enable

... to see the generated JIT code. This can further be processed with
tools/net/bpf_jit_disasm.

With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]

... on both ppc64 BE and LE.

The details of the approach are documented through various comments in
the code.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/asm-compat.h |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |  20 +-
 arch/powerpc/net/Makefile |   4 +
 arch/powerpc/net/bpf_jit.h|  53 +-
 arch/powerpc/net/bpf_jit64.h  | 102 
 arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
 arch/powerpc/net/bpf_jit_comp64.c | 954 ++
 8 files changed, 1315 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit64.h
 create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
 create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 0a9d439..ee82f9a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -128,7 +128,8 @@ config PPC
select IRQ_FORCED_THREADING
select HAVE_RCU_TABLE_FREE if SMP
select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT if CPU_BIG_ENDIAN
+   select HAVE_CBPF_JIT if !PPC64
+   select HAVE_EBPF_JIT if PPC64
select HAVE_ARCH_JUMP_LABEL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/powerpc/include/asm/asm-compat.h 
b/arch/powerpc/include/asm/asm-compat.h
index dc85dcb..cee3aa0 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -36,11 +36,13 @@
 #define PPC_MIN_STKFRM 112
 
 #ifdef __BIG_ENDIAN__
+#define LHZX_BEstringify_in_c(lhzx)
 #define LWZX_BEstringify_in_c(lwzx)
 #define LDX_BE stringify_in_c(ldx)
 #define STWX_BEstringify_in_c(stwx)
 #define STDX_BEstringify_in_c(stdx)
 #else
+#define LHZX_BEstringify_in_c(lhbrx)
 #define LWZX_BEstringify_in_c(lwbrx)
 #define LDX_BE stringify_in_c(ldbrx)
 #define STWX_BEstringify_in_c(stwbrx)
diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index fd8d640..6a77d130 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -142,9 +142,11 @@
 #define PPC_INST_ISEL  0x7c1e
 #define PPC_INST_ISEL_MASK 0xfc3e
 #define PPC_INST_LDARX 0x7ca8
+#define PPC_INST_STDCX 0x7c0001ad
 #define PPC_INST_LSWI  0x7c0004aa
 #define PPC_INST_LSWX  0x7c00042a
 #define PPC_INST_LWARX 0x7c28
+#define PPC_INST_STWCX 0x7c00012d
 #define PPC_INST_LWSYNC0x7c2004ac
 #define PPC_INST_SYNC  0x7c0004ac
 #define PPC_INST_SYNC_MASK 0xfc0007fe
@@ -211,8 +213,11 @@
 #define PPC_INST_LBZ   0x8800
 #define PPC_INST_LD0xe800
 #define PPC_INST_LHZ   0xa000
-#define PPC_INST_LHBRX 0x7c00062c
 #define PPC_INST_LWZ   0x8000
+#define PPC_INST_LHBRX 0x7c00062c
+#define PPC_INST_LDBRX 0x7c000428
+#define PPC_INST_STB   0x9800
+#define PPC_INST_STH   0xb000
 #define PPC_INST_STD   0xf800
 #define PPC_INST_STDU  0xf801
 #define PPC_INST_STW   0x9000
@@ -221,22 +226,34 @@
 #define PPC_INST_MTLR  0x7c0803a6
 #define PPC_INST_CMPWI 0x2c00
 #define PPC_INST_CMPDI 0x2c20
+#define PPC_INST_CMPW  0x7c00
+#define PPC_INST_CMPD  0x7c20
 #define PPC_INST_CMPLW 0x7c40
+#define PPC_INST_CMPLD 0x7c200040
 #define PPC_INST_CMPLWI0x2800
+#define PPC_INST_CMPLDI0x2820
 #define PPC_INST_ADDI  0x3800
 #define PPC_INST_ADDIS 0x3c00
 #define PPC_INST_ADD   0x7c000214
 #define PPC_INST_SUB   

[PATCHv2 1/7] ppc bpf/jit: Disable classic BPF JIT on ppc64le

2016-06-22 Thread Naveen N. Rao
Classic BPF JIT was never ported completely to work on little endian
powerpc. However, it can be enabled and will crash the system when used.
As such, disable use of BPF JIT on ppc64le.

Cc: sta...@vger.kernel.org
Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Reported-by: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 01f7464..0a9d439 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -128,7 +128,7 @@ config PPC
select IRQ_FORCED_THREADING
select HAVE_RCU_TABLE_FREE if SMP
select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT
+   select HAVE_CBPF_JIT if CPU_BIG_ENDIAN
select HAVE_ARCH_JUMP_LABEL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_HAS_GCOV_PROFILE_ALL
-- 
2.8.2



[PATCHv2 2/7] ppc: bpf/jit: Fix/enhance 32-bit Load Immediate implementation

2016-06-22 Thread Naveen N. Rao
The existing LI32() macro can sometimes result in a sign-extended 32-bit
load that does not clear the top 32-bits properly. As an example,
loading 0x7fff results in the register containing
0x7fff. While this does not impact classic BPF JIT
implementation (since that only uses the lower word for all operations),
we would like to share this macro between classic BPF JIT and extended
BPF JIT, wherein the entire 64-bit value in the register matters. Fix
this by first doing a shifted LI followed by ORI.

An additional optimization is with loading values between -32768 to -1,
where we now only need a single LI.

The new implementation now generates the same or less number of
instructions.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 889fd19..a9882db 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -232,10 +232,17 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 (((cond) & 0x3ff) << 16) |   \
 (((dest) - (ctx->idx * 4)) & \
  0xfffc))
-#define PPC_LI32(d, i) do { PPC_LI(d, IMM_L(i)); \
-   if ((u32)(uintptr_t)(i) >= 32768) {   \
-   PPC_ADDIS(d, d, IMM_HA(i));   \
+/* Sign-extended 32-bit immediate load */
+#define PPC_LI32(d, i) do {  \
+   if ((int)(uintptr_t)(i) >= -32768 &&  \
+   (int)(uintptr_t)(i) < 32768)  \
+   PPC_LI(d, i); \
+   else {\
+   PPC_LIS(d, IMM_H(i)); \
+   if (IMM_L(i)) \
+   PPC_ORI(d, d, IMM_L(i));  \
} } while(0)
+
 #define PPC_LI64(d, i) do {  \
if (!((uintptr_t)(i) & 0xULL))\
PPC_LI32(d, i);   \
-- 
2.8.2



Re: [PATCH] ppc: Fix BPF JIT for ABIv2

2016-06-22 Thread Naveen N. Rao
On 2016/06/22 12:42PM, Naveen N Rao wrote:
> On 2016/06/21 11:47AM, Thadeu Lima de Souza Cascardo wrote:
> > On Tue, Jun 21, 2016 at 09:15:48PM +1000, Michael Ellerman wrote:
> > > On Tue, 2016-06-21 at 14:28 +0530, Naveen N. Rao wrote:
> > > > On 2016/06/20 03:56PM, Thadeu Lima de Souza Cascardo wrote:
> > > > > On Sun, Jun 19, 2016 at 11:19:14PM +0530, Naveen N. Rao wrote:
> > > > > > On 2016/06/17 10:00AM, Thadeu Lima de Souza Cascardo wrote:
> > > > > > > 
> > > > > > > Hi, Michael and Naveen.
> > > > > > > 
> > > > > > > I noticed independently that there is a problem with BPF JIT and 
> > > > > > > ABIv2, and
> > > > > > > worked out the patch below before I noticed Naveen's patchset and 
> > > > > > > the latest
> > > > > > > changes in ppc tree for a better way to check for ABI versions.
> > > > > > > 
> > > > > > > However, since the issue described below affect mainline and 
> > > > > > > stable kernels,
> > > > > > > would you consider applying it before merging your two patchsets, 
> > > > > > > so that we can
> > > > > > > more easily backport the fix?
> > > > > > 
> > > > > > Hi Cascardo,
> > > > > > Given that this has been broken on ABIv2 since forever, I didn't 
> > > > > > bother 
> > > > > > fixing it. But, I can see why this would be a good thing to have 
> > > > > > for 
> > > > > > -stable and existing distros. However, while your patch below may 
> > > > > > fix 
> > > > > > the crash you're seeing on ppc64le, it is not sufficient -- you'll 
> > > > > > need 
> > > > > > changes in bpf_jit_asm.S as well.
> > > > > 
> > > > > Hi, Naveen.
> > > > > 
> > > > > Any tips on how to exercise possible issues there? Or what changes 
> > > > > you think
> > > > > would be sufficient?
> > > > 
> > > > The calling convention is different with ABIv2 and so we'll need 
> > > > changes 
> > > > in bpf_slow_path_common() and sk_negative_common().
> > > 
> > > How big would those changes be? Do we know?

So, this does need quite a few changes:
- the skb helpers need to emit code to setup TOC and the JIT code needs 
  to be updated to setup r12.
- the slow path code needs to be changed to store r3 elsewhere on ABIv2
- the above also means we need to change the stack macros with the 
  proper ABIv2 values
- the little endian support isn't complete as well -- some of the skb 
  helpers are not using byte swap instructions.

As such, I think we should just disable classic JIT on ppc64le.


- Naveen
 



Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-22 Thread Naveen N. Rao
On 2016/06/22 08:37PM, Michael Ellerman wrote:
> On Tue, 2016-06-07 at 19:02 +0530, Naveen N. Rao wrote:
> 
> > PPC64 eBPF JIT compiler.
> > 
> > Enable with:
> > echo 1 > /proc/sys/net/core/bpf_jit_enable
> > or
> > echo 2 > /proc/sys/net/core/bpf_jit_enable
> > 
> > ... to see the generated JIT code. This can further be processed with
> > tools/net/bpf_jit_disasm.
> > 
> > With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
> > test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]
> > 
> > ... on both ppc64 BE and LE.
> > 
> > The details of the approach are documented through various comments in
> > the code.
> 
> This is crashing for me on a Cell machine, not sure why at a glance:
> 
> 
> test_bpf: #250 JMP_JSET_X: if (0x3 & 0x) return 1 jited:1 14 PASS
> test_bpf: #251 JMP_JA: Jump, gap, jump, ... jited:1 15 PASS
> test_bpf: #252 BPF_MAXINSNS: Maximum possible literals 
> Unable to handle kernel paging request for data at address 0xd7b2
> Faulting instruction address: 0xc0667b6c
> cpu 0x0: Vector: 300 (Data Access) at [c007f83bf3a0]
> pc: c0667b6c: .flush_icache_range+0x3c/0x84
> lr: c0082354: .bpf_int_jit_compile+0x1fc/0x2c8
> sp: c007f83bf620
>msr: 9200b032
>dar: d7b2
>  dsisr: 4000
>   current = 0xc007f8249580
>   paca= 0xcfff softe: 0irq_happened: 0x01
> pid   = 1822, comm = insmod
> Linux version 4.7.0-rc3-00061-g007c99b9d8c1 (mich...@ka3.ozlabs.ibm.com) (gcc 
> version 6.1.0 (GCC) ) #3 SMP Wed Jun 22 19:22:23 AEST 2016
> enter ? for help
> [link register   ] c0082354 .bpf_int_jit_compile+0x1fc/0x2c8
> [c007f83bf620] c00822fc .bpf_int_jit_compile+0x1a4/0x2c8 
> (unreliable)
> [c007f83bf700] c013cda4 .bpf_prog_select_runtime+0x24/0x108
> [c007f83bf780] c0548918 .bpf_prepare_filter+0x9b0/0x9e8
> [c007f83bf830] c05489d4 .bpf_prog_create+0x84/0xd0
> [c007f83bf8c0] d3b21158 .test_bpf_init+0x28c/0x83c [test_bpf]
> [c007f83bfa00] c000a7b4 .do_one_initcall+0x5c/0x1c0
> [c007f83bfae0] c0669058 .do_init_module+0x80/0x21c
> [c007f83bfb80] c011e3a0 .load_module+0x2028/0x23a8
> [c007f83bfd20] c011e898 .SyS_init_module+0x178/0x1b0
> [c007f83bfe30] c0009220 system_call+0x38/0x110
> --- Exception: c01 (System Call) at 0ff5e0c4
> SP (ffde0960) is in userspace
> 0:mon> r
> R00 = c01c   R16 = 
> R01 = c007f83bf620   R17 = 024000c0
> R02 = c094ce00   R18 = 
> R03 = d7b1   R19 = d3c32df0
> R04 = d7b40338   R20 = c072b488

Wow. I can't actually understand why this did not trigger for me. We are 
sending incorrect values into flush_icache_range(). So the first page is 
being flushed properly, but we are faulting trying to access another 
page. Patch forthcoming.

Thanks,
Naveen
 



Re: [PATCH] ppc: Fix BPF JIT for ABIv2

2016-06-22 Thread Naveen N. Rao
On 2016/06/21 11:47AM, Thadeu Lima de Souza Cascardo wrote:
> On Tue, Jun 21, 2016 at 09:15:48PM +1000, Michael Ellerman wrote:
> > On Tue, 2016-06-21 at 14:28 +0530, Naveen N. Rao wrote:
> > > On 2016/06/20 03:56PM, Thadeu Lima de Souza Cascardo wrote:
> > > > On Sun, Jun 19, 2016 at 11:19:14PM +0530, Naveen N. Rao wrote:
> > > > > On 2016/06/17 10:00AM, Thadeu Lima de Souza Cascardo wrote:
> > > > > > 
> > > > > > Hi, Michael and Naveen.
> > > > > > 
> > > > > > I noticed independently that there is a problem with BPF JIT and 
> > > > > > ABIv2, and
> > > > > > worked out the patch below before I noticed Naveen's patchset and 
> > > > > > the latest
> > > > > > changes in ppc tree for a better way to check for ABI versions.
> > > > > > 
> > > > > > However, since the issue described below affect mainline and stable 
> > > > > > kernels,
> > > > > > would you consider applying it before merging your two patchsets, 
> > > > > > so that we can
> > > > > > more easily backport the fix?
> > > > > 
> > > > > Hi Cascardo,
> > > > > Given that this has been broken on ABIv2 since forever, I didn't 
> > > > > bother 
> > > > > fixing it. But, I can see why this would be a good thing to have for 
> > > > > -stable and existing distros. However, while your patch below may fix 
> > > > > the crash you're seeing on ppc64le, it is not sufficient -- you'll 
> > > > > need 
> > > > > changes in bpf_jit_asm.S as well.
> > > > 
> > > > Hi, Naveen.
> > > > 
> > > > Any tips on how to exercise possible issues there? Or what changes you 
> > > > think
> > > > would be sufficient?
> > > 
> > > The calling convention is different with ABIv2 and so we'll need changes 
> > > in bpf_slow_path_common() and sk_negative_common().
> > 
> > How big would those changes be? Do we know?

I don't think it'd be that much -- I will take a stab at this today.

> > 
> > How come no one reported this was broken previously? This is the first I've
> > heard of it being broken.
> > 
> 
> I just heard of it less than two weeks ago, and only could investigate it last
> week, when I realized mainline was also affected.
> 
> It looks like the little-endian support for classic JIT were done before the
> conversion to ABIv2. And as JIT is disabled by default, no one seems to have
> exercised it.

Yes, my thoughts too. I didn't previously think much about this as JIT 
wouldn't be enabled by default. It's interesting though that no one else 
reported this as an issue before.

> 
> > > However, rather than enabling classic JIT for ppc64le, are we better off 
> > > just disabling it?
> > > 
> > > --- a/arch/powerpc/Kconfig
> > > +++ b/arch/powerpc/Kconfig
> > > @@ -128,7 +128,7 @@ config PPC
> > > select IRQ_FORCED_THREADING
> > > select HAVE_RCU_TABLE_FREE if SMP
> > > select HAVE_SYSCALL_TRACEPOINTS
> > > -   select HAVE_CBPF_JIT
> > > +   select HAVE_CBPF_JIT if CPU_BIG_ENDIAN
> > > select HAVE_ARCH_JUMP_LABEL
> > > select ARCH_HAVE_NMI_SAFE_CMPXCHG
> > > select ARCH_HAS_GCOV_PROFILE_ALL
> > > 
> > > 
> > > Michael,
> > > Let me know your thoughts on whether you intend to take this patch or 
> > > Cascardo's patch for -stable before the eBPF patches. I can redo my 
> > > patches accordingly.
> > 
> > This patch sounds like the best option at the moment for something we can
> > backport. Unless the changes to fix it are minimal.

Right -- I will take a look today to see what changes would be needed.

- Naveen



Re: [6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-22 Thread Naveen N. Rao
On 2016/06/21 09:04PM, Michael Ellerman wrote:
> On Tue, 2016-06-21 at 12:28 +0530, Naveen N. Rao wrote:
> > On 2016/06/21 09:38AM, Michael Ellerman wrote:
> > > On Sun, 2016-06-19 at 23:06 +0530, Naveen N. Rao wrote:
> > > > 
> > > > #include 
> > > > 
> > > > in bpf_jit_comp64.c
> > > > 
> > > > Can you please check if it resolves the build error?
> > > 
> > > Can you? :D
> > 
> > :)
> > Sorry, I should have explained myself better. I did actually try your 
> > config and I was able to reproduce the build error. After the above 
> > #include, that error went away, but I saw some vdso related errors. I 
> > thought I was doing something wrong and needed a different setup for 
> > that particular kernel config, which is why I requested your help in the 
> > matter. I just didn't do a good job of putting across that message...
> 
> Ah OK. Not sure why you're seeing VDSO errors?

'Cause I wasn't paying attention. I tried your .config on a LE machine.  
It works fine on BE, as it should.

> 
> > Note to self: randconfig builds *and* more time drafting emails :)
> 
> No stress. You don't need to do randconfig builds, or even build all the
> arch/powerpc/ configs, just try to do a reasonable set, something like - 
> ppc64,
> powernv, pseries, pmac32, ppc64e.

Ok, will do.

> 
> I'm happy to catch the esoteric build failures.
> 
> > Do you want me to respin the patches?
> 
> No that's fine, I'll fix it up here.

Thanks,
Naveen



Re: [PATCH] ppc: Fix BPF JIT for ABIv2

2016-06-21 Thread Naveen N. Rao
On 2016/06/20 03:56PM, Thadeu Lima de Souza Cascardo wrote:
> On Sun, Jun 19, 2016 at 11:19:14PM +0530, Naveen N. Rao wrote:
> > On 2016/06/17 10:00AM, Thadeu Lima de Souza Cascardo wrote:
> > > 
> > > Hi, Michael and Naveen.
> > > 
> > > I noticed independently that there is a problem with BPF JIT and ABIv2, 
> > > and
> > > worked out the patch below before I noticed Naveen's patchset and the 
> > > latest
> > > changes in ppc tree for a better way to check for ABI versions.
> > > 
> > > However, since the issue described below affect mainline and stable 
> > > kernels,
> > > would you consider applying it before merging your two patchsets, so that 
> > > we can
> > > more easily backport the fix?
> > 
> > Hi Cascardo,
> > Given that this has been broken on ABIv2 since forever, I didn't bother 
> > fixing it. But, I can see why this would be a good thing to have for 
> > -stable and existing distros. However, while your patch below may fix 
> > the crash you're seeing on ppc64le, it is not sufficient -- you'll need 
> > changes in bpf_jit_asm.S as well.
> 
> Hi, Naveen.
> 
> Any tips on how to exercise possible issues there? Or what changes you think
> would be sufficient?

The calling convention is different with ABIv2 and so we'll need changes 
in bpf_slow_path_common() and sk_negative_common().

However, rather than enabling classic JIT for ppc64le, are we better off 
just disabling it?

--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -128,7 +128,7 @@ config PPC
select IRQ_FORCED_THREADING
select HAVE_RCU_TABLE_FREE if SMP
select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT
+   select HAVE_CBPF_JIT if CPU_BIG_ENDIAN
select HAVE_ARCH_JUMP_LABEL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_HAS_GCOV_PROFILE_ALL


Michael,
Let me know your thoughts on whether you intend to take this patch or 
Cascardo's patch for -stable before the eBPF patches. I can redo my 
patches accordingly.


- Naveen



Re: [6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-21 Thread Naveen N. Rao
On 2016/06/21 09:38AM, Michael Ellerman wrote:
> On Sun, 2016-06-19 at 23:06 +0530, Naveen N. Rao wrote:
> > On 2016/06/17 10:53PM, Michael Ellerman wrote:
> > > On Tue, 2016-07-06 at 13:32:23 UTC, "Naveen N. Rao" wrote:
> > > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> > > > b/arch/powerpc/net/bpf_jit_comp64.c
> > > > new file mode 100644
> > > > index 000..954ff53
> > > > --- /dev/null
> > > > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > > > @@ -0,0 +1,956 @@
> > > ...
> 
> > > > +
> > > > +static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
> > > > +{
> > > > +   int *p = area;
> > > > +
> > > > +   /* Fill whole space with trap instructions */
> > > > +   while (p < (int *)((char *)area + size))
> > > > +   *p++ = BREAKPOINT_INSTRUCTION;
> > > > +}
> > > 
> > > This breaks the build for some configs, presumably you're missing a 
> > > header:
> > > 
> > >   arch/powerpc/net/bpf_jit_comp64.c:30:10: error: 
> > > 'BREAKPOINT_INSTRUCTION' undeclared (first use in this function)
> > > 
> > > http://kisskb.ellerman.id.au/kisskb/buildresult/12720611/
> > 
> > Oops. Yes, I should have caught that. I need to add:
> > 
> > #include 
> > 
> > in bpf_jit_comp64.c
> > 
> > Can you please check if it resolves the build error?
> 
> Can you? :D

:)
Sorry, I should have explained myself better. I did actually try your 
config and I was able to reproduce the build error. After the above 
#include, that error went away, but I saw some vdso related errors. I 
thought I was doing something wrong and needed a different setup for 
that particular kernel config, which is why I requested your help in the 
matter. I just didn't do a good job of putting across that message...

Note to self: randconfig builds *and* more time drafting emails :)

Do you want me to respin the patches?


Thanks,
Naveen



Re: [PATCH] ppc: Fix BPF JIT for ABIv2

2016-06-19 Thread Naveen N. Rao
On 2016/06/17 10:00AM, Thadeu Lima de Souza Cascardo wrote:
> On Fri, Jun 17, 2016 at 10:53:21PM +1000, Michael Ellerman wrote:
> > On Tue, 2016-07-06 at 13:32:23 UTC, "Naveen N. Rao" wrote:
> > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> > > b/arch/powerpc/net/bpf_jit_comp64.c
> > > new file mode 100644
> > > index 000..954ff53
> > > --- /dev/null
> > > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > > @@ -0,0 +1,956 @@
> > ...
> > > +
> > > +static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
> > > +{
> > > + int *p = area;
> > > +
> > > + /* Fill whole space with trap instructions */
> > > + while (p < (int *)((char *)area + size))
> > > + *p++ = BREAKPOINT_INSTRUCTION;
> > > +}
> > 
> > This breaks the build for some configs, presumably you're missing a header:
> > 
> >   arch/powerpc/net/bpf_jit_comp64.c:30:10: error: 'BREAKPOINT_INSTRUCTION' 
> > undeclared (first use in this function)
> > 
> > http://kisskb.ellerman.id.au/kisskb/buildresult/12720611/
> > 
> > cheers
> 
> Hi, Michael and Naveen.
> 
> I noticed independently that there is a problem with BPF JIT and ABIv2, and
> worked out the patch below before I noticed Naveen's patchset and the latest
> changes in ppc tree for a better way to check for ABI versions.
> 
> However, since the issue described below affect mainline and stable kernels,
> would you consider applying it before merging your two patchsets, so that we 
> can
> more easily backport the fix?

Hi Cascardo,
Given that this has been broken on ABIv2 since forever, I didn't bother 
fixing it. But, I can see why this would be a good thing to have for 
-stable and existing distros. However, while your patch below may fix 
the crash you're seeing on ppc64le, it is not sufficient -- you'll need 
changes in bpf_jit_asm.S as well.

Regards,
Naveen



Re: [6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-19 Thread Naveen N. Rao
On 2016/06/17 10:53PM, Michael Ellerman wrote:
> On Tue, 2016-07-06 at 13:32:23 UTC, "Naveen N. Rao" wrote:
> > diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> > b/arch/powerpc/net/bpf_jit_comp64.c
> > new file mode 100644
> > index 000..954ff53
> > --- /dev/null
> > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > @@ -0,0 +1,956 @@
> ...
> > +
> > +static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
> > +{
> > +   int *p = area;
> > +
> > +   /* Fill whole space with trap instructions */
> > +   while (p < (int *)((char *)area + size))
> > +   *p++ = BREAKPOINT_INSTRUCTION;
> > +}
> 
> This breaks the build for some configs, presumably you're missing a header:
> 
>   arch/powerpc/net/bpf_jit_comp64.c:30:10: error: 'BREAKPOINT_INSTRUCTION' 
> undeclared (first use in this function)
> 
> http://kisskb.ellerman.id.au/kisskb/buildresult/12720611/

Oops. Yes, I should have caught that. I need to add:

#include 

in bpf_jit_comp64.c

Can you please check if it resolves the build error?

Regards,
Naveen



Re: [PATCH 0/6] eBPF JIT for PPC64

2016-06-12 Thread Naveen N. Rao
On 2016/06/10 10:47PM, David Miller wrote:
> From: "Naveen N. Rao" <naveen.n@linux.vnet.ibm.com>
> Date: Tue,  7 Jun 2016 19:02:17 +0530
> 
> > Please note that patch [2] is a pre-requisite for this patchset, and is
> > not yet upstream.
>  ...
> > [1] http://thread.gmane.org/gmane.linux.kernel/2188694
> > [2] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/96514
> 
> Because of #2 I don't think I can take this directly into the networking
> tree, right?
> 
> Therefore, how would you like this to be merged?

Hi David,
Thanks for asking. Yes, I think it is better to take this through the 
powerpc tree as all the changes are contained within arch/powerpc, 
unless Michael Ellerman feels differently.

Michael?


Regards,
Naveen



Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-09 Thread Naveen N. Rao
On 2016/06/08 10:19PM, Nilay Vaish wrote:
> Naveen, can you point out where in the patch you update the variable:
> idx, a member of codegen_contex structure?  Somehow I am unable to
> figure it out.  I can only see that we set it to 0 in the
> bpf_int_jit_compile function.  Since all your test cases pass, I am
> clearly overlooking something.

Yes, that's being done in bpf_jit.h (see the earlier patches in the 
series). All the PPC_*() instruction macros are defined to EMIT() the 
respective powerpc instruction encoding.  EMIT() translates to 
PLANT_INSTR(), which actually increments idx.

- Naveen



Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-08 Thread Naveen N. Rao
On 2016/06/07 03:56PM, Alexei Starovoitov wrote:
> On Tue, Jun 07, 2016 at 07:02:23PM +0530, Naveen N. Rao wrote:
> > PPC64 eBPF JIT compiler.
> > 
> > Enable with:
> > echo 1 > /proc/sys/net/core/bpf_jit_enable
> > or
> > echo 2 > /proc/sys/net/core/bpf_jit_enable
> > 
> > ... to see the generated JIT code. This can further be processed with
> > tools/net/bpf_jit_disasm.
> > 
> > With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
> > test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]
> > 
> > ... on both ppc64 BE and LE.
> 
> Nice. That's even better than on x64 which cannot jit one test:
> test_bpf: #262 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 168 PASS
> which was designed specifically to hit x64 jit pass limit.
> ppc jit has predicatble number of passes and doesn't have this problem
> as expected. Great.

Yes, that's thanks to the clever handling of conditional branches by 
Matt -- we always emit 2 instructions for this reason (encoded in 
PPC_BCC() macro).

> 
> > The details of the approach are documented through various comments in
> > the code.
> > 
> > Cc: Matt Evans <m...@ozlabs.org>
> > Cc: Denis Kirjanov <k...@linux-powerpc.org>
> > Cc: Michael Ellerman <m...@ellerman.id.au>
> > Cc: Paul Mackerras <pau...@samba.org>
> > Cc: Alexei Starovoitov <a...@fb.com>
> > Cc: Daniel Borkmann <dan...@iogearbox.net>
> > Cc: "David S. Miller" <da...@davemloft.net>
> > Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
> > Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/Kconfig  |   3 +-
> >  arch/powerpc/include/asm/asm-compat.h |   2 +
> >  arch/powerpc/include/asm/ppc-opcode.h |  20 +-
> >  arch/powerpc/net/Makefile |   4 +
> >  arch/powerpc/net/bpf_jit.h|  53 +-
> >  arch/powerpc/net/bpf_jit64.h  | 102 
> >  arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
> >  arch/powerpc/net/bpf_jit_comp64.c | 956 
> > ++
> >  8 files changed, 1317 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/powerpc/net/bpf_jit64.h
> >  create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
> >  create mode 100644 arch/powerpc/net/bpf_jit_comp64.c
> 
> don't see any issues with the code.
> Thank you for working on this.
> 
> Acked-by: Alexei Starovoitov <a...@kernel.org>

Thanks, Alexei!


Regards,
Naveen



[PATCH 3/6] ppc: bpf/jit: Introduce rotate immediate instructions

2016-06-07 Thread Naveen N. Rao
Since we will be using the rotate immediate instructions for extended
BPF JIT, let's introduce macros for the same. And since the shift
immediate operations use the rotate immediate instructions, let's redo
those macros to use the newly introduced instructions.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/ppc-opcode.h |  2 ++
 arch/powerpc/net/bpf_jit.h| 20 +++-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 1d035c1..fd8d640 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -272,6 +272,8 @@
 #define __PPC_SH(s)__PPC_WS(s)
 #define __PPC_MB(s)(((s) & 0x1f) << 6)
 #define __PPC_ME(s)(((s) & 0x1f) << 1)
+#define __PPC_MB64(s)  (__PPC_MB(s) | ((s) & 0x20))
+#define __PPC_ME64(s)  __PPC_MB64(s)
 #define __PPC_BI(s)(((s) & 0x1f) << 16)
 #define __PPC_CT(t)(((t) & 0x0f) << 21)
 
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 4c1e055..95d0e38 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -210,18 +210,20 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRW(d, a, s)   EMIT(PPC_INST_SRW | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_RLWINM(d, a, i, mb, me)EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
+   ___PPC_RS(a) | __PPC_SH(i) |  \
+   __PPC_MB(mb) | __PPC_ME(me))
+#define PPC_RLDICR(d, a, i, me)EMIT(PPC_INST_RLDICR | 
___PPC_RA(d) | \
+   ___PPC_RS(a) | __PPC_SH(i) |  \
+   __PPC_ME64(me) | (((i) & 0x20) >> 4))
+
 /* slwi = rlwinm Rx, Ry, n, 0, 31-n */
-#define PPC_SLWI(d, a, i)  EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(i) | \
-__PPC_MB(0) | __PPC_ME(31-(i)))
+#define PPC_SLWI(d, a, i)  PPC_RLWINM(d, a, i, 0, 31-(i))
 /* srwi = rlwinm Rx, Ry, 32-n, n, 31 */
-#define PPC_SRWI(d, a, i)  EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(32-(i)) |\
-__PPC_MB(i) | __PPC_ME(31))
+#define PPC_SRWI(d, a, i)  PPC_RLWINM(d, a, 32-(i), i, 31)
 /* sldi = rldicr Rx, Ry, n, 63-n */
-#define PPC_SLDI(d, a, i)  EMIT(PPC_INST_RLDICR | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(i) | \
-__PPC_MB(63-(i)) | (((i) & 0x20) >> 4))
+#define PPC_SLDI(d, a, i)  PPC_RLDICR(d, a, i, 63-(i))
+
 #define PPC_NEG(d, a)  EMIT(PPC_INST_NEG | ___PPC_RT(d) | ___PPC_RA(a))
 
 /* Long jump; (unconditional 'branch') */
-- 
2.8.2



[PATCH 4/6] ppc: bpf/jit: A few cleanups

2016-06-07 Thread Naveen N. Rao
1. Per the ISA, ADDIS actually uses RT, rather than RS. Though
the result is the same, make the usage clear.
2. The multiply instruction used is a 32-bit multiply. Rename PPC_MUL()
to PPC_MULW() to make the same clear.
3. PPC_STW[U] take the entire 16-bit immediate value and do not require
word-alignment, per the ISA. Change the macros to use IMM_L().
4. A few white-space cleanups to satisfy checkpatch.pl.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h  | 13 +++--
 arch/powerpc/net/bpf_jit_comp.c |  8 
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 95d0e38..9041d3f 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -83,7 +83,7 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
  */
 #define IMM_H(i)   ((uintptr_t)(i)>>16)
 #define IMM_HA(i)  (((uintptr_t)(i)>>16) +   \
-(((uintptr_t)(i) & 0x8000) >> 15))
+   (((uintptr_t)(i) & 0x8000) >> 15))
 #define IMM_L(i)   ((uintptr_t)(i) & 0x)
 
 #define PLANT_INSTR(d, idx, instr)   \
@@ -99,16 +99,16 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_MR(d, a)   PPC_OR(d, a, a)
 #define PPC_LI(r, i)   PPC_ADDI(r, 0, i)
 #define PPC_ADDIS(d, a, i) EMIT(PPC_INST_ADDIS | \
-___PPC_RS(d) | ___PPC_RA(a) | IMM_L(i))
+___PPC_RT(d) | ___PPC_RA(a) | IMM_L(i))
 #define PPC_LIS(r, i)  PPC_ADDIS(r, 0, i)
 #define PPC_STD(r, base, i)EMIT(PPC_INST_STD | ___PPC_RS(r) |\
 ___PPC_RA(base) | ((i) & 0xfffc))
 #define PPC_STDU(r, base, i)   EMIT(PPC_INST_STDU | ___PPC_RS(r) |   \
 ___PPC_RA(base) | ((i) & 0xfffc))
 #define PPC_STW(r, base, i)EMIT(PPC_INST_STW | ___PPC_RS(r) |\
-___PPC_RA(base) | ((i) & 0xfffc))
+___PPC_RA(base) | IMM_L(i))
 #define PPC_STWU(r, base, i)   EMIT(PPC_INST_STWU | ___PPC_RS(r) |   \
-___PPC_RA(base) | ((i) & 0xfffc))
+___PPC_RA(base) | IMM_L(i))
 
 #define PPC_LBZ(r, base, i)EMIT(PPC_INST_LBZ | ___PPC_RT(r) |\
 ___PPC_RA(base) | IMM_L(i))
@@ -174,13 +174,14 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_CMPWI(a, i)EMIT(PPC_INST_CMPWI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMPDI(a, i)EMIT(PPC_INST_CMPDI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMPLWI(a, i)   EMIT(PPC_INST_CMPLWI | ___PPC_RA(a) | IMM_L(i))
-#define PPC_CMPLW(a, b)EMIT(PPC_INST_CMPLW | ___PPC_RA(a) | 
___PPC_RB(b))
+#define PPC_CMPLW(a, b)EMIT(PPC_INST_CMPLW | ___PPC_RA(a) |
  \
+   ___PPC_RB(b))
 
 #define PPC_SUB(d, a, b)   EMIT(PPC_INST_SUB | ___PPC_RT(d) |\
 ___PPC_RB(a) | ___PPC_RA(b))
 #define PPC_ADD(d, a, b)   EMIT(PPC_INST_ADD | ___PPC_RT(d) |\
 ___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_MUL(d, a, b)   EMIT(PPC_INST_MULLW | ___PPC_RT(d) |  \
+#define PPC_MULW(d, a, b)  EMIT(PPC_INST_MULLW | ___PPC_RT(d) |  \
 ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_MULHWU(d, a, b)EMIT(PPC_INST_MULHWU | ___PPC_RT(d) | \
 ___PPC_RA(a) | ___PPC_RB(b))
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 2d66a84..6012aac 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -161,14 +161,14 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
break;
case BPF_ALU | BPF_MUL | BPF_X: /* A *= X; */
ctx->seen |= SEEN_XREG;
-   PPC_MUL(r_A, r_A, r_X);
+   PPC_MULW(r_A, r_A, r_X);
break;
case BPF_ALU | BPF_MUL | BPF_K: /* A *= K */
if (K < 32768)
PPC_MULI(r_A, r_A, K);
el

[PATCH 0/6] eBPF JIT for PPC64

2016-06-07 Thread Naveen N. Rao
Implement extended BPF JIT for ppc64. We retain the classic BPF JIT for
ppc32 and move ppc64 BE/LE to use the new JIT. Classic BPF filters will
be converted to extended BPF (see convert_filter()) and JIT'ed with the
new compiler.

Most of the existing macros are retained and fixed/enhanced where
appropriate. Patches 1-4 are geared towards this.

Patch 5 breaks out the classic BPF JIT specifics into a separate
bpf_jit32.h header file, while retaining all the generic instruction
macros in bpf_jit.h.

Patch 6 implements eBPF JIT for ppc64.

Since the RFC patchset [1], powerpc JIT has now gained support for skb
access helpers and now passes all tests in test_bpf.ko. Review comments
on the RFC patches have been addressed (use of an ABI macro [2] and use
of bpf_jit_binary_alloc()), along with a few other generic fixes and
updates.

Prominent TODOs:
 - implement BPF tail calls
 - support for BPF constant blinding

Please note that patch [2] is a pre-requisite for this patchset, and is
not yet upstream.


- Naveen

[1] http://thread.gmane.org/gmane.linux.kernel/2188694
[2] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/96514


Naveen N. Rao (6):
  ppc: bpf/jit: Fix/enhance 32-bit Load Immediate implementation
  ppc: bpf/jit: Optimize 64-bit Immediate loads
  ppc: bpf/jit: Introduce rotate immediate instructions
  ppc: bpf/jit: A few cleanups
  ppc: bpf/jit: Isolate classic BPF JIT specifics into a separate header
  ppc: ebpf/jit: Implement JIT compiler for extended BPF

 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/asm-compat.h |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |  22 +-
 arch/powerpc/net/Makefile |   4 +
 arch/powerpc/net/bpf_jit.h| 235 -
 arch/powerpc/net/bpf_jit32.h  | 139 +
 arch/powerpc/net/bpf_jit64.h  | 102 
 arch/powerpc/net/bpf_jit_asm.S|   2 +-
 arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
 arch/powerpc/net/bpf_jit_comp.c   |  10 +-
 arch/powerpc/net/bpf_jit_comp64.c | 956 ++
 11 files changed, 1504 insertions(+), 151 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit32.h
 create mode 100644 arch/powerpc/net/bpf_jit64.h
 create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
 create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

-- 
2.8.2



[PATCH 1/6] ppc: bpf/jit: Fix/enhance 32-bit Load Immediate implementation

2016-06-07 Thread Naveen N. Rao
The existing LI32() macro can sometimes result in a sign-extended 32-bit
load that does not clear the top 32-bits properly. As an example,
loading 0x7fff results in the register containing
0x7fff. While this does not impact classic BPF JIT
implementation (since that only uses the lower word for all operations),
we would like to share this macro between classic BPF JIT and extended
BPF JIT, wherein the entire 64-bit value in the register matters. Fix
this by first doing a shifted LI followed by ORI.

An additional optimization is with loading values between -32768 to -1,
where we now only need a single LI.

The new implementation now generates the same or less number of
instructions.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 889fd19..a9882db 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -232,10 +232,17 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 (((cond) & 0x3ff) << 16) |   \
 (((dest) - (ctx->idx * 4)) & \
  0xfffc))
-#define PPC_LI32(d, i) do { PPC_LI(d, IMM_L(i)); \
-   if ((u32)(uintptr_t)(i) >= 32768) {   \
-   PPC_ADDIS(d, d, IMM_HA(i));   \
+/* Sign-extended 32-bit immediate load */
+#define PPC_LI32(d, i) do {  \
+   if ((int)(uintptr_t)(i) >= -32768 &&  \
+   (int)(uintptr_t)(i) < 32768)  \
+   PPC_LI(d, i); \
+   else {\
+   PPC_LIS(d, IMM_H(i)); \
+   if (IMM_L(i)) \
+   PPC_ORI(d, d, IMM_L(i));  \
} } while(0)
+
 #define PPC_LI64(d, i) do {  \
if (!((uintptr_t)(i) & 0xULL))\
PPC_LI32(d, i);   \
-- 
2.8.2



[PATCH 5/6] ppc: bpf/jit: Isolate classic BPF JIT specifics into a separate header

2016-06-07 Thread Naveen N. Rao
Break out classic BPF JIT specifics into a separate header in
preparation for eBPF JIT implementation. Note that ppc32 will still need
the classic BPF JIT.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h  | 121 +-
 arch/powerpc/net/bpf_jit32.h| 139 
 arch/powerpc/net/bpf_jit_asm.S  |   2 +-
 arch/powerpc/net/bpf_jit_comp.c |   2 +-
 4 files changed, 143 insertions(+), 121 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit32.h

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 9041d3f..313cfaf 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -1,4 +1,5 @@
-/* bpf_jit.h: BPF JIT compiler for PPC64
+/*
+ * bpf_jit.h: BPF JIT compiler for PPC
  *
  * Copyright 2011 Matt Evans <m...@ozlabs.org>, IBM Corporation
  *
@@ -10,66 +11,8 @@
 #ifndef _BPF_JIT_H
 #define _BPF_JIT_H
 
-#ifdef CONFIG_PPC64
-#define BPF_PPC_STACK_R3_OFF   48
-#define BPF_PPC_STACK_LOCALS   32
-#define BPF_PPC_STACK_BASIC(48+64)
-#define BPF_PPC_STACK_SAVE (18*8)
-#define BPF_PPC_STACKFRAME (BPF_PPC_STACK_BASIC+BPF_PPC_STACK_LOCALS+ \
-BPF_PPC_STACK_SAVE)
-#define BPF_PPC_SLOWPATH_FRAME (48+64)
-#else
-#define BPF_PPC_STACK_R3_OFF   24
-#define BPF_PPC_STACK_LOCALS   16
-#define BPF_PPC_STACK_BASIC(24+32)
-#define BPF_PPC_STACK_SAVE (18*4)
-#define BPF_PPC_STACKFRAME (BPF_PPC_STACK_BASIC+BPF_PPC_STACK_LOCALS+ \
-BPF_PPC_STACK_SAVE)
-#define BPF_PPC_SLOWPATH_FRAME (24+32)
-#endif
-
-#define REG_SZ (BITS_PER_LONG/8)
-
-/*
- * Generated code register usage:
- *
- * As normal PPC C ABI (e.g. r1=sp, r2=TOC), with:
- *
- * skb r3  (Entry parameter)
- * A register  r4
- * X register  r5
- * addr param  r6
- * r7-r10  scratch
- * skb->data   r14
- * skb headlen r15 (skb->len - skb->data_len)
- * m[0]r16
- * m[...]  ...
- * m[15]   r31
- */
-#define r_skb  3
-#define r_ret  3
-#define r_A4
-#define r_X5
-#define r_addr 6
-#define r_scratch1 7
-#define r_scratch2 8
-#define r_D14
-#define r_HL   15
-#define r_M16
-
 #ifndef __ASSEMBLY__
 
-/*
- * Assembly helpers from arch/powerpc/net/bpf_jit.S:
- */
-#define DECLARE_LOAD_FUNC(func)\
-   extern u8 func[], func##_negative_offset[], func##_positive_offset[]
-
-DECLARE_LOAD_FUNC(sk_load_word);
-DECLARE_LOAD_FUNC(sk_load_half);
-DECLARE_LOAD_FUNC(sk_load_byte);
-DECLARE_LOAD_FUNC(sk_load_byte_msh);
-
 #ifdef CONFIG_PPC64
 #define FUNCTION_DESCR_SIZE24
 #else
@@ -131,46 +74,6 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0)
 #endif
 
-/* Convenience helpers for the above with 'far' offsets: */
-#define PPC_LBZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LBZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LBZ(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LD_OFFS(r, base, i) do { if ((i) < 32768) PPC_LD(r, base, i); \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LD(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LWZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LWZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LWZ(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LHZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LHZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LHZ(r, r, IMM_L(i)); } } while(0)
-
-#ifdef CONFIG_PPC64
-#define PPC_LL_OFFS(r, base, i) do { PPC_LD_OFFS(r, base, i); } while(0)
-#else
-#define PPC_LL_OFFS(r, base, i) do { PPC_LWZ_OFFS(r, base, i); } while(0)
-#endif
-
-#ifdef CONFIG_SMP
-#ifdef CONFIG_PPC64
-#define PPC_BPF_LOAD_CPU(r)\
-   do { BUILD_BUG_ON(FIELD_SIZEOF(struct paca_struct, paca_index) != 2);   
\
-   PPC_LHZ_OFFS(r, 13, offsetof(struct paca_struct, paca_index));  
\
-   } while (0)
-#else
-#define PPC_BPF_LOAD_CPU(r) \
-   do { BUILD_BUG_ON(FIELD_SIZEOF(struct thread_info, cpu) != 4);  
\
-   PPC_LHZ_OFFS(r, (1 & ~(THREAD_SIZE - 1)),   
   

[PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-07 Thread Naveen N. Rao
PPC64 eBPF JIT compiler.

Enable with:
echo 1 > /proc/sys/net/core/bpf_jit_enable
or
echo 2 > /proc/sys/net/core/bpf_jit_enable

... to see the generated JIT code. This can further be processed with
tools/net/bpf_jit_disasm.

With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]

... on both ppc64 BE and LE.

The details of the approach are documented through various comments in
the code.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/asm-compat.h |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |  20 +-
 arch/powerpc/net/Makefile |   4 +
 arch/powerpc/net/bpf_jit.h|  53 +-
 arch/powerpc/net/bpf_jit64.h  | 102 
 arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
 arch/powerpc/net/bpf_jit_comp64.c | 956 ++
 8 files changed, 1317 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit64.h
 create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
 create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 01f7464..ee82f9a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -128,7 +128,8 @@ config PPC
select IRQ_FORCED_THREADING
select HAVE_RCU_TABLE_FREE if SMP
select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT
+   select HAVE_CBPF_JIT if !PPC64
+   select HAVE_EBPF_JIT if PPC64
select HAVE_ARCH_JUMP_LABEL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/powerpc/include/asm/asm-compat.h 
b/arch/powerpc/include/asm/asm-compat.h
index dc85dcb..cee3aa0 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -36,11 +36,13 @@
 #define PPC_MIN_STKFRM 112
 
 #ifdef __BIG_ENDIAN__
+#define LHZX_BEstringify_in_c(lhzx)
 #define LWZX_BEstringify_in_c(lwzx)
 #define LDX_BE stringify_in_c(ldx)
 #define STWX_BEstringify_in_c(stwx)
 #define STDX_BEstringify_in_c(stdx)
 #else
+#define LHZX_BEstringify_in_c(lhbrx)
 #define LWZX_BEstringify_in_c(lwbrx)
 #define LDX_BE stringify_in_c(ldbrx)
 #define STWX_BEstringify_in_c(stwbrx)
diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index fd8d640..6a77d130 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -142,9 +142,11 @@
 #define PPC_INST_ISEL  0x7c1e
 #define PPC_INST_ISEL_MASK 0xfc3e
 #define PPC_INST_LDARX 0x7ca8
+#define PPC_INST_STDCX 0x7c0001ad
 #define PPC_INST_LSWI  0x7c0004aa
 #define PPC_INST_LSWX  0x7c00042a
 #define PPC_INST_LWARX 0x7c28
+#define PPC_INST_STWCX 0x7c00012d
 #define PPC_INST_LWSYNC0x7c2004ac
 #define PPC_INST_SYNC  0x7c0004ac
 #define PPC_INST_SYNC_MASK 0xfc0007fe
@@ -211,8 +213,11 @@
 #define PPC_INST_LBZ   0x8800
 #define PPC_INST_LD0xe800
 #define PPC_INST_LHZ   0xa000
-#define PPC_INST_LHBRX 0x7c00062c
 #define PPC_INST_LWZ   0x8000
+#define PPC_INST_LHBRX 0x7c00062c
+#define PPC_INST_LDBRX 0x7c000428
+#define PPC_INST_STB   0x9800
+#define PPC_INST_STH   0xb000
 #define PPC_INST_STD   0xf800
 #define PPC_INST_STDU  0xf801
 #define PPC_INST_STW   0x9000
@@ -221,22 +226,34 @@
 #define PPC_INST_MTLR  0x7c0803a6
 #define PPC_INST_CMPWI 0x2c00
 #define PPC_INST_CMPDI 0x2c20
+#define PPC_INST_CMPW  0x7c00
+#define PPC_INST_CMPD  0x7c20
 #define PPC_INST_CMPLW 0x7c40
+#define PPC_INST_CMPLD 0x7c200040
 #define PPC_INST_CMPLWI0x2800
+#define PPC_INST_CMPLDI0x2820
 #define PPC_INST_ADDI  0x3800
 #define PPC_INST_ADDIS 0x3c00
 #define PPC_INST_ADD   0x7c000214
 #define PPC_INST_SUB   0x7c50
 #define PPC_INST_BLR   0x4e800020
 #define PPC_INST_BLRL  0x4e800021
+#de

[PATCH 2/6] ppc: bpf/jit: Optimize 64-bit Immediate loads

2016-06-07 Thread Naveen N. Rao
Similar to the LI32() optimization, if the value can be represented
in 32-bits, use LI32(). Also handle loading a few specific forms of
immediate values in an optimum manner.

Cc: Matt Evans <m...@ozlabs.org>
Cc: Denis Kirjanov <k...@linux-powerpc.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index a9882db..4c1e055 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -244,20 +244,25 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
} } while(0)
 
 #define PPC_LI64(d, i) do {  \
-   if (!((uintptr_t)(i) & 0xULL))\
+   if ((long)(i) >= -2147483648 &&   \
+   (long)(i) < 2147483648)   \
PPC_LI32(d, i);   \
else {\
-   PPC_LIS(d, ((uintptr_t)(i) >> 48));   \
-   if ((uintptr_t)(i) & 0xULL)   \
-   PPC_ORI(d, d, \
-   ((uintptr_t)(i) >> 32) & 0x); \
+   if (!((uintptr_t)(i) & 0x8000ULL))\
+   PPC_LI(d, ((uintptr_t)(i) >> 32) & 0x);   \
+   else {\
+   PPC_LIS(d, ((uintptr_t)(i) >> 48));   \
+   if ((uintptr_t)(i) & 0xULL)   \
+   PPC_ORI(d, d, \
+ ((uintptr_t)(i) >> 32) & 0x);   \
+   } \
PPC_SLDI(d, d, 32);   \
if ((uintptr_t)(i) & 0xULL)   \
PPC_ORIS(d, d,\
 ((uintptr_t)(i) >> 16) & 0x);\
if ((uintptr_t)(i) & 0xULL)   \
PPC_ORI(d, d, (uintptr_t)(i) & 0x);   \
-   } } while (0);
+   } } while (0)
 
 #ifdef CONFIG_PPC64
 #define PPC_FUNC_ADDR(d,i) do { PPC_LI64(d, i); } while(0)
-- 
2.8.2



Re: [net-next PATCH V4 1/5] samples/bpf: add back functionality to redefine LLC command

2016-04-28 Thread Naveen N. Rao
On 2016/04/28 04:40PM, Jesper Dangaard Brouer wrote:
> On Thu, 28 Apr 2016 18:51:33 +0530
> "Naveen N. Rao" <naveen.n@linux.vnet.ibm.com> wrote:
> 
> > > Add this features back. Note that it is possible to redefine the LLC
> > > on the make command like:
> > > 
> > >  make samples/bpf/ LLC=~/git/llvm/build/bin/llc  
> > 
> > I don't have an objection to this patch, but you didn't explain why/how 
> > this approach is better than just doing:
> >   PATH=~/git/llvm/build/bin make samples/bpf/
> 
> It is almost the same. There is always another way to do the same.
> 
> I explicitly use this to test different combinations of LLC and CLANG,
> in-order to validate Alexei's claim that older versions of CLANG could
> still work with a newer version of LLC.  Thus, one use-case you
> approach cannot cover ;-)
> 
> And clang seems to install a clang-3.9, which my solution also covers
> by explicitly specifying CLANG=clang-3.9, if several avail clang's are
> in the PATH.

Ah, so a very niche use-case.

Acked-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

- Naveen



Re: [net-next PATCH V4 3/5] samples/bpf: add a README file to get users started

2016-04-28 Thread Naveen N. Rao
On 2016/04/28 02:21PM, Jesper Dangaard Brouer wrote:
> Getting started with using examples in samples/bpf/ is not
> straightforward.  There are several dependencies, and specific
> versions of these dependencies.
> 
> Just compiling the example tool is also slightly obscure, e.g. one
> need to call make like:
> 
>  make samples/bpf/
> 
> Do notice the "/" slash after the directory name.
> 
> Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>

Acked-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

> ---
>  samples/bpf/README.rst |   64 
> 
>  1 file changed, 64 insertions(+)
>  create mode 100644 samples/bpf/README.rst
> 
> diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
> new file mode 100644
> index ..993d280184fa
> --- /dev/null
> +++ b/samples/bpf/README.rst
> @@ -0,0 +1,64 @@
> +eBPF sample programs
> +
> +
> +This directory contains a mini eBPF library, test stubs, verifier
> +test-suite and examples for using eBPF.
> +
> +Build dependencies
> +==
> +
> +Compiling requires having installed:
> + * clang >= version 3.4.0
> + * llvm >= version 3.7.1
> +
> +Note that LLVM's tool 'llc' must support target 'bpf', list version
> +and supported targets with command: ``llc --version``
> +
> +Kernel headers
> +--
> +
> +There are usually dependencies to header files of the current kernel.
> +To avoid installing devel kernel headers system wide, as a normal
> +user, simply call::
> +
> + make headers_install
> +
> +This will creates a local "usr/include" directory in the git/build top
> +level directory, that the make system automatically pickup first.
> +
> +Compiling
> +=
> +
> +For building the BPF samples, issue the below command from the kernel
> +top level directory::
> +
> + make samples/bpf/
> +
> +Do notice the "/" slash after the directory name.
> +
> +Manually compiling LLVM with 'bpf' support
> +--
> +
> +Since version 3.7.0, LLVM adds a proper LLVM backend target for the
> +BPF bytecode architecture.
> +
> +By default llvm will build all non-experimental backends including bpf.
> +To generate a smaller llc binary one can use::
> +
> + -DLLVM_TARGETS_TO_BUILD="BPF"
> +
> +Quick sniplet for manually compiling LLVM and clang
> +(build dependencies are cmake and gcc-c++)::
> +
> + $ git clone http://llvm.org/git/llvm.git
> + $ cd llvm/tools
> + $ git clone --depth 1 http://llvm.org/git/clang.git
> + $ cd ..; mkdir build; cd build
> + $ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86"
> + $ make -j $(getconf _NPROCESSORS_ONLN)
> +
> +It is also possible to point make to the newly compiled 'llc' command
> +via redefining LLC on the make command line::
> +
> + make samples/bpf/ LLC=~/git/llvm/build/bin/llc
> +
> 



Re: [net-next PATCH V4 1/5] samples/bpf: add back functionality to redefine LLC command

2016-04-28 Thread Naveen N. Rao
On 2016/04/28 02:20PM, Jesper Dangaard Brouer wrote:
> It is practical to be-able-to redefine the location of the LLVM
> command 'llc', because not all distros have a LLVM version with bpf
> target support.  Thus, it is sometimes required to compile LLVM from
> source, and sometimes it is not desired to overwrite the distros
> default LLVM version.
> 
> This feature was removed with 128d1514be35 ("samples/bpf: Use llc in
> PATH, rather than a hardcoded value").
> 
> Add this features back. Note that it is possible to redefine the LLC
> on the make command like:
> 
>  make samples/bpf/ LLC=~/git/llvm/build/bin/llc

I don't have an objection to this patch, but you didn't explain why/how 
this approach is better than just doing:
  PATH=~/git/llvm/build/bin make samples/bpf/

- Naveen



Re: [net-next PATCH V3 3/5] samples/bpf: add a README file to get users started

2016-04-27 Thread Naveen N. Rao
On 2016/04/27 11:16AM, Jesper Dangaard Brouer wrote:
> On Wed, 27 Apr 2016 14:05:22 +0530
> "Naveen N. Rao" <naveen.n@linux.vnet.ibm.com> wrote:
> 
> > On 2016/04/27 09:30AM, Jesper Dangaard Brouer wrote:
> > > Getting started with using examples in samples/bpf/ is not
> > > straightforward.  There are several dependencies, and specific
> > > versions of these dependencies.
> > > 
> > > Just compiling the example tool is also slightly obscure, e.g. one
> > > need to call make like:
> > > 
> > >  make samples/bpf/
> > > 
> > > Do notice the "/" slash after the directory name.
> > > 
> > > Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
> > > ---
> > >  samples/bpf/README.rst |   75 
> > > 
> > >  1 file changed, 75 insertions(+)
> > >  create mode 100644 samples/bpf/README.rst  
> > 
> > Thanks for adding this! A few nits...
> 
> I would prefer if we could apply this patchset and you could followup
> with a patch with your nits...

... and have another patch just for that?
Regardless, I thought the reason we review is so the patch that goes in 
is already in a good shape.

> 
> > > 
> > > diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
> > > new file mode 100644
> > > index ..1fa157db905b
> > > --- /dev/null
> > > +++ b/samples/bpf/README.rst
> > > @@ -0,0 +1,75 @@
> > > +eBPF sample programs
> > > +
> > > +
> > > +This kernel samples/bpf directory contains a mini eBPF library, test  
> > ^^
> > 'This directory contains' should suffice.
> 
> The reason I formulated it like this, was that people will often hit
> this kind of documentation when searching google.

That doesn't make sense - shouldn't they be looking at a README file in 
the local samples/bpf directory first before going to google?

> 
> 
> > > +stubs, verifier test-suite and examples for using eBPF.
> > > +
> > > +Build dependencies
> > > +==
> > > +
> > > +Compiling requires having installed:
> > > + * clang >= version 3.4.0
> > > + * llvm >= version 3.7.1
> > > +
> > > +Note that LLVM's tool 'llc' must support target 'bpf', list with 
> > > command::
> > > +
> > > + $ llc --version  
> > 
> > 'llc --version | grep bpf' is probably simpler?
> 
> I wanted to give people the impression of how the output looks like.

But, that won't help someone trying to check if their installed llc has 
bpf support or not.
> 
> > > + LLVM (http://llvm.org/):
> > > +  LLVM version 3.x.y
> > > +  [...]
> > > +  Host CPU: xxx

For instance, is the above output something the user needs to see to 
ensure BPF support for llc?

> > > +
> > > +  Registered Targets:
> > > +[...]
> > > +bpf- BPF (host endian)
> > > +bpfeb  - BPF (big endian)
> > > +bpfel  - BPF (little endian)

The above is what really matters. Adding 'grep bpf' makes it explicit on 
what the user needs to look for.

> > > +[...]
> > > +
> > > +Kernel headers
> > > +--
> > > +
> > > +There are usually dependencies to header files of the current kernel.
> > > +To avoid installing devel kernel headers system wide, as a normal
> > > +user, simply call::
> > > +
> > > + make headers_install
> > > +
> > > +This will creates a local "usr/include" directory in the git/build top
> > > +level directory, that the make system automatically pickup first.
> > > +
> > > +Compiling
> > > +=
> > > +
> > > +For compiling goto kernel top level build directory and run make like::  
> > 
> > For building the BPF samples, issue the below command from the kernel 
> > root directory:
> 
> I like your formulation better, but it it worth a respin of the entire
> patchset? 
> 
> Notice you need the extra "::" ending of the paragraph, to make this
> document format nicely with RST (ReStructuredText).
> 
> The a README with a .rst suffix will be picked up by github and
> displayed as the doc for the directory. Thus I also made sure it
> "compiles" with the rst tools. E.g see how samples/pktgen gets auto
> documented and nicely formatted via github (scroll down):
>  https://github.com/torvalds/linux/tree/master/samples/pkt

Re: [net-next PATCH V3 2/5] samples/bpf: Makefile verify LLVM compiler avail and bpf target is supported

2016-04-27 Thread Naveen N. Rao
On 2016/04/27 09:30AM, Jesper Dangaard Brouer wrote:
> Make compiling samples/bpf more user friendly, by detecting if LLVM
> compiler tool 'llc' is available, and also detect if the 'bpf' target
> is available in this version of LLVM.
> 
> Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
> Acked-by: Alexei Starovoitov <a...@kernel.org>
> ---
>  samples/bpf/Makefile |   18 ++
>  1 file changed, 18 insertions(+)

Acked-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

> 
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index 5bae9536f100..45859c99f573 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -85,6 +85,24 @@ HOSTLOADLIBES_test_overhead += -lelf -lrt
>  #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc
>  LLC ?= llc
> 
> +# Verify LLVM compiler is available and bpf target is supported
> +.PHONY: verify_cmd_llc verify_target_bpf
> +
> +verify_cmd_llc:
> + @if ! (which "${LLC}" > /dev/null 2>&1); then \
> + echo "*** ERROR: Cannot find LLVM tool 'llc' (${LLC})" ;\
> + exit 1; \
> + else true; fi
> +
> +verify_target_bpf: verify_cmd_llc
> + @if ! (${LLC} -march=bpf -mattr=help > /dev/null 2>&1); then \
> + echo "*** ERROR: LLVM (${LLC}) does not support 'bpf' target" ;\
> + echo "   NOTICE: LLVM version >= 3.7.1 required" ;\
> + exit 2; \
> + else true; fi
> +
> +$(src)/*.c: verify_target_bpf
> +
>  # asm/sysreg.h - inline assembly used by it is incompatible with llvm.
>  # But, there is no easy way to fix it, so just exclude it since it is
>  # useless for BPF samples.
> 



Re: [net-next PATCH V3 3/5] samples/bpf: add a README file to get users started

2016-04-27 Thread Naveen N. Rao
On 2016/04/27 09:30AM, Jesper Dangaard Brouer wrote:
> Getting started with using examples in samples/bpf/ is not
> straightforward.  There are several dependencies, and specific
> versions of these dependencies.
> 
> Just compiling the example tool is also slightly obscure, e.g. one
> need to call make like:
> 
>  make samples/bpf/
> 
> Do notice the "/" slash after the directory name.
> 
> Signed-off-by: Jesper Dangaard Brouer 
> ---
>  samples/bpf/README.rst |   75 
> 
>  1 file changed, 75 insertions(+)
>  create mode 100644 samples/bpf/README.rst

Thanks for adding this! A few nits...

> 
> diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
> new file mode 100644
> index ..1fa157db905b
> --- /dev/null
> +++ b/samples/bpf/README.rst
> @@ -0,0 +1,75 @@
> +eBPF sample programs
> +
> +
> +This kernel samples/bpf directory contains a mini eBPF library, test
^^
'This directory contains' should suffice.

> +stubs, verifier test-suite and examples for using eBPF.
> +
> +Build dependencies
> +==
> +
> +Compiling requires having installed:
> + * clang >= version 3.4.0
> + * llvm >= version 3.7.1
> +
> +Note that LLVM's tool 'llc' must support target 'bpf', list with command::
> +
> + $ llc --version

'llc --version | grep bpf' is probably simpler?

> + LLVM (http://llvm.org/):
> +  LLVM version 3.x.y
> +  [...]
> +  Host CPU: xxx
> +
> +  Registered Targets:
> +[...]
> +bpf- BPF (host endian)
> +bpfeb  - BPF (big endian)
> +bpfel  - BPF (little endian)
> +[...]
> +
> +Kernel headers
> +--
> +
> +There are usually dependencies to header files of the current kernel.
> +To avoid installing devel kernel headers system wide, as a normal
> +user, simply call::
> +
> + make headers_install
> +
> +This will creates a local "usr/include" directory in the git/build top
> +level directory, that the make system automatically pickup first.
> +
> +Compiling
> +=
> +
> +For compiling goto kernel top level build directory and run make like::

For building the BPF samples, issue the below command from the kernel 
root directory:

> +
> + make samples/bpf/
> +
> +Do notice the "/" slash after the directory name.
> +
> +Manually compiling LLVM with 'bpf' support
> +--
> +
> +Since version 3.7.0, LLVM adds a proper LLVM backend target for the
> +BPF bytecode architecture.
> +
> +By default llvm will build all non-experimental backends including bpf.
> +To generate a smaller llc binary one can use::
> +
> + -DLLVM_TARGETS_TO_BUILD="BPF;X86"

Is the X86 target really needed?

> +
> +Quick sniplet for manually compiling LLVM and clang
> +(build dependencies are cmake and gcc-c++)::
> +
> + $ git clone http://llvm.org/git/llvm.git
> + $ cd llvm/tools
> + $ git clone --depth 1 http://llvm.org/git/clang.git
> + $ cd ..; mkdir build; cd build
> + $ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86"
^^^
Here too.

- Naveen

> + $ make -j $(getconf _NPROCESSORS_ONLN)
> +
> +It is also possible to point make to the newly compiled 'llc' command
> +via redefining LLC on the make command line::
> +
> + make samples/bpf/ LLC=~/git/llvm/build/bin/llc
> +
> 



Re: [net-next PATCH V3 1/5] samples/bpf: add back functionality to redefine LLC command

2016-04-27 Thread Naveen N. Rao
On 2016/04/27 09:30AM, Jesper Dangaard Brouer wrote:
> It is practical to be-able-to redefine the location of the LLVM
> command 'llc', because not all distros have a LLVM version with bpf
> target support.  Thus, it is sometimes required to compile LLVM from
> source, and sometimes it is not desired to overwrite the distros
> default LLVM version.
> 
> This feature was removed with 128d1514be35 ("samples/bpf: Use llc in
> PATH, rather than a hardcoded value").
> 
> Add this features back. Note that it is possible to redefine the LLC
> on the make command like:
> 
>  make samples/bpf/ LLC=~/git/llvm/build/bin/llc

Why not do:
  PATH=~/git/llvm/build/bin:$PATH make samples/bpf/

..if you wish to override clang/llc only for building bpf samples?
Or, just export the updated $PATH:
  export PATH=~/git/llvm/build/bin:$PATH
  make samples/bpf


- Naveen
 



Re: [PATCH net 4/4] lib/test_bpf: Add additional BPF_ADD tests

2016-04-05 Thread Naveen N. Rao
On 2016/04/05 09:28AM, Alexei Starovoitov wrote:
> On 4/5/16 3:02 AM, Naveen N. Rao wrote:
> >Some of these tests proved useful with the powerpc eBPF JIT port due to
> >sign-extended 16-bit immediate loads. Though some of these aspects get
> >covered in other tests, it is better to have explicit tests so as to
> >quickly tag the precise problem.
> >
> >Cc: Alexei Starovoitov <a...@fb.com>
> >Cc: Daniel Borkmann <dan...@iogearbox.net>
> >Cc: "David S. Miller" <da...@davemloft.net>
> >Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
> >Cc: Michael Ellerman <m...@ellerman.id.au>
> >Cc: Paul Mackerras <pau...@samba.org>
> >Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> 
> Makes sense. Looks like ppc jit will be using quite a bit of
> available ppc instructions. Nice.
> 
> I'm assuming all these new tests passed with x64 jit?

Yes, all these tests pass on x86_64.

- Naveen



Re: [PATCH net 2/4] lib/test_bpf: Add tests for unsigned BPF_JGT

2016-04-05 Thread Naveen N. Rao
On 2016/04/05 09:20AM, Alexei Starovoitov wrote:
> On 4/5/16 3:02 AM, Naveen N. Rao wrote:
> >Unsigned Jump-if-Greater-Than.
> >
> >Cc: Alexei Starovoitov <a...@fb.com>
> >Cc: Daniel Borkmann <dan...@iogearbox.net>
> >Cc: "David S. Miller" <da...@davemloft.net>
> >Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
> >Cc: Michael Ellerman <m...@ellerman.id.au>
> >Cc: Paul Mackerras <pau...@samba.org>
> >Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> 
> I think some of the tests already cover it, but extra tests are
> always great.
> Acked-by: Alexei Starovoitov <a...@kernel.org>
> 
> I think the whole set belongs in net-next.
> Next time you submit the patches please say [PATCH net-next] in subject.
> [PATCH net] is for bugfixes only.

Ah, sure. Thanks for the review!

- Naveen



[PATCH net 2/4] lib/test_bpf: Add tests for unsigned BPF_JGT

2016-04-05 Thread Naveen N. Rao
Unsigned Jump-if-Greater-Than.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 lib/test_bpf.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index e76fa4d..7e6fb49 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -4222,6 +4222,20 @@ static struct bpf_test tests[] = {
{ },
{ { 0, 1 } },
},
+   {
+   "JMP_JGT_K: Unsigned jump: if (-1 > 1) return 1",
+   .u.insns_int = {
+   BPF_ALU32_IMM(BPF_MOV, R0, 0),
+   BPF_LD_IMM64(R1, -1),
+   BPF_JMP_IMM(BPF_JGT, R1, 1, 1),
+   BPF_EXIT_INSN(),
+   BPF_ALU32_IMM(BPF_MOV, R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 1 } },
+   },
/* BPF_JMP | BPF_JGE | BPF_K */
{
"JMP_JGE_K: if (3 >= 2) return 1",
@@ -4404,6 +4418,21 @@ static struct bpf_test tests[] = {
{ },
{ { 0, 1 } },
},
+   {
+   "JMP_JGT_X: Unsigned jump: if (-1 > 1) return 1",
+   .u.insns_int = {
+   BPF_ALU32_IMM(BPF_MOV, R0, 0),
+   BPF_LD_IMM64(R1, -1),
+   BPF_LD_IMM64(R2, 1),
+   BPF_JMP_REG(BPF_JGT, R1, R2, 1),
+   BPF_EXIT_INSN(),
+   BPF_ALU32_IMM(BPF_MOV, R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 1 } },
+   },
/* BPF_JMP | BPF_JGE | BPF_X */
{
"JMP_JGE_X: if (3 >= 2) return 1",
-- 
2.7.4



[PATCH net 1/4] lib/test_bpf: Fix JMP_JSET tests

2016-04-05 Thread Naveen N. Rao
JMP_JSET tests incorrectly used BPF_JNE. Fix the same.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 lib/test_bpf.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 27a7a26..e76fa4d 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -4303,7 +4303,7 @@ static struct bpf_test tests[] = {
.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0),
BPF_LD_IMM64(R1, 3),
-   BPF_JMP_IMM(BPF_JNE, R1, 2, 1),
+   BPF_JMP_IMM(BPF_JSET, R1, 2, 1),
BPF_EXIT_INSN(),
BPF_ALU32_IMM(BPF_MOV, R0, 1),
BPF_EXIT_INSN(),
@@ -4317,7 +4317,7 @@ static struct bpf_test tests[] = {
.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0),
BPF_LD_IMM64(R1, 3),
-   BPF_JMP_IMM(BPF_JNE, R1, 0x, 1),
+   BPF_JMP_IMM(BPF_JSET, R1, 0x, 1),
BPF_EXIT_INSN(),
BPF_ALU32_IMM(BPF_MOV, R0, 1),
BPF_EXIT_INSN(),
@@ -4474,7 +4474,7 @@ static struct bpf_test tests[] = {
BPF_ALU32_IMM(BPF_MOV, R0, 0),
BPF_LD_IMM64(R1, 3),
BPF_LD_IMM64(R2, 2),
-   BPF_JMP_REG(BPF_JNE, R1, R2, 1),
+   BPF_JMP_REG(BPF_JSET, R1, R2, 1),
BPF_EXIT_INSN(),
BPF_ALU32_IMM(BPF_MOV, R0, 1),
BPF_EXIT_INSN(),
@@ -4489,7 +4489,7 @@ static struct bpf_test tests[] = {
BPF_ALU32_IMM(BPF_MOV, R0, 0),
BPF_LD_IMM64(R1, 3),
BPF_LD_IMM64(R2, 0x),
-   BPF_JMP_REG(BPF_JNE, R1, R2, 1),
+   BPF_JMP_REG(BPF_JSET, R1, R2, 1),
BPF_EXIT_INSN(),
BPF_ALU32_IMM(BPF_MOV, R0, 1),
BPF_EXIT_INSN(),
-- 
2.7.4



[PATCH net 3/4] lib/test_bpf: Add test to check for result of 32-bit add that overflows

2016-04-05 Thread Naveen N. Rao
BPF_ALU32 and BPF_ALU64 tests for adding two 32-bit values that results in
32-bit overflow.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 lib/test_bpf.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 7e6fb49..2fb31aa 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -2444,6 +2444,22 @@ static struct bpf_test tests[] = {
{ { 0, 4294967295U } },
},
{
+   "ALU_ADD_X: 2 + 4294967294 = 0",
+   .u.insns_int = {
+   BPF_LD_IMM64(R0, 2),
+   BPF_LD_IMM64(R1, 4294967294U),
+   BPF_ALU32_REG(BPF_ADD, R0, R1),
+   BPF_JMP_IMM(BPF_JEQ, R0, 0, 2),
+   BPF_ALU32_IMM(BPF_MOV, R0, 0),
+   BPF_EXIT_INSN(),
+   BPF_ALU32_IMM(BPF_MOV, R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 1 } },
+   },
+   {
"ALU64_ADD_X: 1 + 2 = 3",
.u.insns_int = {
BPF_LD_IMM64(R0, 1),
@@ -2467,6 +2483,23 @@ static struct bpf_test tests[] = {
{ },
{ { 0, 4294967295U } },
},
+   {
+   "ALU64_ADD_X: 2 + 4294967294 = 4294967296",
+   .u.insns_int = {
+   BPF_LD_IMM64(R0, 2),
+   BPF_LD_IMM64(R1, 4294967294U),
+   BPF_LD_IMM64(R2, 4294967296ULL),
+   BPF_ALU64_REG(BPF_ADD, R0, R1),
+   BPF_JMP_REG(BPF_JEQ, R0, R2, 2),
+   BPF_MOV32_IMM(R0, 0),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 1 } },
+   },
/* BPF_ALU | BPF_ADD | BPF_K */
{
"ALU_ADD_K: 1 + 2 = 3",
@@ -2502,6 +2535,21 @@ static struct bpf_test tests[] = {
{ { 0, 4294967295U } },
},
{
+   "ALU_ADD_K: 4294967294 + 2 = 0",
+   .u.insns_int = {
+   BPF_LD_IMM64(R0, 4294967294U),
+   BPF_ALU32_IMM(BPF_ADD, R0, 2),
+   BPF_JMP_IMM(BPF_JEQ, R0, 0, 2),
+   BPF_ALU32_IMM(BPF_MOV, R0, 0),
+   BPF_EXIT_INSN(),
+   BPF_ALU32_IMM(BPF_MOV, R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 1 } },
+   },
+   {
"ALU_ADD_K: 0 + (-1) = 0x",
.u.insns_int = {
BPF_LD_IMM64(R2, 0x0),
@@ -2551,6 +2599,22 @@ static struct bpf_test tests[] = {
{ { 0, 2147483647 } },
},
{
+   "ALU64_ADD_K: 4294967294 + 2 = 4294967296",
+   .u.insns_int = {
+   BPF_LD_IMM64(R0, 4294967294U),
+   BPF_LD_IMM64(R1, 4294967296ULL),
+   BPF_ALU64_IMM(BPF_ADD, R0, 2),
+   BPF_JMP_REG(BPF_JEQ, R0, R1, 2),
+   BPF_ALU32_IMM(BPF_MOV, R0, 0),
+   BPF_EXIT_INSN(),
+   BPF_ALU32_IMM(BPF_MOV, R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 1 } },
+   },
+   {
"ALU64_ADD_K: 2147483646 + -2147483647 = -1",
.u.insns_int = {
BPF_LD_IMM64(R0, 2147483646),
-- 
2.7.4



[PATCH net 4/4] lib/test_bpf: Add additional BPF_ADD tests

2016-04-05 Thread Naveen N. Rao
Some of these tests proved useful with the powerpc eBPF JIT port due to
sign-extended 16-bit immediate loads. Though some of these aspects get
covered in other tests, it is better to have explicit tests so as to
quickly tag the precise problem.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: "David S. Miller" <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Paul Mackerras <pau...@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
 lib/test_bpf.c | 128 +
 1 file changed, 128 insertions(+)

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 2fb31aa..8f22fbe 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -2566,6 +2566,70 @@ static struct bpf_test tests[] = {
{ { 0, 0x1 } },
},
{
+   "ALU_ADD_K: 0 + 0x = 0x",
+   .u.insns_int = {
+   BPF_LD_IMM64(R2, 0x0),
+   BPF_LD_IMM64(R3, 0x),
+   BPF_ALU32_IMM(BPF_ADD, R2, 0x),
+   BPF_JMP_REG(BPF_JEQ, R2, R3, 2),
+   BPF_MOV32_IMM(R0, 2),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 0x1 } },
+   },
+   {
+   "ALU_ADD_K: 0 + 0x7fff = 0x7fff",
+   .u.insns_int = {
+   BPF_LD_IMM64(R2, 0x0),
+   BPF_LD_IMM64(R3, 0x7fff),
+   BPF_ALU32_IMM(BPF_ADD, R2, 0x7fff),
+   BPF_JMP_REG(BPF_JEQ, R2, R3, 2),
+   BPF_MOV32_IMM(R0, 2),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 0x1 } },
+   },
+   {
+   "ALU_ADD_K: 0 + 0x8000 = 0x8000",
+   .u.insns_int = {
+   BPF_LD_IMM64(R2, 0x0),
+   BPF_LD_IMM64(R3, 0x8000),
+   BPF_ALU32_IMM(BPF_ADD, R2, 0x8000),
+   BPF_JMP_REG(BPF_JEQ, R2, R3, 2),
+   BPF_MOV32_IMM(R0, 2),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 0x1 } },
+   },
+   {
+   "ALU_ADD_K: 0 + 0x80008000 = 0x80008000",
+   .u.insns_int = {
+   BPF_LD_IMM64(R2, 0x0),
+   BPF_LD_IMM64(R3, 0x80008000),
+   BPF_ALU32_IMM(BPF_ADD, R2, 0x80008000),
+   BPF_JMP_REG(BPF_JEQ, R2, R3, 2),
+   BPF_MOV32_IMM(R0, 2),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 0x1 } },
+   },
+   {
"ALU64_ADD_K: 1 + 2 = 3",
.u.insns_int = {
BPF_LD_IMM64(R0, 1),
@@ -2657,6 +2721,70 @@ static struct bpf_test tests[] = {
{ },
{ { 0, 0x1 } },
},
+   {
+   "ALU64_ADD_K: 0 + 0x = 0x",
+   .u.insns_int = {
+   BPF_LD_IMM64(R2, 0x0),
+   BPF_LD_IMM64(R3, 0x),
+   BPF_ALU64_IMM(BPF_ADD, R2, 0x),
+   BPF_JMP_REG(BPF_JEQ, R2, R3, 2),
+   BPF_MOV32_IMM(R0, 2),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 0x1 } },
+   },
+   {
+   "ALU64_ADD_K: 0 + 0x7fff = 0x7fff",
+   .u.insns_int = {
+   BPF_LD_IMM64(R2, 0x0),
+   BPF_LD_IMM64(R3, 0x7fff),
+   BPF_ALU64_IMM(BPF_ADD, R2, 0x7fff),
+   BPF_JMP_REG(BPF_JEQ, R2, R3, 2),
+   BPF_MOV32_IMM(R0, 2),
+   BPF_EXIT_INSN(),
+   BPF_MOV32_IMM(R0, 1),
+   BPF_EXIT_INSN(),
+   },
+   INTERNAL,
+   { },
+   { { 0, 0x1 } },
+   },
+   {
+   "ALU64_ADD_K: 0 + 0x8000 = 0x8000",
+

Re: [RFC PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-04-04 Thread Naveen N. Rao
On 2016/04/01 08:34PM, Daniel Borkmann wrote:
> On 04/01/2016 08:10 PM, Alexei Starovoitov wrote:
> >On 4/1/16 2:58 AM, Naveen N. Rao wrote:
> >>PPC64 eBPF JIT compiler. Works for both ABIv1 and ABIv2.
> >>
> >>Enable with:
> >>echo 1 > /proc/sys/net/core/bpf_jit_enable
> >>or
> >>echo 2 > /proc/sys/net/core/bpf_jit_enable
> >>
> >>... to see the generated JIT code. This can further be processed with
> >>tools/net/bpf_jit_disasm.
> >>
> >>With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
> >>test_bpf: Summary: 291 PASSED, 0 FAILED, [234/283 JIT'ed]
> >>
> >>... on both ppc64 BE and LE.
> >>
> >>The details of the approach are documented through various comments in
> >>the code, as are the TODOs. Some of the prominent TODOs include
> >>implementing BPF tail calls and skb loads.
> >>
> >>Cc: Matt Evans <m...@ozlabs.org>
> >>Cc: Michael Ellerman <m...@ellerman.id.au>
> >>Cc: Paul Mackerras <pau...@samba.org>
> >>Cc: Alexei Starovoitov <a...@fb.com>
> >>Cc: "David S. Miller" <da...@davemloft.net>
> >>Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
> >>Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> >>---
> >>  arch/powerpc/include/asm/ppc-opcode.h |  19 +-
> >>  arch/powerpc/net/Makefile |   4 +
> >>  arch/powerpc/net/bpf_jit.h|  66 ++-
> >>  arch/powerpc/net/bpf_jit64.h  |  58 +++
> >>  arch/powerpc/net/bpf_jit_comp64.c | 828 
> >> ++
> >>  5 files changed, 973 insertions(+), 2 deletions(-)
> >>  create mode 100644 arch/powerpc/net/bpf_jit64.h
> >>  create mode 100644 arch/powerpc/net/bpf_jit_comp64.c
> >...
> >>-#ifdef CONFIG_PPC64
> >>+#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF != 2)
> >
> >impressive stuff!
> 
> +1, awesome to see another one!

Thanks!

> 
> >Everything nicely documented. Could you add few words for the above
> >condition as well ?
> >Or may be a new macro, since it occurs many times?
> >What are these _CALL_ELF == 2 and != 2 conditions mean? ppc ABIs ?

Yes, there are 2 ABIs: ppc64 (ABIv1) -- big endian and the recently 
introduced ppc64le (ABIv2) which is currently only little endian. There 
is also ppc32...

Good suggestion about using a macro. I will put out a patch for that.

> >Will there ever be v3 ?

Hope not! ;)

> 
> Minor TODO would also be to convert to use bpf_jit_binary_alloc() and
> bpf_jit_binary_free() API for the image, which is done by other eBPF
> jits, too.

Sure. I will make that change.

> 
> >So far most of the bpf jits were going via net-next tree, but if
> >in this case no changes to the core is necessary then I guess it's fine
> >to do it via powerpc tree. What's your plan?

I initially thought this has to go through the powerpc tree. I don't 
really have a preference and I'll allow the maintainers to take a call 
on that. I do however need a review of the JIT code from Michael
Ellerman/Paul Mackerras.


- Naveen



[PATCHv2 net 1/3] samples/bpf: Fix build breakage with map_perf_test_user.c

2016-04-04 Thread Naveen N. Rao
Building BPF samples is failing with the below error:

samples/bpf/map_perf_test_user.c: In function ‘main’:
samples/bpf/map_perf_test_user.c:134:9: error: variable ‘r’ has
initializer but incomplete type
  struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 ^
samples/bpf/map_perf_test_user.c:134:21: error: ‘RLIM_INFINITY’
undeclared (first use in this function)
  struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 ^
samples/bpf/map_perf_test_user.c:134:21: note: each undeclared
identifier is reported only once for each function it appears in
samples/bpf/map_perf_test_user.c:134:9: warning: excess elements in
struct initializer [enabled by default]
  struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 ^
samples/bpf/map_perf_test_user.c:134:9: warning: (near initialization
for ‘r’) [enabled by default]
samples/bpf/map_perf_test_user.c:134:9: warning: excess elements in
struct initializer [enabled by default]
samples/bpf/map_perf_test_user.c:134:9: warning: (near initialization
for ‘r’) [enabled by default]
samples/bpf/map_perf_test_user.c:134:16: error: storage size of ‘r’
isn’t known
  struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
^
samples/bpf/map_perf_test_user.c:139:2: warning: implicit declaration of
function ‘setrlimit’ [-Wimplicit-function-declaration]
  setrlimit(RLIMIT_MEMLOCK, );
  ^
samples/bpf/map_perf_test_user.c:139:12: error: ‘RLIMIT_MEMLOCK’
undeclared (first use in this function)
  setrlimit(RLIMIT_MEMLOCK, );
^
samples/bpf/map_perf_test_user.c:134:16: warning: unused variable ‘r’
[-Wunused-variable]
  struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
^
make[2]: *** [samples/bpf/map_perf_test_user.o] Error 1

Fix this by including the necessary header file.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: David S. Miller <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
v2: no changes

 samples/bpf/map_perf_test_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index 95af56e..3147377 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "libbpf.h"
 #include "bpf_load.h"
 
-- 
2.7.4



[PATCHv2 net 2/3] samples/bpf: Use llc in PATH, rather than a hardcoded value

2016-04-04 Thread Naveen N. Rao
While at it, remove the generation of .s files and fix some typos in the
related comment.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: David S. Miller <da...@davemloft.net>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
v2: removed generation of .s files

 samples/bpf/Makefile | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 502c9fc..b820cc9 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -76,16 +76,10 @@ HOSTLOADLIBES_offwaketime += -lelf
 HOSTLOADLIBES_spintest += -lelf
 HOSTLOADLIBES_map_perf_test += -lelf -lrt
 
-# point this to your LLVM backend with bpf support
-LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
-
-# asm/sysreg.h inline assmbly used by it is incompatible with llvm.
-# But, ehere is not easy way to fix it, so just exclude it since it is
+# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
+# But, there is no easy way to fix it, so just exclude it since it is
 # useless for BPF samples.
 $(obj)/%.o: $(src)/%.c
clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
-   clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-   -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=asm -o 
$@.s
+   -O2 -emit-llvm -c $< -o -| llc -march=bpf -filetype=obj -o $@
-- 
2.7.4



[PATCHv2 net 3/3] samples/bpf: Enable powerpc support

2016-04-04 Thread Naveen N. Rao
Add the necessary definitions for building bpf samples on ppc.

Since ppc doesn't store function return address on the stack, modify how
PT_REGS_RET() and PT_REGS_FP() work.

Also, introduce PT_REGS_IP() to access the instruction pointer.

Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: David S. Miller <da...@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
---
v2: updated macros using ({ }) gcc extension as per Alexei

 samples/bpf/bpf_helpers.h   | 26 ++
 samples/bpf/spintest_kern.c |  2 +-
 samples/bpf/tracex2_kern.c  |  4 ++--
 samples/bpf/tracex4_kern.c  |  2 +-
 4 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 9363500..7904a2a 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -82,6 +82,7 @@ static int (*bpf_l4_csum_replace)(void *ctx, int off, int 
from, int to, int flag
 #define PT_REGS_FP(x) ((x)->bp)
 #define PT_REGS_RC(x) ((x)->ax)
 #define PT_REGS_SP(x) ((x)->sp)
+#define PT_REGS_IP(x) ((x)->ip)
 
 #elif defined(__s390x__)
 
@@ -94,6 +95,7 @@ static int (*bpf_l4_csum_replace)(void *ctx, int off, int 
from, int to, int flag
 #define PT_REGS_FP(x) ((x)->gprs[11]) /* Works only with CONFIG_FRAME_POINTER 
*/
 #define PT_REGS_RC(x) ((x)->gprs[2])
 #define PT_REGS_SP(x) ((x)->gprs[15])
+#define PT_REGS_IP(x) ((x)->ip)
 
 #elif defined(__aarch64__)
 
@@ -106,6 +108,30 @@ static int (*bpf_l4_csum_replace)(void *ctx, int off, int 
from, int to, int flag
 #define PT_REGS_FP(x) ((x)->regs[29]) /* Works only with CONFIG_FRAME_POINTER 
*/
 #define PT_REGS_RC(x) ((x)->regs[0])
 #define PT_REGS_SP(x) ((x)->sp)
+#define PT_REGS_IP(x) ((x)->pc)
+
+#elif defined(__powerpc__)
+
+#define PT_REGS_PARM1(x) ((x)->gpr[3])
+#define PT_REGS_PARM2(x) ((x)->gpr[4])
+#define PT_REGS_PARM3(x) ((x)->gpr[5])
+#define PT_REGS_PARM4(x) ((x)->gpr[6])
+#define PT_REGS_PARM5(x) ((x)->gpr[7])
+#define PT_REGS_RC(x) ((x)->gpr[3])
+#define PT_REGS_SP(x) ((x)->sp)
+#define PT_REGS_IP(x) ((x)->nip)
 
 #endif
+
+#ifdef __powerpc__
+#define BPF_KPROBE_READ_RET_IP(ip, ctx)({ (ip) = (ctx)->link; 
})
+#define BPF_KRETPROBE_READ_RET_IP  BPF_KPROBE_READ_RET_IP
+#else
+#define BPF_KPROBE_READ_RET_IP(ip, ctx)({  
\
+   bpf_probe_read(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })
+#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) ({  
\
+   bpf_probe_read(&(ip), sizeof(ip),   
\
+   (void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
+#endif
+
 #endif
diff --git a/samples/bpf/spintest_kern.c b/samples/bpf/spintest_kern.c
index 4b27619..ce0167d 100644
--- a/samples/bpf/spintest_kern.c
+++ b/samples/bpf/spintest_kern.c
@@ -34,7 +34,7 @@ struct bpf_map_def SEC("maps") stackmap = {
 #define PROG(foo) \
 int foo(struct pt_regs *ctx) \
 { \
-   long v = ctx->ip, *val; \
+   long v = PT_REGS_IP(ctx), *val; \
 \
val = bpf_map_lookup_elem(_map, ); \
bpf_map_update_elem(_map, , , BPF_ANY); \
diff --git a/samples/bpf/tracex2_kern.c b/samples/bpf/tracex2_kern.c
index 09c1adc..6d6eefd 100644
--- a/samples/bpf/tracex2_kern.c
+++ b/samples/bpf/tracex2_kern.c
@@ -27,10 +27,10 @@ int bpf_prog2(struct pt_regs *ctx)
long init_val = 1;
long *value;
 
-   /* x64/s390x specific: read ip of kfree_skb caller.
+   /* read ip of kfree_skb caller.
 * non-portable version of __builtin_return_address(0)
 */
-   bpf_probe_read(, sizeof(loc), (void *)PT_REGS_RET(ctx));
+   BPF_KPROBE_READ_RET_IP(loc, ctx);
 
value = bpf_map_lookup_elem(_map, );
if (value)
diff --git a/samples/bpf/tracex4_kern.c b/samples/bpf/tracex4_kern.c
index ac46714..6dd8e38 100644
--- a/samples/bpf/tracex4_kern.c
+++ b/samples/bpf/tracex4_kern.c
@@ -40,7 +40,7 @@ int bpf_prog2(struct pt_regs *ctx)
long ip = 0;
 
/* get ip address of kmem_cache_alloc_node() caller */
-   bpf_probe_read(, sizeof(ip), (void *)(PT_REGS_FP(ctx) + sizeof(ip)));
+   BPF_KRETPROBE_READ_RET_IP(ip, ctx);
 
struct pair v = {
.val = bpf_ktime_get_ns(),
-- 
2.7.4



Re: [PATCH 4/4] samples/bpf: Enable powerpc support

2016-04-01 Thread Naveen N. Rao
On 2016/03/31 10:52AM, Alexei Starovoitov wrote:
> On 3/31/16 4:25 AM, Naveen N. Rao wrote:
> ...
> >+
> >+#ifdef __powerpc__
> >+#define BPF_KPROBE_READ_RET_IP(ip, ctx) { (ip) = (ctx)->link; }
> >+#define BPF_KRETPROBE_READ_RET_IP(ip, ctx)  BPF_KPROBE_READ_RET_IP(ip, ctx)
> >+#else
> >+#define BPF_KPROBE_READ_RET_IP(ip, ctx) 
> >\
> >+bpf_probe_read(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx))
> >+#define BPF_KRETPROBE_READ_RET_IP(ip, ctx)  
> >\
> >+bpf_probe_read(&(ip), sizeof(ip),   
> >\
> >+(void *)(PT_REGS_FP(ctx) + sizeof(ip)))
> 
> makes sense, but please use ({ }) gcc extension instead of {} and
> open call to make sure that macro body is scoped.

To be sure I understand this right, do you mean something like this?

+
+#ifdef __powerpc__
+#define BPF_KPROBE_READ_RET_IP(ip, ctx)({ (ip) = (ctx)->link; 
})
+#define BPF_KRETPROBE_READ_RET_IP  BPF_KPROBE_READ_RET_IP
+#else
+#define BPF_KPROBE_READ_RET_IP(ip, ctx)({  
\
+   bpf_probe_read(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })
+#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) ({  
\
+   bpf_probe_read(&(ip), sizeof(ip),   
\
+   (void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
+#endif
+


Thanks,
Naveen



Re: [PATCH 2/4] samples/bpf: Use llc in PATH, rather than a hardcoded value

2016-04-01 Thread Naveen N. Rao
On 2016/03/31 08:19PM, Daniel Borkmann wrote:
> On 03/31/2016 07:46 PM, Alexei Starovoitov wrote:
> >On 3/31/16 4:25 AM, Naveen N. Rao wrote:
> >>  clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
> >>  -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign 
> >> \
> >>--O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
> >>+-O2 -emit-llvm -c $< -o -| llc -march=bpf -filetype=obj -o $@
> >>  clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
> >>  -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign 
> >> \
> >>--O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=asm -o $@.s
> >>+-O2 -emit-llvm -c $< -o -| llc -march=bpf -filetype=asm -o $@.s
> >
> >that was a workaround when clang/llvm didn't have bpf support.
> >Now clang 3.7 and 3.8 have bpf built-in, so make sense to remove
> >manual calls to llc completely.
> >Just use 'clang -target bpf -O2 -D... -c $< -o $@'
> 
> +1, the clang part in that Makefile should also more correctly be called
> with '-target bpf' as it turns out (despite llc with '-march=bpf' ...).
> Better to use clang directly as suggested by Alexei.

I'm likely missing something obvious, but I cannot get this to work.  
With this diff:

 $(obj)/%.o: $(src)/%.c
clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf 
-filetype=obj -o $@
-   clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-   -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf 
-filetype=asm -o $@.s
+   -O2 -target bpf -c $< -o $@

I see far too many errors thrown starting with:

clang  -nostdinc -isystem 
/usr/lib/gcc/x86_64-redhat-linux/4.8.2/include 
-I./arch/x86/include -Iarch/x86/include/generated/uapi 
-Iarch/x86/include/generated  -Iinclude 
-I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi 
-I./include/uapi -Iinclude/generated/uapi -include 
./include/linux/kconfig.h  \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-O2 -target bpf -c samples/bpf/map_perf_test_kern.c -o 
samples/bpf/map_perf_test_kern.o
In file included from samples/bpf/map_perf_test_kern.c:7:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/kernel.h:10:
In file included from include/linux/bitops.h:36:
In file included from ./arch/x86/include/asm/bitops.h:500:
./arch/x86/include/asm/arch_hweight.h:31:10: error: invalid output 
constraint '=a' in asm
 : "="REG_OUT (res)
   ^
./arch/x86/include/asm/arch_hweight.h:59:10: error: invalid output 
constraint '=a' in asm
 : "="REG_OUT (res)


What am I missing?


- Naveen



Re: [PATCH 3/4] samples/bpf: Simplify building BPF samples

2016-03-31 Thread Naveen N. Rao
On 2016/03/31 10:49AM, Alexei Starovoitov wrote:
> On 3/31/16 4:25 AM, Naveen N. Rao wrote:
> >Make BPF samples build depend on CONFIG_SAMPLE_BPF. We still don't add a
> >Kconfig option since that will add a dependency on llvm for allyesconfig
> >builds which may not be desirable.
> >
> >Those who need to build the BPF samples can now just do:
> >
> >make CONFIG_SAMPLE_BPF=y
> >
> >or:
> >
> >export CONFIG_SAMPLE_BPF=y
> >make
> 
> I don't like this 'simplification'.
> make samples/bpf/
> is much easier to type than capital letters.

This started out as a patch to have the BPF samples built with a Kconfig 
option. As stated in the commit description, I realised that it won't 
work for allyesconfig builds. However, the reason I retained this patch 
is since it gets us one step closer to building the samples as part of 
the kernel build.

The 'simplification' is since I can now have the export in my .bashrc 
and the kernel build will now build the BPF samples too without 
requiring an additional 'make samples/bpf/' step.

I agree this is subjective, so I am ok if this isn't taken in.


- Naveen



Re: [PATCH 1/4] samples/bpf: Fix build breakage with map_perf_test_user.c

2016-03-31 Thread Naveen N. Rao
On 2016/03/31 10:43AM, Alexei Starovoitov wrote:
> On 3/31/16 4:25 AM, Naveen N. Rao wrote:
> >Building BPF samples is failing with the below error:
> >
> >samples/bpf/map_perf_test_user.c: In function ‘main’:
> >samples/bpf/map_perf_test_user.c:134:9: error: variable ‘r’ has
> >initializer but incomplete type
> >   struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
> >  ^
> >Fix this by including the necessary header file.
> >
> >Cc: Alexei Starovoitov <a...@fb.com>
> >Cc: David S. Miller <da...@davemloft.net>
> >Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
> >Cc: Michael Ellerman <m...@ellerman.id.au>
> >Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
> >---
> >  samples/bpf/map_perf_test_user.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> >diff --git a/samples/bpf/map_perf_test_user.c 
> >b/samples/bpf/map_perf_test_user.c
> >index 95af56e..3147377 100644
> >--- a/samples/bpf/map_perf_test_user.c
> >+++ b/samples/bpf/map_perf_test_user.c
> >@@ -17,6 +17,7 @@
> >  #include 
> >  #include 
> >  #include 
> >+#include 
> >  #include "libbpf.h"
> >  #include "bpf_load.h"
> 
> It's failing this way on powerpc? Odd.

This fails for me on x86_64 too -- RHEL 7.1.

> Such hidden header dependency was always puzzling to me. Anyway:
> Acked-by: Alexei Starovoitov <a...@kernel.org>
> 
> I'm assuming you want this set to go via 'net' tree, so please resubmit
> with [PATCH net 1/4] subjects and cc netdev.

Sure.

> 
> Reviewing your other patches...

Thanks for your review!

- Naveen