Naveen N. Rao wrote:
Alexei Starovoitov wrote:
On 2/8/18 4:03 AM, Sandipan Das wrote:
The imm field of a bpf_insn is a signed 32-bit integer. For
JIT-ed bpf-to-bpf function calls, it stores the offset from
__bpf_call_base to the start of the callee function.

For some architectures, such as powerpc64, it was found that
this offset may be as large as 64 bits because of which this
cannot be accomodated in the imm field without truncation.

To resolve this, we additionally use the aux data within each
bpf_prog associated with the caller functions to store the
addresses of their respective callees.

Signed-off-by: Sandipan Das <sandi...@linux.vnet.ibm.com>
---
 kernel/bpf/verifier.c | 39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5fb69a85d967..52088b4ca02f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5282,6 +5282,19 @@ static int jit_subprogs(struct bpf_verifier_env *env)
         * run last pass of JIT
         */
        for (i = 0; i <= env->subprog_cnt; i++) {
+               u32 flen = func[i]->len, callee_cnt = 0;
+               struct bpf_prog **callee;
+
+               /* for now assume that the maximum number of bpf function
+                * calls that can be made by a caller must be at most the
+                * number of bpf instructions in that function
+                */
+               callee = kzalloc(sizeof(func[i]) * flen, GFP_KERNEL);
+               if (!callee) {
+                       err = -ENOMEM;
+                       goto out_free;
+               }
+
                insn = func[i]->insnsi;
                for (j = 0; j < func[i]->len; j++, insn++) {
                        if (insn->code != (BPF_JMP | BPF_CALL) ||
@@ -5292,6 +5305,26 @@ static int jit_subprogs(struct bpf_verifier_env *env)
                        insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
                                func[subprog]->bpf_func -
                                __bpf_call_base;
+
+                       /* the offset to the callee from __bpf_call_base
+                        * may be larger than what the 32 bit integer imm
+                        * can accomodate which will truncate the higher
+                        * order bits
+                        *
+                        * to avoid this, we additionally utilize the aux
+                        * data of each caller function for storing the
+                        * addresses of every callee associated with it
+                        */
+                       callee[callee_cnt++] = func[subprog];

can you share typical /proc/kallsyms ?
Are you saying that kernel and kernel modules are allocated from
address spaces that are always more than 32-bit apart?

Yes. On ppc64, kernel text is linearly mapped from 0xc000000000000000, while vmalloc'ed area starts from 0xd000000000000000 (for radix, this is
different, but still beyond a 32-bit offset).

That would mean that all kernel calls into modules are far calls
and the other way around form .ko into kernel?
Performance is probably suffering because every call needs to be built
with full 64-bit offset. No ?

Possibly, and I think Michael can give a better perspective, but I think
this is due to our ABI. For inter-module calls, we need to setup the TOC
pointer (or the address of the function being called with ABIv2), which would require us to load a full address regardless.

Thinking more about this, as an optimization, for bpf-to-bpf calls, we could detect a near call and just emit a relative branch since we don't care about TOC with BPF. But, this will depend on whether the different BPF functions are close enough (within 32MB) of one another.

We can attempt that once the generic changes are finalized on.

Thanks,
Naveen


Reply via email to