On 08/09/2018 04:02 AM, Daniel Borkmann wrote:
On 08/09/2018 06:48 AM, Alexei Starovoitov wrote:
On Wed, Aug 08, 2018 at 10:08:47PM -0500, Mauricio Vasquez wrote:
And how about adding three new helpers: push/pop/peek as well?
Reusing lookup/update is neat, but does lookup == pop
or does lookup == peek ?
I suspect it will be confusing.
Three new helpers cost nothing, but will make bpf progs easier to read.
I agree. I have one doubt here, update/lookup/delete is implemented in all
map types, if the operation is not supported it returns -EINVAL.
For push/pop/peek, should we implement it in all maps or is it better to
check the map type before invoking map->ops->push/pop/seek?
(Maybe checking if map->ops->xxx is NULL will also work)
Since push/pop/peak are only for this queue/stack I thought we won't
be adding 'ops' for them and just call the helpers from progs.
But now I'm having second thoughts, since 3 new commands for syscall
feels like overkill.
At the same time I still don't feel that lookup == pop is the right alias.
Also what peak is going to alias to ?
May be let's go back to your original idea with a tweak:
push == update
peak == lookup
pop = lookup + delete
In other words in case of stack the bpf_map_lookup_elem will return
the pointer to value of top element in the stack and
bpf_map_delete_elem will delete that top element.
Then in user space we can introduce push/pop always_inline functions like:
void bpf_push(struct bpf_map *map, void *value)
{
   bpf_map_update_elem(map, NULL/*key*/, value);
}

void *bpf_pop(struct bpf_map *map)
{
   void * val = bpf_map_lookup_elem(map, NULL/*key*/);
   bpf_map_delete_elem(map, NULL/*key*/);
   return val;
}

Thoughts?
I actually think that having new push/peak/pop BPF map helpers would
be fine, as well as having them sit in map->ops. Initially I'd probably
leave out the syscall ops counterparts so they can be used only from
program.

I also think that having the new helpers is good. I don't have a strong opinion about saving them in map->ops or not, in any case it has to be verified that push/pop/peek are only invoked in stack/queue maps. I think we could force in on the verifier, would it be ok?

Potentially array and per-cpu array could implement them even by having
an internal pointer to the current slot which we move on push/pop. This
would potentially also avoid all the RCU callbacks to free the elements,
elem allocations, list locking, and improve cache locality compared to
the current implementation.

I could change the implementation, a linked list is used when BPF_F_NO_PREALLOC is passed, otherwise a plain array with internal indexes is used. I have a question about RCU and a possible array-based implementation. It should be guarantee that a pointer from a pop() operation is valid for the whole program duration. If an eBPF program A calls pop() it will get a pointer to the head in the array, and the head index will move, it is possible that a second program B (running on a different CPU) pushes a new element overwriting the element pointed by A. I think we need to use a more sophisticated mechanism than just an array and two indexes. Am I missing something?

Agree that existing ops are not the right alias, but deferring to user
space as inline function also doesn't really seem like a good fit, imho,
so I'd prefer rather to have something native. (Aside from that, the
above inline bpf_pop() would also race between CPUs.)

I think we should have push/pop/peek syscalls as well, having a bpf_pop() that is race prone would create problems. Users expected maps operations to be safe, so having one that is not will confuse them.
Thanks,
Daniel


Thanks,
Mauricio.






Reply via email to