On 2020/05/04 1:22, William Tu wrote:
On Tue, Apr 21, 2020 at 11:47:04PM +0900, Toshiaki Makita wrote:
This adds a reference program, flowtable_afxdp.o, which can be used to
offload flows to XDP through netdev-offload-xdp.
The program will be compiled when using --enable-bpf switch.
Signed-off-by: Toshiaki Makita <[email protected]>
Hi Toshiaki,
Thanks for your patch, I haven't tried to run it but I have
questions about miniflow.
---
...
+SEC("xdp") int
+flowtable_afxdp(struct xdp_md *ctx)
+{
+ struct xdp_miniflow *pkt_mf;
+ struct xdp_subtable_mask *subtable_mask;
+ int *head;
+ struct xdp_flow_actions *xdp_actions = NULL;
+ struct nlattr *a;
+ unsigned int left;
+ int cnt, idx, zero = 0;
+
+ account_debug(0);
+
+ head = bpf_map_lookup_elem(&subtbl_masks_hd, &zero);
+ if (!head) {
+ return XDP_ABORTED;
+ }
+ if (*head == XDP_SUBTABLES_TAIL) {
+ /* Offload not enabled */
+ goto upcall;
+ }
+
+ /* Get temporary storage for storing packet miniflow */
+ pkt_mf = bpf_map_lookup_elem(&pkt_mf_tbl, &zero);
+ if (!pkt_mf) {
+ return XDP_ABORTED;
+ }
+
I start to wonder what's the benefit of using miniflow in XDP?
miniflow tries to compress the large flow key into smaller memory,
and with a flowmap indicating which offset in bits are valid.
And when adding a new subtable at flow_put, the subtable's key
size can be smaller with only the needed fields.
But in the case of XDP/eBPF, we have to statically allocated fixed
key size (the subtbl_template has key as struct xdp_flow_key), so
each subtable is always having key with full key fields. (not saving
memory space) Moreover with miniflow, it makes the 'mask_key' function
below pretty complicated.
Fixed sized subtable is restriction of map-in-map.
The benefit to use miniflow I envisioned initially was
- compress the key
- use existing function to convert flows for XDP
The first one is actually not doable due to map-in-map. I hope someday the
restriction gets loosen...
The second one is doable and it's good for userspace offload driver.
But I ended up with rewriting most helper functions of miniflow for BPF due to some
restriction.
What if we don't compress the flow key?
unlike to OVS userspace, more like the kernel's implementation of OVS
megaflow (ex: masked_flow_lookup in net/openvswitch/flow_table.c)
Without miniflow the BPF program will be simple like the original xdp_flow patch for
Linux kernel.
I can try it.
Where it has a list of masks (like your 'subtbl_masks' map) but
only use 1 hash table (so we don't need map-in-map).
We lookup the hash table multiple times, each
time applied different masks in the mask list until finding a match.
For HW offload one big hash table may help. Fow software it will be slower as key
size and entries are larger.
Maybe we can support both types, one big hash table and multiple subtables, in the
future?
As I don't have BPF offloadable NICs I prefer to keep current multiple subtables for
now.
Toshiaki Makita
However, I'm not sure whether this will hit some limitations of verifier.
Please correct me if I understand wrong here.
Thanks
William
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev