Re: netmap: extension to store user data per packet/slot?
Hi Luigi, On 12 Nov 2014, at 00:00, Luigi Rizzo ri...@iet.unipi.it wrote: apparently you want some user-defined metadata to move along with the packet, but i do not think it is reasonable to put it in the slots. If we do that, what about timestamps, flow IDs, interface and queue index and all the rest of the things that we normally find in an mbuf/skbuf ? This is not going to scale. that's true. I'm only suggesting a small extension to be used freely, but would never consider increasing the slot size beyond 32 bytes in any case. Keeping it sleek is obviously important. Also consider that at some point you may use a different arrangement (with packets passed along VALE switches or physical interfaces etc.) i believe the most reasonable place to put the extra info is at the end of the packet and possibly bump the length in the slot so you are safe in case the packet is copied. I dont believe dirtying the cache lines in the actual packet buffer is a wise choice, but it certainly works. There is no timestamp appended to the packet at the moment, it was a feature i thought somebody may want to have, but between the relative scarcity of hardware that provides per-packet timestamps, and the questionable usefulness of the same, i doubt it will be available. It is a useful feature to have receive timestamps per packet for better accounting, but I can see it being too mystical in its current form inside the packet buffer. It's still in my TODO list to investigate the impact, but a system certainly works without that extra bit of resolution. For now, I'll go with Adrians suggestion and keep track of the buffer index inside the first process away from netmap(4) itself. This setup breaks for non-circular pipe arrangements, but the load-balancing use case at hand is alright. Cheers, Franco ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
On Tue, Nov 11, 2014 at 10:13:54PM +0100, Franco Fichtner wrote: Hi Luigi, hi all, so I was running into logistics issues with netmap(4) with regard to zero-copy and redirection through pipes: working on a load-balancing framework revealed that it is very hard to track a packet's origins to later move it onward to the respective outgoing interface, be it another device or the host stack. Long story short: user data needs to be stored for the packet buffer or slot. I think need configurable (by sysctl) space recerved before packet. This is may be used as user data. Or for insert VLAN/MPLS/QinQ/etc headers. More general: tilera have good api for this. There are three ways that I can see so far: (1) Allocate a netmap pipe pair for each interface, in case of transparent mode also a pipe for the host stack each. That's a lot of pipes and most likely insane, but it won't extend the ABI. (2) Store the additional data in the actual buffer. That is sort of ok, but seems sluggish WRT cache behaviour -- maybe the buffer won't be read but it needs to be written. Sure, we can store it at the end, but there already resides the packet timestamp if enabled (if I recall correctly). Wouldn't extend the ABI per se, but might collide with the timestamping (3) Make room in struct netmap_slot itself like this: diff --git a/sys/net/netmap.h b/sys/net/netmap.h index 15ebf73..d0a9c0e 100644 --- a/sys/net/netmap.h +++ b/sys/net/netmap.h @@ -147,6 +147,7 @@ struct netmap_slot { uint16_t len; /* length for this slot */ uint16_t flags; /* buf changed, etc. */ uint64_t ptr; /* pointer for indirect buffers */ + uint64_t userdata; /* reserved storage for caller */ }; It could also be broken down in two fields with uint32_t each; not sure what would be more sensible. This of course requires an API bump, although it should be backwards compatible. Any feedback on this is highly appreciated. Cheers, Franco ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
On Wed, Nov 12, 2014 at 11:16 AM, Slawa Olhovchenkov s...@zxy.spb.ru wrote: On Tue, Nov 11, 2014 at 10:13:54PM +0100, Franco Fichtner wrote: Hi Luigi, hi all, so I was running into logistics issues with netmap(4) with regard to zero-copy and redirection through pipes: working on a load-balancing framework revealed that it is very hard to track a packet's origins to later move it onward to the respective outgoing interface, be it another device or the host stack. Long story short: user data needs to be stored for the packet buffer or slot. I think need configurable (by sysctl) space recerved before packet. This is may be used as user data. Or for insert VLAN/MPLS/QinQ/etc headers. this is yet another requirement: not just metadata but also encapsulation. For the records, the VALE switch does have TSO support (implemented through the VHOST header) so that VMs can pass large segments across a switch and they are properly split when traffic goes to a physical interface or a port that does not support the header. We also support scatter-gather I/O at least on the switch (haven't implemented this feature yet on NICs). But please consider that following this route we end up more or less into the same complications that afflict the standard stack: everything is configurable and decided at runtime, and the code becomes a maze of conditionals or indirect function calls with little chance of optimisations. Also, it's not that one sysctl works for all cases. Different ports typically have different encapsulation sizes, NICs may have alignment constraints (even those who don't suffer if buffers not 64-byte aligned), so you'll end up with scatter-gather I/O or copying anyways. After two years of experience with netmap i am not so sure anymore that zero copy makes much sense, except perhaps for the case of large packets (but i am not so sure about that, either). Apart from benchmarks, if you want to do something useful with the packets you need to read the header, at which point the concerns on having data in cache or not are less significant and the cost of the copy is heavily reduced. Tracking ownership of buffers (which is needed for zero copy) is also expensive even when they are not shared (and we have great trouble in managing the extra buffers we recently added to the netmap API to support zero-copy, to the point that I am tempted to remove the feature. cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
... I'm confused. Do you have the slot id already, right? Why not allocate an array of userdata pointers somewhere else and just use the netmap slot id as an indirection into that? -adrian On 11 November 2014 13:13, Franco Fichtner fra...@lastsummer.de wrote: Hi Luigi, hi all, so I was running into logistics issues with netmap(4) with regard to zero-copy and redirection through pipes: working on a load-balancing framework revealed that it is very hard to track a packet's origins to later move it onward to the respective outgoing interface, be it another device or the host stack. Long story short: user data needs to be stored for the packet buffer or slot. There are three ways that I can see so far: (1) Allocate a netmap pipe pair for each interface, in case of transparent mode also a pipe for the host stack each. That's a lot of pipes and most likely insane, but it won't extend the ABI. (2) Store the additional data in the actual buffer. That is sort of ok, but seems sluggish WRT cache behaviour -- maybe the buffer won't be read but it needs to be written. Sure, we can store it at the end, but there already resides the packet timestamp if enabled (if I recall correctly). Wouldn't extend the ABI per se, but might collide with the timestamping (3) Make room in struct netmap_slot itself like this: diff --git a/sys/net/netmap.h b/sys/net/netmap.h index 15ebf73..d0a9c0e 100644 --- a/sys/net/netmap.h +++ b/sys/net/netmap.h @@ -147,6 +147,7 @@ struct netmap_slot { uint16_t len; /* length for this slot */ uint16_t flags; /* buf changed, etc. */ uint64_t ptr; /* pointer for indirect buffers */ + uint64_t userdata; /* reserved storage for caller */ }; It could also be broken down in two fields with uint32_t each; not sure what would be more sensible. This of course requires an API bump, although it should be backwards compatible. Any feedback on this is highly appreciated. Cheers, Franco ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
Hi Adrian, On 11 Nov 2014, at 22:22, Adrian Chadd adr...@freebsd.org wrote: ... I'm confused. Do you have the slot id already, right? Why not allocate an array of userdata pointers somewhere else and just use the netmap slot id as an indirection into that? The slot id is per ring and there are a lot of them. In case of zero-copy the slot changes at least 1. Consider two processes for the load balancing case. Process 1 attaches to the devices and Process 2 only has a a pipe pair for receiving and sending packets back to Process 1 after processing, because only that process has access to the real devices: em0, em1, etc. --RX/TX-- Process 1 --pipe pair-- Process 2 (Hardware)(Balancer) (Worker) There is no way to trace packet origin back to em0 or em1 after pushing the packets through the pipe pair unless either the pipes are unique for each device or there is another means to keep its state. Should, however, the buffer id be unique that would make it easy to do what you suggest, but I don't know the netmap(4) internals by heart. It seems a wee unnatural to rebuild tracing of packets in userland when netmap(4) has all the infrastructure needed to deal with this effectively, but I'm not opposed to doing that to avoid API/ABI changes. Speaking of those, should volatile internals change regarding the buffer id that would also break the attempts to deal with the issue consistently. Cheers, Franco ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
On 11 November 2014 13:41, Franco Fichtner fra...@lastsummer.de wrote: Hi Adrian, On 11 Nov 2014, at 22:22, Adrian Chadd adr...@freebsd.org wrote: ... I'm confused. Do you have the slot id already, right? Why not allocate an array of userdata pointers somewhere else and just use the netmap slot id as an indirection into that? The slot id is per ring and there are a lot of them. In case of zero-copy the slot changes at least 1. Consider two processes for the load balancing case. Process 1 attaches to the devices and Process 2 only has a a pipe pair for receiving and sending packets back to Process 1 after processing, because only that process has access to the real devices: em0, em1, etc. --RX/TX-- Process 1 --pipe pair-- Process 2 (Hardware)(Balancer) (Worker) There is no way to trace packet origin back to em0 or em1 after pushing the packets through the pipe pair unless either the pipes are unique for each device or there is another means to keep its state. Should, however, the buffer id be unique that would make it easy to do what you suggest, but I don't know the netmap(4) internals by heart. It seems a wee unnatural to rebuild tracing of packets in userland when netmap(4) has all the infrastructure needed to deal with this effectively, but I'm not opposed to doing that to avoid API/ABI changes. Speaking of those, should volatile internals change regarding the buffer id that would also break the attempts to deal with the issue consistently. Ah, I see. You're missing some unique identifier for each netmap buffer. I thought there was one already. Silly me. Luigi? -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
On 11 Nov 2014, at 22:48, Adrian Chadd adr...@freebsd.org wrote: Ah, I see. You're missing some unique identifier for each netmap buffer. I thought there was one already. Silly me. Exactly, and, no, thank you for making clear what is needed. :) A little more on this: I think struct netmap_slot is convenient due to the fact that in zero-copy one wouldn't want to mess with the actual buffer for speed and userland code already touches slot internals for each ring transition so there is no performance degradation. The key benefit is that if userland can use this storage freely netmap(4) doesn't get in the way of building complex setups that require decoupled logic and each ring hop may alter the state as required. Cheers, Franco ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: netmap: extension to store user data per packet/slot?
Franco, apparently you want some user-defined metadata to move along with the packet, but i do not think it is reasonable to put it in the slots. If we do that, what about timestamps, flow IDs, interface and queue index and all the rest of the things that we normally find in an mbuf/skbuf ? This is not going to scale. Also consider that at some point you may use a different arrangement (with packets passed along VALE switches or physical interfaces etc.) i believe the most reasonable place to put the extra info is at the end of the packet and possibly bump the length in the slot so you are safe in case the packet is copied. There is no timestamp appended to the packet at the moment, it was a feature i thought somebody may want to have, but between the relative scarcity of hardware that provides per-packet timestamps, and the questionable usefulness of the same, i doubt it will be available. cheers luigi On Tue, Nov 11, 2014 at 2:01 PM, Franco Fichtner fra...@lastsummer.de wrote: On 11 Nov 2014, at 22:48, Adrian Chadd adr...@freebsd.org wrote: Ah, I see. You're missing some unique identifier for each netmap buffer. I thought there was one already. Silly me. Exactly, and, no, thank you for making clear what is needed. :) A little more on this: I think struct netmap_slot is convenient due to the fact that in zero-copy one wouldn't want to mess with the actual buffer for speed and userland code already touches slot internals for each ring transition so there is no performance degradation. The key benefit is that if userland can use this storage freely netmap(4) doesn't get in the way of building complex setups that require decoupled logic and each ring hop may alter the state as required. Cheers, Franco -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org