Re: netmap: extension to store user data per packet/slot?

2014-11-12 Thread Franco Fichtner
Hi Luigi,

On 12 Nov 2014, at 00:00, Luigi Rizzo ri...@iet.unipi.it wrote:

 apparently you want some user-defined metadata to move
 along with the packet, but i do not think it is
 reasonable to put it in the slots.
 If we do that, what about timestamps, flow IDs,
 interface and queue index and all the rest of the things that
 we normally find in an mbuf/skbuf ? This is not
 going to scale.

that's true.  I'm only suggesting a small extension to be used
freely, but would never consider increasing the slot size beyond
32 bytes in any case.  Keeping it sleek is obviously important.

 Also consider that at some point you may use a different
 arrangement (with packets passed along VALE switches
 or physical interfaces etc.) i believe the most
 reasonable place to put the extra info is at the end
 of the packet and possibly bump the length in the slot
 so you are safe in case the packet is copied.

I dont believe dirtying the cache lines in the actual packet
buffer is a wise choice, but it certainly works.

 There is no timestamp appended to the packet at the moment,
 it was a feature i thought somebody may want to have,
 but between the relative scarcity of hardware that provides
 per-packet timestamps, and the questionable usefulness
 of the same, i doubt it will be available.

It is a useful feature to have receive timestamps per packet
for better accounting, but I can see it being too mystical in
its current form inside the packet buffer.  It's still in my
TODO list to investigate the impact, but a system certainly
works without that extra bit of resolution.

For now, I'll go with Adrians suggestion and keep track of the
buffer index inside the first process away from netmap(4)
itself.  This setup breaks for non-circular pipe arrangements,
but the load-balancing use case at hand is alright.


Cheers,
Franco
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: netmap: extension to store user data per packet/slot?

2014-11-12 Thread Slawa Olhovchenkov
On Tue, Nov 11, 2014 at 10:13:54PM +0100, Franco Fichtner wrote:

 Hi Luigi,
 hi all,
 
 so I was running into logistics issues with netmap(4)
 with regard to zero-copy and redirection through pipes:
 working on a load-balancing framework revealed that it
 is very hard to track a packet's origins to later move
 it onward to the respective outgoing interface, be it
 another device or the host stack.
 
 Long story short: user data needs to be stored for the
 packet buffer or slot.

I think need configurable (by sysctl) space recerved before packet.
This is may be used as user data. Or for insert VLAN/MPLS/QinQ/etc
headers.

More general: tilera have good api for this.

 There are three ways that I can see so far:
 
 (1) Allocate a netmap pipe pair for each interface,
 in case of transparent mode also a pipe for the
 host stack each.  That's a lot of pipes and
 most likely insane, but it won't extend the ABI.
 
 (2) Store the additional data in the actual buffer.
 That is sort of ok, but seems sluggish WRT cache
 behaviour -- maybe the buffer won't be read but
 it needs to be written.  Sure, we can store it at
 the end, but there already resides the packet
 timestamp if enabled (if I recall correctly).
 Wouldn't extend the ABI per se, but might collide
 with the timestamping
 
 (3) Make room in struct netmap_slot itself like this:
 
 diff --git a/sys/net/netmap.h b/sys/net/netmap.h
 index 15ebf73..d0a9c0e 100644
 --- a/sys/net/netmap.h
 +++ b/sys/net/netmap.h
 @@ -147,6 +147,7 @@ struct netmap_slot {
 uint16_t len;   /* length for this slot */
 uint16_t flags; /* buf changed, etc. */
 uint64_t ptr;   /* pointer for indirect buffers */
 +   uint64_t userdata;  /* reserved storage for caller */
  };
 
 It could also be broken down in two fields with uint32_t
 each; not sure what would be more sensible.  This of course
 requires an API bump, although it should be backwards
 compatible.
 
 Any feedback on this is highly appreciated.
 
 
 Cheers,
 Franco
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: netmap: extension to store user data per packet/slot?

2014-11-12 Thread Luigi Rizzo
On Wed, Nov 12, 2014 at 11:16 AM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:

 On Tue, Nov 11, 2014 at 10:13:54PM +0100, Franco Fichtner wrote:

  Hi Luigi,
  hi all,
 
  so I was running into logistics issues with netmap(4)
  with regard to zero-copy and redirection through pipes:
  working on a load-balancing framework revealed that it
  is very hard to track a packet's origins to later move
  it onward to the respective outgoing interface, be it
  another device or the host stack.
 
  Long story short: user data needs to be stored for the
  packet buffer or slot.

 I think need configurable (by sysctl) space recerved before packet.
 This is may be used as user data. Or for insert VLAN/MPLS/QinQ/etc
 headers.


​this is yet another requirement: not just metadata but
also encapsulation.

For the records, the VALE switch does have TSO support (implemented
through the VHOST header) so that VMs can pass large segments
across a switch and they are properly split when traffic goes
to a physical interface or a port that does not support the
header. We also support scatter-gather I/O at least on the switch
(haven't implemented this feature yet on NICs).

But please consider that following this route we end up more or less
into the same complications that afflict the standard stack:
everything is configurable and decided at runtime, and the code
becomes a maze of conditionals or indirect function calls
with little chance of optimisations.

Also, it's not that one sysctl works for all cases. Different ports
typically have different encapsulation sizes, NICs may have
alignment constraints (even those who don't suffer if buffers
not 64-byte aligned), so you'll end up with scatter-gather I/O
or copying anyways.

After two years of experience with netmap i am not so sure
anymore that zero copy makes much sense, except perhaps for
the case of large packets (but i am not so sure about that, either).

Apart from benchmarks, if you want to do something useful with the
packets you need to read the header, at which point the concerns
on having data in cache or not are less significant and the cost
of the copy is heavily reduced. Tracking ownership of buffers
(which is needed for zero copy) is also expensive even when
they are not shared (and we have great trouble in managing the
extra buffers we recently added to the netmap API to support
zero-copy, to the point that I am tempted to remove the feature.

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: netmap: extension to store user data per packet/slot?

2014-11-11 Thread Adrian Chadd
... I'm confused. Do you have the slot id already, right? Why not
allocate an array of userdata pointers somewhere else and just use the
netmap slot id as an indirection into that?



-adrian


On 11 November 2014 13:13, Franco Fichtner fra...@lastsummer.de wrote:
 Hi Luigi,
 hi all,

 so I was running into logistics issues with netmap(4)
 with regard to zero-copy and redirection through pipes:
 working on a load-balancing framework revealed that it
 is very hard to track a packet's origins to later move
 it onward to the respective outgoing interface, be it
 another device or the host stack.

 Long story short: user data needs to be stored for the
 packet buffer or slot.

 There are three ways that I can see so far:

 (1) Allocate a netmap pipe pair for each interface,
 in case of transparent mode also a pipe for the
 host stack each.  That's a lot of pipes and
 most likely insane, but it won't extend the ABI.

 (2) Store the additional data in the actual buffer.
 That is sort of ok, but seems sluggish WRT cache
 behaviour -- maybe the buffer won't be read but
 it needs to be written.  Sure, we can store it at
 the end, but there already resides the packet
 timestamp if enabled (if I recall correctly).
 Wouldn't extend the ABI per se, but might collide
 with the timestamping

 (3) Make room in struct netmap_slot itself like this:

 diff --git a/sys/net/netmap.h b/sys/net/netmap.h
 index 15ebf73..d0a9c0e 100644
 --- a/sys/net/netmap.h
 +++ b/sys/net/netmap.h
 @@ -147,6 +147,7 @@ struct netmap_slot {
 uint16_t len;   /* length for this slot */
 uint16_t flags; /* buf changed, etc. */
 uint64_t ptr;   /* pointer for indirect buffers */
 +   uint64_t userdata;  /* reserved storage for caller */
  };

 It could also be broken down in two fields with uint32_t
 each; not sure what would be more sensible.  This of course
 requires an API bump, although it should be backwards
 compatible.

 Any feedback on this is highly appreciated.


 Cheers,
 Franco
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: netmap: extension to store user data per packet/slot?

2014-11-11 Thread Franco Fichtner
Hi Adrian,

On 11 Nov 2014, at 22:22, Adrian Chadd adr...@freebsd.org wrote:

 ... I'm confused. Do you have the slot id already, right? Why not
 allocate an array of userdata pointers somewhere else and just use the
 netmap slot id as an indirection into that?

The slot id is per ring and there are a lot of them.  In case of
zero-copy the slot changes at least 1.  Consider two processes
for the load balancing case.  Process 1 attaches to the devices
and Process 2 only has a a pipe pair for receiving and sending
packets back to Process 1 after processing, because only that
process has access to the real devices:

em0, em1, etc. --RX/TX-- Process 1 --pipe pair-- Process 2
(Hardware)(Balancer)  (Worker)

There is no way to trace packet origin back to em0 or em1 after
pushing the packets through the pipe pair unless either the
pipes are unique for each device or there is another means to
keep its state.

Should, however, the buffer id be unique that would make it
easy to do what you suggest, but I don't know the netmap(4)
internals by heart.

It seems a wee unnatural to rebuild tracing of packets in
userland when netmap(4) has all the infrastructure needed to
deal with this effectively, but I'm not opposed to doing that
to avoid API/ABI changes.  Speaking of those, should volatile
internals change regarding the buffer id that would also break
the attempts to deal with the issue consistently.


Cheers,
Franco
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: netmap: extension to store user data per packet/slot?

2014-11-11 Thread Adrian Chadd
On 11 November 2014 13:41, Franco Fichtner fra...@lastsummer.de wrote:
 Hi Adrian,

 On 11 Nov 2014, at 22:22, Adrian Chadd adr...@freebsd.org wrote:

 ... I'm confused. Do you have the slot id already, right? Why not
 allocate an array of userdata pointers somewhere else and just use the
 netmap slot id as an indirection into that?

 The slot id is per ring and there are a lot of them.  In case of
 zero-copy the slot changes at least 1.  Consider two processes
 for the load balancing case.  Process 1 attaches to the devices
 and Process 2 only has a a pipe pair for receiving and sending
 packets back to Process 1 after processing, because only that
 process has access to the real devices:

 em0, em1, etc. --RX/TX-- Process 1 --pipe pair-- Process 2
 (Hardware)(Balancer)  (Worker)

 There is no way to trace packet origin back to em0 or em1 after
 pushing the packets through the pipe pair unless either the
 pipes are unique for each device or there is another means to
 keep its state.

 Should, however, the buffer id be unique that would make it
 easy to do what you suggest, but I don't know the netmap(4)
 internals by heart.

 It seems a wee unnatural to rebuild tracing of packets in
 userland when netmap(4) has all the infrastructure needed to
 deal with this effectively, but I'm not opposed to doing that
 to avoid API/ABI changes.  Speaking of those, should volatile
 internals change regarding the buffer id that would also break
 the attempts to deal with the issue consistently.

Ah, I see. You're missing some unique identifier for each netmap
buffer. I thought there was one already. Silly me.

Luigi?



-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: netmap: extension to store user data per packet/slot?

2014-11-11 Thread Franco Fichtner

On 11 Nov 2014, at 22:48, Adrian Chadd adr...@freebsd.org wrote:

 Ah, I see. You're missing some unique identifier for each netmap
 buffer. I thought there was one already. Silly me.

Exactly, and, no, thank you for making clear what is needed.  :)

A little more on this: I think struct netmap_slot is convenient
due to the fact that in zero-copy one wouldn't want to mess with
the actual buffer for speed and userland code already touches slot
internals for each ring transition so there is no performance
degradation.

The key benefit is that if userland can use this storage freely
netmap(4) doesn't get in the way of building complex setups that
require decoupled logic and each ring hop may alter the state
as required.


Cheers,
Franco
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: netmap: extension to store user data per packet/slot?

2014-11-11 Thread Luigi Rizzo
Franco,
apparently you want some user-defined metadata to move
along with the packet, but i do not think it is
reasonable to put it in the slots.
If we do that, what about timestamps, flow IDs,
interface and queue index and all the rest of the things that
we normally find in an mbuf/skbuf ? This is not
going to scale.

Also consider that at some point you may use a different
arrangement (with packets passed along VALE switches
or physical interfaces etc.) i believe the most
reasonable place to put the extra info is at the end
of the packet and possibly bump the length in the slot
so you are safe in case the packet is copied.

There is no timestamp appended to the packet at the moment,
it was a feature i thought somebody may want to have,
but between the relative scarcity of hardware that provides
per-packet timestamps, and the questionable usefulness
of the same, i doubt it will be available.

cheers
luigi


On Tue, Nov 11, 2014 at 2:01 PM, Franco Fichtner fra...@lastsummer.de
wrote:


 On 11 Nov 2014, at 22:48, Adrian Chadd adr...@freebsd.org wrote:

  Ah, I see. You're missing some unique identifier for each netmap
  buffer. I thought there was one already. Silly me.

 Exactly, and, no, thank you for making clear what is needed.  :)

 A little more on this: I think struct netmap_slot is convenient
 due to the fact that in zero-copy one wouldn't want to mess with
 the actual buffer for speed and userland code already touches slot
 internals for each ring transition so there is no performance
 degradation.

 The key benefit is that if userland can use this storage freely
 netmap(4) doesn't get in the way of building complex setups that
 require decoupled logic and each ring hop may alter the state
 as required.


 Cheers,
 Franco




-- 
-+---
 Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/. Universita` di Pisa
 TEL  +39-050-2211611   . via Diotisalvi 2
 Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org