On 17 February 2018 at 03:36, Bill Fischofer <bill.fischo...@linaro.org> wrote:
> Changed to post this to the ODP mailing list, as this is a good topic to
> discuss there.
>
> Let's first back up a bit and discuss the intent behind packet references.
> Packets typically consist of one or more headers followed by a payload.
> References are designed to allow multiple unique headers to share a common
> payload, which must be treated as read/only. This is because headers
> typically have things like unique destination addresses (for multicast) or
> IP identifiers (for retransmission). A fundamental assumption of the ODP
> programming model is that a packet handle is owned by only one thread at a
> time. If two or more threads were to make simultaneous accesses to a packet
> through the same handle, the result would be unpredictable race conditions
> and likely data corruption.This is why the shared part of any packet
> reference must be treated as read only by all parties.
>
> So any packet reference consists of a R/W prefix, which is unique to that
> reference, followed by a (potentially) shared R/O suffix. The point at which
> the R/O suffix begins is set by the offset parameter. For dynamic references
> the prefix can either start out null, which is what odp_packet_ref() does,
> or can be supplied by the caller, which is what odp_packet_ref_pkt() does.
> In both cases the R/W prefix can be modified
>
> With this background let's look at the specifics.
>
> On Fri, Feb 16, 2018 at 9:31 AM, Bogdan Pricope <bogdan.pric...@linaro.org>
> wrote:
>>
>> Hi,
>>
>> I am trying to use ODP packet reference API to improve transmission of
>> TCP data. Yet, it seems current API is not enough for this.
>>
>> Usecase:  three data areas stored inside odp_packet_t
>>
>> mb1: size 40 bytes – to send last 20 bytes
>> mb2: size 40 bytes – to send 40 bytes
>> mb3: size 40 bytes – to send first 20 bytes
>>
>
> Are you looking at three areas within the same packet or three areas that
> are in separate packets? You cannot create a reference to part of a packet.
> The only control is the starting offset within the packet that is to become
> the shared portion of the reference. That extends to the end of the packet
> on which the reference is created. We had considered adding a length
> parameter but that would pose undue implementation complications.

Each data area is in separate odp packet: this is generated by 3 calls
to an API equivalent to:

ssize_t send(int sockfd, const void *buf, size_t len, int flags);

Buffer 'buf' provided by application is stored as odp_packet_t inside
socket send buffer array.

Packetization function receives as arguments: a list of payload
buffers (odp_packet_t), offset (start of the un-ack data) and
a len (min(data available, send window size)). Length parameter is needed.

>
>>
>>
>> Resulting packet should look like:
>> mb0: len 32 bytes to hold IPv4 + TCP headers
>> reference to mb1: 20 bytes offset, 20 bytes len
>> reference to mb2: 0 bytes offset, 40 bytes len
>> reference to mb3: 0 offset, 20 bytes len
>
>
> What you're describing here is either a composite packet that is assembled
> from multiple different packets via odp_packet_concat(), or a compound
> reference where the packet on which you're creating a reference is itself a
> reference. You cannot create a reference directly to more than one packet.

I meant a compound reference.

>
>>
>>
>>
>> API calls:
>> pkt = odp_packet_alloc(pool, 32);
>> pkt = odp_packet_ref_pkt(mb1, 20, pkt);
>> pkt = odp_packet_ref_pkt(mb2, 0, pkt);
>> pkt = odp_packet_ref_pkt(mb3, 0, pkt);
>
>
> Not sure if this is an omission, but having pkt on both sides of the
> assignment here is attempting to create a circular reference (pkt refers to

I thought it works like:

pkt = pkt + new_buff;

> itself) and that is expressly forbidden by the APIs. The proper way to do
> what I think you're trying to do is:
>
> odp_packet_t mb0, mb1, mb2, mb3, refpkt;
>
> ... setup mb0 to hold 14 bytes of Ethernet header
> ... setup mb1 to hold 20 bytes of IPv4 header
> ... setup mb2 to hold 20 bytes of TCP header
> ... setup mb3 to hold your payload
>
> refpkt = odp_packet_ref_pkt(odp_packet_ref_pkt(odp_packet_ref_pkt(mb3, 0,
> mb2), 0, mb1), 0, mb0);
>

It should work (except for missing length part) on single thread
scenario, yet is an odd construction.
Not sure if single thread scenario is good enough for this usecase.


> You of course could use intermediate assignments to capture the handles for
> the intermediate reference.
>
> The result is refpkt is the handle to the composite reference {mb0 || mb1 ||
> mb2 || mb3}, where only mb0 is R/W and all others must be treated as R/O. Of
> the original packets you now have (in linux-generic, other implementations
> may be slightly different).
>
> refpkt has a ref_count of 1 since it is a unique (unreferenced) packet.
> mb0 has a ref_count of 1 since it is returned as refpkt.
> mb1 has a ref_count of 2 since is is referred by to mb0/refpkt.
> mb2 has a ref_count of 3 since it is referred to by mb1 and (indirectly) by
> mb0/refpkt.
> mb3 has a ref_count of 4 since it is referred to by mb2 and (indirectly) by
> both mb1, and mb0/refpkt.
>
>>
>>
>> Two problems:
>> 1.       Cannot add mb2 and mb3 because pkt has references after first
>> call to odp_packet_ref_pkt()
>>
>>  * The packet handle ('pkt') may itself be a (dynamic) reference to a
>> packet,
>>  * but the header packet handle ('hdr') must be unique.
>>
>> odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,
>>
>> odp_packet_t hdr);
>>
>> 2.       Although packet reference can start with an offset, it cannot
>> be less that all the rest of the packet (no size).
>
>
> Hopefully the above explanation makes sense.
>
>>
>>
>> Is there any other way to create this packet structure? Maybe with
>> odp_packet_concat() / odp_packet_split()?
>
>
> odp_packet_concat() is actually quite efficient in the current linux-generic
> implementation since it just links segments and does not do data copies.
> Since this results in a single packet the result is R/W. However it's also
> not sharable, which may or may not be what you want.
>
>>
>>
>> Br,
>> Bogdan
>
>

Reply via email to