Re: [lng-odp] APIs for dealing with compact handles

Sachin Saxena Wed, 24 May 2017 02:44:08 -0700

Thanks Bill for initiating the thread.

Please checkout some more details*(**i**nline)* on the requirements &proposal.



On 5/17/2017 8:50 PM, Bill Fischofer wrote:

This thread is to discuss ideas and proposed solutions to two issuesthat have been raised by Sachin relating to VPP needs, as well asHonnappa relating to the scalable scheduler. Background ========= ODPhandles are abstract types that implementations may define to be ofarbitrary bit width in size. However, for a number of reasons (e.g.,provision of strong typing support, ABI compatibility, efficiency ofinternal manipulation, etc.) these are typically represented as 64-bitquantities. Some applications that store handles in their ownstructures wish to minimize the cache footprint consumed by thesestructures and so would like an option to store handles in a morecompact format that uses a smaller number of bits. To date 32-bitsseems sufficient for application need, however in theory 16 or even 8bits might be desirable in some circumstances. We already have anexample of 8-bit handles in the odp_packet_seg_t type, where odp-linuxuses an 8-bit representation of this type as a segment index when ODPis configured with --enable-abi-compat=no while using a 64-bit sizewhen configured with --enable-abi-compat=yes. Considerations============ In choosing the bit width to use in representing handlesthere are two main considerations that implementations must take intoaccount. First, to achieve strong typing in C, handles need to be ofpointer width. For development this is a very valuable feature, whichis why implementations are encouraged to provide strong typing for ODPabstract types. Second, for ABI compatibility it is required that allimplementations use the same width for types that are to be ABIcompatible across different implementations. Implementations mayinterpret the bits of a handle very differently, but all must agreethat handles are of the same bit width if they wish to be binarycompatible with each other. Stated Needs =========== VPP currentlypackages its metadata into a vlib_mbuf struct that is used pervasivelyto reference packets that are being processed by VPP nodes. Theaddress of this struct is desired to be held in compressed (32-bit)format. Today the vlib_mbuf is implemented as a user area associatedwith an odp_packet_t. As such the odp_packet_user_area() API returns a(64-bit) pointer. What is desired is a compact representation of thisaddress.

VPP collects bunch of packets from ODP/DPDK input node and looks forinline "struct vlib_buffer" address in each packet.Then it creates a VPP Library Frame which is a collection of thevlib_buffers (vectors). For this, VPP converts 64-bit address of eachvlib_buffer to a 32-bit index and save in the VLib frame and pass thisframe to next Node.In each processing node in Data path where packet contents are accessed,VPP converts this 32-bit index to actual 64-bit address to get packetdata pointer.In current implementation, VPP converts 32-bit index to address @ ~900places in overall code via API:

            vlib_get_buffer (vlib_main_t * vm, u32 buffer_index)


*Code reference*:
GIT: https://git.fd.io/odp4vpp/log/
Files:              vlib/vlib/buffer_funcs.h
vlib/vlib/buffer.h

VPP on the transmit side also needs to obtain the odp_packet_tassociated with a vlib_mbuf. For the scalable scheduler, the desire isfor a compact representation of an odp_event_t that can be stored in aspace-efficient manner in queues. Proposed Solutions ===============Outlined here are a couple of proposed solutions to these problems.Please feel free to propose alternate solutions as well. For the caseof the compact user area pointers needed by VPP, the suggestion hasbeen made that ODP pools provide an API to return pool boundsinformation so that VPP can convert the user area pointers to a morecompact index. However, this makes a number of assumptions about theinternals of ODP pools that may or may not be portable or practical inall implementations. Since the requirement is for a compactrepresentation of the user area address, a more direct solution may besimply to provide a set of new APIs that address this need directly:uint32_t odp_packet_user_area_index(odp_packet_t pkt); This API wouldreturn a 32-bit index of the user area associated with anodp_packet_t. Note that since user areas are mapped one-to-one withODP packets, this can serve effectively as a packet index as well.With this API, applications can obtain the user area address directlyor an indirectly in a compact form. The problem is converting theindex back into the user area address. An API of the form: void*odp_packet_user_area_addr(uint32_t ndx); Assumes that this is areversible mapping, which probably isn't true. However, adding theodp_packet_t as a second argument would be pointless since if theapplication has the odp_packet_t it can use the existingodp_packet_user_area() API directly. So the containing pool would seema necessary 2nd argument: void *odp_packet_user_area_addr(uint32_tndx, odp_pool_t pool); These APIs seem awkward as well, so perhapsrecasting these as a way to get compact packet handles might bebetter: uint32_t odp_packet_to_index(odp_pool_t pool, odp_packet_tpkt); odp_packet_t odp_packet_from_index(odp_pool_t pool, uint32_tndx); An interesting aside is that given this general approach, anadditional API could be envisioned that would provide even morecompact packet indexes: uint16_t odp_packet_to_index_16(odp_pool_tpool, odp_packet_t pkt); Note that a single odp_packet_from_index()suffices since uint16_t indexes will promote to a uint32_t argumentwithout problem. The odp_pool_capability() API may indicate whetherthis additional compact forms is supported, and of course this wouldonly be possible if the pool's pkt.num is < 64K. With these APIs, thecompact vlib_mbuf requirement would seem to be satisfied by thefollowing routines: uint32_t vlib_mbuf_index(odp_packet_t pkt) {return odp_packet_to_index(odp_packet_pool(pkt), pkt); } void*vlib_mbuf_addr(odp_pool_t pool, uint32_t ndx) { returnodp_packet_user_area(odp_packet_from_index(pool, ndx)); }

The API "*v**lib_get_buffer (vlib_main_t * vm, u32 buffer_index)*" isgetting called not only in ODP/DPDK node's Transmit path

but also in VPP internal "vnet" nodes & "vlib" data path implementation.

For example, in a scenario where VPP is running as vSwitch, followingare the node's function which call this API to get buffer address fromindex in RX to TX path.

   1. VNET In ethernet node processing
   2. l2input_node_fn in l2_input processing
   3. l2flood_node_fn
   4. l2output_node_fn
   5. vnet_interface_output_node
   6. *odp_packet_interface_tx*

That means to change implementation of "*vlib_get_buffer()*" API, tocall *odp_packet_to_index()*, we need to change the VPP internalframework API and might need discussions to accept this.Also, to make sure that the API "vlib_get_buffer()" compatible withexisting DPDK node, we need to introduce our code in compile time flags.


Which may looks like:

   always_inline vlib_buffer_t *
   vlib_get_buffer (vlib_main_t * vm, u32 buffer_index)
   {
   #if ODP

   return odp_packet_user_area(odp_packet_from_index(pool, ndx));

   #else
                  return vlib_physmem_at_offset (&vm->physmem_main,
   ((uword) buffer_index)
                                         << CLIB_LOG2_CACHE_LINE_BYTES);
   #endif
   }

Conversely, the need to obtain the odp_packet_t from the vlib_mbufindex would be simply: odp_packet_t vlib_mbuf_to_pkt(odp_pool_t pool,uint32_t ndx) { return odp_packet_from_index(pool, ndx); } For thescalable scheduler, since this is internal to the ODP implementation,there doesn't seem to be the need for any new external APIs. Internal_odp_event_to_index() and _odp_event_from_index() APIs could bemodeled on this approach to achieve the same effect, however.

Re: [lng-odp] APIs for dealing with compact handles

Reply via email to