On 2014-01-29 21:47, Fran?ois-Fr?d?ric Ozog wrote: >>> First and easy answer: it is open source, so anyone can recompile. So, >>> what's the issue? >> >> I'm talking from a pure distribution perspective here: Requiring to >> recompile all DPDK based applications to distribute a bugfix or to add >> support for a new PMD is not ideal. > >> >> So ideally OVS would have the possibility to link against the shared >> library long term. > > I agree that distribution of DPDK apps is not covered properly at present. > Identifying the proper scheme requires a specific analysis based on the > constraints of the Telecom/Cloud/Networking markets. > > In the telecom world, if you fix the underlying framework of an app, you > will still have to validate the solution, ie app/framework. In addition, the > idea of shared libraries introduces the implied requirement to validate apps > against diverse versions of DPDK shared libraries. This translates into > development and support costs. > > I also expect many DPDK applications to tackle core networking features, > with sub micro second packet handling delays and even lower than 200ns > (NAT64...). The lazy binding based on ELF PLT represent quite a cost, not > mentioning that optimization stops are shared libraries boundaries (gcc > whole program optimization can be very effective...). Microsoft DLL linkage > are an order of magnitude faster. If Linux was to provide that, I would > probably revise my judgment. (I haven't checked Linux dynamic linking > implementation for some time so my understanding of Linux dynamic linking > may be outdated). > > >> >>> I get lost: do you mean ABI + API toward the PMDs or towards the >>> applications using the librte ? >> >> Towards the PMDs is more straight forward at first so it seems logical to >> focus on that first. > > I don't think it is so straight forward. Many recent cards such as Chelsio > and Myricom have a very different "packet memory layout" that does not fit > so easily into actual DPDK architecture. > > 1) "traditional" architecture: the driver reserves X buffers and provide the > card with descriptors of those buffers. Each packet is DMA'ed into exactly > one buffer. Typically you have 2K buffers, a 64 byte packet consumes exactly > one buffer > > 2) "alternative" new architecture: the driver reserves a memory zone, say > 4MB, without any structure, and provide a a single zone description and a > ring buffer to the card. (there no individual buffer descriptors any more). > The card fills the memory zone with packets, one next to the other and > specifies where the packets are by updating the supplied ring. Out of the > many issues fitting this scheme into DPDK, you cannot free a single mbuf: > you have to maintain a ref count to the memory zone so that, when all mbufs > have been "released", the memory zone can be freed. > That's quite a stretch from actual paradigm. > > Apart from this aspect, managing RSS is two tied to Intel's flow director > concepts and cannot accommodate directly smarter or dumber RSS mechanisms. > > That said, I fully agree PMD API should be revisited.
Hi, Sorry for jumping in late. Perhaps you are already aware of OpenDataPlane, which can use DPDK as its south bound NIC interface. > > Cordially, > > Fran?ois-Fr?d?ric >