Thanks Matt! I will try that. It seems very clean. On Wed, Jun 8, 2016 at 9:45 AM, Matt Laswell <laswell at infinite.io> wrote:
> Hey Cliff, > > I have a similar use case in my application. If you're willing to > dedicate an lcore per socket, another way to approach what you're > describing is to create a KNI interface thread that talks to the other > cores via message rings. That is, the cores that are interacting with the > NIC read a bunch of packets, determine if any of them need to go to KNI > and, if so, enqueue them using rte_ring_enqueue(). They also do a periodic > rte_ring_dequeue() on another queue to accept back any packets that come > back from KNI. > > The KNI interface process, meanwhile, just loops along, taking packets in > from the NIC interface threads via rte_ring_dequeue() and sending them to > KNI, and taking packets from KNI and returning them to the NIC interface > threads via rte_ring_enqueue(). > > I've found that this sort of scheme works well, and is reasonably clean > architecturally. Also, I found that calls into KNI can at times be very > slow. In my application, I would periodically see KNI calls take 50-100K > cycles, which can cause congestion if you're handling large volumes of > traffic. Letting a non-critical thread handle this interface was a big win > for me. > > This leaves the kernel side processing out, of course. But if the traffic > going to the kernel is lightweight, you likely don't need a dedicated core > for the kernel-side RX and TX work. > > -- > Matt Laswell > Principal Software Engineer > infinite io > > On Wed, Jun 8, 2016 at 11:30 AM, Cliff Burdick <shaklee3 at gmail.com> wrote: > >> Hi, I have an application with two sockets where each core I'm planning to >> transmit and receive a fairly large amount of traffic per core. Each core >> right now handles a single queue of either TX or RX of a given port. >> Across >> all the cores, I may be processing up to 12 ports. I also need to handle >> things like ARP and ping, so I'm going to add in the KNI driver to handle >> that. Since the amount of traffic I'm expecting that I'll need to forward >> to Linux is very small, it seems like I should be able to dedicate one >> lcore per socket to handle this functionality and have the dataplane cores >> pass the traffic off to this core using rte_kni_tx_burst(). >> >> My question is, first of all, is this possible? It seems like I can >> configure the KNI driver to start in "single thread" mode. From that >> point, >> I want to initialize one KNI device for each port, and have each kernel >> lcore on each processor handle that traffic. I believe if I call >> rte_kni_alloc with core_id set to the kernel lcore for each device, then >> in >> the end I'll have something like 6 KNI devices on socket one being handled >> by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an >> example. Then my threads that are handling the dataplane tx/rx can simply >> be passed a pointer to their respective rte_kni device. Does this sound >> correct? >> >> Also, the sample says the core affinity needs to be set using taskset. Is >> that already taken care of with conf.core_id in rte_kni_alloc or do I >> still >> need to set it? >> >> Thanks >> > >
