Hey Cliff, I have a similar use case in my application. If you're willing to dedicate an lcore per socket, another way to approach what you're describing is to create a KNI interface thread that talks to the other cores via message rings. That is, the cores that are interacting with the NIC read a bunch of packets, determine if any of them need to go to KNI and, if so, enqueue them using rte_ring_enqueue(). They also do a periodic rte_ring_dequeue() on another queue to accept back any packets that come back from KNI.
The KNI interface process, meanwhile, just loops along, taking packets in from the NIC interface threads via rte_ring_dequeue() and sending them to KNI, and taking packets from KNI and returning them to the NIC interface threads via rte_ring_enqueue(). I've found that this sort of scheme works well, and is reasonably clean architecturally. Also, I found that calls into KNI can at times be very slow. In my application, I would periodically see KNI calls take 50-100K cycles, which can cause congestion if you're handling large volumes of traffic. Letting a non-critical thread handle this interface was a big win for me. This leaves the kernel side processing out, of course. But if the traffic going to the kernel is lightweight, you likely don't need a dedicated core for the kernel-side RX and TX work. -- Matt Laswell Principal Software Engineer infinite io On Wed, Jun 8, 2016 at 11:30 AM, Cliff Burdick <shaklee3 at gmail.com> wrote: > Hi, I have an application with two sockets where each core I'm planning to > transmit and receive a fairly large amount of traffic per core. Each core > right now handles a single queue of either TX or RX of a given port. Across > all the cores, I may be processing up to 12 ports. I also need to handle > things like ARP and ping, so I'm going to add in the KNI driver to handle > that. Since the amount of traffic I'm expecting that I'll need to forward > to Linux is very small, it seems like I should be able to dedicate one > lcore per socket to handle this functionality and have the dataplane cores > pass the traffic off to this core using rte_kni_tx_burst(). > > My question is, first of all, is this possible? It seems like I can > configure the KNI driver to start in "single thread" mode. From that point, > I want to initialize one KNI device for each port, and have each kernel > lcore on each processor handle that traffic. I believe if I call > rte_kni_alloc with core_id set to the kernel lcore for each device, then in > the end I'll have something like 6 KNI devices on socket one being handled > by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an > example. Then my threads that are handling the dataplane tx/rx can simply > be passed a pointer to their respective rte_kni device. Does this sound > correct? > > Also, the sample says the core affinity needs to be set using taskset. Is > that already taken care of with conf.core_id in rte_kni_alloc or do I still > need to set it? > > Thanks >
