On Thu, 12 Jun 2014 22:46:14 +0800 Tyrone Lau <tyronelau at gmail.com> wrote:
> Hi, all. I have found recently the Linux kernel will complain occasionally > a dead lock, while I use the kernel module rte_kni provided in DPDK. After > reviewing the dpdk source code and googling, > I found that the deadlock occurred because netif_receive_skb is invoked in > a non-softirq context. The erroneous source code is listed as below (in > lib/librte_eal/linuxapp/kni/kni_net.c:kni_net_rx_normal): > > * /* Transfer received packets to netif */ > for (i = 0; i < num; i++) { > kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva; > len = kva->data_len; > data_kva = kva->data - kni->mbuf_va + kni->mbuf_kva; > > skb = dev_alloc_skb(len + 2); > if (!skb) { > KNI_ERR("Out of mem, dropping pkts\n"); > /* Update statistics */ > kni->stats.rx_dropped++; > } > else { > /* Align IP on 16B boundary */ > skb_reserve(skb, 2); > memcpy(skb_put(skb, len), data_kva, len); > skb->dev = dev; > skb->protocol = eth_type_trans(skb, dev); > skb->ip_summed = CHECKSUM_UNNECESSARY; > > /* Call netif interface */ > netif_receive_skb(skb); > > /* Update statistics */ > kni->stats.rx_bytes += len; > kni->stats.rx_packets++; > } > }* > > The similar bug is reported and fixed in dpdk extension memnic. See > > http://comments.gmane.org/gmane.comp.networking.dpdk.devel/3151 > > To fix this bug, we should call local_bh_disable/local_bh_enable > around netif_receive_skb to disable and re-enable soft-irq. > Best Regards Probably better to call netif_rx instead, because that will handle the case of overrun. Other comments, this code should be using per-cpu stats. it should use netdev_alloc_skb_ip_align rather than doing align itself. Even better yet would be bursting packets into the receive handler.