Hi, My CPU cores are still suffering from high number of interrupts per second generated by the transmit NICs ( used only for packets transmission by ToDevice elements). I tried to decrease the number of interrupts using Interrupt transmit delay and interrupts throttling parameters of the e1000 driver, but this does not help. As soon as I decrease the number of interrupts I start losing packets and the performance degrades. Any way, I was wondering if the ToDevice really needs to use interrupts when a packet is sent ??? Does the tx DMA ring cleaning function *e1000 _tx _clean *need an interrupt to start cleaning the DMA descriptors and buffers ???? what other ToDevice functions require interrupts to get triggered ? I would really appreciate some explanation regarding the ToDevice element in order to deal with my problem.
Thank you Ahmed On Thu, Jul 14, 2011 at 3:29 PM, ahmed A. <[email protected]> wrote: > Hi all, > > Thank you for the tips and feedback that you provided. Finally, I managed > to allocate the source of the problem. I found out that my output cards (the > ones attached to the ToDevices) generate a considerable number of interrupts > (32000 intr/sec) and linux somehow assigns the interrupt-handling of those > interrupts to specific cores, the cores which give me the bad forwarding > performance. So, as soon as I assign the interrupt handling to different > core, the performance of the bad core gets to normal (the expected > performance). I still do not know why I got this high number of interrupts > in the transmission path. What about CLICK ToDevice, does it allow or > require these interrupts ?? I am looking now for a way to reduce the number > of interrupts, so any tips would be useful. > > > Regards, > Ahmed > > On Wed, Jul 13, 2011 at 6:33 PM, Eddie Kohler <[email protected]> wrote: > >> Hey Adam, >> >> Have you added these papers to the wiki page for such things?? >> >> Eddie >> >> >> >> On 7/13/11 2:50 AM, Adam Greenhalgh wrote: >> >>> sorry for the shameless self plug but a number of these papers might >>> be of use to you, they explain some of the issues you will be seeing >>> with multi cpu issues and click. >>> >>> http://www.comp.lancs.ac.uk/~**laurent/papers/egi_npc.pdf<http://www.comp.lancs.ac.uk/~laurent/papers/egi_npc.pdf> >>> >>> http://www.comp.lancs.ac.uk/~**laurent/papers/high_perf_** >>> vrouters-CoNEXT08.pdf<http://www.comp.lancs.ac.uk/~laurent/papers/high_perf_vrouters-CoNEXT08.pdf> >>> >>> http://www.comp.lancs.ac.uk/~**laurent/papers/fairness_** >>> vrouters-PRESTO08.pdf<http://www.comp.lancs.ac.uk/~laurent/papers/fairness_vrouters-PRESTO08.pdf> >>> >>> Adam >>> >>> On 12 July 2011 15:53, ahmed A.<[email protected]> wrote: >>> >>>> Hi, >>>> >>>> I am examining the forwarding performance of CLICK in our four core CPU >>>> machine. I am assigning two different simple forwarding path to >>>> two different core each time and watch the forwarding rate using a CLICK >>>> counter. My CLICK configuration file is as follow : >>>> >>>> d1::PollDevice(eth24,PROMISC true,BURST 16) -> queue1::Queue(10000) >>>> -> >>>> c1::Counter -> td1::ToDevice(eth22); >>>> pd2::PollDevice(eth25,PROMISC true,BURST 16) -> queue2::Queue(10000) -> >>>> c2::Counter -> td2::ToDevice(eth23); >>>> >>>> Idle -> ToDevice(eth24); >>>> Idle -> ToDevice(eth25); >>>> >>>> >>>> StaticThreadSched(pd1 0,td1 0,pd2 1,td2 1); >>>> >>>> CpuPin(0 0,1 1); >>>> >>>> I was expecting to have almost the same forwarding rate (counter rate) >>>> for >>>> both paths whatever was the assigned cores, but actually I got different >>>> results >>>> depending on the cores that I use, for example when I use core 0 and 1, >>>> I >>>> got *1.0 Million Packets Per Second MPPS* for core 0 and *1.42 MPPS* for >>>> core 1. for core 1 and 2 >>>> I got *1.39 MPPS* for both cores. and for core 2 and 3 I got *1.42 MPPS* >>>> for >>>> core 2 and core *1 MPPS* for core 2. In summary, there alway are two >>>> cores >>>> with bad performance comparing to the >>>> other cores. >>>> >>>> By checking the *monitored_empty_polls_rate* of the cores, I found out >>>> that >>>> the cores with bad performance has 0.781 monitored_empty_polls_rate >>>> whereas >>>> the good >>>> cores has 207111 *monitored_empty_polls_rate*. The number of dropped >>>> packets >>>> in the NIC port assigned to the bad cores are much bigger than number of >>>> dropped packets assigned to the good cores. My explanation is the >>>> Polldevice >>>> is not >>>> getting enough CPU cycles (i.e not scheduled enough) to poll packets and >>>> refill the DMA ring with skb buffers, but I have no idea why ???? >>>> >>>> Does the Linux scheduler interfere with click ? I checked the load at >>>> each >>>> core using *top* but I could not see any other processes running on the >>>> bad >>>> cores, they are idle all the time. >>>> I would appreciate any tips or help. >>>> >>>> thank you in advance >>>> >>>> Ahmed >>>> >>>> PS: I have 2.6.18 kernel running in fedora filesystem with CLICK 1.6 and >>>> e1000 batched driver >>>> ______________________________**_________________ >>>> click mailing list >>>> [email protected] >>>> https://amsterdam.lcs.mit.edu/**mailman/listinfo/click<https://amsterdam.lcs.mit.edu/mailman/listinfo/click> >>>> >>>> >>> ______________________________**_________________ >>> click mailing list >>> [email protected] >>> https://amsterdam.lcs.mit.edu/**mailman/listinfo/click<https://amsterdam.lcs.mit.edu/mailman/listinfo/click> >>> >> > _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
