There could be a few causes for the number of interrupts to change.
Either there was a change in the interrupt moderation scheme in use, or
the driver is simply processing packets faster and exiting polling more
frequently.
To test for a difference in interrupt moderation I would recommend using
ethtool -C <iface> rx-usecs 400. That should lock the interface in at
2500 interrupts per second. You should be able to do this on either
kernel to determine if the difference is interrupt moderation. Other
than that you might try using "perf top" like I mentioned to see where
the hot spots are in the old kernel versus the new one.
- Alex
On 11/15/2014 06:03 AM, Mike Zupan wrote:
> Alexander
>
> Thanks for the reply and clear up.. Looks like I’m doing 4-5x the
> number of interrupts
>
> Like for example 130 is eth2-TxRx-0 an 112 is eth0-TxRx-0
>
> There are 2 bonds on this host. One is to external network and the
> other is for the internal network with a total of 4 Nics.
>
> 3.10 kernel
>
> 05:58:42 AM INTR intr/s
> 05:58:46 AM 104 0.25
> 05:58:46 AM 105 0.25
> 05:58:46 AM 106 0.25
> 05:58:46 AM 107 0.25
> 05:58:46 AM 108 0.25
> 05:58:46 AM 112 4866.25
> 05:58:46 AM 113 5007.50
> 05:58:46 AM 114 4891.75
> 05:58:46 AM 115 5171.00
> 05:58:46 AM 116 4894.00
> 05:58:46 AM 118 5253.75
> 05:58:46 AM 119 4986.00
> 05:58:46 AM 121 3.50
> 05:58:46 AM 122 6.00
> 05:58:46 AM 123 3.75
> 05:58:46 AM 124 1.25
> 05:58:46 AM 125 2.25
> 05:58:46 AM 126 2.00
> 05:58:46 AM 127 1.00
> 05:58:46 AM 128 1.25
> 05:58:46 AM 130 8547.25
> 05:58:46 AM 131 8671.50
> 05:58:46 AM 132 8620.50
> 05:58:46 AM 133 8864.00
> 05:58:46 AM 134 8508.25
> 05:58:46 AM 135 8597.25
> 05:58:46 AM 136 8742.75
> 05:58:46 AM 137 8536.25
> 05:58:46 AM 139 6.00
> 05:58:46 AM 140 6.25
> 05:58:46 AM 141 6.50
> 05:58:46 AM 142 1.75
> 05:58:46 AM 143 2.75
> 05:58:46 AM 144 1.50
> 05:58:46 AM 145 2.00
> 05:58:46 AM 146 6.25
>
>
> 2.6 kernel
>
> 05:58:38 AM INTR intr/s
> 05:58:42 AM 50 203.27
> 05:58:42 AM 82 2505.54
> 05:58:42 AM 83 2731.99
> 05:58:42 AM 84 2586.65
> 05:58:42 AM 85 2565.99
> 05:58:42 AM 86 2078.34
> 05:58:42 AM 87 2351.89
> 05:58:42 AM 88 2270.03
> 05:58:42 AM 89 2579.09
> 05:58:42 AM 91 94.71
> 05:58:42 AM 92 31.49
> 05:58:42 AM 93 37.28
> 05:58:42 AM 94 42.32
> 05:58:42 AM 95 32.24
> 05:58:42 AM 96 30.73
> 05:58:42 AM 97 39.04
> 05:58:42 AM 98 48.61
> 05:58:42 AM 100 2949.87
> 05:58:42 AM 101 3349.12
> 05:58:42 AM 102 3233.00
> 05:58:42 AM 103 2839.55
> 05:58:42 AM 105 2912.09
> 05:58:42 AM 106 2672.29
> 05:58:42 AM 107 2996.98
> 05:58:42 AM 109 91.69
> 05:58:42 AM 110 48.11
> 05:58:42 AM 111 42.32
> 05:58:42 AM 112 46.60
> 05:58:42 AM 113 46.35
> 05:58:42 AM 114 53.15
> 05:58:42 AM 115 52.90
> 05:58:42 AM 116 43.83
>
>
>
> --
> Mike Zupan
>
> On Friday, November 14, 2014 at 8:54 PM, Alexander Duyck wrote:
>
>> On 11/13/2014 11:13 AM, Mike Zupan wrote:
>>> I’m having a strange issue doing on with 3.10 or 3.17 kernel that
>>> I’m not seeing with 2.6. We are seeing a lot of softirq requests for
>>> network cards even on a mostly idle system. It happens on any server
>>> in the cluster if I deploy the 3.10 or 3.17 kernel
>>>
>>> Using top we noticed this process using a lot of CPU. As soon as I
>>> give the server traffic load spikes to well over 200 for a 1 min
>>> average.
>>>
>>> [kworker/u66:2]
>>>
>>> That lead us to install `powertop` and then saw this
>>>
>>> Usage Events/s Category Description
>>> 1110 ms/s 2045.2 Process php-fpm: pool www
>>> 36.0 ms/s 2165.4 Timer tick_sched_timer
>>> 57.7 ms/s 1285.0 Process nginx: worker process
>>> 13.3 ms/s 416.0 Timer hrtimer_wakeup
>>> 39.1 ms/s 350.7 Interrupt [3] net_rx(softirq)
>>>
>>> This is the same on a 2.6 series getting the same amount of traffic
>>>
>>> Usage Events/s Category Description
>>> 1795 ms/s 1654.0 Process php-fpm: pool www
>>> 45.3 ms/s 1110.4 Process nginx: worker process
>>> 562.8 µs/s 122.4 Process /usr/bin/java -Xms200m -Xmx2000m -Xss256k
>>> -XX:MaxDirectMemorySize=516m -XX:+UseParNewGC
>>> -XX:+UseConcMarkSweepGC -Dage
>>> 497.1 µs/s 59.3 Process /usr/sbin/gmond
>>> 16.0 ms/s 30.2 Process /usr/bin/redis-server 127.0.0.1:6379
>>> 4.7 ms/s 32.8 Process python /usr/bin/statsd-relay.py
>>> 81.7 ms/s 0.00 Timer tcp_delack_timer
>>> 24.8 ms/s 0.00 Timer tick_sched_timer
>>> 549.4 µs/s 9.2 Process java -Xmx6g -server -Dfile.encoding=utf-8
>>> -XX:OnOutOfMemoryError=kill -9 %p -XX:+HeapDumpOnOutOfMemoryError
>>> -XX:HeapD
>>> 15.2 ms/s 0.00 Interrupt [3] net_rx(softirq)
>>>
>>>
>>> As you can see the net_rx is 0 on 2.6 but we get as many as 4k/s on
>>> 3.10. The server specs are the same and removed all sysctl settings.
>>> I can replicate the issue just by installing 3.10 on a server.
>>>
>>> the nics we have in are
>>>
>>> 06:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
>>> Connection (rev 01)
>>> 06:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
>>> Connection (rev 01)
>>> 06:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network
>>> Connection (rev 01)
>>> 06:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network
>>> Connection (rev 01)
>>>
>>> --
>>> Mike Zupan
>>
>> Mike,
>>
>> I would recommend installing the "perf" tool and running "perf top"
>> instead of "powertop" to try and determine what is running on your
>> system. The powertop tool is meant to determine what is waking you up
>> out of sleep states, not what is actually making use of the system. As
>> such with powertop you could see 0 events per second and all that would
>> mean is that the system isn't getting to sleep as it is too busy, which
>> a high count could actually mean your system is going idle resulting in
>> a significant number of wake-ups.
>>
>> For interrupt information you might try watching the rate at which
>> /proc/interrupts increases or you could install sysstat and then run
>> "sar -I XALL 4 500 | grep -v 0.00", to watch for the non zero interrupt
>> rates after figuring out which interrupts belong to your network adapter.
>>
>> Thanks,
>>
>> Alex
>
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired