Ok thanks.. It’s certainly something in the kernel then and doesn’t look like
the network card
16.18% php-fpm [.] 0x0000000000271803
4.60% [kernel] [k] osq_lock
4.44% [kernel] [k] mutex_spin_on_owner
3.77% opcache.so [.] 0x0000000000010cf3
3.30% [kernel] [k] mm_find_pmd
2.64% [kernel] [k] page_fault
2.40% apcu.so [.] 0x0000000000008864
2.18% [kernel] [k] _raw_spin_lock
1.89% nginx [.] 0x0000000000012655
1.87% [kernel] [k] __page_check_address
I’m not seeing any kernel calls even yellow in the 2.6 kernel.
--
Mike Zupan
On Saturday, November 15, 2014 at 12:31 PM, Alexander Duyck wrote:
> There could be a few causes for the number of interrupts to change.
> Either there was a change in the interrupt moderation scheme in use, or
> the driver is simply processing packets faster and exiting polling more
> frequently.
>
> To test for a difference in interrupt moderation I would recommend using
> ethtool -C <iface> rx-usecs 400. That should lock the interface in at
> 2500 interrupts per second. You should be able to do this on either
> kernel to determine if the difference is interrupt moderation. Other
> than that you might try using "perf top" like I mentioned to see where
> the hot spots are in the old kernel versus the new one.
>
> - Alex
>
> On 11/15/2014 06:03 AM, Mike Zupan wrote:
> > Alexander
> >
> > Thanks for the reply and clear up.. Looks like I’m doing 4-5x the
> > number of interrupts
> >
> > Like for example 130 is eth2-TxRx-0 an 112 is eth0-TxRx-0
> >
> > There are 2 bonds on this host. One is to external network and the
> > other is for the internal network with a total of 4 Nics.
> >
> > 3.10 kernel
> >
> > 05:58:42 AM INTR intr/s
> > 05:58:46 AM 104 0.25
> > 05:58:46 AM 105 0.25
> > 05:58:46 AM 106 0.25
> > 05:58:46 AM 107 0.25
> > 05:58:46 AM 108 0.25
> > 05:58:46 AM 112 4866.25
> > 05:58:46 AM 113 5007.50
> > 05:58:46 AM 114 4891.75
> > 05:58:46 AM 115 5171.00
> > 05:58:46 AM 116 4894.00
> > 05:58:46 AM 118 5253.75
> > 05:58:46 AM 119 4986.00
> > 05:58:46 AM 121 3.50
> > 05:58:46 AM 122 6.00
> > 05:58:46 AM 123 3.75
> > 05:58:46 AM 124 1.25
> > 05:58:46 AM 125 2.25
> > 05:58:46 AM 126 2.00
> > 05:58:46 AM 127 1.00
> > 05:58:46 AM 128 1.25
> > 05:58:46 AM 130 8547.25
> > 05:58:46 AM 131 8671.50
> > 05:58:46 AM 132 8620.50
> > 05:58:46 AM 133 8864.00
> > 05:58:46 AM 134 8508.25
> > 05:58:46 AM 135 8597.25
> > 05:58:46 AM 136 8742.75
> > 05:58:46 AM 137 8536.25
> > 05:58:46 AM 139 6.00
> > 05:58:46 AM 140 6.25
> > 05:58:46 AM 141 6.50
> > 05:58:46 AM 142 1.75
> > 05:58:46 AM 143 2.75
> > 05:58:46 AM 144 1.50
> > 05:58:46 AM 145 2.00
> > 05:58:46 AM 146 6.25
> >
> >
> > 2.6 kernel
> >
> > 05:58:38 AM INTR intr/s
> > 05:58:42 AM 50 203.27
> > 05:58:42 AM 82 2505.54
> > 05:58:42 AM 83 2731.99
> > 05:58:42 AM 84 2586.65
> > 05:58:42 AM 85 2565.99
> > 05:58:42 AM 86 2078.34
> > 05:58:42 AM 87 2351.89
> > 05:58:42 AM 88 2270.03
> > 05:58:42 AM 89 2579.09
> > 05:58:42 AM 91 94.71
> > 05:58:42 AM 92 31.49
> > 05:58:42 AM 93 37.28
> > 05:58:42 AM 94 42.32
> > 05:58:42 AM 95 32.24
> > 05:58:42 AM 96 30.73
> > 05:58:42 AM 97 39.04
> > 05:58:42 AM 98 48.61
> > 05:58:42 AM 100 2949.87
> > 05:58:42 AM 101 3349.12
> > 05:58:42 AM 102 3233.00
> > 05:58:42 AM 103 2839.55
> > 05:58:42 AM 105 2912.09
> > 05:58:42 AM 106 2672.29
> > 05:58:42 AM 107 2996.98
> > 05:58:42 AM 109 91.69
> > 05:58:42 AM 110 48.11
> > 05:58:42 AM 111 42.32
> > 05:58:42 AM 112 46.60
> > 05:58:42 AM 113 46.35
> > 05:58:42 AM 114 53.15
> > 05:58:42 AM 115 52.90
> > 05:58:42 AM 116 43.83
> >
> >
> >
> > --
> > Mike Zupan
> >
> > On Friday, November 14, 2014 at 8:54 PM, Alexander Duyck wrote:
> >
> > > On 11/13/2014 11:13 AM, Mike Zupan wrote:
> > > > I’m having a strange issue doing on with 3.10 or 3.17 kernel that
> > > > I’m not seeing with 2.6. We are seeing a lot of softirq requests for
> > > > network cards even on a mostly idle system. It happens on any server
> > > > in the cluster if I deploy the 3.10 or 3.17 kernel
> > > >
> > > > Using top we noticed this process using a lot of CPU. As soon as I
> > > > give the server traffic load spikes to well over 200 for a 1 min
> > > > average.
> > > >
> > > > [kworker/u66:2]
> > > >
> > > > That lead us to install `powertop` and then saw this
> > > >
> > > > Usage Events/s Category Description
> > > > 1110 ms/s 2045.2 Process php-fpm: pool www
> > > > 36.0 ms/s 2165.4 Timer tick_sched_timer
> > > > 57.7 ms/s 1285.0 Process nginx: worker process
> > > > 13.3 ms/s 416.0 Timer hrtimer_wakeup
> > > > 39.1 ms/s 350.7 Interrupt [3] net_rx(softirq)
> > > >
> > > > This is the same on a 2.6 series getting the same amount of traffic
> > > >
> > > > Usage Events/s Category Description
> > > > 1795 ms/s 1654.0 Process php-fpm: pool www
> > > > 45.3 ms/s 1110.4 Process nginx: worker process
> > > > 562.8 µs/s 122.4 Process /usr/bin/java -Xms200m -Xmx2000m -Xss256k
> > > > -XX:MaxDirectMemorySize=516m -XX:+UseParNewGC
> > > > -XX:+UseConcMarkSweepGC -Dage
> > > > 497.1 µs/s 59.3 Process /usr/sbin/gmond
> > > > 16.0 ms/s 30.2 Process /usr/bin/redis-server 127.0.0.1:6379
> > > > 4.7 ms/s 32.8 Process python /usr/bin/statsd-relay.py
> > > > 81.7 ms/s 0.00 Timer tcp_delack_timer
> > > > 24.8 ms/s 0.00 Timer tick_sched_timer
> > > > 549.4 µs/s 9.2 Process java -Xmx6g -server -Dfile.encoding=utf-8
> > > > -XX:OnOutOfMemoryError=kill -9 %p -XX:+HeapDumpOnOutOfMemoryError
> > > > -XX:HeapD
> > > > 15.2 ms/s 0.00 Interrupt [3] net_rx(softirq)
> > > >
> > > >
> > > > As you can see the net_rx is 0 on 2.6 but we get as many as 4k/s on
> > > > 3.10. The server specs are the same and removed all sysctl settings.
> > > > I can replicate the issue just by installing 3.10 on a server.
> > > >
> > > > the nics we have in are
> > > >
> > > > 06:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
> > > > Connection (rev 01)
> > > > 06:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
> > > > Connection (rev 01)
> > > > 06:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network
> > > > Connection (rev 01)
> > > > 06:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network
> > > > Connection (rev 01)
> > > >
> > > > --
> > > > Mike Zupan
> > > >
> > >
> > >
> > > Mike,
> > >
> > > I would recommend installing the "perf" tool and running "perf top"
> > > instead of "powertop" to try and determine what is running on your
> > > system. The powertop tool is meant to determine what is waking you up
> > > out of sleep states, not what is actually making use of the system. As
> > > such with powertop you could see 0 events per second and all that would
> > > mean is that the system isn't getting to sleep as it is too busy, which
> > > a high count could actually mean your system is going idle resulting in
> > > a significant number of wake-ups.
> > >
> > > For interrupt information you might try watching the rate at which
> > > /proc/interrupts increases or you could install sysstat and then run
> > > "sar -I XALL 4 500 | grep -v 0.00", to watch for the non zero interrupt
> > > rates after figuring out which interrupts belong to your network adapter.
> > >
> > > Thanks,
> > >
> > > Alex
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
>
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net (mailto:E1000-devel@lists.sourceforge.net)
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel® Ethernet, visit
> http://communities.intel.com/community/wired
>
>
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired