I haven't been able to get a system out on the older hardware running
CentOS6 yet.
In the meantime I did want to confirm that, according to turbostat (and
i7z) my cores never leave C0/C1. They also stay at a consistent frequency
(3.0-3.2Ghz depending on the processor). I am fairly confident that the
information reported by those tools is accurate and that there are no
sleep/wakeup issues in terms of CPU power management.
Are there other sleep/wake issues on the newer hardware I need to be aware
of, other than the CPU power state? As far as I know, ASPM is also disabled
(as reported by lspci -vv LnkCtl: ASPM Disabled).
Thanks,
Scott Silverman | IT | Simplex Investments | 312-360-2444
230 S. LaSalle St., Suite 4-100, Chicago, IL 60604
On Thu, Dec 19, 2013 at 5:32 PM, Brandeburg, Jesse <
jesse.brandeb...@intel.com> wrote:
> Scott be sure to try running turbostat on both old and new servers as I
> suspect the 50us wake latency of C6 power state may cause drops.
>
> The new kernels enable deeper sleep.
>
> You can also try a bios setting to disable deep sleep states, leave on C1
> only.
>
> There was a program called cpudmalatency.c or something that may be able
> to help you keep system more awake.
>
> --
> Jesse Brandeburg
>
>
> On Dec 19, 2013, at 2:57 PM, "Scott Silverman" <
> ssilver...@simplexinvestments.com> wrote:
>
> > Alex,
> >
> > Thanks for the response, I'll attempt to reproduce with a consistent OS
> > release and re-open the discussion at that time.
> >
> >
> >
> >
> >
> >
> > Thanks,
> >
> > Scott Silverman
> >
> >
> > On Thu, Dec 19, 2013 at 4:52 PM, Alexander Duyck <
> > alexander.h.du...@intel.com> wrote:
> >
> >> On 12/19/2013 10:31 AM, Scott Silverman wrote:
> >>> We have three generations of servers running nearly identical software.
> >>> Each subscribes to a variety of multicast groups taking in, on average,
> >>> 200-300Mbps of data.
> >>>
> >>> The oldest generation (2x Xeon X5670, SuperMicro 6016T-NTRF, Intel
> >>> X520-DA2) has no issues handling all the incoming data. (zero
> >>> rx_no_dma_resources)
> >>>
> >>> The middle generation (2x Xeon E5-2670, SuperMicro 6017R-WRF, Intel
> >>> X520-DA2) and the newest generation (2x Xeon E5-2680v2, SuperMicro
> >>> 6017R-WRF, Intel X520-DAs) both have issues handling the incoming data
> >>> (indicated by increasing rx_no_dma_resources counter).
> >>>
> >>> The oldest generation of servers is running CentOS5 on a newer kernel
> >>> (3.4.41), the others are running CentOS6 on the exact same kernel
> >> (3.4.41).
> >>>
> >>> The oldest generation is using ixgbe 3.13.10, the middle generation
> >> 3.13.10
> >>> and the newest are on 3.18.7. All machines are using the
> set_irq_affinity
> >>> script to spread queue interrupts across available cores. All machines
> >> are
> >>> configured with C1 as the maximum C-state and CPU clocks are all steady
> >>> between 3-3.2Ghz depending on the processor model.
> >>>
> >>> On the middle/newer boxes, lowering the number of RSS queues manually
> >> (i.e.
> >>> RSS=8,8) seems to help reduce the amount of dropping, but it does not
> >>> eliminate it.
> >>>
> >>> The ring buffer drops do not seem to correlate with data rates, either.
> >> It
> >>> does not seem that it is an issue of keeping up. In addition, the boxes
> >> are
> >>> not under particularly heavy load. The CPU usage is generally between
> >> 3-5%
> >>> and rarely spikes much higher than 15%. The load average is generally
> >>> around 2.
> >>>
> >>> I am at a loss for what else to try to diagnose and/or fix this. In my
> >>> mind, the newer boxes should have no problem at all keeping up with the
> >>> older ones.
> >>>
> >>> I've attached the output of ethtool -S, one from each generation of
> >> server.
> >>>
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Scott Silverman
> >>
> >> Scott,
> >>
> >> Have you tried running the CentOS5 w/ newer kernel on any of your newer
> >> servers, or CentOS6 on one of the older ones? I ask because this would
> >> seem to be the one of the most significant differences between the
> >> servers that are not dropping frames and those that are. I suspect you
> >> may have something in the CentOS6 configuration that is responsible for
> >> the drops that is not present in the CentOS5 configuration. We really
> >> need to eliminate any OS based issues before we can really even hope to
> >> start chasing this issue down into the driver and/or device
> configuration.
> >>
> >> Thanks,
> >>
> >> Alex
> >
> ------------------------------------------------------------------------------
> > Rapidly troubleshoot problems before they affect your business. Most IT
> > organizations don't have a clear picture of how application performance
> > affects their revenue. With AppDynamics, you get 100% visibility into
> your
> > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
> AppDynamics Pro!
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
> > _______________________________________________
> > E1000-devel mailing list
> > E1000-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/e1000-devel
> > To learn more about Intel® Ethernet, visit
> http://communities.intel.com/community/wired
>
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired