Re: [E1000-devel] rx_missed_errors grows while rx_no_buffer does not

Michał Purzyński Fri, 30 Sep 2016 14:12:12 -0700

Finally I sound some time to test multiple RSS queues. There's no ATR
parameter, but there is AtrSampleRate=0 and that's what I did (=no samples,
ATR disabled).


RSS hash has been changed with ethtool -x and ntuple disabled. I don't have
an easy way to confirm with Suricata that it works the way it should, but
some 15 minutes tests with Bro were promising.

Now, one more question :-)

With 14 queues per card rx_missed=0 With 1 queue it's about 1% of all
packets.

This shows that it's not a PCIe fault, or does it not?

Now, since rx_missed means packet was dropped 'on the wire' because it
could not be picked up and loaded into FIFO since there was no place in
FIFO. Why wasn't any other counter growing to tell a reason behind that?

Suricata threads have tons of breathing room, so that's not software that's
slow here. Am I understanding it wrong?

X710 have just landed, so more interesting experiments will follow.


On Mon, Sep 26, 2016 at 5:15 PM, Alexander Duyck <alexander.du...@gmail.com>
wrote:

> What ATR=0 does is disable a feature called Application Targeted
> Routing.  It tries to match up the Rx queue for incoming traffic with
> the Tx queue for outgoing traffic.  If it is enabled it would be
> creating rules in the flow director filter table that would be
> rerouting Rx traffic and could cause reordering.
>
> - Alex
>
> On Mon, Sep 26, 2016 at 6:53 AM, Michał Purzyński
> <michalpurzyns...@gmail.com> wrote:
> > Thank you a lot! I think there's a value in making sure the
> > driver/card/bios/kernel level is tuned correctly. I have learned a lot in
> > the process.
> >
> > A gold standard for Suricata configuration will follow, so that
> knowledge is
> > not forgotten.
> >
> > I'm in touch with Suricata developers, this whole thread started with me
> > being surprised that despite growing rx_missed_error (and well, growing
> to
> > like 38% of all packets received) there was no DMA errors or anything
> like
> > that.
> >
> > Now we are at 1% of rx_missed and at least know how to troubleshoot it.
> > Excellent.
> >
> > What helped most was making sure that each card sends interrupts to a
> > separate CPUs, NUMA configuration is correct, processes are pinned and
> > cpufreq governor is set to performance.
> > Smaller things like disabling ASPM (so PCIe does not go away from under
> the
> > card in the worst moment) and keeping CPU somewhere near C0/C1 also
> helped.
> >
> >
> > The af_packet copies data from skbuff so there's technically two
> memcpy() if
> > I understand that correctly - driver buffers -> skbuff -> af_packet
> buffers
> >
> > Later af_packet buffers are mapped into userspace so no additional data
> > copying occurs.
> >
> > I'll ask if that can be even more optimized.
> >
> >
> > Need to find some time to check out the changed RSS hash key and see how
> it
> > performs, also in terms of packet reordering. There were two problems
> with
> > RSS:
> >
> > 1. non-symmetric hash (easy to change)
> > 2. packet reordering introduced with RSS, even when hash is symmetric
> >
> > What does the ATR=0 do and why is it necessary?
> >
> > Also, that's a last question in this series. Thank you a lot, Suricata
> > community really appreciates Intel's help :-)
> >
> >
> > On Mon, Sep 26, 2016 at 3:46 AM, Alexander Duyck <
> alexander.du...@gmail.com>
> > wrote:
> >>
> >> Okay, I'll just paste the bits here that I think are relevant.
> >> Specifically the symbols that are at or above 0.5% CPU utilization.
> >>
> >> # Overhead       sys       usr  Command         Shared Object
> >>   Symbol
> >> # ........  ........  ........  ..............
> >> .......................
> >> ................................................................
> >> #
> >>     12.91%    12.91%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] tpacket_rcv
> >>     11.22%    11.22%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] memcpy_erms
> >>      4.03%     0.00%     4.03%  W#01-p1p1       libhs.so.4.2.0
> >>   [.] fdr_engine_exec
> >>      3.17%     3.17%     0.00%  W#01-p1p1       [kernel.kallsyms]
> >>   [k] tpacket_rcv
> >>      3.10%     3.10%     0.00%  W#01-p1p1       [kernel.kallsyms]
> >>   [k] memcpy_erms
> >>      2.65%     2.65%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] __netif_receive_skb_core
> >>      2.61%     0.00%     2.61%  W#01-p1p1       libhs.so.4.2.0
> >>   [.] nfaExecMcClellan16_B
> >>      2.41%     2.41%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] ixgbe_clean_rx_irq
> >>      2.40%     2.40%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] mwait_idle
> >>      1.91%     0.00%     1.91%  W#01-p1p1       libc-2.19.so
> >>   [.] memset
> >>      1.52%     1.52%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] consume_skb
> >>      1.29%     1.29%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] __skb_get_hash
> >>      1.18%     1.18%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] prb_fill_curr_block.isra.59
> >>      1.09%     1.09%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] __skb_flow_dissect
> >>      1.06%     1.06%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] __build_skb
> >>      1.04%     1.04%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] packet_rcv
> >>      0.89%     0.89%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] irq_entries_start
> >>      0.82%     0.82%     0.00%  ksoftirqd/0     [kernel.kallsyms]
> >>   [k] memcpy_erms
> >>      0.78%     0.78%     0.00%  ksoftirqd/0     [kernel.kallsyms]
> >>   [k] tpacket_rcv
> >>      0.72%     0.72%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] skb_copy_bits
> >>      0.71%     0.00%     0.71%  W#01-p1p1       suricata
> >>   [.] SigMatchSignatures
> >>      0.69%     0.00%     0.69%  W#01-p1p1       libc-2.19.so
> >>   [.] malloc
> >>      0.66%     0.66%     0.00%  W#01-p1p1       [kernel.kallsyms]
> >>   [k] ixgbe_clean_rx_irq
> >>      0.63%     0.63%     0.00%  W#01-p1p1       [kernel.kallsyms]
> >>   [k] __netif_receive_skb_core
> >>      0.51%     0.51%     0.00%  swapper         [kernel.kallsyms]
> >>   [k] kfree_skb
> >>
> >> So looking over what you sent me it doesn't looks so much like this is
> >> a driver issue as the kernel overhead for processing these frames is
> >> pretty significant with at least something like 25% of the CPU time
> >> being spent handling tpacket_rcv or a memcpy in order to service
> >> tpacket_rcv.  I haven't had much experience with Suricata but you
> >> might want to try checking with experts on that if you haven't already
> >> as it seems like some significant CPU time is getting consumed in the
> >> kernel/userspace handoff.  If nothing else you might try bringing up
> >> questions on how to improve raw socket performance on the netdev
> >> mailing list.
> >>
> >> I just remembered that you disabled RSS.  That is the reason why you
> >> are not seeing any rx_no_dma_resources errors.  In order for packets
> >> to be dropped per ring you have to have more than 1 ring enabled.  I
> >> did some quick googling on why Suricata might not support RSS and I
> >> guess it has to do with Tx and Rx traffic not ending up on the same
> >> queue.  That is actually pretty easy to fix.  All you would need to do
> >> is pass the module parameter ATR=0 in order to disable ATR and change
> >> the RSS key on the device to use a 16 bit repeating value.  You can
> >> find a paper detailing some of that here:
> >> http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf
> >>
> >> Other than these tips I don't know if there is much more info I can
> >> provide.  It looks like you will need to add more CPU power in order
> >> to be able to handle the load as you are currently maxing out the one
> >> thread you are using.
> >>
> >> - Alex
> >>
> >> On Sun, Sep 25, 2016 at 4:47 PM, Michał Purzyński
> >> <michalpurzyns...@gmail.com> wrote:
> >> >
> >> > Sent off list, because files are around a MB.
> >> >
> >> > On Mon, Sep 26, 2016 at 1:28 AM, Alexander Duyck
> >> > <alexander.du...@gmail.com> wrote:
> >> >>
> >> >> If you can just send me the output from "perf report" it would be
> more
> >> >> useful.  The problem is the raw data you sent me doesn't do me any
> good
> >> >> without the symbol tables and such and those would be too large to be
> >> >> sending over email.
> >> >>
> >> >> What I am basically looking for is a dump with the symbol names that
> >> >> are taking up the CPU time.  From there I can probably start to
> understand
> >> >> what is going on.
> >> >>
> >> >> - Alex
> >> >>
> >> >>
> >> >> On Sun, Sep 25, 2016 at 4:09 PM, Michał Purzyński
> >> >> <michalpurzyns...@gmail.com> wrote:
> >> >>>
> >> >>> perf record (and perf top) shows interesting results indeed. For
> one,
> >> >>> there was some lock function called with _slowpath_ in name, with
> perf top
> >> >>> -g quickly traced to cpufreq and I ended up setting performance
> governor and
> >> >>> that slowpath call is gone now.
> >> >>>
> >> >>> Some rx_missed are still here. Much less but traffic is also far
> from
> >> >>> what it is on weekdays. Below you will find links to perf.data and
> a results
> >> >>> of perf script -D (let me know if I got it wrong)
> >> >>>
> >> >>>
> >> >>> https://drive.google.com/file/d/0B4XJBHc9i84dRXU5eE5FRFBsVUU/
> view?usp=sharing
> >> >>>
> >> >>> https://drive.google.com/file/d/0B4XJBHc9i84dd2ZSREUtN2Z4dDQ/
> view?usp=sharing
> >> >>>
> >> >>> I made triple sure that VTd is disabled so IOMMU is gone with it,
> from
> >> >>> day one I received this server.
> >> >>>
> >> >>>
> >> >>> On Sun, Sep 25, 2016 at 8:21 PM, Alexander Duyck
> >> >>> <alexander.du...@gmail.com> wrote:
> >> >>>>
> >> >>>> You probably don't need to bother with disabling any other
> >> >>>> prefetchers or anything like that.
> >> >>>>
> >> >>>> One thing that did occur to me is that when you are running your
> test
> >> >>>> you might try to capture a perf trace on the core that the
> interrupt is
> >> >>>> running on.  All you need to do to capture that is just run perf
> record -C
> >> >>>> <cpu num> sleep 20 while your test is running.  Then dump perf
> report to a
> >> >>>> logfile of your choice and send us the results. That should help
> us to
> >> >>>> identify any hot spots that might be eating any extra CPU time.
> >> >>>>
> >> >>>> Also when you are in the BIOS you might try looking to see if you
> >> >>>> have an IOMMU or VTd feature enabled.  If you do you might want to
> try
> >> >>>> disabling it to see if that gives you any performance boost.  If
> so you
> >> >>>> could try booting with the kernel parameter iommu=pt which should
> switch the
> >> >>>> system over to identity mapping the device onto the system which
> would save
> >> >>>> you some considerable time.
> >> >>>>
> >> >>>>
> >> >>>> On Sun, Sep 25, 2016 at 9:40 AM, Michał Purzyński
> >> >>>> <michalpurzyns...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> Yes, I have all kinds of offloads disabled. I'll ask HP to
> provide a
> >> >>>>> detailed connections scheme, the one they are avoiding so much in
> the server
> >> >>>>> manual.
> >> >>>>>
> >> >>>>> Super micro publishes it all. Go figure.
> >> >>>>>
> >> >>>>> Btw how should the prefetching be configured to not interfere with
> >> >>>>> DCA?
> >> >>>>>
> >> >>>>> Here's what I have. Should I disable HW prefetcher and Adjacent
> >> >>>>> Sector Prefetch? Anything more?
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On 25 Sep 2016, at 03:55, Alexander Duyck
> >> >>>>> <alexander.du...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> On Sat, Sep 24, 2016 at 4:40 PM, Michał Purzyński
> >> >>>>> <michalpurzyns...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> Thank for you being persistent with answers.
> >> >>>>>
> >> >>>>>
> >> >>>>> So right after sending previous email I noticed that I left over
> >> >>>>> some
> >> >>>>>
> >> >>>>> careless IRQ assignments after experimenting with IRQ and process
> >> >>>>> CPU
> >> >>>>>
> >> >>>>> affinity. Both cards were hitting the same core, which (for second
> >> >>>>> card) was
> >> >>>>>
> >> >>>>> on a different NUMA node, plus that core was saturated.
> >> >>>>>
> >> >>>>>
> >> >>>>> Result was around 38% packets lost, calculated by comparing
> packets
> >> >>>>> received
> >> >>>>>
> >> >>>>> with rx_missed. It's interesting that no other counter was
> >> >>>>> increasing.
> >> >>>>>
> >> >>>>>
> >> >>>>> Right now I have moved card 0 to core 0 and card 1 to first core
> >> >>>>> from the
> >> >>>>>
> >> >>>>> second CPU.
> >> >>>>>
> >> >>>>>
> >> >>>>> Now the rx_missed is around 6-7% for each card. Still way too
> much.
> >> >>>>>
> >> >>>>>
> >> >>>>> I send a total of 8-11Gbit/sec to both cars total, so each
> receives
> >> >>>>> around
> >> >>>>>
> >> >>>>> half of that. Packet rate is 1.2Mpps top (also total). All kinds
> of
> >> >>>>> packet
> >> >>>>>
> >> >>>>> sizes.
> >> >>>>>
> >> >>>>>
> >> >>>>> So if you are doing packet analysis I assume you don't need LRO or
> >> >>>>> GRO.  If not you may want to look into disabling them via "ethtool
> >> >>>>> -K".  I know RSC can sometimes cause packet drops due to
> aggregating
> >> >>>>> a
> >> >>>>> number of frames before finally submitting them to the device.
> >> >>>>> Although that usually required ASPM to be enabled as well.
> >> >>>>>
> >> >>>>> I'll lower rings to 512 as the next step. Good to know about
> card's
> >> >>>>>
> >> >>>>> limitations.
> >> >>>>>
> >> >>>>>
> >> >>>>> Given that InterruptThrottleRate has to be given in a 'number of
> >> >>>>> interrupts
> >> >>>>>
> >> >>>>> / second' what would you recommend I set it to, for a start at
> >> >>>>> least? I have
> >> >>>>>
> >> >>>>> a 2.6Ghz Xeon E5 v3.
> >> >>>>>
> >> >>>>>
> >> >>>>> So I would recommend a value no less than 12500 for
> >> >>>>> InterruptThrottleRate.  Assuming a reasonable packet rate that
> >> >>>>> should
> >> >>>>> give you a decent trade of in terms of performance versus latency.
> >> >>>>>
> >> >>>>> I'll buy a pair of X710 for a test as well. It will be an
> >> >>>>> interesting
> >> >>>>>
> >> >>>>> comparison. Who knows, maybe the RSS implementation and MQ there
> is
> >> >>>>> good
> >> >>>>>
> >> >>>>> enough for IDS to be used.
> >> >>>>>
> >> >>>>>
> >> >>>>> Fortunately I don't run it inline, this server receives a copy of
> >> >>>>> traffic.
> >> >>>>>
> >> >>>>>
> >> >>>>> I'm not sure if it will get you much more throughput or not.  I
> >> >>>>> still
> >> >>>>> find it odd that your dropping packets even though the device
> isn't
> >> >>>>> complaining about not having ring buffer resources.  Usually that
> >> >>>>> points to a bottleneck somewhere in the PCIe bus.  You might want
> to
> >> >>>>> double check and verify that the devices are connected directly to
> >> >>>>> the
> >> >>>>> root complex and not some secondary bus on a PCIe switch that is
> >> >>>>> actually downgrading the link between the device and the CPU
> socket.
> >> >>>>>
> >> >>>>> On Sat, Sep 24, 2016 at 3:20 AM, Alexander Duyck
> >> >>>>> <alexander.du...@gmail.com>
> >> >>>>>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>> Well as a general rule anything over about 80usecs for
> >> >>>>>
> >> >>>>> InterruptThrottleRate is a waste.  One advantage to reducing the
> >> >>>>>
> >> >>>>> interrupt throttle rate is you can reduce the ring size and you
> >> >>>>> might
> >> >>>>>
> >> >>>>> see a slight performance improvement.  One problem with using 4096
> >> >>>>>
> >> >>>>> descriptors is that it greatly increases the cache footprint and
> >> >>>>> leads
> >> >>>>>
> >> >>>>> to more buffer-bloat and cache thrash as you have to evict old
> >> >>>>>
> >> >>>>> descriptors to pull in new ones.  I'm also sure if you are doing
> an
> >> >>>>>
> >> >>>>> intrusion detection system (I'm assuming that is what IDS is in
> >> >>>>>
> >> >>>>> reference to), then the users would appreciate it if you didn't
> add
> >> >>>>> up
> >> >>>>>
> >> >>>>> to a half dozen extra milliseconds of latency to their network
> >> >>>>> (worst
> >> >>>>>
> >> >>>>> case with an elephant flow of 1514 byte frames).
> >> >>>>>
> >> >>>>>
> >> >>>>> What size packets is it you are working with?  One limitation of
> the
> >> >>>>>
> >> >>>>> 82599 is that it can only handle an upper limit of somewhere
> around
> >> >>>>>
> >> >>>>> 12Mpps if you are using something like 6 queues, and only a little
> >> >>>>>
> >> >>>>> over 2 for a single queue.  If you exceed 12Mpps then the part
> will
> >> >>>>>
> >> >>>>> start reporting rx_missed because the PCIe overhead for moving 64
> >> >>>>> byte
> >> >>>>>
> >> >>>>> packets is great enough that it actually causes us to exceed the
> >> >>>>>
> >> >>>>> limits of the x8 gen2 link.  If the memcpy is what I think it is
> >> >>>>> then
> >> >>>>>
> >> >>>>> it allows us to avoid having to do two different atomic operations
> >> >>>>>
> >> >>>>> that would have been more expensive otherwise.
> >> >>>>>
> >> >>>>>
> >> >>>>> On Fri, Sep 23, 2016 at 12:46 PM, Michał Purzyński
> >> >>>>>
> >> >>>>> <michalpurzyns...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> Here's what I did
> >> >>>>>
> >> >>>>>
> >> >>>>> ethtool -A p1p1 rx off tx off
> >> >>>>>
> >> >>>>> ethtool -A p3p1 rx off tx off
> >> >>>>>
> >> >>>>>
> >> >>>>> Both ethtool -a <interface> and Arista that's pumping data show
> that
> >> >>>>>
> >> >>>>> RX/TX
> >> >>>>>
> >> >>>>> pause are disabled.
> >> >>>>>
> >> >>>>>
> >> >>>>> I have two cards, each connected to a separate NUMA node, threads
> >> >>>>>
> >> >>>>> pinned,
> >> >>>>>
> >> >>>>> etc.
> >> >>>>>
> >> >>>>>
> >> >>>>> One non-standard thing is that I use single queue only, because
> any
> >> >>>>> form
> >> >>>>>
> >> >>>>> of
> >> >>>>>
> >> >>>>> multiqueue leads to packet reordering and confuses IDS. An issue
> >> >>>>> that's
> >> >>>>>
> >> >>>>> been
> >> >>>>>
> >> >>>>> hidden for a while in the NSM community.
> >> >>>>>
> >> >>>>>
> >> >>>>> driver (from sourceforge) was loaded with MQ=0 DCA=2 RSS=1 VMDQ=0
> >> >>>>>
> >> >>>>> InterruptThrottleRate=956 FCoE=0 LRO=0 vxvlan_rx=0 (each option's
> >> >>>>> value
> >> >>>>>
> >> >>>>> given enogh times so it applies to all cards in this system).
> >> >>>>>
> >> >>>>>
> >> >>>>> I could see the same issue sending traffic to just one card.
> >> >>>>>
> >> >>>>>
> >> >>>>> Of course a single core is swamped with ACK-ing hardware IRQ and
> >> >>>>> then
> >> >>>>>
> >> >>>>> doing
> >> >>>>>
> >> >>>>> softIRQ (which seems to be mostly memcpy?). But then again, I
> don't
> >> >>>>> see
> >> >>>>>
> >> >>>>> errors about lacking buffers (I run with 4096 descriptors).
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Fri, Sep 23, 2016 at 9:22 PM, Alexander Duyck
> >> >>>>>
> >> >>>>> <alexander.du...@gmail.com>
> >> >>>>>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>> When you say you disabled flow control did you disable it on the
> >> >>>>>
> >> >>>>> interface that is dropping packets or the other end?  You might
> try
> >> >>>>>
> >> >>>>> explicitly disabling it on the interface that is dropping packets,
> >> >>>>>
> >> >>>>> that in turn should enable per-queue drop instead of putting
> >> >>>>>
> >> >>>>> back-pressure onto the Rx FIFO.
> >> >>>>>
> >> >>>>>
> >> >>>>> With flow control disabled on the local port you should see
> >> >>>>>
> >> >>>>> rx_no_dma_resources start incrementing if the issue is that one of
> >> >>>>> the
> >> >>>>>
> >> >>>>> Rx rings is not keeping up.
> >> >>>>>
> >> >>>>>
> >> >>>>> - Alex
> >> >>>>>
> >> >>>>>
> >> >>>>> On Fri, Sep 23, 2016 at 11:09 AM, Michał Purzyński
> >> >>>>>
> >> >>>>> <michalpurzyns...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> xoff was increasing so I disabled flow control.
> >> >>>>>
> >> >>>>>
> >> >>>>> That's a HP DL360 Gen9 and lspci -vvv tells me cards are connected
> >> >>>>> to
> >> >>>>>
> >> >>>>> x8
> >> >>>>>
> >> >>>>> link, speed is 5GT/s and ASPM is disabled.
> >> >>>>>
> >> >>>>>
> >> >>>>> Other error counters are still zero. When I compared rx_packets
> and
> >> >>>>>
> >> >>>>> rx_missed_errors it looks like a 38% (!!) packets are getting
> lost.
> >> >>>>>
> >> >>>>>
> >> >>>>> Unfortunately HP documentation is a scam and they actively avoid
> >> >>>>>
> >> >>>>> publishing
> >> >>>>>
> >> >>>>> motherboard layout.
> >> >>>>>
> >> >>>>>
> >> >>>>> Any other place I could look for hints?
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Fri, Sep 23, 2016 at 7:01 PM, Alexander Duyck
> >> >>>>>
> >> >>>>> <alexander.du...@gmail.com>
> >> >>>>>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>> On Fri, Sep 23, 2016 at 1:10 AM, Michał Purzyński
> >> >>>>>
> >> >>>>> <michalpurzyns...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> Hello.
> >> >>>>>
> >> >>>>>
> >> >>>>> On my IDS workload with af_packet I can see rx_missed_errors
> >> >>>>>
> >> >>>>> growing
> >> >>>>>
> >> >>>>> while
> >> >>>>>
> >> >>>>> rx_no_buffer_count does not. Basically every other kind of rx_
> >> >>>>>
> >> >>>>> error
> >> >>>>>
> >> >>>>> counter is 0, including rx_no_dma_resources. It's an 82599 based
> >> >>>>>
> >> >>>>> card.
> >> >>>>>
> >> >>>>>
> >> >>>>> I don't know what to think about that. I went through ixgbe source
> >> >>>>>
> >> >>>>> code
> >> >>>>>
> >> >>>>> and
> >> >>>>>
> >> >>>>> the 82599 datasheet and seems like rx_missed_error means a new
> >> >>>>>
> >> >>>>> packet
> >> >>>>>
> >> >>>>> overwrote something already in the packet buffer (FIFO queue on
> >> >>>>>
> >> >>>>> the
> >> >>>>>
> >> >>>>> card)
> >> >>>>>
> >> >>>>> because there was no more space in it.
> >> >>>>>
> >> >>>>>
> >> >>>>> Now, that would happen if there is no place to DMA packets into -
> >> >>>>>
> >> >>>>> but
> >> >>>>>
> >> >>>>> that
> >> >>>>>
> >> >>>>> counter does not grow.
> >> >>>>>
> >> >>>>>
> >> >>>>> Could you point me to where should I be looking for a problem?
> >> >>>>>
> >> >>>>>
> >> >>>>> --
> >> >>>>>
> >> >>>>> Michal Purzynski
> >> >>>>>
> >> >>>>>
> >> >>>>> The Rx missed count will increment if you are not able to receive
> a
> >> >>>>>
> >> >>>>> packet because the Rx FIFO is full.  If you are not seeing any
> >> >>>>>
> >> >>>>> rx_no_dma_resources problems it might indicate that the problem is
> >> >>>>>
> >> >>>>> not
> >> >>>>>
> >> >>>>> with providing the DMA resources, but a problem on the bus itself.
> >> >>>>>
> >> >>>>> You might want to double check the slot the device is connected to
> >> >>>>>
> >> >>>>> in
> >> >>>>>
> >> >>>>> order to guarantee that there is a x8 link that supports 5GT/s all
> >> >>>>>
> >> >>>>> the
> >> >>>>>
> >> >>>>> way through to the root complex.
> >> >>>>>
> >> >>>>>
> >> >>>>> - Alex
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >
> >
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] rx_missed_errors grows while rx_no_buffer does not

Reply via email to