A the subject says, I am seeing rate spikes in long duration tests using 
tcpreplay.  Spike occurred after roughly 12 hours when transmitting at a 
rate of 425Mb/s.

Question: Is this a known problem?  Is it possible I am doing something 
wrong in my testing?  I've never heard of such a thing before and none 
of my colleagues who have used tcpreplay extensively have seen such a 
thing either, and we are all at a loss to explain it.

I have a work-around so that I can complete my testing.  Essentially I 
will run tcpreplay on the file once at the given rate ("-l 1" instead of 
"-l 0") and wrap *that* in a script so it repeats indefinitely.  That 
should get me the same results, but I would like to know if the 
behaviour I am seeing is expected or anomalous.

Thanks for any time you have to look into this.

~~~~~

Setup:

I am using tcpreplay version 4.1.0 on a CentOS 7.2.1511 operating 
system.  My goal is to send traffic at a steady rate for a long duration 
(24 hours).  In my two attempts, I have seen that after about 12 hours 
the rate that I specified was no longer enforced.

I have two servers directly connected by ethernet cable.  Conveniently, 
the connected interfaces are both named p1p3.

In my first attempt, I was sending traffic from one server to the other 
at 415Mb/s.

# tcpreplay -i p1p3 -M 415  -l 0 ethernet_all.dmp

I started this test at around 18:00.  I verified the speed on both ends 
using a script which basically takes the delta of

# cat /sys/class/net/${INTERFACE}/statistics/[tr]x_bytes

periodically to compute the speed.  I divided the result by 1024^2 to 
get it in MB/s instead and was seeing 49MB/s consistently.  The source 
machine reported 49MB/s tx and the destination machine reported 49MB/s rx.

Things got strange at 05:26 the next day; the rate on both machines 
jumped from 49MB/s to 114MB/s.  By the time I saw it, it had been 
running at that rate for many hours.  When I stopped it, the script I 
used to compute the rate reported 0MB/s, and when I restarted (only a 
few seconds later), it was back to the normal 49MB/s.

I retried the same test the next day with the same results.  This time I 
was using a traffic rate of 425Mb/s (50MB/s), but the same result was 
seen.  I started the traffic around 16:00 and the spike occurred at 6:50 
the next morning (sorry, I don't have exact times).  Again, it began 
transmitting at 114MB/s.

In an effort to isolate the problem, I repeated the procedure without my 
application.  I used tcpreplay on one end and tcpdump on the other.  I 
set a rate of "-M 425" and left it over night.  About 12 hours later 
(1:45) the spike again occurred ramping traffic up to 114MB/s.

Here are some additional data points:
- If I run at top speed (-t), then I also see 114MB/s, so it would seem 
that after a while tcpreplay begins transmitting at top speed.
- Originally I was accessing my pcap file over an NFS mount.  In order 
to rule out the possibility that it was somehow affected by this, I made 
a local copy of the file.
- There is no other data on that interface.  When I stop tcpreplay, the 
traffic rate drops to 0 and when I restart, it goes to the normal 
(pre-spike) value.  If there was another source of traffic, I would 
expect stopping tcpreplay would decrease the rate only by the amount 
contributed (e.g. 49MB/s), but not all the way to 0.  This suggests that 
tcpreplay is the only source of traffic.

Since it takes about 12 hours for this to happen, it is a little slow to 
get some results.  However, my next steps are to test the following:
- Reverse the flow. See if this same problem occurs if I replay traffic 
from the destination back to the source.  Should have the same problem, 
but if not, then I can start investigating HW/config on the two 
supposedly "identical" servers.
- Try with a different pcap file.  Not sure why the pcap file should 
have an effect, but I have no experience with the tcpreplay source code, 
so perhaps.
- Try with higher/lower rates.  For example, try with "-M 200" or "-M 
600" instead.  See if the problem occurs sooner/later/at all with the 
different rates.  If it is a memory problem, perhaps higher rates will 
cause it to happen sooner.  Also, if the spikes occur, see if they all 
spike to the same 114MB/s value.

I am not sure what else to try, so I am open to suggestions.  Like I 
said, I have a work around so I don't need to investigate this, however 
I would like to know if it is a bug or perhaps a user error.

If there is any additional information I can get or things I can try, I 
am open to doing so.

Thanks again for your time!

Chris


TCPREPLAY version:
# tcpreplay -V
tcpreplay version: 4.1.0 (build git:v4.1.0)
Copyright 2013-2014 by Fred Klassen <tcpreplay at appneta dot com> - 
AppNeta Inc.
Copyright 2000-2012 by Aaron Turner <aturner at synfin dot net>
The entire Tcpreplay Suite is licensed under the GPLv3
Cache file supported: 04
Not compiled with libdnet.
Compiled against libpcap: 1.5.3
64 bit packet counters: enabled
Verbose printing via tcpdump: enabled
Packet editing: disabled
Fragroute engine: disabled
Injection method: PF_PACKET send()
Not compiled with netmap

Sample command line:
# tcpreplay -i p1p3 -M 425  -l 0 ethernet_all.dmp

Platform:
# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

Network information (identical on both servers):
# ethtool p1p3
Settings for p1p3:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: off (auto)
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes


# ethtool -i p1p3
driver: igb
version: 5.2.15-k
firmware-version: 1.67, 0x80000d66, 16.5.20
bus-info: 0000:01:00.2
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no


#  lshw -class network
   *-network:2
        description: Ethernet interface
        product: I350 Gigabit Network Connection
        vendor: Intel Corporation
        physical id: 0.2
        bus info: pci@0000:01:00.2
        logical name: p1p3
        version: 01
        serial: a0:36:9f:83:79:52 (ends in 78:bb on the other machine 
for what it is worth)
        size: 1Gbit/s
        capacity: 1Gbit/s
        width: 32 bits
        clock: 33MHz
        capabilities: pm msi msix pciexpress vpd bus_master cap_list rom 
ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
        configuration: autonegotiation=on broadcast=yes driver=igb 
driverversion=5.2.15-k duplex=full firmware=1.67, 0x80000d66, 16.5.20 
latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s
        resources: irq:18 memory:a2c00000-a2cfffff 
memory:a2f04000-a2f07fff memory:a0180000-a01fffff memory:a2f50000-a2f6ffff



------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Tcpreplay-users mailing list
Tcpreplay-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tcpreplay-users
Support Information: http://tcpreplay.synfin.net/trac/wiki/Support

Reply via email to