Looks like there is a solution in version 4.1.1.  For anyone's interest, 
I did still run three tests last night:
- reverse direction - problem still occurred
- higher transfer rate - problem occurred sooner
- lower transfer rate - problem has not yet occurred

At 50MB/s ("-M 425"), traffic started at 19:26:16.  Problem occurred at 
07:29:40, almost exactly 12 hours later.

At 71MB/s ("-M 600"), traffic started at 19:27.  Problem occurred at 
03:59, so about 8.5 hours later.

At 35MB/s ("-M 300"), traffic stated at 19:26:24. It is now 12:00, so 
16.5 hours without a problem.

These appear to be consistent.  After transferring 2.17GB of data the 
problem seems to occur.  At 50MB/s, this happens in 2.17GB/50MB/s = 
43380s = 723min = 12.48hours.  At 71MB/s, this works out to 509min=8.48 
hours.

Without looking at the code, this number (2.17GB) appears to be close to 
2^31=2,149,483,648 - the maximum value for a signed 32bit integer. 
(quick look at the bug fix confirms is was solved by increasing to 64bit 
counters).

At the current rates of 35MB/s, I would expect the problem to occur in 
1033min=17.2hours.  This should happen within the hour - at exactly 
12:38.  I won't flood anyone's inbox with this mundane report unless 
this last test shows something unique.

All times are UTC.

Thanks for the help!

Chris

On 16-04-26 03:30 PM, Chris wrote:
> The second one appears to be my problem, although in the bug report it
> is happening much sooner.  Perhaps due to an increased rate.  I am still
> trying three more tests overnight to confirm.
>
> Thanks!
>
> On 16-04-26 02:35 PM, Fredrick Klassen wrote:
>> Fixed in 4.1.1 https://github.com/appneta/tcpreplay/releases/tag/v4.1.1
>>
>> See https://github.com/appneta/tcpreplay/issues/241 and
>> https://github.com/appneta/tcpreplay/issues/210
>>
>> Fred.
>>> On Apr 26, 2016, at 11:08 AM, Chris <ckittl...@gmail.com
>>> <mailto:ckittl...@gmail.com>> wrote:
>>>
>>> A the subject says, I am seeing rate spikes in long duration tests using
>>> tcpreplay.  Spike occurred after roughly 12 hours when transmitting at a
>>> rate of 425Mb/s.
>>>
>>> Question: Is this a known problem?  Is it possible I am doing something
>>> wrong in my testing?  I've never heard of such a thing before and none
>>> of my colleagues who have used tcpreplay extensively have seen such a
>>> thing either, and we are all at a loss to explain it.
>>>
>>> I have a work-around so that I can complete my testing.  Essentially I
>>> will run tcpreplay on the file once at the given rate ("-l 1" instead of
>>> "-l 0") and wrap *that* in a script so it repeats indefinitely.  That
>>> should get me the same results, but I would like to know if the
>>> behaviour I am seeing is expected or anomalous.
>>>
>>> Thanks for any time you have to look into this.
>>>
>>> ~~~~~
>>>
>>> Setup:
>>>
>>> I am using tcpreplay version 4.1.0 on a CentOS 7.2.1511 operating
>>> system.  My goal is to send traffic at a steady rate for a long duration
>>> (24 hours).  In my two attempts, I have seen that after about 12 hours
>>> the rate that I specified was no longer enforced.
>>>
>>> I have two servers directly connected by ethernet cable.  Conveniently,
>>> the connected interfaces are both named p1p3.
>>>
>>> In my first attempt, I was sending traffic from one server to the other
>>> at 415Mb/s.
>>>
>>> # tcpreplay -i p1p3 -M 415  -l 0 ethernet_all.dmp
>>>
>>> I started this test at around 18:00.  I verified the speed on both ends
>>> using a script which basically takes the delta of
>>>
>>> # cat /sys/class/net/${INTERFACE}/statistics/[tr]x_bytes
>>>
>>> periodically to compute the speed.  I divided the result by 1024^2 to
>>> get it in MB/s instead and was seeing 49MB/s consistently.  The source
>>> machine reported 49MB/s tx and the destination machine reported
>>> 49MB/s rx.
>>>
>>> Things got strange at 05:26 the next day; the rate on both machines
>>> jumped from 49MB/s to 114MB/s.  By the time I saw it, it had been
>>> running at that rate for many hours.  When I stopped it, the script I
>>> used to compute the rate reported 0MB/s, and when I restarted (only a
>>> few seconds later), it was back to the normal 49MB/s.
>>>
>>> I retried the same test the next day with the same results.  This time I
>>> was using a traffic rate of 425Mb/s (50MB/s), but the same result was
>>> seen.  I started the traffic around 16:00 and the spike occurred at 6:50
>>> the next morning (sorry, I don't have exact times).  Again, it began
>>> transmitting at 114MB/s.
>>>
>>> In an effort to isolate the problem, I repeated the procedure without my
>>> application.  I used tcpreplay on one end and tcpdump on the other.  I
>>> set a rate of "-M 425" and left it over night.  About 12 hours later
>>> (1:45) the spike again occurred ramping traffic up to 114MB/s.
>>>
>>> Here are some additional data points:
>>> - If I run at top speed (-t), then I also see 114MB/s, so it would seem
>>> that after a while tcpreplay begins transmitting at top speed.
>>> - Originally I was accessing my pcap file over an NFS mount.  In order
>>> to rule out the possibility that it was somehow affected by this, I made
>>> a local copy of the file.
>>> - There is no other data on that interface.  When I stop tcpreplay, the
>>> traffic rate drops to 0 and when I restart, it goes to the normal
>>> (pre-spike) value.  If there was another source of traffic, I would
>>> expect stopping tcpreplay would decrease the rate only by the amount
>>> contributed (e.g. 49MB/s), but not all the way to 0.  This suggests that
>>> tcpreplay is the only source of traffic.
>>>
>>> Since it takes about 12 hours for this to happen, it is a little slow to
>>> get some results.  However, my next steps are to test the following:
>>> - Reverse the flow. See if this same problem occurs if I replay traffic
>>> from the destination back to the source.  Should have the same problem,
>>> but if not, then I can start investigating HW/config on the two
>>> supposedly "identical" servers.
>>> - Try with a different pcap file.  Not sure why the pcap file should
>>> have an effect, but I have no experience with the tcpreplay source code,
>>> so perhaps.
>>> - Try with higher/lower rates.  For example, try with "-M 200" or "-M
>>> 600" instead.  See if the problem occurs sooner/later/at all with the
>>> different rates.  If it is a memory problem, perhaps higher rates will
>>> cause it to happen sooner.  Also, if the spikes occur, see if they all
>>> spike to the same 114MB/s value.
>>>
>>> I am not sure what else to try, so I am open to suggestions.  Like I
>>> said, I have a work around so I don't need to investigate this, however
>>> I would like to know if it is a bug or perhaps a user error.
>>>
>>> If there is any additional information I can get or things I can try, I
>>> am open to doing so.
>>>
>>> Thanks again for your time!
>>>
>>> Chris
>>>
>>>
>>> TCPREPLAY version:
>>> # tcpreplay -V
>>> tcpreplay version: 4.1.0 (build git:v4.1.0)
>>> Copyright 2013-2014 by Fred Klassen <tcpreplay at appneta dot com> -
>>> AppNeta Inc.
>>> Copyright 2000-2012 by Aaron Turner <aturner at synfin dot net>
>>> The entire Tcpreplay Suite is licensed under the GPLv3
>>> Cache file supported: 04
>>> Not compiled with libdnet.
>>> Compiled against libpcap: 1.5.3
>>> 64 bit packet counters: enabled
>>> Verbose printing via tcpdump: enabled
>>> Packet editing: disabled
>>> Fragroute engine: disabled
>>> Injection method: PF_PACKET send()
>>> Not compiled with netmap
>>>
>>> Sample command line:
>>> # tcpreplay -i p1p3 -M 425  -l 0 ethernet_all.dmp
>>>
>>> Platform:
>>> # cat /etc/redhat-release
>>> CentOS Linux release 7.2.1511 (Core)
>>>
>>> Network information (identical on both servers):
>>> # ethtool p1p3
>>> Settings for p1p3:
>>> Supported ports: [ TP ]
>>> Supported link modes:   10baseT/Half 10baseT/Full
>>>                        100baseT/Half 100baseT/Full
>>>                        1000baseT/Full
>>> Supported pause frame use: Symmetric
>>> Supports auto-negotiation: Yes
>>> Advertised link modes:  10baseT/Half 10baseT/Full
>>>                        100baseT/Half 100baseT/Full
>>>                        1000baseT/Full
>>> Advertised pause frame use: Symmetric
>>> Advertised auto-negotiation: Yes
>>> Speed: 1000Mb/s
>>> Duplex: Full
>>> Port: Twisted Pair
>>> PHYAD: 1
>>> Transceiver: internal
>>> Auto-negotiation: on
>>> MDI-X: off (auto)
>>> Supports Wake-on: d
>>> Wake-on: d
>>> Current message level: 0x00000007 (7)
>>>       drv probe link
>>> Link detected: yes
>>>
>>>
>>> # ethtool -i p1p3
>>> driver: igb
>>> version: 5.2.15-k
>>> firmware-version: 1.67, 0x80000d66, 16.5.20
>>> bus-info: 0000:01:00.2
>>> supports-statistics: yes
>>> supports-test: yes
>>> supports-eeprom-access: yes
>>> supports-register-dump: yes
>>> supports-priv-flags: no
>>>
>>>
>>> #  lshw -class network
>>>   *-network:2
>>>        description: Ethernet interface
>>>        product: I350 Gigabit Network Connection
>>>        vendor: Intel Corporation
>>>        physical id: 0.2
>>>        bus info: pci@0000:01:00.2
>>>        logical name: p1p3
>>>        version: 01
>>>        serial: a0:36:9f:83:79:52 (ends in 78:bb on the other machine
>>> for what it is worth)
>>>        size: 1Gbit/s
>>>        capacity: 1Gbit/s
>>>        width: 32 bits
>>>        clock: 33MHz
>>>        capabilities: pm msi msix pciexpress vpd bus_master cap_list rom
>>> ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd
>>> autonegotiation
>>>        configuration: autonegotiation=on broadcast=yes driver=igb
>>> driverversion=5.2.15-k duplex=full firmware=1.67, 0x80000d66, 16.5.20
>>> latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s
>>>        resources: irq:18 memory:a2c00000-a2cfffff
>>> memory:a2f04000-a2f07fff memory:a0180000-a01fffff
>>> memory:a2f50000-a2f6ffff
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> Find and fix application performance issues faster with Applications
>>> Manager
>>> Applications Manager provides deep performance insights into multiple
>>> tiers of
>>> your business applications. It resolves application problems quickly and
>>> reduces your MTTR. Get your free trial!
>>> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
>>> _______________________________________________
>>> Tcpreplay-users mailing list
>>> Tcpreplay-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/tcpreplay-users
>>> Support Information: http://tcpreplay.synfin.net/trac/wiki/Support
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> Find and fix application performance issues faster with Applications
>> Manager
>> Applications Manager provides deep performance insights into multiple
>> tiers of
>> your business applications. It resolves application problems quickly and
>> reduces your MTTR. Get your free trial!
>> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
>>
>>
>>
>> _______________________________________________
>> Tcpreplay-users mailing list
>> Tcpreplay-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/tcpreplay-users
>> Support Information: http://tcpreplay.synfin.net/trac/wiki/Support
>>

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Tcpreplay-users mailing list
Tcpreplay-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tcpreplay-users
Support Information: http://tcpreplay.synfin.net/trac/wiki/Support

Reply via email to