Wow, that's a lot of information. I'll add one more cases I'm aware of: Applications capturing on multiple interfaces and writing to a single file. Depending on who does the timestamping (the NIC or the CPU) as well as how you're poll()'ing the interfaces can lead to small jumps back in time. Basically if the NIC is doing the timestamping of the packet (which is hardware/driver dependent) it will put the "correct time" in the packet header, but it's quite probable that the user-space application which is polling those NIC's won't actually read the packets in the order they were received.
Basically, userspace would have to cache packets in memory and reorder them on the fly before writing the pcap file, but I suspect a lot of people don't bother to do this. Or you could write them to separate files and merge them offline. On Thu, Jun 28, 2012 at 10:07 AM, John McHugh <mch...@cs.unc.edu> wrote: > The tcpreplay FAQ says "More specifically, I have seen cases where a packet > has a timestamp before the previous packet in the capture file. I'm not sure > how such a pcap got created, but it seems to occasionally happen." > > The purpose of this note is to provide a plausible explanation for timestamp > reversals and a bit of advice for those who are capturing pcap. The posting > is somewhat lengthly as I think the context is important. It is not specific > to tcpreplay except in the context of the FAQ quotation, above. > > 1) There are two publicly available data sets that manifest the problem. > (There are probably many more, but these are available.) > a) The Crawdad (Dartmouth) 2003-2004 wireless packet header traces - about > 4 months of anonymized headers from 18 wireless sniffers located on the > dartmouth campus. Packets were truncated after the port fields for TCP and > UDP, after the IP header for everything else. The user agreement is fairly > benign, but attacking the anonymization is off limits. > b) The IARPA data set available from the DHS Predict repository. Note that > a Predict user agreement is required which requires institutional or > corporate level responsibility. This is artificial data with background and > attack scenarios captured by 3 tcpdump probes attached to trunks within a > simulated /11 network. It contains tcpdump traces as well as logs, alerts, > etc. along with ground truth for labeled attacks. > > 2) I have primarily worked with NetFlow data, so packet headers are fine for > what I do. I am paranoid about data, so I do a lot of sanity checking. I > use the SiLK tools from CERT NETSA as my primary tool base, sometimes heavily > modified. To convert pcap to "degenerate" flows (1 packet produces 1 flow > record), I use a modified "rwptoflow" program. > > 3) The Dartmouth data is about 160GB of compressed packet header files. (a > week or more to download on a modest DSL line). I modified libpcap > (modification is in latest source release) to allow reading of compressed > files using a hack that supports filenames of the form "| gunzip -c <file>" > to let pcap_open_offline use popen / pclose on such forms. (thanks to Phil > Budne's SNOBOL 4 in C) and modified rwptoflow to treat file names starting > with "@" as a list of files to be processed so that I could process all 4 > months of data for one sensor in one pass, producing a date hierarchy of > hourly SiLK files. (mergecap, distributed with wireshark, opens all the > files at once and runs out of file descriptors for some sensors having > hundreds of files) > > One of the modifications to rwptoflow is a check for decreasing time. Time > reversals of a few microseconds are scattered throughout the Dartmouth data > for some sensors. There are also a hand full of longer reversals, on the > order of 3400 seconds. Because SiLK only carries time to the millisecond, it > was possible to determine that none of the minor reversals in the pcap cause > SiLK time reversals. Inspection of the reversal regions using "wireshark" > confirmed that they are in the pcap files. > > My suspicion is that the longer reversals are caused by manual resetting of > the clock on the capture machine(s) involved, but why 3400 seconds remains > unresolved. Since the data is captured from wireless sniffers, I reasoned > that some sources might occasionally appear simultaneously in two (or more) > traces and that if I could find the same traffic in two traces, one with and > one without the jump, I might be able to correct the faulty clock. I started > by looking for the same (sip, dip, protocol, sport, dport) 5 tuples in > pairwise traces using the SiLK rwmatch program. This revealed that, for the > first 6 weeks of the trace, time differences of a minute or more between > traces were common and that substantial relative drifts occurred between > clocks. (see attached png). As far as I can tell, acquisition was started > without setting up NTP on the sniffers which must have had really bad clocks. > After about 6 weeks, the clocks abruptly converge, indicating that NTP was > probably enabled. No records were kept of configuration, and, even when I > first noticed the problem in 2006, or so, no further information was > available from Dartmouth. > > 4) The IARPA trace has a small number of millisecond level reversals, all > near the beginning of capture runs. The complete data set is about 100GB and > took about a week to download using wget with a 200KB/sec limit. > > 5) For the small reversals in both sets, I suspect that NTP is setting the > clock back, abruptly. I think that this can be avoided by having NTP slew > rather than step the clock on machines used for capture. In the IARPA case, > the capture was probably started on a few occasions before NTP had stabilized > the time. > > 6) Thus far, I have been unable to find evidence to resolve the large jumps. > There are not enough cases of multiple capture to set up a relative clock > rating matrix that would remove all the drifts during the first 6 weeks. If > that could be done, the offsets when NTP was applied could be used to > backwards correct the earlier timestamps. This might make an interesting > Senior or MS project for a student who wants to pay attention to small > details. > > ============ > > All of this points out the need for complete record keeping during packet > capture. NTP should be enabled with an adequate NTP server configuration. > Slewing corrections should be used. Nonetheless, the data should be analyzed > for time reversals and unexpectedly long forward jumps. Both the NTP > configuration and the NTP corrections log should be part of supporting > evidence for capture validity. > > John McHugh > RedJack, LLC and > CS Department, UNC > > > ============ > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Tcpreplay-users mailing list > Tcpreplay-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/tcpreplay-users > Support Information: http://tcpreplay.synfin.net/trac/wiki/Support -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero" ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Tcpreplay-users mailing list Tcpreplay-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tcpreplay-users Support Information: http://tcpreplay.synfin.net/trac/wiki/Support