Hi Brian,
This time-wrap is a kind of hassle since ever. It actually has been tested on
many exporters - ok mainly on Cisco's, as it seemed to be used widely.

I haven't looked into softflowd overruns, as I expected them too behave the 
same.
I am open to help debugging the stuff, you have me a pcap with an overflow,
which could be rather difficult to produce.

Alternatively, you could add some LogInfo() messages in case of a wrap around
to log the original values to see how compensation needs to be done correctly
or even better to fix softflowd.

Cheers

        - Peter



On 02.06.16 16:51, Brian Candler wrote:
> In one site, I am generating flow records from a pfSense (2.3) box 
> running softflowd (1.2.1).
> 
> On the collector, nfdump (1.6.13) is compiled with --enable-nsel because 
> there is also an ASA generating flows.
> 
> Looking at the pfSense flows, I am seeing some strange summary records 
> where the duration looks to be a 2^32 wraparound (and hence the bps 
> values are completely wrong).
> 
> $ nfdump -M /var/nfsen/profiles-data/live/lch-fw1  -T  -r 
> 2016/06/02/nfcapd.201606020300 -c 10 -s proto/flows 'src host 10.26.1.30 
> and dst host xxx.xx.xx.1'
> Top 10 Protocol ordered by flows:
> Date first seen          Duration Proto Protocol    Flows(%)     
> Packets(%)       Bytes(%) pps      bps   bpp
> 2016-04-13 09:57:55.236 4294010.415 ICMP 1        1(100.0)       
> 24(100.0)     2016(100.0) 0        0    84
> 
> Summary: total flows: 1, total bytes: 2016, total packets: 24, avg bps: 
> 0, avg pps: 0, avg bpp: 84
> Time window: 2016-04-13 09:38:02 - 2016-06-02 03:02:51
> Total flows processed: 5373, Blocks skipped: 0, Bytes read: 349768
> Sys: 0.004s flows/second: 1343250.0  Wall: 0.003s flows/second: 1704632.0
> 
> Note that (4294010.415 * 1000) = 0xfff1662f which is suspiciously near 
> to 2^32.
> 
> The pfsense box generating these records has been up for 38 days, which 
> means it booted on 23rd April. Hence the "Date first seen" of 13th April 
> for the flow record doesn't make much sense.
> 
> I purposedly chose a summary over a single flow for the example above. 
> Looking at the flow itself:
> 
> $ nfdump -M /var/nfsen/profiles-data/live/lch-fw1  -T  -r 
> 2016/06/02/nfcapd.201606020300 -c 10 -o raw 'src host 10.26.1.30 and dst 
> host xxx.xx.xx.1'
> 
> Flow Record:
>    Flags        =              0x06 FLOW, Unsampled
>    export sysid =                 1
>    size         =                56
>    first        =        1460537875 [2016-04-13 09:57:55]
>    last         =        1464831885 [2016-06-02 02:44:45]
>    msec_first   =               236
>    msec_last    =               651
>    src addr     =        10.26.1.30
>    dst addr     =       xxx.xx.xx.1
>    ICMP         =               8.0  type.code
>    fwd status   =                 0
>    tcp flags    =              0x00 ......
>    proto        =                 1 ICMP
>    (src)tos     =                 0
>    (in)packets  =                24
>    (in)bytes    =              2016
> 
> Summary: total flows: 1, total bytes: 2016, total packets: 24, avg bps: 
> 0, avg pps: 0, avg bpp: 84
> Time window: 2016-04-13 09:38:02 - 2016-06-02 03:02:51
> Total flows processed: 5373, Blocks skipped: 0, Bytes read: 349768
> Sys: 0.004s flows/second: 1343250.0  Wall: 0.004s flows/second: 1295323.0
> 
> The "first" timestamp appears to be April 13th.
> 
> If I simply add (2^32 / 1000) to the start time, I get:
> 
> new first = 1464832842
> 
> last = 1464831885
> 
> ...but this gives a "first" time which is now 957 seconds *after* the 
> last time :-(
> 
> Looking at the source code I found this (calculations are in 
> milliseconds at this point):
> 
>                          uint32_t First = data_record->first;
>                          uint32_t Last  = data_record->last;
> 
>                          if ( First > Last )
>                                  /* First in msec, in case of msec 
> overflow, between start and end */
>                                  start_time = exporter->boot_time - 
> 0x100000000LL + (uint64_t)First;
>                          else
>                                  start_time = (uint64_t)First + 
> exporter->boot_time;
> 
>                          /* end time in msecs */
>                          end_time = (uint64_t)Last + exporter->boot_time;
> 
>                          if ( (end_time - start_time) > 0xffc00000 && 
> table->bytes < 2000 ) {
>                                  dbg_printf("CISCO bugfix!\n");
>                                  start_time += 0xffc00000;
>                          }
> 
> There's certainly a subtraction of 2^32 possible there; this implies 
> that softflowd is giving a flow with First > Last.
> 
> I don't think the 0xffc00000 branch applies here: firstly it would bring 
> the start_time forward not back, and secondly the (in) bytes count is 
> greater than 2000.
> 
> However, adding (0xffc00000 / 1000) to the start time, I get a value 
> which is about 54 minutes before the finish time, which is vaguely 
> plausible.
> 
> Any suggestions for what is happening here, or how I might go about 
> debugging it? Or has anyone else seen issues like this with softflowd 
> and flow start/end timestamps?
> 
> Thanks,
> 
> Brian.
> 
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are 
> consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
> J-Flow, sFlow and other flows. Make informed decisions using capacity 
> planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
> _______________________________________________
> Nfdump-discuss mailing list
> Nfdump-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfdump-discuss
> 

-- 
Be nice to your netflow data. Use NfSen and nfdump :)

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Nfdump-discuss mailing list
Nfdump-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfdump-discuss

Reply via email to