Re: [pmacct-discussion] Timestamps in RabbitMQ/JSON output

2014-06-05 Thread Paolo Lucente
Hi Chris,

On Tue, Jun 03, 2014 at 10:50:21PM +0300, Chris Wilson wrote:

 So at the moment I am developing this by running pmacctd (not
 nfacctd) on my own laptop to collect and graph my own traffic.
 Thanks for the suggestion of using timestamp_start and _end which I
 didn't know you could aggregate on.
 
 However when I added these to my aggregate line, I found that the
 timestamp_start is in local time (not GMT) and a human-readable date
 format, which is not great for processing in JavaScript, and
 timestamp_end doesn't appear to work properly:
 
 DEBUG ( default/amqp ): publishing [E=pmacct RK=acct DM=0]:
 {timestamp_start: 2014-06-03 22:42:00.202820, ip_dst:
 196.223.145.xxx, ip_proto: tcp, tos: 0, ip_src:
 86.30.131.xxx, bytes: 142, port_dst: 36363, packets: 1,
 port_src: 2201, timestamp_end: 1970-01-01 03:00:00.0}
 
 Is this a bug? Would it be easy to fix?

This is not a bug. This is result of the fact a single packet
has a single timestamp (or two coinciding) hence only one of
the two values, timestamp_start, is populated. Try to:

* capture your own traffic with pmacctd attaching to it a
  nfprobe plugin, the NetFlow/IPFIX probe plugin. Set the
  export to localhost.

* on localhost you bind nfacctd that listens for NetFlow/IPFIX
  packets (generated by pmacctd/nfprobe) and writes wherever you
  want to like with the aggregation you like (this time you will
  see both timestamp_start and timestamp_end populated - as a
  result of the flow-aware cache of nfprobe).

This is the slightly more involved solution i was proposing,
which i don't know if you like or not (definitely good for a
proof of concept).

 It might be. Because I'm mainly using pmacctd (not having any
 netflow-capable hardware) I don't know how that would work in
 pmacctd. Would you send every packet? That could be an awful lot of
 traffic, with some flows having a thousand packets per second.

That is why i'm insisting on the point of the flow-aware cache,
agree you don't want to tag with a timestamp every packet and
send it over :)

Cheers,
Paolo


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Timestamps in RabbitMQ/JSON output

2014-06-03 Thread Chris Wilson

Hi Paolo,

On Tue, 3 Jun 2014, Paolo Lucente wrote:

What you describe for timestamps seems a good match for NetFlow, ie. 
cast packets into flows and handle these via a flow-aware cache (so 
active/passive expiration timers, max lifetime, etc.). All described is 
already part of the nfprobe plugin. Collecting back such data via 
nfacctd (on the same box where NetFlow is exported or ship it to some 
central location) enables to use timestamp_start, timestamp_end 
aggregation primitives - which should be precisely what you want to 
achieve. The beauty is that you can have all time references possible at 
once: timetamp_start, timestamp_end, stamp_inserted, stamp_updated.


Don't know how much you like/dislike the solution but i'd encourage to 
run a proof-of-concept with these tools (which are all available 
already) so to see we are in line with your requirements and hence take 
it from there.


So at the moment I am developing this by running pmacctd (not nfacctd) on 
my own laptop to collect and graph my own traffic. Thanks for the 
suggestion of using timestamp_start and _end which I didn't know you could 
aggregate on.


However when I added these to my aggregate line, I found that the 
timestamp_start is in local time (not GMT) and a human-readable date 
format, which is not great for processing in JavaScript, and timestamp_end 
doesn't appear to work properly:


DEBUG ( default/amqp ): publishing [E=pmacct RK=acct DM=0]: 
{timestamp_start: 2014-06-03 22:42:00.202820, ip_dst: 
196.223.145.xxx, ip_proto: tcp, tos: 0, ip_src: 86.30.131.xxx, 
bytes: 142, port_dst: 36363, packets: 1, port_src: 2201, 
timestamp_end: 1970-01-01 03:00:00.0}


Is this a bug? Would it be easy to fix?

About sql_refresh_time less than one second. I've not considered it for 
a simple reason: it seems to me like forcing an existing caching 
mechanism towards a real-time use-case. Then better to disable it at all 
and stream flows as they arrive onto the AMQP exchange. I have this on 
my todo list - does it seem what you are looking for?


It might be. Because I'm mainly using pmacctd (not having any 
netflow-capable hardware) I don't know how that would work in pmacctd. 
Would you send every packet? That could be an awful lot of traffic, with 
some flows having a thousand packets per second.


We could process and aggregate it all on the client side, and that has 
uses (such as drilling down into individual packets), but it would be 
great to have the option of aggregating them on the server as well, at a 
resolution chosen by the user.


It's definitely not something that I need now, but would like you to have 
it on your radar that this might be useful for some people.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists