Hi Daniel,

I see the 1 minute table contains duplicates - it would be
better to say that _everything_ extracted is repeated twice.
It could be handy to add tags to the aggregation method and
assign a different 'post_tag' to each plugin so to identify
who is generating them. 

I also wonder: how does the primary key of the 1 min table
look like? Is it any different from the 1 hour table? With
the sql_don_try_update turned on and the default indexing,
duplicates are not possible. 

Also at a closer look to the configuration you posted i see
no aggregate_filter are specified (see EXAMPLES): it means 
each plugin collects and tries to write to the same table
both inbound and outbound traffic. So either you can remove
one set of plugins or craft a proper aggregate_filter so
that each does only its bit of the job.

With regards to the missing tuples, from the few checks i've
done, it is always the case that something is in the 1 hour
table but can be missing in the 1 minute one. This can very
well be the result of a shared 'sql_preprocess: minb = 1000'
directive: a flow can accumulate more than 1000 bytes in 1
hour but not in 1 minute - and hence it's accounted in one
table and stripped off in the other.

Given the sql_preprocess you should never expect counters
to match for the same reason as above. To have a comparison
more apples to apples, you should consider removing it and
when confident everything is allright put it back again.

Finally, unrelated to the issue: please for the benefit of
public archives, don't send attachments to the list.

Cheers,
Paolo

 

On Fri, Feb 19, 2010 at 12:11:24PM +0000, Daniel Levy wrote:
> Hi Paolo,
> 
> Here is a report with differences between the two tables.  There are a
> lot of differences between the download figures as well as some
> instances where IP addresses are only found in one table (see
> report.txt). I also have some raw data for the time periods between
> 10:00 and 11:00 on 11/02/2010, which are being sent separately via
> yousendit.
> 
> Regards
> 
> -- 
> Daniel Levy
> 
> Aptivate | http://www.aptivate.org/ | +44 (0)1223 760887
> The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES
> 
> Aptivate is a not-for-profit company registered in England and Wales
> with company number 04980791. 
> 
> 
> 
> Paolo Lucente wrote:
> > Hi Daniel,
> >
> > Getting through the data and compare traffic figures is,
> > IHMO, the more practical approach - compared to trying to
> > reproduce the issue in a controlled environment. Once you
> > discover a descrepancy, it would be great to receive the
> > contributing data of each report to see where the issue
> > comes from.
> >
> > It's also true that version 0.9.1 is almost 5 years old; i
> > would highly encourage to refresh it. I should be correct
> > saying Ubuntu features also version 0.11.4 and 0.11.6 if
> > you really don't like the idea of compiling 0.12 yourself
> > (which would be my preferred approach).
> >
> > Let me know.
> >
> > Cheers,
> > Paolo
> >
> >
> >
> > On Mon, Feb 15, 2010 at 10:41:21AM +0000, Daniel Levy wrote:
> >   
> >> Hi Paolo,
> >>
> >> Thanks for getting back to me. The version of pmacct being used is
> >> 0.9.1-1ubuntu1. I'm not sure how the problem was discovered, but I have
> >> asked to person who found the problem to tell me and I will forward you
> >> the response. As for the reports, I'm not entirely sure what you need. I
> >> am considering going through the database data for each hour and
> >> comparing the total figures for uploaded and downloaded packets, per IP
> >> address between the two tables.  Would this give you the information
> >> you're looking for?
> >>
> >> -- 
> >> Daniel Levy
> >>
> >> Aptivate | http://www.aptivate.org/ | +44 (0)1223 760887
> >> The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES
> >>
> >> Aptivate is a not-for-profit company registered in England and Wales
> >> with company number 04980791. 
> >>
> >>
> >>
> >> Paolo Lucente wrote:
> >>     
> >>> Hi Daniel,
> >>>
> >>> Unfortunately the configuration doesn't make evident where the
> >>> issue can be. The 'sql_dont_try_update' very well protects against
> >>> duplicate tuples - so i'm rather inclined to exclude that reason. 
> >>>
> >>> Which version are you using? How you did discover the issue - ie.
> >>> did you upgrade recently from a previous version or is a fresh
> >>> installation? Finally, is it possible to get - privately - two
> >>> reports, one from each table, for the same time period? Say, one
> >>> or better two hours? 
> >>>
> >>> Let me know.
> >>>
> >>> Cheers,
> >>> Paolo
> >>>
> >>>
> >>> On Fri, Feb 12, 2010 at 03:19:38PM +0000, Daniel Levy wrote:
> >>>   
> >>>       
> >>>> Hi,
> >>>>
> >>>> I'm using pmacct to store data in two tables, one containing data
> >>>> recorded on a per minute basis, the other containing data recorded on an
> >>>> hourly basis. When I get data for the first table over a period of three
> >>>> hours, the download traffic (calculated by adding up the bytes field for
> >>>> traffic where the ip_dst value is from a machine on the local network)
> >>>> for one IP address on the network is 3,719,772,656 bytes. The download
> >>>> traffic from the second table for the same IP address over a period of
> >>>> one week, including the three hour period mentioned above, is
> >>>> significantly smaller (2,114,286,512 bytes) where I would expect it to
> >>>> be much larger and I can't figure out why. A slightly modified version
> >>>> of the contents of my pmacctd.conf file is given below. Can anyone help?
> >>>>
> >>>> daemonize: true
> >>>> pidfile: /var/run/pmacctd.pid
> >>>> syslog: daemon
> >>>>
> >>>> plugins: mysql[inbound1], mysql[outbound1], mysql[inbound2],
> >>>> mysql[outbound2]
> >>>>
> >>>> aggregate[inbound1]: src_host, src_port, dst_host, dst_port, proto
> >>>> aggregate[outbound1]: src_host, src_port, dst_host, dst_port, proto
> >>>> aggregate[inbound2]: src_host, src_port, dst_host, dst_port, proto
> >>>> aggregate[outbound2]: src_host, src_port, dst_host, dst_port, proto
> >>>>
> >>>> pcap_filter: not (src and dst net 0.0.0.0/24)
> >>>>
> >>>>
> >>>> sql_db: pmacct
> >>>> sql_table[inbound1]: short_data_table
> >>>> sql_table[outbound1]: short_data_table
> >>>>
> >>>> sql_table[inbound2]: long_data_table
> >>>> sql_table[outbound2]: long_data_table
> >>>>
> >>>> sql_history[inbound1]: 1m
> >>>> sql_history[outbound1]: 1m
> >>>> sql_history[inbound2]: 1h
> >>>> sql_history[outbound2]: 1h
> >>>>
> >>>> sql_history_roundoff[inbound1]: m
> >>>> sql_history_roundoff[outbound1]: m
> >>>> sql_history_roundoff[inbound2]: h
> >>>> sql_history_roundoff[outbound2]: h
> >>>> sql_table_version: 6
> >>>> sql_host: localhost
> >>>> sql_user: auser
> >>>> sql_passwd: apass
> >>>>
> >>>> sql_refresh_time[inbound1]: 60
> >>>> sql_refresh_time[outbound1]: 60
> >>>> sql_refresh_time[inbound2]: 3600
> >>>> sql_refresh_time[outbound2]: 3600
> >>>> sql_dont_try_update: true
> >>>> sql_optimize_clauses: true
> >>>>
> >>>> sql_preprocess: minb = 1000
> >>>>
> >>>> Regards
> >>>>
> >>>> -- 
> >>>> Daniel Levy
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> pmacct-discussion mailing list
> >>>> http://www.pmacct.net/#mailinglists
> >>>>     
> >>>>         
> >>> _______________________________________________
> >>> pmacct-discussion mailing list
> >>> http://www.pmacct.net/#mailinglists
> >>>   
> >>>       



_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to