Hi Chris,
About the SQL INSERT conflict, are you by any chance making use of the
"sql_dont_try_update" directive in your configuration? And are you using
32bit counters? The conjunction of these two conditions might explain.
The SQL cache code, while summing up counters, makes a check on whether
the counter field is about to overflow. When 64bit counters are disabled
(default) this is what happens:
#define UINT32T_THRESHOLD 4290000000UL
#define CACHE_THRESHOLD UINT32T_THRESHOLD
/* additional check: bytes counter overflow */
else if (Cursor->bytes_counter > CACHE_THRESHOLD) {
if (!staleElem && Cursor->chained) staleElem = Cursor;
goto follow_chain;
}
Basically, a new record for the entry which is going to overflow is
opened and the old one if "parked". When purging the cache to the SQL
database, both records (the active and the parked one) are sent over;
the first with an INSERT the second with an UPDATE. This mechanism is
valid for any number of overflows - indeed.
The above would also explain why a number of the entries above the 1GB
level are around the 4GB. But this also would suggest the counters are
genuine. Another thing which would suggest these are "real" is that by
dividing the bytes counter by the packets counter, you get a consistent
average data size:
4290000028 / 10026264 = ~428 bytes
3943258731 / 8984686 = ~439 bytes
Any bytes counter roll-over would have greatly skewed one of the above
two proportions - highlighting an issue. But this would suggest that in
a single minute roughly 8GB of data were transferred. This translates in
a fully loaded 1Gbps link. This brings me to these questions: is your LAN
network (including the "192.168.0.175" host) connected to 1Gbps? Do you
think it could be possible some LAN traffic gets spanned over?
Please let me know.
Cheers,
Paolo
On Sat, Mar 14, 2009 at 02:59:30PM +0000, Chris Wilson wrote:
> Hi Paolo,
>
> I'm running pmacctd 0.11.5 on a small network for traffic accounting.
> Generally it's behaving well, but occasionally I can see weird data being
> inserted:
>
> 17190 Query INSERT INTO `acct_v7` (stamp_updated, stamp_inserted, vlan,
> ip_dst, as_src, as_dst, src_port, dst_port, tcp_flags, tos, ip_proto,
> agent_id, class_id, mac_src, mac_dst, ip_src, packets, bytes, flows)
> VALUES (FROM_UNIXTIME(1236952981), FROM_UNIXTIME(1236952920), 0,
> '192.168.0.175', 0, 0, 0, 0, 0, 0, 'ip', 0, 'unknown', '0:0:0:0:0:0',
> '0:0:0:0:0:0', '0.0.0.0', 10026264, 4290000028, 0)
>
> 17190 Query INSERT INTO `acct_v7` (stamp_updated, stamp_inserted, vlan,
> ip_dst, as_src, as_dst, src_port, dst_port, tcp_flags, tos, ip_proto,
> agent_id, class_id, mac_src, mac_dst, ip_src, packets, bytes, flows)
> VALUES (FROM_UNIXTIME(1236952981), FROM_UNIXTIME(1236952920), 0,
> '192.168.0.175', 0, 0, 0, 0, 0, 0, 'ip', 0, 'unknown', '0:0:0:0:0:0',
> '0:0:0:0:0:0', '0.0.0.0', 8984686, 3943258731, 0)
>
> The byte counters look bogus to me. It's hard to imagine how anyone could
> send 4 GB of data down through my cable modem connection in just one
> minute. I might even suspect a 32-bit sign overflow, but in the second
> case that would still mean 350 MB in one minute which is 46 Mbps, more
> than four times my line rate, and my external interface graphs show no
> traffic at all during that time.
>
> What's also odd is that the second record is a primary key conflict with
> the first, so it never ended up in the database. I don't have two
> pmacctd's running this time :) but I do have two plugins configured as
> follows:
>
> plugins: mysql[inbound], mysql[outbound]
>
> aggregate[inbound]: dst_host
> aggregate_filter[inbound]: dst net 192.168.0.0/24
>
> aggregate[outbound]: src_host
> aggregate_filter[outbound]: src net 192.168.0.0/24
>
> They both insert into the same table, which is what I want in this case.
> Because of aggregation, they should never conflict with each other. But
> could this be causing memory corruption?
>
> Here is the suspicious data that I have in my database (I assume that
> MySQL is not corrupting this data):
>
> mysql> select stamp_inserted,bytes,packets from acct_v7 where bytes >
> 1000000000;
> +---------------------+------------+----------+
> | stamp_inserted | bytes | packets |
> +---------------------+------------+----------+
> | 2009-02-13 09:27:00 | 3192440953 | 3077338 |
> | 2009-02-25 15:31:00 | 1520451669 | 17845485 |
> | 2009-02-25 15:31:00 | 4290000569 | 9270610 |
> | 2009-02-25 15:32:00 | 1833044423 | 4116940 |
> | 2009-03-09 01:43:00 | 3842930106 | 4829946 |
> | 2009-03-09 01:43:00 | 4290000226 | 4202681 |
> | 2009-03-13 14:00:00 | 4290000631 | 9675501 |
> | 2009-03-13 14:01:00 | 4290000783 | 9514197 |
> | 2009-03-13 14:02:00 | 4290000028 | 10026264 |
> | 2009-03-13 14:03:00 | 4290000262 | 9798220 |
> | 2009-03-13 14:04:00 | 2777022526 | 6454405 |
> | 2009-03-14 00:08:00 | 1521800860 | 2077144 |
> | 2009-03-14 05:22:00 | 1460542448 | 3737824 |
> +---------------------+------------+----------+
>
> Do you have any ideas what might be going on here?
>
> Cheers, Chris.
> --
> Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
> The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES
>
> Aptivate is a not-for-profit company registered in England and Wales
> with company number 04980791.
_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists