Re: [pmacct-discussion] Timestamps in RabbitMQ/JSON output

2014-06-03 Thread Chris Wilson

Hi Paolo,

On Tue, 3 Jun 2014, Paolo Lucente wrote:

What you describe for timestamps seems a good match for NetFlow, ie. 
cast packets into flows and handle these via a flow-aware cache (so 
active/passive expiration timers, max lifetime, etc.). All described is 
already part of the nfprobe plugin. Collecting back such data via 
nfacctd (on the same box where NetFlow is exported or ship it to some 
central location) enables to use timestamp_start, timestamp_end 
aggregation primitives - which should be precisely what you want to 
achieve. The beauty is that you can have all time references possible at 
once: timetamp_start, timestamp_end, stamp_inserted, stamp_updated.


Don't know how much you like/dislike the solution but i'd encourage to 
run a proof-of-concept with these tools (which are all available 
already) so to see we are in line with your requirements and hence take 
it from there.


So at the moment I am developing this by running pmacctd (not nfacctd) on 
my own laptop to collect and graph my own traffic. Thanks for the 
suggestion of using timestamp_start and _end which I didn't know you could 
aggregate on.


However when I added these to my aggregate line, I found that the 
timestamp_start is in local time (not GMT) and a human-readable date 
format, which is not great for processing in JavaScript, and timestamp_end 
doesn't appear to work properly:


DEBUG ( default/amqp ): publishing [E=pmacct RK=acct DM=0]: 
{timestamp_start: 2014-06-03 22:42:00.202820, ip_dst: 
196.223.145.xxx, ip_proto: tcp, tos: 0, ip_src: 86.30.131.xxx, 
bytes: 142, port_dst: 36363, packets: 1, port_src: 2201, 
timestamp_end: 1970-01-01 03:00:00.0}


Is this a bug? Would it be easy to fix?

About sql_refresh_time less than one second. I've not considered it for 
a simple reason: it seems to me like forcing an existing caching 
mechanism towards a real-time use-case. Then better to disable it at all 
and stream flows as they arrive onto the AMQP exchange. I have this on 
my todo list - does it seem what you are looking for?


It might be. Because I'm mainly using pmacctd (not having any 
netflow-capable hardware) I don't know how that would work in pmacctd. 
Would you send every packet? That could be an awful lot of traffic, with 
some flows having a thousand packets per second.


We could process and aggregate it all on the client side, and that has 
uses (such as drilling down into individual packets), but it would be 
great to have the option of aggregating them on the server as well, at a 
resolution chosen by the user.


It's definitely not something that I need now, but would like you to have 
it on your radar that this might be useful for some people.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Newbie

2014-04-05 Thread Chris Wilson

Hi Mike,

On Sat, 5 Apr 2014, Mike Hammett wrote:

The OfficialConfigKeys is very verbose and no doubt holds the key (no 
pun intended) to every possible configuration, but all config examples 
I've found seem drastically simplistic or seemingly incomplete.


Try this one:

daemonize: false
debug: true
pidfile: /var/run/nfacctd.pid
! logfile: /var/log/nfacctd.log
! syslog: daemon
nfacctd_port: 4096
plugins: mysql
aggregate: src_host, src_port, dst_host, dst_port, proto
sql_db: pmacct
sql_table: acct_v8
sql_history: 1m
sql_history_roundoff: m
sql_table_version: 8
sql_host: 127.0.0.1
sql_user: pmacct
sql_passwd: X
sql_refresh_time: 60
sql_dont_try_update: true
sql_optimize_clauses: true
sql_preprocess: minb = 1


From page 47 of:

http://www.ws.afnog.org/afnog2013/tutorials/bmo/afnog-bmo-presentation.odp

Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] HTTP traffic classification

2014-03-24 Thread Chris Wilson

Hi Karl,

On Mon, 24 Mar 2014, Karl O. Pinc wrote:

On 03/24/2014 06:31:30 AM, Stathis Gkotsis wrote:
Concerning HTTP: I guess the thing to output would be hostname, since 
you can have multiple HTTP requests to different URLs inside one TCP 
Session.About DNS, what should be outputted? I guess the hostname for A 
queries is good enough to start with.


I'm not clear on where DNS would fit into this.  Offhand, DNS lookups
(and then reverse DNS lookups, etc.) should not be part of
pmacct.  There's just too much latency.  People who want that
sort of thing should work out how to do it outside of pmacct.


I'd like to see the *content* of DNS requests and responses available to 
be logged in data records by pmacct. It can be very helpful in identifying 
which website someone was trying to access, when all we have is an IP 
address. I accept that not everybody would want this, but I do.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] HTTP traffic classification

2014-03-22 Thread Chris Wilson

Hi all,

On Sat, 22 Mar 2014, Viacheslav Dubrovskyi wrote:

22.03.2014 21:20, Stathis Gkotsis пишет:
First, I would like to thank you for the great product, pmacct has 
proven very useful to me, which brings me to my question :) I see that 
it is possible to enable traffic classification, which is about 
detecting L7 protocol. I am particularly interested in HTTP and also 
outputting the hostname or url, e.g. in exports via the print module. 
Is this somehow possible?


IMHO better use special tools https://github.com/jbittel/httpry


I'm also interested in this. Even if it's captured by a separate tool (and 
I'm not sure why it couldn't be integrated with pmacct's L7 classifiers) I 
would really like to be able to log http and https hostnames of 
connections, and correlate them with flows recorded by pmacct and DNS 
requests and responses.


It's not clear that httpry can log the source and destination host and 
port at all, let alone store it in a SQL database (no sample output is 
provided), and presumably it does nothing with https.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value

2013-09-03 Thread Chris Wilson

Hi Edward,

On Tue, 3 Sep 2013, Edward van Kuik wrote:

Sep  2 17:59:01 microserver pmacctd[17603]: ERROR ( summary/mysql ): 
'sql_multi_values' is too small (100). Try with a larger value.


I set mine to 1000.


OK, so 1000 might work for you now. But it seems that pmacct can't split 
the inserts into multiple batches, otherwise a smaller batch size would 
work too. So one day you might have more than 1000 flows to insert at a 
time, and you'd get this error and lose data. In fact are you sure you 
haven't lost any data already?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value

2013-09-03 Thread Chris Wilson

On Tue, 3 Sep 2013, Edward van Kuik wrote:

No, it should definitely batch the data into inserts of 1000 values 
each.


Then why would it give me this error message? The error doesn't make sense 
if pmacct does break inserts into smaller batches.


Sep  2 17:59:01 microserver pmacctd[17603]: ERROR ( 
summary/mysql ): 'sql_multi_values' is too small (100). Try with a 
larger value.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value

2013-09-03 Thread Chris Wilson

Hi Paolo,

On Tue, 3 Sep 2013, Paolo Lucente wrote:

Maybe a bug in documentation in the release you are using? CONFIG-KEYS 
says: The value of the directive is intended to be the size (in bytes) 
of the multi-values buffer.. So 100 bytes is on the low side, and by 
default MySQL comes with a 1MB buffer - after that you should tweak 
MySQL config first, then set the sql_multi_values value accordingly. I 
can confirm statements are batched in several buffers if one can't fit 
them all.


Thanks, I understand now. I had completely missed that it was in bytes 
instead of rows.


There does seem to be a minor bug in that pmacct appears to fall over if 
the value is too small. I'm sure it could log a warning and write larger 
but valid INSERT statements, with at least one VALUES row per statement.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value

2013-09-02 Thread Chris Wilson

Hi all,

I tried to enable the sql_multi_values option, but setting it to a 
reasonable number of rows to insert at once (100) to avoid hitting the 
MySQL packet size limit. But I get these errors in the logs:


Sep  2 17:59:01 microserver pmacctd[17603]: ERROR ( summary/mysql ): 
'sql_multi_values' is too small (100). Try with a larger value.


Sep  2 16:57:46 microserver pmacctd[17608]: ERROR ( inbound/mysql ): You 
have an error in your SQL syntax; check the manual that corresponds to 
your MySQL server version for the right syntax to use near 'VALUES 
(FROM_UNIXTIME(1378141141), FROM_UNIXTIME(1378141080), '00:1b:21:92:98:17' 
at line 1


This looks like a bug to me? Surely it should be reasonable to insert 
up to 100 rows at a time (per SQL statement) instead of just 1?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Build fails to find libmysqlclient on 64-bit CentOS

2013-06-26 Thread Chris Wilson

Hi Paolo,

On Tue, 25 Jun 2013, Paolo Lucente wrote:


Sure, thanks for the tip: makes sense, will do.


Also please find attached an RPM spec file to help build rpms for pmacct.

It would be great if you could include this in the tarball.

Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.
%define with_pgsql   0
%define with_sqlite  0

Summary: Promiscuous mode IP Accounting package
Name: pmacct
Version: 0.14.3
Release: 1.cw.130625
License: GPL
Group: Monitoring
Source: http://www.pmacct.net/%{name}-%{version}.tar.gz
Source1: nfacctd.init
Source2: pmacctd.init
Source3: sfacctd.init
Source4: sfacctd.conf
#Patch1: pmacct-fix_realloc.patch
URL: http://www.pmacct.net/
BuildRoot: %{_tmppath}/%{name}-root
BuildRequires: mysql-devel gcc
%if %{with_pgsql}
BuildRequires: postgresql-devel
%endif
%if %{with_sqlite}
BuildRequires: sqlite-devel = 3.0.0
%endif
BuildRequires: libpcap-devel

%description
pmacct is a small set of passive network monitoring tools to measure, account,
classify and aggregate IPv4 and IPv6 traffic; a pluggable and flexible
architecture allows to store the collected traffic data into memory tables or
SQL (MySQL, SQLite, PostgreSQL) databases. pmacct supports fully customizable
historical data breakdown, flow sampling, filtering and tagging, recovery
actions, and triggers. Libpcap, sFlow v2/v4/v5 and NetFlow v1/v5/v7/v8/v9 are
supported, both unicast and multicast. Also, a client program makes it easy to
export data to tools like RRDtool, GNUPlot, Net-SNMP, MRTG, and Cacti.

%prep
%setup -q
#%patch1
chmod a+rx docs examples sql
find docs examples sql -type f -print0 | xargs -r0 chmod -x

%build
if [ -r /usr/lib64/mysql/libmysqlclient.so ]; then
MYSQL_LIBS='--with-mysql-libs=/usr/lib64/mysql'
fi
%configure \
--sysconfdir=%{_sysconfdir}/%{name} \
--enable-threads \
--enable-64bit \
--enable-mysql \
$MYSQL_LIBS \
%if %{with_pgsql}
--enable-pgsql \
--with-pgsql-includes=/usr/include/pgsql/ \
%endif
%if %{with_sqlite}
--enable-sqlite3 \
%endif
--enable-ulog \
--enable-ipv6 \
--enable-v4-mapped


%__make %{?jobs:-j%{jobs}}

%install
%makeinstall

%{__install} -Dp %{SOURCE1} %{buildroot}/%{_sysconfdir}/init.d/nfacctd
%{__install} -Dp %{SOURCE2} %{buildroot}/%{_sysconfdir}/init.d/pmacctd
%{__install} -Dp %{SOURCE3} %{buildroot}/%{_sysconfdir}/init.d/sfacctd
ln -sf ../../etc/init.d/nfacctd $RPM_BUILD_ROOT/usr/sbin/rcnfacctd
ln -sf ../../etc/init.d/pmacctd $RPM_BUILD_ROOT/usr/sbin/rcpmacctd
ln -sf ../../etc/init.d/sfacctd $RPM_BUILD_ROOT/usr/sbin/rcsfacctd

%{__install} -Dp examples/nfacctd-sql_v2.conf.example 
%{buildroot}/%{_sysconfdir}/pmacct/nfacctd.conf
%{__install} -Dp examples/pmacctd-sql_v2.conf.example 
%{buildroot}/%{_sysconfdir}/pmacct/pmacctd.conf
%{__install} -Dp %{SOURCE4} %{buildroot}/%{_sysconfdir}/pmacct/sfacctd.conf

rm -f $RPM_BUILD_ROOT/usr/sbin/rc*acctd

%clean
%{__rm} -rf %{buildroot}

%files
%defattr(-, root, root)
%doc AUTHORS ChangeLog CONFIG-KEYS COPYING FAQS INSTALL KNOWN-BUGS NEWS 
QUICKSTART README TODO TOOLS UPGRADE
%doc docs examples sql
%attr(755,root,root) %{_bindir}/*
%attr(755,root,root) %{_sbindir}/*
%{_sysconfdir}/init.d/*
%dir /etc/pmacct
%attr(600,root,root) %config(noreplace) %{_sysconfdir}/pmacct/nfacctd.conf
%attr(600,root,root) %config(noreplace) %{_sysconfdir}/pmacct/pmacctd.conf
%attr(600,root,root) %config(noreplace) %{_sysconfdir}/pmacct/sfacctd.conf


%changelog
* Thu Mar 24 2011 zamir za...@mandriva.org 0.12.5-0mdv2011.0
+ Revision: 648360
- first build
- create pmacct

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

[pmacct-discussion] Build fails to find libmysqlclient on 64-bit CentOS

2013-06-25 Thread Chris Wilson

Hi Paolo,

Configure fails to find /usr/lib64/mysql/libmysqlclient.so on 64-bit 
CentOS. You might want to add that to the list of search directories in 
configure.in?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Citylife House, Sturton Street, Cambridge, CB1 2QF, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Duplicate entry for key 1 (primary key violations)

2012-07-18 Thread Chris Wilson

Hi Paolo,

On Fri, 29 Jun 2012, Paolo Lucente wrote:

On Tue, Jun 26, 2012 at 10:13:30AM +0100, Chris Wilson wrote:

OK, testing now. Would it be possible for pmacctd to log a warning if 
it exceeds any of these thresholds, to help with tuning without wasting 
memory?


In a way you reckon things go wrong from the process list: the MySQL 
plugin writer process mentions the wording 'emergency' if the write was 
due to an unscheduled event. Then you know the value of the cache 
entries is too low. It's a good idea (and easy to implement) what you 
propose: when an emergency writer is triggered - then write the event to 
the logfile aswell. Adding to my todo list.


Thanks for doing that :) I'm testing the latest CVS now.

Is it possible that it either failed to remove some records from the 
cache, or calculated the timestamp of the database records incorrectly?


Well, the former case would be a bug; the latter is not really possible
unless somebody is playing with date on the system: pmacctd and uacctd
just use timestamps feeded by the underlying library. Is it possible the
DB is underperforming and a commit from the previous hour is taking long
to finish? Do you see a 1:1 relationship between the MySQL plugins and
the writers when you have a look to the process list?


I don't think we have writers taking an hour to write. The system isn't 
that heavily loaded.


I did notice that restarting the daemon generates these duplicate key 
errors. Restarting isn't completely compatible with an always insert 
configuration and unique primary keys.


I haven't reproduced the original problem yet.

On an unrelated note, how hard would it be to get the log message from 
ULOG stored in the database, for example in the classification field? I 
had a look through the code but I couldn't see any way to store this 
field from the received packet into the in-memory structure used to track 
flows.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Duplicate entry for key 1 (primary key violations)

2012-07-18 Thread Chris Wilson

Hi Paolo,

On Wed, 18 Jul 2012, Paolo Lucente wrote:

On an unrelated note, how hard would it be to get the log message from 
ULOG stored in the database, for example in the classification field? I 
had a look through the code but I couldn't see any way to store this 
field from the received packet into the in-memory structure used to 
track flows.


For clarity: which log messages are you referring to? The original 
packet (portion) itself with (or without) ancillary netfilter 
structures? If yes - than that is not currently possible.


The log message is an option of the ULOG target in iptables. We use it to 
help us debug our QoS traffic classification by showing which packets have 
which classification:


iptables -t mangle -A POSTROUTING $@ -j CLASSIFY --set-class $class
iptables -t mangle -A POSTROUTING $@ -j ULOG --ulog-prefix $class
iptables -t mangle -A POSTROUTING $@ -j RETURN

This results in a class string such as 1:123 being included in the 
output of the ulogd user-space application which receives the logs:


  Jul 18 15:50:44 fen-fw2 1:123 IN= OUT=ppp0 MAC= SRC=10.0.156.131
DST=176.58.108.189 LEN=52 TOS=00 PREC=0x00 TTL=63 ID=54141 CE DF ...

This seems to come from ulog_packet_msg_t.prefix according to the ulogd 2 
sources.


It's always possible to embed some data in some fields but the 
showstopper i see is an entry in the database has not 1:1 relationship 
with a single packet (portion): these should be concatenated or so 
(which i can anticipate is some work). What is the case study?


In our case, the classification could change mid-stream, as it depends on 
TOS flags and UDP packet sizes. I wonder whether it's possible to include 
the classification in the flow key in such cases, so we can separate out 
high and low priority traffic in the same stream and see how much traffic 
is being wrongly classified?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Duplicate entry for key 1 (primary key violations)

2012-06-26 Thread Chris Wilson

Hi Paolo,

On Wed, 20 Jun 2012, Paolo Lucente wrote:

I'm thinking to the possibility that given the aggregation method the 
SQL cache configured by default is not sufficient to keep all the 
aggregates over the time period - although the time period is very 
short. Can you as matter of test add the following line to your config 
and see if it makes any difference?


sql_cache_entries: 91

About plugin_buffer_size and plugin_pipe_size, CONFIG-KEYS gives
some guidelines. I suggest to start from those and take it from
there (no error message is OK; still error message then move up
by one order of magnitude; etc.). So try starting from:

plugin_pipe_size: 1024
plugin_buffer_size: 10240


OK, testing now. Would it be possible for pmacctd to log a warning if it 
exceeds any of these thresholds, to help with tuning without wasting 
memory?


I'm still getting some duplicate values, although fewer, and I noticed 
something interesting:


Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate 
entry '10.0.156.34-10.9.0.6-443-34555-tcp-2012-06-26 02:00:00' for key 1


Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate 
entry '109.74.198.131-10.0.156.210-56505-8140-tcp-2012-06-26 02:00:00' for 
key 1


Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate 
entry '10.0.156.210-109.74.198.131-8140-56505-tcp-2012-06-26 02:00:00' for 
key 1


Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate 
entry '178.79.174.118-10.0.156.210-58250-8140-tcp-2012-06-26 02:00:00' for 
key 1


These log entries were created at 4am, and the long configuration 
aggregates over one hour, so at 4am it should have been writing database 
records for 3am-4am, with a timestamp of 3am. But the timestamp was 2am. 
Is it possible that it either failed to remove some records from the 
cache, or calculated the timestamp of the database records incorrectly?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] Duplicate entry for key 1 (primary key violations)

2012-06-12 Thread Chris Wilson

Hi all,

We get many of these errors in our system logs:

Jun 12 10:01:01 fen-fw2 pmacctd[2153]: ERROR ( short/mysql ): Duplicate 
entry '72.232.223.58-82.68.244.70-80-46802-tcp-2012-06-12 09:56:00' for 
key 1


They usually happen in batches. E.g. we had a few hundred at 07:27, then 
another few hundred at 10:01, and a few dozen at 10:17.


In our configuration these duplicate inserts should never happen. We 
should get one INSERT per flow per minute, and the different minutes 
should result in different values of the primary key.


plugins: mysql[short], mysql[long]
aggregate[short]: src_host, src_port, dst_host, dst_port, proto
sql_db: pmacct
sql_table[short]: acct_v6
sql_history[short]: 1m
sql_history_roundoff[short]: m
sql_refresh_time[short]: 60
sql_dont_try_update: true
sql_optimize_clauses: true

It's like pmacct is not correctly finding an existing flow when 
aggregating, and creating a new one that duplicates the existing one. Is 
there some way to test that? Would the memory plugin do it?


Can anyone explain why this is happening or what I'm doing wrong?

Also, we get a lot of these:

Jun 10 05:13:01 fen-fw2 pmacctd[4070]: ERROR ( short/mysql ): We are 
missing data.
Jun 10 05:13:01 fen-fw2 pmacctd[4070]: If you see this message once in a 
while, discard it. Otherwise some solutions follow:
Jun 10 05:13:01 fen-fw2 pmacctd[4070]: - increase shared memory size, 
'plugin_pipe_size'; now: '3096576'.
Jun 10 05:13:01 fen-fw2 pmacctd[4070]: - increase buffer size, 
'plugin_buffer_size'; now: '192'.
Jun 10 05:13:01 fen-fw2 pmacctd[4070]: - increase system maximum socket 
size.


How would I know which parameter to increase? Could the writer tell us 
exactly which limit it hit?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] nfacctd

2012-04-01 Thread Chris Wilson
Hi Johan, your nfacctd is compiled without mysql support, so it's not logging 
to the database, only the memory plugin. Please fix that and try again. Cheers, 
Chris.


-Original Message-
From: johan lotter jlct...@gmail.com
Sender: Chris Wilson ch...@aptivate.orgDate: Sun, 1 Apr 2012 16:33:50 
To: pmacct-discussion@pmacct.net; ch...@aptivate.org; pa...@pmacct.net
Cc: pmgraph-t...@aptivate.org
Subject: nfacctd

Hi Chris

1) Clean install of pmgraphs on Debian Squeeze using the Debian
package instructions at: http://www.aptivate.org/pmgraph-nstallation-2
2) Disabled Iptables Firewall Rules using these instructions:
http://www.cyberciti.biz/faq/turn-on-turn-off-firewall-in-linux/
3) Created nfacctd.conf file using known working nfacct/pmgraph
configuration, page 52: http://www.ws.afnog.org/afnog2010/bw-mgmt/

daemonize: false
debug: true
pidfile: /var/run/nfacctd.pid
logfile: /var/log/nfacctd.log
! syslog: daemon
nfacctd_port: 5678
plugins: mysql
aggregate: src_host, src_port, dst_host, dst_port, proto
sql_db: pmacct
sql_table: acct_v6
sql_history: 1m
sql_history_roundoff: m
sql_table_version: 6
sql_host: 127.0.0.1
sql_user: pmacct
sql_passwd: secret
sql_refresh_time: 60
sql_dont_try_update: true
sql_optimize_clauses: true
! sql_preprocess: minb = 1000

4) Changed the subnet in pmgraphs to that of my own (192.168.88.)
5) Configured Net-Flow (v5) on my (Mikrotik) Router to send flows to
PC running pmacct/pmgraphs: 192.168.88.150
6) Executed with: nfacctd -f nfacctd.conf

And get the following error:

root@debhome:/etc/pmacct# nfacctd -f nfacctd.conf
ERROR ( nfacctd.conf ): Unknown plugin type: mysql. Ignoring.
WARN ( nfacctd.conf ): No plugin has been activated; defaulting to
in-memory table.

gedit /var/log/nfacctd.log

(edited down quite a bit)

Apr 01 16:10:38 INFO ( default/memory ): 124928 bytes are available to
address shared memory segment; buffer size is 176 bytes.
Apr 01 16:10:38 INFO ( default/memory ): Trying to allocate a shared
memory segment of 2748416 bytes.
Apr 01 16:10:38 INFO ( default/core ): waiting for NetFlow data on 0.0.0.0:5678
Apr 01 16:10:38 DEBUG ( default/memory ): allocating a new memory segment.
Apr 01 16:10:38 DEBUG ( default/memory ): allocating a new memory segment.
Apr 01 16:10:38 OK ( default/memory ): waiting for data on: '/tmp/collect.pipe'
Apr 01 16:10:39 DEBUG ( default/memory ): Selecting bucket 4612.
Apr 01 16:10:39 DEBUG ( default/memory ): Selecting bucket 9644.
Apr 01 16:10:41 DEBUG ( default/memory ): Selecting bucket 2391.
Apr 01 16:10:41 DEBUG ( default/memory ): Selecting bucket 2391.
Apr 01 16:11:23 DEBUG ( default/memory ): Selecting bucket 31124.
Apr 01 16:11:24 INFO: Discarding unknown packet: nfacctd=0.0.0.0:5678
agent=192.168.88.1:5678
Apr 01 16:12:19 DEBUG ( default/memory ): Selecting bucket 20471.
Apr 01 16:12:24 INFO: Discarding unknown packet: nfacctd=0.0.0.0:5678
agent=192.168.88.1:5678
Apr 01 16:12:33 DEBUG ( default/memory ): Selecting bucket 2325.
Apr 01 16:12:33 DEBUG ( default/memory ): Selecting bucket 1887.

There is nothing in /var/log/daemon.log pertaining to nfacctd (even
though I have tried running with daemonize: true

Any help very welcome (as always), thanks.


 -- Forwarded message --
 From: Chris Wilson ch...@aptivate.org
 To: pmacct-discussion@pmacct.net
 Cc: pmgraph-t...@aptivate.org
 Date: Thu, 2 Feb 2012 12:37:42 + (GMT)
 Subject: Re: [pmacct-discussion] pmacct-discussion Digest, Vol 83, Issue 1
 Hi Johan,

 On Thu, 2 Feb 2012, johan lotter wrote:

 Yet when I configure and run with mysql plugin I get no data...


 Does that mean that you get nothing in the database, or nothing graphed? I 
 notice that you mentioned pmgraph later, which is a different project (that 
 uses pmacct).

 If you get nothing in the database, please check your /var/log/syslog and 
 /var/log/daemon files for messages from pmacct.

 Created a file called nfacctd.conf
 placed it in the same directory as pmacct.conf
 edited as follows:
 !
 daemonize: true
 plugins: mysql
 aggregate: sum_host

 pmgraph will not work if you aggregate on sum_host. It requires the src_host, 
 dst_host, src_port and dst_port fields at least. It may also get confused by 
 a recent change to pmacct (which I requested) to change the names of the 
 src_port and dst_port fields, as the pmgraph package may not have been 
 updated to account for that change.

 You may find this presentation useful for a known working nfacct/pmgraph 
 configuration, especially page 52:

  http://www.ws.afnog.org/afnog2010/bw-mgmt/

 executed with nfacctd -f nfacctd.conf
 enabled Netflow (Traffic-Flow on my router) and told it to send
 traffic to IP address of listening NIC on port 5678

 Yet pmgraph is not graphing anything

 No firewall blocking inbound UDP traffic to port 5678?


!DSPAM:4f78675743401269443440!

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct-discussion Digest, Vol 83, Issue 1

2012-02-02 Thread Chris Wilson

Hi Johan,

On Thu, 2 Feb 2012, johan lotter wrote:


Yet when I configure and run with mysql plugin I get no data...


Does that mean that you get nothing in the database, or nothing graphed? I 
notice that you mentioned pmgraph later, which is a different project 
(that uses pmacct).


If you get nothing in the database, please check your /var/log/syslog and 
/var/log/daemon files for messages from pmacct.



Created a file called nfacctd.conf
placed it in the same directory as pmacct.conf
edited as follows:
!
daemonize: true
plugins: mysql
aggregate: sum_host


pmgraph will not work if you aggregate on sum_host. It requires the 
src_host, dst_host, src_port and dst_port fields at least. It may also get 
confused by a recent change to pmacct (which I requested) to change the 
names of the src_port and dst_port fields, as the pmgraph package may not 
have been updated to account for that change.


You may find this presentation useful for a known working nfacct/pmgraph 
configuration, especially page 52:


  http://www.ws.afnog.org/afnog2010/bw-mgmt/


executed with nfacctd -f nfacctd.conf
enabled Netflow (Traffic-Flow on my router) and told it to send
traffic to IP address of listening NIC on port 5678

Yet pmgraph is not graphing anything


No firewall blocking inbound UDP traffic to port 5678?

Please also trim your posts to remove irrelevant information, especially 
when replying to a digest that contains many emails completely unrelated 
to the one that you're replying to.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967838
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Infinite loop in sql_cache_insert

2011-12-02 Thread Chris Wilson

Hi Paolo,

On Mon, 28 Nov 2011, Paolo Lucente wrote:


Would be great if: 1) you can upgrade to something more recent than that,
ie. issue could be related to timestamps and fix might well be in some
other parts of the code (pkt_handlers.c pops to mind)


I will probably do this soon as I'm intending to do more work on pmacct 
development. However it would be great if Ubuntu would pick up more recent 
versions of pmacct in their newer releases. I'm running the latest 
release, Oneiric. Are you in touch with the package maintainer?


I'd particularly like to add some more identifying information to the list 
of aggregation primitives, to help connect pmacct traffic logs with Squid 
logs, to associate website names to them. However I was completely 
confused about where to start on my first attempt to achieve this (adding 
new primitives). I was wondering whether it would be easier to write 
a classifier that would inspect the first packet of the stream and stuff 
the TCP ISN into the classification field? Does that seem like a 
reasonable approach?



and/or 2) manage to reproduce the issue.


I'm afraid this is probably impossible. I rarely run packet logging on my 
laptop and I wasn't at that time. It has happened a few times, but rarely.


Apart of the above, agree 100% with your thoughts about cleaning up a 
bit; i have that on my todo list (along with other related things, ie. 
creating a sql_cache_free_entries() routine).


Excellent :) Simpler and more flexible code would make it much easier to 
work on and extend pmacct.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Broken aggregate Filter

2011-06-09 Thread Chris Wilson

Hi Bernd,

On Thu, 9 Jun 2011, Bernd Bornkessel wrote:


It works if I use:

vlan and ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst 
net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 
or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 
88.215.192.0/19))


Well, but what if I also want to filter by VLAN. The following filters do not 
work :\

[...[
vlan and ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst 
net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 
or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 
88.215.192.0/19))


These filters look identical to me. How come it both works and doesn't 
work?


Cheers, Chris.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Broken aggregate Filter

2011-06-09 Thread Chris Wilson

Hi Bernd,

On Thu, 9 Jun 2011, Bernd Bornkessel wrote:


The working filter is:

vlan and (dst net 192.76.141.0/24 or dst net 194.55.246.0/23 or dst net 
195.246.160/19 or dst net 88.215.224.0/19 or dst net 62.93.212.0/23 or 
dst net 62.93.246.0/23 or dst net 88.215.192.0/19)


The non-working are:

vlan and ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst 
net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 
or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 
88.215.192.0/19))


((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst net 
194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 or 
dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 
88.215.192.0/19))


I think you may be falling victim to this (from man pcap-filter(7)):

   vlan [vlan_id]

  True if the packet is an IEEE 802.1Q VLAN packet.  If 
[vlan_id] is specified, only true if the packet has the specified vlan_id. 
Note that the first vlan keyword encountered in expression changes the 
decoding offsets for the remainder of expression on the assumption that 
the packet is a VLAN packet.  The vlan [vlan_id] expression may be used 
more than once, to filter on VLAN hierarchies.  Each use of that 
expression increments the filter offsets by 4.


Therefore I don't think you can use the vlan keyword more than once in 
the same expression (unless you have vlan hierarchies). This appears to be 
a limitation (and a rather unusual one) of libpcap, not pmacct.


If they really want to support nested vlans (and I would seriously 
question the sanity of anyone who used them) I would respectfully suggest 
that they modify the vlan keyword to not change the filter offset, and 
create a new keyword like nested-vlan which does.


Cheers, Chris.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Aggregate not working?

2010-11-11 Thread Chris Wilson

Hi Lockywolf,

On Thu, 11 Nov 2010, Lockywolf __ wrote:


aggregate[in]: dst_host
aggregate[out]: src_host
aggregate_filter[in]: dst net 192.168.88.0/16
aggregate_filter[out]: src net 192.168.88.0/16
plugins: mysql[in], mysql[out]

Still, in MySQL i have (a lot of) lines like the following:

| 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 109.107.91.158  |
0 |0 | ip   |   1 | 309 | 2010-11-10 16:50:00 |
2010-11-10 16:59:02 |
| 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 71.228.40.130   |
0 |0 | ip   |   1 | 305 | 2010-11-10 16:50:00 |
2010-11-10 16:59:02 |
| 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 94.24.134.127   |
0 |0 | ip   |   1 | 305 | 2010-11-10 16:50:00 |
2010-11-10 16:59:02 |
| 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 188.112.79.97   |
0 |0 | ip   |   1 | 305 | 2010-11-10 16:50:00 |
2010-11-10 16:59:02 |

No MACs ? i guess it's OK with netflow.


If you don't aggregate on src_mac and dst_mac, you won't get any MACs...


Btw, anybody can tell me, why do i have so many connections to 0.0.0.0?


That's what aggregate does. It zeroes all the fields that you don't 
aggregate on (including the other side's IP address in this case).



it's a router, has no brains.


It doesn't even exist, it's not a router.

But why does it log ips which have neither src_ip nor dst_ip in 
192.168.88.0/16 ?


That's a good question, I don't know. Might you have more than one 
nfacctd/pmacctd running? Or might you have changed the config without 
restarting it?


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Source port column name depends on database

2010-10-06 Thread Chris Wilson

Hi Paolo,

On Wed, 6 Oct 2010, Paolo Lucente wrote:

To say this work (as agreed in the shape of sql table version 8) has 
been just committed to the CVS. Please give it a try and let me know if 
it seems to work to your eyes.


Thanks for this. I haven't compiled it yet, but I noticed this line:

  if ((!strcmp(config.type, mysql) || !strcmp(config.type, 
sqlite3))  config.sql_table_version != 8) {


Doesn't this mean that it will revert to the old schema when we release a 
schema version 9? Is that what you wanted? It seems surprising to me. I 
would have expected config.sql_table_version  8 instead.


By the way I've written this story up in a blog post, I hope that's OK, 
but please let me know if you want me to edit it:

http://blog.aptivate.org/2010/10/06/consistency-portability-and-backwards-compatibility/

Cheers, Chris.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Source port column name depends on database

2010-10-06 Thread Chris Wilson

Hi Paolo,

On Wed, 6 Oct 2010, Paolo Lucente wrote:

Yes, that's intended for a couple of reasons: 1) don't expect to release 
any more table versions: you see that already happening with recently 
introduced primitives; idea is to stick to a table version (or style 
nowadays) and then customize it from there, adding (or removing) fields 
to the base schema. 2) combinations of table type/version are internally 
mapped to a number greater than 8, ie. table type BGP, table version 1.


OK, I didn't know that, thanks.

No problem with the blog entry. I believe you can change the Luckily he 
agreed to simply He agreed - i'm not such of an un-cooperative beast, 
am i?


Of course not, far from it :) I've changed it.

Cheers, Chris.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Source port column name depends on database

2010-09-15 Thread Chris Wilson
Hi Paolo,

On Wed, 15 Sep 2010, Paolo Lucente wrote:
 On Tue, Sep 14, 2010 at 09:16:37AM +0200, Chris Wilson wrote:
 
  I'm not sure about adding a new config switch, do we actually need it?
 
 Funnily enough, and that was my perspective, in this case a configuration
 switch only adds two if-then-else in the common SQL plugins code. Whereas
 impact of a new schema version you can verify it yourself by grepping the
 source code for 'sql_table_version'.

I think the code that uses sql_table_version has been well written, and 
none of these places should need to be changed at all.

The only place that should need changing, I hope, is the one line of 
sql_common.c that currently says:

  if (!strcmp(config.type, mysql) || !strcmp(config.type, sqlite3)) {

and would now check for sql_table_version = 7 (or similar) instead.

So this change does not actually increase the code complexity, or the 
number of config options, at all.

 I'd target release 0.12.5 for this as 0.12.4 is planned to be out soon 
 (by end of the month). Will give a shout as soon as i get something 
 workable in the CVS.

That would be great, please do!

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Source port column name depends on database

2010-09-14 Thread Chris Wilson
Hi Paolo,

On Tue, 14 Sep 2010, Paolo Lucente wrote:

 Agree. I seem to reckon this legacy issue is limited to the TCP/UDP 
 ports only and i'm thinking perhaps the best way to approach it is to 
 issue a true/false config switch, ie. sql_table_compat, for the purpose. 
 But for consistency with the rest, these fields should be aligned to 
 port_src and port_dst. Agree?

Agree definitely on consistency, and don't really mind which way the name 
goes.

I'm not sure about adding a new config switch, do we actually need it? I 
seem to recall some wiser counsel to not add configuration options where 
possible, as it exponentially multiplies the complexity of the software 
code and also linearly increases the complexity of using it.

If our intention is to rename the MySQL fields going forward, why not just 
use a new schema version to grandfather the old column names?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] Source port column name depends on database

2010-09-13 Thread Chris Wilson
Hi all,

We just had a bug report in pmGraph because it assumed that the source 
port database column was called src_port always, as it is in MySQL. The 
user is using a postgres database, and it appears that the column is 
called port_src there instead:

if (!strcmp(config.type, mysql) || !strcmp(config.type, sqlite3)) 
{
  strncat(insert_clause, src_port, SPACELEFT(insert_clause));
  strncat(where[primitive].string, src_port=%u, 
SPACELEFT(where[primitive].string));
}
else {
  strncat(insert_clause, port_src, SPACELEFT(insert_clause));
  strncat(where[primitive].string, port_src=%u, 
SPACELEFT(where[primitive].string));
}

I would be much happier writing database-independent code around 
pmacct if it didn't do things like this.

I understand that there is a backwards compatibility issue with changing 
it, but perhaps it could be done in a new version of the mysql or postgres 
schema?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Pmacct data inconsistencies between tables.

2010-02-19 Thread Chris Wilson
Hi Paolo and Daniel,

(please allow me to jump in as I may be able to help here, despite 
currently being in country working on a project.)

On Fri, 19 Feb 2010, Paolo Lucente wrote:

 I also wonder: how does the primary key of the 1 min table look like? Is 
 it any different from the 1 hour table? With the sql_don_try_update 
 turned on and the default indexing, duplicates are not possible.

I deleted the primary key from that table because it should not be 
necessary (there should not be any duplicates if everything is configured 
correctly) and it makes inserts extremely slow (by a factor of 10-100) 
when the table gets large.

 Also at a closer look to the configuration you posted i see no 
 aggregate_filter are specified (see EXAMPLES): it means each plugin 
 collects and tries to write to the same table both inbound and outbound 
 traffic. So either you can remove one set of plugins or craft a proper 
 aggregate_filter so that each does only its bit of the job.

The inbound and outbound traffic are supposed to go into the same table, 
but you're right that the aggregate_filter appears to be missing and this 
is almost certainly the cause of the duplicate records in the short 
table. Daniel, could you please add something like this:

aggregate_filter[inbound1]: dst net 10.0.156.0/24
aggregate_filter[outbound1]: src net 10.0.156.0/24
aggregate_filter[inbound2]: dst net 10.0.156.0/24
aggregate_filter[outbound2]: src net 10.0.156.0/24

However, I'm surprised that this doesn't also happen in the long table?

 With regards to the missing tuples, from the few checks i've done, it is 
 always the case that something is in the 1 hour table but can be missing 
 in the 1 minute one. This can very well be the result of a shared 
 'sql_preprocess: minb = 1000' directive: a flow can accumulate more than 
 1000 bytes in 1 hour but not in 1 minute - and hence it's accounted in 
 one table and stripped off in the other.

Yes, I would expect the long table totals to be slightly more than the 
short table ones for this reason. However, the problem that we're seeing 
is the opposite: the totals calculated from the long table are less than 
those from the short table, even though the long table includes flows that 
the short table doesn't.

And, while this might be accounted for by the duplicate flows in the short 
table, the same should apply to the long table, so I think it should have 
balanced out.

 Given the sql_preprocess you should never expect counters to match for 
 the same reason as above. To have a comparison more apples to apples, 
 you should consider removing it and when confident everything is 
 allright put it back again.

Unfortunately we cannot do this in the production environment, as the 
number of rows of tiny flows (which are effectively noise) completely 
dwarfs the real data, overloads our firewall's CPU and disk space, and 
makes querying so slow that the data is useless. This is where a test lab 
environment would be useful.

Thanks for your help with this :)

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Pmacct data inconsistencies between tables.

2010-02-19 Thread Chris Wilson
Hi Karl,

On Fri, 19 Feb 2010, Karl O. Pinc wrote:
 On 02/19/2010 07:42:08 AM, Chris Wilson wrote:
 
  I deleted the primary key from that table because it should not be 
  necessary (there should not be any duplicates if everything is 
  configured correctly) and it makes inserts extremely slow (by a factor 
  of 10-100) when the table gets large.
 
 FWIW, the automatic sequential key generation speed is unrelated
 to table size when using postgresql.

There is no sequence to generate as far as I know. The problem is the size 
of the index file, and the fact that it has to be rewritten for every 
insert (or block of inserts) that makes insertion get slower as database 
size increases.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] NAT question

2009-11-11 Thread Chris Wilson
Hi JF,

On Thu, 12 Nov 2009, JF Cliche wrote:

 I am behind two NAT routers (Linksys running DD-WRT) with port 
 forwarding up to the machine running pmacct, and yet pmacct reports SSH 
 traffic to the forwarded port with the public (external, non-NATed) 
 addresses. I thought all traffic should be seen as coming from the 
 second router private address. Is pmacct (or underlying pcab library) 
 getting the public address from extra data encapsulated in the TCP 
 packets by the routers or in the SSH protocol? I've seen the opposite 
 problem being discussed in this forum, but not this...

NAT usually affects only the source address of outbound connections, and 
the destination address of inbound ones. There's no need for it to change 
the source of your incoming (to the pmacct server) SSH connection, as its 
reply packets will still go back to the SSH client via the router, which 
is necessary in order to have their source IP natted.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] timestamp rounding bug

2009-08-04 Thread Chris Wilson
Hi Paolo,

On Mon, 3 Aug 2009, Paolo Lucente wrote:

 Didn't act on it yet, being focused on some new features. My goal is to 
 do something about it in 0.12.0rc2. Basically it would be a fix for who 
 doesn't use an UTC clock on the system running pmacct. If there is 
 general interest around this story, I'll remember to briefly post here 
 about it the code is committed to the CVS.
 
 Btw, i guess the outcome of that thread was a recommendation to run 
 pmacct on a system which is set up for UTC. Maybe this should also be 
 made slightly more visible - maybe inserted into the FAQS document.

Is any real-world system set to UTC? I'm certainly not going to run my 
firewall (where I run pmacct currently) on UTC. All my logs would be 
screwed up and much harder to interpret.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Flexible aggregation

2009-06-14 Thread Chris Wilson
Hi Paolo and Karl,

On Sat, 13 Jun 2009, Paolo Lucente wrote:

 On Sat, Jun 13, 2009 at 03:07:01PM -0500, Karl O. Pinc wrote:
 
  We are only interested in a single table.
 
  Why can't two separate sql plugins write to the same table?
 
 What Karl is proposing here might really result in a simpler
 approach compared to the sub-aggregation scenario - which, with
 some care (ie. sql_startup_delay to svoid events syncronization
 while retaining same sql_history and sql_refresh_time settings),
 can not only achieve same results but best of all is already
 there. Let us know your thoughts!

I don't think it can. For example, how would we write the configuration? 
Let's say we just want to zero (not aggregate on) the destination IP for 
flows less than 1000 bytes. We could try:

  plugins: mysql[with_dst], mysql[without_dst]
  aggregate[with_dst]: src_host, src_port, dst_host, dst_port, proto
  aggregate[without_dst]: src_host, src_port, dst_port, proto
  sql_preprocess[with_dst]: minb = 1000
  sql_preprocess[without_dst]: maxb = 1000

but the flow aggregates are not the same for both plugins, so we can't 
ensure that any flow ends up in one plugin or the other but not both or 
neither.

How else could we do it with what we already have? We could write to 
different tables at different levels of aggregation, and let the user 
choose which one to use, and delete old data from each table to stop it 
becoming too large... but that gets more complicated for the user.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Flexible aggregation

2009-06-13 Thread Chris Wilson
Hi Paolo,

On Sat, 13 Jun 2009, Paolo Lucente wrote:

 Good pointer. From a brief scan of the Aguri homepage, please feel free 
 to correct whether i'm wrong, i see many similarities between pmacct and 
 Aguri.

I guess so; I was thinking that Aguri seems to store its output in text 
files rather than a database, and perhaps provides more dynamic/automatic 
filtering, but seems to be a research project and not highly supported or 
maintained.

 Aguri is slightly more limited in the fact it has only a set of (4?) 
 traffic aggregation profiles whereas pmacct offers a wider range of 
 primitives. But I guess the point you wanted to make was the dynamic 
 variation of the sampling rate under increased traffic load (ie. DDoS).

OK, I didn't realise that it was just the sample rate that was varied. I 
thought it was to do with the flexible aggregation, e.g. if we have 1000 
flows with the same source IP and source port, they might be aggregated 
together as a single, more highly summarised flow.

 pmacct actually does have such feature only available to the SQL
 plugins: it's part of the SQL preprocess infrastructure (look for 
 'sql_preprocess' in the CONFIG-KEYS document or the wiki) and is
 called 'fsrc' (Flow Sampling under Resource Constraints). It is
 an implementation i did years ago loosely based on a paper coming
 from ATT Labs. It aims at offering to the SQL database a sort of
 stream-lined number of aggregates; aggregates are weighted, ranked
 and sampled based on probability (which gives the dynamic/adaptive
 part of the approach); the resource constraint is expressed via
 the number of flows you want to end in the database (which is in
 turn seen as the constrained resource here).

We are using this feature to filter out small flows, but the problem is 
that they are not accounted for at all, so the database contents e.g. 
SUM(bytes) no longer reflect the interface totals.

What I would ideally like to see, but I realise that it's hard is 
something like this:

Initial filter selects flows over a certain size and non-selected flows 
can either be discarded (as now) or reaggregated by zeroing a selected 
feature, e.g. the destination port, and combined into a new single record 
if there is more than one of them. These, more highly aggregated records 
then continue down the preprocess chain, and if they fail to match a later 
condition then they can be aggregated again in a different way, e.g. by 
zeroing the destination IP address, and so on, until we end up with a 
single record where all the features were aggregated.

For example, sql_preprocess might look something like this:

minb = 1, zero_dstip, minb = 1, zero_dstport, minb = 1, 
zero_srcport, minb = 1, zero_srcip

Then any flows which together do not add up to enough bytes to pass the 
minb filters, even after aggregation, end up in a record where all the 
selector fields are zeroed out. Since there is no final minb condition, 
this row would always be added to the database, never rejected, so 
SUM(bytes) would again equal the interface counters for any given time 
range.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Flexible aggregation

2009-06-13 Thread Chris Wilson
Hi Paolo,

On Sat, 13 Jun 2009, Paolo Lucente wrote:

  minb = 1, zero_dstip, minb = 1, zero_dstport, minb = 1, 
  zero_srcport, minb = 1, zero_srcip
  
  Then any flows which together do not add up to enough bytes to pass 
  the minb filters, even after aggregation, end up in a record where all 
  the selector fields are zeroed out. Since there is no final minb 
  condition, this row would always be added to the database, never 
  rejected, so SUM(bytes) would again equal the interface counters for 
  any given time range.
 
 I explored this valid approach some time ago (years!); by zeroing some
 aggregation primitives previously selected, duplicates are likely to be
 created. The trick is to resolve such duplicates before offering them
 to the SQL database - via a sub-aggregation operation. The cache is not
 sorted - making any sub-aggregation operation very expensive (scaling
 linearly with the number of aggregated being offered); the idea here is
 to index the cache, perform the sub-aggregation and offer the result of
 this to the SQL database. 

I agree that merging duplicate records would produce the most useful 
results for us.

 In summary, it's not something quick to do but it can be done - maybe 
 something good for inclusion within the 0.12 trunk later in the year. At 
 this stage, this feature can't be included in the first pre-release 
 version (0.12.0p1) but I can plan it along the rocky way to the first 
 official release, 0.12.0. Maybe already in 0.12.0p2. How does it sound?

That sounds great! I was not expecting you to offer to implement it so 
quickly. I understand that it's difficult and may conflict with your other 
priorities.

 Let me spend a couple of words on a different aspect: the above approach 
 implies everything ends in the same SQL table - which can have pros and 
 cons; the pro is simplicity (one table for everything); the con is that 
 might want to have sub-aggregated data clearly separated into a 
 different table to, say, apply different policies. This is something can 
 be done today with pmacct as 'sql_preprocess' offers also the max 
 version of the min features you are using. It means having, for 
 example, two SQL plugins, writing to different SQL tables, aggregating 
 data differently and using complementary sql_preprocess features (so 
 that at the end by summing data in both tables one ends with the full 
 picture). Would this be a feasible approach to you?

We are only interested in a single table. We can show 0.0.0.0 as 
Aggregated out in the pmGraph user interface. I'd rather that we didn't 
have to query five separate tables to get the results at different levels 
of aggregation, and merge them all together in our code. However I can see 
that some people would prefer to keep them in separate tables.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] Flexible aggregation

2009-06-12 Thread Chris Wilson
Hi all,

Has anyone heard of Aguri?

Aguri is an aggregation-based traffic profiler targeted for near 
real-time, long-term, and wide-area traffic monitoring. Aguri adapts 
itself to spatial traffic distribution by aggregating small volume flows 
into aggregates, and achieves temporal aggregation by creating a summary 
of summaries applying the same algorithm to its outputs. A set of scripts 
are used for archiving and visualizing summaries in different time scales. 
Aguri does not need a predefined rule set and is capable of detecting an 
unexpected increase of unknown protocols or DoS attacks, which 
considerably simplifies the task of network monitoring.

[http://www.sonycsl.co.jp/person/kjc/kjc/software.html]

I think I remember something like this  being posted to the list a while 
back, so I'm sorry if this is a duplicate.

Has anyone considered implementing anything like this flexible aggregation 
in pmacct? Could the code be taken from Aguri under BSD license?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] timestamp rounding bug

2009-04-20 Thread Chris Wilson

Hi Paolo,

On Sun, 19 Apr 2009, Karl O. Pinc wrote:

what makes sense to me is to collect timestamps in UTC, store them in 
UTC when storing them in a database, and let whatever's pulling the data 
out of the db present the data to the user in whatever fashion makes 
sense.  Any other approach, i.e. working in local time or DST, makes 
working across time zones difficult, and computing intervals (in the 
case of DST) impossible.


I agree with Karl. Timestamps in UTC in the database make the most sense 
for me.


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip 
 f7bfa9ca sp ffb88334 error 4 in 
 libpthread-2.3.6.so[f7bf2000+e000]

 I think I got it (using a written coredump):

Yes, that's it, thanks. I'm afraid it doesn't mean much to me, but I hope 
it will help Paolo. What exact version of pmacct are you using?

 (gdb) bt
 #0  0xf7ba29ca in pthread_getspecific () from
 /lib/tls/i686/cmov/libpthread.so.0
 #1  0xf7c8bf85 in inet_ntoa () from /lib/tls/i686/cmov/libc.so.6

Paolo, this looks weird to me. pthread_getspecific() should not crash, 
that makes me think that the heap has been trashed (stack looks generally 
OK as the backtrace is OK). Perhaps a Valgrind is in order? Any static 
or fixed-size buffers in the mysql plugin that might be busted?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 I'm not sure why flows is in your aggregate set since flows are 
 already aggregated into flows in all cases by pmacctd, as far as I 
 know (please correct me if I'm wrong).

 flow isn't in the primary key.

 I didn't say it was, but it is in your aggregate set and I don't 
 understand why.

 Are you shure its flow, between mac and IP it could be vlan?

aggregate:
src_host,dst_host,dst_port,src_port,flows,dst_mac,proto,src_mac,vlan

It's right there before dst_mac.

 I guess you mean the SIGSEGV error has been logged in your syslog? gdb 
 should stop when it sees the SIGSEGV error, and wait for a command such 
 as bt. So I guess it's happening in another thread than the main one, 
 so it will be harder to trace.

 You could wait until pmacctd is up and running, then press Ctrl+C, 
 enter the info threads command, then guess a thread other than the 
 first one and switch to it with thread xxx and continue, and hope 
 that that thread dies with SIGSEGV.

 Is pmacctd not terminated once pressing ctrl+c?

It shouldn't be, gdb should intercept the SIGINT and stop it from reaching 
the process.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 Paolo, this looks weird to me. pthread_getspecific() should not crash, 
 that makes me think that the heap has been trashed (stack looks 
 generally OK as the backtrace is OK). Perhaps a Valgrind is in order? 
 Any static or fixed-size buffers in the mysql plugin that might be 
 busted?

 No Valgrind instaled.

You can probably apt-get install valgrind and run pmacctd through it.

 I cleared the database, and observed what happend:
 Apr 13 17:18:19 server1 pmacctd[12394]: INFO ( default/core ): Start logging 
 ...
 Apr 13 17:18:19 server1 pmacctd[12394]: OK ( default/core ): link type is: 1
 Apr 13 17:19:41 server1 pmacctd[12394]: Expiring orphan fragment:
 ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33635
 Apr 13 17:19:47 server1 pmacctd[12394]: Expiring orphan fragment:
 ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33756
 Apr 13 17:19:58 server1 pmacctd[12394]: Expiring orphan fragment:
 ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33415
 Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ): Duplicate 
 entry
 '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for key 1
 Apr 13 17:20:01 server1 kernel: pmacctd[12419]: segfault at 3827208c ip
 f7c599ca sp ffde7894 error 4 in
 libpthread-2.3.6.so[f7c51000+e000]

As this crash is so early, perhaps the thread isn't initialised properly?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ):
 Duplicate entry
 '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for
 key 1

 As this crash is so early, perhaps the thread isn't initialised properly?

 Well, the first update (into the complet empty table) was successfull,
 and I think that has used the same kid of thread.

 I have now a guess whre the duplicated keys error come from:

 Assume the updates are done at :30 with sql_history_roundoff: 1h and
 sql_refresh_time: 3600 (1h) (so long for simplifikation)

 at 0:30 for each recorded flow a row is inserted with the timestamp 0:00
 at 1:30 for log flow a row is inserted for 0:00 and 1:00 ...

 at least if I understood the dokumentation right, that makes the error,
 since to identical inserts should be done...

My understanding is that with those settings, a row would be inserted just 
after 0:00, with stamp_inserted = 0:00, and another one just after 1:00, 
with stamp_inserted 1:00, so there should not be a conflict.

What makes you think that anything should happen at 0:30 or 1:30? Also, 
the second insert should have stamp_inserted = 1:00 not 0:00, as far as I 
know.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] pmacct weird counters

2009-03-14 Thread Chris Wilson
Hi Paolo,

I'm running pmacctd 0.11.5 on a small network for traffic accounting. 
Generally it's behaving well, but occasionally I can see weird data being 
inserted:

17190 Query INSERT INTO `acct_v7` (stamp_updated, stamp_inserted, vlan, 
ip_dst, as_src, as_dst, src_port, dst_port, tcp_flags, tos, ip_proto, 
agent_id, class_id, mac_src, mac_dst, ip_src, packets, bytes, flows) 
VALUES (FROM_UNIXTIME(1236952981), FROM_UNIXTIME(1236952920), 0, 
'192.168.0.175', 0, 0, 0, 0, 0, 0, 'ip', 0, 'unknown', '0:0:0:0:0:0', 
'0:0:0:0:0:0', '0.0.0.0', 10026264, 429028, 0)

17190 Query INSERT INTO `acct_v7` (stamp_updated, stamp_inserted, vlan, 
ip_dst, as_src, as_dst, src_port, dst_port, tcp_flags, tos, ip_proto, 
agent_id, class_id, mac_src, mac_dst, ip_src, packets, bytes, flows) 
VALUES (FROM_UNIXTIME(1236952981), FROM_UNIXTIME(1236952920), 0, 
'192.168.0.175', 0, 0, 0, 0, 0, 0, 'ip', 0, 'unknown', '0:0:0:0:0:0', 
'0:0:0:0:0:0', '0.0.0.0', 8984686, 3943258731, 0)

The byte counters look bogus to me. It's hard to imagine how anyone could 
send 4 GB of data down through my cable modem connection in just one 
minute. I might even suspect a 32-bit sign overflow, but in the second 
case that would still mean 350 MB in one minute which is 46 Mbps, more 
than four times my line rate, and my external interface graphs show no 
traffic at all during that time.

What's also odd is that the second record is a primary key conflict with 
the first, so it never ended up in the database. I don't have two 
pmacctd's running this time :) but I do have two plugins configured as 
follows:

plugins: mysql[inbound], mysql[outbound]

aggregate[inbound]: dst_host
aggregate_filter[inbound]: dst net 192.168.0.0/24

aggregate[outbound]: src_host
aggregate_filter[outbound]: src net 192.168.0.0/24

They both insert into the same table, which is what I want in this case. 
Because of aggregation, they should never conflict with each other. But 
could this be causing memory corruption?

Here is the suspicious data that I have in my database (I assume that 
MySQL is not corrupting this data):

mysql select stamp_inserted,bytes,packets from acct_v7 where bytes  
10;
+-++--+
| stamp_inserted  | bytes  | packets  |
+-++--+
| 2009-02-13 09:27:00 | 3192440953 |  3077338 |
| 2009-02-25 15:31:00 | 1520451669 | 17845485 |
| 2009-02-25 15:31:00 | 429569 |  9270610 |
| 2009-02-25 15:32:00 | 1833044423 |  4116940 |
| 2009-03-09 01:43:00 | 3842930106 |  4829946 |
| 2009-03-09 01:43:00 | 429226 |  4202681 |
| 2009-03-13 14:00:00 | 429631 |  9675501 |
| 2009-03-13 14:01:00 | 429783 |  9514197 |
| 2009-03-13 14:02:00 | 429028 | 10026264 |
| 2009-03-13 14:03:00 | 429262 |  9798220 |
| 2009-03-13 14:04:00 | 2777022526 |  6454405 |
| 2009-03-14 00:08:00 | 1521800860 |  2077144 |
| 2009-03-14 05:22:00 | 1460542448 |  3737824 |
+-++--+

Do you have any ideas what might be going on here?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct weird counters

2009-03-14 Thread Chris Wilson
Hi Karl,

On Sat, 14 Mar 2009, Karl O. Pinc wrote:

 Do you have any ideas what might be going on here?

 Have you bound to an interface with 'interface'?

 Could be you're picking up, say, a file transfer to your gateway. 
 You'd want to monitor your external interface, or filter out traffic to 
 the box itself.

Good idea, but I am bound to interface eth0.

 As a debugging aid (or in general) you might consider putting your 
 rfc1918 network in a networks file. With an aggregate on sum_net and 
 without any other filters you get the cross product of all the 
 possibilities so can see if there's traffic from/to the local network or 
 other things you're perhaps not expecting. If nothing else a quick test 
 with the memory plugin may be revealing.

Sorry, what is an aggregate on sum_net? I'm aggregating on ip_src and 
ip_dst respectively in two different plugins.

I have been thinking about using a networks file, although I'm not sure 
how to do it yet. I have just changed my configuration as follows:

aggregate[inbound]: dst_host, src_mac, dst_mac
aggregate_filter[inbound]: dst net 192.168.0.0/24 and not src net 
192.168.0.0/24

aggregate[outbound]: src_host, src_mac, dst_mac
aggregate_filter[outbound]: src net 192.168.0.0/24 and not dst net 
192.168.0.0/24

to hopefully exclude local traffic and also to see if some weird MAC 
addresses are involved, e.g. multicast, spoofing. But I don't see traffic 
in the gigabytes on either interface when this happens (internal or 
external).

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct weird counters

2009-03-14 Thread Chris Wilson
Hi Paolo,

On Sat, 14 Mar 2009, Paolo Lucente wrote:

 About the SQL INSERT conflict, are you by any chance making use of the
 sql_dont_try_update directive in your configuration?

Yes I am, because it's much more efficient.

 And are you using 32bit counters?

I think so, yes. I compiled with default options on a 32-bit host.

 The conjunction of these two conditions might explain.

 The SQL cache code, while summing up counters, makes a check on whether
 the counter field is about to overflow. When 64bit counters are disabled
 (default) this is what happens:

 #define UINT32T_THRESHOLD 429000UL
 #define CACHE_THRESHOLD UINT32T_THRESHOLD

 /* additional check: bytes counter overflow */
 else if (Cursor-bytes_counter  CACHE_THRESHOLD) {
  if (!staleElem  Cursor-chained) staleElem = Cursor;
  goto follow_chain;
 }

 Basically, a new record for the entry which is going to overflow is
 opened and the old one if parked. When purging the cache to the SQL
 database, both records (the active and the parked one) are sent over;
 the first with an INSERT the second with an UPDATE. This mechanism is
 valid for any number of overflows - indeed.

 The above would also explain why a number of the entries above the 1GB
 level are around the 4GB. But this also would suggest the counters are
 genuine. Another thing which would suggest these are real is that by
 dividing the bytes counter by the packets counter, you get a consistent
 average data size:

 429028 / 10026264 = ~428 bytes
 3943258731 / 8984686  = ~439 bytes

 Any bytes counter roll-over would have greatly skewed one of the above
 two proportions - highlighting an issue. But this would suggest that in
 a single minute roughly 8GB of data were transferred. This translates in
 a fully loaded 1Gbps link. This brings me to these questions: is your LAN
 network (including the 192.168.0.175 host) connected to 1Gbps? Do you
 think it could be possible some LAN traffic gets spanned over?

The local machine is connected to a gigabit switch on the LAN, but this 
host is attached to another switch which is not gigabit, so that suggests 
to me that the counter is invalid. I just checked on the switch, and the 
port that this machine is attached to is currently running at 100mbps.

It is possible that either the switch or my firewall/router/pmacct box is 
going mental and repeating traffic.

Perhaps the best thing to do is to recompile pmacct with 64-bit counters 
to see if the issue goes away? Alternatively I planned to log all traffic 
with tcpdump -w to create a pcap file that I could replay into pmacctd to 
reproduce the problem if it happens again. Would that work? Does pmacctd 
honour the timestamps in the pcap file while reading it?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct weird counters

2009-03-14 Thread Chris Wilson
Hi Karl,

On Sat, 14 Mar 2009, Karl O. Pinc wrote:

 Sorry, what is an aggregate on sum_net? I'm aggregating on ip_src and 
 ip_dst respectively in two different plugins.

 sum_net gets you a all the traffic to and from each network you list in 
 your networks file, plus to and from anywhere else. The cross product. 
 In your case, if you put only 192.168.0.0/24 in your networks file you 
 get out totals for the following possibilities.

Great, thanks, that's a very useful feature that I didn't know about. I've 
switched my configuration to use that, and we'll see if the problem goes 
away.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct weird counters

2009-03-14 Thread Chris Wilson
Hi Karl,

On Sat, 14 Mar 2009, Chris Wilson wrote:

 sum_net gets you a all the traffic to and from each network you list in 
 your networks file, plus to and from anywhere else. The cross product. 
 In your case, if you put only 192.168.0.0/24 in your networks file you 
 get out totals for the following possibilities.

 Great, thanks, that's a very useful feature that I didn't know about. 
 I've switched my configuration to use that, and we'll see if the problem 
 goes away.

Sorry, I just realised that that only produces a summary of all traffic 
from the net, whereas I want to account by individual host within the net. 
So I can't replace my current config with sum_net, but I have added it as 
a new plugin.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct weird counters

2009-03-14 Thread Chris Wilson
Hi Paolo,

On Sat, 14 Mar 2009, Paolo Lucente wrote:

 Any signs of massive packet drops on any port throughout your switches? 
 I ask because the traffic reported might not have been actually 
 delivered to the end host.

The switch has been up for 12.25 days, and in that time has recorded 
2,085,458,896 octets sent and 4,161,359,962 octets received by that port 
(which seems unusually low), and 77,060,310 packets sent and 66,840,066 
packets received.

Over the same period, pmacctd logged 57,439,276,227 bytes and 105,873,327 
packets sent to that host alone, or 129,777,361 packets including another 
host which I know is on the same port.

The switch shows 242 RX errors (all CRC alignment) on that port and no 
other errors or discards. There are no errors or discards on the port that 
my router/pmacct box is attached to. packet numbers are in the same 
region, i.e. a bit less than 100 million. I suspect that the switch's byte 
counters are wrapping.

 Can you do a bit of profiling? Like: see what is the average traffic 
 download/upload for the host X; also what is the average bytes per 
 packet value. Then, when you see an huge downstream traffic rate, see 
 what happens to the upstream. Do you see any correspondence with respect 
 to the average values?

Running this query:

select a.stamp_inserted,
   a.ip_src, a.ip_dst, a.bytes, a.packets,
   b.ip_src, b.ip_dst, b.bytes, b.packets 
from acct_v7 as a
left join acct_v7 as b
on a.stamp_inserted = b.stamp_inserted
where a.bytes  1
and (a.ip_srcb.ip_src or a.ip_dstb.ip_dst);

to find all records with the same timestamp as the excessive ones, I can 
see that:

* when a host is accused of sending a lot of traffic, it doesn't receive a
   lot of traffic at the same time; but

* when a host is accused of sending a lot of traffic, other hosts are also
   accused of sending (but not receiving) a lot of traffic; and

* the same goes for s/sending/receiving/g and vice versa.

 Yes, enable 64-bit counters and see what happens. If you see in a single 
 entry ~8GB of traffic, then everything was correct. Otherwise something 
 must have been wrong on the pmacct side. Running tcpdump in parallel 
 would be great for double-checking. And yes, pmacct honours timestamps 
 within pcap trace files.

OK, done. I assume the default snaplen of 96 bytes is OK for pmacct?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] HTTP Virtual Hosts classification

2009-02-18 Thread Chris Wilson
Hi all,

On Wed, 18 Feb 2009, Paolo Lucente wrote:

 In concept, and as documentation says, what you want to achieve is 
 feasible and your understanding of the classifier() is correct - you 
 only have to write down your own patterns: re-phrased, regular 
 expressions are typically employed to recognize protocols but they can 
 be of course used to recognize virtual hosts when in presence of 
 text-based protocols (ie. HTTP, FTP or POP3).
 
 As you said this is quite innovative and interesting - so let me know if 
 i can support you somehow (feel also free to contact me privately). For 
 now i have not received any feedback which can help you dimensioning the 
 solution - so can't say how easy it would be to deploy in this sense; 
 perhaps somebody reading can fill this gap?

I have thought about doing this as well. The main problem that I had with 
using classifiers is that I ultimately would have to implement a TCP 
engine to reassemble the stream from packets (perhaps the one in snort can 
be borrowed?). Otherwise the Host: header could (accidentally or 
deliberately) be split across multiple packets. There is plenty of 
opportunity for exploitation here as well, e.g. multiple Host: headers, 
invalid characters in headers, packets that look like HTTP requests in the 
middle of streams, bad Content-Lengths, etc.

What I was planning to do, but have not done yet, is to:

* force everyone to use a HTTP proxy (transparent or not) so that dealing 
with malicious requests becomes someone else's problem;

* use the HTTP proxy's logging features to capture the full details of 
both requests (inbound to proxy and outbound from proxy) along with the 
requested URI and current time;

* save all this in a separate table in the database;

* left join from pmacct's acct_v* table to the proxy table on the unique 
quadruple (ip_src,ip_dst,src_port,dst_port) and time.

Thsi was appropriate for my situation as I wanted everyone to use a 
caching proxy anyway to save bandwidth, and hopefully to authenticate. 
However I discovered that Squid's logging formats do not provide all the 
information that I needed to reliably match up the connection (no client 
port, see http://www.visolve.com/squid/squid30/logs.php#logformat).

The external ACL program does have enough information for this
(http://www.visolve.com/squid/squid30/externalsupport.php#external_acl_type), 
so writing a program to run as an external ACL helper and log the 
information to the database is a possibility. 

In our case this also was not good enough, as it does not tell us whether 
the request will be served from the cache or not, and therefore does not 
correspond to the client's real bandwidth usage.

I would be very interested to see what you do in this space.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] multiple interfaces

2009-01-23 Thread Chris Wilson
Hi Mariano,

On Fri, 23 Jan 2009, Mariano Spadaccini wrote:

 Now the problem is only on the tagged port. But I have tried others 
 probe, with the same error (only unidirectional flows).
 
 However I have resolved with one pmacctd/one interface (untagged port).

Have you tried using any as the interface name to capture all flows? I 
think it should work, although it will not put any interface into 
promiscuous mode. Please let us know if it does work.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] multiple interfaces

2009-01-07 Thread Chris Wilson
Hi Anil and Juan,

On Wed, 7 Jan 2009, Juan Rivera wrote:

 My understanding is that any one instance of the daemon can only bind to 
 a single interface.  I think that a workaround would be to run more than 
 one instance of the daemon, one per interface, and use a different 
 configuration file for each instance.

tcpdump can bind to all interfaces but it can't put them all into 
promiscuous mode at the same time. If that's OK for your application, try 
using the device any instead of a real device.

Cheers, Chris.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pNRG and graphing

2008-10-21 Thread Chris Wilson
Hi Gregory,

On Tue, 21 Oct 2008, Gregory Machin wrote:

 Im trying to configure pmacctd to graph traffic passing through the 
 public interface of a firewall .. The public interface is connected to 
 an adsl router .. they share a dedicated private lan . The firewall's ip 
 is 192.168.42.1 and the adsl's ip is 192.168.42.10 , with the firewalls 
 default gateway configured to 192.168.42.10
 
 why does pNRG show traffic for 192.168.42.10 and non for 192.168.42.1

Do you have any traffic destined for 192.168.42.1? E.g. if you run 
tcpdump -n -i eth1 dst host 192.168.42.1 does it show anything?

I suspect that almost all your traffic is actually destined to hosts out 
on the Internet, especially as you are looking at the external interface. 
I would not expect to see any traffic destined for 192.168.42.1 arriving 
on the public interface, as your ISP should not be routing such traffic to 
your connection.

 I only what to graph all the income (sum of) and outgoing (sum of)
 traffic passing through eth1 / 192.168.42.1
 In short I want to graph the network utalisation of the public
 interface so I can see if the adsl is being maxed out.
 How could I do this ?

You should already have it. Just add up the traffic for each source 
address with a SQL SUM/GROUP BY and it will give you total traffic for all 
hosts.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] MySQL and Duplicate Primary Keys

2008-10-08 Thread Chris Wilson
Hi Paolo,

On Wed, 8 Oct 2008, Paolo Lucente wrote:

 Also, i see two different PIDs logging the duplication issue in your 
 email; whereas disabling the primary key the same tuple is written three 
 times; is it possible that there are multiple (3) concurrent pmacctd 
 instances running by mistake?

Sorry, I think you're right, there were multiple instances running :(

Thanks again for your help.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] MySQL and Duplicate Primary Keys

2008-10-02 Thread Chris Wilson
Hi all,

I always get a lot of errors like this when using pmacct on a MySQL 
database:

Oct  2 06:26:01 fen-fw pmacctd[16237]: ERROR ( default/mysql ): Duplicate 
entry 
'00-0-0-217.160.76.21-10.0.156.226-4949-33730-tcp-0-2008-10-0' for key 
1  
Oct  2 06:26:01 fen-fw pmacctd[16239]: ERROR ( default/mysql ): Duplicate 
entry 
'00-0-0-217.160.76.21-10.0.156.226-4949-33730-tcp-0-2008-10-0' for key 
1  

(I didn't paste that line twice, there really are two identical lines in 
the log).

After I delete the primary key, I get duplicate rows in the database, like 
this:

mysql select ip_src,ip_dst,src_port,dst_port,stamp_inserted from acct_v6 
where ip_src=10.0.156.1 and ip_dst=10.0.156.210 and src_port=53 and 
dst_port=56556 and stamp_inserted=2008-10-02 10:43:00;
++--+--+--+-+
| ip_src | ip_dst   | src_port | dst_port | stamp_inserted  |
++--+--+--+-+
| 10.0.156.1 | 10.0.156.210 |   53 |56556 | 2008-10-02 10:43:00 | 
| 10.0.156.1 | 10.0.156.210 |   53 |56556 | 2008-10-02 10:43:00 | 
| 10.0.156.1 | 10.0.156.210 |   53 |56556 | 2008-10-02 10:43:00 | 
++--+--+--+-+
3 rows in set (0.00 sec)

(I've omitted the other columns, but they really are all identical).

The configuration is:

aggregate: src_host, src_port, dst_host, dst_port, proto
sql_history: 1m
sql_history_roundoff: m
sql_table_version: 6
sql_refresh_time: 60
sql_multi_values: 1024000
sql_dont_try_update: true
sql_optimize_clauses: true

Does anyone have any ideas about what might cause this? 

I'm using pmacct 0.9.1 on this server. I know it's old, but it's what 
comes with Ubuntu Dapper.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] How does pmacct divide between in and outbound traffic?

2008-08-04 Thread Chris Wilson
Hi Dennis,

Dennis Kempin wrote:

 I am currently trying to set up pmacct to account traffic between my host and 
 the internet. 
 
 I account src and dst hosts without any filtering.
 aggregate[out]: dst_host,src_host
 aggregate[in]: dst_host,src_host
 
 Looking at the results i wondered how pmacct does divide between inbound and 
 outbound traffic?
 My IN socket shows many connections from my IP to the internet. 

It doesn't. You have to tell it how to, e.g. by applying an appropriate 
filter to each plugin, for example:

aggregate_filter[in]: dst net 192.168.0.0/16
aggregate_filter[out]: src net 192.168.0.0/16

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] mysql plugin connect problem

2008-07-23 Thread Chris Wilson
Hi anil,

Anil wrote:

 ( default/mysql ) *** Purging cache - START ***
 ERROR ( default/mysql ): PRIMARY 'mysql' backend trouble.
 ERROR ( default/mysql ): The SQL server says: Access denied for user
 'admin'@'%.domain.com' to database 'bandwidth_db'
 
 ( default/mysql ) *** Purging cache - END (QN: 0, ET: 0) ***
 
 
 But in the mysql logs, I see that it connected w/o a problem:
 
 
 080722 22:15:01   12 Connect[EMAIL PROTECTED] on bandwidth_db
12 Query LOCK TABLES `acct` WRITE
 
 
 Why does the ERROR show %.domain.com instead of host.domain.com,
 which I specifically setup in my configuration:

host.domain.com is what MySQL gets by doing a reverse lookup on the IP 
address that you connected from. [EMAIL PROTECTED] is the matching rule 
from your grant tables that was used to decide what access this user 
has, and apparently MySQL thinks that this user (pattern) doesn't have 
access to the bandwidth_db database.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacct and nat ?

2008-07-07 Thread Chris Wilson
Hi Sebastien,

Sébastien CRAMATTE wrote:

 I'm running pmacctd on a natted network.
 pmacctd account properply local traffic. My problem is that when I visit 
 a website  o any other thing that is after the  nat router (I'm 
 connected with cable modem) traffic is never accounted !
 
 Does this is the normal behavior ?  What happens is that I've tested 
 with ntop too ... and ntop give me back these datas this is why I ask ...
 Normally interfaces in promiscious mode should see every kind of traffic  ?

Do you mean that you don't see traffic from other machines on your 
network out to the Internet? That probably means that your machine 
doesn't see the traffic. If it's in promiscuous mode, that probably 
means that you have a switch rather than a hub. Try configuring your 
switch with a mirror port, or putting your sensor inline with your NAT 
router as a transparent bridge. Also check that you can see the traffic 
with tcpdump or wireshark before blaming pmacct :)

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Measurement accuracy issues

2008-06-10 Thread Chris Wilson
Hi Ahmed,

On Tue, 10 Jun 2008, Ahmed Kamal wrote:

 I have setup pmacct with your help, and it's been running like a champ. I
 have also installed darkstat for comparison. I am seeing a big error (around
 30%) between the 2 tools!
...
 Here's what I am seeing:

 IP STARTENDDELTA
 DARKSTAT(bytes)
 81.10.100.42 7607.7053 9477.4200 1869.7147 1,397,584,555
 81.10.100.73 3603.2834 4716.6248 1113.3414 810,169,491
 81.10.100.37 3540.3343 5698.6758 2158.3415 1,573,900,631
 81.10.100.199 3444.3568 4358.3895914.0327 575,124,842
 81.10.100.75 2951.8349 3697.5900 745.7551 556,560,149
 81.10.100.30 2770.9552 3807.6038 1036.6486 715,830,077
 81.10.100.46 2698.5764 3987.1379 1288.5615 856,582,079
 81.10.100.44 1982.1858 2381.7297 399.5439 296,992,631
 81.10.100.71 1880.2033 2522.7183 642.5150 548,180,038
 81.10.100.201 1300.2739 2040.0713739.7974 411,031,858

 Those are the top 10 BW users. All measurements are in MB (from SQL query),
 darkstat data is in bytes. As you can see, the first line it's 1.9GB vs
 1.4GB and so on ...

 Any ideas how to track such errors ?

My first suspicion would be that Darkstat is reporting bytes transferred
(TCP data) rather than total size of packets. You can confirm this with
some simple tests. E.g. create a file of exactly 1MB on a remote web
server and download it through your pmacct/darkstat box. If darkstat
reports that the amount downloaded is just over 1MB (e.g. 1.001 MB) then
it's reporting TCP data.

pmacct will always report packet sizes (IP data) and therefore is likely
to report more bytes downloaded. Given that the TCP overhead is about 40
bytes per 1500 byte packet, i.e. about 2.6%, I'd expect it to report about
1.027 MB in this case.

The overhead will be much higher for smaller packets which may explain
your observed 30% discrepancy. If so, this is arguably a bug (or
limitation) of darkstat rather than pmacct.

Please let us know what you discover.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] pmacctd transparent proxy

2006-12-21 Thread Chris Wilson
Hi all,

On Thu, 21 Dec 2006, Jaime Nebrera wrote:

 I have a linux-router as internet gateway for small office with pmaccd 
 running. It works well for now. But when I start the transparent proxy 
 with permanent redirect of http to it, pmacct dosn't count incoming 
 http traffic. I know that it comes from webserver to my router, not to 
 lan client.

 Does anybody knows how to count such traffic and assign it with lan
 host?

  We have faced the same problem in the fast and are currently
  experiencing with the only solution available.

  You need to use tproxy :) This means patching the kernel and iptables,
  patching Squid and well, getting into there. We have made it work but
  are unsure yet of its other consecuences (besides of course, being able
  to see the internal IPs)

If I understood the problem correctly, then I think there is another 
possible solution: write your own transparent proxy (or modify an existing 
one) to intercept the X-Forwarded-For and Host headers, and all four IP 
addresses and port numbers (a pair of each for the connection into and out 
of the proxy).

You can put this information in a database table that you can link to the 
pmacct accounting tables whenever you need it. An added bonus is that you 
get the name of the remote website, not just the port number, whenever you 
want it.

The disadvantages are that your web connections are broken into two 
connections in the pmacct database (which just means that it is reflecting 
reality); and your pmacct client software needs to be modified to take 
advantage of the new table.

Cheers, Chris.
-- 
(aidworld) chris wilson | chief engineer ([EMAIL PROTECTED])

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Classification

2006-11-07 Thread Chris Wilson
Hi Paolo,

On Wed, 18 Oct 2006, Paolo Lucente wrote:

 I'd be interested to know if anyone has combined layer 7 classification
 with pmacct's traffic aggregation. For example, I would like to combine
 all Kazaa traffic (per minute) into a single counter.

 It's already there, you can get a look to the VIII. Quickstart guide to
 packet classifiers chapter in EXAMPLES.

Thanks for pointing me towards that, and apologies for the delay in 
replying. I also found a link to [http://www.pmacct.net/classification/] 
which was quite well hidden on the main pmacct web page :-) and which 
explained what I needed to know: an overview of how the existing structure 
works.

 Yes, traffic shaping between interfaces should be better done in kernel. 
 And i fully agree with you: doing the job twice is not great idea. So, 
 if you can see a way to, say, get the flows from libpcap and 
 classification infos from the kernel, just let me/us know as it sounds a 
 good idea!

OK, I have some idea of how this might work. Harald Welte, one of the 
Netfilter developers, has proposed a system for accounting flows in the 
kernel as part of Netfilter's Conntrack code. He presented a paper on this 
at LinuxTag 2005, which unfortunately is not available online in PDF form 
(since LinuxTag apparently charges for access to conference papers). I 
generated an HTML version and attached it here:

[http://bmo.aidworld.org/attach/Chris/paper.html]

Basically this means that the Linux kernel will be keeping track of flows, 
and can notify user space about flow events. Combined with IPP2P or 
L7-filter, we will have all the information that we need in the kernel, 
and efficient access to it from user space.

So what I'm considering is to create a new version of pmacctd (like 
sfacctd, nfacctd) called ctacctd, which reads flow information from the 
kernel rather than from pcap, etc. Otherwise it would have the same 
data storage backend and processing tools as the pmacct suite. I hope that 
it could be included in the pmacct suite, even if it only works on Linux.

The use of Layer 7 inspection in Netfilter gives us a powerful advantage, 
because we can monitor and shape traffic on the same box, with minimal 
reclassification. Perhaps it can be ported to the BSDs, etc, if I can 
figure out how to access the connection tracking system from user space.

I'm currently on contract to an organisation in Kenya which is currently 
using flowc for traffic monitoring. Flowc has a powerful user interface 
and graphs, but it's extremely difficult to set up, and only works with 
Cisco routers using Netflow. I'm considering implementing some of this 
functionality for the pmacct suite.

I'm still concerned about the performance of the MySQL plugin with 
threading, so I'm considering providing an option to disable the extra 
threads, and run updates synchronously.

I'd be very interested to hear your comments on these ideas. Thanks in 
advance.

Cheers, Chris.
-- 
(aidworld) chris wilson | chief engineer (http://www.aidworld.org)

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Classification

2006-11-07 Thread Chris Wilson
Hi Sven,

On Tue, 7 Nov 2006, Sven Anderson wrote:

 He gave the same talk on the Linux Symposium 2005, you can find the paper
 in the proceedings:

 http://www.linuxsymposium.org/2005/linuxsymposium_procv2.pdf

Great, thanks for that.

 - First, he clearly pointed out, that flow accounting in the conntrack 
 module makes sense _only_ if you use conntrack anyway (like firewall, 
 NAT, ...). To use conntrack just for flow accounting would be just 
 overkill, he wrote.

Yes, and in our case we will be doing that anyway, because we want to 
traffic shape flows.

 - Second, you are strictly bound to the classical flow keys which are 
 kept in the conntrack table anyway, that is source and destination IP 
 and port.

 So the usage of the flow-accounting module in conntrack is quite
 restricted, but as long as these restrictions don't bother, it's a good
 alternative of course. (At the moment pmacct also only has a fixed flow
 data structure, but with the propagation of IPFIX I hope we will move to a
 more flexible structure.)

But this is also how Netflow works, isn't it? The Cisco router has some 
idea about flows that isn't changeable externally, and it will send you 
updates about their state whenever it feels like it. I think that the 
kernel sending you information about its understanding of flows (which 
ctacctd would be free to reinterpret and aggregate) would work similarly.

 But for traffic-shaping based on application level analysis you have a
 problem already: You can classify packets, but you cannot store that
 information in the conntrack table as a flow key (AFAIK).

You can store it using connmark. I have to find a way to export that data 
to user space, but it shouldn't be hard once nfnetlink_conntrack exists.

 Of course you could store that information in another place and map it 
 to the flows in the conntrack table, but then the - let's call it - 
 L7ClassID is not a real flow key, since it it possible that one flow 
 (in the conntrack table) has several different L7ClassIDs over time, 
 splitting it in different flows in fact.

I don't mind that in practice. I could ignore the classification from the 
point of view of distinguishing flows. Also, I thought that pmacct had the 
ability to reclassify existing flows?

 In general you have to ask yourself the question, if having both routing
 and monitoring on the same machine is a good idea. You will probably
 always end up in a situation, where both functionalities interfere with
 each other. That's why I think, having a dedicated metering-probe is in
 most cases the better choice. And then, as the machine is not doing
 anything else with the monitored packets, handling everything in
 user-space is the better approach. Under Linux you can even optimize the
 network-adapter-user-space transition with PF_RING by Luca Deri. Of
 course, you cannot use this set-up if you want to do traffic shaping or
 similar based on the monitoring.

Yes, that is exactly what I want to do. I want to shape bittorrent, 
gnutella and skype traffic without having to know what port it's running 
on.

 I'm still concerned about the performance of the MySQL plugin with 
 threading, so I'm considering providing an option to disable the extra 
 threads, and run updates synchronously.

 Interesting. What about having also a switch to have numbers-only 
 tables, that is IP addresses, timestamps, class_id, mac addresses and 
 protocol are all stored as integers?

I don't see how that would help. It's basically just changing the constant 
multiplier cost. The problem I'm having is that when the database or the 
box is busy, pmacct starts spawning more and more threads that end up 
sleeping on the database. This eats resources and can lead to catastrophic 
failure (it has done it to me at least once). I would rather delay writing 
to the database by having it done synchronously, to limit the damage that 
it can do to the rest of the box.

 While on the subject of changing everything: what about a different 
 timestamp set? I would prefer three timestamps: one for the first and 
 the last packet in a flow, and one for the time the flow got closed 
 (or updated the last time) which would correspond to the time-slot the 
 flow belongs to. The third one is probably not really necessary, as you 
 can calculate it from the other timestamps and the configuration, but it 
 would give you a good index-key for the time-slots.

Sorry, I don't understand what you mean by a time slot? For me, the 
relevant information is the start and end times of the flow, which I can 
use to draw graphs, etc.

Ideally, I would like more detailed information about the flow at various 
points during its life (e.g. status every minute) and I'm not sure if I 
can get that using pmacctd, or how. I'm still working on it.

Cheers, Chris.
-- 
(aidworld) chris wilson | chief engineer (http://www.aidworld.org)

___
pmacct-discussion mailing list
http://www.pmacct.net

[pmacct-discussion] Large number of threads

2006-10-18 Thread Chris Wilson
Hi all,

I'm running pmacct on a fairly low spec box (Celeron 366, 128 Mb RAM) with 
a MySQL database. It started off fine, but as the box started to run out 
of memory (due to Apache I think), pmacctd started spawning more threads 
to write to the database. I ended up with 73 processes/threads in total, 
almost all database writers.

Is this really a good idea? Wouldn't it be better to serialise database 
writes to some extent, to degrade gracefully rather than spiralling to 
death? Or is this already possible and I missed the config option?

Thanks in advance for your help.

Cheers, Chris.
-- 
(aidworld) chris wilson | chief engineer (http://www.aidworld.org)

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists