Re: [pmacct-discussion] Timestamps in RabbitMQ/JSON output
Hi Paolo, On Tue, 3 Jun 2014, Paolo Lucente wrote: What you describe for timestamps seems a good match for NetFlow, ie. cast packets into flows and handle these via a flow-aware cache (so active/passive expiration timers, max lifetime, etc.). All described is already part of the nfprobe plugin. Collecting back such data via nfacctd (on the same box where NetFlow is exported or ship it to some central location) enables to use timestamp_start, timestamp_end aggregation primitives - which should be precisely what you want to achieve. The beauty is that you can have all time references possible at once: timetamp_start, timestamp_end, stamp_inserted, stamp_updated. Don't know how much you like/dislike the solution but i'd encourage to run a proof-of-concept with these tools (which are all available already) so to see we are in line with your requirements and hence take it from there. So at the moment I am developing this by running pmacctd (not nfacctd) on my own laptop to collect and graph my own traffic. Thanks for the suggestion of using timestamp_start and _end which I didn't know you could aggregate on. However when I added these to my aggregate line, I found that the timestamp_start is in local time (not GMT) and a human-readable date format, which is not great for processing in JavaScript, and timestamp_end doesn't appear to work properly: DEBUG ( default/amqp ): publishing [E=pmacct RK=acct DM=0]: {timestamp_start: 2014-06-03 22:42:00.202820, ip_dst: 196.223.145.xxx, ip_proto: tcp, tos: 0, ip_src: 86.30.131.xxx, bytes: 142, port_dst: 36363, packets: 1, port_src: 2201, timestamp_end: 1970-01-01 03:00:00.0} Is this a bug? Would it be easy to fix? About sql_refresh_time less than one second. I've not considered it for a simple reason: it seems to me like forcing an existing caching mechanism towards a real-time use-case. Then better to disable it at all and stream flows as they arrive onto the AMQP exchange. I have this on my todo list - does it seem what you are looking for? It might be. Because I'm mainly using pmacctd (not having any netflow-capable hardware) I don't know how that would work in pmacctd. Would you send every packet? That could be an awful lot of traffic, with some flows having a thousand packets per second. We could process and aggregate it all on the client side, and that has uses (such as drilling down into individual packets), but it would be great to have the option of aggregating them on the server as well, at a resolution chosen by the user. It's definitely not something that I need now, but would like you to have it on your radar that this might be useful for some people. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Newbie
Hi Mike, On Sat, 5 Apr 2014, Mike Hammett wrote: The OfficialConfigKeys is very verbose and no doubt holds the key (no pun intended) to every possible configuration, but all config examples I've found seem drastically simplistic or seemingly incomplete. Try this one: daemonize: false debug: true pidfile: /var/run/nfacctd.pid ! logfile: /var/log/nfacctd.log ! syslog: daemon nfacctd_port: 4096 plugins: mysql aggregate: src_host, src_port, dst_host, dst_port, proto sql_db: pmacct sql_table: acct_v8 sql_history: 1m sql_history_roundoff: m sql_table_version: 8 sql_host: 127.0.0.1 sql_user: pmacct sql_passwd: X sql_refresh_time: 60 sql_dont_try_update: true sql_optimize_clauses: true sql_preprocess: minb = 1 From page 47 of: http://www.ws.afnog.org/afnog2013/tutorials/bmo/afnog-bmo-presentation.odp Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] HTTP traffic classification
Hi Karl, On Mon, 24 Mar 2014, Karl O. Pinc wrote: On 03/24/2014 06:31:30 AM, Stathis Gkotsis wrote: Concerning HTTP: I guess the thing to output would be hostname, since you can have multiple HTTP requests to different URLs inside one TCP Session.About DNS, what should be outputted? I guess the hostname for A queries is good enough to start with. I'm not clear on where DNS would fit into this. Offhand, DNS lookups (and then reverse DNS lookups, etc.) should not be part of pmacct. There's just too much latency. People who want that sort of thing should work out how to do it outside of pmacct. I'd like to see the *content* of DNS requests and responses available to be logged in data records by pmacct. It can be very helpful in identifying which website someone was trying to access, when all we have is an IP address. I accept that not everybody would want this, but I do. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] HTTP traffic classification
Hi all, On Sat, 22 Mar 2014, Viacheslav Dubrovskyi wrote: 22.03.2014 21:20, Stathis Gkotsis пишет: First, I would like to thank you for the great product, pmacct has proven very useful to me, which brings me to my question :) I see that it is possible to enable traffic classification, which is about detecting L7 protocol. I am particularly interested in HTTP and also outputting the hostname or url, e.g. in exports via the print module. Is this somehow possible? IMHO better use special tools https://github.com/jbittel/httpry I'm also interested in this. Even if it's captured by a separate tool (and I'm not sure why it couldn't be integrated with pmacct's L7 classifiers) I would really like to be able to log http and https hostnames of connections, and correlate them with flows recorded by pmacct and DNS requests and responses. It's not clear that httpry can log the source and destination host and port at all, let alone store it in a SQL database (no sample output is provided), and presumably it does nothing with https. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value
Hi Edward, On Tue, 3 Sep 2013, Edward van Kuik wrote: Sep 2 17:59:01 microserver pmacctd[17603]: ERROR ( summary/mysql ): 'sql_multi_values' is too small (100). Try with a larger value. I set mine to 1000. OK, so 1000 might work for you now. But it seems that pmacct can't split the inserts into multiple batches, otherwise a smaller batch size would work too. So one day you might have more than 1000 flows to insert at a time, and you'd get this error and lose data. In fact are you sure you haven't lost any data already? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value
On Tue, 3 Sep 2013, Edward van Kuik wrote: No, it should definitely batch the data into inserts of 1000 values each. Then why would it give me this error message? The error doesn't make sense if pmacct does break inserts into smaller batches. Sep 2 17:59:01 microserver pmacctd[17603]: ERROR ( summary/mysql ): 'sql_multi_values' is too small (100). Try with a larger value. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value
Hi Paolo, On Tue, 3 Sep 2013, Paolo Lucente wrote: Maybe a bug in documentation in the release you are using? CONFIG-KEYS says: The value of the directive is intended to be the size (in bytes) of the multi-values buffer.. So 100 bytes is on the low side, and by default MySQL comes with a 1MB buffer - after that you should tweak MySQL config first, then set the sql_multi_values value accordingly. I can confirm statements are batched in several buffers if one can't fit them all. Thanks, I understand now. I had completely missed that it was in bytes instead of rows. There does seem to be a minor bug in that pmacct appears to fall over if the value is too small. I'm sure it could log a warning and write larger but valid INSERT statements, with at least one VALUES row per statement. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] Error: 'sql_multi_values' is too small (100). Try with a larger value
Hi all, I tried to enable the sql_multi_values option, but setting it to a reasonable number of rows to insert at once (100) to avoid hitting the MySQL packet size limit. But I get these errors in the logs: Sep 2 17:59:01 microserver pmacctd[17603]: ERROR ( summary/mysql ): 'sql_multi_values' is too small (100). Try with a larger value. Sep 2 16:57:46 microserver pmacctd[17608]: ERROR ( inbound/mysql ): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'VALUES (FROM_UNIXTIME(1378141141), FROM_UNIXTIME(1378141080), '00:1b:21:92:98:17' at line 1 This looks like a bug to me? Surely it should be reasonable to insert up to 100 rows at a time (per SQL statement) instead of just 1? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Build fails to find libmysqlclient on 64-bit CentOS
Hi Paolo, On Tue, 25 Jun 2013, Paolo Lucente wrote: Sure, thanks for the tip: makes sense, will do. Also please find attached an RPM spec file to help build rpms for pmacct. It would be great if you could include this in the tarball. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. %define with_pgsql 0 %define with_sqlite 0 Summary: Promiscuous mode IP Accounting package Name: pmacct Version: 0.14.3 Release: 1.cw.130625 License: GPL Group: Monitoring Source: http://www.pmacct.net/%{name}-%{version}.tar.gz Source1: nfacctd.init Source2: pmacctd.init Source3: sfacctd.init Source4: sfacctd.conf #Patch1: pmacct-fix_realloc.patch URL: http://www.pmacct.net/ BuildRoot: %{_tmppath}/%{name}-root BuildRequires: mysql-devel gcc %if %{with_pgsql} BuildRequires: postgresql-devel %endif %if %{with_sqlite} BuildRequires: sqlite-devel = 3.0.0 %endif BuildRequires: libpcap-devel %description pmacct is a small set of passive network monitoring tools to measure, account, classify and aggregate IPv4 and IPv6 traffic; a pluggable and flexible architecture allows to store the collected traffic data into memory tables or SQL (MySQL, SQLite, PostgreSQL) databases. pmacct supports fully customizable historical data breakdown, flow sampling, filtering and tagging, recovery actions, and triggers. Libpcap, sFlow v2/v4/v5 and NetFlow v1/v5/v7/v8/v9 are supported, both unicast and multicast. Also, a client program makes it easy to export data to tools like RRDtool, GNUPlot, Net-SNMP, MRTG, and Cacti. %prep %setup -q #%patch1 chmod a+rx docs examples sql find docs examples sql -type f -print0 | xargs -r0 chmod -x %build if [ -r /usr/lib64/mysql/libmysqlclient.so ]; then MYSQL_LIBS='--with-mysql-libs=/usr/lib64/mysql' fi %configure \ --sysconfdir=%{_sysconfdir}/%{name} \ --enable-threads \ --enable-64bit \ --enable-mysql \ $MYSQL_LIBS \ %if %{with_pgsql} --enable-pgsql \ --with-pgsql-includes=/usr/include/pgsql/ \ %endif %if %{with_sqlite} --enable-sqlite3 \ %endif --enable-ulog \ --enable-ipv6 \ --enable-v4-mapped %__make %{?jobs:-j%{jobs}} %install %makeinstall %{__install} -Dp %{SOURCE1} %{buildroot}/%{_sysconfdir}/init.d/nfacctd %{__install} -Dp %{SOURCE2} %{buildroot}/%{_sysconfdir}/init.d/pmacctd %{__install} -Dp %{SOURCE3} %{buildroot}/%{_sysconfdir}/init.d/sfacctd ln -sf ../../etc/init.d/nfacctd $RPM_BUILD_ROOT/usr/sbin/rcnfacctd ln -sf ../../etc/init.d/pmacctd $RPM_BUILD_ROOT/usr/sbin/rcpmacctd ln -sf ../../etc/init.d/sfacctd $RPM_BUILD_ROOT/usr/sbin/rcsfacctd %{__install} -Dp examples/nfacctd-sql_v2.conf.example %{buildroot}/%{_sysconfdir}/pmacct/nfacctd.conf %{__install} -Dp examples/pmacctd-sql_v2.conf.example %{buildroot}/%{_sysconfdir}/pmacct/pmacctd.conf %{__install} -Dp %{SOURCE4} %{buildroot}/%{_sysconfdir}/pmacct/sfacctd.conf rm -f $RPM_BUILD_ROOT/usr/sbin/rc*acctd %clean %{__rm} -rf %{buildroot} %files %defattr(-, root, root) %doc AUTHORS ChangeLog CONFIG-KEYS COPYING FAQS INSTALL KNOWN-BUGS NEWS QUICKSTART README TODO TOOLS UPGRADE %doc docs examples sql %attr(755,root,root) %{_bindir}/* %attr(755,root,root) %{_sbindir}/* %{_sysconfdir}/init.d/* %dir /etc/pmacct %attr(600,root,root) %config(noreplace) %{_sysconfdir}/pmacct/nfacctd.conf %attr(600,root,root) %config(noreplace) %{_sysconfdir}/pmacct/pmacctd.conf %attr(600,root,root) %config(noreplace) %{_sysconfdir}/pmacct/sfacctd.conf %changelog * Thu Mar 24 2011 zamir za...@mandriva.org 0.12.5-0mdv2011.0 + Revision: 648360 - first build - create pmacct ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] Build fails to find libmysqlclient on 64-bit CentOS
Hi Paolo, Configure fails to find /usr/lib64/mysql/libmysqlclient.so on 64-bit CentOS. You might want to add that to the list of search directories in configure.in? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Citylife House, Sturton Street, Cambridge, CB1 2QF, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Duplicate entry for key 1 (primary key violations)
Hi Paolo, On Fri, 29 Jun 2012, Paolo Lucente wrote: On Tue, Jun 26, 2012 at 10:13:30AM +0100, Chris Wilson wrote: OK, testing now. Would it be possible for pmacctd to log a warning if it exceeds any of these thresholds, to help with tuning without wasting memory? In a way you reckon things go wrong from the process list: the MySQL plugin writer process mentions the wording 'emergency' if the write was due to an unscheduled event. Then you know the value of the cache entries is too low. It's a good idea (and easy to implement) what you propose: when an emergency writer is triggered - then write the event to the logfile aswell. Adding to my todo list. Thanks for doing that :) I'm testing the latest CVS now. Is it possible that it either failed to remove some records from the cache, or calculated the timestamp of the database records incorrectly? Well, the former case would be a bug; the latter is not really possible unless somebody is playing with date on the system: pmacctd and uacctd just use timestamps feeded by the underlying library. Is it possible the DB is underperforming and a commit from the previous hour is taking long to finish? Do you see a 1:1 relationship between the MySQL plugins and the writers when you have a look to the process list? I don't think we have writers taking an hour to write. The system isn't that heavily loaded. I did notice that restarting the daemon generates these duplicate key errors. Restarting isn't completely compatible with an always insert configuration and unique primary keys. I haven't reproduced the original problem yet. On an unrelated note, how hard would it be to get the log message from ULOG stored in the database, for example in the classification field? I had a look through the code but I couldn't see any way to store this field from the received packet into the in-memory structure used to track flows. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Duplicate entry for key 1 (primary key violations)
Hi Paolo, On Wed, 18 Jul 2012, Paolo Lucente wrote: On an unrelated note, how hard would it be to get the log message from ULOG stored in the database, for example in the classification field? I had a look through the code but I couldn't see any way to store this field from the received packet into the in-memory structure used to track flows. For clarity: which log messages are you referring to? The original packet (portion) itself with (or without) ancillary netfilter structures? If yes - than that is not currently possible. The log message is an option of the ULOG target in iptables. We use it to help us debug our QoS traffic classification by showing which packets have which classification: iptables -t mangle -A POSTROUTING $@ -j CLASSIFY --set-class $class iptables -t mangle -A POSTROUTING $@ -j ULOG --ulog-prefix $class iptables -t mangle -A POSTROUTING $@ -j RETURN This results in a class string such as 1:123 being included in the output of the ulogd user-space application which receives the logs: Jul 18 15:50:44 fen-fw2 1:123 IN= OUT=ppp0 MAC= SRC=10.0.156.131 DST=176.58.108.189 LEN=52 TOS=00 PREC=0x00 TTL=63 ID=54141 CE DF ... This seems to come from ulog_packet_msg_t.prefix according to the ulogd 2 sources. It's always possible to embed some data in some fields but the showstopper i see is an entry in the database has not 1:1 relationship with a single packet (portion): these should be concatenated or so (which i can anticipate is some work). What is the case study? In our case, the classification could change mid-stream, as it depends on TOS flags and UDP packet sizes. I wonder whether it's possible to include the classification in the flow key in such cases, so we can separate out high and low priority traffic in the same stream and see how much traffic is being wrongly classified? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Duplicate entry for key 1 (primary key violations)
Hi Paolo, On Wed, 20 Jun 2012, Paolo Lucente wrote: I'm thinking to the possibility that given the aggregation method the SQL cache configured by default is not sufficient to keep all the aggregates over the time period - although the time period is very short. Can you as matter of test add the following line to your config and see if it makes any difference? sql_cache_entries: 91 About plugin_buffer_size and plugin_pipe_size, CONFIG-KEYS gives some guidelines. I suggest to start from those and take it from there (no error message is OK; still error message then move up by one order of magnitude; etc.). So try starting from: plugin_pipe_size: 1024 plugin_buffer_size: 10240 OK, testing now. Would it be possible for pmacctd to log a warning if it exceeds any of these thresholds, to help with tuning without wasting memory? I'm still getting some duplicate values, although fewer, and I noticed something interesting: Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate entry '10.0.156.34-10.9.0.6-443-34555-tcp-2012-06-26 02:00:00' for key 1 Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate entry '109.74.198.131-10.0.156.210-56505-8140-tcp-2012-06-26 02:00:00' for key 1 Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate entry '10.0.156.210-109.74.198.131-8140-56505-tcp-2012-06-26 02:00:00' for key 1 Jun 26 04:00:01 fen-fw2 pmacctd[11470]: ERROR ( long/mysql ): Duplicate entry '178.79.174.118-10.0.156.210-58250-8140-tcp-2012-06-26 02:00:00' for key 1 These log entries were created at 4am, and the long configuration aggregates over one hour, so at 4am it should have been writing database records for 3am-4am, with a timestamp of 3am. But the timestamp was 2am. Is it possible that it either failed to remove some records from the cache, or calculated the timestamp of the database records incorrectly? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] Duplicate entry for key 1 (primary key violations)
Hi all, We get many of these errors in our system logs: Jun 12 10:01:01 fen-fw2 pmacctd[2153]: ERROR ( short/mysql ): Duplicate entry '72.232.223.58-82.68.244.70-80-46802-tcp-2012-06-12 09:56:00' for key 1 They usually happen in batches. E.g. we had a few hundred at 07:27, then another few hundred at 10:01, and a few dozen at 10:17. In our configuration these duplicate inserts should never happen. We should get one INSERT per flow per minute, and the different minutes should result in different values of the primary key. plugins: mysql[short], mysql[long] aggregate[short]: src_host, src_port, dst_host, dst_port, proto sql_db: pmacct sql_table[short]: acct_v6 sql_history[short]: 1m sql_history_roundoff[short]: m sql_refresh_time[short]: 60 sql_dont_try_update: true sql_optimize_clauses: true It's like pmacct is not correctly finding an existing flow when aggregating, and creating a new one that duplicates the existing one. Is there some way to test that? Would the memory plugin do it? Can anyone explain why this is happening or what I'm doing wrong? Also, we get a lot of these: Jun 10 05:13:01 fen-fw2 pmacctd[4070]: ERROR ( short/mysql ): We are missing data. Jun 10 05:13:01 fen-fw2 pmacctd[4070]: If you see this message once in a while, discard it. Otherwise some solutions follow: Jun 10 05:13:01 fen-fw2 pmacctd[4070]: - increase shared memory size, 'plugin_pipe_size'; now: '3096576'. Jun 10 05:13:01 fen-fw2 pmacctd[4070]: - increase buffer size, 'plugin_buffer_size'; now: '192'. Jun 10 05:13:01 fen-fw2 pmacctd[4070]: - increase system maximum socket size. How would I know which parameter to increase? Could the writer tell us exactly which limit it hit? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838 Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] nfacctd
Hi Johan, your nfacctd is compiled without mysql support, so it's not logging to the database, only the memory plugin. Please fix that and try again. Cheers, Chris. -Original Message- From: johan lotter jlct...@gmail.com Sender: Chris Wilson ch...@aptivate.orgDate: Sun, 1 Apr 2012 16:33:50 To: pmacct-discussion@pmacct.net; ch...@aptivate.org; pa...@pmacct.net Cc: pmgraph-t...@aptivate.org Subject: nfacctd Hi Chris 1) Clean install of pmgraphs on Debian Squeeze using the Debian package instructions at: http://www.aptivate.org/pmgraph-nstallation-2 2) Disabled Iptables Firewall Rules using these instructions: http://www.cyberciti.biz/faq/turn-on-turn-off-firewall-in-linux/ 3) Created nfacctd.conf file using known working nfacct/pmgraph configuration, page 52: http://www.ws.afnog.org/afnog2010/bw-mgmt/ daemonize: false debug: true pidfile: /var/run/nfacctd.pid logfile: /var/log/nfacctd.log ! syslog: daemon nfacctd_port: 5678 plugins: mysql aggregate: src_host, src_port, dst_host, dst_port, proto sql_db: pmacct sql_table: acct_v6 sql_history: 1m sql_history_roundoff: m sql_table_version: 6 sql_host: 127.0.0.1 sql_user: pmacct sql_passwd: secret sql_refresh_time: 60 sql_dont_try_update: true sql_optimize_clauses: true ! sql_preprocess: minb = 1000 4) Changed the subnet in pmgraphs to that of my own (192.168.88.) 5) Configured Net-Flow (v5) on my (Mikrotik) Router to send flows to PC running pmacct/pmgraphs: 192.168.88.150 6) Executed with: nfacctd -f nfacctd.conf And get the following error: root@debhome:/etc/pmacct# nfacctd -f nfacctd.conf ERROR ( nfacctd.conf ): Unknown plugin type: mysql. Ignoring. WARN ( nfacctd.conf ): No plugin has been activated; defaulting to in-memory table. gedit /var/log/nfacctd.log (edited down quite a bit) Apr 01 16:10:38 INFO ( default/memory ): 124928 bytes are available to address shared memory segment; buffer size is 176 bytes. Apr 01 16:10:38 INFO ( default/memory ): Trying to allocate a shared memory segment of 2748416 bytes. Apr 01 16:10:38 INFO ( default/core ): waiting for NetFlow data on 0.0.0.0:5678 Apr 01 16:10:38 DEBUG ( default/memory ): allocating a new memory segment. Apr 01 16:10:38 DEBUG ( default/memory ): allocating a new memory segment. Apr 01 16:10:38 OK ( default/memory ): waiting for data on: '/tmp/collect.pipe' Apr 01 16:10:39 DEBUG ( default/memory ): Selecting bucket 4612. Apr 01 16:10:39 DEBUG ( default/memory ): Selecting bucket 9644. Apr 01 16:10:41 DEBUG ( default/memory ): Selecting bucket 2391. Apr 01 16:10:41 DEBUG ( default/memory ): Selecting bucket 2391. Apr 01 16:11:23 DEBUG ( default/memory ): Selecting bucket 31124. Apr 01 16:11:24 INFO: Discarding unknown packet: nfacctd=0.0.0.0:5678 agent=192.168.88.1:5678 Apr 01 16:12:19 DEBUG ( default/memory ): Selecting bucket 20471. Apr 01 16:12:24 INFO: Discarding unknown packet: nfacctd=0.0.0.0:5678 agent=192.168.88.1:5678 Apr 01 16:12:33 DEBUG ( default/memory ): Selecting bucket 2325. Apr 01 16:12:33 DEBUG ( default/memory ): Selecting bucket 1887. There is nothing in /var/log/daemon.log pertaining to nfacctd (even though I have tried running with daemonize: true Any help very welcome (as always), thanks. -- Forwarded message -- From: Chris Wilson ch...@aptivate.org To: pmacct-discussion@pmacct.net Cc: pmgraph-t...@aptivate.org Date: Thu, 2 Feb 2012 12:37:42 + (GMT) Subject: Re: [pmacct-discussion] pmacct-discussion Digest, Vol 83, Issue 1 Hi Johan, On Thu, 2 Feb 2012, johan lotter wrote: Yet when I configure and run with mysql plugin I get no data... Does that mean that you get nothing in the database, or nothing graphed? I notice that you mentioned pmgraph later, which is a different project (that uses pmacct). If you get nothing in the database, please check your /var/log/syslog and /var/log/daemon files for messages from pmacct. Created a file called nfacctd.conf placed it in the same directory as pmacct.conf edited as follows: ! daemonize: true plugins: mysql aggregate: sum_host pmgraph will not work if you aggregate on sum_host. It requires the src_host, dst_host, src_port and dst_port fields at least. It may also get confused by a recent change to pmacct (which I requested) to change the names of the src_port and dst_port fields, as the pmgraph package may not have been updated to account for that change. You may find this presentation useful for a known working nfacct/pmgraph configuration, especially page 52: http://www.ws.afnog.org/afnog2010/bw-mgmt/ executed with nfacctd -f nfacctd.conf enabled Netflow (Traffic-Flow on my router) and told it to send traffic to IP address of listening NIC on port 5678 Yet pmgraph is not graphing anything No firewall blocking inbound UDP traffic to port 5678? !DSPAM:4f78675743401269443440! ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct-discussion Digest, Vol 83, Issue 1
Hi Johan, On Thu, 2 Feb 2012, johan lotter wrote: Yet when I configure and run with mysql plugin I get no data... Does that mean that you get nothing in the database, or nothing graphed? I notice that you mentioned pmgraph later, which is a different project (that uses pmacct). If you get nothing in the database, please check your /var/log/syslog and /var/log/daemon files for messages from pmacct. Created a file called nfacctd.conf placed it in the same directory as pmacct.conf edited as follows: ! daemonize: true plugins: mysql aggregate: sum_host pmgraph will not work if you aggregate on sum_host. It requires the src_host, dst_host, src_port and dst_port fields at least. It may also get confused by a recent change to pmacct (which I requested) to change the names of the src_port and dst_port fields, as the pmgraph package may not have been updated to account for that change. You may find this presentation useful for a known working nfacct/pmgraph configuration, especially page 52: http://www.ws.afnog.org/afnog2010/bw-mgmt/ executed with nfacctd -f nfacctd.conf enabled Netflow (Traffic-Flow on my router) and told it to send traffic to IP address of listening NIC on port 5678 Yet pmgraph is not graphing anything No firewall blocking inbound UDP traffic to port 5678? Please also trim your posts to remove irrelevant information, especially when replying to a digest that contains many emails completely unrelated to the one that you're replying to. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 967838 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Infinite loop in sql_cache_insert
Hi Paolo, On Mon, 28 Nov 2011, Paolo Lucente wrote: Would be great if: 1) you can upgrade to something more recent than that, ie. issue could be related to timestamps and fix might well be in some other parts of the code (pkt_handlers.c pops to mind) I will probably do this soon as I'm intending to do more work on pmacct development. However it would be great if Ubuntu would pick up more recent versions of pmacct in their newer releases. I'm running the latest release, Oneiric. Are you in touch with the package maintainer? I'd particularly like to add some more identifying information to the list of aggregation primitives, to help connect pmacct traffic logs with Squid logs, to associate website names to them. However I was completely confused about where to start on my first attempt to achieve this (adding new primitives). I was wondering whether it would be easier to write a classifier that would inspect the first packet of the stream and stuff the TCP ISN into the classification field? Does that seem like a reasonable approach? and/or 2) manage to reproduce the issue. I'm afraid this is probably impossible. I rarely run packet logging on my laptop and I wasn't at that time. It has happened a few times, but rarely. Apart of the above, agree 100% with your thoughts about cleaning up a bit; i have that on my todo list (along with other related things, ie. creating a sql_cache_free_entries() routine). Excellent :) Simpler and more flexible code would make it much easier to work on and extend pmacct. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Broken aggregate Filter
Hi Bernd, On Thu, 9 Jun 2011, Bernd Bornkessel wrote: It works if I use: vlan and ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 88.215.192.0/19)) Well, but what if I also want to filter by VLAN. The following filters do not work :\ [...[ vlan and ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 88.215.192.0/19)) These filters look identical to me. How come it both works and doesn't work? Cheers, Chris. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Broken aggregate Filter
Hi Bernd, On Thu, 9 Jun 2011, Bernd Bornkessel wrote: The working filter is: vlan and (dst net 192.76.141.0/24 or dst net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 88.215.192.0/19) The non-working are: vlan and ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 88.215.192.0/19)) ((vlan 365 or vlan 1337) and (dst net 192.76.141.0/24 or dst net 194.55.246.0/23 or dst net 195.246.160/19 or dst net 88.215.224.0/19 or dst net 62.93.212.0/23 or dst net 62.93.246.0/23 or dst net 88.215.192.0/19)) I think you may be falling victim to this (from man pcap-filter(7)): vlan [vlan_id] True if the packet is an IEEE 802.1Q VLAN packet. If [vlan_id] is specified, only true if the packet has the specified vlan_id. Note that the first vlan keyword encountered in expression changes the decoding offsets for the remainder of expression on the assumption that the packet is a VLAN packet. The vlan [vlan_id] expression may be used more than once, to filter on VLAN hierarchies. Each use of that expression increments the filter offsets by 4. Therefore I don't think you can use the vlan keyword more than once in the same expression (unless you have vlan hierarchies). This appears to be a limitation (and a rather unusual one) of libpcap, not pmacct. If they really want to support nested vlans (and I would seriously question the sanity of anyone who used them) I would respectfully suggest that they modify the vlan keyword to not change the filter offset, and create a new keyword like nested-vlan which does. Cheers, Chris. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Aggregate not working?
Hi Lockywolf, On Thu, 11 Nov 2010, Lockywolf __ wrote: aggregate[in]: dst_host aggregate[out]: src_host aggregate_filter[in]: dst net 192.168.88.0/16 aggregate_filter[out]: src net 192.168.88.0/16 plugins: mysql[in], mysql[out] Still, in MySQL i have (a lot of) lines like the following: | 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 109.107.91.158 | 0 |0 | ip | 1 | 309 | 2010-11-10 16:50:00 | 2010-11-10 16:59:02 | | 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 71.228.40.130 | 0 |0 | ip | 1 | 305 | 2010-11-10 16:50:00 | 2010-11-10 16:59:02 | | 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 94.24.134.127 | 0 |0 | ip | 1 | 305 | 2010-11-10 16:50:00 | 2010-11-10 16:59:02 | | 0:0:0:0:0:0 | 0:0:0:0:0:0 | 0.0.0.0 | 188.112.79.97 | 0 |0 | ip | 1 | 305 | 2010-11-10 16:50:00 | 2010-11-10 16:59:02 | No MACs ? i guess it's OK with netflow. If you don't aggregate on src_mac and dst_mac, you won't get any MACs... Btw, anybody can tell me, why do i have so many connections to 0.0.0.0? That's what aggregate does. It zeroes all the fields that you don't aggregate on (including the other side's IP address in this case). it's a router, has no brains. It doesn't even exist, it's not a router. But why does it log ips which have neither src_ip nor dst_ip in 192.168.88.0/16 ? That's a good question, I don't know. Might you have more than one nfacctd/pmacctd running? Or might you have changed the config without restarting it? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Source port column name depends on database
Hi Paolo, On Wed, 6 Oct 2010, Paolo Lucente wrote: To say this work (as agreed in the shape of sql table version 8) has been just committed to the CVS. Please give it a try and let me know if it seems to work to your eyes. Thanks for this. I haven't compiled it yet, but I noticed this line: if ((!strcmp(config.type, mysql) || !strcmp(config.type, sqlite3)) config.sql_table_version != 8) { Doesn't this mean that it will revert to the old schema when we release a schema version 9? Is that what you wanted? It seems surprising to me. I would have expected config.sql_table_version 8 instead. By the way I've written this story up in a blog post, I hope that's OK, but please let me know if you want me to edit it: http://blog.aptivate.org/2010/10/06/consistency-portability-and-backwards-compatibility/ Cheers, Chris. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Source port column name depends on database
Hi Paolo, On Wed, 6 Oct 2010, Paolo Lucente wrote: Yes, that's intended for a couple of reasons: 1) don't expect to release any more table versions: you see that already happening with recently introduced primitives; idea is to stick to a table version (or style nowadays) and then customize it from there, adding (or removing) fields to the base schema. 2) combinations of table type/version are internally mapped to a number greater than 8, ie. table type BGP, table version 1. OK, I didn't know that, thanks. No problem with the blog entry. I believe you can change the Luckily he agreed to simply He agreed - i'm not such of an un-cooperative beast, am i? Of course not, far from it :) I've changed it. Cheers, Chris. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Source port column name depends on database
Hi Paolo, On Wed, 15 Sep 2010, Paolo Lucente wrote: On Tue, Sep 14, 2010 at 09:16:37AM +0200, Chris Wilson wrote: I'm not sure about adding a new config switch, do we actually need it? Funnily enough, and that was my perspective, in this case a configuration switch only adds two if-then-else in the common SQL plugins code. Whereas impact of a new schema version you can verify it yourself by grepping the source code for 'sql_table_version'. I think the code that uses sql_table_version has been well written, and none of these places should need to be changed at all. The only place that should need changing, I hope, is the one line of sql_common.c that currently says: if (!strcmp(config.type, mysql) || !strcmp(config.type, sqlite3)) { and would now check for sql_table_version = 7 (or similar) instead. So this change does not actually increase the code complexity, or the number of config options, at all. I'd target release 0.12.5 for this as 0.12.4 is planned to be out soon (by end of the month). Will give a shout as soon as i get something workable in the CVS. That would be great, please do! Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Source port column name depends on database
Hi Paolo, On Tue, 14 Sep 2010, Paolo Lucente wrote: Agree. I seem to reckon this legacy issue is limited to the TCP/UDP ports only and i'm thinking perhaps the best way to approach it is to issue a true/false config switch, ie. sql_table_compat, for the purpose. But for consistency with the rest, these fields should be aligned to port_src and port_dst. Agree? Agree definitely on consistency, and don't really mind which way the name goes. I'm not sure about adding a new config switch, do we actually need it? I seem to recall some wiser counsel to not add configuration options where possible, as it exponentially multiplies the complexity of the software code and also linearly increases the complexity of using it. If our intention is to rename the MySQL fields going forward, why not just use a new schema version to grandfather the old column names? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] Source port column name depends on database
Hi all, We just had a bug report in pmGraph because it assumed that the source port database column was called src_port always, as it is in MySQL. The user is using a postgres database, and it appears that the column is called port_src there instead: if (!strcmp(config.type, mysql) || !strcmp(config.type, sqlite3)) { strncat(insert_clause, src_port, SPACELEFT(insert_clause)); strncat(where[primitive].string, src_port=%u, SPACELEFT(where[primitive].string)); } else { strncat(insert_clause, port_src, SPACELEFT(insert_clause)); strncat(where[primitive].string, port_src=%u, SPACELEFT(where[primitive].string)); } I would be much happier writing database-independent code around pmacct if it didn't do things like this. I understand that there is a backwards compatibility issue with changing it, but perhaps it could be done in a new version of the mysql or postgres schema? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Pmacct data inconsistencies between tables.
Hi Paolo and Daniel, (please allow me to jump in as I may be able to help here, despite currently being in country working on a project.) On Fri, 19 Feb 2010, Paolo Lucente wrote: I also wonder: how does the primary key of the 1 min table look like? Is it any different from the 1 hour table? With the sql_don_try_update turned on and the default indexing, duplicates are not possible. I deleted the primary key from that table because it should not be necessary (there should not be any duplicates if everything is configured correctly) and it makes inserts extremely slow (by a factor of 10-100) when the table gets large. Also at a closer look to the configuration you posted i see no aggregate_filter are specified (see EXAMPLES): it means each plugin collects and tries to write to the same table both inbound and outbound traffic. So either you can remove one set of plugins or craft a proper aggregate_filter so that each does only its bit of the job. The inbound and outbound traffic are supposed to go into the same table, but you're right that the aggregate_filter appears to be missing and this is almost certainly the cause of the duplicate records in the short table. Daniel, could you please add something like this: aggregate_filter[inbound1]: dst net 10.0.156.0/24 aggregate_filter[outbound1]: src net 10.0.156.0/24 aggregate_filter[inbound2]: dst net 10.0.156.0/24 aggregate_filter[outbound2]: src net 10.0.156.0/24 However, I'm surprised that this doesn't also happen in the long table? With regards to the missing tuples, from the few checks i've done, it is always the case that something is in the 1 hour table but can be missing in the 1 minute one. This can very well be the result of a shared 'sql_preprocess: minb = 1000' directive: a flow can accumulate more than 1000 bytes in 1 hour but not in 1 minute - and hence it's accounted in one table and stripped off in the other. Yes, I would expect the long table totals to be slightly more than the short table ones for this reason. However, the problem that we're seeing is the opposite: the totals calculated from the long table are less than those from the short table, even though the long table includes flows that the short table doesn't. And, while this might be accounted for by the duplicate flows in the short table, the same should apply to the long table, so I think it should have balanced out. Given the sql_preprocess you should never expect counters to match for the same reason as above. To have a comparison more apples to apples, you should consider removing it and when confident everything is allright put it back again. Unfortunately we cannot do this in the production environment, as the number of rows of tiny flows (which are effectively noise) completely dwarfs the real data, overloads our firewall's CPU and disk space, and makes querying so slow that the data is useless. This is where a test lab environment would be useful. Thanks for your help with this :) Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Pmacct data inconsistencies between tables.
Hi Karl, On Fri, 19 Feb 2010, Karl O. Pinc wrote: On 02/19/2010 07:42:08 AM, Chris Wilson wrote: I deleted the primary key from that table because it should not be necessary (there should not be any duplicates if everything is configured correctly) and it makes inserts extremely slow (by a factor of 10-100) when the table gets large. FWIW, the automatic sequential key generation speed is unrelated to table size when using postgresql. There is no sequence to generate as far as I know. The problem is the size of the index file, and the fact that it has to be rewritten for every insert (or block of inserts) that makes insertion get slower as database size increases. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] NAT question
Hi JF, On Thu, 12 Nov 2009, JF Cliche wrote: I am behind two NAT routers (Linksys running DD-WRT) with port forwarding up to the machine running pmacct, and yet pmacct reports SSH traffic to the forwarded port with the public (external, non-NATed) addresses. I thought all traffic should be seen as coming from the second router private address. Is pmacct (or underlying pcab library) getting the public address from extra data encapsulated in the TCP packets by the routers or in the SSH protocol? I've seen the opposite problem being discussed in this forum, but not this... NAT usually affects only the source address of outbound connections, and the destination address of inbound ones. There's no need for it to change the source of your incoming (to the pmacct server) SSH connection, as its reply packets will still go back to the SSH client via the router, which is necessary in order to have their source IP natted. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] timestamp rounding bug
Hi Paolo, On Mon, 3 Aug 2009, Paolo Lucente wrote: Didn't act on it yet, being focused on some new features. My goal is to do something about it in 0.12.0rc2. Basically it would be a fix for who doesn't use an UTC clock on the system running pmacct. If there is general interest around this story, I'll remember to briefly post here about it the code is committed to the CVS. Btw, i guess the outcome of that thread was a recommendation to run pmacct on a system which is set up for UTC. Maybe this should also be made slightly more visible - maybe inserted into the FAQS document. Is any real-world system set to UTC? I'm certainly not going to run my firewall (where I run pmacct currently) on UTC. All my logs would be screwed up and much harder to interpret. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Flexible aggregation
Hi Paolo and Karl, On Sat, 13 Jun 2009, Paolo Lucente wrote: On Sat, Jun 13, 2009 at 03:07:01PM -0500, Karl O. Pinc wrote: We are only interested in a single table. Why can't two separate sql plugins write to the same table? What Karl is proposing here might really result in a simpler approach compared to the sub-aggregation scenario - which, with some care (ie. sql_startup_delay to svoid events syncronization while retaining same sql_history and sql_refresh_time settings), can not only achieve same results but best of all is already there. Let us know your thoughts! I don't think it can. For example, how would we write the configuration? Let's say we just want to zero (not aggregate on) the destination IP for flows less than 1000 bytes. We could try: plugins: mysql[with_dst], mysql[without_dst] aggregate[with_dst]: src_host, src_port, dst_host, dst_port, proto aggregate[without_dst]: src_host, src_port, dst_port, proto sql_preprocess[with_dst]: minb = 1000 sql_preprocess[without_dst]: maxb = 1000 but the flow aggregates are not the same for both plugins, so we can't ensure that any flow ends up in one plugin or the other but not both or neither. How else could we do it with what we already have? We could write to different tables at different levels of aggregation, and let the user choose which one to use, and delete old data from each table to stop it becoming too large... but that gets more complicated for the user. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Flexible aggregation
Hi Paolo, On Sat, 13 Jun 2009, Paolo Lucente wrote: Good pointer. From a brief scan of the Aguri homepage, please feel free to correct whether i'm wrong, i see many similarities between pmacct and Aguri. I guess so; I was thinking that Aguri seems to store its output in text files rather than a database, and perhaps provides more dynamic/automatic filtering, but seems to be a research project and not highly supported or maintained. Aguri is slightly more limited in the fact it has only a set of (4?) traffic aggregation profiles whereas pmacct offers a wider range of primitives. But I guess the point you wanted to make was the dynamic variation of the sampling rate under increased traffic load (ie. DDoS). OK, I didn't realise that it was just the sample rate that was varied. I thought it was to do with the flexible aggregation, e.g. if we have 1000 flows with the same source IP and source port, they might be aggregated together as a single, more highly summarised flow. pmacct actually does have such feature only available to the SQL plugins: it's part of the SQL preprocess infrastructure (look for 'sql_preprocess' in the CONFIG-KEYS document or the wiki) and is called 'fsrc' (Flow Sampling under Resource Constraints). It is an implementation i did years ago loosely based on a paper coming from ATT Labs. It aims at offering to the SQL database a sort of stream-lined number of aggregates; aggregates are weighted, ranked and sampled based on probability (which gives the dynamic/adaptive part of the approach); the resource constraint is expressed via the number of flows you want to end in the database (which is in turn seen as the constrained resource here). We are using this feature to filter out small flows, but the problem is that they are not accounted for at all, so the database contents e.g. SUM(bytes) no longer reflect the interface totals. What I would ideally like to see, but I realise that it's hard is something like this: Initial filter selects flows over a certain size and non-selected flows can either be discarded (as now) or reaggregated by zeroing a selected feature, e.g. the destination port, and combined into a new single record if there is more than one of them. These, more highly aggregated records then continue down the preprocess chain, and if they fail to match a later condition then they can be aggregated again in a different way, e.g. by zeroing the destination IP address, and so on, until we end up with a single record where all the features were aggregated. For example, sql_preprocess might look something like this: minb = 1, zero_dstip, minb = 1, zero_dstport, minb = 1, zero_srcport, minb = 1, zero_srcip Then any flows which together do not add up to enough bytes to pass the minb filters, even after aggregation, end up in a record where all the selector fields are zeroed out. Since there is no final minb condition, this row would always be added to the database, never rejected, so SUM(bytes) would again equal the interface counters for any given time range. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Flexible aggregation
Hi Paolo, On Sat, 13 Jun 2009, Paolo Lucente wrote: minb = 1, zero_dstip, minb = 1, zero_dstport, minb = 1, zero_srcport, minb = 1, zero_srcip Then any flows which together do not add up to enough bytes to pass the minb filters, even after aggregation, end up in a record where all the selector fields are zeroed out. Since there is no final minb condition, this row would always be added to the database, never rejected, so SUM(bytes) would again equal the interface counters for any given time range. I explored this valid approach some time ago (years!); by zeroing some aggregation primitives previously selected, duplicates are likely to be created. The trick is to resolve such duplicates before offering them to the SQL database - via a sub-aggregation operation. The cache is not sorted - making any sub-aggregation operation very expensive (scaling linearly with the number of aggregated being offered); the idea here is to index the cache, perform the sub-aggregation and offer the result of this to the SQL database. I agree that merging duplicate records would produce the most useful results for us. In summary, it's not something quick to do but it can be done - maybe something good for inclusion within the 0.12 trunk later in the year. At this stage, this feature can't be included in the first pre-release version (0.12.0p1) but I can plan it along the rocky way to the first official release, 0.12.0. Maybe already in 0.12.0p2. How does it sound? That sounds great! I was not expecting you to offer to implement it so quickly. I understand that it's difficult and may conflict with your other priorities. Let me spend a couple of words on a different aspect: the above approach implies everything ends in the same SQL table - which can have pros and cons; the pro is simplicity (one table for everything); the con is that might want to have sub-aggregated data clearly separated into a different table to, say, apply different policies. This is something can be done today with pmacct as 'sql_preprocess' offers also the max version of the min features you are using. It means having, for example, two SQL plugins, writing to different SQL tables, aggregating data differently and using complementary sql_preprocess features (so that at the end by summing data in both tables one ends with the full picture). Would this be a feasible approach to you? We are only interested in a single table. We can show 0.0.0.0 as Aggregated out in the pmGraph user interface. I'd rather that we didn't have to query five separate tables to get the results at different levels of aggregation, and merge them all together in our code. However I can see that some people would prefer to keep them in separate tables. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] Flexible aggregation
Hi all, Has anyone heard of Aguri? Aguri is an aggregation-based traffic profiler targeted for near real-time, long-term, and wide-area traffic monitoring. Aguri adapts itself to spatial traffic distribution by aggregating small volume flows into aggregates, and achieves temporal aggregation by creating a summary of summaries applying the same algorithm to its outputs. A set of scripts are used for archiving and visualizing summaries in different time scales. Aguri does not need a predefined rule set and is capable of detecting an unexpected increase of unknown protocols or DoS attacks, which considerably simplifies the task of network monitoring. [http://www.sonycsl.co.jp/person/kjc/kjc/software.html] I think I remember something like this being posted to the list a while back, so I'm sorry if this is a duplicate. Has anyone considered implementing anything like this flexible aggregation in pmacct? Could the code be taken from Aguri under BSD license? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] timestamp rounding bug
Hi Paolo, On Sun, 19 Apr 2009, Karl O. Pinc wrote: what makes sense to me is to collect timestamps in UTC, store them in UTC when storing them in a database, and let whatever's pulling the data out of the db present the data to the user in whatever fashion makes sense. Any other approach, i.e. working in local time or DST, makes working across time zones difficult, and computing intervals (in the case of DST) impossible. I agree with Karl. Timestamps in UTC in the database make the most sense for me. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Strange SQL-Error
Hi Johannes, On Mon, 13 Apr 2009, Johannes Formann wrote: Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip f7bfa9ca sp ffb88334 error 4 in libpthread-2.3.6.so[f7bf2000+e000] I think I got it (using a written coredump): Yes, that's it, thanks. I'm afraid it doesn't mean much to me, but I hope it will help Paolo. What exact version of pmacct are you using? (gdb) bt #0 0xf7ba29ca in pthread_getspecific () from /lib/tls/i686/cmov/libpthread.so.0 #1 0xf7c8bf85 in inet_ntoa () from /lib/tls/i686/cmov/libc.so.6 Paolo, this looks weird to me. pthread_getspecific() should not crash, that makes me think that the heap has been trashed (stack looks generally OK as the backtrace is OK). Perhaps a Valgrind is in order? Any static or fixed-size buffers in the mysql plugin that might be busted? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Strange SQL-Error
Hi Johannes, On Mon, 13 Apr 2009, Johannes Formann wrote: I'm not sure why flows is in your aggregate set since flows are already aggregated into flows in all cases by pmacctd, as far as I know (please correct me if I'm wrong). flow isn't in the primary key. I didn't say it was, but it is in your aggregate set and I don't understand why. Are you shure its flow, between mac and IP it could be vlan? aggregate: src_host,dst_host,dst_port,src_port,flows,dst_mac,proto,src_mac,vlan It's right there before dst_mac. I guess you mean the SIGSEGV error has been logged in your syslog? gdb should stop when it sees the SIGSEGV error, and wait for a command such as bt. So I guess it's happening in another thread than the main one, so it will be harder to trace. You could wait until pmacctd is up and running, then press Ctrl+C, enter the info threads command, then guess a thread other than the first one and switch to it with thread xxx and continue, and hope that that thread dies with SIGSEGV. Is pmacctd not terminated once pressing ctrl+c? It shouldn't be, gdb should intercept the SIGINT and stop it from reaching the process. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Strange SQL-Error
Hi Johannes, On Mon, 13 Apr 2009, Johannes Formann wrote: Paolo, this looks weird to me. pthread_getspecific() should not crash, that makes me think that the heap has been trashed (stack looks generally OK as the backtrace is OK). Perhaps a Valgrind is in order? Any static or fixed-size buffers in the mysql plugin that might be busted? No Valgrind instaled. You can probably apt-get install valgrind and run pmacctd through it. I cleared the database, and observed what happend: Apr 13 17:18:19 server1 pmacctd[12394]: INFO ( default/core ): Start logging ... Apr 13 17:18:19 server1 pmacctd[12394]: OK ( default/core ): link type is: 1 Apr 13 17:19:41 server1 pmacctd[12394]: Expiring orphan fragment: ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33635 Apr 13 17:19:47 server1 pmacctd[12394]: Expiring orphan fragment: ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33756 Apr 13 17:19:58 server1 pmacctd[12394]: Expiring orphan fragment: ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33415 Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ): Duplicate entry '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for key 1 Apr 13 17:20:01 server1 kernel: pmacctd[12419]: segfault at 3827208c ip f7c599ca sp ffde7894 error 4 in libpthread-2.3.6.so[f7c51000+e000] As this crash is so early, perhaps the thread isn't initialised properly? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Strange SQL-Error
Hi Johannes, On Mon, 13 Apr 2009, Johannes Formann wrote: Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ): Duplicate entry '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for key 1 As this crash is so early, perhaps the thread isn't initialised properly? Well, the first update (into the complet empty table) was successfull, and I think that has used the same kid of thread. I have now a guess whre the duplicated keys error come from: Assume the updates are done at :30 with sql_history_roundoff: 1h and sql_refresh_time: 3600 (1h) (so long for simplifikation) at 0:30 for each recorded flow a row is inserted with the timestamp 0:00 at 1:30 for log flow a row is inserted for 0:00 and 1:00 ... at least if I understood the dokumentation right, that makes the error, since to identical inserts should be done... My understanding is that with those settings, a row would be inserted just after 0:00, with stamp_inserted = 0:00, and another one just after 1:00, with stamp_inserted 1:00, so there should not be a conflict. What makes you think that anything should happen at 0:30 or 1:30? Also, the second insert should have stamp_inserted = 1:00 not 0:00, as far as I know. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] pmacct weird counters
Hi Paolo, I'm running pmacctd 0.11.5 on a small network for traffic accounting. Generally it's behaving well, but occasionally I can see weird data being inserted: 17190 Query INSERT INTO `acct_v7` (stamp_updated, stamp_inserted, vlan, ip_dst, as_src, as_dst, src_port, dst_port, tcp_flags, tos, ip_proto, agent_id, class_id, mac_src, mac_dst, ip_src, packets, bytes, flows) VALUES (FROM_UNIXTIME(1236952981), FROM_UNIXTIME(1236952920), 0, '192.168.0.175', 0, 0, 0, 0, 0, 0, 'ip', 0, 'unknown', '0:0:0:0:0:0', '0:0:0:0:0:0', '0.0.0.0', 10026264, 429028, 0) 17190 Query INSERT INTO `acct_v7` (stamp_updated, stamp_inserted, vlan, ip_dst, as_src, as_dst, src_port, dst_port, tcp_flags, tos, ip_proto, agent_id, class_id, mac_src, mac_dst, ip_src, packets, bytes, flows) VALUES (FROM_UNIXTIME(1236952981), FROM_UNIXTIME(1236952920), 0, '192.168.0.175', 0, 0, 0, 0, 0, 0, 'ip', 0, 'unknown', '0:0:0:0:0:0', '0:0:0:0:0:0', '0.0.0.0', 8984686, 3943258731, 0) The byte counters look bogus to me. It's hard to imagine how anyone could send 4 GB of data down through my cable modem connection in just one minute. I might even suspect a 32-bit sign overflow, but in the second case that would still mean 350 MB in one minute which is 46 Mbps, more than four times my line rate, and my external interface graphs show no traffic at all during that time. What's also odd is that the second record is a primary key conflict with the first, so it never ended up in the database. I don't have two pmacctd's running this time :) but I do have two plugins configured as follows: plugins: mysql[inbound], mysql[outbound] aggregate[inbound]: dst_host aggregate_filter[inbound]: dst net 192.168.0.0/24 aggregate[outbound]: src_host aggregate_filter[outbound]: src net 192.168.0.0/24 They both insert into the same table, which is what I want in this case. Because of aggregation, they should never conflict with each other. But could this be causing memory corruption? Here is the suspicious data that I have in my database (I assume that MySQL is not corrupting this data): mysql select stamp_inserted,bytes,packets from acct_v7 where bytes 10; +-++--+ | stamp_inserted | bytes | packets | +-++--+ | 2009-02-13 09:27:00 | 3192440953 | 3077338 | | 2009-02-25 15:31:00 | 1520451669 | 17845485 | | 2009-02-25 15:31:00 | 429569 | 9270610 | | 2009-02-25 15:32:00 | 1833044423 | 4116940 | | 2009-03-09 01:43:00 | 3842930106 | 4829946 | | 2009-03-09 01:43:00 | 429226 | 4202681 | | 2009-03-13 14:00:00 | 429631 | 9675501 | | 2009-03-13 14:01:00 | 429783 | 9514197 | | 2009-03-13 14:02:00 | 429028 | 10026264 | | 2009-03-13 14:03:00 | 429262 | 9798220 | | 2009-03-13 14:04:00 | 2777022526 | 6454405 | | 2009-03-14 00:08:00 | 1521800860 | 2077144 | | 2009-03-14 05:22:00 | 1460542448 | 3737824 | +-++--+ Do you have any ideas what might be going on here? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct weird counters
Hi Karl, On Sat, 14 Mar 2009, Karl O. Pinc wrote: Do you have any ideas what might be going on here? Have you bound to an interface with 'interface'? Could be you're picking up, say, a file transfer to your gateway. You'd want to monitor your external interface, or filter out traffic to the box itself. Good idea, but I am bound to interface eth0. As a debugging aid (or in general) you might consider putting your rfc1918 network in a networks file. With an aggregate on sum_net and without any other filters you get the cross product of all the possibilities so can see if there's traffic from/to the local network or other things you're perhaps not expecting. If nothing else a quick test with the memory plugin may be revealing. Sorry, what is an aggregate on sum_net? I'm aggregating on ip_src and ip_dst respectively in two different plugins. I have been thinking about using a networks file, although I'm not sure how to do it yet. I have just changed my configuration as follows: aggregate[inbound]: dst_host, src_mac, dst_mac aggregate_filter[inbound]: dst net 192.168.0.0/24 and not src net 192.168.0.0/24 aggregate[outbound]: src_host, src_mac, dst_mac aggregate_filter[outbound]: src net 192.168.0.0/24 and not dst net 192.168.0.0/24 to hopefully exclude local traffic and also to see if some weird MAC addresses are involved, e.g. multicast, spoofing. But I don't see traffic in the gigabytes on either interface when this happens (internal or external). Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct weird counters
Hi Paolo, On Sat, 14 Mar 2009, Paolo Lucente wrote: About the SQL INSERT conflict, are you by any chance making use of the sql_dont_try_update directive in your configuration? Yes I am, because it's much more efficient. And are you using 32bit counters? I think so, yes. I compiled with default options on a 32-bit host. The conjunction of these two conditions might explain. The SQL cache code, while summing up counters, makes a check on whether the counter field is about to overflow. When 64bit counters are disabled (default) this is what happens: #define UINT32T_THRESHOLD 429000UL #define CACHE_THRESHOLD UINT32T_THRESHOLD /* additional check: bytes counter overflow */ else if (Cursor-bytes_counter CACHE_THRESHOLD) { if (!staleElem Cursor-chained) staleElem = Cursor; goto follow_chain; } Basically, a new record for the entry which is going to overflow is opened and the old one if parked. When purging the cache to the SQL database, both records (the active and the parked one) are sent over; the first with an INSERT the second with an UPDATE. This mechanism is valid for any number of overflows - indeed. The above would also explain why a number of the entries above the 1GB level are around the 4GB. But this also would suggest the counters are genuine. Another thing which would suggest these are real is that by dividing the bytes counter by the packets counter, you get a consistent average data size: 429028 / 10026264 = ~428 bytes 3943258731 / 8984686 = ~439 bytes Any bytes counter roll-over would have greatly skewed one of the above two proportions - highlighting an issue. But this would suggest that in a single minute roughly 8GB of data were transferred. This translates in a fully loaded 1Gbps link. This brings me to these questions: is your LAN network (including the 192.168.0.175 host) connected to 1Gbps? Do you think it could be possible some LAN traffic gets spanned over? The local machine is connected to a gigabit switch on the LAN, but this host is attached to another switch which is not gigabit, so that suggests to me that the counter is invalid. I just checked on the switch, and the port that this machine is attached to is currently running at 100mbps. It is possible that either the switch or my firewall/router/pmacct box is going mental and repeating traffic. Perhaps the best thing to do is to recompile pmacct with 64-bit counters to see if the issue goes away? Alternatively I planned to log all traffic with tcpdump -w to create a pcap file that I could replay into pmacctd to reproduce the problem if it happens again. Would that work? Does pmacctd honour the timestamps in the pcap file while reading it? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct weird counters
Hi Karl, On Sat, 14 Mar 2009, Karl O. Pinc wrote: Sorry, what is an aggregate on sum_net? I'm aggregating on ip_src and ip_dst respectively in two different plugins. sum_net gets you a all the traffic to and from each network you list in your networks file, plus to and from anywhere else. The cross product. In your case, if you put only 192.168.0.0/24 in your networks file you get out totals for the following possibilities. Great, thanks, that's a very useful feature that I didn't know about. I've switched my configuration to use that, and we'll see if the problem goes away. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct weird counters
Hi Karl, On Sat, 14 Mar 2009, Chris Wilson wrote: sum_net gets you a all the traffic to and from each network you list in your networks file, plus to and from anywhere else. The cross product. In your case, if you put only 192.168.0.0/24 in your networks file you get out totals for the following possibilities. Great, thanks, that's a very useful feature that I didn't know about. I've switched my configuration to use that, and we'll see if the problem goes away. Sorry, I just realised that that only produces a summary of all traffic from the net, whereas I want to account by individual host within the net. So I can't replace my current config with sum_net, but I have added it as a new plugin. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct weird counters
Hi Paolo, On Sat, 14 Mar 2009, Paolo Lucente wrote: Any signs of massive packet drops on any port throughout your switches? I ask because the traffic reported might not have been actually delivered to the end host. The switch has been up for 12.25 days, and in that time has recorded 2,085,458,896 octets sent and 4,161,359,962 octets received by that port (which seems unusually low), and 77,060,310 packets sent and 66,840,066 packets received. Over the same period, pmacctd logged 57,439,276,227 bytes and 105,873,327 packets sent to that host alone, or 129,777,361 packets including another host which I know is on the same port. The switch shows 242 RX errors (all CRC alignment) on that port and no other errors or discards. There are no errors or discards on the port that my router/pmacct box is attached to. packet numbers are in the same region, i.e. a bit less than 100 million. I suspect that the switch's byte counters are wrapping. Can you do a bit of profiling? Like: see what is the average traffic download/upload for the host X; also what is the average bytes per packet value. Then, when you see an huge downstream traffic rate, see what happens to the upstream. Do you see any correspondence with respect to the average values? Running this query: select a.stamp_inserted, a.ip_src, a.ip_dst, a.bytes, a.packets, b.ip_src, b.ip_dst, b.bytes, b.packets from acct_v7 as a left join acct_v7 as b on a.stamp_inserted = b.stamp_inserted where a.bytes 1 and (a.ip_srcb.ip_src or a.ip_dstb.ip_dst); to find all records with the same timestamp as the excessive ones, I can see that: * when a host is accused of sending a lot of traffic, it doesn't receive a lot of traffic at the same time; but * when a host is accused of sending a lot of traffic, other hosts are also accused of sending (but not receiving) a lot of traffic; and * the same goes for s/sending/receiving/g and vice versa. Yes, enable 64-bit counters and see what happens. If you see in a single entry ~8GB of traffic, then everything was correct. Otherwise something must have been wrong on the pmacct side. Running tcpdump in parallel would be great for double-checking. And yes, pmacct honours timestamps within pcap trace files. OK, done. I assume the default snaplen of 96 bytes is OK for pmacct? Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] HTTP Virtual Hosts classification
Hi all, On Wed, 18 Feb 2009, Paolo Lucente wrote: In concept, and as documentation says, what you want to achieve is feasible and your understanding of the classifier() is correct - you only have to write down your own patterns: re-phrased, regular expressions are typically employed to recognize protocols but they can be of course used to recognize virtual hosts when in presence of text-based protocols (ie. HTTP, FTP or POP3). As you said this is quite innovative and interesting - so let me know if i can support you somehow (feel also free to contact me privately). For now i have not received any feedback which can help you dimensioning the solution - so can't say how easy it would be to deploy in this sense; perhaps somebody reading can fill this gap? I have thought about doing this as well. The main problem that I had with using classifiers is that I ultimately would have to implement a TCP engine to reassemble the stream from packets (perhaps the one in snort can be borrowed?). Otherwise the Host: header could (accidentally or deliberately) be split across multiple packets. There is plenty of opportunity for exploitation here as well, e.g. multiple Host: headers, invalid characters in headers, packets that look like HTTP requests in the middle of streams, bad Content-Lengths, etc. What I was planning to do, but have not done yet, is to: * force everyone to use a HTTP proxy (transparent or not) so that dealing with malicious requests becomes someone else's problem; * use the HTTP proxy's logging features to capture the full details of both requests (inbound to proxy and outbound from proxy) along with the requested URI and current time; * save all this in a separate table in the database; * left join from pmacct's acct_v* table to the proxy table on the unique quadruple (ip_src,ip_dst,src_port,dst_port) and time. Thsi was appropriate for my situation as I wanted everyone to use a caching proxy anyway to save bandwidth, and hopefully to authenticate. However I discovered that Squid's logging formats do not provide all the information that I needed to reliably match up the connection (no client port, see http://www.visolve.com/squid/squid30/logs.php#logformat). The external ACL program does have enough information for this (http://www.visolve.com/squid/squid30/externalsupport.php#external_acl_type), so writing a program to run as an external ACL helper and log the information to the database is a possibility. In our case this also was not good enough, as it does not tell us whether the request will be served from the cache or not, and therefore does not correspond to the client's real bandwidth usage. I would be very interested to see what you do in this space. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] multiple interfaces
Hi Mariano, On Fri, 23 Jan 2009, Mariano Spadaccini wrote: Now the problem is only on the tagged port. But I have tried others probe, with the same error (only unidirectional flows). However I have resolved with one pmacctd/one interface (untagged port). Have you tried using any as the interface name to capture all flows? I think it should work, although it will not put any interface into promiscuous mode. Please let us know if it does work. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] multiple interfaces
Hi Anil and Juan, On Wed, 7 Jan 2009, Juan Rivera wrote: My understanding is that any one instance of the daemon can only bind to a single interface. I think that a workaround would be to run more than one instance of the daemon, one per interface, and use a different configuration file for each instance. tcpdump can bind to all interfaces but it can't put them all into promiscuous mode at the same time. If that's OK for your application, try using the device any instead of a real device. Cheers, Chris. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pNRG and graphing
Hi Gregory, On Tue, 21 Oct 2008, Gregory Machin wrote: Im trying to configure pmacctd to graph traffic passing through the public interface of a firewall .. The public interface is connected to an adsl router .. they share a dedicated private lan . The firewall's ip is 192.168.42.1 and the adsl's ip is 192.168.42.10 , with the firewalls default gateway configured to 192.168.42.10 why does pNRG show traffic for 192.168.42.10 and non for 192.168.42.1 Do you have any traffic destined for 192.168.42.1? E.g. if you run tcpdump -n -i eth1 dst host 192.168.42.1 does it show anything? I suspect that almost all your traffic is actually destined to hosts out on the Internet, especially as you are looking at the external interface. I would not expect to see any traffic destined for 192.168.42.1 arriving on the public interface, as your ISP should not be routing such traffic to your connection. I only what to graph all the income (sum of) and outgoing (sum of) traffic passing through eth1 / 192.168.42.1 In short I want to graph the network utalisation of the public interface so I can see if the adsl is being maxed out. How could I do this ? You should already have it. Just add up the traffic for each source address with a SQL SUM/GROUP BY and it will give you total traffic for all hosts. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] MySQL and Duplicate Primary Keys
Hi Paolo, On Wed, 8 Oct 2008, Paolo Lucente wrote: Also, i see two different PIDs logging the duplication issue in your email; whereas disabling the primary key the same tuple is written three times; is it possible that there are multiple (3) concurrent pmacctd instances running by mistake? Sorry, I think you're right, there were multiple instances running :( Thanks again for your help. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
[pmacct-discussion] MySQL and Duplicate Primary Keys
Hi all, I always get a lot of errors like this when using pmacct on a MySQL database: Oct 2 06:26:01 fen-fw pmacctd[16237]: ERROR ( default/mysql ): Duplicate entry '00-0-0-217.160.76.21-10.0.156.226-4949-33730-tcp-0-2008-10-0' for key 1 Oct 2 06:26:01 fen-fw pmacctd[16239]: ERROR ( default/mysql ): Duplicate entry '00-0-0-217.160.76.21-10.0.156.226-4949-33730-tcp-0-2008-10-0' for key 1 (I didn't paste that line twice, there really are two identical lines in the log). After I delete the primary key, I get duplicate rows in the database, like this: mysql select ip_src,ip_dst,src_port,dst_port,stamp_inserted from acct_v6 where ip_src=10.0.156.1 and ip_dst=10.0.156.210 and src_port=53 and dst_port=56556 and stamp_inserted=2008-10-02 10:43:00; ++--+--+--+-+ | ip_src | ip_dst | src_port | dst_port | stamp_inserted | ++--+--+--+-+ | 10.0.156.1 | 10.0.156.210 | 53 |56556 | 2008-10-02 10:43:00 | | 10.0.156.1 | 10.0.156.210 | 53 |56556 | 2008-10-02 10:43:00 | | 10.0.156.1 | 10.0.156.210 | 53 |56556 | 2008-10-02 10:43:00 | ++--+--+--+-+ 3 rows in set (0.00 sec) (I've omitted the other columns, but they really are all identical). The configuration is: aggregate: src_host, src_port, dst_host, dst_port, proto sql_history: 1m sql_history_roundoff: m sql_table_version: 6 sql_refresh_time: 60 sql_multi_values: 1024000 sql_dont_try_update: true sql_optimize_clauses: true Does anyone have any ideas about what might cause this? I'm using pmacct 0.9.1 on this server. I know it's old, but it's what comes with Ubuntu Dapper. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] How does pmacct divide between in and outbound traffic?
Hi Dennis, Dennis Kempin wrote: I am currently trying to set up pmacct to account traffic between my host and the internet. I account src and dst hosts without any filtering. aggregate[out]: dst_host,src_host aggregate[in]: dst_host,src_host Looking at the results i wondered how pmacct does divide between inbound and outbound traffic? My IN socket shows many connections from my IP to the internet. It doesn't. You have to tell it how to, e.g. by applying an appropriate filter to each plugin, for example: aggregate_filter[in]: dst net 192.168.0.0/16 aggregate_filter[out]: src net 192.168.0.0/16 Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] mysql plugin connect problem
Hi anil, Anil wrote: ( default/mysql ) *** Purging cache - START *** ERROR ( default/mysql ): PRIMARY 'mysql' backend trouble. ERROR ( default/mysql ): The SQL server says: Access denied for user 'admin'@'%.domain.com' to database 'bandwidth_db' ( default/mysql ) *** Purging cache - END (QN: 0, ET: 0) *** But in the mysql logs, I see that it connected w/o a problem: 080722 22:15:01 12 Connect[EMAIL PROTECTED] on bandwidth_db 12 Query LOCK TABLES `acct` WRITE Why does the ERROR show %.domain.com instead of host.domain.com, which I specifically setup in my configuration: host.domain.com is what MySQL gets by doing a reverse lookup on the IP address that you connected from. [EMAIL PROTECTED] is the matching rule from your grant tables that was used to decide what access this user has, and apparently MySQL thinks that this user (pattern) doesn't have access to the bandwidth_db database. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct and nat ?
Hi Sebastien, Sébastien CRAMATTE wrote: I'm running pmacctd on a natted network. pmacctd account properply local traffic. My problem is that when I visit a website o any other thing that is after the nat router (I'm connected with cable modem) traffic is never accounted ! Does this is the normal behavior ? What happens is that I've tested with ntop too ... and ntop give me back these datas this is why I ask ... Normally interfaces in promiscious mode should see every kind of traffic ? Do you mean that you don't see traffic from other machines on your network out to the Internet? That probably means that your machine doesn't see the traffic. If it's in promiscuous mode, that probably means that you have a switch rather than a hub. Try configuring your switch with a mirror port, or putting your sensor inline with your NAT router as a transparent bridge. Also check that you can see the traffic with tcpdump or wireshark before blaming pmacct :) Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Measurement accuracy issues
Hi Ahmed, On Tue, 10 Jun 2008, Ahmed Kamal wrote: I have setup pmacct with your help, and it's been running like a champ. I have also installed darkstat for comparison. I am seeing a big error (around 30%) between the 2 tools! ... Here's what I am seeing: IP STARTENDDELTA DARKSTAT(bytes) 81.10.100.42 7607.7053 9477.4200 1869.7147 1,397,584,555 81.10.100.73 3603.2834 4716.6248 1113.3414 810,169,491 81.10.100.37 3540.3343 5698.6758 2158.3415 1,573,900,631 81.10.100.199 3444.3568 4358.3895914.0327 575,124,842 81.10.100.75 2951.8349 3697.5900 745.7551 556,560,149 81.10.100.30 2770.9552 3807.6038 1036.6486 715,830,077 81.10.100.46 2698.5764 3987.1379 1288.5615 856,582,079 81.10.100.44 1982.1858 2381.7297 399.5439 296,992,631 81.10.100.71 1880.2033 2522.7183 642.5150 548,180,038 81.10.100.201 1300.2739 2040.0713739.7974 411,031,858 Those are the top 10 BW users. All measurements are in MB (from SQL query), darkstat data is in bytes. As you can see, the first line it's 1.9GB vs 1.4GB and so on ... Any ideas how to track such errors ? My first suspicion would be that Darkstat is reporting bytes transferred (TCP data) rather than total size of packets. You can confirm this with some simple tests. E.g. create a file of exactly 1MB on a remote web server and download it through your pmacct/darkstat box. If darkstat reports that the amount downloaded is just over 1MB (e.g. 1.001 MB) then it's reporting TCP data. pmacct will always report packet sizes (IP data) and therefore is likely to report more bytes downloaded. Given that the TCP overhead is about 40 bytes per 1500 byte packet, i.e. about 2.6%, I'd expect it to report about 1.027 MB in this case. The overhead will be much higher for smaller packets which may explain your observed 30% discrepancy. If so, this is arguably a bug (or limitation) of darkstat rather than pmacct. Please let us know what you discover. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacctd transparent proxy
Hi all, On Thu, 21 Dec 2006, Jaime Nebrera wrote: I have a linux-router as internet gateway for small office with pmaccd running. It works well for now. But when I start the transparent proxy with permanent redirect of http to it, pmacct dosn't count incoming http traffic. I know that it comes from webserver to my router, not to lan client. Does anybody knows how to count such traffic and assign it with lan host? We have faced the same problem in the fast and are currently experiencing with the only solution available. You need to use tproxy :) This means patching the kernel and iptables, patching Squid and well, getting into there. We have made it work but are unsure yet of its other consecuences (besides of course, being able to see the internal IPs) If I understood the problem correctly, then I think there is another possible solution: write your own transparent proxy (or modify an existing one) to intercept the X-Forwarded-For and Host headers, and all four IP addresses and port numbers (a pair of each for the connection into and out of the proxy). You can put this information in a database table that you can link to the pmacct accounting tables whenever you need it. An added bonus is that you get the name of the remote website, not just the port number, whenever you want it. The disadvantages are that your web connections are broken into two connections in the pmacct database (which just means that it is reflecting reality); and your pmacct client software needs to be modified to take advantage of the new table. Cheers, Chris. -- (aidworld) chris wilson | chief engineer ([EMAIL PROTECTED]) ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Classification
Hi Paolo, On Wed, 18 Oct 2006, Paolo Lucente wrote: I'd be interested to know if anyone has combined layer 7 classification with pmacct's traffic aggregation. For example, I would like to combine all Kazaa traffic (per minute) into a single counter. It's already there, you can get a look to the VIII. Quickstart guide to packet classifiers chapter in EXAMPLES. Thanks for pointing me towards that, and apologies for the delay in replying. I also found a link to [http://www.pmacct.net/classification/] which was quite well hidden on the main pmacct web page :-) and which explained what I needed to know: an overview of how the existing structure works. Yes, traffic shaping between interfaces should be better done in kernel. And i fully agree with you: doing the job twice is not great idea. So, if you can see a way to, say, get the flows from libpcap and classification infos from the kernel, just let me/us know as it sounds a good idea! OK, I have some idea of how this might work. Harald Welte, one of the Netfilter developers, has proposed a system for accounting flows in the kernel as part of Netfilter's Conntrack code. He presented a paper on this at LinuxTag 2005, which unfortunately is not available online in PDF form (since LinuxTag apparently charges for access to conference papers). I generated an HTML version and attached it here: [http://bmo.aidworld.org/attach/Chris/paper.html] Basically this means that the Linux kernel will be keeping track of flows, and can notify user space about flow events. Combined with IPP2P or L7-filter, we will have all the information that we need in the kernel, and efficient access to it from user space. So what I'm considering is to create a new version of pmacctd (like sfacctd, nfacctd) called ctacctd, which reads flow information from the kernel rather than from pcap, etc. Otherwise it would have the same data storage backend and processing tools as the pmacct suite. I hope that it could be included in the pmacct suite, even if it only works on Linux. The use of Layer 7 inspection in Netfilter gives us a powerful advantage, because we can monitor and shape traffic on the same box, with minimal reclassification. Perhaps it can be ported to the BSDs, etc, if I can figure out how to access the connection tracking system from user space. I'm currently on contract to an organisation in Kenya which is currently using flowc for traffic monitoring. Flowc has a powerful user interface and graphs, but it's extremely difficult to set up, and only works with Cisco routers using Netflow. I'm considering implementing some of this functionality for the pmacct suite. I'm still concerned about the performance of the MySQL plugin with threading, so I'm considering providing an option to disable the extra threads, and run updates synchronously. I'd be very interested to hear your comments on these ideas. Thanks in advance. Cheers, Chris. -- (aidworld) chris wilson | chief engineer (http://www.aidworld.org) ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] Classification
Hi Sven, On Tue, 7 Nov 2006, Sven Anderson wrote: He gave the same talk on the Linux Symposium 2005, you can find the paper in the proceedings: http://www.linuxsymposium.org/2005/linuxsymposium_procv2.pdf Great, thanks for that. - First, he clearly pointed out, that flow accounting in the conntrack module makes sense _only_ if you use conntrack anyway (like firewall, NAT, ...). To use conntrack just for flow accounting would be just overkill, he wrote. Yes, and in our case we will be doing that anyway, because we want to traffic shape flows. - Second, you are strictly bound to the classical flow keys which are kept in the conntrack table anyway, that is source and destination IP and port. So the usage of the flow-accounting module in conntrack is quite restricted, but as long as these restrictions don't bother, it's a good alternative of course. (At the moment pmacct also only has a fixed flow data structure, but with the propagation of IPFIX I hope we will move to a more flexible structure.) But this is also how Netflow works, isn't it? The Cisco router has some idea about flows that isn't changeable externally, and it will send you updates about their state whenever it feels like it. I think that the kernel sending you information about its understanding of flows (which ctacctd would be free to reinterpret and aggregate) would work similarly. But for traffic-shaping based on application level analysis you have a problem already: You can classify packets, but you cannot store that information in the conntrack table as a flow key (AFAIK). You can store it using connmark. I have to find a way to export that data to user space, but it shouldn't be hard once nfnetlink_conntrack exists. Of course you could store that information in another place and map it to the flows in the conntrack table, but then the - let's call it - L7ClassID is not a real flow key, since it it possible that one flow (in the conntrack table) has several different L7ClassIDs over time, splitting it in different flows in fact. I don't mind that in practice. I could ignore the classification from the point of view of distinguishing flows. Also, I thought that pmacct had the ability to reclassify existing flows? In general you have to ask yourself the question, if having both routing and monitoring on the same machine is a good idea. You will probably always end up in a situation, where both functionalities interfere with each other. That's why I think, having a dedicated metering-probe is in most cases the better choice. And then, as the machine is not doing anything else with the monitored packets, handling everything in user-space is the better approach. Under Linux you can even optimize the network-adapter-user-space transition with PF_RING by Luca Deri. Of course, you cannot use this set-up if you want to do traffic shaping or similar based on the monitoring. Yes, that is exactly what I want to do. I want to shape bittorrent, gnutella and skype traffic without having to know what port it's running on. I'm still concerned about the performance of the MySQL plugin with threading, so I'm considering providing an option to disable the extra threads, and run updates synchronously. Interesting. What about having also a switch to have numbers-only tables, that is IP addresses, timestamps, class_id, mac addresses and protocol are all stored as integers? I don't see how that would help. It's basically just changing the constant multiplier cost. The problem I'm having is that when the database or the box is busy, pmacct starts spawning more and more threads that end up sleeping on the database. This eats resources and can lead to catastrophic failure (it has done it to me at least once). I would rather delay writing to the database by having it done synchronously, to limit the damage that it can do to the rest of the box. While on the subject of changing everything: what about a different timestamp set? I would prefer three timestamps: one for the first and the last packet in a flow, and one for the time the flow got closed (or updated the last time) which would correspond to the time-slot the flow belongs to. The third one is probably not really necessary, as you can calculate it from the other timestamps and the configuration, but it would give you a good index-key for the time-slots. Sorry, I don't understand what you mean by a time slot? For me, the relevant information is the start and end times of the flow, which I can use to draw graphs, etc. Ideally, I would like more detailed information about the flow at various points during its life (e.g. status every minute) and I'm not sure if I can get that using pmacctd, or how. I'm still working on it. Cheers, Chris. -- (aidworld) chris wilson | chief engineer (http://www.aidworld.org) ___ pmacct-discussion mailing list http://www.pmacct.net
[pmacct-discussion] Large number of threads
Hi all, I'm running pmacct on a fairly low spec box (Celeron 366, 128 Mb RAM) with a MySQL database. It started off fine, but as the box started to run out of memory (due to Apache I think), pmacctd started spawning more threads to write to the database. I ended up with 73 processes/threads in total, almost all database writers. Is this really a good idea? Wouldn't it be better to serialise database writes to some extent, to degrade gracefully rather than spiralling to death? Or is this already possible and I missed the config option? Thanks in advance for your help. Cheers, Chris. -- (aidworld) chris wilson | chief engineer (http://www.aidworld.org) ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists