Hi George-Cristian, This puts in a direction but unfortunately is not of great help: it *may* suggest there is an issue with memory availability - although every failing malloc() in sql_cache_insert() is taken care of (and from your aggregation method you should be leveraging only one of them).
I will have a deeper look and follow-up privately with you - either with a proposal for a setup with two nfacctd, each with a single PostgreSQL plugin while keeping current functionality integral, or with more info about my findings. Cheers, Paolo On Mon, Jul 29, 2013 at 01:04:27PM +0300, George-Cristian Bîrzan wrote: > I got a backtrace, though not from GDB. The problem is that I don't know > which of the two plugins will crash, and as far as I know I can only attach > to one plugin... Anyway, what I have is: > > *** glibc detected *** nfacctd: PostgreSQL Plugin [out]: break adjusted to > free malloc space: 0x0000000001e2be60 *** > ======= Backtrace: ========= > /lib/x86_64-linux-gnu/libc.so.6(+0x76d76)[0x7f151cfadd76] > /lib/x86_64-linux-gnu/libc.so.6(+0x7a443)[0x7f151cfb1443] > /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70)[0x7f151cfb2b90] > nfacctd: PostgreSQL Plugin [out](sql_cache_insert+0x822)[0x4503c2] > nfacctd: PostgreSQL Plugin [out](pgsql_plugin+0xc60)[0x44b160] > nfacctd: PostgreSQL Plugin [out](load_plugins+0x314)[0x4228a4] > nfacctd: PostgreSQL Plugin [out](main+0xf53)[0x41b123] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7f151cf55ead] > nfacctd: PostgreSQL Plugin [out][0x41baf5] > > The config I have is: > > sql_optimize_clauses: true > sql_db: acct > sql_table_version: 7 > sql_table: acct > sql_refresh_time: 1 > sql_history: 1s > sql_user: cloudsigma > sql_dont_try_update: true > sql_use_copy: true > plugin_pipe_size: 573440000 > plugin_buffer_size: 20000 > aggregate: src_host,dst_host,proto,src_port,dst_port > plugins: pgsql[in], pgsql[out] > > Also, I have to point out that I modified the code a bit to allow > sql_history of 1s, but dunno if it's related. > > > > On Fri, Jun 21, 2013 at 2:39 PM, George-Cristian Bîrzan > <[email protected]>wrote: > > > I'll try to, but I'm not so sure it'll be trivial to reproduce. > > > > > > On Thu, Jun 20, 2013 at 8:09 PM, Paolo Lucente <[email protected]> wrote: > > > >> Hi George-Cristian, > >> > >> One or more plugins that bail out and consequently core process that > >> closes up after all plugins are gone (essentially, the message you > >> posted) could be symptom of plugins crashing for some reason. It can > >> help if you run the daemon under gdb with follow-fork-mode set to > >> child and post the backtrace. Please follow this up with gdb ouptuts, > >> etc. privately. > >> > >> Cheers, > >> Paolo > >> > >> On Wed, Jun 19, 2013 at 11:01:30AM +0300, George-Cristian Bîrzan wrote: > >> > Is it possible to auto-reconnect to the DB when the connection is lost? > >> For > >> > reasons that pass understanding, sometimes, pmacct decides it lost the > >> > connection to the DB, at which point it just dies: > >> > > >> > Jun 19 04:01:53 host nfacctd[24062]: INFO: connection lost to > >> 'out-pgsql'; > >> > closing connection. > >> > Jun 19 04:01:53 host nfacctd[24062]: INFO: no more plugins active. > >> Shutting > >> > down. > >> > > >> > At that time, our PostgreSQL server didn't log anything: > >> > > >> > 2013-06-17 16:51:46 UTC HINT: Consider increasing the configuration > >> > parameter "checkpoint_segments". > >> > 2013-06-17 16:51:48 UTC LOG: checkpoints are occurring too frequently > >> (2 > >> > seconds apart) > >> > 2013-06-17 16:51:48 UTC HINT: Consider increasing the configuration > >> > parameter "checkpoint_segments". > >> > 2013-06-17 17:53:52 UTC WARNING: pgstat wait timeout > >> > 2013-06-17 23:58:02 UTC WARNING: pgstat wait timeout > >> > 2013-06-19 07:52:18 UTC LOG: checkpoints are occurring too frequently > >> (25 > >> > seconds apart) > >> > 2013-06-19 07:52:18 UTC HINT: Consider increasing the configuration > >> > parameter "checkpoint_segments". > >> > > >> > (The 7:52 one is when I restarted it now. And, yeah, gonna fix the psql > >> > stuff, but so far it's been not a problem, as the load on the machine is > >> > literally 0 as long as we don't do stupid stuff like try to read the > >> > hundreds of GB of data) > >> > > >> > -- > >> > George-Cristian Bîrzan > >> > >> > _______________________________________________ > >> > pmacct-discussion mailing list > >> > http://www.pmacct.net/#mailinglists > >> > >> > >> _______________________________________________ > >> pmacct-discussion mailing list > >> http://www.pmacct.net/#mailinglists > >> > > > > > > > > -- > > George-Cristian Bîrzan > > > > > > -- > George-Cristian Bîrzan _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
