[pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Johannes Formann
Hello,

I'll get this kind of error:
Apr 13 15:27:15 server pmacctd[1341]: ERROR ( default/mysql ): Duplicate
entry '0-00:1c:c0:74:5b:e9-00:00:0c:07:ac:6a-0-84.38.64.216-79.221.203.'
for key 1  
Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip
f7bfa9ca sp ffb88334 error 4 in
libpthread-2.3.6.so[f7bf2000+e000]

Only when using sql_dont_try_update: true (without the databaseserver is
stressed too much):

But I think the configuration is so, this shoudn't happen:
! pmacctd configuration
!
!
!
daemonize: true
pidfile: /var/run/pmacctd.pid
syslog: daemon
!
! interested in in and outbound traffic
!aggregate: src_host,dst_host
aggregate:
src_host,dst_host,dst_port,src_port,flows,dst_mac,proto,src_mac,vlan
! on this interface
interface: eth0
!
post_tag: 1 
! storage methods
plugins: mysql
sql_db: pmacct
sql_table: acct_v4
sql_table_version: 4 
sql_passwd: verySecret ;)
sql_user: pmacct
! refresh the db every 5 minute
sql_refresh_time: 300
! reduce the size of the insert/update clause
sql_optimize_clauses: true
! accumulate values in each row for up to 5 minutes
sql_history: 5m 
! create new rows on the minute, hour, day boundaries
sql_history_roundoff: mhd
! Multivalue-Support, bei mysql schneller
sql_multi_values: 1024000

! nur interessante Ports, reduktion der Datenmenge
ports_file: /etc/pmacct/ports
! nur inserts
!sql_dont_try_update: true

! intern puffern
plugin_buffer_size: 2048
plugin_pipe_size: 2048000

Any hints?

regards

Johannes

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Johannes Formann
Hello,

  I'll get this kind of error:
  Apr 13 15:27:15 server pmacctd[1341]: ERROR ( default/mysql ): Duplicate
  entry '0-00:1c:c0:74:5b:e9-00:00:0c:07:ac:6a-0-84.38.64.216-79.221.203.'
  for key 1
 
 I get this as well. It means that the primary key is not long enough to
 make the records truly unique. As you can see, it has been truncated.
 Either find out how to make the key longer (and please let me know :) or
 remove some aggregation fields, e.g. mac addresses and vlans.

strange, The primary key could be up to 1000 bytes if I remember
correctly, the whole row isn't that long.

 I'm not sure why flows is in your aggregate set since flows are already
 aggregated into flows in all cases by pmacctd, as far as I know (please
 correct me if I'm wrong).

flow isn't in the primary key.

  Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip
  f7bfa9ca sp ffb88334 error 4 in
  libpthread-2.3.6.so[f7bf2000+e000]
 
 And that should definitely not happen. Where did you get pmacct from? Did
 you compile it yourself? Can you build a version with debugger symbols and
 run it in gdb to get a backtrace (bt command) when it crashes?

Since I'd had similar problems with the debian version, I compiled it on
my own.

Since I hav't much experience with gdb, any hints, this seems to bee not
the right way:
(gdb) run -f /etc/pmacct/pmacctd.conf 
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /usr/sbin/pmacctd -f /etc/pmacct/pmacctd.conf
[Thread debugging using libthread_db enabled]
[New Thread -139180352 (LWP 4975)]
bt
^C
Program received signal SIGINT, Interrupt.
[Switching to Thread -139180352 (LWP 4975)]
0xe430 in __kernel_vsyscall ()
(gdb) bt
#0  0xe430 in __kernel_vsyscall ()
#1  0xf7c587f3 in write () from /lib/tls/i686/cmov/libc.so.6
#2  0x080579d4 in exec_plugins (pptrs=0xfff30fd8) at plugin_hooks.c:254
#3  0x08054331 in pcap_cb (user=0xfff31188 , pkthdr=0xfff31098,
buf=0x80d4a50 )
at pmacctd.c:604
#4  0xf7d150d3 in pcap_read_linux (handle=0x80d47b0, max_packets=-1,
callback=0x8054277 pcap_cb, 
user=0xfff31188 ) at pcap-linux.c:797
#5  0xf7d15f8b in pcap_loop (p=0x80d47b0, cnt=-1, callback=0x8054277
pcap_cb, user=0xfff31188 )
at pcap.c:107
#6  0x08054223 in main (argc=3, argv=0xfff314a4, envp=0xfff314b4) at
pmacctd.c:564

(The error has been loged bevore I pressen ctrl+c)

regards

Johannes

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip 
 f7bfa9ca sp ffb88334 error 4 in 
 libpthread-2.3.6.so[f7bf2000+e000]

 I think I got it (using a written coredump):

Yes, that's it, thanks. I'm afraid it doesn't mean much to me, but I hope 
it will help Paolo. What exact version of pmacct are you using?

 (gdb) bt
 #0  0xf7ba29ca in pthread_getspecific () from
 /lib/tls/i686/cmov/libpthread.so.0
 #1  0xf7c8bf85 in inet_ntoa () from /lib/tls/i686/cmov/libc.so.6

Paolo, this looks weird to me. pthread_getspecific() should not crash, 
that makes me think that the heap has been trashed (stack looks generally 
OK as the backtrace is OK). Perhaps a Valgrind is in order? Any static 
or fixed-size buffers in the mysql plugin that might be busted?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 I'm not sure why flows is in your aggregate set since flows are 
 already aggregated into flows in all cases by pmacctd, as far as I 
 know (please correct me if I'm wrong).

 flow isn't in the primary key.

 I didn't say it was, but it is in your aggregate set and I don't 
 understand why.

 Are you shure its flow, between mac and IP it could be vlan?

aggregate:
src_host,dst_host,dst_port,src_port,flows,dst_mac,proto,src_mac,vlan

It's right there before dst_mac.

 I guess you mean the SIGSEGV error has been logged in your syslog? gdb 
 should stop when it sees the SIGSEGV error, and wait for a command such 
 as bt. So I guess it's happening in another thread than the main one, 
 so it will be harder to trace.

 You could wait until pmacctd is up and running, then press Ctrl+C, 
 enter the info threads command, then guess a thread other than the 
 first one and switch to it with thread xxx and continue, and hope 
 that that thread dies with SIGSEGV.

 Is pmacctd not terminated once pressing ctrl+c?

It shouldn't be, gdb should intercept the SIGINT and stop it from reaching 
the process.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Johannes Formann
Chris Wilson ch...@aptivate.org wrote:
 On Mon, 13 Apr 2009, Johannes Formann wrote:

Hi Chris,

  Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip 
  f7bfa9ca sp ffb88334 error 4 in 
  libpthread-2.3.6.so[f7bf2000+e000]
 
  I think I got it (using a written coredump):
 
 Yes, that's it, thanks. I'm afraid it doesn't mean much to me, but I hope 
 it will help Paolo. What exact version of pmacct are you using?
 
  (gdb) bt
  #0  0xf7ba29ca in pthread_getspecific () from
  /lib/tls/i686/cmov/libpthread.so.0
  #1  0xf7c8bf85 in inet_ntoa () from /lib/tls/i686/cmov/libc.so.6
 
 Paolo, this looks weird to me. pthread_getspecific() should not crash, 
 that makes me think that the heap has been trashed (stack looks generally 
 OK as the backtrace is OK). Perhaps a Valgrind is in order? Any static 
 or fixed-size buffers in the mysql plugin that might be busted?

No Valgrind instaled.

I cleared the database, and observed what happend:
Apr 13 17:18:19 server1 pmacctd[12394]: INFO ( default/core ): Start logging ...
Apr 13 17:18:19 server1 pmacctd[12394]: OK ( default/core ): link type is: 1 
Apr 13 17:19:41 server1 pmacctd[12394]: Expiring orphan fragment:
ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33635 
Apr 13 17:19:47 server1 pmacctd[12394]: Expiring orphan fragment:
ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33756 
Apr 13 17:19:58 server1 pmacctd[12394]: Expiring orphan fragment:
ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33415 
Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ): Duplicate entry
'0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for key 1  
Apr 13 17:20:01 server1 kernel: pmacctd[12419]: segfault at 3827208c ip
f7c599ca sp ffde7894 error 4 in
libpthread-2.3.6.so[f7c51000+e000]

This means starting with the secound write it fails, in the database the
follwowing stamp-updated were found (later more)
2009-04-13 17:19:42
2009-04-13 17:20:01

BTW: tried to use v7 Schema, slightly other SQL-Error (as expeted), same
problem.

cheers

Johannes

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 Paolo, this looks weird to me. pthread_getspecific() should not crash, 
 that makes me think that the heap has been trashed (stack looks 
 generally OK as the backtrace is OK). Perhaps a Valgrind is in order? 
 Any static or fixed-size buffers in the mysql plugin that might be 
 busted?

 No Valgrind instaled.

You can probably apt-get install valgrind and run pmacctd through it.

 I cleared the database, and observed what happend:
 Apr 13 17:18:19 server1 pmacctd[12394]: INFO ( default/core ): Start logging 
 ...
 Apr 13 17:18:19 server1 pmacctd[12394]: OK ( default/core ): link type is: 1
 Apr 13 17:19:41 server1 pmacctd[12394]: Expiring orphan fragment:
 ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33635
 Apr 13 17:19:47 server1 pmacctd[12394]: Expiring orphan fragment:
 ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33756
 Apr 13 17:19:58 server1 pmacctd[12394]: Expiring orphan fragment:
 ip_src=98.218.230.138 ip_dst=84.38.67.65 proto=17 id=33415
 Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ): Duplicate 
 entry
 '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for key 1
 Apr 13 17:20:01 server1 kernel: pmacctd[12419]: segfault at 3827208c ip
 f7c599ca sp ffde7894 error 4 in
 libpthread-2.3.6.so[f7c51000+e000]

As this crash is so early, perhaps the thread isn't initialised properly?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Johannes Formann
Johannes Formann mlm...@formann.de wrote:

Hello,

 I'll get this kind of error:
 Apr 13 15:27:15 server pmacctd[1341]: ERROR ( default/mysql ): Duplicate
 entry '0-00:1c:c0:74:5b:e9-00:00:0c:07:ac:6a-0-84.38.64.216-79.221.203.'
 for key 1  
 Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip
 f7bfa9ca sp ffb88334 error 4 in
 libpthread-2.3.6.so[f7bf2000+e000]

Could it

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Johannes Formann
Johannes Formann mlm...@formann.de wrote:

Hi Chris,

  Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ):
  Duplicate entry
  '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for
  key 1
  
  As this crash is so early, perhaps the thread isn't initialised properly?
 
 Well, the first update (into the complet empty table) was successfull,
 and I think that has used the same kid of thread.

I have now a guess whre the duplicated keys error come from:

Assume the updates are done at :30 with sql_history_roundoff: 1h and
sql_refresh_time: 3600 (1h) (so long for simplifikation)

at 0:30 for each recorded flow a row is inserted with the timestamp 0:00
at 1:30 for log flow a row is inserted for 0:00 and 1:00 ...

at least if I understood the dokumentation right, that makes the error,
since to identical inserts should be done...

greetings

Johannes

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Chris Wilson
Hi Johannes,

On Mon, 13 Apr 2009, Johannes Formann wrote:

 Apr 13 17:20:01 server1 pmacctd[12419]: ERROR ( default/mysql ):
 Duplicate entry
 '0-00:1b:8f:61:55:c9-00:1c:c0:ab:8a:48-0-91.22.172.35-84.38.74.24' for
 key 1

 As this crash is so early, perhaps the thread isn't initialised properly?

 Well, the first update (into the complet empty table) was successfull,
 and I think that has used the same kid of thread.

 I have now a guess whre the duplicated keys error come from:

 Assume the updates are done at :30 with sql_history_roundoff: 1h and
 sql_refresh_time: 3600 (1h) (so long for simplifikation)

 at 0:30 for each recorded flow a row is inserted with the timestamp 0:00
 at 1:30 for log flow a row is inserted for 0:00 and 1:00 ...

 at least if I understood the dokumentation right, that makes the error,
 since to identical inserts should be done...

My understanding is that with those settings, a row would be inserted just 
after 0:00, with stamp_inserted = 0:00, and another one just after 1:00, 
with stamp_inserted 1:00, so there should not be a conflict.

What makes you think that anything should happen at 0:30 or 1:30? Also, 
the second insert should have stamp_inserted = 1:00 not 0:00, as far as I 
know.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Paolo Lucente
Hi Johannes, Chris,

Thanks to the both of you for progressing troubleshooting of the problem.
A lot of interesting elements are already there but there is something not
clear yet about the nature of this issue:

* The SEGV appears to be in a pthread_getspecific() call while the MySQL
  plugin does not do any pthread-related call. Also a simple inet_ntoa()
  should not do that. Did you by any chance compile the package with the
  --enable-threads option? I don't think so but it's worth asking.

* Within the debugger i can see the following line: Core was generated
  by `pmacctd: SQL Plugin -- DB Writer (urgent). An urgent DB Writer can
  be triggered for two reasons - both related to the SQL cache: 1) memory
  is finished so it tries to purge the content of the cache to the DB in
  order to make room to further traffic or 2) the SQL cache size is too
  small for the traffic footprint. The value is expressed in cache entries,
  defaults to 32771 and can be tuned via the 'sql_cache_entries' config
  directive. Can you let me know about this point?

* Falling into an urgent DB writer and having the 'sql_dont_try_update'
  directive set to true can generate the duplicate SQL key message. The
  strange thing about it is that the key is reported truncated in the
  middle of an IP address '[...]-84.38.64.216-79.221.203.'. I actually
  can't think at anything which can originate this behaviour.

* Few days ago we received on this list a report from another user by
  which 'sql_multi_values' was causing a crash. Just to be sure, do you
  mind commenting this out and see if it helps? 

* If none of the above helps, i would suggest to proceed by simplifying
  the SQL-related configuration to a few statements, see if it works. If
  that's the case, then i would add statements back one by one. I would
  start with something like this:

  [ ... ]
  sql_db: pmacct
  sql_table: acct_v4
  sql_table_version: 4
  sql_passwd: verySecret ;)
  sql_user: pmacct
  sql_refresh_time: 60
  sql_history: 5m
  sql_history_roundoff: m
  ports_file: /etc/pmacct/ports
  [ ... ]

Let me know.

Cheers,
Paolo

On Mon, Apr 13, 2009 at 03:43:47PM +0200, Johannes Formann wrote:
 Hello,
 
 I'll get this kind of error:
 Apr 13 15:27:15 server pmacctd[1341]: ERROR ( default/mysql ): Duplicate
 entry '0-00:1c:c0:74:5b:e9-00:00:0c:07:ac:6a-0-84.38.64.216-79.221.203.'
 for key 1  
 Apr 13 15:27:15 server kernel: pmacctd[1341]: segfault at f7002991 ip
 f7bfa9ca sp ffb88334 error 4 in
 libpthread-2.3.6.so[f7bf2000+e000]
 
 Only when using sql_dont_try_update: true (without the databaseserver is
 stressed too much):
 
 But I think the configuration is so, this shoudn't happen:
 ! pmacctd configuration
 !
 !
 !
 daemonize: true
 pidfile: /var/run/pmacctd.pid
 syslog: daemon
 !
 ! interested in in and outbound traffic
 !aggregate: src_host,dst_host
 aggregate:
 src_host,dst_host,dst_port,src_port,flows,dst_mac,proto,src_mac,vlan
 ! on this interface
 interface: eth0
 !
 post_tag: 1 
 ! storage methods
 plugins: mysql
 sql_db: pmacct
 sql_table: acct_v4
 sql_table_version: 4 
 sql_passwd: verySecret ;)
 sql_user: pmacct
 ! refresh the db every 5 minute
 sql_refresh_time: 300
 ! reduce the size of the insert/update clause
 sql_optimize_clauses: true
 ! accumulate values in each row for up to 5 minutes
 sql_history: 5m 
 ! create new rows on the minute, hour, day boundaries
 sql_history_roundoff: mhd
 ! Multivalue-Support, bei mysql schneller
 sql_multi_values: 1024000
 
 ! nur interessante Ports, reduktion der Datenmenge
 ports_file: /etc/pmacct/ports
 ! nur inserts
 !sql_dont_try_update: true
 
 ! intern puffern
 plugin_buffer_size: 2048
 plugin_pipe_size: 2048000
 
 Any hints?
 
 regards
 
 Johannes
 
 ___
 pmacct-discussion mailing list
 http://www.pmacct.net/#mailinglists

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Strange SQL-Error

2009-04-13 Thread Johannes Formann
Paolo Lucente pa...@pmacct.net wrote:

Hi Paolo,

 * The SEGV appears to be in a pthread_getspecific() call while the MySQL
   plugin does not do any pthread-related call. Also a simple inet_ntoa()
   should not do that. Did you by any chance compile the package with the
   --enable-threads option? I don't think so but it's worth asking.

Tested once, but used only more CPU-Time, so thrown away ;)
Currently runnig:
./configure --enable-mysql   --enable-ipv6 --prefix=/usr/
--with-pcap-libs=/usr/local/lib/
--with-pcap-includes=/usr/local/src/libpcap-0.9.8.20081128
--enable-64bit   --enable-debug


 * Within the debugger i can see the following line: Core was generated
   by `pmacctd: SQL Plugin -- DB Writer (urgent). An urgent DB Writer can
   be triggered for two reasons - both related to the SQL cache: 1) memory
   is finished so it tries to purge the content of the cache to the DB in
   order to make room to further traffic or 2) the SQL cache size is too
   small for the traffic footprint. The value is expressed in cache entries,
   defaults to 32771 and can be tuned via the 'sql_cache_entries' config
   directive. Can you let me know about this point?

It's the default, so increasing sql_cache_entries might help?
I'll set it to 262168, should bee enough to prevent it to be the error.

 * Falling into an urgent DB writer and having the 'sql_dont_try_update'
   directive set to true can generate the duplicate SQL key message. The
   strange thing about it is that the key is reported truncated in the
   middle of an IP address '[...]-84.38.64.216-79.221.203.'. I actually
   can't think at anything which can originate this behaviour.

Looks like a fixed legth trukation, while testing I'd thrown out the
agent and tos-field, and after that the ip fits completly in the error.

 * Few days ago we received on this list a report from another user by
   which 'sql_multi_values' was causing a crash. Just to be sure, do you
   mind commenting this out and see if it helps? 

I try everything ;)

But that Changes seems to solve the Problem, although without
sql_multi_values the db-Server is a more loaded (and the table longer
locked).

regards

Johannes

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists