On Thu, May 2, 2013 at 4:37 PM, Paolo Lucente <[email protected]> wrote:
> If both processes drop to zero CPU utilization then it looks like the
> issue might be in what is feeding pmacct. Although pmacctd protects from
> interface flaps (ie. if the interface drops, it tries to re-bind) can
> you check your system logs to spot if there has been any link down-ups?

No, links are stable. There is no link down/up events in kernel log.
Both processes just hang. I tried attaching to them via strace, one of
them was stuck in futex call, another one shows up as in
restart_syscall.

> What OS are you running this and what mechanism are you using to feed
> pmacct (plain libpcap, libpcap-mmap, PF_RING, etc.)? What version of
> pmacct are you running - and can you find anything relevant in pmacct
> log file, if one is configured?

OS is Ubuntu 13.04, pmacct is from packages, version 0.14.0. Pmacct
package in Ubuntu/Debian depends on libpcap library, so I suppose it
is using libpcap. Pmacct sends logs to syslog, I don't see anything
suspicous except lines like this:

pmacctd[30646]: INFO ( default/core ): short IPv4 packet read
(37/38/frags). Snaplen issue ?

I suppose this is not a problem, just a fragment of IPv4 packet is
received and pmacct is not able to see at higher level protocol fields
except IP. When pmacct hangs, logging stops, so there's no clue what's
wrong.

> It would be ideal, if you manage to reproduce the issue, if you could
> provide remote-access for an inspection - if i find you positive on this,
> please follow-up prievately.

Well, this happens several times per day. May be I can collect some
core dumps or whatever is needed if you give me instructions? As it is
running on a production router (if you can call it "production" with
unstable traffic accounting ;-), we are restarting it as soon as the
problem appears.

Actually, there are two pmacct instances running, because when we run
only one instance it maxes out CPU core. We run two instances, one for
inbound, one for outbound directions, so we can spread the load among
cores. Single instance hangs the same way, so problem is not related
to this fact of two instances.

This is config file for single direction, second is the same with
direction reversed in pcap_filter:

daemonize: false
pidfile: /var/run/pmacctd.eth0.19-in.pid
syslog: daemon

pcap_filter: src net <NET1>/20 or src net <NET2>/18
interface: eth0.19
promisc: false

plugins: nfprobe
nfprobe_receiver: 172.19.200.19:9998
nfprobe_version: 5
nfprobe_timeouts:
maxlife=120:general=15:tcp=15:tcp.rst=15:tcp.fin=15:udp=15:icmp=15:expint=15
nfprobe_maxflows: 200000
nfprobe_source_ip: <IP>



--
Timur Irmatov

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to