Hello all,
I've been working with Mr. Strauss to resolve a problem that's been plaguing me
for several years: ntop will segfault at random intervals. Sometimes an hour,
sometimes a week. The problem I experience most often is related to looking up
/ resolving IP's to host names. This process appears to not be thread safe.
For several days now I've been running a single thread for this process and
have experienced zero segfaults, whereas typically I would have several / day.
During testing increasing the number of threads from 3 to 9 resulted in much
more frequent abends.
If you experience segfaults please try the following tests if you can:
1.) Add "-n | --numeric-ip-addresses"; or "-n 0" where:
[-n <mode> | --numeric-ip-addresses <mode>] Numeric IP addresses
DNS resolution mode:
0 - No DNS
resolution at all
1 - DNS resolution
for local hosts only
2 - DNS resolution
for remote hosts only
What options are available to you depend on your version of ntop. I THINK
4.0.3 and later support the "-n mode"; where earlier versions support just the
"-n" You can check by running "./ntop --help" in the directory where your ntop
binary is stored. It will print all (most) of the startup options.
Run ntop in this mode for some period of time - whatever you feel is
satisfactory to prove it's stable. Ie: If it usually abends once / day,
perhaps let it run a week.
2.) If number one is successful, the next step is to recompile ntop with the
following change to "globals-defines.h"
- #define MAX_NUM_DEQUEUE_ADDRESS_THREADS 1
The default is 3 threads, and under certain conditions they stomp on each
other. Limiting to a single thread renders this specific fault condition
impossible.
Obviously running only a single thread means the "to be resolved" queue will
get quite large and possibly overflow; meaning anything above the max will be
discarded. This is just a test so some/mostly resolved IP's are better than
none - right? If you wish you COULD increase the queue size by editing the
following; again in globals-defines.h:
- #define MAX_NUM_QUEUED_ADDRESSES 16384
16K is quite large. If your DNS servers are fast you should be able to keep
the queue serviced before it overflows. However, if you wish to change this I
ask you do so at some time after reducing the thread count to 1 - to minimize
the variables and keep the test as controlled as possible.
It seems most people reporting this issue are running netflow, including
myself. All interface types use the same process/function to resolve the IP's
so I'm not sure why it seems more prevalent with netflow users.
Anyway, hope this helps someone. If there is anyone willing to help test
patches that would be great. I'll "alpha" test them before hand, but if we
could get at least a couple others to test beta patches that would be great.
TIA!
Gary
<font size="1">
<div style='border:none;border-bottom:double windowtext 2.25pt;padding:0in 0in
1.0pt 0in'>
</div>
"This email is intended to be reviewed by only the intended recipient
and may contain information that is privileged and/or confidential.
If you are not the intended recipient, you are hereby notified that
any review, use, dissemination, disclosure or copying of this email
and its attachments, if any, is strictly prohibited. If you have
received this email in error, please immediately notify the sender by
return email and delete this email from your system."
</font>
_______________________________________________
Ntop mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop