OK, but WHICH version of ntop?

The bad magic # message is odd.  The malloc() calls should be protected, so
that if ntop ran out of available memory (real+swap), there should have been
a message and then it should have gone into stopcap mode.

Unless this is 2.2??

Now once there's a bogus entry in the hash tables, I can see how it could
cause lots and lots of messages - the test is in the free and lookup
routines (hash.c and util.c).  The real question is where did the bogus
entry come from.  Given the # of  repetitions, I'm betting it was in the
lookup, and the associated entry was one that had a lot of traffic.

No clue why however, and that's the key question!

The trap code worked, but what was it trapping????

Once you have 'too many' hosts - and that depends on the cpu, memory, swap,
etc. - it's easily possible for the generation of some of the html pages to
take over the 15s alarm, hence the message:

**ERROR** http generation failed, alarm() tripped. Please report this to
ntop-dev list!

So that's not something I'm worried about - although it would be nice to
have the stats from the problem report to see how many hosts ntop 'knows'
about.

For practical purposes, you should probably configure your logrotate with a
size limit too.  Or even move to a more robust log.  I use metalog on my
Gentoo systems - it manages the log file size on each message, rather than a
cron-driven logrotate.  That would have prevented your log filling up the
system.


Thanks for the report, unfortunately, there's just not much we can do with
it.  Ultimately it comes down to what caused a wild/bad pointer in the hash
table and the information to detect that - if it ever existed - is long
gone.

-----Burton




> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf
> Of Berggren Niklas
> Sent: Tuesday, April 27, 2004 1:39 AM
> To: [EMAIL PROTECTED]
> Subject: [Ntop-dev] Rather severe ntop bug
>
>
> Hello.
>
> I checked one of my ntop boxes today, and noticed that there was
> 0 bytes left on the disk. The problem was that the file
> /tmp/logwatch.XXzEMOPa/messages had grown to ~4.5 Gbytes over the night.
> In my /var/log/messages i found the following snippet;
>
> --Log cut--
> Apr 26 08:07:16 vnm ntop[16340]:   INITWEB: Starting web server
> Apr 26 08:07:16 vnm ntop[16340]:   THREADMGMT: Started thread
> (65545) for web server
> Apr 26 08:07:16 vnm ntop[17650]:   THREADMGMT: web connections
> thread (17650) started...
> Apr 26 08:07:16 vnm ntop[16340]:   THREADMGMT: Started thread
> (73738) for network packet sniffing on eth0
> Apr 26 08:07:16 vnm ntop[17651]:   THREADMGMT: pcap dispatch
> thread running...
> Apr 26 08:07:16 vnm ntop[17650]:   Note: SIGPIPE handler set (ignore)
> Apr 26 08:07:16 vnm ntop[17650]:   WEB: ntop's web server is now
> processing requests
> Apr 26 11:51:42 vnm sshd(pam_unix)[16285]: session closed for user root
> Apr 26 14:22:31 vnm sshd(pam_unix)[1615]: session opened for user
> root by (uid=0)
> Apr 26 14:27:02 vnm ntop[5670]:   **ERROR** http generation
> failed, alarm() tripped. Please report this to ntop-dev list!
> Apr 26 14:52:04 vnm ntop[27623]:   **ERROR** http generation
> failed, alarm() tripped. Please report this to ntop-dev list!
> Apr 26 15:17:08 vnm sshd(pam_unix)[1615]: session closed for user root
> Apr 26 20:37:13 vnm ntop[31733]:   **ERROR** http generation
> failed, alarm() tripped. Please report this to ntop-dev list!
> Apr 26 20:37:26 vnm ntop[513]:   **ERROR** http generation
> failed, alarm() tripped. Please report this to ntop-dev list!
> Apr 26 20:58:25 vnm ntop[17651]:   Resetting stats on user request...
> Apr 26 20:58:25 vnm ntop[17651]:   Resetting traffic statistics
> for device eth0
> Apr 26 20:58:26 vnm ntop[17651]:   Resetting traffic statistics
> for device NetFlow-device
> Apr 26 20:58:31 vnm ntop[16453]:   **WARNING** Error: bad magic
> number (expected=1968/real=0)
> Apr 26 20:58:42 vnm last message repeated 122076 times
> Apr 26 20:58:42 vnm ntop[17651]:   User requested stats reset complete
> Apr 26 20:58:42 vnm ntop[16453]:   **WARNING** Error: bad magic
> number (expected=1968/real=0)
> --Log cut--
>
> The rest of the log is filled with "This message repeated X
> times"-messages.
> The system has run out of RAM before, but then ntop was merely
> killed and forgotten by the kernel, I've not found anything
> indicating that the machine ran out of swap this time.
> Further, the only one using SSH to this box is me, and I
> experienced no problems in general when I left at 15:17.At that
> time however, the disk was not full.
>
> My starting method of ntop:
> --snip from /etc/rc.local--
> LOCALSUBNETS="148.138.0.0/255.255.240.0,148.138.16.0/255.255.248.0
> ,148.138.64.0/255.255.240.0,148.138.80.0/255.255.248.0,192.165.181
> .0/255.255.255.0"
> PROTOCOLS="HTTP=http|www|https|3128|8080,FTP=ftp|ftp-data,SSH=22,T
ELNET=telnet|login,SNMP=snmp|snmp-trap,SMB=netbios-ns|netbios-dgm|netbios->
ssm,MAIL=25|110|993,NOVELL=524"
> /usr/local/ntop/bin/ntop -4 -M -m $LOCALSUBNETS -p $PROTOCOLS
> --w3c -o -c -d --disable-instantsessionpurge
> --snip--
>
> This obviously eats a lot of memory, and had ~190 pages of total
> hosts, via NetFlow.
>
> No major issue for myself, but you wanted this reported, so I
> did. If you need anything else just drop me a message.
>
> Regards.
> Niklas Berggren.
> _______________________________________________
> Ntop-dev mailing list
> [EMAIL PROTECTED]
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
[EMAIL PROTECTED]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev

Reply via email to