Hello,
I was unable to reproduce the problem on 5.7.1, so I assume that this was
already fixed. I'll upgrade our production installation to 5.7.2 as soon as
possible.
BR,
Joel Hansell


On Tue, Feb 4, 2014 at 3:09 PM, Joel Hansell <joel.hans...@gmail.com> wrote:

> Hi again,
>
> As an update, I was able to reproduce the error by writing a script that
> issues 4096 HUP signals to snmptrapd. The process starts issuing the
> "maximum conf file count (4096) exceeded" error, and forgets the configured
> logging format.
> I don't suppose this is a known bug which was fixed some time after
> 5.6.1.1? It looks like an issue with the signal handling - there is only
> one config file, which is read multiple times. Seems like the config file
> counter should be reset when SIGHUP is received.
>  I can't find the source for 5.6.1.1, but in the 5.6.2 code, I don't see
> what the problem could be since "files" is in function-local scope. It
> should be zeroed every time the read_config() function is called. I guess
> there could be some scope confusion (or confusion on my part, I'm not a
> habitual C programmer), or the problem was already fixed in 5.6.2...
>
> I'm trying to reproduce the bug on 5.7.1 now, but unfortunately, since
> snmptrapd takes about a second to restart on SIGHUP, the test takes just
> over an hour to execure.
>
> BR,
> Joel Hansell
>
>
>
>
> On Tue, Feb 4, 2014 at 10:58 AM, Joel Hansell <joel.hans...@gmail.com>wrote:
>
>> Hi list,
>>
>> Here's one I've been scratching my head over lately.
>>
>> We have a bit of an oddball solution, where we've got snmptrapd logging a
>> lot of traps into a file, which is parsed by an external tool. This is all
>> running on a HP-UX 11.2 system, with Net-SNMP version 5.6.1.1 delivered
>> with the "HP-UX Internet Express" package.
>>
>> A cron job runs "logrotate" every 15 minutes, and if the file is too big,
>> it's rotated, and the postrotate script issues a SIGHUP to snmptrapd. That
>> normally triggers the daemon to re-read its config and to restart the
>> logging into a new file.
>>
>> The snmptrapf.config is set up to use a particular one-line trap logging
>> format.
>>
>> It seems that every so often, the snmptrapd fails subtly on SIGHUP. It
>> only seems to happen after more than a couple of months have passed. 60
>> days, 91 days, 101 days, 113 days are some of the fault intervals.
>>
>> I've observed the following about the failure state after it happens:
>> - Snmptrapd is executing and logging traps
>> - The trap logging format has changed to the default trap logging format
>> (three lines per trap). This causes our parser to fail
>> - snmptrapd logs the error "[...]/snmptrapd.conf: line 0: Error: maximum
>> conf file count (4096) exceeded" at the start of the log file.
>>
>> The flow of traps is such that the log file is usually rotated every 30
>> minutes, but it goes up and down a bit. Could this failure be happening
>> after 4096 SIGHUPs? That would explain the varying time between failures.
>>
>> I'm grateful for any input.
>>
>> Regards,
>> Joel Hansell
>>
>>
>
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Net-snmp-users mailing list
Net-snmp-users@lists.sourceforge.net
Please see the following page to unsubscribe or change other options:
https://lists.sourceforge.net/lists/listinfo/net-snmp-users

Reply via email to