The way I'm currently considering doing it is to monitor the age (since
last modified) or a few files (which I know are perpetually active), and
stop nxlog (and the possibly killing it) the starting it.
In bash terms, something like:
seconds_since_last_modification=$(expr $(date +%s) - $(date --reference
/path/to/busy.log +%s))
Performing a core dump (using gcore) would also be useful in such a script.
On Thursday, 28 August 2014, Botond Botyanszki <b...@nxlog.org> wrote:
> Hi,
>
> Ideally the watchdog would be a separate module that could be monitoring
> various things (EPS, memory, etc) and as such more tightly integrated
> than generic monitoring tools (e.g. monit). This is yet another item
> that's been sitting on the TODO list.
>
> The solution Paul posted could cover some part of it but this may fail if
> the process really locks up.
>
> Regards,
> Botond
>
> On Tue, 26 Aug 2014 20:31:40 +1200
> Cameron Kerr <cameron.kerr...@gmail.com <javascript:;>> wrote:
>
> > I would like to have a watchdog for nxlog, as I've found a few occasions
> > recently where nxlog (on RHEL5, build from source using the provided spec
> > file) was no-longer able to send (or have its sent messages received) by
> a
> > receiving nxlog instance (on RHEL6)
> >
> > I'm not sure of why this is (there was nothing logged at INFO level),
> but I
> > do know that restarting nxlog on the receiving side (or pointing the
> sender
> > to a different receiver) was sufficient to restore service.
> >
> > I would much prefer to have some sort of 'liveness' test (not merely
> > ensuring that there is a process called 'nxlog') that could be used as a
> > test to restart a likely failed nxlog instance.
> >
> > Ideally, I'd like to be able to take a core-dump etc. of when this
> happens
> > for further root-cause analysis.
> >
> > Considering there is never any silent time of the day, I would be happy
> > with a test that was based on number of events processes in [some small
> > number] of minutes.
> >
> > Or I suppose I could just pass in a particular log message, and then
> check
> > to see that it has come though...
> >
> >
> > Has anyone done anything similar and would like to share what they have
> > done?
> >
> > Cheers,
> > Cameron
> >
> > --
> > Cameron Kerr <cameron.kerr...@gmail.com <javascript:;>>
> > See my blog at http://distracted-it.blogspot.co.nz/ (previously
> > http://humbledown.org/)
> > Skype me on cameron.kerr.nz
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds. Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> nxlog-ce-users mailing list
> nxlog-ce-users@lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/nxlog-ce-users
>
--
--
Cameron Kerr <cameron.kerr...@gmail.com>
See my blog at http://distracted-it.blogspot.co.nz/ (previously
http://humbledown.org/)
Skype me on cameron.kerr.nz
------------------------------------------------------------------------------
Slashdot TV.
Video for Nerds. Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
nxlog-ce-users mailing list
nxlog-ce-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nxlog-ce-users