Here's the poller.ini file I'm using:
[daemon]
#-- Global Configuration
#user=shinken ; if not set then by default it's the
current user.
#group=shinken ; if not set then by default it's the
current group.
# Set to 0 if you want to make this daemon NOT run
daemon_enabled=1
# Larger configurations need more threads (default is 8?)
daemon_thread_pool_size=50
#-- Path Configuration
# The daemon will chdir into the directory workdir when launched
# paths variables values, if not absolute paths, are relative to
workdir.
# using default values for following config variables value:
workdir = /var/run/shinken
logdir = /var/log/shinken
pidfile=%(workdir)s/pollerd.pid
#-- Network configuration
# host=0.0.0.0
# port=7771
# http_backend=auto
# idontcareaboutsecurity=0
#-- SSL configuration --
use_ssl=0
# WARNING : Put full paths for certs
#ca_cert=/etc/shinken/certs/ca.pem
#server_cert=/etc/shinken/certs/server.cert
#server_key=/etc/shinken/certs/server.key
#hard_ssl_name_check=0
#-- Local log management --
# Enabled by default to ease troubleshooting
use_local_log=1
local_log=%(logdir)s/pollerd.log
# accepted log level values= DEBUG,INFO,WARNING,ERROR,CRITICAL
log_level=INFO
#log_level=DEBUG
And here's the poller.cfg file:
#===============================================================================
# POLLER (S1_Poller)
#===============================================================================
# Description: The poller is responsible for:
# - Active data acquisition
# - Local passive data acquisition
#
https://shinken.readthedocs.org/en/latest/08_configobjects/poller.html
#===============================================================================
define poller {
poller_name poller-1
address shinken1.dc1.example.com
port 7771
## Optional
spare 0 ; 1 = is a spare, 0 = is not a spare
manage_sub_realms 0 ; Does it take jobs from schedulers
of sub-Realms?
min_workers 0 ; Starts with N processes (0 = 1 per
CPU)
max_workers 0 ; No more than N processes (0 = 1
per CPU)
processes_by_worker 256 ; Each worker manages N checks
polling_interval 1 ; Get jobs from schedulers each N
seconds
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the
node is dead
check_interval 60 ; Ping node every N seconds
## Interesting modules that can be used:
# - booster-nrpe = Replaces the check_nrpe binary.
Therefore it
# enhances performances when there are
lot of NRPE
# calls.
# - named-pipe = Allow the poller to read a nagios.cmd
named pipe.
# This permits the use of distributed
check_mk checks
# should you desire it.
# - SnmpBooster = Snmp bulk polling module
modules named-pipe, booster-nrpe
## Advanced Features
#passive 0 ; For DMZ monitoring, set to 1 so
the connections
; will be from scheduler ->
poller.
# Poller tags are the tag that the poller will manage. Use
None as tag name to manage
# untaggued checks
#poller_tags None
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the
middle attacks
hard_ssl_name_check 0
realm All
}
On 5/14/15 3:13 PM, David Good wrote:
Here's another example of what I'm seeing -- In the arbiter log
I'll see something like this:
[1431641122] INFO: [Shinken] [All] Trying to send configuration to
poller poller-1
[1431641242] ERROR: [Shinken] Failed sending configuration for
poller-1: Connexion error to http://shinken1.dc1.example.com:7771/
: Operation timed out after 120001 milliseconds with 0 bytes
received
And then just a few seconds later:
[1431641291] INFO: [Shinken] [All] Trying to send configuration to
poller poller-1
[1431641291] INFO: [Shinken] [All] Dispatch OK of configuration 1
to poller poller-1
And this poller is on the same server as the arbiter. I see this
happening sporadically for pretty much every daemon, causing the
configuration to be constantly in the process of being
re-dispatched. This is especially frustrating as I'm trying to
test out some new configs adding and removing hosts and services
from monitoring. If it can't finish dispatching it makes it hard
to test :-/
On 5/14/15 2:49 PM, David Good wrote:
I doubt that was the case -- I was careful to make sure
everything was stopped before restarting.
And now my problems have started up again. I may be forced to
upgrade to 2.4 to see if it helps any. Very frustrating. If
that doesn't fix it, I may be forced to fall back to nagios and
gearman. It'd hate to do that as we had promised that Shinken
would scale better than Nagios.
On 5/13/15 2:50 PM, Felipe openglx
wrote:
Play the lotto just in case ;)
My suspicion would be that your previous "restart" to
adjust the thread pool (or other testing) didn't kill
all threads, hence why you had some very unusual
situations going on.
Let us know how it goes, best luck on getting the project
delivered!
Regards
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel
|
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel