On 5/12/15 4:16 PM, David Good wrote:
On 5/12/15 2:46 PM, Felipe openglx wrote:
The devs will be able to give more specifics (maybe even confirm if 
2.4 performs better for your case?) but I faced similar issues with 
timeout because of the time it took to "slice and dice" the amount of 
objects.
If you can enable debug mode on all nodes and provide some captures it 
would be great.
OK -- I'll see about setting that up.


Here's one set of captures.  I'm choosing timeouts between daemons running on the same server as the arbiter so there's no issue of network interference or clock skew.

Here's the arbiter:

[1431539957] INFO: [Shinken] [All] Trying to send configuration to poller poller-1
[1431539960] ERROR: [Shinken] Failed sending configuration for poller-1: Connexion error to http://shinken1.dc1.example.com:7771/ : Operation timed out after 3001 milliseconds with 0 bytes received
[1431539960] INFO: [Shinken] [All] Trying to send configuration to poller poller-4

Here's the corresponding entries from poller-1.  Note that we have a lot of servers we're checking via NRPE but that don't have the NRPE daemon setup properly yet to allow access from the shinken servers, which looking at the code where these error messages are generated seems to be the cause of these errors:

[1431539922] DEBUG: [Shinken] Error on SSL shutdown : library=missing reason=missing : [] ; Tracebac
k (most recent call last):
  File "/var/lib/shinken/modules/booster-nrpe/module.py", line 220, in close
    break
Error: []

[1431539957] DEBUG: [Shinken] Error on SSL shutdown : library=missing reason=missing : [] ; Tracebac
k (most recent call last):
  File "/var/lib/shinken/modules/booster-nrpe/module.py", line 220, in close
    break
Error: []

[1431539957] DEBUG: [Shinken] Error on SSL shutdown : library=missing reason=missing : [] ; Tracebac
k (most recent call last):
  File "/var/lib/shinken/modules/booster-nrpe/module.py", line 220, in close
    break
Error: []

[1431540097] DEBUG: [Shinken] socket.shutdown failed: [Errno 107] Transport endpoint is not connecte
d
[1431540097] DEBUG: [Shinken] socket.shutdown failed: [Errno 9] Bad file descriptor
[1431540097] DEBUG: [Shinken] Error on SSL shutdown : library=missing reason=missing : [] ; Tracebac
k (most recent call last):
  File "/var/lib/shinken/modules/booster-nrpe/module.py", line 220, in close
    break
Error: []

There doesn't seem to be anything going on with the poller at the time that the arbiter is complaining about it not responding.



  


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel

Reply via email to