Hello, I upgraded to 2.11.20161023-labs-edition a few months back to get rid of some mod_fcgid errors by using lmd. That has helped a lot, but I am still seeing errors but not nearly as bad now.
[Wed Feb 01 04:45:42.270564 2017] [core:error] [pid 26829] [client 127.0.0.1:48720] End of script output before headers: fcgid_env.sh [Wed Feb 01 04:50:41.266928 2017] [fcgid:warn] [pid 26828] [client 127.0.0.1:55088] mod_fcgid: read data timeout in 40 seconds We are up 10 Nagios servers with about 210,000 host and service checks. Also we are starting to play with Nagvis and tried to use LMD for our socket but there seems to be a compatibility issue. Here is what we found. So, it looks like the PHP change to strip the KeepAlive header isn't going to work. This was our first stab at fixing the issue but there are also columns specified by NagVis that are present in Livestatus, but not in lmd. Here's the actual request NagVis is making (without edits): GET hosts Columns: state plugin_output alias display_name address notes last_check next_check state_type current_attempt max_check_attempts last_state_change last_hard_state_change perf_data acknowledged scheduled_downtime_depth has_been_checked name check_command custom_variable_names custom_variable_values staleness Filter: name = 4457-TX-RTR OutputFormat: json KeepAlive: on ResponseHeader: fixed16 Now the results with different configurations based on that input... Original query: OMD[fss]:~$ unixcat tmp/thruk/lmd/live.sock < nagvis_def_query.txt bad request: unrecognized header KeepAlive: on Removing KeepAlive header: OMD[fss]:~$ unixcat tmp/thruk/lmd/live.sock < nagvis_def_query.txt 400 49 bad request: table hosts has no column staleness Removing staleness from columns: OMD[fss]:~$ unixcat tmp/thruk/lmd/live.sock < nagvis_def_query.txt 200 621 [[0,"OK - 10.21.118.1: rta 0.558ms, lost 0%","GC-WatsonWise-TX-4457-RTR","4457-TX-RTR","10.21.118.1","",1.485967184e+09,1.485967784e+09,1,1,10,1.484687044e+09,1.484342519e+09,"rta=0.558ms;3000.000;5000.000;0; pl=0%;80;100;; rtmax=0.783ms;;;; rtmin=0.492ms;;;;",0,0,1,"4457-TX-RTR","check-host-alive",[],[]] , [0,"OK - 10.21.118.1: rta 67.222ms, lost 0%","GC-WatsonWise-TX-4457-RTR","4457-TX-RTR","10.21.118.1","",1.485966678e+09,1.485967278e+09,1,1,10,1.485547247e+09,1.485547247e+09,"rta=67.222ms;3000.000;5000.000;0; pl=0%;80;100;; rtmax=69.266ms;;;; rtmin=66.296ms;;;;",0,0,1,"4457-TX-RTR","check-host-alive",[],[]] ] So, it looks like what needs to happen for compatibility with lmd is to both remove the KeepAlive header, and modify requests so that the staleness column is not requested. Does anyone previously changed this or modified this or should we just continue to go to the live status on the Nagios servers for Nagvis? Or are we missing something ? We love the product and hope to expand this out to about 2500 Nagios servers to feed Thruk in OMD within the next 6 months. Any insight would be greatly appreciated. Tom
_______________________________________________ omd-users mailing list [email protected] http://lists.mathias-kettner.de/mailman/listinfo/omd-users
