Hi guys,
I've searched but not found what I'm looking for, sorry if this has been
asked before.
Background:
I am trying to monitor puppet run success by monitoring the file
/var/lib/puppet/state/last_run_summary.yaml. Then I am trying to break a
puppet run, by temporarily removing a manifest on the puppet master, which
is needed by a client. This is my test to see if the check works and gets
caught by our monitoring system.
A puppet agent -t looks like:
{code}
puppet agent -t
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/filesystems.rb
Info: Loading facts in
/var/lib/puppet/lib/facter/postgres_default_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/rabbitmq_erlang_cookie.rb
Info: Loading facts in /var/lib/puppet/lib/facter/ip6tables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in
/var/lib/puppet/lib/facter/iptables_persistent_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/os_maj_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Error: Could not retrieve catalog from remote server: Error 400 on SERVER:
Could not find class role::ouf for ov28.fqdn on node ov28.fqdn
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
{code}
And then, I run my monitor to see if it detects a broken run:
sudo -u xymon sudo /usr/libexec/xymon/client/ext/check_puppet.rb -w 2000 -c
3600
CRITICAL: FAILED - Puppet failed to run. Missing dependencies? Catalog
compilation failed? Last run 23 seconds
ago|time_since_last_run=23s;2000;3600;0 failed_resources=99;;;0
failed_events=99;;;0
Great, the check detects that puppet has failed. The last_run_summary looks
like this after the run:
cat /var/lib/puppet/state/last_run_summary.yaml
---
version:
config:
puppet: "3.4.3"
time:
last_run: 1401807503
However. After puppet agent schedules a puppet run, I do not get the same
errors. The contents of last_run_summary.yaml look like a normal puppet run
has completed successfully:
cat /var/lib/puppet/state/last_run_summary.yaml
---
changes:
total: 0
version:
puppet: "3.4.3"
config: 1401798243
time:
last_run: 1401808053
anchor: 0.002382
total: 227.941278069473
exec: 0.552989
datacat_fragment: 0.00575
mount: 0.001974
ssh_authorized_key: 0.025437
schedule: 0.000933
package: 0.542415
datacat_collector: 0.012692
user: 0.130179
host: 0.000364
filebucket: 0.000187
file: 220.198688
config_retrieval: 1.89250206947327
service: 4.57266
group: 0.002126
resources:
changed: 0
failed_to_restart: 0
total: 513
out_of_sync: 0
skipped: 0
restarted: 0
failed: 0
scheduled: 0
events:
failure: 0
total: 0
success: 0
And so the monitor does not pick up the errors.
Any ideas? What am I doing wrong?
Thanks in advance :)
--
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/0c316d9a-c636-4e28-a1d7-af20faa82558%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.