Hi, Yes you got it. With puppet agent -t it seems no cached catalog is applied, and it seems that the way in which I purposefully "broke puppet" by removing a manifest on the puppet master causes puppet agent to use a cached catalogue which is the default behavior. I set usecacheonfailure = false in puppet.conf, and now it fails as expected on the agent runs, and the monitor also catches it. This is pretty interesting. If I broke puppet, and it continues to successfully run with cached catalog everything looks fine although isn't. Or I disable cached catalog, and have maybe heavier load on the server. Maybe I just need change the monitor tests I am doing.
Will think about this for a bit longer :) Thanks for the help! On Tuesday, June 3, 2014 5:31:47 PM UTC+2, Jose Luis Ledesma wrote: > > Could be it is using a cached catalog? > > I think that when you use the -t flag no cached catalog is applied, but > without it could be. > > Regards > El 03/06/2014 17:22, "Steve Kilduff" <[email protected] <javascript:>> > escribió: > >> Hi guys, >> >> I've searched but not found what I'm looking for, sorry if this has been >> asked before. >> >> Background: >> I am trying to monitor puppet run success by monitoring the file >> /var/lib/puppet/state/last_run_summary.yaml. Then I am trying to break a >> puppet run, by temporarily removing a manifest on the puppet master, which >> is needed by a client. This is my test to see if the check works and gets >> caught by our monitoring system. >> >> A puppet agent -t looks like: >> >> {code} >> puppet agent -t >> Info: Retrieving plugin >> Info: Loading facts in /var/lib/puppet/lib/facter/filesystems.rb >> Info: Loading facts in >> /var/lib/puppet/lib/facter/postgres_default_version.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb >> Info: Loading facts in >> /var/lib/puppet/lib/facter/rabbitmq_erlang_cookie.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/ip6tables_version.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb >> Info: Loading facts in >> /var/lib/puppet/lib/facter/iptables_persistent_version.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/iptables_version.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/os_maj_version.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb >> Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb >> >> Error: Could not retrieve catalog from remote server: Error 400 on >> SERVER: Could not find class role::ouf for ov28.fqdn on node ov28.fqdn >> Warning: Not using cache on failed catalog >> Error: Could not retrieve catalog; skipping run >> {code} >> >> And then, I run my monitor to see if it detects a broken run: >> >> sudo -u xymon sudo /usr/libexec/xymon/client/ext/check_puppet.rb -w 2000 >> -c 3600 >> CRITICAL: FAILED - Puppet failed to run. Missing dependencies? Catalog >> compilation failed? Last run 23 seconds >> ago|time_since_last_run=23s;2000;3600;0 failed_resources=99;;;0 >> failed_events=99;;;0 >> >> Great, the check detects that puppet has failed. The last_run_summary >> looks like this after the run: >> >> cat /var/lib/puppet/state/last_run_summary.yaml >> --- >> version: >> config: >> puppet: "3.4.3" >> time: >> last_run: 1401807503 >> >> >> >> However. After puppet agent schedules a puppet run, I do not get the same >> errors. The contents of last_run_summary.yaml look like a normal puppet run >> has completed successfully: >> >> cat /var/lib/puppet/state/last_run_summary.yaml >> --- >> changes: >> total: 0 >> version: >> puppet: "3.4.3" >> config: 1401798243 >> time: >> last_run: 1401808053 >> anchor: 0.002382 >> total: 227.941278069473 >> exec: 0.552989 >> datacat_fragment: 0.00575 >> mount: 0.001974 >> ssh_authorized_key: 0.025437 >> schedule: 0.000933 >> package: 0.542415 >> datacat_collector: 0.012692 >> user: 0.130179 >> host: 0.000364 >> filebucket: 0.000187 >> file: 220.198688 >> config_retrieval: 1.89250206947327 >> service: 4.57266 >> group: 0.002126 >> resources: >> changed: 0 >> failed_to_restart: 0 >> total: 513 >> out_of_sync: 0 >> skipped: 0 >> restarted: 0 >> failed: 0 >> scheduled: 0 >> events: >> failure: 0 >> total: 0 >> success: 0 >> >> >> And so the monitor does not pick up the errors. >> >> Any ideas? What am I doing wrong? >> >> Thanks in advance :) >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Puppet Users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/puppet-users/0c316d9a-c636-4e28-a1d7-af20faa82558%40googlegroups.com >> >> <https://groups.google.com/d/msgid/puppet-users/0c316d9a-c636-4e28-a1d7-af20faa82558%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/333804fd-3e4d-4fe2-8164-c4328cde4258%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
