Could it be that when you change "nagios_name.rb" file on pupptermaster A, there is an event triggered so that Apache reloads this file? But since this event isn't passed over to nfs in any way, this doesn't happend to puppetmaster B?
Have you tried to restart every component after you change a file, just to verify that it is read correct by all components? Have no idea if this is to any help but its better than nothing. On 24 Jul, 14:37, Monachus <[email protected]> wrote: > I'm not 100% sure if the subject correctly describes the problem I've > been having, but it's the closest I can get with my troubleshooting. > My setup looks like this: > > * 2 puppetmasters running 0.25.4 on Ubuntu, running under passenger > * backend content (etc and var) shared over NFS > * haproxy load balancing across the 2 puppetmasters > * mysql for stored configs > > I just upgraded from 0.24.8 to 0.25.4 a couple of weeks ago. The > setup we've been using above has worked fine since we implemented it > months ago, so I don't believe that there is any problem with NFS or > the load balancer. I have a handful of custom functions, and after > updating to 0.25.4, puppetmaster started complaining about one of > them, a simple function called nagios_name. This function takes an > FQDN and turns it into a name we use in Nagios and mcollective > (turning "support.arces.net" into "arces.support" for example). The > function is basic ruby and is available for you to look at > here:http://monachus.pastebin.com/yLF1syqU. The function works fine. > > The error that puppetmaster reports is: > > Unknown function nagios_name at /var/www/localhost/puppet/etc/ > manifests/outsidein_nodes.pp:16 on node some.node.com. > > It doesn't report this all of the time - instead it reports it about > 40% of the time, while other nodes before and after it do not report > the error. It seems that a node with a problem will always have the > problem, and a node where it works will always work. This reinforces > the fact that the function is fine - it works and has worked for > months. > > My thought is that it's some sort of caching issue, and I even thought > it might be a race condition with the backend storage being NFS - one > puppetmaster loading a cached yaml file before the other was done > writing it or something. I've done all of the following, all with no > success: > > * turn off one puppetmaster so traffic isn't split across them > * move yaml files for node/facts to local storage instead of NFS > * enable IP-based persistence in haproxy so that traffic from a client > always goes to the same puppetmaster > * --ignorecache in config.ru for puppetmaster > > What I've discovered, however, is more interesting. It appears that > if I go into the actual nagios_name.rb file and change it in any way > (add a single character of whitespace) and restart Apache, the error > goes away. The file is detected as different and loaded for delivery > to the clients, and everything works fine after that. I discovered > this by adding debug() statements to the function 2 weeks ago, only to > find that it worked fine from then on. The problem resurfaced today > when I turned the 2nd puppetmaster back on, and I decided to try it > with whitespace - same thing. Clears it right up. This tells me that > there is some sort of caching wonkiness happening somewhere, but I'm > not able to figure out where. > > Perhaps one of the variables the function is looking for (fqdn?) isn't > available at the time it's requested, resulting in a compile error > that isn't always visible? > > I'm pleased to have a workaround, but to go from "Unknown function" to > "everything is cool" by adding a space to the file and saving it isn't > really much of a long-term solution. > > I'm sending this to the list rather than filing a bug report to see if > anyone has experienced anything like this or has any thoughts. If > there's any further information I can give to help narrow down the > source of the problem, I'm happy to do so. > > Adrian -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
