Issue #16753 has been updated by Daniel Pittman.

Status changed from Unreviewed to Needs Decision
Assignee set to eric sorenson

The reason this was removed was to support the changes that made the ENC 
authoritative over the agent environment.  As part of that we had a 
bootstrapping problem: the agent had an idea of the environment to request, 
used that in pluginsync, and then as part of the request for the catalog.

If that idea was wrong, the catalog would be returned from the correct, ENC 
specified environment, but it would have been generated with the wrong set of 
plugins - including custom facts.  So, the agent would detect that, pluginsync 
to the *new* environment in the catalog, and compile a new catalog.

That fixed the problem, but was inefficient - every agent run with an incorrect 
environment would mean *two* catalog compilations, and doubling master load in 
a common situation (ENC says !production, agent run from cron) was pretty 
unacceptable.

So, instead, the agent was changed to query the master for node data about 
itself - and to use the environment that came back from that.

This had a side effect: it also changed the sequence of indirection calls on 
the master.  Previously the first thing to ask for a node explicitly disabled 
the cache, so it went to the back-end, and then any subsequent queries would 
use the updated YAML cache data.

Now that the agent was asking for the same information it was getting stale 
YAML cache data - a problem observed in the real world - and leading to the 
same inefficiency every time the ENC changed the catalog definition.

The root cause was cache invalidation: the client couldn't know if the master 
was configured with a different (and important) node data cache, so couldn't 
just bypass cache entirely.  We could have added more special case code to 
handle that, and have the terminus implementations aware of it, but that seemed 
uglier than just eliminating the cache entirely.


The primary consumers of this don't seem to care about data being *read* from 
the YAML cache as part of the operation of the master, they just want to 
interact with it outside the normal compilation processes.

It should be practical to use, instead of a real YAML cache, a "write only" 
YAML cache over the node terminus in the master.  This would store the node 
data in YAML form on disk, but would *never* return anything from `find`.  This 
terminus should probably be named something other than "YAML".

That allows the following:
1. Users can just use the `yaml` terminus to read this data
2. The master writes `yaml` terminus data, but never reads from the cache
3. The agent never sees stale data
4. The `yaml` terminus can search, even if the node backend terminus is `plain`

Other approaches exist, but are not awesome.  (Ideally, of course, we will just 
push users to consult PuppetDB instead of this cache hack, but until that is 
everywhere this is likely the best course.)

Also of note: the YAML cache was not user configurable unless they used the 
`routes.yaml` facility, so it is unlikely that many users have changed this 
away from the default behaviour.
----------------------------------------
Bug #16753: YAML node cache disabled
https://projects.puppetlabs.com/issues/16753#change-72409

Author: James Turnbull
Status: Needs Decision
Priority: Normal
Assignee: eric sorenson
Category: indirector
Target version: 3.x
Affected Puppet version: 3.0.0
Keywords: 
Branch: 


In Puppet 3.0 we've disabled the default YAML node cache (see 
https://github.com/puppetlabs/puppet/commit/5a79d9abd96e73ff166527cdee69a30da8ab0f87).

I use this code (and a number of others in the community use similar) to return 
a list of nodes:

<pre>
    Puppet[:clientyamldir] = Puppet[:yamldir]
      if Puppet::Node.respond_to? :terminus_class
        Puppet::Node.terminus_class = :yaml
        nodes = Puppet::Node.search("*")
      else
        Puppet::Node.indirection.terminus_class = :yaml
        nodes = Puppet::Node.indirection.search("*")
      end 
</pre>

This now doesn't work.

We need a method of returning the current list of nodes the master knows about.




-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to