Re: [Puppet-dev] Environment Caching - RFC

Thomas Hallgren Mon, 21 Apr 2014 23:06:48 -0700

Would a MANUAL strategy make sense? I.e. instead of rebooting the master, just tell it to clear the cache (perhaps perenvironment).


- thomas


On 2014-04-21 23:29, Henrik Lindberg wrote:

Hi,
We have been looking into environment caching and have some thoughts and ideas about how this can be done. Love to getyour input on those ideas, and your thoughts about their usefulness.
There is a google document that has the long story - it is open for commenting. It is not required reading as theessence is outlined here.The doc is here:https://docs.google.com/a/puppetlabs.com/document/d/1G-4Z6vi6Tv5xZtzVh7aT2zNWbOxJ3BGfJu31pAHxS7g/edit?disco=AAAAAGtMYOI#heading=h.rpgaxghcfqol
The current state of caching environments
---
A legacy environment caches the result or parsing manifests and loading functions / types, and reacts to changedfiles. It does this by recording the mtime of each file as it is parsed / read. Later, if the same file would beparsed again, it will use the already cached produced result. If the file is stale, the entire cache is cleared and itstarts from scratch.
It does not however react to added files. It also does not recognize changes in files evaluated as a consequence ofevaluating ruby logic (i.e. if a function, type, etc. required something, that is not recorded).
The new directory based environments does not support caching. (And now we want 
to address this).

The problem with caching
---
The problem with caching is that it can be quite costly to compute and we found that different scenarios benefits fromdifferent caching strategies.
In an environment where the ratio of modules/manifests present in the environment vs. the number actually used perindividual node is low checking caching can be slower than starting with a clean slate every time.
Proposed Strategies
---
We think there is a core set of strategies that a user should be able to select. These should cover the typical usagescenarios.
* NONE - no caching, each catalog product starts with a clean slate.
  This is the current state of directory based environments, and it
  could also be made to apply to legacy environments. This is good in
  a very dynamic environment / development or low "signal/noise" ratio.

* REBOOT - (the opposite of NONE) - cache everything, never check for
  changes. A reboot of the  master is required for it to react to
  changes.
  This is good for a  static configuration, and where the organization
  always takes down the master for other reasons when there are changes.
  This strategy avoids scanning, and is thus a speed improvement for
  configurations with a large set of files.

* TIMEOUT - cache all environments with a 'time to live' (TTL). When a
  request is made for an environment where the TTL has expired it
  starts that environment with a clean slate.
  This is a compromise - it will pick up all changes (even additions),
  but it will take one "TTL" before they are picked up (say 5 minutes;
  configurable).
These three schemes are believed to cover the different usage scenarios. They all have the benefit that they do notrequire watching any files (thereby drastically reducing the number of stat calls).
Strategy that is probably not needed:

* ENVDIRCHANGE - watches the directory that represents
  the environment. Reloads if the directory itself is stale (using
  filetimeout setting to cap the number of times it checks). Thus, it
  will reaact to changes to the environment root only (which typically
  does not happen when changing content in the environment, but is
  triggered if the environments configuration file is added or removed).
  To pick up any other changes, the user would need to touch the
  directory.

Strategies we think are not needed:

* SCAN - like today where every file is watched.
* CONFCHANGE - watch/scan all configuration files.

Feedback ?
---
Here are a couple of questions to start with...

* What do you think of the proposed strategies?
* If you like the scanning strategy, what use cases do you see it would benefit that the proposed strategies does nothandle?
* Any other ideas?
* Any use cases you feel strongly about? Scenarios we need to consider...

Regards
- henrik


--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/535606E2.3020401%40puppetlabs.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] Environment Caching - RFC

Reply via email to