Re: [Puppet-dev] Re: Environment Caching - RFC

Andy Parker Tue, 22 Apr 2014 17:17:35 -0700

On Tue, Apr 22, 2014 at 1:32 PM, John Bollinger
<[email protected]>wrote:


> On Monday, April 21, 2014 4:29:23 PM UTC-5, henrik lindberg wrote:
> [...]
>
> We think there is a core set of strategies that a user should be able to
>> select. These should cover the typical usage scenarios.
>>
>> * NONE - no caching, each catalog product starts with a clean slate.
>>    This is the current state of directory based environments,
>
>
>
> Is that why a user is reporting over on puppet-users that turning on
> directory environments explodes his catalog compilation times?
>
>
Turns out that the problem there is related to another problem that we
found related to how puppet parses resource references combined with how
environments are loaded. Henrik will be posting a link to a PR shortly that
should have a fix for that.


> I don't see any reason to object to this strategy, but I'm inclined to
> doubt that many sites will find it useful in production.
>
>
Hopefully nobody tries to use this in production :) The intention of this
style was for development systems when you are in development loop of: edit
manifests, run agent.


>
>
>> and it
>>    could also be made to apply to legacy environments. This is good in
>>    a very dynamic environment / development or low "signal/noise" ratio.
>>
>> * REBOOT - (the opposite of NONE) - cache everything, never check for
>>    changes. A reboot of the  master is required for it to react to
>>    changes.
>>    This is good for a  static configuration, and where the organization
>>    always takes down the master for other reasons when there are changes.
>>    This strategy avoids scanning, and is thus a speed improvement for
>>    configurations with a large set of files.
>>
>>
>
> I could see the REBOOT strategy being used at very sensitive or
> tightly-controlled sites, but I'm inclined to think that ENVDIRCHANGE would
> be preferable to many people on account of the ability it affords to
> trigger cache invalidation without restarting the master.
>
>
Or in more performance oriented sites? It would reduce the number of stat
calls that the master ends up doing. The thinking for this one was that in
a production environment the master only should reread when the manifests
have changed and that should be explicit. This can be done by signaling a
graceful restart of the master for either passenger + apache or nginx +
unicorn. For the apache setup it just takes sending a HUP to apache and for
the nginx setup it takes sending a HUP to unicorn. I think that should
provide a better deployment scenario for masters, but I might be wrong.


>
>
>> * TIMEOUT - cache all environments with a 'time to live' (TTL). When a
>>    request is made for an environment where the TTL has expired it
>>    starts that environment with a clean slate.
>>    This is a compromise - it will pick up all changes (even additions),
>>    but it will take one "TTL" before they are picked up (say 5 minutes;
>>    configurable).
>>
>>
>
> That one makes me very nervous.  It seems like an open invitation for
> manifest version shear.  I would not even consider using it, myself; I'd
> prefer even scanning.
>
>
Yes, it does pose a bit more risk than the REBOOT strategy since it could
decide to reload in the middle of a deploy, but the SCAN strategy, which is
pretty much what the existing environments do can be very dangerous, or at
least it was in the past.

The old strategy was to rescan during a compile if the timeout expired on
*any* of the watched files. We changed this recently (3.5? I lose track of
what version things go out in) so that it only reevaluates this at the
*beginning* of the catalog compile run, but it still ends up scanning all
of the files. The timeout would be very similar, but instead of being based
on any file timestamps it would only be based on when the environment was
loaded. So if the timeout has expired it just throws away and reloads the
environment.


>
>
>> These three schemes are believed to cover the different usage scenarios.
>> They all have the benefit that they do not require watching any files
>> (thereby drastically reducing the number of stat calls).
>>
>> Strategy that is probably not needed:
>>
>> * ENVDIRCHANGE - watches the directory that represents
>>    the environment. Reloads if the directory itself is stale (using
>>    filetimeout setting to cap the number of times it checks). Thus, it
>>    will reaact to changes to the environment root only (which typically
>>    does not happen when changing content in the environment, but is
>>    triggered if the environments configuration file is added or removed).
>>    To pick up any other changes, the user would need to touch the
>>    directory.
>>
>>
>
> Perhaps it's unneeded, but that's the option I like best among those
> presented.  I like having a means to manually flush the cache without
> restarting the master(s).
>
>
Wouldn't a graceful restart work just as well. I like the REBOOT + graceful
restart option because it keeps the behavior of puppet much simpler and
under the control of the user.


>
>
>> Strategies we think are not needed:
>>
>> * SCAN - like today where every file is watched.
>> * CONFCHANGE - watch/scan all configuration files.
>>
>> Feedback ?
>>
>
>
> I'm all for moving away from the SCAN approach.  As for CONFCHANGE, is the
> idea basically a more clueful variant of ENVDIRCHANGE?  I could imagine
> that being of interest, but if you're looking to streamline initial rollout
> then I could see deferring it until you can document demand for it.
>
>
>
>> ---
>> Here are a couple of questions to start with...
>>
>> * What do you think of the proposed strategies?
>>
>
>
> See above.
>
>
>
>> * If you like the scanning strategy, what use cases do you see it would
>> benefit that the proposed strategies does not handle?
>>
>
>
> Relative to scanning, they all make it a little harder to use an approach
> where the master automatically pulls manifests from VCS.
>
>
>
>> * Any other ideas?
>>
>
>
> Can the catalog compiler be induced to abandon its progress and restart
> the current catalog when the cache for its environment is flushed?  That
> might make the TIMEOUT strategy more palatable, and it would be appropriate
> for some other strategies, too.
>
>
>
>> * Any use cases you feel strongly about? Scenarios we need to consider...
>>
>>
>
> If I'm actively changing the manifest set on my master, then I know better
> than the master when I've done, and I favor being able to hold off on
> flushing the cache until then.  Also, I like being able to flush the cache
> of just one environment at a time, and without bringing down the master to
> do so.
>
>
> John
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/a80a3c7c-99cd-44ed-aef3-36eaf481604e%40googlegroups.com<https://groups.google.com/d/msgid/puppet-dev/a80a3c7c-99cd-44ed-aef3-36eaf481604e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Andrew Parker
[email protected]
Freenode: zaphod42
Twitter: @aparker42
Software Developer

*Join us at PuppetConf 2014 <http://www.puppetconf.com/>, September
22-24 in San Francisco*
*Register by May 30th to take advantage of the Early Adopter discount
<http://links.puppetlabs.com/puppetconf-early-adopter> **—**save $349!*

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CANhgQXseunu-kap4CvS0wyO-GqjW8hzaiQfciu5KvsaE4%2Bs4qQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] Re: Environment Caching - RFC

Reply via email to