PuppetLabs Request For Comments: 1
Authors: Paul & Markus
Title: Semantics of autoloaded classes

Contents:

   Abstract
   Problem specification
   Proposed solutions, with pros and cons

Abstract
--------

Autoloading files on the (unverified) assumption that they contain the
class (and only the class) corresponding to the file's name presents a
number of semantic issues which can result in difficult to diagnose
problems with complex manifests.  The results of a compilation can
depend on such factors as:

* Which nodes have been previously compiled in this environment on this
master
* Which order classes are autoloaded in
* Which classes are (or have been) compiled by other threads on this master,
* etc.

We have several ideas for resolving this problem.  Since this is a
fairly major issue touching on architecture / principles we don't want
to just choose a solution without getting feedback / design discussion.


Problem Specification
---------------------

Puppet principles include:

* Consistency: software behavior shouldn't depend on factors that seem
  irrelevant to an end user.
* Order independence: manifests should be declarative--their meaning
  should not depend on the order in which their contents are
  evaluated.
* Performance: software should execute quickly, particulary on the
  master when serving a large number of nodes based on the same set of
  manifest files.
* Predictability / principle of "least surprise": a user should be
  able to achieve their desired effect without reading the source code
  or having a deep understanding of the subtleties of Puppet's
  internal design.

The current behavior of autoloading (in Statler) breaks at least three
of these principles.

It introduces potential inconsistencies, since it is possible that
while servicing node A, the puppet master will autoload some manifest
files what wouldn't have been loaded while servicing node B.  If the
master later services node B, those files will remain loaded and may
affect the catalog that gets sent to node B (issue #4656).

It introduces a potential order dependency, because the order in which
manifests are evaluated determines the order in which autoloading is
attempted, and the behavior of autoloading depends on what classes
have already been loaded.  For example, if a user's site.pp refers
to autoloaded classes foo::bar and foo::baz, both of which exist in
foo.pp, and foo::bar also exists in foo/bar.pp, then the behavior
depends on whether the reference to foo::bar or foo::baz is evaluated
first.

It hurts performance, because it sometimes has to access the
filesystem in order to resolve a name even if that name already
exists.  For example, if a user has classes foo and bar, and tries
to refer to bar within namespace foo, Puppet first has to go to the
filesystem to ensure that foo::bar doesn't exist before it can resolve
the reference to the toplevel bar.  (Note: this behavior is new to
Statler--it was introduced in commit 6b1dd81).

Of course, a user who is aware of these issues can take steps to
avoid them.  For example, to avoid inconsistencies and order
dependence, a user can follow these principles

1. Ensure that each autoloaded file only contains classes,
   definitions, and nodes whose names match the filename (for example,
   foo/bar.pp should only contain classes such as foo::bar and
   foo::bar::baz).

2. Ensure that each autoloaded class is defined in only one file.

3. Ensure that autoloaded files do not contain toplevel resources
   (resources declared outside of any class).

However, in accordance with the principle of least surprise, we would
rather not to force users to be aware of these principles in order
to write good manifests.

Note that we're initially focussing on the autoloading of .pp files;
some of the proposed solutions may need extention / modification to
account for internal DSL (.rb) files.

Proposed Solution I: rebuild known_resource_types on each compile
-----------------------------------------------------------------

Each time an agent contacts the master, rebuild known_resource_types
from scratch (using cached copies of the ASTs for the relevant files).

Pros:

- This eliminates inconsistencies caused by servicing previous nodes.

- Minimal user impact.  Manifests that worked consistantly in 0.25 and
  2.6 will continue to work without modification.

- It will cause manifests which depended on the node-order issues (and
  thus failed occasionaly) to fail consistently, making them much easier
  to diagnose.

Cons:

- This does nothing to address order dependencies or the existing
  performance issues with autoloading.

- It introduces a slight additional performance hit, since
  known_resource_types must be rebuilt from scratch with each compile.
  However, we estimate this performance hit to be very small in
  practice (<1% of total compilation time spent on the master).



Proposed Solution II: restrict the contents of autoloaded files
---------------------------------------------------------------

When autoloading a file, check that it only declares things
within a namespace corresponding to its filename.  For example,
the file $module_path/foo/manifests/init.pp may declare a
class (or definition or node) foo, any class within the namespace
foo (e.g. foo::bar), and any resources within those classes.  It
may not declare any toplevel resources or any other toplevel
classes.  These same restrictions apply to a file ./foo.pp
autoloaded from the current working directory.

In a similar vein, a file $module_path/foo/manifests/bar.pp (or
./foo/bar.pp) may only declare things within the namespace
foo::bar.  Exception: it may also declare the class foo, but it
may not declare anything inside it other than bar.  These rules
are extended in the obvious way for more deeply nested
files (e.g. ./foo/bar/baz.pp).

Also, modify the search order of autoloading so that it looks in
the order of general to specific.  For example, when trying to
autoload foo::bar::baz, if foo is a module, it looks in the order

  $module_path/foo/manifests/init.pp, then
  $module_path/foo/manifests/bar.pp, then
  $module_path/foo/manifests/bar/baz.pp.

If foo is not a module, then it looks in the order

  ./foo.pp, then
  ./foo/bar.pp, then
  ./foo/bar/baz.pp.

In either case, it stops as soon as it finds
the thing being autoloaded.

Pros:

- This eliminates most sources of inconsistencies and order
  dependencies by formalizing a relationship between files and
  classes that most users are probably following anyway.

- It eliminates the remaining sources of inconsistencies and
  order dependencies by making a small change to search order
  that is unlikely to affect most users.

- It forces users to follow a naming convention that will help
  them to organize their manifests well.

Cons:

- This does not address any performance issues with autoloading.

- Potentially large user impact.  Unconventionally structured
  manifests that worked in 0.25 and 2.6 may require substantial
  renaming / relocation of classes in order to meet the new file
  organization requirements.  (However, users can work around
  this using explicit imports.)


Proposed Solution III: eliminate the autoloading feature
--------------------------------------------------------

Do not allow autoloading.  Require the user to explicitly import all
necessary manifest files either directly or indirectly from site.pp.

Pros:

- This eliminates inconsistencies and order dependencies because all
  files are imported before evaluation of any nodes begins.

- It improves performance, because it eliminates the need to go to the
  filesystem when resolving names.

- It improves predictability, because it always makes it explicitly
  clear which manifest files can contribute to a catalog.

- Minimal user impact.  Manifests that worked in 0.25 and 2.6 may
  require explicit imports in order to work properly without autoloading,
  however it will be easy to tell from the error messages what imports
  to add.

Cons:

- It eliminates a feature which users may perceive to be of high value.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to