Re: [Puppet-dev] Re: Data binding for defines

Alessandro Franceschi Mon, 19 May 2014 09:43:21 -0700

On 19 May 2014, at 17:37, John Bollinger <john.bollin...@stjude.org> wrote:


> 
> 
> On Thursday, May 15, 2014 7:17:38 PM UTC-5, Alessandro Franceschi wrote:
> 
> 
> On Thursday, May 15, 2014 9:55:05 PM UTC+2, John Bollinger wrote:
> 
> 
> On Thursday, May 15, 2014 9:46:40 AM UTC-5, Alessandro Franceschi wrote:
> Hallo everybody,
> I've tried to look around in the group for past discussions about this topic 
> but haven't found any.
> If this has been already debated , please forgive me and point me to the 
> right direction.
> 
> I wonder what do you thing about a feature request to have data bindings also 
> for defines' parameters.
> 
> 
> I'm skeptical about their value.  We already have data bindings for classes, 
> by which resource parameters can (indirectly) be injected, and I am inclined 
> to think that classes are the right level of abstraction.  On the other hand, 
> we also have create_resources(), which already can give you resource-level 
> data binding if you want to use it that way.
> 
> I do not favor drawing distinctions between defined resource types and native 
> resource types.  On that basis, I would argue for data binding either for all 
> resource types or for none, and against data binding for only defined types.
> 
> I am concerned about the impact.  It is already somewhat costly for Puppet to 
> evaluate data bindings for class parameters, and adding bindings for resource 
> parameters (even just for resources of defined types) will magnify that.  
> Note that the cost scales with the aggregate number of defined parameters for 
> all declared resources, independent whether any data are actually bound.  In 
> fact, the cases were no data are bound are the most costly, because hiera 
> must then search the entire hierarchy.
> 
> Yes, these are valid and convincing points.
> Anyway if we find data binding useful for classes and can bear the 
> performance overhead, I suppose we can do the same for defined types.
> 
> 
> In some cases we cannot bear the overhead even just for classes.  People who 
> tried to use hiera-gpg with Puppet 3 discovered that pretty quickly.  Even if 
> there were only a few encrypted data, the encrypted file gets decrypted for 
> every class parameter that has no value specified in a higher-priority back 
> end (which usually is most of them), making catalog compilation take forever.
> 
> We're not now in a comfort zone with respect to data binding cost -- we're 
> near and sometimes over the edge of acceptable cost.
>  
> I see many wonderful use cases for such a feature and no apparent cons.
> 
> 
> I see no particularly good use cases and several cons, but I'm prepared to be 
> amazed by your vision.  Would you care to elaborate?
> 
> Yes, why not. Let me digress a bit then. It could be useful to verify if the 
> following idea seems cool only to me.
> 
> First, a premise.
> 
> [... omitted for brevity ...]
>  
> My point is that a reusable module should allow total freedom on how its 
> configurations should be managed: with a file-based approach 
> (source/template/ now epp_template...), with a setting-based approach, 
> (augeas, file_line..) with concat or whatever. User should decide what's best 
> for him, not the module author. 
> 
> 
> I accept the premise up to this point, with some reservations.  In 
> particular, I think it applies mainly to fairly low-level modules, such as 
> those managing fairly narrow programs or services.  Once you move into 
> modules that compose those lower-level units, I don't think you can reliably 
> stick to the premised approach any longer.  On the other hand, it may be that 
> such mid- or high-level modules are not well suited for reuse anyway.

Just bear in mind that I don't expect this tp module to substitute any existing 
module. I suppose it can be used in cases where you have simple applications to 
manage (ie the typical package/service/configfile ones) or when you don't need 
specific defines to manage elements of an application (apache modules, mysql 
grants...) and just want to manage a few files over a vanilla installation of 
an application.


> 
>  
> Also I think that a single parameter that expects an hash ( options_hash, 
> config_file_options_hash, options, config... name it as you want) whose key 
> values can be freely used in a [custom] template is better than dozens of 
> parameters, totally unmaintainable, one for each possible configuration entry 
> of an application (and for each one, as you pointed, a data binding lookup).
> 
> 
> 
> There is a trade-off here between expressiveness of the class/resource 
> interface and its cost.  On the expressiveness side, I prefer separate 
> parameters for resource types, at least up to a point.  Declarations are so 
> much easier to read that way.  I'm not so fond of types exposing dozens of 
> parameters, however; generally, I'm inclined to think that such types should 
> be decomposed.  The same applies to classes to some extent, but I don't worry 
> much about that end because I don't generally want to see resource-like class 
> declarations at all.
> 
>  
> Given this premise (longer than expected, sorry), I've recently thought about 
> a single module: tp (stays for tiny puppet) which should be able to manage 
> essential features of any application.
> Something you can use in manifests in a similar fashion:
> 
> To install an application (the tp module contains all the data to do it right 
> on different OS) 
>   tp::install { 'redis': }  
> 
> To configure it via a template:
>   tp::conf { 'redis.conf':
>     application => 'redis',
>     template    => 'site/redis/redis.conf.erb',
>   }  
> 
> Or, alternatively (using in the title any sane separator):
>   tp::conf { 'redis--redis.conf':
>     template    => 'site/redis/redis.conf.erb',
>   }  
> 
> To configure it via the fileserver:
>   tp::conf { 'redis.conf':
>     application => 'redis',
>     source      => 'puppet:///modules/site/redis/redis.conf',
>   }  
> 
> But also have something like (to manage single lines in a configuration file):
>   tp::line { 'redis::redis.conf::port':
>     value => '1234',
>   }  
> 
> or even something like:
>   tp::concat { 'redis':
>      target   => 'redis.conf',
>      order    => 10,
>      content => 'port 1234',
>   }  
> 
> And manage, if overrides are needed, any internally used parameter with 
> something like:
>   tp::settings { 'redis':
>     lookup_strategy => 'merge', # Such a settings should define if the 
> module's data has to be merged or not with user data
>     settings => {
>       config_dir_path => '/opt/redis/conf',
>       tcp_port        => '3242',
>       pid_file_path   => '/opt/redis/run/redis.pid',
>     },
> 
> Also, it could be nice to have a face that allows commands like:
> puppet tp check redis  # To check is redis is running, based on settings like 
> service name , pid file, port ... (Incidentally this would make integration 
> tests a "bit" easier and quicker)
> puppet tp info redis # To show info about how redis is managed by Puppet or 
> how it is working
> 
> Basically such a tp module would be a sum of most of my existing modules main 
> features + the puppi module (https://github.com/example42/puppi,  for the 
> functionalities not related to application deployments).
> 
> Now, the real added value of such a thing would be a (Hiera like) set of yaml 
> files where all the default settings for as many applications as possible for 
> different operating systems can be defined in order to make usage of the 
> previous defines possible with a wide variety of applications and allow, at 
> the same time, extreme customization options.
> 
> To do this (hey, this is the topic of this thread), it would have been nice 
> to have a data in modules approach (as based on defines parameters) and have 
> in files like:
> tp/data/redis/default.yaml , tp/data/redis/osfamily/RedHat.yaml all the data 
> needed for all the managed applications, such as :
> ---
>   tp::settings::redis::port: '6379'
>   tp::settings::redis::config_file_path: '/etc/redis/redis.conf'
> 
> Or, alternatively, data expressed in a format like :
> ---
>   redis::packages: (or tp::install::redis::packages...)
>     redis:
>       ensure: present
>       alias: 'redis'
> 
> 
> 
> [...]
> 
> In fact that's related to, but not quite, the original topic of the thread.  
> You can have something along those lines (at least the latter) without 
> extending classes' automated data binding feature to resources.  I tried -- 
> evidently unsuccessfully -- to make that point in my first response.  You put 
> your data into a hash, nested to whatever depth it needs to be for the degree 
> of consolidation you want, and your generic 'tp' module retrieves it via an 
> ordinary hiera() call.  That will work in today's Puppet.
> 
> The question, then, is not so much about whether it is useful to be able to 
> bind data to specific resources, but about whether Puppet should do so 
> automatically.
> 
> I have misgivings, too, about exposing module implementation details in the 
> form a data-binding interface applicable to internally-declared resources, 
> but I'll leave that aside for now.
> 
>  
> In this case the tp::install code could be as simple as this POC:
> define tp::install (
> 
>   $packages  = { } ,
>   $services  = { } ,
>   $files     = { } ,
>   $execs     = { } ,
>   $users     = { } ,
> 
>   $configs   = { } ,
> 
>   ) {
> 
>   if $packages {
>     create_resources('package', $packages)
>   }
> 
>   if $services {
>     create_resources('service', $services)
>   }
> 
>   if $files {
>     create_resources('file', $files)
>   }
> 
>   if $execs {
>     create_resources('exec', $execs)
>   }
> 
>   if $users {
>     create_resources('user', $users)
>   }
> 
> }
> 
> 
> 
> Or like this:
> 
> define tp::install () {
>   $config = hiera("tp::config::$title", {})

John, if it were that simple I wouldn't have asked here.
The hierarchy of such a tp module has to be module specific and should not 
depend on how data is managed in users' hiera.yaml.
Default data for the managed applications should be placed in the same tp 
module and be based on a module specific hierarchy, it would contain references 
to osfamily/operatingsystem/etc facts that can't be forced into the users' own 
local hierarchies (besides that fact that imho in a sane /etc/puppet/hiera.yaml 
file there should not be references to OS related facts) .
The hiera function allows the possibility to add a datasource to the default 
hierarchy, so I might try to work around it  (even if I'd need an array of 
extra datasources rather that just one, and I'm not sure this is supported, and 
I would need to change the datadir on the fly, which is almost surely not 
supported), or maybe create a custom function that mimicks the hiera 
functionality in some way.

> 
>   if $config['packages'] {
>     create_resources('package', $config['packages'])
>   }
> 
>   # etc...
> }
> 
> You could even go further by bundling all the data for all tp-managed 
> components into one higher-level hash, and looking it up just once, recording 
> it in a variable of class tp for later access by the various defined types of 
> the module.

How data would be organized and namespaced is a secondary problem after all, my 
main concern is how to organize in a hiera-like style a bunch of yaml files 
inside a module and get data from them.
Using directly hiera instead of writing a custom function that emulates it 
seemed the most logical approach, but it still doesn't seem currently possible.

> 
>  
> 
> Anyway, however is organized the internal data, and the relevant code 
> (instead of the create_resource we could use a lambda and cycle over the 
> various hashes) having the possibility to use Hiera for the backend would be 
> a good thing also because it would make easier for user to override the 
> default module data with custom one.
> 
> 
> I agree that providing the data via Hiera is a good approach.  I wouldn't 
> recommend anything else.
> 
>  
> 
> I guess there are better ways to obtain the same (suggestions welcomed) 
> rather than enabling  data bindings for defined types  and using data in 
> modules (both features being not existing in Puppet core). For example with a 
> custom function, but I have to figure out how to do it in the right way.
> 
> 
> 
> See above for one way.

Thanks anyway for the attempt, hopefully now is clearer why I can't do this 
with existing functionalities.

Al

> 
> 
> John
> 
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "Puppet Developers" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/puppet-dev/4lFhfChM9XM/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> puppet-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/puppet-dev/285608ae-a5f7-4091-b108-5a81309d3fe7%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


Alessandro Franceschi

site { 'Example42 Puppet modules':
  url       => 'http://www.example42.com',
  before => Service['puppet'],
}

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/341A5203-6FC6-458D-8D14-CB80357FAF09%40lab42.it.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] Re: Data binding for defines

Reply via email to