[Puppet-dev] Re: Data binding for defines

John Bollinger Mon, 19 May 2014 08:38:41 -0700


On Thursday, May 15, 2014 7:17:38 PM UTC-5, Alessandro Franceschi wrote:
>
>
>
> On Thursday, May 15, 2014 9:55:05 PM UTC+2, John Bollinger wrote:
>>
>>
>>
>> On Thursday, May 15, 2014 9:46:40 AM UTC-5, Alessandro Franceschi wrote:
>>>
>>> Hallo everybody,
>>> I've tried to look around in the group for past discussions about this 
>>> topic but haven't found any.
>>> If this has been already debated , please forgive me and point me to the 
>>> right direction.
>>>
>>> I wonder what do you thing about a feature request to have data bindings 
>>> also for defines' parameters.
>>>
>>
>>
>> I'm skeptical about their value.  We already have data bindings for 
>> classes, by which resource parameters can (indirectly) be injected, and I 
>> am inclined to think that classes are the right level of abstraction.  On 
>> the other hand, we also have create_resources(), which already can give you 
>> resource-level data binding if you want to use it that way.
>>
>> I do not favor drawing distinctions between defined resource types and 
>> native resource types.  On that basis, I would argue for data binding 
>> either for all resource types or for none, and against data binding for 
>> only defined types.
>>
>> I am concerned about the impact.  It is already somewhat costly for 
>> Puppet to evaluate data bindings for class parameters, and adding bindings 
>> for resource parameters (even just for resources of defined types) will 
>> magnify that.  Note that the cost scales with the aggregate number of 
>> defined parameters for all declared resources, independent whether any data 
>> are actually bound.  In fact, the cases were no data are bound are the most 
>> costly, because hiera must then search the entire hierarchy.
>>
>
> Yes, these are valid and convincing points.
> Anyway if we find data binding useful for classes and can bear the 
> performance overhead, I suppose we can do the same for defined types.
>



In some cases we cannot bear the overhead even just for classes.  People 
who tried to use hiera-gpg with Puppet 3 discovered that pretty quickly.  
Even if there were only a few encrypted data, the encrypted file gets 
decrypted for every class parameter that has no value specified in a 
higher-priority back end (which usually is most of them), making catalog 
compilation take forever.

We're not now in a comfort zone with respect to data binding cost -- we're 
near and sometimes over the edge of acceptable cost.
 

> I see many wonderful use cases for such a feature and no apparent cons.
>>>
>>
>>
>> I see no particularly good use cases and several cons, but I'm prepared 
>> to be amazed by your vision.  Would you care to elaborate?
>>
>
> Yes, why not. Let me digress a bit then. It could be useful to verify if 
> the following idea seems cool only to me.
>
> First, a premise.
>

[... omitted for brevity ...]
 

> My point is that a reusable module should allow total freedom on how its 
> configurations should be managed: with a file-based approach 
> (source/template/ now epp_template...), with a setting-based approach, 
> (augeas, file_line..) with concat or whatever. User should decide what's 
> best for him, not the module author. 
>


I accept the premise up to this point, with some reservations.  In 
particular, I think it applies mainly to fairly low-level modules, such as 
those managing fairly narrow programs or services.  Once you move into 
modules that compose those lower-level units, I don't think you can 
reliably stick to the premised approach any longer.  On the other hand, it 
may be that such mid- or high-level modules are not well suited for reuse 
anyway.

 

> Also I think that a single parameter that expects an hash ( options_hash, 
> config_file_options_hash, options, config... name it as you want) whose key 
> values can be freely used in a [custom] template is better than dozens of 
> parameters, totally unmaintainable, one for each possible configuration 
> entry of an application (and for each one, as you pointed, a data binding 
> lookup).
>
>

There is a trade-off here between expressiveness of the class/resource 
interface and its cost.  On the expressiveness side, I prefer separate 
parameters for resource types, at least up to a point.  Declarations are so 
much easier to read that way.  I'm not so fond of types exposing dozens of 
parameters, however; generally, I'm inclined to think that such types 
should be decomposed.  The same applies to classes to some extent, but I 
don't worry much about that end because I don't generally want to see 
resource-like class declarations at all.

 

> Given this premise (longer than expected, sorry), I've recently thought 
> about a single module: tp (stays for tiny puppet) which should be able to 
> manage essential features of any application.
> Something you can use in manifests in a similar fashion:
>
> To install an application (the tp module contains all the data to do it 
> right on different OS) 
>   tp::install { 'redis': }  
>
> To configure it via a template:
>   tp::conf { 'redis.conf':
>     application => 'redis',
>     template    => 'site/redis/redis.conf.erb',
>   }  
>
> Or, alternatively (using in the title any sane separator):
>   tp::conf { 'redis--redis.conf':
>     template    => 'site/redis/redis.conf.erb',
>   }  
>
> To configure it via the fileserver:
>   tp::conf { 'redis.conf':
>     application => 'redis',
>     source      => 'puppet:///modules/site/redis/redis.conf',
>   }  
>
> But also have something like (to manage single lines in a configuration 
> file):
>   tp::line { 'redis::redis.conf::port':
>     value => '1234',
>   }  
>
> or even something like:
>   tp::concat { 'redis':
>      target   => 'redis.conf',
>      order    => 10,
>      content => 'port 1234',
>   }  
>
> And manage, if overrides are needed, any internally used parameter with 
> something like:
>   tp::settings { 'redis':
>     lookup_strategy => 'merge', # Such a settings should define if the 
> module's data has to be merged or not with user data
>     settings => {
>       config_dir_path => '/opt/redis/conf',
>       tcp_port        => '3242',
>       pid_file_path   => '/opt/redis/run/redis.pid',
>     },
>
> Also, it could be nice to have a face that allows commands like:
> puppet tp check redis  # To check is redis is running, based on settings 
> like service name , pid file, port ... (Incidentally this would make 
> integration tests a "bit" easier and quicker)
> puppet tp info redis # To show info about how redis is managed by Puppet 
> or how it is working
>
> Basically such a tp module would be a sum of most of my existing modules 
> main features + the puppi module (https://github.com/example42/puppi, 
>  for the functionalities not related to application deployments).
>
> Now, the real added value of such a thing would be a (Hiera like) set of 
> yaml files where all the default settings for as many applications as 
> possible for different operating systems can be defined in order to make 
> usage of the previous defines possible with a wide variety of applications 
> and allow, at the same time, extreme customization options.
>
> To do this (hey, this is the topic of this thread), it would have been 
> nice to have a data in modules approach (as based on defines parameters) 
> and have in files like:
> tp/data/redis/default.yaml , tp/data/redis/osfamily/RedHat.yaml all the 
> data needed for all the managed applications, such as :
> ---
>   tp::settings::redis::port: '6379'
>   tp::settings::redis::config_file_path: '/etc/redis/redis.conf'
>
> Or, alternatively, data expressed in a format like :
> ---
>   redis::packages: (or tp::install::redis::packages...)
>     redis:
>       ensure: present
>       alias: 'redis'
>
>

[...]

In fact that's related to, but not quite, the original topic of the 
thread.  You can have something along those lines (at least the latter) 
without extending classes' automated data binding feature to resources.  I 
tried -- evidently unsuccessfully -- to make that point in my first 
response.  You put your data into a hash, nested to whatever depth it needs 
to be for the degree of consolidation you want, and your generic 'tp' 
module retrieves it via an ordinary hiera() call.  That will work in 
today's Puppet.

The question, then, is not so much about whether it is useful to be able to 
bind data to specific resources, but about whether Puppet should do so 
*automatically*.

I have misgivings, too, about exposing module implementation details in the 
form a data-binding interface applicable to internally-declared resources, 
but I'll leave that aside for now.

 

> In this case the tp::install code could be as simple as this POC:
> define tp::install (
>
>   $packages  = { } ,
>   $services  = { } ,
>   $files     = { } ,
>   $execs     = { } ,
>   $users     = { } ,
>
>   $configs   = { } ,
>
>   ) {
>
>   if $packages {
>     create_resources('package', $packages)
>   }
>
>   if $services {
>     create_resources('service', $services)
>   }
>
>   if $files {
>     create_resources('file', $files)
>   }
>
>   if $execs {
>     create_resources('exec', $execs)
>   }
>
>   if $users {
>     create_resources('user', $users)
>   }
>
> }
>
>

Or like this:

define tp::install () {
  $config = hiera("tp::config::$title", {})

  if $config['packages'] {
    create_resources('package', $config['packages'])
  }

  # etc...
}

You could even go further by bundling all the data for all tp-managed 
components into one higher-level hash, and looking it up just once, 
recording it in a variable of class tp for later access by the various 
defined types of the module.

 

>
> Anyway, however is organized the internal data, and the relevant code 
> (instead of the create_resource we could use a lambda and cycle over the 
> various hashes) having the possibility to use Hiera for the backend would 
> be a good thing also because it would make easier for user to override the 
> default module data with custom one.
>


I agree that providing the data via Hiera is a good approach.  I wouldn't 
recommend anything else.

 

>
> I guess there are better ways to obtain the same (suggestions welcomed) 
> rather than enabling  data bindings for defined types  and using data in 
> modules (both features being not existing in Puppet core). For example with 
> a custom function, but I have to figure out how to do it in the right way.
>
>

See above for one way.


John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/285608ae-a5f7-4091-b108-5a81309d3fe7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Re: Data binding for defines

Reply via email to