Re: [Puppet Users] hiera level explosion

2017-03-21 Thread Henrik Lindberg

On 20/03/17 16:03, Darragh Bailey wrote:

Hi,


Looking at how our hiera levels are already exploding due to some
preferences, I'm wondering how others describe use hiera.

We have a preference to group related data within separate files,
however some colleague concerns about using '%{module_name}' and
'%{calling_class}' means that for each separate application related
class within our main module we end up with a dedicated level in hiera.

While our existing hierarchy doesn't quite look like the following, once
we've migrated to using a eyaml backend (in addition to the current yaml
backend) instead of a separate restricted access git repo I expect to
see it look like the following:

:hierarchy:
  - "node/%{::domain}/%{::hostname}"
  - "gerrit"
  - "database"
  - "jenkins"
  - "server"
  - "web"
  - "%{calling_class}"
  - "%{module_name}"
  - defaults

Tbh, I'd favour simply doing something like the following:

:hierarchy:
  - "node/%{::domain}/%{::hostname}"
  - "%{calling_class}"
  - "%{module_name}"
  - defaults

And have anything in 'gerrit', 'database', 'jenkins', 'server' & 'web'
that needs to be accessible by other classes placed in 'defaults' and
for anything specific to that class simply put in a name that is picked
up by '%{calling_class}'.

However there are concerns that it's difficult to remember that data is
only visible to the associated class/module when made accessible under
'%{calling_class}' and '%{module_name}', and I think '%{module_name}'
goes away in hiera 5 or at least it's deprecated and support for it will
be removed in hiera 6.



Both %{module_name} and %{calling_class} are going away - you cannot use 
them when you are switching from legacy hiera 3 style hiera.yaml to
hiera 5 style hiera.yaml. Support for those variables will be removed in 
Puppet 6.0.0 where hiera 3 backwards compatibility will be dropped (at 
least that is what we think now, but it depends on several factors).


The main problem with interpolation of those "dynamic values" into paths 
is that the value for a given key changes during the cause of the 
compilation.

This creates a performance problem (all caches have to be evicted), and
it makes it a lot harder to debug since the value of a key - say x::y is 
different depending on where it is obtained. Such designs should be 
avoided as they are confusing and hard to maintain.


There are several new mechanisms in hiera 5 that can be used for various 
purposes - hard to say exactly what you would be using as it depends on 
what you are trying to achieve with your current design (why is it there 
in the first place, who gets to change what were, how to you review and 
do QA on data, etc. etc).


Best,
- henrik



What concerns me however is whether there is a performance impact of
creating lots of levels to keep data nicely separately on a
service/application basis in the name of keeping it easy to understand.

Do others simply use a single file? Or do you favour use of
'%{module_name}', '%{calling_class}', and/or '%{calling_class_path}'? If
so what are your plans around hiera's future behaviour?

Any clues on assessing the performance impact of either approach? I
doubt it currently makes much difference, but I'm sure as we add more
and more puppet code to manage additional services/applications and
consequently many more levels this will have to start impacting at some
point.

Perhaps it makes more sense to have these in separate files and then a
additional step to the deployment that combines the application specific
files into a single yaml entry to be used by hiera. Giving us separation
at the source/review level and simple single file at the point of usage
to ensure good performance.
It also seems to more in line with hiera as these application specific
files are not really separate levels of hierarchy, they are just
separated for human reading convenience.

Anyone care to provide some insight:
Have you encountered this?
Do you just stick everything for different services/applications in the
same file?
Does that isolate which puppet modules/classes where that data is
accessible from?
Do you prefer explicit isolation though using the special variables? and
just trust that people remember these are not visible everywhere?

--
Darragh

--
You received this message because you are subscribed to the Google
Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to puppet-users+unsubscr...@googlegroups.com
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/8d142857-c985-4902-9346-aaeb577dc2e6%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.



--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message bec

Re: [Puppet Users] hiera level explosion

2017-03-21 Thread Darragh Bailey
Hi Rob,

Thanks for some of the suggestions, I suspect part of our problem is that 
we have multiple applications/services deployed per node using docker 
containers. 

We probably can't easily map a single role per machine because we want to 
be able to move the application/services between machines relatively 
easily. This kind of suggests this should all be in a single layer such as 
the datacenter one suggested. Although our usage of the '%{calling_class}' 
is analogous to your '%{puppet_role}' above, see comment below.

On Monday, March 20, 2017 at 3:30:12 PM UTC, Rob Nelson wrote:
>
> If you're looking up hiera data based on the calling class, I'd question 
> whether that's useful to split out to hiera at all - every instance of the 
> class would get the same values, right? And would you really want ALL nodes 
> that `include jenkins` to get the same jenkins server? Even if they're in 
> DCs on opposite sides of the world supporting different groups?
>

Have a tendancy to use a specific class for a specific instance that can 
then 'include jenkins'. To properly make use of calling_class across the 
board to collapse our levels, would need to rename the yaml files so 
jenkins -> local_module_name::my_jenkins, gerrit -> 
local_module_name::my_gerrit, etc.

This meant that the information would only be available when configuring 
that specific service and if we needed two of them, we would end up 
creating two separate classes wrapping the same specific base class in 
order to pull in the desired info.

Seems to follow the idea of similarly the information for specific roles is 
only available to those systems with the given roles.
 

> It's more likely that you want to use some facts about the nodes - 
> datacenter, network, owning organization, etc. - to provide that data. Your 
> hierarchy should be modeled after that. Mine is:
>
> :hierarchy:
>   - "clientcert/%{clientcert}"
>   - "puppet_role/%{puppet_role}"
>   - "osfamily-release/%{osfamily}-%{operatingsystemmajrelease}"
>   - "datacenter/%{datacenter}"
>   - global
>
> Clientcert lets individual nodes override settings groups normally 
> inherit; puppet_role is a custom fact reflecting a service like AppX, AppY, 
> DHCP, DNS, etc; the next tier is OS since we run a few versions that often 
> require different values; next is the datacenter, where routes and DNS and 
> such might differ; and finally global is things standardized across the 
> board (called 'common' in default installations). IMO, the only tier that 
> should reference a single filename would be global/common, anything else 
> doing so is really just replicating that tier higher up the stack and 
> adding complexity. I'm sure there's some valid use case for it, though.
>

This is closer to a layout I had expected to be used, and it was the 
desired to split up some of the information into management chunks that 
drove the separate files.

 

> There's some perf impact when you have more tiers, but hiera lookups don't 
> have a high enough cost for us to worry about it. There is a cost to 
> architecting and maintaining additional tiers and that's my main concern. 
> You can only keep so much in your head, so it's easy to lose track of where 
> things are configured and where they should be configured, and of course it 
> affects troubleshooting times as well.
>
> I believe that answers some of your questions and obviates the need for 
> answers to others.
>

Thanks, I've a feeling we'll have to think about the hiera layers some more 
and how data is organized in order to get a better handle on it.

 

> Rob Nelson
> rnel...@gmail.com 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/e157dac1-a87d-473c-9915-5306d3214866%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Puppet Users] hiera level explosion

2017-03-20 Thread Rob Nelson
If you're looking up hiera data based on the calling class, I'd question
whether that's useful to split out to hiera at all - every instance of the
class would get the same values, right? And would you really want ALL nodes
that `include jenkins` to get the same jenkins server? Even if they're in
DCs on opposite sides of the world supporting different groups?

It's more likely that you want to use some facts about the nodes -
datacenter, network, owning organization, etc. - to provide that data. Your
hierarchy should be modeled after that. Mine is:

:hierarchy:
  - "clientcert/%{clientcert}"
  - "puppet_role/%{puppet_role}"
  - "osfamily-release/%{osfamily}-%{operatingsystemmajrelease}"
  - "datacenter/%{datacenter}"
  - global

Clientcert lets individual nodes override settings groups normally inherit;
puppet_role is a custom fact reflecting a service like AppX, AppY, DHCP,
DNS, etc; the next tier is OS since we run a few versions that often
require different values; next is the datacenter, where routes and DNS and
such might differ; and finally global is things standardized across the
board (called 'common' in default installations). IMO, the only tier that
should reference a single filename would be global/common, anything else
doing so is really just replicating that tier higher up the stack and
adding complexity. I'm sure there's some valid use case for it, though.

There's some perf impact when you have more tiers, but hiera lookups don't
have a high enough cost for us to worry about it. There is a cost to
architecting and maintaining additional tiers and that's my main concern.
You can only keep so much in your head, so it's easy to lose track of where
things are configured and where they should be configured, and of course it
affects troubleshooting times as well.

I believe that answers some of your questions and obviates the need for
answers to others.


Rob Nelson
rnels...@gmail.com

On Mon, Mar 20, 2017 at 11:03 AM, Darragh Bailey 
wrote:

> Hi,
>
>
> Looking at how our hiera levels are already exploding due to some
> preferences, I'm wondering how others describe use hiera.
>
> We have a preference to group related data within separate files, however
> some colleague concerns about using '%{module_name}' and '%{calling_class}'
> means that for each separate application related class within our main
> module we end up with a dedicated level in hiera.
>
> While our existing hierarchy doesn't quite look like the following, once
> we've migrated to using a eyaml backend (in addition to the current yaml
> backend) instead of a separate restricted access git repo I expect to see
> it look like the following:
>
> :hierarchy:
>   - "node/%{::domain}/%{::hostname}"
>   - "gerrit"
>   - "database"
>   - "jenkins"
>   - "server"
>   - "web"
>   - "%{calling_class}"
>   - "%{module_name}"
>   - defaults
>
> Tbh, I'd favour simply doing something like the following:
>
> :hierarchy:
>   - "node/%{::domain}/%{::hostname}"
>   - "%{calling_class}"
>   - "%{module_name}"
>   - defaults
>
> And have anything in 'gerrit', 'database', 'jenkins', 'server' & 'web'
> that needs to be accessible by other classes placed in 'defaults' and for
> anything specific to that class simply put in a name that is picked up by
> '%{calling_class}'.
>
> However there are concerns that it's difficult to remember that data is
> only visible to the associated class/module when made accessible under
> '%{calling_class}' and '%{module_name}', and I think '%{module_name}' goes
> away in hiera 5 or at least it's deprecated and support for it will be
> removed in hiera 6.
>
>
> What concerns me however is whether there is a performance impact of
> creating lots of levels to keep data nicely separately on a
> service/application basis in the name of keeping it easy to understand.
>
> Do others simply use a single file? Or do you favour use of
> '%{module_name}', '%{calling_class}', and/or '%{calling_class_path}'? If so
> what are your plans around hiera's future behaviour?
>
> Any clues on assessing the performance impact of either approach? I doubt
> it currently makes much difference, but I'm sure as we add more and more
> puppet code to manage additional services/applications and consequently
> many more levels this will have to start impacting at some point.
>
> Perhaps it makes more sense to have these in separate files and then a
> additional step to the deployment that combines the application specific
> files into a single yaml entry to be used by hiera. Giving us separation at
> the source/review level and simple single file at the point of usage to
> ensure good performance.
> It also seems to more in line with hiera as these application specific
> files are not really separate levels of hierarchy, they are just separated
> for human reading convenience.
>
> Anyone care to provide some insight:
> Have you encountered this?
> Do you just stick everything for different services/applications in the
> same file?
> Does that isolate 

[Puppet Users] hiera level explosion

2017-03-20 Thread Darragh Bailey
Hi,


Looking at how our hiera levels are already exploding due to some 
preferences, I'm wondering how others describe use hiera.

We have a preference to group related data within separate files, however 
some colleague concerns about using '%{module_name}' and '%{calling_class}' 
means that for each separate application related class within our main 
module we end up with a dedicated level in hiera.

While our existing hierarchy doesn't quite look like the following, once 
we've migrated to using a eyaml backend (in addition to the current yaml 
backend) instead of a separate restricted access git repo I expect to see 
it look like the following:

:hierarchy:
  - "node/%{::domain}/%{::hostname}"
  - "gerrit"
  - "database"
  - "jenkins"
  - "server"
  - "web"
  - "%{calling_class}"
  - "%{module_name}"
  - defaults

Tbh, I'd favour simply doing something like the following:

:hierarchy:
  - "node/%{::domain}/%{::hostname}"
  - "%{calling_class}"
  - "%{module_name}"
  - defaults

And have anything in 'gerrit', 'database', 'jenkins', 'server' & 'web' that 
needs to be accessible by other classes placed in 'defaults' and for 
anything specific to that class simply put in a name that is picked up by 
'%{calling_class}'.

However there are concerns that it's difficult to remember that data is 
only visible to the associated class/module when made accessible under 
'%{calling_class}' and '%{module_name}', and I think '%{module_name}' goes 
away in hiera 5 or at least it's deprecated and support for it will be 
removed in hiera 6.


What concerns me however is whether there is a performance impact of 
creating lots of levels to keep data nicely separately on a 
service/application basis in the name of keeping it easy to understand.

Do others simply use a single file? Or do you favour use of 
'%{module_name}', '%{calling_class}', and/or '%{calling_class_path}'? If so 
what are your plans around hiera's future behaviour?

Any clues on assessing the performance impact of either approach? I doubt 
it currently makes much difference, but I'm sure as we add more and more 
puppet code to manage additional services/applications and consequently 
many more levels this will have to start impacting at some point.

Perhaps it makes more sense to have these in separate files and then a 
additional step to the deployment that combines the application specific 
files into a single yaml entry to be used by hiera. Giving us separation at 
the source/review level and simple single file at the point of usage to 
ensure good performance.
It also seems to more in line with hiera as these application specific 
files are not really separate levels of hierarchy, they are just separated 
for human reading convenience.

Anyone care to provide some insight:
Have you encountered this?
Do you just stick everything for different services/applications in the 
same file?
Does that isolate which puppet modules/classes where that data is 
accessible from?
Do you prefer explicit isolation though using the special variables? and 
just trust that people remember these are not visible everywhere?

--
Darragh

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/8d142857-c985-4902-9346-aaeb577dc2e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.