On Monday, April 14, 2014 1:41:42 PM UTC-5, Jim Donnellan wrote:
>
> Puppeteers,
>
> I'm trying to get something done in puppet/hiera, and I'm curious if it's 
> possible. A bit of background:
>
> We're using puppet and hiera to build out and maintain Apache Solr, and 
> we're using Solr in a cloud structure. What this ends up meaning 
> configuration-wise is that we have our data broken off into shards, and 
> servers that are hosting different replicas of the shards for redundancy. 
> For a simplified sense, say it looks like this...
>
> Server1 hosts:
>   shard1, replica 1
>   shard2, replica 1
>
> Server2 hosts:
>   shard1, replica 2
>   shard2, replica 2
>
> Server3 hosts:
>   shard3, replica 1
>   shard4, replica 1
>
> ...and so on. So each shard exists on multiple hosts, each host has 
> multiple shards, but not every shard is on every host. 
>
> We were able to handle this just fine by using a hiera array to list the 
> shards at the _host_.yaml level. No big deal. Done and done, works great in 
> production.
>
>
> The issue that has come up is that some of the shards (which are JVMs) 
> have grown a bit and require a heap size greater than the default. 
> Obviously this would be something we'd want to wrangle with puppet and 
> hiera. We've come up with some initial attempts that seem to work, which 
> involve just enhancing the hiera data when we declare the shards at a host 
> level. So instead of:
>
> Server1.yaml::
> shards:
> - shard1
> - shard2
>
> We have something like this:
>
> Server1.yaml::
> shards:
> - shard1: 5G
> - shard2: 7G
>
> ...which is workable. The thing I don't like about it is that I'm defining 
> the heap size at the host level, even though they should be consistent for 
> any given shard across servers. This is redundant at best, and leaves 
> things open for inconsistency across servers at worst. I kind of want to 
> raise the heap size declarations up above the host level, up to the 
> application or environment level I guess. But I would still need to declare 
> which shards are where at the host level. In short, I guess I need the 
> deployment to look for what shards should be on a host at the host level, 
> and then look up the chain a bit to see what heap size that shard should 
> have. 
>
> Does this sound doable?
>
>

At that level of abstraction, yes, it sounds doable, but the Devil is in 
the details.  The per-host shard data probably need to be references (by 
name) to shard details in some more general level of your hierarchy.  You 
can then use a defined type to declare all the shards for each server, 
based on the shared details for each shard.  From the data structure you 
describe, I suppose you probably already have something going in this 
direction.  So your data might look more like this:

Server1.yaml:
shards:
- shard1
- shard2

Server5.yaml:
shards:
- shard1
- shard5

common.yaml:
shard_details:
  shard1:
    max_heap: 5G
  shard2:
    max_heap: 7G
  shard5:
    max_heap: 4G


There are any number of ways you could tweak the data structure, but that 
general approach seems sound to me.


John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/a53bb7c6-8aef-4966-9563-b0c2aeda1b04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to