On Monday, April 14, 2014 1:41:42 PM UTC-5, Jim Donnellan wrote:
>
> Puppeteers,
>
> I'm trying to get something done in puppet/hiera, and I'm curious if it's
> possible. A bit of background:
>
> We're using puppet and hiera to build out and maintain Apache Solr, and
> we're using Solr in a cloud structure. What this ends up meaning
> configuration-wise is that we have our data broken off into shards, and
> servers that are hosting different replicas of the shards for redundancy.
> For a simplified sense, say it looks like this...
>
> Server1 hosts:
> shard1, replica 1
> shard2, replica 1
>
> Server2 hosts:
> shard1, replica 2
> shard2, replica 2
>
> Server3 hosts:
> shard3, replica 1
> shard4, replica 1
>
> ...and so on. So each shard exists on multiple hosts, each host has
> multiple shards, but not every shard is on every host.
>
> We were able to handle this just fine by using a hiera array to list the
> shards at the _host_.yaml level. No big deal. Done and done, works great in
> production.
>
>
> The issue that has come up is that some of the shards (which are JVMs)
> have grown a bit and require a heap size greater than the default.
> Obviously this would be something we'd want to wrangle with puppet and
> hiera. We've come up with some initial attempts that seem to work, which
> involve just enhancing the hiera data when we declare the shards at a host
> level. So instead of:
>
> Server1.yaml::
> shards:
> - shard1
> - shard2
>
> We have something like this:
>
> Server1.yaml::
> shards:
> - shard1: 5G
> - shard2: 7G
>
> ...which is workable. The thing I don't like about it is that I'm defining
> the heap size at the host level, even though they should be consistent for
> any given shard across servers. This is redundant at best, and leaves
> things open for inconsistency across servers at worst. I kind of want to
> raise the heap size declarations up above the host level, up to the
> application or environment level I guess. But I would still need to declare
> which shards are where at the host level. In short, I guess I need the
> deployment to look for what shards should be on a host at the host level,
> and then look up the chain a bit to see what heap size that shard should
> have.
>
> Does this sound doable?
>
>
At that level of abstraction, yes, it sounds doable, but the Devil is in
the details. The per-host shard data probably need to be references (by
name) to shard details in some more general level of your hierarchy. You
can then use a defined type to declare all the shards for each server,
based on the shared details for each shard. From the data structure you
describe, I suppose you probably already have something going in this
direction. So your data might look more like this:
Server1.yaml:
shards:
- shard1
- shard2
Server5.yaml:
shards:
- shard1
- shard5
common.yaml:
shard_details:
shard1:
max_heap: 5G
shard2:
max_heap: 7G
shard5:
max_heap: 4G
There are any number of ways you could tweak the data structure, but that
general approach seems sound to me.
John
--
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/a53bb7c6-8aef-4966-9563-b0c2aeda1b04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.