Re: [Puppet-dev] Alternative to facts-in-query

Brice Figureau Sat, 28 Nov 2009 13:57:51 -0800

On 28/11/09 19:30, Luke Kanies wrote:
 > On Nov 26, 2009, at 12:23 PM, Markus Roberts wrote:
 >
 >>>> I'm thinking that it might be a better idea to solve this problem
 >>>> than
 >>>> to hack around it.  The main solution I'm thinking of is essentially
 >>>> requiring some kind of shared back-end or requiring a shared cache
 >>>> such as memcached.
 >>>
 >>> My concern here is that we don't want to cache the facts, we want to
 >>> have them at disposal to several masters. The issue with caching is
 >>> that
 >>> you're never sure the content will be there (that's even the whole
 >>> point
 >>> of cache).
 >>> So what happens if memcached decides to purge the facts for a given
 >>> host
 >>> and said host asks for a catalog?
 >>
 >> Technically, yes, but Luke's broader point stands; there are a number
 >> of solutions (e.g. memcachedb and tokyo tyrant) that use the memcached
 >> protocol but are persistent.
 >
 > Yep.  Additionally, though, there really is a good bit of caching
 > going on on the server now, and doing so with memcached makes a lot
 > more sense in many cases.
 >
 > The complication is that most of that caching is a traditional cache
 > -- we can get new data if it goes stale -- but the fact information
 > isn't really a cache in that sense.


That was one of my concern, too, and something that I always found 
strange (ie storeconfigs being used in the cache part of the indirector).

 >>> What we need is a (more) persistent shared storage for this. And the
 >>> only one we have at the moment is storeconfigs/thin_storeconfigs.
 >>
 >> As Luke noted, memcached (and other system that use the protocol)
 >> should be quite doable; there are also a slew of other options (such
 >> as Maglev and the other nosql systems).  This would be a prime
 >> candidate for plugins.
 >>
 >>> Granted those are performance suckers (less of course for
 >>> thin_storeconfigs), so that might not be usefull for large sites
 >>> (which
 >>> of course needs several masters).
 >>
 >> I suspect that the performance issues are resolvable.
 >
 > Probably, although I'm not convinced it's possible to do so without
 > changing technologies (Brice, how's your research into TokyoCabinet et
 > al going?).

I couldn't make any progress so far, my puppet work has stalled recently 
because of an activity surge at the office.

 > Really, though, to store the Fact information, the performance won't
 > be nearly as big a problem.

Sure. We can even use thing_storeconfigs for that.

 >>>> A shared cache with memcached should be pretty close to trivial -
 >>>> just
 >>>> another terminus type.  This obviously adds another dependency, but
 >>>> only in those cases where you 1) have multiple masters, 2) don't
 >>>> have
 >>>> client binding to an individual master, and 3) aren't using some
 >>>> common back-end (one of which will be available from us with this
 >>>> information by the next major release).
 >>
 >> Having an additional dependency for an optional feature seems quite
 >> reasonable.
 >>
 >>> * we bend the REST model to POST the facts and get the catalog as a
 >>> result (ie one transaction like now, but posted)
 >>
 >> Sure.  I mean, if you're willing to contort HTTP and pretend it's a
 >> RPC system (which is what REST is), a little extra bending to make it
 >> actually work shouldn't be that objectionable.  Are there any "REST
 >> purest" on this bus, and if so have you thought about how paradoxical
 >> that is?  If we're all pragmatist here, this may be the simplest/most
 >> reliable solution.
 >
 > AFAIK there haven't been any purist arguments in the group, and as you
 > say, that would be pretty silly.
 >
 > My only concern is how much it requires a change to the existing
 > architecture or a one-off solution that's painful to maintain over time.

I didn't really check, but I didn't think it was complex...

[snipped]
 >>> * we don't care and ask users wanting to have multiple masters to
 >>> use a
 >>> shared filesystem (whatever it is) to share the yaml dumped facts.
 >>
 >> It could work.  It could also fail due to various race conditions.
 >
 >
 > This basically says that multiple masters is really complicated and
 > you shouldn't do it, which is not where we want to end up.
 >
 > IMO, the right approach is to have a node manager capable of
 > functioning as an inventory server (holdiing all fact/node data), and
 > then have the servers query that (with the same kind of caching
 > they're doing now).
 >
 > This gets you essentially everything you need, and all it says is:  If
 > you want multimaster, you have to have an inventorying node manager.

But people running multi master mainly do this for failure resistance, 
and we're just adding a single point of failure... Don't you think this 
is a problem?
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

--

You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Re: [Puppet-dev] Alternative to facts-in-query

Reply via email to