On 28/11/09 19:30, Luke Kanies wrote: > On Nov 26, 2009, at 12:23 PM, Markus Roberts wrote: > >>>> I'm thinking that it might be a better idea to solve this problem >>>> than >>>> to hack around it. The main solution I'm thinking of is essentially >>>> requiring some kind of shared back-end or requiring a shared cache >>>> such as memcached. >>> >>> My concern here is that we don't want to cache the facts, we want to >>> have them at disposal to several masters. The issue with caching is >>> that >>> you're never sure the content will be there (that's even the whole >>> point >>> of cache). >>> So what happens if memcached decides to purge the facts for a given >>> host >>> and said host asks for a catalog? >> >> Technically, yes, but Luke's broader point stands; there are a number >> of solutions (e.g. memcachedb and tokyo tyrant) that use the memcached >> protocol but are persistent. > > Yep. Additionally, though, there really is a good bit of caching > going on on the server now, and doing so with memcached makes a lot > more sense in many cases. > > The complication is that most of that caching is a traditional cache > -- we can get new data if it goes stale -- but the fact information > isn't really a cache in that sense.
That was one of my concern, too, and something that I always found strange (ie storeconfigs being used in the cache part of the indirector). >>> What we need is a (more) persistent shared storage for this. And the >>> only one we have at the moment is storeconfigs/thin_storeconfigs. >> >> As Luke noted, memcached (and other system that use the protocol) >> should be quite doable; there are also a slew of other options (such >> as Maglev and the other nosql systems). This would be a prime >> candidate for plugins. >> >>> Granted those are performance suckers (less of course for >>> thin_storeconfigs), so that might not be usefull for large sites >>> (which >>> of course needs several masters). >> >> I suspect that the performance issues are resolvable. > > Probably, although I'm not convinced it's possible to do so without > changing technologies (Brice, how's your research into TokyoCabinet et > al going?). I couldn't make any progress so far, my puppet work has stalled recently because of an activity surge at the office. > Really, though, to store the Fact information, the performance won't > be nearly as big a problem. Sure. We can even use thing_storeconfigs for that. >>>> A shared cache with memcached should be pretty close to trivial - >>>> just >>>> another terminus type. This obviously adds another dependency, but >>>> only in those cases where you 1) have multiple masters, 2) don't >>>> have >>>> client binding to an individual master, and 3) aren't using some >>>> common back-end (one of which will be available from us with this >>>> information by the next major release). >> >> Having an additional dependency for an optional feature seems quite >> reasonable. >> >>> * we bend the REST model to POST the facts and get the catalog as a >>> result (ie one transaction like now, but posted) >> >> Sure. I mean, if you're willing to contort HTTP and pretend it's a >> RPC system (which is what REST is), a little extra bending to make it >> actually work shouldn't be that objectionable. Are there any "REST >> purest" on this bus, and if so have you thought about how paradoxical >> that is? If we're all pragmatist here, this may be the simplest/most >> reliable solution. > > AFAIK there haven't been any purist arguments in the group, and as you > say, that would be pretty silly. > > My only concern is how much it requires a change to the existing > architecture or a one-off solution that's painful to maintain over time. I didn't really check, but I didn't think it was complex... [snipped] >>> * we don't care and ask users wanting to have multiple masters to >>> use a >>> shared filesystem (whatever it is) to share the yaml dumped facts. >> >> It could work. It could also fail due to various race conditions. > > > This basically says that multiple masters is really complicated and > you shouldn't do it, which is not where we want to end up. > > IMO, the right approach is to have a node manager capable of > functioning as an inventory server (holdiing all fact/node data), and > then have the servers query that (with the same kind of caching > they're doing now). > > This gets you essentially everything you need, and all it says is: If > you want multimaster, you have to have an inventorying node manager. But people running multi master mainly do this for failure resistance, and we're just adding a single point of failure... Don't you think this is a problem? -- Brice Figureau My Blog: http://www.masterzen.fr/ -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
