-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've been following this thread with interest and I think that Donavan is hitting upon something that I've also been wanting.
However, the way I was looking at it was as a set of atomic, optionally blocking, semaphores in a set of parallel threads. If you look at each puppet client as an individual thread and realize that you need things to happen cross-client that depend on the state of one or more clients, then you've deconstructed this into a classic parallel programming application (with all the constituent nonsense). I was toying with the idea of a, for lack of a better term, registry where you could have your modules place *any* type of data and other systems would retrieve that data/data sets when necessary. I would *not* build this into the puppetmaster for scaling reasons but would instead make it a separate data service perhaps optionally backed by a distributed database. This scenario works both for the OpenLDAP situation below as well as the situation where you can't start a service/don't want to apply part of a manifest, until something has happened on a different system. OpenLDAP Example: Node A - - Request Lock with data broker - Obtain lock Node B - - Request Lock with data broker - Sleep or skip on returned block Node A - - Obtain RID list - RID +1 - Register new RID with data broker - Release Lock Node B - - Get notified that lock is available if didn't skip - Continue with processing .... It's a complex situation, but I'm not entirely sure how else to do it without a kludgy web "service" or the like. If you could, perhaps, use some of the recent NoSQL abstractions then this may be a reasonably fast operation. Of course, not all data would need to be locked, if you're just reading data, then there's no need to lock at all. But, in my opinion, this is fundamentally cross-system parallel programming at its best and I think that the existing techniques for dealing with the problem would be best suited to the task. I do see a growing trend in this thread and others that, no matter what you choose, someone is going to need something else. Such is the nature of the vast array of data that we have to pull from. For example, Person A with 500 servers will be happy with some GUI wrapped around YAML. However, Person B with 5000 servers will find the same solution to be tedious and slow and will want a full-on database. Hope this helps and doesn't just make the whole thing more complicated. Thanks, Trevor On 05/28/2010 03:41 PM, donavan wrote: > On May 28, 3:00 am, Luke Kanies <[email protected]> wrote: >> External data (that is, data specified outside of Puppet manifests) >> seems to keep coming up. This is a relatively long description of >> where it seems we are and where we should go from here. > > I'd like to +1 this discussion in general. My personal #1 wishlist > item is the 'data from other nodes' problem that Daniel mentions. > > It seems to me that more separation between logic and data is needed > in Puppet manifests. It's one of the main problems I see with module > redistribution. Apache module A is written for Debian, Apache module B > is written for RHEL, etc. Even if that was cleaned up you still see > $adminpassword variables, and not everyone wants the same list of > modules loaded/installed. > >> * Alessandro's presentation caused someone to point out to me >> afterward that case statements of this ilk: >> >> case $operatingsytem { >> debian: { ... } >> redhat: { ... } >> >> } > > I echo Jonathans sentiments on this. I think a better alternative to > he above is something like this: > class apache { > include apache::$operatingsystem > } > > To add solaris support I add solaris.pp into the load path. > Class[apache::solaris] can then include or inherit > Class[apache::base], as needed. Then I extend the module and limit > conflict with upstream manifests. It may not be ideal, but it works > today. > >> * Users should probably be able to put their external data in a >> database, preferably in their external node tool > > I'd like to second Daniels comments regarding data of other nodes. > This is pretty much required when configuring distributed services. > Today I can get most of the the way there with storeconfigs & clever > defines, but it's not ideal. I posted a question about hacking faux > distributed key value storage using storeconfigs, but got crickets. > > A common use case for me is OpenLDAP replication (syncrepl). Each > slave is identified by a 3 digit integer (rid), which must be unique > in the replication group, that needs to persist for the life of that > replica. So each new slave needs to know the rid of every existing > slave, and then pick the next available rid. The same pattern applies > to MySQL replication, for example. > >> I also don't like the idea of just relying on a function - I'd like a >> class to be able to declare that it relies on external data, so that >> users know what they can configure in their class. > > Isn't this part of parameterized classes? Today I'd probably do 'if > $value == "" { fail("must define \$value") }' where required. > - -- Trevor Vaughan Vice President, Onyx Point, Inc. email: [email protected] phone: 410-541-ONYX (6699) pgp: 0x6C701E94 - -- This account not approved for unencrypted sensitive information -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAkwHC1sACgkQyWMIJmxwHpRK1QCfR2quLSuugQjJymBllxGV4PDU LrQAoJpizbcZz3BpzzTE2WmsBKs4OM7/ =b0gf -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
<<attachment: tvaughan.vcf>>
