We've been doing this for the last few years. We just built our own solution, called heckle. The code is open source, etc. https://trac.mcs.anl.gov/projects/Heckle
The high level model is that each node pxe boots each time. We use gpxe, and point it at heckle for its boot configuration. Nodes are registered with the system, and go through a state machine for building a given image. Each image as a series of boot steps, and loads its kernel over the network. The orchestration bit is build mechanism agnostic; we've loaded system imager, a home-grown tinycore imaging system, preseed and kickstart with it. All of these parts are pretty straightforward; depending on your use case, the user end part is a lot more complicated. In our case, our users need root, and want to build their own configurations, sometimes based on the images we supply, but sometimes not. All of these bits mean that we need to look inside of node images a fair bit. if you could make due with static images pointed at external data things would be considerably easier. (and keep in mind that this is a configuration wonk saying that the configuration can easily get out of hand; you can easily end up supporting a *really* wide variety of diverse configurations.) Another major issue that comes into play is build performance. If users request nodes synchronously, you don't want builds to take too long. 4 minutes is about as fast as we've been able to make node builds work on raw hardware. In order to do that, we needed to skip kickstart and preseed altogether, in favor of a homebuilt imaging system based on tinycore. The base imaging process takes about 90s, the tinycore boot takes about 20s, and you lose a minute or more per boot to the bios, depending on the type of system. We've used power controllers for nodes originally, but have switched to ipmi more recently. The devil is in the details, particularly in terms of what you're trying to provide your users with. Oh yeah, and the thing you definitely need is a netboot rescue image that fires up a serial console and ssh. It is worth its weight in gold, particularly if users expect to keep their allocated nodes beyond a single boot (and kernel mods, etc) -nld On Apr 7, 2011, at 8:43 PM, A. Rich wrote: > Is anyone doing cloud computing with physical hosts (re-imaging with pxeboot, > deploy studio, etc) or, even better, a mix of physical and virtual hosts? If > so, what products did you investigate before picking your solution, and what > did you eventually pick and why? I'm especially interested in solutions that > support various distros and versions of linux and windows. > > Thanks! > > _______________________________________________ > Tech mailing list > [email protected] > https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech > This list provided by the League of Professional System Administrators > http://lopsa.org/ _______________________________________________ Tech mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
