We've been doing this for the last few years. We just built our own solution, 
called heckle. The code is open source, etc. 
https://trac.mcs.anl.gov/projects/Heckle

The high level model is that each node pxe boots each time. We use gpxe, and 
point it at heckle for its boot configuration. Nodes are registered with the 
system, and go through a state machine for building a given image. Each image 
as a series of boot steps, and loads its kernel over the network. 

The orchestration bit is build mechanism agnostic; we've loaded system imager, 
a home-grown tinycore imaging system, preseed and kickstart with it.

All of these parts are pretty straightforward; depending on your use case, the 
user end part is a lot more complicated. In our case, our users need root, and 
want to build their own configurations, sometimes based on the images we 
supply, but sometimes not. All of these bits mean that we need to look inside 
of node images a fair bit. if you could make due with static images pointed at 
external data things would be considerably easier. (and keep in mind that this 
is a configuration wonk saying that the configuration can easily get out of 
hand; you can easily end up supporting a *really* wide variety of diverse 
configurations.)

Another major issue that comes into play is build performance. If users request 
nodes synchronously, you don't want builds to take too long. 4 minutes is about 
as fast as we've been able to make node builds work on raw hardware. In order 
to do that, we needed to skip kickstart and preseed altogether, in favor of a 
homebuilt imaging system based on tinycore. The base imaging process takes 
about 90s, the tinycore boot takes about 20s, and you lose a minute or more per 
boot to the bios, depending on the type of system. We've used power controllers 
for nodes originally, but have switched to ipmi more recently. 

The devil is in the details, particularly in terms of what you're trying to 
provide your users with. 

Oh yeah, and the thing you definitely need is a netboot rescue image that fires 
up a serial console and ssh. It is worth its weight in gold, particularly if 
users expect to keep their allocated nodes beyond a single boot (and kernel 
mods, etc)
 -nld


On Apr 7, 2011, at 8:43 PM, A. Rich wrote:

> Is anyone doing cloud computing with physical hosts (re-imaging with pxeboot,
> deploy studio, etc) or, even better, a mix of physical and virtual hosts?  If
> so, what products did you investigate before picking your solution, and what
> did you eventually pick and why?  I'm especially interested in solutions that
> support various distros and versions of linux and windows.
> 
> Thanks!
> 
> _______________________________________________
> Tech mailing list
> [email protected]
> https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
> This list provided by the League of Professional System Administrators
> http://lopsa.org/

_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to