I agree with Michael. We need to use the OpenStack tooling.
Sahara is encountering some of the same issues we are as they are building
up their hadoop VM/clusters.
See
http://docs.openstack.org/developer/sahara/userdoc/vanilla_plugin.html
http://docs.openstack.org/developer/sahara/userdoc/diskim
I am investigating building scripts that use diskimage-builder
(https://github.com/openstack/diskimage-builder) to create a "purpose
built" image. This should allow some flexibility in the base image
and the output image format (including a path to docker).
The definition of "purpose built" is op