Guys, I am trying to build a fully provisioned Docker container with Hadoop installed and configured in it. The reason is pretty simple: it will give a user an ability to quickly run a sandbox environment with some/all bits in it.
Of course I am taking the advantage of our deploy images and puppet recipes, but here's a snag: at the moment of image creation the FQDN of the container could not be set, so the image will, by definition, get setup as a worker node. I have devised a contraption where I install master packages, like hadoop-hdfs-namenode) as a part of the image build. Then at the container creation time I am running 'puppet apply' again and in a matter of 60-70 seconds everything gets configured, based on the actual hostname, and the services got started. I think this is a pretty neat workaround. compared to what I've seen on the net with awful miles of sed statements do fix the configs, followed by scripted service runs, etc. Nasty... I want to do this second run automatically, when the container get created, so used doesn't need to do anything extra. I've tried to see if I can tail-gait ENTRYPOINT to do the second run of the 'puppet apply', but understandably the container stops right after puppet run is over. Any ideas how this can be achieved? Would be good to have 1.1 official docker image once it is out. For reference, the initial version of this is sitting as a patch on BIGTOP-2296 Looking forward for your thoughts! Thanks Cos
signature.asc
Description: Digital signature
