Hi Todd,
thanks for catching back. I was off on vacation myself, so I haven't had
played around the last week myself. I think I have gotten around the
idea but am now struggling to create the image. I am currently using a
python script called live-cd-creator or something like that, but I do
not have a good kickstart file for that, so am pretty much just fiddling
around.
How did you create your image? Do you have any documentation/scripts
you'd like to share?
Again thanks for replying. You are really helping me a lot.
Cheers,
alex.
On 08/19/2014 05:19 PM, Todd S wrote:
> Hi Alex,
>
> Sorry for the email confusion, I blame my email admin (sadly, it's me).
>
> We do run the OS from RAM only - no install on the local disk. Every
> boot is done by PXE, which then mounts the data disks and we're off to
> the races again. We only use Cobbler to hand out the PXE image, which
> is setup as a distro:
>
> Name : hdp-dn
> Architecture : x86_64
> TFTP Boot Files : {}
> Breed : redhat
> Comment :
> Fetchable Files : {}
> Initrd : /tftpboot/hdp-dn/initrd-hdp.gz
> Kernel : /tftpboot/hdp-dn/linux-hdp
> Kernel Options : {}
> Kernel Options (Post Install) : {}
> Kickstart Metadata : {}
> Management Classes : []
> OS Version : generic26
> Owners : ['admin']
> Red Hat Management Key : <<inherit>>
> Red Hat Management Server : <<inherit>>
> Template Files : {}
>
>
> Hopefully you got things working in my lag to reply .. vacation season
> is upon us!
>
> All the best,
>
> Todd.
>
>
>
> On Mon, Aug 11, 2014 at 12:02 PM, a. <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Todd!
>
> I've run into a caveat getting the same setup to work, as we are looking
> into a "live system", i.e. where the OS is completely loaded into the
> RAM and not installed onto the hard drive before use. To be honest I
> thought this was possible from what you wrote. Right now I only managed
> for cobbler to let their installer provision CentOS onto the machines
> hard drive. But this isn't what we wanted.
>
> Could you clarify: Are you installing your CentOS onto the machines or
> are you booting them up live? You wrote "Our Datanodes are completely
> stateless" so that's where we took that from.
>
> If you managed to get a pxe-live-boot to work with cobbler, I'd be so
> happy to hear from you how you did it. I'm pretty sure it should be
> doable but right now I do not know how to set cobbler up for that.
>
> Any info you could give us would be great!
>
> Also happy to talk to folk having the same problems as we have over
> here. You are totally right, everyone wants to talk about the Data but
> never about the platforms. ;)
>
> Cheers,
> alex.
>
> P.S. I tried to email you directly before, but your address
> toddæhatescomputers.org <http://xn--toddhatescomputers-rub.org> was
> not valid. :(
>
> On 08/04/2014 06:08 PM, Martin Tippmann wrote:
> > ---------- Forwarded message ----------
> > From: Todd Snyder <[email protected]
> <mailto:[email protected]>>
> > Date: 2014-08-04 17:54 GMT+02:00
> > Subject: Re: A few questions regarding Ambari: Stateless Imaging and
> > existing infrastructure
> > To: [email protected] <mailto:[email protected]>,
> [email protected] <mailto:[email protected]>
> >
> >
> > re:
> >
> >> 1) Stateless images and Ambari
> >
> > We have just gone through the process of figuring this out, and it
> > does work great. We boot hundreds of nodes using a PXE server. Our
> > philosophy is to work in layers, so we bring up the 'platform' with
> > PXE, then use our config managemnt to configure the node, then use
> > Ambari to config the service. As such, we've baked our image with
> > CentOS6.5, Pupppet, Ambari Agent, and due to some frustration, we
> > ended up baking in the Yarn install as well. Our Datanodes are
> > completely stateless, except for one folder on a data disk where we
> > keep the Hadoop logs, so that if we have issues causing datanode
> > reboots, we still have logs to review.
> >
> > Everything works nicely - we use Cobbler to manage DHCP/PXE. The DHCP
> > server has been 'modified' (cfg file) to provide both the puppet
> > master and the ambari server for the envrionment. DHCP-exit-hooks
> > configures the initial environment, then runs Pupppet, which further
> > configures it. This brings up Ambari, looking for 'ambari.FQDN',
> > which is (hopefully) the local cluster, assuming it's up. Ambari
> > checks in, gets its configs and tada!
> >
> > There are some caveats to "tada", however. The first time you join
> > the datanode to the cluster (ie: join Ambari agent to the server),
> > Ambari server will assign it its roles, bring up the service(s)
> > assigned (assuming you've assigned them on the server), and things are
> > grand. However, if you reboot the stateless server, Ambari (agent)
> > doesn't start automatically, and forgets what its roles were. As
> > such, I've written a script that we call from rc.local (after Puppet,
> > after Ambari agent) that uses the API to call out to the Ambari
> > server, and push the roles back down to the agent. This causes the
> > server to push the roles to the agent, as well as the related configs,
> > and brings everything back up and works. We've rebooted hundreds of
> > times now (across various nodes) and the approach works well.
> > Apparently support for doing this automatically is coming (having the
> > agent check in and get it's roles again). Its been a few weeks since
> > I looked at all of this, so I might be mixing up words/order of
> > operations, apologies. I can likely share more details if you're
> > interested.
> >
> >
> >> 3) Existing Hadoop
> >
> > We are doing a migration between versions of Hadoop, and haven't had
> > any issues, particularly with Ambari. Ambari hasn't formatted any
> > disks or anything like that - it sits above that level of things. I'd
> > suggest testing to confirm, but in our case, we're simply rebooting
> > the datanodes, switching from Ubuntu to CentOS + Ambari, and it leaves
> > all the data alone.
> >
> >> 4) Ubuntu 14.04 suppport
> >
> > We ditched Ubuntu and picked up CentOS 6.5 because of the Ambari
> > support. It wasn't much work, just figuring out how to make a PXE
> > image for CentOS vs Ubuntu.
> >
> >
> > Happy to talk about managing large clusters. I'm fairly new to it,
> > but there's not enough people talking about the platform in the Big
> > Data community .. everyone wants to talk about the Data :)
> >
> > Cheers,
> >
> > t.
> >
> >
> > On Mon, Aug 4, 2014 at 11:15 AM, Martin Tippmann
> > <[email protected]
> <mailto:[email protected]>> wrote:
> >>
> >> Hi!
> >>
> >> We are in the process to plan a new Hadoop 2.0 cluster. Ambari looks
> >> really great for this job but while looking through the documentation
> >> we stumbled upon a few questions:
> >>
> >> 1. Stateless images and Ambari
> >>
> >> We think about booting all machines in the Cluster using PXE +
> >> stateless images. This means the OS image will only be in memory and
> >> changes to /etc/ or files will vanish after an reboot. Is it possible
> >> to use Ambari in such a setup? In theory in should be enough to start
> >> the ambari-agent after booting the image and the agent will ensure
> >> that the configuration is correct.
> >>
> >> The idea is to use all the HDDs in the machines for HDFS storage and
> >> to avoid the burden of maintance for seperate OS installs.
> >> Provisioning the OS via automated install on the HDD is another
> option
> >> if stateless imagining is not compatible with Ambari.
> >>
> >> Can anyone here tell what they are using? What are the best
> practices?
> >> We will have around 140 machines.
> >>
> >>
> >> 2. Existing Icinga/Nagios and Ganglia
> >>
> >> Is it possible to use an existing install of Ganglia and Nagios for
> >> Ambari? We already a smaller Hadoop cluster and have Ganglia and
> >> Icinga checks in place. We would like to avoid having duplicate
> >> Infrastructure if possible run only one Icinga/Nagios server and only
> >> one Ganglia instance for everything.
> >>
> >> 3. Existing Hadoop
> >>
> >> Is it possible to migrate an existing HDFS to Ambari? We have 150TB
> >> data in one HDFS and would migrate that to Ambari but due to
> automated
> >> nature of the installation I'd like to ask if it is safe to do so.
> >> Does Ambari format the disks on the nodes while installing? Or will
> >> the NameNode be formatted during installation?
> >>
> >> 4. Ubuntu 14.04 support
> >>
> >> We plan on using Ubuntu 14.04 LTS for the new cluster as we are only
> >> using Ubuntu in the department here. Is this a bad idea? Will
> there be
> >> support in the future? From looking through the requirements it
> >> shouldn't be a major problem as Ambari is mostly Python and Java
> - but
> >> if it is not and will not be supported we probably have to change the
> >> OS.
> >>
> >>
> >> Thanks for any help!
> >>
> >> If you are already running a bigger Hadoop cluster I'd love to hear
> >> some advice and best-practices for managing the system. At the moment
> >> we plan on using xCat for provisioning the machines, Saltstack for
> >> configuration management and Ambari for managing the Hadoop
> >> configuration.
> >>
> >> regards
> >> Martin Tippmann
>
>
>