Re: A few questions regarding Ambari: Stateless Imaging and existing infrastructure

a. Tue, 26 Aug 2014 01:45:12 -0700

Hi Todd,

thanks for catching back. I was off on vacation myself, so I haven't had
played around the last week myself. I think I have gotten around the
idea but am now struggling to create the image. I am currently using a
python script called live-cd-creator or something like that, but I do
not have a good kickstart file for that, so am pretty much just fiddling
around.


How did you create your image? Do you have any documentation/scripts
you'd like to share?

Again thanks for replying. You are really helping me a lot.

Cheers,
alex.

On 08/19/2014 05:19 PM, Todd S wrote:
> Hi Alex,
> 
> Sorry for the email confusion, I blame my email admin (sadly, it's me).
> 
> We do run the OS from RAM only - no install on the local disk.  Every
> boot is done by PXE, which then mounts the data disks and we're off to
> the races again.  We only use Cobbler to hand out the PXE image, which
> is setup as a distro:
> 
> Name                           : hdp-dn
> Architecture                   : x86_64
> TFTP Boot Files                : {}
> Breed                          : redhat
> Comment                        : 
> Fetchable Files                : {}
> Initrd                         : /tftpboot/hdp-dn/initrd-hdp.gz
> Kernel                         : /tftpboot/hdp-dn/linux-hdp
> Kernel Options                 : {}
> Kernel Options (Post Install)  : {}
> Kickstart Metadata             : {}
> Management Classes             : []
> OS Version                     : generic26
> Owners                         : ['admin']
> Red Hat Management Key         : <<inherit>>
> Red Hat Management Server      : <<inherit>>
> Template Files                 : {}
> 
> 
> Hopefully you got things working in my lag to reply .. vacation season
> is upon us!
> 
> All the best,
> 
> Todd.
> 
> 
> 
> On Mon, Aug 11, 2014 at 12:02 PM, a. <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Hi Todd!
> 
>     I've run into a caveat getting the same setup to work, as we are looking
>     into a "live system", i.e. where the OS is completely loaded into the
>     RAM and not installed onto the hard drive before use. To be honest I
>     thought this was possible from what you wrote. Right now I only managed
>     for cobbler to let their installer provision CentOS onto the machines
>     hard drive. But this isn't what we wanted.
> 
>     Could you clarify: Are you installing your CentOS onto the machines or
>     are you booting them up live? You wrote "Our Datanodes are completely
>     stateless" so that's where we took that from.
> 
>     If you managed to get a pxe-live-boot to work with cobbler, I'd be so
>     happy to hear from you how you did it. I'm pretty sure it should be
>     doable but right now I do not know how to set cobbler up for that.
> 
>     Any info you could give us would be great!
> 
>     Also happy to talk to folk having the same problems as we have over
>     here. You are totally right, everyone wants to talk about the Data but
>     never about the platforms. ;)
> 
>     Cheers,
>     alex.
> 
>     P.S. I tried to email you directly before, but your address
>     toddæhatescomputers.org <http://xn--toddhatescomputers-rub.org> was
>     not valid. :(
> 
>     On 08/04/2014 06:08 PM, Martin Tippmann wrote:
>     > ---------- Forwarded message ----------
>     > From: Todd Snyder <[email protected]
>     <mailto:[email protected]>>
>     > Date: 2014-08-04 17:54 GMT+02:00
>     > Subject: Re: A few questions regarding Ambari: Stateless Imaging and
>     > existing infrastructure
>     > To: [email protected] <mailto:[email protected]>,
>     [email protected] <mailto:[email protected]>
>     >
>     >
>     > re:
>     >
>     >> 1) Stateless images and Ambari
>     >
>     > We have just gone through the process of figuring this out, and it
>     > does work great.  We boot hundreds of nodes using a PXE server.  Our
>     > philosophy is to work in layers, so we bring up the 'platform' with
>     > PXE, then use our config managemnt to configure the node, then use
>     > Ambari to config the service.  As such, we've baked our image with
>     > CentOS6.5, Pupppet, Ambari Agent, and due to some frustration, we
>     > ended up baking in the Yarn install as well.  Our Datanodes are
>     > completely stateless, except for one folder on a data disk where we
>     > keep the Hadoop logs, so that if we have issues causing datanode
>     > reboots, we still have logs to review.
>     >
>     > Everything works nicely - we use Cobbler to manage DHCP/PXE.  The DHCP
>     > server has been 'modified' (cfg file) to provide both the puppet
>     > master and the ambari server for the envrionment.  DHCP-exit-hooks
>     > configures the initial environment, then runs Pupppet, which further
>     > configures it.  This brings up Ambari, looking for 'ambari.FQDN',
>     > which is (hopefully) the local cluster, assuming it's up.  Ambari
>     > checks in, gets its configs and tada!
>     >
>     > There are some caveats to "tada", however.  The first time you join
>     > the datanode to the cluster (ie: join Ambari agent to the server),
>     > Ambari server will assign it its roles, bring up the service(s)
>     > assigned (assuming you've assigned them on the server), and things are
>     > grand.  However, if you reboot the stateless server, Ambari (agent)
>     > doesn't start automatically, and forgets what its roles were.  As
>     > such, I've written a script that we call from rc.local (after Puppet,
>     > after Ambari agent) that uses the API to call out to the Ambari
>     > server, and push the roles back down to the agent.  This causes the
>     > server to push the roles to the agent, as well as the related configs,
>     > and brings everything back up and works.  We've rebooted hundreds of
>     > times now (across various nodes) and the approach works well.
>     > Apparently support for doing this automatically is coming (having the
>     > agent check in and get it's roles again).  Its been a few weeks since
>     > I looked at all of this, so I might be mixing up words/order of
>     > operations, apologies.  I can likely share more details if you're
>     > interested.
>     >
>     >
>     >> 3) Existing Hadoop
>     >
>     > We are doing a migration between versions of Hadoop, and haven't had
>     > any issues, particularly with Ambari.  Ambari hasn't formatted any
>     > disks or anything like that - it sits above that level of things.  I'd
>     > suggest testing to confirm, but in our case, we're simply rebooting
>     > the datanodes, switching from Ubuntu to CentOS + Ambari, and it leaves
>     > all the data alone.
>     >
>     >> 4) Ubuntu 14.04 suppport
>     >
>     > We ditched Ubuntu and picked up CentOS 6.5 because of the Ambari
>     > support.  It wasn't much work, just figuring out how to make a PXE
>     > image for CentOS vs Ubuntu.
>     >
>     >
>     > Happy to talk about managing large clusters.  I'm fairly new to it,
>     > but there's not enough people talking about the platform in the Big
>     > Data community .. everyone wants to talk about the Data :)
>     >
>     > Cheers,
>     >
>     > t.
>     >
>     >
>     > On Mon, Aug 4, 2014 at 11:15 AM, Martin Tippmann
>     > <[email protected]
>     <mailto:[email protected]>> wrote:
>     >>
>     >> Hi!
>     >>
>     >> We are in the process to plan a new Hadoop 2.0 cluster. Ambari looks
>     >> really great for this job but while looking through the documentation
>     >> we stumbled upon a few questions:
>     >>
>     >> 1. Stateless images and Ambari
>     >>
>     >> We think about booting all machines in the Cluster using PXE +
>     >> stateless images. This means the OS image will only be in memory and
>     >> changes to /etc/ or files will vanish after an reboot. Is it possible
>     >> to use Ambari in such a setup? In theory in should be enough to start
>     >> the ambari-agent after booting the image and the agent will ensure
>     >> that the configuration is correct.
>     >>
>     >> The idea is to use all the HDDs in the machines for HDFS storage and
>     >> to avoid the burden of maintance for seperate OS installs.
>     >> Provisioning the OS via automated install on the HDD is another
>     option
>     >> if stateless imagining is not compatible with Ambari.
>     >>
>     >> Can anyone here tell what they are using? What are the best
>     practices?
>     >> We will have around 140 machines.
>     >>
>     >>
>     >> 2. Existing Icinga/Nagios and Ganglia
>     >>
>     >> Is it possible to use an existing install of Ganglia and Nagios for
>     >> Ambari? We already a smaller Hadoop cluster and have Ganglia and
>     >> Icinga checks in place. We would like to avoid having duplicate
>     >> Infrastructure if possible run only one Icinga/Nagios server and only
>     >> one Ganglia instance for everything.
>     >>
>     >> 3. Existing Hadoop
>     >>
>     >> Is it possible to migrate an existing HDFS to Ambari? We have 150TB
>     >> data in one HDFS and would migrate that to Ambari but due to
>     automated
>     >> nature of the installation I'd like to ask if it is safe to do so.
>     >> Does Ambari format the disks on the nodes while installing? Or will
>     >> the NameNode be formatted during installation?
>     >>
>     >> 4. Ubuntu 14.04 support
>     >>
>     >> We plan on using Ubuntu 14.04 LTS for the new cluster as we are only
>     >> using Ubuntu in the department here. Is this a bad idea? Will
>     there be
>     >> support in the future? From looking through the requirements it
>     >> shouldn't be a major problem as Ambari is mostly Python and Java
>     - but
>     >> if it is not and will not be supported we probably have to change the
>     >> OS.
>     >>
>     >>
>     >> Thanks for any help!
>     >>
>     >> If you are already running a bigger Hadoop cluster I'd love to hear
>     >> some advice and best-practices for managing the system. At the moment
>     >> we plan on using xCat for provisioning the machines, Saltstack for
>     >> configuration management and Ambari for managing the Hadoop
>     >> configuration.
>     >>
>     >> regards
>     >> Martin Tippmann
> 
> 
>

Re: A few questions regarding Ambari: Stateless Imaging and existing infrastructure

Reply via email to