Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action

David Lyle Fri, 13 Jan 2017 06:17:20 -0800

Exactly. Reasons of who did it or when it was done. So, your basic
authentication and audit.


On Fri, Jan 13, 2017 at 9:14 AM, Casey Stella <[email protected]> wrote:

> So, the reason to have the push operations push to ambari and then have
> ambari sync to zookeeper (btw: is this possible, do we have a hook like
> that in ambari?) is to make sure that users can specify a comment about
> what changed, correct?  If we pushed to zookeeper and had ambari listen
> (not sure it can do that either, btw) and update itself, we wouldn't be
> able to specify reasons.
>
> Casey
>
> On Fri, Jan 13, 2017 at 9:09 AM, David Lyle <[email protected]> wrote:
>
> > The only tooling I'm aware of that Ambari isn't already using is the
> > Stellar stuff, is there more?
> >
> > Regardless, I'd always push from Ambari to zookeeper and let other
> tooling
> > talk to Ambari (Casey's first bullet). The only wrinkle is we have to
> > decide if we want to support manual installation. Fwiw, I do. If we did,
> > we'd need to do a bit of mode selection to support both. But the happy
> path
> > would be to do stuff (human or machine) via Ambari.
> >
> > -D...
> >
> >
> > On Fri, Jan 13, 2017 at 9:01 AM, Casey Stella <[email protected]>
> wrote:
> >
> > > Just piling on in support for Ambari.  I really, really don't like
> > > reinventing wheels, especially hard ones.  I guess my questions now are
> > > mainly around technical feasibility.  Seems to me that we can either:
> > >
> > >    - retrofit the tooling that currently manages configs to use the
> > Ambari
> > >    API's as well as pushing to zokeeper
> > >    - have a service listening to zookeeper and pushing changes to
> ambari
> > to
> > >    keep it in sync
> > >    - Something that I may have missed
> > >
> > > Each of those have pro's and con's.  Thoughts?
> > >
> > > Casey
> > >
> > > On Fri, Jan 13, 2017 at 8:53 AM, David Lyle <[email protected]>
> > wrote:
> > >
> > > > I'm in complete agreement with all the points Matt made. I think the
> > way
> > > > forward should be to expose ALL user-modifiable configs via Ambari
> and
> > > let
> > > > Ambari actively manage them. We should keep the command line tools as
> > the
> > > > backend and Ambari should continue to leverage them. This will allow
> > > manual
> > > > installation/management if desired and will ensure the command line
> > > scripts
> > > > are kept up to date.
> > > >
> > > > Fully leveraging Ambari has many beneficial effects. My top four:
> > > >    Provides proper revision control for the configurations
> > > >    Scales easily into things like rolling|quick upgrades and Kerberos
> > > > support
> > > >    Provides other applications a restful endpoint to change
> > > configurations
> > > >    We get a force multiplier from the Ambari devs
> > > >
> > > > The working description Matt provided is completely consistent with
> my
> > > > understanding of how it works (derived from the Ambari docs,
> authoring
> > > > pieces of the mpack and interacting with some Ambari devs).
> Restarting
> > > > Ambari agent is the only circumstance I'm aware of outside of
> > > > save/start|restart that would initiate a re-write of the configs and
> > > cache,
> > > > there could be others.
> > > >
> > > > -D...
> > > >
> > > > On Thu, Jan 12, 2017 at 9:24 PM, Matt Foley <[email protected]>
> wrote:
> > > >
> > > > > Mike, could you try again on the image, please, making sure it is a
> > > > simple
> > > > > format (gif, png, or jpeg)?  It got munched, at least in my viewer.
> > > > Thanks.
> > > > >
> > > > > Casey, responding to some of the questions you raised:
> > > > >
> > > > > I’m going to make a rather strong statement:  We already have a
> > service
> > > > > “to intermediate and handle config update/retrieval”.
> > > > > Furthermore, it:
> > > > > - Correctly handles the problems of distributed services running on
> > > > > multi-node clusters.  (That’s a HARD problem, people, and we
> > shouldn’t
> > > > try
> > > > > to reinvent the wheel.)
> > > > > - Correctly handles Kerberos security. (That’s kinda hard too, or
> at
> > > > least
> > > > > a lot of work.)
> > > > > - It does automatic versioning of configurations, and allows
> viewing,
> > > > > comparing, and reverting historical configs
> > > > > - It has a capable REST API for all those things.
> > > > > It doesn’t natively integrate Zookeeper storage of configs, but
> there
> > > is
> > > > a
> > > > > natural place to specify copy to/from Zookeeper for the files
> > desired.
> > > > >
> > > > > It is Ambari.  And we should commit to it, rather than try to
> > re-create
> > > > > such features.
> > > > > Because it has a good REST API, it is perfectly feasible to
> implement
> > > > > Stellar functions that call it.
> > > > > GUI configuration tools can also use the Ambari APIs, or better yet
> > be
> > > > > integrated in an “Ambari View”. (Eg, see the “Yarn Capacity
> Scheduler
> > > > > Configuration Tool” example in the Ambari documentation, under
> “Using
> > > > > Ambari Views”.)
> > > > >
> > > > > Arguments are: Parsimony, Sufficiency, Not reinventing the wheel,
> and
> > > Not
> > > > > spending weeks and weeks of developer time over the next year
> > > reinventing
> > > > > the wheel while getting details wrong multiple times…
> > > > >
> > > > > Okay, off soapbox.
> > > > >
> > > > > Casey asked what the config update behavior of Ambari is, and how
> it
> > > will
> > > > > interact with changes made from outside Ambari.
> > > > > The following is from my experience working with the Ambari Mpack
> for
> > > > > Metron.  I am not otherwise an Ambari expert, so tomorrow I’ll get
> it
> > > > > reviewed by an Ambari development engineer.
> > > > >
> > > > > Ambari-server runs on one node, and Ambari-agent runs on each of
> all
> > > the
> > > > > nodes.
> > > > > Ambari-server has a private set of py, xml, and template files,
> which
> > > > > together are used both to generate the Ambari configuration GUI,
> with
> > > > > defaults, and to generate configuration files (of any needed
> > filetype)
> > > > for
> > > > > the various Stack components.
> > > > > Ambari-server also has a database where it stores the schema
> related
> > to
> > > > > these files, so even if you reach in and edit Ambari’s files, it
> will
> > > > Error
> > > > > out if the set of parameters or parameter names changes.  The
> > > historical
> > > > > information about configuration changes is also stored in the db.
> > > > > For each component (and in the case of Metron, for each topology),
> > > there
> > > > > is a python file which controls the logic for these actions, among
> > > > others:
> > > > > - Install
> > > > > - Start / stop / restart / status
> > > > > - Configure
> > > > >
> > > > > It is actually up to this python code (which we wrote for the
> Metron
> > > > > Mpack) what happens in each of these API calls.  But the current
> > code,
> > > > and
> > > > > I believe this is typical of Ambari-managed components, performs a
> > > > > “Configure” action whenever you press the “Save” button after
> > changing
> > > a
> > > > > component config in Ambari, and also on each Install and Start or
> > > > Restart.
> > > > >
> > > > > The Configure action consists of approximately the following
> sequence
> > > > (see
> > > > > disclaimer above :-)
> > > > > - Recreate the generated config files, using the template files and
> > the
> > > > > actual configuration most recently set in Ambari
> > > > > o Note this is also under the control of python code that we wrote,
> > and
> > > > > this is the appropriate place to push to ZK if desired.
> > > > > - Propagate those config files to each Ambari-agent, with a command
> > to
> > > > set
> > > > > them locally
> > > > > - The ambari-agents on each node receive the files and write them
> to
> > > the
> > > > > specified locations on local storage
> > > > >
> > > > > Ambari-server then whines that the updated services should be
> > > restarted,
> > > > > but does not initiate that action itself (unless of course the
> > > initiating
> > > > > action was a Start command from the administrator).
> > > > >
> > > > > Make sense?  It’s all quite straightforward in concept, there’s
> just
> > an
> > > > > awful lot of stuff wrapped around that to make it all go smoothly
> and
> > > > > handle the problems when it doesn’t.
> > > > >
> > > > > There’s additional complexity in that the Ambari-agent also caches
> > (on
> > > > > each node) both the template files and COMPILED forms of the python
> > > files
> > > > > (.pyc) involved in transforming them.  The pyc files incorporate
> some
> > > > > amount of additional info regarding parameter values, but I’m not
> > sure
> > > of
> > > > > the form.  I don’t think that changes the above in any practical
> way
> > > > unless
> > > > > you’re trying to cheat Ambari by reaching in and editing its files
> > > > > directly.  In that case, you also need to whack the pyc files (on
> > each
> > > > > node) to force the data to be reloaded from Ambari-server.  Best
> > > solution
> > > > > is don’t cheat.
> > > > >
> > > > > Also, there may be circumstances under which the Ambari-agent will
> > > detect
> > > > > changes and re-write the latest version it knows of the config
> files,
> > > > even
> > > > > without a Save or Start action at the Ambari-server.  I’m not sure
> of
> > > > this
> > > > > and need to check with Ambari developers.  It may no longer happen,
> > > altho
> > > > > I’m pretty sure change detection/reversion was a feature of early
> > > > versions
> > > > > of Ambari.
> > > > >
> > > > > Hope this helps,
> > > > > --Matt
> > > > >
> > > > > ================================================
> > > > > From: Michael Miklavcic <[email protected]>
> > > > > Reply-To: "[email protected]"
> > > > <[email protected].
> > > > > org>
> > > > > Date: Thursday, January 12, 2017 at 3:59 PM
> > > > > To: "[email protected]"
> <[email protected].
> > org
> > > >
> > > > > Subject: Re: [DISCUSS] Ambari Metron Configuration Management
> > > > consequences
> > > > > and call to action
> > > > >
> > > > > Hi Casey,
> > > > >
> > > > > Thanks for starting this thread. I believe you are correct in your
> > > > > assessment of the 4 options for updating configs in Metron. When
> > using
> > > > more
> > > > > than one of these options we can get into a split-brain scenario. A
> > > basic
> > > > > example is updating the global config on disk and using the
> > > > > zk_load_configs.sh. Later, if a user decides to restart Ambari, the
> > > > cached
> > > > > version stored by Ambari (it's in the MySQL or other database
> backing
> > > > > Ambari) will be written out to disk in the defined config
> directory,
> > > and
> > > > > subsequently loaded using the zk_load_configs.sh under the hood.
> Any
> > > > global
> > > > > configuration modified outside of Ambari will be lost at this
> point.
> > > This
> > > > > is obviously undesirable, but I also like the purpose and utility
> > > exposed
> > > > > by the multiple config management interfaces we currently have
> > > > available. I
> > > > > also agree that a service would be best.
> > > > >
> > > > > For reference, here's my understanding of the current configuration
> > > > > loading mechanisms and their deps.
> > > > >
> > > > > <image>
> > > > >
> > > > > Mike
> > > > >
> > > > >
> > > > > On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella <[email protected]>
> > > > wrote:
> > > > >
> > > > > In the course of discussion on the PR for METRON-652
> > > > > <https://github.com/apache/incubator-metron/pull/415> something
> > that I
> > > > > should definitely have understood better came to light and I
> thought
> > > that
> > > > > it was worth bringing to the attention of the community to get
> > > > > clarification/discuss is just how we manage configs.
> > > > >
> > > > > Currently (assuming the management UI that Ryan Merriman submitted)
> > > > configs
> > > > > are managed/adjusted via a couple of different mechanism.
> > > > >
> > > > >    - zk_load_utils.sh: pushed and pulled from disk to zookeeper
> > > > >    - Stellar REPL: pushed and pulled via the CONFIG_GET/CONFIG_PUT
> > > > > functions
> > > > >    - Ambari: initialized via the zk_load_utils script and then some
> > of
> > > > them
> > > > >    are managed directly (global config) and some indirectly
> > > > > (sensor-specific
> > > > >    configs).
> > > > >       - NOTE: Upon service restart, it may or may not overwrite
> > changes
> > > > on
> > > > >       disk or on zookeeper.  *Can someone more knowledgeable than
> me
> > > > about
> > > > >       this describe precisely the semantics that we can expect on
> > > > > service restart
> > > > >       for Ambari? What gets overwritten on disk and what gets
> updated
> > > > > in ambari?*
> > > > >    - The Management UI: manages some of the configs. *RYAN: Which
> > > configs
> > > > >    do we support here and which don't we support here?*
> > > > >
> > > > > As you can see, we have a mishmash of mechanisms to update and
> manage
> > > the
> > > > > configuration for Metron in zookeeper.  In the beginning the
> approach
> > > was
> > > > > just to edit configs on disk and push/pull them via zk_load_utils.
> > > > Configs
> > > > > could be historically managed using source control, etc.  As we got
> > > more
> > > > > and more components managing the configs, we haven't taken care
> that
> > > they
> > > > > they all work with each other in an expected way (I believe these
> are
> > > > > true..correct me if I'm wrong):
> > > > >
> > > > >    - If configs are modified in the management UI or the Stellar
> REPL
> > > and
> > > > >    someone forgets to pull the configs from zookeeper to disk,
> before
> > > > they
> > > > > do
> > > > >    a push via zk_load_utils, they will clobber the configs in
> > zookeeper
> > > > > with
> > > > >    old configs.
> > > > >    - If the global config is changed on disk and the ambari service
> > > > >    restarts, it'll get reset with the original global config.
> > > > >    - *Ryan, in the management UI, if someone changes the zookeeper
> > > > configs
> > > > >    from outside, are those configs reflected immediately in the
> UI?*
> > > > >
> > > > >
> > > > > It seems to me that we have a couple of options here:
> > > > >
> > > > >    - A service to intermediate and handle config update/retrieval
> and
> > > > >    tracking historical changes so these different mechanisms can
> use
> > a
> > > > > common
> > > > >    component for config management/tracking and refactor the
> existing
> > > > >    mechanisms to use that service
> > > > >    - Standardize on exactly one component to manage the configs and
> > > > regress
> > > > >    the others (that's a verb, right?   nicer than delete.)
> > > > >
> > > > > I happen to like the service approach, myself, but I wanted to put
> it
> > > up
> > > > > for discussion and hopefully someone will volunteer to design such
> a
> > > > thing.
> > > > >
> > > > > To frame the debate, I want us to keep in mind a couple of things
> > that
> > > > may
> > > > > or may not be relevant to the discussion:
> > > > >
> > > > >    - We will eventually be moving to support kerberos so there
> should
> > > at
> > > > >    least be a path to use kerberos for any solution IMO
> > > > >    - There is value in each of the different mechanisms in place
> now.
> > > If
> > > > >    there weren't, then they wouldn't have been created.  Before we
> > try
> > > to
> > > > > make
> > > > >    this a "there can be only one" argument, I'd like to hear very
> > good
> > > > >    arguments.
> > > > >
> > > > > Finally, I'd appreciate if some people might answer the questions I
> > > have
> > > > in
> > > > > bold there.  Hopefully this discussion, if nothing else happens,
> will
> > > > > result in fodder for proper documentation of the ins and outs of
> each
> > > of
> > > > > the components bulleted above.
> > > > >
> > > > > Best,
> > > > >
> > > > > Casey
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action

Reply via email to