Re: [DISCUSS] Ambari Integration

Justin Leet Wed, 21 Sep 2016 06:41:21 -0700

Hi all,

I opened up a PR at  <goog_1872877021>
https://github.com/apache/incubator-metron/pull/266 for everyone to take a
look at and comment on.  For reference, the original JIRA is
https://issues.apache.org/jira/browse/METRON-427


It pretty much covers the MVP that Casey outlined and should give a pretty
good starting point for everyone to build on.

There's more details on the ticket (and in the README in the code), but
I'll try to give the abbreviated version.

The PR builds an mpack that sets up Kafka topics, a MySQL instance with
GeoIP data loaded, the Storm topologies (parsers, enrichment, and
indexing), and output to Elasticsearch and HDFS.  It also exposes
management and a lot of configuration through Ambari.  The sensors are NOT
managed by Ambari.  It includes some testing instructions for trying things
out.

Additionally, this does not replace the current Ansible infrastructure.
There's definitely good discussion to be had around what interaction these
two approaches have.

It also includes a set of limitations / caveats that we'll want to build on
as we expand out of the MVP.  I'll include them here so that everyone has a
good idea of what where the MVP ends (as of the PR as it stands right now)
and where people can contribute ideas or code if they have an interest.

   - MySQL install should be optional (and allow for using an existing
   instance).
   - MySQL should not be installed on a node already running a MySQL
   instance (e.g. an Ambari Server using MySQL as its database).
   - There is currently no hosting for RPMs remotely. They will have to be
   built locally.
   - Colocation of appropriate services should be enforced by Ambari. See
   'Installing Management Pack' section in the README for more details.
   - Storm's topology.classpath is not updated with the Metron service
   install and needs to be updated separately.
   - Several configuration parameters used when installing the Metron
   service could (and should) be grabbed from Ambari. Install will require
   them to be manually entered.
   - Need to handle upgrading Metron


Thanks,
Justin

On Fri, Sep 16, 2016 at 11:32 AM, Justin Leet <justinjl...@gmail.com> wrote:

> I went ahead and created a Jira ticket mirroring Casey's discussion of the
> MVP.  Feel free to add anything of interest there, too.
>
> https://issues.apache.org/jira/browse/METRON-427
>
>
> Justin
>
>
> On Fri, Sep 16, 2016 at 9:39 AM, Justin Leet <justinjl...@gmail.com>
> wrote:
>
>
>> ---------- Forwarded message ----------
>> From: zeo...@gmail.com <zeo...@gmail.com>
>> Date: Thu, Sep 15, 2016 at 9:02 PM
>> Subject: Re: [DISCUSS] Ambari Integration
>> To: user@metron.incubator.apache.org
>> Cc: d...@metron.incubator.apache.org
>>
>>
>> Of course I would still need a full list of the repos, and submit proxy
>> rules for the Ambari box, but happy to hear it will alleviate the need for
>> making the scripts use proxies on the cluster nodes.
>>
>> Jon
>>
>> On Thu, Sep 15, 2016, 19:34 Nick Allen <n...@nickallen.org> wrote:
>>
>> > Jon - Installing Metron on an isolated network becomes much easier with
>> > Ambari.  You would just mirror the required RPM repositories.  You can
>> then
>> > point Ambari to where your repo lives via the installation wizard.  I've
>> > done quite a few installs via Ambari on an isolated network and it
>> worked
>> > quite well.
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Sep 15, 2016 at 6:50 PM, zeo...@gmail.com <zeo...@gmail.com>
>> > wrote:
>> >
>> >> First of all - very much looking forward to this approach.  I'm not
>> very
>> >> familiar with management packs, but I did read some of the
>> documentation in
>> >> the link you sent.
>> >>
>> >> Not sure if this is already included in a "minimum viable product," but
>> >> at some point I think there needs to be a method of specifying proxies
>> >> and/or internal package repos.  I recently did a Metron 0.2.0 install
>> >> behind a proxy (hence METRON-409
>> >> <https://issues.apache.org/jira/browse/METRON-409>) and it look me a
>> >> semi-lengthy amount of time to (1) find all of the destinations I
>> needed to
>> >> request openings for in the proxy, and (2) modify the ambari scripts to
>> >> appropriately use my proxies in the correct way.
>> >>
>> >> I also have a bit of a concern with upgrades and customizations in
>> >> general (Not just how it would work with mpacks).  I have not done any
>> of
>> >> this to date, but I have rebuilt and redeployed a couple of times and I
>> >> needed to modify some of the metron code itself before build/deploy
>> >> (because of my concern with it getting overwritten on upgrade if I
>> just did
>> >> it directly on the cluster).  I would like to see a method of putting
>> in
>> >> install-specific files that modify or overwrite parts of the core
>> metron
>> >> stack, like changes to znodes, parsers, etc.
>> >>
>> >> Regarding not managing sensors with Ambari, I agree.  I run a large bro
>> >> cluster and it is maintained via Puppet and various other mechanisms -
>> no
>> >> need for Ambari to bleed over in my case.
>> >>
>> >> Thanks for the great work.
>> >>
>> >> Jon
>> >>
>> >> On Thu, Sep 15, 2016 at 5:10 PM Casey Stella <ceste...@gmail.com>
>> wrote:
>> >>
>> >>> Hi Everyone,
>> >>>
>> >>> I wanted to solicit some discussion around a feature that is fast
>> >>> approaching.  A major pain point in using Metron is installation.
>> Thus far
>> >>> our only approach to installation has been driven by the developer's
>> needs
>> >>> to construct a virtual environment to test out changes, which lead us
>> to
>> >>> either an ansible installation or a manual installation.
>> >>>
>> >>> Because we want to make sure that the installation of Metron is as
>> easy
>> >>> as possible, we have had some great contributions of an additional
>> >>> approach, installation via Apache Ambari directly.  Our ansible
>> scripts
>> >>> currently rely on Ambari blueprints to set up Hadoop on the cluster
>> that it
>> >>> is deploying on, so it is not a new dependency, but we're working
>> toward a
>> >>> full Ambari management pack
>> >>> <https://cwiki.apache.org/confluence/display/AMBARI/Management+Packs>
>> >>> that will lay down the relevant topologies (parser, enrichment,
>> indexing),
>> >>> configs, bits and their infrastructural dependencies (ES and mysql)
>> and
>> >>> allow the topologies to be started and stopped as minimum viable
>> product.
>> >>>
>> >>> The beginnings of this have started with:
>> >>>
>> >>>    - Ambari Service Definitions for the Parser topologies
>> >>>    <https://github.com/apache/incubator-metron/pull/218>
>> >>>    - Ambari Service Definition for the Indexing Topology
>> >>>    <https://github.com/apache/incubator-metron/pull/222>
>> >>>    - Ambari Service Definition for Elasticsearch
>> >>>    <https://github.com/apache/incubator-metron/pull/223>
>> >>>
>> >>> There will be more to come in the near-term to realize that vision,
>> but
>> >>> we wanted to get some reactions.  Past minimum viable product, what
>> do you
>> >>> guys think we should have and how should it look?
>> >>>
>> >>> Currently we are treating the domain of the ambari installation as
>> from
>> >>> kafka to the indexes, which leaves the sensors unmanaged via ambari.
>> Is
>> >>> that a good decision?
>> >>>
>> >>> Are there other pain points that you have had around installation that
>> >>> you'd like to see addressed?
>> >>>
>> >>> The purpose of this discussion thread is to let you guys know that we
>> >>> will soon have a new way to install metron, but also to understand
>> what the
>> >>> future requirements are so we, as a community, can address them.
>> >>>
>> >>> Best,
>> >>>
>> >>> Casey
>> >>>
>> >> --
>> >>
>> >> Jon
>> >>
>> >
>> >
>> >
>> > --
>> > Nick Allen <n...@nickallen.org>
>> >
>> --
>>
>> Jon
>>
>>
>

Re: [DISCUSS] Ambari Integration

Reply via email to