Re: [DISCUSS] Ambari Integration

Otto Fowler Wed, 21 Sep 2016 10:55:11 -0700

Thanks Justin,

So this should just replace what is currently happening if you do the full
deployment, but you have not tested it as such?
I think the difference in the ASW deployment that I saw was how it set the
nodes to roles through the script.  Sorry if I overstated it.



On September 21, 2016 at 13:45:14, Justin Leet ([email protected])
wrote:

Hi Otto,

Couple things to dig into a bit.  Let me know if I stray off what your
question is, but I think this should give you the answer.

For the mpack, it's just taking a cluster without Metron and turning it
into a cluster running Metron (regardless of the cluster itself was
provisioned).  I wasn't clear about it in my last message, but testing on
AWS wasn't really about making sure small_cluster configuration was
compatible with the mpack changes.  It was about making sure that we could
go from a cluster without Metron to one with Metron.  The deployment was on
AWS more to ensure we had enough memory to actually run the various
services (the Docker cluster was having issues before we'd dumped enough
extra services).

Someone with a little more experience could probably chime in here, but the
current AWS install actually does use the small_cluster configuration if
you look at the defaults.yml under amazon-ec2 in metron-deployment
package.  The mpack setup is independent of the Ansible stuff for right
now. How close together those get and live (especially because there
definitely is some overlap) is definitely a more involved discussion.

Thanks,
Justin


On Wed, Sep 21, 2016 at 9:55 AM, Otto Fowler <[email protected]>
wrote:

> Hi Justin,
>
> Are you testing this against the small_cluster configuration?  With the
> full install ( install ambari etc ) as well as the AWS install?
> The AWS install seems like it’s own path, and is essentially different
> from small_cluster.
>
> I myself am interested in the whole boat deployment - where I’m providing
> centos nodes with only os/ssh/host setups to be totally deployed.
>
> On September 21, 2016 at 09:41:05, Justin Leet ([email protected])
> wrote:
>
> Hi all,
>
> I opened up a PR at  <http://goog_1872877021>https://github.com/apache/
> incubator-metron/pull/266 for everyone to take a look at and comment on.
> For reference, the original JIRA is https://issues.apache.org/
> jira/browse/METRON-427
>
> It pretty much covers the MVP that Casey outlined and should give a pretty
> good starting point for everyone to build on.
>
> There's more details on the ticket (and in the README in the code), but
> I'll try to give the abbreviated version.
>
> The PR builds an mpack that sets up Kafka topics, a MySQL instance with
> GeoIP data loaded, the Storm topologies (parsers, enrichment, and
> indexing), and output to Elasticsearch and HDFS.  It also exposes
> management and a lot of configuration through Ambari.  The sensors are NOT
> managed by Ambari.  It includes some testing instructions for trying things
> out.
>
> Additionally, this does not replace the current Ansible infrastructure.
> There's definitely good discussion to be had around what interaction these
> two approaches have.
>
> It also includes a set of limitations / caveats that we'll want to build
> on as we expand out of the MVP.  I'll include them here so that everyone
> has a good idea of what where the MVP ends (as of the PR as it stands right
> now) and where people can contribute ideas or code if they have an interest.
>
>    - MySQL install should be optional (and allow for using an existing
>    instance).
>    - MySQL should not be installed on a node already running a MySQL
>    instance (e.g. an Ambari Server using MySQL as its database).
>    - There is currently no hosting for RPMs remotely. They will have to
>    be built locally.
>    - Colocation of appropriate services should be enforced by Ambari. See
>    'Installing Management Pack' section in the README for more details.
>    - Storm's topology.classpath is not updated with the Metron service
>    install and needs to be updated separately.
>    - Several configuration parameters used when installing the Metron
>    service could (and should) be grabbed from Ambari. Install will require
>    them to be manually entered.
>    - Need to handle upgrading Metron
>
>
> Thanks,
> Justin
>
> On Fri, Sep 16, 2016 at 11:32 AM, Justin Leet <[email protected]>
> wrote:
>
>> I went ahead and created a Jira ticket mirroring Casey's discussion of
>> the MVP.  Feel free to add anything of interest there, too.
>>
>> https://issues.apache.org/jira/browse/METRON-427
>>
>>
>> Justin
>>
>>
>> On Fri, Sep 16, 2016 at 9:39 AM, Justin Leet <[email protected]>
>> wrote:
>>
>>
>>> ---------- Forwarded message ----------
>>> From: [email protected] <[email protected]>
>>> Date: Thu, Sep 15, 2016 at 9:02 PM
>>> Subject: Re: [DISCUSS] Ambari Integration
>>> To: [email protected]
>>> Cc: [email protected]
>>>
>>>
>>> Of course I would still need a full list of the repos, and submit proxy
>>> rules for the Ambari box, but happy to hear it will alleviate the need
>>> for
>>> making the scripts use proxies on the cluster nodes.
>>>
>>> Jon
>>>
>>> On Thu, Sep 15, 2016, 19:34 Nick Allen <[email protected]> wrote:
>>>
>>> > Jon - Installing Metron on an isolated network becomes much easier with
>>> > Ambari.  You would just mirror the required RPM repositories.  You can
>>> then
>>> > point Ambari to where your repo lives via the installation wizard.
>>> I've
>>> > done quite a few installs via Ambari on an isolated network and it
>>> worked
>>> > quite well.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Sep 15, 2016 at 6:50 PM, [email protected] <[email protected]>
>>> > wrote:
>>> >
>>> >> First of all - very much looking forward to this approach.  I'm not
>>> very
>>> >> familiar with management packs, but I did read some of the
>>> documentation in
>>> >> the link you sent.
>>> >>
>>> >> Not sure if this is already included in a "minimum viable product,"
>>> but
>>> >> at some point I think there needs to be a method of specifying proxies
>>> >> and/or internal package repos.  I recently did a Metron 0.2.0 install
>>> >> behind a proxy (hence METRON-409
>>> >> <https://issues.apache.org/jira/browse/METRON-409>) and it look me a
>>> >> semi-lengthy amount of time to (1) find all of the destinations I
>>> needed to
>>> >> request openings for in the proxy, and (2) modify the ambari scripts
>>> to
>>> >> appropriately use my proxies in the correct way.
>>> >>
>>> >> I also have a bit of a concern with upgrades and customizations in
>>> >> general (Not just how it would work with mpacks).  I have not done
>>> any of
>>> >> this to date, but I have rebuilt and redeployed a couple of times and
>>> I
>>> >> needed to modify some of the metron code itself before build/deploy
>>> >> (because of my concern with it getting overwritten on upgrade if I
>>> just did
>>> >> it directly on the cluster).  I would like to see a method of putting
>>> in
>>> >> install-specific files that modify or overwrite parts of the core
>>> metron
>>> >> stack, like changes to znodes, parsers, etc.
>>> >>
>>> >> Regarding not managing sensors with Ambari, I agree.  I run a large
>>> bro
>>> >> cluster and it is maintained via Puppet and various other mechanisms
>>> - no
>>> >> need for Ambari to bleed over in my case.
>>> >>
>>> >> Thanks for the great work.
>>> >>
>>> >> Jon
>>> >>
>>> >> On Thu, Sep 15, 2016 at 5:10 PM Casey Stella <[email protected]>
>>> wrote:
>>> >>
>>> >>> Hi Everyone,
>>> >>>
>>> >>> I wanted to solicit some discussion around a feature that is fast
>>> >>> approaching.  A major pain point in using Metron is installation.
>>> Thus far
>>> >>> our only approach to installation has been driven by the developer's
>>> needs
>>> >>> to construct a virtual environment to test out changes, which lead
>>> us to
>>> >>> either an ansible installation or a manual installation.
>>> >>>
>>> >>> Because we want to make sure that the installation of Metron is as
>>> easy
>>> >>> as possible, we have had some great contributions of an additional
>>> >>> approach, installation via Apache Ambari directly.  Our ansible
>>> scripts
>>> >>> currently rely on Ambari blueprints to set up Hadoop on the cluster
>>> that it
>>> >>> is deploying on, so it is not a new dependency, but we're working
>>> toward a
>>> >>> full Ambari management pack
>>> >>> <https://cwiki.apache.org/confluence/display/AMBARI/Management+Packs
>>> >
>>> >>> that will lay down the relevant topologies (parser, enrichment,
>>> indexing),
>>> >>> configs, bits and their infrastructural dependencies (ES and mysql)
>>> and
>>> >>> allow the topologies to be started and stopped as minimum viable
>>> product.
>>> >>>
>>> >>> The beginnings of this have started with:
>>> >>>
>>> >>>    - Ambari Service Definitions for the Parser topologies
>>> >>>    <https://github.com/apache/incubator-metron/pull/218>
>>> >>>    - Ambari Service Definition for the Indexing Topology
>>> >>>    <https://github.com/apache/incubator-metron/pull/222>
>>> >>>    - Ambari Service Definition for Elasticsearch
>>> >>>    <https://github.com/apache/incubator-metron/pull/223>
>>> >>>
>>> >>> There will be more to come in the near-term to realize that vision,
>>> but
>>> >>> we wanted to get some reactions.  Past minimum viable product, what
>>> do you
>>> >>> guys think we should have and how should it look?
>>> >>>
>>> >>> Currently we are treating the domain of the ambari installation as
>>> from
>>> >>> kafka to the indexes, which leaves the sensors unmanaged via
>>> ambari.  Is
>>> >>> that a good decision?
>>> >>>
>>> >>> Are there other pain points that you have had around installation
>>> that
>>> >>> you'd like to see addressed?
>>> >>>
>>> >>> The purpose of this discussion thread is to let you guys know that we
>>> >>> will soon have a new way to install metron, but also to understand
>>> what the
>>> >>> future requirements are so we, as a community, can address them.
>>> >>>
>>> >>> Best,
>>> >>>
>>> >>> Casey
>>> >>>
>>> >> --
>>> >>
>>> >> Jon
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Nick Allen <[email protected]>
>>> >
>>> --
>>>
>>> Jon
>>>
>>>
>>
>

Re: [DISCUSS] Ambari Integration

Reply via email to