How we install depends on what we're choosing to keep around. My concern is
getting core Metron's scope down to a supportable level.  This entire
conversation is probably just a thought experiment until we properly limit
the rest of our scope.  It's putting the cart before the horse. I want to
emphasize this, because we're having a discussion about how to install
something that in many ways doesn't actually exist yet.

A lot of the install complexity comes from managing so many moving parts at
once (ES/Solr, the UI, Kerberos, etc.). If we cut that down, I'm not sure
we need a big installer to manage everything. Plenty of projects trust
people to be able to run convenience scripts and shell commands. Again, I
think this is an academic discussion until we figure out our overall
project direction.

On Tue, Apr 21, 2020 at 10:02 AM Nick Allen <n...@nickallen.org> wrote:

> Hi Tom -
>
> >  Do you or anyone have enough experience to judge if it is possible to
> leverage Ansible as a replacement to deploy a working cluster?
>
> Yes, I worked a lot on the Ansible mechanism in the early days of Metron.
> This was the primary deployment mechanism before we had the Ambari MPack.
>
> We found it very difficult to use Ansible to create a one-size-fits-all
> deployment solution. It's possible, but very difficult to get a solution
> that doesn't take close monitoring and manual work arounds when attempting
> to use it across environments of different sizes and shapes. In terms of
> usability, the Ambari MPack was a big step-up in my opinion.
>
>
> >  perhaps a dedicated docker image that is designed to connect with other
> dockerized applications such as Storm, Kafka, etc..?
>
> Yes, I think that would be the way to go for a dev environment. We would be
> able to use community supported containers for most of our underlying
> platform needs. Unfortunately, this alone would not help anyone deploy
> Metron on a cluster.
>
>
>
>
> On Tue, Apr 21, 2020 at 9:08 AM Yerex, Tom <tom.ye...@ubc.ca> wrote:
>
> > Hi Nick,
> >
> > I see there is a lot of work done using Ansible in the repository. Do you
> > or anyone have enough experience to judge if it is possible to leverage
> > Ansible as a replacement to deploy a working cluster?
> >
> > Now that I am typing this out, I wonder if docker might be a solution
> that
> > would work? I don't have much experience with docker, perhaps a dedicated
> > docker image that is designed to connect with other dockerized
> applications
> > such as Storm, Kafka, etc..?
> >
> > --Tom.
> >
> > On 2020-04-17, 11:27 AM, "Nick Allen" <n...@nickallen.org> wrote:
> >
> >     This is a good discussion and one that I haven't fully grappled with
> > in my
> >     own mind yet. I'll have more to add, but I just want to chime in on
> the
> >     topic of Ambari at this point.
> >
> >     ### Ambari and the Paywall
> >
> >     The problem with Ambari is that its installation mechanism requires a
> >     repository of compiled packages (RPMs, DEBs, etc.) To install the
> >     underlying platform dependencies (like Kafka, HBase, Storm, Zk, etc)
> we
> >     relied on binary packages that were made freely available by
> >     Cloudera/Hortonworks. As of this past January, those packages are now
> >     behind a paywall.
> >
> >     Due to the paywall, installing your own HDP cluster with Ambari is
> now
> >     effectively dead.  I am not sure if legacy versions of Kafka, HBase,
> > Storm,
> >     etc will continue to be freely available, but even if so, we cannot
> >     continue to rely on this mechanism if new versions and security
> updates
> >     will not be made available.
> >
> >     The Apache Metron project does not publish compiled binaries or
> > packages
> >     either.  We do make the code freely available to allow users to build
> > and
> >     publish their own Metron packages.   But even with this capability,
> > unless
> >     you have a means to install the underlying platform dependencies via
> >     Ambari, installing Metron with Ambari has little value.
> >
> >     Unfortunately, I don't see a feasible path forward for Metron's
> Ambari
> >     MPack.
> >
> >     ### Dev Environment
> >
> >     This not only impacts the users of Apache Metron, this impacts
> > contributors
> >     also. Our primary development environment relies on that Ambari
> > MPack.  To
> >     continue development on any of the components of Apache Metron, we
> > would
> >     need to build an alternative development environment that can
> function
> >     despite the paywall.  That could take many shapes, but in my opinion
> it
> >     would be a blocker for continuing any development on Apache Metron,
> >     unfortunately.
> >
> >     Please do let me know if anyone disagrees or can think of an
> > alternative
> >     approach that would allow the current Ambari MPack to remain viable.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >     On Thu, Apr 16, 2020 at 4:34 PM Dima Kovalyov <dimdr...@gmail.com>
> > wrote:
> >
> >     >   - Dropping Ambari.
> >     >
> >     > I like the progress that Apache did with Ambari in 2.7. And I don't
> > know a
> >     > better installer/manager for all the services (we use other Hadoop
> > eco
> >     > services besides Metron).
> >     >
> >     > Sometimes its buggy, agents get stuck or server needs reboot from
> > time to
> >     > time, mpacks brake some functionality. But overall I feel this is
> the
> >     > direction for central management and orchestration.
> >     >
> >     > - Dima
> >     >
> >     > On Wed, Apr 15, 2020, 12:45 Justin Leet <justinjl...@gmail.com>
> > wrote:
> >     >
> >     > > This is a bit off the top of my head, but I'd I agree with pretty
> > much
> >     > all
> >     > > of points on what's bringing a lot of overhead.  There's probably
> > also a
> >     > > worthwhile discussion about what value we're shooting for the
> > project to
> >     > > provide to people that influences what stays/goes.
> >     > >
> >     > > Thinking out loud a bit
> >     > >
> >     > >    - Dropping Storm and moving to Spark drops the very hard to
> >     > >    tune/manage/troubleshoot Storm.
> >     > >    - Dropping the UIs (and making SQL the external interface)
> > pretty much
> >     > >    implies dropping the REST APIs and ES/Solr.  ES/Solr have been
> > a giant
> >     > >    source of dev heartache on the project and they exist
> primarily
> > for
> >     > the
> >     > >    real time use case.  People can build whatever UIs or use
> > existing
> >     > tools
> >     > >    against Parquet/Hive/whatever.
> >     > >    - Dropping Ambari. It's a complex beast to install because of
> > how many
> >     > >    components we have. Dropping the above makes our install much
> > easier
> >     > and
> >     > >    should alleviate the need for a complex installer.
> >     > >
> >     > > At that point, we're basically left with
> >     > >
> >     > >    - Some Spark for parse -> enrich -> output
> >     > >    - The profiler
> >     > >    - Stellar
> >     > >    - Probably some other misc stuff (sensors, bro kafka plugging,
> > etc.)
> >     > >
> >     > > At a glance, that seems almost an order of magnitude smaller than
> > what we
> >     > > currently try to handle.
> >     > >
> >     > > I'm not really sure what an appropriate way to handle the
> profiler
> > is.
> >     > I've
> >     > > barely touched the code for it, so I anything I say is a vague
> > guess.
> >     > >
> >     > > On Wed, Apr 8, 2020 at 7:38 PM Yerex, Tom <tom.ye...@ubc.ca>
> > wrote:
> >     > >
> >     > > > To me Metron is big and broad in the scope of technology
> > required to
> >     > get
> >     > > > it running. If things were more modular that would go a long
> way
> > to
> >     > > > reducing the learning curve or at least putting it into smaller
> > bites
> >     > > (and
> >     > > > it might encourage more people to get involved).
> >     > > >
> >     > > > If the UI were an add-on module in another project, it would
> > have made
> >     > it
> >     > > > easier for me and it could also encourage my hypothetical buddy
> > who is
> >     > a
> >     > > > web developer expert to get involved since he could focus on
> the
> > web-ui
> >     > > > module instead of trying to tackle all the other pieces that
> are
> >     > probably
> >     > > > not part of his bailiwick.
> >     > > >
> >     > > > Stellar is very intriguing, maybe that is not unique to Metron?
> > The
> >     > > > architecture of Metron with respect to parsing, enriching,
> etc.,
> > makes
> >     > a
> >     > > > lot of sense to anyone I talk with. These two aspects of Metron
> > seem
> >     > like
> >     > > > standout examples that make for a powerful platform to develop
> > on.
> >     > > >
> >     > > > Thanks for continuing this discussion,
> >     > > >
> >     > > > Tom.
> >     > > >
> >     > > >
> >     > > > On 2020-04-08 15:32:46-07:00 Casey Stella wrote:
> >     > > >
> >     > > > As far as I know there is no minimum bar of development
> activity
> > to
> >     > keep
> >     > > a
> >     > > > project open.  I think we would all be grateful for any
> > investment that
> >     > > you
> >     > > > or your organization would want to make.
> >     > > > It also occurs to me that your observation is absolutely spot
> > on: we
> >     > have
> >     > > > a LOT of moving parts.
> >     > > > I see some deficiencies here:
> >     > > >
> >     > > >   *   We depend on a lot of the various hadoop ecosystem
> > projects and
> >     > > they
> >     > > > have to work together very precisely:
> >     > > >      *   This makes for a system that is hard to install.
> >     > > >      *   This also makes for a system which is hard to
> > tune/manage
> >     > > >   *   We have a large surface area of coverage
> >     > > >      *   We have an installer, backend system and front-end UI,
> > which
> >     > > > stretches our developers a bit thin, especially since there
> > isn't even
> >     > > > interest in those systems
> >     > > >
> >     > > > Perhaps a reconsideration of the scope and technologies that we
> > use
> >     > would
> >     > > > be merited?  If we were to decide to, for instance:
> >     > > >
> >     > > >   *   Consolidate scope: focus on a viable backend/API rather
> > than a UI
> >     > > >   *   Consolidate technology: reposition ourselves on top of
> > Spark as a
> >     > > > consolidated streaming/batch system
> >     > > >   *   Make SQL our external interface: write out to parquet +
> > the Hive
> >     > > > metastore and let users pin up presto tables or hive tables as
> > they see
> >     > > fit
> >     > > >
> >     > > > This might reduce some of our surface area and make it more
> > viable to
> >     > get
> >     > > > started?
> >     > > > Anyway, just some thoughts.
> >     > > > Casey
> >     > > >
> >     > > > On Wed, Apr 8, 2020 at 6:20 PM Yerex, Tom <tom.ye...@ubc.ca
> > <mailto:
> >     > > > tom.ye...@ubc.ca>> wrote:
> >     > > > Hi Casey,
> >     > > >
> >     > > > I'm new here and new to contributing to an open source project.
> > Thus
> >     > far
> >     > > > my contribution has been questions, however the steep learning
> > curve
> >     > has
> >     > > > had me working to understand all the moving parts for the last
> 18
> >     > months
> >     > > > and I see that as a big investment by my organization.
> >     > > >
> >     > > > What is a level that would be viable?
> >     > > >
> >     > > > If my organization were to contribute I don't know that it
> would
> > be
> >     > soon
> >     > > > enough or at the volume that is recognized as viable, which is
> > why I
> >     > ask
> >     > > > the question.
> >     > > >
> >     > > >
> >     > > > On 2020-04-08 15:05:51-07:00 Casey Stella wrote:
> >     > > >
> >     > > > Hi all,
> >     > > >
> >     > > > When composing the board report today, I realized that we have
> >     > > effectively
> >     > > > had no development in the last quarter on this project.  Please
> > be
> >     > aware
> >     > > > that I say this without a shred of blame or judgement
> > (especially so
> >     > > > considering I have not contributed in a long time).  That being
> > said, I
> >     > > > would like to pose the question to the community:
> >     > > >
> >     > > > Do we feel that this project is viable?  If so, how are we
> going
> > to
> >     > spur
> >     > > > new contributions?  If not, then should we begin the process to
> > fold
> >     > the
> >     > > > project?
> >     > > >
> >     > > >
> >     > > > Best,
> >     > > >
> >     > > > Casey
> >     > > >
> >     > > >
> >     > >
> >     >
> >
>

Reply via email to