Hi all,

Ran through multiple rounds of testing on
https://issues.apache.org/jira/browse/METRON-2217 (PR 1483) and it has now
been merged into master. I held off a couple days before merging it because
we were dealing with some other classpath issues in the feature branch with
the revert PR that have now been addressed. Status has been updated on the
Jira epic https://issues.apache.org/jira/browse/METRON-2088. Note - the PRs
that were reverted are simply left as "done." If someone has used a
different/better protocol for handling this in the past, please feel free
to chime in :). At this point, some of the final tasks will likely roll
into the Hadoop upgrade to 3.1.1 task. I think this is where we should
focus our final rounds of testing E2E and with Kerberos (fingers crossed)
enabled.

Best,
Mike



On Tue, Sep 3, 2019 at 11:41 AM James Sirota <jsir...@apache.org> wrote:

> Thanks for the status update, Mike
>
> 28.08.2019, 16:48, "Michael Miklavcic" <michael.miklav...@gmail.com>:
> > Update on current HDP upgrade progress.
> >
> > A separate DISCUSS thread has been spun up around how we should handle
> the
> > upgrade, backwards compatibility issues, semantic versioning, new
> release,
> > etc. here -
> >
> https://lists.apache.org/thread.html/21e9caf80c0ec4d9db4cb44423befa87cd1e42d327860369a6a13273@%3Cdev.metron.apache.org%3E
> >
> > A few PRs have recently gone into the FB from Ryan Merriman
> >
> >    1. https://issues.apache.org/jira/browse/METRON-2169
> >    2. https://issues.apache.org/jira/browse/METRON-2225
> >    3. https://issues.apache.org/jira/browse/METRON-2224
> >
> > He's also currently making some progress towards the Hadoop upgrade, but
> > will be afk for a bit, so this work is expected to be handed off in a
> > branch - https://issues.apache.org/jira/browse/METRON-2232
> >
> > Nick Allen is currently working on reverting the 3 Hbase FB PRs. Ran into
> > some merge conflicts with the Kafka PR, and hit some classpath issues.
> >
> > I merged master into the HBase PR for changing deprecated interfaces, but
> > ran into some classpath trouble with another recent PR change. That's
> been
> > resolved. Ran through a round of testing and all looks good. Looking for
> > some help from Anand and Mohan in getting a good multi-node deployment
> test
> > done on this PR before we merge it in. I've also reached out to Dale
> > Richardson for some additional testing validation
> > https://github.com/apache/metron/pull/1483
> >
> > We'll be looking to merge the current master branch into the FB in the
> next
> > day or so and get the 3 PRs in the FB reverted. Once we have that settled
> > down, we'll have hopefully been able to get PR 1483 into master, which
> > should then be merged again into the FB. From there on, there's a bit of
> > remaining work to polish off HBase. And then it's getting through the
> final
> > Hadoop version upgrade with some final testing with Kerberos, etc.
> >
> > Best,
> > Mike
> >
> > On Fri, Aug 23, 2019 at 10:52 AM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >>  Hi devs,
> >>
> >>  We've been at this a while now, and I want to share an update with you.
> >>  Here's the current HDP 3.1 upgrade Jira again for reference -
> >>  https://issues.apache.org/jira/browse/METRON-2088. Nick, Ryan, and I
> had
> >>  a number of offline conversations in the past week discussing some of
> what
> >>  has been learned during the upgrade as well as how to best address
> some of
> >>  the feedback originating in the recent HBase PR reviews (
> >>  https://github.com/apache/metron/pull/1470#issuecomment-521033037).
> >>
> >>  *Prolog*
> >>
> >>  As this discuss thread shows, there's little debate from the community
> >>  regarding the need to upgrade to HDP 3.1 and our current version of
> Storm
> >>  is being eol'ed (
> >>
> https://lists.apache.org/thread.html/7dbac4e50159ec899d1965505a9844f503130b1526a0e577c705959c@%3Cdev.metron.apache.org%3E
> ).
> >>  Metron has been running on the same version of Hadoop ever since its
> >>  graduation to a TLP a few years ago. While there have been numerous
> >>  individual component/service upgrades: ElasticSearch, Solr, and Storm,
> to
> >>  name a few, this is our first major platform-wide upgrade.
> >>
> >>  The biggest challenge we've run into so far has been dealing with the
> >>  HBase client APIs that were deprecated in HBase 1.x and now removed in
> 2.x.
> >>  We were previously able to depend on HTableInterface and HTables fully
> >>  encapsulating the connection management logic, but this was decoupled
> in
> >>  the new API. It is now left to the user to manage connection lifecycle.
> >>  Nick Allen spent time exploring various options for how to move forward
> >>  with new abstractions that would accommodate a new connection strategy
> as
> >>  well as work within our existing Storm architecture. Fully extracting
> logic
> >>  for managing HBase connections independent of the Tables proved to have
> >>  dramatic ripple effects throughout the architecture (again, refer to
> >>  https://github.com/apache/metron/pull/1470#issuecomment-521033037 for
> >>  more details). We also encountered major changes in the core classes
> used
> >>  by our MockHTable implementation used in the integration tests. These
> two
> >>  problems resulted in interface changes that affected quite a bit of
> code.
> >>
> >>  Reflecting on what we learned from this refactoring push, we explored
> >>  other options to reduce the overall surface area impacted by the API
> >>  change. The main thrust of our work seems to hinge on how to deal with
> the
> >>  new connection management problem. We looked at options for how to
> leverage
> >>  the existing TableProvider abstraction and decided to try out a
> compromise
> >>  approach that allows us to:
> >>
> >>     1. Upgrade as much of the API as possible in the current version of
> >>     HBase against master
> >>     2. Manage connections within the TableProvider abstraction - this
> >>     would have an API feel that is similar to what had been
> encapsulated by the
> >>     HTableInterfaces we rely on currently, and remove a large chunk of
> the code
> >>     that had been necessary to finish the upgrade.
> >>
> >>  *Reducing Scope*
> >>
> >>  We have long-lived connections to HBase that don't need to be
> >>  opened/closed and pooled in the traditional request/response lifecycle
> >>  sense. We know at the time our application spins up how many processes
> and
> >>  threads there will be - this is static for us. I put together a PR (
> >>  https://issues.apache.org/jira/browse/METRON-2217) that migrates
> HTable
> >>  and HTableInterface classes to the new Table API and encapsulates
> >>  connection management within the TableProviders. We had some concerns
> about
> >>  risks associated with managing the connections this way, as opposed to
> >>  using a more robust connection management approach, so I reached out
> to the
> >>  HBase community to get some guidance. The feedback we received suggests
> >>  that managing our connections this way should be sufficient. And the
> HBase
> >>  connection objects are threadsafe, to boot.
> >>
> https://lists.apache.org/thread.html/6b83cd7548efb8c37899063affc97e4c5ce823a13359a49b477e3c07@%3Cdev.hbase.apache.org%3E
> >>
> >>  *A Revised Plan*
> >>
> >>  The alternative HBase client/connection approach is promising, and it
> >>  greatly reduces the overall architectural impact we will need to absorb
> >>  alongside a major upgrade. The following is my proposal after some
> coding
> >>  experiments and numerous conversations with Nick Allen, Ryan Merriman,
> >>  Casey Stella, Otto Fowler, and James Sirota.
> >>
> >>     1. Do as much refactoring in small chunks as possible in master.
> e.g.
> >>     the first phase of the HBase API changes. Reducing the overall
> number of
> >>     variables changing all at the same time in the same place should
> reduce the
> >>     overall risk of the upgrade. Prove stability with what we can in
> master and
> >>     the issues we run into in the FB should be easier to isolate and
> solve.
> >>     2. Target the upgrade feature branch as being a place where we
> >>     primarily have to deal with changes due to classpath problems.
> There will
> >>     be some other necessary code changes, e.g. hbase coprocessor,
> however the
> >>     changes should be well-isolated and narrower in scope.
> >>     3. Ryan and I have had numerous conversations surrounding the Maven
> >>     dependency classpath issues that frequently come up at runtime
> anytime even
> >>     the smallest change to our dependency tree occurs. I won't go into
> those
> >>     details now, but you can see the discussion and history here (
> >>     https://github.com/apache/metron/pull/1436). While there's inherent
> >>     risk in making big changes to our dep management, there is also a
> >>     substantial upside - this PR makes finding classpath problems and
> solving
> >>     them substantially easier. This PR is ready to go in master and
> should
> >>     greatly speed up our ability to rectify and cp problems we
> encounter in the
> >>     feature branch.
> >>     4. Find an analog for our port of the MockHTable (
> >>
> https://github.com/apache/metron/blob/master/metron-platform/metron-hbase/metron-hbase-common/src/test/java/org/apache/metron/hbase/mock/MockHTable.java)
> in
> >>     HBase 2.0.2. Nick has been working on a POC around this alongside
> my work
> >>     on the other API migration and he has been able to get to a point
> with the
> >>     integration tests passing. We had originally hoped this could be
> landed in
> >>     master, but the underlying low-level supporting classes have
> changed and
> >>     are not be forwards compatible the way the Table interface
> >>     and ConnectionFactory class are. We plan to land this in the
> feature branch.
> >>     5. Manage component version changes in an HDP 3.1 profile that gets
> >>     updated as PRs are submitted. This allows the modules to be
> upgraded on a
> >>     per-component basis, while still compiling and allowing tests to
> run,
> >>     without requiring a big bang all-or-nothing upgrade. We would then
> do a
> >>     final reconciliation and deprecation of the old Hadoop versions and
> profile
> >>     at the tail of the FB.
> >>     https://issues.apache.org/jira/browse/METRON-2223
> >>     6. Upgrade Kafka, Storm, Solr, and Zeppelin. PRs from Ryan are up in
> >>     the feature branch now.
> >>     7. Revert the HBase feature branch PRs that have already gone in.
> The
> >>     new approach removes the need for the HBase client changes that have
> >>     already gone in, so we should remove them before polishing off the
> HBase
> >>     upgrade.
> >>     8. Merge in master - including the Maven and HTable migration PRs
> >>     9. Finish HBase upgrade: coprocessor, integration test changes, data
> >>     management
> >>     10. Upgrade Hadoop
> >>     11. Final dependency reconciliation
> >>     https://issues.apache.org/jira/browse/METRON-2223
> >>     12. Acceptance testing
> >>     13. Beers, Profit
> >>
> >>  I think I've covered the major tasks, but if I've missed anything
> please
> >>  reach out.
> >>
> >>  Best,
> >>  Mike Miklavcic
> >>
> >>  On Tue, Apr 23, 2019 at 8:18 AM Nick Allen <n...@nickallen.org> wrote:
> >>
> >>>  FYI - I opened a ticket to serve as an epic for this work and the
> feature
> >>>  branch.
> >>>
> >>>  https://issues.apache.org/jira/browse/METRON-2088
> >>>
> >>>  On Mon, Apr 22, 2019 at 3:32 PM Michael Miklavcic <
> >>>  michael.miklav...@gmail.com> wrote:
> >>>
> >>>  > +1 to starting a feature branch for this.
> >>>  > +1 to removing our custom implementations if the newest revs are in
> fact
> >>>  > stable now.
> >>>  >
> >>>  > Regarding the profile option - if it's possible to keep 2.6.5 for a
> bit
> >>>  and
> >>>  > not require separate branches or code trees, this is probably OK.
> >>>  > Otherwise, I'm inclined to take the approach we've taken in the past
> >>>  with
> >>>  > every other upgrade and only support 1 version. I think we should
> >>>  prepare
> >>>  > users for the likelihood that if/when we cut over, there will be no
> more
> >>>  > updates to 2.6.x.
> >>>  >
> >>>  > I talked through this a bit with Nick and Ryan Merriman offline.
> There
> >>>  are
> >>>  > a number of major version revs of components from HDP 2.6 to 3.x
> that
> >>>  are
> >>>  > likely to have backwards compatibility problems. HBase is a big one
> that
> >>>  > comes to mind - I noticed the HTable interface was deprecated while
> >>>  working
> >>>  > through the coprocessor implementation, and Ryan found that it was
> >>>  removed
> >>>  > completely in the new version. That affects our integration tests as
> >>>  well
> >>>  > bc we have a rather large mock implementation of HBase in use that
> is
> >>>  built
> >>>  > around the removed API. We will either need to migrate to the new
> API or
> >>>  > find alternative approach to integration testing with HBase.
> >>>  >
> >>>  > I'll let Nick add more detail in the Epic/Jira and feature branch
> plan,
> >>>  but
> >>>  > here is a sampling of some of what we can expect to require some
> work to
> >>>  > upgrade:
> >>>  >
> >>>  > - Ambari - the current MPack is incompatible with Ambari 2.7.3,
> >>>  however
> >>>  > there isn't a breaking changes document, so we'll have to work
> >>>  through
> >>>  > this
> >>>  > brute force or hopefully find some help from the Ambari community.
> >>>  > - MaaS - YARN major change
> >>>  > - PCAP - HDFS, Kafka
> >>>  > - Indexing - HDFS, Solr
> >>>  > - All topologies - Kafka
> >>>  > - Stellar - HDFS, HBase
> >>>  > - Enrichment - HBase
> >>>  > - Enrichment Coprocessor (the enrichments listing) - HBase
> >>>  > - Integration tests - Kafka and HBase have changed considerably.
> >>>  > - UI, REST - Solr, HDFS, HBase
> >>>  > - Knox
> >>>  > - Kerberos (hopefully this is a kick-the-tires effort, though there
> >>>  is
> >>>  > some possible risk if Ambari and the individual components introduce
> >>>  > changes here)
> >>>  >
> >>>  > Fortunately, Zookeeper appears to have stayed the same across
> versions.
> >>>  It
> >>>  > might be worthwhile to get a chart of the versions for each platform
> >>>  added
> >>>  > to the epic Jira for reference while performing this work.
> >>>  >
> >>>  > Best,
> >>>  > Mike
> >>>  >
> >>>  >
> >>>  > On Mon, Apr 22, 2019, 12:50 PM Nick Allen <n...@nickallen.org>
> wrote:
> >>>  >
> >>>  > > We currently support running Metron on an HDP 2.6.5
> >>>  > > <
> >>>  > >
> >>>  >
> >>>
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/comp_versions.html
> >>>  > > >
> >>>  > > cluster.
> >>>  > > I'd like to get Metron updated to run in an HDP 3.1.0
> >>>  > > <
> >>>  > >
> >>>  >
> >>>
> https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/release-notes/content/comp_versions.html
> >>>  > > >
> >>>  > > cluster.
> >>>  > > This provides a number of significant updates to the core platform
> >>>  > > components that we depend on like Kafka, HBase, Ambari, etc.
> >>>  > >
> >>>  > > ### Feature Branch
> >>>  > >
> >>>  > > I'd like to create a feature branch in which to do this. This will
> >>>  take
> >>>  > a
> >>>  > > good amount of effort and multiple PRs. To avoid any impact to
> master
> >>>  as
> >>>  > we
> >>>  > > progress through this, a feature branch would make sense.
> >>>  > >
> >>>  > > If you have concerns or interest in this effort, please speak up.
> >>>  Here
> >>>  > are
> >>>  > > some relevant discussion points based on what I know so far.
> >>>  > >
> >>>  > > ### CentOS 7
> >>>  > >
> >>>  > > CentOS 6 RPMs are no longer distributed for HDP 3.1.0, only
> CentOS 7
> >>>  > RPMs.
> >>>  > > Because of this we will likely need to transition Full Dev over to
> >>>  CentOS
> >>>  > > 7. I don't see a downside to doing this since 6 is rather old and
> I
> >>>  > assume
> >>>  > > that most users run variants of 7 already anyways.
> >>>  > >
> >>>  > > ### HDP 2.6.5
> >>>  > >
> >>>  > > I'd like to try and make these changes backwards compatible with
> HDP
> >>>  > 2.6.5
> >>>  > > if possible, but only as long as that does not increase our
> ongoing
> >>>  > > development burden.
> >>>  > >
> >>>  > > For example, if I can simply define a separate build profile for
> 3.1.0
> >>>  > and
> >>>  > > things are generally backwards compatible, then I'm all for
> >>>  maintaining
> >>>  > > support for 2.6.5. On the other hand, I would not want to go as
> far
> >>>  as
> >>>  > > maintaining separate master branches for each. In my mind the
> ongoing
> >>>  > cost
> >>>  > > there is too high.
> >>>  > >
> >>>  > > ### HDP 2.5.6
> >>>  > >
> >>>  > > There are some workaround in the code base that were introduced to
> >>>  > support
> >>>  > > HDP
> >>>  > > 2.5.6
> >>>  > > <
> >>>  > >
> >>>  >
> >>>
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_release-notes/content/comp_versions.html
> >>>  > > >
> >>>  > > when
> >>>  > > we moved to HDP 2.6.5. There are some workarounds specifically for
> >>>  older
> >>>  > > versions of Storm like 1.0.x. Rather than maintaining this going
> >>>  forward,
> >>>  > > I'd prefer we remove this technical debt and not support anything
> >>>  older
> >>>  > > than HDP 2.6.5.
> >>>  > >
> >>>  > >
> >>>  > >
> >>>  > >
> >>>  > > Best,
> >>>  > > Nick
> >>>  > >
> >>>  >
>
> -------------------
> Thank you,
>
> James Sirota
> PMC- Apache Metron
> jsirota AT apache DOT org
>
>

Reply via email to