Update on current HDP upgrade progress.

A separate DISCUSS thread has been spun up around how we should handle the
upgrade, backwards compatibility issues, semantic versioning, new release,
etc. here -
https://lists.apache.org/thread.html/21e9caf80c0ec4d9db4cb44423befa87cd1e42d327860369a6a13273@%3Cdev.metron.apache.org%3E

A few PRs have recently gone into the FB from Ryan Merriman

   1. https://issues.apache.org/jira/browse/METRON-2169
   2. https://issues.apache.org/jira/browse/METRON-2225
   3. https://issues.apache.org/jira/browse/METRON-2224

He's also currently making some progress towards the Hadoop upgrade, but
will be afk for a bit, so this work is expected to be handed off in a
branch - https://issues.apache.org/jira/browse/METRON-2232

Nick Allen is currently working on reverting the 3 Hbase FB PRs. Ran into
some merge conflicts with the Kafka PR, and hit some classpath issues.

I merged master into the HBase PR for changing deprecated interfaces, but
ran into some classpath trouble with another recent PR change. That's been
resolved. Ran through a round of testing and all looks good. Looking for
some help from Anand and Mohan in getting a good multi-node deployment test
done on this PR before we merge it in. I've also reached out to Dale
Richardson for some additional testing validation
https://github.com/apache/metron/pull/1483

We'll be looking to merge the current master branch into the FB in the next
day or so and get the 3 PRs in the FB reverted. Once we have that settled
down, we'll have hopefully been able to get PR 1483 into master, which
should then be merged again into the FB. From there on, there's a bit of
remaining work to polish off HBase. And then it's getting through the final
Hadoop version upgrade with some final testing with Kerberos, etc.

Best,
Mike


On Fri, Aug 23, 2019 at 10:52 AM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Hi devs,
>
> We've been at this a while now, and I want to share an update with you.
> Here's the current HDP 3.1 upgrade Jira again for reference -
> https://issues.apache.org/jira/browse/METRON-2088. Nick, Ryan, and I had
> a number of offline conversations in the past week discussing some of what
> has been learned during the upgrade as well as how to best address some of
> the feedback originating in the recent HBase PR reviews (
> https://github.com/apache/metron/pull/1470#issuecomment-521033037).
>
> *Prolog*
>
> As this discuss thread shows, there's little debate from the community
> regarding the need to upgrade to HDP 3.1 and our current version of Storm
> is being eol'ed (
> https://lists.apache.org/thread.html/7dbac4e50159ec899d1965505a9844f503130b1526a0e577c705959c@%3Cdev.metron.apache.org%3E).
> Metron has been running on the same version of Hadoop ever since its
> graduation to a TLP a few years ago. While there have been numerous
> individual component/service upgrades: ElasticSearch, Solr, and Storm, to
> name a few, this is our first major platform-wide upgrade.
>
> The biggest challenge we've run into so far has been dealing with the
> HBase client APIs that were deprecated in HBase 1.x and now removed in 2.x.
> We were previously able to depend on HTableInterface and HTables fully
> encapsulating the connection management logic, but this was decoupled in
> the new API. It is now left to the user to manage connection lifecycle.
> Nick Allen spent time exploring various options for how to move forward
> with new abstractions that would accommodate a new connection strategy as
> well as work within our existing Storm architecture. Fully extracting logic
> for managing HBase connections independent of the Tables proved to have
> dramatic ripple effects throughout the architecture (again, refer to
> https://github.com/apache/metron/pull/1470#issuecomment-521033037 for
> more details). We also encountered major changes in the core classes used
> by our MockHTable implementation used in the integration tests. These two
> problems resulted in interface changes that affected quite a bit of code.
>
> Reflecting on what we learned from this refactoring push, we explored
> other options to reduce the overall surface area impacted by the API
> change. The main thrust of our work seems to hinge on how to deal with the
> new connection management problem. We looked at options for how to leverage
> the existing TableProvider abstraction and decided to try out a compromise
> approach that allows us to:
>
>    1. Upgrade as much of the API as possible in the current version of
>    HBase against master
>    2. Manage connections within the TableProvider abstraction - this
>    would have an API feel that is similar to what had been encapsulated by the
>    HTableInterfaces we rely on currently, and remove a large chunk of the code
>    that had been necessary to finish the upgrade.
>
>
> *Reducing Scope*
>
> We have long-lived connections to HBase that don't need to be
> opened/closed and pooled in the traditional request/response lifecycle
> sense. We know at the time our application spins up how many processes and
> threads there will be - this is static for us. I put together a PR (
> https://issues.apache.org/jira/browse/METRON-2217) that migrates HTable
> and HTableInterface classes to the new Table API and encapsulates
> connection management within the TableProviders. We had some concerns about
> risks associated with managing the connections this way, as opposed to
> using a more robust connection management approach, so I reached out to the
> HBase community to get some guidance. The feedback we received suggests
> that managing our connections this way should be sufficient. And the HBase
> connection objects are threadsafe, to boot.
> https://lists.apache.org/thread.html/6b83cd7548efb8c37899063affc97e4c5ce823a13359a49b477e3c07@%3Cdev.hbase.apache.org%3E
>
> *A Revised Plan*
>
> The alternative HBase client/connection approach is promising, and it
> greatly reduces the overall architectural impact we will need to absorb
> alongside a major upgrade. The following is my proposal after some coding
> experiments and numerous conversations with Nick Allen, Ryan Merriman,
> Casey Stella, Otto Fowler, and James Sirota.
>
>
>    1. Do as much refactoring in small chunks as possible in master. e.g.
>    the first phase of the HBase API changes. Reducing the overall number of
>    variables changing all at the same time in the same place should reduce the
>    overall risk of the upgrade. Prove stability with what we can in master and
>    the issues we run into in the FB should be easier to isolate and solve.
>    2. Target the upgrade feature branch as being a place where we
>    primarily have to deal with changes due to classpath problems. There will
>    be some other necessary code changes, e.g. hbase coprocessor, however the
>    changes should be well-isolated and narrower in scope.
>    3. Ryan and I have had numerous conversations surrounding the Maven
>    dependency classpath issues that frequently come up at runtime anytime even
>    the smallest change to our dependency tree occurs. I won't go into those
>    details now, but you can see the discussion and history here (
>    https://github.com/apache/metron/pull/1436). While there's inherent
>    risk in making big changes to our dep management, there is also a
>    substantial upside - this PR makes finding classpath problems and solving
>    them substantially easier. This PR is ready to go in master and should
>    greatly speed up our ability to rectify and cp problems we encounter in the
>    feature branch.
>    4. Find an analog for our port of the MockHTable (
>    
> https://github.com/apache/metron/blob/master/metron-platform/metron-hbase/metron-hbase-common/src/test/java/org/apache/metron/hbase/mock/MockHTable.java)
>  in
>    HBase 2.0.2. Nick has been working on a POC around this alongside my work
>    on the other API migration and he has been able to get to a point with the
>    integration tests passing. We had originally hoped this could be landed in
>    master, but the underlying low-level supporting classes have changed and
>    are not be forwards compatible the way the Table interface
>    and ConnectionFactory class are. We plan to land this in the feature 
> branch.
>    5. Manage component version changes in an HDP 3.1 profile that gets
>    updated as PRs are submitted. This allows the modules to be upgraded on a
>    per-component basis, while still compiling and allowing tests to run,
>    without requiring a big bang all-or-nothing upgrade. We would then do a
>    final reconciliation and deprecation of the old Hadoop versions and profile
>    at the tail of the FB.
>    https://issues.apache.org/jira/browse/METRON-2223
>    6. Upgrade Kafka, Storm, Solr, and Zeppelin. PRs from Ryan are up in
>    the feature branch now.
>    7. Revert the HBase feature branch PRs that have already gone in. The
>    new approach removes the need for the HBase client changes that have
>    already gone in, so we should remove them before polishing off the HBase
>    upgrade.
>    8. Merge in master - including the Maven and HTable migration PRs
>    9. Finish HBase upgrade: coprocessor, integration test changes, data
>    management
>    10. Upgrade Hadoop
>    11. Final dependency reconciliation
>    https://issues.apache.org/jira/browse/METRON-2223
>    12. Acceptance testing
>    13. Beers, Profit
>
> I think I've covered the major tasks, but if I've missed anything please
> reach out.
>
> Best,
> Mike Miklavcic
>
>
>
> On Tue, Apr 23, 2019 at 8:18 AM Nick Allen <n...@nickallen.org> wrote:
>
>> FYI - I opened a ticket to serve as an epic for this work and the feature
>> branch.
>>
>> https://issues.apache.org/jira/browse/METRON-2088
>>
>> On Mon, Apr 22, 2019 at 3:32 PM Michael Miklavcic <
>> michael.miklav...@gmail.com> wrote:
>>
>> > +1 to starting a feature branch for this.
>> > +1 to removing our custom implementations if the newest revs are in fact
>> > stable now.
>> >
>> > Regarding the profile option - if it's possible to keep 2.6.5 for a bit
>> and
>> > not require separate branches or code trees, this is probably OK.
>> > Otherwise, I'm inclined to take the approach we've taken in the past
>> with
>> > every other upgrade and only support 1 version. I think we should
>> prepare
>> > users for the likelihood that if/when we cut over, there will be no more
>> > updates to 2.6.x.
>> >
>> > I talked through this a bit with Nick and Ryan Merriman offline. There
>> are
>> > a number of major version revs of components from HDP 2.6 to 3.x that
>> are
>> > likely to have backwards compatibility problems. HBase is a big one that
>> > comes to mind - I noticed the HTable interface was deprecated while
>> working
>> > through the coprocessor implementation, and Ryan found that it was
>> removed
>> > completely in the new version. That affects our integration tests as
>> well
>> > bc we have a rather large mock implementation of HBase in use that is
>> built
>> > around the removed API. We will either need to migrate to the new API or
>> > find alternative approach to integration testing with HBase.
>> >
>> > I'll let Nick add more detail in the Epic/Jira and feature branch plan,
>> but
>> > here is a sampling of some of what we can expect to require some work to
>> > upgrade:
>> >
>> >    - Ambari - the current MPack is incompatible with Ambari 2.7.3,
>> however
>> >    there isn't a breaking changes document, so we'll have to work
>> through
>> > this
>> >    brute force or hopefully find some help from the Ambari community.
>> >    - MaaS - YARN major change
>> >    - PCAP - HDFS, Kafka
>> >    - Indexing - HDFS, Solr
>> >    - All topologies - Kafka
>> >    - Stellar - HDFS, HBase
>> >    - Enrichment - HBase
>> >    - Enrichment Coprocessor (the enrichments listing) - HBase
>> >    - Integration tests - Kafka and HBase have changed considerably.
>> >    - UI, REST - Solr, HDFS, HBase
>> >    - Knox
>> >    - Kerberos (hopefully this is a kick-the-tires effort, though there
>> is
>> >    some possible risk if Ambari and the individual components introduce
>> >    changes here)
>> >
>> > Fortunately, Zookeeper appears to have stayed the same across versions.
>> It
>> > might be worthwhile to get a chart of the versions for each platform
>> added
>> > to the epic Jira for reference while performing this work.
>> >
>> > Best,
>> > Mike
>> >
>> >
>> > On Mon, Apr 22, 2019, 12:50 PM Nick Allen <n...@nickallen.org> wrote:
>> >
>> > > We currently support running Metron on an HDP 2.6.5
>> > > <
>> > >
>> >
>> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/comp_versions.html
>> > > >
>> > > cluster.
>> > > I'd like to get Metron updated to run in an HDP 3.1.0
>> > > <
>> > >
>> >
>> https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/release-notes/content/comp_versions.html
>> > > >
>> > > cluster.
>> > > This provides a number of significant updates to the core platform
>> > > components that we depend on like Kafka, HBase, Ambari, etc.
>> > >
>> > > ### Feature Branch
>> > >
>> > > I'd like to create a feature branch in which to do this.  This will
>> take
>> > a
>> > > good amount of effort and multiple PRs. To avoid any impact to master
>> as
>> > we
>> > > progress through this, a feature branch would make sense.
>> > >
>> > > If you have concerns or interest in this effort, please speak up.
>> Here
>> > are
>> > > some relevant discussion points based on what I know so far.
>> > >
>> > > ### CentOS 7
>> > >
>> > > CentOS 6 RPMs are no longer distributed for HDP 3.1.0, only CentOS 7
>> > RPMs.
>> > > Because of this we will likely need to transition Full Dev over to
>> CentOS
>> > > 7.  I don't see a downside to doing this since 6 is rather old and I
>> > assume
>> > > that most users run variants of 7 already anyways.
>> > >
>> > > ### HDP 2.6.5
>> > >
>> > > I'd like to try and make these changes backwards compatible with HDP
>> > 2.6.5
>> > > if possible, but only as long as that does not increase our ongoing
>> > > development burden.
>> > >
>> > > For example, if I can simply define a separate build profile for 3.1.0
>> > and
>> > > things are generally backwards compatible, then I'm all for
>> maintaining
>> > > support for 2.6.5.  On the other hand, I would not want to go as far
>> as
>> > > maintaining separate master branches for each.  In my mind the ongoing
>> > cost
>> > > there is too high.
>> > >
>> > > ### HDP 2.5.6
>> > >
>> > > There are some workaround in the code base that were introduced to
>> > support
>> > > HDP
>> > > 2.5.6
>> > > <
>> > >
>> >
>> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_release-notes/content/comp_versions.html
>> > > >
>> > > when
>> > > we moved to HDP 2.6.5. There are some workarounds specifically for
>> older
>> > > versions of Storm like 1.0.x. Rather than maintaining this going
>> forward,
>> > > I'd prefer we remove this technical debt and not support anything
>> older
>> > > than HDP 2.6.5.
>> > >
>> > >
>> > >
>> > >
>> > > Best,
>> > > Nick
>> > >
>> >
>>
>

Reply via email to