Still not having looked at what Tephra does -- I'm intrigued by what Istvan has in-progress. Waiting to see what he comes up with would be my suggestion :)

On 1/14/20 1:12 PM, la...@apache.org wrote:
  Does somebody volunteer to take this up?
I can see whether I can a resource where I work, but it's highly uncertain.
It would need a bit of digging and design work to see how we would abstract the 
HBase interface in the most effective way.
As mentioned below, Tephra did a good job at this and could serve as an example 
here. (Not dinging OMID, OMID does most of it's work client side and doesn't 
need these abstractions.)
-- Lars

     On Tuesday, January 14, 2020, 01:13:36 AM PST, István Tóth 
<st...@cloudera.com.invalid> wrote:
Yes, the HBase API signatures change between versions, so we need to
compile each compat module against a specific HBase.

Whether I can define an internal compatibility API that is switchable at
run (startup) time without a performance hit remains to be seen.

István

On Tue, Jan 14, 2020 at 3:21 AM Josh Elser <els...@apache.org> wrote:

Agree that trying to wrangle branches is just too frustrating and
error-prone.

It would also be great if we could have a single Phoenix jar that works
across HBase versions, but would not die on that hill :)

On 12/20/19 5:04 AM, la...@apache.org wrote:
   I said _provided_ they can be isolated easily :) (I meant it in the
sense of assuming it's easy).
As I said though, Tephra has a similar problem and they did a really
good job isolating HBase versions. We can learn from them. Sometimes they
isolate the change only, and sometimes the class needs to be copied, but
even then it's the one class that is copied, not another branch that needs
to be kept in sync.

This may also drive the desperately necessary refactoring of Phoenix to
make these things easier to isolate, or to reduce the copying to a minimum.
And we'd need to think through testing carefully.

The branch per Phoenix and HBase version is too complex, IMHO. And the
complex branch to HBase version mapping that Istvan outlines below confirms
that.

We should all take a brief look at the Tephra solution and see whether
we can apply that. (And since Tephra is part of the fold now, perhaps
someone can help there...?)
Cheers.
-- Lars

       On Thursday, December 19, 2019, 8:34:15 PM GMT+1, Geoffrey Jacoby <
gjac...@gmail.com> wrote:

   Lars,

I'm curious why you say the differences are easily isolated -- many of
the
core classes of Phoenix either directly inherit HBase classes or
implement
HBase interfaces, and those can vary between minor versions. (See my
above
example of a new coprocessor hook on BaseRegionObserver.)

Geoffrey

On Thu, Dec 19, 2019 at 10:54 AM la...@apache.org <la...@apache.org>
wrote:

     Yep. The differences are pretty minimal - provided they can be
isolated
easily.
Tephra might be a pretty good model. It supports various versions of
HBase
in a single branch and has similar issues as Phoenix (coprocessors,
etc).
-- Lars
       On Thursday, December 19, 2019, 7:07:51 PM GMT+1, Josh Elser <
els...@apache.org> wrote:

     To clarify, you think that compat modules are better than that
separate-branches model in 4.x?

On 12/18/19 11:29 AM, la...@apache.org wrote:
This is really hard to follow.

I think we should do the same with HBase dependencies in Phoenix that
HBase does with Hadoop dependencies.

That is:  We could have a maven module with the specific HBase version
dependent code.
Btw. Tephra does the same... A module for HBase version specific code.
-- Lars

         On Tuesday, December 17, 2019, 10:00:31 AM GMT+1, Istvan Toth <
st...@apache.org> wrote:

     What do you think about tying the minor releases to Hbase minor
releases
(not necessarily one-to-one)

for example (provided 5.1 is 2020H1)

5.0.0 -> HB 2.0
5.1.0 -> HB 2.2.2 (and whatever 2.1 is API compatible with it)
5.1.x -> HB 2.2.x (treat as maintenance branch, no major new features)
5.2.0 -> HB 2.3.0 (if released by that time)
5.2.x -> HB 2.3.x (treat as maintenance branch, no major new features)
5.3.0 -> HB 2.3.x (if there is no new major/minor Hbase release)
master -> latest released HBase version

Alternatively, we could stick with the same HBase version for patch
releases that we used for the first minor release.

This would limit the number of branches that we have to maintain in
parallel, while providing maintenance branches for older releases, and
timely-ish Phoenix releases.

The drawback is that users of old HBase versions won't get the latest
features, on the other hand they can expect more polish.

Istvan

On Thu, Dec 12, 2019 at 8:05 PM Geoffrey Jacoby <gjac...@apache.org>
wrote:

Since HBase 2.0 is EOM'ed, I'm +1 for not worrying about 2.0.x
compatibility with the 5.x branch going forward.

Given how coupled Phoenix is to the implementation details of HBase
though,
I'm not sure trying to abstract those away to keep one Phoenix branch
per
HBase major version is practical, however. At the least, it would be
really
complex.

For example, in the new year I plan to return to working on the change
data
capture and Phoenix-level replication features, both of which depend
on
WALKey interface changes and a new RegionObserver coprocessor hook
introduced in HBASE-22622 and HBASE-22623. This was released in HBase
1.5
and will be in the forthcoming HBase 2.3. While the HBase community is
discussing EOMing 1.3 right now, and maybe 1.4 will go in the medium
term,
I don't see all pre-2.3 branch-2's getting deprecated anytime soon.

So there will be at least two significant features that can only exist
in
some but not all of our 4.x and 5.x branches.

Geoffrey

On Thu, Dec 12, 2019 at 8:21 AM Josh Elser <els...@apache.org> wrote:

As much as possible, I'd like to avoid us getting into another
situation
with 5.x where we have multiple branches. My hope was/is that we can
keep one Phoenix5 branch that works against an acceptable set of
HBase
branches.

To me, that acceptable set of HBase branches is _a_ 2.1 and 2.2
release.
I don't think we need to support all 2.1.x or 2.2.x, nor do I think
we
need to keep trying to maintain 2.0.x as it's already end of support
by
the HBase community.

Thanks for updating your PR. I'll add this to my review queue.

On 12/12/19 1:52 AM, Istvan Toth wrote:
Hi!

I'd like to start a conversation about supporting HBase 2.2. in the
master branch.

https://issues.apache.org/jira/browse/PHOENIX-5268 has a slightly
out
of
date, but functional PR for HBase 2.2 support on master. (Please
review
and comment if you have the time, I'll try to update the PR in the
next
few days)

The reason that it is not a straightforward decision to merge it is
that
applying that patch breaks compatibility with HBase 2.0.1, the
current
base.

I can see the following outcomes:

- Do nothing
- Move master to HBase 2.2.2
- Fork master to Hbase-2.0 and Hbase-2.2 branches
- Build time compatibility modules
- Run time compatibility modules
- Something that I haven't thought of


Doing nothing is obviously not a long term solution, as the current
master doesn't work with any of the currently supported HBase
branches,
but we may postpone the inevitable.

Simply moving master to HBase 2.2 is the most attractive solution
from
a
pure developer POV, but there may be other considerations.

Having multiple masters for 2.0 and 2.2 is simple from a code
perspective, but maintaining two branches is a non-trivial amount of
additional work. (See the 4.x situation)

Moving the HBase version dependent stuff into a separate module, and
choosing at build time is not pretty from a code POV, but saves us
the
hassle of maintaining multiple branches, while maintaining
compatibility
with multiple  HBase versions, and can handle future API changes as
well
from a single branch. Doing something like this could have saved us
the
effort of maintaining three separate 4.x branches.

I feel that since Phoenix is closely timed to HBase, and requires
cluster-wide HBase configuration to work anyway, handling the
different
HBase versions from the same binary/JAR is not worth the effort.

Please share your thoughts!

regards
Istvan











Reply via email to