Does somebody volunteer to take this up? I can see whether I can a resource where I work, but it's highly uncertain. It would need a bit of digging and design work to see how we would abstract the HBase interface in the most effective way. As mentioned below, Tephra did a good job at this and could serve as an example here. (Not dinging OMID, OMID does most of it's work client side and doesn't need these abstractions.) -- Lars
On Tuesday, January 14, 2020, 01:13:36 AM PST, István Tóth <st...@cloudera.com.invalid> wrote: Yes, the HBase API signatures change between versions, so we need to compile each compat module against a specific HBase. Whether I can define an internal compatibility API that is switchable at run (startup) time without a performance hit remains to be seen. István On Tue, Jan 14, 2020 at 3:21 AM Josh Elser <els...@apache.org> wrote: > Agree that trying to wrangle branches is just too frustrating and > error-prone. > > It would also be great if we could have a single Phoenix jar that works > across HBase versions, but would not die on that hill :) > > On 12/20/19 5:04 AM, la...@apache.org wrote: > > I said _provided_ they can be isolated easily :) (I meant it in the > sense of assuming it's easy). > > As I said though, Tephra has a similar problem and they did a really > good job isolating HBase versions. We can learn from them. Sometimes they > isolate the change only, and sometimes the class needs to be copied, but > even then it's the one class that is copied, not another branch that needs > to be kept in sync. > > > > This may also drive the desperately necessary refactoring of Phoenix to > make these things easier to isolate, or to reduce the copying to a minimum. > And we'd need to think through testing carefully. > > > > The branch per Phoenix and HBase version is too complex, IMHO. And the > complex branch to HBase version mapping that Istvan outlines below confirms > that. > > > > We should all take a brief look at the Tephra solution and see whether > we can apply that. (And since Tephra is part of the fold now, perhaps > someone can help there...?) > > Cheers. > > -- Lars > > > > On Thursday, December 19, 2019, 8:34:15 PM GMT+1, Geoffrey Jacoby < > gjac...@gmail.com> wrote: > > > > Lars, > > > > I'm curious why you say the differences are easily isolated -- many of > the > > core classes of Phoenix either directly inherit HBase classes or > implement > > HBase interfaces, and those can vary between minor versions. (See my > above > > example of a new coprocessor hook on BaseRegionObserver.) > > > > Geoffrey > > > > On Thu, Dec 19, 2019 at 10:54 AM la...@apache.org <la...@apache.org> > wrote: > > > >> Yep. The differences are pretty minimal - provided they can be > isolated > >> easily. > >> Tephra might be a pretty good model. It supports various versions of > HBase > >> in a single branch and has similar issues as Phoenix (coprocessors, > etc). > >> -- Lars > >> On Thursday, December 19, 2019, 7:07:51 PM GMT+1, Josh Elser < > >> els...@apache.org> wrote: > >> > >> To clarify, you think that compat modules are better than that > >> separate-branches model in 4.x? > >> > >> On 12/18/19 11:29 AM, la...@apache.org wrote: > >>> This is really hard to follow. > >>> > >>> I think we should do the same with HBase dependencies in Phoenix that > >> HBase does with Hadoop dependencies. > >>> > >>> That is: We could have a maven module with the specific HBase version > >> dependent code. > >>> Btw. Tephra does the same... A module for HBase version specific code. > >>> -- Lars > >>> > >>> On Tuesday, December 17, 2019, 10:00:31 AM GMT+1, Istvan Toth < > >> st...@apache.org> wrote: > >>> > >>> What do you think about tying the minor releases to Hbase minor > releases > >>> (not necessarily one-to-one) > >>> > >>> for example (provided 5.1 is 2020H1) > >>> > >>> 5.0.0 -> HB 2.0 > >>> 5.1.0 -> HB 2.2.2 (and whatever 2.1 is API compatible with it) > >>> 5.1.x -> HB 2.2.x (treat as maintenance branch, no major new features) > >>> 5.2.0 -> HB 2.3.0 (if released by that time) > >>> 5.2.x -> HB 2.3.x (treat as maintenance branch, no major new features) > >>> 5.3.0 -> HB 2.3.x (if there is no new major/minor Hbase release) > >>> master -> latest released HBase version > >>> > >>> Alternatively, we could stick with the same HBase version for patch > >>> releases that we used for the first minor release. > >>> > >>> This would limit the number of branches that we have to maintain in > >>> parallel, while providing maintenance branches for older releases, and > >>> timely-ish Phoenix releases. > >>> > >>> The drawback is that users of old HBase versions won't get the latest > >>> features, on the other hand they can expect more polish. > >>> > >>> Istvan > >>> > >>> On Thu, Dec 12, 2019 at 8:05 PM Geoffrey Jacoby <gjac...@apache.org> > >> wrote: > >>> > >>>> Since HBase 2.0 is EOM'ed, I'm +1 for not worrying about 2.0.x > >>>> compatibility with the 5.x branch going forward. > >>>> > >>>> Given how coupled Phoenix is to the implementation details of HBase > >> though, > >>>> I'm not sure trying to abstract those away to keep one Phoenix branch > >> per > >>>> HBase major version is practical, however. At the least, it would be > >> really > >>>> complex. > >>>> > >>>> For example, in the new year I plan to return to working on the change > >> data > >>>> capture and Phoenix-level replication features, both of which depend > on > >>>> WALKey interface changes and a new RegionObserver coprocessor hook > >>>> introduced in HBASE-22622 and HBASE-22623. This was released in HBase > >> 1.5 > >>>> and will be in the forthcoming HBase 2.3. While the HBase community is > >>>> discussing EOMing 1.3 right now, and maybe 1.4 will go in the medium > >> term, > >>>> I don't see all pre-2.3 branch-2's getting deprecated anytime soon. > >>>> > >>>> So there will be at least two significant features that can only exist > >> in > >>>> some but not all of our 4.x and 5.x branches. > >>>> > >>>> Geoffrey > >>>> > >>>> On Thu, Dec 12, 2019 at 8:21 AM Josh Elser <els...@apache.org> wrote: > >>>> > >>>>> As much as possible, I'd like to avoid us getting into another > >> situation > >>>>> with 5.x where we have multiple branches. My hope was/is that we can > >>>>> keep one Phoenix5 branch that works against an acceptable set of > HBase > >>>>> branches. > >>>>> > >>>>> To me, that acceptable set of HBase branches is _a_ 2.1 and 2.2 > >> release. > >>>>> I don't think we need to support all 2.1.x or 2.2.x, nor do I think > we > >>>>> need to keep trying to maintain 2.0.x as it's already end of support > by > >>>>> the HBase community. > >>>>> > >>>>> Thanks for updating your PR. I'll add this to my review queue. > >>>>> > >>>>> On 12/12/19 1:52 AM, Istvan Toth wrote: > >>>>>> Hi! > >>>>>> > >>>>>> I'd like to start a conversation about supporting HBase 2.2. in the > >>>>>> master branch. > >>>>>> > >>>>>> https://issues.apache.org/jira/browse/PHOENIX-5268 has a slightly > out > >>>> of > >>>>>> date, but functional PR for HBase 2.2 support on master. (Please > >> review > >>>>>> and comment if you have the time, I'll try to update the PR in the > >> next > >>>>>> few days) > >>>>>> > >>>>>> The reason that it is not a straightforward decision to merge it is > >>>> that > >>>>>> applying that patch breaks compatibility with HBase 2.0.1, the > current > >>>>>> base. > >>>>>> > >>>>>> I can see the following outcomes: > >>>>>> > >>>>>> - Do nothing > >>>>>> - Move master to HBase 2.2.2 > >>>>>> - Fork master to Hbase-2.0 and Hbase-2.2 branches > >>>>>> - Build time compatibility modules > >>>>>> - Run time compatibility modules > >>>>>> - Something that I haven't thought of > >>>>>> > >>>>>> > >>>>>> Doing nothing is obviously not a long term solution, as the current > >>>>>> master doesn't work with any of the currently supported HBase > >> branches, > >>>>>> but we may postpone the inevitable. > >>>>>> > >>>>>> Simply moving master to HBase 2.2 is the most attractive solution > from > >>>> a > >>>>>> pure developer POV, but there may be other considerations. > >>>>>> > >>>>>> Having multiple masters for 2.0 and 2.2 is simple from a code > >>>>>> perspective, but maintaining two branches is a non-trivial amount of > >>>>>> additional work. (See the 4.x situation) > >>>>>> > >>>>>> Moving the HBase version dependent stuff into a separate module, and > >>>>>> choosing at build time is not pretty from a code POV, but saves us > the > >>>>>> hassle of maintaining multiple branches, while maintaining > >>>> compatibility > >>>>>> with multiple HBase versions, and can handle future API changes as > >>>> well > >>>>>> from a single branch. Doing something like this could have saved us > >> the > >>>>>> effort of maintaining three separate 4.x branches. > >>>>>> > >>>>>> I feel that since Phoenix is closely timed to HBase, and requires > >>>>>> cluster-wide HBase configuration to work anyway, handling the > >> different > >>>>>> HBase versions from the same binary/JAR is not worth the effort. > >>>>>> > >>>>>> Please share your thoughts! > >>>>>> > >>>>>> regards > >>>>>> Istvan > >>>>>> > >>>>> > >>>> > >>> > >>> > >> > > > > > -- *István Tóth* | Sr. Software Engineer t. (36) 70 283-1788 st...@cloudera.com <https://www.cloudera.com> [image: Cloudera] <https://www.cloudera.com/> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> <https://www.cloudera.com/> ------------------------------