Re: [DISCUSS] HBase 3 release plans

Andrew Purtell Mon, 17 Feb 2020 12:31:13 -0800

The comment on the JRuby shell sounded like a proposal to replace based on a 
personal dislike. I apologize given your clarification.


JRuby may not be loved but it’s the only basis for scripted automation we have 
right now and there are compatibility concerns regards any proposals to replace.

We’ve been good about building the shell functions on admin APIs so 
alternatives can share the same foundation. 


> On Feb 17, 2020, at 11:32 AM, Lars Francke <[email protected]> wrote:
> 
> On Mon, Feb 17, 2020 at 7:53 PM Andrew Purtell <[email protected]>
> wrote:
> 
>> On the topic of JRuby, we have operational tooling based on this precisely
>> because HBase offers a shell built on it where the shell function
>> themselves can be used as functions, per design. You want us to throw this
>> all away because you don’t like JRuby? Or want to play with some new
>> language? Have a care for the users of the software, please.
>> 
> 
> If your last sentence implies I don't care for the users then that's not
> true (and actually a bit of an unfair assumption) as I am a user myself and
> we have dozens of clients using HBase and they are users and I don't know a
> _single_ one who likes the JRuby based shell. And yes it's as simple as
> that: Ruby is unfamiliar to most people I deal with and lots of people (me
> included) have trouble remembering the syntax especially when it comes to
> extensions (e.g. Formatters etc.) so an alternative would be more than
> welcome.
> 
> But to your actual point: It's a valid one so maybe a rewrite isn't
> necessary but we could start working on an alternative. An alternative to
> the existing one doesn't mean we have to throw out the old one.
> 
> 
>>> A user can upgrade their runtime to 11 and deploy software built on 8
>> at their option at any time. Staying on 8 doesn’t affect this one way or
>> another.
>> 
> 
> You're correct and in our reality, this rarely happens if there is no
> pressure to do so.
> 
>> 
>>> Are we in the business of Java language evangelism or adoption? No. So
>> as PMC I am -1 on taking on such orthogonal missions.
>> 
> 
> Fine with me. I just listed reasons why for me and our customers this would
> be a good step, your view is different. This was just one of the reasons I
> listed and the only non-technical one.
> As a committer, user, and consultant for/supporter of/developer on I'm +1
> on this and I firmly believe this would be a good thing.
> I don't, however, have a very strong opinion here. I just wanted to list a
> few points in favor - you don't agree with any/all of them.
> 
>> 
>>> When you run ZK 3.4’s test suite on 11 about 40% fail. An example.
>> Hadoop 2 versions also have test issues. This doesn’t mean they won’t
>> operate acceptably at runtime on 11 as part of a holistic HBase
>> installation, but it does raise red flags. If we adopt Java 11 all of our
>> transitive dependencies should be testing out green too. That means we
>> upgrade those dependencies or fix them. By “we” I mean the operators who
>> have real world deployments if nobody else if HBase devs shirk the
>> responsibility. I would define our mission, or at least the essential
>> mission of users and operators, as conservative, stable data storage. Red
>> builds of dependencies on 11 does not provide a sense of stability. A bunch
>> of dependency upgrades to compensate does not provide a sense of stability.
>> 
> 
> That's a good reason not to upgrade yet.
> 
> 
>>> 
>>> The mandating of Java 11 or later runtimes is a much bigger investment
>> that this discussion seems to contemplate. Unless we don’t care about
>> adoption.
>>> 
>>> If you want next generation low pause GC, Shenandoah has been ported
>> back to 8, is supported and available in RedHat’s OpenJDK, and that JDK
>> ships by default in several modern Linuxen not least of which Amazon Linux
>> AMIs.
>>> 
>>> 
>>>>> On Feb 17, 2020, at 10:32 AM, Lars Francke <[email protected]>
>> wrote:
>>>> I don't have a super-strong opinion on this but one of the reasons I'm
>> in
>>>> favor of moving to 11 is actually to apply some pressure.
>>>> OpenJDK is in all the major package repositories as far as I know. This
>> is
>>>> different from the Oracle JDK. So it's relatively simple to upgrade.
>>>> Lots of our customers would love to upgrade to newer versions because
>> most
>>>> of the time it means they can use the newer versions also for their
>> Spark
>>>> jobs etc. and this way they have a reason to do so and also some
>> leverage
>>>> in internal politics. I know it's not a technical reason but for a whole
>>>> bunch of people, it's still a very valid one.
>>>> But there are also technical reasons as Sean mentioned:
>>>> * One of them being the whole module system which would be nice to try
>> out
>>>> on HBase (which has a "shady" past in this regard).
>>>> * We could also check whether the Shell could be rewritten using JShell
>>>> (I've never looked into the feasibility but I'd personally love to get
>> away
>>>> from JRuby),
>>>> * Then there's TLS 1.3 support (which honestly might also be backported
>> in
>>>> an earlier version) which is getting more important
>>>> * and we have operational improvements around Garbage Collection which
>>>> would benefit HBase.
>>>>> On Mon, Feb 17, 2020 at 6:57 PM Andrew Purtell <
>> [email protected]>
>>>>> wrote:
>>>>> I share this concern.
>>>>> I recommend we build on 8, so can run on any 8 or later runtime, and
>> test
>>>>> on 11 or whatever the desired next advertised version of the runtime
>> will
>>>>> be. Builds and unit tests would execute on 8, then whole stack
>> integration
>>>>> tests on 11 (like ITBLL).
>>>>> Only if a post 8 language feature becomes compelling should there be
>> any
>>>>> need to move up the bytecode compatibility line. Any thoughts there. I
>>>>> can’t think of a one. Eventually the value types work might be worth
>> it.
>>>>> There may be differences in runtime, like we’ve seen historically. So
>> far
>>>>> we have managed to navigate them just fine by being careful or porting
>> in
>>>>> appropriately licensed support to our own util packages. I would expect
>>>>> this to continue to be possible.
>>>>>> On Feb 17, 2020, at 6:41 AM, 张铎(Duo Zhang) <[email protected]>
>>>>> wrote:
>>>>>> For me the only concern is the JDK11+.
>>>>>> As long as lots of big company are starting to make their own OpenJDK8
>>>>>> releases(like AdoptOpenJDK from redhat, and corretto from Amazon, and
>> so
>>>>>> on), at least a big part of java world will still be on JDK8, this
>> means
>>>>> we
>>>>>> still need to run HBase 2 for a very long time?
>>>>>> Lars Francke <[email protected]> 于2020年2月17日周一 下午6:45写道：
>>>>>>> Sean,
>>>>>>> you didn't have any explicit questions or request for feedback in
>> your
>>>>> mail
>>>>>>> so I'll just say that from my point all the points from your mail
>> make
>>>>>>> sense to me and I'm +1 on all of them.
>>>>>>> - Timeline (GA December/January)
>>>>>>> - Start of shorter release cycle
>>>>>>> - Rolling upgrade
>>>>>>> - JDK11+
>>>>>>> - Hadoop 3 only
>>>>>>> - No more Log4j 1
>>>>>>> - Minimize Hadoop needs
>>>>>>> Thank you for starting this process!
>>>>>>>>> On Sat, Feb 15, 2020 at 2:55 AM Sean Busbey <[email protected]>
>> wrote:
>>>>>>>>> Hi folks!
>>>>>>>>> Consider this the start of my release manager duties for HBase
>> 3.0.0.
>>>>>>>>> HBase 2.0 started alpha releases in Jun 2017 and went GA in April
>>>>>>>>> 2018. I'd like to start alpha releases for HBase 3.0 next week and
>> aim
>>>>>>>>> for a GA in December or January.
>>>>>>>>> As RM, I consider the alpha releases a chance for folks to shake
>>>>>>>>> things out and for us to decide as a community what makes it in or
>> not
>>>>>>>>> for the release line. Once things start being labeled beta, I would
>>>>>>>>> like things to be feature frozen. My current goal is to set beta in
>>>>>>>>> May or June.
>>>>>>>>> I would like HBase 3.0 to be the start of us getting into practice
>> on
>>>>>>>>> tighter iteration cycles for major versions. 2.5 years is too
>> long. We
>>>>>>>>> should try to look at our version numbers as akin to Linux kernel
>>>>>>>>> version numbers. Something useful to those interested in the
>> internals
>>>>>>>>> of the project but not something where most downstream users have
>> to
>>>>>>>>> dread major bumps. To that end I would like HBase 3.0 to be rolling
>>>>>>>>> upgradable from HBase 2.y at GA. Ultimately, I would like to update
>>>>>>>>> our reference guide's section on compatibility to state that future
>>>>>>>>> major releases will similarly be rolling upgradable from some minor
>>>>>>>>> release of the prior major release line.
>>>>>>>>> Given my desire to make major upgrades less of a world changing
>> event
>>>>>>>>> for our downstream folks, I also don't have any new features that I
>>>>>>>>> feel strongly need to make it into HBase 3.0. I'll do a review of
>>>>>>>>> what's currently there so we can motivate folks, but I won't do
>> that
>>>>>>>>> until we're ready to declare beta since that will be when I'll
>> have a
>>>>>>>>> better idea of what's actually ready to ship.
>>>>>>>>> With the major version change I'd like us to shed some maintenance
>>>>>>>>> burden in the project. For each of these, doing that will require
>>>>>>>>> getting a branch-2 release out that can successfully opt-in to the
>>>>>>>>> HBase 3 requirement and run on the current HBase 2 requirement.
>> That
>>>>>>>>> way folks can do a rolling restart within HBase 2 to make the
>> change,
>>>>>>>>> then do a rolling upgrade to HBase 3.
>>>>>>>>> = Hadoop 3 only
>>>>>>>>> The Hadoop community's focus increasingly is on Hadoop 3.y
>> releases. A
>>>>>>>>> substantial amount of our dependency handling is tied to trying to
>>>>>>>>> span both Hadoop 2 and Hadoop 3. I would like us to drop Hadoop 2
>>>>>>>>> entirely. I think branch-2 is currently in a place where running on
>>>>>>>>> Hadoop 3 is reasonable.
>>>>>>>>> = JDK11+
>>>>>>>>> We've been bending over backwards on jdk versions for a while.
>> Maven
>>>>>>>>> build sets up for multiple jdk builds are a PITA and our existing
>>>>>>>>> build is already too complicated. I'd like us to get branch-2 into
>> a
>>>>>>>>> solid state for running on either JDK8 or JDK11 so folks can do
>>>>>>>>> production upgrades on those releases. I'd like HBase 3 to be
>> jdk11+
>>>>>>>>> only so that we can reduce our test footprint in the project and
>> start
>>>>>>>>> to entertain features that don't work with jdk8. For example, we
>> can't
>>>>>>>>> start to figure out how we should fit in the module system as
>> things
>>>>>>>>> are.
>>>>>>>>> = No more Log4j 1
>>>>>>>>> We got through 95% of the work to make our logging system
>> pluggable,
>>>>>>>>> but we still ship with log4j 1 as the out of the box solution. I
>> want
>>>>>>>>> HBase 3 to ship with some other slf4j backend and I want that same
>>>>>>>>> backend to be a realistic option for branch-2 users to deploy in
>>>>>>>>> production.
>>>>>>>>> = Minimize Hadoop needs
>>>>>>>>> I would like to isolate the things we have that reach into Hadoop
>>>>>>>>> internals so that we can ease the logistics of changing the Hadoop
>>>>>>>>> version we run on and minimize the extra stuff we carry around for
>>>>>>>>> those who don't run on Hadoop at all. This will involve moving the
>>>>>>>>> stuff we have that reaches into internals into one or more
>> module(s)
>>>>>>>>> and updating our artifacts so that we can tell where our hadoop
>>>>>>>>> related dependencies are in an installation.
>>

Re: [DISCUSS] HBase 3 release plans

Reply via email to