+1, for the branch idea. Just FYI, Your biggest problem is proving that
Hadoop and the downstream projects work correctly after you upgrade core
components like Protobuf.
So while branching and working on a branch is easy, merging back after you
upgrade some of these core components is insanely hard. You might want to
make sure that community buys into upgrading these components in the trunk.
That way we will get testing and downstream components will notice when
things break.

That said, I have lobbied for the upgrade of Protobuf for a really long
time; I have argued that 2.5 is out of support and we cannot stay on that
branch forever; or we need to take ownership of the Protobuf 2.5 code base.
It has been rightly pointed to me that while all the arguments I make is
correct; it is a very complicated task to upgrade Protobuf, and the worst
part is we will not even know what breaks until downstream projects pick up
these changes and work against us.

If we work off the Hadoop version 3 — and assume that we have "shading" in
place for all deployments; it might be possible to get there; still a
daunting task.

So best of luck with the branch approach — But please remember, Merging
back will be hard, Just my 2 cents.

— Anu




On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng <zhengzhenyul...@gmail.com>
wrote:

> Hi,
>
> Thanks Vinaya for bring this up and thanks Sheng for the idea. A separate
> branch with it's own ARM CI seems a really good idea.
> By doing this we won't break any of the undergoing development in trunk and
> a CI can be a very good way to show what are the
> current problems and what have been fixed, it will also provide a very good
> view for contributors that are intrested to working on
> this. We can finally merge back the branch to trunk until the community
> thinks it is good enough and stable enough. We can donate
> ARM machines to the existing CI system for the job.
>
> I wonder if this approch possible?
>
> BR,
>
> On Thu, Aug 29, 2019 at 11:29 AM Sheng Liu <liusheng2...@gmail.com> wrote:
>
> > Hi,
> >
> > Thanks Vinay for bring this up, I am a member of "Openlab" community
> > mentioned by Vinay. I am working on building and
> > testing Hadoop components on aarch64 server these days, besides the
> missing
> > dependices of ARM platform issues #1 #2 #3
> > mentioned by Vinay, other similar issue has also be found, such as the
> > "PhantomJS" dependent package also missing for aarch64.
> >
> > To promote the ARM support for Hadoop, we have discussed and hoped to add
> > an ARM specific CI to Hadoop repo. we are not
> > sure about if there is any potential effect or confilict on the trunk
> > branch, so maybe creating a ARM specific branch for doing these stuff
> > is a better choice, what do you think?
> >
> > Hope to hear thoughts from you :)
> >
> > BR,
> > Liu sheng
> >
> > Vinayakumar B <vinayakum...@apache.org> 于2019年8月27日周二 上午5:34写道:
> >
> > > Hi Folks,
> > >
> > > ARM is becoming famous lately in its processing capability and has got
> > the
> > > potential to run Bigdata workloads.
> > > Many users have been moving to ARM machines due to its low cost.
> > >
> > > In the past there were attempts to compile Hadoop on ARM (Rasberry PI)
> > for
> > > experimental purposes. Today ARM architecture is taking some of the
> > > serverside processing as well. So there will be/is a real need of
> Hadoop
> > to
> > > support ARM architecture as well.
> > >
> > > There are bunch of users who are trying out building Hadoop on ARM,
> > trying
> > > to add ARM CI to hadoop and facing issues[1]. Also some
> > >
> > > As of today, Hadoop does not compile on ARM due to below issues, found
> > from
> > > testing done in openlab in [2].
> > >
> > > 1. Protobuf :
> > > -------------------
> > >      Hadoop project (also some downstream projects) stuck to protobuf
> > 2.5.0
> > > version, due to backward compatibility reasons. Protobuf-2.5.0 is not
> > being
> > > maintained in the community. While protobuf 3.x is being actively
> adopted
> > > widely, still protobuf 3.x provides wire compatibility for proto2
> > messages.
> > > Due to some compilation issues in the generated java code, which can
> > induce
> > > problems in downstream. Due to this reason protobuf upgrade from 2.5.0
> > was
> > > not taken up.
> > > In 3.0.0 onwards, hadoop supports shading of libraries to avoid
> classpath
> > > problem in downstream projects.
> > >     There are patches available to fix compilation in Hadoop. But need
> to
> > > find a way to upgrade protobuf to latest version and still maintain the
> > > downstream's classpath using shading feature of Hadoop build.
> > >
> > >      There is a Jira for protobuf upgrade[3] created even before shade
> > > support was added to Hadoop. Now need to revisit the Jira and continue
> > > explore possibilities.
> > >
> > > 2. leveldbjni:
> > > ---------------
> > >     Current leveldbjni used in YARN doesnot support ARM architecture,
> > need
> > > to check whether any of the future versions support ARM and can hadoop
> > > upgrade to that version.
> > >
> > >
> > > 3. hadoop-yarn-csi's dependency 'protoc-gen-grpc-java:1.15.1'
> > > -------------------------
> > > 'protoc-gen-grpc-java:1.15.1' does not provide ARM executable by
> default
> > in
> > > the maven repository. Workaround is to build it locally and keep in
> local
> > > maven repository.
> > > Need to check whether any future versions of 'protoc-gen-grpc-java' is
> > > having ARM executable and whether hadoop-yarn-csi can upgrade it?
> > >
> > >
> > > Once the compilation issues are solved, then there might be many native
> > > code related issues due to different architectures.
> > > So to explore everything, need to join hands together and proceed.
> > >
> > >
> > > Let us discuss and check, whether any body else out there who also need
> > the
> > > support of Hadoop on ARM architectures and ready to lend their hands
> and
> > > time in this work.
> > >
> > >
> > > [1] https://issues.apache.org/jira/browse/HADOOP-16358
> > > [2]
> > >
> > >
> >
> https://issues.apache.org/jira/browse/HADOOP-16358?focusedCommentId=16904887&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16904887
> > > [3] https://issues.apache.org/jira/browse/HADOOP-13363
> > >
> > > -Vinay
> > >
> >
>

Reply via email to