Re: HBase 2.4.x + Spark 3.3

Andrew Purtell Wed, 19 Oct 2022 17:28:53 -0700

No, that is insufficient. HBase must be recompiled against Hadoop 3 first

cd /path/to/hbase
mvn clean install assembly:single -DskipTests -Dhadoop.profile=3.0
-Dhadoop-three.version=XXX


Then once the results are in your local maven cache or nexus instance, you
can compile Spark as indicated.


On Tue, Oct 18, 2022 at 11:39 PM Lars Francke <lars.fran...@gmail.com>
wrote:

> Hi Andrew,
>
> thanks for the reply.
> I should have been more specific: We only tried to compile the "client"
> part that's used in Spark itself and we used the proper versions
>
> mvn -Dspark.version=XXX -Dscala.version=XXX -Dhadoop-three.version=XXX
> -Dscala.binary.version=XXX -Dhbase.version=XXX clean package
>
> I assume that should pull in the correct dependencies but I have to admit
> that I didn't check, took it straight from the readme.
> We wanted to try the server bit for the RegionServers afterwards but didn't
> even get to it yet.
>
> We have this on our radar though and might try to work through those issues
> at some point.
> If we get started on that I'll ping the list.
>
> Cheers,
> Lars
>
> On Wed, Oct 19, 2022 at 1:41 AM Andrew Purtell <apurt...@apache.org>
> wrote:
>
> > Out of the box use is going to be problematic without recompiling HBase
> for
> > Hadoop 3. Spark 3.3 ships with Hadoop 3.3.2. Apache HBase 2.4.x (and all
> > 2.x) releases are compiled against Hadoop 2. Link errors (ClassNotFound,
> > NoClassDef, etc) I think are to be expected because the class hierarchies
> > of various Hadoop things have been incompatibly changed in 3.x releases
> > relative to 2.x. This is not unreasonable. Semantic versioning suggests
> > breaking changes can be expected in a major version increment.
> >
> > Users probably need to do a holistic (or hermetic, if you prefer) build
> of
> > their bill of materials before testing it or certainly before shipping
> it.
> > Build your HBase for the version of Hadoop you are actually shipping it
> > with, as opposed to whatever the upstream project picks as a default
> build
> > target. They are called "convenience binaries" by the project and the
> > Foundation for a reason. Convenience may vary according to your
> > circumstances. When HBase finally ships builds compiled against Hadoop 3
> by
> > default, anyone still using 2.x in production will face the same problem
> > (in reverse). The Phoenix project also faces this issue for what it's
> > worth. Their readme and build instructions walk users through rebuilding
> > HBase using -Dhadoop.profile=3.0 as a first step as well.
> >
> >
> > On Mon, Oct 17, 2022 at 1:52 PM Lars Francke <lars.fran...@gmail.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > we've just recently tried getting the HBase Spark connector running
> > against
> > > Spark 3.3 and HBase 2.4.x and failed miserably. It was a mess of Scala
> > and
> > > Java issues, classpath, NoClassDef etc.
> > >
> > > The trauma is too recent for me to dig up the details but if needed I
> can
> > > ;-)
> > >
> > > For now I'm just wondering if anyone has succeeded using this
> > combination?
> > >
> > > Cheers,
> > > Lars
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Unrest, ignorance distilled, nihilistic imbeciles -
> >     It's what we’ve earned
> > Welcome, apocalypse, what’s taken you so long?
> > Bring us the fitting end that we’ve been counting on
> >    - A23, Welcome, Apocalypse
> >
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Re: HBase 2.4.x + Spark 3.3

Reply via email to