Out of the box use is going to be problematic without recompiling HBase for
Hadoop 3. Spark 3.3 ships with Hadoop 3.3.2. Apache HBase 2.4.x (and all
2.x) releases are compiled against Hadoop 2. Link errors (ClassNotFound,
NoClassDef, etc) I think are to be expected because the class hierarchies
of various Hadoop things have been incompatibly changed in 3.x releases
relative to 2.x. This is not unreasonable. Semantic versioning suggests
breaking changes can be expected in a major version increment.

Users probably need to do a holistic (or hermetic, if you prefer) build of
their bill of materials before testing it or certainly before shipping it.
Build your HBase for the version of Hadoop you are actually shipping it
with, as opposed to whatever the upstream project picks as a default build
target. They are called "convenience binaries" by the project and the
Foundation for a reason. Convenience may vary according to your
circumstances. When HBase finally ships builds compiled against Hadoop 3 by
default, anyone still using 2.x in production will face the same problem
(in reverse). The Phoenix project also faces this issue for what it's
worth. Their readme and build instructions walk users through rebuilding
HBase using -Dhadoop.profile=3.0 as a first step as well.


On Mon, Oct 17, 2022 at 1:52 PM Lars Francke <lars.fran...@gmail.com> wrote:

> Hi everyone,
>
> we've just recently tried getting the HBase Spark connector running against
> Spark 3.3 and HBase 2.4.x and failed miserably. It was a mess of Scala and
> Java issues, classpath, NoClassDef etc.
>
> The trauma is too recent for me to dig up the details but if needed I can
> ;-)
>
> For now I'm just wondering if anyone has succeeded using this combination?
>
> Cheers,
> Lars
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Reply via email to