Out of the box use is going to be problematic without recompiling HBase for Hadoop 3. Spark 3.3 ships with Hadoop 3.3.2. Apache HBase 2.4.x (and all 2.x) releases are compiled against Hadoop 2. Link errors (ClassNotFound, NoClassDef, etc) I think are to be expected because the class hierarchies of various Hadoop things have been incompatibly changed in 3.x releases relative to 2.x. This is not unreasonable. Semantic versioning suggests breaking changes can be expected in a major version increment.
Users probably need to do a holistic (or hermetic, if you prefer) build of their bill of materials before testing it or certainly before shipping it. Build your HBase for the version of Hadoop you are actually shipping it with, as opposed to whatever the upstream project picks as a default build target. They are called "convenience binaries" by the project and the Foundation for a reason. Convenience may vary according to your circumstances. When HBase finally ships builds compiled against Hadoop 3 by default, anyone still using 2.x in production will face the same problem (in reverse). The Phoenix project also faces this issue for what it's worth. Their readme and build instructions walk users through rebuilding HBase using -Dhadoop.profile=3.0 as a first step as well. On Mon, Oct 17, 2022 at 1:52 PM Lars Francke <lars.fran...@gmail.com> wrote: > Hi everyone, > > we've just recently tried getting the HBase Spark connector running against > Spark 3.3 and HBase 2.4.x and failed miserably. It was a mess of Scala and > Java issues, classpath, NoClassDef etc. > > The trauma is too recent for me to dig up the details but if needed I can > ;-) > > For now I'm just wondering if anyone has succeeded using this combination? > > Cheers, > Lars > -- Best regards, Andrew Unrest, ignorance distilled, nihilistic imbeciles - It's what we’ve earned Welcome, apocalypse, what’s taken you so long? Bring us the fitting end that we’ve been counting on - A23, Welcome, Apocalypse