My recommendation is that there is no hadoop artifact in HBase, but construct from $PREFIX/share/hadoop class path. There should be a primary version of Hadoop that is advised by HBase community as officially supported. Communities like Bigtop can advertise community certified release with their patches.
regards, Eric On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote: > Yes, but what version of Hadoop your published hbase artifact has? And > how do you handle the pre-0.23 and 0.23-onwards there? How the > developers using hbase artifacts will deal with this? > > Thanks. > > Alejandro > > On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[email protected]> wrote: >> This is where separated maven profiles can be useful in toggling tests with >> different dependency trees for test purpose only. >> >> regards, >> Eric >> >> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: >> >>> Eric, >>> >>> One problem is that you cannot depend on hadoop-core (for pre 0.23) >>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same >>> time. >>> >>> Another problem is that different versions of hadoop bring in >>> different dependencies you want to exclude, thus you have to exclude >>> all deps from all potential hadoop versions you don't want (to >>> complicate things more, jetty changed group name, thus you have to >>> exclude it twice) >>> >>> Thanks. >>> >>> Alejandro >>> >>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[email protected]> wrote: >>>> >>>> >>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >>>> >>>>>> Some effort was put into restore and forward porting features to ensure >>>>>> HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that >>>>>> one HBase release should be certified for one major release of Hadoop to >>>>>> reduce risk. Perhaps when public Hadoop API are rock solid, then it >>>>>> will become feasible to have a version of HBase that work across >>>>>> multiple version of Hadoop. >>>>> >>>>> Since 0.20.205.0 is the build default, a lot of the testing will >>>>> naturally take place on this combination. But there are clearly >>>>> others interested in (and investing a lot of testing effort in) >>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>>>> >>>>>> >>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging >>>>>> process excludes inclusion of Hadoop jar file, and pick up from >>>>>> constructed class path. In the effort of ensuring Hadoop related >>>>>> technology can work together in integrated fashion (File system layout >>>>>> change in HADOOP-6255). >>>>> >>>>> This is good, when the packaging system supports flexible enough >>>>> dependencies to allow different Hadoop versions to satisfy the package >>>>> "Depends:", but I don't think it gets us all the way there. >>>>> >>>>> We still want to provide tarball distributions that contain a bundled >>>>> Hadoop jar for easy standalone setup and testing. >>>>> >>>>> Maven dependencies seem to be the other limiting factor. If I setup a >>>>> java program that uses the HBase client and declare that dependency, I >>>>> get a transitive dependency on Hadoop (good), but what version? If >>>>> I'm running Hadoop 0.22, but the published maven artifact for HBase >>>>> depends on 205, can I override that dependency in my POM? Or do we >>>>> need to publish separate maven artifacts for each Hadoop version, so >>>>> that the dependencies for each possible combination can be met (using >>>>> versioning or the version classifier)? >>>>> >>>>> I really don't know enough about maven dependency management. Can we >>>>> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >>>>> for Hadoop to do a "Provides:" on a virtual package name that those 3 >>>>> can share? >>>> >>>> Maven is quite flexible in specifying dependency. Both version range and >>>> provided can be defined in pom.xml to improve compatibility. >>>> Certification of individual version of dependent component should be >>>> expressed in the integration test phase of HBase pom.xml to ensure some >>>> version test validations can be done in HBase builds. If Provided is >>>> expressed, there is no need of virtual package, ie: >>>> >>>> <dependencies> >>>> <dependency> >>>> <groupId>org.apache.hadoop</groupId> >>>> <artifactId>hadoop-core</artifactId> >>>> <version>[0.20.205.0,)</version> >>>> <scope>provided</scope> >>>> </dependency> >>>> <dependency> >>>> <groupId>org.apache.hadoop</groupId> >>>> <artifactId>hadoop-common</artifactId> >>>> <version>[0.22.0,)</version> >>>> <scope>provided</scope> >>>> </dependency> >>>> <dependency> >>>> <groupId>org.apache.hadoop</groupId> >>>> <artifactId>hadoop-hdfs</artifactId> >>>> <version>[0.22.0,)</version> >>>> <scope>provided</scope> >>>> </dependency> >>>> </dependencies> >>>> >>>> The packaging proposal is to ensure the produced packages are not fixed to >>>> a single version of Hadoop. It is useful for QA to run smoke test without >>>> having to make changes to scripts for release package. >>>> >>>> regards, >>>> Eric >> >>
