Eric, One problem is that you cannot depend on hadoop-core (for pre 0.23) and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same time.
Another problem is that different versions of hadoop bring in different dependencies you want to exclude, thus you have to exclude all deps from all potential hadoop versions you don't want (to complicate things more, jetty changed group name, thus you have to exclude it twice) Thanks. Alejandro On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[email protected]> wrote: > > > On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: > >>> Some effort was put into restore and forward porting features to ensure >>> HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one >>> HBase release should be certified for one major release of Hadoop to reduce >>> risk. Perhaps when public Hadoop API are rock solid, then it will become >>> feasible to have a version of HBase that work across multiple version of >>> Hadoop. >> >> Since 0.20.205.0 is the build default, a lot of the testing will >> naturally take place on this combination. But there are clearly >> others interested in (and investing a lot of testing effort in) >> running on 0.22 and 0.23, so we can't exclude those as unsupported. >> >>> >>> In proposed HBase structure layout change (HBASE-4337), the packaging >>> process excludes inclusion of Hadoop jar file, and pick up from constructed >>> class path. In the effort of ensuring Hadoop related technology can work >>> together in integrated fashion (File system layout change in HADOOP-6255). >> >> This is good, when the packaging system supports flexible enough >> dependencies to allow different Hadoop versions to satisfy the package >> "Depends:", but I don't think it gets us all the way there. >> >> We still want to provide tarball distributions that contain a bundled >> Hadoop jar for easy standalone setup and testing. >> >> Maven dependencies seem to be the other limiting factor. If I setup a >> java program that uses the HBase client and declare that dependency, I >> get a transitive dependency on Hadoop (good), but what version? If >> I'm running Hadoop 0.22, but the published maven artifact for HBase >> depends on 205, can I override that dependency in my POM? Or do we >> need to publish separate maven artifacts for each Hadoop version, so >> that the dependencies for each possible combination can be met (using >> versioning or the version classifier)? >> >> I really don't know enough about maven dependency management. Can we >> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >> for Hadoop to do a "Provides:" on a virtual package name that those 3 >> can share? > > Maven is quite flexible in specifying dependency. Both version range and > provided can be defined in pom.xml to improve compatibility. Certification > of individual version of dependent component should be expressed in the > integration test phase of HBase pom.xml to ensure some version test > validations can be done in HBase builds. If Provided is expressed, there is > no need of virtual package, ie: > > <dependencies> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-core</artifactId> > <version>[0.20.205.0,)</version> > <scope>provided</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-common</artifactId> > <version>[0.22.0,)</version> > <scope>provided</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-hdfs</artifactId> > <version>[0.22.0,)</version> > <scope>provided</scope> > </dependency> > </dependencies> > > The packaging proposal is to ensure the produced packages are not fixed to a > single version of Hadoop. It is useful for QA to run smoke test without > having to make changes to scripts for release package. > > regards, > Eric
