Eric, Do you mean that the HBASE published POM won't have a Hadoop artifact as a dependency?
If so, the artifact will not be usable by HBASE downstream projects unless the developer adds his/her version of Hadoop explicitly. IMO this is not very kosher. It that your idea? Thanks. Alejandro On Fri, Nov 11, 2011 at 1:49 PM, Eric Yang <[email protected]> wrote: > My recommendation is that there is no hadoop artifact in HBase, but construct > from $PREFIX/share/hadoop class path. There should be a primary version of > Hadoop that is advised by HBase community as officially supported. > Communities like Bigtop can advertise community certified release with their > patches. > > regards, > Eric > > On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote: > >> Yes, but what version of Hadoop your published hbase artifact has? And >> how do you handle the pre-0.23 and 0.23-onwards there? How the >> developers using hbase artifacts will deal with this? >> >> Thanks. >> >> Alejandro >> >> On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[email protected]> wrote: >>> This is where separated maven profiles can be useful in toggling tests with >>> different dependency trees for test purpose only. >>> >>> regards, >>> Eric >>> >>> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: >>> >>>> Eric, >>>> >>>> One problem is that you cannot depend on hadoop-core (for pre 0.23) >>>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same >>>> time. >>>> >>>> Another problem is that different versions of hadoop bring in >>>> different dependencies you want to exclude, thus you have to exclude >>>> all deps from all potential hadoop versions you don't want (to >>>> complicate things more, jetty changed group name, thus you have to >>>> exclude it twice) >>>> >>>> Thanks. >>>> >>>> Alejandro >>>> >>>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[email protected]> wrote: >>>>> >>>>> >>>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >>>>> >>>>>>> Some effort was put into restore and forward porting features to ensure >>>>>>> HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that >>>>>>> one HBase release should be certified for one major release of Hadoop >>>>>>> to reduce risk. Perhaps when public Hadoop API are rock solid, then it >>>>>>> will become feasible to have a version of HBase that work across >>>>>>> multiple version of Hadoop. >>>>>> >>>>>> Since 0.20.205.0 is the build default, a lot of the testing will >>>>>> naturally take place on this combination. But there are clearly >>>>>> others interested in (and investing a lot of testing effort in) >>>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>>>>> >>>>>>> >>>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging >>>>>>> process excludes inclusion of Hadoop jar file, and pick up from >>>>>>> constructed class path. In the effort of ensuring Hadoop related >>>>>>> technology can work together in integrated fashion (File system layout >>>>>>> change in HADOOP-6255). >>>>>> >>>>>> This is good, when the packaging system supports flexible enough >>>>>> dependencies to allow different Hadoop versions to satisfy the package >>>>>> "Depends:", but I don't think it gets us all the way there. >>>>>> >>>>>> We still want to provide tarball distributions that contain a bundled >>>>>> Hadoop jar for easy standalone setup and testing. >>>>>> >>>>>> Maven dependencies seem to be the other limiting factor. If I setup a >>>>>> java program that uses the HBase client and declare that dependency, I >>>>>> get a transitive dependency on Hadoop (good), but what version? If >>>>>> I'm running Hadoop 0.22, but the published maven artifact for HBase >>>>>> depends on 205, can I override that dependency in my POM? Or do we >>>>>> need to publish separate maven artifacts for each Hadoop version, so >>>>>> that the dependencies for each possible combination can be met (using >>>>>> versioning or the version classifier)? >>>>>> >>>>>> I really don't know enough about maven dependency management. Can we >>>>>> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >>>>>> for Hadoop to do a "Provides:" on a virtual package name that those 3 >>>>>> can share? >>>>> >>>>> Maven is quite flexible in specifying dependency. Both version range and >>>>> provided can be defined in pom.xml to improve compatibility. >>>>> Certification of individual version of dependent component should be >>>>> expressed in the integration test phase of HBase pom.xml to ensure some >>>>> version test validations can be done in HBase builds. If Provided is >>>>> expressed, there is no need of virtual package, ie: >>>>> >>>>> <dependencies> >>>>> <dependency> >>>>> <groupId>org.apache.hadoop</groupId> >>>>> <artifactId>hadoop-core</artifactId> >>>>> <version>[0.20.205.0,)</version> >>>>> <scope>provided</scope> >>>>> </dependency> >>>>> <dependency> >>>>> <groupId>org.apache.hadoop</groupId> >>>>> <artifactId>hadoop-common</artifactId> >>>>> <version>[0.22.0,)</version> >>>>> <scope>provided</scope> >>>>> </dependency> >>>>> <dependency> >>>>> <groupId>org.apache.hadoop</groupId> >>>>> <artifactId>hadoop-hdfs</artifactId> >>>>> <version>[0.22.0,)</version> >>>>> <scope>provided</scope> >>>>> </dependency> >>>>> </dependencies> >>>>> >>>>> The packaging proposal is to ensure the produced packages are not fixed >>>>> to a single version of Hadoop. It is useful for QA to run smoke test >>>>> without having to make changes to scripts for release package. >>>>> >>>>> regards, >>>>> Eric >>> >>> > >
