Eric,

Do you mean that the HBASE published POM won't have a Hadoop artifact
as a dependency?

If so, the artifact will not be usable by HBASE downstream projects
unless the developer adds his/her version of Hadoop explicitly.

IMO this is not very kosher.

It that your idea?

Thanks.

Alejandro

On Fri, Nov 11, 2011 at 1:49 PM, Eric Yang <[email protected]> wrote:
> My recommendation is that there is no hadoop artifact in HBase, but construct 
> from $PREFIX/share/hadoop class path.  There should be a primary version of 
> Hadoop that is advised by HBase community as officially supported.  
> Communities like Bigtop can advertise community certified release with their 
> patches.
>
> regards,
> Eric
>
> On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote:
>
>> Yes, but what version of Hadoop your published hbase artifact has? And
>> how do you handle the pre-0.23 and 0.23-onwards there? How the
>> developers using hbase artifacts will deal with this?
>>
>> Thanks.
>>
>> Alejandro
>>
>> On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[email protected]> wrote:
>>> This is where separated maven profiles can be useful in toggling tests with 
>>> different dependency trees for test purpose only.
>>>
>>> regards,
>>> Eric
>>>
>>> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote:
>>>
>>>> Eric,
>>>>
>>>> One problem is that you cannot depend on hadoop-core (for pre 0.23)
>>>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same
>>>> time.
>>>>
>>>> Another problem is that different versions of hadoop bring in
>>>> different dependencies you want  to exclude, thus you have to exclude
>>>> all deps from all potential hadoop versions you don't want (to
>>>> complicate things more, jetty changed group name, thus you have to
>>>> exclude it twice)
>>>>
>>>> Thanks.
>>>>
>>>> Alejandro
>>>>
>>>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[email protected]> wrote:
>>>>>
>>>>>
>>>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote:
>>>>>
>>>>>>> Some effort was put into restore and forward porting features to ensure 
>>>>>>> HBase 0.90.x and Hadoop 0.20.205.0 can work together.  I recommend that 
>>>>>>> one HBase release should be certified for one major release of Hadoop 
>>>>>>> to reduce risk.  Perhaps when public Hadoop API are rock solid, then it 
>>>>>>> will become feasible to have a version of HBase that work across 
>>>>>>> multiple version of Hadoop.
>>>>>>
>>>>>> Since 0.20.205.0 is the build default, a lot of the testing will
>>>>>> naturally take place on this combination.  But there are clearly
>>>>>> others interested in (and investing a lot of testing effort in)
>>>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported.
>>>>>>
>>>>>>>
>>>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging 
>>>>>>> process excludes inclusion of Hadoop jar file, and pick up from 
>>>>>>> constructed class path.  In the effort of ensuring Hadoop related 
>>>>>>> technology can work together in integrated fashion (File system layout 
>>>>>>> change in HADOOP-6255).
>>>>>>
>>>>>> This is good, when the packaging system supports flexible enough
>>>>>> dependencies to allow different Hadoop versions to satisfy the package
>>>>>> "Depends:", but I don't think it gets us all the way there.
>>>>>>
>>>>>> We still want to provide tarball distributions that contain a bundled
>>>>>> Hadoop jar for easy standalone setup and testing.
>>>>>>
>>>>>> Maven dependencies seem to be the other limiting factor.  If I setup a
>>>>>> java program that uses the HBase client and declare that dependency, I
>>>>>> get a transitive dependency on Hadoop (good), but what version?  If
>>>>>> I'm running Hadoop 0.22, but the published maven artifact for HBase
>>>>>> depends on 205, can I override that dependency in my POM?  Or do we
>>>>>> need to publish separate maven artifacts for each Hadoop version, so
>>>>>> that the dependencies for each possible combination can be met (using
>>>>>> versioning or the version classifier)?
>>>>>>
>>>>>> I really don't know enough about maven dependency management.  Can we
>>>>>> specify a version like (0.20.205.0|0.22|0.23)?  Or is there any way
>>>>>> for Hadoop to do a "Provides:" on a virtual package name that those 3
>>>>>> can share?
>>>>>
>>>>> Maven is quite flexible in specifying dependency.  Both version range and 
>>>>> provided can be defined in pom.xml to improve compatibility.  
>>>>> Certification of individual version of dependent component should be 
>>>>> expressed in the integration test phase of HBase pom.xml to ensure some 
>>>>> version test validations can be  done in HBase builds.  If Provided is 
>>>>> expressed, there is no need of virtual package, ie:
>>>>>
>>>>> <dependencies>
>>>>>  <dependency>
>>>>>    <groupId>org.apache.hadoop</groupId>
>>>>>    <artifactId>hadoop-core</artifactId>
>>>>>    <version>[0.20.205.0,)</version>
>>>>>    <scope>provided</scope>
>>>>>  </dependency>
>>>>>  <dependency>
>>>>>    <groupId>org.apache.hadoop</groupId>
>>>>>    <artifactId>hadoop-common</artifactId>
>>>>>    <version>[0.22.0,)</version>
>>>>>    <scope>provided</scope>
>>>>>  </dependency>
>>>>>  <dependency>
>>>>>    <groupId>org.apache.hadoop</groupId>
>>>>>    <artifactId>hadoop-hdfs</artifactId>
>>>>>    <version>[0.22.0,)</version>
>>>>>    <scope>provided</scope>
>>>>>  </dependency>
>>>>> </dependencies>
>>>>>
>>>>> The packaging proposal is to ensure the produced packages are not fixed 
>>>>> to a single version of Hadoop.  It is useful for QA to run smoke test 
>>>>> without having to make changes to scripts for release package.
>>>>>
>>>>> regards,
>>>>> Eric
>>>
>>>
>
>

Reply via email to