[ 
https://issues.apache.org/jira/browse/HBASE-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741243#comment-13741243
 ] 

Sergey Shelukhin commented on HBASE-9213:
-----------------------------------------

[~eclark] Can you elaborate? Cannot we do it in respective shims/add a proxy 
class/...? Granted, HBase uses HDFS a lot more than Hive,  but it should still 
be doable.


[~stack]
bq. We need to build tgzs and publish to maven. We need to do it for hadoop1 
and hadoop2. The tgz needs to include all dependencies of which there are quiet 
a few when you are running on hadoop2. The dependencies are ill-specified in 
associated poms overly-cautious pulling in way more than is needed in the name 
of "just-in-case". hadoop1 and hadoop2 and their dependencies likely need to be 
siloed (We might do this in a subdir in a tgz).
Do we really need to ship all the dependencies? User would point us to their 
Hadoop anyway as you mention below. So no need to ship our own Hadoop jars.

bq. Then there is publishing to maven. When we publish to maven we say what we 
depend on in the associated pom we publish. The vocabulary available to you 
when you are doing maven publishing is limited, cryptic, broken (as best as I 
can discern), and there is no means of flipping a switch to say "I am currently 
dependent on hadoop1 (as opposed to hadoop2)" when downstream dependencies are 
doing their dependency pull.
As far as I understand you depend on both. So when somebody pulls you for their 
own build you also pull both. Then shim is pointed at the correct ones based on 
your local build flags.
Shim always sees just one set (from where it's pointed to in local tests, or 
from where the user pointed it to in production), figures out which one it is 
and initializes itself accordingly.
It is not the prettiest thing, but reliable (if shim cannot recognize the 
version then chances are you are broken wrt it anyway), and avoids two builds 
(and maven arcane :))

bq. On hbase-common, we could likely have a single jar that would work with 
both hadoop1 and hadoop2. As Elliott says, we haven't done the work (it could 
be just a simple hack in the script over in HBASE-8224). I've not tried it (it 
didn't occur to me – it is a good idea). The prefix-tree module could likely 
drop the hadoop1 and hadoop2 suffix.
The jars have different size, so it looks like they are indeed different...

bq. all artifacts will have the -hadoop2 and -hadoop1 appended but you probably 
won't have to worry because they will be pulled in for you by maven (we did 
some tests to ensure the right dependencies come in). Let us know if it isn't 
working for you. Thanks.
But you would need to include a particular one still, right?

                
> create a unified shim for hadoop 1 and 2 so that there's one build of HBase
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-9213
>                 URL: https://issues.apache.org/jira/browse/HBASE-9213
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: build
>            Reporter: Sergey Shelukhin
>             Fix For: 0.96.0
>
>
> This is a brainstorming JIRA. Working with HBase dependency at this point 
> seems to be rather painful from what I hear from other folks. We could do the 
> hive model with unified shim, built in such manner that it can work with 
> either version, where at build time dependencies for all 2-3 versions are 
> pulled and the appropriate one is used for tests, and when running HBase you 
> have to point at Hadoop directory to get the dependencies. I am not very 
> proficient at maven so not quite certain of the best solution yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to