[jira] [Commented] (HBASE-9213) create a unified shim for hadoop 1 and 2 so that there's one build of HBase

stack (JIRA) Wed, 14 Aug 2013 22:22:25 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740695#comment-13740695
 ]


stack commented on HBASE-9213:
------------------------------

[~sershe] A single build would be coolio.  If you have ideas for how to make it 
work, I am all ears.

Does the hive shim work well?  It strikes me as a horror to keep up.  Our 
compat modules that Elliott made are hard enough and these are just meshing 
APIs.

We need to build tgzs and publish to maven.  We need to do it for hadoop1 and 
hadoop2.  The tgz needs to include all dependencies of which there are quiet a 
few when you are running on hadoop2.  The dependencies are ill-specified in 
associated poms overly-cautious pulling in way more than is needed in the name 
of "just-in-case".  hadoop1 and hadoop2 and their dependencies likely need to 
be siloed (We might do this in a subdir in a tgz).

Remember also that the hadoop we ship w/ is likely moot anyways as it is 
unlikely to match what the user is running; user has to replace the hadoop we 
ship w/ the hadoop they are running.

Thats the tgz.

Then there is publishing to maven.  When we publish to maven we say what we 
depend on in the associated pom we publish.  The vocabulary available to you 
when you are doing maven publishing is limited, cryptic, broken (as best as I 
can discern), and there is no means of flipping a switch to say "I am currently 
dependent on hadoop1 (as opposed to hadoop2)" when downstream dependencies are 
doing their dependency pull.

So, after messing w/ the maven arcane -- e.g. classifications, trying to set 
properties/profiles at publish and dependency fullfillment time, etc. (of note, 
each plugin is written by a different crew w/o enforcement of how to name the 
property that refers to a particular attribute so each plugin can name it as it 
will, and then when it comes to corner-facility such as classification, plugins 
may implement or not so you have 'interesting' cases such as classifications 
works for near all of the build pipeline but NOT for the final assembly plugin 
step, the plugin that makes the tgzs....or the release plugin doesn't even know 
the name of the pom that it is supposed to be reading though you can pass it on 
the command line to mvn and all other plugins are fine w/ that making it so you 
have to tell this plugin what pom to use via gymnastics), we've ended up w/ our 
current hokey system where our build can be set against a target hadoop using 
maven profiles which works fine for local builds or builds up on the build box 
for unit tests etc., but it is lacking when it comes time to publish.  
Publishing, we need to generate two different artifacts and we denote them by 
adding -hadoop1 and -hadoop2 to our version.  I could not make mvn do this for 
us so I made the script to do it working off the committed poms.

On hbase-common, we could likely have a single jar that would work with both 
hadoop1 and hadoop2.  As Elliott says, we haven't done the work (it could be 
just a simple hack in the script over in HBASE-8224).  I've not tried it (it 
didn't occur to me -- it is a good idea).  The prefix-tree module could likely 
drop the hadoop1 and hadoop2 suffix.


[~roshan_naik] Yes on 1.  See publish SNAPSHOTS for examples (let us know if 
the recent ones do not work for you -- we have heard from others that they do 
work as dependencies for downstreamers so we are thinking we are good here 
until we hear otherwise).  On 2., yes... all artifacts will have the -hadoop2 
and -hadoop1 appended but you probably won't have to worry because they will be 
pulled in for you by maven (we did some tests to ensure the right dependencies 
come in).  Let us know if it isn't working for you.  Thanks.
                
> create a unified shim for hadoop 1 and 2 so that there's one build of HBase
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-9213
>                 URL: https://issues.apache.org/jira/browse/HBASE-9213
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: build
>            Reporter: Sergey Shelukhin
>             Fix For: 0.96.0
>
>
> This is a brainstorming JIRA. Working with HBase dependency at this point 
> seems to be rather painful from what I hear from other folks. We could do the 
> hive model with unified shim, built in such manner that it can work with 
> either version, where at build time dependencies for all 2-3 versions are 
> pulled and the appropriate one is used for tests, and when running HBase you 
> have to point at Hadoop directory to get the dependencies. I am not very 
> proficient at maven so not quite certain of the best solution yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9213) create a unified shim for hadoop 1 and 2 so that there's one build of HBase

Reply via email to