[
https://issues.apache.org/jira/browse/HBASE-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740695#comment-13740695
]
stack commented on HBASE-9213:
------------------------------
[~sershe] A single build would be coolio. If you have ideas for how to make it
work, I am all ears.
Does the hive shim work well? It strikes me as a horror to keep up. Our
compat modules that Elliott made are hard enough and these are just meshing
APIs.
We need to build tgzs and publish to maven. We need to do it for hadoop1 and
hadoop2. The tgz needs to include all dependencies of which there are quiet a
few when you are running on hadoop2. The dependencies are ill-specified in
associated poms overly-cautious pulling in way more than is needed in the name
of "just-in-case". hadoop1 and hadoop2 and their dependencies likely need to
be siloed (We might do this in a subdir in a tgz).
Remember also that the hadoop we ship w/ is likely moot anyways as it is
unlikely to match what the user is running; user has to replace the hadoop we
ship w/ the hadoop they are running.
Thats the tgz.
Then there is publishing to maven. When we publish to maven we say what we
depend on in the associated pom we publish. The vocabulary available to you
when you are doing maven publishing is limited, cryptic, broken (as best as I
can discern), and there is no means of flipping a switch to say "I am currently
dependent on hadoop1 (as opposed to hadoop2)" when downstream dependencies are
doing their dependency pull.
So, after messing w/ the maven arcane -- e.g. classifications, trying to set
properties/profiles at publish and dependency fullfillment time, etc. (of note,
each plugin is written by a different crew w/o enforcement of how to name the
property that refers to a particular attribute so each plugin can name it as it
will, and then when it comes to corner-facility such as classification, plugins
may implement or not so you have 'interesting' cases such as classifications
works for near all of the build pipeline but NOT for the final assembly plugin
step, the plugin that makes the tgzs....or the release plugin doesn't even know
the name of the pom that it is supposed to be reading though you can pass it on
the command line to mvn and all other plugins are fine w/ that making it so you
have to tell this plugin what pom to use via gymnastics), we've ended up w/ our
current hokey system where our build can be set against a target hadoop using
maven profiles which works fine for local builds or builds up on the build box
for unit tests etc., but it is lacking when it comes time to publish.
Publishing, we need to generate two different artifacts and we denote them by
adding -hadoop1 and -hadoop2 to our version. I could not make mvn do this for
us so I made the script to do it working off the committed poms.
On hbase-common, we could likely have a single jar that would work with both
hadoop1 and hadoop2. As Elliott says, we haven't done the work (it could be
just a simple hack in the script over in HBASE-8224). I've not tried it (it
didn't occur to me -- it is a good idea). The prefix-tree module could likely
drop the hadoop1 and hadoop2 suffix.
[~roshan_naik] Yes on 1. See publish SNAPSHOTS for examples (let us know if
the recent ones do not work for you -- we have heard from others that they do
work as dependencies for downstreamers so we are thinking we are good here
until we hear otherwise). On 2., yes... all artifacts will have the -hadoop2
and -hadoop1 appended but you probably won't have to worry because they will be
pulled in for you by maven (we did some tests to ensure the right dependencies
come in). Let us know if it isn't working for you. Thanks.
> create a unified shim for hadoop 1 and 2 so that there's one build of HBase
> ---------------------------------------------------------------------------
>
> Key: HBASE-9213
> URL: https://issues.apache.org/jira/browse/HBASE-9213
> Project: HBase
> Issue Type: Brainstorming
> Components: build
> Reporter: Sergey Shelukhin
> Fix For: 0.96.0
>
>
> This is a brainstorming JIRA. Working with HBase dependency at this point
> seems to be rather painful from what I hear from other folks. We could do the
> hive model with unified shim, built in such manner that it can work with
> either version, where at build time dependencies for all 2-3 versions are
> pulled and the appropriate one is used for tests, and when running HBase you
> have to point at Hadoop directory to get the dependencies. I am not very
> proficient at maven so not quite certain of the best solution yet.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira