[DISCUSS] Publishing and releasing jars for different hadoop version dependencies

Hitesh Shah Thu, 26 Feb 2015 11:07:39 -0800

Hi folks, 

Chris raised a good point earlier in terms of publishing jars for use against 
different versions of hadoop. For the most part, I think we have done well to 
ensure that the user-facing modules are version agnostic but the same does not 
hold for other modules which are times are needed by other applications for 
testing.


There aren’t really too many different options we can try.  The simplest option 
I can think of is just to build tez against different versions of hadoop with 
the tez.version set to something along the lines of 
“tez.version-hadoop.version”. This would imply having tez-api-0.6.0-hadoop2.4 
or tez-api-0.6.0-hadoop26. For a usability point of view, depending on the 
option we pick, users will need to switch their dependencies to point to an 
appropriate version based on what version of hadoop they are using. For apps 
such as hive and pig, they will need to manage picking a particular version of 
tez based on which hadoop profile they are building against. 

Any other suggestions for publishing version dependent jars?

For binary releases, should we release only the minimal tarball? or both the 
minimal and full tar balls? The full tarball is the recommended deployment 
model as it is more robust towards compatibility on a changing cluster. It 
should work in most scenarios as long as the hadoop client libraries that Tez 
depends on are compatible with the servers running on the cluster.

General questions for the community/past release managers: 
   - Should we retain the simple version ( i.e. plain only x.y.z ) when 
building against the default version of hadoop as determined by Tez? This 
“default.version” will have a tendency to evolve over time :) . These simple 
version jars would be in addition to the version specific jars. 
   - What versions of hadoop should we compile against? 2.2, 2.4 and 2.6 or 
2.2,2.3,2.4,2.5,2.6 ? Please note that I am ignoring the minor version so we 
should pick the latest version in each line i.e. 2.2.1 over 2.2.0 if 2.2.1 
exists. 
   
Any other comments? 

thanks
— Hitesh

[DISCUSS] Publishing and releasing jars for different hadoop version dependencies

Reply via email to