Thanks for the suggestions, Chris. Filed TEZ-2168 for this. At this point, I am inclined to follow option 2 mainly to retain the ability for users to compile against hadoop 2.4. I am not sure if there is a simple and performant way ( without using reflection for all 2.6 specific calls ) to retain compile compatibility with option 1.
Any other comments for other folks on this issue in general or on the 2 options that Chris suggested? thanks — Hitesh On Feb 26, 2015, at 1:18 PM, Chris K Wensel <[email protected]> wrote: > The immediate issue is having two mutually exclusive artifacts: > tez-yarn-timeline-history and tez-yarn-timeline-history > > outside of ATSHistoryACLPolicyManager, the code is identical. just the > dependencies are changed. > > TezClient attempts to load this Manager, under the assumption if it exists, > it is running on hadoop 2.6. (running on 2.4 is fatal) > > My recommendation would be never to change artifact names (or conditionally > choose them) inside of major releases, but accreting new, optional, ones as > versions progress is fine. > > thus I would either: > > create a single artifact tez-yarn-timeline-history compiled with a default > dep of hadoop 2.6, that includes the Manager. update the TezClient code to > gracefully fail if the Manager is not applicable (the runtime env is Hadoop > 2.4). > > or > > offer tez-yarn-timeline-history-with-acls as an optional artifact for Hadoop > 2.6 deployments, with the single Manager class in it, which in turn requires > the tez-yarn-timeline-history artifact -- which is sufficient for a 2.4 > runtime. if the user provides the additional -with-acls artifact, they are > knowingly going to have problems on Hadoop 2.4. > > I prefer the first as it keeps my build file simple. graceful degradation of > services per environment (with appropriate logging) is a well accepted > practice. > > and you can now test Tez across multiple versions Hadoop/Yarn at runtime > (outside of compile time). > > we do this with Cascading, just simple build file modifications to verify > binary compatibility (vendors fork this repo to verify their distributions, > and been known to find critical bugs): > > https://github.com/Cascading/cascading.compatibility > > ckw > >> On Feb 26, 2015, at 11:03 AM, Hitesh Shah <[email protected]> wrote: >> >> Hi folks, >> >> Chris raised a good point earlier in terms of publishing jars for use >> against different versions of hadoop. For the most part, I think we have >> done well to ensure that the user-facing modules are version agnostic but >> the same does not hold for other modules which are times are needed by other >> applications for testing. >> >> There aren’t really too many different options we can try. The simplest >> option I can think of is just to build tez against different versions of >> hadoop with the tez.version set to something along the lines of >> “tez.version-hadoop.version”. This would imply having >> tez-api-0.6.0-hadoop2.4 or tez-api-0.6.0-hadoop26. For a usability point of >> view, depending on the option we pick, users will need to switch their >> dependencies to point to an appropriate version based on what version of >> hadoop they are using. For apps such as hive and pig, they will need to >> manage picking a particular version of tez based on which hadoop profile >> they are building against. >> >> Any other suggestions for publishing version dependent jars? >> >> For binary releases, should we release only the minimal tarball? or both the >> minimal and full tar balls? The full tarball is the recommended deployment >> model as it is more robust towards compatibility on a changing cluster. It >> should work in most scenarios as long as the hadoop client libraries that >> Tez depends on are compatible with the servers running on the cluster. >> >> General questions for the community/past release managers: >> - Should we retain the simple version ( i.e. plain only x.y.z ) when >> building against the default version of hadoop as determined by Tez? This >> “default.version” will have a tendency to evolve over time :) . These simple >> version jars would be in addition to the version specific jars. >> - What versions of hadoop should we compile against? 2.2, 2.4 and 2.6 or >> 2.2,2.3,2.4,2.5,2.6 ? Please note that I am ignoring the minor version so we >> should pick the latest version in each line i.e. 2.2.1 over 2.2.0 if 2.2.1 >> exists. >> >> Any other comments? >> >> thanks >> — Hitesh >> >> > > — > Chris K Wensel > [email protected] > > > >
