Proposal for Sqoop 1.5

We have Sqoop 1.4.x going on which is the production version of Sqoop, with 
support for ancient versions for Hadoop (from 0.20), Hive 0.7+ and HBase 0.94 
among others.

There is  a good amount of  interest in contribution to Sqoop 1 as it is the 
current production version.  But Sqoop has a few issues that make Hadoop 1.x is 
causing issues in bringing new features easily into Sqoop 1.x  (for example  
getting Phoenix changes into Sqoop and potentially others waiting in  the wings)

Also, we have been using Ant/Ivy based project, which is causing issues with 
component version management.   We can potentially use a Maven profile based 
configuration to easily allow multiple component versions to  have more 
flexibility in builds and packaging and how we publish artifacts

To that end here is what I propose (had a brief discussion with Jarcec last 
week) in the order of priority

Create a new Sqoop 1.5 branch where we


1.     Deprecate support for Hadoop 1 and older versions of HBase (only support 
1.0+) and Hive (only support 1.0+)

2.     Mavenize the project

3.     Clean up the package jumble in the code – only have org.apache.sqoop 
packages

4.     Bring in all the new features that otherwise are difficult to bring in 
with older

What should we do with 1.4.x branch?   My initial thought is that we do a 1.4.7 
release with what is available and have 1.5.x as the branch to make further 
changes.

Thoughts?

Thanks

Venkat

Reply via email to