Hi all,

I also think that completely eliminating the Kite dependency from Sqoop
would be the easiest way of going forward, I will try to analyze this topic
a bit more next week and come up with subtasks so we could work on it in
parallel potentially.

I am happy with the Sqoop 3.0 scope proposal too and Bogi being the release
manager of it.

Szabolcs


On Fri, Apr 13, 2018 at 2:37 PM, Boglarka Egyed <b...@apache.org> wrote:

> Hi Daniel et al,
>
> Thanks for bringing up this topic and the detailed status update.
>
> I am sharing my thoughts point by point, please find them below.
>
> 1) How to get a new Kite release? Maybe we should remove the Kite
> > dependency altogether (as Szabolcs hinted in comments of SQOOP-3171)?
>
>
> I think making a new Kite release would be a huge effort as it would
> require upgrading the versions, making the necessary code modifications,
> testing it thoroughly, etc. then making the release itself meanwhile Kite
> is a very passively handled tool having minimal activity on it thus it
> would definitely mean a lot of effort to get it done. It would have a
> dependency on Solr community too as the Morphlines module of Kite is
> heavily used and somewhat actively developed by them. Also indeed there is
> a shorter/longer term goal to get rid of Kite dependency in Sqoop entirely,
> i.e. all release efforts would become throw-away very soon.
>
> Focusing on the Kite removal seems to be more reasonable to me. However it
> would be great to see an estimation regarding this effort, @Szabolcs could
> you maybe share your thoughts on this?
>
> 2) Should we drop support for Hadoop 2?
> >
>
> I think we can drop support for Hadoop 2 especially if we use
> straightforward versioning with the new release.
>
>
> > 3) What version number should we use? To avoid confusion with Sqoop2 I'd
> go
> > with 3.0.
> >
>
> I like this idea, +1 for making a 3.0 release containing these changes.
>
>
> > 4) Does (should?) this affect the 1.5 release?
>
>
> I think the answer is yes. Currently the following breaking changes are on
> the horizon which could be part of a next Sqoop release:
> * com.cloudera package removal (done)
> * Gradle introduction (in progress)
> * Hadoop/Hive/HBase version upgrade (in progress)
> * Kite deprecation/removal (planned)
> * Bump Java version to 8 (planned )
>
> Looking at this list I would say that making a Sqoop 1.5 release containing
> only the com.cloudera package removal, the Gradle introduction and the Java
> version bump would mean a somewhat small and irrelevant scope from a user
> perspective so maybe having two releases (1.5 and 3.0) would be a little
> bit overkill. I would instead suggest to go with a Sqoop 3.0 release
> containing all the changes listed above. What do you think?
>
> Summarizing it up I see the following dependencies for a next Sqoop release
> currently:
> * Finishing up the Gradle patch
> * Hive 3 release
> * Kite removal - this could be the next common effort in the community
>
> Anyhow I would be happy to take the Release Manager role for the next
> release, please let me know if everyone would be OK with that.
>
> I am looking forward to see others thoughts on this too.
>
> Many thanks,
> Bogi
>
> On Thu, Apr 12, 2018 at 5:17 PM, Dániel Vörös <daniel.vo...@gmail.com>
> wrote:
>
> > Dear All,
> >
> > After some development towards supporting Hadoop 3 (and latest version of
> > downstream components) I'd like to summarize the current state of the
> > upgrade and start the conversation about releasing a new version of Sqoop
> > with Hadoop 3 support.
> >
> > Here's what happened so far:
> >  - Upgraded Hadoop dependency to 3.0.0
> >  - Hive had to be upgraded, since old Hive didn't work with Hadoop 3.
> >  - HBase had to be upgraded since Hive 3 depends on HBase 2(alpha)
> >  - Dealt with a bunch of minor issues like changed Hadoop configuration
> > names and different packaging of Maven artifacts.
> >
> > For details please refer to this ticket and the attached review request:
> > https://issues.apache.org/jira/browse/SQOOP-3305
> >
> > Remaining work:
> >  - Parquet importing doesn't work. It was broken by a
> standalone-metastore
> > change in Hive and fixing would require a new Kite version to be built
> > against Hive 3.
> >  - Hive 3 is going to enable ACID tables by default. We should support
> > importing into these. Details:
> > https://issues.apache.org/jira/browse/SQOOP-3311
> >
> > Other blocking issues:
> >  - There's no Hive 3 release (no alpha/beta) yet.
> >
> > I'd like to kindly ask you all to share any other tasks/issues you know
> of
> > that we should address to support the latest versions. Also, there are a
> > couple open questions:
> >  1) How to get a new Kite release? Maybe we should remove the Kite
> > dependency altogether (as Szabolcs hinted in comments of SQOOP-3171)?
> >  2) Should we drop support for Hadoop 2?
> >  3) What version number should we use? To avoid confusion with Sqoop2 I'd
> > go with 3.0.
> >  4) Does (should?) this affect the 1.5 release?
> >
> > Regards,
> > Daniel
> >
>

Reply via email to