Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment
+1 On Fri, Oct 14, 2016 at 12:04 PM, Stephan Ewenwrote: > +1 > > On Fri, Oct 14, 2016 at 11:54 AM, Greg Hogan wrote: > > > +1 > > > > On Fri, Oct 14, 2016 at 5:29 AM, Fabian Hueske > wrote: > > > > > Hi everybody, > > > > > > I would like to propose to deprecate the utility methods to read data > > with > > > Hadoop InputFormats from the (batch) ExecutionEnvironment. > > > > > > The motivation for deprecating these methods is reduce Flink's > dependency > > > on Hadoop but rather have Hadoop as an optional dependency for users > that > > > actually need it (HDFS, MapRed-Compat, ...). Eventually, we want to > have > > > Flink distribution that does not have a hard Hadoop dependency. > > > > > > One step for this is to remove the Hadoop dependency from flink-java > > > (Flink's Java DataSet API) which is currently required due to the above > > > utility methods (see FLINK-4315). We recently received a PR that > > addresses > > > FLINK-4315 and removes the Hadoop methods from the > ExecutionEnvironment. > > > After some discussion, it was decided to defer the PR to Flink 2.0 > > because > > > it breaks the API (these methods are delared @PublicEvolving). > > > > > > I propose to accept this PR for Flink 1.2, but instead of removing the > > > methods deprecating them. > > > This would help to migrate old code and prevent new usage of these > > methods. > > > For a later Flink release (1.3 or 2.0) we could remove these methods > and > > > the Hadoop dependency on flink-java. > > > > > > What do others think? > > > > > > Best, Fabian > > > > > >
Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment
+1 On Fri, Oct 14, 2016 at 11:54 AM, Greg Hoganwrote: > +1 > > On Fri, Oct 14, 2016 at 5:29 AM, Fabian Hueske wrote: > > > Hi everybody, > > > > I would like to propose to deprecate the utility methods to read data > with > > Hadoop InputFormats from the (batch) ExecutionEnvironment. > > > > The motivation for deprecating these methods is reduce Flink's dependency > > on Hadoop but rather have Hadoop as an optional dependency for users that > > actually need it (HDFS, MapRed-Compat, ...). Eventually, we want to have > > Flink distribution that does not have a hard Hadoop dependency. > > > > One step for this is to remove the Hadoop dependency from flink-java > > (Flink's Java DataSet API) which is currently required due to the above > > utility methods (see FLINK-4315). We recently received a PR that > addresses > > FLINK-4315 and removes the Hadoop methods from the ExecutionEnvironment. > > After some discussion, it was decided to defer the PR to Flink 2.0 > because > > it breaks the API (these methods are delared @PublicEvolving). > > > > I propose to accept this PR for Flink 1.2, but instead of removing the > > methods deprecating them. > > This would help to migrate old code and prevent new usage of these > methods. > > For a later Flink release (1.3 or 2.0) we could remove these methods and > > the Hadoop dependency on flink-java. > > > > What do others think? > > > > Best, Fabian > > >
Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment
+1 On Fri, Oct 14, 2016 at 5:29 AM, Fabian Hueskewrote: > Hi everybody, > > I would like to propose to deprecate the utility methods to read data with > Hadoop InputFormats from the (batch) ExecutionEnvironment. > > The motivation for deprecating these methods is reduce Flink's dependency > on Hadoop but rather have Hadoop as an optional dependency for users that > actually need it (HDFS, MapRed-Compat, ...). Eventually, we want to have > Flink distribution that does not have a hard Hadoop dependency. > > One step for this is to remove the Hadoop dependency from flink-java > (Flink's Java DataSet API) which is currently required due to the above > utility methods (see FLINK-4315). We recently received a PR that addresses > FLINK-4315 and removes the Hadoop methods from the ExecutionEnvironment. > After some discussion, it was decided to defer the PR to Flink 2.0 because > it breaks the API (these methods are delared @PublicEvolving). > > I propose to accept this PR for Flink 1.2, but instead of removing the > methods deprecating them. > This would help to migrate old code and prevent new usage of these methods. > For a later Flink release (1.3 or 2.0) we could remove these methods and > the Hadoop dependency on flink-java. > > What do others think? > > Best, Fabian >
[DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment
Hi everybody, I would like to propose to deprecate the utility methods to read data with Hadoop InputFormats from the (batch) ExecutionEnvironment. The motivation for deprecating these methods is reduce Flink's dependency on Hadoop but rather have Hadoop as an optional dependency for users that actually need it (HDFS, MapRed-Compat, ...). Eventually, we want to have Flink distribution that does not have a hard Hadoop dependency. One step for this is to remove the Hadoop dependency from flink-java (Flink's Java DataSet API) which is currently required due to the above utility methods (see FLINK-4315). We recently received a PR that addresses FLINK-4315 and removes the Hadoop methods from the ExecutionEnvironment. After some discussion, it was decided to defer the PR to Flink 2.0 because it breaks the API (these methods are delared @PublicEvolving). I propose to accept this PR for Flink 1.2, but instead of removing the methods deprecating them. This would help to migrate old code and prevent new usage of these methods. For a later Flink release (1.3 or 2.0) we could remove these methods and the Hadoop dependency on flink-java. What do others think? Best, Fabian