Github user mxm commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130677380
Thanks for the contribution @sachingoel0101!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/966
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130979827
Sure. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130318970
For transformation functions, there is a clear case for thin versus
rich, for Java8 lambdas.
Input formats are a different game. They are super rich by default
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130324909
Functions also need to extend RichFunction to have access to `open()` and
`close()`.
I think the two things a different enough that any strife for consistency
is
Github user mxm commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130324770
Rich does not refer to the number of methods but the fact that it has the
RuntimeContext available. All non-rich variants do not get state inserted. This
follows a naming
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130318553
We are back to square one ;-)
- `Function`: single abstract method
- `RichFunction` = 5 methods. I see how that gets rich.
- `InputFormat`:
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130319611
The suggestion `AbstractInputFormat` was not so bad, in my opinion.
If you want a name that explains what's happening, you can always call it
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130319656
To keep it consistent with the remaining API. For functions you need to
extend a RichFunction if you want to have access to the RuntimeContext. I agree
that the name is
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130384903
Oh apologies. I only saw the first comment on the email thread. I guess
it's more or less settled. I'll leave it up to you guys to make a final
decision on
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130383984
I agree with Stephan's argument that addition of context to I/O formats is
a very marginal enhancement. He literally stole my words. :')
However, from my
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130078564
Okay. I'll hold off on making any changes right now. We should reach at a
consensus first.
---
If your project is set up for it, you can reply to this email and
Github user mxm commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-129849477
Yes, existing formats should converted to rich formats. The name
`Abstract..` makes more sense if it becomes the new default way of interfacing
with the rich input/output
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-129847518
I'm in favor of making the existing formats rich.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-129834217
Okay. I think maybe calling them `Rich` maybe an overkill. I will change
the names to `Abstract`.
Do we keep the existing formats *non-rich* or would it be
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-129830222
Okay, so how about calling the abstract base class`AbstractInputFormat` and
have it implement the runtime context and leaves the other methods abstract?
---
If your
Github user mxm commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-129826022
@StephanEwen @fhueske
The `Rich` prefix might seem a bit odd but it is a naming convention that
is consistent with the user-defined functions which have the
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-130091134
@mxm, @sachingoel0101 get your point. Being consistent with the remaining
API is a valid argument. I would be OK with calling the abstract classes
`RichInputFormat` and
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36352389
--- Diff:
flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSinkBase.java
---
@@ -328,9 +330,12 @@ public int compare(IN o1, IN o2) {
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36352428
--- Diff:
flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSourceBase.java
---
@@ -200,14 +202,18 @@ public void
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36352731
--- Diff:
flink-core/src/test/java/org/apache/flink/api/common/io/RichInputFormatTest.java
---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36352779
--- Diff:
flink-core/src/test/java/org/apache/flink/api/common/io/RichOutputFormatTest.java
---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36355211
--- Diff:
flink-staging/flink-jdbc/src/main/java/org/apache/flink/api/java/record/io/jdbc/JDBCOutputFormat.java
---
@@ -40,7 +40,7 @@
import
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36355327
--- Diff:
flink-tests/src/test/java/org/apache/flink/test/classloading/jar/CustomInputSplitProgram.java
---
@@ -65,7 +65,7 @@ public static void
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36355535
--- Diff:
flink-staging/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapred/record/HadoopRecordInputFormat.java
---
@@ -37,7
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r3630
--- Diff:
flink-staging/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapred/record/HadoopRecordOutputFormat.java
---
@@
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36357066
--- Diff:
flink-core/src/main/java/org/apache/flink/api/common/io/ReplicatingInputFormat.java
---
@@ -59,7 +60,7 @@
* @see
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-128157289
Hi @sachingoel0101,
Thanks for the update! The PR looks good in general.
Regarding the interface vs. abstract class issue, I am a bit undecided with
a slight
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-128230692
I'm inclined towards the Abstract class too. If we're saying something is
context aware, we better provide access to it IMO. It doesn't make sense to ask
the user
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127581951
Okay I guess we could leave the proper usage of `RuntimeContext` to the
user. We make it explicit that the `RuntimeContext` will not be available in
`configure`
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127627512
Build failure is on unrelated tests. The recent error on
`PartitionedStateCheckpointingITCase` and the other on `KafkaITCase`.
---
If your project is set up for
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127512155
I am pretty much against `postConfigure()`. There are so many life cycle
methods already, we should really not make it more complicated (I think even
`configure()`
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36136321
--- Diff:
flink-java/src/main/java/org/apache/flink/api/java/io/DiscardingOutputFormat.java
---
@@ -27,7 +27,7 @@
*
* @param T The type of the
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36142648
--- Diff:
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
---
@@ -0,0 +1,86 @@
+/*
+ *
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127433865
@fhueske, in the life cycle of an Input format, the `open` method is called
several times in each parallel instance of the `DataSource` operator. Consider
for
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36136514
--- Diff:
flink-java/src/test/java/org/apache/flink/api/java/type/extractor/TypeExtractorInputFormatsTest.java
---
@@ -130,7 +131,7 @@ public void
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36136228
--- Diff:
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127430006
@fhueske , what do we settle on then? Should I keep it as abstract class or
change it to an interface?
Also, which classes need to be Rich? Can I assume that
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127415936
Hi @sachingoel0101,
thanks for opening this PR!
I see your point that users would have to implement `setRuntimeContext()`
and `getRuntimeContext()` if we
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36142853
--- Diff:
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36136714
--- Diff:
flink-tests/src/test/java/org/apache/flink/test/accumulators/AccumulatorLiveITCase.java
---
@@ -307,7 +307,7 @@ public void flatMap(String value,
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36135989
--- Diff:
flink-tests/src/test/java/org/apache/flink/test/recordJobs/util/DiscardingOutputFormat.java
---
@@ -19,14 +19,14 @@
package
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36136676
--- Diff:
flink-staging/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
---
@@ -80,7 +80,7 @@ public void
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/966#discussion_r36143535
--- Diff:
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
---
@@ -0,0 +1,86 @@
+/*
+ *
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127037112
@StephanEwen, it would be much simpler to have an interface. But then we
leave the part about implementing the `setRuntimeContext` and
`getRuntimeContext` to the
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127032575
What is the `postConfigure` method for? Input and output formats have
already the `open()` method, that is called after `configure()`. The
`configure()` method
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127032323
Would not not be much simpler to add an interface `RuntimeContextFormat`
with a method (`setRuntimeContext()`) and call this method on the formats that
implement
Github user mxm commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-126705428
Thanks for the pull request!
You did not convert all InputFormats and OutputFormats. There are a few
more, e.g. HadoopInputFormatBase or JDBCInputFormat. Other than
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-126706999
Ah yes. I'll update them in a while. There's actually some problem with the
unit test I've written too. Travis fails sporadically.
---
If your project is set up
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/966
[FLINK-1819][core]Allow access to RuntimeContext from Input and Output
formats
1. Introduces new Rich Input and Output formats, similar to Rich Functions.
2. Makes all existing input and
50 matches
Mail list logo