[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-13 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130677380
  
Thanks for the contribution @sachingoel0101!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/966


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-13 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130979827
  
Sure. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130318970
  
For transformation functions, there is a clear case for thin versus 
rich, for Java8 lambdas.
Input formats are a different game. They are super rich by default anyways.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130324909
  
Functions also need to extend RichFunction to have access to `open()` and 
`close()`.
I think the two things a different enough that any strife for consistency 
is actually pretty random.
If your thoughts currently revolve around the RuntimeContext, it apprears 
more consistent. If you thoughts are on the life cycle methods, it seems 
inconsistent. Random.

I think you should go ahead and just call them Rich. It is just a name, 
and what matters is that the JavaDocs describe what it actually means...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130324770
  
Rich does not refer to the number of methods but the fact that it has the 
RuntimeContext available. All non-rich variants do not get state inserted. This 
follows a naming convention in Flink. `AbstractInputFormat` might be a more 
intuitive name for novices but I'm more inclined to naming consistency. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130318553
  
We are back to square one ;-)

  - `Function`: single abstract method
  - `RichFunction` = 5 methods. I see how that gets rich.

  - `InputFormat`: 8 methods
  - `RichInputFormat`: 10 methods.
We could call it `SlightlyMoreRichInputFormat` ;-) I cannot help but find 
this very confusing.

Why the urgency to stick with the Rich prefix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130319611
  
The suggestion `AbstractInputFormat` was not so bad, in my opinion.
If you want a name that explains what's happening, you can always call it 
`InputFormatWithContext` ;-)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130319656
  
To keep it consistent with the remaining API. For functions you need to 
extend a RichFunction if you want to have access to the RuntimeContext. I agree 
that the name is not perfect (and I think everybody else got your point as 
well) but I think it's a valid point to aim for consistency.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130384903
  
Oh apologies. I  only saw the first comment on the email thread. I guess
it's more or less settled. I'll leave it up to you guys to make a final
decision on this. :')
On Aug 12, 2015 10:59 PM, Sachin Goel sachingoel0...@gmail.com wrote:

 I agree with Stephan's argument that addition of context to I/O formats is
 a very marginal enhancement. He literally stole my words. :')
 However, from my perspective, when I first started using flink, Rich meant
 runtime context. The idea of open and close wasn't as nearly exciting as
 the runtime context.
 What if we changed back to the original name mentioned on jira and make it
 `ContextAwareInputFormat`? Would everyone be okay with that?
 On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote:

 Functions also need to extend RichFunction to have access to open() and
 close().
 I think the two things a different enough that any strife for
 consistency is actually pretty random.
 If your thoughts currently revolve around the RuntimeContext, it apprears
 more consistent. If you thoughts are on the life cycle methods, it seems
 inconsistent. Random.

 I think you should go ahead and just call them Rich. It is just a name,
 and what matters is that the JavaDocs describe what it actually means...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/966#issuecomment-130324909.





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130383984
  
I agree with Stephan's argument that addition of context to I/O formats is
a very marginal enhancement. He literally stole my words. :')
However, from my perspective, when I first started using flink, Rich meant
runtime context. The idea of open and close wasn't as nearly exciting as
the runtime context.
What if we changed back to the original name mentioned on jira and make it
`ContextAwareInputFormat`? Would everyone be okay with that?
On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote:

 Functions also need to extend RichFunction to have access to open() and
 close().
 I think the two things a different enough that any strife for
 consistency is actually pretty random.
 If your thoughts currently revolve around the RuntimeContext, it apprears
 more consistent. If you thoughts are on the life cycle methods, it seems
 inconsistent. Random.

 I think you should go ahead and just call them Rich. It is just a name,
 and what matters is that the JavaDocs describe what it actually means...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/966#issuecomment-130324909.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130078564
  
Okay. I'll hold off on making any changes right now. We should reach at a 
consensus first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129849477
  
Yes, existing formats should converted to rich formats. The name 
`Abstract..` makes more sense if it becomes the new default way of interfacing 
with the rich input/output formats. I still think, calling it `Rich...` makes 
more sense in terms of consistency with the user-defined functions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129847518
  
I'm in favor of making the existing formats rich.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129834217
  
Okay. I think maybe calling them `Rich` maybe an overkill. I will change 
the names to `Abstract`. 
Do we keep the existing formats *non-rich* or would it be okay to make all 
of them *rich* [which is the case right now]? The problem with not making them 
*rich* is that then we're limiting any extending classes to never be *rich*.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129830222
  
Okay, so how about calling the abstract base class`AbstractInputFormat` and 
have it implement the runtime context and leaves the other methods abstract?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129826022
  
@StephanEwen @fhueske 
The `Rich` prefix might seem a bit odd but it is a naming convention that 
is consistent with the user-defined functions which have the RuntimeContext 
available.
As for `abstract classes vs interface` I agree with @sachingoel0101 that it 
makes sense to implement access to the RuntimeContext once for the user. We may 
also add other rich functions; if we implemented that using an interface we 
would have to change it later on or add another interface.

I think Input/Ouput formats should be rich by default. However, we might 
not want to break the API for existing external implementations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130091134
  
@mxm, @sachingoel0101 get your point. Being consistent with the remaining 
API is a valid argument. I would be OK with calling the abstract classes 
`RichInputFormat` and `RichOutputFormat`.

Let's wait for at least 24 hours and merge this PR if nobody speaks up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36352389
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSinkBase.java
 ---
@@ -328,9 +330,12 @@ public int compare(IN o1, IN o2) {
if(format instanceof InitializeOnMaster) {
((InitializeOnMaster)format).initializeGlobal(1);
}
-   
+
+   if(format instanceof RichOutputFormat){
--- End diff --

Move after `configure()` call


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36352428
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSourceBase.java
 ---
@@ -200,14 +202,18 @@ public void accept(VisitorOperator? visitor) {
visitor.postVisit(this);
}
}
-   
+
// 


-   protected ListOUT executeOnCollections(ExecutionConfig 
executionConfig) throws Exception {
+   protected ListOUT executeOnCollections(RuntimeContext ctx, 
ExecutionConfig executionConfig) throws Exception {
@SuppressWarnings(unchecked)
InputFormatOUT, InputSplit inputFormat = (InputFormatOUT, 
InputSplit) this.formatWrapper.getUserCodeObject();
+
+   if(inputFormat instanceof RichInputFormat){
--- End diff --

Move after `configure()` call


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36352731
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/io/RichInputFormatTest.java
 ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.io;
+
+import java.util.HashMap;
+
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.accumulators.Accumulator;
+import org.apache.flink.api.common.functions.util.RuntimeUDFContext;
+import org.apache.flink.types.Record;
+import org.junit.Assert;
+import org.junit.Test;
+
+/**
+ * Tests runtime context access from inside an RichInputFormat class
+ */
+public class RichInputFormatTest {
+
+   @Test
+   public void testCheckRuntimeContextAccess() {
+   final SerializedInputFormatRecord inputFormat = new 
SerializedInputFormatRecord();
--- End diff --

We are trying to reduce the use of the `Record` data type which is an 
artifact of a deprecated API. Could you replace it by another data type?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36352779
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/io/RichOutputFormatTest.java
 ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.io;
+
+import java.util.HashMap;
+
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.accumulators.Accumulator;
+import org.apache.flink.api.common.functions.util.RuntimeUDFContext;
+import org.apache.flink.types.Record;
+import org.junit.Assert;
+import org.junit.Test;
+
+/**
+ * Tests runtime context access from inside an RichOutputFormat class
+ */
+public class RichOutputFormatTest {
+
+   @Test
+   public void testCheckRuntimeContextAccess() {
+   final SerializedOutputFormatRecord inputFormat = new 
SerializedOutputFormatRecord();
--- End diff --

Please replace `Record` data type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36355211
  
--- Diff: 
flink-staging/flink-jdbc/src/main/java/org/apache/flink/api/java/record/io/jdbc/JDBCOutputFormat.java
 ---
@@ -40,7 +40,7 @@
 import org.apache.flink.types.StringValue;
 import org.apache.flink.types.Value;
 
-public class JDBCOutputFormat implements OutputFormatRecord {
--- End diff --

Class has been removed by a recent commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36355327
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/classloading/jar/CustomInputSplitProgram.java
 ---
@@ -65,7 +65,7 @@ public static void main(String[] args) throws Exception {
}
// 


-   public static final class CustomInputFormat implements 
InputFormatInteger, CustomInputSplit, ResultTypeQueryableInteger {
+   public static final class CustomInputFormat extends 
RichInputFormatInteger, CustomInputSplit implements 
ResultTypeQueryableInteger {
--- End diff --

This is a test class which does not need to be adapted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36355535
  
--- Diff: 
flink-staging/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapred/record/HadoopRecordInputFormat.java
 ---
@@ -37,7 +37,7 @@
 import org.apache.hadoop.mapred.RecordReader;
 import org.apache.hadoop.util.ReflectionUtils;
 
-public class HadoopRecordInputFormatK, V implements InputFormatRecord, 
HadoopInputSplit {
+public class HadoopRecordInputFormatK, V extends RichInputFormatRecord, 
HadoopInputSplit {
--- End diff --

Class will be removed soon. Not strictly necessary to adapt it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r3630
  
--- Diff: 
flink-staging/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapred/record/HadoopRecordOutputFormat.java
 ---
@@ -37,7 +37,7 @@
 import org.apache.hadoop.util.ReflectionUtils;
 
 
-public class HadoopRecordOutputFormatK,V implements OutputFormatRecord 
{
+public class HadoopRecordOutputFormatK,V extends 
RichOutputFormatRecord {
--- End diff --

Class will be removed soon. Not strictly necessary to adapt it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36357066
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/io/ReplicatingInputFormat.java
 ---
@@ -59,7 +60,7 @@
  * @see org.apache.flink.api.common.operators.base.FilterOperatorBase
  * @see org.apache.flink.api.common.operators.base.MapPartitionOperatorBase
  */
-public final class ReplicatingInputFormatOT, S extends InputSplit 
implements InputFormatOT, S {
+public final class ReplicatingInputFormatOT, S extends InputSplit 
extends RichInputFormatOT, S {
--- End diff --

The `ReplicatingInputFormat` can wrap any kind of `InputFormat` (with or 
without RuntimeContext). 
Therefore, it needs to extend `RichInputFormat` but overwrite the 
`setRuntimeContext()` method and forward the context to its wrapped IF, if this 
is also a `RichInputFormat`.

Please check the updated JavaDocs as well. The wrapped IF does not need to 
be rich.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-128157289
  
Hi @sachingoel0101,
Thanks for the update! The PR looks good in general.
Regarding the interface vs. abstract class issue, I am a bit undecided with 
a slight preference towards the abstract class. However, I agree with 
@StephanEwen that `RichInputFormat` and `RichOutputFormat` are not good names. 
How about `ContextInputFormat` and `ContextOutputFormat`?

Any other opinions on that matter?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-05 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-128230692
  
I'm inclined towards the Abstract class too. If we're saying something is 
context aware, we better provide access to it IMO. It doesn't make sense to ask 
the user to write it themselves and guarantee that we'd just call the relevant 
methods at runtime.
As for the names, wouldn't it be better to maintain consistency in naming? 
All the * rich * functions are named `RichFunctions`, which is why I decided to 
name these `RichFormat`s too. I do agree though. The IO formats are already 
pretty * rich * in the sense that they allow the user to control everything 
that happens at runtime.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-04 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127581951
  
Okay I guess we could leave the proper usage of `RuntimeContext` to the 
user. We make it explicit that the `RuntimeContext` will not be available in 
`configure` in the documentation, and if someone is overriding the `open` 
method, they likely know it'll be called several times and can take care of not 
performing, for example, duplicate `accumulator` additions. What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-04 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127627512
  
Build failure is on unrelated tests. The recent error on 
`PartitionedStateCheckpointingITCase` and the other on `KafkaITCase`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-04 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127512155
  
I am pretty much against `postConfigure()`. There are so many life cycle 
methods already, we should really not make it more complicated (I think even 
`configure()` is a problem, but that's an artifact).

It is not so hard to make it right in the `open()` method, and there are 
methods that getOrCreate accumulators, afaik.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36136321
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/io/DiscardingOutputFormat.java
 ---
@@ -27,7 +27,7 @@
  *
  * @param T The type of the elements accepted by the output format.
  */
-public class DiscardingOutputFormatT implements OutputFormatT {
+public class DiscardingOutputFormatT extends RichOutputFormatT {
--- End diff --

Doesn't need to access the RuntimeContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread sachingoel0101
Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36142648
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
 ---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.operators.util;
+
+import java.io.IOException;
+
+import org.apache.flink.api.common.io.InputFormat;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+
+/**
+ * Generic base class for all inputs that are not based on files.
+ * This is copied from {@link 
org.apache.flink.api.common.io.GenericInputFormat}
+ * This class doesn't provide access to RuntimeContext.
+ */
+public abstract class NonRichGenericInputFormatOT implements 
InputFormatOT, GenericInputSplit {
--- End diff --

This was only for testing purposes. I needed to verify that I didn't break 
any functionality by making every IO format rich. That is, to verify the class 
casting behavior in the Pact tasks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127433865
  
@fhueske, in the life cycle of an Input format, the `open` method is called 
several times in each parallel instance of the `DataSource` operator. Consider 
for example, the simple example of adding an accumulator. Since access to 
`RuntimeContext` is enabled only after configure, we will end up adding the 
same accumulator several times, which of course would fail. This is why I 
decided to add a `postConfigure` method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36136514
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/type/extractor/TypeExtractorInputFormatsTest.java
 ---
@@ -130,7 +131,7 @@ public void testQueryableFormatType() {
//  Test formats
// 


-   public static final class DummyFloatInputFormat implements 
InputFormatFloat, InputSplit {
+   public static final class DummyFloatInputFormat extends 
RichInputFormatFloat, InputSplit {
--- End diff --

Test input/output formats don't need access to the RuntimeContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36136228
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
 ---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.operators.util;
+
+import java.io.IOException;
+
+import org.apache.flink.api.common.io.InputFormat;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+
+/**
+ * Generic base class for all inputs that are not based on files.
+ * This is copied from {@link 
org.apache.flink.api.common.io.GenericInputFormat}
+ * This class doesn't provide access to RuntimeContext.
+ */
+public abstract class NonRichGenericInputFormatOT implements 
InputFormatOT, GenericInputSplit {
--- End diff --

I don't think we need this class. Why would somebody need a generic input 
format without RuntimeContext, if you can have one with RC for free?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127430006
  
@fhueske , what do we settle on then? Should I keep it as abstract class or 
change it to an interface?
Also, which classes need to be Rich? Can I assume that the ones you didn't 
comment on are okay?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127415936
  
Hi @sachingoel0101,

thanks for opening this PR! 
I see your point that users would have to implement `setRuntimeContext()` 
and `getRuntimeContext()` if we would use an interface. However, I am not sure 
that this would affect many people because we could implement the methods in 
all our base Input/OutputFormats which should cover a large portion of new 
formats. Also, these methods can be implemented in less than a minute. I think 
@StephanEwen has a point that all Input/OutputFormats are always quite rich. I 
would opt for the interface solution but implement it for all relevant 
Input/OutputFormats that Flink provides.

I would also remove the `postConfigure()` method in favor of a clear 
documentation of the interface stating that the `RuntimeContext`is available in 
`open()`.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36142853
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
 ---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.operators.util;
+
+import java.io.IOException;
+
+import org.apache.flink.api.common.io.InputFormat;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+
+/**
+ * Generic base class for all inputs that are not based on files.
+ * This is copied from {@link 
org.apache.flink.api.common.io.GenericInputFormat}
+ * This class doesn't provide access to RuntimeContext.
+ */
+public abstract class NonRichGenericInputFormatOT implements 
InputFormatOT, GenericInputSplit {
--- End diff --

Ah, sorry. Didn't notice it is in test scope. Good have it for tests!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36136714
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/accumulators/AccumulatorLiveITCase.java
 ---
@@ -307,7 +307,7 @@ public void flatMap(String value, CollectorInteger 
out) throws Exception {
 
}
 
-   private static class WaitingOutputFormat implements 
OutputFormatInteger {
+   private static class WaitingOutputFormat extends 
RichOutputFormatInteger {
--- End diff --

Test OF does not use the RuntimeContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36135989
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/recordJobs/util/DiscardingOutputFormat.java
 ---
@@ -19,14 +19,14 @@
 
 package org.apache.flink.test.recordJobs.util;
 
-import org.apache.flink.api.common.io.OutputFormat;
+import org.apache.flink.api.common.io.RichOutputFormat;
 import org.apache.flink.configuration.Configuration;
 import org.apache.flink.types.Record;
 
 /**
  * A simple output format that discards all data by doing nothing.
  */
-public class DiscardingOutputFormat implements OutputFormatRecord {
+public class DiscardingOutputFormat extends RichOutputFormatRecord {
--- End diff --

I don't think the DiscardingOutputFormat needs to access the RuntimeContext


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36136676
  
--- Diff: 
flink-staging/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -80,7 +80,7 @@ public void cancel() {
 * This class implements an OutputFormat for HBase
 *
 */
-   private static class HBaseOutputFormat implements OutputFormatString {
+   private static class HBaseOutputFormat extends RichOutputFormatString 
{
--- End diff --

No need to update this class if the example doesn't use the RuntimeContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-03 Thread sachingoel0101
Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/966#discussion_r36143535
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/operators/util/NonRichGenericInputFormat.java
 ---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.operators.util;
+
+import java.io.IOException;
+
+import org.apache.flink.api.common.io.InputFormat;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+
+/**
+ * Generic base class for all inputs that are not based on files.
+ * This is copied from {@link 
org.apache.flink.api.common.io.GenericInputFormat}
+ * This class doesn't provide access to RuntimeContext.
+ */
+public abstract class NonRichGenericInputFormatOT implements 
InputFormatOT, GenericInputSplit {
--- End diff --

No problem! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-02 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127037112
  
@StephanEwen, it would be much simpler to have an interface. But then we 
leave the part about implementing the `setRuntimeContext` and 
`getRuntimeContext` to the user. IMO that's not a very good option. So it makes 
to extend rich formats from abstract Rich formats. Doing that, I couldn't 
figure out a better solution if we want to make all existing IO formats rich. 
To sum it up:
1. If we only defined an interface, we leave the implementation of `setter` 
and `getter` methods to the user.
2. If we write a Rich abstract class, we have to make all existing IO 
formats rich since the user might want to extend one of these, and of course 
can't extend two classes.
If there's a way to achieve this, let me know. I might have missed 
something.

As for the `postConfigure` method, it was to guarantee that the 
`RuntimeContext` will be available now. I tried initializing the context in 
`DataSourceTask` and `DataSinkTask` before calling `configure`, but it seems 
the format is re-initialized somewhere as that information is lost. I could 
remove this function and we can just initialize `counters` etc. in `open` call.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-02 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127032575
  
What is the `postConfigure` method for? Input and output formats have 
already the `open()` method, that is called after `configure()`. The 
`configure()` method itself is somewhat of a baggage from the past, when all 
code was shipped as classes + configuration, rather than via closures.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-02 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-127032323
  
Would not not be much simpler to add an interface `RuntimeContextFormat` 
with a method (`setRuntimeContext()`) and call this method on the formats that 
implement this interface? That would be three changed classes, done.

I am also a bit sceptical about the difference between input format and 
rich input format. For transformation functions the difference between Function 
(a lambda) and RichFunction (many mode methods) makes sense. Input formats are 
per-se pretty rich...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-07-31 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-126705428
  
Thanks for the pull request!

You did not convert all InputFormats and OutputFormats. There are a few 
more, e.g. HadoopInputFormatBase or JDBCInputFormat. Other than that, your pull 
request looks good. I think it is fine to keep the notion of normal and rich, 
just like we do for the operator user functions. That also means that we do not 
break the existing interfaces.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-07-31 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-126706999
  
Ah yes. I'll update them in a while. There's actually some problem with the 
unit test I've written too. Travis fails sporadically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-07-31 Thread sachingoel0101
GitHub user sachingoel0101 opened a pull request:

https://github.com/apache/flink/pull/966

[FLINK-1819][core]Allow access to RuntimeContext from Input and Output 
formats

1. Introduces new Rich Input and Output formats, similar to Rich Functions.
2. Makes all existing input and output formats rich, without any API 
breaking changes.
3. Provides RuntimeContext to GenericSink and GenericSource operators
4. DataSourceTask and DataSinkTask set DistributedUDFContext to their input 
and output formats respectively.
5. Complete access to RuntimeContext, including access to accumulators.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sachingoel0101/flink flink-1819

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/966.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #966


commit 627ee290fb88ac3e670fe86a734116b06f431ade
Author: Sachin Goel sachingoel0...@gmail.com
Date:   2015-07-31T04:36:20Z

[FLINK-1819][core]Allow access to RuntimeContext from Input and Output
formats




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---