[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853917#comment-15853917 ] ASF GitHub Bot commented on FLINK-1398: --- Github user FelixNeutatz closed the pull request at: https://github.com/apache/flink/pull/308 > A new DataSet function: extractElementFromTuple > --- > > Key: FLINK-1398 > URL: https://issues.apache.org/jira/browse/FLINK-1398 > Project: Flink > Issue Type: Wish > Components: DataSet API >Reporter: Felix Neutatz >Assignee: Felix Neutatz >Priority: Minor > > This is the use case: > {code:xml} > DataSet> data = env.fromElements(new > Tuple2 (1,2.0)); > > data.map(new ElementFromTuple()); > > } > public static final class ElementFromTuple implements > MapFunction , Double> { > @Override > public Double map(Tuple2 value) { > return value.f1; > } > } > {code} > It would be awesome if we had something like this: > {code:xml} > data.extractElement(1); > {code} > This means that we implement a function for DataSet which extracts a certain > element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853874#comment-15853874 ] ASF GitHub Bot commented on FLINK-1398: --- Github user tonycox commented on the issue: https://github.com/apache/flink/pull/308 Okay, I'll add it to clean up jira list > A new DataSet function: extractElementFromTuple > --- > > Key: FLINK-1398 > URL: https://issues.apache.org/jira/browse/FLINK-1398 > Project: Flink > Issue Type: Wish > Components: DataSet API >Reporter: Felix Neutatz >Assignee: Felix Neutatz >Priority: Minor > > This is the use case: > {code:xml} > DataSet> data = env.fromElements(new > Tuple2 (1,2.0)); > > data.map(new ElementFromTuple()); > > } > public static final class ElementFromTuple implements > MapFunction , Double> { > @Override > public Double map(Tuple2 value) { > return value.f1; > } > } > {code} > It would be awesome if we had something like this: > {code:xml} > data.extractElement(1); > {code} > This means that we implement a function for DataSet which extracts a certain > element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853856#comment-15853856 ] ASF GitHub Bot commented on FLINK-1398: --- Github user fhueske commented on the issue: https://github.com/apache/flink/pull/308 I agree with Stephan > A new DataSet function: extractElementFromTuple > --- > > Key: FLINK-1398 > URL: https://issues.apache.org/jira/browse/FLINK-1398 > Project: Flink > Issue Type: Wish > Components: DataSet API >Reporter: Felix Neutatz >Assignee: Felix Neutatz >Priority: Minor > > This is the use case: > {code:xml} > DataSet> data = env.fromElements(new > Tuple2 (1,2.0)); > > data.map(new ElementFromTuple()); > > } > public static final class ElementFromTuple implements > MapFunction , Double> { > @Override > public Double map(Tuple2 value) { > return value.f1; > } > } > {code} > It would be awesome if we had something like this: > {code:xml} > data.extractElement(1); > {code} > This means that we implement a function for DataSet which extracts a certain > element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853820#comment-15853820 ] ASF GitHub Bot commented on FLINK-1398: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/308 Playing devil's advocate here: Given that this is really easy to just solve on-the-fly in a program (especially with Java 8 and lambdas) has not been requested another time by the community, I would suggest to actually not add this to Flink. I think that in this case, the benefit of reducing maintenance effort outweighs the benefit to users via this utility. > A new DataSet function: extractElementFromTuple > --- > > Key: FLINK-1398 > URL: https://issues.apache.org/jira/browse/FLINK-1398 > Project: Flink > Issue Type: Wish > Components: DataSet API >Reporter: Felix Neutatz >Assignee: Felix Neutatz >Priority: Minor > > This is the use case: > {code:xml} > DataSet> data = env.fromElements(new > Tuple2 (1,2.0)); > > data.map(new ElementFromTuple()); > > } > public static final class ElementFromTuple implements > MapFunction , Double> { > @Override > public Double map(Tuple2 value) { > return value.f1; > } > } > {code} > It would be awesome if we had something like this: > {code:xml} > data.extractElement(1); > {code} > This means that we implement a function for DataSet which extracts a certain > element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853807#comment-15853807 ] ASF GitHub Bot commented on FLINK-1398: --- Github user tonycox commented on the issue: https://github.com/apache/flink/pull/308 Hi @FelixNeutatz Can you finish this PR ? > A new DataSet function: extractElementFromTuple > --- > > Key: FLINK-1398 > URL: https://issues.apache.org/jira/browse/FLINK-1398 > Project: Flink > Issue Type: Wish > Components: DataSet API >Reporter: Felix Neutatz >Assignee: Felix Neutatz >Priority: Minor > > This is the use case: > {code:xml} > DataSet> data = env.fromElements(new > Tuple2 (1,2.0)); > > data.map(new ElementFromTuple()); > > } > public static final class ElementFromTuple implements > MapFunction , Double> { > @Override > public Double map(Tuple2 value) { > return value.f1; > } > } > {code} > It would be awesome if we had something like this: > {code:xml} > data.extractElement(1); > {code} > This means that we implement a function for DataSet which extracts a certain > element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547993#comment-14547993 ] ASF GitHub Bot commented on FLINK-1398: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-103060542 +1 for contrib A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543875#comment-14543875 ] ASF GitHub Bot commented on FLINK-1398: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-102077233 I think the consensus was that we don't want to have such a method in the DataSet API. We can, however, put a utility for this in flink-contrib. This utility should work for both batch and streaming? Any other opinions? Please correct me if I'm wrong. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543419#comment-14543419 ] ASF GitHub Bot commented on FLINK-1398: --- Github user FelixNeutatz commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-101981739 So, what is the final decision here? this seems like a lot of code for something that can be achieved using a simple mapper -- that is exactly the reason why I would want this functionality - it is too much code for a simple thing :) A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509002#comment-14509002 ] ASF GitHub Bot commented on FLINK-1398: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95577570 +1 for not specializing code in the `DataSet` A `TupleUtil` would be fine, in my opinion A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508774#comment-14508774 ] ASF GitHub Bot commented on FLINK-1398: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95522921 I would put this into the `flink-contrib` module. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508782#comment-14508782 ] ASF GitHub Bot commented on FLINK-1398: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95523597 @aljoscha Those are good points. I like the idea of cleaning the Java and Scala APIs from operations that work on structured data such as projection, aggregation, etc. and support those use cases through the Table API. Not sure if the Table API can serve as a complete replacement at this point in time, but moving it there is the right thing to do, IMO. But this discussion should happen on the dev mailing list, not in some PR comment thread ;-) A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508957#comment-14508957 ] ASF GitHub Bot commented on FLINK-1398: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95566780 Yes, I think we should start a discussion there. I just wanted to give the reasons for my opinion here. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505613#comment-14505613 ] ASF GitHub Bot commented on FLINK-1398: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-94918887 I think this is good in general, modulo two issues: - There are going to be more Utils, to I would like to give it a more speaking name, like TupleUtils, or TupleExtractors, or something along these lines. - I think we can omit the return type class. Similar as with the projections, this should not be needed. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504932#comment-14504932 ] ASF GitHub Bot commented on FLINK-1398: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/308#discussion_r28776433 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/lib/DataSetUtil.java --- @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.java.lib; + + +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.operators.MapOperator; +import org.apache.flink.api.java.operators.SingleInputUdfOperator; +import org.apache.flink.api.java.tuple.Tuple; +import org.apache.flink.api.java.typeutils.TupleTypeInfo; +import org.apache.flink.api.java.typeutils.TypeExtractor; + + +public class DataSetUtil { + + + // + // Extraction of a single field + // + + /** +* Applies a single field extraction on a {@link Tuple} {@link DataSet}.br/ +* bNote: Can be only applied on Tuple DataSets using the corresponding field index./b/br +* The transformation extracts of each Tuple of the DataSet a given field./br +* +* +* @param ds The input DataSet. +* @param fieldIndex The field index of the input tuple which is extracted. +* @param outputType Class of the extracted field. +* @return A SingleInputUdfOperator that represents the extracted field. +* +* @see Tuple +* @see DataSet +* @see org.apache.flink.api.java.operators.SingleInputUdfOperator +*/ + public static IN extends Tuple, OUT SingleInputUdfOperatorIN, OUT, MapOperatorIN, OUT extractSingleField(DataSetIN ds, int fieldIndex, ClassOUT outputType) { + + if(!ds.getType().isTupleType()) { + throw new IllegalArgumentException(The DataSet has to contain a Tuple, not + ds.getType().getTypeClass().getName()); + } + + TupleTypeInfoIN tupleInfo = (TupleTypeInfo) ds.getType(); + if(fieldIndex = tupleInfo.getArity() || fieldIndex 0) { + throw new IndexOutOfBoundsException(The field index has to be between 0 and + (tupleInfo.getArity()-1)); + } + + if(!tupleInfo.getTypeAt(fieldIndex).equals(TypeExtractor.createTypeInfo(outputType))) { + throw new IllegalArgumentException(The output class type has to be: + tupleInfo.getTypeAt(fieldIndex).toString()); + } + + return ds.map(new ExtractElement(fieldIndex)).returns(tupleInfo.getTypeAt(fieldIndex)); --- End diff -- Add `.name(Extract Field +fieldIndex)` to specify the name of the Map operator. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml}
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504934#comment-14504934 ] ASF GitHub Bot commented on FLINK-1398: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-94792842 How do we proceed with this PR? I think it looks good and would be OK with adding this feature to a `DataSetUtils` class. Other opinions? A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277526#comment-14277526 ] ASF GitHub Bot commented on FLINK-1398: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-69976328 +1 for some utilities. I'm not sure however where to put it. Should we add another maven module? Make it part of the current flink-java ? Or start it as a github repo outside of the main project? A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277105#comment-14277105 ] ASF GitHub Bot commented on FLINK-1398: --- GitHub user FelixNeutatz opened a pull request: https://github.com/apache/flink/pull/308 [FLINK-1398] Introduce extractSingleField() in DataSet This is a prototype how we could implement extractSingleField() for DataSet. Let's discuss :) You can merge this pull request into a Git repository by running: $ git pull https://github.com/FelixNeutatz/incubator-flink ExtractSingleField-FLINK1398 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/308.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #308 commit c3162413b2f6979595393f20d347e6e2057620fa Author: FelixNeutatz neut...@googlemail.com Date: 2015-01-14T15:50:37Z [FLINK-1398] Introduce extractSingleField() in DataSet A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277141#comment-14277141 ] ASF GitHub Bot commented on FLINK-1398: --- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-69940866 Why did you make a new operator instead of implementing it as a simple map function? A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277226#comment-14277226 ] Fabian Hueske commented on FLINK-1398: -- I am not sure how useful / how much needed such an operator is. Designing an API includes finding the right trade-off of conciseness and providing build-in operators. Extracting an element can be done using a trivial MapFunction, in Scala or Java8 even a lambda function. So this is just syntactic sugar for convenience. For that we would pay with two additional methods (one with an Integer index for tuples and another one with a field expression String for Pojo and tuple types) in the API which is already quite loaded, IMO. My feeling is, that the gain is not enough for extending the API, but I am open for other arguments ;-) A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)