[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506826#comment-14506826
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/572


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503927#comment-14503927
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-94591244
  
Thanks for the prompt update and the good work!
Will try the PR and merge it if everything works fine :-)


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502996#comment-14502996
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/572#discussion_r28696934
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/GroupReduceITCase.java
 ---
@@ -1063,6 +1065,33 @@ public void reduce(IterableTuple5Integer, Long, 
Integer, String, Long values
 
}
 
+   @Test
+   public void testGroupReduceWithAtomicValue() throws Exception {
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   DataSetInteger ds = env.fromElements(1, 1, 2, 3, 4);
+   DataSetInteger reduceDs = ds.groupBy(*).reduceGroup(new 
GroupReduceFunctionInteger, Integer() {
+   @Override
+   public void reduce(IterableInteger values, 
CollectorInteger out) throws Exception {
+   SetInteger set = new HashSetInteger();
+   for (Integer i : values) {
+   set.add(i);
+   }
+   for (Integer i : set) {
+   out.collect(i);
+   }
+   }
+   });
+
+   env.setParallelism(1);
--- End diff --

Why do you set the parallelism to 1?


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502997#comment-14502997
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/572#discussion_r28696965
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/JoinITCase.java
 ---
@@ -663,6 +663,39 @@ public void 
testNonPojoToVerifyNestedTupleElementSelectionWithFirstKeyFieldGreat
((3,2,Hello world),(3,2,Hello 
world)),((3,2,Hello world),(3,2,Hello world))\n;
}
 
+   @Test
+   public void testJoinWithAtomicType1() throws Exception {
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+   DataSetTuple3Integer, Long, String ds1 = 
CollectionDataSets.getSmall3TupleDataSet(env);
+   DataSetInteger ds2 = env.fromElements(1, 2);
+
+   DataSetTuple2Tuple3Integer, Long, String, Integer joinDs 
= ds1.join(ds2).where(0).equalTo(*);
+
+   joinDs.writeAsCsv(resultPath);
+   env.setParallelism(1);
--- End diff --

Why parallelism == 1?


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503224#comment-14503224
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-94512484
  
I updated this PR. Thanks for advice. :)

* Remove `setParallelism(1)` in test code.
* Simplify `testGroupReduceWithAtomicValue` in `GroupReduceITCase`.
* Add a `null` check for `expressionsIn` in constructor of `ExpressionKeys`.


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503045#comment-14503045
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/572#discussion_r28699341
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/GroupReduceITCase.java
 ---
@@ -1063,6 +1065,33 @@ public void reduce(IterableTuple5Integer, Long, 
Integer, String, Long values
 
}
 
+   @Test
+   public void testGroupReduceWithAtomicValue() throws Exception {
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   DataSetInteger ds = env.fromElements(1, 1, 2, 3, 4);
+   DataSetInteger reduceDs = ds.groupBy(*).reduceGroup(new 
GroupReduceFunctionInteger, Integer() {
+   @Override
+   public void reduce(IterableInteger values, 
CollectorInteger out) throws Exception {
+   SetInteger set = new HashSetInteger();
+   for (Integer i : values) {
+   set.add(i);
+   }
+   for (Integer i : set) {
+   out.collect(i);
+   }
+   }
+   });
+
+   env.setParallelism(1);
--- End diff --

Oh, I just copied code from other test case. I will fix it :)


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503011#comment-14503011
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/572#discussion_r28697764
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/Keys.java ---
@@ -274,24 +274,31 @@ private static int 
countNestedElementsBefore(CompositeType? compositeType, int
 * Create ExpressionKeys from String-expressions
 */
public ExpressionKeys(String[] expressionsIn, 
TypeInformationT type) {
-   if(!(type instanceof CompositeType?)) {
-   throw new IllegalArgumentException(Key 
expressions are only supported on POJO types and Tuples. 
-   + A type is considered a POJO 
if all its fields are public, or have both getters and setters defined);
-   }
-   CompositeTypeT cType = (CompositeTypeT) type;
-   
-   String[] expressions = removeDuplicates(expressionsIn);
-   if(expressionsIn.length != expressions.length) {
-   LOG.warn(The key expressions contained 
duplicates. They are now unique);
-   }
-   // extract the keys on their flat position
-   keyFields = new 
ArrayListFlatFieldDescriptor(expressions.length);
-   for (int i = 0; i  expressions.length; i++) {
-   ListFlatFieldDescriptor keys = 
cType.getFlatFields(expressions[i]); // use separate list to do a size check
-   if(keys.size() == 0) {
-   throw new 
IllegalArgumentException(Unable to extract key from expression 
'+expressions[i]+' on key +cType);
+   if (type instanceof AtomicType) {
--- End diff --

can you add a `null` check for `expressionsIn`?


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503018#comment-14503018
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-94482477
  
Thanks for the update. Looks really good! 
I have just a few inline comments. Once these are resolved, I would try the 
code and merge it if everything works.


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501501#comment-14501501
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-94188129
  
Hi. I updated this PR. The changes are following.

* Re-implement this feature with generalizing `ExpressionKeys`.
* Modify `CoGroupOperatorBase`, `GroupCombineOperatorBase` and 
`GroupReduceOperatorBase` to allow use of AtomicType as Key.
* Add unit tests to test invalid usage.

Because a mention for wildcard expression with atomic type exists already 
in documentation. I didn't modify documentation.


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485000#comment-14485000
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-90863864
  
Looks good @chiwanpark. Could you add some documentation besides the java 
doc? For example, here: 
http://ci.apache.org/projects/flink/flink-docs-master/programming_guide.html#define-keys-using-key-selector-functions
 


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485014#comment-14485014
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-90865966
  
Thanks @chiwanpark for the PR!

Using an IdentityKeySelector is not the best solution in this case. A 
KeySelectorX,Y transparently converts a `DataSetX` into a 
`DataSetTuple2Y,X` and uses the first tuple field as key. Since the 
KeySelector used here is an IdentityKeySelector we end up with a `DataSetX,X` 
which unnecessarily doubles the amount of data. 

I will look at this PR later in detail and give some feedback how it could 
be improved. Thanks!


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481150#comment-14481150
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/572#issuecomment-90023731
  
@hsaputra Thanks for advice! I renamed `BasicKeySelector` to 
`IdentityKeySelector` to prevent ambiguous naming. I added JavaDoc for 
`IdentityKeySelector`.

BTW, I found some duplicated classes with `IdentityKeySelector` 
([1](https://github.com/apache/flink/blob/master/flink-optimizer/src/test/java/org/apache/flink/optimizer/testfunctions/IdentityKeyExtractor.java),
 
[2](https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/iterative/StaticlyNestedIterationsITCase.java#L58),
 
[3](https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/misc/CustomPartitioningITCase.java#L67)).
 Should I substitute the `IdentityKeySelector` for them? It will decrease the 
amount of duplicated code.


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14407480#comment-14407480
 ] 

ASF GitHub Bot commented on FLINK-703:
--

GitHub user chiwanpark opened a pull request:

https://github.com/apache/flink/pull/572

[FLINK-703] Use complete element as join key

Hello. I open a pull request about FLINK-703. You can find more detail 
description in [JIRA](https://issues.apache.org/jira/browse/FLINK-703). This PR 
contains following changes.

* Add `BasicKeySelector` class to use complete element as key.
* Add `checkForAtomicType` method in `Keys` class to check condition.
* Modify `CoGroupOperator`, `JoinOperator`, `DataSet` in Java and Scala API
* Add some unit tests and integration tests to test this modification.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chiwanpark/flink FLINK-703

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/572.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #572


commit e6804bbf4d5ec345a723d692da26b277a1cabf04
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-04-05T18:18:23Z

[FLINK-703] [java api] Use complete element as join key

commit 862c5a5608cd7fd8af22c5a781e87ef0ece79e85
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-04-05T20:07:11Z

[FLINK-703] [scala api] Use complete element as join key




 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-03-27 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384529#comment-14384529
 ] 

Fabian Hueske commented on FLINK-703:
-

Hi, that's a very good example and shows that the description of this issue is 
incomplete ;-)
Key expressions (such as {{*}} and {{_}}) are supported for Pojo, Tuple, 
and CaseClass types.

If you would change your example in a way, that you join against DataSet of 
primitive types (such as DataSetInteger) it would no longer work. Since 
primitive types are not composite, the only possible key expression is the 
wildcard which selects the full type. We handle this case in several places of 
the API as special case. See for example in 
[DataSink.java|https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/operators/DataSink.java]
 in function {{sortLocalOutput(String fieldExpression, Order order)}} (around 
line 180).

So this issue would be mean to add similiar functionality to the key definition 
functions of join, coGroup, and grouping. For that, we need to check if the 
type is a valid key type ({{TypeInformation.isKeyType()}}) and if the type is 
an atomic type, set the key to int[]\{0\}


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-02-09 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313163#comment-14313163
 ] 

Fabian Hueske commented on FLINK-703:
-

I propose wildcard field expressions ( {{*}} and {{_}} ) to define full 
elements (of any type) as join, group, or cogroup keys (see Define keys using 
Field Expressions section in the [programming 
guide|https://github.com/apache/flink/blob/master/docs/programming_guide.md].

 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)