[
https://issues.apache.org/jira/browse/FLINK-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fabian Hueske resolved FLINK-1628.
----------------------------------
Resolution: Fixed
Fixed with b0a57c32fc68d4a1369e5ece25d3a6e986ee1e2a
> Strange behavior of "where" function during a join
> --------------------------------------------------
>
> Key: FLINK-1628
> URL: https://issues.apache.org/jira/browse/FLINK-1628
> Project: Flink
> Issue Type: Bug
> Components: Optimizer
> Affects Versions: 0.9
> Reporter: Daniel Bali
> Assignee: Fabian Hueske
> Priority: Critical
> Labels: batch
>
> Hello!
> If I use the `where` function with a field list during a join, it exhibits
> strange behavior.
> Here is the sample code that triggers the error:
> https://gist.github.com/balidani/d9789b713e559d867d5c
> This example joins a DataSet with itself, then counts the number of rows. If
> I use `.where(0, 1)` the result is (22), which is not correct. If I use
> `EdgeKeySelector`, I get the correct result (101).
> When I pass a field list to the `equalTo` function (but not `where`),
> everything works again.
> If I don't include the `groupBy` and `reduceGroup` parts, everything works.
> Also, when working with large DataSets, passing a field list to `where` makes
> it incredibly slow, even though I don't see any exceptions in the log (in
> DEBUG mode).
> Does anybody know what might cause this problem?
> Thanks!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)