[
https://issues.apache.org/jira/browse/TAJO-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267905#comment-15267905
]
ASF GitHub Bot commented on TAJO-2135:
--------------------------------------
Github user jihoonson commented on a diff in the pull request:
https://github.com/apache/tajo/pull/1009#discussion_r61831252
--- Diff:
tajo-core/src/main/java/org/apache/tajo/engine/function/window/Rank.java ---
@@ -48,7 +48,8 @@ public Rank() {
public static boolean checkIfDistinctValue(RankContext context, Tuple
params) {
for (int i = 0; i < context.latest.length; i++) {
- if (!context.latest[i].equalsTo(params.asDatum(i)).isTrue()) {
+ if ((context.latest[i].isNotNull() || params.asDatum(i).isNotNull())
--- End diff --
Because, in rank(), null values are regarded as the same unlike our other
implementations. Please refer to
https://en.wikipedia.org/wiki/Three-valued_logic.
> Invalid join result when join key columns contain nulls
> -------------------------------------------------------
>
> Key: TAJO-2135
> URL: https://issues.apache.org/jira/browse/TAJO-2135
> Project: Tajo
> Issue Type: Bug
> Affects Versions: 0.11.0
> Reporter: Jihoon Son
> Assignee: Jihoon Son
> Priority: Critical
> Fix For: 0.12.0
>
>
> You can simply reproduce this bug as follows. The correct answer of the below
> query is 20965674.
> {noformat}
> tpcds100> select count(*) from store_sales, store_returns
> where
> ss_customer_sk = sr_customer_sk
> and ss_item_sk = sr_item_sk
> ;
> [=========================================>] 100% 33.315 sec
> ?count
> -------------------------------
> 101145653
> (1 rows, 33.315 sec, 16 B selected)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)