[ 
https://issues.apache.org/jira/browse/HIVE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-741:
-----------------------------------------

    Attachment: patch-741-2.txt

Patch fixes SMBMapJoinOperator also. I modified compareKeys(ArrayList<Object> 
k1, ArrayList<Object> k2) to do the following:
{code}
    if (hasNullElements(k1) && hasNullElements(k2)) {
      return -1; // just return k1 is smaller than k2
    } else if (hasNullElements(k1)) {
      return (0 - k2.size());
    } else if (hasNullElements(k2)) {
      return k1.size();
    }
   ... //the existing code.
{code}

Does the above make sense?

Updated the testcase with smb join queries. 

When I'm running smb join on my local machine (pseudo distributed mode), I'm 
getting different results. I think that is mostly because of HIVE-1561. Will 
update the issue with my findings.

> NULL is not handled correctly in join
> -------------------------------------
>
>                 Key: HIVE-741
>                 URL: https://issues.apache.org/jira/browse/HIVE-741
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-741-1.txt, patch-741-2.txt, patch-741.txt, 
> smbjoin_nulls.q.txt
>
>
> With the following data in table input4_cb:
> Key        Value
> ------       --------
> NULL     325
> 18          NULL
> The following query:
> {code}
> select * from input4_cb a join input4_cb b on a.key = b.value;
> {code}
> returns the following result:
> NULL    325    18   NULL
> The correct result should be empty set.
> When 'null' is replaced by '' it works.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to