[ 
https://issues.apache.org/jira/browse/DRILL-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Malashevsky updated DRILL-6121:
-----------------------------------------
    Description: 
*AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT

*AFFECTED_FUNCTIONALITY:* INNER JOIN

*ISSUE_DESCRIPTION:* There were added new Json data types in DRILL-5919: *NaN, 
Infinity, -Infinity*. 
During testing activities, it was detected a bit strange behavior of INNER JOIN 
operator - different query results in almost the same queries. 
*Query1* {code} select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t 
inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
*Query2* {code} select distinct t.name from dfs.tmp.`ObjsX.json` t inner join 
dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}

*Query1* differs from *Query2* by 1 columns only:
- In *Query1* - 2 columns are selected - t.name, tt.name
- In *Query2* - 1 column is selected - t.name

However *Query1*/*Query2* return completely different results:
- *Query1* returns
        {code}
        name         name0
        object2         object2
        object2         object3
        object2         object4
        object3         object2
        object3         object3
        object3         object4
        object4         object2
        object4         object3
        object4         object4
        {code}
This result seems to be correct.

- *Query2* returns _*No result found*_, not expected:

        *EXPECTED_RESULT:*
        {code}
        name
        object2
        object3
        object4
        {code}
        
        *ACTUAL_RESULT*: {code}No result found{code}

*NB!:* the issue appears only if tables are _*JOINed by a column which contains 
newly-added data types (NaN, Infinity, -Infinity)*_. The issue is not 
reproducible is a user is JOINing tables by a column containing other data types

  was:
*AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT

*AFFECTED_FUNCTIONALITY:* INNER JOIN

*ISSUE_DESCRIPTION:* There were added new Json data types in 
MD-2745/DRILL-5919: *NaN, Infinity, -Infinity*. 
During testing activities, it was detected a bit strange behavior of INNER JOIN 
operator - different query results in almost the same queries. 
*Query1* {code} select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t 
inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
*Query2* {code} select distinct t.name from dfs.tmp.`ObjsX.json` t inner join 
dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}

*Query1* differs from *Query2* by 1 columns only:
- In *Query1* - 2 columns are selected - t.name, tt.name
- In *Query2* - 1 column is selected - t.name

However *Query1*/*Query2* return completely different results:
- *Query1* returns
        {code}
        name         name0
        object2         object2
        object2         object3
        object2         object4
        object3         object2
        object3         object3
        object3         object4
        object4         object2
        object4         object3
        object4         object4
        {code}
This result seems to be correct.

- *Query2* returns _*No result found*_, not expected:

        *EXPECTED_RESULT:*
        {code}
        name
        object2
        object3
        object4
        {code}
        
        *ACTUAL_RESULT*: {code}No result found{code}

*NB!:* the issue appears only if tables are _*JOINed by a column which contains 
newly-added data types (NaN, Infinity, -Infinity)*_. The issue is not 
reproducible is a user is JOINing tables by a column containing other data types


> Nan/Inf data types: strange query result with INNER JOIN operator when 
> selecting 1 column
> -----------------------------------------------------------------------------------------
>
>                 Key: DRILL-6121
>                 URL: https://issues.apache.org/jira/browse/DRILL-6121
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - JSON
>            Reporter: Alexander Malashevsky
>            Assignee: Volodymyr Tkach
>            Priority: Minor
>         Attachments: ObjsX.json
>
>
> *AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT
> *AFFECTED_FUNCTIONALITY:* INNER JOIN
> *ISSUE_DESCRIPTION:* There were added new Json data types in DRILL-5919: 
> *NaN, Infinity, -Infinity*. 
> During testing activities, it was detected a bit strange behavior of INNER 
> JOIN operator - different query results in almost the same queries. 
> *Query1* {code} select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t 
> inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
> *Query2* {code} select distinct t.name from dfs.tmp.`ObjsX.json` t inner join 
> dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
> *Query1* differs from *Query2* by 1 columns only:
> - In *Query1* - 2 columns are selected - t.name, tt.name
> - In *Query2* - 1 column is selected - t.name
> However *Query1*/*Query2* return completely different results:
> - *Query1* returns
>       {code}
>       name         name0
>       object2         object2
>       object2         object3
>       object2         object4
>       object3         object2
>       object3         object3
>       object3         object4
>       object4         object2
>       object4         object3
>       object4         object4
>       {code}
> This result seems to be correct.
> - *Query2* returns _*No result found*_, not expected:
>       *EXPECTED_RESULT:*
>       {code}
>       name
>       object2
>       object3
>       object4
>       {code}
>       
>       *ACTUAL_RESULT*: {code}No result found{code}
> *NB!:* the issue appears only if tables are _*JOINed by a column which 
> contains newly-added data types (NaN, Infinity, -Infinity)*_. The issue is 
> not reproducible is a user is JOINing tables by a column containing other 
> data types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to