[
https://issues.apache.org/jira/browse/DRILL-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Malashevsky updated DRILL-6121:
-----------------------------------------
Description:
*AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT
*AFFECTED_FUNCTIONALITY:* INNER JOIN
*ISSUE_DESCRIPTION:* There were added new Json data types in DRILL-5919: *NaN,
Infinity, -Infinity*.
During testing activities, it was detected a bit strange behavior of INNER JOIN
operator - different query results in almost the same queries.
*Query1* {code} select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t
inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
*Query2* {code} select distinct t.name from dfs.tmp.`ObjsX.json` t inner join
dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
*Query1* differs from *Query2* by 1 columns only:
- In *Query1* - 2 columns are selected - t.name, tt.name
- In *Query2* - 1 column is selected - t.name
However *Query1*/*Query2* return completely different results:
- *Query1* returns
{code}
name name0
object2 object2
object2 object3
object2 object4
object3 object2
object3 object3
object3 object4
object4 object2
object4 object3
object4 object4
{code}
This result seems to be correct.
- *Query2* returns _*No result found*_, not expected:
*EXPECTED_RESULT:*
{code}
name
object2
object3
object4
{code}
*ACTUAL_RESULT*: {code}No result found{code}
*NB!:* the issue appears only if tables are _*JOINed by a column which contains
newly-added data types (NaN, Infinity, -Infinity)*_. The issue is not
reproducible is a user is JOINing tables by a column containing other data types
was:
*AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT
*AFFECTED_FUNCTIONALITY:* INNER JOIN
*ISSUE_DESCRIPTION:* There were added new Json data types in
MD-2745/DRILL-5919: *NaN, Infinity, -Infinity*.
During testing activities, it was detected a bit strange behavior of INNER JOIN
operator - different query results in almost the same queries.
*Query1* {code} select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t
inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
*Query2* {code} select distinct t.name from dfs.tmp.`ObjsX.json` t inner join
dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
*Query1* differs from *Query2* by 1 columns only:
- In *Query1* - 2 columns are selected - t.name, tt.name
- In *Query2* - 1 column is selected - t.name
However *Query1*/*Query2* return completely different results:
- *Query1* returns
{code}
name name0
object2 object2
object2 object3
object2 object4
object3 object2
object3 object3
object3 object4
object4 object2
object4 object3
object4 object4
{code}
This result seems to be correct.
- *Query2* returns _*No result found*_, not expected:
*EXPECTED_RESULT:*
{code}
name
object2
object3
object4
{code}
*ACTUAL_RESULT*: {code}No result found{code}
*NB!:* the issue appears only if tables are _*JOINed by a column which contains
newly-added data types (NaN, Infinity, -Infinity)*_. The issue is not
reproducible is a user is JOINing tables by a column containing other data types
> Nan/Inf data types: strange query result with INNER JOIN operator when
> selecting 1 column
> -----------------------------------------------------------------------------------------
>
> Key: DRILL-6121
> URL: https://issues.apache.org/jira/browse/DRILL-6121
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - JSON
> Reporter: Alexander Malashevsky
> Assignee: Volodymyr Tkach
> Priority: Minor
> Attachments: ObjsX.json
>
>
> *AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT
> *AFFECTED_FUNCTIONALITY:* INNER JOIN
> *ISSUE_DESCRIPTION:* There were added new Json data types in DRILL-5919:
> *NaN, Infinity, -Infinity*.
> During testing activities, it was detected a bit strange behavior of INNER
> JOIN operator - different query results in almost the same queries.
> *Query1* {code} select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t
> inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
> *Query2* {code} select distinct t.name from dfs.tmp.`ObjsX.json` t inner join
> dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code}
> *Query1* differs from *Query2* by 1 columns only:
> - In *Query1* - 2 columns are selected - t.name, tt.name
> - In *Query2* - 1 column is selected - t.name
> However *Query1*/*Query2* return completely different results:
> - *Query1* returns
> {code}
> name name0
> object2 object2
> object2 object3
> object2 object4
> object3 object2
> object3 object3
> object3 object4
> object4 object2
> object4 object3
> object4 object4
> {code}
> This result seems to be correct.
> - *Query2* returns _*No result found*_, not expected:
> *EXPECTED_RESULT:*
> {code}
> name
> object2
> object3
> object4
> {code}
>
> *ACTUAL_RESULT*: {code}No result found{code}
> *NB!:* the issue appears only if tables are _*JOINed by a column which
> contains newly-added data types (NaN, Infinity, -Infinity)*_. The issue is
> not reproducible is a user is JOINing tables by a column containing other
> data types
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)