[jira] [Commented] (HIVE-15272) "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

2017-04-07 Thread Miklos Szurap (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961373#comment-15961373
 ] 

Miklos Szurap commented on HIVE-15272:
--

As the amount field is a DECIMAL, it is very likely that HIVE-12768 is the root 
cause.

> "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark
> --
>
> Key: HIVE-15272
> URL: https://issues.apache.org/jira/browse/HIVE-15272
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 1.1.0
> Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4
>Reporter: Vikash Pareek
>Assignee: Rui Li
>
> I ran following Hive query multiple times with execution engine as Hive on 
> Spark and Hive on MapReduce.
> {code}
> SELECT COUNT(DISTINCT t1.region, t1.amount)
> FROM my_db.my_table1 t1
> LEFT OUTER
> JOIN my_db.my_table2 t2 ON (t1.id = t2.id
> AND t1.name = t2.name)
> {code}
> With Hive on Spark: Result (count) were different of every execution.
> With Hive on MapReduce: Result (count) were same of every execution.
> Seems like Hive on Spark behaving differently in each execution and does not 
> populating correct result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15272) "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

2016-12-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753212#comment-15753212
 ] 

Rui Li commented on HIVE-15272:
---

OK I'll look into this.
[~VPareek], I think the two tables have same DDL right? Do they contain same 
data? Could you upload some sample data that can reproduce the issue? Thanks!

> "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark
> --
>
> Key: HIVE-15272
> URL: https://issues.apache.org/jira/browse/HIVE-15272
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 1.1.0
> Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4
>Reporter: Vikash Pareek
>
> I ran following Hive query multiple times with execution engine as Hive on 
> Spark and Hive on MapReduce.
> {code}
> SELECT COUNT(DISTINCT t1.region, t1.amount)
> FROM my_db.my_table1 t1
> LEFT OUTER
> JOIN my_db.my_table2 t2 ON (t1.id = t2.id
> AND t1.name = t2.name)
> {code}
> With Hive on Spark: Result (count) were different of every execution.
> With Hive on MapReduce: Result (count) were same of every execution.
> Seems like Hive on Spark behaving differently in each execution and does not 
> populating correct result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15272) "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

2016-12-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15751333#comment-15751333
 ] 

Xuefu Zhang commented on HIVE-15272:


cc: [~lirui]

> "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark
> --
>
> Key: HIVE-15272
> URL: https://issues.apache.org/jira/browse/HIVE-15272
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 1.1.0
> Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4
>Reporter: Vikash Pareek
>
> I ran following Hive query multiple times with execution engine as Hive on 
> Spark and Hive on MapReduce.
> {code}
> SELECT COUNT(DISTINCT t1.region, t1.amount)
> FROM my_db.my_table1 t1
> LEFT OUTER
> JOIN my_db.my_table2 t2 ON (t1.id = t2.id
> AND t1.name = t2.name)
> {code}
> With Hive on Spark: Result (count) were different of every execution.
> With Hive on MapReduce: Result (count) were same of every execution.
> Seems like Hive on Spark behaving differently in each execution and does not 
> populating correct result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15272) "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

2016-12-15 Thread Vikash Pareek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750784#comment-15750784
 ] 

Vikash Pareek commented on HIVE-15272:
--

Query you can find in the issue description itself.
SELECT COUNT(DISTINCT t1.region, t1.amount)
FROM my_db.my_table1 t1
LEFT OUTER
JOIN my_db.my_table2 t2 ON (t1.id = t2.id
AND t1.name = t2.name)

For DDL, 
region -> STRING
amount -> DECIMAL
name -> STRING


> "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark
> --
>
> Key: HIVE-15272
> URL: https://issues.apache.org/jira/browse/HIVE-15272
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 1.1.0
> Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4
>Reporter: Vikash Pareek
>
> I ran following Hive query multiple times with execution engine as Hive on 
> Spark and Hive on MapReduce.
> {code}
> SELECT COUNT(DISTINCT t1.region, t1.amount)
> FROM my_db.my_table1 t1
> LEFT OUTER
> JOIN my_db.my_table2 t2 ON (t1.id = t2.id
> AND t1.name = t2.name)
> {code}
> With Hive on Spark: Result (count) were different of every execution.
> With Hive on MapReduce: Result (count) were same of every execution.
> Seems like Hive on Spark behaving differently in each execution and does not 
> populating correct result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15272) "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

2016-11-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703275#comment-15703275
 ] 

Xuefu Zhang commented on HIVE-15272:


[~VPareek] would you mind providing a repro case (data, ddl, and query)? Thanks.

> "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark
> --
>
> Key: HIVE-15272
> URL: https://issues.apache.org/jira/browse/HIVE-15272
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 1.1.0
> Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4
>Reporter: Vikash Pareek
>
> I ran following Hive query multiple times with execution engine as Hive on 
> Spark and Hive on MapReduce.
> {code}
> SELECT COUNT(DISTINCT t1.region, t1.amount)
> FROM my_db.my_table1 t1
> LEFT OUTER
> JOIN my-db.my_table2 t2 ON (t1.id = t2.id
> AND t1.name = t2.name)
> {code}
> With Hive on Spark: Result (count) were different of every execution.
> With Hive on MapReduce: Result (count) were same of every execution.
> Seems like Hive on Spark behaving differently in each execution and does not 
> populating correct result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)