[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721484#comment-15721484
 ] 

Hive QA commented on HIVE-15239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12841701/HIVE-15239.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10761 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2412/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2412/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2412/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12841701 - PreCommit-HIVE-Build

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721233#comment-15721233
 ] 

Hive QA commented on HIVE-15239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12841693/HIVE-15239.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10731 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_interval_mapjoin.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q]
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2410/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2410/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2410/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12841693 - PreCommit-HIVE-Build

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713840#comment-15713840
 ] 

Xuefu Zhang commented on HIVE-15239:


+1

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712934#comment-15712934
 ] 

Xuefu Zhang commented on HIVE-15239:


+1 with minor comment on RB.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712003#comment-15712003
 ] 

Hive QA commented on HIVE-15239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12841281/HIVE-15239.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10753 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2359/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2359/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2359/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12841281 - PreCommit-HIVE-Build

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711469#comment-15711469
 ] 

Hive QA commented on HIVE-15239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12841244/HIVE-15239.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10738 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,stats16.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q]
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2357/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2357/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2357/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12841244 - PreCommit-HIVE-Build

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-30 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710944#comment-15710944
 ] 

Xuefu Zhang commented on HIVE-15239:


[~lirui] do mind creating a RB for this? Thanks.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-30 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710435#comment-15710435
 ] 

Xuefu Zhang commented on HIVE-15239:


Sorry for the delay.

Re: my point #1, I was referring to this:
{code}
  Set firstRootOperators = first.getAllRootOperators();
  Set secondRootOperators = second.getAllRootOperators();
  if (firstRootOperators.size() != secondRootOperators.size()) {
return false;
  }

  // need to check paths and partition desc for MapWorks
  if (first instanceof MapWork && !compareMapWork((MapWork) first, 
(MapWork) second)) {
return false;
  }
{code}
I think it's better to be like the following in order to logical unit of code 
together.
{code}
  // need to check paths and partition desc for MapWorks
  if (first instanceof MapWork && !compareMapWork((MapWork) first, 
(MapWork) second)) {
return false;
  }

  Set firstRootOperators = first.getAllRootOperators();
  Set secondRootOperators = second.getAllRootOperators();
  if (firstRootOperators.size() != secondRootOperators.size()) {
return false;
  }
{code}

As to exhaustive check, your fix will solve the problem describe here. I would 
even believe there is a possibility that there are two two mapwork that works 
on different partitions of the same table, such as in case of union.

Overall, I feel more testing is needed for this feature. Of course this goes 
beyond the scope of this JIRA.



> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-29 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707619#comment-15707619
 ] 

Rui Li commented on HIVE-15239:
---

Pinging [~xuefuz]

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-27 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700706#comment-15700706
 ] 

Rui Li commented on HIVE-15239:
---

Thanks [~csun] for the explanations. I think these tests have been ignored for 
a while because some of them fail when I explicitly run with 
TestMiniSparkOnYarnCliDriver. I'll fix and re-enable them in a separate JIRA.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-25 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696544#comment-15696544
 ] 

Chao Sun commented on HIVE-15239:
-

[~lirui] yes I added this. It's for the test cases that should ONLY run under 
HoS, like HoS dynamic partition pruning.
Part of the change is in the test node's config file, which is not part of the 
git repository. I guess much has changed since then for the qtest framework so 
this needs to be changed as well.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-24 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694624#comment-15694624
 ] 

Rui Li commented on HIVE-15239:
---

Latest failures are not related. And I guess {{spark.only.query.files}} are not 
automatically picked up. We can enable them in a follow-on.
[~xuefuz] please have another look. Thanks.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693097#comment-15693097
 ] 

Hive QA commented on HIVE-15239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840399/HIVE-15239.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10733 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=91)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=92)
org.apache.hive.hcatalog.api.TestHCatClientNotification.createTable 
(batchId=218)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2282/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2282/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2282/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840399 - PreCommit-HIVE-Build

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-24 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692721#comment-15692721
 ] 

Rui Li commented on HIVE-15239:
---

Thanks [~xuefuz] for the suggestions.

1. Not sure if I'm following your point. The code is in method {{compareWork}}, 
which checks if two works are equivalent. The patch adds some special checking 
for MapWork. If the check fails, we don't have to check the operators.
2. OK I'll move the null check to compare methods. Some of them need to stay in 
the caller though, otherwise we'll get NPEs.
3. I thought about override the equals method of each corresponding classes. 
But I'm not sure how to override the hashCode methods accordingly.

The fields used in the comparison are same as those used in the clone method of 
each classes. So I think it's exhaustive. Actually I'm not sure if it's 
necessary to go this far in the comparison. We can simply compare the paths to 
solve the example problem in this JIRA - different paths mean the MapWorks are 
for different tables/partitions. I don't know if it's ever possible that two 
MapWorks point to the same path but have different PartitionDesc.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691764#comment-15691764
 ] 

Xuefu Zhang commented on HIVE-15239:


Patch looks good. Two minor comments:

1. The following code seems being inserted in the middle of other code block.
{code}
  // need to check paths and partition desc for MapWorks
  if (first instanceof MapWork && !compareMapWork((MapWork) first, 
(MapWork) second)) {
return false;
  }
{code}
2. As a custom, null check and null equal check might be better in the compare 
method itself rather than letting the caller take the responsibility. This 
applies to the few private methods introduced, but no big deal though.
3. I'm not sure if it makes sense to put these compare() methods in the 
corresponding classes. Otherwise, these comparisons can be easily broken.

One concern I have is whether the comparisons are exhaustive. That is, whether 
the condition check is sufficient. With some many noisy fields in those 
compared classes, it's hard to see which are important and which are not. 

Thoughts?

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690856#comment-15690856
 ] 

Hive QA commented on HIVE-15239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840261/HIVE-15239.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 10703 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=128)

[union_remove_15.q,bucket_map_join_tez1.q,groupby7_noskew.q,bucketmapjoin1.q,subquery_multiinsert.q,auto_join8.q,auto_join6.q,groupby2_map_skew.q,lateral_view_explode2.q,join28.q,load_dyn_part1.q,skewjoinopt17.q,skewjoin_union_remove_1.q,union_remove_20.q,bucketmapjoin5.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[parallel_join1.q,union27.q,union12.q,groupby7_map_multi_single_reducer.q,varchar_join1.q,join7.q,join_reorder4.q,skewjoinopt2.q,bucketsortoptimize_insert_2.q,smb_mapjoin_17.q,script_env_var1.q,groupby7_map.q,groupby3.q,bucketsortoptimize_insert_8.q,union20.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=43)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_part] 
(batchId=105)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_shufflejoin]
 (batchId=123)
org.apache.hive.hcatalog.api.TestHCatClient.testBasicDDLCommands (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testCreateTableLike (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testDatabaseLocation (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testDropPartitionsWithPartialSpec 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testDropTableException (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testEmptyTableInstantiation 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testGetMessageBusTopicName 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testGetPartitionsWithPartialSpec 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testObjectNotFoundException 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testOtherFailure (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSchema (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionsHCatClientImpl 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testRenameTable (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testReplicationTaskIter 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=166)
org.apache.hive.hcatalog.api.TestHCatClient.testUpdateTableSchema (batchId=166)
org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish
 (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2262/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2262/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2262/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840261 - PreCommit-HIVE-Build

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO 

[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-18 Thread wangwenli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15676230#comment-15676230
 ] 

wangwenli commented on HIVE-15239:
--

[~xuefu.w...@kodak.com] different problem?   what is the problem?

come to this issue, in
{code}
org.apache.hadoop.hive.ql.optimizer.spark.CombineEquivalentWorkResolver.EquivalentWorkMatcher.compareWork()
{code}
it check operator is same or not, here the tablescan operator is same base on 
the currently impl TableScanOperatorComparator,  but they are different tables 
tablescan, should not be same.
Maybe we can add one more check, check the table is same or not.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675878#comment-15675878
 ] 

Xuefu Zhang commented on HIVE-15239:


cc: [~lirui]. I got a different problem with cdh 5.7. Currently having problem 
to set up Hive on Spark with latest code.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)