[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509317#comment-14509317
 ] 

Ashutosh Chauhan commented on HIVE-10416:
-

Is this patch ready for commit or does it need more work ?

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-23 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509322#comment-14509322
 ] 

Jesus Camacho Rodriguez commented on HIVE-10416:


[~ashutoshc], not yet, I need to discuss with John about his comment.

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-23 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510060#comment-14510060
 ] 

Laljo John Pullokkaran commented on HIVE-10416:
---

Code Formatting: two tabs.
+1

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.02.patch, 
 HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507383#comment-14507383
 ] 

Ashutosh Chauhan commented on HIVE-10416:
-

I like new patch since it projects only needed columns while generating Sel Op 
as oppose to adding unnecessary SelOp at  the top.
[~jpullokkaran] what do you think?

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507138#comment-14507138
 ] 

Hive QA commented on HIVE-10416:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727191/HIVE-10416.01.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3526/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3526/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3526/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727191 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-22 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507739#comment-14507739
 ] 

Laljo John Pullokkaran commented on HIVE-10416:
---

[~jcamachorodriguez] Introducing top level select needs to traverse recursively 
as long as nodes are sortrel and !ProjectRel. Practically this may happen only 
in very few cases (may be OB followed by limit).

regardless its better to traverse it down till you hit a non sort rel.

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-21 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505331#comment-14505331
 ] 

Laljo John Pullokkaran commented on HIVE-10416:
---

[~jcamachorodriguez] Introducing Select on top of Sort will not work as TEZ 
can not preserve ordering across select.

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505055#comment-14505055
 ] 

Hive QA commented on HIVE-10416:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726880/HIVE-10416.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3513/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3513/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3513/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726880 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)