[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-02-04 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853156#comment-15853156
 ] 

Matt McCline commented on HIVE-15573:
-

New patch has review comment changes except guard-rail.  Other changes for 
EXPLAIN VECTORIZATION.

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-02-04 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15573:

Status: Patch Available  (was: In Progress)

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-02-04 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15573:

Attachment: HIVE-15573.04.patch

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-02-04 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15573:

Status: In Progress  (was: Patch Available)

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853138#comment-15853138
 ] 

Hive QA commented on HIVE-15808:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12851062/HIVE-15808.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3383/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3383/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3383/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12851062 - PreCommit-HIVE-Build

> Remove semijoin reduction branch if it is on bigtable along with hash join
> --
>
> Key: HIVE-15808
> URL: https://issues.apache.org/jira/browse/HIVE-15808
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15808.2.patch, HIVE-15808.patch
>
>
> If there is a semijoin branch on the same operator pipeline which contains a 
> hash join then it is by design on big table which is not optimal. The 
> operator cycle detection logic may not find a cycle as there is no cycle at 
> operator level. However, once Tez builds its task there can be a cycle at 
> task level causing the query to fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15806) Druid schema inference for Select queries might produce wrong type for metrics

2017-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853136#comment-15853136
 ] 

Ashutosh Chauhan commented on HIVE-15806:
-

* Post aggregator columns in TopN,GroupBy and timeseries queries are always 
float, but they could be potentially be of long, no?
* Add a comment why we are doing metadata query only for select but not for 
other query types?

> Druid schema inference for Select queries might produce wrong type for metrics
> --
>
> Key: HIVE-15806
> URL: https://issues.apache.org/jira/browse/HIVE-15806
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15806.01.patch, HIVE-15806.patch
>
>
> We inferred float automatically, instead of emitting a metadata query to 
> Druid and checking the type of the metric.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15812) Scalar subquery with having throws exception

2017-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853132#comment-15853132
 ] 

Ashutosh Chauhan commented on HIVE-15812:
-

+1

> Scalar subquery with having throws exception
> 
>
> Key: HIVE-15812
> URL: https://issues.apache.org/jira/browse/HIVE-15812
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15812.1.patch
>
>
> Following query throws an exception
> {code:SQL}
> select sum(p_retailprice) from part group by p_type having sum(p_retailprice) 
> > (select max(pp.p_retailprice) from part pp);
> {code}
> {noformat}
> SemanticException [Error 10004]: Line 3:40 Invalid table alias or column 
> reference 'pp': (possible column names are: p_partkey, p_name, p_mfgr, 
> p_brand, p_type, p_size, p_container, p_retailprice, p_comment)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15804) Druid handler might not emit metadata query when CBO fails

2017-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853131#comment-15853131
 ] 

Ashutosh Chauhan commented on HIVE-15804:
-

Also seems like there is need to update title of the jira .

> Druid handler might not emit metadata query when CBO fails
> --
>
> Key: HIVE-15804
> URL: https://issues.apache.org/jira/browse/HIVE-15804
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15804.01.patch, HIVE-15804.patch
>
>
> When CBO is not enabled/fails, we should still be able to run queries on 
> Druid datasources.
> This is implemented as Select query that will retrieve all data available 
> from Druid and then execute the rest of the logic on the data. However, 
> currently we might fail due to wrong inferred type for the Druid datasource 
> columns for numerical types.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15804) Druid handler might not emit metadata query when CBO fails

2017-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853130#comment-15853130
 ] 

Ashutosh Chauhan commented on HIVE-15804:
-

+1

> Druid handler might not emit metadata query when CBO fails
> --
>
> Key: HIVE-15804
> URL: https://issues.apache.org/jira/browse/HIVE-15804
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15804.01.patch, HIVE-15804.patch
>
>
> When CBO is not enabled/fails, we should still be able to run queries on 
> Druid datasources.
> This is implemented as Select query that will retrieve all data available 
> from Druid and then execute the rest of the logic on the data. However, 
> currently we might fail due to wrong inferred type for the Druid datasource 
> columns for numerical types.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15458) Fix semi-join conversion rule for subquery

2017-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853126#comment-15853126
 ] 

Ashutosh Chauhan commented on HIVE-15458:
-

+1

> Fix semi-join conversion rule for subquery
> --
>
> Key: HIVE-15458
> URL: https://issues.apache.org/jira/browse/HIVE-15458
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch, 
> HIVE-15458.3.patch
>
>
> Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* 
> since it doesn't work for subqueries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS

2017-02-04 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853125#comment-15853125
 ] 

Xuefu Zhang commented on HIVE-15815:


+1. I assume that spark.hadoop.oozie properties will be interpreted correctly 
by Spark.

> Allow to pass some Oozie properties to Spark in HoS
> ---
>
> Key: HIVE-15815
> URL: https://issues.apache.org/jira/browse/HIVE-15815
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15815.patch
>
>
> Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when 
> it invokes Hive2 action. If we allow these properties to be passed to Spark 
> in HoS, we can easily associate an Ooize workflow ID to an HoS client and 
> Spark job in Spark history. It will be very helpful in diagnosing some issues 
> involving Oozie Hive2/HoS/Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853124#comment-15853124
 ] 

Hive QA commented on HIVE-15222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12851061/HIVE-15222.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 128 failed/errored test(s), 10181 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=120)

[groupby4_noskew.q,groupby3_map_skew.q,join_cond_pushdown_2.q,union19.q,union24.q,union_remove_5.q,groupby7_noskew_multi_single_reducer.q,vectorization_1.q,index_auto_self_join.q,auto_smb_mapjoin_14.q,script_env_var2.q,pcr.q,auto_join_filters.q,join0.q,join37.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=126)

[ptf_seqfile.q,union_remove_23.q,parallel_join0.q,union_remove_9.q,join_nullsafe.q,skewjoinopt14.q,vectorized_mapjoin.q,union4.q,auto_join5.q,vectorized_shufflejoin.q,smb_mapjoin_20.q,groupby8_noskew.q,auto_sortmerge_join_10.q,groupby11.q,union_remove_16.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=130)

[groupby6_map.q,stats13.q,groupby2_noskew_multi_distinct.q,load_dyn_part12.q,join15.q,auto_join17.q,join_hive_626.q,tez_join_tests.q,auto_join21.q,join_view.q,join_cond_pushdown_4.q,vectorization_0.q,union_null.q,auto_join3.q,vectorization_decimal_date.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input4] (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join0] (batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel_join0] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] 
(batchId=38)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_dpp]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_semijoin]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_3] 
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_5] 
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_1] 
(batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_2] 
(batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_3] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_4] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_5] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[deleteAnalyze]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[empty_join] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[nonmr_fetch_threshold]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_complex]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_primitive]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_table]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_table]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_cache] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_without_gby]
 (batchId=149)

[jira] [Commented] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join

2017-02-04 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853113#comment-15853113
 ] 

Deepak Jaiswal commented on HIVE-15808:
---

Patch updated.

> Remove semijoin reduction branch if it is on bigtable along with hash join
> --
>
> Key: HIVE-15808
> URL: https://issues.apache.org/jira/browse/HIVE-15808
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15808.2.patch, HIVE-15808.patch
>
>
> If there is a semijoin branch on the same operator pipeline which contains a 
> hash join then it is by design on big table which is not optimal. The 
> operator cycle detection logic may not find a cycle as there is no cycle at 
> operator level. However, once Tez builds its task there can be a cycle at 
> task level causing the query to fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15743) vectorized text parsing: speed up double parse

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853106#comment-15853106
 ] 

Hive QA commented on HIVE-15743:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12851060/HIVE-15743.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10225 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=230)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3381/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3381/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3381/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12851060 - PreCommit-HIVE-Build

> vectorized text parsing: speed up double parse
> --
>
> Key: HIVE-15743
> URL: https://issues.apache.org/jira/browse/HIVE-15743
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, 
> HIVE-15743.3.patch, HIVE-15743.4.patch, tpch-without.png
>
>
> {noformat}
> Double.parseDouble(
> new String(bytes, fieldStart, fieldLength, 
> StandardCharsets.UTF_8));{noformat}
> This takes ~25% of the query time in some cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join

2017-02-04 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15808:
--
Attachment: HIVE-15808.2.patch

> Remove semijoin reduction branch if it is on bigtable along with hash join
> --
>
> Key: HIVE-15808
> URL: https://issues.apache.org/jira/browse/HIVE-15808
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15808.2.patch, HIVE-15808.patch
>
>
> If there is a semijoin branch on the same operator pipeline which contains a 
> hash join then it is by design on big table which is not optimal. The 
> operator cycle detection logic may not find a cycle as there is no cycle at 
> operator level. However, once Tez builds its task there can be a cycle at 
> task level causing the query to fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join

2017-02-04 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15808:
--
Description: If there is a semijoin branch on the same operator pipeline 
which contains a hash join then it is by design on big table which is not 
optimal. The operator cycle detection logic may not find a cycle as there is no 
cycle at operator level. However, once Tez builds its task there can be a cycle 
at task level causing the query to fail.  (was: It is found that the current 
logic of cycle detection does not find cycles created when there is a semijoin 
branch parallel to a hash join on a reducer.
To avoid such cycles, remove the semijoin reduction optimization.)
Summary: Remove semijoin reduction branch if it is on bigtable along 
with hash join  (was: Remove Semijoin reduction branch on reducers if there is 
hash join)

> Remove semijoin reduction branch if it is on bigtable along with hash join
> --
>
> Key: HIVE-15808
> URL: https://issues.apache.org/jira/browse/HIVE-15808
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15808.patch
>
>
> If there is a semijoin branch on the same operator pipeline which contains a 
> hash join then it is by design on big table which is not optimal. The 
> operator cycle detection logic may not find a cycle as there is no cycle at 
> operator level. However, once Tez builds its task there can be a cycle at 
> task level causing the query to fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15223) replace org.json usage in EximUtil with some alternative

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853094#comment-15853094
 ] 

Hive QA commented on HIVE-15223:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850960/HIVE-15223.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10222 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=230)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=186)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3380/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3380/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3380/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850960 - PreCommit-HIVE-Build

> replace org.json usage in EximUtil with some alternative
> 
>
> Key: HIVE-15223
> URL: https://issues.apache.org/jira/browse/HIVE-15223
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Teddy Choi
> Fix For: 2.2.0
>
> Attachments: HIVE-15223.1.patch
>
>
> The metadata is stored in json format...which changed lately with the advent 
> of replication v2.
> I think jackson would be nice to have here - it could possibly aid to make 
> this Metadata reading / writing more resilient against future serialization 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative

2017-02-04 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-15222:
--
Status: Patch Available  (was: Open)

> replace org.json usage in ExplainTask/TezTask related classes with some 
> alternative
> ---
>
> Key: HIVE-15222
> URL: https://issues.apache.org/jira/browse/HIVE-15222
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Teddy Choi
> Fix For: 2.2.0
>
> Attachments: HIVE-15222.1.patch
>
>
> Replace org.json usage in these classes.
> It seems to me that json is probably only used to write some information - 
> but the application never reads it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative

2017-02-04 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-15222:
--
Attachment: HIVE-15222.1.patch

I replaced json.org with Jackson. However, the patch file is bigger than I 
thought. It covers 7 files and its size is 64KB. I tested TestExplainTask and 
it succeeded. But it still may not be sufficient. I will wait for integration 
test results.

> replace org.json usage in ExplainTask/TezTask related classes with some 
> alternative
> ---
>
> Key: HIVE-15222
> URL: https://issues.apache.org/jira/browse/HIVE-15222
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Teddy Choi
> Fix For: 2.2.0
>
> Attachments: HIVE-15222.1.patch
>
>
> Replace org.json usage in these classes.
> It seems to me that json is probably only used to write some information - 
> but the application never reads it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15743) vectorized text parsing: speed up double parse

2017-02-04 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-15743:
--
Attachment: HIVE-15743.4.patch

This patch applies Sergey's feedback. It uses a trimed string for precision 
check, and limits strtod(String) for testing only.

> vectorized text parsing: speed up double parse
> --
>
> Key: HIVE-15743
> URL: https://issues.apache.org/jira/browse/HIVE-15743
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, 
> HIVE-15743.3.patch, HIVE-15743.4.patch, tpch-without.png
>
>
> {noformat}
> Double.parseDouble(
> new String(bytes, fieldStart, fieldLength, 
> StandardCharsets.UTF_8));{noformat}
> This takes ~25% of the query time in some cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15223) replace org.json usage in EximUtil with some alternative

2017-02-04 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-15223:
--
Status: Patch Available  (was: Open)

> replace org.json usage in EximUtil with some alternative
> 
>
> Key: HIVE-15223
> URL: https://issues.apache.org/jira/browse/HIVE-15223
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Teddy Choi
> Fix For: 2.2.0
>
> Attachments: HIVE-15223.1.patch
>
>
> The metadata is stored in json format...which changed lately with the advent 
> of replication v2.
> I think jackson would be nice to have here - it could possibly aid to make 
> this Metadata reading / writing more resilient against future serialization 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS

2017-02-04 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15815:
---
Status: Patch Available  (was: Open)

> Allow to pass some Oozie properties to Spark in HoS
> ---
>
> Key: HIVE-15815
> URL: https://issues.apache.org/jira/browse/HIVE-15815
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15815.patch
>
>
> Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when 
> it invokes Hive2 action. If we allow these properties to be passed to Spark 
> in HoS, we can easily associate an Ooize workflow ID to an HoS client and 
> Spark job in Spark history. It will be very helpful in diagnosing some issues 
> involving Oozie Hive2/HoS/Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS

2017-02-04 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15815:
---
Attachment: HIVE-15815.patch

[~xuefuz] could you help to review the patch to see if it makes sense? thanks!

> Allow to pass some Oozie properties to Spark in HoS
> ---
>
> Key: HIVE-15815
> URL: https://issues.apache.org/jira/browse/HIVE-15815
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15815.patch
>
>
> Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when 
> it invokes Hive2 action. If we allow these properties to be passed to Spark 
> in HoS, we can easily associate an Ooize workflow ID to an HoS client and 
> Spark job in Spark history. It will be very helpful in diagnosing some issues 
> involving Oozie Hive2/HoS/Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS

2017-02-04 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-15815:
--


> Allow to pass some Oozie properties to Spark in HoS
> ---
>
> Key: HIVE-15815
> URL: https://issues.apache.org/jira/browse/HIVE-15815
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
>
> Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when 
> it invokes Hive2 action. If we allow these properties to be passed to Spark 
> in HoS, we can easily associate an Ooize workflow ID to an HoS client and 
> Spark job in Spark history. It will be very helpful in diagnosing some issues 
> involving Oozie Hive2/HoS/Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("

2017-02-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852964#comment-15852964
 ] 

Pengcheng Xiong commented on HIVE-15388:


I have added lots of parentheses queries before, e.g., multi_column_in.q 
multi_column_in_single.q. For interval, we have tpcds queries in perfclidriver 
which supports new interval

> HiveParser spends lots of time in parsing queries with lots of "("
> --
>
> Key: HIVE-15388
> URL: https://issues.apache.org/jira/browse/HIVE-15388
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, 
> HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, 
> hive-15388.stacktrace.txt
>
>
> Branch: apache-master (applicable with previous releases as well)
> Queries generated via tools can have lots of "(" for "AND/OR" conditions. 
> This causes huge delays in parsing phase when the number of expressions are 
> high.
> e.g
> {noformat}
> SELECT `iata`,
>`airport`,
>`city`,
>`state`,
>`country`,
>`lat`,
>`lon`
> FROM airports
> WHERE 
> ((`airports`.`airport`
>  = "Thigpen"
>   
>   OR `airports`.`airport` = "Astoria Regional")
>   
>  OR `airports`.`airport` = "Warsaw Municipal")
>   
> OR `airports`.`airport` = "John F Kennedy Memorial")
>  
> OR `airports`.`airport` = "Hall-Miller Municipal")
> 
> OR `airports`.`airport` = "Atqasuk")
>OR 
> `airports`.`airport` = "William B Hartsfield-Atlanta Intl")
>   OR 
> `airports`.`airport` = "Artesia Municipal")
>  OR 
> `airports`.`airport` = "Outagamie County Regional")
> OR 
> `airports`.`airport` = "Watertown Municipal")
>OR 
> `airports`.`airport` = "Augusta State")
>   OR 
> `airports`.`airport` = "Aurora Municipal")
>  OR 
> `airports`.`airport` = "Alakanuk")
> OR 
> `airports`.`airport` = "Austin Municipal")
>OR 
> `airports`.`airport` = "Auburn Municipal")
>   OR 
> `airports`.`airport` = "Auburn-Opelik")
>  OR 
> `airports`.`airport` = "Austin-Bergstrom International")
> OR 
> `airports`.`airport` = "Wausau Municipal")
>OR 
> `airports`.`airport` = "Mecklenburg-Brunswick Regional")
>   OR 
> `airports`.`airport` = "Alva Regional")
>  OR 
> `airports`.`airport` = "Asheville Regional")
> OR 
> `airports`.`airport` = "Avon Park Municipal")
>OR 
> `airports`.`airport` = "Wilkes-Barre/Scranton Intl")
>   OR 
> `airports`.`airport` = "Marana Northwest Regional")
>  OR 
> `airports`.`airport` = "Catalina")
> OR 
> `airports`.`airport` = "Washington Municipal")
>OR 
> `airports`.`airport` = "Wainwright")
>   OR `airports`.`airport` 
> = "West Memphis Municipal")
>  OR `airports`.`airport` 
> = "Arlington Municipal")
> OR `airports`.`airport` = 
> "Algona Municipal")
>  

[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("

2017-02-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852963#comment-15852963
 ] 

Pengcheng Xiong commented on HIVE-15388:


Here are the results and they are expected:
{code}
PREHOOK: query: select true=true in (true, false)
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
POSTHOOK: query: select true=true in (true, false)
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
true
PREHOOK: query: select false=true in (true, false)
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
POSTHOOK: query: select false=true in (true, false)
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
false
{code}
Postgres
{code}
horton=# select true=false in (true,false);
 ?column?
--
 t
(1 row)

horton=# select false=false in (true,false);
 ?column?
--
 f
(1 row)
{code}

And the error:
Hive:
{code}
2017-02-04T14:25:34,713 ERROR [a24cc02e-355b-402d-8183-43501e0edc77 main] 
ql.Driver: FAILED: SemanticException Line 0:-1 Wrong arguments 'false': The 
arguments for IN should be the same type! Types are: {int IN (boolean, boolean)}
org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Wrong arguments 
'false': The arguments for IN should be the same type! Types are: {int IN 
(boolean, boolean)}
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1367)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
{code}
Postgres
{code}
horton=# select 1=1 in (true, false);
ERROR:  operator does not exist: integer = boolean
LINE 1: select 1=1 in (true, false);
   ^
HINT:  No operator matches the given name and argument type(s). You might need 
to add explicit type casts.
{code}


> HiveParser spends lots of time in parsing queries with lots of "("
> --
>
> Key: HIVE-15388
> URL: https://issues.apache.org/jira/browse/HIVE-15388
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, 
> HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, 
> hive-15388.stacktrace.txt
>
>
> Branch: apache-master (applicable with previous releases as well)
> Queries generated via tools can have lots of "(" for "AND/OR" conditions. 
> This causes huge delays in parsing phase when the number of expressions are 
> high.
> e.g
> {noformat}
> SELECT `iata`,
>`airport`,
>`city`,
>`state`,
>`country`,
>`lat`,
>`lon`
> FROM airports
> WHERE 
> ((`airports`.`airport`
>  = "Thigpen"
>   
>   OR `airports`.`airport` = "Astoria Regional")
>   
>  OR `airports`.`airport` = "Warsaw Municipal")
>   
> OR `airports`.`airport` = "John F Kennedy Memorial")
>  
> OR `airports`.`airport` = "Hall-Miller Municipal")
> 
> OR `airports`.`airport` = "Atqasuk")
>OR 
> `airports`.`airport` = "William B Hartsfield-Atlanta Intl")
>   OR 
> `airports`.`airport` = "Artesia Municipal")
>  OR 
> `airports`.`airport` = "Outagamie County Regional")
> OR 
> `airports`.`airport` = "Watertown Municipal")
>OR 
> `airports`.`airport` = "Augusta State")
>   OR 
> `airports`.`airport` = "Aurora Municipal")
>  OR 
> `airports`.`airport` = "Alakanuk")
> OR 
> `airports`.`airport` = "Austin Municipal")
>OR 
> `airports`.`airport` 

[jira] [Updated] (HIVE-15458) Fix semi-join conversion rule for subquery

2017-02-04 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15458:
---
Status: Open  (was: Patch Available)

> Fix semi-join conversion rule for subquery
> --
>
> Key: HIVE-15458
> URL: https://issues.apache.org/jira/browse/HIVE-15458
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch, 
> HIVE-15458.3.patch
>
>
> Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* 
> since it doesn't work for subqueries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15458) Fix semi-join conversion rule for subquery

2017-02-04 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15458:
---
Attachment: HIVE-15458.3.patch

> Fix semi-join conversion rule for subquery
> --
>
> Key: HIVE-15458
> URL: https://issues.apache.org/jira/browse/HIVE-15458
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch, 
> HIVE-15458.3.patch
>
>
> Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* 
> since it doesn't work for subqueries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("

2017-02-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852916#comment-15852916
 ] 

Gunther Hagleitner commented on HIVE-15388:
---

[~pxiong] Thanks for the clarification. When you say "1=1 in (true, false)" is 
illegal, is it a semantic error or a parse error? What happens when you run: 
"select true=true in (true, false)"? Can you add that to the tests. 

Problem w/ saying 10k tests didn't find anything else is that I don't know how 
many tests actually have an in clause with expressions. Probably not that many. 
Can you make sure that you cover these expressions in "udf_in.q"?

For interval literals - the spec says:

{noformat}
 ::= INTERVAL [  ]  

 ::=   
{noformat}

The "unquoted interval string" is parsed elsewhere. So it sounds like 
restricting it for now is fine, although I'm still looking at this. You are 
throwing out more tests from "interval_alt.q" than needed, some statements in 
there should still work, right?

> HiveParser spends lots of time in parsing queries with lots of "("
> --
>
> Key: HIVE-15388
> URL: https://issues.apache.org/jira/browse/HIVE-15388
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, 
> HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, 
> hive-15388.stacktrace.txt
>
>
> Branch: apache-master (applicable with previous releases as well)
> Queries generated via tools can have lots of "(" for "AND/OR" conditions. 
> This causes huge delays in parsing phase when the number of expressions are 
> high.
> e.g
> {noformat}
> SELECT `iata`,
>`airport`,
>`city`,
>`state`,
>`country`,
>`lat`,
>`lon`
> FROM airports
> WHERE 
> ((`airports`.`airport`
>  = "Thigpen"
>   
>   OR `airports`.`airport` = "Astoria Regional")
>   
>  OR `airports`.`airport` = "Warsaw Municipal")
>   
> OR `airports`.`airport` = "John F Kennedy Memorial")
>  
> OR `airports`.`airport` = "Hall-Miller Municipal")
> 
> OR `airports`.`airport` = "Atqasuk")
>OR 
> `airports`.`airport` = "William B Hartsfield-Atlanta Intl")
>   OR 
> `airports`.`airport` = "Artesia Municipal")
>  OR 
> `airports`.`airport` = "Outagamie County Regional")
> OR 
> `airports`.`airport` = "Watertown Municipal")
>OR 
> `airports`.`airport` = "Augusta State")
>   OR 
> `airports`.`airport` = "Aurora Municipal")
>  OR 
> `airports`.`airport` = "Alakanuk")
> OR 
> `airports`.`airport` = "Austin Municipal")
>OR 
> `airports`.`airport` = "Auburn Municipal")
>   OR 
> `airports`.`airport` = "Auburn-Opelik")
>  OR 
> `airports`.`airport` = "Austin-Bergstrom International")
> OR 
> `airports`.`airport` = "Wausau Municipal")
>OR 
> `airports`.`airport` = "Mecklenburg-Brunswick Regional")
>   OR 
> `airports`.`airport` = "Alva Regional")
>  OR 
> `airports`.`airport` = "Asheville Regional")
> OR 
> `airports`.`airport` = "Avon Park Municipal")
>OR 
> `airports`.`airport` = "Wilkes-Barre/Scranton Intl")
>   OR 
> `airports`.`airport` = "Marana Northwest Regional")
>  

[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("

2017-02-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852883#comment-15852883
 ] 

Pengcheng Xiong commented on HIVE-15388:


[~hagleitn] and [~ashutoshc], i think there is some misunderstanding here. 
Parenthesis are now mandatory for expressions in predicate in that single q 
test. It is because of the "=" where "in" has higher precedence than "=". This 
is not saying that for every q test, we need to use parenthesis for expressions 
in predicate. Out of 10K+ q tests, I only discovered that single q test which 
needs modification and it is illegal in postgres/oracle. I also tried "select 
1+1 in (1,2,3,4)" and "select (1+1) in (1,2,3,4)" in Hive. Both of them work 
well with my patch. Thanks.

> HiveParser spends lots of time in parsing queries with lots of "("
> --
>
> Key: HIVE-15388
> URL: https://issues.apache.org/jira/browse/HIVE-15388
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, 
> HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, 
> hive-15388.stacktrace.txt
>
>
> Branch: apache-master (applicable with previous releases as well)
> Queries generated via tools can have lots of "(" for "AND/OR" conditions. 
> This causes huge delays in parsing phase when the number of expressions are 
> high.
> e.g
> {noformat}
> SELECT `iata`,
>`airport`,
>`city`,
>`state`,
>`country`,
>`lat`,
>`lon`
> FROM airports
> WHERE 
> ((`airports`.`airport`
>  = "Thigpen"
>   
>   OR `airports`.`airport` = "Astoria Regional")
>   
>  OR `airports`.`airport` = "Warsaw Municipal")
>   
> OR `airports`.`airport` = "John F Kennedy Memorial")
>  
> OR `airports`.`airport` = "Hall-Miller Municipal")
> 
> OR `airports`.`airport` = "Atqasuk")
>OR 
> `airports`.`airport` = "William B Hartsfield-Atlanta Intl")
>   OR 
> `airports`.`airport` = "Artesia Municipal")
>  OR 
> `airports`.`airport` = "Outagamie County Regional")
> OR 
> `airports`.`airport` = "Watertown Municipal")
>OR 
> `airports`.`airport` = "Augusta State")
>   OR 
> `airports`.`airport` = "Aurora Municipal")
>  OR 
> `airports`.`airport` = "Alakanuk")
> OR 
> `airports`.`airport` = "Austin Municipal")
>OR 
> `airports`.`airport` = "Auburn Municipal")
>   OR 
> `airports`.`airport` = "Auburn-Opelik")
>  OR 
> `airports`.`airport` = "Austin-Bergstrom International")
> OR 
> `airports`.`airport` = "Wausau Municipal")
>OR 
> `airports`.`airport` = "Mecklenburg-Brunswick Regional")
>   OR 
> `airports`.`airport` = "Alva Regional")
>  OR 
> `airports`.`airport` = "Asheville Regional")
> OR 
> `airports`.`airport` = "Avon Park Municipal")
>OR 
> `airports`.`airport` = "Wilkes-Barre/Scranton Intl")
>   OR 
> `airports`.`airport` = "Marana Northwest Regional")
>  OR 
> `airports`.`airport` = "Catalina")
> OR 
> `airports`.`airport` = "Washington Municipal")
>   

[jira] [Commented] (HIVE-15769) Support view creation in CBO

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852814#comment-15852814
 ] 

Hive QA commented on HIVE-15769:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850982/HIVE-15769.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 130 failed/errored test(s), 10227 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_as_select] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_rename] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_8] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_createtab]
 (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_createtab_noauthzapi]
 (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_owner_actions]
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_1] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_const] (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_subq_exists] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_union_view] 
(batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_big_view] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_tbl_props] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_or_replace_view] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_defaultformats]
 (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_translate] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_char] (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_date] (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_varchar] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cteViews] (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_2] (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_4] (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_ddl1] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_query5] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned]
 (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned_json]
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[escape_comments] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_ddl] (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_dependency] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_logical] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_view] (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_onview] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_2] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_6] (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_7] (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[quotedid_basic] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_create_table_view] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_views] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[struct_in_view] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_exists] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_exists_having] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unicode_comments] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unset_table_view_property]
 

[jira] [Commented] (HIVE-15804) Druid handler might not emit metadata query when CBO fails

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852785#comment-15852785
 ] 

Hive QA commented on HIVE-15804:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850978/HIVE-15804.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3377/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3377/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3377/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850978 - PreCommit-HIVE-Build

> Druid handler might not emit metadata query when CBO fails
> --
>
> Key: HIVE-15804
> URL: https://issues.apache.org/jira/browse/HIVE-15804
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15804.01.patch, HIVE-15804.patch
>
>
> When CBO is not enabled/fails, we should still be able to run queries on 
> Druid datasources.
> This is implemented as Select query that will retrieve all data available 
> from Druid and then execute the rest of the logic on the data. However, 
> currently we might fail due to wrong inferred type for the Druid datasource 
> columns for numerical types.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink

2017-02-04 Thread Kalyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852773#comment-15852773
 ] 

Kalyan commented on HIVE-15691:
---

Hi [~ekoifman],

can please review the above patch

Thanks 

> Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
> -
>
> Key: HIVE-15691
> URL: https://issues.apache.org/jira/browse/HIVE-15691
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Transactions
>Reporter: Kalyan
>Assignee: Kalyan
> Attachments: HIVE-15691.1.patch, HIVE-15691.patch, 
> HIVE-15691-updated.patch
>
>
> Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink.
> It is similar to StrictJsonWriter available in hive.
> Dependency is there in flume to commit.
> FLUME-3036 : Create a RegexSerializer for Hive Sink.
> Patch is available for Flume, Please verify the below link
> https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15806) Druid schema inference for Select queries might produce wrong type for metrics

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852763#comment-15852763
 ] 

Hive QA commented on HIVE-15806:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850974/HIVE-15806.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10211 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=120)

[groupby4_noskew.q,groupby3_map_skew.q,join_cond_pushdown_2.q,union19.q,union24.q,union_remove_5.q,groupby7_noskew_multi_single_reducer.q,vectorization_1.q,index_auto_self_join.q,auto_smb_mapjoin_14.q,script_env_var2.q,pcr.q,auto_join_filters.q,join0.q,join37.q]
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=186)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3376/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3376/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3376/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850974 - PreCommit-HIVE-Build

> Druid schema inference for Select queries might produce wrong type for metrics
> --
>
> Key: HIVE-15806
> URL: https://issues.apache.org/jira/browse/HIVE-15806
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15806.01.patch, HIVE-15806.patch
>
>
> We inferred float automatically, instead of emitting a metadata query to 
> Druid and checking the type of the metric.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15812) Scalar subquery with having throws exception

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852748#comment-15852748
 ] 

Hive QA commented on HIVE-15812:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850962/HIVE-15812.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=186)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3375/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3375/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3375/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850962 - PreCommit-HIVE-Build

> Scalar subquery with having throws exception
> 
>
> Key: HIVE-15812
> URL: https://issues.apache.org/jira/browse/HIVE-15812
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15812.1.patch
>
>
> Following query throws an exception
> {code:SQL}
> select sum(p_retailprice) from part group by p_type having sum(p_retailprice) 
> > (select max(pp.p_retailprice) from part pp);
> {code}
> {noformat}
> SemanticException [Error 10004]: Line 3:40 Invalid table alias or column 
> reference 'pp': (possible column names are: p_partkey, p_name, p_mfgr, 
> p_brand, p_type, p_size, p_container, p_retailprice, p_comment)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15458) Fix semi-join conversion rule for subquery

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852735#comment-15852735
 ] 

Hive QA commented on HIVE-15458:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850951/HIVE-15458.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[leftsemijoin] 
(batchId=42)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_partitioner]
 (batchId=162)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3374/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3374/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3374/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850951 - PreCommit-HIVE-Build

> Fix semi-join conversion rule for subquery
> --
>
> Key: HIVE-15458
> URL: https://issues.apache.org/jira/browse/HIVE-15458
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch
>
>
> Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* 
> since it doesn't work for subqueries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15810) llapstatus should wait for ZK node to become available when in wait mode

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852710#comment-15852710
 ] 

Hive QA commented on HIVE-15810:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850947/HIVE-15810.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3373/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3373/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3373/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850947 - PreCommit-HIVE-Build

> llapstatus should wait for ZK node to become available when in wait mode
> 
>
> Key: HIVE-15810
> URL: https://issues.apache.org/jira/browse/HIVE-15810
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15810.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15797) separate the configs for gby and oby position alias usage

2017-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852690#comment-15852690
 ] 

Hive QA commented on HIVE-15797:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850950/HIVE-15797.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption
 (batchId=277)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[0]
 (batchId=173)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[1]
 (batchId=173)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[0]
 (batchId=173)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate (batchId=173)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate3 (batchId=173)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3372/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3372/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3372/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850950 - PreCommit-HIVE-Build

> separate the configs for gby and oby position alias usage
> -
>
> Key: HIVE-15797
> URL: https://issues.apache.org/jira/browse/HIVE-15797
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15797.01.patch, HIVE-15797.02.patch, 
> HIVE-15797.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15277) Teach Hive how to create/delete Druid segments

2017-02-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753038#comment-15753038
 ] 

Lefty Leverenz edited comment on HIVE-15277 at 2/4/17 8:08 AM:
---

The new table property should be documented here as well as in the Druid 
Integration doc:

* [DDL -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]

Also document the new configuration parameters:

*  *hive.druid.indexer.segments.granularity*
*  *hive.druid.indexer.partition.size.max*
*  *hive.druid.indexer.memory.rownum.max*
*  *hive.druid.basePersistDirectory*
*  *hive.druid.storage.storageDirectory*
*  *hive.druid.metadata.base*
*  *hive.druid.metadata.db.type*  (Edit:  see HIVE-15809 for correct values)
*  *hive.druid.metadata.username*
*  *hive.druid.metadata.password*
*  *hive.druid.metadata.uri*
*  *hive.druid.working.directory*

At this point there are enough Druid configuration parameters for a separate 
subsection in the Configuration Properties doc.  (Also see HIVE-14217 and 
HIVE-15273.)

* [Hive Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties]

Added a TODOC2.2 label.


was (Author: le...@hortonworks.com):
The new table property should be documented here as well as in the Druid 
Integration doc:

* [DDL -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]

Also document the new configuration parameters:

*  *hive.druid.indexer.segments.granularity*
*  *hive.druid.indexer.partition.size.max*
*  *hive.druid.indexer.memory.rownum.max*
*  *hive.druid.basePersistDirectory*
*  *hive.druid.storage.storageDirectory*
*  *hive.druid.metadata.base*
*  *hive.druid.metadata.db.type*
*  *hive.druid.metadata.username*
*  *hive.druid.metadata.password*
*  *hive.druid.metadata.uri*
*  *hive.druid.working.directory*

At this point there are enough Druid configuration parameters for a separate 
subsection in the Configuration Properties doc.  (Also see HIVE-14217 and 
HIVE-15273.)

* [Hive Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties]

Added a TODOC2.2 label.

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: file.patch, HIVE-15277.2.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15809) Typo in the PostgreSQL database name for druid service

2017-02-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15809:
--
Labels: TODOC2.2  (was: )

> Typo in the PostgreSQL database name for druid service
> --
>
> Key: HIVE-15809
> URL: https://issues.apache.org/jira/browse/HIVE-15809
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Trivial
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15809.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15809) Typo in the PostgreSQL database name for druid service

2017-02-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852684#comment-15852684
 ] 

Lefty Leverenz commented on HIVE-15809:
---

Doc note:  This fixes a possible value of *hive.druid.metadata.db.type*, which 
was created by HIVE-15277 (also in release 2.2.0).  The wiki needs to list both 
values, so I'm adding a TODOC2.2 label to this issue.

> Typo in the PostgreSQL database name for druid service
> --
>
> Key: HIVE-15809
> URL: https://issues.apache.org/jira/browse/HIVE-15809
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Trivial
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15809.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15809) Typo in the PostgreSQL database name for druid service

2017-02-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15809:
--
Labels:   (was: TODOC2.2)

> Typo in the PostgreSQL database name for druid service
> --
>
> Key: HIVE-15809
> URL: https://issues.apache.org/jira/browse/HIVE-15809
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Trivial
> Fix For: 2.2.0
>
> Attachments: HIVE-15809.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)