[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853156#comment-15853156 ] Matt McCline commented on HIVE-15573: - New patch has review comment changes except guard-rail. Other changes for EXPLAIN VECTORIZATION. > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15573: Status: Patch Available (was: In Progress) > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15573: Attachment: HIVE-15573.04.patch > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15573: Status: In Progress (was: Patch Available) > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join
[ https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853138#comment-15853138 ] Hive QA commented on HIVE-15808: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851062/HIVE-15808.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3383/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3383/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3383/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851062 - PreCommit-HIVE-Build > Remove semijoin reduction branch if it is on bigtable along with hash join > -- > > Key: HIVE-15808 > URL: https://issues.apache.org/jira/browse/HIVE-15808 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-15808.2.patch, HIVE-15808.patch > > > If there is a semijoin branch on the same operator pipeline which contains a > hash join then it is by design on big table which is not optimal. The > operator cycle detection logic may not find a cycle as there is no cycle at > operator level. However, once Tez builds its task there can be a cycle at > task level causing the query to fail. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15806) Druid schema inference for Select queries might produce wrong type for metrics
[ https://issues.apache.org/jira/browse/HIVE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853136#comment-15853136 ] Ashutosh Chauhan commented on HIVE-15806: - * Post aggregator columns in TopN,GroupBy and timeseries queries are always float, but they could be potentially be of long, no? * Add a comment why we are doing metadata query only for select but not for other query types? > Druid schema inference for Select queries might produce wrong type for metrics > -- > > Key: HIVE-15806 > URL: https://issues.apache.org/jira/browse/HIVE-15806 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15806.01.patch, HIVE-15806.patch > > > We inferred float automatically, instead of emitting a metadata query to > Druid and checking the type of the metric. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15812) Scalar subquery with having throws exception
[ https://issues.apache.org/jira/browse/HIVE-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853132#comment-15853132 ] Ashutosh Chauhan commented on HIVE-15812: - +1 > Scalar subquery with having throws exception > > > Key: HIVE-15812 > URL: https://issues.apache.org/jira/browse/HIVE-15812 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15812.1.patch > > > Following query throws an exception > {code:SQL} > select sum(p_retailprice) from part group by p_type having sum(p_retailprice) > > (select max(pp.p_retailprice) from part pp); > {code} > {noformat} > SemanticException [Error 10004]: Line 3:40 Invalid table alias or column > reference 'pp': (possible column names are: p_partkey, p_name, p_mfgr, > p_brand, p_type, p_size, p_container, p_retailprice, p_comment) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15804) Druid handler might not emit metadata query when CBO fails
[ https://issues.apache.org/jira/browse/HIVE-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853131#comment-15853131 ] Ashutosh Chauhan commented on HIVE-15804: - Also seems like there is need to update title of the jira . > Druid handler might not emit metadata query when CBO fails > -- > > Key: HIVE-15804 > URL: https://issues.apache.org/jira/browse/HIVE-15804 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15804.01.patch, HIVE-15804.patch > > > When CBO is not enabled/fails, we should still be able to run queries on > Druid datasources. > This is implemented as Select query that will retrieve all data available > from Druid and then execute the rest of the logic on the data. However, > currently we might fail due to wrong inferred type for the Druid datasource > columns for numerical types. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15804) Druid handler might not emit metadata query when CBO fails
[ https://issues.apache.org/jira/browse/HIVE-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853130#comment-15853130 ] Ashutosh Chauhan commented on HIVE-15804: - +1 > Druid handler might not emit metadata query when CBO fails > -- > > Key: HIVE-15804 > URL: https://issues.apache.org/jira/browse/HIVE-15804 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15804.01.patch, HIVE-15804.patch > > > When CBO is not enabled/fails, we should still be able to run queries on > Druid datasources. > This is implemented as Select query that will retrieve all data available > from Druid and then execute the rest of the logic on the data. However, > currently we might fail due to wrong inferred type for the Druid datasource > columns for numerical types. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15458) Fix semi-join conversion rule for subquery
[ https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853126#comment-15853126 ] Ashutosh Chauhan commented on HIVE-15458: - +1 > Fix semi-join conversion rule for subquery > -- > > Key: HIVE-15458 > URL: https://issues.apache.org/jira/browse/HIVE-15458 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch, > HIVE-15458.3.patch > > > Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* > since it doesn't work for subqueries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS
[ https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853125#comment-15853125 ] Xuefu Zhang commented on HIVE-15815: +1. I assume that spark.hadoop.oozie properties will be interpreted correctly by Spark. > Allow to pass some Oozie properties to Spark in HoS > --- > > Key: HIVE-15815 > URL: https://issues.apache.org/jira/browse/HIVE-15815 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15815.patch > > > Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when > it invokes Hive2 action. If we allow these properties to be passed to Spark > in HoS, we can easily associate an Ooize workflow ID to an HoS client and > Spark job in Spark history. It will be very helpful in diagnosing some issues > involving Oozie Hive2/HoS/Spark. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853124#comment-15853124 ] Hive QA commented on HIVE-15222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851061/HIVE-15222.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 128 failed/errored test(s), 10181 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=120) [groupby4_noskew.q,groupby3_map_skew.q,join_cond_pushdown_2.q,union19.q,union24.q,union_remove_5.q,groupby7_noskew_multi_single_reducer.q,vectorization_1.q,index_auto_self_join.q,auto_smb_mapjoin_14.q,script_env_var2.q,pcr.q,auto_join_filters.q,join0.q,join37.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=126) [ptf_seqfile.q,union_remove_23.q,parallel_join0.q,union_remove_9.q,join_nullsafe.q,skewjoinopt14.q,vectorized_mapjoin.q,union4.q,auto_join5.q,vectorized_shufflejoin.q,smb_mapjoin_20.q,groupby8_noskew.q,auto_sortmerge_join_10.q,groupby11.q,union_remove_16.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=130) [groupby6_map.q,stats13.q,groupby2_noskew_multi_distinct.q,load_dyn_part12.q,join15.q,auto_join17.q,join_hive_626.q,tez_join_tests.q,auto_join21.q,join_view.q,join_cond_pushdown_4.q,vectorization_0.q,union_null.q,auto_join3.q,vectorization_decimal_date.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input4] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join0] (batchId=54) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel_join0] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] (batchId=38) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_dpp] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_semijoin] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_3] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_5] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_1] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_2] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_3] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_4] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_5] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[deleteAnalyze] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[empty_join] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[nonmr_fetch_threshold] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_complex] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_primitive] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_table] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_table] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_cache] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_without_gby] (batchId=149)
[jira] [Commented] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join
[ https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853113#comment-15853113 ] Deepak Jaiswal commented on HIVE-15808: --- Patch updated. > Remove semijoin reduction branch if it is on bigtable along with hash join > -- > > Key: HIVE-15808 > URL: https://issues.apache.org/jira/browse/HIVE-15808 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-15808.2.patch, HIVE-15808.patch > > > If there is a semijoin branch on the same operator pipeline which contains a > hash join then it is by design on big table which is not optimal. The > operator cycle detection logic may not find a cycle as there is no cycle at > operator level. However, once Tez builds its task there can be a cycle at > task level causing the query to fail. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15743) vectorized text parsing: speed up double parse
[ https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853106#comment-15853106 ] Hive QA commented on HIVE-15743: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851060/HIVE-15743.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10225 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple] (batchId=153) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=230) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3381/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3381/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3381/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851060 - PreCommit-HIVE-Build > vectorized text parsing: speed up double parse > -- > > Key: HIVE-15743 > URL: https://issues.apache.org/jira/browse/HIVE-15743 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, > HIVE-15743.3.patch, HIVE-15743.4.patch, tpch-without.png > > > {noformat} > Double.parseDouble( > new String(bytes, fieldStart, fieldLength, > StandardCharsets.UTF_8));{noformat} > This takes ~25% of the query time in some cases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join
[ https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-15808: -- Attachment: HIVE-15808.2.patch > Remove semijoin reduction branch if it is on bigtable along with hash join > -- > > Key: HIVE-15808 > URL: https://issues.apache.org/jira/browse/HIVE-15808 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-15808.2.patch, HIVE-15808.patch > > > If there is a semijoin branch on the same operator pipeline which contains a > hash join then it is by design on big table which is not optimal. The > operator cycle detection logic may not find a cycle as there is no cycle at > operator level. However, once Tez builds its task there can be a cycle at > task level causing the query to fail. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15808) Remove semijoin reduction branch if it is on bigtable along with hash join
[ https://issues.apache.org/jira/browse/HIVE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-15808: -- Description: If there is a semijoin branch on the same operator pipeline which contains a hash join then it is by design on big table which is not optimal. The operator cycle detection logic may not find a cycle as there is no cycle at operator level. However, once Tez builds its task there can be a cycle at task level causing the query to fail. (was: It is found that the current logic of cycle detection does not find cycles created when there is a semijoin branch parallel to a hash join on a reducer. To avoid such cycles, remove the semijoin reduction optimization.) Summary: Remove semijoin reduction branch if it is on bigtable along with hash join (was: Remove Semijoin reduction branch on reducers if there is hash join) > Remove semijoin reduction branch if it is on bigtable along with hash join > -- > > Key: HIVE-15808 > URL: https://issues.apache.org/jira/browse/HIVE-15808 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-15808.patch > > > If there is a semijoin branch on the same operator pipeline which contains a > hash join then it is by design on big table which is not optimal. The > operator cycle detection logic may not find a cycle as there is no cycle at > operator level. However, once Tez builds its task there can be a cycle at > task level causing the query to fail. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15223) replace org.json usage in EximUtil with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853094#comment-15853094 ] Hive QA commented on HIVE-15223: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850960/HIVE-15223.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10222 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple] (batchId=147) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=230) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite (batchId=186) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3380/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3380/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3380/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850960 - PreCommit-HIVE-Build > replace org.json usage in EximUtil with some alternative > > > Key: HIVE-15223 > URL: https://issues.apache.org/jira/browse/HIVE-15223 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15223.1.patch > > > The metadata is stored in json format...which changed lately with the advent > of replication v2. > I think jackson would be nice to have here - it could possibly aid to make > this Metadata reading / writing more resilient against future serialization > issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-15222: -- Status: Patch Available (was: Open) > replace org.json usage in ExplainTask/TezTask related classes with some > alternative > --- > > Key: HIVE-15222 > URL: https://issues.apache.org/jira/browse/HIVE-15222 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15222.1.patch > > > Replace org.json usage in these classes. > It seems to me that json is probably only used to write some information - > but the application never reads it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-15222: -- Attachment: HIVE-15222.1.patch I replaced json.org with Jackson. However, the patch file is bigger than I thought. It covers 7 files and its size is 64KB. I tested TestExplainTask and it succeeded. But it still may not be sufficient. I will wait for integration test results. > replace org.json usage in ExplainTask/TezTask related classes with some > alternative > --- > > Key: HIVE-15222 > URL: https://issues.apache.org/jira/browse/HIVE-15222 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15222.1.patch > > > Replace org.json usage in these classes. > It seems to me that json is probably only used to write some information - > but the application never reads it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15743) vectorized text parsing: speed up double parse
[ https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-15743: -- Attachment: HIVE-15743.4.patch This patch applies Sergey's feedback. It uses a trimed string for precision check, and limits strtod(String) for testing only. > vectorized text parsing: speed up double parse > -- > > Key: HIVE-15743 > URL: https://issues.apache.org/jira/browse/HIVE-15743 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, > HIVE-15743.3.patch, HIVE-15743.4.patch, tpch-without.png > > > {noformat} > Double.parseDouble( > new String(bytes, fieldStart, fieldLength, > StandardCharsets.UTF_8));{noformat} > This takes ~25% of the query time in some cases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15223) replace org.json usage in EximUtil with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-15223: -- Status: Patch Available (was: Open) > replace org.json usage in EximUtil with some alternative > > > Key: HIVE-15223 > URL: https://issues.apache.org/jira/browse/HIVE-15223 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15223.1.patch > > > The metadata is stored in json format...which changed lately with the advent > of replication v2. > I think jackson would be nice to have here - it could possibly aid to make > this Metadata reading / writing more resilient against future serialization > issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS
[ https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15815: --- Status: Patch Available (was: Open) > Allow to pass some Oozie properties to Spark in HoS > --- > > Key: HIVE-15815 > URL: https://issues.apache.org/jira/browse/HIVE-15815 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15815.patch > > > Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when > it invokes Hive2 action. If we allow these properties to be passed to Spark > in HoS, we can easily associate an Ooize workflow ID to an HoS client and > Spark job in Spark history. It will be very helpful in diagnosing some issues > involving Oozie Hive2/HoS/Spark. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS
[ https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15815: --- Attachment: HIVE-15815.patch [~xuefuz] could you help to review the patch to see if it makes sense? thanks! > Allow to pass some Oozie properties to Spark in HoS > --- > > Key: HIVE-15815 > URL: https://issues.apache.org/jira/browse/HIVE-15815 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15815.patch > > > Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when > it invokes Hive2 action. If we allow these properties to be passed to Spark > in HoS, we can easily associate an Ooize workflow ID to an HoS client and > Spark job in Spark history. It will be very helpful in diagnosing some issues > involving Oozie Hive2/HoS/Spark. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15815) Allow to pass some Oozie properties to Spark in HoS
[ https://issues.apache.org/jira/browse/HIVE-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-15815: -- > Allow to pass some Oozie properties to Spark in HoS > --- > > Key: HIVE-15815 > URL: https://issues.apache.org/jira/browse/HIVE-15815 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > > Oozie passes some of its properties (e.g. oozie.job.id) to Beeline/HS2 when > it invokes Hive2 action. If we allow these properties to be passed to Spark > in HoS, we can easily associate an Ooize workflow ID to an HoS client and > Spark job in Spark history. It will be very helpful in diagnosing some issues > involving Oozie Hive2/HoS/Spark. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852964#comment-15852964 ] Pengcheng Xiong commented on HIVE-15388: I have added lots of parentheses queries before, e.g., multi_column_in.q multi_column_in_single.q. For interval, we have tpcds queries in perfclidriver which supports new interval > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright") > OR `airports`.`airport` > = "West Memphis Municipal") > OR `airports`.`airport` > = "Arlington Municipal") > OR `airports`.`airport` = > "Algona Municipal") >
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852963#comment-15852963 ] Pengcheng Xiong commented on HIVE-15388: Here are the results and they are expected: {code} PREHOOK: query: select true=true in (true, false) PREHOOK: type: QUERY PREHOOK: Input: _dummy_database@_dummy_table A masked pattern was here POSTHOOK: query: select true=true in (true, false) POSTHOOK: type: QUERY POSTHOOK: Input: _dummy_database@_dummy_table A masked pattern was here true PREHOOK: query: select false=true in (true, false) PREHOOK: type: QUERY PREHOOK: Input: _dummy_database@_dummy_table A masked pattern was here POSTHOOK: query: select false=true in (true, false) POSTHOOK: type: QUERY POSTHOOK: Input: _dummy_database@_dummy_table A masked pattern was here false {code} Postgres {code} horton=# select true=false in (true,false); ?column? -- t (1 row) horton=# select false=false in (true,false); ?column? -- f (1 row) {code} And the error: Hive: {code} 2017-02-04T14:25:34,713 ERROR [a24cc02e-355b-402d-8183-43501e0edc77 main] ql.Driver: FAILED: SemanticException Line 0:-1 Wrong arguments 'false': The arguments for IN should be the same type! Types are: {int IN (boolean, boolean)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Wrong arguments 'false': The arguments for IN should be the same type! Types are: {int IN (boolean, boolean)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1367) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) {code} Postgres {code} horton=# select 1=1 in (true, false); ERROR: operator does not exist: integer = boolean LINE 1: select 1=1 in (true, false); ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. {code} > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport`
[jira] [Updated] (HIVE-15458) Fix semi-join conversion rule for subquery
[ https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-15458: --- Status: Open (was: Patch Available) > Fix semi-join conversion rule for subquery > -- > > Key: HIVE-15458 > URL: https://issues.apache.org/jira/browse/HIVE-15458 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch, > HIVE-15458.3.patch > > > Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* > since it doesn't work for subqueries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15458) Fix semi-join conversion rule for subquery
[ https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-15458: --- Attachment: HIVE-15458.3.patch > Fix semi-join conversion rule for subquery > -- > > Key: HIVE-15458 > URL: https://issues.apache.org/jira/browse/HIVE-15458 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch, > HIVE-15458.3.patch > > > Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* > since it doesn't work for subqueries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852916#comment-15852916 ] Gunther Hagleitner commented on HIVE-15388: --- [~pxiong] Thanks for the clarification. When you say "1=1 in (true, false)" is illegal, is it a semantic error or a parse error? What happens when you run: "select true=true in (true, false)"? Can you add that to the tests. Problem w/ saying 10k tests didn't find anything else is that I don't know how many tests actually have an in clause with expressions. Probably not that many. Can you make sure that you cover these expressions in "udf_in.q"? For interval literals - the spec says: {noformat} ::= INTERVAL [ ] ::= {noformat} The "unquoted interval string" is parsed elsewhere. So it sounds like restricting it for now is fine, although I'm still looking at this. You are throwing out more tests from "interval_alt.q" than needed, some statements in there should still work, right? > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") >
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852883#comment-15852883 ] Pengcheng Xiong commented on HIVE-15388: [~hagleitn] and [~ashutoshc], i think there is some misunderstanding here. Parenthesis are now mandatory for expressions in predicate in that single q test. It is because of the "=" where "in" has higher precedence than "=". This is not saying that for every q test, we need to use parenthesis for expressions in predicate. Out of 10K+ q tests, I only discovered that single q test which needs modification and it is illegal in postgres/oracle. I also tried "select 1+1 in (1,2,3,4)" and "select (1+1) in (1,2,3,4)" in Hive. Both of them work well with my patch. Thanks. > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >
[jira] [Commented] (HIVE-15769) Support view creation in CBO
[ https://issues.apache.org/jira/browse/HIVE-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852814#comment-15852814 ] Hive QA commented on HIVE-15769: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850982/HIVE-15769.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 130 failed/errored test(s), 10227 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_as_select] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_rename] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_8] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_createtab] (batchId=26) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_createtab_noauthzapi] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_owner_actions] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_1] (batchId=18) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_const] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_subq_exists] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_union_view] (batchId=19) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_big_view] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_tbl_props] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_or_replace_view] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_defaultformats] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_translate] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_char] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_date] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_varchar] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cteViews] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_2] (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_4] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_ddl1] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_query5] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned_json] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[escape_comments] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_ddl] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_dependency] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_logical] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_view] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_onview] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_2] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_6] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_7] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[quotedid_basic] (batchId=56) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_create_table_view] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_views] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[struct_in_view] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_exists] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_exists_having] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unicode_comments] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unset_table_view_property]
[jira] [Commented] (HIVE-15804) Druid handler might not emit metadata query when CBO fails
[ https://issues.apache.org/jira/browse/HIVE-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852785#comment-15852785 ] Hive QA commented on HIVE-15804: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850978/HIVE-15804.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple] (batchId=153) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3377/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3377/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3377/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850978 - PreCommit-HIVE-Build > Druid handler might not emit metadata query when CBO fails > -- > > Key: HIVE-15804 > URL: https://issues.apache.org/jira/browse/HIVE-15804 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15804.01.patch, HIVE-15804.patch > > > When CBO is not enabled/fails, we should still be able to run queries on > Druid datasources. > This is implemented as Select query that will retrieve all data available > from Druid and then execute the rest of the logic on the data. However, > currently we might fail due to wrong inferred type for the Druid datasource > columns for numerical types. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852773#comment-15852773 ] Kalyan commented on HIVE-15691: --- Hi [~ekoifman], can please review the above patch Thanks > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15806) Druid schema inference for Select queries might produce wrong type for metrics
[ https://issues.apache.org/jira/browse/HIVE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852763#comment-15852763 ] Hive QA commented on HIVE-15806: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850974/HIVE-15806.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10211 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=120) [groupby4_noskew.q,groupby3_map_skew.q,join_cond_pushdown_2.q,union19.q,union24.q,union_remove_5.q,groupby7_noskew_multi_single_reducer.q,vectorization_1.q,index_auto_self_join.q,auto_smb_mapjoin_14.q,script_env_var2.q,pcr.q,auto_join_filters.q,join0.q,join37.q] org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite (batchId=186) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3376/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3376/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3376/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850974 - PreCommit-HIVE-Build > Druid schema inference for Select queries might produce wrong type for metrics > -- > > Key: HIVE-15806 > URL: https://issues.apache.org/jira/browse/HIVE-15806 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15806.01.patch, HIVE-15806.patch > > > We inferred float automatically, instead of emitting a metadata query to > Druid and checking the type of the metric. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15812) Scalar subquery with having throws exception
[ https://issues.apache.org/jira/browse/HIVE-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852748#comment-15852748 ] Hive QA commented on HIVE-15812: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850962/HIVE-15812.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite (batchId=186) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3375/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3375/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3375/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850962 - PreCommit-HIVE-Build > Scalar subquery with having throws exception > > > Key: HIVE-15812 > URL: https://issues.apache.org/jira/browse/HIVE-15812 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15812.1.patch > > > Following query throws an exception > {code:SQL} > select sum(p_retailprice) from part group by p_type having sum(p_retailprice) > > (select max(pp.p_retailprice) from part pp); > {code} > {noformat} > SemanticException [Error 10004]: Line 3:40 Invalid table alias or column > reference 'pp': (possible column names are: p_partkey, p_name, p_mfgr, > p_brand, p_type, p_size, p_container, p_retailprice, p_comment) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15458) Fix semi-join conversion rule for subquery
[ https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852735#comment-15852735 ] Hive QA commented on HIVE-15458: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850951/HIVE-15458.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[leftsemijoin] (batchId=42) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_partitioner] (batchId=162) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3374/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3374/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3374/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850951 - PreCommit-HIVE-Build > Fix semi-join conversion rule for subquery > -- > > Key: HIVE-15458 > URL: https://issues.apache.org/jira/browse/HIVE-15458 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-15458.1.patch, HIVE-15458.2.patch > > > Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* > since it doesn't work for subqueries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15810) llapstatus should wait for ZK node to become available when in wait mode
[ https://issues.apache.org/jira/browse/HIVE-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852710#comment-15852710 ] Hive QA commented on HIVE-15810: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850947/HIVE-15810.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple] (batchId=147) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3373/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3373/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3373/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850947 - PreCommit-HIVE-Build > llapstatus should wait for ZK node to become available when in wait mode > > > Key: HIVE-15810 > URL: https://issues.apache.org/jira/browse/HIVE-15810 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15810.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15797) separate the configs for gby and oby position alias usage
[ https://issues.apache.org/jira/browse/HIVE-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852690#comment-15852690 ] Hive QA commented on HIVE-15797: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12850950/HIVE-15797.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple] (batchId=153) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption (batchId=277) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[0] (batchId=173) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[1] (batchId=173) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[0] (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate3 (batchId=173) org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=215) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3372/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3372/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3372/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12850950 - PreCommit-HIVE-Build > separate the configs for gby and oby position alias usage > - > > Key: HIVE-15797 > URL: https://issues.apache.org/jira/browse/HIVE-15797 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15797.01.patch, HIVE-15797.02.patch, > HIVE-15797.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753038#comment-15753038 ] Lefty Leverenz edited comment on HIVE-15277 at 2/4/17 8:08 AM: --- The new table property should be documented here as well as in the Druid Integration doc: * [DDL -- Table Properties | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties] Also document the new configuration parameters: * *hive.druid.indexer.segments.granularity* * *hive.druid.indexer.partition.size.max* * *hive.druid.indexer.memory.rownum.max* * *hive.druid.basePersistDirectory* * *hive.druid.storage.storageDirectory* * *hive.druid.metadata.base* * *hive.druid.metadata.db.type* (Edit: see HIVE-15809 for correct values) * *hive.druid.metadata.username* * *hive.druid.metadata.password* * *hive.druid.metadata.uri* * *hive.druid.working.directory* At this point there are enough Druid configuration parameters for a separate subsection in the Configuration Properties doc. (Also see HIVE-14217 and HIVE-15273.) * [Hive Configuration Properties | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties] Added a TODOC2.2 label. was (Author: le...@hortonworks.com): The new table property should be documented here as well as in the Druid Integration doc: * [DDL -- Table Properties | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties] Also document the new configuration parameters: * *hive.druid.indexer.segments.granularity* * *hive.druid.indexer.partition.size.max* * *hive.druid.indexer.memory.rownum.max* * *hive.druid.basePersistDirectory* * *hive.druid.storage.storageDirectory* * *hive.druid.metadata.base* * *hive.druid.metadata.db.type* * *hive.druid.metadata.username* * *hive.druid.metadata.password* * *hive.druid.metadata.uri* * *hive.druid.working.directory* At this point there are enough Druid configuration parameters for a separate subsection in the Configuration Properties doc. (Also see HIVE-14217 and HIVE-15273.) * [Hive Configuration Properties | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties] Added a TODOC2.2 label. > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: file.patch, HIVE-15277.2.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15809) Typo in the PostgreSQL database name for druid service
[ https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15809: -- Labels: TODOC2.2 (was: ) > Typo in the PostgreSQL database name for druid service > -- > > Key: HIVE-15809 > URL: https://issues.apache.org/jira/browse/HIVE-15809 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Trivial > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15809.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15809) Typo in the PostgreSQL database name for druid service
[ https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852684#comment-15852684 ] Lefty Leverenz commented on HIVE-15809: --- Doc note: This fixes a possible value of *hive.druid.metadata.db.type*, which was created by HIVE-15277 (also in release 2.2.0). The wiki needs to list both values, so I'm adding a TODOC2.2 label to this issue. > Typo in the PostgreSQL database name for druid service > -- > > Key: HIVE-15809 > URL: https://issues.apache.org/jira/browse/HIVE-15809 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Trivial > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15809.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15809) Typo in the PostgreSQL database name for druid service
[ https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15809: -- Labels: (was: TODOC2.2) > Typo in the PostgreSQL database name for druid service > -- > > Key: HIVE-15809 > URL: https://issues.apache.org/jira/browse/HIVE-15809 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Trivial > Fix For: 2.2.0 > > Attachments: HIVE-15809.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)