[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-13860: --- Attachment: HIVE-13860-java8.patch > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch, > HIVE-13860-java8.patch, HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results
[ https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303648#comment-15303648 ] Vaibhav Gumashta commented on HIVE-11527: - [~tasanuma0829] Thanks a lot for the work. I'll post my comments (if any) by tomorrow. > bypass HiveServer2 thrift interface for query results > - > > Key: HIVE-11527 > URL: https://issues.apache.org/jira/browse/HIVE-11527 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Sergey Shelukhin >Assignee: Takanobu Asanuma > Attachments: HIVE-11527.WIP.patch > > > Right now, HS2 reads query results and returns them to the caller via its > thrift API. > There should be an option for HS2 to return some pointer to results (an HDFS > link?) and for the user to read the results directly off HDFS inside the > cluster, or via something like WebHDFS outside the cluster > Review board link: https://reviews.apache.org/r/40867 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13870) Decimal vector is not resized correctly
[ https://issues.apache.org/jira/browse/HIVE-13870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303642#comment-15303642 ] Matt McCline commented on HIVE-13870: - LGTM +1 > Decimal vector is not resized correctly > --- > > Key: HIVE-13870 > URL: https://issues.apache.org/jira/browse/HIVE-13870 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.1.0 > > Attachments: HIVE-13870.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests
[ https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13874: Status: Patch Available (was: Open) > Tighten up EOF checking in Fast DeserializeRead classes; display better > exception information; add new Unit Tests > - > > Key: HIVE-13874 > URL: https://issues.apache.org/jira/browse/HIVE-13874 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13874.01.patch > > > Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond > stated row end are never read. > Display more detailed information when an exception is thrown by > DeserializeRead classes. > Add Unit Tests, including some that catch the error in HIVE-13818. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests
[ https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13874: Summary: Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests (was: Tighten up EOF checking in Fast DeserializeRead classes; display better exception information) > Tighten up EOF checking in Fast DeserializeRead classes; display better > exception information; add new Unit Tests > - > > Key: HIVE-13874 > URL: https://issues.apache.org/jira/browse/HIVE-13874 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13874.01.patch > > > Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond > stated row end are never read. > Display more detailed information when an exception is thrown by > DeserializeRead classes. > Add Unit Tests, including some that catch the error in HIVE-13818. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests
[ https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13874: Attachment: HIVE-13874.01.patch > Tighten up EOF checking in Fast DeserializeRead classes; display better > exception information; add new Unit Tests > - > > Key: HIVE-13874 > URL: https://issues.apache.org/jira/browse/HIVE-13874 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13874.01.patch > > > Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond > stated row end are never read. > Display more detailed information when an exception is thrown by > DeserializeRead classes. > Add Unit Tests, including some that catch the error in HIVE-13818. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information
[ https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13874: Description: Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond stated row end are never read. Display more detailed information when an exception is thrown by DeserializeRead classes. Add Unit Tests, including some that catch the error in HIVE-13818. was: Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond stated row end are never read. Display more detailed information when an exception is thrown by DeserializeRead classes. > Tighten up EOF checking in Fast DeserializeRead classes; display better > exception information > - > > Key: HIVE-13874 > URL: https://issues.apache.org/jira/browse/HIVE-13874 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13874.01.patch > > > Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond > stated row end are never read. > Display more detailed information when an exception is thrown by > DeserializeRead classes. > Add Unit Tests, including some that catch the error in HIVE-13818. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases
[ https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13837: --- Status: Open (was: Patch Available) minor change from ":" to "." according to Oracle timestamp standard. > current_timestamp() output format is different in some cases > > > Key: HIVE-13837 > URL: https://issues.apache.org/jira/browse/HIVE-13837 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch > > > As [~jdere] reports: > {code} > current_timestamp() udf returns result with different format in some cases. > select current_timestamp() returns result with decimal precision: > {noformat} > hive> select current_timestamp(); > OK > 2016-04-14 18:26:58.875 > Time taken: 0.077 seconds, Fetched: 1 row(s) > {noformat} > But output format is different for select current_timestamp() from all100k > union select current_timestamp() from over100k limit 5; > {noformat} > hive> select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 > Total jobs = 1 > Launching Job 1 out of 1 > Tez session was closed. Reopening... > Session re-established. > Status: Running (Executing on YARN cluster with App id > application_1460611908643_0624) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Map 4 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > -- > VERTICES: 03/03 [==>>] 100% ELAPSED TIME: 0.92 s > > -- > OK > 2016-04-14 18:29:56 > Time taken: 10.558 seconds, Fetched: 1 row(s) > {noformat} > explain plan for select current_timestamp(); > {noformat} > hive> explain extended select current_timestamp(); > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION >current_timestamp > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: _dummy_table > Row Limit Per Split: 1 > GatherStats: false > Select Operator > expressions: 2016-04-14 18:30:57.206 (type: timestamp) > outputColumnNames: _col0 > ListSink > Time taken: 0.062 seconds, Fetched: 30 row(s) > {noformat} > explain plan for select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > {noformat} > hive> explain extended select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_FROM > TOK_SUBQUERY > TOK_QUERY > TOK_FROM >TOK_SUBQUERY > TOK_UNIONALL > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > all100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > over100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > _u1 > TOK_INSERT >
[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases
[ https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13837: --- Status: Patch Available (was: Open) > current_timestamp() output format is different in some cases > > > Key: HIVE-13837 > URL: https://issues.apache.org/jira/browse/HIVE-13837 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch > > > As [~jdere] reports: > {code} > current_timestamp() udf returns result with different format in some cases. > select current_timestamp() returns result with decimal precision: > {noformat} > hive> select current_timestamp(); > OK > 2016-04-14 18:26:58.875 > Time taken: 0.077 seconds, Fetched: 1 row(s) > {noformat} > But output format is different for select current_timestamp() from all100k > union select current_timestamp() from over100k limit 5; > {noformat} > hive> select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 > Total jobs = 1 > Launching Job 1 out of 1 > Tez session was closed. Reopening... > Session re-established. > Status: Running (Executing on YARN cluster with App id > application_1460611908643_0624) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Map 4 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > -- > VERTICES: 03/03 [==>>] 100% ELAPSED TIME: 0.92 s > > -- > OK > 2016-04-14 18:29:56 > Time taken: 10.558 seconds, Fetched: 1 row(s) > {noformat} > explain plan for select current_timestamp(); > {noformat} > hive> explain extended select current_timestamp(); > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION >current_timestamp > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: _dummy_table > Row Limit Per Split: 1 > GatherStats: false > Select Operator > expressions: 2016-04-14 18:30:57.206 (type: timestamp) > outputColumnNames: _col0 > ListSink > Time taken: 0.062 seconds, Fetched: 30 row(s) > {noformat} > explain plan for select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > {noformat} > hive> explain extended select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_FROM > TOK_SUBQUERY > TOK_QUERY > TOK_FROM >TOK_SUBQUERY > TOK_UNIONALL > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > all100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > over100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > _u1 > TOK_INSERT >TOK_DESTINATION > TOK_DIR >
[jira] [Comment Edited] (HIVE-13837) current_timestamp() output format is different in some cases
[ https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303595#comment-15303595 ] Pengcheng Xiong edited comment on HIVE-13837 at 5/27/16 6:11 AM: - minor change from ":" to "." according to Oracle timestamp standard. Resubmit the patch. was (Author: pxiong): minor change from ":" to "." according to Oracle timestamp standard. > current_timestamp() output format is different in some cases > > > Key: HIVE-13837 > URL: https://issues.apache.org/jira/browse/HIVE-13837 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch > > > As [~jdere] reports: > {code} > current_timestamp() udf returns result with different format in some cases. > select current_timestamp() returns result with decimal precision: > {noformat} > hive> select current_timestamp(); > OK > 2016-04-14 18:26:58.875 > Time taken: 0.077 seconds, Fetched: 1 row(s) > {noformat} > But output format is different for select current_timestamp() from all100k > union select current_timestamp() from over100k limit 5; > {noformat} > hive> select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 > Total jobs = 1 > Launching Job 1 out of 1 > Tez session was closed. Reopening... > Session re-established. > Status: Running (Executing on YARN cluster with App id > application_1460611908643_0624) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Map 4 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > -- > VERTICES: 03/03 [==>>] 100% ELAPSED TIME: 0.92 s > > -- > OK > 2016-04-14 18:29:56 > Time taken: 10.558 seconds, Fetched: 1 row(s) > {noformat} > explain plan for select current_timestamp(); > {noformat} > hive> explain extended select current_timestamp(); > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION >current_timestamp > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: _dummy_table > Row Limit Per Split: 1 > GatherStats: false > Select Operator > expressions: 2016-04-14 18:30:57.206 (type: timestamp) > outputColumnNames: _col0 > ListSink > Time taken: 0.062 seconds, Fetched: 30 row(s) > {noformat} > explain plan for select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > {noformat} > hive> explain extended select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_FROM > TOK_SUBQUERY > TOK_QUERY > TOK_FROM >TOK_SUBQUERY > TOK_UNIONALL > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > all100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > over100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT >
[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases
[ https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13837: --- Attachment: HIVE-13837.02.patch > current_timestamp() output format is different in some cases > > > Key: HIVE-13837 > URL: https://issues.apache.org/jira/browse/HIVE-13837 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch > > > As [~jdere] reports: > {code} > current_timestamp() udf returns result with different format in some cases. > select current_timestamp() returns result with decimal precision: > {noformat} > hive> select current_timestamp(); > OK > 2016-04-14 18:26:58.875 > Time taken: 0.077 seconds, Fetched: 1 row(s) > {noformat} > But output format is different for select current_timestamp() from all100k > union select current_timestamp() from over100k limit 5; > {noformat} > hive> select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 > Total jobs = 1 > Launching Job 1 out of 1 > Tez session was closed. Reopening... > Session re-established. > Status: Running (Executing on YARN cluster with App id > application_1460611908643_0624) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Map 4 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > -- > VERTICES: 03/03 [==>>] 100% ELAPSED TIME: 0.92 s > > -- > OK > 2016-04-14 18:29:56 > Time taken: 10.558 seconds, Fetched: 1 row(s) > {noformat} > explain plan for select current_timestamp(); > {noformat} > hive> explain extended select current_timestamp(); > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION >current_timestamp > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: _dummy_table > Row Limit Per Split: 1 > GatherStats: false > Select Operator > expressions: 2016-04-14 18:30:57.206 (type: timestamp) > outputColumnNames: _col0 > ListSink > Time taken: 0.062 seconds, Fetched: 30 row(s) > {noformat} > explain plan for select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > {noformat} > hive> explain extended select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY >TOK_FROM > TOK_SUBQUERY > TOK_QUERY > TOK_FROM >TOK_SUBQUERY > TOK_UNIONALL > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > all100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > TOK_QUERY > TOK_FROM >TOK_TABREF > TOK_TABNAME > over100k > TOK_INSERT >TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE >TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > _u1 > TOK_INSERT >TOK_DESTINATION > TOK_DIR > T
[jira] [Commented] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
[ https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303593#comment-15303593 ] Hive QA commented on HIVE-13564: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806019/HIVE-13564.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 10077 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby6_map.q-join13.q-union14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_complex_types.q-groupby_map_ppr_multi_distinct.q-vectorization_16.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_vc.q-input1_limit.q-join16.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-multi_insert.q-join5.q-groupby6.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats15 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5_map org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union24 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_0 org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/400/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/400/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-400/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12806019 - PreCommit-HIVE-MASTER-Build > Deprecate HIVE_STATS_COLLECT_RAWDATASIZE > > > Key: HIVE-13564 > URL: https://issues.apache.org/jira/browse/HIVE-13564 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Statistics >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Attachments: HIVE-13564.01.patch > > > Reasons (1) It is only used in stats20.q (2) We already have a > "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect > rawDataSize and #rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM
[ https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303586#comment-15303586 ] Rajat Khandelwal commented on HIVE-13862: - Just to reaffirm the gravity of this fix, in our production, we had a box with both mysql and hive metastore running. Without this fix, both processes are continuously using 500-600 percent cpu each. After deploying this, the total cpu usage for both processes is around 50. > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter > falls back to ORM > --- > > Key: HIVE-13862 > URL: https://issues.apache.org/jira/browse/HIVE-13862 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Fix For: 2.1.0 > > Attachments: HIVE-13862.patch > > > We are seeing following exception and calls fall back to ORM which make it > costly : > {noformat} > WARN org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, > falling back to ORM > java.lang.ClassCastException: > org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to > java.lang.Number > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5999) Allow other characters for LINES TERMINATED BY
[ https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303498#comment-15303498 ] Ashutosh Chauhan commented on HIVE-5999: I am not working on it. Take it over. > Allow other characters for LINES TERMINATED BY > --- > > Key: HIVE-5999 > URL: https://issues.apache.org/jira/browse/HIVE-5999 > Project: Hive > Issue Type: Improvement > Components: Beeline, Database/Schema, Hive >Affects Versions: 0.12.0 >Reporter: Mariano Dominguez >Assignee: Nemon Lou >Priority: Critical > Labels: Delimiter, Hive, Row, SerDe > > LINES TERMINATED BY only supports newline '\n' right now. > It would be nice to loosen this constraint and allow other characters. > This limitation seems to be hardcoded here: > https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171 > The DDL Definition on the Hive Language manual shows this as a configurable > property whereas it is not. This may lead to mileading assement of being able > to choose a choice of field delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5999) Allow other characters for LINES TERMINATED BY
[ https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5999: --- Assignee: Nemon Lou (was: Ashutosh Chauhan) > Allow other characters for LINES TERMINATED BY > --- > > Key: HIVE-5999 > URL: https://issues.apache.org/jira/browse/HIVE-5999 > Project: Hive > Issue Type: Improvement > Components: Beeline, Database/Schema, Hive >Affects Versions: 0.12.0 >Reporter: Mariano Dominguez >Assignee: Nemon Lou >Priority: Critical > Labels: Delimiter, Hive, Row, SerDe > > LINES TERMINATED BY only supports newline '\n' right now. > It would be nice to loosen this constraint and allow other characters. > This limitation seems to be hardcoded here: > https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171 > The DDL Definition on the Hive Language manual shows this as a configurable > property whereas it is not. This may lead to mileading assement of being able > to choose a choice of field delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13873) Column pruning for nested fields
[ https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303491#comment-15303491 ] Ferdinand Xu commented on HIVE-13873: - Thanks [~xuefuz] for reaching me about it. I will take a look later. > Column pruning for nested fields > > > Key: HIVE-13873 > URL: https://issues.apache.org/jira/browse/HIVE-13873 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Reporter: Xuefu Zhang > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13873) Column pruning for nested fields
[ https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303487#comment-15303487 ] Xuefu Zhang commented on HIVE-13873: FYI, [~Ferd] > Column pruning for nested fields > > > Key: HIVE-13873 > URL: https://issues.apache.org/jira/browse/HIVE-13873 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Reporter: Xuefu Zhang > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM
[ https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303484#comment-15303484 ] Amareshwari Sriramadasu commented on HIVE-13862: Yeah.. seems it was always falling to back to ORM - and never worked with directsql earlier with HIVE-11487. I dont think we have a way to test whether api is answered from directsql vs orm in unit tests. btw, we deployed the above fix in our production environment, and it is working fine. bq. IIRC some methods use a call on the query object that forces a single result, that may be a better option here. Didnt find any. Can you give more pointers? > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter > falls back to ORM > --- > > Key: HIVE-13862 > URL: https://issues.apache.org/jira/browse/HIVE-13862 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Fix For: 2.1.0 > > Attachments: HIVE-13862.patch > > > We are seeing following exception and calls fall back to ORM which make it > costly : > {noformat} > WARN org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, > falling back to ORM > java.lang.ClassCastException: > org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to > java.lang.Number > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303480#comment-15303480 ] Gopal V edited comment on HIVE-13872 at 5/27/16 4:54 AM: - AFAIK, the issue is that the column pruner removes the nearly all columns from the TableScan, but the VectorizationContext does not realize the needed columns list because there's no SEL operator in the middle to indicate the project of the 2 columns. {code} 2016-05-27T00:52:21,575 INFO [IO-Elevator-Thread-22[attempt_1462788318414_0308_24_00_02_3]]: LlapIoImpl (:()) - Processing data for hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/customer_demographics/03_0 2016-05-27T00:52:21,613 WARN [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: vector.VectorReduceSinkOperator (:()) - Object inspectors = org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector 2016-05-27T00:52:21,613 WARN [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: vector.VectorReduceSinkOperator (:()) - Projected columns = 0, 1, 2, 3, 4, 5, 6, 7, 8, 2016-05-27T00:52:21,614 ERROR [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: tez.MapRecordSource (:()) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {code} was (Author: gopalv): AFAIK, the issue is that the column pruner removes the nearly all columns from the TableScan, but the VectorizationContext does not realize the needed columns list because there's no SEL operator in the middle to indicate the project of the 3 columns. {code} 2016-05-27T00:52:21,575 INFO [IO-Elevator-Thread-22[attempt_1462788318414_0308_24_00_02_3]]: LlapIoImpl (:()) - Processing data for hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/customer_demographics/03_0 2016-05-27T00:52:21,613 WARN [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: vector.VectorReduceSinkOperator (:()) - Object inspectors = org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector 2016-05-27T00:52:21,613 WARN [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: vector.VectorReduceSinkOperator (:()) - Projected columns = 0, 1, 2, 3, 4, 5, 6, 7, 8, 2016-05-27T00:52:21,614 ERROR [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: tez.MapRecordSource (:()) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {code} > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap
[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303480#comment-15303480 ] Gopal V commented on HIVE-13872: AFAIK, the issue is that the column pruner removes the nearly all columns from the TableScan, but the VectorizationContext does not realize the needed columns list because there's no SEL operator in the middle to indicate the project of the 3 columns. {code} 2016-05-27T00:52:21,575 INFO [IO-Elevator-Thread-22[attempt_1462788318414_0308_24_00_02_3]]: LlapIoImpl (:()) - Processing data for hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/customer_demographics/03_0 2016-05-27T00:52:21,613 WARN [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: vector.VectorReduceSinkOperator (:()) - Object inspectors = org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector 2016-05-27T00:52:21,613 WARN [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: vector.VectorReduceSinkOperator (:()) - Projected columns = 0, 1, 2, 3, 4, 5, 6, 7, 8, 2016-05-27T00:52:21,614 ERROR [TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: tez.MapRecordSource (:()) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {code} > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13872: --- Description: TPC-DS Q13 produces a cross-product without CBO simplifying the query {code} Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 projection column num 1 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) ... 18 more {code} Simplified query {code} set hive.cbo.enable=false; -- explain select count(1) from store_sales ,customer_demographics where ( ( customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' )or ( customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' )) ; {code} {code} Map 3 Map Operator Tree: TableScan alias: customer_demographics Statistics: Num rows: 1920800 Data size: 717255532 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 1920800 Data size: 717255532 Basic stats: COMPLETE Column stats: NONE value expressions: cd_demo_sk (type: int), cd_marital_status (type: string) Execution mode: vectorized, llap LLAP IO: all inputs {code} was: TPC-DS Q13 produces a cross-product without CBO simplifying the query {code} Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 projection column num 1 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) ... 18 more {code} Simplified query {code} set hive.cbo.enable=false; -- explain select count(1) from store_sales ,customer_demographics where ( ( customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' )or ( customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' )) ; {code} > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = stor
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13872: --- Description: TPC-DS Q13 produces a cross-product without CBO simplifying the query {code} Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 projection column num 1 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) ... 18 more {code} Simplified query {code} set hive.cbo.enable=false; -- explain select count(1) from store_sales ,customer_demographics where ( ( customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' )or ( customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' )) ; {code} was: TPC-DS Q13 produces a cross-product without CBO simplifying the query {code} Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 projection column num 1 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) ... 18 more {code} > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303474#comment-15303474 ] Hive QA commented on HIVE-13860: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806538/HIVE-13860-java8.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 56 failed/errored test(s), 9003 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestJdbcWithMiniHA - did not produce a TEST-*.xml file TestJdbcWithMiniMr - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_sortmerge_join_7.q-orc_merge9.q-tez_union_dynamic_partition.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-cte_4.q-vector_non_string_partition.q-delete_where_non_partitioned.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-script_pipe.q-vector_decimal_aggregate.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorization_13.q-auto_sortmerge_join_13.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file TestMinimrCliDriver-bucket_num_reducers.q-table_nonprintable.q-scriptfile1.q-and-1-more - did not produce a TEST-*.xml file TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more - did not produce a TEST-*.xml file TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join30.q-join2.q-input17.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketmapjoin10.q-join_rc.q-skewjoinopt13.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby10.q-groupby4_noskew.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-multi_insert.q-join5.q-groupby6.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ptf_rcfile.q-bucketmapjoin_negative.q-bucket_map_join_spark2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoin_noskew.q-sample2.q-skewjoinopt10.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoin_union_remove_2.q-timestamp_null.q-union32.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt3.q-union27.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-stats2.q-ppd_gby_join.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-vector_distinct_2.q-join15.q-load_dyn_part3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDr
[jira] [Commented] (HIVE-13432) ACID ORC CompactorMR job throws java.lang.ArrayIndexOutOfBoundsException: 7
[ https://issues.apache.org/jira/browse/HIVE-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303470#comment-15303470 ] Qiuzhuang Lian commented on HIVE-13432: --- Hi Matt, Since we are blocked by this issue, can you please help take a look at this? Many thanks. > ACID ORC CompactorMR job throws java.lang.ArrayIndexOutOfBoundsException: 7 > --- > > Key: HIVE-13432 > URL: https://issues.apache.org/jira/browse/HIVE-13432 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.2.1 > Environment: Hadoop 2.6.2+Hive 1.2.1 >Reporter: Qiuzhuang Lian >Assignee: Matt McCline > > After initiating HIVE ACID ORC table compaction, the CompactorMR job throws > exception: > Error: java.lang.ArrayIndexOutOfBoundsException: 7 > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:1968) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2368) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:1969) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2368) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:69) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:183) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:466) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:1308) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:512) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:491) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > As a result, we see hadoop exception stack, > 297 failed with state FAILED due to: Task failed > task_1458819387386_11297_m_08 > Job failed as tasks failed. failedMaps:1 failedReduces:0 > 2016-04-06 11:30:57,891 INFO [dn209006-27]: mapreduce.Job > (Job.java:monitorAndPrintJob(1392)) - Counters: 14 > Job Counters > Failed map tasks=16 > Killed map tasks=7 > Launched map tasks=23 > Other local map tasks=13 > Data-local map tasks=6 > Rack-local map tasks=4 > Total time spent by all maps in occupied slots (ms)=412592 > Total time spent by all reduces in occupied slots (ms)=0 > Total time spent by all map tasks (ms)=206296 > Total vcore-seconds taken by all map tasks=206296 > Total megabyte-seconds taken by all map tasks=422494208 > Map-Reduce Framework > CPU time spent (ms)=0 > Physical memory (bytes) snapshot=0 > Virtual memory (bytes) snapshot=0 > 2016-04-06 11:30:57,891 ERROR [dn209006-27]: compactor.Worker > (Worker.java:run(176)) - Caught exception while trying to compact > lqz.my_orc_acid_table. Marking clean to avoid repeated failures, > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:186) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:162) > 2016-04-06 11:30:57,894 ERROR [dn209006-27]: txn.CompactionTxnHandler > (CompactionTxnHandler.java:markCleaned(327)) - Expected to remove at least > one row from completed_txn_components when marking compaction entry as clean! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13872: --- Description: TPC-DS Q13 produces a cross-product without CBO simplifying the query {code} Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 projection column num 1 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) ... 18 more {code} was: TPC-DS Q13 produces a cross-product once CBO runs through {code} Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 projection column num 1 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) ... 18 more {code} > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13490) Change itests to be part of the main Hive build
[ https://issues.apache.org/jira/browse/HIVE-13490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-13490: Attachment: HIVE-13490.03.patch i haven't modified any files in ptest2 or anything...that should work as before; but because this patch enables the execution of all tests without the need to install into maven local repo - maybe those installs could be removed; however it's not entirely clear for me why it installs the artifcats for every {{ADDITIONAL_PROFILES}} entry. i don't have a working ptest2 installation to validate my assumptions..so I think it's better for me to stay on the safe side and to not modify them ;) > Change itests to be part of the main Hive build > --- > > Key: HIVE-13490 > URL: https://issues.apache.org/jira/browse/HIVE-13490 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Zoltan Haindrich > Attachments: HIVE-13490.01.patch, HIVE-13490.02.patch, > HIVE-13490.03.patch > > > Instead of having to build Hive, and then itests separately. > With IntelliJ, this ends up being loaded as two separate dependencies, and > there's a lot of hops involved to make changes. > Does anyone know why these have been kept separate ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13376) HoS emits too many logs with application state
[ https://issues.apache.org/jira/browse/HIVE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303395#comment-15303395 ] Xuefu Zhang commented on HIVE-13376: Sounds good to me, [~lirui]. Disabling spark.yarn.submit.waitAppCompletion sounds good. However, I'm not sure if it has any other use other than checking app aliveness. Please find out. Thanks. > HoS emits too many logs with application state > -- > > Key: HIVE-13376 > URL: https://issues.apache.org/jira/browse/HIVE-13376 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 2.1.0 > > Attachments: HIVE-13376.2.patch, HIVE-13376.patch > > > The logs get flooded with something like: > > Mar 28, 3:12:21.851 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > > Mar 28, 3:12:21.912 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report > > for application_1458679386200_0149 (state: RUNNING) > > Mar 28, 3:12:22.853 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > > Mar 28, 3:12:22.913 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report > > for application_1458679386200_0149 (state: RUNNING) > > Mar 28, 3:12:23.855 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:23 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > While this is good information, it is a bit much. > Seems like SparkJobMonitor hard-codes its interval to 1 second. It should be > higher and perhaps made configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-5999) Allow other characters for LINES TERMINATED BY
[ https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303389#comment-15303389 ] Nemon Lou edited comment on HIVE-5999 at 5/27/16 3:05 AM: -- [~ashutoshc] Do you plan to work on this? I have implemented one based on text file.And need some review from hive community. :) was (Author: nemon): [~ashutoshc] Do you plan to work on this? I have implemented one based on text file.And nee some review from hive community. :) > Allow other characters for LINES TERMINATED BY > --- > > Key: HIVE-5999 > URL: https://issues.apache.org/jira/browse/HIVE-5999 > Project: Hive > Issue Type: Improvement > Components: Beeline, Database/Schema, Hive >Affects Versions: 0.12.0 >Reporter: Mariano Dominguez >Assignee: Ashutosh Chauhan >Priority: Critical > Labels: Delimiter, Hive, Row, SerDe > > LINES TERMINATED BY only supports newline '\n' right now. > It would be nice to loosen this constraint and allow other characters. > This limitation seems to be hardcoded here: > https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171 > The DDL Definition on the Hive Language manual shows this as a configurable > property whereas it is not. This may lead to mileading assement of being able > to choose a choice of field delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5999) Allow other characters for LINES TERMINATED BY
[ https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303389#comment-15303389 ] Nemon Lou commented on HIVE-5999: - [~ashutoshc] Do you plan to work on this? I have implemented one based on text file.And nee some review from hive community. :) > Allow other characters for LINES TERMINATED BY > --- > > Key: HIVE-5999 > URL: https://issues.apache.org/jira/browse/HIVE-5999 > Project: Hive > Issue Type: Improvement > Components: Beeline, Database/Schema, Hive >Affects Versions: 0.12.0 >Reporter: Mariano Dominguez >Assignee: Ashutosh Chauhan >Priority: Critical > Labels: Delimiter, Hive, Row, SerDe > > LINES TERMINATED BY only supports newline '\n' right now. > It would be nice to loosen this constraint and allow other characters. > This limitation seems to be hardcoded here: > https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171 > The DDL Definition on the Hive Language manual shows this as a configurable > property whereas it is not. This may lead to mileading assement of being able > to choose a choice of field delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13376) HoS emits too many logs with application state
[ https://issues.apache.org/jira/browse/HIVE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303379#comment-15303379 ] Rui Li commented on HIVE-13376: --- [~xuefuz], [~szehon] - I just did more tests about this and want to correct some of my previous comments: # In yarn-cluster mode, {{SparkSubmit}} runs the {{Client}}. The Client keeps checking the app state and printing the logs. On hive side, we read from SparkSubmit's input and err streams and print to hive log. # In yarn-client mode, {{SparkSubmit}} runs our {{RemoteDriver}}. RemoteDirver waits for the app to start running and then serves the job requests from hive. It doesn't report the app state after that. # The verbose logging only happens with yarn-cluster mode. # The long interval only affects yarn-client mode. # To avoid the state reports in yarn-cluster mode, we can change log level (e.g. WARN instead of INFO), or we can set {{spark.yarn.submit.waitAppCompletion=false}} and {{SparkSubmit}} will terminate after it submits the app to RM. I'd prefer disabling {{spark.yarn.submit.waitAppCompletion}}, if it doesn't cause any other trouble. > HoS emits too many logs with application state > -- > > Key: HIVE-13376 > URL: https://issues.apache.org/jira/browse/HIVE-13376 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 2.1.0 > > Attachments: HIVE-13376.2.patch, HIVE-13376.patch > > > The logs get flooded with something like: > > Mar 28, 3:12:21.851 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > > Mar 28, 3:12:21.912 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report > > for application_1458679386200_0149 (state: RUNNING) > > Mar 28, 3:12:22.853 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > > Mar 28, 3:12:22.913 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report > > for application_1458679386200_0149 (state: RUNNING) > > Mar 28, 3:12:23.855 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:23 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > While this is good information, it is a bit much. > Seems like SparkJobMonitor hard-codes its interval to 1 second. It should be > higher and perhaps made configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13778) DROP TABLE PURGE on S3A table with too many files does not delete the files
[ https://issues.apache.org/jira/browse/HIVE-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303387#comment-15303387 ] Aaron Fabbri commented on HIVE-13778: - [~sailesh] can you assign this to me please? I will resolve it. > DROP TABLE PURGE on S3A table with too many files does not delete the files > --- > > Key: HIVE-13778 > URL: https://issues.apache.org/jira/browse/HIVE-13778 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Sailesh Mukil >Priority: Critical > Labels: metastore, s3 > > I've noticed that when we do a DROP TABLE tablename PURGE on a table on S3A > that has many files, the files never get deleted. However, the Hive metastore > logs do say that the path was deleted: > "Not moving [path] to trash" > "Deleted the diretory [path]" > I initially thought that this was due to the eventually consistent nature of > S3 for deletes, however, a week later, the files still exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13778) DROP TABLE PURGE on S3A table with too many files does not delete the files
[ https://issues.apache.org/jira/browse/HIVE-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303380#comment-15303380 ] Aaron Fabbri edited comment on HIVE-13778 at 5/27/16 3:01 AM: -- Note this is the same as [IMPALA-3558|https://issues.cloudera.org/projects/IMPALA/issues/IMPALA-3558]. See that issue for my explanation that this is expected behavior. was (Author: fabbri): Note this is the same as [IMPALA-3558|https://issues.cloudera.org/projects/IMPALA/issues/IMPALA-3558] > DROP TABLE PURGE on S3A table with too many files does not delete the files > --- > > Key: HIVE-13778 > URL: https://issues.apache.org/jira/browse/HIVE-13778 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Sailesh Mukil >Priority: Critical > Labels: metastore, s3 > > I've noticed that when we do a DROP TABLE tablename PURGE on a table on S3A > that has many files, the files never get deleted. However, the Hive metastore > logs do say that the path was deleted: > "Not moving [path] to trash" > "Deleted the diretory [path]" > I initially thought that this was due to the eventually consistent nature of > S3 for deletes, however, a week later, the files still exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13778) DROP TABLE PURGE on S3A table with too many files does not delete the files
[ https://issues.apache.org/jira/browse/HIVE-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303380#comment-15303380 ] Aaron Fabbri commented on HIVE-13778: - Note this is the same as [IMPALA-3558|https://issues.cloudera.org/projects/IMPALA/issues/IMPALA-3558] > DROP TABLE PURGE on S3A table with too many files does not delete the files > --- > > Key: HIVE-13778 > URL: https://issues.apache.org/jira/browse/HIVE-13778 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Sailesh Mukil >Priority: Critical > Labels: metastore, s3 > > I've noticed that when we do a DROP TABLE tablename PURGE on a table on S3A > that has many files, the files never get deleted. However, the Hive metastore > logs do say that the path was deleted: > "Not moving [path] to trash" > "Deleted the diretory [path]" > I initially thought that this was due to the eventually consistent nature of > S3 for deletes, however, a week later, the files still exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13837) current_timestamp() output format is different in some cases
[ https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303373#comment-15303373 ] Hive QA commented on HIVE-13837: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806011/HIVE-13837.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 9975 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_sortmerge_join_7.q-orc_merge9.q-tez_union_dynamic_partition.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join_reordering_values.q-ptf_seqfile.q-auto_join18.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketmapjoin3.q-enforce_order.q-union11.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-skewjoinopt8.q-union_remove_1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-order.q-auto_join18_multi_distinct.q-union2.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ts org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reflect2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/399/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/399/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-399/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 22 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12806011 - PreCommit-HIVE-MASTER-Build > current_timestamp() output format is different in some cases > > > Key: HIVE-13837 > URL: https://issues.apache.org/jira/browse/HIVE-13837 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13837.01.patch > > > As [~jdere] reports: > {code} > current_timestamp() udf returns result with different format in some cases. > select current_timestamp() returns result with decimal precision: > {noformat} > hive> select current_timestamp(); > OK > 2016-04-14 18:26:58.875 > Time taken: 0.077 seconds, Fetched: 1 row(s) > {noformat} > But output format is different for select current_timestamp() from all100k > union select current_timestamp() from over100k limit 5; > {noformat} > hive> select current_timestamp() from all100k union select > current_timestamp() from over100k limit 5; > Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 > Total jobs = 1 > Launching Job 1 out of 1 > Tez session was closed. Reopening... > Session re-established. > Status: Running (Executing on YARN cluster with App id > application_1460611908643_0624) > -
[jira] [Updated] (HIVE-13443) LLAP: signing for the second state of submit (the event)
[ https://issues.apache.org/jira/browse/HIVE-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13443: Attachment: HIVE-13443.01.patch Parking the rebase... > LLAP: signing for the second state of submit (the event) > > > Key: HIVE-13443 > URL: https://issues.apache.org/jira/browse/HIVE-13443 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13443.01.patch, HIVE-13443.WIP.nogen.patch, > HIVE-13443.patch, HIVE-13443.wo.13444.13675.nogen.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13870) Decimal vector is not resized correctly
[ https://issues.apache.org/jira/browse/HIVE-13870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13870: Status: Patch Available (was: Open) > Decimal vector is not resized correctly > --- > > Key: HIVE-13870 > URL: https://issues.apache.org/jira/browse/HIVE-13870 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.1.0 > > Attachments: HIVE-13870.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13870) Decimal vector is not resized correctly
[ https://issues.apache.org/jira/browse/HIVE-13870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13870: Attachment: HIVE-13870.patch Simple patch. [~prasanth_j] [~mmccline] the same patch that we have discussed before > Decimal vector is not resized correctly > --- > > Key: HIVE-13870 > URL: https://issues.apache.org/jira/browse/HIVE-13870 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.1.0 > > Attachments: HIVE-13870.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13818) Fast Vector MapJoin Long hashtable has to handle all integral types
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13818: --- Fix Version/s: 2.1.0 Affects Version/s: 2.1.0 Status: Patch Available (was: Open) > Fast Vector MapJoin Long hashtable has to handle all integral types > --- > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Matt McCline >Assignee: Gopal V >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, > HIVE-13818.1.patch, vector_bug.q, vector_bug.q.out > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13818) Fast Vector MapJoin Long hashtable has to handle all integral types
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13818: --- Attachment: HIVE-13818.1.patch > Fast Vector MapJoin Long hashtable has to handle all integral types > --- > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Matt McCline >Assignee: Gopal V >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, > HIVE-13818.1.patch, vector_bug.q, vector_bug.q.out > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303315#comment-15303315 ] Sergey Shelukhin edited comment on HIVE-13084 at 5/27/16 1:36 AM: -- Read the patch, it seems to make sense. I didn't find anything, which might be evening related ;) Thanks for the comments, the code appears to do what they say. Didn't review the q files. +1 pending tests was (Author: sershe): Read the patch, it seems to make sense. I didn't find anything, which might be evening related ;) Didn't review the q files. +1 pending tests > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, HIVE-13084.04.patch, HIVE-13084.05.patch, > HIVE-13084.06.patch, HIVE-13084.07.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303315#comment-15303315 ] Sergey Shelukhin commented on HIVE-13084: - Read the patch, it seems to make sense. I didn't find anything which might be evening related ;) Didn't review the q files. +1 pending tests > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, HIVE-13084.04.patch, HIVE-13084.05.patch, > HIVE-13084.06.patch, HIVE-13084.07.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303315#comment-15303315 ] Sergey Shelukhin edited comment on HIVE-13084 at 5/27/16 1:36 AM: -- Read the patch, it seems to make sense. I didn't find anything, which might be evening related ;) Didn't review the q files. +1 pending tests was (Author: sershe): Read the patch, it seems to make sense. I didn't find anything which might be evening related ;) Didn't review the q files. +1 pending tests > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, HIVE-13084.04.patch, HIVE-13084.05.patch, > HIVE-13084.06.patch, HIVE-13084.07.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303313#comment-15303313 ] Mohit Sabharwal commented on HIVE-13860: Updated TestSparkCliDriver.testCliDriver_join0 and TestSparkCliDriver.testCliDriver_outer_join_ppr as well. i.e. regenerated to bring java8 version up to date with the java7 version. > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch, > HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-13860: --- Attachment: HIVE-13860-java8.patch > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch, > HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13675) LLAP: add HMAC signatures to LLAPIF splits
[ https://issues.apache.org/jira/browse/HIVE-13675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13675: Attachment: HIVE-13675.02.patch Backing up the rebase for now > LLAP: add HMAC signatures to LLAPIF splits > -- > > Key: HIVE-13675 > URL: https://issues.apache.org/jira/browse/HIVE-13675 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13675.01.patch, HIVE-13675.02.patch, > HIVE-13675.WIP.patch, HIVE-13675.wo.13444.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13818) Fast Vector MapJoin Long hashtable has to handle all integral types
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13818: --- Summary: Fast Vector MapJoin Long hashtable has to handle all integral types (was: Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?) > Fast Vector MapJoin Long hashtable has to handle all integral types > --- > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Gopal V >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, > vector_bug.q.out > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned HIVE-13818: -- Assignee: Gopal V (was: Matt McCline) > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Gopal V >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, > vector_bug.q.out > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12721) Add UUID built in function
[ https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303293#comment-15303293 ] Jeremy Beard commented on HIVE-12721: - I did it to mimic the existing built in functions, which for strings all seemed to return Text. Looking again now I see a couple that return String but Text is much more common. > Add UUID built in function > -- > > Key: HIVE-12721 > URL: https://issues.apache.org/jira/browse/HIVE-12721 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Jeremy Beard >Assignee: Jeremy Beard > Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch > > > A UUID function would be very useful for ETL jobs that need to generate > surrogate keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11345) Fix formatting of Show Compations/Transactions/Locks
[ https://issues.apache.org/jira/browse/HIVE-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11345: -- Assignee: (was: Eugene Koifman) > Fix formatting of Show Compations/Transactions/Locks > > > Key: HIVE-11345 > URL: https://issues.apache.org/jira/browse/HIVE-11345 > Project: Hive > Issue Type: Bug > Components: CLI, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman > > all the columns of the output are variable length (in each row, based on > data) - makes it really difficult to read -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12267) Make Compaction jobs run on Tez
[ https://issues.apache.org/jira/browse/HIVE-12267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12267: -- Assignee: (was: Eugene Koifman) > Make Compaction jobs run on Tez > --- > > Key: HIVE-12267 > URL: https://issues.apache.org/jira/browse/HIVE-12267 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman > > Currently all Compaction jobs run on MR. > Should support running on Tez. > add hive.compactor.engine which can be set to mr, tez or value of > hive.execution.engine property. The latter would be the default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13248) Change date_add/date_sub/to_date functions to return Date type rather than String
[ https://issues.apache.org/jira/browse/HIVE-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303280#comment-15303280 ] Hive QA commented on HIVE-13248: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806234/HIVE-13248.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/398/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/398/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-398/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: resource batch-exec.vm not found. {noformat} This message is automatically generated. ATTACHMENT ID: 12806234 - PreCommit-HIVE-MASTER-Build > Change date_add/date_sub/to_date functions to return Date type rather than > String > - > > Key: HIVE-13248 > URL: https://issues.apache.org/jira/browse/HIVE-13248 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-13248.1.patch, HIVE-13248.2.patch, > HIVE-13248.3.patch > > > Some of the original "date" related functions return string values rather > than Date values, because they were created before the Date type existed in > Hive. We can try to change these to return Date in the 2.x line. > Date values should be implicitly convertible to String. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11956) SHOW LOCKS should indicate what acquired the lock
[ https://issues.apache.org/jira/browse/HIVE-11956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303265#comment-15303265 ] Eugene Koifman commented on HIVE-11956: --- failed tests with age > 1 are not related [~wzheng] could you review please > SHOW LOCKS should indicate what acquired the lock > - > > Key: HIVE-11956 > URL: https://issues.apache.org/jira/browse/HIVE-11956 > Project: Hive > Issue Type: Improvement > Components: CLI, Transactions >Affects Versions: 0.14.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-11956.patch > > > This can be a queryId, Flume agent id, Storm bolt id, etc. This would > dramatically help diagnosing issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results
[ https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303262#comment-15303262 ] Sergey Shelukhin commented on HIVE-11527: - +1. [~vgumashta] any comments? Otherwise I will commit soon > bypass HiveServer2 thrift interface for query results > - > > Key: HIVE-11527 > URL: https://issues.apache.org/jira/browse/HIVE-11527 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Sergey Shelukhin >Assignee: Takanobu Asanuma > Attachments: HIVE-11527.WIP.patch > > > Right now, HS2 reads query results and returns them to the caller via its > thrift API. > There should be an option for HS2 to return some pointer to results (an HDFS > link?) and for the user to read the results directly off HDFS inside the > cluster, or via something like WebHDFS outside the cluster > Review board link: https://reviews.apache.org/r/40867 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started
[ https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303261#comment-15303261 ] Nachiket Vaidya commented on HIVE-13836: Thank you [~sushanth]. I created a jira HIVE-13869 and linked it up. Can you please review the attached patch? Thank you. > DbNotifications giving an error = Invalid state. Transaction has already > started > > > Key: HIVE-13836 > URL: https://issues.apache.org/jira/browse/HIVE-13836 > Project: Hive > Issue Type: Bug >Reporter: Nachiket Vaidya >Priority: Critical > Attachments: HIVE-13836.patch > > > I used pyhs2 python client to create tables/partitions in hive. I was working > fine until I moved to multithreaded scripts which created 8 connections and > ran DDL queries concurrently. > I got the error as > {noformat} > 2016-05-04 17:49:26,226 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: > HMSHandler Fatal error: Invalid state. Transaction has already started > org.datanucleus.transaction.NucleusTransactionException: Invalid state. > Transaction has already started > at > org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47) > at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131) > at > org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88) > at > org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80) > at > org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463) > at > org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) > at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502) > at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) > at > com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11956) SHOW LOCKS should indicate what acquired the lock
[ https://issues.apache.org/jira/browse/HIVE-11956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303254#comment-15303254 ] Hive QA commented on HIVE-11956: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12805953/HIVE-11956.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 98 failed/errored test(s), 10108 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorization_13.q-auto_sortmerge_join_13.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_3 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_tests org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_joins_explain org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_main org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstCompa
[jira] [Updated] (HIVE-13444) LLAP: add HMAC signatures to LLAP; verify them on LLAP side
[ https://issues.apache.org/jira/browse/HIVE-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13444: Attachment: HIVE-13444.04.patch Added the test > LLAP: add HMAC signatures to LLAP; verify them on LLAP side > --- > > Key: HIVE-13444 > URL: https://issues.apache.org/jira/browse/HIVE-13444 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13444.01.patch, HIVE-13444.02.patch, > HIVE-13444.03.patch, HIVE-13444.04.patch, HIVE-13444.WIP.patch, > HIVE-13444.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request
[ https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303240#comment-15303240 ] Eugene Koifman commented on HIVE-13354: --- +1 pending tests > Add ability to specify Compaction options per table and per request > --- > > Key: HIVE-13354 > URL: https://issues.apache.org/jira/browse/HIVE-13354 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.3.0, 2.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Labels: TODOC2.1 > Attachments: HIVE-13354.1.patch, > HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch, HIVE-13354.3.patch > > > Currently the are a few options that determine when automatic compaction is > triggered. They are specified once for the warehouse. > This doesn't make sense - some table may be more important and need to be > compacted more often. > We should allow specifying these on per table basis. > Also, compaction is an MR job launched from within the metastore. There is > currently no way to control job parameters (like memory, for example) except > to specify it in hive-site.xml for metastore which means they are site wide. > Should add a way to specify these per table (perhaps even per compaction if > launched via ALTER TABLE) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13863) Improve AnnotateWithStatistics with support for cartesian product
[ https://issues.apache.org/jira/browse/HIVE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303227#comment-15303227 ] Ashutosh Chauhan commented on HIVE-13863: - +1 I am assuming it probably require updating few other golden files as well. > Improve AnnotateWithStatistics with support for cartesian product > - > > Key: HIVE-13863 > URL: https://issues.apache.org/jira/browse/HIVE-13863 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13863.patch > > > Currently cartesian product stats based on cardinality of inputs are not > inferred correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules
[ https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303212#comment-15303212 ] Ashutosh Chauhan commented on HIVE-13861: - +1 pending tests > Fix up nullability issue that might be created by pull up constants rules > - > > Key: HIVE-13861 > URL: https://issues.apache.org/jira/browse/HIVE-13861 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13861.patch > > > When we pull up constants through Union or Sort operators, we might end up > rewriting the original expression into an expression whose schema has > different nullability properties for some of its columns. > This results in AssertionError of the following kind: > {noformat} > ... > org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.AssertionError: Internal error: Cannot add expression of different > type to set: > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink
[ https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303209#comment-15303209 ] Ashutosh Chauhan commented on HIVE-13808: - Can you create a RB with updated golden files for this? > Use constant expressions to backtrack when we create ReduceSink > --- > > Key: HIVE-13808 > URL: https://issues.apache.org/jira/browse/HIVE-13808 > Project: Hive > Issue Type: Sub-task > Components: Parser >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13808.patch > > > Follow-up of HIVE-13068. > When we create a RS with constant expressions as keys/values, and immediately > after we create a SEL operator that backtracks the expressions from the RS. > Currently, we automatically create references for all the keys/values. > Before, we could rely on Hive ConstantPropagate to propagate the constants to > the SEL. However, after HIVE-13068, Hive ConstantPropagate does not get > exercised anymore. Thus, we can simply create constant expressions when we > create the SEL operator instead of a reference. > Ex. ql/src/test/results/clientpositive/vector_coalesce.q.out > {noformat} > EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, > cstring1, cint, cfloat, csmallint) as c > FROM alltypesorc > WHERE (cdouble IS NULL) > ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c > LIMIT 10 > {noformat} > Plan: > {noformat} > EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, > cstring1, cint, cfloat, csmallint) as c > FROM alltypesorc > WHERE (cdouble IS NULL) > ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c > LIMIT 10 > POSTHOOK: type: QUERY > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 2641964 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: cdouble is null (type: boolean) > Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: cstring1 (type: string), cint (type: int), > cfloat (type: float), csmallint (type: smallint), > COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string) > outputColumnNames: _col1, _col2, _col3, _col4, _col5 > Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: null (type: double), _col1 (type: string), > _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: > string) > sort order: ++ > Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE > TopN Hash Memory Usage: 0.1 > Execution mode: vectorized > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 > (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: > float), KEY.reducesinkkey4 (type: smallint), KEY.reducesinkkey5 (type: string) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 > Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 10 > Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE > Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: 10 > Processor Tree: > ListSink > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13849) Wrong plan for hive.optimize.sort.dynamic.partition=true
[ https://issues.apache.org/jira/browse/HIVE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303190#comment-15303190 ] Ashutosh Chauhan commented on HIVE-13849: - +1 > Wrong plan for hive.optimize.sort.dynamic.partition=true > > > Key: HIVE-13849 > URL: https://issues.apache.org/jira/browse/HIVE-13849 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.1.0, 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-13849.patch > > > To reproduce: > {noformat} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.exec.dynamic.partition.mode=nonstrict; > set hive.optimize.sort.dynamic.partition=true; > CREATE TABLE non_acid(key string, value string) PARTITIONED BY(ds string, hr > int) CLUSTERED BY(key) INTO 2 BUCKETS STORED AS ORC; > explain insert into table non_acid partition(ds,hr) select * from srcpart > sort by value; > {noformat} > CC'ed [~ashutoshc], [~ekoifman] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13617) LLAP: support non-vectorized execution in IO
[ https://issues.apache.org/jira/browse/HIVE-13617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13617: Attachment: HIVE-13617.03.patch Fixing the issue in CliDriver case > LLAP: support non-vectorized execution in IO > > > Key: HIVE-13617 > URL: https://issues.apache.org/jira/browse/HIVE-13617 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13617-wo-11417.patch, HIVE-13617-wo-11417.patch, > HIVE-13617.01.patch, HIVE-13617.03.patch, HIVE-13617.patch, HIVE-13617.patch, > HIVE-15396-with-oi.patch > > > Two approaches - a separate decoding path, into rows instead of VRBs; or > decoding VRBs into rows on a higher level (the original LlapInputFormat). I > think the latter might be better - it's not a hugely important path, and perf > in non-vectorized case is not the best anyway, so it's better to make do with > much less new code and architectural disruption. > Some ORC patches in progress introduce an easy to reuse (or so I hope, > anyway) VRB-to-row conversion, so we should just use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13868) Include derby.log file in the Hive ptest logs
[ https://issues.apache.org/jira/browse/HIVE-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña resolved HIVE-13868. Resolution: Fixed Fix Version/s: 2.2.0 > Include derby.log file in the Hive ptest logs > - > > Key: HIVE-13868 > URL: https://issues.apache.org/jira/browse/HIVE-13868 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Fix For: 2.2.0 > > Attachments: HIVE-13868.1.patch > > > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13868) Include derby.log file in the Hive ptest logs
[ https://issues.apache.org/jira/browse/HIVE-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303105#comment-15303105 ] Sergio Peña commented on HIVE-13868: No need review for this patch. I need to submit it in order to get the derby.log so that I debug the HMS errors we're seeing. [~szehon] FYI > Include derby.log file in the Hive ptest logs > - > > Key: HIVE-13868 > URL: https://issues.apache.org/jira/browse/HIVE-13868 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-13868.1.patch > > > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13858) LLAP: A preempted task can end up waiting on completeInitialization if some part of the executing code suppressed the interrupt
[ https://issues.apache.org/jira/browse/HIVE-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-13858: -- Attachment: HIVE-13858.02.patch Updated patch with review comments addressed. > LLAP: A preempted task can end up waiting on completeInitialization if some > part of the executing code suppressed the interrupt > --- > > Key: HIVE-13858 > URL: https://issues.apache.org/jira/browse/HIVE-13858 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Labels: llap > Attachments: HIVE-13858.01.patch, HIVE-13858.02.patch > > > An interrupt along with a HiveProcessor.abort call is made when attempting to > preempt a task. > In this specific case, the task was in the middle of HDFS IO - which > 'handled' the interrupt by retrying. As a result the interrupt status on the > thread was reset - so instead of skipping the future.get in > completeInitialization - the task ended up blocking there. > End result - a single executor slot permanently blocked in LLAP. Depending on > what else is running - this can cause a cluster level deadlock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13868) Include derby.log file in the Hive ptest logs
[ https://issues.apache.org/jira/browse/HIVE-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-13868: --- Attachment: HIVE-13868.1.patch > Include derby.log file in the Hive ptest logs > - > > Key: HIVE-13868 > URL: https://issues.apache.org/jira/browse/HIVE-13868 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-13868.1.patch > > > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started
[ https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303089#comment-15303089 ] Sushanth Sowmyan commented on HIVE-13836: - I will agree. As long as we don't lose the underlying issue(and when you create a new jira for that, could you link it to this - that way, whoever works on that has an easy reproduction to work against), I'm okay with adding synchronization here to DbNotificationListener. > DbNotifications giving an error = Invalid state. Transaction has already > started > > > Key: HIVE-13836 > URL: https://issues.apache.org/jira/browse/HIVE-13836 > Project: Hive > Issue Type: Bug >Reporter: Nachiket Vaidya >Priority: Critical > Attachments: HIVE-13836.patch > > > I used pyhs2 python client to create tables/partitions in hive. I was working > fine until I moved to multithreaded scripts which created 8 connections and > ran DDL queries concurrently. > I got the error as > {noformat} > 2016-05-04 17:49:26,226 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: > HMSHandler Fatal error: Invalid state. Transaction has already started > org.datanucleus.transaction.NucleusTransactionException: Invalid state. > Transaction has already started > at > org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47) > at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131) > at > org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88) > at > org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80) > at > org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463) > at > org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) > at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502) > at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) > at > com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13867) restore HiveAuthorizer interface changes
[ https://issues.apache.org/jira/browse/HIVE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-13867: - Description: TLDR: Some of the changes to hive authorizer interface made as part of HIVE-13360 are inappropriate and need to be restored. Regarding the move of ip address from the query context object (HiveAuthzContext) to HiveAuthenticationProvider. That isn't the right place for it.​ In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 , every request for single session does not have to come via a single IP address. Current assumption in hive code base is that the IP address is valid for the entire session. This might not hold true for ever. A limitation in HS2 that it holds state for the session would currently force the user configure proxies and knox to remember which next Host it was using, because they need to have state to remember the HS2 instance to be used! But that is a limitation that ideally goes away some day, and when that happens, HiveAuthzContext would be the right place for keeping the IP address! was: TLDR: Some of the changes to hive authorizer interface made as part of HIVE-13360 are inappropriate and need to be restored. Pasting comments from Thejas in an email: Regarding the plans to move ip address from the query context object (HiveAuthzContext) to HiveAuthenticationProvider. I don't think that is a clear right place for it.​ In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 , every request for single session does not have to come via a single IP address. Current assumption in hive code base is that the IP address is valid for the entire session. This might not hold true for ever. A limitation in HS2 that it holds state for the session would currently force the user configure proxies and knox to remember which next Host it was using, because they need to have state to remember the HS2 instance to be used! But that is a limitation that ideally goes away some day, and when that happens, HiveAuthzContext would be the right place for keeping the IP address! > restore HiveAuthorizer interface changes > > > Key: HIVE-13867 > URL: https://issues.apache.org/jira/browse/HIVE-13867 > Project: Hive > Issue Type: Bug >Reporter: Thejas M Nair >Priority: Blocker > > TLDR: Some of the changes to hive authorizer interface made as part of > HIVE-13360 are inappropriate and need to be restored. > Regarding the move of ip address from the query context object > (HiveAuthzContext) to HiveAuthenticationProvider. That isn't the right place > for it.​ > In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 > , every request for single session does not have to come via a single IP > address. > Current assumption in hive code base is that the IP address is valid for the > entire session. This might not hold true for ever. > A limitation in HS2 that it holds state for the session would currently force > the user configure proxies and knox to remember which next Host it was using, > because they need to have state to remember the HS2 instance to be used! But > that is a limitation that ideally goes away some day, and when that happens, > HiveAuthzContext would be the right place for keeping the IP address! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13867) restore HiveAuthorizer interface changes
[ https://issues.apache.org/jira/browse/HIVE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair reassigned HIVE-13867: Assignee: Thejas M Nair > restore HiveAuthorizer interface changes > > > Key: HIVE-13867 > URL: https://issues.apache.org/jira/browse/HIVE-13867 > Project: Hive > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > > TLDR: Some of the changes to hive authorizer interface made as part of > HIVE-13360 are inappropriate and need to be restored. > Regarding the move of ip address from the query context object > (HiveAuthzContext) to HiveAuthenticationProvider. That isn't the right place > for it.​ > In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 > , every request for single session does not have to come via a single IP > address. > Current assumption in hive code base is that the IP address is valid for the > entire session. This might not hold true for ever. > A limitation in HS2 that it holds state for the session would currently force > the user configure proxies and knox to remember which next Host it was using, > because they need to have state to remember the HS2 instance to be used! But > that is a limitation that ideally goes away some day, and when that happens, > HiveAuthzContext would be the right place for keeping the IP address! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303049#comment-15303049 ] Thejas M Nair commented on HIVE-13749: -- HIVE-3098 fixes it for metastore . This fix can be dangerous for the embedded metastore use case. bq. I think because of my test being run as a single user. Single user shouldn't matter as the cache is based on the UGI object as I mentioned earlier. Testing using hive-cli might be better that would ensure creation on new metastore connection each time as well. I assume you haven't seen this in other user environments. I suspect there is something unique about their environment that would be triggering this. You might want to check if their are using any specific plugins. Is this with kerberos enabled ? > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs
[ https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303018#comment-15303018 ] Prasanth Jayachandran commented on HIVE-13338: -- lgtm, +1 > Differences in vectorized_casts.q output for vectorized and non-vectorized > runs > --- > > Key: HIVE-13338 > URL: https://issues.apache.org/jira/browse/HIVE-13338 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13338.01.patch, HIVE-13338.02.patch > > > Turn off vectorization and you get different results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13269) Simplify comparison expressions using column stats
[ https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303001#comment-15303001 ] Lefty Leverenz commented on HIVE-13269: --- Doc note: This adds *hive.optimize.filter.stats.reduction* to HiveConf.java, so it needs to be documented in the wiki for release 2.1.0. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Added a TODOC2.1 label. > Simplify comparison expressions using column stats > -- > > Key: HIVE-13269 > URL: https://issues.apache.org/jira/browse/HIVE-13269 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, > HIVE-13269.03.patch, HIVE-13269.04.patch, HIVE-13269.patch, HIVE-13269.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12721) Add UUID built in function
[ https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302992#comment-15302992 ] Sergio Peña commented on HIVE-12721: Test failures are not related. [~jbeard] Why is {{public Text evaluate}} used instead of {{public String evaluate}} if at the end we convert to String? The patch looks very simple. I just want to know if Text is needed in the class. > Add UUID built in function > -- > > Key: HIVE-12721 > URL: https://issues.apache.org/jira/browse/HIVE-12721 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Jeremy Beard >Assignee: Jeremy Beard > Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch > > > A UUID function would be very useful for ETL jobs that need to generate > surrogate keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302995#comment-15302995 ] Ashutosh Chauhan commented on HIVE-13857: - +1 pending tests > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, > HIVE-13857.3.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats
[ https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-13269: -- Labels: TODOC2.1 (was: ) > Simplify comparison expressions using column stats > -- > > Key: HIVE-13269 > URL: https://issues.apache.org/jira/browse/HIVE-13269 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, > HIVE-13269.03.patch, HIVE-13269.04.patch, HIVE-13269.patch, HIVE-13269.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302965#comment-15302965 ] Naveen Gangam commented on HIVE-13749: -- Oops just posted the patch to RB (https://reviews.apache.org/r/47918/) at the same time as this comment. 1) Isnt the shutdown() called when a HMS request is fulfilled and the executor thread is being released back to the pool? So any new calls would potentially have a new UGI and a new instance of HiveConf. Also, calling closeAll() just removes the cached element. At worst, the FileSystem object is re-cached on a miss. 2) The other fixes are to address a similar issue on the HS2 side where using the FileSystem APIs causes the Cache to grow. This issue is on the HMS side. Regarding reproducing this locally, yes and no. I ran 100's of iterations of beeline executing a script that create a table and then drops it while randomly toggling the value of a hive conf property. For 300 iterations, I have gotten it to retain 60 instances which is not quite the same success as the customer is having. I think because of my test being run as a single user. Re-running the test with this fix, I have 8 instances retained but none in this particular cache. I have run with debug around this code and during the drop table command, I can see an element being added to the cache. I am also waiting for logs from this customer who is running with some instrumentation + fix. I can confirm that from those logs too. Alternatively, in checkTrashPurgeCombination() we could add a close() to this FileSystem. In my testcase, this has been the primary reason for the retained instances. {code} HadoopShims.HdfsEncryptionShim shim = ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf), hiveConf); {code} Thoughts? Thanks > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13857: - Attachment: HIVE-13857.3.patch > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, > HIVE-13857.3.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13857: - Attachment: (was: HIVE-13857.3.patch) > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, > HIVE-13857.3.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13857: - Attachment: HIVE-13857.3.patch > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, > HIVE-13857.3.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302925#comment-15302925 ] Thejas M Nair commented on HIVE-13749: -- Regarding the patch 1. How do you make sure that files created by this ugi are not in use in other parts ? We need do the closing only after we are sure that the ugi object is no longer going to be used. 2. I am not sure if this would fix the leak. As you can see we have patches that deal with the closing when UGI object is no longer used. Are you able to reproduce this in your environment ? If not, you might want to add some debugging around code that adds entries in the cache, and see if the closing of files generated from those places is happening. You might also want to see if the user is some some plugins that might be creating new UGI objects. > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-13749: - Status: Patch Available (was: Open) > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-13749: - Attachment: HIVE-13749.patch > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302889#comment-15302889 ] Naveen Gangam commented on HIVE-13749: -- perhaps in the HiveMetaStore.shutdown() we clear the cache for the current UGI. Make sense? Could you please review the patch when you have a chance ? I have had the customer disable the FileSystem caching by adding {{fs.hdfs.impl.disable.cache=true}} to the HMS configuration, the re-run the workloads. The same site that had 66000+ Configuration instances in their heapdump now has 80 instances and none of them are in Cache. So it is clear that the FileSystem.CACHE is the problem. Thanks > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302876#comment-15302876 ] Hive QA commented on HIVE-13860: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806415/HIVE-13860-java8.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 106 failed/errored test(s), 9933 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestJdbcWithMiniHA - did not produce a TEST-*.xml file TestJdbcWithMiniMr - did not produce a TEST-*.xml file TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-cte_4.q-vector_non_string_partition.q-delete_where_non_partitioned.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more - did not produce a TEST-*.xml file TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did not produce a TEST-*.xml file TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-skewjoinopt8.q-union_remove_1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_3.q-groupby7.q-auto_join17.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-order.q-auto_join18_multi_distinct.q-union2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt15.q-join39.q-avro_joins_native.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_int_type_promotion org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_3 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynp
[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started
[ https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302875#comment-15302875 ] Nachiket Vaidya commented on HIVE-13836: [~sushanth] Thank you for reply. I agree with you that the issue is deep inside. The issue is easy to reproduce. I tried that and I got different stack trace. {noformat} 2016-05-26 12:32:27,904 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-7]: MetaException(message:java.lang.NullPointerException) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5535) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:2308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) at com.sun.proxy.$Proxy14.add_partitions_req(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partitions_req.getResult(ThriftHiveMetastore.java:9723) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partitions_req.getResult(ThriftHiveMetastore.java:9707) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1245) at com.jolbox.bonecp.StatementHandle.executeBatch(StatementHandle.java:424) at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:372) at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:628) at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:324) at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:194) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:640) at org.datanucleus.store.query.Query.executeQuery(Query.java:1786) at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672) at org.datanucleus.store.query.Query.execute(Query.java:1654) at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221) at org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7534) at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source) at org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261) at org.apache.hive.hcatalog.listener.DbNotificationListener.onAddPartition(DbNotificationListener.java:168) {noformat} It is of course concurrency issue manifesting in different way. It looks like db notification is using ObjectStore differently. Adding synchronization at db notification solves this issue. Given that there is not much performance implication for using synchronization, it should be ok to fix it in db notification and then file separate jira to track ObjectStore issue. Please let me know what do you think. > DbNotifications giving an error = Invalid state. Transaction has already > started > > >
[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302844#comment-15302844 ] Gopal V commented on HIVE-13818: The bug is limited to Fast hashtables {code} hive.mapjoin.hybridgrace.hashtable=false; hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true; {code} > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, > vector_bug.q.out > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302831#comment-15302831 ] Ashutosh Chauhan commented on HIVE-13857: - There are other callers which are passing in false. Can you also create a RB for this? > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size
[ https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302813#comment-15302813 ] Prasanth Jayachandran commented on HIVE-13751: -- Yeah. This will go into 2.1.0 > LlapOutputFormatService should have a configurable send buffer size > --- > > Key: HIVE-13751 > URL: https://issues.apache.org/jira/browse/HIVE-13751 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13751.1.patch, HIVE-13751.2.patch, > HIVE-13751.3.patch > > > Netty channel buffer size is hard-coded 128KB now. It should be made > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13611) add jar causes beeline not to output log messages
[ https://issues.apache.org/jira/browse/HIVE-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-13611: Assignee: Naveen Gangam > add jar causes beeline not to output log messages > - > > Key: HIVE-13611 > URL: https://issues.apache.org/jira/browse/HIVE-13611 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 1.1.0 >Reporter: Thomas Scott >Assignee: Naveen Gangam >Priority: Minor > > After adding a jar in beeline warning messages and job log ouptut are no > longer shown. This only occurs if you use short connection strings (e.g. > jdbc:hive2://). Example below: > {code} > 0: jdbc:hive2://nightly55-1.gce.cloudera.com:> !connect jdbc:hive2:// > Connecting to jdbc:hive2:// > Enter username for jdbc:hive2://: hive > Enter password for jdbc:hive2://: > Connected to: Apache Hive (version 1.1.0-cdh5.5.4) > Driver: Hive JDBC (version 1.1.0-cdh5.5.4) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 1: jdbc:hive2://> select count(*) from sample_07 limit 1; > INFO : Number of reduce tasks determined at compile time: 1 > INFO : In order to change the average load for a reducer (in bytes): > INFO : set hive.exec.reducers.bytes.per.reducer= > INFO : In order to limit the maximum number of reducers: > INFO : set hive.exec.reducers.max= > INFO : In order to set a constant number of reducers: > INFO : set mapreduce.job.reduces= > INFO : number of splits:1 > INFO : Submitting tokens for job: job_1461621650734_0020 > INFO : The url to track the job: > http://nightly55-1.gce.cloudera.com:8088/proxy/application_1461621650734_0020/ > INFO : Starting Job = job_1461621650734_0020, Tracking URL = > http://nightly55-1.gce.cloudera.com:8088/proxy/application_1461621650734_0020/ > INFO : Kill Command = /usr/lib/hadoop/bin/hadoop job -kill > job_1461621650734_0020 > INFO : Hadoop job information for Stage-1: number of mappers: 1; number of > reducers: 1 > INFO : 2016-04-26 01:36:04,297 Stage-1 map = 0%, reduce = 0% > INFO : 2016-04-26 01:36:11,802 Stage-1 map = 100%, reduce = 0%, Cumulative > CPU 1.52 sec > INFO : 2016-04-26 01:36:19,419 Stage-1 map = 100%, reduce = 100%, > Cumulative CPU 3.25 sec > INFO : MapReduce Total cumulative CPU time: 3 seconds 250 msec > INFO : Ended Job = job_1461621650734_0020 > +--+--+ > | _c0 | > +--+--+ > | 823 | > +--+--+ > 1 row selected (25.908 seconds) > 1: jdbc:hive2://> add jar hdfs://some_nn.com/tmp/somedir/some_jar.jar > 1: jdbc:hive2://> ; > converting to local hdfs://some_nn.com/tmp/somedir/some_jar.jar > Added [/tmp/93ca63a2-5019-4f37-b9b4-75f1740b53c8_resources/some_jar.jar] to > class path > Added resources: [hdfs://some_nn.com/tmp/somedir/some_jar.jar] > No rows affected (0.179 seconds) > 1: jdbc:hive2://> select count(*) from sample_07 limit 1; > +--+--+ > | _c0 | > +--+--+ > | 823 | > +--+--+ > 1: jdbc:hive2://> > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM
[ https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302745#comment-15302745 ] Sergey Shelukhin commented on HIVE-13862: - Hmm. I wonder how (and if ;)) it ever worked. Could the list result be DB-specific, or is this the bug for all DBs? IIRC some methods use a call on the query object that forces a single result, that may be a better option here. > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter > falls back to ORM > --- > > Key: HIVE-13862 > URL: https://issues.apache.org/jira/browse/HIVE-13862 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Fix For: 2.1.0 > > Attachments: HIVE-13862.patch > > > We are seeing following exception and calls fall back to ORM which make it > costly : > {noformat} > WARN org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, > falling back to ORM > java.lang.ClassCastException: > org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to > java.lang.Number > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13844) Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class
[ https://issues.apache.org/jira/browse/HIVE-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Svetozar Ivanov updated HIVE-13844: --- Description: Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name 'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name is 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' {code} public static enum IndexType { AGGREGATE_TABLE("aggregate", "org.apache.hadoop.hive.ql.AggregateIndexHandler"), COMPACT_SUMMARY_TABLE("compact", "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"), BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler"); private IndexType(String indexType, String className) { indexTypeName = indexType; this.handlerClsName = className; } private final String indexTypeName; private final String handlerClsName; public String getName() { return indexTypeName; } public String getHandlerClsName() { return handlerClsName; } } {code} Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't work in case of configured 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' as index handler. In hive server log is observed java.lang.NullPointerException. was: Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name 'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name is 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' {code} public static enum IndexType { AGGREGATE_TABLE("aggregate", "org.apache.hadoop.hive.ql.AggregateIndexHandler"), COMPACT_SUMMARY_TABLE("compact", "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"), BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler"); private IndexType(String indexType, String className) { indexTypeName = indexType; this.handlerClsName = className; } private final String indexTypeName; private final String handlerClsName; public String getName() { return indexTypeName; } public String getHandlerClsName() { return handlerClsName; } } {code} Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't work as we got java.lang.NullPointerException. > Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class > > > Key: HIVE-13844 > URL: https://issues.apache.org/jira/browse/HIVE-13844 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 2.0.0 >Reporter: Svetozar Ivanov >Priority: Minor > Attachments: HIVE-13844.patch > > > Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name > 'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name > is 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' > {code} > public static enum IndexType { > AGGREGATE_TABLE("aggregate", > "org.apache.hadoop.hive.ql.AggregateIndexHandler"), > COMPACT_SUMMARY_TABLE("compact", > "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"), > > BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler"); > private IndexType(String indexType, String className) { > indexTypeName = indexType; > this.handlerClsName = className; > } > private final String indexTypeName; > private final String handlerClsName; > public String getName() { > return indexTypeName; > } > public String getHandlerClsName() { > return handlerClsName; > } > } > > {code} > Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't > work in case of configured > 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' as index handler. In > hive server log is observed java.lang.NullPointerException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13840) Orc split generation is reading file footers twice
[ https://issues.apache.org/jira/browse/HIVE-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302704#comment-15302704 ] Owen O'Malley commented on HIVE-13840: -- I commented in RB, but this looks fine. +1 > Orc split generation is reading file footers twice > -- > > Key: HIVE-13840 > URL: https://issues.apache.org/jira/browse/HIVE-13840 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13840.1.patch, HIVE-13840.2.patch > > > Recent refactorings to move orc out introduced a regression in split > generation. This leads to reading the orc file footers twice during split > generation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11693) CommonMergeJoinOperator throws exception with tez
[ https://issues.apache.org/jira/browse/HIVE-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302701#comment-15302701 ] Selina Zhang commented on HIVE-11693: - [~rajesh.balamohan], we hit the same issue recently. But I think the patch you attached did not fix the root problem. The issue is actually CommonMergeJoinOperator only set big table position when it has inputs for big table. {code:title=CommonMergeJoinOperator.java} @Override public void process(Object row, int tag) throws HiveException { posBigTable = (byte) conf.getBigTablePosition(); ... {code} If the input is empty, the above method will not be called. In the query you listed, a subquery is involved. The generated table is tagged as 0, while the left table is tagged as 1. GenTezWork.java set the big table position as 1 for both reduce work and CommonJoinOperator. In reduce phase, when ReduceRecordProcessor got executed, it retrieves the record from big table: {code:title=ReduceRecordProcessor.java} @Override void run() throws Exception { // run the operator pipeline while (sources[bigTablePosition].pushRecord()) { } } {code} The big table position here is 1. If the input from the big table is empty, this is the only place pushRecord() be called to read big table. However, because the CommonMergeJoinOperator missed set big table position, in closeOp() part, it will think tag 1 is small table, so another pushRecord() is called to retrieve table content. Then we see the exception listed in this JIRA. Please let me know if my analysis has problem. If you think it is correct, can you update the patch? Thanks > CommonMergeJoinOperator throws exception with tez > - > > Key: HIVE-11693 > URL: https://issues.apache.org/jira/browse/HIVE-11693 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: HIVE-11693.1.patch > > > Got this when executing a simple query with latest hive build + tez latest > version. > {noformat} > Error: Failure while running task: > attempt_1439860407967_0291_2_03_45_0:java.lang.RuntimeException: > java.lang.RuntimeException: Hive Runtime Error while closing operators: > java.lang.RuntimeException: java.io.IOException: Please check if you are > invoking moveToNext() even after it returned false. > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Hive Runtime Error while closing > operators: java.lang.RuntimeException: java.io.IOException: Please check if > you are invoking moveToNext() even after it returned false. > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:316) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.io.IOException: Please check if you are > invoking moveToNext() even after it returned false. > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412) > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:375) > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:482) > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:434) > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:384) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:61
[jira] [Commented] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302698#comment-15302698 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13857: -- [~ashutoshc] where ever recursion=false, this makes sense and I have modified it in patch #2. When recursion=true, I dont think sending the Status object of top level directory will be of much help, so I have retained the behavior there. Thanks Hari > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II
[ https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13857: - Attachment: HIVE-13857.2.patch > insert overwrite select from some table fails throwing > org.apache.hadoop.security.AccessControlException - II > - > > Key: HIVE-13857 > URL: https://issues.apache.org/jira/browse/HIVE-13857 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch > > > HIVE-13810 missed a fix, tracking it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13566) Auto-gather column stats - phase 1
[ https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302676#comment-15302676 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13566: -- The commit for this jira removed the fix for HIVE-13810. I will add them back as part of HIVE-13857 Thanks Hari > Auto-gather column stats - phase 1 > -- > > Key: HIVE-13566 > URL: https://issues.apache.org/jira/browse/HIVE-13566 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-13566.01.patch, HIVE-13566.02.patch, > HIVE-13566.03.patch > > > This jira adds code and tests for auto-gather column stats. Golden file > update will be done in phase 2 - HIVE-11160 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13855) select INPUT__FILE__NAME throws NPE exception
[ https://issues.apache.org/jira/browse/HIVE-13855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302666#comment-15302666 ] Yongzhi Chen commented on HIVE-13855: - The change looks fine. +1 > select INPUT__FILE__NAME throws NPE exception > - > > Key: HIVE-13855 > URL: https://issues.apache.org/jira/browse/HIVE-13855 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13855.1.patch > > > The following query executes successfully > select INPUT__FILE__NAME from src limit 1; > But the following NPE is thrown > {noformat} > 16/05/25 16:49:49 ERROR exec.Utilities: Failed to load plan: null: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:407) > at > org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:299) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:315) > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) > at > org.apache.hadoop.hive.ql.exec.FetchOperator$1.doNext(FetchOperator.java:340) > at > org.apache.hadoop.hive.ql.exec.FetchOperator$1.doNext(FetchOperator.java:331) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:484) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:144) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1884) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started
[ https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302652#comment-15302652 ] Sushanth Sowmyan commented on HIVE-13836: - Thanks for the catch, [~vaidyand]. I have a couple of thoughts - firstly, since the notion of a Notification Log Table is dependent on the RawStore implementation, the lock might belong more in implementations of RawStore such as ObjectStore, rather than in DbNotificationListener. Secondly, when I went and looked at ObjectStore, I see that we're correctly calling openTransaction()/commitTransaction()/rollbackTransaction() which should serve the same purposes as the lock manages, and if you're getting an error that states "Transaction has already started", it's likely that we're hitting a deeper bug with transaction semantics(with nesting allowed for) in ObjectStore. Adding a lock in DbNotificationListener will fix this bug in DbNotificationListener, but we have some other issue that we're not discovering. [~alangates], thoughts? > DbNotifications giving an error = Invalid state. Transaction has already > started > > > Key: HIVE-13836 > URL: https://issues.apache.org/jira/browse/HIVE-13836 > Project: Hive > Issue Type: Bug >Reporter: Nachiket Vaidya >Priority: Critical > Attachments: HIVE-13836.patch > > > I used pyhs2 python client to create tables/partitions in hive. I was working > fine until I moved to multithreaded scripts which created 8 connections and > ran DDL queries concurrently. > I got the error as > {noformat} > 2016-05-04 17:49:26,226 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: > HMSHandler Fatal error: Invalid state. Transaction has already started > org.datanucleus.transaction.NucleusTransactionException: Invalid state. > Transaction has already started > at > org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47) > at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131) > at > org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88) > at > org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80) > at > org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463) > at > org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) > at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502) > at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) > at > com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12721) Add UUID built in function
[ https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302648#comment-15302648 ] Sean Busbey commented on HIVE-12721: can a committer with access to the QA job relaunch it to see if these failures are related? > Add UUID built in function > -- > > Key: HIVE-12721 > URL: https://issues.apache.org/jira/browse/HIVE-12721 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Jeremy Beard >Assignee: Jeremy Beard > Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch > > > A UUID function would be very useful for ETL jobs that need to generate > surrogate keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12721) Add UUID built in function
[ https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302638#comment-15302638 ] Jeremy Beard commented on HIVE-12721: - I don't think so but I can't check because the test result pages seem to have been purged. > Add UUID built in function > -- > > Key: HIVE-12721 > URL: https://issues.apache.org/jira/browse/HIVE-12721 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Jeremy Beard >Assignee: Jeremy Beard > Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch > > > A UUID function would be very useful for ETL jobs that need to generate > surrogate keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)