[jira] [Created] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output
Matt McCline created HIVE-20091: --- Summary: Tez: Add security credentials for FileSinkOperator output Key: HIVE-20091 URL: https://issues.apache.org/jira/browse/HIVE-20091 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline DagUtils needs to add security credentials for the output for the FileSinkOperator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities
Jesus Camacho Rodriguez created HIVE-20090: -- Summary: Extend creation of semijoin reduction filters to be able to discover new opportunities Key: HIVE-20090 URL: https://issues.apache.org/jira/browse/HIVE-20090 Project: Hive Issue Type: Improvement Components: Physical Optimizer Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Assume the following plan: {noformat} TS[0] - RS[1] - JOIN[4] - RS[5] - JOIN[8] - FS[9] TS[2] - RS[3] - JOIN[4] TS[6] - RS[7] - JOIN[8] {noformat} Currently, {{TS\[6\]}} may only be reduced with the output of {{RS\[5\]}}, i.e., input to join between both subplans. However, it may be useful to consider other possibilities too, e.g., reduced by the output of {{RS\[1\]}} or {{RS\[3\]}}. For instance, this is important when, given a large plan, an edge between {{RS[5]}} and {{TS[0]}} would create a cycle, while an edge between {{RS[1]}} and {{TS[6]}} would not. This patch comprises two parts. First, it creates additional predicates when possible. Secondly, it removes duplicate semijoin reduction branches/predicates, e.g., if another semijoin that consumes the output of the same expression already reduces a certain table scan operator (heuristic, since this may not result in most efficient plan in all cases). Ultimately, the decision on whether to use one or another should be cost-driven (follow-up). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20089) CTAS doesn't work into nonexisting /tmp/... directory while CT works
Laszlo Bodor created HIVE-20089: --- Summary: CTAS doesn't work into nonexisting /tmp/... directory while CT works Key: HIVE-20089 URL: https://issues.apache.org/jira/browse/HIVE-20089 Project: Hive Issue Type: Bug Affects Versions: 3.0.0 Reporter: Laszlo Bodor While checking negative qtests I've found some strange behavior according to CT and CTAS statements. ct_noperm_loc.q ctas_noperm_loc.q The common part these tests are initialization: {code} set hive.test.authz.sstd.hs2.mode=true; set hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest; set hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateConfigUserAuthenticator; set hive.security.authorization.enabled=true; set user.name=user1; {code} But while simple 'create table' works to a nonexisting dir... {code} create table foo0(id int) location 'hdfs:///tmp/ct_noperm_loc_foo0'; {code} ...'create table as select' doesn't work: {code} create table foo0 location 'hdfs:///tmp/ctas_noperm_loc_foo0' as select 1 as c1; {code} expected result is: {code} FAILED: HiveAccessControlException Permission denied: Principal [name=user1, type=USER] does not have following privileges for operation CREATETABLE_AS_SELECT [[INSERT, DELETE] on Object [type=DFS_URI, name=hdfs://### HDFS PATH ###]] {code} Is it by design, am I missing something here? {code} mvn test -Dtest=TestNegativeMinimrCliDriver -Dqfile=ct_noperm_loc.q -Pitests,hadoop-2 -pl itests/qtest mvn test -Dtest=TestNegativeMinimrCliDriver -Dqfile=ctas_noperm_loc.q -Pitests,hadoop-2 -pl itests/qtest {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20088) Beeline config location path is assembled incorrectly
Denes Bodo created HIVE-20088: - Summary: Beeline config location path is assembled incorrectly Key: HIVE-20088 URL: https://issues.apache.org/jira/browse/HIVE-20088 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 3.0.0 Reporter: Denes Bodo Assignee: Denes Bodo Checking the code in [https://github.com/apache/hive/blob/branch-3/beeline/src/java/org/apache/hive/beeline/hs2connection/UserHS2ConnectionFileParser.java] or in [https://github.com/apache/hive/blob/branch-3/beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineSiteParser.java] I see {code}locations.add(ETC_HIVE_CONF_LOCATION + DEFAULT_BEELINE_SITE_FILE_NAME);{code} whee file separator shall be used: {code}locations.add(ETC_HIVE_CONF_LOCATION + File.separator + DEFAULT_BEELINE_SITE_FILE_NAME);{code} Due to this, BeeLine cannot use configuration in case if this would be the only way. In my hadoop-3 setup, the locations list contains the following: {code} /home/myuser/.beeline/beeline-site.xml /etc/hive/confbeeline-site.xml {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20087) Fix reoptimization for semijoin reduction cases
Zoltan Haindrich created HIVE-20087: --- Summary: Fix reoptimization for semijoin reduction cases Key: HIVE-20087 URL: https://issues.apache.org/jira/browse/HIVE-20087 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich The real TS will get further info about the other table; which makes the physically read record count inaccurate.. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20086) Druid-hive kafka ingestion: indexing tasks kept running even after setting 'druid.kafka.ingestion' = 'STOP'
Dileep Kumar Chiguruvada created HIVE-20086: --- Summary: Druid-hive kafka ingestion: indexing tasks kept running even after setting 'druid.kafka.ingestion' = 'STOP' Key: HIVE-20086 URL: https://issues.apache.org/jira/browse/HIVE-20086 Project: Hive Issue Type: Bug Components: Hive, StorageHandler Affects Versions: 3.0.0 Reporter: Dileep Kumar Chiguruvada Fix For: 3.0.0 Attachments: Screen Shot 2018-07-02 at 8.51.10 PM.png Druid-hive kafka ingestion: indexing tasks kept running even after setting 'druid.kafka.ingestion' = 'STOP'. when ingestion started( 'druid.kafka.ingestion' = 'START') the indexing task start running and could able to load rows into Druid-Hive table. But after stopping it still the indexing task kept running without getting down gracefully. The issue is for every START of ingestion this will pool up multiple indexing tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20085) Druid-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
Dileep Kumar Chiguruvada created HIVE-20085: --- Summary: Druid-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional Key: HIVE-20085 URL: https://issues.apache.org/jira/browse/HIVE-20085 Project: Hive Issue Type: Bug Components: Hive, StorageHandler Affects Versions: 3.0.0 Reporter: Dileep Kumar Chiguruvada Fix For: 3.0.0 Druid-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional {code} drop table if exists calcs; create table calcs STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ( "druid.segment.granularity" = "MONTH", "druid.query.granularity" = "DAY") AS SELECT cast(datetime0 as timestamp with local time zone) `__time`, key, str0, str1, str2, str3, date0, date1, date2, date3, time0, time1, datetime0, datetime1, zzz, cast(bool0 as string) bool0, cast(bool1 as string) bool1, cast(bool2 as string) bool2, cast(bool3 as string) bool3, int0, int1, int2, int3, num0, num1, num2, num3, num4 from tableau_orc.calcs; 2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running (Executing on YARN cluster with App id application_1530592209763_0009) ... ... 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: 0 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME: 330 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2: 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : org.apache.hadoop.hive.llap.counters.LlapWmCounters: 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: 0 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_RUNNING_NS: 0 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPECULATIVE_QUEUED_NS: 2162643606 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPECULATIVE_RUNNING_NS: 12151664909 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task [Stage-0:MOVE] in serial mode 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to directory hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs from hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task [Stage-4:DDL] in serial mode 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table druid_tableau.calcs failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.) 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing command(queryId=hive_20180703045727_c39c40d2-7d4a-46c7-a36d-7925e7c4a788); Time taken: 6.794 seconds 2018-07-03 04:57:36,337|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table druid_tableau.calcs failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.) (state=08S01,code=1) {code} This will not allow druid tables to be managed. So its not direct to create Druid tables. while trying to modify things to external tables..we see below issues 1)