[jira] [Created] (HIVE-27195) Drop table if Exists . fails during authorization for temporary tables
Riju Trivedi created HIVE-27195: --- Summary: Drop table if Exists . fails during authorization for temporary tables Key: HIVE-27195 URL: https://issues.apache.org/jira/browse/HIVE-27195 Project: Hive Issue Type: Bug Reporter: Riju Trivedi Assignee: Riju Trivedi https://issues.apache.org/jira/browse/HIVE-20051 handles skipping authorization for temporary tables. But still, the drop table if Exists fails with HiveAccessControlException. Steps to Repro: {code:java} use test; CREATE TEMPORARY TABLE temp_table (id int); drop table if exists test.temp_table; Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [rtrivedi] does not have [DROP] privilege on [test/temp_table] (state=42000,code=4) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27181) Remove RegexSerDe from hive-contrib, Upgrade should update changed FQN for RegexSerDe in HMS DB
Riju Trivedi created HIVE-27181: --- Summary: Remove RegexSerDe from hive-contrib, Upgrade should update changed FQN for RegexSerDe in HMS DB Key: HIVE-27181 URL: https://issues.apache.org/jira/browse/HIVE-27181 Project: Hive Issue Type: Sub-task Components: Hive Reporter: Riju Trivedi -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27180) Remove JsonSerde from hcatalog, Upgrade should update changed FQN for JsonSerDe in HMS DB
Riju Trivedi created HIVE-27180: --- Summary: Remove JsonSerde from hcatalog, Upgrade should update changed FQN for JsonSerDe in HMS DB Key: HIVE-27180 URL: https://issues.apache.org/jira/browse/HIVE-27180 Project: Hive Issue Type: Sub-task Components: Hive Reporter: Riju Trivedi Assignee: Riju Trivedi -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27113) Documentation for hive.thrift.client.max.message.size config needs to be corrected
Riju Trivedi created HIVE-27113: --- Summary: Documentation for hive.thrift.client.max.message.size config needs to be corrected Key: HIVE-27113 URL: https://issues.apache.org/jira/browse/HIVE-27113 Project: Hive Issue Type: Bug Reporter: Riju Trivedi Assignee: Riju Trivedi HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE("hive.thrift.client.max.message.size", "1gb", new SizeValidator(-1L, true, (long) Integer.MAX_VALUE, true), "Thrift client configuration for max message size. 0 or -1 will use the default defined in the Thrift " + "library. The upper limit is 2147483648 bytes (or 2gb).") Documentation on the help suggests setting 2147483648 while Integer Max is 2147483647. So, it actually becomes -1 and gets set to thrift default limit (100 MB) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27011) Default value of PartitionManagementTask frequency should be set to higher value
Riju Trivedi created HIVE-27011: --- Summary: Default value of PartitionManagementTask frequency should be set to higher value Key: HIVE-27011 URL: https://issues.apache.org/jira/browse/HIVE-27011 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Riju Trivedi Assignee: Riju Trivedi Default for "metastore.partition.management.task.frequency" is 5 mins, less than ideal for Prod scenarios. When there are a large number of databases/tables, it takes a lot of time for PartitionManagementTask to scan all tables and partitions and doesn't complete within 5 mins. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-23471) Statement.executeUpdate() does not return correct affected records causing "No such lock"
Riju Trivedi created HIVE-23471: --- Summary: Statement.executeUpdate() does not return correct affected records causing "No such lock" Key: HIVE-23471 URL: https://issues.apache.org/jira/browse/HIVE-23471 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Reporter: Riju Trivedi Assignee: Denys Kuzmenko In TxnHandler.acquire() call , Statement.executeUpdate() does not return correct number of records updated in HIVE_LOCKS table as requested number of locks. This results in error "*Couldn't find a lock we just created! No such lock(s)*" as acquire is rolled back. {code:java} int rc = stmt.executeUpdate(s); if (rc < locksBeingChecked.size()) { LOG.debug("Going to rollback acquire(Connection dbConn, Statement stmt, List locksBeingChecked)"); dbConn.rollback(); /*select all locks for this ext ID and see which ones are missing*/ StringBuilder sb = new StringBuilder("No such lock(s): (" + JavaUtils.lockIdToString(extLockId) + ":"); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23339) SBA does not check permissions for DB location specified in Create database query
Riju Trivedi created HIVE-23339: --- Summary: SBA does not check permissions for DB location specified in Create database query Key: HIVE-23339 URL: https://issues.apache.org/jira/browse/HIVE-23339 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Reporter: Riju Trivedi Assignee: Shubham Chaurasia With doAs=true and StorageBasedAuthorization provider, create database with specific location succeeds even if user doesn't have access to that path. {code:java} hadoop fs -ls -d /tmp/cannot_write drwx-- - hive hadoop 0 2020-04-01 22:53 /tmp/cannot_write create a database under /tmp/cannot_write. We would expect it to fail, but is actually created successfully with "hive" as the owner: rtrivedi@bdp01:~> beeline -e "create database rtrivedi_1 location '/tmp/cannot_write/rtrivedi_1'" INFO : OK No rows affected (0.116 seconds) hive@hpchdd2e:~> hadoop fs -ls /tmp/cannot_write Found 1 items drwx-- - hive hadoop 0 2020-04-01 23:05 /tmp/cannot_write/rtrivedi_1 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23058) Compaction task reattempt fails with FileAlreadyExistsException
Riju Trivedi created HIVE-23058: --- Summary: Compaction task reattempt fails with FileAlreadyExistsException Key: HIVE-23058 URL: https://issues.apache.org/jira/browse/HIVE-23058 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Reporter: Riju Trivedi Assignee: Riju Trivedi Issue occurs when compaction attempt is relaunched after first task attempt failure due to preemption by Scheduler or any other reason. Since _tmp directory was created by first attempt and was left uncleaned after task attempt failure. Second attempt of the the task fails with "FileAlreadyExistsException" exception. Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException): /warehouse/tablespace/managed/hive/default.db/compaction_test/_tmp_3670bbef-ba7a-4c10-918d-9a2ee17cbd22/base_186/bucket_5 for client 10.xx.xx.xxx already exists -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22452) CTAS query failure at DDL task stage doesn't clean out the target directory
Riju Trivedi created HIVE-22452: --- Summary: CTAS query failure at DDL task stage doesn't clean out the target directory Key: HIVE-22452 URL: https://issues.apache.org/jira/browse/HIVE-22452 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.2, 3.1.0 Reporter: Riju Trivedi Assignee: Riju Trivedi CTAS query failure at DDL task stage due to HMS connection issue leaves the output file in target directory. Since DDL task stage happens after Tez DAG completion and MOVE Task , output file gets already moved to target directory and does not get cleaned up after the query failure. Re-executing the same query causes a duplicate file under table location hence duplicate data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22208) Column name with reserved keyword is unescaped when query includes join on table with mask column is re-written
Riju Trivedi created HIVE-22208: --- Summary: Column name with reserved keyword is unescaped when query includes join on table with mask column is re-written Key: HIVE-22208 URL: https://issues.apache.org/jira/browse/HIVE-22208 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0, 4.0.0 Reporter: Riju Trivedi Join query involving table with mask column and other having reserved keyword as column name fails with SemanticException during parsing re-written query : Original Query : {code:java} select a.`date`, b.nm from sample_keyword a join sample_mask b on b.id = a.id; {code} Re-written Query : {code:java} select a.date, b.nm from sample_keyword a join (SELECT `id`, CAST(mask_hash(nm) AS string) AS `nm`, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `default`.`sample_mask` )`b` on b.id = a.id; {code} Re-written query does not have escape quotes for date column which cause SemanticException while parsing : {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 1:9 cannot recognize input near 'a' '.' 'date' in selection target at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter( SemanticAnalyzer.java:12084) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal( SemanticAnalyzer.java:12298) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal( CalcitePlanner.java:360) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze( BaseSemanticAnalyzer.java:289) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1869) {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (HIVE-21061) CTAS query fails with IllegalStateException for empty source
Riju Trivedi created HIVE-21061: --- Summary: CTAS query fails with IllegalStateException for empty source Key: HIVE-21061 URL: https://issues.apache.org/jira/browse/HIVE-21061 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Reporter: Riju Trivedi Assignee: Riju Trivedi Creating a table using CTAS from an empty source table with predicate condition evaluating to False {code} create table testctas1 (id int); create table testctas2 as select * from testctas1 where 1=2; {code} Fails with below exception: {code} Caused by: java.lang.IllegalStateException: null at com.google.common.base.Preconditions.checkState(Preconditions.java:159) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(Vectorizer.java:1312) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(Vectorizer.java:1654) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(Vectorizer.java:1865) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:1109) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:961) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:2442) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:717) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:258) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12443) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1863) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1810) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1805) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197) ... 36 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20406) Nested Coalesce giving incorrect results
Riju Trivedi created HIVE-20406: --- Summary: Nested Coalesce giving incorrect results Key: HIVE-20406 URL: https://issues.apache.org/jira/browse/HIVE-20406 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0 Reporter: Riju Trivedi Below query is returning NULL instead of 'TEST' select coalesce(coalesce(null), 'TEST'); INFO : OK +---+ | _c0 | +---+ | NULL | +---+ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19761) Incorrect evaluation of OR condition
Riju Trivedi created HIVE-19761: --- Summary: Incorrect evaluation of OR condition Key: HIVE-19761 URL: https://issues.apache.org/jira/browse/HIVE-19761 Project: Hive Issue Type: Bug Components: CLI, Parser Reporter: Riju Trivedi OR clause is evaluated incorrectly in Hive query when the where the condition is evaluated as FALSE or TRUE. If we reverse the condition checks and it is evaluated as TRUE OR FALSE, it works fine. Steps to repro : {code} CREATE TABLE `rtfnprepro1`( `id` int, `name` string, `zip` int, `city` string, `last` string, `ssn` int, `phone` int, `gender` string, `weight` int, `desc` string) PARTITIONED BY ( `country` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'; insert into table rtfnprepro1 partition (country='INDIA') values (1, "abdcghjhkkjkjkkkgj",1,"PIShkjkkjhkkk","abc",1,2,"malhjkkjke",2,"hello"); select * from rtfnprepro1 where ((phone=1 and last='adc') OR (phone=2 and last!='adc')); select * from rtfnprepro1 where ( (phone=2 and last!='adc') OR (phone=1 and last='adc') ); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19570) Multiple inserts using "Group by" generates incorrect results
Riju Trivedi created HIVE-19570: --- Summary: Multiple inserts using "Group by" generates incorrect results Key: HIVE-19570 URL: https://issues.apache.org/jira/browse/HIVE-19570 Project: Hive Issue Type: Bug Components: Logical Optimizer, Query Processor Affects Versions: 1.2.0, 3.0.0 Reporter: Riju Trivedi Repro steps: drop database if exists ax1 cascade; create database ax1; use ax1; CREATE TABLE tmp1 ( v1 string , v2 string , v3 string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' ; INSERT INTO tmp1 VALUES ('a', 'b', 'c1') , ('a', 'b', 'c2') , ('d', 'e', 'f') , ('g', 'h', 'i') ; CREATE TABLE tmp_grouped_by_one_col ( v1 string , cnt__v2 int , cnt__v3 int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' ; CREATE TABLE tmp_grouped_by_two_col ( v1 string , v2 string , cnt__v3 int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' ; CREATE TABLE tmp_grouped_by_all_col ( v1 string , v2 string , v3 string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' ; FROM tmp1 INSERT INTO tmp_grouped_by_one_col SELECT v1, count(distinct v2), count(distinct v3) GROUP BY v1 INSERT INTO tmp_grouped_by_all_col SELECT v1, v2, v3 GROUP BY v1, v2, v3 ; select 'tmp_grouped_by_one_col',count(*) from tmp_grouped_by_one_col union all select 'tmp_grouped_by_two_col',count(*) from tmp_grouped_by_two_col union all select 'tmp_grouped_by_all_col',count(*) from tmp_grouped_by_all_col; select * from tmp_grouped_by_all_col; tmp_grouped_by_all_col table should have 4 reocrds but it loads 7 records into the table. ++++--+ | tmp_grouped_by_all_col.v1 | tmp_grouped_by_all_col.v2 | tmp_grouped_by_all_col.v3 | ++++--+ | a | b | b | | a | c1 | c1 | | a | c2 | c2 | | d | e | e | | d | f | f | | g | h | h | | g | i | i | ++++--+ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19255) Hive doesn't support column list specification in INSERT into statements with distribute by/Cluster by
Riju Trivedi created HIVE-19255: --- Summary: Hive doesn't support column list specification in INSERT into statements with distribute by/Cluster by Key: HIVE-19255 URL: https://issues.apache.org/jira/browse/HIVE-19255 Project: Hive Issue Type: Bug Components: Parser, Query Processor, SQL Affects Versions: 1.2.0 Reporter: Riju Trivedi INSERT into TABLE target_table_2 partition (col3) (col1, col2,col3) SELECT col1,col2,col3 FROM source_table DISTRIBUTE BY col1 SORT BY col1,col2; This Insert statement throws Error: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 4:14 Invalid table alias or column reference 'col1': Query is executed successfully with below workaround: INSERT into TABLE target_table_2 partition (col3) (col1, col2,col3) select * From (SELECT col1, col2,col3 FROM source_table DISTRIBUTE BY col1 SORT BY col1,col2) a; -- This message was sent by Atlassian JIRA (v7.6.3#76005)