[jira] [Created] (HIVE-24140) Improve materialized view authorization check
Vineet Garg created HIVE-24140: -- Summary: Improve materialized view authorization check Key: HIVE-24140 URL: https://issues.apache.org/jira/browse/HIVE-24140 Project: Hive Issue Type: Improvement Components: Query Planning Reporter: Vineet Garg Currently (with HIVE-23454) after mv rewriting authorization check on each mv is done and rewrite is rejected if check of any mv fails. This is inefficient as it does authorization check after rewriting. Ideally this check should be done prior to rewrite (and skip rewrite accordingly). One approach is to check for all mv for tables involved in the query (This may cause rewrite to skip even though mv may not be select for rewrite) Another approach is to cache mv privileges in the registry and refresh them periodically. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24139) VectorGroupByOperator is not flushing hash table entries as needed
Mustafa Iman created HIVE-24139: --- Summary: VectorGroupByOperator is not flushing hash table entries as needed Key: HIVE-24139 URL: https://issues.apache.org/jira/browse/HIVE-24139 Project: Hive Issue Type: Bug Reporter: Mustafa Iman Assignee: Mustafa Iman After https://issues.apache.org/jira/browse/HIVE-23975 introduced a bug where copyKey mutates some key wrappers while copying. This Jira is to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24138) Llap external client flow is broken due to netty shading
Shubham Chaurasia created HIVE-24138: Summary: Llap external client flow is broken due to netty shading Key: HIVE-24138 URL: https://issues.apache.org/jira/browse/HIVE-24138 Project: Hive Issue Type: Bug Components: llap Reporter: Shubham Chaurasia We shaded netty in hive-exec in - https://issues.apache.org/jira/browse/HIVE-23073 This breaks LLAP external client flow on LLAP daemon side - {code} 2020-09-09T18:22:13,413 INFO [TezTR-222977_4_0_0_0_0 (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning writer for: attempt_497418324441977_0004_0_00_00_0 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: java.lang.NoSuchMethodError: org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf; at org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96) at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74) at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57) at org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89) at org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88) at org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130) at org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102) at org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85) at org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46) at org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137) at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Arrow method signature mismatch mainly happens due to the fact that arrow contains some classes which are packaged under {{io.netty.buffer.*}} - {code} io.netty.buffer.ArrowBuf io.netty.buffer.ExpandableByteBuf io.netty.buffer.LargeBuffer io.netty.buffer.MutableWrappedByteBuf io.netty.buffer.PooledByteBufAllocatorL io.netty.buffer.UnsafeDirectLittleEndian {code} Since we have relocated netty, these classes have also been relocated to {{org.apache.hive.io.netty.buffer.*}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24137) Race condition when copying llap.tar.gz by multiple HSI
Attila Magyar created HIVE-24137: Summary: Race condition when copying llap.tar.gz by multiple HSI Key: HIVE-24137 URL: https://issues.apache.org/jira/browse/HIVE-24137 Project: Hive Issue Type: Bug Components: llap Reporter: Attila Magyar When both HSI started simultaneously , one of it fails to start. This issue seems to be because multiple HSI are started simultaneous and there is a race condition by DFSClient trying to copy llap tar package to HDFS Restarting one after another would resolve the issue or trying second restart might help. But for long term fix , we would need to fix llap-server/src/main/resources/templates.py and retry copyFromLocal. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24136) create table table_name as 任务执行成功,表没有创建出来
paul created HIVE-24136: --- Summary: create table table_name as 任务执行成功,表没有创建出来 Key: HIVE-24136 URL: https://issues.apache.org/jira/browse/HIVE-24136 Project: Hive Issue Type: Bug Affects Versions: 3.1.2 Reporter: paul hive 版本 3.1.2,使用 CTAS 方式创建表,执行状态成功但是表没有创建出来。 通过查询日志,发现没有 metastore.HiveMetaStore: 22556: create_table: Table(tableName:t_nagent_trade_water_day_temp1061 和 exec.Task: Moving data to directory 相关的日志。 查找过mysql binlog,没有在元数据库执行创建表语句 create table t_nagent_trade_water_day_temp1061 as select a.trade_water_id,mrch_no,FIRST_REPORT_SUC_TIME,trade_amt,trade_date,create_time,trans_type,a.agent_code,mrch_type,level_four,level_four_name,level_three,level_three_name, level_two,level_two_name,level_one,level_one_name from (select t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,t2.level_four,t2.level_four_name, t2.level_three,t2.level_three_name,t2.level_two,t2.level_two_name,t2.level_one,t2.level_one_name from t_nagent_trade_water t1 left join agent_belong_temp1061 t2 on t1.agent_code=t2.level_four where t2.level_four!='' and t1.trade_status='1' union all select t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,'' level_four,'' level_four_name, t2.level_three,t2.level_three_name,t2.level_two,t2.level_two_name,t2.level_one,t2.level_one_name from t_nagent_trade_water t1 left join (select level_three,level_three_name,level_two,level_two_name,level_one,level_one_name from agent_belong_temp1061 group by level_three,level_three_name,level_two,level_two_name,level_one,level_one_name) t2 on t1.agent_code=t2.level_three\nwhere t2.level_three!='' and t1.trade_status='1' union all select t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,'' level_four,'' level_four_name, '' level_three,'' level_three_name,t2.level_two,t2.level_two_name,t2.level_one,t2.level_one_name\nfrom t_nagent_trade_water t1 left join (select level_two,level_two_name,level_one,level_one_name from agent_belong_temp1061\ngroup by level_two,level_two_name,level_one,level_one_name) t2 on t1.agent_code=t2.level_two\nwhere t2.level_two!='' and t1.trade_status='1' union all select t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,'' level_four,'' level_four_name, '' level_three,'' level_three_name,'' level_two,'' level_two_name,t2.level_one,t2.level_one_name from t_nagent_trade_water t1 left join (select level_one,level_one_name from agent_belong_temp1061\ngroup by level_one,level_one_name) t2 on t1.agent_code=t2.level_one where t2.level_one!='' and t1.trade_status='1' ) a left join t_nagent_merchant_incoming b on a.mrch_no=b.merc_no where platform_code='05' and to_date(create_time)=from_unixtime(unix_timestamp("20200829",'MMdd'),'-MM-dd') -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24135) Drop database doesn't delete directory in managed location
Karen Coppage created HIVE-24135: Summary: Drop database doesn't delete directory in managed location Key: HIVE-24135 URL: https://issues.apache.org/jira/browse/HIVE-24135 Project: Hive Issue Type: Sub-task Reporter: Karen Coppage Assignee: Naveen Gangam Repro: say the default managed location is managed/hive and the default external location is external/hive. {code:java} create database db1; -- creates: external/hive/db1.db create table db1.table1 (i int); -- creates: managed/hive/db1.db and managed/hive/db1.db/table1 drop database db1 cascade; -- removes : external/hive/db1.db and managed/hive/db1.db/table1 {code} Problem: Directory managed/hive/db1.db remains. Since HIVE-22995, dbs have a managed (managedLocationUri) and an external location (locationUri). I think the issue is that HiveMetaStore.HMSHandler#drop_database_core deletes only the db directory in the external location. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24134) Revert deletion of HiveStrictManagedMigration
Aasha Medhi created HIVE-24134: -- Summary: Revert deletion of HiveStrictManagedMigration Key: HIVE-24134 URL: https://issues.apache.org/jira/browse/HIVE-24134 Project: Hive Issue Type: Task Reporter: Aasha Medhi Assignee: Aasha Medhi Partial revert of https://issues.apache.org/jira/browse/HIVE-23995 to keep the migration code -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24133) Hive query with Hbase storagehandler can give back incorrect results when predicate contains null check
Marton Bod created HIVE-24133: - Summary: Hive query with Hbase storagehandler can give back incorrect results when predicate contains null check Key: HIVE-24133 URL: https://issues.apache.org/jira/browse/HIVE-24133 Project: Hive Issue Type: Bug Reporter: Marton Bod It has been observed that when using Hbase storage handler and the table contains null values, Hive can give back wrong query results, depending on what columns we select for and whether the where clause predicate contains any null checks. For example: create 'default:hive_test', 'cf' put 'default:hive_test', '1', 'cf:col1', 'val1' put 'default:hive_test', '1', 'cf:col2', 'val2' put 'default:hive_test', '2', 'cf:col1', 'val1_2' put 'default:hive_test', '2', 'cf:col2', 'val2_2' put 'default:hive_test', '3', 'cf:col1', 'val1_3' put 'default:hive_test', '3', 'cf:col2', 'val2_3' put 'default:hive_test', '3', 'cf:col3', 'val3_3' put 'default:hive_test', '3', 'cf:col4', "\x00\x00\x00\x00\x00\x02\xC2" put 'default:hive_test', '4', 'cf:col1', 'val1_4' put 'default:hive_test', '4', 'cf:col2', 'val2_4' scan 'default:hive_test' = HIVE CREATE EXTERNAL TABLE hbase_hive_test ( rowkey string, col1 string, col2 string, col3 string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,cf:col1,cf:col2,cf:col3" ) TBLPROPERTIES("hbase.table.name" = "default:hive_test"); query: select * from hbase_hive_test where col3 is null; result: Total MapReduce CPU Time Spent: 10 seconds 980 msec OK 1 val1 val2 NULL 2 val1_2 val2_2 NULL 4 val1_4 val2_4 NULL query: select rowkey from hbase_hive_test where col3 is null; This does not produce any records. However, select rowkey, col2 from hbase_hive_test where col3 is null; This gives back the correct results again. -- This message was sent by Atlassian Jira (v8.3.4#803005)