[jira] [Created] (HIVE-24140) Improve materialized view authorization check

2020-09-09 Thread Vineet Garg (Jira)
Vineet Garg created HIVE-24140:
--

 Summary: Improve materialized view authorization check
 Key: HIVE-24140
 URL: https://issues.apache.org/jira/browse/HIVE-24140
 Project: Hive
  Issue Type: Improvement
  Components: Query Planning
Reporter: Vineet Garg


Currently (with HIVE-23454) after mv rewriting authorization check on each mv 
is done and rewrite is rejected if check of any mv fails.
This is inefficient as it does authorization check after rewriting. Ideally 
this check should be done prior to rewrite (and skip rewrite accordingly).

One approach is to check for all mv for tables involved in the query (This may 
cause rewrite to skip even though mv may not be select for rewrite)
Another approach is to cache mv privileges in the registry and refresh them 
periodically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24139) VectorGroupByOperator is not flushing hash table entries as needed

2020-09-09 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-24139:
---

 Summary: VectorGroupByOperator is not flushing hash table entries 
as needed
 Key: HIVE-24139
 URL: https://issues.apache.org/jira/browse/HIVE-24139
 Project: Hive
  Issue Type: Bug
Reporter: Mustafa Iman
Assignee: Mustafa Iman


After https://issues.apache.org/jira/browse/HIVE-23975 introduced a bug where 
copyKey mutates some key wrappers while copying. This Jira is to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-09 Thread Shubham Chaurasia (Jira)
Shubham Chaurasia created HIVE-24138:


 Summary: Llap external client flow is broken due to netty shading
 Key: HIVE-24138
 URL: https://issues.apache.org/jira/browse/HIVE-24138
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Shubham Chaurasia


We shaded netty in hive-exec in - 
https://issues.apache.org/jira/browse/HIVE-23073

This breaks LLAP external client flow on LLAP daemon side - 
{code}
2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
(497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
writer for: attempt_497418324441977_0004_0_00_00_0
2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
(497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
java.lang.NoSuchMethodError: 
org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
at 
org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
at 
org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
at 
org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
at 
org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
at 
org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
at 
org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
at 
org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
at 
org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

Arrow method signature mismatch mainly happens due to the fact that arrow 
contains some classes which are packaged under {{io.netty.buffer.*}} - 
{code}
io.netty.buffer.ArrowBuf
io.netty.buffer.ExpandableByteBuf
io.netty.buffer.LargeBuffer
io.netty.buffer.MutableWrappedByteBuf
io.netty.buffer.PooledByteBufAllocatorL
io.netty.buffer.UnsafeDirectLittleEndian
{code}

Since we have relocated netty, these classes have also been relocated to 
{{org.apache.hive.io.netty.buffer.*}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24137) Race condition when copying llap.tar.gz by multiple HSI

2020-09-09 Thread Attila Magyar (Jira)
Attila Magyar created HIVE-24137:


 Summary: Race condition when copying llap.tar.gz by multiple HSI
 Key: HIVE-24137
 URL: https://issues.apache.org/jira/browse/HIVE-24137
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Attila Magyar


When both HSI started simultaneously , one of it fails to start.

This issue seems to be because multiple HSI are started simultaneous and there 
is a race condition by DFSClient trying to copy llap tar package to HDFS

 

Restarting one after another would resolve the issue or trying second restart 
might help. But for long term fix , we would need to fix 
llap-server/src/main/resources/templates.py and retry copyFromLocal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24136) create table table_name as 任务执行成功,表没有创建出来

2020-09-09 Thread paul (Jira)
paul created HIVE-24136:
---

 Summary: create table table_name as 任务执行成功,表没有创建出来
 Key: HIVE-24136
 URL: https://issues.apache.org/jira/browse/HIVE-24136
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: paul


hive 版本 3.1.2,使用 CTAS 方式创建表,执行状态成功但是表没有创建出来。 通过查询日志,发现没有  
metastore.HiveMetaStore: 22556: create_table: 
Table(tableName:t_nagent_trade_water_day_temp1061  和  exec.Task: Moving data to 
directory  相关的日志。 

查找过mysql binlog,没有在元数据库执行创建表语句

 

 

 

 

create table t_nagent_trade_water_day_temp1061 as

select 
a.trade_water_id,mrch_no,FIRST_REPORT_SUC_TIME,trade_amt,trade_date,create_time,trans_type,a.agent_code,mrch_type,level_four,level_four_name,level_three,level_three_name,

level_two,level_two_name,level_one,level_one_name from 

(select 
t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,t2.level_four,t2.level_four_name,

t2.level_three,t2.level_three_name,t2.level_two,t2.level_two_name,t2.level_one,t2.level_one_name

from t_nagent_trade_water t1 left join agent_belong_temp1061 t2 on 
t1.agent_code=t2.level_four

where t2.level_four!='' and t1.trade_status='1'

union all

select 
t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,''
 level_four,'' level_four_name,

t2.level_three,t2.level_three_name,t2.level_two,t2.level_two_name,t2.level_one,t2.level_one_name

from t_nagent_trade_water t1 left join 

(select 
level_three,level_three_name,level_two,level_two_name,level_one,level_one_name 
from agent_belong_temp1061

group by 
level_three,level_three_name,level_two,level_two_name,level_one,level_one_name) 

t2 on t1.agent_code=t2.level_three\nwhere t2.level_three!='' and 
t1.trade_status='1'

union all

select 
t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,''
 level_four,'' level_four_name,

'' level_three,'' 
level_three_name,t2.level_two,t2.level_two_name,t2.level_one,t2.level_one_name\nfrom
 t_nagent_trade_water t1 left join 

(select level_two,level_two_name,level_one,level_one_name from 
agent_belong_temp1061\ngroup by 
level_two,level_two_name,level_one,level_one_name) 

t2 on t1.agent_code=t2.level_two\nwhere t2.level_two!='' and t1.trade_status='1'

union all

select 
t1.trade_water_id,t1.mrch_no,t1.trade_amt,t1.trade_date,t1.create_time,t1.trans_type,t1.agent_code,''
 level_four,'' level_four_name,

'' level_three,'' level_three_name,'' level_two,'' 
level_two_name,t2.level_one,t2.level_one_name

from t_nagent_trade_water t1 left join 

(select level_one,level_one_name from agent_belong_temp1061\ngroup by 
level_one,level_one_name) 

t2 on t1.agent_code=t2.level_one

where t2.level_one!='' and t1.trade_status='1'

) a left join t_nagent_merchant_incoming b on a.mrch_no=b.merc_no where 
platform_code='05' and 
to_date(create_time)=from_unixtime(unix_timestamp("20200829",'MMdd'),'-MM-dd')



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24135) Drop database doesn't delete directory in managed location

2020-09-09 Thread Karen Coppage (Jira)
Karen Coppage created HIVE-24135:


 Summary: Drop database doesn't delete directory in managed location
 Key: HIVE-24135
 URL: https://issues.apache.org/jira/browse/HIVE-24135
 Project: Hive
  Issue Type: Sub-task
Reporter: Karen Coppage
Assignee: Naveen Gangam


Repro:
 say the default managed location is managed/hive and the default external 
location is external/hive.
{code:java}
create database db1; -- creates: external/hive/db1.db
create table db1.table1 (i int); -- creates: managed/hive/db1.db and  
managed/hive/db1.db/table1
drop database db1 cascade; -- removes : external/hive/db1.db and 
managed/hive/db1.db/table1
{code}
Problem: Directory managed/hive/db1.db remains.

Since HIVE-22995, dbs have a managed (managedLocationUri) and an external 
location (locationUri). I think the issue is that 
HiveMetaStore.HMSHandler#drop_database_core deletes only the db directory in 
the external location.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24134) Revert deletion of HiveStrictManagedMigration

2020-09-09 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-24134:
--

 Summary: Revert deletion of HiveStrictManagedMigration 
 Key: HIVE-24134
 URL: https://issues.apache.org/jira/browse/HIVE-24134
 Project: Hive
  Issue Type: Task
Reporter: Aasha Medhi
Assignee: Aasha Medhi


Partial revert of https://issues.apache.org/jira/browse/HIVE-23995 to keep the 
migration code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24133) Hive query with Hbase storagehandler can give back incorrect results when predicate contains null check

2020-09-09 Thread Marton Bod (Jira)
Marton Bod created HIVE-24133:
-

 Summary: Hive query with Hbase storagehandler can give back 
incorrect results when predicate contains null check
 Key: HIVE-24133
 URL: https://issues.apache.org/jira/browse/HIVE-24133
 Project: Hive
  Issue Type: Bug
Reporter: Marton Bod


It has been observed that when using Hbase storage handler and the table 
contains null values, Hive can give back wrong query results, depending on what 
columns we select for and whether the where clause predicate contains any null 
checks.

For example:

create 'default:hive_test', 'cf'
put 'default:hive_test', '1', 'cf:col1', 'val1'
put 'default:hive_test', '1', 'cf:col2', 'val2'

put 'default:hive_test', '2', 'cf:col1', 'val1_2'
put 'default:hive_test', '2', 'cf:col2', 'val2_2'

put 'default:hive_test', '3', 'cf:col1', 'val1_3'
put 'default:hive_test', '3', 'cf:col2', 'val2_3'
put 'default:hive_test', '3', 'cf:col3', 'val3_3'
put 'default:hive_test', '3', 'cf:col4', "\x00\x00\x00\x00\x00\x02\xC2"

put 'default:hive_test', '4', 'cf:col1', 'val1_4'
put 'default:hive_test', '4', 'cf:col2', 'val2_4'

scan 'default:hive_test'

= HIVE

CREATE EXTERNAL TABLE hbase_hive_test (
rowkey string,
col1 string,
col2 string,
col3 string
)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,cf:col1,cf:col2,cf:col3"
)
TBLPROPERTIES("hbase.table.name" = "default:hive_test");

query: select * from hbase_hive_test where col3 is null;

result:
Total MapReduce CPU Time Spent: 10 seconds 980 msec
OK
1 val1 val2 NULL
2 val1_2 val2_2 NULL
4 val1_4 val2_4 NULL

query: select rowkey from hbase_hive_test where col3 is null;

This does not produce any records.

However, select rowkey, col2 from hbase_hive_test where col3 is null;

This gives back the correct results again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)