[jira] [Created] (HIVE-24724) Create table with LIKE operator does not work correctly

2021-02-02 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24724:
-

 Summary: Create table with LIKE operator does not work correctly
 Key: HIVE-24724
 URL: https://issues.apache.org/jira/browse/HIVE-24724
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


Steps to repro:

{code:java}
create table atable (id int, str1 string);
alter table atable add constraint pk_atable primary key (id) disable novalidate;

create table btable like atable;

{code}

describe formatted btable lacks the constraints information.

 CreateTableLikeDesc does not set/fetch the constraints for LIKE table

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L13594-L13616

neither DDLTask fetches/set the constraints for the table. 

https://github.com/apache/hive/blob/5ba3dfcb6470ff42c58a3f95f0d5e72050274a42/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/like/CreateTableLikeOperation.java#L58-L83




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24723) Use ExecutorService in TezSessionPool

2021-02-02 Thread David Mollitor (Jira)
David Mollitor created HIVE-24723:
-

 Summary: Use ExecutorService in TezSessionPool
 Key: HIVE-24723
 URL: https://issues.apache.org/jira/browse/HIVE-24723
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Currently there are some wonky home-made thread pooling action going on in 
{{TezSessionPool}.  Replace it with some JDK/Guava goodness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24722) LLAP cache hydration

2021-02-02 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24722:
--

 Summary: LLAP cache hydration
 Key: HIVE-24722
 URL: https://issues.apache.org/jira/browse/HIVE-24722
 Project: Hive
  Issue Type: Improvement
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Provide a way to save and reload the contents of the cache in the llap daemons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24721) Align CacheWriter bufferSize with Llap max alloc

2021-02-02 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-24721:
-

 Summary: Align CacheWriter bufferSize with Llap max alloc
 Key: HIVE-24721
 URL: https://issues.apache.org/jira/browse/HIVE-24721
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis


Before bumping to ORC-1.6, LLAP_ALLOCATOR_MAX_ALLOC value was also used as the 
ORC CacheWriter buffer size.

As per ORC-238, the max bufferSize argument can be up to 2^(3*8 - 1) -- i.e., 
less than 8Mb and since we enforce the size to be power of 2 the next available 
is 4Mb.

In HIVE-23553 we decouple the two configuration (LLAP max alloc and CacheWriter 
buffer size) with the first one being 16Mb, and the latter 4Mb -- this ticket 
is to investigate if there is a need for these two conf to convergence.

More details: 
https://github.com/apache/hive/pull/1823#pullrequestreview-575698916



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24720) Got error while join iceberg table and hive table

2021-02-02 Thread Xingxing Di (Jira)
Xingxing Di created HIVE-24720:
--

 Summary: Got error while join iceberg table and hive table
 Key: HIVE-24720
 URL: https://issues.apache.org/jira/browse/HIVE-24720
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers, StorageHandler
Affects Versions: 2.0.1
 Environment: Iceberg : 0.11

hive : 2.0.1

hadoop : 2.7.2

JDK : 1.8

 
Reporter: Xingxing Di


We got error while join iceberg table and hive table at same time, most of 
mappers succeed, but some mappers got an `cannot find field` error: 
{code:java}
Caused by: java.lang.RuntimeException: cannot find field log_src from 
[org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@8838736]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:442)
at 
org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldRef(IcebergRecordObjectInspector.java:78)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:139)
at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:980)
at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1006)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:77)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:498)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:115)
... 22 more

{code}
 

 
{code:java}
2021-02-02 15:57:35,036 INFO [main] org.apache.hadoop.mapred.MapTask: 
Processing split: 
Paths:/tmp/hive/flink/25369d0b-110e-4eca-8553-f2f3953f0303/hive_2021-02-02_15-56-22_060_2459779186626852210-1/-mr-10005/0/emptyFile:0+0InputFormatClass:
 org.apache.hadoop.mapred.TextInputFormat
{code}
 

*The split file is from `hive table`(which is an empty table), but still the 
mapper using `org.apache.iceberg.mr.hive.HiveIcebergSerDe`, the table 
properties is also from iceberg table.*

It seems like hive use wrong serd to deserialize data, I am not an expert in 
hive, hope someone could give me some clues :).

SQL:
{code:java}
select count(1),count(a.dt),count(b.dt)
from (
select dt,concat_ws('###',wx_source, log_src, dt, hour) as str
from flink_fdm_iceberg.iceberg_table1
where dt='2021-01-29' and hour='16') a
full outer join (
select dt,concat_ws('###',wx_source, log_src, dt, hour) as str
from flink_fdm_iceberg.hive_table1
where dt='2021-01-29' and hour='16') b on a.str=b.str
where a.str is null or b.str is null;
{code}
 

The full log by the failed mapper:

 
{code:java}
2021-02-02 15:57:30,060 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
hadoop-metrics2.properties2021-02-02 15:57:30,060 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
hadoop-metrics2.properties2021-02-02 15:57:30,118 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 
10 second(s).2021-02-02 15:57:30,118 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
started2021-02-02 15:57:30,131 INFO [main] org.apache.hadoop.mapred.YarnChild: 
Executing with tokens:2021-02-02 15:57:30,131 INFO [main] 
org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: 
job_1605975810748_1427, Ident: 
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@4b2bac3f)2021-02-02
 15:57:30,167 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ns2, Ident: (HDFS_DELEGATION_TOKEN 
token 151282235 for flink)2021-02-02 15:57:30,168 INFO [main] 
org.apache.hadoop.mapred.YarnChild: Kind: HDFS_DELEGATION_TOKEN, Service: 
ha-hdfs:ns1, Ident: (HDFS_DELEGATION_TOKEN token 151249944 for flink)2021-02-02 
15:57:30,168 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ns4, Ident: (HDFS_DELEGATION_TOKEN 
token 151235141 for flink)2021-02-02 15:57:30,169 INFO [main] 
org.apache.hadoop.mapred.YarnChild: Kind: 

[jira] [Created] (HIVE-24719) There's a getAcidState() without impersonation in compactor.Worker

2021-02-02 Thread Karen Coppage (Jira)
Karen Coppage created HIVE-24719:


 Summary: There's a getAcidState() without impersonation in 
compactor.Worker
 Key: HIVE-24719
 URL: https://issues.apache.org/jira/browse/HIVE-24719
 Project: Hive
  Issue Type: Improvement
Reporter: Karen Coppage


In Initiator and Cleaner, getAcidState is called by a proxy user (the 
table/partition dir owner) because the HS2 user might not have permission to 
list the files. In Worker getAcidState is not called by a proxy user.

It's potentially a simple fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)