[jira] [Created] (HIVE-24711) hive metastore memory leak

2021-01-31 Thread LinZhongwei (Jira)
LinZhongwei created HIVE-24711:
--

 Summary: hive metastore memory leak
 Key: HIVE-24711
 URL: https://issues.apache.org/jira/browse/HIVE-24711
 Project: Hive
  Issue Type: Bug
  Components: Hive, Metastore
Affects Versions: 3.1.0
Reporter: LinZhongwei


hdp version:3.1.5.31-1

hive version:3.1.0.3.1.5.31-1

hadoop version:3.1.1.3.1.5.31-1

We find that the hive metastore has memory leak if we set 
compactor.initiator.on to true.

If we disable the configuration, the memory leak disappear.

How can we resolve this problem?

Even if we set the heap size of hive metastore to 40 GB, after 1 month the hive 
metastore service will be down with outofmemory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24710) PTFRowContainer could be reading more number of blocks than needed

2021-01-31 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-24710:
---

 Summary: PTFRowContainer could be reading more number of blocks 
than needed
 Key: HIVE-24710
 URL: https://issues.apache.org/jira/browse/HIVE-24710
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Rajesh Balamohan


PTFRowContainer could be reading the same block repeatedly for the first block. 
Default block size is around 25000. For the first 25000 rowIdx, it would read 
the block repeatedly due to ("rowIdx < currentReadBlockStartRow ") condition.

{noformat}
 public Row getAt(int rowIdx) throws HiveException {
int blockSize = getBlockSize();
if ( rowIdx < currentReadBlockStartRow || rowIdx >= 
currentReadBlockStartRow + blockSize ) {
  readBlock(getBlockNum(rowIdx));
}
return getReadBlockRow(rowIdx - currentReadBlockStartRow);
  }
{noformat} 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24709) Hive Stats compute on columns failing after new columns added to AVRO tables

2021-01-31 Thread Vignesh Ilangovan (Jira)
Vignesh Ilangovan created HIVE-24709:


 Summary: Hive Stats compute on columns failing after new columns 
added to AVRO tables
 Key: HIVE-24709
 URL: https://issues.apache.org/jira/browse/HIVE-24709
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: Vignesh Ilangovan


ANALYZE TABLE <> COMPUTE STATISTICS FOR COLUMNS; 

FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask

 

When AVRO non-partition table updated with new columns in AVSC files then 
compute statistics on table working fine but compute statistics on columns 
failing with above error. Temporarily dropped and recreated the hive avro table 
since it is external table there is no much impact. But each time recreating 
the DDL is not an right option

Note:

MR job succeeded on compute statistics for columns but at the final step it 
returns code 1.

Hope it is some bug.

 

Steps to reproduce:

#1 Create table with avro table as external table pointing to avsc file

#2 Update the new columns in avsc file

#3 Run 'ANALYZE TABLE <> COMPUTE STATISTICS FOR COLUMNS; '

#4 MR will succeed but it will fail with return code 1 

from org.apache.hadoop.hive.ql.exec.ColumnStatsTask

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24708) org.apache.thrift.transport.TTransportException: null;Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.StatsTa

2021-01-31 Thread tiki (Jira)
tiki created HIVE-24708:
---

 Summary: org.apache.thrift.transport.TTransportException: 
null;Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.StatsTask
 Key: HIVE-24708
 URL: https://issues.apache.org/jira/browse/HIVE-24708
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.1.2
 Environment: centos7

hadoop:3.1.4

hbase:2.2.6

mysql:5.7.32

hive:3.1.2

 
Reporter: tiki
 Fix For: 0.13.0
 Attachments: hive错误日志.txt

h2. 1.原配置

在hive-site.xml中配置了:

 
{code:java}
  
hive.metastore.uris
thrift://hadoop-server-004:9083,hadoop-server-005:9083
  
{code}
之后,在hive中插入数据,会打印一条Error日志,但数据成功插入,查看hive的hive.log日志文件,发现如下错误:
{code:java}
org.apache.thrift.transport.TTransportException: null
..
2021-01-27T20:00:17,086 ERROR [HiveServer2-Background-Pool: Thread-510] 
exec.StatsTask: Failed to run stats task 
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.transport.TTransportException
..
Caused by: org.apache.thrift.transport.TTransportException
..
2021-01-27T20:00:17,089 ERROR [HiveServer2-Background-Pool: Thread-510] 
operation.Operation: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.StatsTask
..
{code}
具体日志错误见 附件
h2. 2.修改配置

将配置修改为:

 
{code:java}
  
hive.metastore.uris

  
{code}
之后,插入数据正常



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24707) Apply Sane Default for Tez Containers as Last Resort

2021-01-31 Thread David Mollitor (Jira)
David Mollitor created HIVE-24707:
-

 Summary: Apply Sane Default for Tez Containers as Last Resort
 Key: HIVE-24707
 URL: https://issues.apache.org/jira/browse/HIVE-24707
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


{code:java|title=DagUtils.java}
public static Resource getContainerResource(Configuration conf) {
int memory = HiveConf.getIntVar(conf, 
HiveConf.ConfVars.HIVETEZCONTAINERSIZE) > 0 ?
  HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVETEZCONTAINERSIZE) :
  conf.getInt(MRJobConfig.MAP_MEMORY_MB, MRJobConfig.DEFAULT_MAP_MEMORY_MB);
int cpus = HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVETEZCPUVCORES) > 0 
?
  HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVETEZCPUVCORES) :
  conf.getInt(MRJobConfig.MAP_CPU_VCORES, 
MRJobConfig.DEFAULT_MAP_CPU_VCORES);
return Resource.newInstance(memory, cpus);
  }
{code}

If Tez Container Size or VCores is an invalid value ( <= 0 ) then it falls back 
onto the MapReduce configurations, but if the MapReduce configurations have 
invalid values ( <= 0 ), they are excepted regardless and this will cause 
failures down the road.

This code should also check the MapReduce values and fall back to MapReduce 
default values if they are <= 0.

Also, some logging would be nice here too, reporting about where the 
configuration values came from.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)