[jira] [Commented] (HIVE-22981) DataFileReader is not closed in AvroGenericRecordReader#extractWriterTimezoneFromMetadata

2020-12-05 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244663#comment-17244663
 ] 

Yuming Wang commented on HIVE-22981:


[~sunchao] Backcport this to branch-2.3?

> DataFileReader is not closed in 
> AvroGenericRecordReader#extractWriterTimezoneFromMetadata
> -
>
> Key: HIVE-22981
> URL: https://issues.apache.org/jira/browse/HIVE-22981
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22981.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Method looks like :
> {code}
>  private ZoneId extractWriterTimezoneFromMetadata(JobConf job, FileSplit 
> split,
>   GenericDatumReader gdr) throws IOException {
> if (job == null || gdr == null || split == null || split.getPath() == 
> null) {
>   return null;
> }
> try {
>   DataFileReader dataFileReader =
>   new DataFileReader(new FsInput(split.getPath(), 
> job), gdr);
>   [...return...]
>   }
> } catch (IOException e) {
>   // Can't access metadata, carry on.
> }
> return null;
>   }
> {code}
> The DataFileReader is never closed which can cause a memory leak. We need a 
> try-with-resources here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24305) avro decimal schema is not properly populating scale/precision if value is enclosed in quote

2020-12-05 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244661#comment-17244661
 ] 

Yuming Wang commented on HIVE-24305:


[~sunchao] Could we back port this to branch-2.3?
{code:sql}
spark-sql>
 >
 > CREATE TABLE test_quoted_scale_precision STORED AS AVRO 
TBLPROPERTIES 
('avro.schema.literal'='{"type":"record","name":"DecimalTest","namespace":"com.example.test","fields":[{"name":"Decimal24_6","type":["null",{"type":"bytes","logicalType":"decimal","precision":24,"scale":"6"}]}]}');
spark-sql> desc test_quoted_scale_precision;
decimal24_6 decimal(24,0)
spark-sql>
{code}

> avro decimal schema is not properly populating scale/precision if value is 
> enclosed in quote
> 
>
> Key: HIVE-24305
> URL: https://issues.apache.org/jira/browse/HIVE-24305
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java}
> CREATE TABLE test_quoted_scale_precision STORED AS AVRO TBLPROPERTIES 
> ('avro.schema.literal'='{"type":"record","name":"DecimalTest","namespace":"com.example.test","fields":[{"name":"Decimal24_6","type":["null",{"type":"bytes","logicalType":"decimal","precision":24,"scale":"6"}]}]}');
>  
> desc test_quoted_scale_precision;
> // current output
> decimal24_6 decimal(24,0)
> // expected output
> decimal24_6 decimal(24,6){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24493) after upgrading Hive from 2.1.1 to 2.3.7, log keeps printing "Hive Schema version 2.3.0 does not match metastore's schema version 2.1.0"

2020-12-05 Thread Yongle Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongle Zhang updated HIVE-24493:

Description: 
We tried to upgrade a hive single node from 2.1.1 to 2.3.7  and use derby as 
the database. After upgrade, Hive log keeps printing the following Error 
message repeatedly: 

 
{code:java}
// code placeholder
2020-12-05T02:47:26,573 ERROR [main] metastore.RetryingHMSHandler: HMSHandler 
Fatal error: MetaException(message:Hive Schema version 2.3.0 does not match 
metastore's schema version 2.1.0 Metastore is not upgraded or corrupt)
  at 
org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7825)
  at 
org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7788)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
  at com.sun.proxy.$Proxy34.verifySchema(Unknown Source)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:595)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:588)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
  at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
  at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79)
  at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6902)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:164)
  at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1707)
  at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83)
  at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
  at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
  at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3600)
  at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3652)
  at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3632)
  at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3894)
  at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:248)
  at 
org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
  at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388)
  at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332)
  at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312)
  at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
  at 
org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:917)
  at 
org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:881)
  at 
org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1687)
  at 
org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:130)
  at org.apache.hive.service.cli.CLIService.init(CLIService.java:114)
  at org.apache.hive.service.CompositeService.init(CompositeService.java:59)
  at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:142)
  at 
org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:607)
  at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:100)
  at 
org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:855)
  at 

[jira] [Commented] (HIVE-21463) Table "partition_keys" has been specified with a primary-key to include column "TBL_ID"

2020-12-05 Thread Attila Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244565#comment-17244565
 ] 

Attila Szucs commented on HIVE-21463:
-

[~AndyRuby], [~yongjian.wu] - any resolution to this issue? I also face the 
same exception...

> Table "partition_keys" has been specified with a primary-key to include 
> column "TBL_ID"
> ---
>
> Key: HIVE-21463
> URL: https://issues.apache.org/jira/browse/HIVE-21463
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 2.3.4
>Reporter: yongjian.wu
>Priority: Major
>
> Hi,when i use the Hive-2.3.4 with the mariadb10.2.14 as the mete data db,i 
> meet the bellow error message:
> hive> create table jian(ii char(1));
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table 
> "partition_keys" has been specified with a primary-key to include column 
> "TBL_ID" but this column is not found in the table. Please check your 
>  column specification.)
> and about my mete data db you can see:
> 13:41:25 (root@localhost) [jian]> show create table partition_keys;
> | partition_keys | CREATE TABLE `partition_keys` (
>  `TBL_ID` bigint(20) NOT NULL,
>  `PKEY_COMMENT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT 
> NULL,
>  `PKEY_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
>  `PKEY_TYPE` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
>  `INTEGER_IDX` int(11) NOT NULL,
>  PRIMARY KEY (`TBL_ID`,`PKEY_NAME`),
>  KEY `PARTITION_KEYS_N49` (`TBL_ID`),
>  CONSTRAINT `PARTITION_KEYS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `tbls` 
> (`TBL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
> when i create a database is can be working bu when create table it is error 
> occur
> hive> create database jiantest;
> OK
> Time taken: 6.783 seconds
> hive> show databases;
> OK
> default
> jiantest
> Time taken: 0.236 seconds, Fetched: 2 row(s)
>  
>  
>  
> this my config file if needed:
> [root@hadoop hive-2.3.4]# cat conf/hive-site.xml 
> 
>  
>  hive.metastore.local
>  true
>  
>  
>  javax.jdo.option.ConnectionURL
>  jdbc:mysql://172.17.0.5:3306/jian?characterEncoding=latin1
>  
>  
>  javax.jdo.option.ConnectionDriverName
>  org.mariadb.jdbc.Driver
>  mariadb-java-client-2.4.0.jar
>  
>  
>  javax.jdo.option.ConnectionUserName
>  jian
>  
>  
>  javax.jdo.option.ConnectionPassword
>  123456
>  
>  
>  hive.metastore.schema.verification
>  false
>  
>  
> 
>  
> waiting for you reply,thank you
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24475) Generalize fixacidkeyindex utility

2020-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24475?focusedWorklogId=520513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-520513
 ]

ASF GitHub Bot logged work on HIVE-24475:
-

Author: ASF GitHub Bot
Created on: 05/Dec/20 16:50
Start Date: 05/Dec/20 16:50
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #1730:
URL: https://github.com/apache/hive/pull/1730#discussion_r536821113



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/FixAcidKeyIndex.java
##
@@ -163,11 +144,52 @@ static void checkFile(Configuration conf, Path inputPath) 
throws IOException {
   return;
 }
 
-boolean validIndex = isAcidKeyIndexValid(reader);
+AcidKeyIndexValidationResult validationResult = validate(conf, inputPath);
+boolean validIndex = validationResult.isValid;
 System.out.println("Checking " + inputPath + " - acid key index is " +
 (validIndex ? "valid" : "invalid"));
   }
 
+  public static AcidKeyIndexValidationResult validate(Configuration conf, Path 
inputPath) throws IOException {
+AcidKeyIndexValidationResult result = new AcidKeyIndexValidationResult();
+FileSystem fs = inputPath.getFileSystem(conf);
+Reader reader = OrcFile.createReader(fs, inputPath);
+List stripes = reader.getStripes();
+RecordIdentifier[] keyIndex = OrcRecordUpdater.parseKeyIndex(reader);
+StructObjectInspector soi = (StructObjectInspector) 
reader.getObjectInspector();
+// 
struct structFields = soi.getAllStructFieldRefs();
+
+StructField transactionField = structFields.get(1);
+StructField bucketField = structFields.get(2);

Review comment:
   Can this be moved out of the loop ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 520513)
Time Spent: 1h 10m  (was: 1h)

> Generalize fixacidkeyindex utility
> --
>
> Key: HIVE-24475
> URL: https://issues.apache.org/jira/browse/HIVE-24475
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC, Transactions
>Affects Versions: 3.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There is a utility in hive which can validate/fix corrupted 
> hive.acid.key.index.
> hive --service fixacidkeyindex
> Unfortunately it is only tailored for a specific problem 
> (https://issues.apache.org/jira/browse/HIVE-18907), instead of generally 
> validating and recovering the hive.acid.key.index from the stripe data itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24475) Generalize fixacidkeyindex utility

2020-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24475?focusedWorklogId=520512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-520512
 ]

ASF GitHub Bot logged work on HIVE-24475:
-

Author: ASF GitHub Bot
Created on: 05/Dec/20 16:47
Start Date: 05/Dec/20 16:47
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #1730:
URL: https://github.com/apache/hive/pull/1730#issuecomment-739319574


   +1 LGTM 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 520512)
Time Spent: 1h  (was: 50m)

> Generalize fixacidkeyindex utility
> --
>
> Key: HIVE-24475
> URL: https://issues.apache.org/jira/browse/HIVE-24475
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC, Transactions
>Affects Versions: 3.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> There is a utility in hive which can validate/fix corrupted 
> hive.acid.key.index.
> hive --service fixacidkeyindex
> Unfortunately it is only tailored for a specific problem 
> (https://issues.apache.org/jira/browse/HIVE-18907), instead of generally 
> validating and recovering the hive.acid.key.index from the stripe data itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24492) SharedCache not able to estimate size for null field of TableWrapper

2020-12-05 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24492:
---
Summary: SharedCache not able to estimate size for null field of 
TableWrapper  (was: SharedCache not able to estimate size for location field of 
TableWrapper)

> SharedCache not able to estimate size for null field of TableWrapper
> 
>
> Key: HIVE-24492
> URL: https://issues.apache.org/jira/browse/HIVE-24492
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> The following message appears various times in the logs indicating an error 
> on estimating the size of some field of TableWrapper:
> {noformat}
> 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] 
> cache.SharedCache: Not able to estimate size
> java.lang.NullPointerException: null
> at 
> sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57)
>  ~[?:1.8.0_261]
> at 
> sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38)
>  ~[?:1.8.0_261]
> at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261]
> at 
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399)
>  ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386)
>  ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.(SharedCache.java:321)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [?:1.8.0_261]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [?:1.8.0_261]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [?:1.8.0_261]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [?:1.8.0_261]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_261]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_261]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat}
> The message appears many times when running the TPC-DS perf tests:
> {noformat}
> mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat}
> From the stack trace it seems that we cannot estimate the size of a field 
> cause it is null.
> If the value of a field is null then we shouldn't attempt to estimate the 
> size since it will always lead to a NPE. Furthermore, there is no need to 
> estimate and we can simply count it as zero.
> Looking a bit deeper in this use-case the field which causes the NPE is 
> {{TableWrapper#location}} which comes from the storage descriptor (SDS table 
> in metastore). So should this field be null in the first place?
> The content of the metastore shows that this happens for technical tables 
> such as version, schemata, tables, table_privileges, etc:
> {noformat}
> version   | 
>  db_version| 
> 

[jira] [Updated] (HIVE-24492) SharedCache not able to estimate size for location field of TableWrapper

2020-12-05 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24492:
---
Description: 
The following message appears various times in the logs indicating an error on 
estimating the size of some field of TableWrapper:
{noformat}
2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] 
cache.SharedCache: Not able to estimate size
java.lang.NullPointerException: null
at 
sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57) 
~[?:1.8.0_261]
at 
sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38)
 ~[?:1.8.0_261]
at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261]
at 
org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399)
 ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386)
 ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.(SharedCache.java:321)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_261]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[?:1.8.0_261]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_261]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [?:1.8.0_261]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_261]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_261]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat}
The message appears many times when running the TPC-DS perf tests:
{noformat}
mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat}
>From the stack trace it seems that we cannot estimate the size of a field 
>cause it is null.

If the value of a field is null then we shouldn't attempt to estimate the size 
since it will always lead to a NPE. Furthermore, there is no need to estimate 
and we can simply count it as zero.

Looking a bit deeper in this use-case the field which causes the NPE is 
{{TableWrapper#location}} which comes from the storage descriptor (SDS table in 
metastore). So should this field be null in the first place?

The content of the metastore shows that this happens for technical tables such 
as version, schemata, tables, table_privileges, etc:
{noformat}
version   | 
 db_version| 
hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/db_version
 funcs | 
hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/funcs
 key_constraints   | 
hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/key_constraints
 table_stats_view  | 
 columns   | 
 web_site  | 
hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_3.db/web_site
 inventory_i   | 

[jira] [Assigned] (HIVE-24492) SharedCache not able to estimate size for location field of TableWrapper

2020-12-05 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24492:
--


> SharedCache not able to estimate size for location field of TableWrapper
> 
>
> Key: HIVE-24492
> URL: https://issues.apache.org/jira/browse/HIVE-24492
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> The following message appears various times in the logs indicating an error 
> on estimating the size of some field of TableWrapper:
> {noformat}
> 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] 
> cache.SharedCache: Not able to estimate size
> java.lang.NullPointerException: null
> at 
> sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57)
>  ~[?:1.8.0_261]
> at 
> sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38)
>  ~[?:1.8.0_261]
> at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261]
> at 
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399)
>  ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386)
>  ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.(SharedCache.java:321)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [?:1.8.0_261]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [?:1.8.0_261]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [?:1.8.0_261]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [?:1.8.0_261]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_261]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_261]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat}
> The message appears many times when running the TPC-DS perf tests:
> {noformat}
> mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat}
> From the stack trace it seems that we cannot estimate the size of a field 
> cause it is null.
> If the value of a field is null then we shouldn't attempt to estimate the 
> size since it will always lead to a NPE. Furthermore, there is no need to 
> estimate and we can simply count it as zero.
> Looking a bit deeper in this use-case the field which causes the NPE is 
> {{TableWrapper#location}} which comes from the storage descriptor (SDS table 
> in metastore). So should this field be null in the first place?
> The content of the metastore shows that this happens for technical tables:
> {noformat}
> version   | 
>  db_version| 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/db_version
>  funcs | 
> 

[jira] [Commented] (HIVE-24437) Add more removed configs for(Don't fail config validation for removed configs)

2020-12-05 Thread JiangZhu (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1729#comment-1729
 ] 

JiangZhu commented on HIVE-24437:
-

[~ashutoshc]  Can you please review this patch ?

> Add more removed configs for(Don't fail config validation for removed configs)
> --
>
> Key: HIVE-24437
> URL: https://issues.apache.org/jira/browse/HIVE-24437
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: JiangZhu
>Assignee: JiangZhu
>Priority: Major
> Attachments: HIVE-24437.1.patch
>
>
> Add more removed configs for(HIVE-14132 Don't fail config validation for 
> removed configs)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)