[jira] [Commented] (HIVE-22981) DataFileReader is not closed in AvroGenericRecordReader#extractWriterTimezoneFromMetadata
[ https://issues.apache.org/jira/browse/HIVE-22981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244663#comment-17244663 ] Yuming Wang commented on HIVE-22981: [~sunchao] Backcport this to branch-2.3? > DataFileReader is not closed in > AvroGenericRecordReader#extractWriterTimezoneFromMetadata > - > > Key: HIVE-22981 > URL: https://issues.apache.org/jira/browse/HIVE-22981 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-22981.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Method looks like : > {code} > private ZoneId extractWriterTimezoneFromMetadata(JobConf job, FileSplit > split, > GenericDatumReader gdr) throws IOException { > if (job == null || gdr == null || split == null || split.getPath() == > null) { > return null; > } > try { > DataFileReader dataFileReader = > new DataFileReader(new FsInput(split.getPath(), > job), gdr); > [...return...] > } > } catch (IOException e) { > // Can't access metadata, carry on. > } > return null; > } > {code} > The DataFileReader is never closed which can cause a memory leak. We need a > try-with-resources here. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24305) avro decimal schema is not properly populating scale/precision if value is enclosed in quote
[ https://issues.apache.org/jira/browse/HIVE-24305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244661#comment-17244661 ] Yuming Wang commented on HIVE-24305: [~sunchao] Could we back port this to branch-2.3? {code:sql} spark-sql> > > CREATE TABLE test_quoted_scale_precision STORED AS AVRO TBLPROPERTIES ('avro.schema.literal'='{"type":"record","name":"DecimalTest","namespace":"com.example.test","fields":[{"name":"Decimal24_6","type":["null",{"type":"bytes","logicalType":"decimal","precision":24,"scale":"6"}]}]}'); spark-sql> desc test_quoted_scale_precision; decimal24_6 decimal(24,0) spark-sql> {code} > avro decimal schema is not properly populating scale/precision if value is > enclosed in quote > > > Key: HIVE-24305 > URL: https://issues.apache.org/jira/browse/HIVE-24305 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code:java} > CREATE TABLE test_quoted_scale_precision STORED AS AVRO TBLPROPERTIES > ('avro.schema.literal'='{"type":"record","name":"DecimalTest","namespace":"com.example.test","fields":[{"name":"Decimal24_6","type":["null",{"type":"bytes","logicalType":"decimal","precision":24,"scale":"6"}]}]}'); > > desc test_quoted_scale_precision; > // current output > decimal24_6 decimal(24,0) > // expected output > decimal24_6 decimal(24,6){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24493) after upgrading Hive from 2.1.1 to 2.3.7, log keeps printing "Hive Schema version 2.3.0 does not match metastore's schema version 2.1.0"
[ https://issues.apache.org/jira/browse/HIVE-24493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongle Zhang updated HIVE-24493: Description: We tried to upgrade a hive single node from 2.1.1 to 2.3.7 and use derby as the database. After upgrade, Hive log keeps printing the following Error message repeatedly: {code:java} // code placeholder 2020-12-05T02:47:26,573 ERROR [main] metastore.RetryingHMSHandler: HMSHandler Fatal error: MetaException(message:Hive Schema version 2.3.0 does not match metastore's schema version 2.1.0 Metastore is not upgraded or corrupt) at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7825) at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7788) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) at com.sun.proxy.$Proxy34.verifySchema(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:595) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:588) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6902) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:164) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1707) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3600) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3652) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3632) at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3894) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:248) at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388) at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332) at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288) at org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:917) at org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:881) at org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1687) at org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:130) at org.apache.hive.service.cli.CLIService.init(CLIService.java:114) at org.apache.hive.service.CompositeService.init(CompositeService.java:59) at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:142) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:607) at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:100) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:855) at
[jira] [Commented] (HIVE-21463) Table "partition_keys" has been specified with a primary-key to include column "TBL_ID"
[ https://issues.apache.org/jira/browse/HIVE-21463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244565#comment-17244565 ] Attila Szucs commented on HIVE-21463: - [~AndyRuby], [~yongjian.wu] - any resolution to this issue? I also face the same exception... > Table "partition_keys" has been specified with a primary-key to include > column "TBL_ID" > --- > > Key: HIVE-21463 > URL: https://issues.apache.org/jira/browse/HIVE-21463 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 2.3.4 >Reporter: yongjian.wu >Priority: Major > > Hi,when i use the Hive-2.3.4 with the mariadb10.2.14 as the mete data db,i > meet the bellow error message: > hive> create table jian(ii char(1)); > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table > "partition_keys" has been specified with a primary-key to include column > "TBL_ID" but this column is not found in the table. Please check your > column specification.) > and about my mete data db you can see: > 13:41:25 (root@localhost) [jian]> show create table partition_keys; > | partition_keys | CREATE TABLE `partition_keys` ( > `TBL_ID` bigint(20) NOT NULL, > `PKEY_COMMENT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT > NULL, > `PKEY_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL, > `PKEY_TYPE` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL, > `INTEGER_IDX` int(11) NOT NULL, > PRIMARY KEY (`TBL_ID`,`PKEY_NAME`), > KEY `PARTITION_KEYS_N49` (`TBL_ID`), > CONSTRAINT `PARTITION_KEYS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `tbls` > (`TBL_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | > when i create a database is can be working bu when create table it is error > occur > hive> create database jiantest; > OK > Time taken: 6.783 seconds > hive> show databases; > OK > default > jiantest > Time taken: 0.236 seconds, Fetched: 2 row(s) > > > > this my config file if needed: > [root@hadoop hive-2.3.4]# cat conf/hive-site.xml > > > hive.metastore.local > true > > > javax.jdo.option.ConnectionURL > jdbc:mysql://172.17.0.5:3306/jian?characterEncoding=latin1 > > > javax.jdo.option.ConnectionDriverName > org.mariadb.jdbc.Driver > mariadb-java-client-2.4.0.jar > > > javax.jdo.option.ConnectionUserName > jian > > > javax.jdo.option.ConnectionPassword > 123456 > > > hive.metastore.schema.verification > false > > > > > waiting for you reply,thank you > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24475) Generalize fixacidkeyindex utility
[ https://issues.apache.org/jira/browse/HIVE-24475?focusedWorklogId=520513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-520513 ] ASF GitHub Bot logged work on HIVE-24475: - Author: ASF GitHub Bot Created on: 05/Dec/20 16:50 Start Date: 05/Dec/20 16:50 Worklog Time Spent: 10m Work Description: maheshk114 commented on a change in pull request #1730: URL: https://github.com/apache/hive/pull/1730#discussion_r536821113 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/FixAcidKeyIndex.java ## @@ -163,11 +144,52 @@ static void checkFile(Configuration conf, Path inputPath) throws IOException { return; } -boolean validIndex = isAcidKeyIndexValid(reader); +AcidKeyIndexValidationResult validationResult = validate(conf, inputPath); +boolean validIndex = validationResult.isValid; System.out.println("Checking " + inputPath + " - acid key index is " + (validIndex ? "valid" : "invalid")); } + public static AcidKeyIndexValidationResult validate(Configuration conf, Path inputPath) throws IOException { +AcidKeyIndexValidationResult result = new AcidKeyIndexValidationResult(); +FileSystem fs = inputPath.getFileSystem(conf); +Reader reader = OrcFile.createReader(fs, inputPath); +List stripes = reader.getStripes(); +RecordIdentifier[] keyIndex = OrcRecordUpdater.parseKeyIndex(reader); +StructObjectInspector soi = (StructObjectInspector) reader.getObjectInspector(); +// struct structFields = soi.getAllStructFieldRefs(); + +StructField transactionField = structFields.get(1); +StructField bucketField = structFields.get(2); Review comment: Can this be moved out of the loop ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 520513) Time Spent: 1h 10m (was: 1h) > Generalize fixacidkeyindex utility > -- > > Key: HIVE-24475 > URL: https://issues.apache.org/jira/browse/HIVE-24475 > Project: Hive > Issue Type: Improvement > Components: ORC, Transactions >Affects Versions: 3.0.0 >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > There is a utility in hive which can validate/fix corrupted > hive.acid.key.index. > hive --service fixacidkeyindex > Unfortunately it is only tailored for a specific problem > (https://issues.apache.org/jira/browse/HIVE-18907), instead of generally > validating and recovering the hive.acid.key.index from the stripe data itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24475) Generalize fixacidkeyindex utility
[ https://issues.apache.org/jira/browse/HIVE-24475?focusedWorklogId=520512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-520512 ] ASF GitHub Bot logged work on HIVE-24475: - Author: ASF GitHub Bot Created on: 05/Dec/20 16:47 Start Date: 05/Dec/20 16:47 Worklog Time Spent: 10m Work Description: maheshk114 commented on pull request #1730: URL: https://github.com/apache/hive/pull/1730#issuecomment-739319574 +1 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 520512) Time Spent: 1h (was: 50m) > Generalize fixacidkeyindex utility > -- > > Key: HIVE-24475 > URL: https://issues.apache.org/jira/browse/HIVE-24475 > Project: Hive > Issue Type: Improvement > Components: ORC, Transactions >Affects Versions: 3.0.0 >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > There is a utility in hive which can validate/fix corrupted > hive.acid.key.index. > hive --service fixacidkeyindex > Unfortunately it is only tailored for a specific problem > (https://issues.apache.org/jira/browse/HIVE-18907), instead of generally > validating and recovering the hive.acid.key.index from the stripe data itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24492) SharedCache not able to estimate size for null field of TableWrapper
[ https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis updated HIVE-24492: --- Summary: SharedCache not able to estimate size for null field of TableWrapper (was: SharedCache not able to estimate size for location field of TableWrapper) > SharedCache not able to estimate size for null field of TableWrapper > > > Key: HIVE-24492 > URL: https://issues.apache.org/jira/browse/HIVE-24492 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > > The following message appears various times in the logs indicating an error > on estimating the size of some field of TableWrapper: > {noformat} > 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] > cache.SharedCache: Not able to estimate size > java.lang.NullPointerException: null > at > sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57) > ~[?:1.8.0_261] > at > sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38) > ~[?:1.8.0_261] > at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261] > at > org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399) > ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386) > ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.(SharedCache.java:321) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_261] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [?:1.8.0_261] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [?:1.8.0_261] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [?:1.8.0_261] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_261] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_261] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat} > The message appears many times when running the TPC-DS perf tests: > {noformat} > mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat} > From the stack trace it seems that we cannot estimate the size of a field > cause it is null. > If the value of a field is null then we shouldn't attempt to estimate the > size since it will always lead to a NPE. Furthermore, there is no need to > estimate and we can simply count it as zero. > Looking a bit deeper in this use-case the field which causes the NPE is > {{TableWrapper#location}} which comes from the storage descriptor (SDS table > in metastore). So should this field be null in the first place? > The content of the metastore shows that this happens for technical tables > such as version, schemata, tables, table_privileges, etc: > {noformat} > version | > db_version| >
[jira] [Updated] (HIVE-24492) SharedCache not able to estimate size for location field of TableWrapper
[ https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis updated HIVE-24492: --- Description: The following message appears various times in the logs indicating an error on estimating the size of some field of TableWrapper: {noformat} 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] cache.SharedCache: Not able to estimate size java.lang.NullPointerException: null at sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57) ~[?:1.8.0_261] at sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38) ~[?:1.8.0_261] at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261] at org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399) ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386) ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.(SharedCache.java:321) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_261] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_261] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_261] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_261] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_261] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_261] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat} The message appears many times when running the TPC-DS perf tests: {noformat} mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat} >From the stack trace it seems that we cannot estimate the size of a field >cause it is null. If the value of a field is null then we shouldn't attempt to estimate the size since it will always lead to a NPE. Furthermore, there is no need to estimate and we can simply count it as zero. Looking a bit deeper in this use-case the field which causes the NPE is {{TableWrapper#location}} which comes from the storage descriptor (SDS table in metastore). So should this field be null in the first place? The content of the metastore shows that this happens for technical tables such as version, schemata, tables, table_privileges, etc: {noformat} version | db_version| hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/db_version funcs | hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/funcs key_constraints | hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/key_constraints table_stats_view | columns | web_site | hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_3.db/web_site inventory_i |
[jira] [Assigned] (HIVE-24492) SharedCache not able to estimate size for location field of TableWrapper
[ https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis reassigned HIVE-24492: -- > SharedCache not able to estimate size for location field of TableWrapper > > > Key: HIVE-24492 > URL: https://issues.apache.org/jira/browse/HIVE-24492 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > > The following message appears various times in the logs indicating an error > on estimating the size of some field of TableWrapper: > {noformat} > 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] > cache.SharedCache: Not able to estimate size > java.lang.NullPointerException: null > at > sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57) > ~[?:1.8.0_261] > at > sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38) > ~[?:1.8.0_261] > at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261] > at > org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399) > ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386) > ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.(SharedCache.java:321) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_261] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [?:1.8.0_261] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [?:1.8.0_261] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [?:1.8.0_261] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_261] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_261] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat} > The message appears many times when running the TPC-DS perf tests: > {noformat} > mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat} > From the stack trace it seems that we cannot estimate the size of a field > cause it is null. > If the value of a field is null then we shouldn't attempt to estimate the > size since it will always lead to a NPE. Furthermore, there is no need to > estimate and we can simply count it as zero. > Looking a bit deeper in this use-case the field which causes the NPE is > {{TableWrapper#location}} which comes from the storage descriptor (SDS table > in metastore). So should this field be null in the first place? > The content of the metastore shows that this happens for technical tables: > {noformat} > version | > db_version| > hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/db_version > funcs | >
[jira] [Commented] (HIVE-24437) Add more removed configs for(Don't fail config validation for removed configs)
[ https://issues.apache.org/jira/browse/HIVE-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1729#comment-1729 ] JiangZhu commented on HIVE-24437: - [~ashutoshc] Can you please review this patch ? > Add more removed configs for(Don't fail config validation for removed configs) > -- > > Key: HIVE-24437 > URL: https://issues.apache.org/jira/browse/HIVE-24437 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.7 >Reporter: JiangZhu >Assignee: JiangZhu >Priority: Major > Attachments: HIVE-24437.1.patch > > > Add more removed configs for(HIVE-14132 Don't fail config validation for > removed configs) -- This message was sent by Atlassian Jira (v8.3.4#803005)