[
https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stamatis Zampetakis reassigned HIVE-24492:
------------------------------------------
> SharedCache not able to estimate size for location field of TableWrapper
> ------------------------------------------------------------------------
>
> Key: HIVE-24492
> URL: https://issues.apache.org/jira/browse/HIVE-24492
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
>
> The following message appears various times in the logs indicating an error
> on estimating the size of some field of TableWrapper:
> {noformat}
> 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266]
> cache.SharedCache: Not able to estimate size
> java.lang.NullPointerException: null
> at
> sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57)
> ~[?:1.8.0_261]
> at
> sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38)
> ~[?:1.8.0_261]
> at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261]
> at
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399)
> ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386)
> ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.<init>(SharedCache.java:321)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767)
> [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_261]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [?:1.8.0_261]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [?:1.8.0_261]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [?:1.8.0_261]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_261]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_261]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat}
> The message appears many times when running the TPC-DS perf tests:
> {noformat}
> mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat}
> From the stack trace it seems that we cannot estimate the size of a field
> cause it is null.
> If the value of a field is null then we shouldn't attempt to estimate the
> size since it will always lead to a NPE. Furthermore, there is no need to
> estimate and we can simply count it as zero.
> Looking a bit deeper in this use-case the field which causes the NPE is
> {{TableWrapper#location}} which comes from the storage descriptor (SDS table
> in metastore). So should this field be null in the first place?
> The content of the metastore shows that this happens for technical tables:
> {noformat}
> version |
> db_version |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/db_version
> funcs |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/funcs
> key_constraints |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/key_constraints
> table_stats_view |
> columns |
> web_site |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_30000.db/web_site
> inventory_i |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_30000.db/inventory_i
> partition_stats_view |
> wm_resourceplans |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_resourceplans
> wm_triggers |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_triggers
> wm_pools |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_pools
> wm_pools_to_triggers |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_pools_to_triggers
> wm_mappings |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_mappings
> scheduled_queries |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/scheduled_queries
> scheduled_executions |
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/scheduled_executions
> schemata |
> tables |
> table_privileges |
> column_privileges |
> views |
> scheduled_queries |
> scheduled_executions
> {noformat}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)