[ 
https://issues.apache.org/jira/browse/HIVE-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24492:
------------------------------------------


> SharedCache not able to estimate size for location field of TableWrapper
> ------------------------------------------------------------------------
>
>                 Key: HIVE-24492
>                 URL: https://issues.apache.org/jira/browse/HIVE-24492
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> The following message appears various times in the logs indicating an error 
> on estimating the size of some field of TableWrapper:
> {noformat}
> 2020-12-04T15:54:18,551 ERROR [CachedStore-CacheUpdateService: Thread-266] 
> cache.SharedCache: Not able to estimate size
> java.lang.NullPointerException: null
>         at 
> sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57)
>  ~[?:1.8.0_261]
>         at 
> sun.reflect.UnsafeQualifiedObjectFieldAccessorImpl.get(UnsafeQualifiedObjectFieldAccessorImpl.java:38)
>  ~[?:1.8.0_261]
>         at java.lang.reflect.Field.get(Field.java:393) ~[?:1.8.0_261]
>         at 
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:399)
>  ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator$ObjectEstimator.estimate(IncrementalObjectSizeEstimator.java:386)
>  ~[hive-storage-api-2.7.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.getTableWrapperSizeWithoutMaps(SharedCache.java:348)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.SharedCache$TableWrapper.<init>(SharedCache.java:321)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.createTableWrapper(SharedCache.java:1893)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.populateTableInCache(SharedCache.java:1754)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.prewarm(CachedStore.java:577)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.triggerPreWarm(CachedStore.java:161)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.CachedStore.access$600(CachedStore.java:90)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:767)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [?:1.8.0_261]
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [?:1.8.0_261]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [?:1.8.0_261]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [?:1.8.0_261]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_261]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_261]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]{noformat}
> The message appears many times when running the TPC-DS perf tests:
> {noformat}
> mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver{noformat}
> From the stack trace it seems that we cannot estimate the size of a field 
> cause it is null.
> If the value of a field is null then we shouldn't attempt to estimate the 
> size since it will always lead to a NPE. Furthermore, there is no need to 
> estimate and we can simply count it as zero.
> Looking a bit deeper in this use-case the field which causes the NPE is 
> {{TableWrapper#location}} which comes from the storage descriptor (SDS table 
> in metastore). So should this field be null in the first place?
> The content of the metastore shows that this happens for technical tables:
> {noformat}
> version                   | 
>  db_version                | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/db_version
>  funcs                     | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/funcs
>  key_constraints           | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/key_constraints
>  table_stats_view          | 
>  columns                   | 
>  web_site                  | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_30000.db/web_site
>  inventory_i               | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_30000.db/inventory_i
>  partition_stats_view      | 
>  wm_resourceplans          | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_resourceplans
>  wm_triggers               | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_triggers
>  wm_pools                  | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_pools
>  wm_pools_to_triggers      | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_pools_to_triggers
>  wm_mappings               | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/wm_mappings
>  scheduled_queries         | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/scheduled_queries
>  scheduled_executions      | 
> hdfs://localhost:40889/clusters/env-6cwwgq/warehouse-1580339123-xdmn/warehouse/tablespace/external/hive/sys.db/scheduled_executions
>  schemata                  | 
>  tables                    | 
>  table_privileges          | 
>  column_privileges         | 
>  views                     | 
>  scheduled_queries         | 
>  scheduled_executions
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to