[ 
https://issues.apache.org/jira/browse/HIVE-28808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933641#comment-17933641
 ] 

Quanlong Huang commented on HIVE-28808:
---------------------------------------

The current code fetch a batch of old events and delete them using 
PersistenceManager.deletePersistentAll():
https://github.com/apache/hive/blob/56a18bbba94f7cc099cb8dd1ab5e243a77fecd3f/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L11603

An alternative is using a DELETE FROM query directly. However, that can't do 
the job in batches. In the case when there are lots of old events, this might 
impact inserting new events.

The issue in this JIRA can be workarounded by setting 
hive.metastore.event.db.clean.maxevents to smaller values. The default is 
10000. I tried 100 and it works in my env.

> DB-Notification-Cleaner thread dies in startup due to OOM
> ---------------------------------------------------------
>
>                 Key: HIVE-28808
>                 URL: https://issues.apache.org/jira/browse/HIVE-28808
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> Saw this when launching HMS on a huge NOTIFICATION_LOG table.
> {noformat}
> 2025-03-07 19:37:18: Starting Hive Metastore Server
> Listening for transport dt_socket at address: 30010 
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/quanlong/workspace/Impala/toolchain/cdp_components-58457853/apache-hive-3.1.3000.7.3.1.0-160-bin/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/home/quanlong/workspace/Impala/toolchain/cdp_components-58457853/hadoop-3.1.1.7.3.1.0-160/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Exception in thread "DB-Notification-Cleaner" java.lang.OutOfMemoryError: 
> Java heap space 
>         at java.lang.StringCoding.decode(StringCoding.java:215)
>         at java.lang.String.<init>(String.java:463)
>         at org.postgresql.core.Encoding.decode(Encoding.java:284)
>         at org.postgresql.core.Encoding.decode(Encoding.java:295)
>         at org.postgresql.jdbc.PgResultSet.getString(PgResultSet.java:2256)
>         at 
> com.zaxxer.hikari.pool.HikariProxyResultSet.getString(HikariProxyResultSet.java)
>         at 
> org.datanucleus.store.rdbms.mapping.column.LongVarcharColumnMapping.getString(LongVarcharColumnMapping.java:102)
>         at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.getString(SingleFieldMapping.java:188)
>         at 
> org.datanucleus.store.rdbms.fieldmanager.ResultSetGetter.fetchStringField(ResultSetGetter.java:133)
>         at 
> org.datanucleus.state.StateManagerImpl.replacingStringField(StateManagerImpl.java:1986)
>         at 
> org.apache.hadoop.hive.metastore.model.MNotificationLog.dnReplaceField(MNotificationLog.java)
>         at 
> org.apache.hadoop.hive.metastore.model.MNotificationLog.dnReplaceFields(MNotificationLog.java)
>         at 
> org.datanucleus.state.StateManagerImpl.replaceFields(StateManagerImpl.java:4352)
>         at 
> org.datanucleus.store.rdbms.query.PersistentClassROF$1.fetchFields(PersistentClassROF.java:528)
>         at 
> org.datanucleus.state.StateManagerImpl.loadFieldValues(StateManagerImpl.java:3743)
>         at 
> org.datanucleus.state.StateManagerImpl.initialiseForHollow(StateManagerImpl.java:383)
>         at 
> org.datanucleus.state.ObjectProviderFactoryImpl.newForHollow(ObjectProviderFactoryImpl.java:99)
>         at 
> org.datanucleus.ExecutionContextImpl.findObject(ExecutionContextImpl.java:3199)
>         at 
> org.datanucleus.store.rdbms.query.PersistentClassROF.findObjectWithIdAndLoadFields(PersistentClassROF.java:523)
>         at 
> org.datanucleus.store.rdbms.query.PersistentClassROF.getObject(PersistentClassROF.java:456)
>         at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:181)
>         at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:409)
>         at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult.processNumberOfResults(ForwardQueryResult.java:137)
>         at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult.advanceToEndOfResultSet(ForwardQueryResult.java:165)
>         at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult.getSizeUsingMethod(ForwardQueryResult.java:519)
>         at 
> org.datanucleus.store.query.AbstractQueryResult.size(AbstractQueryResult.java:256)
>         at 
> org.apache.hadoop.hive.metastore.ObjectStore.doCleanNotificationEvents(ObjectStore.java:12213)
>         at 
> org.apache.hadoop.hive.metastore.ObjectStore.cleanOlderEvents(ObjectStore.java:12175)
>         at 
> org.apache.hadoop.hive.metastore.ObjectStore.cleanNotificationEvents(ObjectStore.java:12161)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {noformat}
> This is a downstream build. The related hive code is
> https://github.com/apache/hive/blob/d0372808177a823d63383e311c5909aa46b9a961/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L11577
> It seems we should optimize the code of cleaning old events.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to