cnDenis opened a new issue #2019:
URL: https://github.com/apache/hudi/issues/2019


   **Describe the problem you faced**
   
   java.lang.ApplicationShutdownHooks holds huge number of 
org.apache.hudi.common.util.collection.DiskBasedMap and 
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator 
object, run out of memory.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. run hudi in local mode with  local[4] and 4G memory, set 
spark.memory.fraction=0.2 spark.memory.storageFraction=0.2 (allowing it to 
spill rather than OOM according to tuning guide)
   2. receive messages from kafka and write them into hdfs using hudi for few 
days
   3. then out of memory happened
   
   I check lsof, the process opens 25000+ temp file in /tmp
   
   Check with jmap -dump, there are large number of DiskBasedMap and 
LazyFileIterable in ShutDownHooks. (see below)
   
   How to close these temp files and hooks ?
   
   **Environment Description**
   
   * Hudi version : 0.5.3
   
   * Spark version : 2.4.6
   
   * Hive version : 1.2.1
   
   * Hadoop version : 2.7.3
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   ```
   Class Name                                                                   
                                                    | Shallow Heap | Retained 
Heap
   
----------------------------------------------------------------------------------------------------------------------------------------------------------------
   class java.lang.ApplicationShutdownHooks @ 0x6c45b3888 System Class          
                                                    |            8 | 
3,233,852,456
   |- <class> class java.lang.Class @ 0x6c3d26108 System Class                  
                                                    |           40 |         
1,152
   |- <classloader> java.lang.ClassLoader @ 0x0  <system class loader>          
                                                    |           64 |            
64
   |- <super> class java.lang.Object @ 0x6c3d4e498 System Class                 
                                                    |            8 |            
40
   |- <resolved_references> java.lang.Object[3] @ 0x6c449b708                   
                                                    |           32 |           
208
   |- hooks java.util.IdentityHashMap @ 0x6c4658950                             
                                                    |           40 | 
3,233,852,240
   |  |- <class> class java.util.IdentityHashMap @ 0x6c45b36a0 System Class     
                                                    |           32 |           
264
   |  |- table java.lang.Object[262144] @ 0x794000000                           
                                                    |    1,048,592 | 
3,233,852,200
   |  |  |- <class> class java.lang.Object[] @ 0x6c3cc9610                      
                                                    |            0 |            
 0
   |  |  |- [221646], [221647] 
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 
0x6c0009d60  Thread-112 |          128 |           536
   |  |  |- [88924], [88925] 
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 
0x6c0019338  Thread-1549  |          128 |           536
   |  |  |- [203058], [203059] 
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 
0x6c0019650  Thread-1611|          128 |           536
   |  |  |- [256104], [256105] 
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 
0x6c0160080  Thread-1460|          128 |           536
   |  |  |- [239698], [239699] 
org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160500  
Thread-1432                     |          128 |       131,808
   |  |  |- [72214], [72215] 
org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c01609d0  
Thread-1451                       |          128 |       131,808
   |  |  |- [31762], [31763] 
org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160df0  
Thread-1435                       |          128 |       131,808
   
----------------------------------------------------------------------------------------------------------------------------------------------------------------
   
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to