cnDenis opened a new issue #2019:
URL: https://github.com/apache/hudi/issues/2019
**Describe the problem you faced**
java.lang.ApplicationShutdownHooks holds huge number of
org.apache.hudi.common.util.collection.DiskBasedMap and
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator
object, run out of memory.
**To Reproduce**
Steps to reproduce the behavior:
1. run hudi in local mode with local[4] and 4G memory, set
spark.memory.fraction=0.2 spark.memory.storageFraction=0.2 (allowing it to
spill rather than OOM according to tuning guide)
2. receive messages from kafka and write them into hdfs using hudi for few
days
3. then out of memory happened
I check lsof, the process opens 25000+ temp file in /tmp
Check with jmap -dump, there are large number of DiskBasedMap and
LazyFileIterable in ShutDownHooks. (see below)
How to close these temp files and hooks ?
**Environment Description**
* Hudi version : 0.5.3
* Spark version : 2.4.6
* Hive version : 1.2.1
* Hadoop version : 2.7.3
* Storage (HDFS/S3/GCS..) : HDFS
* Running on Docker? (yes/no) : no
**Additional context**
```
Class Name
| Shallow Heap | Retained
Heap
----------------------------------------------------------------------------------------------------------------------------------------------------------------
class java.lang.ApplicationShutdownHooks @ 0x6c45b3888 System Class
| 8 |
3,233,852,456
|- <class> class java.lang.Class @ 0x6c3d26108 System Class
| 40 |
1,152
|- <classloader> java.lang.ClassLoader @ 0x0 <system class loader>
| 64 |
64
|- <super> class java.lang.Object @ 0x6c3d4e498 System Class
| 8 |
40
|- <resolved_references> java.lang.Object[3] @ 0x6c449b708
| 32 |
208
|- hooks java.util.IdentityHashMap @ 0x6c4658950
| 40 |
3,233,852,240
| |- <class> class java.util.IdentityHashMap @ 0x6c45b36a0 System Class
| 32 |
264
| |- table java.lang.Object[262144] @ 0x794000000
| 1,048,592 |
3,233,852,200
| | |- <class> class java.lang.Object[] @ 0x6c3cc9610
| 0 |
0
| | |- [221646], [221647]
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @
0x6c0009d60 Thread-112 | 128 | 536
| | |- [88924], [88925]
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @
0x6c0019338 Thread-1549 | 128 | 536
| | |- [203058], [203059]
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @
0x6c0019650 Thread-1611| 128 | 536
| | |- [256104], [256105]
org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @
0x6c0160080 Thread-1460| 128 | 536
| | |- [239698], [239699]
org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160500
Thread-1432 | 128 | 131,808
| | |- [72214], [72215]
org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c01609d0
Thread-1451 | 128 | 131,808
| | |- [31762], [31763]
org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160df0
Thread-1435 | 128 | 131,808
----------------------------------------------------------------------------------------------------------------------------------------------------------------
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]