While doing testing against Apache Drill 1.16.0, we are running into this
error: java.lang.OutOfMemoryError: GC overhead limit exceeded
In our use case, Apache Drill is using a custom storage plugin and no other
storage plugins like PostgreSQL, MySQL, etc. Some of the queries are very
large involving many subquery, join, functions, etc. And we are running
through the same set of queries that work without issue in Drill version
1.14.0.
We generated a heap dump at the time of out of memory exception. Heap dump
file is about 5.8 GB. Opening the dump showed:
Heap:
Size: 3.1 GB
Classes: 21.1k
Objects: 82.4m
Class Loader: 538
Showing the dominator tree for the allocated heap indicate two threads,
both with similar ownership stack for the bulk of the memory allocated.
E.g.
Class Name
| Shallow Heap | Retained Heap | Percentage
------------------------------------------------------------------------------------------------------------------------------------------------
java.lang.Thread @ 0x73af9b238
2288c900-b265-3988-1524-8e920a884075:frag:4:0 Thread |
120 | 1,882,336,336 | 56.51%
|- org.apache.drill.exec.compile.bytecode.MethodAnalyzer @ 0x73cc3fe88
| 56 | 1,873,674,888 | 56.25%
| |- org.objectweb.asm.tree.analysis.Frame[33487] @ 0x73d19b570
| 133,968 | 1,873,239,392 | 56.24%
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x786c6c470| 40 | 206,576 | 0.01%
| | | |- java.util.ArrayDeque @ 0x786c6c550
| 24 | 198,120 | 0.01%
| | | | '- java.lang.Object[2048] @ 0x786ce99f8
| 8,208 | 198,096 | 0.01%
| | | | |- java.util.HashSet @ 0x786c6e2d8
| 16 | 288 | 0.00%
| | | | |- java.util.HashSet @ 0x786c6ec68
| 16 | 288 | 0.00%
| | | | |- java.util.HashSet @ 0x786cd1ce8
| 16 | 288 | 0.00%
| | | | |- java.util.HashSet @ 0x786cd2ad8
| 16 | 288 | 0.00%
| | | | |- ......
| | |
| | | | *Total: 25 of 1,260 entries*
| | |
| | | |- java.util.ArrayDeque @ 0x786cf3128
| 24 | 8,232 | 0.00%
| | | | '- java.lang.Object[2048] @ 0x786cf3140
| 8,208 | 8,208 | 0.00%
| | | |- org.objectweb.asm.tree.analysis.Value[42] @ 0x786c6c498
| 184 | 184 | 0.00%
| | | | *Total: 3 entries*
| | |
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x786cf5150| 40 | 206,576 | 0.01%
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x78697ee00| 40 | 206,416 | 0.01%
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x7869b1440| 40 | 206,416 | 0.01%
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x784d5a328| 40 | 206,336 | 0.01%
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x784d8c918| 40 | 206,336 | 0.01%
| | |-
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
@ 0x784f9cc88| 40 | 206,336 | 0.01%
| | |- ......
| | |
| | | *Total: 25 of 19,971 entries*
| | |
...........
------------------------------------------------------------------------------------------------------------------------------------------------
Not sure if the above is normal or not with many
org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame
Unreachable objects:
Size: 3.3 GB
Objects: 80k
Classes: 454
Seems like a lot of unreachable objects. Where do I go from here to debug
this? Is there some JVM setting to fix this issue? Thanks.
-- Jiang