Jinho Kim created TAJO-2000:
-------------------------------
Summary: BSTIndex can cause OOM
Key: TAJO-2000
URL: https://issues.apache.org/jira/browse/TAJO-2000
Project: Tajo
Issue Type: Bug
Components: Data Shuffle, Physical Operator
Affects Versions: 0.8.0
Reporter: Jinho Kim
Assignee: Jinho Kim
Fix For: 0.11.1
BSTIndex writer collect the key and offsets for range shuffle. if keys is
unique, collector needs large memory.
In case of sorted dataset, dataset don’t need to keep in the memory. It should
be written in the file immediately
Here is the stack trace
{noformat}
Thread 63932: (state = BLOCKED)
- org.apache.tajo.storage.VTuple.isBlankOrNull(int) @bci=15, line=67 (Compiled
frame; information may be imprecise)
-
org.apache.tajo.storage.BaseTupleComparator.compare(org.apache.tajo.storage.Tuple,
org.apache.tajo.storage.Tuple) @bci=32, line=112 (Compiled frame)
- org.apache.tajo.storage.BaseTupleComparator.compare(java.lang.Object,
java.lang.Object) @bci=9, line=37 (Compiled frame)
- java.util.TreeMap.getEntryUsingComparator(java.lang.Object) @bci=29, line=376
(Compiled frame)
- java.util.TreeMap.getEntry(java.lang.Object) @bci=9, line=345 (Compiled frame)
- java.util.TreeMap.containsKey(java.lang.Object) @bci=2, line=232 (Compiled
frame)
-
org.apache.tajo.storage.index.bst.BSTIndex$BSTIndexWriter$KeyOffsetCollector.put(org.apache.tajo.storage.Tuple,
long) @bci=5, line=263 (Compiled frame)
-
org.apache.tajo.storage.index.bst.BSTIndex$BSTIndexWriter.write(org.apache.tajo.storage.Tuple,
long) @bci=88, line=143 (Compiled frame)
- org.apache.tajo.engine.planner.physical.RangeShuffleFileWriteExec.next()
@bci=78, line=108 (Compiled frame)
- org.apache.tajo.worker.TaskImpl.run() @bci=99, line=402 (Interpreted frame)
- org.apache.tajo.worker.TaskContainer.run() @bci=149, line=65 (Interpreted
frame)
- java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511
(Interpreted frame)
- java.util.concurrent.FutureTask.run() @bci=42, line=266 (Interpreted frame)
-
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=95, line=1142 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617
(Interpreted frame)
- java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)