Jinho Kim created TAJO-2000:
-------------------------------

             Summary: BSTIndex can cause OOM
                 Key: TAJO-2000
                 URL: https://issues.apache.org/jira/browse/TAJO-2000
             Project: Tajo
          Issue Type: Bug
          Components: Data Shuffle, Physical Operator
    Affects Versions: 0.8.0
            Reporter: Jinho Kim
            Assignee: Jinho Kim
             Fix For: 0.11.1


BSTIndex writer collect the key and offsets for range shuffle. if keys is 
unique, collector needs large memory.
In case of sorted dataset, dataset don’t need to keep in the memory. It should 
be written in the file immediately

Here is the stack trace
{noformat}
Thread 63932: (state = BLOCKED)
- org.apache.tajo.storage.VTuple.isBlankOrNull(int) @bci=15, line=67 (Compiled 
frame; information may be imprecise)
- 
org.apache.tajo.storage.BaseTupleComparator.compare(org.apache.tajo.storage.Tuple,
 org.apache.tajo.storage.Tuple) @bci=32, line=112 (Compiled frame)
- org.apache.tajo.storage.BaseTupleComparator.compare(java.lang.Object, 
java.lang.Object) @bci=9, line=37 (Compiled frame)
- java.util.TreeMap.getEntryUsingComparator(java.lang.Object) @bci=29, line=376 
(Compiled frame)
- java.util.TreeMap.getEntry(java.lang.Object) @bci=9, line=345 (Compiled frame)
- java.util.TreeMap.containsKey(java.lang.Object) @bci=2, line=232 (Compiled 
frame)
- 
org.apache.tajo.storage.index.bst.BSTIndex$BSTIndexWriter$KeyOffsetCollector.put(org.apache.tajo.storage.Tuple,
 long) @bci=5, line=263 (Compiled frame)
- 
org.apache.tajo.storage.index.bst.BSTIndex$BSTIndexWriter.write(org.apache.tajo.storage.Tuple,
 long) @bci=88, line=143 (Compiled frame)
- org.apache.tajo.engine.planner.physical.RangeShuffleFileWriteExec.next() 
@bci=78, line=108 (Compiled frame)
- org.apache.tajo.worker.TaskImpl.run() @bci=99, line=402 (Interpreted frame)
- org.apache.tajo.worker.TaskContainer.run() @bci=149, line=65 (Interpreted 
frame)
- java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
(Interpreted frame)
- java.util.concurrent.FutureTask.run() @bci=42, line=266 (Interpreted frame)
- 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
 @bci=95, line=1142 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 
(Interpreted frame)
- java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to