[
https://issues.apache.org/jira/browse/TAJO-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jihoon Son updated TAJO-587:
----------------------------
Description:
See the title. When I run a simple sort query against a table of 1TB, the query
is hanging and not finished.
Queries should be terminated immediately when OOME occurs.
{noformat}
tajo> select l_orderkey from lineitem order by l_orderkey
2014-02-05 17:20:52,339 FATAL master.TajoAsyncDispatcher
(TajoAsyncDispatcher.java:dispatch(143)) - Error in dispatcher
thread:SUBQUERY_COMPLETED
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.net.URI.create(URI.java:857)
at
org.apache.tajo.master.querymaster.Repartitioner.scheduleRangeShuffledFetches(Repartitioner.java:342)
at
org.apache.tajo.master.querymaster.Repartitioner.scheduleFragmentsForNonLeafTasks(Repartitioner.java:261)
at
org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.schedule(SubQuery.java:680)
at
org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:523)
at
org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:504)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.tajo.master.querymaster.SubQuery.handle(SubQuery.java:481)
at
org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.executeNextBlock(Query.java:311)
at
org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.transition(Query.java:357)
at
org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.transition(Query.java:297)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.tajo.master.querymaster.Query.handle(Query.java:584)
at org.apache.tajo.master.querymaster.Query.handle(Query.java:58)
at
org.apache.tajo.master.TajoAsyncDispatcher.dispatch(TajoAsyncDispatcher.java:137)
at
org.apache.tajo.master.TajoAsyncDispatcher$1.run(TajoAsyncDispatcher.java:79)
at java.lang.Thread.run(Thread.java:701)
2014-02-05 17:20:52,339 WARN querymaster.QueryMaster
(QueryMaster.java:run(459)) - Query q_1391587770871_0001 stopped cause query
sesstion timeout: 384113 ms
2014-02-05 17:20:52,339 INFO querymaster.QueryMasterTask
(QueryMasterTask.java:stop(168)) - Stopping QueryMasterTask:q_1391587770871_0001
2014-02-05 17:20:52,346 INFO master.TajoAsyncDispatcher
(TajoAsyncDispatcher.java:stop(122)) - AsyncDispatcher
stopped:q_1391587770871_0001
2014-02-05 17:20:52,351 INFO querymaster.QueryMasterTask
(QueryMasterTask.java:stop(198)) - Stopped QueryMasterTask:q_1391587770871_0001
2014-02-05 17:23:28,614 ERROR worker.TajoWorker (SignalLogger.java:handle(60))
- RECEIVED SIGNAL 15: SIGTERM
{noformat}
was:
See the title. When I run a simple sort query against a table of 1TB, the query
is hanging and not finished.
Queris should be terminated immediately when OOME occurs.
{noformat}
tajo> select l_orderkey from lineitem order by l_orderkey
2014-02-05 17:20:52,339 FATAL master.TajoAsyncDispatcher
(TajoAsyncDispatcher.java:dispatch(143)) - Error in dispatcher
thread:SUBQUERY_COMPLETED
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.net.URI.create(URI.java:857)
at
org.apache.tajo.master.querymaster.Repartitioner.scheduleRangeShuffledFetches(Repartitioner.java:342)
at
org.apache.tajo.master.querymaster.Repartitioner.scheduleFragmentsForNonLeafTasks(Repartitioner.java:261)
at
org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.schedule(SubQuery.java:680)
at
org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:523)
at
org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:504)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.tajo.master.querymaster.SubQuery.handle(SubQuery.java:481)
at
org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.executeNextBlock(Query.java:311)
at
org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.transition(Query.java:357)
at
org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.transition(Query.java:297)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.tajo.master.querymaster.Query.handle(Query.java:584)
at org.apache.tajo.master.querymaster.Query.handle(Query.java:58)
at
org.apache.tajo.master.TajoAsyncDispatcher.dispatch(TajoAsyncDispatcher.java:137)
at
org.apache.tajo.master.TajoAsyncDispatcher$1.run(TajoAsyncDispatcher.java:79)
at java.lang.Thread.run(Thread.java:701)
2014-02-05 17:20:52,339 WARN querymaster.QueryMaster
(QueryMaster.java:run(459)) - Query q_1391587770871_0001 stopped cause query
sesstion timeout: 384113 ms
2014-02-05 17:20:52,339 INFO querymaster.QueryMasterTask
(QueryMasterTask.java:stop(168)) - Stopping QueryMasterTask:q_1391587770871_0001
2014-02-05 17:20:52,346 INFO master.TajoAsyncDispatcher
(TajoAsyncDispatcher.java:stop(122)) - AsyncDispatcher
stopped:q_1391587770871_0001
2014-02-05 17:20:52,351 INFO querymaster.QueryMasterTask
(QueryMasterTask.java:stop(198)) - Stopped QueryMasterTask:q_1391587770871_0001
2014-02-05 17:23:28,614 ERROR worker.TajoWorker (SignalLogger.java:handle(60))
- RECEIVED SIGNAL 15: SIGTERM
{noformat}
> Query is hanging when OutOfMemoryError occurs in the query master
> -----------------------------------------------------------------
>
> Key: TAJO-587
> URL: https://issues.apache.org/jira/browse/TAJO-587
> Project: Tajo
> Issue Type: Bug
> Components: tajo master
> Reporter: Jihoon Son
> Fix For: 0.8-incubating
>
>
> See the title. When I run a simple sort query against a table of 1TB, the
> query is hanging and not finished.
> Queries should be terminated immediately when OOME occurs.
> {noformat}
> tajo> select l_orderkey from lineitem order by l_orderkey
> 2014-02-05 17:20:52,339 FATAL master.TajoAsyncDispatcher
> (TajoAsyncDispatcher.java:dispatch(143)) - Error in dispatcher
> thread:SUBQUERY_COMPLETED
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.net.URI.create(URI.java:857)
> at
> org.apache.tajo.master.querymaster.Repartitioner.scheduleRangeShuffledFetches(Repartitioner.java:342)
> at
> org.apache.tajo.master.querymaster.Repartitioner.scheduleFragmentsForNonLeafTasks(Repartitioner.java:261)
> at
> org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.schedule(SubQuery.java:680)
> at
> org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:523)
> at
> org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:504)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tajo.master.querymaster.SubQuery.handle(SubQuery.java:481)
> at
> org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.executeNextBlock(Query.java:311)
> at
> org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.transition(Query.java:357)
> at
> org.apache.tajo.master.querymaster.Query$SubQueryCompletedTransition.transition(Query.java:297)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at org.apache.tajo.master.querymaster.Query.handle(Query.java:584)
> at org.apache.tajo.master.querymaster.Query.handle(Query.java:58)
> at
> org.apache.tajo.master.TajoAsyncDispatcher.dispatch(TajoAsyncDispatcher.java:137)
> at
> org.apache.tajo.master.TajoAsyncDispatcher$1.run(TajoAsyncDispatcher.java:79)
> at java.lang.Thread.run(Thread.java:701)
> 2014-02-05 17:20:52,339 WARN querymaster.QueryMaster
> (QueryMaster.java:run(459)) - Query q_1391587770871_0001 stopped cause query
> sesstion timeout: 384113 ms
> 2014-02-05 17:20:52,339 INFO querymaster.QueryMasterTask
> (QueryMasterTask.java:stop(168)) - Stopping
> QueryMasterTask:q_1391587770871_0001
> 2014-02-05 17:20:52,346 INFO master.TajoAsyncDispatcher
> (TajoAsyncDispatcher.java:stop(122)) - AsyncDispatcher
> stopped:q_1391587770871_0001
> 2014-02-05 17:20:52,351 INFO querymaster.QueryMasterTask
> (QueryMasterTask.java:stop(198)) - Stopped
> QueryMasterTask:q_1391587770871_0001
> 2014-02-05 17:23:28,614 ERROR worker.TajoWorker
> (SignalLogger.java:handle(60)) - RECEIVED SIGNAL 15: SIGTERM
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)