[
https://issues.apache.org/jira/browse/TAJO-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734288#comment-13734288
]
Hyunsik Choi commented on TAJO-91:
----------------------------------
This patch works well in Mac OS X. I tested this patch on CentOS 6.4 and Ubuntu
12.04. In the both machines, the patch causes the following error.
{noformat}
2013-08-09 09:58:08,345 INFO querymaster.QueryMasterManager
(QueryMasterManager.java:monitorApplication(329)) - Got application report from
ASM for, appId=1, appAttemptId=appattempt_1376009881336_0001_000001,
clientToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default,
appMasterRpcPort=0, appStartTime=1376009888321, yarnAppState=SUBMITTED,
distributedFinalState=UNDEFINED,
appTrackingUrl=local05.gruter.no-ip.org:60237/proxy/application_1376009881336_0001/,
appUser=hyunsik
2013-08-09 09:58:08,449 INFO querymaster.QueryMasterManager
(QueryMasterManager.java:monitorApplication(329)) - Got application report from
ASM for, appId=1, appAttemptId=appattempt_1376009881336_0001_000001,
clientToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default,
appMasterRpcPort=0, appStartTime=1376009888321, yarnAppState=ACCEPTED,
distributedFinalState=UNDEFINED,
appTrackingUrl=local05.gruter.no-ip.org:60237/proxy/application_1376009881336_0001/,
appUser=hyunsik
2013-08-09 09:58:08,450 INFO querymaster.QueryMasterManager
(QueryMasterManager.java:allocateAndLaunchQueryMaster(313)) - Launching
QueryMaster with id: appattempt_1376009881336_0001_000001
2013-08-09 09:58:08,450 INFO master.TajoMasterClientService
(TajoMasterClientService.java:submitQuery(146)) - Query
q_1376009881336_0001_000000 is submitted
2013-08-09 09:58:18,508 WARN resourcemanager.RMAuditLogger
(RMAuditLogger.java:logFailure(255)) - USER=hyunsik OPERATION=Application
Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App
failed with state: FAILED PERMISSIONS=Application
application_1376009881336_0001 failed 1 times due to Error launching
appattempt_1376009881336_0001_000001. Got exception:
java.lang.reflect.UndeclaredThrowableException
at
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
at
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
at
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
at
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException:
Call From local05.gruter.no-ip.org/192.168.0.205 to
local05.gruter.no-ip.org:38977 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
at sun.proxy.$Proxy78.startContainer(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:106)
... 5 more
Caused by: java.net.ConnectException: Call From
local05.gruter.no-ip.org/192.168.0.205 to local05.gruter.no-ip.org:38977 failed
on connection exception: java.net.ConnectException: Connection refused; For
more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:780)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:727)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
... 7 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:597)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:526)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:490)
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:508)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:603)
at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:253)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1288)
at org.apache.hadoop.ipc.Client.call(Client.java:1206)
... 8 more
. Failing the application. APPID=application_1376009881336_0001
2013-08-09 09:58:19,486 WARN containermanager.ContainerManagerImpl
(ContainerManagerImpl.java:handle(558)) - Event EventType: KILL_CONTAINER sent
to absent container container_1376009881336_0001_01_000001
2013-08-09 09:58:19,488 WARN containermanager.ContainerManagerImpl
(ContainerManagerImpl.java:handle(574)) - Event EventType: FINISH_APPLICATION
sent to absent application application_1376009881336_0001
{noformat}
> Launch QueryMaster on NodeManager per query
> -------------------------------------------
>
> Key: TAJO-91
> URL: https://issues.apache.org/jira/browse/TAJO-91
> Project: Tajo
> Issue Type: Sub-task
> Components: master
> Reporter: hyoungjunkim
> Assignee: hyoungjunkim
> Fix For: 0.3-incubating
>
> Attachments: TAJO-91.patch
>
>
> In the current implementation, TajoMaster creates a QueryMaster per qeury in
> same JVM. If many queries run concurrently, TajoMaster is bottleneck.
> TajoMaster launches QueryMaster on NodeManager when query requested.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira