Feng Yuan created TEZ-3266:
------------------------------
Summary: DAG failed when yarn resources is rare like " No groups
available for user A" because DAGAppMaster launched and exit_with_sucessful
immediately.
Key: TEZ-3266
URL: https://issues.apache.org/jira/browse/TEZ-3266
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.5.2
Environment: hadoop-2.6.0, hive-0.14.0
Reporter: Feng Yuan
When in a resource queue there is full of apps,if you submit a new tez
app,there is infomations like :
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_1463493135662_66844_01_000004 Container Transitioned from ALLOCATED
to ACQUIRED
2016-05-24 01:52:02,963 INFO
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with
id 66847 submitted by user bae
2016-05-24 01:52:02,963 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing
application with id application_1463493135662_66847
2016-05-24 01:52:02,963 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=bae
IP=192.168.44.40 OPERATION=Submit Application Request
TARGET=ClientRMService RESULT=SUCCESS APPID=application_1463493135662_66847
2016-05-24 01:52:02,963 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1463493135662_66847 State change from NEW to NEW_SAVING
2016-05-24 01:52:02,964 INFO
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing
info for app: application_1463493135662_66847
2016-05-24 01:52:02,966 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1463493135662_66847 State change from NEW_SAVING to SUBMITTED
2016-05-24 01:52:02,966 WARN org.apache.hadoop.security.UserGroupInformation:
No groups available for user bae
2016-05-24 01:52:02,966 WARN org.apache.hadoop.security.UserGroupInformation:
No groups available for user bae
2016-05-24 01:52:02,966 WARN org.apache.hadoop.security.UserGroupInformation:
No groups available for user bae
2016-05-24 01:52:02,966 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Accepted application application_1463493135662_66847 from user: bae, in queue:
default, currently num of applications: 16
2016-05-24 01:52:02,967 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1463493135662_66847 State change from SUBMITTED to ACCEPTED
2016-05-24 01:52:02,967 INFO
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
Registering app attempt : appattempt_1463493135662_66847_000001
2016-05-24 01:52:02,967 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1463493135662_66847_000001 State change from NEW to SUBMITTED
2016-05-24 01:52:02,967 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Added Application Attempt appattempt_1463493135662_66847_000001 to scheduler
from user: bae
2016-05-24 01:52:02,967 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1463493135662_66847_000001 State change from SUBMITTED to SCHEDULED
2016-05-24 01:52:02,976 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth
successful for appattempt_1463493135662_66837_000001 (auth:SIMPLE)
2016-05-24 01:52:03,044 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)