[
https://issues.apache.org/jira/browse/TEZ-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337404#comment-14337404
]
Chang Li commented on TEZ-2143:
-------------------------------
[~hitesh] I have verified this patch
{code}
_000029 transitioned from STOP_REQUESTED to STOPPING via event C_NM_STOP_SENT
2015-02-25 22:54:24,293 INFO [IPC Server handler 3 on 41658]
app.TaskAttemptListenerImpTezDag: Container with id:
container_e05_1424818686446_1302_01_000029 is valid, but no longer registered,
and will be killed
2015-02-25 22:54:24,389 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: Released container
completed:container_e05_1424818686446_1302_01_000029 last allocated to task:
attempt_1424818686446_1302_1_00_000000_0
2015-02-25 22:54:24,390 INFO [Dispatcher thread: Central]
container.AMContainerImpl: Container container_e05_1424818686446_1302_01_000029
exited with diagnostics set to Container failed. Container released by
application
2015-02-25 22:54:24,390 INFO [Dispatcher thread: Central]
container.AMContainerImpl: AMContainer
container_e05_1424818686446_1302_01_000029 transitioned from STOPPING to
COMPLETED via event C_COMPLETED
2015-02-25 22:54:24,397 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop
for all the services
2015-02-25 22:54:24,398 INFO [AMShutdownThread] history.HistoryEventHandler:
Stopping HistoryEventHandler
2015-02-25 22:54:24,398 INFO [AMShutdownThread] recovery.RecoveryService:
Stopping RecoveryService
2015-02-25 22:54:24,398 INFO [AMShutdownThread] recovery.RecoveryService:
Closing Summary Stream
2015-02-25 22:54:24,398 INFO [RecoveryEventHandlingThread]
recovery.RecoveryService: EventQueue take interrupted. Returning
2015-02-25 22:54:24,410 INFO [AMShutdownThread] ats.ATSHistoryLoggingService:
Stopping ATSService, eventQueueBacklog=0
2015-02-25 22:54:24,411 INFO [DelayedContainerManager]
rm.YarnTaskSchedulerService: AllocatedContainerManager Thread interrupted
2015-02-25 22:54:24,413 INFO [AMShutdownThread] rm.YarnTaskSchedulerService:
Unregistering application from RM, exitStatus=SUCCEEDED, exitMessage=Session
stats:submittedDAGs=1, successfulDAGs=1, failedDAGs=0, killedDAGs=0
,
trackingURL=axonitered-jt1.red.ygrid.yahoo.com:4080/tez/#/?appid=application_1424818686446_1302
2015-02-25 22:54:24,428 INFO [AMShutdownThread] impl.AMRMClientImpl: Waiting
for application to be successfully unregistered.
2015-02-25 22:54:24,537 INFO [AMShutdownThread] rm.YarnTaskSchedulerService:
Successfully unregistered application from RM
2015-02-25 22:54:24,538 INFO [AMShutdownThread] ipc.Server: Stopping server on
41658
2015-02-25 22:54:24,538 INFO [AMRM Callback Handler Thread]
impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
2015-02-25 22:54:24,539 INFO [IPC Server Responder] ipc.Server: Stopping IPC
Server Responder
2015-02-25 22:54:24,539 INFO [AMShutdownThread] ipc.Server: Stopping server on
50500
2015-02-25 22:54:24,539 INFO [IPC Server listener on 41658] ipc.Server:
Stopping IPC Server listener on 41658
2015-02-25 22:54:24,541 INFO [IPC Server listener on 50500] ipc.Server:
Stopping IPC Server listener on 50500
2015-02-25 22:54:24,541 INFO [IPC Server Responder] ipc.Server: Stopping IPC
Server Responder
2015-02-25 22:54:24,542 INFO [Thread-3] app.DAGAppMaster:
DAGAppMasterShutdownHook invoked
2015-02-25 22:54:24,543 INFO [Thread-3] app.DAGAppMaster: The shutdown handler
is still running, waiting for it to complete
2015-02-25 22:54:24,545 INFO [AMShutdownThread] app.DAGAppMaster: Completed
deletion of tez scratch data dir,
path=hdfs://gateway:8020/tmp/temp-1823306399/.tez/application_1424818686446_1302
2015-02-25 22:54:24,545 INFO [AMShutdownThread] app.DAGAppMaster: Exiting
DAGAppMaster..GoodBye!
2015-02-25 22:54:24,545 INFO [Thread-3] app.DAGAppMaster: The shutdown handler
has completed
{code}
The log shows the completed deletion.
TEZ-2133 is the same as my jira. Should I post my patch over there?
Thanks
> scratch file can't be deleted in security mode
> ----------------------------------------------
>
> Key: TEZ-2143
> URL: https://issues.apache.org/jira/browse/TEZ-2143
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Chang Li
> Assignee: Chang Li
> Attachments: TEZ_2143_V1.patch
>
>
> {code}2015-02-25 21:01:47,685 WARN [AMShutdownThread] app.DAGAppMaster:
> Failed to delete tez scratch data dir
> java.io.IOException: Failed on local exception: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate
> via:[TOKEN, KERBEROS]; Host Details : local host is: "gateway/10.100.100.10";
> destination host is: namenode;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> at org.apache.hadoop.ipc.Client.call(Client.java:1455)
> at org.apache.hadoop.ipc.Client.call(Client.java:1382)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> at com.sun.proxy.$Proxy14.delete(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy15.delete(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
> at
> org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:1691)
> at
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
> at
> org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterShutdownHandler$AMShutdownRunnable.run(DAGAppMaster.java:724)
> at java.lang.Thread.run(Thread.java:722)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)