[jira] [Commented] (YARN-5898) Container can not stop, because the call stopContainer NMClient method appears DIGEST-MD5 exception, onGetContainerStatusError NMClientAsync method is also the same

2016-11-17 Thread gaoyanfu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675413#comment-15675413
 ] 

gaoyanfu commented on YARN-5898:


AppMaster did not restart, usually, appMaster after running for some time, some 
container appear the exception 

> Container can not stop, because the call stopContainer NMClient method 
> appears DIGEST-MD5 exception, onGetContainerStatusError NMClientAsync method 
> is also the same
> 
>
> Key: YARN-5898
> URL: https://issues.apache.org/jira/browse/YARN-5898
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.6.0
> Environment: cdh5.5,java 7
>Reporter: gaoyanfu
>  Labels: DIGEST-MD5, getContainerStatuses, 
> onGetContainerStatusError, stopContainer
> Fix For: 2.6.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> GetContainerStatusAsync call the NMClientAsync method, the callback method 
> corresponding onGetContainerStatusError method, DIGEST-MD5 SaslException, 
> ContainerStatus stopContainer can not get; call the nmClient method will be 
> the exception, not stop Container.
> ---REST API---
> request:
> http://server3.xdpp.boco:8042/ws/v1/node/containers
> response:
> {"containers":{"container":[
> {"id":"container_e07_1477704520017_0001_01_04","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":8704,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e07_1477704520017_0001_01_04/xdpp","nodeId":"server3.xdpp.boco:8041"},
> {"id":"container_e09_1477719748865_0003_01_25","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":1536,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e09_1477719748865_0003_01_25/xdpp","nodeId":"server3.xdpp.boco:8041"},
> {"id":"container_e09_1477719748865_0004_02_000103","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":6656,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e09_1477719748865_0004_02_000103/xdpp","nodeId":"server3.xdpp.boco:8041"}
> ]}}
> ---exception--
> 2016-11-14 11:17:12.725 ERROR containerStatusLogger 
> [ContainerManager.java:484] *Container onGetContainerStatusError deal 
> begin.containerId:container_e09_1477719748865_0003_01_25
> javax.security.sasl.SaslException: DIGEST-MD5: digest response format 
> violation. Mismatched response.
>   at sun.reflect.GeneratedConstructorAccessor59.newInstance(Unknown 
> Source) ~[na:na]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  ~[na:1.7.0_79]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> ~[na:1.7.0_79]
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) 
> ~[hadoop-yarn-common-2.6.0.jar:na]
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) 
> ~[hadoop-yarn-common-2.6.0.jar:na]
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.getContainerStatuses(ContainerManagementProtocolPBClientImpl.java:127)
>  ~[hadoop-yarn-common-2.6.0.jar:na]
>   at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) ~[na:na]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.7.0_79]
>   at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_79]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>  ~[hadoop-common-2.6.0.jar:na]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>  ~[hadoop-common-2.6.0.jar:na]
>   at com.sun.proxy.$Proxy23.getContainerStatuses(Unknown Source) ~[na:na]
>   at 
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl.getContainerStatus(NMClientImpl.java:267)
>  ~[hadoop-yarn-client-2.6.0.jar:na]
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$ContainerEventProcessor.run(NMClientAsyncImpl.java:534)
>  ~[hadoop-yarn-client-2.6.0.jar:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> Caused by: 

[jira] [Commented] (YARN-5898) Container can not stop, because the call stopContainer NMClient method appears DIGEST-MD5 exception, onGetContainerStatusError NMClientAsync method is also the same

2016-11-17 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673309#comment-15673309
 ] 

sandflee commented on YARN-5898:


had AM ever restarted?

> Container can not stop, because the call stopContainer NMClient method 
> appears DIGEST-MD5 exception, onGetContainerStatusError NMClientAsync method 
> is also the same
> 
>
> Key: YARN-5898
> URL: https://issues.apache.org/jira/browse/YARN-5898
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.6.0
> Environment: cdh5.5,java 7
>Reporter: gaoyanfu
>  Labels: DIGEST-MD5, getContainerStatuses, 
> onGetContainerStatusError, stopContainer
> Fix For: 2.6.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> GetContainerStatusAsync call the NMClientAsync method, the callback method 
> corresponding onGetContainerStatusError method, DIGEST-MD5 SaslException, 
> ContainerStatus stopContainer can not get; call the nmClient method will be 
> the exception, not stop Container.
> ---REST API---
> request:
> http://server3.xdpp.boco:8042/ws/v1/node/containers
> response:
> {"containers":{"container":[
> {"id":"container_e07_1477704520017_0001_01_04","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":8704,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e07_1477704520017_0001_01_04/xdpp","nodeId":"server3.xdpp.boco:8041"},
> {"id":"container_e09_1477719748865_0003_01_25","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":1536,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e09_1477719748865_0003_01_25/xdpp","nodeId":"server3.xdpp.boco:8041"},
> {"id":"container_e09_1477719748865_0004_02_000103","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":6656,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e09_1477719748865_0004_02_000103/xdpp","nodeId":"server3.xdpp.boco:8041"}
> ]}}
> ---exception--
> 2016-11-14 11:17:12.725 ERROR containerStatusLogger 
> [ContainerManager.java:484] *Container onGetContainerStatusError deal 
> begin.containerId:container_e09_1477719748865_0003_01_25
> javax.security.sasl.SaslException: DIGEST-MD5: digest response format 
> violation. Mismatched response.
>   at sun.reflect.GeneratedConstructorAccessor59.newInstance(Unknown 
> Source) ~[na:na]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  ~[na:1.7.0_79]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> ~[na:1.7.0_79]
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) 
> ~[hadoop-yarn-common-2.6.0.jar:na]
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) 
> ~[hadoop-yarn-common-2.6.0.jar:na]
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.getContainerStatuses(ContainerManagementProtocolPBClientImpl.java:127)
>  ~[hadoop-yarn-common-2.6.0.jar:na]
>   at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) ~[na:na]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.7.0_79]
>   at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_79]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>  ~[hadoop-common-2.6.0.jar:na]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>  ~[hadoop-common-2.6.0.jar:na]
>   at com.sun.proxy.$Proxy23.getContainerStatuses(Unknown Source) ~[na:na]
>   at 
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl.getContainerStatus(NMClientImpl.java:267)
>  ~[hadoop-yarn-client-2.6.0.jar:na]
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$ContainerEventProcessor.run(NMClientAsyncImpl.java:534)
>  ~[hadoop-yarn-client-2.6.0.jar:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> Caused by: org.apache.hadoop.ipc.RemoteException: DIGEST-MD5: digest response 
> format violation. Mismatched