[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234197#comment-16234197
 ] 

Sean Mackrory commented on HDFS-11096:
--------------------------------------

>From an HDFS standpoint, definitely - I've run many successful rolling upgrade 
>and distcp-over-webhdfs tests this week and updated the patch. The only thing 
>remaining is to get automation itself in place after this is committed.

I looked into the YARN issues. I'm still seeing very similar symptoms to the 
YARN-6457 issue mentioned above in both branch-3.0 and trunk. In trunk I'm also 
seeing this:

{quote}
17/10/31 23:05:49 INFO security.AMRMTokenSecretManager: Creating password for 
appattempt_1509490231144_0628_000002
17/10/31 23:05:49 INFO amlauncher.AMLauncher: Error launching 
appattempt_1509490231144_0628_000002. Got exception: 
org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid container 
token used for starting container on : container-5.docker:35151
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789)
        at 
org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70)
        at 
org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:127)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)

        at sun.reflect.GeneratedConstructorAccessor70.newInstance(Unknown 
Source)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
        at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
        at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:131)
        at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy89.startContainers(Unknown Source)
        at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
        at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:304)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Invalid container token used for starting container on : 
container-5.docker:35151
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789)
        at 
org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70)
        at 
org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:127)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
        at org.apache.hadoop.ipc.Client.call(Client.java:1437)
        at org.apache.hadoop.ipc.Client.call(Client.java:1347)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy88.startContainers(Unknown Source)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
        ... 14 more

17/10/31 23:05:49 INFO attempt.RMAppAttemptImpl: Updating application attempt 
appattempt_1509490231144_0628_000002 with final state: FAILED, and exit status: 
-1000
17/10/31 23:05:49 INFO attempt.RMAppAttemptImpl: 
appattempt_1509490231144_0628_000002 State change from ALLOCATED to 
FINAL_SAVING on event = LAUNCH_FAILED
{quote}

> Support rolling upgrade between 2.x and 3.x
> -------------------------------------------
>
>                 Key: HDFS-11096
>                 URL: https://issues.apache.org/jira/browse/HDFS-11096
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: rolling upgrades
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Andrew Wang
>            Assignee: Sean Mackrory
>            Priority: Blocker
>         Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch, HDFS-11096.007.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to