[ 
https://issues.apache.org/jira/browse/HDDS-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-4405:
-------------------------------------
    Description: 
{code:java}
[root@uma-1 ~]# sudo -u hdfs hdfs dfs -ls o3fs://bucket.volume.ozone1/
20/10/28 23:37:50 INFO retry.RetryInvocationHandler: 
com.google.protobuf.ServiceException: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
 OM:om2 is not the leader. Suggested leader is OM:om3.
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:198)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:186)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:123)
 at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:73)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:113)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
, while invoking $Proxy10.submitRequest over 
{om1=nodeId=om1,nodeAddress=uma-1.uma.root.hwx.site:9862, 
om3=nodeId=om3,nodeAddress=uma-3.uma.root.hwx.site:9862, 
om2=nodeId=om2,nodeAddress=uma-2.uma.root.hwx.site:9862} after 1 failover 
attempts. Trying to failover immediately.{code}

This issue in the Apache Ozone main branch will be fixed once Hadoop version is 
updated. Till then if vendors/users who backport fix to their hadoop version, 
this fix will help them.

  was:
{code:java}
[root@uma-1 ~]# sudo -u hdfs hdfs dfs -ls o3fs://bucket.volume.ozone1/
20/10/28 23:37:50 INFO retry.RetryInvocationHandler: 
com.google.protobuf.ServiceException: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
 OM:om2 is not the leader. Suggested leader is OM:om3.
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:198)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:186)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:123)
 at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:73)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:113)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
, while invoking $Proxy10.submitRequest over 
{om1=nodeId=om1,nodeAddress=uma-1.uma.root.hwx.site:9862, 
om3=nodeId=om3,nodeAddress=uma-3.uma.root.hwx.site:9862, 
om2=nodeId=om2,nodeAddress=uma-2.uma.root.hwx.site:9862} after 1 failover 
attempts. Trying to failover immediately.{code}


> Proxy failover is logging with out trying all OMS
> -------------------------------------------------
>
>                 Key: HDDS-4405
>                 URL: https://issues.apache.org/jira/browse/HDDS-4405
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Uma Maheswara Rao G
>            Assignee: Bharat Viswanadham
>            Priority: Major
>              Labels: pull-request-available
>
> {code:java}
> [root@uma-1 ~]# sudo -u hdfs hdfs dfs -ls o3fs://bucket.volume.ozone1/
> 20/10/28 23:37:50 INFO retry.RetryInvocationHandler: 
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
>  OM:om2 is not the leader. Suggested leader is OM:om3.
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:198)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:186)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:123)
>  at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:73)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:113)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
> , while invoking $Proxy10.submitRequest over 
> {om1=nodeId=om1,nodeAddress=uma-1.uma.root.hwx.site:9862, 
> om3=nodeId=om3,nodeAddress=uma-3.uma.root.hwx.site:9862, 
> om2=nodeId=om2,nodeAddress=uma-2.uma.root.hwx.site:9862} after 1 failover 
> attempts. Trying to failover immediately.{code}
> This issue in the Apache Ozone main branch will be fixed once Hadoop version 
> is updated. Till then if vendors/users who backport fix to their hadoop 
> version, this fix will help them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to