adoroszlai commented on PR #7376:
URL: https://github.com/apache/ozone/pull/7376#issuecomment-2464338771

   With Hadoop 3.1.2 `hadoop dfs` commands from `rm` (Resource Manager) node 
are stuck in retry:
   
   ```
   2024-11-08 09:55:48 INFO  RetryInvocationHandler:411 - 
java.lang.IllegalStateException, while invoking $Proxy11.submitRequest over 
nodeId=om2,nodeAddress=om2:9862 after 1 failover attempts. Trying to failover 
immediately.
   ```
   
   at
   
   ```
   "main" #1 prio=5 os_prio=0 tid=0x00007b1a08054000 nid=0x11d waiting on 
condition [0x00007b1a11db5000]
      java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.processWaitTimeAndRetryInfo(RetryInvocationHandler.java:130)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:107)
        - locked <0x00000006748192d0> (a 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy11.submitRequest(Unknown Source)
        at 
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.submitRequest(Hadoop3OmTransport.java:80)
        at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.submitRequest(OzoneManagerProtocolClientSideTranslatorPB.java:338)
        at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getServiceInfo(OzoneManagerProtocolClientSideTranslatorPB.java:1863)
        at 
org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:273)
        at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:269)
        at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:136)
        at 
org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.<init>(BasicOzoneClientAdapterImpl.java:186)
        at 
org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.<init>(OzoneClientAdapterImpl.java:51)
        at 
org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:109)
        at 
org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:200)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
        at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
        at 
org.apache.hadoop.fs.shell.CommandWithDestination.getRemoteDestination(CommandWithDestination.java:195)
        at 
org.apache.hadoop.fs.shell.CopyCommands$Put.processOptions(CopyCommands.java:259)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:175)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
   ```
   
   
   With Hadoop 3.4.1 it's failing with:
   
   ```
   Exception in thread "main" java.lang.NoClassDefFoundError: 
com/google/protobuf/ServiceException
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:114)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:674)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:639)
        at 
org.apache.hadoop.ozone.om.ha.OMFailoverProxyProviderBase.createOMProxy(OMFailoverProxyProviderBase.java:143)
        at 
org.apache.hadoop.ozone.om.ha.HadoopRpcOMFailoverProxyProvider.createOMProxy(HadoopRpcOMFailoverProxyProvider.java:145)
        at 
org.apache.hadoop.ozone.om.ha.HadoopRpcOMFailoverProxyProvider.getProxy(HadoopRpcOMFailoverProxyProvider.java:132)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$ProxyDescriptor.<init>(RetryInvocationHandler.java:202)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:335)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:329)
        at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:61)
        at 
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.createRetryProxy(Hadoop3OmTransport.java:116)
        at 
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.<init>(Hadoop3OmTransport.java:73)
        at 
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory.createOmTransport(Hadoop3OmTransportFactory.java:33)
        at 
org.apache.hadoop.ozone.om.protocolPB.OmTransportFactory.create(OmTransportFactory.java:45)
        at 
org.apache.hadoop.ozone.client.rpc.RpcClient.createOmTransport(RpcClient.java:414)
        at 
org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:261)
           ...
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
   ```
   
   It looks like shading protobuf does not work, because we still rely on 
Hadoop jars outside of our fat FS jar (and classes in those jars do not know 
about our shaded protobuf).
   
   _acceptance (MR)_ 
[passed](https://github.com/jojochuang/ozone/actions/runs/11735618372/job/32693998999)
 in your fork without actually executing any Hadoop / MapReduce tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to