adoroszlai commented on PR #7376:
URL: https://github.com/apache/ozone/pull/7376#issuecomment-2464338771
With Hadoop 3.1.2 `hadoop dfs` commands from `rm` (Resource Manager) node
are stuck in retry:
```
2024-11-08 09:55:48 INFO RetryInvocationHandler:411 -
java.lang.IllegalStateException, while invoking $Proxy11.submitRequest over
nodeId=om2,nodeAddress=om2:9862 after 1 failover attempts. Trying to failover
immediately.
```
at
```
"main" #1 prio=5 os_prio=0 tid=0x00007b1a08054000 nid=0x11d waiting on
condition [0x00007b1a11db5000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.processWaitTimeAndRetryInfo(RetryInvocationHandler.java:130)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:107)
- locked <0x00000006748192d0> (a
org.apache.hadoop.io.retry.RetryInvocationHandler$Call)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy11.submitRequest(Unknown Source)
at
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.submitRequest(Hadoop3OmTransport.java:80)
at
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.submitRequest(OzoneManagerProtocolClientSideTranslatorPB.java:338)
at
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getServiceInfo(OzoneManagerProtocolClientSideTranslatorPB.java:1863)
at
org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:273)
at
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:269)
at
org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:136)
at
org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.<init>(BasicOzoneClientAdapterImpl.java:186)
at
org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.<init>(OzoneClientAdapterImpl.java:51)
at
org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:109)
at
org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:200)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
at
org.apache.hadoop.fs.shell.CommandWithDestination.getRemoteDestination(CommandWithDestination.java:195)
at
org.apache.hadoop.fs.shell.CopyCommands$Put.processOptions(CopyCommands.java:259)
at org.apache.hadoop.fs.shell.Command.run(Command.java:175)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
```
With Hadoop 3.4.1 it's failing with:
```
Exception in thread "main" java.lang.NoClassDefFoundError:
com/google/protobuf/ServiceException
at
org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:114)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:674)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:639)
at
org.apache.hadoop.ozone.om.ha.OMFailoverProxyProviderBase.createOMProxy(OMFailoverProxyProviderBase.java:143)
at
org.apache.hadoop.ozone.om.ha.HadoopRpcOMFailoverProxyProvider.createOMProxy(HadoopRpcOMFailoverProxyProvider.java:145)
at
org.apache.hadoop.ozone.om.ha.HadoopRpcOMFailoverProxyProvider.getProxy(HadoopRpcOMFailoverProxyProvider.java:132)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$ProxyDescriptor.<init>(RetryInvocationHandler.java:202)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:335)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:329)
at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:61)
at
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.createRetryProxy(Hadoop3OmTransport.java:116)
at
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.<init>(Hadoop3OmTransport.java:73)
at
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory.createOmTransport(Hadoop3OmTransportFactory.java:33)
at
org.apache.hadoop.ozone.om.protocolPB.OmTransportFactory.create(OmTransportFactory.java:45)
at
org.apache.hadoop.ozone.client.rpc.RpcClient.createOmTransport(RpcClient.java:414)
at
org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:261)
...
at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
```
It looks like shading protobuf does not work, because we still rely on
Hadoop jars outside of our fat FS jar (and classes in those jars do not know
about our shaded protobuf).
_acceptance (MR)_
[passed](https://github.com/jojochuang/ozone/actions/runs/11735618372/job/32693998999)
in your fork without actually executing any Hadoop / MapReduce tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]