adoroszlai commented on PR #3903:
URL: https://github.com/apache/ozone/pull/3903#issuecomment-1298504475

   > Retrigger build doesn't work, still failing...
   
   Please check logs before retriggering.
   
   Config test is failing:
   
   ```
   Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.421 s <<< 
FAILURE! - in org.apache.hadoop.ozone.TestOzoneConfigurationFields
   
org.apache.hadoop.ozone.TestOzoneConfigurationFields.testCompareXmlAgainstConfigurationClass
  Time elapsed: 0.244 s  <<< FAILURE!
   java.lang.AssertionError: ozone-default.xml has 1 properties missing in  
class org.apache.hadoop.ozone.OzoneConfigKeys  class 
org.apache.hadoop.hdds.scm.ScmConfigKeys  class 
org.apache.hadoop.ozone.om.OMConfigKeys  class 
org.apache.hadoop.hdds.HddsConfigKeys  class 
org.apache.hadoop.hdds.recon.ReconConfigKeys  class 
org.apache.hadoop.ozone.recon.ReconServerConfigKeys  class 
org.apache.hadoop.ozone.s3.S3GatewayConfigKeys  class 
org.apache.hadoop.hdds.scm.server.SCMHTTPServerConfig  class 
org.apache.hadoop.hdds.scm.server.SCMHTTPServerConfig$ConfigStrings  class 
org.apache.hadoop.hdds.scm.ScmConfig$ConfigStrings Entries:   
ozone.recon.address expected:<0> but was:<1>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at 
org.apache.hadoop.conf.TestConfigurationFieldsBase.testCompareXmlAgainstConfigurationClass(TestConfigurationFieldsBase.java:540)
   ```
   
   I think the timeouts are due to datanodes being stuck trying to connect to 
Recon when none is started:
   
   ```
   2022-10-31 19:22:13,880 [EndpointStateMachine task thread for /0.0.0.0:9891 
- 0 ] INFO  ipc.Client (Client.java:handleConnectionFailure(1010)) - Retrying 
connect to server: 0.0.0.0/0.0.0.0:9891. Already tried 14 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=15, sleepTime=1000 
MILLISECONDS)
   2022-10-31 19:22:13,882 [EndpointStateMachine task thread for /0.0.0.0:9891 
- 0 ] WARN  statemachine.EndpointStateMachine 
(EndpointStateMachine.java:logIfNeeded(242)) - Unable to communicate to Recon 
server at 0.0.0.0:9891 for past 0 seconds.
   java.net.ConnectException: Your endpoint configuration is wrong; For more 
details see:  http://wiki.apache.org/hadoop/UnsetHostnameOrPort
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:824)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1616)
        at org.apache.hadoop.ipc.Client.call(Client.java:1558)
        at org.apache.hadoop.ipc.Client.call(Client.java:1455)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:235)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:122)
        at com.sun.proxy.$Proxy52.submitRequest(Unknown Source)
        at 
org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.submitRequest(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:117)
        at 
org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.getVersion(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:133)
        at 
org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:69)
        at 
org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:40)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
   Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:205)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:711)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:833)
        at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
        at org.apache.hadoop.ipc.Client.call(Client.java:1502)
        ... 12 more
   ```
   
   I _guess_ config property was missing from defaults to ensure users 
knowingly configure it to a valid address if needed.  If you would like to add 
it for documentation purpose, I suggest trying empty value as default, to 
retain existing behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to