Attila Doroszlai created HDDS-10177:
---------------------------------------

             Summary: OM RPC server restarted by InstallSnapshotThread during 
shutdown
                 Key: HDDS-10177
                 URL: https://issues.apache.org/jira/browse/HDDS-10177
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Manager
            Reporter: Attila Doroszlai


TestSnapshotBackgroundServices was successful:

{code}
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 171.3 s -- in 
org.apache.hadoop.ozone.om.TestSnapshotBackgroundServices
{code}

but it timed out during post-test cluster shutdown, because it was waiting 
indefinitely for the RPC server to stop:

{code}
"main" 
   java.lang.Thread.State: WAITING
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at org.apache.hadoop.ipc.Server.join(Server.java:3569)
        at org.apache.hadoop.ozone.om.OzoneManager.join(OzoneManager.java:2286)
        at 
org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopOM(MiniOzoneClusterImpl.java:558)
        at 
org.apache.hadoop.ozone.MiniOzoneHAClusterImpl.stop(MiniOzoneHAClusterImpl.java:311)
        at 
org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:453)
        at 
org.apache.hadoop.ozone.om.TestSnapshotBackgroundServices.shutdown(TestSnapshotBackgroundServices.java:202)
{code}

The problem is that {{InstallSnapshotThread}} restarted the RPC server in the 
meantime:

{code}
2024-01-20 18:37:17,649 [main] INFO  ozone.MiniOzoneHAClusterImpl 
(MiniOzoneHAClusterImpl.java:stop(310)) - Stopping the OzoneManager omNode-3
2024-01-20 18:37:17,649 [main] INFO  om.OzoneManager 
(OzoneManager.java:stop(2204)) - omNode-3[localhost:15012]: Stopping Ozone 
Manager
2024-01-20 18:37:17,650 [main] INFO  ipc.Server (Server.java:stop(3523)) - 
Stopping server on 15012
...
2024-01-20 18:37:17,913 [omNode-3-InstallSnapshotThread] INFO  ipc.Server 
(Server.java:<init>(1287)) - Listener at localhost:15012
2024-01-20 18:37:17,932 [omNode-3-InstallSnapshotThread] INFO  om.OzoneManager 
(OzoneManager.java:installCheckpoint(3863)) - RPC server is re-started. Spend 
377 ms.
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to