adoroszlai commented on pull request #2360:
URL: https://github.com/apache/ozone/pull/2360#issuecomment-880096223


   Thanks a lot @elek for this improvement.  I am yet to review the change in 
detail, but I have tried running the existing replication smoketest and it 
failed:
   
   ```
   Creating ozonesecure_datanode_3 ... done
   
==============================================================================
   Wait :: Wait for replication to succeed
   
==============================================================================
   Wait Until Container Replicated                                       | FAIL 
|
   Test timeout 5 minutes exceeded.
   
------------------------------------------------------------------------------
   Wait :: Wait for replication to succeed                               | FAIL 
|
   1 test, 0 passed, 1 failed
   ```
   
   Same in unsecure.
   
   The only error message I see:
   
   ```
   datanode_3  | 2021-07-14 16:35:04,130 [ContainerReplicationThread-8] INFO 
replication.DownloadAndImportReplicator: Starting replication of container 1 
from [b6bfac09-f440-4a0c-99df-72a2ef6d7c1a{ip: 172.24.0.4, host: 
ozonesecure_datanode_2.ozonesecure_default, ports: [REPLICATION=9886, 
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
networkLocation: /default-rack, certSerialId: null, persistedOpState: 
IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
0e9c8c1e-57de-4556-870d-09e757478913{ip: 172.24.0.7, host: 
ozonesecure_datanode_1.ozonesecure_default, ports: [REPLICATION=9886, 
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
networkLocation: /default-rack, certSerialId: null, persistedOpState: 
IN_SERVICE, persistedOpStateExpiryEpochSec: 0}]
   ...
   datanode_3  | 2021-07-14 16:35:14,148 [ContainerReplicationThread-8] ERROR 
replication.DownloadAndImportReplicator: Error on replicating container 1
   datanode_3  | org.apache.hadoop.ozone.container.stream.StreamingException: 
Streaming is failed. Not all files are streamed. Please check the log of the 
server. Last (partial?) streamed file: 
   datanode_3  |        at 
org.apache.hadoop.ozone.container.stream.StreamingClient.stream(StreamingClient.java:99)
   datanode_3  |        at 
org.apache.hadoop.ozone.container.stream.StreamingClient.stream(StreamingClient.java:86)
   datanode_3  |        at 
org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:142)
   datanode_3  |        at 
org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:142)
   datanode_3  | 2021-07-14 16:35:14,148 [ContainerReplicationThread-8] ERROR 
replication.ReplicationSupervisor: Container 1 can't be downloaded from any of 
the datanodes.
   ```
   
   No related messages found on datanode 1 or 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to