errose28 opened a new pull request, #10584:
URL: https://github.com/apache/ozone/pull/10584

   ## What changes were proposed in this pull request?
   
   Datanodes have used push replication to move containers by default since 
early 2023. There is no reason to continue using the pull model which is now 
disabled behind a feature flag. We can safely remove it, which will be easier 
to do before ZDU is merged into master.
   - On the ZDU replication path, both the client and server need to operate on 
the same apparent version to provide agreement on the API while datanodes are 
in mixed versions. Tracking this version requires twice the work when there are 
two protocols that have different client and server roles.
   - By removing the API before ZDU we do not need to do any compatibility 
handling, since the cluster must be upgraded to the first ZDU enabled version 
before it can use ZDU for upgrades going forward.
     - If we remove it after ZDU, we will need to take precautions to make sure 
the replications don't fail as datanodes are being upgraded.
   
   ### Implementation Notes
   
   - Compression was only used for pull replication where it was disabled by 
default. This was never used for push replication so it is now removed entirely.
   - To exclude decommissioning and maintenance datanodes as replication 
targets, `AbstractReplicationTask#shouldOnlyRunOnInServiceDatanodes` could 
block pull replication from running on these nodes.
     - With push replication the switch was always `false` since a 
decom/maintenance node will need to push its replicas to other datanodes before 
it can be removed. Therefore the switch was removed.
   - Freon's `ClosedContainerReplicator` was removed, since it only worked for 
pull replication and there is no equivalent for push replication.
        - This tool worked when Freon was both the client and the target, and 
it could download from existing Datanodes that were the server and the source.
        - With push replication, Freon remains the client but it most also be 
the source, despite having no container data since that only exists in running 
datanodes.
        - Since this is not a CLI used on the production code path it was 
removed entirely, although we could optionally deprecate it and make it a no-op.
    
   ### Reviewer Notes
   
   This change is fairly large and I could not find a good way to break it 
down, although I am open to suggestions. One alternative is to iteratively 
review the prod changes first, and once those look good move on to the test 
changes.
   
   ## What is the link to the Apache JIRA
   
   HDDS-15614
   
   ## How was this patch tested?
   
   - Tests specifically for pull that already had push equivalents were removed.
   - Tests that were only using pull were migrated to push equivalents:
     - `TestOzoneContainerWithTLS` was previously only testing pull. It now 
tests push.
     - `TestReplicationSupervisor#testReplicationImportReserveSpace` was 
removed, since the reserved space check is relevant on the target only, and 
`ReplicationSupervisor` is now always the source.
       - `TestSendContainerRequestHandler#testNoSpaceOnTargetVolume` was added 
as a new test to cover this functionality. `SendContainerRequestHandler` is the 
receiving class on the target node in the push model.
   - `ReplicateContainerCommand#forTest` was no longer relevant. Tests now call 
`toTarget` directly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to