errose28 opened a new pull request, #10584:
URL: https://github.com/apache/ozone/pull/10584
## What changes were proposed in this pull request?
Datanodes have used push replication to move containers by default since
early 2023. There is no reason to continue using the pull model which is now
disabled behind a feature flag. We can safely remove it, which will be easier
to do before ZDU is merged into master.
- On the ZDU replication path, both the client and server need to operate on
the same apparent version to provide agreement on the API while datanodes are
in mixed versions. Tracking this version requires twice the work when there are
two protocols that have different client and server roles.
- By removing the API before ZDU we do not need to do any compatibility
handling, since the cluster must be upgraded to the first ZDU enabled version
before it can use ZDU for upgrades going forward.
- If we remove it after ZDU, we will need to take precautions to make sure
the replications don't fail as datanodes are being upgraded.
### Implementation Notes
- Compression was only used for pull replication where it was disabled by
default. This was never used for push replication so it is now removed entirely.
- To exclude decommissioning and maintenance datanodes as replication
targets, `AbstractReplicationTask#shouldOnlyRunOnInServiceDatanodes` could
block pull replication from running on these nodes.
- With push replication the switch was always `false` since a
decom/maintenance node will need to push its replicas to other datanodes before
it can be removed. Therefore the switch was removed.
- Freon's `ClosedContainerReplicator` was removed, since it only worked for
pull replication and there is no equivalent for push replication.
- This tool worked when Freon was both the client and the target, and
it could download from existing Datanodes that were the server and the source.
- With push replication, Freon remains the client but it most also be
the source, despite having no container data since that only exists in running
datanodes.
- Since this is not a CLI used on the production code path it was
removed entirely, although we could optionally deprecate it and make it a no-op.
### Reviewer Notes
This change is fairly large and I could not find a good way to break it
down, although I am open to suggestions. One alternative is to iteratively
review the prod changes first, and once those look good move on to the test
changes.
## What is the link to the Apache JIRA
HDDS-15614
## How was this patch tested?
- Tests specifically for pull that already had push equivalents were removed.
- Tests that were only using pull were migrated to push equivalents:
- `TestOzoneContainerWithTLS` was previously only testing pull. It now
tests push.
- `TestReplicationSupervisor#testReplicationImportReserveSpace` was
removed, since the reserved space check is relevant on the target only, and
`ReplicationSupervisor` is now always the source.
- `TestSendContainerRequestHandler#testNoSpaceOnTargetVolume` was added
as a new test to cover this functionality. `SendContainerRequestHandler` is the
receiving class on the target node in the push model.
- `ReplicateContainerCommand#forTest` was no longer relevant. Tests now call
`toTarget` directly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]