sadanand48 opened a new pull request, #5364: URL: https://github.com/apache/hadoop/pull/5364
### Description of PR Currently in `DistcpSync` i.e the step which applies the diff b/w 2 provided snapshots as arguments to the distcp job with `-diff ` option, only `DistributedFilesystem` and `WebHDFS` filesystems are supported. Now that Ozone supports snapshots, the change here is to make DistcpSync also support OzoneFilesystem (ofs). If we add the check here in `DistcpSync `to check whether filesystem is instance of `BasicRootedOzoneFilesystem`, we would need to introduce ozone dependency in hadoop code - but Ozone already has a hadoop dependency? To solve this , this PR introduces a job config by the name distcp.sync.class and the default value for this is the `DistcpSync` class itself. If a new filesystem wants to introduce their own implementation of `DistcpSync`, it just needs to extend the `DistcpSync `class and provide the class name to the config. The class will be initialised at run time, if no class is provided , it falls back to the original DistcpSync implementation and won't affect any current code. The changes in this PR are mostly to be able to extend `DistcpSync` class in Ozone by making changing some of the access modifiers from private to public/protected so that it can be accessed from elsewhere. ### How was this patch tested? Tested by building hadoop code and replacing distcp jars in Ozone via Unit tests in Ozone. Also tested on a cluster. ### For code changes: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
