[
https://issues.apache.org/jira/browse/HDFS-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685364#comment-17685364
]
ASF GitHub Bot commented on HDFS-16911:
---------------------------------------
sadanand48 opened a new pull request, #5364:
URL: https://github.com/apache/hadoop/pull/5364
### Description of PR
Currently in `DistcpSync` i.e the step which applies the diff b/w 2 provided
snapshots as arguments to the distcp job with `-diff ` option, only
`DistributedFilesystem` and `WebHDFS` filesystems are supported.
Now that Ozone supports snapshots, the change here is to make DistcpSync
also support OzoneFilesystem (ofs).
If we add the check here in `DistcpSync `to check whether filesystem is
instance of `BasicRootedOzoneFilesystem`,
we would need to introduce ozone dependency in hadoop code - but Ozone
already has a hadoop dependency? To solve this , this PR introduces a job
config by the name distcp.sync.class and the default value for this is the
`DistcpSync` class itself. If a new filesystem wants to introduce their own
implementation of `DistcpSync`, it just needs to extend the `DistcpSync
`class and provide the class name to the config. The class will be initialised
at run time, if no class is provided , it falls back to the original DistcpSync
implementation and won't affect any current code.
The changes in this PR are mostly to be able to extend `DistcpSync` class in
Ozone by making changing some of the access modifiers from private to
public/protected so that it can be accessed from elsewhere.
### How was this patch tested?
Tested by building hadoop code and replacing distcp jars in Ozone via Unit
tests in Ozone.
Also tested on a cluster.
### For code changes:
> Distcp with snapshot diff to support Ozone filesystem.
> ------------------------------------------------------
>
> Key: HDFS-16911
> URL: https://issues.apache.org/jira/browse/HDFS-16911
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: distcp
> Reporter: Sadanand Shenoy
> Priority: Major
>
> Currently in DistcpSync i.e the step which applies the diff b/w 2 provided
> snapshots as arguments to the distcp job with -diff option, only
> DistributedFilesystem and WebHDFS filesytems are supported.
>
> {code:java}
> // currently we require both the source and the target file system are
> // DistributedFileSystem or (S)WebHdfsFileSystem.
> if (!(srcFs instanceof DistributedFileSystem
> || srcFs instanceof WebHdfsFileSystem)) {
> throw new IllegalArgumentException("Unsupported source file system: "
> + srcFs.getScheme() + "://. " +
> "Supported file systems: hdfs://, webhdfs:// and swebhdfs://.");
> }
> if (!(tgtFs instanceof DistributedFileSystem
> || tgtFs instanceof WebHdfsFileSystem)) {
> throw new IllegalArgumentException("Unsupported target file system: "
> + tgtFs.getScheme() + "://. " +
> "Supported file systems: hdfs://, webhdfs:// and swebhdfs://.");
> }{code}
> As Ozone now supports snapshot feature after HDDS-6517, add support to use
> distcp with ozone snapshots.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]