jackjlli commented on a change in pull request #5218: Fix HDFS copy logic
URL: https://github.com/apache/incubator-pinot/pull/5218#discussion_r409846110
##########
File path:
pinot-plugins/pinot-file-system/pinot-hdfs/src/main/java/org/apache/pinot/plugin/filesystem/HadoopPinotFS.java
##########
@@ -97,13 +100,24 @@ public boolean copy(URI srcUri, URI dstUri)
throws IOException {
Path source = new Path(srcUri);
Path target = new Path(dstUri);
- RemoteIterator<LocatedFileStatus> sourceFiles =
_hadoopFS.listFiles(source, true);
+ RemoteIterator<FileStatus> sourceFiles =
_hadoopFS.listStatusIterator(source);
if (sourceFiles != null) {
while (sourceFiles.hasNext()) {
- boolean succeeded =
- FileUtil.copy(_hadoopFS, sourceFiles.next().getPath(), _hadoopFS,
target, true, _hadoopConf);
- if (!succeeded) {
- return false;
+ FileStatus sourceFile = sourceFiles.next();
+ Path sourceFilePath = sourceFile.getPath();
+ if (sourceFile.isFile()) {
+ try {
+ FileUtil.copy(_hadoopFS, sourceFilePath, _hadoopFS, new
Path(target, sourceFilePath.getName()), false,
+ _hadoopConf);
+ } catch (FileNotFoundException e) {
+ LOGGER.warn("Not found file {}, skipping copying it...",
sourceFilePath, e);
+ }
+ } else if (sourceFile.isDirectory()) {
+ try {
+ copy(sourceFilePath.toUri(), new Path(target,
sourceFilePath.getName()).toUri());
Review comment:
Here the distURI is constructed by target + sourceFilePath. Can you make
sure `sourceFilePath` is relative path instead of absolute path?
And if possible, it'd be good to add some tests for this method. Thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]