Srinivasu Majeti created HDFS-14323:
---------------------------------------
Summary: Distcp fails in Hadoop 3.x when 2.x source webhdfs url
has special characters in hdfs file path
Key: HDFS-14323
URL: https://issues.apache.org/jira/browse/HDFS-14323
Project: Hadoop HDFS
Issue Type: Bug
Components: webhdfs
Affects Versions: 3.2.0
Reporter: Srinivasu Majeti
There was an enhancement to allow semicolon in source/target URLs for distcp
use case as part of HDFS-13176 and backward compatibility fix as part of
HDFS-13582 . Still there seems to be an issue when trying to trigger distcp
from 3.x cluster to pull webhdfs data from 2.x hadoop cluster. We might need to
deal with existing fix as described below by making sure if url is already
encoded or not. That fixes it.
diff --git
a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
index 5936603c34a..dc790286aff 100644
---
a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
+++
b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
@@ -609,7 +609,10 @@ URL toUrl(final HttpOpParam.Op op, final Path fspath,
boolean pathAlreadyEncoded = false;
try {
fspathUriDecoded = URLDecoder.decode(fspathUri.getPath(), "UTF-8");
- pathAlreadyEncoded = true;
+ if(!fspathUri.getPath().equals(fspathUriDecoded))
+ {
+ pathAlreadyEncoded = true;
+ }
} catch (IllegalArgumentException ex) {
LOG.trace("Cannot decode URL encoded file", ex);
}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]