[jira] [Comment Edited] (HADOOP-13618) IllegalArgumentException when accessing Swift object with name containing space character

2016-10-13 Thread Yulei Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573866#comment-15573866
 ] 

Yulei Li edited comment on HADOOP-13618 at 10/14/16 2:44 AM:
-

Both cases means the object contains spaces and %20 string. I will package the 
jar and send it to you, including HADOOP-13617 and HADOOP-13618.


was (Author: charlse):
Both cases means the object contains spaces and %20 string. I will package the 
jar and send it to you, including HADOOP-13617 and HADOOP-13618.

> IllegalArgumentException when accessing Swift object with name containing 
> space character
> -
>
> Key: HADOOP-13618
> URL: https://issues.apache.org/jira/browse/HADOOP-13618
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.6.0
> Environment: Linux EL6
>Reporter: Steve Yang
>Assignee: Yulei Li
> Fix For: 3.0.0-alpha2
>
> Attachments: HADOOP-13618.patch, avro_test.zip
>
>
> We are using Spark and hadoop-openstack-2.6.0.jar 
> (compile('org.apache.hadoop:hadoop-openstack:2.6.0')) to access Oracle 
> Storage Service which is Swift-based:
> DataFrame df = 
> hiveCtx.read().format("com.databricks.spark.csv").option(...).load(objectName);
> When accessing a Swift URL like "swift://Linda.oracleswift/non-matching 
> records.csv" where the object name "non-matching records.csv" contains a 
> space character, the following exception is thrown:
> 2016-08-23 15:56:03 DEBUG SwiftNativeFileSystem:126 - SwiftFileSystem 
> initialized
> java.lang.IllegalArgumentException: Illegal character in path at index 13: 
> /non-matching records.csv
> at java.net.URI.create(URI.java:859)
> at 
> org.apache.hadoop.fs.swift.util.SwiftObjectPath.(SwiftObjectPath.java:59)
> at 
> org.apache.hadoop.fs.swift.util.SwiftObjectPath.fromPath(SwiftObjectPath.java:183)
> at 
> org.apache.hadoop.fs.swift.util.SwiftObjectPath.fromPath(SwiftObjectPath.java:145)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.toObjectPath(SwiftNativeFileSystemStore.java:434)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObjectMetadata(SwiftNativeFileSystemStore.java:211)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObjectMetadata(SwiftNativeFileSystemStore.java:181)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.getFileStatus(SwiftNativeFileSystem.java:173)
> at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)
> at org.apache.hadoop.fs.Globber.doGlob(Globber.java:272)
> at org.apache.hadoop.fs.Globber.glob(Globber.java:151)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1653)
> at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259)
> ...
> Apparently it is complaining about the space character. However, checking the 
> debug messages earlier before this error is raised we can see:
> 2016-08-23 15:56:03 DEBUG SwiftNativeFileSystem:122 - Initializing 
> SwiftNativeFileSystem against URI 
> swift://Linda.oracleswift/non-matching%20records.csv and working dir 
> swift://Linda.oracleswift/user/syang
> 2016-08-23 15:56:03 DEBUG RestClientBindings:141 - Filesystem 
> swift://Linda.oracleswift/non-matching%20records.csv is using configuration 
> keys fs.swift.service.oracleswift
> ...
> The space character has already been encoded into "%20" and so it seems the 
> Swift URL enters into SwiftNativeFileSystem is properly encoded.
> Because of this error any Swift object with file name contains space 
> character (and may be slash '/' character as well?) cannot be accessed.
> As an additional data point, if we first encode the object name("non-matching 
> records.csv"=>"non-matching%20records.csv") before giving it to OpenStack 
> Swift API, a different error is raised. This time somehow the path separator 
> '/' after the container name 'Linda' got encoded by 
> SwiftNativeFileSystemStore:
> 2016-08-23 10:56:41 DEBUG SwiftRestClient:1731 - Status code = 400
> 2016-08-23 10:56:41 DEBUG SwiftRestClient:1445 - Method HEAD on 
> https://storage.oraclecorp.com/v1/Storage-dfisher/Linda%2Fnon-matching%20records.csv
>  failed, status code: 400, status line: HTTP/1.1 400 Bad Request
> BadRequest: Bad request against 
> https://storage.oraclecorp.com/v1/Storage-dfisher/Linda%2Fnon-matching%20records.csv
>  HEAD 
> https://storage.oraclecorp.com/v1/Storage-dfisher/Linda%2Fnon-matching%20records.csv
>  => 400
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1456)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.headRequest(SwiftRestClient.java:1016)
> 

[jira] [Comment Edited] (HADOOP-13618) IllegalArgumentException when accessing Swift object with name containing space character

2016-10-13 Thread Yulei Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573866#comment-15573866
 ] 

Yulei Li edited comment on HADOOP-13618 at 10/14/16 2:12 AM:
-

Both cases means the object contains spaces and %20 string. I will package the 
jar and send it to you, including HADOOP-13617 and HADOOP-13618.


was (Author: charlse):
Both cases means the object contains spaces and %20 string. I will package the 
jar and send it to you, just wait.

> IllegalArgumentException when accessing Swift object with name containing 
> space character
> -
>
> Key: HADOOP-13618
> URL: https://issues.apache.org/jira/browse/HADOOP-13618
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.6.0
> Environment: Linux EL6
>Reporter: Steve Yang
>Assignee: Yulei Li
> Fix For: 3.0.0-alpha2
>
> Attachments: HADOOP-13618.patch, avro_test.zip
>
>
> We are using Spark and hadoop-openstack-2.6.0.jar 
> (compile('org.apache.hadoop:hadoop-openstack:2.6.0')) to access Oracle 
> Storage Service which is Swift-based:
> DataFrame df = 
> hiveCtx.read().format("com.databricks.spark.csv").option(...).load(objectName);
> When accessing a Swift URL like "swift://Linda.oracleswift/non-matching 
> records.csv" where the object name "non-matching records.csv" contains a 
> space character, the following exception is thrown:
> 2016-08-23 15:56:03 DEBUG SwiftNativeFileSystem:126 - SwiftFileSystem 
> initialized
> java.lang.IllegalArgumentException: Illegal character in path at index 13: 
> /non-matching records.csv
> at java.net.URI.create(URI.java:859)
> at 
> org.apache.hadoop.fs.swift.util.SwiftObjectPath.(SwiftObjectPath.java:59)
> at 
> org.apache.hadoop.fs.swift.util.SwiftObjectPath.fromPath(SwiftObjectPath.java:183)
> at 
> org.apache.hadoop.fs.swift.util.SwiftObjectPath.fromPath(SwiftObjectPath.java:145)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.toObjectPath(SwiftNativeFileSystemStore.java:434)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObjectMetadata(SwiftNativeFileSystemStore.java:211)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObjectMetadata(SwiftNativeFileSystemStore.java:181)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.getFileStatus(SwiftNativeFileSystem.java:173)
> at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)
> at org.apache.hadoop.fs.Globber.doGlob(Globber.java:272)
> at org.apache.hadoop.fs.Globber.glob(Globber.java:151)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1653)
> at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259)
> ...
> Apparently it is complaining about the space character. However, checking the 
> debug messages earlier before this error is raised we can see:
> 2016-08-23 15:56:03 DEBUG SwiftNativeFileSystem:122 - Initializing 
> SwiftNativeFileSystem against URI 
> swift://Linda.oracleswift/non-matching%20records.csv and working dir 
> swift://Linda.oracleswift/user/syang
> 2016-08-23 15:56:03 DEBUG RestClientBindings:141 - Filesystem 
> swift://Linda.oracleswift/non-matching%20records.csv is using configuration 
> keys fs.swift.service.oracleswift
> ...
> The space character has already been encoded into "%20" and so it seems the 
> Swift URL enters into SwiftNativeFileSystem is properly encoded.
> Because of this error any Swift object with file name contains space 
> character (and may be slash '/' character as well?) cannot be accessed.
> As an additional data point, if we first encode the object name("non-matching 
> records.csv"=>"non-matching%20records.csv") before giving it to OpenStack 
> Swift API, a different error is raised. This time somehow the path separator 
> '/' after the container name 'Linda' got encoded by 
> SwiftNativeFileSystemStore:
> 2016-08-23 10:56:41 DEBUG SwiftRestClient:1731 - Status code = 400
> 2016-08-23 10:56:41 DEBUG SwiftRestClient:1445 - Method HEAD on 
> https://storage.oraclecorp.com/v1/Storage-dfisher/Linda%2Fnon-matching%20records.csv
>  failed, status code: 400, status line: HTTP/1.1 400 Bad Request
> BadRequest: Bad request against 
> https://storage.oraclecorp.com/v1/Storage-dfisher/Linda%2Fnon-matching%20records.csv
>  HEAD 
> https://storage.oraclecorp.com/v1/Storage-dfisher/Linda%2Fnon-matching%20records.csv
>  => 400
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1456)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.headRequest(SwiftRestClient.java:1016)
> at 
>