[ 
https://issues.apache.org/jira/browse/JCLOUDS-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Nguyen updated JCLOUDS-1638:
----------------------------------
    Description: 
{noformat}
java.lang.RuntimeException: request: GET 
https://sclas-cloud-storage-master.s3.amazonaws.com/?delimiter=/&prefix=Data/57-2943/10-8-20/&max-keys=1000
 HTTP/1.1; response: HTTP/1.1 200 OK; cause: java.lang.RuntimeException: 
request: GET 
https://sclas-cloud-storage-master.s3.amazonaws.com/?delimiter=/&prefix=Data/57-2943/10-8-20/&max-keys=1000
 HTTP/1.1; error at 323:2 in document ; cause: org.xml.sax.SAXParseException; 
lineNumber: 2; columnNumber: 323; Character reference "&#x18" is an invalid XML 
character.
        at 
org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:174)
        at 
org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:146)
        at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:86)
        at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:52)
        at 
org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
        at 
org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
        at 
org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
        at 
org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
        at 
org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
        at jdk.proxy2/jdk.proxy2.$Proxy235.listBucket(Unknown Source)
        at org.jclouds.s3.blobstore.S3BlobStore.list(S3BlobStore.java:177)
{noformat}

When there's a control character in the folder path in S3, we can't parse it 
from the response because it throws SAXParseException.

Can there be an option that at least lets us forward the encoding-type param?
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html#API_ListObjects_RequestSyntax
And url decode it for us so that listing can be possible? This bug currently 
doesn't allow us to list any children of a root folder if one of the children 
contains control characters.

Here's an example XML response from S3 when listing objects from cURL:

{noformat}
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult 
xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test&#x10;/</Prefix></CommonPrefixes></ListBucketResult>
{noformat}

Child folder of 'some' contains 
{noformat}
<Prefix>some/test&#x10;/</Prefix>
{noformat}
which can't be parsed.

But with the urlParam:
{noformat}
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult 
xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test%10/</Prefix></CommonPrefixes></ListBucketResult>
{noformat}


{noformat}
<Prefix>some/test%10/</Prefix>
{noformat}
Can probably be parsed.



  was:
{noformat}
java.lang.RuntimeException: request: GET 
https://sclas-cloud-storage-master.s3.amazonaws.com/?delimiter=/&prefix=Data/57-2943/10-8-20/&max-keys=1000
 HTTP/1.1; response: HTTP/1.1 200 OK; cause: java.lang.RuntimeException: 
request: GET 
https://sclas-cloud-storage-master.s3.amazonaws.com/?delimiter=/&prefix=Data/57-2943/10-8-20/&max-keys=1000
 HTTP/1.1; error at 323:2 in document ; cause: org.xml.sax.SAXParseException; 
lineNumber: 2; columnNumber: 323; Character reference "&#x18" is an invalid XML 
character.
        at 
org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:174)
        at 
org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:146)
        at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:86)
        at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:52)
        at 
org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
        at 
org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
        at 
org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
        at 
org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
        at 
org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
        at jdk.proxy2/jdk.proxy2.$Proxy235.listBucket(Unknown Source)
        at org.jclouds.s3.blobstore.S3BlobStore.list(S3BlobStore.java:177)
{noformat}

When there's a control character in the folder path in S3, we can't parse it 
from the response because it throws SAXParseException.

Can there be an option that at least lets us forward the encoding-type param?
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html#API_ListObjects_RequestSyntax
And url decode it for us so that listing can be possible? This bug currently 
doesn't allow us to list any children of a root folder if one of the children 
contains control characters.

Here's an example XML response from S3 when listing objects from cURL:

{noformat}
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult 
xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>different-bucket-name</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test&#x10;/</Prefix></CommonPrefixes></ListBucketResult>
{noformat}

Child folder of 'some' contains 
{noformat}
<Prefix>some/test&#x10;/</Prefix>
{noformat}
which can't be parsed.

But with the urlParam:
{noformat}
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult 
xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test%10/</Prefix></CommonPrefixes></ListBucketResult>
{noformat}


{noformat}
<Prefix>some/test%10/</Prefix>
{noformat}
Can probably be parsed.




> SAXParseException on S3 Listing
> -------------------------------
>
>                 Key: JCLOUDS-1638
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-1638
>             Project: jclouds
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Jacob Nguyen
>            Assignee: Andrew Gaul
>            Priority: Major
>
> {noformat}
> java.lang.RuntimeException: request: GET 
> https://sclas-cloud-storage-master.s3.amazonaws.com/?delimiter=/&prefix=Data/57-2943/10-8-20/&max-keys=1000
>  HTTP/1.1; response: HTTP/1.1 200 OK; cause: java.lang.RuntimeException: 
> request: GET 
> https://sclas-cloud-storage-master.s3.amazonaws.com/?delimiter=/&prefix=Data/57-2943/10-8-20/&max-keys=1000
>  HTTP/1.1; error at 323:2 in document ; cause: org.xml.sax.SAXParseException; 
> lineNumber: 2; columnNumber: 323; Character reference "&#x18" is an invalid 
> XML character.
>       at 
> org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:174)
>       at 
> org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:146)
>       at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:86)
>       at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:52)
>       at 
> org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
>       at 
> org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
>       at 
> org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
>       at 
> org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
>       at 
> org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
>       at jdk.proxy2/jdk.proxy2.$Proxy235.listBucket(Unknown Source)
>       at org.jclouds.s3.blobstore.S3BlobStore.list(S3BlobStore.java:177)
> {noformat}
> When there's a control character in the folder path in S3, we can't parse it 
> from the response because it throws SAXParseException.
> Can there be an option that at least lets us forward the encoding-type param?
> https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html#API_ListObjects_RequestSyntax
> And url decode it for us so that listing can be possible? This bug currently 
> doesn't allow us to list any children of a root folder if one of the children 
> contains control characters.
> Here's an example XML response from S3 when listing objects from cURL:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?>
> <ListBucketResult 
> xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test&#x10;/</Prefix></CommonPrefixes></ListBucketResult>
> {noformat}
> Child folder of 'some' contains 
> {noformat}
> <Prefix>some/test&#x10;/</Prefix>
> {noformat}
> which can't be parsed.
> But with the urlParam:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?>
> <ListBucketResult 
> xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test%10/</Prefix></CommonPrefixes></ListBucketResult>
> {noformat}
> {noformat}
> <Prefix>some/test%10/</Prefix>
> {noformat}
> Can probably be parsed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to