[ 
https://issues.apache.org/jira/browse/HADOOP-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874539#comment-15874539
 ] 

Steve Loughran commented on HADOOP-13811:
-----------------------------------------

Luke, 

* grab the HDP 2.5 sandbox and take the JARs from there .. they have the input 
stream speedup, and it'd be interesting to see if that's enough to make the 
problems go away
* that new stack trace is different: can you file a new bug against it and 
we'll take a look at how to handle it. Looks like we will need to add some 
handling what is presumagly a race condition. For now, try setting 
fs.s3a.multiobjectdelete.enable to false.

Can I state that I do not recommend using s3 as a direct destination of  work, 
such as {{DataFrame.write()}}. Without list consistency there's a risk of 
written work not being discovered, so copied. S3guard, HADOOP-13345 will fix 
that along with an O1 committer, but it's not out yet

> s3a: getFileStatus fails with com.amazonaws.AmazonClientException: Failed to 
> sanitize XML document destined for handler class
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13811
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13811
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0, 2.7.3
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>
> Sometimes, occasionally, getFileStatus() fails with a stack trace starting 
> with {{com.amazonaws.AmazonClientException: Failed to sanitize XML document 
> destined for handler class}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to