[
https://issues.apache.org/jira/browse/HADOOP-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874539#comment-15874539
]
Steve Loughran commented on HADOOP-13811:
-----------------------------------------
Luke,
* grab the HDP 2.5 sandbox and take the JARs from there .. they have the input
stream speedup, and it'd be interesting to see if that's enough to make the
problems go away
* that new stack trace is different: can you file a new bug against it and
we'll take a look at how to handle it. Looks like we will need to add some
handling what is presumagly a race condition. For now, try setting
fs.s3a.multiobjectdelete.enable to false.
Can I state that I do not recommend using s3 as a direct destination of work,
such as {{DataFrame.write()}}. Without list consistency there's a risk of
written work not being discovered, so copied. S3guard, HADOOP-13345 will fix
that along with an O1 committer, but it's not out yet
> s3a: getFileStatus fails with com.amazonaws.AmazonClientException: Failed to
> sanitize XML document destined for handler class
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-13811
> URL: https://issues.apache.org/jira/browse/HADOOP-13811
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Steve Loughran
> Assignee: Steve Loughran
>
> Sometimes, occasionally, getFileStatus() fails with a stack trace starting
> with {{com.amazonaws.AmazonClientException: Failed to sanitize XML document
> destined for handler class}}.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]