[ 
https://issues.apache.org/jira/browse/HADOOP-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545665#comment-17545665
 ] 

Brandon commented on HADOOP-17372:
----------------------------------

Hello [[email protected]], thank you for your work on S3A.

In our Spark jobs, we use a custom AWS credentials provider class which is 
bundled into the Spark application jar. This worked on Hadoop 3.2.1, but 
unfortunately this class can't be found after upgrading to Hadoop 3.3.3. This 
surfaces as a ClassNotFoundException in S3AFileSystem's initialization:
{noformat}
java.io.IOException: From option fs.s3a.aws.credentials.provider 
java.lang.ClassNotFoundException: Class [custom AWS credentials provider class] 
not found
at org.apache.hadoop.fs.s3a.S3AUtils.loadAWSProviderClasses (S3AUtils.java:657)
org.apache.hadoop.fs.s3a.S3AUtils.buildAWSProviderList (S3AUtils.java:680)
org.apache.hadoop.fs.s3a.S3AUtils.createAWSCredentialProviderSet 
(S3AUtils.java:631)
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient (S3AFileSystem.java:877)
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize (S3AFileSystem.java:534)
org.apache.hadoop.fs.FileSystem.createFileSystem 
(FileSystem.java:3469){noformat}
We were able to track this down to the change in this ticket. I believe what's 
happening here is:
 * The S3AFileSystem class is provided by a jar on disk. This jar is added to 
the java classpath via the normal java command-line option. So, the classloader 
of S3AFileSystem is a java application classloader.
 * The Spark application jar which contains our AWS credentials provider class 
is downloaded at runtime by Spark and then "patched into" the java classpath 
via Spark's mutable classloader.
 * Therefore, classes in the application jar are not visible to the classloader 
that loaded S3AFileSystem.

In the meantime, I think our most reasonable path forward is to pull the custom 
AWS credentials provider out of the application jar, install it in a jar on 
disk, and add it to the java command-line classpath like hadoop-aws itself. Not 
too bad, but certainly more complicated than the prior setup with Hadoop 3.2.1.

> S3A AWS Credential provider loading gets confused with isolated classloaders
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-17372
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17372
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>             Fix For: 3.3.1
>
>
> Problem: exception in loading S3A credentials for an FS, "Class class 
> com.amazonaws.auth.EnvironmentVariableCredentialsProvider does not implement 
> AWSCredentialsProvider"
> Location: S3A + Spark dataframes test
> Hypothesised cause:
> Configuration.getClasses() uses the context classloader, and with the spark 
> isolated CL that's different from the one the s3a FS uses, so it can't load 
> AWS credential providers.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to