derjust opened a new pull request #9071: FLINK-13044 [BuildSystem / Shaded] Fix 
for wrong shading of AWS SDK in flink-s3-fs-hadoop
URL: https://github.com/apache/flink/pull/9071
 
 
   ## What is the purpose of the change
   
   Due to the bug MSHADE-156 in Maven's shading plugin [1] the merging of the 
AWS SDK also causes string literals to be changed in Flink classes:
   
   `PACKAGE_PREFIXES_TO_SHADE` in 
`org.apache.flink.fs.s3hadoop.S3FileSystemFactory` from `com.amazonaws.` to 
`org.apache.flink.fs.s3base.shaded.com.amazonaws` - which causes settings like 
`fs.s3a.aws.credentials.provider: 
com.amazonaws.auth.DefaultAWSCredentialsProviderChain`
   to not be remapped properly as the `startsWith` in 
`org.apache.flink.fs.s3.common.HadoopConfigLoader.shadeClassConfig()` doesn't 
match the beginning of the FQCN.
   
   Using `DefaultAWSCredentialsProviderChain` instead of Flink's hand-crafted 
chain can be required depending on the Flink deployment - i.e. as part of 
Fargate.
   
   This issue is only visible looking into the compiled class file after the 
shading happened!
   
   ## Brief change log
   
   * Fixing the `PACKAGE_PREFIXES_TO_SHADE` value alone doesn't help as based 
on `FLINK_SHADING_PREFIX` the FQCN is assembled as 
`org.apache.flink.fs.s3hadoop.shaded.com.amazonaws.auth.DefaultAWSCredentialsProviderChain`
   which can't be found either; the class is located at 
`org.apache.flink.fs.s3base.shaded.com.amazonaws.auth.DefaultAWSCredentialsProviderChain`
   
       * This commit follows the workaround shown in [1] to workaround the 
shading issue.
   * Also fixing the wrong package name in the Flink code for the shaded AWS 
SDK classes.
   
   ## Verifying this change
   
   This change can be verified as follows:
   - *Set `fs.s3a.aws.credentials.provider: 
com.amazonaws.auth.DefaultAWSCredentialsProviderChain` in `flink-conf.yaml`*
   - *Start JobManager and see it using the 
`DefaultAWSCredentialsProviderChain` instead of Flink's handcrafted chain*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: yes
   
   ## Documentation
   
     - Does this pull request introduce a new feature?   no
     - If yes, how is the feature documented? not applicable 
   
   
   [1] https://issues.apache.org/jira/browse/MSHADE-156
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to