s3a with hadoop s3 filesystem works fine for us wit sts assume role
credentials and with kms.
Below are how our hadoop s3a configs look like. Since the endpoint is
globally whitelisted, we don't explicitly mention the endpoint.

fs.s3a.aws.credentials.provider:
org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider
fs.s3a.assumed.role.credentials.provider:
com.amazonaws.auth.profile.ProfileCredentialsProvider
fs.s3a.assumed.role.arn: arn:aws:iam::<account>:role/<iam_role>
fs.s3a.server-side-encryption-algorithm: SSE-KMS
fs.s3a.server-side-encryption.key:
arn:aws:kms:<region>:<account>:key/<key-alias>


However, for checkpointing we definitely want to use presto s3, and just
could not make it work. FINE logging on presto-hive is not helping either,
as the lib uses airlift logger.
Also, based on the code here
https://github.com/prestodb/presto/blob/2aeedb944fc8b47bfe1cad78732d6dd2308ee9ad/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L821,
PrestoS3FileSystem does switch to iam role credentials if one is provided.

Anyone successful using the s3 presto filesystem in flink v1.13.0?


Thanks,
Vamshi


On Mon, Aug 16, 2021 at 3:59 AM David Morávek <d...@apache.org> wrote:

> Hi Vamshi,
>
> From your configuration I'm guessing that you're using Amazon S3 (not any
> implementation such as Minio).
>
> Two comments:
> - *s3.endpoint* should not contain bucket (this is included in your s3
> path, eg. *s3://<bucket>/<file>*)
> - "*s3.path.style.access*: true" is only correct for 3rd party
> implementation such as Minio / Swift, that have bucket definied in url path
> instead of subdomain
>
> You can find some information about connecting to s3 in Flink docs [1].
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/
> <https://urldefense.com/v3/__https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/__;!!DCbAVzZNrAf4!RfOKZc2kW2eWOFMP6fnvNYnG0F8tq8oaCr08o2xPNF7G1L2OfoLZdZifyODfHBc3Nx4$>
>
> Best,
> D.
>
>
> On Tue, Aug 10, 2021 at 2:37 AM Vamshi G <vgandr...@salesforce.com> wrote:
>
>> We are using Flink version 1.13.0 on Kubernetes.
>> For checkpointing we have configured fs.s3 flink-s3-fs-presto.
>> We have enabled sse on our buckets with kms cmk.
>>
>> flink-conf.yaml is configured as below.
>> s3.entropy.key: _entropy_
>> s3.entropy.length: 4
>> s3.path.style.access: true
>> s3.ssl.enabled: true
>> s3.sse.enabled: true
>> s3.sse.type: KMS
>> s3.sse.kms-key-id: <ARN of keyid>
>> s3.iam-role: <IAM role with read/write access to bucket>
>> s3.endpoint: <bucketname>.s3-us-west-2.amazonaws.com
>> <https://urldefense.com/v3/__http://s3-us-west-2.amazonaws.com__;!!DCbAVzZNrAf4!RfOKZc2kW2eWOFMP6fnvNYnG0F8tq8oaCr08o2xPNF7G1L2OfoLZdZifyODfwoagq5A$>
>> s3.credentials-provider:
>> com.amazonaws.auth.profile.ProfileCredentialsProvider
>>
>> However, PUT operations on the bucket are resulting in access denied
>> error. Access policies for the role are checked and works fine when checked
>> with CLI.
>> Also, can't get to see debug logs from presto s3 lib, is there a way to
>> enable logger for presto airlift logging?
>>
>> Any inputs on above issue?
>>
>>

Reply via email to