Hello,

I have successfully been able to store data on S3 bucket. Earlier, I used
to have a similar issue. What you need to confirm:
1. S3 bucket is created with RW access(irrespective if it is minio or AWS
S3)
2. "flink/opt/flink-s3-fs-presto-1.14.0.jar" jar is copied to plugin
directory of "flink/plugins/s3-fs-presto"
3. Add following configuration in config(configuration or programmatically,
either way)

state.checkpoints.dir: <S3://bucket-name/checkpoints>
    state.backend.fs.checkpointdir: <s3://bucket-name/checkpoints/>
    s3.path-style: true
    s3.path.style.access: true

On Wed, Oct 27, 2021 at 2:47 AM Vamshi G <vgandr...@salesforce.com> wrote:

> s3a with hadoop s3 filesystem works fine for us wit sts assume role
> credentials and with kms.
> Below are how our hadoop s3a configs look like. Since the endpoint is
> globally whitelisted, we don't explicitly mention the endpoint.
>
> fs.s3a.aws.credentials.provider: 
> org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider
> fs.s3a.assumed.role.credentials.provider: 
> com.amazonaws.auth.profile.ProfileCredentialsProvider
> fs.s3a.assumed.role.arn: arn:aws:iam::<account>:role/<iam_role>
> fs.s3a.server-side-encryption-algorithm: SSE-KMS
> fs.s3a.server-side-encryption.key: 
> arn:aws:kms:<region>:<account>:key/<key-alias>
>
>
> However, for checkpointing we definitely want to use presto s3, and just
> could not make it work. FINE logging on presto-hive is not helping either,
> as the lib uses airlift logger.
> Also, based on the code here
> https://github.com/prestodb/presto/blob/2aeedb944fc8b47bfe1cad78732d6dd2308ee9ad/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L821,
> PrestoS3FileSystem does switch to iam role credentials if one is provided.
>
> Anyone successful using the s3 presto filesystem in flink v1.13.0?
>
>
> Thanks,
> Vamshi
>
>
> On Mon, Aug 16, 2021 at 3:59 AM David Morávek <d...@apache.org> wrote:
>
>> Hi Vamshi,
>>
>> From your configuration I'm guessing that you're using Amazon S3 (not any
>> implementation such as Minio).
>>
>> Two comments:
>> - *s3.endpoint* should not contain bucket (this is included in your s3
>> path, eg. *s3://<bucket>/<file>*)
>> - "*s3.path.style.access*: true" is only correct for 3rd party
>> implementation such as Minio / Swift, that have bucket definied in url path
>> instead of subdomain
>>
>> You can find some information about connecting to s3 in Flink docs [1].
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/
>> <https://urldefense.com/v3/__https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/__;!!DCbAVzZNrAf4!RfOKZc2kW2eWOFMP6fnvNYnG0F8tq8oaCr08o2xPNF7G1L2OfoLZdZifyODfHBc3Nx4$>
>>
>> Best,
>> D.
>>
>>
>> On Tue, Aug 10, 2021 at 2:37 AM Vamshi G <vgandr...@salesforce.com>
>> wrote:
>>
>>> We are using Flink version 1.13.0 on Kubernetes.
>>> For checkpointing we have configured fs.s3 flink-s3-fs-presto.
>>> We have enabled sse on our buckets with kms cmk.
>>>
>>> flink-conf.yaml is configured as below.
>>> s3.entropy.key: _entropy_
>>> s3.entropy.length: 4
>>> s3.path.style.access: true
>>> s3.ssl.enabled: true
>>> s3.sse.enabled: true
>>> s3.sse.type: KMS
>>> s3.sse.kms-key-id: <ARN of keyid>
>>> s3.iam-role: <IAM role with read/write access to bucket>
>>> s3.endpoint: <bucketname>.s3-us-west-2.amazonaws.com
>>> <https://urldefense.com/v3/__http://s3-us-west-2.amazonaws.com__;!!DCbAVzZNrAf4!RfOKZc2kW2eWOFMP6fnvNYnG0F8tq8oaCr08o2xPNF7G1L2OfoLZdZifyODfwoagq5A$>
>>> s3.credentials-provider:
>>> com.amazonaws.auth.profile.ProfileCredentialsProvider
>>>
>>> However, PUT operations on the bucket are resulting in access denied
>>> error. Access policies for the role are checked and works fine when checked
>>> with CLI.
>>> Also, can't get to see debug logs from presto s3 lib, is there a way to
>>> enable logger for presto airlift logging?
>>>
>>> Any inputs on above issue?
>>>
>>>

-- 
Regards,
Parag Surajmal Somani.

Reply via email to