Re: Flink on Native Kubernetes S3 checkpointing error

2021-11-22 Thread Matthias Pohl
Cool, thanks for the update.

Matthias

On Mon, Nov 22, 2021 at 6:42 PM bat man  wrote:

> Hi Matthias,
>
> Looks like the service account token volume projection was not working
> fine with the EKS version I was running. Upgraded the version and with the
> same configs now the s3 checkpointing is working fine.
> So, in short, on AWS use EKS v1.20+ for IAM Pod Identity Webhook.
>
> Thanks,
> Hemant
>
> On Mon, Nov 22, 2021 at 7:26 PM Matthias Pohl 
> wrote:
>
>> Hi bat man,
>> this feature seems to be tied to a certain AWS SDK version [1] which you
>> already considered. But I checked the version used in Flink 1.13.1 for the
>> s3 filesystem. It seems like the version that's used (1.11.788) is good
>> enough to provide this feature (which was added in 1.11.704):
>> ```
>> $ git checkout release-1.13.1
>> $ cd flink-filesystems/flink-s3-fs-base; mvn dependency:tree | grep
>> com.amazonaws:aws-java-sdk-s3
>> [INFO] +- com.amazonaws:aws-java-sdk-s3:jar:1.11.788:compile
>> ```
>>
>> Matthias
>>
>> [1]
>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html
>>
>> On Mon, Nov 22, 2021 at 8:04 AM bat man  wrote:
>>
>>> Hi,
>>>
>>> I am using flink 1.13.1 to use checkpointing(RocksDB) on s3 with native
>>> kubernetes.
>>> Passing in this parameter to job -
>>>
>>>
>>> *-Dfs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider*
>>> I am getting this error in job-manager logs -
>>>
>>> *Caused by: com.amazonaws.AmazonClientException: No AWS Credentials
>>> provided by WebIdentityTokenCredentialsProvider :
>>> com.amazonaws.SdkClientException: Unable to locate specified web identity
>>> token file: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
>>>  at
>>> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:139)
>>> ~[?:?]*
>>>
>>> Describing the pod shows that that volume is mounted to the jobmanager
>>> pod.
>>> Is there anything specific that needs to be done as on the same EKS
>>> cluster for testing I ran a sample pod with aws cli image and it's able to
>>> do *ls* on the s3 buckets.
>>> Is this related to aws sdk used in Flink 1.13.1, shall I try with recent
>>> flink versions.
>>>
>>> Any help would be appreciated.
>>>
>>> Thanks.
>>>
>>


Re: Flink on Native Kubernetes S3 checkpointing error

2021-11-22 Thread bat man
Hi Matthias,

Looks like the service account token volume projection was not working fine
with the EKS version I was running. Upgraded the version and with the same
configs now the s3 checkpointing is working fine.
So, in short, on AWS use EKS v1.20+ for IAM Pod Identity Webhook.

Thanks,
Hemant

On Mon, Nov 22, 2021 at 7:26 PM Matthias Pohl 
wrote:

> Hi bat man,
> this feature seems to be tied to a certain AWS SDK version [1] which you
> already considered. But I checked the version used in Flink 1.13.1 for the
> s3 filesystem. It seems like the version that's used (1.11.788) is good
> enough to provide this feature (which was added in 1.11.704):
> ```
> $ git checkout release-1.13.1
> $ cd flink-filesystems/flink-s3-fs-base; mvn dependency:tree | grep
> com.amazonaws:aws-java-sdk-s3
> [INFO] +- com.amazonaws:aws-java-sdk-s3:jar:1.11.788:compile
> ```
>
> Matthias
>
> [1]
> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html
>
> On Mon, Nov 22, 2021 at 8:04 AM bat man  wrote:
>
>> Hi,
>>
>> I am using flink 1.13.1 to use checkpointing(RocksDB) on s3 with native
>> kubernetes.
>> Passing in this parameter to job -
>>
>>
>> *-Dfs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider*
>> I am getting this error in job-manager logs -
>>
>> *Caused by: com.amazonaws.AmazonClientException: No AWS Credentials
>> provided by WebIdentityTokenCredentialsProvider :
>> com.amazonaws.SdkClientException: Unable to locate specified web identity
>> token file: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
>>  at
>> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:139)
>> ~[?:?]*
>>
>> Describing the pod shows that that volume is mounted to the jobmanager
>> pod.
>> Is there anything specific that needs to be done as on the same EKS
>> cluster for testing I ran a sample pod with aws cli image and it's able to
>> do *ls* on the s3 buckets.
>> Is this related to aws sdk used in Flink 1.13.1, shall I try with recent
>> flink versions.
>>
>> Any help would be appreciated.
>>
>> Thanks.
>>
>


Re: Flink on Native Kubernetes S3 checkpointing error

2021-11-22 Thread Matthias Pohl
Hi bat man,
this feature seems to be tied to a certain AWS SDK version [1] which you
already considered. But I checked the version used in Flink 1.13.1 for the
s3 filesystem. It seems like the version that's used (1.11.788) is good
enough to provide this feature (which was added in 1.11.704):
```
$ git checkout release-1.13.1
$ cd flink-filesystems/flink-s3-fs-base; mvn dependency:tree | grep
com.amazonaws:aws-java-sdk-s3
[INFO] +- com.amazonaws:aws-java-sdk-s3:jar:1.11.788:compile
```

Matthias

[1]
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html

On Mon, Nov 22, 2021 at 8:04 AM bat man  wrote:

> Hi,
>
> I am using flink 1.13.1 to use checkpointing(RocksDB) on s3 with native
> kubernetes.
> Passing in this parameter to job -
>
>
> *-Dfs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider*
> I am getting this error in job-manager logs -
>
> *Caused by: com.amazonaws.AmazonClientException: No AWS Credentials
> provided by WebIdentityTokenCredentialsProvider :
> com.amazonaws.SdkClientException: Unable to locate specified web identity
> token file: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
>  at
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:139)
> ~[?:?]*
>
> Describing the pod shows that that volume is mounted to the jobmanager pod.
> Is there anything specific that needs to be done as on the same EKS
> cluster for testing I ran a sample pod with aws cli image and it's able to
> do *ls* on the s3 buckets.
> Is this related to aws sdk used in Flink 1.13.1, shall I try with recent
> flink versions.
>
> Any help would be appreciated.
>
> Thanks.
>


Flink on Native Kubernetes S3 checkpointing error

2021-11-21 Thread bat man
Hi,

I am using flink 1.13.1 to use checkpointing(RocksDB) on s3 with native
kubernetes.
Passing in this parameter to job -

*-Dfs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider*
I am getting this error in job-manager logs -

*Caused by: com.amazonaws.AmazonClientException: No AWS Credentials
provided by WebIdentityTokenCredentialsProvider :
com.amazonaws.SdkClientException: Unable to locate specified web identity
token file: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
 at
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:139)
~[?:?]*

Describing the pod shows that that volume is mounted to the jobmanager pod.
Is there anything specific that needs to be done as on the same EKS cluster
for testing I ran a sample pod with aws cli image and it's able to do *ls* on
the s3 buckets.
Is this related to aws sdk used in Flink 1.13.1, shall I try with recent
flink versions.

Any help would be appreciated.

Thanks.