And actually, I've found that the correct version of the AWS SDK *is*
included in Flink 1.12, which was reported and fixed in FLINK-18676
(see[1]). Since you said you saw this also occur in 1.12, can you share
more details about what you saw there?

Best,
Austin

[1]: https://issues.apache.org/jira/browse/FLINK-18676

On Mon, Apr 5, 2021 at 4:53 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> That looks interesting! I've also found the full list of S3 properties[1]
> for the version of presto-hive bundled with Flink 1.12 (see [2]), which
> includes an option for a KMS key (hive.s3.kms-key-id).
>
> (also, adding back the user list)
>
> [1]:
> https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration
> [2]:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/filesystems/s3.html#hadooppresto-s3-file-systems-plugins
>
> On Mon, Apr 5, 2021 at 4:21 PM Swagat Mishra <swaga...@gmail.com> wrote:
>
>> Btw, there is also an option to provide a custom credential provider,
>> what are your thoughts on this?
>>
>> presto.s3.credentials-provider
>>
>>
>> On Tue, Apr 6, 2021 at 12:43 AM Austin Cawley-Edwards <
>> austin.caw...@gmail.com> wrote:
>>
>>> I've confirmed that for the bundled + shaded aws dependency, the only
>>> way to upgrade it is to build a flink-s3-fs-presto jar with the updated
>>> dependency. Let me know if this is feasible for you, if the KMS key
>>> solution doesn't work.
>>>
>>> Best,
>>> Austin
>>>
>>> On Mon, Apr 5, 2021 at 2:18 PM Austin Cawley-Edwards <
>>> austin.caw...@gmail.com> wrote:
>>>
>>>> Hi Swagat,
>>>>
>>>> I don't believe there is an explicit configuration option for the KMS
>>>> key – please let me know if you're able to make that work!
>>>>
>>>> Best,
>>>> Austin
>>>>
>>>> On Mon, Apr 5, 2021 at 1:45 PM Swagat Mishra <swaga...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Austin,
>>>>>
>>>>> Let me know what you think on my latest email, if the approach might
>>>>> work, or if it is already supported and I am not using the configurations
>>>>> properly.
>>>>>
>>>>> Thanks for your interest and support.
>>>>>
>>>>> Regards,
>>>>> Swagat
>>>>>
>>>>> On Mon, Apr 5, 2021 at 10:39 PM Austin Cawley-Edwards <
>>>>> austin.caw...@gmail.com> wrote:
>>>>>
>>>>>> Hi Swagat,
>>>>>>
>>>>>> It looks like Flink 1.6 bundles the 1.11.165 version of the
>>>>>> aws-java-sdk-core with the Presto implementation (transitively from 
>>>>>> Presto
>>>>>> 0.185[1]).
>>>>>> The minimum support version for the ServiceAccount authentication
>>>>>> approach is 1.11.704 (see [2]) which was released on Jan 9th, 2020[3], 
>>>>>> long
>>>>>> after Flink 1.6 was released. It looks like even the most recent Presto 
>>>>>> is
>>>>>> on a version below that, concretely 1.11.697 in the master branch[4], so 
>>>>>> I
>>>>>> don't think even upgrading Flink to 1.6+ will solve this though it looks 
>>>>>> to
>>>>>> me like the AWS dependency is managed better in more recent Flink 
>>>>>> versions.
>>>>>> I'll have more for you on that front tomorrow, after the Easter break.
>>>>>>
>>>>>> I think what you would have to do to make this authentication
>>>>>> approach work for Flink 1.6 is building a custom version of the
>>>>>> flink-s3-fs-presto jar, replacing the bundled AWS dependency with the
>>>>>> 1.11.704 version, and then shading it the same way.
>>>>>>
>>>>>> In the meantime, would you mind creating a JIRA ticket with this use
>>>>>> case? That'll give you the best insight into the status of fixing this :)
>>>>>>
>>>>>> Let me know if that makes sense,
>>>>>> Austin
>>>>>>
>>>>>> [1]:
>>>>>> https://github.com/prestodb/presto/blob/1d4ee196df4327568c0982811d8459a44f1792b9/pom.xml#L53
>>>>>> [2]:
>>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html
>>>>>> [3]: https://github.com/aws/aws-sdk-java/releases/tag/1.11.704
>>>>>> [4]: https://github.com/prestodb/presto/blob/master/pom.xml#L52
>>>>>>
>>>>>> On Sun, Apr 4, 2021 at 3:32 AM Swagat Mishra <swaga...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Austin -
>>>>>>>
>>>>>>> In my case the set up is such that services are deployed on
>>>>>>> Kubernetes with Docker, running on EKS. There is also an istio service
>>>>>>> mesh. So all the services communicate and access AWS resources like S3
>>>>>>> using the service account. Service account is associated with IAM 
>>>>>>> roles. I
>>>>>>> have verified that the service account has access to S3, by running a
>>>>>>> program that connects to S3 to read a file also aws client when
>>>>>>> packaged into the pod is able to access S3. So that means the roles and
>>>>>>> policies are good.
>>>>>>>
>>>>>>> When I am running flink, I am following the same configuration for
>>>>>>> job manager and task manager as provided here:
>>>>>>>
>>>>>>>
>>>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/kubernetes.html
>>>>>>>
>>>>>>> The exception we are getting is -
>>>>>>> org.apache.flink.fs.s3presto.shaded.com.amazonaws.SDKClientException:
>>>>>>> Unable to load credentials from service end point.
>>>>>>>
>>>>>>> This happens in the EC2CredentialFetcher class method
>>>>>>> fetchCredentials - line number 66, when it tries to read resource,
>>>>>>> effectively executing
>>>>>>> CURL 169.254.170.2/AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
>>>>>>>
>>>>>>> I am not setting the variable AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
>>>>>>> because its not the right way to do it for us, we are on EKS. Similarly 
>>>>>>> any
>>>>>>> of the ~/.aws/credentials file approach will also not work for us.
>>>>>>>
>>>>>>>
>>>>>>> Atm, I haven't tried the kuberenetes service account property you
>>>>>>> mentioned above. I will try and let you know how it goes.
>>>>>>>
>>>>>>> Question - do i need to provide any parameters while building the
>>>>>>> docker image or any configuration in the flink config to tell flink that
>>>>>>> for all purposes it should be using the service account and not try to 
>>>>>>> get
>>>>>>> into the EC2CredentialFetcher class.
>>>>>>>
>>>>>>> One more thing - we were trying this on the 1.6 version of Flink and
>>>>>>> not the 1.12 version.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Swagat
>>>>>>>
>>>>>>> On Sun, Apr 4, 2021 at 8:56 AM Sameer Wadkar <sam...@axiomine.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Kube2Iam needs to modify IPtables to proxy calls to ec2 metadata to
>>>>>>>> a daemonset which runs privileged pods which maps a IP Address of the 
>>>>>>>> pods
>>>>>>>> and its associated service account to make STS calls and return 
>>>>>>>> temporary
>>>>>>>> AWS credentials. Your pod “thinks” the ec2 metadata url works locally 
>>>>>>>> like
>>>>>>>> in an ec2 instance.
>>>>>>>>
>>>>>>>> I have found that mutating webhooks are easier to deploy (when you
>>>>>>>> have no control over the Kubernetes environment - say you cannot change
>>>>>>>> iptables or run privileged pods). These can configure the
>>>>>>>> ~/.aws/credentials file. The webhook can make the STS call for the 
>>>>>>>> service
>>>>>>>> account to role mapping. A side car container to which the main 
>>>>>>>> container
>>>>>>>> has no access can even renew credentials becoz STS returns temp
>>>>>>>> credentials.
>>>>>>>>
>>>>>>>> Sent from my iPhone
>>>>>>>>
>>>>>>>> On Apr 3, 2021, at 10:29 PM, Austin Cawley-Edwards <
>>>>>>>> austin.caw...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>> If you’re just looking to attach a service account to a pod using
>>>>>>>> the native AWS EKS IAM mapping[1], you should be able to attach the 
>>>>>>>> service
>>>>>>>> account to the pod via the `kubernetes.service-account` configuration
>>>>>>>> option[2].
>>>>>>>>
>>>>>>>> Let me know if that works for you!
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Austin
>>>>>>>>
>>>>>>>> [1]:
>>>>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
>>>>>>>> [2]:
>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account
>>>>>>>>
>>>>>>>> On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards <
>>>>>>>> austin.caw...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Can you describe your setup a little bit more? And perhaps how you
>>>>>>>>> use this setup to grant access to other non-Flink pods?
>>>>>>>>>
>>>>>>>>> On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra <swaga...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Yes I looked at kube2iam, I haven't experimented with it.
>>>>>>>>>>
>>>>>>>>>> Given that the service account has access to S3, shouldn't we
>>>>>>>>>> have a simpler mechanism to connect to underlying resources based on 
>>>>>>>>>> the
>>>>>>>>>> service account authorization?
>>>>>>>>>>
>>>>>>>>>> On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards <
>>>>>>>>>> austin.caw...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Swagat,
>>>>>>>>>>>
>>>>>>>>>>> I’ve used kube2iam[1] for granting AWS access to Flink pods in
>>>>>>>>>>> the past with good results. It’s all based on mapping pod 
>>>>>>>>>>> annotations to
>>>>>>>>>>> AWS IAM roles. Is this something that might work for you?
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Austin
>>>>>>>>>>>
>>>>>>>>>>> [1]: https://github.com/jtblin/kube2iam
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra <
>>>>>>>>>>> swaga...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> No we are running on aws. The mechanisms supported by flink to
>>>>>>>>>>>> connect to resources like S3, need us to make changes that will 
>>>>>>>>>>>> impact all
>>>>>>>>>>>> services, something that we don't want to do. So providing the aws 
>>>>>>>>>>>> secret
>>>>>>>>>>>> key ID and passcode upfront or iam rules where it connects by 
>>>>>>>>>>>> executing
>>>>>>>>>>>> curl/ http calls to connect to S3 , don't work for me.
>>>>>>>>>>>>
>>>>>>>>>>>> I want to be able to connect to S3, using aws Api's and if that
>>>>>>>>>>>> connection can be leveraged by the presto library, that is what I 
>>>>>>>>>>>> am
>>>>>>>>>>>> looking for.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Swagat
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo <israele...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Are you running on Azure Kubernetes Service.
>>>>>>>>>>>>>
>>>>>>>>>>>>> You should be able to do it because the identity can be mapped
>>>>>>>>>>>>> to the labels of the pods not necessary Flink.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra <
>>>>>>>>>>>>> swaga...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think flink doesn't support pod identity, any plans tk
>>>>>>>>>>>>>> achieve it in any subsequent release.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Swagat
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Reply via email to