And actually, I've found that the correct version of the AWS SDK *is* included in Flink 1.12, which was reported and fixed in FLINK-18676 (see[1]). Since you said you saw this also occur in 1.12, can you share more details about what you saw there?
Best, Austin [1]: https://issues.apache.org/jira/browse/FLINK-18676 On Mon, Apr 5, 2021 at 4:53 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > That looks interesting! I've also found the full list of S3 properties[1] > for the version of presto-hive bundled with Flink 1.12 (see [2]), which > includes an option for a KMS key (hive.s3.kms-key-id). > > (also, adding back the user list) > > [1]: > https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration > [2]: > https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/filesystems/s3.html#hadooppresto-s3-file-systems-plugins > > On Mon, Apr 5, 2021 at 4:21 PM Swagat Mishra <swaga...@gmail.com> wrote: > >> Btw, there is also an option to provide a custom credential provider, >> what are your thoughts on this? >> >> presto.s3.credentials-provider >> >> >> On Tue, Apr 6, 2021 at 12:43 AM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >>> I've confirmed that for the bundled + shaded aws dependency, the only >>> way to upgrade it is to build a flink-s3-fs-presto jar with the updated >>> dependency. Let me know if this is feasible for you, if the KMS key >>> solution doesn't work. >>> >>> Best, >>> Austin >>> >>> On Mon, Apr 5, 2021 at 2:18 PM Austin Cawley-Edwards < >>> austin.caw...@gmail.com> wrote: >>> >>>> Hi Swagat, >>>> >>>> I don't believe there is an explicit configuration option for the KMS >>>> key – please let me know if you're able to make that work! >>>> >>>> Best, >>>> Austin >>>> >>>> On Mon, Apr 5, 2021 at 1:45 PM Swagat Mishra <swaga...@gmail.com> >>>> wrote: >>>> >>>>> Hi Austin, >>>>> >>>>> Let me know what you think on my latest email, if the approach might >>>>> work, or if it is already supported and I am not using the configurations >>>>> properly. >>>>> >>>>> Thanks for your interest and support. >>>>> >>>>> Regards, >>>>> Swagat >>>>> >>>>> On Mon, Apr 5, 2021 at 10:39 PM Austin Cawley-Edwards < >>>>> austin.caw...@gmail.com> wrote: >>>>> >>>>>> Hi Swagat, >>>>>> >>>>>> It looks like Flink 1.6 bundles the 1.11.165 version of the >>>>>> aws-java-sdk-core with the Presto implementation (transitively from >>>>>> Presto >>>>>> 0.185[1]). >>>>>> The minimum support version for the ServiceAccount authentication >>>>>> approach is 1.11.704 (see [2]) which was released on Jan 9th, 2020[3], >>>>>> long >>>>>> after Flink 1.6 was released. It looks like even the most recent Presto >>>>>> is >>>>>> on a version below that, concretely 1.11.697 in the master branch[4], so >>>>>> I >>>>>> don't think even upgrading Flink to 1.6+ will solve this though it looks >>>>>> to >>>>>> me like the AWS dependency is managed better in more recent Flink >>>>>> versions. >>>>>> I'll have more for you on that front tomorrow, after the Easter break. >>>>>> >>>>>> I think what you would have to do to make this authentication >>>>>> approach work for Flink 1.6 is building a custom version of the >>>>>> flink-s3-fs-presto jar, replacing the bundled AWS dependency with the >>>>>> 1.11.704 version, and then shading it the same way. >>>>>> >>>>>> In the meantime, would you mind creating a JIRA ticket with this use >>>>>> case? That'll give you the best insight into the status of fixing this :) >>>>>> >>>>>> Let me know if that makes sense, >>>>>> Austin >>>>>> >>>>>> [1]: >>>>>> https://github.com/prestodb/presto/blob/1d4ee196df4327568c0982811d8459a44f1792b9/pom.xml#L53 >>>>>> [2]: >>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html >>>>>> [3]: https://github.com/aws/aws-sdk-java/releases/tag/1.11.704 >>>>>> [4]: https://github.com/prestodb/presto/blob/master/pom.xml#L52 >>>>>> >>>>>> On Sun, Apr 4, 2021 at 3:32 AM Swagat Mishra <swaga...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Austin - >>>>>>> >>>>>>> In my case the set up is such that services are deployed on >>>>>>> Kubernetes with Docker, running on EKS. There is also an istio service >>>>>>> mesh. So all the services communicate and access AWS resources like S3 >>>>>>> using the service account. Service account is associated with IAM >>>>>>> roles. I >>>>>>> have verified that the service account has access to S3, by running a >>>>>>> program that connects to S3 to read a file also aws client when >>>>>>> packaged into the pod is able to access S3. So that means the roles and >>>>>>> policies are good. >>>>>>> >>>>>>> When I am running flink, I am following the same configuration for >>>>>>> job manager and task manager as provided here: >>>>>>> >>>>>>> >>>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/kubernetes.html >>>>>>> >>>>>>> The exception we are getting is - >>>>>>> org.apache.flink.fs.s3presto.shaded.com.amazonaws.SDKClientException: >>>>>>> Unable to load credentials from service end point. >>>>>>> >>>>>>> This happens in the EC2CredentialFetcher class method >>>>>>> fetchCredentials - line number 66, when it tries to read resource, >>>>>>> effectively executing >>>>>>> CURL 169.254.170.2/AWS_CONTAINER_CREDENTIALS_RELATIVE_URI >>>>>>> >>>>>>> I am not setting the variable AWS_CONTAINER_CREDENTIALS_RELATIVE_URI >>>>>>> because its not the right way to do it for us, we are on EKS. Similarly >>>>>>> any >>>>>>> of the ~/.aws/credentials file approach will also not work for us. >>>>>>> >>>>>>> >>>>>>> Atm, I haven't tried the kuberenetes service account property you >>>>>>> mentioned above. I will try and let you know how it goes. >>>>>>> >>>>>>> Question - do i need to provide any parameters while building the >>>>>>> docker image or any configuration in the flink config to tell flink that >>>>>>> for all purposes it should be using the service account and not try to >>>>>>> get >>>>>>> into the EC2CredentialFetcher class. >>>>>>> >>>>>>> One more thing - we were trying this on the 1.6 version of Flink and >>>>>>> not the 1.12 version. >>>>>>> >>>>>>> Regards, >>>>>>> Swagat >>>>>>> >>>>>>> On Sun, Apr 4, 2021 at 8:56 AM Sameer Wadkar <sam...@axiomine.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Kube2Iam needs to modify IPtables to proxy calls to ec2 metadata to >>>>>>>> a daemonset which runs privileged pods which maps a IP Address of the >>>>>>>> pods >>>>>>>> and its associated service account to make STS calls and return >>>>>>>> temporary >>>>>>>> AWS credentials. Your pod “thinks” the ec2 metadata url works locally >>>>>>>> like >>>>>>>> in an ec2 instance. >>>>>>>> >>>>>>>> I have found that mutating webhooks are easier to deploy (when you >>>>>>>> have no control over the Kubernetes environment - say you cannot change >>>>>>>> iptables or run privileged pods). These can configure the >>>>>>>> ~/.aws/credentials file. The webhook can make the STS call for the >>>>>>>> service >>>>>>>> account to role mapping. A side car container to which the main >>>>>>>> container >>>>>>>> has no access can even renew credentials becoz STS returns temp >>>>>>>> credentials. >>>>>>>> >>>>>>>> Sent from my iPhone >>>>>>>> >>>>>>>> On Apr 3, 2021, at 10:29 PM, Austin Cawley-Edwards < >>>>>>>> austin.caw...@gmail.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> If you’re just looking to attach a service account to a pod using >>>>>>>> the native AWS EKS IAM mapping[1], you should be able to attach the >>>>>>>> service >>>>>>>> account to the pod via the `kubernetes.service-account` configuration >>>>>>>> option[2]. >>>>>>>> >>>>>>>> Let me know if that works for you! >>>>>>>> >>>>>>>> Best, >>>>>>>> Austin >>>>>>>> >>>>>>>> [1]: >>>>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html >>>>>>>> [2]: >>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account >>>>>>>> >>>>>>>> On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards < >>>>>>>> austin.caw...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Can you describe your setup a little bit more? And perhaps how you >>>>>>>>> use this setup to grant access to other non-Flink pods? >>>>>>>>> >>>>>>>>> On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra <swaga...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Yes I looked at kube2iam, I haven't experimented with it. >>>>>>>>>> >>>>>>>>>> Given that the service account has access to S3, shouldn't we >>>>>>>>>> have a simpler mechanism to connect to underlying resources based on >>>>>>>>>> the >>>>>>>>>> service account authorization? >>>>>>>>>> >>>>>>>>>> On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards < >>>>>>>>>> austin.caw...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Swagat, >>>>>>>>>>> >>>>>>>>>>> I’ve used kube2iam[1] for granting AWS access to Flink pods in >>>>>>>>>>> the past with good results. It’s all based on mapping pod >>>>>>>>>>> annotations to >>>>>>>>>>> AWS IAM roles. Is this something that might work for you? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Austin >>>>>>>>>>> >>>>>>>>>>> [1]: https://github.com/jtblin/kube2iam >>>>>>>>>>> >>>>>>>>>>> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra < >>>>>>>>>>> swaga...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> No we are running on aws. The mechanisms supported by flink to >>>>>>>>>>>> connect to resources like S3, need us to make changes that will >>>>>>>>>>>> impact all >>>>>>>>>>>> services, something that we don't want to do. So providing the aws >>>>>>>>>>>> secret >>>>>>>>>>>> key ID and passcode upfront or iam rules where it connects by >>>>>>>>>>>> executing >>>>>>>>>>>> curl/ http calls to connect to S3 , don't work for me. >>>>>>>>>>>> >>>>>>>>>>>> I want to be able to connect to S3, using aws Api's and if that >>>>>>>>>>>> connection can be leveraged by the presto library, that is what I >>>>>>>>>>>> am >>>>>>>>>>>> looking for. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Swagat >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo <israele...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Are you running on Azure Kubernetes Service. >>>>>>>>>>>>> >>>>>>>>>>>>> You should be able to do it because the identity can be mapped >>>>>>>>>>>>> to the labels of the pods not necessary Flink. >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra < >>>>>>>>>>>>> swaga...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I think flink doesn't support pod identity, any plans tk >>>>>>>>>>>>>> achieve it in any subsequent release. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Swagat >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>