Re: Flink - Pod Identity

2021-04-03 Thread Sameer Wadkar
Kube2Iam needs to modify IPtables to proxy calls to ec2 metadata to a daemonset 
which runs privileged pods which maps a IP Address of the pods and its 
associated service account to make STS calls and return temporary AWS 
credentials. Your pod “thinks” the ec2 metadata url works locally like in an 
ec2 instance. 

I have found that mutating webhooks are easier to deploy (when you have no 
control over the Kubernetes environment - say you cannot change iptables or run 
privileged pods). These can configure the ~/.aws/credentials file. The webhook 
can make the STS call for the service account to role mapping. A side car 
container to which the main container has no access can even renew credentials 
becoz STS returns temp credentials. 

Sent from my iPhone

> On Apr 3, 2021, at 10:29 PM, Austin Cawley-Edwards  
> wrote:
> 
> 
> If you’re just looking to attach a service account to a pod using the native 
> AWS EKS IAM mapping[1], you should be able to attach the service account to 
> the pod via the `kubernetes.service-account` configuration option[2]. 
> 
> Let me know if that works for you!
> 
> Best,
> Austin 
> 
> [1]: 
> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
> [2]: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account
> 
>> On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards 
>>  wrote:
>> Can you describe your setup a little bit more? And perhaps how you use this 
>> setup to grant access to other non-Flink pods?
>> 
>>> On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra  wrote:
>>> Yes I looked at kube2iam, I haven't experimented with it.
>>> 
>>> Given that the service account has access to S3, shouldn't we have a 
>>> simpler mechanism to connect to underlying resources based on the service 
>>> account authorization?
>>> 
 On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards 
  wrote:
 Hi Swagat,
 
 I’ve used kube2iam[1] for granting AWS access to Flink pods in the past 
 with good results. It’s all based on mapping pod annotations to AWS IAM 
 roles. Is this something that might work for you?
 
 Best,
 Austin
 
 [1]: https://github.com/jtblin/kube2iam
 
> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra  wrote:
> No we are running on aws. The mechanisms supported by flink to connect to 
> resources like S3, need us to make changes that will impact all services, 
> something that we don't want to do. So providing the aws secret key ID 
> and passcode upfront or iam rules where it connects by executing curl/ 
> http calls to connect to S3 , don't work for me.
> 
> I want to be able to connect to S3, using aws Api's and if that 
> connection can be leveraged by the presto library, that is what I am 
> looking for.
> 
> Regards,
> Swagat
> 
> 
>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo  wrote:
>> Are you running on Azure Kubernetes Service.
>> 
>> You should be able to do it because the identity can be mapped to the 
>> labels of the pods not necessary Flink.
>> 
>>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra  wrote:
>>> Hi,
>>> 
>>> I think flink doesn't support pod identity, any plans tk achieve it in 
>>> any subsequent release.
>>> 
>>> Regards,
>>> Swagat
>>> 
>>> 


Re: Flink - Pod Identity

2021-04-03 Thread Austin Cawley-Edwards
If you’re just looking to attach a service account to a pod using the
native AWS EKS IAM mapping[1], you should be able to attach the service
account to the pod via the `kubernetes.service-account` configuration
option[2].

Let me know if that works for you!

Best,
Austin

[1]:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
[2]:
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account

On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Can you describe your setup a little bit more? And perhaps how you use
> this setup to grant access to other non-Flink pods?
>
> On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra  wrote:
>
>> Yes I looked at kube2iam, I haven't experimented with it.
>>
>> Given that the service account has access to S3, shouldn't we have a
>> simpler mechanism to connect to underlying resources based on the service
>> account authorization?
>>
>> On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards <
>> austin.caw...@gmail.com> wrote:
>>
>>> Hi Swagat,
>>>
>>> I’ve used kube2iam[1] for granting AWS access to Flink pods in the past
>>> with good results. It’s all based on mapping pod annotations to AWS IAM
>>> roles. Is this something that might work for you?
>>>
>>> Best,
>>> Austin
>>>
>>> [1]: https://github.com/jtblin/kube2iam
>>>
>>> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra 
>>> wrote:
>>>
 No we are running on aws. The mechanisms supported by flink to connect
 to resources like S3, need us to make changes that will impact all
 services, something that we don't want to do. So providing the aws secret
 key ID and passcode upfront or iam rules where it connects by executing
 curl/ http calls to connect to S3 , don't work for me.

 I want to be able to connect to S3, using aws Api's and if that
 connection can be leveraged by the presto library, that is what I am
 looking for.

 Regards,
 Swagat


 On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo  wrote:

> Are you running on Azure Kubernetes Service.
>
> You should be able to do it because the identity can be mapped to the
> labels of the pods not necessary Flink.
>
> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra 
> wrote:
>
>> Hi,
>>
>> I think flink doesn't support pod identity, any plans tk achieve it
>> in any subsequent release.
>>
>> Regards,
>> Swagat
>>
>>
>>


Re: Flink - Pod Identity

2021-04-03 Thread Austin Cawley-Edwards
Can you describe your setup a little bit more? And perhaps how you use this
setup to grant access to other non-Flink pods?

On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra  wrote:

> Yes I looked at kube2iam, I haven't experimented with it.
>
> Given that the service account has access to S3, shouldn't we have a
> simpler mechanism to connect to underlying resources based on the service
> account authorization?
>
> On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
>> Hi Swagat,
>>
>> I’ve used kube2iam[1] for granting AWS access to Flink pods in the past
>> with good results. It’s all based on mapping pod annotations to AWS IAM
>> roles. Is this something that might work for you?
>>
>> Best,
>> Austin
>>
>> [1]: https://github.com/jtblin/kube2iam
>>
>> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra  wrote:
>>
>>> No we are running on aws. The mechanisms supported by flink to connect
>>> to resources like S3, need us to make changes that will impact all
>>> services, something that we don't want to do. So providing the aws secret
>>> key ID and passcode upfront or iam rules where it connects by executing
>>> curl/ http calls to connect to S3 , don't work for me.
>>>
>>> I want to be able to connect to S3, using aws Api's and if that
>>> connection can be leveraged by the presto library, that is what I am
>>> looking for.
>>>
>>> Regards,
>>> Swagat
>>>
>>>
>>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo  wrote:
>>>
 Are you running on Azure Kubernetes Service.

 You should be able to do it because the identity can be mapped to the
 labels of the pods not necessary Flink.

 On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra 
 wrote:

> Hi,
>
> I think flink doesn't support pod identity, any plans tk achieve it in
> any subsequent release.
>
> Regards,
> Swagat
>
>
>


Re: Flink - Pod Identity

2021-04-03 Thread Swagat Mishra
Yes I looked at kube2iam, I haven't experimented with it.

Given that the service account has access to S3, shouldn't we have a
simpler mechanism to connect to underlying resources based on the service
account authorization?

On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards 
wrote:

> Hi Swagat,
>
> I’ve used kube2iam[1] for granting AWS access to Flink pods in the past
> with good results. It’s all based on mapping pod annotations to AWS IAM
> roles. Is this something that might work for you?
>
> Best,
> Austin
>
> [1]: https://github.com/jtblin/kube2iam
>
> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra  wrote:
>
>> No we are running on aws. The mechanisms supported by flink to connect to
>> resources like S3, need us to make changes that will impact all services,
>> something that we don't want to do. So providing the aws secret key ID and
>> passcode upfront or iam rules where it connects by executing curl/ http
>> calls to connect to S3 , don't work for me.
>>
>> I want to be able to connect to S3, using aws Api's and if that
>> connection can be leveraged by the presto library, that is what I am
>> looking for.
>>
>> Regards,
>> Swagat
>>
>>
>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo  wrote:
>>
>>> Are you running on Azure Kubernetes Service.
>>>
>>> You should be able to do it because the identity can be mapped to the
>>> labels of the pods not necessary Flink.
>>>
>>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra  wrote:
>>>
 Hi,

 I think flink doesn't support pod identity, any plans tk achieve it in
 any subsequent release.

 Regards,
 Swagat





Re: Flink - Pod Identity

2021-04-03 Thread Austin Cawley-Edwards
Hi Swagat,

I’ve used kube2iam[1] for granting AWS access to Flink pods in the past
with good results. It’s all based on mapping pod annotations to AWS IAM
roles. Is this something that might work for you?

Best,
Austin

[1]: https://github.com/jtblin/kube2iam

On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra  wrote:

> No we are running on aws. The mechanisms supported by flink to connect to
> resources like S3, need us to make changes that will impact all services,
> something that we don't want to do. So providing the aws secret key ID and
> passcode upfront or iam rules where it connects by executing curl/ http
> calls to connect to S3 , don't work for me.
>
> I want to be able to connect to S3, using aws Api's and if that connection
> can be leveraged by the presto library, that is what I am looking for.
>
> Regards,
> Swagat
>
>
> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo  wrote:
>
>> Are you running on Azure Kubernetes Service.
>>
>> You should be able to do it because the identity can be mapped to the
>> labels of the pods not necessary Flink.
>>
>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra  wrote:
>>
>>> Hi,
>>>
>>> I think flink doesn't support pod identity, any plans tk achieve it in
>>> any subsequent release.
>>>
>>> Regards,
>>> Swagat
>>>
>>>
>>>


Re: Flink - Pod Identity

2021-04-03 Thread Swagat Mishra
No we are running on aws. The mechanisms supported by flink to connect to
resources like S3, need us to make changes that will impact all services,
something that we don't want to do. So providing the aws secret key ID and
passcode upfront or iam rules where it connects by executing curl/ http
calls to connect to S3 , don't work for me.

I want to be able to connect to S3, using aws Api's and if that connection
can be leveraged by the presto library, that is what I am looking for.

Regards,
Swagat

On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo  wrote:

> Are you running on Azure Kubernetes Service.
>
> You should be able to do it because the identity can be mapped to the
> labels of the pods not necessary Flink.
>
> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra  wrote:
>
>> Hi,
>>
>> I think flink doesn't support pod identity, any plans tk achieve it in
>> any subsequent release.
>>
>> Regards,
>> Swagat
>>
>>
>>


Re: Flink - Pod Identity

2021-04-03 Thread Israel Ekpo
Are you running on Azure Kubernetes Service.

You should be able to do it because the identity can be mapped to the
labels of the pods not necessary Flink.

On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra  wrote:

> Hi,
>
> I think flink doesn't support pod identity, any plans tk achieve it in any
> subsequent release.
>
> Regards,
> Swagat
>
>
>


UniqueKey constraint is lost with multiple sources join in SQL

2021-04-03 Thread Kai Fu
Hi team,

We have a use case to join multiple data sources to generate a
continuous updated view. We defined primary key constraint on all the input
sources and all the keys are the subsets in the join condition. All joins
are left join.

In our case, the first two inputs can produce *JoinKeyContainsUniqueKey *input
sepc, which is good and performant. While when it comes to the third input
source, it's joined with the intermediate output table of the first two
input tables, and the intermediate table does not carry key constraint
information(although the thrid source input table does), so it results in a
*NoUniqueKey* input sepc. Given NoUniqueKey inputs has dramatic performance
implications per the Force Join Unique Key

email thread, we want to know if there is any mitigation plan for this.

One solution I can come up with is to write the intermediate result into
some place like Kafka with unique constraint and join with the
third source, while it requires extra resources. Any other suggestion on
this? Thanks.

-- 
*Best regards,*
*- Kai*


Flink - Pod Identity

2021-04-03 Thread Swagat Mishra
Hi,

I think flink doesn't support pod identity, any plans tk achieve it in any
subsequent release.

Regards,
Swagat