Re: [DISCUSS] Dashboard/HistoryServer authentication
Thank you, team. Then based on this agreement we are moving the proposal to the wiki and opening the PR soon. On Thu, Jul 1, 2021 at 12:28 AM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > > > > * Even if there is a major breaking version, Flink releases major > versions > > too where it could be added > > Netty framework locking is true but AFAIK there was a discussion to > > rewrite the Netty stuff to a more sexy thing but there was no agreement > to > > do that. > > > > Flink major releases seem to happen even less frequently than Netty > releases :( It would be unfortunate if a breaking Netty API change ended up > in the FLINK-3957[1] catch-all 2.0 changes. > > All in all I would agree on making it experimental. > > > > Thus I am happy with this compromise, thank you :) > > This would simply restrict use-cases where order is not important. Limiting > > devs such an add way is no-go. > > > > I think the only case to be made for imposing limitations would be to > encourage devs to only use this API in very specific situations, otherwise > to solve this in another way, and revisit the API if these limitations are > met and alternatives do not work. That said, I am still trying to > understand this specific Cloudera case – anything you can say about the > limitations of its Flink setup (i.e, difficult to spawn sidecar processes > (because of Yarn?)) would be greatly helpful to me and others without this > bit of context. > > But I think the proposed priority function that you've added is a nice > compromise as well, so +1 from my side with the proposal. I would only > further suggest that we include the other options to this problem in the > docs as the preferred approach, where possible. > > Thanks, > Austin > > > [1]: https://issues.apache.org/jira/browse/FLINK-3957 > > On Wed, Jun 30, 2021 at 10:25 AM Gabor Somogyi > wrote: > > > Answered here because the text started to be crowded. > > > > > It also locks Flink into the current major version of Netty (and the > > Netty framework itself) for the foreseeable future. > > It's not doing any Netty version locking because: > > * Netty not necessarily will add breaking changes in major versions, the > > API is quite stable > > * Even if there is a major breaking version, Flink releases major > versions > > too where it could be added > > Netty framework locking is true but AFAIK there was a discussion to > > rewrite the Netty stuff to a more sexy thing but there was no agreement > to > > do that. > > All in all I would agree on making it experimental. > > > > > why not restrict the service loader to only allow one? > > This would simply restrict use-cases where order is not important. > > Limiting devs such an add way is no-go. > > I think the ordering came up multiple places which I think is a good > > reason fill this gap with a priority function. > > I've updated the doc and added it... > > > > BR, > > G > > > > > > On Wed, Jun 30, 2021 at 3:53 PM Austin Cawley-Edwards < > > austin.caw...@gmail.com> wrote: > > > >> Hi Gabor, > >> > >> Thanks for your answers. I appreciate the explanations. Please see my > >> responses + further questions below. > >> > >> > >> * What stability semantics do you envision for this API? > > >>> As I foresee the API will be as stable as Netty API. Since there is > >>> guarantee on no breaking changes between minor versions we can give the > >>> same guarantee. > >>> If for whatever reason we need to break it we can do it in major > version > >>> like every other open source project does. > >>> > >> > >> * Does Flink expose dependencies’ APIs in other places? Since this > exposes the Netty API, will this make it difficult to upgrade Netty? > > >>> I don't expect breaking changes between minor versions so such cases > >>> there will be no issues. If there is a breaking change in major version > >>> we need to wait Flink major version too. > >>> > >> > >> To clarify, you are proposing this new API to have the same stability > >> guarantees as @Public currently does? Where we will not introduce > breaking > >> changes unless absolutely necessary (and requiring a FLIP, etc.)? > >> > >> If this is the case, I think this puts the community in a tough position > >> where we are forced to maintain compatibility with something that we do > not > >> have control over. It also locks Flink into the current major version of > >> Netty (and the Netty framework itself) for the foreseeable future. > >> > >> I am saying we should not do this, perhaps this is the best solution to > >> finding a good compromise here, but I am trying to discover + > acknowledge > >> the full implications of this proposal so they can be discussed. > >> > >> What do you think about marking this API as @Experimental and not > >> guaranteeing stability between versions? Then, if we do decide we need > to > >> upgrade Netty (or move away from it), we can do so. > >> > >> * I share Till's concern about multiple factories – other HTTP >
Re: [DISCUSS] Dashboard/HistoryServer authentication
> > * Even if there is a major breaking version, Flink releases major versions > too where it could be added > Netty framework locking is true but AFAIK there was a discussion to > rewrite the Netty stuff to a more sexy thing but there was no agreement to > do that. > Flink major releases seem to happen even less frequently than Netty releases :( It would be unfortunate if a breaking Netty API change ended up in the FLINK-3957[1] catch-all 2.0 changes. All in all I would agree on making it experimental. > Thus I am happy with this compromise, thank you :) This would simply restrict use-cases where order is not important. Limiting > devs such an add way is no-go. > I think the only case to be made for imposing limitations would be to encourage devs to only use this API in very specific situations, otherwise to solve this in another way, and revisit the API if these limitations are met and alternatives do not work. That said, I am still trying to understand this specific Cloudera case – anything you can say about the limitations of its Flink setup (i.e, difficult to spawn sidecar processes (because of Yarn?)) would be greatly helpful to me and others without this bit of context. But I think the proposed priority function that you've added is a nice compromise as well, so +1 from my side with the proposal. I would only further suggest that we include the other options to this problem in the docs as the preferred approach, where possible. Thanks, Austin [1]: https://issues.apache.org/jira/browse/FLINK-3957 On Wed, Jun 30, 2021 at 10:25 AM Gabor Somogyi wrote: > Answered here because the text started to be crowded. > > > It also locks Flink into the current major version of Netty (and the > Netty framework itself) for the foreseeable future. > It's not doing any Netty version locking because: > * Netty not necessarily will add breaking changes in major versions, the > API is quite stable > * Even if there is a major breaking version, Flink releases major versions > too where it could be added > Netty framework locking is true but AFAIK there was a discussion to > rewrite the Netty stuff to a more sexy thing but there was no agreement to > do that. > All in all I would agree on making it experimental. > > > why not restrict the service loader to only allow one? > This would simply restrict use-cases where order is not important. > Limiting devs such an add way is no-go. > I think the ordering came up multiple places which I think is a good > reason fill this gap with a priority function. > I've updated the doc and added it... > > BR, > G > > > On Wed, Jun 30, 2021 at 3:53 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi Gabor, >> >> Thanks for your answers. I appreciate the explanations. Please see my >> responses + further questions below. >> >> >> * What stability semantics do you envision for this API? >>> As I foresee the API will be as stable as Netty API. Since there is >>> guarantee on no breaking changes between minor versions we can give the >>> same guarantee. >>> If for whatever reason we need to break it we can do it in major version >>> like every other open source project does. >>> >> >> * Does Flink expose dependencies’ APIs in other places? Since this exposes the Netty API, will this make it difficult to upgrade Netty? >>> I don't expect breaking changes between minor versions so such cases >>> there will be no issues. If there is a breaking change in major version >>> we need to wait Flink major version too. >>> >> >> To clarify, you are proposing this new API to have the same stability >> guarantees as @Public currently does? Where we will not introduce breaking >> changes unless absolutely necessary (and requiring a FLIP, etc.)? >> >> If this is the case, I think this puts the community in a tough position >> where we are forced to maintain compatibility with something that we do not >> have control over. It also locks Flink into the current major version of >> Netty (and the Netty framework itself) for the foreseeable future. >> >> I am saying we should not do this, perhaps this is the best solution to >> finding a good compromise here, but I am trying to discover + acknowledge >> the full implications of this proposal so they can be discussed. >> >> What do you think about marking this API as @Experimental and not >> guaranteeing stability between versions? Then, if we do decide we need to >> upgrade Netty (or move away from it), we can do so. >> >> * I share Till's concern about multiple factories – other HTTP middleware frameworks commonly support chaining middlewares. Since the proposed API does not include these features/guarantee ordering, do you see any reason to allow more than one factory? >>> I personally can't come up with a use-case where ordering is a must. I'm >>> not telling that this is not a valid use-case but adding a feature w/o >>> business rationale would include the maintenance cost (though I'm open to
Re: [DISCUSS] Dashboard/HistoryServer authentication
Answered here because the text started to be crowded. > It also locks Flink into the current major version of Netty (and the Netty framework itself) for the foreseeable future. It's not doing any Netty version locking because: * Netty not necessarily will add breaking changes in major versions, the API is quite stable * Even if there is a major breaking version, Flink releases major versions too where it could be added Netty framework locking is true but AFAIK there was a discussion to rewrite the Netty stuff to a more sexy thing but there was no agreement to do that. All in all I would agree on making it experimental. > why not restrict the service loader to only allow one? This would simply restrict use-cases where order is not important. Limiting devs such an add way is no-go. I think the ordering came up multiple places which I think is a good reason fill this gap with a priority function. I've updated the doc and added it... BR, G On Wed, Jun 30, 2021 at 3:53 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi Gabor, > > Thanks for your answers. I appreciate the explanations. Please see my > responses + further questions below. > > > * What stability semantics do you envision for this API? >>> >> As I foresee the API will be as stable as Netty API. Since there is >> guarantee on no breaking changes between minor versions we can give the >> same guarantee. >> If for whatever reason we need to break it we can do it in major version >> like every other open source project does. >> > > * Does Flink expose dependencies’ APIs in other places? Since this exposes >>> the Netty API, will this make it difficult to upgrade Netty? >>> >> I don't expect breaking changes between minor versions so such cases >> there will be no issues. If there is a breaking change in major version >> we need to wait Flink major version too. >> > > To clarify, you are proposing this new API to have the same stability > guarantees as @Public currently does? Where we will not introduce breaking > changes unless absolutely necessary (and requiring a FLIP, etc.)? > > If this is the case, I think this puts the community in a tough position > where we are forced to maintain compatibility with something that we do not > have control over. It also locks Flink into the current major version of > Netty (and the Netty framework itself) for the foreseeable future. > > I am saying we should not do this, perhaps this is the best solution to > finding a good compromise here, but I am trying to discover + acknowledge > the full implications of this proposal so they can be discussed. > > What do you think about marking this API as @Experimental and not > guaranteeing stability between versions? Then, if we do decide we need to > upgrade Netty (or move away from it), we can do so. > > * I share Till's concern about multiple factories – other HTTP middleware >>> frameworks commonly support chaining middlewares. Since the proposed API >>> does not include these features/guarantee ordering, do you see any reason >>> to allow more than one factory? >>> >> I personally can't come up with a use-case where ordering is a must. I'm >> not telling that this is not a valid use-case but adding a feature w/o >> business rationale would include the maintenance cost (though I'm open to >> add). >> As I've seen Till also can't give example for that (please see the doc >> comments). If you have anything in mind please share it and we can add >> priority to the API. >> There is another option too, namely we can be defensive and we can add >> the priority right now. I would do this only if everybody states in mail >> that it would be the best option, >> otherwise I would stick to the original plan. >> > > Let me try to come up with a use case: > * Someone creates an authentication module for integrating with Google's > OAuth and publishes it to flink-packages > * Another person in another org wants to use Google OAuth and then add > internal authorization based on the user > * In this scenario, *Google OAuth must come before the internal > authorization* > * They place their module and the Google OAuth module to be picked up by > the service loader > * What happens? > > I do not think that the current proposal has a way to handle this, besides > having the implementor of the internal authorization module bundle > everything into one, as you have suggested. Since this is the only way to > achieve order, why not restrict the service loader to only allow one? This > way the API is explicit in what it supports. > > > Let me know what you think, > Austin > > > On Wed, Jun 30, 2021 at 5:24 AM Gabor Somogyi > wrote: > >> Hi Austin, >> >> Please see my answers embedded down below. >> >> BR, >> G >> >> >> >> On Tue, Jun 29, 2021 at 9:59 PM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >>> Hi all, >>> >>> Thanks for the updated proposal. I have a few questions about the API, >>> please see below. >>> >>> * What stability semantics do you
Re: [DISCUSS] Dashboard/HistoryServer authentication
Small correction: I am *not *saying we should not do this, perhaps this is the best solution to finding a good compromise here, but I am trying to discover + acknowledge the full implications of this proposal so they can be discussed. Sorry :) On Wed, Jun 30, 2021 at 9:53 AM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi Gabor, > > Thanks for your answers. I appreciate the explanations. Please see my > responses + further questions below. > > > * What stability semantics do you envision for this API? >>> >> As I foresee the API will be as stable as Netty API. Since there is >> guarantee on no breaking changes between minor versions we can give the >> same guarantee. >> If for whatever reason we need to break it we can do it in major version >> like every other open source project does. >> > > * Does Flink expose dependencies’ APIs in other places? Since this exposes >>> the Netty API, will this make it difficult to upgrade Netty? >>> >> I don't expect breaking changes between minor versions so such cases >> there will be no issues. If there is a breaking change in major version >> we need to wait Flink major version too. >> > > To clarify, you are proposing this new API to have the same stability > guarantees as @Public currently does? Where we will not introduce breaking > changes unless absolutely necessary (and requiring a FLIP, etc.)? > > If this is the case, I think this puts the community in a tough position > where we are forced to maintain compatibility with something that we do not > have control over. It also locks Flink into the current major version of > Netty (and the Netty framework itself) for the foreseeable future. > > I am saying we should not do this, perhaps this is the best solution to > finding a good compromise here, but I am trying to discover + acknowledge > the full implications of this proposal so they can be discussed. > > What do you think about marking this API as @Experimental and not > guaranteeing stability between versions? Then, if we do decide we need to > upgrade Netty (or move away from it), we can do so. > > * I share Till's concern about multiple factories – other HTTP middleware >>> frameworks commonly support chaining middlewares. Since the proposed API >>> does not include these features/guarantee ordering, do you see any reason >>> to allow more than one factory? >>> >> I personally can't come up with a use-case where ordering is a must. I'm >> not telling that this is not a valid use-case but adding a feature w/o >> business rationale would include the maintenance cost (though I'm open to >> add). >> As I've seen Till also can't give example for that (please see the doc >> comments). If you have anything in mind please share it and we can add >> priority to the API. >> There is another option too, namely we can be defensive and we can add >> the priority right now. I would do this only if everybody states in mail >> that it would be the best option, >> otherwise I would stick to the original plan. >> > > Let me try to come up with a use case: > * Someone creates an authentication module for integrating with Google's > OAuth and publishes it to flink-packages > * Another person in another org wants to use Google OAuth and then add > internal authorization based on the user > * In this scenario, *Google OAuth must come before the internal > authorization* > * They place their module and the Google OAuth module to be picked up by > the service loader > * What happens? > > I do not think that the current proposal has a way to handle this, besides > having the implementor of the internal authorization module bundle > everything into one, as you have suggested. Since this is the only way to > achieve order, why not restrict the service loader to only allow one? This > way the API is explicit in what it supports. > > > Let me know what you think, > Austin > > > On Wed, Jun 30, 2021 at 5:24 AM Gabor Somogyi > wrote: > >> Hi Austin, >> >> Please see my answers embedded down below. >> >> BR, >> G >> >> >> >> On Tue, Jun 29, 2021 at 9:59 PM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >>> Hi all, >>> >>> Thanks for the updated proposal. I have a few questions about the API, >>> please see below. >>> >>> * What stability semantics do you envision for this API? >>> >> As I foresee the API will be as stable as Netty API. Since there is >> guarantee on no breaking changes between minor versions we can give the >> same guarantee. >> If for whatever reason we need to break it we can do it in major version >> like every other open source project does. >> >> >>> * Does Flink expose dependencies’ APIs in other places? Since this >>> exposes the Netty API, will this make it difficult to upgrade Netty? >>> >> I don't expect breaking changes between minor versions so such cases >> there will be no issues. If there is a breaking change in major version >> we need to wait Flink major version too. >> >> >>> * I share Till's concern about multiple
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Gabor, Thanks for your answers. I appreciate the explanations. Please see my responses + further questions below. * What stability semantics do you envision for this API? >> > As I foresee the API will be as stable as Netty API. Since there is > guarantee on no breaking changes between minor versions we can give the > same guarantee. > If for whatever reason we need to break it we can do it in major version > like every other open source project does. > * Does Flink expose dependencies’ APIs in other places? Since this exposes >> the Netty API, will this make it difficult to upgrade Netty? >> > I don't expect breaking changes between minor versions so such cases there > will be no issues. If there is a breaking change in major version > we need to wait Flink major version too. > To clarify, you are proposing this new API to have the same stability guarantees as @Public currently does? Where we will not introduce breaking changes unless absolutely necessary (and requiring a FLIP, etc.)? If this is the case, I think this puts the community in a tough position where we are forced to maintain compatibility with something that we do not have control over. It also locks Flink into the current major version of Netty (and the Netty framework itself) for the foreseeable future. I am saying we should not do this, perhaps this is the best solution to finding a good compromise here, but I am trying to discover + acknowledge the full implications of this proposal so they can be discussed. What do you think about marking this API as @Experimental and not guaranteeing stability between versions? Then, if we do decide we need to upgrade Netty (or move away from it), we can do so. * I share Till's concern about multiple factories – other HTTP middleware >> frameworks commonly support chaining middlewares. Since the proposed API >> does not include these features/guarantee ordering, do you see any reason >> to allow more than one factory? >> > I personally can't come up with a use-case where ordering is a must. I'm > not telling that this is not a valid use-case but adding a feature w/o > business rationale would include the maintenance cost (though I'm open to > add). > As I've seen Till also can't give example for that (please see the doc > comments). If you have anything in mind please share it and we can add > priority to the API. > There is another option too, namely we can be defensive and we can add the > priority right now. I would do this only if everybody states in mail that > it would be the best option, > otherwise I would stick to the original plan. > Let me try to come up with a use case: * Someone creates an authentication module for integrating with Google's OAuth and publishes it to flink-packages * Another person in another org wants to use Google OAuth and then add internal authorization based on the user * In this scenario, *Google OAuth must come before the internal authorization* * They place their module and the Google OAuth module to be picked up by the service loader * What happens? I do not think that the current proposal has a way to handle this, besides having the implementor of the internal authorization module bundle everything into one, as you have suggested. Since this is the only way to achieve order, why not restrict the service loader to only allow one? This way the API is explicit in what it supports. Let me know what you think, Austin On Wed, Jun 30, 2021 at 5:24 AM Gabor Somogyi wrote: > Hi Austin, > > Please see my answers embedded down below. > > BR, > G > > > > On Tue, Jun 29, 2021 at 9:59 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi all, >> >> Thanks for the updated proposal. I have a few questions about the API, >> please see below. >> >> * What stability semantics do you envision for this API? >> > As I foresee the API will be as stable as Netty API. Since there is > guarantee on no breaking changes between minor versions we can give the > same guarantee. > If for whatever reason we need to break it we can do it in major version > like every other open source project does. > > >> * Does Flink expose dependencies’ APIs in other places? Since this >> exposes the Netty API, will this make it difficult to upgrade Netty? >> > I don't expect breaking changes between minor versions so such cases there > will be no issues. If there is a breaking change in major version > we need to wait Flink major version too. > > >> * I share Till's concern about multiple factories – other HTTP middleware >> frameworks commonly support chaining middlewares. Since the proposed API >> does not include these features/guarantee ordering, do you see any reason >> to allow more than one factory? >> > I personally can't come up with a use-case where ordering is a must. I'm > not telling that this is not a valid use-case but adding a feature w/o > business rationale would include the maintenance cost (though I'm open to > add). > As I've seen Till also can't give
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Austin, Please see my answers embedded down below. BR, G On Tue, Jun 29, 2021 at 9:59 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi all, > > Thanks for the updated proposal. I have a few questions about the API, > please see below. > > * What stability semantics do you envision for this API? > As I foresee the API will be as stable as Netty API. Since there is guarantee on no breaking changes between minor versions we can give the same guarantee. If for whatever reason we need to break it we can do it in major version like every other open source project does. > * Does Flink expose dependencies’ APIs in other places? Since this exposes > the Netty API, will this make it difficult to upgrade Netty? > I don't expect breaking changes between minor versions so such cases there will be no issues. If there is a breaking change in major version we need to wait Flink major version too. > * I share Till's concern about multiple factories – other HTTP middleware > frameworks commonly support chaining middlewares. Since the proposed API > does not include these features/guarantee ordering, do you see any reason > to allow more than one factory? > I personally can't come up with a use-case where ordering is a must. I'm not telling that this is not a valid use-case but adding a feature w/o business rationale would include the maintenance cost (though I'm open to add). As I've seen Till also can't give example for that (please see the doc comments). If you have anything in mind please share it and we can add priority to the API. There is another option too, namely we can be defensive and we can add the priority right now. I would do this only if everybody states in mail that it would be the best option, otherwise I would stick to the original plan. > > Best, > Austin > > On Tue, Jun 29, 2021 at 8:55 AM Márton Balassi > wrote: > >> Hi all, >> >> I commend Konstantin and Till when it comes to standing up for the >> community values. >> >> Based on your feedback we are withdrawing the original proposal and >> attaching a more general custom netty handler API proposal [1] written by >> G. The change necessary to the Flink repository is approximately 500 lines >> of code. [2] >> >> Please let us focus on discussing the details of this API and whether it >> covers the necessary use cases. >> >> [1] >> >> https://docs.google.com/document/d/1Idnw8YauMK1x_14iv0rVF0Hqm58J6Dg-hi-hEuL6hwM/edit#heading=h.ijcbce3c5gip >> [2] >> >> https://github.com/gaborgsomogyi/flink/commit/942f23679ac21428bb87fc85557b9b443fcaf310 >> >> Thanks, >> Marton >> >> On Wed, Jun 23, 2021 at 9:36 PM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >> > Hi all, >> > >> > Thanks, Konstantin and Till, for guiding the discussion. >> > >> > I was not aware of the results of the call with Konstantin and was >> > attempting to resolve the unanswered questions before more, potentially >> > fruitless, work was done. >> > >> > I am also looking forward to the coming proposal, as well as increasing >> my >> > understanding of this specific use case + its limitations! >> > >> > Best, >> > Austin >> > >> > On Tue, Jun 22, 2021 at 6:32 AM Till Rohrmann >> > wrote: >> > >> > > Hi everyone, >> > > >> > > I do like the idea of keeping the actual change outside of Flink but >> to >> > > enable Flink to support such a use case (different authentication >> > > mechanisms). I think this is a good compromise for the community that >> > > combines long-term maintainability with support for new use-cases. I >> am >> > > looking forward to your proposal. >> > > >> > > I also want to second Konstantin here that the tone of your last >> email, >> > > Marton, does not reflect the values and manners of the Flink community >> > and >> > > is not representative of how we conduct discussions. Especially, the >> more >> > > senior community members should know this and act accordingly in >> order to >> > > be good role models for others in the community. Technical discussions >> > > should not be decided by who wields presumably the greatest authority >> but >> > > by the soundness of arguments and by what is the best solution for a >> > > problem. >> > > >> > > Let us now try to find the best solution for the problem at hand! >> > > >> > > Cheers, >> > > Till >> > > >> > > On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf >> > > wrote: >> > > >> > > > Hi everyone, >> > > > >> > > > First, Marton and I had a brief conversation yesterday offline and >> > > > discussed exploring the approach of exposing the authentication >> > > > functionality via an API. So, I am looking forward to your proposal >> in >> > > that >> > > > direction. The benefit of such a solution would be that it is >> > extensible >> > > > for others and it does add a smaller maintenance (in particular >> > testing) >> > > > footprint to Apache Flink itself. If we end up going down this >> route, >> > > > flink-packages.org would be a great way to promote these third >>
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi all, Thanks for the updated proposal. I have a few questions about the API, please see below. * What stability semantics do you envision for this API? * Does Flink expose dependencies’ APIs in other places? Since this exposes the Netty API, will this make it difficult to upgrade Netty? * I share Till's concern about multiple factories – other HTTP middleware frameworks commonly support chaining middlewares. Since the proposed API does not include these features/guarantee ordering, do you see any reason to allow more than one factory? Best, Austin On Tue, Jun 29, 2021 at 8:55 AM Márton Balassi wrote: > Hi all, > > I commend Konstantin and Till when it comes to standing up for the > community values. > > Based on your feedback we are withdrawing the original proposal and > attaching a more general custom netty handler API proposal [1] written by > G. The change necessary to the Flink repository is approximately 500 lines > of code. [2] > > Please let us focus on discussing the details of this API and whether it > covers the necessary use cases. > > [1] > > https://docs.google.com/document/d/1Idnw8YauMK1x_14iv0rVF0Hqm58J6Dg-hi-hEuL6hwM/edit#heading=h.ijcbce3c5gip > [2] > > https://github.com/gaborgsomogyi/flink/commit/942f23679ac21428bb87fc85557b9b443fcaf310 > > Thanks, > Marton > > On Wed, Jun 23, 2021 at 9:36 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > > > Hi all, > > > > Thanks, Konstantin and Till, for guiding the discussion. > > > > I was not aware of the results of the call with Konstantin and was > > attempting to resolve the unanswered questions before more, potentially > > fruitless, work was done. > > > > I am also looking forward to the coming proposal, as well as increasing > my > > understanding of this specific use case + its limitations! > > > > Best, > > Austin > > > > On Tue, Jun 22, 2021 at 6:32 AM Till Rohrmann > > wrote: > > > > > Hi everyone, > > > > > > I do like the idea of keeping the actual change outside of Flink but to > > > enable Flink to support such a use case (different authentication > > > mechanisms). I think this is a good compromise for the community that > > > combines long-term maintainability with support for new use-cases. I am > > > looking forward to your proposal. > > > > > > I also want to second Konstantin here that the tone of your last email, > > > Marton, does not reflect the values and manners of the Flink community > > and > > > is not representative of how we conduct discussions. Especially, the > more > > > senior community members should know this and act accordingly in order > to > > > be good role models for others in the community. Technical discussions > > > should not be decided by who wields presumably the greatest authority > but > > > by the soundness of arguments and by what is the best solution for a > > > problem. > > > > > > Let us now try to find the best solution for the problem at hand! > > > > > > Cheers, > > > Till > > > > > > On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf > > > wrote: > > > > > > > Hi everyone, > > > > > > > > First, Marton and I had a brief conversation yesterday offline and > > > > discussed exploring the approach of exposing the authentication > > > > functionality via an API. So, I am looking forward to your proposal > in > > > that > > > > direction. The benefit of such a solution would be that it is > > extensible > > > > for others and it does add a smaller maintenance (in particular > > testing) > > > > footprint to Apache Flink itself. If we end up going down this route, > > > > flink-packages.org would be a great way to promote these third party > > > > "authentication modules". > > > > > > > > Second, Marton, I understand your frustration about the long > discussion > > > on > > > > this "simple matter", but the condescending tone of your last mail > > feels > > > > uncalled for to me. Austin expressed a valid opinion on the topic, > > which > > > is > > > > based on his experience from other Open Source frameworks (CNCF > > mostly). > > > I > > > > am sure you agree that it is important for Apache Flink to stay open > > and > > > to > > > > consider different approaches and ideas and I don't think it helps > the > > > > culture of discussion to shoot it down like this ("This is where this > > > > discussion stops."). > > > > > > > > Let's continue to move this discussion forward and I am sure we'll > > find a > > > > consensus based on product and technological considerations. > > > > > > > > Thanks, > > > > > > > > Konstantin > > > > > > > > On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi < > > balassi.mar...@gmail.com > > > > > > > > wrote: > > > > > > > > > Hi Austin, > > > > > > > > > > Thank you for your thoughts. This is where this discussion stops. > > This > > > > > email thread already contains more characters than the > implementation > > > and > > > > > what is needed for the next 20 years of maintenance. > > > > > > > > > > It is great that you have a view on modern
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi all, I commend Konstantin and Till when it comes to standing up for the community values. Based on your feedback we are withdrawing the original proposal and attaching a more general custom netty handler API proposal [1] written by G. The change necessary to the Flink repository is approximately 500 lines of code. [2] Please let us focus on discussing the details of this API and whether it covers the necessary use cases. [1] https://docs.google.com/document/d/1Idnw8YauMK1x_14iv0rVF0Hqm58J6Dg-hi-hEuL6hwM/edit#heading=h.ijcbce3c5gip [2] https://github.com/gaborgsomogyi/flink/commit/942f23679ac21428bb87fc85557b9b443fcaf310 Thanks, Marton On Wed, Jun 23, 2021 at 9:36 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi all, > > Thanks, Konstantin and Till, for guiding the discussion. > > I was not aware of the results of the call with Konstantin and was > attempting to resolve the unanswered questions before more, potentially > fruitless, work was done. > > I am also looking forward to the coming proposal, as well as increasing my > understanding of this specific use case + its limitations! > > Best, > Austin > > On Tue, Jun 22, 2021 at 6:32 AM Till Rohrmann > wrote: > > > Hi everyone, > > > > I do like the idea of keeping the actual change outside of Flink but to > > enable Flink to support such a use case (different authentication > > mechanisms). I think this is a good compromise for the community that > > combines long-term maintainability with support for new use-cases. I am > > looking forward to your proposal. > > > > I also want to second Konstantin here that the tone of your last email, > > Marton, does not reflect the values and manners of the Flink community > and > > is not representative of how we conduct discussions. Especially, the more > > senior community members should know this and act accordingly in order to > > be good role models for others in the community. Technical discussions > > should not be decided by who wields presumably the greatest authority but > > by the soundness of arguments and by what is the best solution for a > > problem. > > > > Let us now try to find the best solution for the problem at hand! > > > > Cheers, > > Till > > > > On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf > > wrote: > > > > > Hi everyone, > > > > > > First, Marton and I had a brief conversation yesterday offline and > > > discussed exploring the approach of exposing the authentication > > > functionality via an API. So, I am looking forward to your proposal in > > that > > > direction. The benefit of such a solution would be that it is > extensible > > > for others and it does add a smaller maintenance (in particular > testing) > > > footprint to Apache Flink itself. If we end up going down this route, > > > flink-packages.org would be a great way to promote these third party > > > "authentication modules". > > > > > > Second, Marton, I understand your frustration about the long discussion > > on > > > this "simple matter", but the condescending tone of your last mail > feels > > > uncalled for to me. Austin expressed a valid opinion on the topic, > which > > is > > > based on his experience from other Open Source frameworks (CNCF > mostly). > > I > > > am sure you agree that it is important for Apache Flink to stay open > and > > to > > > consider different approaches and ideas and I don't think it helps the > > > culture of discussion to shoot it down like this ("This is where this > > > discussion stops."). > > > > > > Let's continue to move this discussion forward and I am sure we'll > find a > > > consensus based on product and technological considerations. > > > > > > Thanks, > > > > > > Konstantin > > > > > > On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi < > balassi.mar...@gmail.com > > > > > > wrote: > > > > > > > Hi Austin, > > > > > > > > Thank you for your thoughts. This is where this discussion stops. > This > > > > email thread already contains more characters than the implementation > > and > > > > what is needed for the next 20 years of maintenance. > > > > > > > > It is great that you have a view on modern solutions and thank you > for > > > > offering your help with brainstorming solutions. I am responsible for > > > Flink > > > > at Cloudera and we do need an implementation like this and it is in > > fact > > > > already in production at dozens of customers. We are open to adapting > > > that > > > > to expose a more generic API (and keeping Kerberos to our fork), to > > > > contribute this to the community as others have asked for it and to > > > protect > > > > ourselves from occasionally having to update this critical > > implementation > > > > path based on changes in the Apache codebase. I have worked with > close > > > to a > > > > hundred Big Data customers as a consultant and an engineering manager > > and > > > > committed hundreds of changes to Apache Flink over the past decade, > > > please > > > > trust my judgement on a simple matter like this.
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi all, Thanks, Konstantin and Till, for guiding the discussion. I was not aware of the results of the call with Konstantin and was attempting to resolve the unanswered questions before more, potentially fruitless, work was done. I am also looking forward to the coming proposal, as well as increasing my understanding of this specific use case + its limitations! Best, Austin On Tue, Jun 22, 2021 at 6:32 AM Till Rohrmann wrote: > Hi everyone, > > I do like the idea of keeping the actual change outside of Flink but to > enable Flink to support such a use case (different authentication > mechanisms). I think this is a good compromise for the community that > combines long-term maintainability with support for new use-cases. I am > looking forward to your proposal. > > I also want to second Konstantin here that the tone of your last email, > Marton, does not reflect the values and manners of the Flink community and > is not representative of how we conduct discussions. Especially, the more > senior community members should know this and act accordingly in order to > be good role models for others in the community. Technical discussions > should not be decided by who wields presumably the greatest authority but > by the soundness of arguments and by what is the best solution for a > problem. > > Let us now try to find the best solution for the problem at hand! > > Cheers, > Till > > On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf > wrote: > > > Hi everyone, > > > > First, Marton and I had a brief conversation yesterday offline and > > discussed exploring the approach of exposing the authentication > > functionality via an API. So, I am looking forward to your proposal in > that > > direction. The benefit of such a solution would be that it is extensible > > for others and it does add a smaller maintenance (in particular testing) > > footprint to Apache Flink itself. If we end up going down this route, > > flink-packages.org would be a great way to promote these third party > > "authentication modules". > > > > Second, Marton, I understand your frustration about the long discussion > on > > this "simple matter", but the condescending tone of your last mail feels > > uncalled for to me. Austin expressed a valid opinion on the topic, which > is > > based on his experience from other Open Source frameworks (CNCF mostly). > I > > am sure you agree that it is important for Apache Flink to stay open and > to > > consider different approaches and ideas and I don't think it helps the > > culture of discussion to shoot it down like this ("This is where this > > discussion stops."). > > > > Let's continue to move this discussion forward and I am sure we'll find a > > consensus based on product and technological considerations. > > > > Thanks, > > > > Konstantin > > > > On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi > > > wrote: > > > > > Hi Austin, > > > > > > Thank you for your thoughts. This is where this discussion stops. This > > > email thread already contains more characters than the implementation > and > > > what is needed for the next 20 years of maintenance. > > > > > > It is great that you have a view on modern solutions and thank you for > > > offering your help with brainstorming solutions. I am responsible for > > Flink > > > at Cloudera and we do need an implementation like this and it is in > fact > > > already in production at dozens of customers. We are open to adapting > > that > > > to expose a more generic API (and keeping Kerberos to our fork), to > > > contribute this to the community as others have asked for it and to > > protect > > > ourselves from occasionally having to update this critical > implementation > > > path based on changes in the Apache codebase. I have worked with close > > to a > > > hundred Big Data customers as a consultant and an engineering manager > and > > > committed hundreds of changes to Apache Flink over the past decade, > > please > > > trust my judgement on a simple matter like this. > > > > > > Please forgive me for referencing authority, this discussion was > getting > > > out of hand. Please keep vigilant. > > > > > > Best, > > > Marton > > > > > > On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards < > > > austin.caw...@gmail.com> wrote: > > > > > > > Hi Gabor + Marton, > > > > > > > > I don't believe that the issue with this proposal is the specific > > > mechanism > > > > proposed (Kerberos), but rather that it is not the level to implement > > it > > > at > > > > (Flink). I'm just one voice, so please take this with a grain of > salt. > > > > > > > > In the other solutions previously noted there is no need to > instrument > > > > Flink which, in addition to reducing the maintenance burden, > provides a > > > > better, decoupled end result. > > > > > > > > IMO we should not add any new API in Flink for this use case. I think > > it > > > is > > > > unfortunate and sympathize with the work that has already been done > on > > > this > > > > feature –
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi everyone, I do like the idea of keeping the actual change outside of Flink but to enable Flink to support such a use case (different authentication mechanisms). I think this is a good compromise for the community that combines long-term maintainability with support for new use-cases. I am looking forward to your proposal. I also want to second Konstantin here that the tone of your last email, Marton, does not reflect the values and manners of the Flink community and is not representative of how we conduct discussions. Especially, the more senior community members should know this and act accordingly in order to be good role models for others in the community. Technical discussions should not be decided by who wields presumably the greatest authority but by the soundness of arguments and by what is the best solution for a problem. Let us now try to find the best solution for the problem at hand! Cheers, Till On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf wrote: > Hi everyone, > > First, Marton and I had a brief conversation yesterday offline and > discussed exploring the approach of exposing the authentication > functionality via an API. So, I am looking forward to your proposal in that > direction. The benefit of such a solution would be that it is extensible > for others and it does add a smaller maintenance (in particular testing) > footprint to Apache Flink itself. If we end up going down this route, > flink-packages.org would be a great way to promote these third party > "authentication modules". > > Second, Marton, I understand your frustration about the long discussion on > this "simple matter", but the condescending tone of your last mail feels > uncalled for to me. Austin expressed a valid opinion on the topic, which is > based on his experience from other Open Source frameworks (CNCF mostly). I > am sure you agree that it is important for Apache Flink to stay open and to > consider different approaches and ideas and I don't think it helps the > culture of discussion to shoot it down like this ("This is where this > discussion stops."). > > Let's continue to move this discussion forward and I am sure we'll find a > consensus based on product and technological considerations. > > Thanks, > > Konstantin > > On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi > wrote: > > > Hi Austin, > > > > Thank you for your thoughts. This is where this discussion stops. This > > email thread already contains more characters than the implementation and > > what is needed for the next 20 years of maintenance. > > > > It is great that you have a view on modern solutions and thank you for > > offering your help with brainstorming solutions. I am responsible for > Flink > > at Cloudera and we do need an implementation like this and it is in fact > > already in production at dozens of customers. We are open to adapting > that > > to expose a more generic API (and keeping Kerberos to our fork), to > > contribute this to the community as others have asked for it and to > protect > > ourselves from occasionally having to update this critical implementation > > path based on changes in the Apache codebase. I have worked with close > to a > > hundred Big Data customers as a consultant and an engineering manager and > > committed hundreds of changes to Apache Flink over the past decade, > please > > trust my judgement on a simple matter like this. > > > > Please forgive me for referencing authority, this discussion was getting > > out of hand. Please keep vigilant. > > > > Best, > > Marton > > > > On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards < > > austin.caw...@gmail.com> wrote: > > > > > Hi Gabor + Marton, > > > > > > I don't believe that the issue with this proposal is the specific > > mechanism > > > proposed (Kerberos), but rather that it is not the level to implement > it > > at > > > (Flink). I'm just one voice, so please take this with a grain of salt. > > > > > > In the other solutions previously noted there is no need to instrument > > > Flink which, in addition to reducing the maintenance burden, provides a > > > better, decoupled end result. > > > > > > IMO we should not add any new API in Flink for this use case. I think > it > > is > > > unfortunate and sympathize with the work that has already been done on > > this > > > feature – perhaps we could brainstorm ways to run this alongside Flink > in > > > your setup. Again, I don't think the proposed solution of an agnostic > API > > > would not work, nor is it a bad idea, but is not one that will make > Flink > > > more compatible with the modern solutions to this problem. > > > > > > Best, > > > Austin > > > > > > On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi < > balassi.mar...@gmail.com > > > > > > wrote: > > > > > > > Hi team, > > > > > > > > Thank you for your input. Based on this discussion I agree with G > that > > > > selecting and standardizing on a specific strong authentication > > mechanism > > > > is more challenging than the whole
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi everyone, First, Marton and I had a brief conversation yesterday offline and discussed exploring the approach of exposing the authentication functionality via an API. So, I am looking forward to your proposal in that direction. The benefit of such a solution would be that it is extensible for others and it does add a smaller maintenance (in particular testing) footprint to Apache Flink itself. If we end up going down this route, flink-packages.org would be a great way to promote these third party "authentication modules". Second, Marton, I understand your frustration about the long discussion on this "simple matter", but the condescending tone of your last mail feels uncalled for to me. Austin expressed a valid opinion on the topic, which is based on his experience from other Open Source frameworks (CNCF mostly). I am sure you agree that it is important for Apache Flink to stay open and to consider different approaches and ideas and I don't think it helps the culture of discussion to shoot it down like this ("This is where this discussion stops."). Let's continue to move this discussion forward and I am sure we'll find a consensus based on product and technological considerations. Thanks, Konstantin On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi wrote: > Hi Austin, > > Thank you for your thoughts. This is where this discussion stops. This > email thread already contains more characters than the implementation and > what is needed for the next 20 years of maintenance. > > It is great that you have a view on modern solutions and thank you for > offering your help with brainstorming solutions. I am responsible for Flink > at Cloudera and we do need an implementation like this and it is in fact > already in production at dozens of customers. We are open to adapting that > to expose a more generic API (and keeping Kerberos to our fork), to > contribute this to the community as others have asked for it and to protect > ourselves from occasionally having to update this critical implementation > path based on changes in the Apache codebase. I have worked with close to a > hundred Big Data customers as a consultant and an engineering manager and > committed hundreds of changes to Apache Flink over the past decade, please > trust my judgement on a simple matter like this. > > Please forgive me for referencing authority, this discussion was getting > out of hand. Please keep vigilant. > > Best, > Marton > > On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > > > Hi Gabor + Marton, > > > > I don't believe that the issue with this proposal is the specific > mechanism > > proposed (Kerberos), but rather that it is not the level to implement it > at > > (Flink). I'm just one voice, so please take this with a grain of salt. > > > > In the other solutions previously noted there is no need to instrument > > Flink which, in addition to reducing the maintenance burden, provides a > > better, decoupled end result. > > > > IMO we should not add any new API in Flink for this use case. I think it > is > > unfortunate and sympathize with the work that has already been done on > this > > feature – perhaps we could brainstorm ways to run this alongside Flink in > > your setup. Again, I don't think the proposed solution of an agnostic API > > would not work, nor is it a bad idea, but is not one that will make Flink > > more compatible with the modern solutions to this problem. > > > > Best, > > Austin > > > > On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi > > > wrote: > > > > > Hi team, > > > > > > Thank you for your input. Based on this discussion I agree with G that > > > selecting and standardizing on a specific strong authentication > mechanism > > > is more challenging than the whole rest of the scope of this > > authentication > > > story. :-) I suggest that G and I go back to the drawing board and come > > up > > > with an API that can support multiple authentication mechanisms, and we > > > would only merge said API to Flink. Specific implementations of it can > be > > > maintained outside of the project. This way we tackle the main > challenge > > in > > > a truly minimal way. > > > > > > Best, > > > Marton > > > > > > On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi < > gabor.g.somo...@gmail.com > > > > > > wrote: > > > > > > > Hi All, > > > > > > > > We see that adding any kind of specific authentication raises more > > > > questions than answers. > > > > What would be if a generic API would be added without any real > > > > authentication logic? > > > > That way every provider can add its own protocol implementation as > > > > additional jar. > > > > > > > > BR, > > > > G > > > > > > > > > > > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards < > > > > austin.caw...@gmail.com> wrote: > > > > > > > >> Hi all, > > > >> > > > >> Sorry to be joining the conversation late. I'm also on the side of > > > >> Konstantin, generally, in that this seems to not be a core goal of
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Austin, Thank you for your thoughts. This is where this discussion stops. This email thread already contains more characters than the implementation and what is needed for the next 20 years of maintenance. It is great that you have a view on modern solutions and thank you for offering your help with brainstorming solutions. I am responsible for Flink at Cloudera and we do need an implementation like this and it is in fact already in production at dozens of customers. We are open to adapting that to expose a more generic API (and keeping Kerberos to our fork), to contribute this to the community as others have asked for it and to protect ourselves from occasionally having to update this critical implementation path based on changes in the Apache codebase. I have worked with close to a hundred Big Data customers as a consultant and an engineering manager and committed hundreds of changes to Apache Flink over the past decade, please trust my judgement on a simple matter like this. Please forgive me for referencing authority, this discussion was getting out of hand. Please keep vigilant. Best, Marton On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi Gabor + Marton, > > I don't believe that the issue with this proposal is the specific mechanism > proposed (Kerberos), but rather that it is not the level to implement it at > (Flink). I'm just one voice, so please take this with a grain of salt. > > In the other solutions previously noted there is no need to instrument > Flink which, in addition to reducing the maintenance burden, provides a > better, decoupled end result. > > IMO we should not add any new API in Flink for this use case. I think it is > unfortunate and sympathize with the work that has already been done on this > feature – perhaps we could brainstorm ways to run this alongside Flink in > your setup. Again, I don't think the proposed solution of an agnostic API > would not work, nor is it a bad idea, but is not one that will make Flink > more compatible with the modern solutions to this problem. > > Best, > Austin > > On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi > wrote: > > > Hi team, > > > > Thank you for your input. Based on this discussion I agree with G that > > selecting and standardizing on a specific strong authentication mechanism > > is more challenging than the whole rest of the scope of this > authentication > > story. :-) I suggest that G and I go back to the drawing board and come > up > > with an API that can support multiple authentication mechanisms, and we > > would only merge said API to Flink. Specific implementations of it can be > > maintained outside of the project. This way we tackle the main challenge > in > > a truly minimal way. > > > > Best, > > Marton > > > > On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi > > > wrote: > > > > > Hi All, > > > > > > We see that adding any kind of specific authentication raises more > > > questions than answers. > > > What would be if a generic API would be added without any real > > > authentication logic? > > > That way every provider can add its own protocol implementation as > > > additional jar. > > > > > > BR, > > > G > > > > > > > > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards < > > > austin.caw...@gmail.com> wrote: > > > > > >> Hi all, > > >> > > >> Sorry to be joining the conversation late. I'm also on the side of > > >> Konstantin, generally, in that this seems to not be a core goal of > Flink > > >> as > > >> a project and adds a maintenance burden. > > >> > > >> Would another con of Kerberos be that is likely a fading project in > > terms > > >> of network security? (serious question, please correct me if there is > > >> reason to believe it is gaining adoption) > > >> > > >> The point about Kerberos being independent of infrastructure is a good > > one > > >> but is something that is also solved by modern sidecar proxies + > service > > >> meshes that can run across Kubernetes and bare-metal. These solutions > > also > > >> handle certificate provisioning, rotation, etc. in addition to > > >> higher-level > > >> authorization policies. Some examples of projects with this "universal > > >> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) > and > > >> Istio[2] (Google). > > >> > > >> Wondering out loud: has anyone tried to run Flink on top of cilium[3], > > >> which also provides zero-trust networking at the kernel level without > > >> needing to instrument applications? This currently only runs on > > Kubernetes > > >> on Linux, so that's a major limitation, but solves many of the request > > >> forging concerns at all levels. > > >> > > >> Thanks, > > >> Austin > > >> > > >> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/ > > >> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/ > > >> [3]: https://cilium.io/ > > >> > > >> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann > > >> wrote: > > >> > > >> > I left some comments
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Gabor + Marton, I don't believe that the issue with this proposal is the specific mechanism proposed (Kerberos), but rather that it is not the level to implement it at (Flink). I'm just one voice, so please take this with a grain of salt. In the other solutions previously noted there is no need to instrument Flink which, in addition to reducing the maintenance burden, provides a better, decoupled end result. IMO we should not add any new API in Flink for this use case. I think it is unfortunate and sympathize with the work that has already been done on this feature – perhaps we could brainstorm ways to run this alongside Flink in your setup. Again, I don't think the proposed solution of an agnostic API would not work, nor is it a bad idea, but is not one that will make Flink more compatible with the modern solutions to this problem. Best, Austin On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi wrote: > Hi team, > > Thank you for your input. Based on this discussion I agree with G that > selecting and standardizing on a specific strong authentication mechanism > is more challenging than the whole rest of the scope of this authentication > story. :-) I suggest that G and I go back to the drawing board and come up > with an API that can support multiple authentication mechanisms, and we > would only merge said API to Flink. Specific implementations of it can be > maintained outside of the project. This way we tackle the main challenge in > a truly minimal way. > > Best, > Marton > > On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi > wrote: > > > Hi All, > > > > We see that adding any kind of specific authentication raises more > > questions than answers. > > What would be if a generic API would be added without any real > > authentication logic? > > That way every provider can add its own protocol implementation as > > additional jar. > > > > BR, > > G > > > > > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards < > > austin.caw...@gmail.com> wrote: > > > >> Hi all, > >> > >> Sorry to be joining the conversation late. I'm also on the side of > >> Konstantin, generally, in that this seems to not be a core goal of Flink > >> as > >> a project and adds a maintenance burden. > >> > >> Would another con of Kerberos be that is likely a fading project in > terms > >> of network security? (serious question, please correct me if there is > >> reason to believe it is gaining adoption) > >> > >> The point about Kerberos being independent of infrastructure is a good > one > >> but is something that is also solved by modern sidecar proxies + service > >> meshes that can run across Kubernetes and bare-metal. These solutions > also > >> handle certificate provisioning, rotation, etc. in addition to > >> higher-level > >> authorization policies. Some examples of projects with this "universal > >> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and > >> Istio[2] (Google). > >> > >> Wondering out loud: has anyone tried to run Flink on top of cilium[3], > >> which also provides zero-trust networking at the kernel level without > >> needing to instrument applications? This currently only runs on > Kubernetes > >> on Linux, so that's a major limitation, but solves many of the request > >> forging concerns at all levels. > >> > >> Thanks, > >> Austin > >> > >> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/ > >> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/ > >> [3]: https://cilium.io/ > >> > >> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann > >> wrote: > >> > >> > I left some comments in the Google document. It would be great if > >> > someone from the community with security experience could also take a > >> look > >> > at it. Maybe Eron you have an opinion on the topic. > >> > > >> > Cheers, > >> > Till > >> > > >> > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann > >> > wrote: > >> > > >> > > Hi Gabor, > >> > > > >> > > I haven't found time to look into the updated FLIP yet. I'll try to > >> do it > >> > > asap. > >> > > > >> > > Cheers, > >> > > Till > >> > > > >> > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf > > >> > > wrote: > >> > > > >> > >> Hi Gabor, > >> > >> > >> > >> > However representing Kerberos as completely new feature is not > true > >> > >> because > >> > >> it's already in since Flink makes authentication at least with HDFS > >> and > >> > >> Hbase through Kerberos. > >> > >> > >> > >> True, that is one way to look at it, but there are differences, > too: > >> > >> Control Plane vs Data Plane, Core vs Connectors. > >> > >> > >> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've > guys > >> > just > >> > >> raised. Why exactly these? If you think this would be beneficial we > >> can > >> > >> discuss it in detail > >> > >> > >> > >> That's exactly my point. Once we start adding authx support, we > will > >> > >> sooner or later discuss other options besides Kerberos, too. A user > >> who > >> > >> would like to use OAuth can not
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi team, Thank you for your input. Based on this discussion I agree with G that selecting and standardizing on a specific strong authentication mechanism is more challenging than the whole rest of the scope of this authentication story. :-) I suggest that G and I go back to the drawing board and come up with an API that can support multiple authentication mechanisms, and we would only merge said API to Flink. Specific implementations of it can be maintained outside of the project. This way we tackle the main challenge in a truly minimal way. Best, Marton On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi wrote: > Hi All, > > We see that adding any kind of specific authentication raises more > questions than answers. > What would be if a generic API would be added without any real > authentication logic? > That way every provider can add its own protocol implementation as > additional jar. > > BR, > G > > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi all, >> >> Sorry to be joining the conversation late. I'm also on the side of >> Konstantin, generally, in that this seems to not be a core goal of Flink >> as >> a project and adds a maintenance burden. >> >> Would another con of Kerberos be that is likely a fading project in terms >> of network security? (serious question, please correct me if there is >> reason to believe it is gaining adoption) >> >> The point about Kerberos being independent of infrastructure is a good one >> but is something that is also solved by modern sidecar proxies + service >> meshes that can run across Kubernetes and bare-metal. These solutions also >> handle certificate provisioning, rotation, etc. in addition to >> higher-level >> authorization policies. Some examples of projects with this "universal >> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and >> Istio[2] (Google). >> >> Wondering out loud: has anyone tried to run Flink on top of cilium[3], >> which also provides zero-trust networking at the kernel level without >> needing to instrument applications? This currently only runs on Kubernetes >> on Linux, so that's a major limitation, but solves many of the request >> forging concerns at all levels. >> >> Thanks, >> Austin >> >> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/ >> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/ >> [3]: https://cilium.io/ >> >> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann >> wrote: >> >> > I left some comments in the Google document. It would be great if >> > someone from the community with security experience could also take a >> look >> > at it. Maybe Eron you have an opinion on the topic. >> > >> > Cheers, >> > Till >> > >> > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann >> > wrote: >> > >> > > Hi Gabor, >> > > >> > > I haven't found time to look into the updated FLIP yet. I'll try to >> do it >> > > asap. >> > > >> > > Cheers, >> > > Till >> > > >> > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf >> > > wrote: >> > > >> > >> Hi Gabor, >> > >> >> > >> > However representing Kerberos as completely new feature is not true >> > >> because >> > >> it's already in since Flink makes authentication at least with HDFS >> and >> > >> Hbase through Kerberos. >> > >> >> > >> True, that is one way to look at it, but there are differences, too: >> > >> Control Plane vs Data Plane, Core vs Connectors. >> > >> >> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys >> > just >> > >> raised. Why exactly these? If you think this would be beneficial we >> can >> > >> discuss it in detail >> > >> >> > >> That's exactly my point. Once we start adding authx support, we will >> > >> sooner or later discuss other options besides Kerberos, too. A user >> who >> > >> would like to use OAuth can not easily use Kerberos, right? >> > >> That is one of the reasons I am skeptical about adding initial authx >> > >> support. >> > >> >> > >> > Related authorization you've mentioned it can be complicated over >> > time. >> > >> Can >> > >> you show us an example? We've knowledge with couple of open source >> > >> components >> > >> but authorization was never a horror complex story. I personally have >> > the >> > >> most experience with Spark which I think is quite simple and stable. >> > Users >> > >> can be viewers/admins >> > >> and jobs started by others can't be modified. If you can share an >> > example >> > >> over-complication we can discuss on facts. >> > >> >> > >> Authorization is a new aspect that needs to be considered for every >> > >> addition to the REST API. In the future users might ask for >> additional >> > >> roles (e.g. an editor), user-defined roles and you've already >> mentioned >> > >> job-level permissions yourself. And keep in mind that there might >> also >> > be >> > >> larger additions in the future like the flink-sql-gateway. >> Contributions >> > >> like this become more expensive the more aspects we need to
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi All, We see that adding any kind of specific authentication raises more questions than answers. What would be if a generic API would be added without any real authentication logic? That way every provider can add its own protocol implementation as additional jar. BR, G On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi all, > > Sorry to be joining the conversation late. I'm also on the side of > Konstantin, generally, in that this seems to not be a core goal of Flink as > a project and adds a maintenance burden. > > Would another con of Kerberos be that is likely a fading project in terms > of network security? (serious question, please correct me if there is > reason to believe it is gaining adoption) > > The point about Kerberos being independent of infrastructure is a good one > but is something that is also solved by modern sidecar proxies + service > meshes that can run across Kubernetes and bare-metal. These solutions also > handle certificate provisioning, rotation, etc. in addition to higher-level > authorization policies. Some examples of projects with this "universal > infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and > Istio[2] (Google). > > Wondering out loud: has anyone tried to run Flink on top of cilium[3], > which also provides zero-trust networking at the kernel level without > needing to instrument applications? This currently only runs on Kubernetes > on Linux, so that's a major limitation, but solves many of the request > forging concerns at all levels. > > Thanks, > Austin > > [1]: https://kuma.io/docs/1.1.6/quickstart/universal/ > [2]: https://istio.io/latest/docs/setup/install/virtual-machine/ > [3]: https://cilium.io/ > > On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann > wrote: > > > I left some comments in the Google document. It would be great if > > someone from the community with security experience could also take a > look > > at it. Maybe Eron you have an opinion on the topic. > > > > Cheers, > > Till > > > > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann > > wrote: > > > > > Hi Gabor, > > > > > > I haven't found time to look into the updated FLIP yet. I'll try to do > it > > > asap. > > > > > > Cheers, > > > Till > > > > > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf > > > wrote: > > > > > >> Hi Gabor, > > >> > > >> > However representing Kerberos as completely new feature is not true > > >> because > > >> it's already in since Flink makes authentication at least with HDFS > and > > >> Hbase through Kerberos. > > >> > > >> True, that is one way to look at it, but there are differences, too: > > >> Control Plane vs Data Plane, Core vs Connectors. > > >> > > >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys > > just > > >> raised. Why exactly these? If you think this would be beneficial we > can > > >> discuss it in detail > > >> > > >> That's exactly my point. Once we start adding authx support, we will > > >> sooner or later discuss other options besides Kerberos, too. A user > who > > >> would like to use OAuth can not easily use Kerberos, right? > > >> That is one of the reasons I am skeptical about adding initial authx > > >> support. > > >> > > >> > Related authorization you've mentioned it can be complicated over > > time. > > >> Can > > >> you show us an example? We've knowledge with couple of open source > > >> components > > >> but authorization was never a horror complex story. I personally have > > the > > >> most experience with Spark which I think is quite simple and stable. > > Users > > >> can be viewers/admins > > >> and jobs started by others can't be modified. If you can share an > > example > > >> over-complication we can discuss on facts. > > >> > > >> Authorization is a new aspect that needs to be considered for every > > >> addition to the REST API. In the future users might ask for additional > > >> roles (e.g. an editor), user-defined roles and you've already > mentioned > > >> job-level permissions yourself. And keep in mind that there might also > > be > > >> larger additions in the future like the flink-sql-gateway. > Contributions > > >> like this become more expensive the more aspects we need to consider. > > >> > > >> In general, I believe, it is important that the community focuses its > > >> efforts where we can generate the most value to the user and - > > personally - > > >> I don't think there is much to gain by extending Flink's scope in that > > >> direction. Of course, this is not black and white and there are other > > valid > > >> opinions. > > >> > > >> Thanks, > > >> > > >> Konstantin > > >> > > >> On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi < > > gabor.g.somo...@gmail.com> > > >> wrote: > > >> > > >>> Hi Konstantin, > > >>> > > >>> Thanks for the response. Related new feature introduction in case of > > >>> Basic > > >>> auth I tend to agree, anything else can be chosen. > > >>> > > >>> However representing Kerberos as completely
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi all, Sorry to be joining the conversation late. I'm also on the side of Konstantin, generally, in that this seems to not be a core goal of Flink as a project and adds a maintenance burden. Would another con of Kerberos be that is likely a fading project in terms of network security? (serious question, please correct me if there is reason to believe it is gaining adoption) The point about Kerberos being independent of infrastructure is a good one but is something that is also solved by modern sidecar proxies + service meshes that can run across Kubernetes and bare-metal. These solutions also handle certificate provisioning, rotation, etc. in addition to higher-level authorization policies. Some examples of projects with this "universal infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and Istio[2] (Google). Wondering out loud: has anyone tried to run Flink on top of cilium[3], which also provides zero-trust networking at the kernel level without needing to instrument applications? This currently only runs on Kubernetes on Linux, so that's a major limitation, but solves many of the request forging concerns at all levels. Thanks, Austin [1]: https://kuma.io/docs/1.1.6/quickstart/universal/ [2]: https://istio.io/latest/docs/setup/install/virtual-machine/ [3]: https://cilium.io/ On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann wrote: > I left some comments in the Google document. It would be great if > someone from the community with security experience could also take a look > at it. Maybe Eron you have an opinion on the topic. > > Cheers, > Till > > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann > wrote: > > > Hi Gabor, > > > > I haven't found time to look into the updated FLIP yet. I'll try to do it > > asap. > > > > Cheers, > > Till > > > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf > > wrote: > > > >> Hi Gabor, > >> > >> > However representing Kerberos as completely new feature is not true > >> because > >> it's already in since Flink makes authentication at least with HDFS and > >> Hbase through Kerberos. > >> > >> True, that is one way to look at it, but there are differences, too: > >> Control Plane vs Data Plane, Core vs Connectors. > >> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys > just > >> raised. Why exactly these? If you think this would be beneficial we can > >> discuss it in detail > >> > >> That's exactly my point. Once we start adding authx support, we will > >> sooner or later discuss other options besides Kerberos, too. A user who > >> would like to use OAuth can not easily use Kerberos, right? > >> That is one of the reasons I am skeptical about adding initial authx > >> support. > >> > >> > Related authorization you've mentioned it can be complicated over > time. > >> Can > >> you show us an example? We've knowledge with couple of open source > >> components > >> but authorization was never a horror complex story. I personally have > the > >> most experience with Spark which I think is quite simple and stable. > Users > >> can be viewers/admins > >> and jobs started by others can't be modified. If you can share an > example > >> over-complication we can discuss on facts. > >> > >> Authorization is a new aspect that needs to be considered for every > >> addition to the REST API. In the future users might ask for additional > >> roles (e.g. an editor), user-defined roles and you've already mentioned > >> job-level permissions yourself. And keep in mind that there might also > be > >> larger additions in the future like the flink-sql-gateway. Contributions > >> like this become more expensive the more aspects we need to consider. > >> > >> In general, I believe, it is important that the community focuses its > >> efforts where we can generate the most value to the user and - > personally - > >> I don't think there is much to gain by extending Flink's scope in that > >> direction. Of course, this is not black and white and there are other > valid > >> opinions. > >> > >> Thanks, > >> > >> Konstantin > >> > >> On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi < > gabor.g.somo...@gmail.com> > >> wrote: > >> > >>> Hi Konstantin, > >>> > >>> Thanks for the response. Related new feature introduction in case of > >>> Basic > >>> auth I tend to agree, anything else can be chosen. > >>> > >>> However representing Kerberos as completely new feature is not true > >>> because > >>> it's already in since Flink makes authentication at least with HDFS and > >>> Hbase through Kerberos. > >>> The main problem with the actual Kerberos implementation is that it > >>> contains several bugs and only partially implemented. Following your > >>> suggestion can we agree that we > >>> skip the Basic auth implementation and finish an already started > Kerberos > >>> story by adding History Server and Job Dashboard authentication? > >>> > >>> Adding OIDC or OAuth2 has the exact same concerns what you've guys just > >>> raised. Why exactly these? If you think
Re: [DISCUSS] Dashboard/HistoryServer authentication
I left some comments in the Google document. It would be great if someone from the community with security experience could also take a look at it. Maybe Eron you have an opinion on the topic. Cheers, Till On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann wrote: > Hi Gabor, > > I haven't found time to look into the updated FLIP yet. I'll try to do it > asap. > > Cheers, > Till > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf > wrote: > >> Hi Gabor, >> >> > However representing Kerberos as completely new feature is not true >> because >> it's already in since Flink makes authentication at least with HDFS and >> Hbase through Kerberos. >> >> True, that is one way to look at it, but there are differences, too: >> Control Plane vs Data Plane, Core vs Connectors. >> >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys just >> raised. Why exactly these? If you think this would be beneficial we can >> discuss it in detail >> >> That's exactly my point. Once we start adding authx support, we will >> sooner or later discuss other options besides Kerberos, too. A user who >> would like to use OAuth can not easily use Kerberos, right? >> That is one of the reasons I am skeptical about adding initial authx >> support. >> >> > Related authorization you've mentioned it can be complicated over time. >> Can >> you show us an example? We've knowledge with couple of open source >> components >> but authorization was never a horror complex story. I personally have the >> most experience with Spark which I think is quite simple and stable. Users >> can be viewers/admins >> and jobs started by others can't be modified. If you can share an example >> over-complication we can discuss on facts. >> >> Authorization is a new aspect that needs to be considered for every >> addition to the REST API. In the future users might ask for additional >> roles (e.g. an editor), user-defined roles and you've already mentioned >> job-level permissions yourself. And keep in mind that there might also be >> larger additions in the future like the flink-sql-gateway. Contributions >> like this become more expensive the more aspects we need to consider. >> >> In general, I believe, it is important that the community focuses its >> efforts where we can generate the most value to the user and - personally - >> I don't think there is much to gain by extending Flink's scope in that >> direction. Of course, this is not black and white and there are other valid >> opinions. >> >> Thanks, >> >> Konstantin >> >> On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi >> wrote: >> >>> Hi Konstantin, >>> >>> Thanks for the response. Related new feature introduction in case of >>> Basic >>> auth I tend to agree, anything else can be chosen. >>> >>> However representing Kerberos as completely new feature is not true >>> because >>> it's already in since Flink makes authentication at least with HDFS and >>> Hbase through Kerberos. >>> The main problem with the actual Kerberos implementation is that it >>> contains several bugs and only partially implemented. Following your >>> suggestion can we agree that we >>> skip the Basic auth implementation and finish an already started Kerberos >>> story by adding History Server and Job Dashboard authentication? >>> >>> Adding OIDC or OAuth2 has the exact same concerns what you've guys just >>> raised. Why exactly these? If you think this would be beneficial we can >>> discuss it in detail >>> but as a side story it would be good to finish a halfway done Kerberos >>> story. >>> >>> Related authorization you've mentioned it can be complicated over time. >>> Can >>> you show us an example? We've knowledge with couple of open source >>> components >>> but authorization was never a horror complex story. I personally have the >>> most experience with Spark which I think is quite simple and stable. >>> Users >>> can be viewers/admins >>> and jobs started by others can't be modified. If you can share an example >>> over-complication we can discuss on facts. >>> >>> Thank you in advance! >>> >>> BR, >>> G >>> >>> >>> On Wed, Jun 16, 2021 at 5:42 PM Konstantin Knauf >>> wrote: >>> >>> > Hi everyone, >>> > >>> > sorry for joining late and thanks for the insightful discussion. >>> > >>> > In general, I'd personally prefer not to increase the surface area of >>> > Apache Flink unless there is a good reason. It seems we all agree that >>> > authx is not part of the core value proposition of Apache Flink, so if >>> we >>> > can delegate this problem to a more specialized tool, I am in favor of >>> > that. Apache Flink is already huge and a lot of work goes into >>> maintenance, >>> > so I personally have become more sensitive to this aspect over time. >>> > >>> > If we add support for Basic Auth and Kerberos now, users will sooner or >>> > later ask for OIDC, LDAP, SAML,... I acknowledge that Kerberos is >>> widely >>> > used in the corporate, on-premises context, but isn't the focus moving >>> more >>> > towards more
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Gabor, I haven't found time to look into the updated FLIP yet. I'll try to do it asap. Cheers, Till On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf wrote: > Hi Gabor, > > > However representing Kerberos as completely new feature is not true > because > it's already in since Flink makes authentication at least with HDFS and > Hbase through Kerberos. > > True, that is one way to look at it, but there are differences, too: > Control Plane vs Data Plane, Core vs Connectors. > > > Adding OIDC or OAuth2 has the exact same concerns what you've guys just > raised. Why exactly these? If you think this would be beneficial we can > discuss it in detail > > That's exactly my point. Once we start adding authx support, we will > sooner or later discuss other options besides Kerberos, too. A user who > would like to use OAuth can not easily use Kerberos, right? > That is one of the reasons I am skeptical about adding initial authx > support. > > > Related authorization you've mentioned it can be complicated over time. > Can > you show us an example? We've knowledge with couple of open source > components > but authorization was never a horror complex story. I personally have the > most experience with Spark which I think is quite simple and stable. Users > can be viewers/admins > and jobs started by others can't be modified. If you can share an example > over-complication we can discuss on facts. > > Authorization is a new aspect that needs to be considered for every > addition to the REST API. In the future users might ask for additional > roles (e.g. an editor), user-defined roles and you've already mentioned > job-level permissions yourself. And keep in mind that there might also be > larger additions in the future like the flink-sql-gateway. Contributions > like this become more expensive the more aspects we need to consider. > > In general, I believe, it is important that the community focuses its > efforts where we can generate the most value to the user and - personally - > I don't think there is much to gain by extending Flink's scope in that > direction. Of course, this is not black and white and there are other valid > opinions. > > Thanks, > > Konstantin > > On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi > wrote: > >> Hi Konstantin, >> >> Thanks for the response. Related new feature introduction in case of Basic >> auth I tend to agree, anything else can be chosen. >> >> However representing Kerberos as completely new feature is not true >> because >> it's already in since Flink makes authentication at least with HDFS and >> Hbase through Kerberos. >> The main problem with the actual Kerberos implementation is that it >> contains several bugs and only partially implemented. Following your >> suggestion can we agree that we >> skip the Basic auth implementation and finish an already started Kerberos >> story by adding History Server and Job Dashboard authentication? >> >> Adding OIDC or OAuth2 has the exact same concerns what you've guys just >> raised. Why exactly these? If you think this would be beneficial we can >> discuss it in detail >> but as a side story it would be good to finish a halfway done Kerberos >> story. >> >> Related authorization you've mentioned it can be complicated over time. >> Can >> you show us an example? We've knowledge with couple of open source >> components >> but authorization was never a horror complex story. I personally have the >> most experience with Spark which I think is quite simple and stable. Users >> can be viewers/admins >> and jobs started by others can't be modified. If you can share an example >> over-complication we can discuss on facts. >> >> Thank you in advance! >> >> BR, >> G >> >> >> On Wed, Jun 16, 2021 at 5:42 PM Konstantin Knauf >> wrote: >> >> > Hi everyone, >> > >> > sorry for joining late and thanks for the insightful discussion. >> > >> > In general, I'd personally prefer not to increase the surface area of >> > Apache Flink unless there is a good reason. It seems we all agree that >> > authx is not part of the core value proposition of Apache Flink, so if >> we >> > can delegate this problem to a more specialized tool, I am in favor of >> > that. Apache Flink is already huge and a lot of work goes into >> maintenance, >> > so I personally have become more sensitive to this aspect over time. >> > >> > If we add support for Basic Auth and Kerberos now, users will sooner or >> > later ask for OIDC, LDAP, SAML,... I acknowledge that Kerberos is widely >> > used in the corporate, on-premises context, but isn't the focus moving >> more >> > towards more web-friendly standards like OIDC/OAuth 2.0? If we only >> want to >> > support a single protocol, there is an argument to be made that it >> should >> > be OIDC and Dex [1,2] as a bridge to everything else. Have OIDC or >> OAuth2 >> > been considered instead of Kerberos? How do you see the market moving? >> But >> > as I said before, in my opinion we can generate more value by investing >> > into
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Gabor, > However representing Kerberos as completely new feature is not true because it's already in since Flink makes authentication at least with HDFS and Hbase through Kerberos. True, that is one way to look at it, but there are differences, too: Control Plane vs Data Plane, Core vs Connectors. > Adding OIDC or OAuth2 has the exact same concerns what you've guys just raised. Why exactly these? If you think this would be beneficial we can discuss it in detail That's exactly my point. Once we start adding authx support, we will sooner or later discuss other options besides Kerberos, too. A user who would like to use OAuth can not easily use Kerberos, right? That is one of the reasons I am skeptical about adding initial authx support. > Related authorization you've mentioned it can be complicated over time. Can you show us an example? We've knowledge with couple of open source components but authorization was never a horror complex story. I personally have the most experience with Spark which I think is quite simple and stable. Users can be viewers/admins and jobs started by others can't be modified. If you can share an example over-complication we can discuss on facts. Authorization is a new aspect that needs to be considered for every addition to the REST API. In the future users might ask for additional roles (e.g. an editor), user-defined roles and you've already mentioned job-level permissions yourself. And keep in mind that there might also be larger additions in the future like the flink-sql-gateway. Contributions like this become more expensive the more aspects we need to consider. In general, I believe, it is important that the community focuses its efforts where we can generate the most value to the user and - personally - I don't think there is much to gain by extending Flink's scope in that direction. Of course, this is not black and white and there are other valid opinions. Thanks, Konstantin On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi wrote: > Hi Konstantin, > > Thanks for the response. Related new feature introduction in case of Basic > auth I tend to agree, anything else can be chosen. > > However representing Kerberos as completely new feature is not true because > it's already in since Flink makes authentication at least with HDFS and > Hbase through Kerberos. > The main problem with the actual Kerberos implementation is that it > contains several bugs and only partially implemented. Following your > suggestion can we agree that we > skip the Basic auth implementation and finish an already started Kerberos > story by adding History Server and Job Dashboard authentication? > > Adding OIDC or OAuth2 has the exact same concerns what you've guys just > raised. Why exactly these? If you think this would be beneficial we can > discuss it in detail > but as a side story it would be good to finish a halfway done Kerberos > story. > > Related authorization you've mentioned it can be complicated over time. Can > you show us an example? We've knowledge with couple of open source > components > but authorization was never a horror complex story. I personally have the > most experience with Spark which I think is quite simple and stable. Users > can be viewers/admins > and jobs started by others can't be modified. If you can share an example > over-complication we can discuss on facts. > > Thank you in advance! > > BR, > G > > > On Wed, Jun 16, 2021 at 5:42 PM Konstantin Knauf > wrote: > > > Hi everyone, > > > > sorry for joining late and thanks for the insightful discussion. > > > > In general, I'd personally prefer not to increase the surface area of > > Apache Flink unless there is a good reason. It seems we all agree that > > authx is not part of the core value proposition of Apache Flink, so if we > > can delegate this problem to a more specialized tool, I am in favor of > > that. Apache Flink is already huge and a lot of work goes into > maintenance, > > so I personally have become more sensitive to this aspect over time. > > > > If we add support for Basic Auth and Kerberos now, users will sooner or > > later ask for OIDC, LDAP, SAML,... I acknowledge that Kerberos is widely > > used in the corporate, on-premises context, but isn't the focus moving > more > > towards more web-friendly standards like OIDC/OAuth 2.0? If we only want > to > > support a single protocol, there is an argument to be made that it should > > be OIDC and Dex [1,2] as a bridge to everything else. Have OIDC or OAuth2 > > been considered instead of Kerberos? How do you see the market moving? > But > > as I said before, in my opinion we can generate more value by investing > > into other areas of Apache Flink. > > > > Authorization also has the potential to become more fine-grained and > > complex over time: you already mentioned restricting the actions that a > > specific user can do in a cluster. > > > > Cheers, > > > > Konstantin > > > > [1] https://github.com/dexidp/dex > > [2]
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Konstantin, Thanks for the response. Related new feature introduction in case of Basic auth I tend to agree, anything else can be chosen. However representing Kerberos as completely new feature is not true because it's already in since Flink makes authentication at least with HDFS and Hbase through Kerberos. The main problem with the actual Kerberos implementation is that it contains several bugs and only partially implemented. Following your suggestion can we agree that we skip the Basic auth implementation and finish an already started Kerberos story by adding History Server and Job Dashboard authentication? Adding OIDC or OAuth2 has the exact same concerns what you've guys just raised. Why exactly these? If you think this would be beneficial we can discuss it in detail but as a side story it would be good to finish a halfway done Kerberos story. Related authorization you've mentioned it can be complicated over time. Can you show us an example? We've knowledge with couple of open source components but authorization was never a horror complex story. I personally have the most experience with Spark which I think is quite simple and stable. Users can be viewers/admins and jobs started by others can't be modified. If you can share an example over-complication we can discuss on facts. Thank you in advance! BR, G On Wed, Jun 16, 2021 at 5:42 PM Konstantin Knauf wrote: > Hi everyone, > > sorry for joining late and thanks for the insightful discussion. > > In general, I'd personally prefer not to increase the surface area of > Apache Flink unless there is a good reason. It seems we all agree that > authx is not part of the core value proposition of Apache Flink, so if we > can delegate this problem to a more specialized tool, I am in favor of > that. Apache Flink is already huge and a lot of work goes into maintenance, > so I personally have become more sensitive to this aspect over time. > > If we add support for Basic Auth and Kerberos now, users will sooner or > later ask for OIDC, LDAP, SAML,... I acknowledge that Kerberos is widely > used in the corporate, on-premises context, but isn't the focus moving more > towards more web-friendly standards like OIDC/OAuth 2.0? If we only want to > support a single protocol, there is an argument to be made that it should > be OIDC and Dex [1,2] as a bridge to everything else. Have OIDC or OAuth2 > been considered instead of Kerberos? How do you see the market moving? But > as I said before, in my opinion we can generate more value by investing > into other areas of Apache Flink. > > Authorization also has the potential to become more fine-grained and > complex over time: you already mentioned restricting the actions that a > specific user can do in a cluster. > > Cheers, > > Konstantin > > [1] https://github.com/dexidp/dex > [2] https://github.com/dexidp/dex/issues/1903 > > > On Wed, Jun 16, 2021 at 11:44 AM Gabor Somogyi > wrote: > >> Hi Till, >> >> Did you have the chance to take a look at the doc? Not yet seen any >> update. >> >> BR, >> G >> >> >> On Wed, Jun 9, 2021 at 1:43 PM Till Rohrmann >> wrote: >> >> > Thanks for the update Gabor. I'll take a look and respond in the >> document. >> > >> > Cheers, >> > Till >> > >> > On Wed, Jun 9, 2021 at 12:59 PM Gabor Somogyi < >> gabor.g.somo...@gmail.com> >> > wrote: >> > >> >> Hi Till, >> >> >> >> Your proxy suggestion has been considered in-depth and updated the FLIP >> >> accordingly. >> >> We've considered 2 proxy implementation (Nginx and Squid) but according >> >> to our analysis and testing it's not suitable for the mentioned >> use-cases. >> >> Please take a look at the rejected alternatives for detailed >> explanation. >> >> >> >> Thanks for your time in advance! >> >> >> >> BR, >> >> G >> >> >> >> >> >> On Fri, Jun 4, 2021 at 3:31 PM Till Rohrmann >> >> wrote: >> >> >> >>> As I've said I am not a security expert and that's why I have to ask >> for >> >>> clarification, Gabor. You are saying that if we configure a >> truststore for >> >>> the REST endpoint with a single trusted certificate which has been >> >>> generated by the operator of the Flink cluster, then the attacker can >> >>> generate a new certificate, sign it and then talk to the Flink >> cluster if >> >>> he has access to the node on which the REST endpoint runs? My >> understanding >> >>> was that you need the corresponding private key which in my proposed >> setup >> >>> would be under the control of the operator as well (e.g. stored in a >> >>> keystore on the same machine but guarded by some secret). That way >> (if I am >> >>> not mistaken), only the entity which has access to the keystore is >> able to >> >>> talk to the Flink cluster. >> >>> >> >>> Maybe we are also getting our wires crossed here and are talking about >> >>> different things. >> >>> >> >>> Thanks for listing the pros and cons of Kerberos. Concerning what >> other >> >>> authentication mechanisms are used in the industry, I am not 100% >> sure. >> >>> >> >>>
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi everyone, sorry for joining late and thanks for the insightful discussion. In general, I'd personally prefer not to increase the surface area of Apache Flink unless there is a good reason. It seems we all agree that authx is not part of the core value proposition of Apache Flink, so if we can delegate this problem to a more specialized tool, I am in favor of that. Apache Flink is already huge and a lot of work goes into maintenance, so I personally have become more sensitive to this aspect over time. If we add support for Basic Auth and Kerberos now, users will sooner or later ask for OIDC, LDAP, SAML,... I acknowledge that Kerberos is widely used in the corporate, on-premises context, but isn't the focus moving more towards more web-friendly standards like OIDC/OAuth 2.0? If we only want to support a single protocol, there is an argument to be made that it should be OIDC and Dex [1,2] as a bridge to everything else. Have OIDC or OAuth2 been considered instead of Kerberos? How do you see the market moving? But as I said before, in my opinion we can generate more value by investing into other areas of Apache Flink. Authorization also has the potential to become more fine-grained and complex over time: you already mentioned restricting the actions that a specific user can do in a cluster. Cheers, Konstantin [1] https://github.com/dexidp/dex [2] https://github.com/dexidp/dex/issues/1903 On Wed, Jun 16, 2021 at 11:44 AM Gabor Somogyi wrote: > Hi Till, > > Did you have the chance to take a look at the doc? Not yet seen any update. > > BR, > G > > > On Wed, Jun 9, 2021 at 1:43 PM Till Rohrmann wrote: > > > Thanks for the update Gabor. I'll take a look and respond in the > document. > > > > Cheers, > > Till > > > > On Wed, Jun 9, 2021 at 12:59 PM Gabor Somogyi > > > wrote: > > > >> Hi Till, > >> > >> Your proxy suggestion has been considered in-depth and updated the FLIP > >> accordingly. > >> We've considered 2 proxy implementation (Nginx and Squid) but according > >> to our analysis and testing it's not suitable for the mentioned > use-cases. > >> Please take a look at the rejected alternatives for detailed > explanation. > >> > >> Thanks for your time in advance! > >> > >> BR, > >> G > >> > >> > >> On Fri, Jun 4, 2021 at 3:31 PM Till Rohrmann > >> wrote: > >> > >>> As I've said I am not a security expert and that's why I have to ask > for > >>> clarification, Gabor. You are saying that if we configure a truststore > for > >>> the REST endpoint with a single trusted certificate which has been > >>> generated by the operator of the Flink cluster, then the attacker can > >>> generate a new certificate, sign it and then talk to the Flink cluster > if > >>> he has access to the node on which the REST endpoint runs? My > understanding > >>> was that you need the corresponding private key which in my proposed > setup > >>> would be under the control of the operator as well (e.g. stored in a > >>> keystore on the same machine but guarded by some secret). That way (if > I am > >>> not mistaken), only the entity which has access to the keystore is > able to > >>> talk to the Flink cluster. > >>> > >>> Maybe we are also getting our wires crossed here and are talking about > >>> different things. > >>> > >>> Thanks for listing the pros and cons of Kerberos. Concerning what other > >>> authentication mechanisms are used in the industry, I am not 100% sure. > >>> > >>> Cheers, > >>> Till > >>> > >>> On Fri, Jun 4, 2021 at 11:09 AM Gabor Somogyi < > gabor.g.somo...@gmail.com> > >>> wrote: > >>> > > I did not mean for the user to sign its own certificates but for the > operator of the cluster. Once the user request hits the proxy, it > should no > longer be under his control. I think I do not fully understand yet > why this > would not work. > I said it's not solving the authentication problem over any proxy. > Even > if the operator is signing the certificate one can have access to an > internal node. > Such case anybody can craft certificates which is accepted by the > server. When it's accepted a bad guy can cancel jobs causing huge > impacts. > > > Also, I am missing a bit the comparison of Kerberos to other > authentication mechanisms and why they were rejected in favour of > Kerberos. > PROS: > * Since it's not depending on cloud provider and/or k8s or bare-metal > etc. deployment it's the biggest plus > * Centralized with tools and no need to write tons of tools around > * There are clients/tools on almost all OS-es and several languages > * Super huge users are using it for years in production w/o huge > issues > * Provides cross-realm trust possibility amongst other features > * Several open source components using it which could increase > compatibility > > CONS: > * Not everybody using kerberos > * It would increase the code footprint but this is true for many >
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Till, Did you have the chance to take a look at the doc? Not yet seen any update. BR, G On Wed, Jun 9, 2021 at 1:43 PM Till Rohrmann wrote: > Thanks for the update Gabor. I'll take a look and respond in the document. > > Cheers, > Till > > On Wed, Jun 9, 2021 at 12:59 PM Gabor Somogyi > wrote: > >> Hi Till, >> >> Your proxy suggestion has been considered in-depth and updated the FLIP >> accordingly. >> We've considered 2 proxy implementation (Nginx and Squid) but according >> to our analysis and testing it's not suitable for the mentioned use-cases. >> Please take a look at the rejected alternatives for detailed explanation. >> >> Thanks for your time in advance! >> >> BR, >> G >> >> >> On Fri, Jun 4, 2021 at 3:31 PM Till Rohrmann >> wrote: >> >>> As I've said I am not a security expert and that's why I have to ask for >>> clarification, Gabor. You are saying that if we configure a truststore for >>> the REST endpoint with a single trusted certificate which has been >>> generated by the operator of the Flink cluster, then the attacker can >>> generate a new certificate, sign it and then talk to the Flink cluster if >>> he has access to the node on which the REST endpoint runs? My understanding >>> was that you need the corresponding private key which in my proposed setup >>> would be under the control of the operator as well (e.g. stored in a >>> keystore on the same machine but guarded by some secret). That way (if I am >>> not mistaken), only the entity which has access to the keystore is able to >>> talk to the Flink cluster. >>> >>> Maybe we are also getting our wires crossed here and are talking about >>> different things. >>> >>> Thanks for listing the pros and cons of Kerberos. Concerning what other >>> authentication mechanisms are used in the industry, I am not 100% sure. >>> >>> Cheers, >>> Till >>> >>> On Fri, Jun 4, 2021 at 11:09 AM Gabor Somogyi >>> wrote: >>> > I did not mean for the user to sign its own certificates but for the operator of the cluster. Once the user request hits the proxy, it should no longer be under his control. I think I do not fully understand yet why this would not work. I said it's not solving the authentication problem over any proxy. Even if the operator is signing the certificate one can have access to an internal node. Such case anybody can craft certificates which is accepted by the server. When it's accepted a bad guy can cancel jobs causing huge impacts. > Also, I am missing a bit the comparison of Kerberos to other authentication mechanisms and why they were rejected in favour of Kerberos. PROS: * Since it's not depending on cloud provider and/or k8s or bare-metal etc. deployment it's the biggest plus * Centralized with tools and no need to write tons of tools around * There are clients/tools on almost all OS-es and several languages * Super huge users are using it for years in production w/o huge issues * Provides cross-realm trust possibility amongst other features * Several open source components using it which could increase compatibility CONS: * Not everybody using kerberos * It would increase the code footprint but this is true for many features (as a side note I'm here to maintain it) Feel free to add your points because it only represents a single viewpoint. Also if you have any better option for strong authentication please share it and we can consider the pros/cons here. BR, G On Fri, Jun 4, 2021 at 10:32 AM Till Rohrmann wrote: > I did not mean for the user to sign its own certificates but for the > operator of the cluster. Once the user request hits the proxy, it should > no > longer be under his control. I think I do not fully understand yet why > this > would not work. > > What I would like to avoid is to add more complexity into Flink if > there is an easy solution which fulfills the requirements. That's why I > would like to exercise thoroughly through the different alternatives. > Also, > I am missing a bit the comparison of Kerberos to other authentication > mechanisms and why they were rejected in favour of Kerberos. > > Cheers, > Till > > On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra wrote: > >> Hi! >> >> I think there might be possible alternatives but it seems Kerberos on >> the rest endpoint ticks all the right boxes and provides a super clean >> and >> simple solution for strong authentication. >> >> I wouldn’t even consider sidecar proxies etc if we can solve it in >> such a simple way as proposed by G. >> >> Cheers >> Gyula >> >> On Fri, 4 Jun 2021 at 10:03, Till Rohrmann >> wrote: >> >>> I am not saying that we shouldn't add a strong authentication >>> mechanism if there are good
Re: [DISCUSS] Dashboard/HistoryServer authentication
Thanks for the update Gabor. I'll take a look and respond in the document. Cheers, Till On Wed, Jun 9, 2021 at 12:59 PM Gabor Somogyi wrote: > Hi Till, > > Your proxy suggestion has been considered in-depth and updated the FLIP > accordingly. > We've considered 2 proxy implementation (Nginx and Squid) but according to > our analysis and testing it's not suitable for the mentioned use-cases. > Please take a look at the rejected alternatives for detailed explanation. > > Thanks for your time in advance! > > BR, > G > > > On Fri, Jun 4, 2021 at 3:31 PM Till Rohrmann wrote: > >> As I've said I am not a security expert and that's why I have to ask for >> clarification, Gabor. You are saying that if we configure a truststore for >> the REST endpoint with a single trusted certificate which has been >> generated by the operator of the Flink cluster, then the attacker can >> generate a new certificate, sign it and then talk to the Flink cluster if >> he has access to the node on which the REST endpoint runs? My understanding >> was that you need the corresponding private key which in my proposed setup >> would be under the control of the operator as well (e.g. stored in a >> keystore on the same machine but guarded by some secret). That way (if I am >> not mistaken), only the entity which has access to the keystore is able to >> talk to the Flink cluster. >> >> Maybe we are also getting our wires crossed here and are talking about >> different things. >> >> Thanks for listing the pros and cons of Kerberos. Concerning what other >> authentication mechanisms are used in the industry, I am not 100% sure. >> >> Cheers, >> Till >> >> On Fri, Jun 4, 2021 at 11:09 AM Gabor Somogyi >> wrote: >> >>> > I did not mean for the user to sign its own certificates but for the >>> operator of the cluster. Once the user request hits the proxy, it should no >>> longer be under his control. I think I do not fully understand yet why this >>> would not work. >>> I said it's not solving the authentication problem over any proxy. Even >>> if the operator is signing the certificate one can have access to an >>> internal node. >>> Such case anybody can craft certificates which is accepted by the >>> server. When it's accepted a bad guy can cancel jobs causing huge impacts. >>> >>> > Also, I am missing a bit the comparison of Kerberos to other >>> authentication mechanisms and why they were rejected in favour of Kerberos. >>> PROS: >>> * Since it's not depending on cloud provider and/or k8s or bare-metal >>> etc. deployment it's the biggest plus >>> * Centralized with tools and no need to write tons of tools around >>> * There are clients/tools on almost all OS-es and several languages >>> * Super huge users are using it for years in production w/o huge issues >>> * Provides cross-realm trust possibility amongst other features >>> * Several open source components using it which could increase >>> compatibility >>> >>> CONS: >>> * Not everybody using kerberos >>> * It would increase the code footprint but this is true for many >>> features (as a side note I'm here to maintain it) >>> >>> Feel free to add your points because it only represents a single >>> viewpoint. >>> Also if you have any better option for strong authentication please >>> share it and we can consider the pros/cons here. >>> >>> BR, >>> G >>> >>> >>> On Fri, Jun 4, 2021 at 10:32 AM Till Rohrmann >>> wrote: >>> I did not mean for the user to sign its own certificates but for the operator of the cluster. Once the user request hits the proxy, it should no longer be under his control. I think I do not fully understand yet why this would not work. What I would like to avoid is to add more complexity into Flink if there is an easy solution which fulfills the requirements. That's why I would like to exercise thoroughly through the different alternatives. Also, I am missing a bit the comparison of Kerberos to other authentication mechanisms and why they were rejected in favour of Kerberos. Cheers, Till On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra wrote: > Hi! > > I think there might be possible alternatives but it seems Kerberos on > the rest endpoint ticks all the right boxes and provides a super clean and > simple solution for strong authentication. > > I wouldn’t even consider sidecar proxies etc if we can solve it in > such a simple way as proposed by G. > > Cheers > Gyula > > On Fri, 4 Jun 2021 at 10:03, Till Rohrmann > wrote: > >> I am not saying that we shouldn't add a strong authentication >> mechanism if there are good reasons for it. I primarily would like to >> understand the context a bit better in order to give qualified feedback >> and >> come to a good decision. In order to do this, I have the feeling that we >> haven't fully considered all available options which are on the table, >> tbh.
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Till, Your proxy suggestion has been considered in-depth and updated the FLIP accordingly. We've considered 2 proxy implementation (Nginx and Squid) but according to our analysis and testing it's not suitable for the mentioned use-cases. Please take a look at the rejected alternatives for detailed explanation. Thanks for your time in advance! BR, G On Fri, Jun 4, 2021 at 3:31 PM Till Rohrmann wrote: > As I've said I am not a security expert and that's why I have to ask for > clarification, Gabor. You are saying that if we configure a truststore for > the REST endpoint with a single trusted certificate which has been > generated by the operator of the Flink cluster, then the attacker can > generate a new certificate, sign it and then talk to the Flink cluster if > he has access to the node on which the REST endpoint runs? My understanding > was that you need the corresponding private key which in my proposed setup > would be under the control of the operator as well (e.g. stored in a > keystore on the same machine but guarded by some secret). That way (if I am > not mistaken), only the entity which has access to the keystore is able to > talk to the Flink cluster. > > Maybe we are also getting our wires crossed here and are talking about > different things. > > Thanks for listing the pros and cons of Kerberos. Concerning what other > authentication mechanisms are used in the industry, I am not 100% sure. > > Cheers, > Till > > On Fri, Jun 4, 2021 at 11:09 AM Gabor Somogyi > wrote: > >> > I did not mean for the user to sign its own certificates but for the >> operator of the cluster. Once the user request hits the proxy, it should no >> longer be under his control. I think I do not fully understand yet why this >> would not work. >> I said it's not solving the authentication problem over any proxy. Even >> if the operator is signing the certificate one can have access to an >> internal node. >> Such case anybody can craft certificates which is accepted by the server. >> When it's accepted a bad guy can cancel jobs causing huge impacts. >> >> > Also, I am missing a bit the comparison of Kerberos to other >> authentication mechanisms and why they were rejected in favour of Kerberos. >> PROS: >> * Since it's not depending on cloud provider and/or k8s or bare-metal >> etc. deployment it's the biggest plus >> * Centralized with tools and no need to write tons of tools around >> * There are clients/tools on almost all OS-es and several languages >> * Super huge users are using it for years in production w/o huge issues >> * Provides cross-realm trust possibility amongst other features >> * Several open source components using it which could increase >> compatibility >> >> CONS: >> * Not everybody using kerberos >> * It would increase the code footprint but this is true for many features >> (as a side note I'm here to maintain it) >> >> Feel free to add your points because it only represents a single >> viewpoint. >> Also if you have any better option for strong authentication please share >> it and we can consider the pros/cons here. >> >> BR, >> G >> >> >> On Fri, Jun 4, 2021 at 10:32 AM Till Rohrmann >> wrote: >> >>> I did not mean for the user to sign its own certificates but for the >>> operator of the cluster. Once the user request hits the proxy, it should no >>> longer be under his control. I think I do not fully understand yet why this >>> would not work. >>> >>> What I would like to avoid is to add more complexity into Flink if there >>> is an easy solution which fulfills the requirements. That's why I would >>> like to exercise thoroughly through the different alternatives. Also, I am >>> missing a bit the comparison of Kerberos to other authentication mechanisms >>> and why they were rejected in favour of Kerberos. >>> >>> Cheers, >>> Till >>> >>> On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra wrote: >>> Hi! I think there might be possible alternatives but it seems Kerberos on the rest endpoint ticks all the right boxes and provides a super clean and simple solution for strong authentication. I wouldn’t even consider sidecar proxies etc if we can solve it in such a simple way as proposed by G. Cheers Gyula On Fri, 4 Jun 2021 at 10:03, Till Rohrmann wrote: > I am not saying that we shouldn't add a strong authentication > mechanism if there are good reasons for it. I primarily would like to > understand the context a bit better in order to give qualified feedback > and > come to a good decision. In order to do this, I have the feeling that we > haven't fully considered all available options which are on the table, > tbh. > > Does the problem of certificate expiry also apply for self-signed > certificates? If yes, then this should then also be a problem for the > internal encryption of Flink's communication. If not, then one could use > self-signed certificates with a
Re: [DISCUSS] Dashboard/HistoryServer authentication
As I've said I am not a security expert and that's why I have to ask for clarification, Gabor. You are saying that if we configure a truststore for the REST endpoint with a single trusted certificate which has been generated by the operator of the Flink cluster, then the attacker can generate a new certificate, sign it and then talk to the Flink cluster if he has access to the node on which the REST endpoint runs? My understanding was that you need the corresponding private key which in my proposed setup would be under the control of the operator as well (e.g. stored in a keystore on the same machine but guarded by some secret). That way (if I am not mistaken), only the entity which has access to the keystore is able to talk to the Flink cluster. Maybe we are also getting our wires crossed here and are talking about different things. Thanks for listing the pros and cons of Kerberos. Concerning what other authentication mechanisms are used in the industry, I am not 100% sure. Cheers, Till On Fri, Jun 4, 2021 at 11:09 AM Gabor Somogyi wrote: > > I did not mean for the user to sign its own certificates but for the > operator of the cluster. Once the user request hits the proxy, it should no > longer be under his control. I think I do not fully understand yet why this > would not work. > I said it's not solving the authentication problem over any proxy. Even if > the operator is signing the certificate one can have access to an internal > node. > Such case anybody can craft certificates which is accepted by the server. > When it's accepted a bad guy can cancel jobs causing huge impacts. > > > Also, I am missing a bit the comparison of Kerberos to other > authentication mechanisms and why they were rejected in favour of Kerberos. > PROS: > * Since it's not depending on cloud provider and/or k8s or bare-metal etc. > deployment it's the biggest plus > * Centralized with tools and no need to write tons of tools around > * There are clients/tools on almost all OS-es and several languages > * Super huge users are using it for years in production w/o huge issues > * Provides cross-realm trust possibility amongst other features > * Several open source components using it which could increase > compatibility > > CONS: > * Not everybody using kerberos > * It would increase the code footprint but this is true for many features > (as a side note I'm here to maintain it) > > Feel free to add your points because it only represents a single viewpoint. > Also if you have any better option for strong authentication please share > it and we can consider the pros/cons here. > > BR, > G > > > On Fri, Jun 4, 2021 at 10:32 AM Till Rohrmann > wrote: > >> I did not mean for the user to sign its own certificates but for the >> operator of the cluster. Once the user request hits the proxy, it should no >> longer be under his control. I think I do not fully understand yet why this >> would not work. >> >> What I would like to avoid is to add more complexity into Flink if there >> is an easy solution which fulfills the requirements. That's why I would >> like to exercise thoroughly through the different alternatives. Also, I am >> missing a bit the comparison of Kerberos to other authentication mechanisms >> and why they were rejected in favour of Kerberos. >> >> Cheers, >> Till >> >> On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra wrote: >> >>> Hi! >>> >>> I think there might be possible alternatives but it seems Kerberos on >>> the rest endpoint ticks all the right boxes and provides a super clean and >>> simple solution for strong authentication. >>> >>> I wouldn’t even consider sidecar proxies etc if we can solve it in such >>> a simple way as proposed by G. >>> >>> Cheers >>> Gyula >>> >>> On Fri, 4 Jun 2021 at 10:03, Till Rohrmann wrote: >>> I am not saying that we shouldn't add a strong authentication mechanism if there are good reasons for it. I primarily would like to understand the context a bit better in order to give qualified feedback and come to a good decision. In order to do this, I have the feeling that we haven't fully considered all available options which are on the table, tbh. Does the problem of certificate expiry also apply for self-signed certificates? If yes, then this should then also be a problem for the internal encryption of Flink's communication. If not, then one could use self-signed certificates with a longer validity to solve the mentioned issue. I think you can set up Flink in such a way that you don't have to handle all the different certificates. For example, you could deploy Flink with a "sidecar proxy" which is responsible for the authentication using an arbitrary method (e.g. Kerberos) and then bind the REST endpoint to a local network interface. That way, the REST endpoint would only be available through the sidecar proxy. Additionally, one could enable SSL for this communication. Would this be a
Re: [DISCUSS] Dashboard/HistoryServer authentication
> I did not mean for the user to sign its own certificates but for the operator of the cluster. Once the user request hits the proxy, it should no longer be under his control. I think I do not fully understand yet why this would not work. I said it's not solving the authentication problem over any proxy. Even if the operator is signing the certificate one can have access to an internal node. Such case anybody can craft certificates which is accepted by the server. When it's accepted a bad guy can cancel jobs causing huge impacts. > Also, I am missing a bit the comparison of Kerberos to other authentication mechanisms and why they were rejected in favour of Kerberos. PROS: * Since it's not depending on cloud provider and/or k8s or bare-metal etc. deployment it's the biggest plus * Centralized with tools and no need to write tons of tools around * There are clients/tools on almost all OS-es and several languages * Super huge users are using it for years in production w/o huge issues * Provides cross-realm trust possibility amongst other features * Several open source components using it which could increase compatibility CONS: * Not everybody using kerberos * It would increase the code footprint but this is true for many features (as a side note I'm here to maintain it) Feel free to add your points because it only represents a single viewpoint. Also if you have any better option for strong authentication please share it and we can consider the pros/cons here. BR, G On Fri, Jun 4, 2021 at 10:32 AM Till Rohrmann wrote: > I did not mean for the user to sign its own certificates but for the > operator of the cluster. Once the user request hits the proxy, it should no > longer be under his control. I think I do not fully understand yet why this > would not work. > > What I would like to avoid is to add more complexity into Flink if there > is an easy solution which fulfills the requirements. That's why I would > like to exercise thoroughly through the different alternatives. Also, I am > missing a bit the comparison of Kerberos to other authentication mechanisms > and why they were rejected in favour of Kerberos. > > Cheers, > Till > > On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra wrote: > >> Hi! >> >> I think there might be possible alternatives but it seems Kerberos on the >> rest endpoint ticks all the right boxes and provides a super clean and >> simple solution for strong authentication. >> >> I wouldn’t even consider sidecar proxies etc if we can solve it in such a >> simple way as proposed by G. >> >> Cheers >> Gyula >> >> On Fri, 4 Jun 2021 at 10:03, Till Rohrmann wrote: >> >>> I am not saying that we shouldn't add a strong authentication mechanism >>> if there are good reasons for it. I primarily would like to understand the >>> context a bit better in order to give qualified feedback and come to a good >>> decision. In order to do this, I have the feeling that we haven't fully >>> considered all available options which are on the table, tbh. >>> >>> Does the problem of certificate expiry also apply for self-signed >>> certificates? If yes, then this should then also be a problem for the >>> internal encryption of Flink's communication. If not, then one could use >>> self-signed certificates with a longer validity to solve the mentioned >>> issue. >>> >>> I think you can set up Flink in such a way that you don't have to handle >>> all the different certificates. For example, you could deploy Flink with a >>> "sidecar proxy" which is responsible for the authentication using an >>> arbitrary method (e.g. Kerberos) and then bind the REST endpoint to a local >>> network interface. That way, the REST endpoint would only be available >>> through the sidecar proxy. Additionally, one could enable SSL for this >>> communication. Would this be a solution for the problem? >>> >>> Cheers, >>> Till >>> >>> On Thu, Jun 3, 2021 at 10:46 PM Márton Balassi >>> wrote: >>> That is an interesting idea, Till. The main issue with it is that TLS certificates have an expiration time, usually they get approved for a couple years. Forcing our users to restart jobs to reprovision TLS certificates would be weird when we could just implement a single proper strong authentication mechanism instead in a couple hundred lines of code. :-) In many cases it is also impractical to go the TLS mutual route, because the Flink Dashboard can end up on any node in the k8s/Yarn cluster which means that we need a certificate per node (due to the mutual auth), but if we also want to protect the private key of these from users accidentally or intentionally leaking them then we need this per user. As in we end up managing user*machine number certificates and having to renew them periodically, which albeit automatable is unfortunately not yet automated in all large organizations. I fully agree that TLS certificate mutual authentication has its nice
Re: [DISCUSS] Dashboard/HistoryServer authentication
I did not mean for the user to sign its own certificates but for the operator of the cluster. Once the user request hits the proxy, it should no longer be under his control. I think I do not fully understand yet why this would not work. What I would like to avoid is to add more complexity into Flink if there is an easy solution which fulfills the requirements. That's why I would like to exercise thoroughly through the different alternatives. Also, I am missing a bit the comparison of Kerberos to other authentication mechanisms and why they were rejected in favour of Kerberos. Cheers, Till On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra wrote: > Hi! > > I think there might be possible alternatives but it seems Kerberos on the > rest endpoint ticks all the right boxes and provides a super clean and > simple solution for strong authentication. > > I wouldn’t even consider sidecar proxies etc if we can solve it in such a > simple way as proposed by G. > > Cheers > Gyula > > On Fri, 4 Jun 2021 at 10:03, Till Rohrmann wrote: > >> I am not saying that we shouldn't add a strong authentication mechanism >> if there are good reasons for it. I primarily would like to understand the >> context a bit better in order to give qualified feedback and come to a good >> decision. In order to do this, I have the feeling that we haven't fully >> considered all available options which are on the table, tbh. >> >> Does the problem of certificate expiry also apply for self-signed >> certificates? If yes, then this should then also be a problem for the >> internal encryption of Flink's communication. If not, then one could use >> self-signed certificates with a longer validity to solve the mentioned >> issue. >> >> I think you can set up Flink in such a way that you don't have to handle >> all the different certificates. For example, you could deploy Flink with a >> "sidecar proxy" which is responsible for the authentication using an >> arbitrary method (e.g. Kerberos) and then bind the REST endpoint to a local >> network interface. That way, the REST endpoint would only be available >> through the sidecar proxy. Additionally, one could enable SSL for this >> communication. Would this be a solution for the problem? >> >> Cheers, >> Till >> >> On Thu, Jun 3, 2021 at 10:46 PM Márton Balassi >> wrote: >> >>> That is an interesting idea, Till. >>> >>> The main issue with it is that TLS certificates have an expiration time, >>> usually they get approved for a couple years. Forcing our users to restart >>> jobs to reprovision TLS certificates would be weird when we could just >>> implement a single proper strong authentication mechanism instead in a >>> couple hundred lines of code. :-) >>> >>> In many cases it is also impractical to go the TLS mutual route, because >>> the Flink Dashboard can end up on any node in the k8s/Yarn cluster which >>> means that we need a certificate per node (due to the mutual auth), but if >>> we also want to protect the private key of these from users accidentally or >>> intentionally leaking them then we need this per user. As in we end up >>> managing user*machine number certificates and having to renew them >>> periodically, which albeit automatable is unfortunately not yet automated >>> in all large organizations. >>> >>> I fully agree that TLS certificate mutual authentication has its nice >>> properties, especially at very large (multiple thousand node) clusters - >>> but it has its own challenges too. Thanks for bringing it up. >>> >>> Happy to have this added to the rejected alternative list so that we >>> have the full picture documented. >>> >>> On Thu, Jun 3, 2021 at 5:52 PM Till Rohrmann >>> wrote: >>> I guess the idea would then be to let the proxy do the authentication job and only forward the request via an SSL mutually encrypted connection to the Flink cluster. Would this be possible? The beauty of this setup is in my opinion that this setup should work with all kinds of authentication mechanisms. Cheers, Till On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi wrote: > Thanks for giving options to fulfil the need. > > Users are looking for a solution where users can be identified on the > whole cluster and restrict access to resources/actions. > A good example for such an action is cancelling other users running > jobs. > > * SSL does provide mutual authentication but when authentication > passed there is no user based on restrictions can be made. > * The less problematic part is that generating/maintaining short time > valid certificates would be a hard (that's the reason KDC like servers > exist). > Having long time valid certificates would widen the attack surface but > since the first concern is there this is just a cosmetic issue. > > All in all using TLS certificates is not sufficient in these > environments unfortunately. > > BR, > G > >
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi! I think there might be possible alternatives but it seems Kerberos on the rest endpoint ticks all the right boxes and provides a super clean and simple solution for strong authentication. I wouldn’t even consider sidecar proxies etc if we can solve it in such a simple way as proposed by G. Cheers Gyula On Fri, 4 Jun 2021 at 10:03, Till Rohrmann wrote: > I am not saying that we shouldn't add a strong authentication mechanism if > there are good reasons for it. I primarily would like to understand the > context a bit better in order to give qualified feedback and come to a good > decision. In order to do this, I have the feeling that we haven't fully > considered all available options which are on the table, tbh. > > Does the problem of certificate expiry also apply for self-signed > certificates? If yes, then this should then also be a problem for the > internal encryption of Flink's communication. If not, then one could use > self-signed certificates with a longer validity to solve the mentioned > issue. > > I think you can set up Flink in such a way that you don't have to handle > all the different certificates. For example, you could deploy Flink with a > "sidecar proxy" which is responsible for the authentication using an > arbitrary method (e.g. Kerberos) and then bind the REST endpoint to a local > network interface. That way, the REST endpoint would only be available > through the sidecar proxy. Additionally, one could enable SSL for this > communication. Would this be a solution for the problem? > > Cheers, > Till > > On Thu, Jun 3, 2021 at 10:46 PM Márton Balassi > wrote: > >> That is an interesting idea, Till. >> >> The main issue with it is that TLS certificates have an expiration time, >> usually they get approved for a couple years. Forcing our users to restart >> jobs to reprovision TLS certificates would be weird when we could just >> implement a single proper strong authentication mechanism instead in a >> couple hundred lines of code. :-) >> >> In many cases it is also impractical to go the TLS mutual route, because >> the Flink Dashboard can end up on any node in the k8s/Yarn cluster which >> means that we need a certificate per node (due to the mutual auth), but if >> we also want to protect the private key of these from users accidentally or >> intentionally leaking them then we need this per user. As in we end up >> managing user*machine number certificates and having to renew them >> periodically, which albeit automatable is unfortunately not yet automated >> in all large organizations. >> >> I fully agree that TLS certificate mutual authentication has its nice >> properties, especially at very large (multiple thousand node) clusters - >> but it has its own challenges too. Thanks for bringing it up. >> >> Happy to have this added to the rejected alternative list so that we have >> the full picture documented. >> >> On Thu, Jun 3, 2021 at 5:52 PM Till Rohrmann >> wrote: >> >>> I guess the idea would then be to let the proxy do the authentication >>> job and only forward the request via an SSL mutually encrypted connection >>> to the Flink cluster. Would this be possible? The beauty of this setup is >>> in my opinion that this setup should work with all kinds of authentication >>> mechanisms. >>> >>> Cheers, >>> Till >>> >>> On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi >>> wrote: >>> Thanks for giving options to fulfil the need. Users are looking for a solution where users can be identified on the whole cluster and restrict access to resources/actions. A good example for such an action is cancelling other users running jobs. * SSL does provide mutual authentication but when authentication passed there is no user based on restrictions can be made. * The less problematic part is that generating/maintaining short time valid certificates would be a hard (that's the reason KDC like servers exist). Having long time valid certificates would widen the attack surface but since the first concern is there this is just a cosmetic issue. All in all using TLS certificates is not sufficient in these environments unfortunately. BR, G On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann wrote: > Thanks for the information Gabor. If it is about securing the > communication between the REST client and the REST server, then Flink > already supports enabling mutual SSL authentication [1]. Would this be > enough to secure the communication and to pass an audit? > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity > > Cheers, > Till > > On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi < > gabor.g.somo...@gmail.com> wrote: > >> Hi Till, >> >> Since I'm working in security area 10+ years let me share my thought. >> I would like to
Re: [DISCUSS] Dashboard/HistoryServer authentication
Till, thanks for investing time in giving further options. Marci, thanks for summarizing the use-case point of view. We've arrived back to one of the original problems. Namely if an attacker gets access to a node it's possible to cancel other user's jobs (and more can be done). Self signed certificate is almost no-op authentication in production environments because any user can sign its own certificate and no third party plays. This problem just can't be solved with SSL no matter from which point of view we consider it. BR, G On Fri, Jun 4, 2021 at 10:03 AM Till Rohrmann wrote: > I am not saying that we shouldn't add a strong authentication mechanism if > there are good reasons for it. I primarily would like to understand the > context a bit better in order to give qualified feedback and come to a good > decision. In order to do this, I have the feeling that we haven't fully > considered all available options which are on the table, tbh. > > Does the problem of certificate expiry also apply for self-signed > certificates? If yes, then this should then also be a problem for the > internal encryption of Flink's communication. If not, then one could use > self-signed certificates with a longer validity to solve the mentioned > issue. > > I think you can set up Flink in such a way that you don't have to handle > all the different certificates. For example, you could deploy Flink with a > "sidecar proxy" which is responsible for the authentication using an > arbitrary method (e.g. Kerberos) and then bind the REST endpoint to a local > network interface. That way, the REST endpoint would only be available > through the sidecar proxy. Additionally, one could enable SSL for this > communication. Would this be a solution for the problem? > > Cheers, > Till > > On Thu, Jun 3, 2021 at 10:46 PM Márton Balassi > wrote: > >> That is an interesting idea, Till. >> >> The main issue with it is that TLS certificates have an expiration time, >> usually they get approved for a couple years. Forcing our users to restart >> jobs to reprovision TLS certificates would be weird when we could just >> implement a single proper strong authentication mechanism instead in a >> couple hundred lines of code. :-) >> >> In many cases it is also impractical to go the TLS mutual route, because >> the Flink Dashboard can end up on any node in the k8s/Yarn cluster which >> means that we need a certificate per node (due to the mutual auth), but if >> we also want to protect the private key of these from users accidentally or >> intentionally leaking them then we need this per user. As in we end up >> managing user*machine number certificates and having to renew them >> periodically, which albeit automatable is unfortunately not yet automated >> in all large organizations. >> >> I fully agree that TLS certificate mutual authentication has its nice >> properties, especially at very large (multiple thousand node) clusters - >> but it has its own challenges too. Thanks for bringing it up. >> >> Happy to have this added to the rejected alternative list so that we have >> the full picture documented. >> >> On Thu, Jun 3, 2021 at 5:52 PM Till Rohrmann >> wrote: >> >>> I guess the idea would then be to let the proxy do the authentication >>> job and only forward the request via an SSL mutually encrypted connection >>> to the Flink cluster. Would this be possible? The beauty of this setup is >>> in my opinion that this setup should work with all kinds of authentication >>> mechanisms. >>> >>> Cheers, >>> Till >>> >>> On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi >>> wrote: >>> Thanks for giving options to fulfil the need. Users are looking for a solution where users can be identified on the whole cluster and restrict access to resources/actions. A good example for such an action is cancelling other users running jobs. * SSL does provide mutual authentication but when authentication passed there is no user based on restrictions can be made. * The less problematic part is that generating/maintaining short time valid certificates would be a hard (that's the reason KDC like servers exist). Having long time valid certificates would widen the attack surface but since the first concern is there this is just a cosmetic issue. All in all using TLS certificates is not sufficient in these environments unfortunately. BR, G On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann wrote: > Thanks for the information Gabor. If it is about securing the > communication between the REST client and the REST server, then Flink > already supports enabling mutual SSL authentication [1]. Would this be > enough to secure the communication and to pass an audit? > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity > > Cheers, > Till >
Re: [DISCUSS] Dashboard/HistoryServer authentication
I am not saying that we shouldn't add a strong authentication mechanism if there are good reasons for it. I primarily would like to understand the context a bit better in order to give qualified feedback and come to a good decision. In order to do this, I have the feeling that we haven't fully considered all available options which are on the table, tbh. Does the problem of certificate expiry also apply for self-signed certificates? If yes, then this should then also be a problem for the internal encryption of Flink's communication. If not, then one could use self-signed certificates with a longer validity to solve the mentioned issue. I think you can set up Flink in such a way that you don't have to handle all the different certificates. For example, you could deploy Flink with a "sidecar proxy" which is responsible for the authentication using an arbitrary method (e.g. Kerberos) and then bind the REST endpoint to a local network interface. That way, the REST endpoint would only be available through the sidecar proxy. Additionally, one could enable SSL for this communication. Would this be a solution for the problem? Cheers, Till On Thu, Jun 3, 2021 at 10:46 PM Márton Balassi wrote: > That is an interesting idea, Till. > > The main issue with it is that TLS certificates have an expiration time, > usually they get approved for a couple years. Forcing our users to restart > jobs to reprovision TLS certificates would be weird when we could just > implement a single proper strong authentication mechanism instead in a > couple hundred lines of code. :-) > > In many cases it is also impractical to go the TLS mutual route, because > the Flink Dashboard can end up on any node in the k8s/Yarn cluster which > means that we need a certificate per node (due to the mutual auth), but if > we also want to protect the private key of these from users accidentally or > intentionally leaking them then we need this per user. As in we end up > managing user*machine number certificates and having to renew them > periodically, which albeit automatable is unfortunately not yet automated > in all large organizations. > > I fully agree that TLS certificate mutual authentication has its nice > properties, especially at very large (multiple thousand node) clusters - > but it has its own challenges too. Thanks for bringing it up. > > Happy to have this added to the rejected alternative list so that we have > the full picture documented. > > On Thu, Jun 3, 2021 at 5:52 PM Till Rohrmann wrote: > >> I guess the idea would then be to let the proxy do the authentication job >> and only forward the request via an SSL mutually encrypted connection to >> the Flink cluster. Would this be possible? The beauty of this setup is in >> my opinion that this setup should work with all kinds of authentication >> mechanisms. >> >> Cheers, >> Till >> >> On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi >> wrote: >> >>> Thanks for giving options to fulfil the need. >>> >>> Users are looking for a solution where users can be identified on the >>> whole cluster and restrict access to resources/actions. >>> A good example for such an action is cancelling other users running jobs. >>> >>> * SSL does provide mutual authentication but when authentication passed >>> there is no user based on restrictions can be made. >>> * The less problematic part is that generating/maintaining short time >>> valid certificates would be a hard (that's the reason KDC like servers >>> exist). >>> Having long time valid certificates would widen the attack surface but >>> since the first concern is there this is just a cosmetic issue. >>> >>> All in all using TLS certificates is not sufficient in these >>> environments unfortunately. >>> >>> BR, >>> G >>> >>> >>> On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann >>> wrote: >>> Thanks for the information Gabor. If it is about securing the communication between the REST client and the REST server, then Flink already supports enabling mutual SSL authentication [1]. Would this be enough to secure the communication and to pass an audit? [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity Cheers, Till On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi < gabor.g.somo...@gmail.com> wrote: > Hi Till, > > Since I'm working in security area 10+ years let me share my thought. > I would like to emphasise there are experts better than me but I have > some > basics. > The discussion is open and not trying to tell alone things... > > > I mean if an attacker can get access to one of the machines, then it > should also be possible to obtain the right Kerberos token. > Not necessarily. For example if one gets access to a specific user's > credentials then it's not possible to compromise other user's jobs, > data, > etc... > Security is like an onion,
Re: [DISCUSS] Dashboard/HistoryServer authentication
That is an interesting idea, Till. The main issue with it is that TLS certificates have an expiration time, usually they get approved for a couple years. Forcing our users to restart jobs to reprovision TLS certificates would be weird when we could just implement a single proper strong authentication mechanism instead in a couple hundred lines of code. :-) In many cases it is also impractical to go the TLS mutual route, because the Flink Dashboard can end up on any node in the k8s/Yarn cluster which means that we need a certificate per node (due to the mutual auth), but if we also want to protect the private key of these from users accidentally or intentionally leaking them then we need this per user. As in we end up managing user*machine number certificates and having to renew them periodically, which albeit automatable is unfortunately not yet automated in all large organizations. I fully agree that TLS certificate mutual authentication has its nice properties, especially at very large (multiple thousand node) clusters - but it has its own challenges too. Thanks for bringing it up. Happy to have this added to the rejected alternative list so that we have the full picture documented. On Thu, Jun 3, 2021 at 5:52 PM Till Rohrmann wrote: > I guess the idea would then be to let the proxy do the authentication job > and only forward the request via an SSL mutually encrypted connection to > the Flink cluster. Would this be possible? The beauty of this setup is in > my opinion that this setup should work with all kinds of authentication > mechanisms. > > Cheers, > Till > > On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi > wrote: > >> Thanks for giving options to fulfil the need. >> >> Users are looking for a solution where users can be identified on the >> whole cluster and restrict access to resources/actions. >> A good example for such an action is cancelling other users running jobs. >> >> * SSL does provide mutual authentication but when authentication passed >> there is no user based on restrictions can be made. >> * The less problematic part is that generating/maintaining short time >> valid certificates would be a hard (that's the reason KDC like servers >> exist). >> Having long time valid certificates would widen the attack surface but >> since the first concern is there this is just a cosmetic issue. >> >> All in all using TLS certificates is not sufficient in these environments >> unfortunately. >> >> BR, >> G >> >> >> On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann >> wrote: >> >>> Thanks for the information Gabor. If it is about securing the >>> communication between the REST client and the REST server, then Flink >>> already supports enabling mutual SSL authentication [1]. Would this be >>> enough to secure the communication and to pass an audit? >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity >>> >>> Cheers, >>> Till >>> >>> On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi >>> wrote: >>> Hi Till, Since I'm working in security area 10+ years let me share my thought. I would like to emphasise there are experts better than me but I have some basics. The discussion is open and not trying to tell alone things... > I mean if an attacker can get access to one of the machines, then it should also be possible to obtain the right Kerberos token. Not necessarily. For example if one gets access to a specific user's credentials then it's not possible to compromise other user's jobs, data, etc... Security is like an onion, the more layers has been added the more time an attacker needs to proceed. At the end of the day if one is in, then most probably can find the way but this time is normally enough to sysadmins or security experts to close down the system and minimize the damage. The other thing is that all tokens has a timeout and if the token is invalid then the attacker can't proceed further. > Is Kerberos also the standard authentication protocol for Kubernetes deployments? Kerberos is an industry standard which is cloud/deployment agnostic and it can be used in any deployments including k8s. The main intention is to use kerberos in k8s deployments too since we're going this direction as well. Please see how Spark does this: https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes Last but not least the most important reason to add at least one strong authentication is that we have users who has hard requirements on this. They're doing security audits and if they fail then it's deal breaking. That is why we have added kerberos at the first place. Unfortunately we can't name them in this public list, however the customers who specifically asked for this were mainly in the
Re: [DISCUSS] Dashboard/HistoryServer authentication
I guess the idea would then be to let the proxy do the authentication job and only forward the request via an SSL mutually encrypted connection to the Flink cluster. Would this be possible? The beauty of this setup is in my opinion that this setup should work with all kinds of authentication mechanisms. Cheers, Till On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi wrote: > Thanks for giving options to fulfil the need. > > Users are looking for a solution where users can be identified on the > whole cluster and restrict access to resources/actions. > A good example for such an action is cancelling other users running jobs. > > * SSL does provide mutual authentication but when authentication passed > there is no user based on restrictions can be made. > * The less problematic part is that generating/maintaining short time > valid certificates would be a hard (that's the reason KDC like servers > exist). > Having long time valid certificates would widen the attack surface but > since the first concern is there this is just a cosmetic issue. > > All in all using TLS certificates is not sufficient in these environments > unfortunately. > > BR, > G > > > On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann > wrote: > >> Thanks for the information Gabor. If it is about securing the >> communication between the REST client and the REST server, then Flink >> already supports enabling mutual SSL authentication [1]. Would this be >> enough to secure the communication and to pass an audit? >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity >> >> Cheers, >> Till >> >> On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi >> wrote: >> >>> Hi Till, >>> >>> Since I'm working in security area 10+ years let me share my thought. >>> I would like to emphasise there are experts better than me but I have >>> some >>> basics. >>> The discussion is open and not trying to tell alone things... >>> >>> > I mean if an attacker can get access to one of the machines, then it >>> should also be possible to obtain the right Kerberos token. >>> Not necessarily. For example if one gets access to a specific user's >>> credentials then it's not possible to compromise other user's jobs, data, >>> etc... >>> Security is like an onion, the more layers has been added the more time >>> an >>> attacker needs to proceed. >>> At the end of the day if one is in, then most probably can find the way >>> but >>> this time is normally enough to sysadmins or security experts to >>> close down the system and minimize the damage. >>> >>> The other thing is that all tokens has a timeout and if the token is >>> invalid then the attacker can't proceed further. >>> >>> > Is Kerberos also the standard authentication protocol for Kubernetes >>> deployments? >>> Kerberos is an industry standard which is cloud/deployment agnostic and >>> it >>> can be used in any deployments including k8s. >>> The main intention is to use kerberos in k8s deployments too since we're >>> going this direction as well. >>> Please see how Spark does this: >>> >>> https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes >>> >>> Last but not least the most important reason to add at least one strong >>> authentication is that we have users who has >>> hard requirements on this. They're doing security audits and if they fail >>> then it's deal breaking. >>> That is why we have added kerberos at the first place. Unfortunately we >>> can't name them in this public list, however >>> the customers who specifically asked for this were mainly in the banking >>> and telco sector. >>> >>> BR, >>> G >>> >>> >>> On Thu, Jun 3, 2021 at 9:20 AM Till Rohrmann >>> wrote: >>> >>> > Thanks for updating the document Márton. Why is it that banks will >>> > consider it more secure if Flink comes with Kerberos authentication >>> > (assuming a properly secured setup)? I mean if an attacker can get >>> access >>> > to one of the machines, then it should also be possible to obtain the >>> right >>> > Kerberos token. >>> > >>> > I am not an authentication expert and that's why I wanted to ask what >>> are >>> > other authentication protocols other than Kerberos? Why did we select >>> > Kerberos and not any other authentication protocol? Maybe you can list >>> the >>> > pros and cons for the different protocols. Is Kerberos also the >>> standard >>> > authentication protocol for Kubernetes deployments? If not, what would >>> be >>> > the answer when deploying on K8s? >>> > >>> > Cheers, >>> > Till >>> > >>> > On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi < >>> gabor.g.somo...@gmail.com> >>> > wrote: >>> > >>> >> Hi team, >>> >> >>> >> Happy to be here and hope I can provide quality additions in the >>> future. >>> >> >>> >> Thank you all for helpful the suggestions! >>> >> Considering them the FLIP has been modified and the work continues on >>> the >>> >> already existing Jira. >>> >> >>> >> BR, >>> >> G >>> >>
Re: [DISCUSS] Dashboard/HistoryServer authentication
Thanks for giving options to fulfil the need. Users are looking for a solution where users can be identified on the whole cluster and restrict access to resources/actions. A good example for such an action is cancelling other users running jobs. * SSL does provide mutual authentication but when authentication passed there is no user based on restrictions can be made. * The less problematic part is that generating/maintaining short time valid certificates would be a hard (that's the reason KDC like servers exist). Having long time valid certificates would widen the attack surface but since the first concern is there this is just a cosmetic issue. All in all using TLS certificates is not sufficient in these environments unfortunately. BR, G On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann wrote: > Thanks for the information Gabor. If it is about securing the > communication between the REST client and the REST server, then Flink > already supports enabling mutual SSL authentication [1]. Would this be > enough to secure the communication and to pass an audit? > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity > > Cheers, > Till > > On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi > wrote: > >> Hi Till, >> >> Since I'm working in security area 10+ years let me share my thought. >> I would like to emphasise there are experts better than me but I have some >> basics. >> The discussion is open and not trying to tell alone things... >> >> > I mean if an attacker can get access to one of the machines, then it >> should also be possible to obtain the right Kerberos token. >> Not necessarily. For example if one gets access to a specific user's >> credentials then it's not possible to compromise other user's jobs, data, >> etc... >> Security is like an onion, the more layers has been added the more time an >> attacker needs to proceed. >> At the end of the day if one is in, then most probably can find the way >> but >> this time is normally enough to sysadmins or security experts to >> close down the system and minimize the damage. >> >> The other thing is that all tokens has a timeout and if the token is >> invalid then the attacker can't proceed further. >> >> > Is Kerberos also the standard authentication protocol for Kubernetes >> deployments? >> Kerberos is an industry standard which is cloud/deployment agnostic and it >> can be used in any deployments including k8s. >> The main intention is to use kerberos in k8s deployments too since we're >> going this direction as well. >> Please see how Spark does this: >> >> https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes >> >> Last but not least the most important reason to add at least one strong >> authentication is that we have users who has >> hard requirements on this. They're doing security audits and if they fail >> then it's deal breaking. >> That is why we have added kerberos at the first place. Unfortunately we >> can't name them in this public list, however >> the customers who specifically asked for this were mainly in the banking >> and telco sector. >> >> BR, >> G >> >> >> On Thu, Jun 3, 2021 at 9:20 AM Till Rohrmann >> wrote: >> >> > Thanks for updating the document Márton. Why is it that banks will >> > consider it more secure if Flink comes with Kerberos authentication >> > (assuming a properly secured setup)? I mean if an attacker can get >> access >> > to one of the machines, then it should also be possible to obtain the >> right >> > Kerberos token. >> > >> > I am not an authentication expert and that's why I wanted to ask what >> are >> > other authentication protocols other than Kerberos? Why did we select >> > Kerberos and not any other authentication protocol? Maybe you can list >> the >> > pros and cons for the different protocols. Is Kerberos also the standard >> > authentication protocol for Kubernetes deployments? If not, what would >> be >> > the answer when deploying on K8s? >> > >> > Cheers, >> > Till >> > >> > On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi < >> gabor.g.somo...@gmail.com> >> > wrote: >> > >> >> Hi team, >> >> >> >> Happy to be here and hope I can provide quality additions in the >> future. >> >> >> >> Thank you all for helpful the suggestions! >> >> Considering them the FLIP has been modified and the work continues on >> the >> >> already existing Jira. >> >> >> >> BR, >> >> G >> >> >> >> >> >> On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi < >> balassi.mar...@gmail.com> >> >> wrote: >> >> >> >>> Thanks, Chesney - I totally missed that. Answered on the ticket too, >> let >> >>> us continue there then. >> >>> >> >>> Till, I agree that we should keep this codepath as slim as possible. >> It >> >>> is an important design decision that we aim to keep the list of >> >>> authentication protocols to a minimum. We believe that this should >> not be a >> >>> primary concern of Flink and a trusted proxy service
Re: [DISCUSS] Dashboard/HistoryServer authentication
Thanks for the information Gabor. If it is about securing the communication between the REST client and the REST server, then Flink already supports enabling mutual SSL authentication [1]. Would this be enough to secure the communication and to pass an audit? [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity Cheers, Till On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi wrote: > Hi Till, > > Since I'm working in security area 10+ years let me share my thought. > I would like to emphasise there are experts better than me but I have some > basics. > The discussion is open and not trying to tell alone things... > > > I mean if an attacker can get access to one of the machines, then it > should also be possible to obtain the right Kerberos token. > Not necessarily. For example if one gets access to a specific user's > credentials then it's not possible to compromise other user's jobs, data, > etc... > Security is like an onion, the more layers has been added the more time an > attacker needs to proceed. > At the end of the day if one is in, then most probably can find the way but > this time is normally enough to sysadmins or security experts to > close down the system and minimize the damage. > > The other thing is that all tokens has a timeout and if the token is > invalid then the attacker can't proceed further. > > > Is Kerberos also the standard authentication protocol for Kubernetes > deployments? > Kerberos is an industry standard which is cloud/deployment agnostic and it > can be used in any deployments including k8s. > The main intention is to use kerberos in k8s deployments too since we're > going this direction as well. > Please see how Spark does this: > > https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes > > Last but not least the most important reason to add at least one strong > authentication is that we have users who has > hard requirements on this. They're doing security audits and if they fail > then it's deal breaking. > That is why we have added kerberos at the first place. Unfortunately we > can't name them in this public list, however > the customers who specifically asked for this were mainly in the banking > and telco sector. > > BR, > G > > > On Thu, Jun 3, 2021 at 9:20 AM Till Rohrmann wrote: > > > Thanks for updating the document Márton. Why is it that banks will > > consider it more secure if Flink comes with Kerberos authentication > > (assuming a properly secured setup)? I mean if an attacker can get access > > to one of the machines, then it should also be possible to obtain the > right > > Kerberos token. > > > > I am not an authentication expert and that's why I wanted to ask what are > > other authentication protocols other than Kerberos? Why did we select > > Kerberos and not any other authentication protocol? Maybe you can list > the > > pros and cons for the different protocols. Is Kerberos also the standard > > authentication protocol for Kubernetes deployments? If not, what would be > > the answer when deploying on K8s? > > > > Cheers, > > Till > > > > On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi > > > wrote: > > > >> Hi team, > >> > >> Happy to be here and hope I can provide quality additions in the future. > >> > >> Thank you all for helpful the suggestions! > >> Considering them the FLIP has been modified and the work continues on > the > >> already existing Jira. > >> > >> BR, > >> G > >> > >> > >> On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi < > balassi.mar...@gmail.com> > >> wrote: > >> > >>> Thanks, Chesney - I totally missed that. Answered on the ticket too, > let > >>> us continue there then. > >>> > >>> Till, I agree that we should keep this codepath as slim as possible. It > >>> is an important design decision that we aim to keep the list of > >>> authentication protocols to a minimum. We believe that this should not > be a > >>> primary concern of Flink and a trusted proxy service (for example > Apache > >>> Knox) should be used to enable a multitude of enduser authentication > >>> mechanisms. The bare minimum of authentication mechanisms to support > >>> consequently consist of a single strong authentication protocol for > which > >>> Kerberos is the enterprise solution and HTTP Basic primary for > development > >>> and light-weight scenarios. > >>> > >>> Added the above wording to G's doc. > >>> > >>> > https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit > >>> > >>> > >>> > >>> On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler > >>> wrote: > >>> > There's a related effort: > https://issues.apache.org/jira/browse/FLINK-21108 > > On 6/1/2021 10:14 AM, Till Rohrmann wrote: > > Hi Gabor, welcome to the Flink community! > > > > Thanks for sharing this proposal with the community Márton. In > general, I > > agree that authentication is missing and that this is required
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Till, Since I'm working in security area 10+ years let me share my thought. I would like to emphasise there are experts better than me but I have some basics. The discussion is open and not trying to tell alone things... > I mean if an attacker can get access to one of the machines, then it should also be possible to obtain the right Kerberos token. Not necessarily. For example if one gets access to a specific user's credentials then it's not possible to compromise other user's jobs, data, etc... Security is like an onion, the more layers has been added the more time an attacker needs to proceed. At the end of the day if one is in, then most probably can find the way but this time is normally enough to sysadmins or security experts to close down the system and minimize the damage. The other thing is that all tokens has a timeout and if the token is invalid then the attacker can't proceed further. > Is Kerberos also the standard authentication protocol for Kubernetes deployments? Kerberos is an industry standard which is cloud/deployment agnostic and it can be used in any deployments including k8s. The main intention is to use kerberos in k8s deployments too since we're going this direction as well. Please see how Spark does this: https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes Last but not least the most important reason to add at least one strong authentication is that we have users who has hard requirements on this. They're doing security audits and if they fail then it's deal breaking. That is why we have added kerberos at the first place. Unfortunately we can't name them in this public list, however the customers who specifically asked for this were mainly in the banking and telco sector. BR, G On Thu, Jun 3, 2021 at 9:20 AM Till Rohrmann wrote: > Thanks for updating the document Márton. Why is it that banks will > consider it more secure if Flink comes with Kerberos authentication > (assuming a properly secured setup)? I mean if an attacker can get access > to one of the machines, then it should also be possible to obtain the right > Kerberos token. > > I am not an authentication expert and that's why I wanted to ask what are > other authentication protocols other than Kerberos? Why did we select > Kerberos and not any other authentication protocol? Maybe you can list the > pros and cons for the different protocols. Is Kerberos also the standard > authentication protocol for Kubernetes deployments? If not, what would be > the answer when deploying on K8s? > > Cheers, > Till > > On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi > wrote: > >> Hi team, >> >> Happy to be here and hope I can provide quality additions in the future. >> >> Thank you all for helpful the suggestions! >> Considering them the FLIP has been modified and the work continues on the >> already existing Jira. >> >> BR, >> G >> >> >> On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi >> wrote: >> >>> Thanks, Chesney - I totally missed that. Answered on the ticket too, let >>> us continue there then. >>> >>> Till, I agree that we should keep this codepath as slim as possible. It >>> is an important design decision that we aim to keep the list of >>> authentication protocols to a minimum. We believe that this should not be a >>> primary concern of Flink and a trusted proxy service (for example Apache >>> Knox) should be used to enable a multitude of enduser authentication >>> mechanisms. The bare minimum of authentication mechanisms to support >>> consequently consist of a single strong authentication protocol for which >>> Kerberos is the enterprise solution and HTTP Basic primary for development >>> and light-weight scenarios. >>> >>> Added the above wording to G's doc. >>> >>> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >>> >>> >>> >>> On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler >>> wrote: >>> There's a related effort: https://issues.apache.org/jira/browse/FLINK-21108 On 6/1/2021 10:14 AM, Till Rohrmann wrote: > Hi Gabor, welcome to the Flink community! > > Thanks for sharing this proposal with the community Márton. In general, I > agree that authentication is missing and that this is required for using > Flink within an enterprise. The thing I am wondering is whether this > feature strictly needs to be implemented inside of Flink or whether a proxy > setup could do the job? Have you considered this option? If yes, then it > would be good to list it under the point of rejected alternatives. > > I do see the benefit of implementing this feature inside of Flink if many > users need it. If not, then it might be easier for the project to not > increase the surface area since it makes the overall maintenance harder. > > Cheers, > Till > > On Mon, May 31, 2021 at 4:57 PM Márton Balassi wrote: > >> Hi
Re: [DISCUSS] Dashboard/HistoryServer authentication
Thanks for updating the document Márton. Why is it that banks will consider it more secure if Flink comes with Kerberos authentication (assuming a properly secured setup)? I mean if an attacker can get access to one of the machines, then it should also be possible to obtain the right Kerberos token. I am not an authentication expert and that's why I wanted to ask what are other authentication protocols other than Kerberos? Why did we select Kerberos and not any other authentication protocol? Maybe you can list the pros and cons for the different protocols. Is Kerberos also the standard authentication protocol for Kubernetes deployments? If not, what would be the answer when deploying on K8s? Cheers, Till On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi wrote: > Hi team, > > Happy to be here and hope I can provide quality additions in the future. > > Thank you all for helpful the suggestions! > Considering them the FLIP has been modified and the work continues on the > already existing Jira. > > BR, > G > > > On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi > wrote: > >> Thanks, Chesney - I totally missed that. Answered on the ticket too, let >> us continue there then. >> >> Till, I agree that we should keep this codepath as slim as possible. It >> is an important design decision that we aim to keep the list of >> authentication protocols to a minimum. We believe that this should not be a >> primary concern of Flink and a trusted proxy service (for example Apache >> Knox) should be used to enable a multitude of enduser authentication >> mechanisms. The bare minimum of authentication mechanisms to support >> consequently consist of a single strong authentication protocol for which >> Kerberos is the enterprise solution and HTTP Basic primary for development >> and light-weight scenarios. >> >> Added the above wording to G's doc. >> >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> >> >> >> On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler >> wrote: >> >>> There's a related effort: >>> https://issues.apache.org/jira/browse/FLINK-21108 >>> >>> On 6/1/2021 10:14 AM, Till Rohrmann wrote: >>> > Hi Gabor, welcome to the Flink community! >>> > >>> > Thanks for sharing this proposal with the community Márton. In >>> general, I >>> > agree that authentication is missing and that this is required for >>> using >>> > Flink within an enterprise. The thing I am wondering is whether this >>> > feature strictly needs to be implemented inside of Flink or whether a >>> proxy >>> > setup could do the job? Have you considered this option? If yes, then >>> it >>> > would be good to list it under the point of rejected alternatives. >>> > >>> > I do see the benefit of implementing this feature inside of Flink if >>> many >>> > users need it. If not, then it might be easier for the project to not >>> > increase the surface area since it makes the overall maintenance >>> harder. >>> > >>> > Cheers, >>> > Till >>> > >>> > On Mon, May 31, 2021 at 4:57 PM Márton Balassi >>> wrote: >>> > >>> >> Hi team, >>> >> >>> >> Firstly I would like to introduce Gabor or G [1] for short to the >>> >> community, he is a Spark committer who has recently transitioned to >>> the >>> >> Flink Engineering team at Cloudera and is looking forward to >>> contributing >>> >> to Apache Flink. Previously G primarily focused on Spark Streaming and >>> >> security. >>> >> >>> >> Based on requests from our customers G has implemented Kerberos and >>> HTTP >>> >> Basic Authentication for the Flink Dashboard and HistoryServer. >>> Previously >>> >> lacked an authentication story. >>> >> >>> >> We are looking to contribute this functionality back to the >>> community, we >>> >> believe that given Flink's maturity there should be a common code >>> solution >>> >> for this general pattern. >>> >> >>> >> We are looking forward to your feedback on G's design. [2] >>> >> >>> >> [1] http://gaborsomogyi.com/ >>> >> [2] >>> >> >>> >> >>> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >>> >> >>> >>>
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi team, Happy to be here and hope I can provide quality additions in the future. Thank you all for helpful the suggestions! Considering them the FLIP has been modified and the work continues on the already existing Jira. BR, G On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi wrote: > Thanks, Chesney - I totally missed that. Answered on the ticket too, let > us continue there then. > > Till, I agree that we should keep this codepath as slim as possible. It is > an important design decision that we aim to keep the list of authentication > protocols to a minimum. We believe that this should not be a primary > concern of Flink and a trusted proxy service (for example Apache Knox) > should be used to enable a multitude of enduser authentication mechanisms. > The bare minimum of authentication mechanisms to support consequently > consist of a single strong authentication protocol for which Kerberos is > the enterprise solution and HTTP Basic primary for development and > light-weight scenarios. > > Added the above wording to G's doc. > > https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit > > > > On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler > wrote: > >> There's a related effort: >> https://issues.apache.org/jira/browse/FLINK-21108 >> >> On 6/1/2021 10:14 AM, Till Rohrmann wrote: >> > Hi Gabor, welcome to the Flink community! >> > >> > Thanks for sharing this proposal with the community Márton. In general, >> I >> > agree that authentication is missing and that this is required for using >> > Flink within an enterprise. The thing I am wondering is whether this >> > feature strictly needs to be implemented inside of Flink or whether a >> proxy >> > setup could do the job? Have you considered this option? If yes, then it >> > would be good to list it under the point of rejected alternatives. >> > >> > I do see the benefit of implementing this feature inside of Flink if >> many >> > users need it. If not, then it might be easier for the project to not >> > increase the surface area since it makes the overall maintenance harder. >> > >> > Cheers, >> > Till >> > >> > On Mon, May 31, 2021 at 4:57 PM Márton Balassi >> wrote: >> > >> >> Hi team, >> >> >> >> Firstly I would like to introduce Gabor or G [1] for short to the >> >> community, he is a Spark committer who has recently transitioned to the >> >> Flink Engineering team at Cloudera and is looking forward to >> contributing >> >> to Apache Flink. Previously G primarily focused on Spark Streaming and >> >> security. >> >> >> >> Based on requests from our customers G has implemented Kerberos and >> HTTP >> >> Basic Authentication for the Flink Dashboard and HistoryServer. >> Previously >> >> lacked an authentication story. >> >> >> >> We are looking to contribute this functionality back to the community, >> we >> >> believe that given Flink's maturity there should be a common code >> solution >> >> for this general pattern. >> >> >> >> We are looking forward to your feedback on G's design. [2] >> >> >> >> [1] http://gaborsomogyi.com/ >> >> [2] >> >> >> >> >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> >> >> >>
Re: [DISCUSS] Dashboard/HistoryServer authentication
Thanks, Chesney - I totally missed that. Answered on the ticket too, let us continue there then. Till, I agree that we should keep this codepath as slim as possible. It is an important design decision that we aim to keep the list of authentication protocols to a minimum. We believe that this should not be a primary concern of Flink and a trusted proxy service (for example Apache Knox) should be used to enable a multitude of enduser authentication mechanisms. The bare minimum of authentication mechanisms to support consequently consist of a single strong authentication protocol for which Kerberos is the enterprise solution and HTTP Basic primary for development and light-weight scenarios. Added the above wording to G's doc. https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler wrote: > There's a related effort: > https://issues.apache.org/jira/browse/FLINK-21108 > > On 6/1/2021 10:14 AM, Till Rohrmann wrote: > > Hi Gabor, welcome to the Flink community! > > > > Thanks for sharing this proposal with the community Márton. In general, I > > agree that authentication is missing and that this is required for using > > Flink within an enterprise. The thing I am wondering is whether this > > feature strictly needs to be implemented inside of Flink or whether a > proxy > > setup could do the job? Have you considered this option? If yes, then it > > would be good to list it under the point of rejected alternatives. > > > > I do see the benefit of implementing this feature inside of Flink if many > > users need it. If not, then it might be easier for the project to not > > increase the surface area since it makes the overall maintenance harder. > > > > Cheers, > > Till > > > > On Mon, May 31, 2021 at 4:57 PM Márton Balassi > wrote: > > > >> Hi team, > >> > >> Firstly I would like to introduce Gabor or G [1] for short to the > >> community, he is a Spark committer who has recently transitioned to the > >> Flink Engineering team at Cloudera and is looking forward to > contributing > >> to Apache Flink. Previously G primarily focused on Spark Streaming and > >> security. > >> > >> Based on requests from our customers G has implemented Kerberos and HTTP > >> Basic Authentication for the Flink Dashboard and HistoryServer. > Previously > >> lacked an authentication story. > >> > >> We are looking to contribute this functionality back to the community, > we > >> believe that given Flink's maturity there should be a common code > solution > >> for this general pattern. > >> > >> We are looking forward to your feedback on G's design. [2] > >> > >> [1] http://gaborsomogyi.com/ > >> [2] > >> > >> > https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit > >> > >
Re: [DISCUSS] Dashboard/HistoryServer authentication
There's a related effort: https://issues.apache.org/jira/browse/FLINK-21108 On 6/1/2021 10:14 AM, Till Rohrmann wrote: Hi Gabor, welcome to the Flink community! Thanks for sharing this proposal with the community Márton. In general, I agree that authentication is missing and that this is required for using Flink within an enterprise. The thing I am wondering is whether this feature strictly needs to be implemented inside of Flink or whether a proxy setup could do the job? Have you considered this option? If yes, then it would be good to list it under the point of rejected alternatives. I do see the benefit of implementing this feature inside of Flink if many users need it. If not, then it might be easier for the project to not increase the surface area since it makes the overall maintenance harder. Cheers, Till On Mon, May 31, 2021 at 4:57 PM Márton Balassi wrote: Hi team, Firstly I would like to introduce Gabor or G [1] for short to the community, he is a Spark committer who has recently transitioned to the Flink Engineering team at Cloudera and is looking forward to contributing to Apache Flink. Previously G primarily focused on Spark Streaming and security. Based on requests from our customers G has implemented Kerberos and HTTP Basic Authentication for the Flink Dashboard and HistoryServer. Previously lacked an authentication story. We are looking to contribute this functionality back to the community, we believe that given Flink's maturity there should be a common code solution for this general pattern. We are looking forward to your feedback on G's design. [2] [1] http://gaborsomogyi.com/ [2] https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi Gabor, welcome to the Flink community! Thanks for sharing this proposal with the community Márton. In general, I agree that authentication is missing and that this is required for using Flink within an enterprise. The thing I am wondering is whether this feature strictly needs to be implemented inside of Flink or whether a proxy setup could do the job? Have you considered this option? If yes, then it would be good to list it under the point of rejected alternatives. I do see the benefit of implementing this feature inside of Flink if many users need it. If not, then it might be easier for the project to not increase the surface area since it makes the overall maintenance harder. Cheers, Till On Mon, May 31, 2021 at 4:57 PM Márton Balassi wrote: > Hi team, > > Firstly I would like to introduce Gabor or G [1] for short to the > community, he is a Spark committer who has recently transitioned to the > Flink Engineering team at Cloudera and is looking forward to contributing > to Apache Flink. Previously G primarily focused on Spark Streaming and > security. > > Based on requests from our customers G has implemented Kerberos and HTTP > Basic Authentication for the Flink Dashboard and HistoryServer. Previously > lacked an authentication story. > > We are looking to contribute this functionality back to the community, we > believe that given Flink's maturity there should be a common code solution > for this general pattern. > > We are looking forward to your feedback on G's design. [2] > > [1] http://gaborsomogyi.com/ > [2] > > https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >