There is a really strong tendency to push many things out of Flink lately
and keep the main building blocks only. However I really think that this
belongs between the minimal building blocks that Flink should provide out
of the box. It's also very likely that security topics will start getting
more attention in the near future.

Maybe one thought, I'm not sure that's the case, but if hard-coding the
delegation framework to Kerberos would be a concern, I could imagine that
we implement this in more generic fashion so other systems that need to
distribute & renew credentials might reuse the same code-path (eg. oauth
tokens for talking to some external APIs).

D.

On Fri, Feb 4, 2022 at 11:32 AM Gyula F贸ra <gyula.f...@gmail.com> wrote:

> Hi Chesnay,
>
> Thanks for the proposal for the alternative mechanism. I see the conceptual
> value of separating this process from Flink but in practice I feel there
> are a few very serious limitations with that.
>
> Just a few points that come to mind:
> 1. Implementing this as independent distributed processes that communicate
> with each other requires:
>     - Secure communication channels
>     - Process discovery
>     - High availability
>     This is a huge effort to say the least, more like a separate project
> than a new feature.
> 2. Independent processes with all of the above would come with their own
> set of dependencies and configuration values for everything ranging from
> communication, ssl settings, etc.
> 3. Flink does not have an existing mechanism for spinning up this processes
> and managing their lifecycle. This would require a completely separate
> design.
>
> If Spark had used external processes now we would still have to design a
> process hook mechanism, every user would have to add an extra probably
> large set of config options just to manage the basic secure process
> communication and would pull in their own dependency mess most likely.
>
> I personally prefer to reuse Flink鈥檚 solids secure communication channels
> and existing HA and discovery mechanism.
> So from my side, +1 for embedding this in the existing Flink components.
>
> Kerberos is here to stay for a long time in many large production
> use-cases, and we should aim to solve this long standing limitation in an
> elegant way.
>
> Thank you!
> Gyula
>
> On Friday, February 4, 2022, Chesnay Schepler <ches...@apache.org> wrote:
>
> > The concrete proposal would be to add a generic process startup lifecycle
> > hook (essentially a Consumer<Configuration>), that is run at the start of
> > each processs (JobManager, TaskManager, HistoryServer (, CLI?).
> >
> > Everything else would be left to the implementation which would live
> > outside of Flink.
> >
> > For this specific case an implementation of this hook would (_somehow_)
> > establish a connection to the external process (that it discovered
> > _somehow_) to retrieve the delegation token, in a blocking fashion to
> pause
> > the startup procedure, and (presumably) schedule something into an
> executor
> > to renew the token at a later date.
> > This is of course very simplifies, but you get the general idea.
> >
> > @Gyula It's certainly a reasonable design, and re-using Flinks existing
> > mechanisms does make sense.
> > However, I do have to point that if Spark had used an external process,
> > then we could've just re-used the part that integrates Spark with that,
> and
> > this whole discussion could've been resolved in a day.
> > This is actually what irks me most about this topic. It could be a
> generic
> > solution to address Kerberos scaling issues that other projects could
> > re-use, instead of everyone having to implement their own custom
> solution.
> >
> > On 04/02/2022 09:46, Gabor Somogyi wrote:
> >
> >> Hi All,
> >>
> >> First of all sorry that I've taken couple of mails heavily!
> >> I've had an impression after we've invested roughly 2 months into the
> FLIP
> >> it's moving to a rejection without alternative what we can work on.
> >>
> >> That said earlier which still stands if there is a better idea how that
> >> could be solved I'm open
> >> even with the price of rejecting this. What I would like to ask even in
> >> case of suggestions/or even
> >> reject please come up with a concrete proposal on what we can agree on.
> >>
> >> During this 2 months I've considered many options and this is the
> >> design/code which contains
> >> the least necessary lines of code, relatively rock stable in production
> in
> >> another product, I personally
> >> have roughly 3 years experience with it. The design is not 1to1
> copy-paste
> >> because I've considered
> >> my limited knowledge about Flink.
> >>
> >> Since I'm not the one who has 7+ years within Flink I can accept if
> >> something is not the way it should be done.
> >> Please suggest a better way and I'm sure we're going to come up with
> >> something which makes everybody happy.
> >>
> >> So waiting on the suggestions and we drive the ship there...
> >>
> >> G
> >>
> >>
> >> On Fri, Feb 4, 2022 at 12:08 AM Till Rohrmann <trohrm...@apache.org>
> >> wrote:
> >>
> >> Sorry I didn't want to offend anybody if it was perceived like this. I
> can
> >>> see that me joining very late into the discussion w/o constructive
> ideas
> >>> was not nice. My motivation for asking for the reasoning behind the
> >>> current
> >>> design proposal is primarily the lack of Kerberos knowledge. Moreover,
> it
> >>> happened before that we moved responsibilities into Flink that we
> >>> regretted
> >>> later.
> >>>
> >>> As I've said, I don't have a better idea right now. If we believe that
> it
> >>> is the right thing to make Flink responsible for distributing the
> tokens
> >>> and we don't find a better solution then we'll go for it. I just wanted
> >>> to
> >>> make sure that we don't overlook an alternative solution that might be
> >>> easier to maintain in the long run.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Thu, Feb 3, 2022 at 7:52 PM Gyula F贸ra <gyula.f...@gmail.com>
> wrote:
> >>>
> >>> Hi Team!
> >>>>
> >>>> Let's all calm down a little and not let our emotions affect the
> >>>>
> >>> discussion
> >>>
> >>>> too much.
> >>>> There has been a lot of effort spent from all involved parties so this
> >>>> is
> >>>> quite understandable :)
> >>>>
> >>>> Even though not everyone said this explicitly, it seems that everyone
> >>>>
> >>> more
> >>>
> >>>> or less agrees that a feature implementing token renewal is necessary
> >>>> and
> >>>> valuable.
> >>>>
> >>>> The main point of contention is: where should the token renewal
> >>>> logic run and how to get the tokens to wherever needed.
> >>>>
> >>>>  From my perspective the current design is very reasonable at first
> >>>> sight
> >>>> because:
> >>>>   1. It runs the token renewal in a single place avoiding extra CDC
> >>>>
> >>> workload
> >>>
> >>>>   2. Does not introduce new processes, extra communication channels
> etc
> >>>>
> >>> but
> >>>
> >>>> piggybacks on existing robust mechanisms.
> >>>>
> >>>> I understand the concerns about adding new things in the resource
> >>>> manager
> >>>> but I think that really depends on how we look at it.
> >>>> We cannot reasonably expect a custom token renewal process to have
> it's
> >>>>
> >>> own
> >>>
> >>>> secure distribution logic like Flink has now, that is a complete
> >>>>
> >>> overkill.
> >>>
> >>>> This practically means that we will not have a slim efficient
> >>>> implementation for this but something unnecessarily complex. And the
> >>>> only
> >>>> thing we get in return is a bit less code in the resource manager.
> >>>>
> >>>>  From a logical standpoint the delegation framework needs to run at a
> >>>> centralized place and need to be able to access new task manager
> >>>>
> >>> processes
> >>>
> >>>> to achieve all it's design goals.
> >>>> We can drop a single renewer as a design goal but that might be a
> >>>>
> >>> decision
> >>>
> >>>> that can affect large scale production runs.
> >>>>
> >>>> Cheers,
> >>>> Gyula
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Feb 3, 2022 at 7:32 PM Chesnay Schepler <ches...@apache.org>
> >>>> wrote:
> >>>>
> >>>> First of, at no point have we questioned the use-case and importance
> of
> >>>>> this feature, and the fact that David, Till and me spent time looking
> >>>>>
> >>>> at
> >>>
> >>>> the FLIP, asking questions, and discussing different aspects of it
> >>>>> should make this obvious.
> >>>>>
> >>>>> I'd appreciate it if you didn't dismiss our replies that quickly.
> >>>>>
> >>>>>   > Ok, so we declare that users who try to use delegation tokens in
> >>>>> Flink is dead end code and not supported, right?
> >>>>>
> >>>>> No one has said that. Are you claiming that your design is the /only
> >>>>> possible implementation/ that is capable of achieving the stated
> goals,
> >>>>> that there are 0 alternatives? On of the *main**points* of these
> >>>>> discussion threads is to discover alternative implementations that
> >>>>>
> >>>> maybe
> >>>
> >>>> weren't thought of. Yes, that may imply that we amend your design, or
> >>>>> reject it completely and come up with a new one.
> >>>>>
> >>>>>
> >>>>> Let's clarify what (I think) Till proposed to get the imagination
> juice
> >>>>> flowing.
> >>>>>
> >>>>> At the end of the day, all we need is a way to provide Flink
> processes
> >>>>> with a token that can be periodically updated. _Who_ issues that
> token
> >>>>> is irrelevant for the functionality to work. You are proposing for a
> >>>>>
> >>>> new
> >>>
> >>>> component in the Flink RM to do that; Till is proposing to have some
> >>>>> external process do it. *That's it*.
> >>>>>
> >>>>> How this could look like in practice is fairly straight forwad; add a
> >>>>> pluggable interface (aka, your TokenProvider thing) that is loaded in
> >>>>> each process, which can _somehow_ provide tokens that are then set in
> >>>>> the UserGroupInformation.
> >>>>> _How_ the provider receives token is up to the provider. It _may_
> just
> >>>>> talk directly to Kerberos, or it could use some communication channel
> >>>>>
> >>>> to
> >>>
> >>>> accept tokens from the outside.
> >>>>> This would for example make it a lot easier to properly integrate
> this
> >>>>> into the lifecycle of the process, as we'd sidestep the whole "TM is
> >>>>> running but still needs a Token" issue; it could become a proper
> setup
> >>>>> step of the process that is independent from other Flink processes.
> >>>>>
> >>>>> /Discuss/.
> >>>>>
> >>>>> On 03/02/2022 18:57, Gabor Somogyi wrote:
> >>>>>
> >>>>>> And even
> >>>>>>>
> >>>>>> if we do it like this, there is no guarantee that it works because
> >>>>>>
> >>>>> there
> >>>>
> >>>>> can be other applications bombing the KDC with requests.
> >>>>>>
> >>>>>> 1. The main issue to solve here is that workloads using delegation
> >>>>>>
> >>>>> tokens
> >>>>
> >>>>> are stopping after 7 days with default configuration.
> >>>>>> 2. This is not new design, it's rock stable and performing well in
> >>>>>>
> >>>>> Spark
> >>>>
> >>>>> for years.
> >>>>>>
> >>>>>>   From a
> >>>>>>>
> >>>>>> maintainability and separation of concerns perspective I'd rather
> >>>>>>
> >>>>> have
> >>>
> >>>> this
> >>>>>
> >>>>>> as some kind of external tool/service that makes KDC scale better
> and
> >>>>>>
> >>>>> that
> >>>>>
> >>>>>> Flink processes can talk to to obtain the tokens.
> >>>>>>
> >>>>>> Ok, so we declare that users who try to use delegation tokens in
> >>>>>>
> >>>>> Flink
> >>>
> >>>> is
> >>>>
> >>>>> dead end code and not supported, right? Then this must be explicitely
> >>>>>> written in the security documentation that such users who use that
> >>>>>>
> >>>>> feature
> >>>>>
> >>>>>> are left behind.
> >>>>>>
> >>>>>> As I see the discussion turned away from facts and started to speak
> >>>>>>
> >>>>> about
> >>>>
> >>>>> feelings. If you have strategic problems with the feature please put
> >>>>>>
> >>>>> your
> >>>>
> >>>>> -1 on the vote and we can spare quite some time.
> >>>>>>
> >>>>>> G
> >>>>>>
> >>>>>>
> >>>>>> On Thu, 3 Feb 2022, 18:34 Till Rohrmann,<trohrm...@apache.org>
> >>>>>>
> >>>>> wrote:
> >>>
> >>>> I don't have a good alternative solution but it sounds to me a bit
> >>>>>>>
> >>>>>> as
> >>>
> >>>> if we
> >>>>>
> >>>>>> are trying to solve Kerberos' scalability problems within Flink. And
> >>>>>>>
> >>>>>> even
> >>>>>
> >>>>>> if we do it like this, there is no guarantee that it works because
> >>>>>>>
> >>>>>> there
> >>>>
> >>>>> can be other applications bombing the KDC with requests. From a
> >>>>>>> maintainability and separation of concerns perspective I'd rather
> >>>>>>>
> >>>>>> have
> >>>
> >>>> this
> >>>>>
> >>>>>> as some kind of external tool/service that makes KDC scale better
> >>>>>>>
> >>>>>> and
> >>>
> >>>> that
> >>>>>
> >>>>>> Flink processes can talk to to obtain the tokens.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Till
> >>>>>>>
> >>>>>>> On Thu, Feb 3, 2022 at 6:01 PM Gabor Somogyi<
> >>>>>>>
> >>>>>> gabor.g.somo...@gmail.com>
> >>>>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> Oh and the most important reason I've forgotten.
> >>>>>>>> Without the feature in the FLIP all secure workloads with
> >>>>>>>>
> >>>>>>> delegation
> >>>
> >>>> tokens
> >>>>>>>
> >>>>>>>> are going to stop when tokens are reaching it's max lifetime 馃檪
> >>>>>>>> This is around 7 days with default config...
> >>>>>>>>
> >>>>>>>> On Thu, Feb 3, 2022 at 5:30 PM Gabor Somogyi<
> >>>>>>>>
> >>>>>>> gabor.g.somo...@gmail.com
> >>>>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>> That's not the single purpose of the feature but in some
> >>>>>>>>>
> >>>>>>>> environments
> >>>>
> >>>>> it
> >>>>>>>
> >>>>>>>> caused problems.
> >>>>>>>>> The main intention is not to deploy keytab to all the nodes
> >>>>>>>>>
> >>>>>>>> because
> >>>
> >>>> the
> >>>>>
> >>>>>> attack surface is bigger + reduce the KDC load.
> >>>>>>>>> I've already described the situation previously in this thread so
> >>>>>>>>>
> >>>>>>>> copying
> >>>>>>>
> >>>>>>>> it here.
> >>>>>>>>>
> >>>>>>>>> --------COPY--------
> >>>>>>>>> "KDC *may* collapse under some circumstances" is the proper
> >>>>>>>>>
> >>>>>>>> wording.
> >>>
> >>>> We have several customers who are executing workloads on
> >>>>>>>>>
> >>>>>>>> Spark/Flink.
> >>>>
> >>>>> Most
> >>>>>>>>
> >>>>>>>>> of the time I'm facing their
> >>>>>>>>> daily issues which is heavily environment and use-case dependent.
> >>>>>>>>>
> >>>>>>>> I've
> >>>>
> >>>>> seen various cases:
> >>>>>>>>> * where the mentioned ~1k nodes were working fine
> >>>>>>>>> * where KDC thought the number of requests are coming from DDOS
> >>>>>>>>>
> >>>>>>>> attack
> >>>>
> >>>>> so
> >>>>>>>
> >>>>>>>> discontinued authentication
> >>>>>>>>> * where KDC was simply not responding because of the load
> >>>>>>>>> * where KDC was intermittently had some outage (this was the most
> >>>>>>>>>
> >>>>>>>> nasty
> >>>>>
> >>>>>> thing)
> >>>>>>>>>
> >>>>>>>>> Since you're managing relatively big cluster then you know that
> >>>>>>>>>
> >>>>>>>> KDC
> >>>
> >>>> is
> >>>>
> >>>>> not
> >>>>>>>>
> >>>>>>>>> only used by Spark/Flink workloads
> >>>>>>>>> but the whole company IT infrastructure is bombing it so it
> really
> >>>>>>>>>
> >>>>>>>> depends
> >>>>>>>>
> >>>>>>>>> on other factors too whether KDC is reaching
> >>>>>>>>> it's limit or not. Not sure what kind of evidence are you looking
> >>>>>>>>>
> >>>>>>>> for
> >>>>
> >>>>> but
> >>>>>>>
> >>>>>>>> I'm not authorized to share any information about
> >>>>>>>>> our clients data.
> >>>>>>>>>
> >>>>>>>>> One thing is for sure. The more external system types are used in
> >>>>>>>>> workloads (for ex. HDFS, HBase, Hive, Kafka) which
> >>>>>>>>> are authenticating through KDC the more possibility to reach this
> >>>>>>>>> threshold when the cluster is big enough.
> >>>>>>>>> --------COPY--------
> >>>>>>>>>
> >>>>>>>>> The FLIP mentions scaling issues with 200 nodes; it's really
> >>>>>>>>>>
> >>>>>>>>> surprising
> >>>>>>>
> >>>>>>>> to me that such a small number of requests can already cause
> >>>>>>>>>
> >>>>>>>> issues.
> >>>
> >>>> One node/task doesn't mean 1 request. The following type of
> >>>>>>>>>
> >>>>>>>> kerberos
> >>>
> >>>> auth
> >>>>>>>
> >>>>>>>> types has been seen by me which can run at the same time:
> >>>>>>>>> HDFS, Hbase, Hive, Kafka, all DBs (oracle, mariaDB, etc...)
> >>>>>>>>>
> >>>>>>>> Additionally
> >>>>>>>
> >>>>>>>> one task is not necessarily opens 1 connection.
> >>>>>>>>>
> >>>>>>>>> All in all I don't have steps to reproduce but we've faced this
> >>>>>>>>>
> >>>>>>>> already...
> >>>>>>>>
> >>>>>>>>> G
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, Feb 3, 2022 at 5:15 PM Chesnay Schepler<
> >>>>>>>>>
> >>>>>>>> ches...@apache.org>
> >>>
> >>>> wrote:
> >>>>>>>>>
> >>>>>>>>> What I don't understand is how this could overload the KDC.
> >>>>>>>>>>
> >>>>>>>>> Aren't
> >>>
> >>>> tokens valid for a relatively long time period?
> >>>>>>>>>>
> >>>>>>>>>> For new deployments where many TMs are started at once I could
> >>>>>>>>>>
> >>>>>>>>> imagine
> >>>>>
> >>>>>> it temporarily, but shouldn't the accesses to the KDC eventually
> >>>>>>>>>> naturally spread out?
> >>>>>>>>>>
> >>>>>>>>>> The FLIP mentions scaling issues with 200 nodes; it's really
> >>>>>>>>>>
> >>>>>>>>> surprising
> >>>>>>>
> >>>>>>>> to me that such a small number of requests can already cause
> >>>>>>>>>>
> >>>>>>>>> issues.
> >>>>
> >>>>> On 03/02/2022 16:14, Gabor Somogyi wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I would prefer not choosing the first option
> >>>>>>>>>>>>
> >>>>>>>>>>> Then the second option may play only.
> >>>>>>>>>>>
> >>>>>>>>>>> I am not a Kerberos expert but is it really so that every
> >>>>>>>>>>>>
> >>>>>>>>>>> application
> >>>>>>>
> >>>>>>>> that
> >>>>>>>>>>
> >>>>>>>>>>> wants to use Kerberos needs to implement the token propagation
> >>>>>>>>>>>
> >>>>>>>>>> itself?
> >>>>>>>
> >>>>>>>> This
> >>>>>>>>>>
> >>>>>>>>>>> somehow feels as if there is something missing.
> >>>>>>>>>>>
> >>>>>>>>>>> OK, so first some kerberos + token intro.
> >>>>>>>>>>>
> >>>>>>>>>>> Some basics:
> >>>>>>>>>>> * TGT can be created from keytab
> >>>>>>>>>>> * TGT is needed to obtain TGS (called token)
> >>>>>>>>>>> * Authentication only works with TGS -> all places where
> >>>>>>>>>>>
> >>>>>>>>>> external
> >>>
> >>>> system is
> >>>>>>>>>>
> >>>>>>>>>>> needed either a TGT or TGS needed
> >>>>>>>>>>>
> >>>>>>>>>>> There are basically 2 ways to authenticate to a kerberos
> secured
> >>>>>>>>>>>
> >>>>>>>>>> external
> >>>>>>>>>>
> >>>>>>>>>>> system:
> >>>>>>>>>>> 1. One needs a kerberos TGT which MUST be propagated to all
> >>>>>>>>>>>
> >>>>>>>>>> JVMs.
> >>>
> >>>> Here
> >>>>>>>
> >>>>>>>> each
> >>>>>>>>>>
> >>>>>>>>>>> and every JVM obtains a TGS by itself which bombs the KDC that
> >>>>>>>>>>>
> >>>>>>>>>> may
> >>>
> >>>> collapse.
> >>>>>>>>>>
> >>>>>>>>>>> 2. One needs a kerberos TGT which exists only on a single place
> >>>>>>>>>>>
> >>>>>>>>>> (in
> >>>>
> >>>>> this
> >>>>>>>>
> >>>>>>>>> case JM). JM gets a TGS which MUST be propagated to all TMs
> >>>>>>>>>>>
> >>>>>>>>>> because
> >>>>
> >>>>> otherwise authentication fails.
> >>>>>>>>>>>
> >>>>>>>>>>> Now the whole system works in a way that keytab file (we can
> >>>>>>>>>>>
> >>>>>>>>>> imagine
> >>>>
> >>>>> that
> >>>>>>>>>>
> >>>>>>>>>>> as plaintext password) is reachable on all nodes.
> >>>>>>>>>>> This is a relatively huge attack surface. Now the main
> intention
> >>>>>>>>>>>
> >>>>>>>>>> is:
> >>>>
> >>>>> * Instead of propagating keytab file to all nodes propagate a
> >>>>>>>>>>>
> >>>>>>>>>> TGS
> >>>
> >>>> which
> >>>>>>>>
> >>>>>>>>> has
> >>>>>>>>>>
> >>>>>>>>>>> limited lifetime (more secure)
> >>>>>>>>>>> * Do the TGS generation in a single place so KDC may not
> >>>>>>>>>>>
> >>>>>>>>>> collapse
> >>>
> >>>> +
> >>>>
> >>>>> having
> >>>>>>>>>>
> >>>>>>>>>>> keytab only on a single node can be better protected
> >>>>>>>>>>>
> >>>>>>>>>>> As a final conclusion if there is a place which expects to do
> >>>>>>>>>>>
> >>>>>>>>>> kerberos
> >>>>>>>
> >>>>>>>> authentication then it's a MUST to have either TGT or TGS.
> >>>>>>>>>>> Now it's done in a pretty unsecure way. The questions are the
> >>>>>>>>>>>
> >>>>>>>>>> following:
> >>>>>>>>
> >>>>>>>>> * Do we want to leave this unsecure keytab propagation like this
> >>>>>>>>>>>
> >>>>>>>>>> and
> >>>>
> >>>>> bomb
> >>>>>>>>>>
> >>>>>>>>>>> KDC?
> >>>>>>>>>>> * If no then how do we propagate the more secure token to TMs.
> >>>>>>>>>>>
> >>>>>>>>>>> If the answer to the first question is no then the FLIP can be
> >>>>>>>>>>>
> >>>>>>>>>> abandoned
> >>>>>>>>
> >>>>>>>>> and doesn't worth the further effort.
> >>>>>>>>>>> If the answer is yes then we can talk about the how part.
> >>>>>>>>>>>
> >>>>>>>>>>> G
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Feb 3, 2022 at 3:42 PM Till Rohrmann<
> >>>>>>>>>>>
> >>>>>>>>>> trohrm...@apache.org
> >>>
> >>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I would prefer not choosing the first option
> >>>>>>>>>>>>
> >>>>>>>>>>>> Make the TM accept tasks only after registration(not sure if
> >>>>>>>>>>>>>
> >>>>>>>>>>>> it's
> >>>>
> >>>>> possible or makes sense at all)
> >>>>>>>>>>>>
> >>>>>>>>>>>> because it effectively means that we change how Flink's
> >>>>>>>>>>>>
> >>>>>>>>>>> component
> >>>
> >>>> lifecycle
> >>>>>>>>>>
> >>>>>>>>>>> works for distributing Kerberos tokens. It also effectively
> >>>>>>>>>>>>
> >>>>>>>>>>> means
> >>>
> >>>> that
> >>>>>>>>
> >>>>>>>>> a TM
> >>>>>>>>>>
> >>>>>>>>>>> cannot make progress until connected to a RM.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am not a Kerberos expert but is it really so that every
> >>>>>>>>>>>>
> >>>>>>>>>>> application
> >>>>>>>
> >>>>>>>> that
> >>>>>>>>>>
> >>>>>>>>>>> wants to use Kerberos needs to implement the token propagation
> >>>>>>>>>>>>
> >>>>>>>>>>> itself?
> >>>>>>>>
> >>>>>>>>> This
> >>>>>>>>>>
> >>>>>>>>>>> somehow feels as if there is something missing.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Till
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Feb 3, 2022 at 3:29 PM Gabor Somogyi <
> >>>>>>>>>>>>
> >>>>>>>>>>> gabor.g.somo...@gmail.com>
> >>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>     Isn't this something the underlying resource management
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> system
> >>>>
> >>>>> could
> >>>>>>>>>>
> >>>>>>>>>>> do
> >>>>>>>>>>>>
> >>>>>>>>>>>>> or which every process could do on its own?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I was looking for such feature but not found.
> >>>>>>>>>>>>> Maybe we can solve the propagation easier but then I'm
> waiting
> >>>>>>>>>>>>>
> >>>>>>>>>>>> on
> >>>>
> >>>>> better
> >>>>>>>>>>
> >>>>>>>>>>> suggestion.
> >>>>>>>>>>>>> If anybody has better/more simple idea then please point to a
> >>>>>>>>>>>>>
> >>>>>>>>>>>> specific
> >>>>>>>>
> >>>>>>>>> feature which works on all resource management systems.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Here's an example for the TM to run workloads without being
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> connected
> >>>>>>>>
> >>>>>>>>> to the RM, without ever having a valid token
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> All in all I see the main problem. Not sure what is the
> reason
> >>>>>>>>>>>>>
> >>>>>>>>>>>> behind
> >>>>>>>>
> >>>>>>>>> that
> >>>>>>>>>>>>
> >>>>>>>>>>>>> a TM accepts tasks w/o registration but clearly not helping
> >>>>>>>>>>>>>
> >>>>>>>>>>>> here.
> >>>>
> >>>>> I basically see 2 possible solutions:
> >>>>>>>>>>>>> * Make the TM accept tasks only after registration(not sure
> if
> >>>>>>>>>>>>>
> >>>>>>>>>>>> it's
> >>>>>>>
> >>>>>>>> possible or makes sense at all)
> >>>>>>>>>>>>> * We send tokens right after container creation with
> >>>>>>>>>>>>> "updateDelegationTokens"
> >>>>>>>>>>>>> Not sure which one is more realistic to do since I'm not
> >>>>>>>>>>>>>
> >>>>>>>>>>>> involved
> >>>>
> >>>>> the
> >>>>>>>>
> >>>>>>>>> new
> >>>>>>>>>>
> >>>>>>>>>>> feature.
> >>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Feb 3, 2022 at 3:09 PM Till Rohrmann <
> >>>>>>>>>>>>>
> >>>>>>>>>>>> trohrm...@apache.org>
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Sorry for joining this discussion late. I also did not read
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> all
> >>>
> >>>> responses
> >>>>>>>>>>>>
> >>>>>>>>>>>>> in this thread so my question might already be answered: Why
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> does
> >>>>
> >>>>> Flink
> >>>>>>>>>>
> >>>>>>>>>>> need to be involved in the propagation of the tokens? Why do
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> we
> >>>
> >>>> need
> >>>>>>>>
> >>>>>>>>> explicit RPC calls in the Flink domain? Isn't this something
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> the
> >>>>
> >>>>> underlying
> >>>>>>>>>>>>
> >>>>>>>>>>>>> resource management system could do or which every process
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> could
> >>>>
> >>>>> do
> >>>>>>>
> >>>>>>>> on
> >>>>>>>>>>
> >>>>>>>>>>> its
> >>>>>>>>>>>>
> >>>>>>>>>>>>> own? I am a bit worried that we are making Flink responsible
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> for
> >>>>
> >>>>> something
> >>>>>>>>>>>>
> >>>>>>>>>>>>> that it is not really designed to do so.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Feb 3, 2022 at 2:54 PM Chesnay Schepler <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> ches...@apache.org>
> >>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Here's an example for the TM to run workloads without being
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> connected
> >>>>>>>>>>
> >>>>>>>>>>> to
> >>>>>>>>>>>>
> >>>>>>>>>>>>> the RM, while potentially having a valid token:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>     1. TM registers at RM
> >>>>>>>>>>>>>>>     2. JobMaster requests slot from RM -> TM gets notified
> >>>>>>>>>>>>>>>     3. JM fails over
> >>>>>>>>>>>>>>>     4. TM re-offers the slot to the failed over JobMaster
> >>>>>>>>>>>>>>>     5. TM reconnects to RM at some point
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Here's an example for the TM to run workloads without being
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> connected
> >>>>>>>>>>
> >>>>>>>>>>> to
> >>>>>>>>>>>>
> >>>>>>>>>>>>> the RM, without ever having a valid token:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>     1. TM1 has a valid token and is running some tasks.
> >>>>>>>>>>>>>>>     2. TM1 crashes
> >>>>>>>>>>>>>>>     3. TM2 is started to take over, and re-uses the working
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> directory
> >>>>>>>>
> >>>>>>>>> of
> >>>>>>>>>>
> >>>>>>>>>>>        TM1 (new feature in 1.15!)
> >>>>>>>>>>>>>>>     4. TM2 recovers the previous slot allocations
> >>>>>>>>>>>>>>>     5. TM2 is informed about leading JM
> >>>>>>>>>>>>>>>     6. TM2 starts registration with RM
> >>>>>>>>>>>>>>>     7. TM2 offers slots to JobMaster
> >>>>>>>>>>>>>>>     8. TM2 accepts task submission from JobMaster
> >>>>>>>>>>>>>>>     9. ...some time later the registration completes...
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 03/02/2022 14:24, Gabor Somogyi wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> but it can happen that the JobMaster+TM collaborate to run
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> stuff
> >>>>>>>
> >>>>>>>> without the TM being registered at the RM
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Honestly I'm not educated enough within Flink to give an
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> example
> >>>>>>>
> >>>>>>>> to
> >>>>>>>>
> >>>>>>>>> such scenario.
> >>>>>>>>>>>>>>>> Until now I thought JM defines tasks to be done and TM
> just
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> blindly
> >>>>>>>>
> >>>>>>>>> connects to external systems and does the processing.
> >>>>>>>>>>>>>>>> All in all if external systems can be touched when JM + TM
> >>>>>>>>>>>>>>>> collaboration happens then we need to consider that in the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> design.
> >>>>>>>>
> >>>>>>>>> Since I don't have an example scenario I don't know what
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> exactly
> >>>>>>>
> >>>>>>>> needs
> >>>>>>>>>>>>
> >>>>>>>>>>>>> to be solved.
> >>>>>>>>>>>>>>>> I think we need an example case to decide whether we face
> a
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> real
> >>>>>>>
> >>>>>>>> issue
> >>>>>>>>>>>>
> >>>>>>>>>>>>> or the design is not leaking.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Thu, Feb 3, 2022 at 2:12 PM Chesnay Schepler <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ches...@apache.org>
> >>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>        > Just to learn something new. I think local
> recovery
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> is
> >>>>
> >>>>> clear to
> >>>>>>>>>>
> >>>>>>>>>>>        me which is not touching external systems like Kafka
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> or
> >>>
> >>>> so
> >>>>>
> >>>>>>        (correct me if I'm wrong). Is it possible that such
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> case
> >>>>
> >>>>> the
> >>>>>>>
> >>>>>>>> user
> >>>>>>>>>>
> >>>>>>>>>>>        code just starts to run blindly w/o JM coordination
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> and
> >>>
> >>>> connects
> >>>>>>>>>>
> >>>>>>>>>>>        to external systems to do data processing?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>        Local recovery itself shouldn't touch external
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> systems;
> >>>
> >>>> the
> >>>>>>>
> >>>>>>>> TM
> >>>>>>>>
> >>>>>>>>>        cannot just run user-code without the JobMaster being
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> involved,
> >>>>>>>>>>
> >>>>>>>>>>>        but it can happen that the JobMaster+TM collaborate
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> to
> >>>
> >>>> run
> >>>>>
> >>>>>> stuff
> >>>>>>>>>>
> >>>>>>>>>>>        without the TM being registered at the RM.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>        On 03/02/2022 13:48, Gabor Somogyi wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>        > Any error in loading the provider (be it by
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> accident
> >>>
> >>>> or
> >>>>>
> >>>>>>        explicit checks) then is a setup error and we can
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> fail
> >>>
> >>>> the
> >>>>>>>
> >>>>>>>> cluster.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>        Fail fast is a good direction in my view. In Spark
> I
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wanted
> >>>>>>>
> >>>>>>>> to
> >>>>>>>>>>
> >>>>>>>>>>> go
> >>>>>>>>>>>>
> >>>>>>>>>>>>>        to this direction but there were other opinions so
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> there
> >>>>
> >>>>> if a
> >>>>>>>>
> >>>>>>>>>        provider is not loaded then the workload goes
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> further.
> >>>
> >>>>        Of course the processing will fail if the token is
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> missing...
> >>>>>>>>
> >>>>>>>>>        > Requiring HBase (and Hadoop for that matter) to be
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> on
> >>>>
> >>>>> the
> >>>>>>>
> >>>>>>>> JM
> >>>>>>>>>>
> >>>>>>>>>>>
>

Reply via email to