Thanks guys. Updated the error codes in both the code and the explanation under "Public Changes". To sum up, here are the error codes listed in the KIP:
enum AssignmentError { NONE, ACTIVE_TASK_ASSIGNED_MULTIPLE_TIMES, ACTIVE_AND_STANDBY_TASK_ASSIGNED_TO_SAME_KAFKASTREAMS, INVALID_STANDBY_TASK, UNKNOWN_PROCESS_ID, UNKNOWN_TASK_ID } Anything missing? (also updated all the code block headings, thanks for noticing that Bruno) On Fri, May 3, 2024 at 9:33 AM Matthias J. Sax <mj...@apache.org> wrote: > 117f: Good point by Bruno. We should check for this, and could have an > additional `INVALID_STANDBY_TASK` error code? > > > -Matthias > > On 5/3/24 5:52 AM, Guozhang Wang wrote: > > Hi Sophie, > > > > Re: As for the return type of the TaskAssignmentUtils, I think that > > makes sense. LGTM. > > > > On Fri, May 3, 2024 at 2:26 AM Bruno Cadonna <cado...@apache.org> wrote: > >> > >> Hi Sophie, > >> > >> 117f: > >> I think, removing the STATEFUL and STATELESS types is not enough to > >> avoid the error Guozhang mentioned. The StreamsPartitionAssignor passes > >> the information whether a task is stateless or stateful into the task > >> assignor. However, the task assignor can return a standby task for a > >> stateless task which is inconsistent. > >> > >> Echoing Matthias' statement about the missing UNKNOWN_TASK_ID error. > >> > >> nit: > >> The titles of some code blocks in the KIP are not consistent with their > >> content, e.g., KafkaStreamsState <-> NodeState > >> > >> > >> Best, > >> Bruno > >> > >> On 5/3/24 2:43 AM, Matthias J. Sax wrote: > >>> Thanks Sophie. My bad. You are of course right about `TaskAssignment` > >>> and the StreamsPartitionAssignor's responsibitliy to map tasks of a > >>> instance to consumers. When I wrote my reply, I forgot about this > detail. > >>> > >>> Seems you did not add `UNKNOWN_TASK_ID` error yet as proposed by > Guozhang? > >>> > >>> Otherwise LGTM. > >>> > >>> > >>> -Matthias > >>> > >>> On 5/2/24 4:20 PM, Sophie Blee-Goldman wrote: > >>>> Guozhang: > >>>> > >>>> 117. All three additions make sense to me. However, while thinking > about > >>>> how users would actually produce an assignment, I realized that it > seems > >>>> silly to make it their responsibility to distinguish between a > stateless > >>>> and stateful task when they return the assignment. The > >>>> StreamsPartitionAssignor already knows which tasks are stateful vs > >>>> stateless, so there's no need to add this extra step for users to > >>>> figure it > >>>> out themselves, and potentially make a mistake. > >>>> > >>>> 117f: So, rather than add a new error type for "inconsistent task > types", > >>>> I'm proposing to just flatten the AssignedTask.Type enum to only > "ACTIVE" > >>>> and "STANDBY", and remove the "STATEFUL" and "STATELESS" types > >>>> altogether. > >>>> Any objections? > >>>> > >>>> ----- > >>>> > >>>> -Thanks, fixed the indentation of headers under "User APIs" and > >>>> "Read-Only > >>>> APIs" > >>>> > >>>> -As for the return type of the TaskAssignmentUtils methods, I don't > >>>> personally feel too strongly about this, but the reason for the return > >>>> type > >>>> being a Map<ProcessId, KafkaStreamsAssignment> rather than a > >>>> TaskAssignment > >>>> is because they are meant to be used iteratively/to create a part of > the > >>>> full assignment, and not necessarily a full assignment for each. > Notice > >>>> that they all have an input parameter of the same type: Map<ProcessId, > >>>> KafkaStreamsAssignment>. The idea is you can take the output of any of > >>>> these and pass it in to another to generate or optimize another piece > of > >>>> the overall assignment. For example, if you want to perform the > >>>> rack-aware > >>>> optimization on both active and standby tasks, you would need to call > >>>> #optimizeRackAwareActiveTasks and then forward the output to > >>>> #optimizeRackAwareStandbyTasks to get the final assignment. If we > >>>> return a > >>>> TaskAssignment, it will usually need to be unwrapped right away. > Perhaps > >>>> more importantly, I worry that returning a TaskAssignment will make it > >>>> seem > >>>> like each of these utility methods return a "full" and final > assignment > >>>> that can just be returned as-is from the TaskAssignor's #assign > method. > >>>> Whereas they are each just a single step in the full assignment > process, > >>>> and not the final product. Does that make sense? > >>>> > >>>> On Thu, May 2, 2024 at 3:50 PM Sophie Blee-Goldman > >>>> <sop...@responsive.dev> > >>>> wrote: > >>>> > >>>>> Matthias: > >>>>> > >>>>> Thanks for the naming suggestions for the error codes. I was > >>>>> definitely not happy with my original naming but couldn't think of > >>>>> anything > >>>>> better. I like your proposals though, will update the KIP names. > >>>>> I'll also > >>>>> add a "NONE" option as well -- much better than just passing in null > >>>>> for no > >>>>> error. > >>>>> > >>>>>> OVERLAPPING_CLIENT : multiple KafkaStreams clients assigned with the > >>>>>> same active task > >>>>> > >>>>> Would also be an error if assigned to two consumers of the same > >>>>> client... > >>>>>> Needs to be rephrased. > >>>>> > >>>>> > >>>>> Well the TaskAssignor only assigns tasks to KafkaStreams clients, > >>>>> it's not > >>>>> responsible for the assignment of tasks to consumers within a > >>>>> KafkaStreams. > >>>>> It would be a bug in the StreamsPartitionAssignor if it received a > valid > >>>>> assignment from the TaskAssignor with only one copy of a task > >>>>> assigned to a > >>>>> single KAfkaStreams client, and then somehow ended up assigning that > >>>>> task > >>>>> to multiple consumers on the KafkaStreams client. It wouldn't be the > >>>>> TaskAssignor's fault so imo it would not make sense to include this > >>>>> case in > >>>>> the OVERLAPPING_CLIENT error (or as it's now called, ACTIVE_TASK_ > >>>>> ASSIGNED_MULTIPLE_TIMES). Not to mention, if there was a bug that > >>>>> caused > >>>>> the StreamsPartitionAssignor to assign a task to multiple consumers, > it > >>>>> presumably wouldn't even notice since it's a bug -- if it did > notice, it > >>>>> can just fix the issue. The error codes are about communicating > >>>>> unfixable > >>>>> issues due to the TaskAssignor itself returning an invalid > >>>>> assignment. The > >>>>> phrasing is intentional, and (imo) correct as it is. > >>>>> > >>>>> I do see your point about how the StreamsPartitionAssignor should > >>>>> handle/react to invalid assignments though. I'm fine with just > >>>>> throwing a > >>>>> StreamsException and crashing after we invoke the > #onAssignmentComputed > >>>>> callback to notify the user of the error. > >>>>> > >>>>> On Wed, May 1, 2024 at 9:46 AM Guozhang Wang > >>>>> <guozhang.wang...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> Jumping back to the party here :) > >>>>>> > >>>>>> 107: I agree with the rationale behind this, and > >>>>>> `numProcessingThreads` looks good to me as it covers both the > current > >>>>>> and future scenarios. > >>>>>> > >>>>>> 117: I agree with Lucas and Bruno, and would add: > >>>>>> * 117e: unknown taskID: fail > >>>>>> * 117f: inconsistent task types (e.g. a known taskID was > indicated > >>>>>> stateless from ApplicationState, but the returned AssignedTask > states > >>>>>> stateful): fail > >>>>>> * 117g: some ProcessID was not included in the returned Set: > pass, > >>>>>> and interprets it as no tasks assigned to it. > >>>>>> > >>>>>> And I'm open for any creative error codes folks would come up with > :) > >>>>>> > >>>>>>> If any of these errors are detected, the StreamsPartitionAssignor > will > >>>>>> immediately "fail" the rebalance and retry it by scheduling an > >>>>>> immediate > >>>>>> followup rebalance. > >>>>>> > >>>>>> I'm also a bit concerned here, as such endless retry loops have > >>>>>> happened in the past in my memory. Given that we would likely see > most > >>>>>> of the user implementations be deterministic, I'm also leaning > towards > >>>>>> failing the app immediately and let the crowd educates us if there > are > >>>>>> some very interesting scenarios out there that are not on our radar > to > >>>>>> re-consider this, rather than getting hard to debug cases in the > dark. > >>>>>> > >>>>>> ----- > >>>>>> > >>>>>> And here are just some nits about the KIP writings itself: > >>>>>> > >>>>>> * I think some bullet points under `User APIs` and `Read-only APIs` > >>>>>> should have a lower level indention? It caught me for a sec until I > >>>>>> realized there are just two categories. > >>>>>> > >>>>>> * In TaskAssignmentUtils , why not let those util functions return > >>>>>> `TaskAssignment` (to me it feels more consistent with the user > APIs), > >>>>>> but instead return a Map<ProcessID, KafkaStreamsAssignment>? > >>>>>> > >>>>>> > >>>>>> Guozhang > >>>>>> > >>>>>> On Tue, Apr 30, 2024 at 5:28 PM Matthias J. Sax <mj...@apache.org> > >>>>>> wrote: > >>>>>>> > >>>>>>> I like the idea of error codes. Not sure if the name are ideal? > >>>>>>> UNKNOWN_PROCESS_ID makes sense, but the other two seems a little > bit > >>>>>>> difficult to understand? > >>>>>>> > >>>>>>> Should we be very descriptive (and also try to avoid coupling it to > >>>>>>> the > >>>>>>> threading model -- important for the first error code): > >>>>>>> - ACTIVE_TASK_ ASSIGNED_MULTIPLE_TIMES > >>>>>>> - ACTIVE_AND_STANDBY_ASSIGNED_TO_SAME_CLIENT (or _INSTANCE > >>>>>>> > >>>>>>> I think we also need to add NONE as option or make the error > parameter > >>>>>>> an `Optional`? > >>>>>>> > >>>>>>> > >>>>>>>> OVERLAPPING_CLIENT : multiple KafkaStreams clients assigned with > the > >>>>>> same active task > >>>>>>> > >>>>>>> Would also be an error if assigned to two consumers of the same > >>>>>>> client... Needs to be rephrased. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> If any of these errors are detected, the StreamsPartitionAssignor > >>>>>> will immediately "fail" the rebalance and retry it by scheduling an > >>>>>> immediate followup rebalance. > >>>>>>> > >>>>>>> Does this make sense? If we assume that the task-assignment is > >>>>>>> deterministic, we would end up with an infinite retry loop? Also, > >>>>>>> assuming that an client leave the group, we cannot assign some task > >>>>>>> any > >>>>>>> longer... I would rather throw a StreamsException and let the > client > >>>>>> crash. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> On 4/30/24 12:22 PM, Sophie Blee-Goldman wrote: > >>>>>>>> One last thing: I added an error code enum to be returned from the > >>>>>>>> #onAssignmentComputed method in case of an invalid assignment. I > >>>>>> created > >>>>>>>> one code for each of the invalid cases we described above. The > >>>>>> downside is > >>>>>>>> that this means we'll have to go through a deprecation cycle if we > >>>>>> want to > >>>>>>>> loosen up the restrictions on any of the enforced cases. The > upside > >>>>>> is that > >>>>>>>> we can very clearly mark what is an invalid assignment and this > will > >>>>>>>> (hopefully) assist users who are new to customizing assignments by > >>>>>> clearly > >>>>>>>> denoting the requirements, and returning a clear error if they are > >>>>>>>> not > >>>>>>>> followed. > >>>>>>>> > >>>>>>>> Of course the StreamsPartitionAssignor will also do a "fallback & > >>>>>> retry" in > >>>>>>>> this case by returning the same assignment to the consumers and > >>>>>> scheduling > >>>>>>>> a followup rebalance. I've added all of this to the TaskAssignor > and > >>>>>>>> #onAssignmentComputed javadocs, and added a section under "Public > >>>>>> Changes" > >>>>>>>> as well. > >>>>>>>> > >>>>>>>> Please let me know if there are any concerns, or if you have > >>>>>> suggestions > >>>>>>>> for how else we can handle an invalid assignment > >>>>>>>> > >>>>>>>> On Tue, Apr 30, 2024 at 11:39 AM Sophie Blee-Goldman < > >>>>>> sop...@responsive.dev> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks guys! I agree with what Lucas said about 117c, we can > always > >>>>>> loosen > >>>>>>>>> a restriction later and I don't want to do anything now that > might > >>>>>> get in > >>>>>>>>> the way of the new threading models. > >>>>>>>>> > >>>>>>>>> With that I think we're all in agreement on 117. I'll update the > KIP > >>>>>> to > >>>>>>>>> include what we've discussed > >>>>>>>>> > >>>>>>>>> (and will fix the remaining #finalAssignment mention as well, > thanks > >>>>>>>>> Bruno. Glad to have such good proof readers! :P) > >>>>>>>>> > >>>>>>>>> On Tue, Apr 30, 2024 at 8:35 AM Bruno Cadonna < > cado...@apache.org> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi again, > >>>>>>>>>> > >>>>>>>>>> I forgot to ask whether you could add the agreement about > handling > >>>>>>>>>> invalid assignment to the KIP. > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> Bruno > >>>>>>>>>> > >>>>>>>>>> On 4/30/24 2:00 PM, Bruno Cadonna wrote: > >>>>>>>>>>> Hi all, > >>>>>>>>>>> > >>>>>>>>>>> I think we are converging! > >>>>>>>>>>> > >>>>>>>>>>> 117 > >>>>>>>>>>> a) fail: Since it is an invalid consumer assignment > >>>>>>>>>>> b) pass: I agree that not assigning a task might be reasonable > in > >>>>>> some > >>>>>>>>>>> situations > >>>>>>>>>>> c) fail: For the reasons Lucas pointed out. I am missing a good > >>>>>>>>>>> use > >>>>>>>>>> case > >>>>>>>>>>> here. > >>>>>>>>>>> d) fail: It is invalid > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Somewhere in the KIP you still use finalAssignment() instead of > >>>>>>>>>>> the > >>>>>>>>>>> wonderful method name onAssignmentComputed() ;-) > >>>>>>>>>>> "... interface also includes a method named finalAssignment > which > >>>>>> is > >>>>>>>>>>> called with the final computed GroupAssignment ..." > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Best, > >>>>>>>>>>> Bruno > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 4/30/24 1:04 PM, Lucas Brutschy wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> Looks like a great KIP to me! > >>>>>>>>>>>> > >>>>>>>>>>>> I'm late, so I'm only going to comment on the last open point > >>>>>> 117. I'm > >>>>>>>>>>>> against any fallbacks like "use the default assignor if the > >>>>>>>>>>>> custom > >>>>>>>>>>>> assignment is invalid", as it's just going to hide bugs. For > >>>>>>>>>>>> the 4 > >>>>>>>>>>>> cases mentioned by Sophie: > >>>>>>>>>>>> > >>>>>>>>>>>> 117a) I'd fail immediately here, as it's an implementation > bug, > >>>>>> and > >>>>>>>>>>>> should not lead to a valid consumer group assignment. > >>>>>>>>>>>> 117b) Agreed. This is a useful assignment and should be > allowed. > >>>>>>>>>>>> 117c) This is the tricky case. However, I'm leaning towards > not > >>>>>>>>>>>> allowing this, unless we have a concrete use case. This will > >>>>>> block us > >>>>>>>>>>>> from potentially using a single consumer for active and > standby > >>>>>> tasks > >>>>>>>>>>>> in the future. It's easier to drop the restriction later if we > >>>>>> have a > >>>>>>>>>>>> concrete use case. > >>>>>>>>>>>> 117d) Definitely fail immediately, as you said. > >>>>>>>>>>>> > >>>>>>>>>>>> Cheers, > >>>>>>>>>>>> Lucas > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman > >>>>>>>>>>>> <sop...@responsive.dev> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Yeah I think that sums it up well. Either you computed a > >>>>>> *possible* > >>>>>>>>>>>>> assignment, > >>>>>>>>>>>>> or you returned something that makes it literally impossible > for > >>>>>> the > >>>>>>>>>>>>> StreamsPartitionAssignor to decipher/translate into an actual > >>>>>> group > >>>>>>>>>>>>> assignment, in which case it should just fail > >>>>>>>>>>>>> > >>>>>>>>>>>>> That's more or less it for the open questions that have been > >>>>>> raised > >>>>>>>>>>>>> so far, > >>>>>>>>>>>>> so I just want to remind folks that there's already a voting > >>>>>> thread > >>>>>>>>>> for > >>>>>>>>>>>>> this. I cast my vote a few minutes ago so it should > resurface in > >>>>>>>>>>>>> everyone's > >>>>>>>>>>>>> inbox :) > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai < > >>>>>> desai.p.ro...@gmail.com > >>>>>>>>>>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> 117: as Sophie laid out, there are two cases here right: > >>>>>>>>>>>>>> 1. cases that are considered invalid by the existing > assignors > >>>>>> but > >>>>>>>>>> are > >>>>>>>>>>>>>> still valid assignments in the sense that they can be used > to > >>>>>>>>>>>>>> generate a > >>>>>>>>>>>>>> valid consumer group assignment (from the perspective of the > >>>>>>>>>>>>>> consumer group > >>>>>>>>>>>>>> protocol). An assignment that excludes a task is one such > >>>>>> example, > >>>>>>>>>> and > >>>>>>>>>>>>>> Sophie pointed out a good use case for it. I also think it > >>>>>>>>>>>>>> makes > >>>>>>>>>>>>>> sense to > >>>>>>>>>>>>>> allow these. It's hard to predict how a user might want to > use > >>>>>> the > >>>>>>>>>>>>>> custom > >>>>>>>>>>>>>> assignor, and its reasonable to expect them to use it with > care > >>>>>> and > >>>>>>>>>> not > >>>>>>>>>>>>>> hand-hold them. > >>>>>>>>>>>>>> 2. cases that are not valid because it is impossible to > compute > >>>>>> a > >>>>>>>>>> valid > >>>>>>>>>>>>>> consumer group assignment from them. In this case it seems > >>>>>> totally > >>>>>>>>>>>>>> reasonable to just throw a fatal exception that gets passed > to > >>>>>> the > >>>>>>>>>>>>>> uncaught > >>>>>>>>>>>>>> exception handler. If this case happens then there is some > bug > >>>>>> in the > >>>>>>>>>>>>>> user's assignor and its totally reasonable to fail the > >>>>>> application > >>>>>>>>>>>>>> in that > >>>>>>>>>>>>>> case. We _could_ try to be more graceful and default to one > of > >>>>>> the > >>>>>>>>>>>>>> existing > >>>>>>>>>>>>>> assignors. But it's usually better to fail hard and fast > when > >>>>>> there > >>>>>>>>>>>>>> is some > >>>>>>>>>>>>>> illegal state detected imo. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, Apr 19, 2024 at 4:18 PM Rohan Desai < > >>>>>> desai.p.ro...@gmail.com > >>>>>>>>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Bruno, I've incorporated your feedback into the KIP > document. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai < > >>>>>>>>>> desai.p.ro...@gmail.com> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Thanks for the feedback Bruno! For the most part I think > it > >>>>>> makes > >>>>>>>>>>>>>>>> sense, > >>>>>>>>>>>>>>>> but leaving a couple follow-up thoughts/questions: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> re 4: I think Sophie's point was slightly different - > that we > >>>>>>>>>>>>>>>> might want > >>>>>>>>>>>>>>>> to wrap the return type for `assign` in a class so that > its > >>>>>> easily > >>>>>>>>>>>>>>>> extensible. This makes sense to me. Whether we do that or > >>>>>> not, we > >>>>>>>>>> can > >>>>>>>>>>>>>> have > >>>>>>>>>>>>>>>> the return type be a Set instead of a Map as well. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> re 6: Yes, it's a callback that's called with the final > >>>>>>>>>> assignment. I > >>>>>>>>>>>>>>>> like your suggested name. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Fri, Apr 5, 2024 at 12:17 PM Rohan Desai < > >>>>>>>>>> desai.p.ro...@gmail.com> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Thanks for the feedback Sophie! > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> re1: Totally agree. The fact that it's related to the > >>>>>> partition > >>>>>>>>>>>>>> assignor > >>>>>>>>>>>>>>>>> is clear from just `task.assignor`. I'll update. > >>>>>>>>>>>>>>>>> re3: This is a good point, and something I would find > useful > >>>>>>>>>>>>>> personally. > >>>>>>>>>>>>>>>>> I think its worth adding an interface that lets the > plugin > >>>>>>>>>>>>>>>>> observe the > >>>>>>>>>>>>>>>>> final assignment. I'll add that. > >>>>>>>>>>>>>>>>> re4: I like the new `NodeAssignment` type. I'll update > the > >>>>>> KIP > >>>>>>>>>> with > >>>>>>>>>>>>>> that. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Thu, Nov 9, 2023 at 11:18 PM Rohan Desai > >>>>>>>>>>>>>>>>> <desai.p.ro...@gmail.com> > >>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Thanks for the feedback so far! I think pretty much all > of > >>>>>> it is > >>>>>>>>>>>>>>>>>> reasonable. I'll reply to it inline: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 1. All the API logic is granular at the Task level, > except > >>>>>> the > >>>>>>>>>>>>>>>>>> previousOwnerForPartition func. I’m not clear what’s the > >>>>>>>>>> motivation > >>>>>>>>>>>>>>>>>> behind it, does our controller also want to change how > the > >>>>>>>>>>>>>>>>>> partitions->tasks mapping is formed? > >>>>>>>>>>>>>>>>>> You're right that this is out of place. I've removed > this > >>>>>> method > >>>>>>>>>> as > >>>>>>>>>>>>>>>>>> it's not needed by the task assignor. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 2. Just on the API layering itself: it feels a bit > >>>>>>>>>>>>>>>>>>> weird to > >>>>>>>>>>>>>>>>>>> have the > >>>>>>>>>>>>>>>>>> three built-in functions (defaultStandbyTaskAssignment > etc) > >>>>>>>>>>>>>>>>>> sitting in > >>>>>>>>>>>>>>>>>> the ApplicationMetadata class. If we consider them as > some > >>>>>>>>>> default > >>>>>>>>>>>>>> util > >>>>>>>>>>>>>>>>>> functions, how about introducing moving those into their > >>>>>>>>>>>>>>>>>> own > >>>>>>>>>> static > >>>>>>>>>>>>>> util > >>>>>>>>>>>>>>>>>> methods to separate from the ApplicationMetadata “fact > >>>>>> objects” ? > >>>>>>>>>>>>>>>>>> Agreed. Updated in the latest revision of the kip. These > >>>>>> have > >>>>>>>>>> been > >>>>>>>>>>>>>>>>>> moved to TaskAssignorUtils > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 3. I personally prefer `NodeAssignment` to be a > read-only > >>>>>> object > >>>>>>>>>>>>>>>>>> containing the decisions made by the assignor, including > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> requestFollowupRebalance flag. For manipulating the > >>>>>> half-baked > >>>>>>>>>>>>>>>>>> results > >>>>>>>>>>>>>>>>>> inside the assignor itself, maybe we can just be > flexible > >>>>>> to let > >>>>>>>>>>>>>> users use > >>>>>>>>>>>>>>>>>> whatever struts / their own classes even, if they like. > >>>>>> WDYT? > >>>>>>>>>>>>>>>>>> Agreed. Updated in the latest version of the kip. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 1. For the API, thoughts on changing the method > signature > >>>>>> to > >>>>>>>>>>>>>>>>>>> return > >>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>> (non-Optional) TaskAssignor? Then we can either have the > >>>>>> default > >>>>>>>>>>>>>>>>>> implementation return new HighAvailabilityTaskAssignor > or > >>>>>> just > >>>>>>>>>>>>>>>>>> have a > >>>>>>>>>>>>>>>>>> default implementation class that people can extend if > they > >>>>>> don't > >>>>>>>>>>>>>> want to > >>>>>>>>>>>>>>>>>> implement every method. > >>>>>>>>>>>>>>>>>> Based on some other discussion, I actually decided to > get > >>>>>> rid of > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> plugin interface, and instead use config to specify > >>>>>> individual > >>>>>>>>>>>>>>>>>> plugin > >>>>>>>>>>>>>>>>>> behaviour. So the method you're referring to is no > longer > >>>>>> part > >>>>>>>>>>>>>>>>>> of the > >>>>>>>>>>>>>>>>>> proposal. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 3. Speaking of ApplicationMetadata, the javadoc says > it's > >>>>>> read > >>>>>>>>>>>>>>>>>>> only > >>>>>>>>>>>>>>>>>> but > >>>>>>>>>>>>>>>>>> theres methods that return void on it? It's not totally > >>>>>> clear to > >>>>>>>>>> me > >>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>>> that interface is supposed to be used by the assignor. > It'd > >>>>>> be > >>>>>>>>>> nice > >>>>>>>>>>>>>> if we > >>>>>>>>>>>>>>>>>> could flip that interface such that it becomes part of > the > >>>>>> output > >>>>>>>>>>>>>> instead > >>>>>>>>>>>>>>>>>> of an input to the plugin. > >>>>>>>>>>>>>>>>>> I've moved those methods to a util class. They're really > >>>>>> utility > >>>>>>>>>>>>>>>>>> methods the assignor might want to call to do some > default > >>>>>> or > >>>>>>>>>>>>>> optimized > >>>>>>>>>>>>>>>>>> assignment for some cases like rack-awareness. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 4. We should consider wrapping UUID in a ProcessID > >>>>>>>>>>>>>>>>>>> class so > >>>>>>>>>>>>>>>>>>> that we > >>>>>>>>>>>>>>>>>> control > >>>>>>>>>>>>>>>>>> the interface (there are a few places where UUID is > >>>>>>>>>>>>>>>>>> directly > >>>>>>>>>> used). > >>>>>>>>>>>>>>>>>> I like it. Updated the proposal. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 5. What does NodeState#newAssignmentForNode() do? I > >>>>>> thought the > >>>>>>>>>>>>>>>>>> point was > >>>>>>>>>>>>>>>>>> for the plugin to make the assignment? Is that the > result > >>>>>> of the > >>>>>>>>>>>>>>>>>> default logic? > >>>>>>>>>>>>>>>>>> It doesn't need to be part of the interface. I've > removed > >>>>>> it. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> re 2/6: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I generally agree with these points, but I'd rather hash > >>>>>> that > >>>>>>>>>>>>>>>>>> out in a > >>>>>>>>>>>>>>>>>> PR than in the KIP review, as it'll be clearer what gets > >>>>>> used > >>>>>>>>>>>>>>>>>> how. It > >>>>>>>>>>>>>> seems > >>>>>>>>>>>>>>>>>> to me (committers please correct me if I'm wrong) that > as > >>>>>> long as > >>>>>>>>>>>>>> we're on > >>>>>>>>>>>>>>>>>> the same page about what information the interfaces are > >>>>>>>>>> returning, > >>>>>>>>>>>>>> that's > >>>>>>>>>>>>>>>>>> ok at this level of discussion. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Tue, Nov 7, 2023 at 12:03 PM Rohan Desai > >>>>>>>>>>>>>>>>>> <desai.p.ro...@gmail.com> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Hello All, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I'd like to start a discussion on KIP-924 ( > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>> > >>>>>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-924%3A+customizable+task+assignment+for+Streams > >>>>>>>>>>>>>> ) > >>>>>>>>>>>>>>>>>>> which proposes an interface to allow users to plug into > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> streams > >>>>>>>>>>>>>>>>>>> partition assignor. The motivation section in the KIP > goes > >>>>>> into > >>>>>>>>>>>>>>>>>>> some > >>>>>>>>>>>>>> more > >>>>>>>>>>>>>>>>>>> detail on why we think this is a useful addition. > >>>>>>>>>>>>>>>>>>> Thanks in > >>>>>>>>>>>>>>>>>>> advance > >>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>> your feedback! > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Rohan > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>> > >>>> >