Hey Bruno,
Thanks for the feedback. Sorry for the late reply, I was hoping others
might weigh in as well and it got away from me.
a) I like this but I think we should separate this out. This kip has
already dragged on more than it should and I think that is a big enough
change to get done by itse
Hi Walker,
Thanks for the updates!
(1) While I like naming the methods differently, I have also to say that
I do not like addIsomorphicGlobalStore() because it does not really tell
what the method does. I could also not come up with a better name than
addGlobalStoreWithReprocessingOnRestore(
Hey all,
(1) no I hadn't considered just naming the methods differently. I actually
really like this idea and am for it. Except we need 3 different methods
now. One for no processor, one for a processor that should restore and one
that reprocesses. How about `addCustomGlobalStore` and
`addIsomorph
One more thing:
I was just looking into the WIP PR, and it seems we will also need to
change `StreamsBuilder.scala`. The KIP needs to cover this changes as well.
-Matthias
On 4/1/24 10:33 PM, Bruno Cadonna wrote:
Hi Walker and Matthias,
(2)
That is exactly my point about having a compile t
Hi Walker and Matthias,
(2)
That is exactly my point about having a compile time error versus a
runtime error. The added flexibility as proposed by Matthias sounds good
to me.
Regarding the Named parameter, I was not aware that the processor that
writes records to the global state store is n
Two more follow up thoughts:
(1) I am still not a big fan of the boolean parameter we introduce. Did
you consider to use different method names, like
`addReadOnlyGlobalStore()` (for the optimized method, that would not
reprocess data on restore), and maybe add `addModifiableGlobalStore()`
(no
Hey all,
Thanks for the feedback Bruno, Almog and Matthias!
Almog: I like the idea, but I agree with Matthais. I actually looked at
that ticket a bit when doing this and found that while similar they are
actually pretty unrelated codewise. I would love to see it get taken care
of.
Bruno and Matt
Hey,
looking into the API, I am wondering why we would need to add an
overload talking a `Named` parameter?
StreamsBuilder.addGlobalStore() (and .addGlobalTable()) already takes a
`Consumed` parameter that allows to set a name.
2.
I do not understand what you mean with "maximum flexibilit
Thanks for the thoughts Bruno!
> Do you mean a API to configure restoration instead of boolean flag
reprocessOnRestore?
Yes, this is exactly the type of thing I was musing (but I don't have any
concrete suggestions). It feels like that would give the flexibility to do
things like the motivation s
Hi Almog,
Do you mean a API to configure restoration instead of boolean flag
reprocessOnRestore?
Do you already have an idea?
The proposal in the KIP is focused on the processor that updates the
global state whereas in the case of GlobalKTable and source KTable the
issues lies in the deseri
Hi Walker,
I have follow-up comments.
1.
I think, we should add an overload to the StreamsBuilder class that
allows to name the processor with Named. That makes that processor
consistent with all other processors in the DSL regarding naming.
2.
I do not understand what you mean with "maximum
Hello Folk!
Glad to see improvements to the GlobalKTables in discussion! I think they
deserve more love :)
Scope creep alert (which I'm generally against and certainly still support
this KIP without but I want to see if there's an elegant way to address
both problems). The KIP mentions that "Now
Hey Bruno,
1) I'm actually not sure why that is in there. It certainly doesn't match
the convention. Best to remove it and match the other methods.
2) Yeah, I thought about it but I'm not convinced it is a necessary
restriction. It might be useful for the already defined processors but then
they
Hi Walker,
A couple of follow-up questions.
1.
Why do you propose to explicitly pass a parameter "storeName" in
StreamsBuilder#addGlobalStore?
The StoreBuilder should already provide a name for the store, if I
understand the code correctly.
I would avoid using the same name for the source node
Thanks for the feedback Bruno, Matthias, and Lucas!
There is a decent amount but I'm going to try and just hit the major points
as I would like to keep this change simple.
I've made corrections for the mistakes pointed out. Thanks for the
suggestions everyone.
The main sticking point seems to be
If the custom store is a key-value store, yes, we could do this. But the
interface does not enforce a key-value store, it's just a most generic
`StateStore` that we pass in, and thus it could be something totally
unknown to us, and we cannot apply a cast...
The underlying idea is really about
@Matthias:
Thanks, I didn't realize that we need processors for any custom store.
Are we sure we cannot build a generic processor to load data into a
custom key-value store? I'm not sure, but you know the code better
than me.
One other alternative is to allow the user to provide a state
transform
@Bruno:
(1), I think you are spot for the ts-extractor: on the restore code
path, we only support record-ts, but there is no need for a custom-ts
because for regular changelog topics KS sets the ts, and thus, the
optimization this KIP proposes required that the global topic follow the
changel
Hey Walker
Thanks for the KIP, and congrats on the KiBiKIP ;)
My main point is that I'd vote against introducing
`reprocessOnRestore`. The behavior for `reprocessOnRestore = false` is
just incorrect and should be removed or deprecated. If we think we
need to keep the old behavior around, renaming
Hi Walker,
Thanks for the KIP!
Great that you are going to fix this long-standing issue!
1.
I was wondering if we need the timestamp extractor as well as the key
and value deserializer in Topology#addGlobalStore() that do not take a
ProcessorSupplier? What about Consumed in StreamsBuilder#add
Thanks for the KIP Walker.
Fixing this issue, and providing users some flexibility to opt-in/out on
"restore reprocessing" is certainly a good improvement.
From an API design POV, I like the idea to not require passing in a
ProcessorSupplier to begin with. Given the current implementation of
Hello everybody,
I wanted to propose a change to our addGlobalStore methods so that the
restore behavior can be controlled on a preprocessor level. This should
help Kafka Stream users to better tune Global stores added with the
processor API to better fit their needs.
Details are in the kip here:
22 matches
Mail list logo