Hi all,

Like Almog's secretary as well! Just following up on that index:

A2: I'm also happy without introducing versioned KV in this KIP as I
would envision it to be introduced as new params into the
KeyValuePluginParams in the future. And just to clarify on Sophie's
previous comment, I think ListStore should not be exposed in this API
until we see it as a common usage and hence would want to (again, we
need to think very carefully since it would potentially ask all
implementers to adopt) move it from the internal category to the
public interface category. As for now, I think only having kv / window
/ session as public store types is fine.

A3: Seems I was not making myself very clear at the beginning :P The
major thing that I'd actually like to avoid having two configs
co-exist for the same function since it will be a confusing learning
curve for users, and hence what I was proposing is to just have the
newly introduced interface but not introducing a new config, and I
realized now that it is actually more aligned with the CUSTOM idea
where the ordering would be looking at config first, and then the
interface. I blushed as I read Almog likes it.. After thinking about
it twice, I'm now a bit leaning towards just deprecating the old
config with the new API+config as well.

A5: Among the names we have been discussed so far:

DslStorePlugin
StoreTypeSpec
StoreImplSpec
DslStoreSuppliers

I am in favor of DslStoreSuppliers as well as a restrictiveness on its
scope, just to echo Bruno's comments above.



Guozhang

On Thu, Jul 27, 2023 at 4:15 AM Bruno Cadonna <cado...@apache.org> wrote:
>
> Hi,
>
> A5. I have to admit that
> "If we envision extending this beyond just StoreSupplier types, it could
> be a good option."
> is scaring me a bit.
> I am wondering what would be an example for such an extension?
> In general, I would propose to limit the scope of a config. In this case
> the config should provide suppliers for state stores for the DSL.
>
> BTW, maybe it is a good idea to let DslStorePlugin extend Configurable.
>
> Best,
> Bruno
>
> On 7/27/23 2:15 AM, Sophie Blee-Goldman wrote:
> > Thanks for the feedback Bruno -- sounds like we're getting close to a final
> > consensus here.
> > It sounds like the two main (only?) semi-unresolved questions that still
> > have differing
> > opinions floating around are whether to deprecate the old config, and what
> > to name the new config
> > + interface.
> >
> > Although I won't personally push back on any of the options listed above,
> > here's my final two cents:
> >
> > A3. I'm still a firm believer in deprecating the old config, and agree
> > wholeheartedly with what Bruno said.
> >
> > A5. I also wasn't crazy about "Plugin" at first, but I will admit it's
> > grown on me. I think it rubbed me the wrong
> > way at  first because it's just not part of the standard vocabulary in
> > Streams so far. If we envision extending
> > this beyond just StoreSupplier types, it could be a good option.
> > DSLStoreSuppliers does make a lot of sense,
> > though.
> >
> > To throw out a few more ideas in case any of them stick, what about
> > something like DSLStoreFormat or
> > DSLStorageType, or even DSLStorageEngine? Or even DSLStoreFactory -- the
> > Stores class is described as
> > a "factory" (though not named so) and, to me, is actually quite comparable
> > -- both are providers not of the
> > stores themselves, but of the basic building blocks of Stores (eg
> > StoreSuppliers)
> >
> > Ultimately fine with anything though. We should try not to drag out the KIP
> > discussion too long once it's down
> > to just nits :P
> >
> > Cheers,
> > Sophie
> >
> >
> >
> > On Wed, Jul 26, 2023 at 8:04 AM Almog Gavra <almog.ga...@gmail.com> wrote:
> >
> >> Thanks for the comments Bruno!
> >>
> >> A3. Oops... I think I didn't do a great job updating the KIP to reflect
> >> Guozhang's suggestion. This seems like the last point of contention, where
> >> we have two options:
> >>
> >> 1. Deprecate the config entirely and replace IN_MEMORY/ROCKSDB with
> >> implementations of the DslStorePlugin
> >> 2. (What's currently in the KIP) Introduce a new config which defaults to
> >> DefaultDslStorePlugin and only the DefaultDslStorePlugin will respect the
> >> old default.store.type config
> >>
> >> I'm happy with either, I'll keep the KIP with (2) for now as that seemed
> >> like the result of the previous discussion but I have no problem changing
> >> it back to (1) which was the original proposal.
> >>
> >> A5. I like "DslStorePlugin" because it leaves room for configuring
> >> implementations beyond just supplying stores (e.g. we could introduce a
> >> `configure()` method etc...). I'll keep it as is for now (and change
> >> Materialized/Stores API sections - thanks for catching that)! I don't feel
> >> too strongly and wouldn't dig my heels in if most people preferred
> >> "DslStoreSuppliers" (I don't love DslStores as it resembles the Stores
> >> class to closely in name and they're a little different).
> >>
> >> A6. Yup, that's the suggestion.
> >>
> >> - Almog
> >>
> >> On Wed, Jul 26, 2023 at 6:38 AM Bruno Cadonna <cado...@apache.org> wrote:
> >>
> >>> Hi,
> >>>
> >>> Sorry for being late to the party!
> >>>
> >>> A1: I agree with Sophie, Guozhang, and Almog not to block the KIP on
> >>> gaps in the implementation.
> >>>
> >>> A2: I am happy with not considering anything special w.r.t. versioned
> >>> state stores in this KIP.
> >>>
> >>> A3: Here I agree with Sophie to deprecate the old config. I would also
> >>> not use config value CUSTOM. Having two configs that sometimes depend on
> >>> each other to configure one single concept seems confusing to me. I see
> >>> future me looking at default.dsl.store = IN_MEMORY and wondering why
> >>> something is written to disk because I did not check config
> >>> dsl.store.plugin.class?
> >>> BTW, the KIP in its current version is not clear about whether
> >>> default.dsl.store will be deprecated or not. In "Compatibility,
> >>> Deprecation, and Migration Plan" it says default.dsl.store will be
> >>> deprecated but in "Configuration" default.dsl.store seems to be an
> >>> essential part of the configuration.
> >>>
> >>> A4: I agree
> >>>
> >>> A5: I do not completely like the name "DslStorePlugin". What about
> >>> naming it simply "DslStores" or "DslStoreSuppliers"? If we decide to
> >>> rename we should also rename dsl.store.plugin.class to
> >>> dsl.store.suppliers.class or similar.
> >>> BTW, I think you missed to rename some occurrences in section
> >>> "Materialized API" especially in the code section "Stores.java".
> >>>
> >>> A6: Actually I am not sure if I completely follow here. Is this about
> >>> the static methods in class Stores? If yes, I agree with Almog to keep
> >>> this out of the KIP.
> >>>
> >>> Best,
> >>> Bruno
> >>>
> >>> On 7/26/23 5:20 AM, Almog Gavra wrote:
> >>>> I have updated the KIP with the points as discussed above. @Guozhang,
> >> the
> >>>> suggested configuration makes it a little more awkward around the
> >>>> Materialized.as and Materialized.withStoreType APIs than it was when we
> >>>> were totally deprecating the old configuration. Let me know what you
> >>> think.
> >>>>
> >>>> I will open the voting tomorrow! Thanks again everyone for the
> >>> discussion.
> >>>>
> >>>> Cheers,
> >>>> Almog
> >>>>
> >>>> On Tue, Jul 25, 2023 at 9:20 AM Almog Gavra <almog.ga...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Glad you like my KIP-secretary skills ;)
> >>>>>
> >>>>> A2. I'm definitely happy to take your suggestion here and not do
> >>> anything
> >>>>> special w.r.t. Versioned stores, I think it makes sense especially if
> >> we
> >>>>> consider them implementation details of a specific store type.
> >>>>>
> >>>>> At EOD I'll update the KIP with all of these changes and if the
> >>>>> discussion is silent I'll open a vote tomorrow morning.
> >>>>>
> >>>>> Cheers,
> >>>>> Almog
> >>>>>
> >>>>> On Mon, Jul 24, 2023 at 2:02 PM Sophie Blee-Goldman <
> >>>>> ableegold...@gmail.com> wrote:
> >>>>>
> >>>>>> Awesome summary (seriously) -- would you kindly offer your
> >>> organizational
> >>>>>> skills to every ongoing KIP from henceforth? We need you :P
> >>>>>>
> >>>>>> A few answers/comments:
> >>>>>>
> >>>>>> A2: I think there is a 3rd sub-option here, which is to leave
> >>>>>> versioned-ness out of this KIP entirely, return only the
> >> non-versioned
> >>>>>> stores for now, and then switch over to the versioned stores (only)
> >>> when
> >>>>>> the time comes to flip the switch on making them the default across
> >> the
> >>>>>> DSL. This has the advantage of retaining the current
> >> behavior/semantics
> >>>>>> and
> >>>>>> provides a clear way to transition smoothly in the future, since it
> >>> seems
> >>>>>> we will want to cut to all versioned state stores rather than offer
> >>> users
> >>>>>> a
> >>>>>> choice between versioned or non-versioned stores going forward
> >>> (similar to
> >>>>>> how we only offer timestamped stores presently, and have completely
> >>>>>> replaced non-timestamped stores in the DSL.) . In both the
> >> timestamped
> >>> and
> >>>>>> versioned cases, the old stores are/will still be available or
> >>> accessible
> >>>>>> to users via the bare StoreSuppliers, should they somehow desire or
> >>>>>> require
> >>>>>> the old store type. Ultimately, I think either this or option (1)
> >>> would be
> >>>>>> preferable, though I think it should be up to Matthias or anyone else
> >>>>>> involved in the versioned stores feature to decide which approach
> >> makes
> >>>>>> sense in the context of that feature's future plans.
> >>>>>>
> >>>>>> A3: sounds reasonable to me
> >>>>>>
> >>>>>> A5: Also sounds fine to me, though I'll let others chime in with/if
> >>> they
> >>>>>> have an alternative suggestion/preference. I guess the other
> >> contender
> >>>>>> would be something like DSLStoreImpl or something like that?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Jul 24, 2023 at 9:36 AM Almog Gavra <almog.ga...@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Lots of thoughts! Happy to see the thriving discussion on this post
> >> -
> >>>>>> lots
> >>>>>>> going on so I'm trying to enumerate them to keep things organized
> >>>>>> (prefix
> >>>>>>> "A" for "Almog" so we can use numbers in responses for other things
> >>> ;P).
> >>>>>>>
> >>>>>>> A1. Question around closing implementation gaps (e.g. no rocks based
> >>>>>>> suppression store)
> >>>>>>> A2. Specifically how to handle Versioned stores
> >>>>>>> A3. Configuration (new config/reuse old one + new one and ordering
> >> of
> >>>>>>> config resolution)
> >>>>>>> A4. Drawing a line between what is implementation detail (not
> >> exposed
> >>> in
> >>>>>>> API) and what is customizable (exposed in API)
> >>>>>>> A5. Naming of StoreTypeSpec
> >>>>>>> A6. Param classes in StoreBuilders
> >>>>>>>
> >>>>>>> ------------------------------
> >>>>>>>
> >>>>>>> Here are summaries for where it seems each of these stands (trying
> >> not
> >>>>>> to
> >>>>>>> add any additional opinion yet):
> >>>>>>>
> >>>>>>> A1. Sophie/Guozhang/Me (if I count hah!) seem to agree that it is
> >>> worth
> >>>>>>> pushing this KIP through independently of the implementation gaps as
> >>> it
> >>>>>>> doesn't seem to move the intermediate state further from the end
> >>> state.
> >>>>>>> Matthias originally had some concerns.
> >>>>>>>
> >>>>>>> A2. There's questions around whether versioned stores should be
> >> their
> >>>>>> own
> >>>>>>> configurable option or whether they are an implementation detail
> >> that
> >>>>>> the
> >>>>>>> StoreSpec should decide. It seems like the discussion is converging
> >>>>>> here,
> >>>>>>> this should be an implementation detail.
> >>>>>>>
> >>>>>>> A3. Matthias/Guozhang prefer adding CUSTOM and then having an
> >>> additional
> >>>>>>> config to determine the implementation. Sophie prefers deprecating
> >> the
> >>>>>> old
> >>>>>>> config. Guozhang additional suggested flipping the resolution order
> >>> such
> >>>>>>> that the old config is only respected in a DefaultStoreSpec
> >>>>>> implementation.
> >>>>>>>
> >>>>>>> A4. This KIP (or rather, the discussion on the KIP) blurs the lines
> >>>>>> between
> >>>>>>> top level store types (KV, windowed, session) and the underlying
> >>>>>>> implementation of them (timestamped, versioned, kv-list). It seems
> >>>>>> everyone
> >>>>>>> is in alignment to ensure that we keep these two things separate and
> >>>>>> that
> >>>>>>> the line is clearly delineated in the text of the KIP.
> >>>>>>>
> >>>>>>> A5. Guozhang and Sophie agree that the current name StoreTypeSpec is
> >>>>>>> misleading, as it's really an implementation spec, not a type
> >>>>>>> specification.
> >>>>>>>
> >>>>>>> A6. Agreement that this is an improvement, Sophie believes this can
> >> be
> >>>>>> done
> >>>>>>> in a follow up but we should ensure our naming is good here so
> >> there's
> >>>>>> no
> >>>>>>> conflicts down the line.
> >>>>>>>
> >>>>>>> ---------------------------------
> >>>>>>>
> >>>>>>> Ok, phew! Hopefully that covers it all! Now for my thoughts,
> >> hopefully
> >>>>>>> wrapping up some of these discussions:
> >>>>>>>
> >>>>>>> A1.  @Matthias - are you still hesitant here? What would you need to
> >>> be
> >>>>>>> convinced here?
> >>>>>>>
> >>>>>>> A2. Since we are all in agreement that versioned stores should be an
> >>>>>>> implementation detail, we have two options:
> >>>>>>>
> >>>>>>> (1) we can extend the KVParams to include a parameter that indicates
> >>>>>>> whether or not the store should be versioned
> >>>>>>> (2) we can introduce a configuration for whether or not to use a
> >>>>>> versioned
> >>>>>>> store, and each implementation can choose to read/ignore that config
> >>>>>>>
> >>>>>>> Any preferences? (1) would align more closely with what we are doing
> >>>>>> today
> >>>>>>> (they are a top-level concept in the Stores API).
> >>>>>>>
> >>>>>>> A3. I like Guozhang's suggestion of making the "primary"
> >> configuration
> >>>>>> to
> >>>>>>> be the new one, and then having a "DefaultStoreTypeSpec" (using the
> >>> old
> >>>>>>> naming) which respects the old configuration. That seems to solve
> >>> nearly
> >>>>>>> all the concerns (e.g. it'd be easy to see where the enum is used
> >>>>>> because
> >>>>>>> it would only be used within that one class instead of littered
> >>>>>> throughout
> >>>>>>> the code base).
> >>>>>>>
> >>>>>>> @Sophie, unless you have objections here I will update the KIP to do
> >>>>>> that.
> >>>>>>>
> >>>>>>> A4. I will make these changes to the KIP to make it clear.
> >>>>>>>
> >>>>>>> A5. I will rename it `DslStorePlugin` - any objections to this name?
> >>>>>>>
> >>>>>>> A6. Let's punt this ;) I agree with everyone that this would be a
> >>>>>> welcome
> >>>>>>> improvement and that this KIP is aligned with moving in that
> >>> direction.
> >>>>>>> Given how much discussion there was on this KIP, which is minor
> >>>>>> relative to
> >>>>>>> making the changes to StoreBuilder API, I'd rather not tie the two
> >>>>>>> together.
> >>>>>>>
> >>>>>>> Cheers & Thanks everyone for the thoughts!
> >>>>>>> - Almog
> >>>>>>>
> >>>>>>> On Sun, Jul 23, 2023 at 5:15 PM Sophie Blee-Goldman <
> >>>>>>> ableegold...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Guozhang:
> >>>>>>>>
> >>>>>>>> On your 2nd point:
> >>>>>>>>
> >>>>>>>>> "impl types" (in hindsight it may not be a good name) for rocksdb
> >> /
> >>>>>>>> memory / custom, and we used "store types" for kv / windowed /
> >>>>>> sessioned
> >>>>>>>> etc,
> >>>>>>>> First off, thanks so much for this clarification -- using "store
> >>> type"
> >>>>>>> here
> >>>>>>>> was definitely making me uncomfortable as this usually refers to KV
> >>> vs
> >>>>>>>> window, etc -- but I just couldn't for the life of me think of the
> >>>>>> right
> >>>>>>>> term for rocks vs IM. We should 100% change to something like
> >>>>>>> StoreImplSpec
> >>>>>>>> for this kind of interface.
> >>>>>>>>
> >>>>>>>>> As for list-value store (for stream-stream Join)
> >>>>>>>> Again, glad you mentioned this -- I forgot how the extra
> >>> stream-stream
> >>>>>>> join
> >>>>>>>> store is not a "regular" KV Store but rather this special
> >> list-value
> >>>>>>> store.
> >>>>>>>> If we proceed with something like the current approach, perhaps
> >> that
> >>>>>>> should
> >>>>>>>> be a boolean (or enum) parameter in the KVConfig, similar to the
> >>>>>>>> EmitStrategy? After all, the high-level goal of this KIP is to be
> >>>>>> able to
> >>>>>>>> fully customize all DSL state stores, and this is currently not
> >>>>>> possible
> >>>>>>>> due to KAFKA-14976 <
> >>> https://issues.apache.org/jira/browse/KAFKA-14976
> >>>>>>> .
> >>>>>>>>
> >>>>>>>> If we expect there to be further customizations like this going
> >>>>>> forward,
> >>>>>>>> perhaps we could instead have each of the three StoreConfig classes
> >>>>>>> accept
> >>>>>>>> a single enum parameter for the "sub-type" (or whatever you want to
> >>>>>> call
> >>>>>>>> it), which would encompass (and replace) things like the
> >> EmitStrategy
> >>>>>> as
> >>>>>>>> well as the list-value type (we could define one enum for each
> >> Config
> >>>>>>> class
> >>>>>>>> so there is no accidentally requesting a LIST_VALUE subtype on a
> >>>>>>>> WindowStore). Thoughts?
> >>>>>>>>
> >>>>>>>> Lastly, regarding 3.b:
> >>>>>>>>
> >>>>>>>> I love that you brought this up because that is actually what I
> >> first
> >>>>>>>> proposed to Almog, ie introducing a param class to clean up the
> >>>>>>>> StoreBuilder API, during our chat that led to this KIP. He pushed
> >>>>>> back,
> >>>>>>>> claiming (rightly so) that this change would be huge in scope for a
> >>>>>>> purely
> >>>>>>>> aesthetic/API change that doesn't add any functionality, and that
> >> it
> >>>>>>> makes
> >>>>>>>> more sense to start with the DSL config since there is a clear gap
> >> in
> >>>>>>>> functionality there, particularly when it comes to custom state
> >>> stores
> >>>>>>>> (reasons 1 & 3 in the Motivation section). He did agree that the
> >>> param
> >>>>>>>> classes were a better API, which is why you see them in this KIP.
> >> In
> >>>>>>> other
> >>>>>>>> words: I fully agree that we should apply this improvement to the
> >>>>>>>> PAPI/StoreBuilder interface as well, but I think that's a distinct
> >>>>>>> concept
> >>>>>>>> for the time-being and should not block the DSL improvement.
> >> Rather,
> >>> I
> >>>>>>>> consider this KIP as setting the stage for a followup KIP down the
> >>>>>> line
> >>>>>>> to
> >>>>>>>> clean up the StoreBuilders and bring them in line with the new
> >>>>>>> param/config
> >>>>>>>> class approach.
> >>>>>>>>
> >>>>>>>> That said,  A) we should definitely make sure whatever we introduce
> >>>>>> here
> >>>>>>>> can be extended to the PAPI StoreBuilders in a natural way, and B)
> >> I
> >>>>>>> should
> >>>>>>>> clarify that I personally would be happy to see this included in
> >> the
> >>>>>>>> current KIP, but as Almog's KIP it's up to him to decide whether
> >> he's
> >>>>>>>> comfortable expanding the scope like this. If you can convince him
> >>>>>> where
> >>>>>>> I
> >>>>>>>> could not, more power to you! :P
> >>>>>>>>
> >>>>>>>> On Sun, Jul 23, 2023 at 4:48 PM Sophie Blee-Goldman <
> >>>>>>>> ableegold...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Matthias:
> >>>>>>>>>
> >>>>>>>>> I'm not sure I agree with (or maybe don't follow) this take:
> >>>>>>>>>>
> >>>>>>>>>> we need all kind of `StoreTypeSpec` implementations,
> >>>>>>>>>> and it might also imply that we need follow up KIPs for new
> >> feature
> >>>>>>>>>> (like in-memory versioned store) that might not need a KIP
> >>>>>> otherwise.
> >>>>>>>>>>
> >>>>>>>>> I see this feature as being a nice add-on/convenience API for any
> >>>>>> store
> >>>>>>>>> types which have a full DSL implementation. I don't think it's
> >>>>>>>> unreasonable
> >>>>>>>>> to just say that this feature is only going to be available for
> >>>>>> store
> >>>>>>>> types
> >>>>>>>>> that have KV, Window, and Session implementations. I can't think
> >> of
> >>>>>> any
> >>>>>>>>> case besides versioned stores where this would force a KIP for a
> >> new
> >>>>>>>>> feature that would not otherwise have to go through a KIP, and
> >> even
> >>>>>> for
> >>>>>>>>> versioned state stores, the only issue is that the KIP for that
> >> was
> >>>>>>>> already
> >>>>>>>>> accepted.
> >>>>>>>>>
> >>>>>>>>> However, I think I agree on your main point -- that things like
> >>>>>>> "regular"
> >>>>>>>>> vs timestamped vs versioned are/should be an implementation detail
> >>>>>>> that's
> >>>>>>>>> hidden from the user. As I noted previously, the current KIP
> >>>>>> actually
> >>>>>>>>> greatly improves the situation for timestamped stores, as this
> >>>>>> would be
> >>>>>>>>> handled completely transparently by the OOTB RocksDBStoreSpec. To
> >>>>>> me,
> >>>>>>>> this
> >>>>>>>>> provides a very natural way to let the DSL operators using the
> >>>>>> default
> >>>>>>>>> store type/spec to specify which kind of store (eg
> >>>>>>>>> versioned/timestamped/etc) it wants, and choose the correct
> >>>>>> default. If
> >>>>>>>> the
> >>>>>>>>> eventual intention is to have versioned state stores replace
> >>>>>>> timestamped
> >>>>>>>>> stores as the default in the DSL, then we can simply swap out the
> >>>>>>>> versioned
> >>>>>>>>> stores for the timestamped stores in the RocksDBStoreTypeSpec,
> >> when
> >>>>>>> that
> >>>>>>>>> time comes. Until then, users who want to use the versioned store
> >>>>>> will
> >>>>>>>> have
> >>>>>>>>> to do what they do today, which is individually override operators
> >>>>>> via
> >>>>>>>>> Materialized/StoreSuppliers.
> >>>>>>>>>
> >>>>>>>>> All in all, it sounds like we should not offer a versioned store
> >>>>>> type
> >>>>>>>>> spec, as "versioned" is more akin to "timestamped" than to a true
> >>>>>>>>> difference in underlying store implementation type (eg rocks vs
> >>>>>>>> in-memory).
> >>>>>>>>> W.r.t whether to deprecate the old config or introduce a new
> >> CUSTOM
> >>>>>>> enum
> >>>>>>>>> type, either seems fine to me, and we can go with that alternative
> >>>>>>>> instead.
> >>>>>>>>> The only other con to this approach that I can think of, and I'm
> >>>>>>> honestly
> >>>>>>>>> not sure if this is something users would care about or only devs,
> >>>>>> is
> >>>>>>>> that
> >>>>>>>>> the advantage to moving rocks and IM to the store type spec
> >>>>>> interface
> >>>>>>> is
> >>>>>>>>> that it helps to keep the relevant logic encapsulated in one easy
> >>>>>> place
> >>>>>>>> you
> >>>>>>>>> can quickly check to tell what kind of state store is used where.
> >> In
> >>>>>>> the
> >>>>>>>>> current code, I found it extremely annoying and difficult to track
> >>>>>> down
> >>>>>>>> all
> >>>>>>>>> usages of the StoreType enum to see which actual rocksdb store was
> >>>>>>> being
> >>>>>>>>> used where (for example some stores using the TimeOrderedBuffer
> >>>>>>> variants
> >>>>>>>> in
> >>>>>>>>> some special cases, or to understand whether the DSL was
> >> defaulting
> >>>>>> to
> >>>>>>>>> plain, timestamped, or versioned stores for RocksDB vs InMemory --
> >>>>>> both
> >>>>>>>> of
> >>>>>>>>> which seem like they could be of interest to a user). This would
> >> be
> >>>>>>> much
> >>>>>>>>> easier if everything was handled in one place, and you can just go
> >>>>>> to
> >>>>>>> the
> >>>>>>>>> (eg) RocksDBStoreTypeSpec and see what it's doing, or find usages
> >> of
> >>>>>>> the
> >>>>>>>>> methods to understand what stores are being handed to which DSL
> >>>>>>>> operators.
> >>>>>>>>>
> >>>>>>>>> I suppose we could still clean up the API and solve this problem
> >> by
> >>>>>>>> having
> >>>>>>>>> the old (and new) config delegate to a StoreTypeSpec no matter
> >> what,
> >>>>>>> but
> >>>>>>>>> make RocksDBStoreTypeSpec and InMemoryStoreTypeSpec internal
> >> classes
> >>>>>>> that
> >>>>>>>>> are simply implementation details of the ROCKSDB vs IN_MEMORY
> >> enums.
> >>>>>>>> WDYT?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Sun, Jul 23, 2023 at 11:14 AM Guozhang Wang <
> >>>>>>>> guozhang.wang...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks everyone for the great discussions so far! I first saw the
> >>>>>> JIRA
> >>>>>>>>>> and left some quick thoughts without being aware of the
> >>>>>>>>>> already-written KIP (kudos to Almog, very great one) and the
> >>>>>> DISCUSS
> >>>>>>>>>> thread here. And I happily find some of my initial thoughts align
> >>>>>> with
> >>>>>>>>>> the KIP already :)
> >>>>>>>>>>
> >>>>>>>>>> Would like to add a bit more of my 2c after reading through the
> >> KIP
> >>>>>>>>>> and the thread here:
> >>>>>>>>>>
> >>>>>>>>>> 1. On the high level, I'm in favor of pushing this KIP through
> >>>>>> without
> >>>>>>>>>> waiting on the other gaps to be closed. In my back pocket's
> >>>>>>>>>> "dependency graph" of Kafka Streams roadmap of large changes or
> >>>>>>>>>> feature gaps, the edges of dependencies are defined based on my
> >>>>>>>>>> understanding of whether doing one first would largely
> >> complicate /
> >>>>>>>>>> negate the effort of the other but not vice versa, in which case
> >> we
> >>>>>>>>>> should consider getting the other done first. In this case, I
> >> feel
> >>>>>>>>>> such a dependency is not strong enough, so encouraging the KIP
> >>>>>>>>>> contributor to finish what he/she would love to do to close some
> >>>>>> gaps
> >>>>>>>>>> early would be higher priorities. I did not see by just doing
> >> this
> >>>>>> we
> >>>>>>>>>> could end up in a worse intermediate stage yet, but I could be
> >>>>>>>>>> corrected.
> >>>>>>>>>>
> >>>>>>>>>> 2. Regarding the store types --- gain here I'd like to just
> >> clarify
> >>>>>>>>>> the terms a bit since in the past it has some confusions: we used
> >>>>>>>>>> "impl types" (in hindsight it may not be a good name) for
> >> rocksdb /
> >>>>>>>>>> memory / custom, and we used "store types" for kv / windowed /
> >>>>>>>>>> sessioned etc, as I said in the JIRA I think the current proposal
> >>>>>> also
> >>>>>>>>>> have a good side effect as quality bar to really enforce us think
> >>>>>>>>>> twice when trying to add more store types in the future as it
> >> will
> >>>>>>>>>> impact API instantiations. In the ideal world, I would consider:
> >>>>>>>>>>
> >>>>>>>>>> * We have (timestamped) kv store, versioned kv store, window
> >> store,
> >>>>>>>>>> session store as first-class DSL store types. Some DSL operators
> >>>>>> could
> >>>>>>>>>> accept multiple store types (e.g. versioned and non versioned
> >>>>>>>>>> kv-store) for semantics / efficiency trade-offs. But I think we
> >>>>>> would
> >>>>>>>>>> remove un-timestamped kv stores eventually since that efficiency
> >>>>>>>>>> trade-off is so minimal compared to its usage limitations.
> >>>>>>>>>> * As for list-value store (for stream-stream Join),
> >>>>>> memory-lru-cache
> >>>>>>>>>> (for PAPI use only), memory-time-ordered-buffer (for
> >> suppression),
> >>>>>>>>>> they would not be exposed as DSL first-class store types in the
> >>>>>>>>>> future. Instead, they would be treated as internal used stores
> >>>>>> (e.g.
> >>>>>>>>>> list-value store is built on key-value store with specialized
> >> serde
> >>>>>>>>>> and putInternal), or continue to be just convenient OOTB PAPI
> >> used
> >>>>>>>>>> stores only.
> >>>>>>>>>> * As we move on, we will continue to be very, very strict on what
> >>>>>>>>>> would be added as DSL store types (and hence requires changes to
> >>>>>> the
> >>>>>>>>>> proposed APIs), what to be added as convenient OOTB PAPI store
> >>>>>> impls
> >>>>>>>>>> only, what to be added as internal used store types that should
> >>>>>> not be
> >>>>>>>>>> exposed to users nor customizable at all.
> >>>>>>>>>>
> >>>>>>>>>> 3. Some more detailed thoughts below:
> >>>>>>>>>>
> >>>>>>>>>> 3.a) I originally also think that we can extend the existing
> >>>>>> config,
> >>>>>>>>>> rather than replacing it. The difference was that I was thinking
> >>>>>> that
> >>>>>>>>>> order-wise, the runtime would look at the API first, and then the
> >>>>>>>>>> config, whereas in your rejected alternative it was looking at
> >> the
> >>>>>>>>>> config first, and then the API --- that I think is a minor thing
> >>>>>> and
> >>>>>>>>>> either is fine. I'm in agreement that having two configs would be
> >>>>>> more
> >>>>>>>>>> confusing to users to learn about their precedence rather than
> >>>>>>>>>> helpful, but if we keep both a config and a public API, then the
> >>>>>>>>>> precedence ordering would not be so confusing as long as we state
> >>>>>> them
> >>>>>>>>>> clearly. For example:
> >>>>>>>>>>
> >>>>>>>>>> * We have DefaultStoreTypeSpec OOTB, in that impl we look at the
> >>>>>>>>>> config only, and would only expect either ROCKS or MEMORY, and
> >>>>>> return
> >>>>>>>>>> corresponding OOTB store impls; if any other values configured,
> >> we
> >>>>>>>>>> error out.
> >>>>>>>>>> * Users extend that by having MyStoreTypeSpec, in which they
> >> could
> >>>>>> do
> >>>>>>>>>> arbituray things without respecting the config at all, but our
> >>>>>>>>>> recommended pattern in docs would still say "look into the
> >> config,
> >>>>>> if
> >>>>>>>>>> it is ROCKS or MEMORY just return fall back to
> >>>>>> DefaultStoreTypeSepc;
> >>>>>>>>>> otherwise if it's some String you recognize, then return your
> >>>>>>>>>> customized store based on the string value, otherwise error out".
> >>>>>>>>>>
> >>>>>>>>>> 3.b) About the struct-like Params classes, I like the idea a lot
> >>>>>> and
> >>>>>>>>>> wished we would pursue this in the first place, but if we only do
> >>>>>> this
> >>>>>>>>>> in Spec it would leave some inconsistencies with the
> >> StoreBuilders
> >>>>>>>>>> though arguably the latter is only for PAPI. I'm wondering if we
> >>>>>>>>>> should consider including the changes in StoreBuilders (e.g.
> >>>>>>>>>> WindowStoreBuilder(WindowSupplierParams)), and if yes, maybe we
> >>>>>> should
> >>>>>>>>>> also consider renaming that e.g. `WindowSupplierParams` to
> >>>>>>>>>> `WindowStoreSpecParams` too? For this one I only have a "weak
> >>>>>> feeling"
> >>>>>>>>>> so I can be convinced otherwise :P
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Guozhang
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Sun, Jul 23, 2023 at 9:52 AM Matthias J. Sax <
> >> mj...@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for all the input. My intention was not to block the KIP,
> >>>>>> but
> >>>>>>>>>>> just to take a step back and try get a holistic picture and
> >>>>>>>> discussion,
> >>>>>>>>>>> to explore if there are good/viable alternative designs. As said
> >>>>>>>>>>> originally, I really like to close this gap, and was always
> >> aware
> >>>>>>> that
> >>>>>>>>>>> the current config is not flexible enough.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I guess, my "concern" is that the KIP does increase the API
> >>>>>> surface
> >>>>>>>> area
> >>>>>>>>>>> significantly, as we need all kind of `StoreTypeSpec`
> >>>>>>> implementations,
> >>>>>>>>>>> and it might also imply that we need follow up KIPs for new
> >>>>>> feature
> >>>>>>>>>>> (like in-memory versioned store) that might not need a KIP
> >>>>>>> otherwise.
> >>>>>>>>>>>
> >>>>>>>>>>> The second question is if it might make the already "patchy"
> >>>>>>> situation
> >>>>>>>>>>> with regard to customization worse.
> >>>>>>>>>>>
> >>>>>>>>>>> We did de-scope the original KIP-591 for this reason, and given
> >>>>>> the
> >>>>>>>> new
> >>>>>>>>>>> situation of the DSL, it seems that it actually got worse
> >>>>>> compared
> >>>>>>> to
> >>>>>>>>>>> back in the days.
> >>>>>>>>>>>
> >>>>>>>>>>> Lastly, I hope to make the new versioned stores the default in
> >>>>>> the
> >>>>>>> DSL
> >>>>>>>>>>> and we did not do it in the previous KIP due to backward
> >>>>>>> compatibility
> >>>>>>>>>>> issues. Thus, from a DSL point of view, I believe there should
> >> be
> >>>>>>> only
> >>>>>>>>>>> "RocksDB", "InMemory", and "Custom" in an ideal world.
> >>>>>> Introducing
> >>>>>>> (I
> >>>>>>>> am
> >>>>>>>>>>> exaggerating to highlight my point) "KvRocksDbSpec",
> >>>>>>>>>>> "TimestampeKvRocksDbSpec", "VersionedRocksDbSpec", plus the
> >>>>>>>>>>> corresponding in-memory specs seems to head into the opposite
> >>>>>>>> direction.
> >>>>>>>>>>> -- My goal is to give users a handle of the _physical_ store
> >>>>>>> (RocksDB
> >>>>>>>> vs
> >>>>>>>>>>> InMemory vs Custom) but not the _logical_ stores (plain kv,
> >>>>>> ts-kv,
> >>>>>>>>>>> versioned) which is "dictated" by the DSL itself and should not
> >>>>>> be
> >>>>>>>>>>> customizable (we are just in a weird intermediate situation that
> >>>>>> we
> >>>>>>>> need
> >>>>>>>>>>> to clean up, but not "lean into" IMHO).
> >>>>>>>>>>>
> >>>>>>>>>>> Thus, I am also not sure if adding "VersionedRocksDbSpec" would
> >>>>>> be
> >>>>>>>> ideal
> >>>>>>>>>>> (also, given that it only changes a single store, but not the
> >> two
> >>>>>>>>>>> windowed stores)?
> >>>>>>>>>>>
> >>>>>>>>>>> Furthermore, I actually hope that we could use the new versioned
> >>>>>>> store
> >>>>>>>>>>> to replace the window- and sessions- stores, and thus to
> >> decrease
> >>>>>>> the
> >>>>>>>>>>> number of required store types.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Admittedly, I am talking a lot about a potential future, but the
> >>>>>>> goal
> >>>>>>>> is
> >>>>>>>>>>> only to explore opportunities to not get into "worse"
> >>>>>> intermediate
> >>>>>>>>>>> state, that will require a huge deprecation surface area later
> >>>>>> on.
> >>>>>>> Of
> >>>>>>>>>>> course, if there is no better way, and my concerns are not
> >>>>>> shared, I
> >>>>>>>> am
> >>>>>>>>>>> ok to move forward with the KIP.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Bottom line: I would personally prefer to keep the current
> >> config
> >>>>>>> and
> >>>>>>>>>>> add a `Custom` option to it, plus adding one new config that
> >>>>>> allows
> >>>>>>>>>>> people to set their custom `StoreTypeSpec` class. -- I would not
> >>>>>>> add a
> >>>>>>>>>>> built-in spec for versioned stores at this point (or any other
> >>>>>>>> built-in
> >>>>>>>>>>> `StorytypeSpec` implementations). I guess, users could write a
> >>>>>>> custom
> >>>>>>>>>>> spec if they want to enable versioned store across the board for
> >>>>>> now
> >>>>>>>>>>> (until we make them the default anyway)?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Hope my train of thoughts is halfway reasonable and not totally
> >>>>>> off
> >>>>>>>>>> track?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>> On 7/21/23 15:27, Sophie Blee-Goldman wrote:
> >>>>>>>>>>>> I agree with everything Almog said above, and will just add on
> >>>>>> to
> >>>>>>>> two
> >>>>>>>>>>>> points:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. Regarding whether to block this KIP on the completion of
> >>>>>> any or
> >>>>>>>> all
> >>>>>>>>>>>> future implementations of in-memory version stores (or persist
> >>>>>>>>>> suppression
> >>>>>>>>>>>> buffers), I feel that would be unfair to this feature which is
> >>>>>>>>>> completely
> >>>>>>>>>>>> unrelated to the semantic improvements offered by versioned
> >>>>>> state
> >>>>>>>>>> stores.
> >>>>>>>>>>>> It seems like the responsibility of those driving the versioned
> >>>>>>>> state
> >>>>>>>>>>>> stores feature, not Almog/this KIP, to make sure that those
> >>>>>> bases
> >>>>>>>> are
> >>>>>>>>>>>> covered. Further, if anything, this KIP will help with the
> >>>>>> massive
> >>>>>>>>>>>> proliferation of StoreSuppliers on the Stores factory class,
> >>>>>> and
> >>>>>>>>>> provide
> >>>>>>>>>>>> users with a much easier way to leverage the versioned stores
> >>>>>>>> without
> >>>>>>>>>>>> having to muck around directly with the StoreSuppliers.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I also thought about it a bit, and really like Almog's
> >>>>>> suggestion
> >>>>>>> to
> >>>>>>>>>>>> introduce an additional StoreSpec for the Versioned state
> >>>>>> stores.
> >>>>>>>>>> Obviously
> >>>>>>>>>>>> we can add the RocksDB one to this KIP now, and then as he
> >>>>>>>> mentioned,
> >>>>>>>>>>>> there's an easy way to get users onto the IMVersionedStateStore
> >>>>>>>> types
> >>>>>>>>>> once
> >>>>>>>>>>>> they are completed.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Lastly, on this note, I want to point out how smoothly this
> >>>>>> solved
> >>>>>>>> the
> >>>>>>>>>>>> issue of timestamped stores, which are intended to be the DSL
> >>>>>>>> default
> >>>>>>>>>> but
> >>>>>>>>>>>> are only a special case for RocksDB. Right now it can be
> >>>>>> confusing
> >>>>>>>>>> for a
> >>>>>>>>>>>> user scrolling through the endless Stores class and seeing a
> >>>>>>>>>> timestamped
> >>>>>>>>>>>> version of the persistent but not in-memory stores. One could
> >>>>>>> easily
> >>>>>>>>>> assume
> >>>>>>>>>>>> there was no timestamped option for IM stores and that this
> >>>>>>> feature
> >>>>>>>>>> was
> >>>>>>>>>>>> incomplete, if they weren't acutely aware of the internal
> >>>>>>>>>> implementation
> >>>>>>>>>>>> details (namely that it's only required for RocksDB for
> >>>>>>>> compatibility
> >>>>>>>>>>>> reasons). However, with this KIP, all that is handled
> >>>>>> completely
> >>>>>>>>>>>> transparently to the user, and we the devs, who *are* aware of
> >>>>>> the
> >>>>>>>>>> internal
> >>>>>>>>>>>> implementation details, are now appropriately the ones
> >>>>>> responsible
> >>>>>>>> for
> >>>>>>>>>>>> handing the correct store type to a particular operator. While
> >>>>>>>>>> versioned
> >>>>>>>>>>>> state stores may not be completely comparable, depending on
> >>>>>>> whether
> >>>>>>>>>> we want
> >>>>>>>>>>>> users to remain able to easily choose between using them or not
> >>>>>>> (vs
> >>>>>>>>>>>> timestamped which should be used by all), I still feel this KIP
> >>>>>>> is a
> >>>>>>>>>> great
> >>>>>>>>>>>> step in the right direction that not only should not be
> >>>>>> blocked on
> >>>>>>>> the
> >>>>>>>>>>>> completion of the IM implementations, but in fact should
> >>>>>>>> specifically
> >>>>>>>>>> be
> >>>>>>>>>>>> done first as it enables an easier way to utilize those IM
> >>>>>>> versioned
> >>>>>>>>>>>> stores. Just my 2 cents :)
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. The idea to expand the existing the config with a CUSTOM
> >>>>>> enum
> >>>>>>>>>> without
> >>>>>>>>>>>> introducing another config to specify the CUSTOM store spec
> >>>>>> does
> >>>>>>> not
> >>>>>>>>>> seem
> >>>>>>>>>>>> appropriate, or  even possible (for the reasons Almog mentioned
> >>>>>>>> above
> >>>>>>>>>> about
> >>>>>>>>>>>> config types, though perhaps there is a way I'm not seeing). I
> >>>>>> do
> >>>>>>>> not
> >>>>>>>>>> buy
> >>>>>>>>>>>> the argument that we should optimize the API to make it easy
> >>>>>> for
> >>>>>>>>>> users who
> >>>>>>>>>>>> just want to switch to all in-memory stores, as I truly believe
> >>>>>>> this
> >>>>>>>>>> is a
> >>>>>>>>>>>> very small fraction of the potential userbase of this feature
> >>>>>>>> (anyone
> >>>>>>>>>> who's
> >>>>>>>>>>>> actually using this should please chime in!). It seems very
> >>>>>> likely
> >>>>>>>>>> that the
> >>>>>>>>>>>> majority of users of this feature are actually those with
> >>>>>> custom
> >>>>>>>> state
> >>>>>>>>>>>> stores, as to my knowledge, this has been the case any/every
> >>>>>> time
> >>>>>>>> this
> >>>>>>>>>>>> feature was requested by users.
> >>>>>>>>>>>>
> >>>>>>>>>>>> That said, while I don't see any way to get around introducing
> >>>>>> a
> >>>>>>> new
> >>>>>>>>>>>> config, I don't personally have a preference w.r.t whether to
> >>>>>> keep
> >>>>>>>>>> around
> >>>>>>>>>>>> the old config or deprecate it. I simply don't get the
> >>>>>> impression
> >>>>>>> it
> >>>>>>>>>> is
> >>>>>>>>>>>> very heavily used as it stands today, while it only works for
> >>>>>>> those
> >>>>>>>>>> who
> >>>>>>>>>>>> want all in-memory stores. Again, input from actual users
> >>>>>> would be
> >>>>>>>>>> very
> >>>>>>>>>>>> valuable here. In the absence of that data, I will just point
> >>>>>> to
> >>>>>>> the
> >>>>>>>>>> fact
> >>>>>>>>>>>> that this KIP was proposed by a Streams dev (you :P),
> >>>>>> abandoned,
> >>>>>>>>>> picked up
> >>>>>>>>>>>> by another Streams dev, and finally implemented without ever
> >>>>>>> hearing
> >>>>>>>>>> from a
> >>>>>>>>>>>> user that they would find this useful. This is not to disparage
> >>>>>>> the
> >>>>>>>>>>>> original KIP, but just to say again, as I stated back then, it
> >>>>>>>> seemed
> >>>>>>>>>> like
> >>>>>>>>>>>> a major opportunity loss to leave out custom state stores
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Sophie
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Jul 21, 2023 at 1:52 PM Almog Gavra <
> >>>>>>> almog.ga...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for all the feedback folk! Responses inline.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Basically, I'm suggesting two things: first, call out in some
> >>>>>>> way
> >>>>>>>>>>>>> (perhaps the StoreTypeSpec javadocs) that each StoreTypeSpec
> >>>>>> is
> >>>>>>>>>> considered
> >>>>>>>>>>>>> a public contract in itself and should outline any semantic
> >>>>>>>>>> guarantees it
> >>>>>>>>>>>>> does, or does not, make. Second, we should add a note on
> >>>>>> ordering
> >>>>>>>>>>>>> guarantees in the two OOTB specs: for RocksDB we assert that
> >>>>>>> range
> >>>>>>>>>> queries
> >>>>>>>>>>>>> will honor serialized byte ordering, whereas the InMemory
> >>>>>> flavor
> >>>>>>>>>> gives no
> >>>>>>>>>>>>> ordering guarantee whatsoever at this time.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> That makes sense to me Sophie! I'll make the changes to the
> >>>>>> KIP.
> >>>>>>>> And
> >>>>>>>>>> @Colt,
> >>>>>>>>>>>>> yes I believe that would be the new javadoc for the generic
> >>>>>>>>>>>>> ReadOnlyKeyValueStore.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> However, I am wondering if we should close others gaps first?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> @Matthias, thanks for the review and thoughts! I think we
> >>>>>> should
> >>>>>>>>>> separate
> >>>>>>>>>>>>> closing other gaps in the product from providing this as
> >>>>>> useful
> >>>>>>>>>>>>> functionality to avoid feature creep so long as the API
> >>>>>> proposed
> >>>>>>>>>> here will
> >>>>>>>>>>>>> be suitable for when we want to close those implementation
> >>>>>> gaps!
> >>>>>>> My
> >>>>>>>>>> general
> >>>>>>>>>>>>> proposal is that for things that are not customizable today by
> >>>>>>>>>>>>> default.dsl.store they remain not customizable after this KIP.
> >>>>>>> The
> >>>>>>>>>> good
> >>>>>>>>>>>>> news is, however, that there's no reason why this cannot be
> >>>>>>>> extended
> >>>>>>>>>> to
> >>>>>>>>>>>>> cover those in the future if we want to - see specifics below.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Comments on the specifics below
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> In particular, this holds for the new versioned-store ...
> >>>>>> Should
> >>>>>>>>>>>>> versioned stores also be covered by the KIP
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Is there a reason why we can't introduce a
> >>>>>>>>>> VersionedRocksDBStoreTypeSpec
> >>>>>>>>>>>>> and if we ever support an in-memory an equivalent
> >>>>>>>>>>>>> VersionedInMemoryRocksDBStoreTypeSpec? If so, then there would
> >>>>>>> not
> >>>>>>>>>> need to
> >>>>>>>>>>>>> be any additional changes to the API proposed in this KIP.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> For `suppress()` it's actually other way around we only have
> >>>>>> an
> >>>>>>>>>> in-memory
> >>>>>>>>>>>>> implementation. Do you aim to allow custom stores for
> >>>>>>> `suppress()`,
> >>>>>>>>>> too?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We have three options here:
> >>>>>>>>>>>>> 1) we can decide to maintain existing behavior and use the
> >>>>>>>> in-memory
> >>>>>>>>>>>>> implementation for all stores (not even going through the API
> >>>>>> at
> >>>>>>>> all)
> >>>>>>>>>>>>> 2a) we can introduce a new parameter to the KeyValueParams
> >>>>>> class
> >>>>>>>>>> (boolean
> >>>>>>>>>>>>> isTimeOrderedBuffer or something like that) and return an
> >>>>>>> in-memory
> >>>>>>>>>> store
> >>>>>>>>>>>>> in the implementation of RocksDBStoreTypeSpec (this maintains
> >>>>>> the
> >>>>>>>>>> existing
> >>>>>>>>>>>>> behavior, and would allow us in the future to make the change
> >>>>>> to
> >>>>>>>>>> return a
> >>>>>>>>>>>>> RocksDB store if we ever provide one)
> >>>>>>>>>>>>> 2b) same as 2a but we throw an exception if the requested
> >>>>>> store
> >>>>>>>> type
> >>>>>>>>>> does
> >>>>>>>>>>>>> not support that (this is backwards incompatible, and since
> >>>>>>>> ROCKS_DB
> >>>>>>>>>> is the
> >>>>>>>>>>>>> default we probably shouldn't do this)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> My proposal for now is 1) because as of KIP-825
> >>>>>>>>>>>>> EmitStrategy#ON_WINDOW_CLOSE is the preferred way of
> >>>>>> suppressing
> >>>>>>>> and
> >>>>>>>>>> that
> >>>>>>>>>>>>> is accounted for in this API already.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Last, I am not sure if the new parameter replacing the
> >>>>>> existing
> >>>>>>>> one
> >>>>>>>>>> is
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> best way to go?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm happy either way, just let me know which you prefer - the
> >>>>>>>>>> discussion
> >>>>>>>>>>>>> around CUSTOM is in the rejected alternatives but I'm happy to
> >>>>>>>>>> differ to
> >>>>>>>>>>>>> whatever the project conventions are :)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just
> >>>>>>> process
> >>>>>>>>>> it as
> >>>>>>>>>>>>> we
> >>>>>>>>>>>>> do know, and if know we assume it's a fully qualified class
> >>>>>> name
> >>>>>>>> and
> >>>>>>>>>> try to
> >>>>>>>>>>>>> instantiate it?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Note that there is no functionality for this kind of thing in
> >>>>>>>>>>>>> AbstractConfig (it's either a String validated enum or a
> >>>>>> class)
> >>>>>>> so
> >>>>>>>>>> this
> >>>>>>>>>>>>> would be a departure from convention. Again, I'm happy to
> >>>>>>> implement
> >>>>>>>>>> that if
> >>>>>>>>>>>>> it's preferred.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Also wondering how it would related to the existing `Stores`
> >>>>>>>>>> factory?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> StoreTypeSpec will depend on Stores factory - they're one
> >>>>>> layer
> >>>>>>>>>> removed.
> >>>>>>>>>>>>> You can imagine that StoreTypeSpec is just a grouping of
> >>>>>> methods
> >>>>>>>>>> from the
> >>>>>>>>>>>>> Stores factory into a convenient package for default
> >>>>>>> configuration
> >>>>>>>>>>>>> purposes.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks again for all the detailed thoughts Matthias!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Fri, Jul 21, 2023 at 11:50 AM Matthias J. Sax <
> >>>>>>> mj...@apache.org
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for the KIP. Overall I like the idea to close this
> >>>>>> gap.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> However, I am wondering if we should close others gaps
> >>>>>> first? In
> >>>>>>>>>>>>>> particular, IIRC, we have a few cases for which we only have
> >>>>>> a
> >>>>>>>>>> RocksDB
> >>>>>>>>>>>>>> implementation for a store, and thus, adding an in-memory
> >>>>>>> version
> >>>>>>>>>> for
> >>>>>>>>>>>>>> these stores first, to make the current `IN_MEMORY` parameter
> >>>>>>>> work,
> >>>>>>>>>>>>>> might be the first step?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In particular, this holds for the new versioned-store (but I
> >>>>>>>>>> actually
> >>>>>>>>>>>>>> believe the is some other internal store with no in-memory
> >>>>>>>>>>>>>> implementation). -- For `suppress()` it's actually other way
> >>>>>>>> around
> >>>>>>>>>> we
> >>>>>>>>>>>>>> we only have an in-memory implementation. Do you aim to allow
> >>>>>>>> custom
> >>>>>>>>>>>>>> stores for `suppress()`, too?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Btw: Should versioned stores also be covered by the KIP (ie,
> >>>>>>>>>>>>>> `StoreTypeSpec`)? We did consider to add a new option
> >>>>>>> `VERSIONED`
> >>>>>>>>>> to the
> >>>>>>>>>>>>>> existing `default.dsl.store` config, but opted out for
> >>>>>> various
> >>>>>>>>>> reasons.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Last, I am not sure if the new parameter replacing the
> >>>>>> existing
> >>>>>>>> one
> >>>>>>>>>> is
> >>>>>>>>>>>>>> the best way to go? Did you put the idea to add `CUSTOM` to
> >>>>>> the
> >>>>>>>>>> existing
> >>>>>>>>>>>>>> config into rejected alternative. Personally, I would prefer
> >>>>>> to
> >>>>>>>> add
> >>>>>>>>>>>>>> `CUSTOM` as I would like to optimize to easy of use for the
> >>>>>>>>>> majority of
> >>>>>>>>>>>>>> users (which don't implement a custom store), but only
> >>>>>> switch to
> >>>>>>>>>>>>>> in-memory sometimes. -- As an alternative, you would also
> >>>>>> just
> >>>>>>>>>> extend
> >>>>>>>>>>>>>> `dsl.default.store` (it's just a String) and allow to pass in
> >>>>>>>>>> anything.
> >>>>>>>>>>>>>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just
> >>>>>>> process
> >>>>>>>>>> it as
> >>>>>>>>>>>>>> we do know, and if know we assume it's a fully qualified
> >>>>>> class
> >>>>>>>> name
> >>>>>>>>>> and
> >>>>>>>>>>>>>> try to instantiate it? -- Given that we plan to keep the
> >>>>>>>>>> store-enum, is
> >>>>>>>>>>>>>> seems cleaner to keep the existing config and keep both the
> >>>>>>> config
> >>>>>>>>>> and
> >>>>>>>>>>>>>> enum aligned to each other?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> It's just preliminary thought. I will need to go back an
> >>>>>> take a
> >>>>>>>> more
> >>>>>>>>>>>>>> detailed look into the code to grok how the propose
> >>>>>>>> `StoreTypeSpec`
> >>>>>>>>>>>>>> falls into place. Also wondering how it would related to the
> >>>>>>>>>> existing
> >>>>>>>>>>>>>> `Stores` factory?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 7/21/23 6:45 AM, Colt McNealy wrote:
> >>>>>>>>>>>>>>> Sophie—
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for chiming in here. +1 to the idea of specifying the
> >>>>>>>>>> ordering
> >>>>>>>>>>>>>>> guarantees that we make in the StorageTypeSpec javadocs.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Quick question then. Is the javadoc that says:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Order is not guaranteed as bytes lexicographical ordering
> >>>>>>> might
> >>>>>>>>>> not
> >>>>>>>>>>>>>>> represent key order.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> no longer correct, and should say:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Order guarantees depend on the underlying implementation of
> >>>>>>> the
> >>>>>>>>>>>>>>> ReadOnlyKeyValueStore. For more information, please consult
> >>>>>> the
> >>>>>>>>>>>>>>> [StorageTypeSpec javadocs](....)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Colt McNealy
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> *Founder, LittleHorse.dev*
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Thu, Jul 20, 2023 at 9:28 PM Sophie Blee-Goldman <
> >>>>>>>>>>>>>> ableegold...@gmail.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hey Almog, first off, thanks for the KIP! I (and others)
> >>>>>>> raised
> >>>>>>>>>>>>> concerns
> >>>>>>>>>>>>>>>> over how restrictive the default.dsl.store config would be
> >>>>>> if
> >>>>>>>> not
> >>>>>>>>>>>>>>>> extendable to custom store types, especially given that
> >>>>>> this
> >>>>>>>>>> seems to
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>> the primary userbase of such a feature. At the time we
> >>>>>> didn't
> >>>>>>>>>> really
> >>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>> any better ideas for a clean way to achieve that, but what
> >>>>>> you
> >>>>>>>>>>>>> proposed
> >>>>>>>>>>>>>>>> makes a lot of sense to me. Happy to see a good solution to
> >>>>>>>> this,
> >>>>>>>>>> and
> >>>>>>>>>>>>>>>> hopefully others will share my satisfaction :P
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I did have one quick piece of feedback which arose from an
> >>>>>>>>>> unrelated
> >>>>>>>>>>>>>>>> question posed to the dev mailing list w/ subject line
> >>>>>>>>>>>>>>>> "ReadOnlyKeyValueStore#range()
> >>>>>>>>>>>>>>>> Semantics"
> >>>>>>>>>>>>>>>> <
> >>>>>>>> https://lists.apache.org/thread/jbckmth8d3mcgg0rd670cpvsgwzqlwrm>.
> >>>>>>>>>> I
> >>>>>>>>>>>>>>>> recommend checking out the full thread for context, but it
> >>>>>>> made
> >>>>>>>> me
> >>>>>>>>>>>>> think
> >>>>>>>>>>>>>>>> about how we can leverage the new StoreTypeSpec concept as
> >>>>>> an
> >>>>>>>>>> answer
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> long-standing question in Streams: where can we put
> >>>>>> guarantees
> >>>>>>>> of
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>> public contract for RocksDB (or other store
> >>>>>> implementations)
> >>>>>>>> when
> >>>>>>>>>> all
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> RocksDB stuff is technically internal.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Basically, I'm suggesting two things: first, call out in
> >>>>>> some
> >>>>>>>> way
> >>>>>>>>>>>>>> (perhaps
> >>>>>>>>>>>>>>>> the StoreTypeSpec javadocs) that each StoreTypeSpec is
> >>>>>>>> considered
> >>>>>>>>>> a
> >>>>>>>>>>>>>> public
> >>>>>>>>>>>>>>>> contract in itself and should outline any semantic
> >>>>>> guarantees
> >>>>>>> it
> >>>>>>>>>> does,
> >>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>> does not, make. Second, we should add a note on ordering
> >>>>>>>>>> guarantees in
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> two OOTB specs: for RocksDB we assert that range queries
> >>>>>> will
> >>>>>>>>>> honor
> >>>>>>>>>>>>>>>> serialized byte ordering, whereas the InMemory flavor
> >>>>>> gives no
> >>>>>>>>>>>>> ordering
> >>>>>>>>>>>>>>>> guarantee whatsoever at this time.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thoughts?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -Sophie
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Thu, Jul 20, 2023 at 4:28 PM Almog Gavra <
> >>>>>>>>>> almog.ga...@gmail.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I would like to propose a KIP to expand support for
> >>>>>> default
> >>>>>>>> store
> >>>>>>>>>>>>> types
> >>>>>>>>>>>>>>>>> (KIP-591) to encompass custom store implementations:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-954%3A+expand+default+DSL+store+configuration+to+custom+types
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Looking forward to your feedback!
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>> Almog
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >

Reply via email to