Hi Patrick, John, Thanks for your explanation and update. It looks better now.
+1 (non-binding) from me. just one minor comment: 10. In the example section, we are still using the variable names `lower` and `upper` as before. We should update them, too. (follow the decision for point 7 above) Thank you. Luke On Thu, Dec 16, 2021 at 3:34 AM Guozhang Wang <wangg...@gmail.com> wrote: > Thanks Patrick, John. > > I read through the updated KIP and it's much cleared on the scope now, > appreciate that! I'm +1 overall, modulo a few minor comments: > > 7. The parameter names of `WindowKeyQuery#withKeyAndWindowStartRange`, i.e. > `startTime` and `endTime` seem not updated; also for the parameter names of > `WindowRangeQuery#withWindowStartRange`, the `earliest / latest` seems > inconsistent with the existing API parameter names, how about naming both > of them `fromWindowStartTime` and `toWindowStartTime`? > > 8. `getStartTime / getEndTime` are not updated as well: should they be > `getEarliestWindowStartTime` / `getLatestWindowStartTime` or `getFrom... > getTo...` if you agree with 7) above? > > 9. "One reason we cannot use WindowKeyQuery to support > SessionStore.fetch(key)": not sure I understand this part, for the new APIs > we do not need to be following the old API's return types since they are > separated, right? So if we think it's actually better for the new API > mimicing `sessionStore.fetch(key)` to return a `WindowStoreIterator<V>`, > why can't we just do that? > > > Guozhang > > On Wed, Dec 15, 2021 at 8:49 AM John Roesler <vvcep...@apache.org> wrote: > > > FYI, I filed this follow-on task to explore a more general > > pattern for these queries: > > https://issues.apache.org/jira/browse/KAFKA-13548 > > > > We can unblock the current scope for these queries but still > > plan to revisit the API before the first release of IQv2. > > > > Thanks! > > -John > > > > On Wed, 2021-12-15 at 10:34 -0600, John Roesler wrote: > > > Thanks for the update, Patrick! > > > > > > Tl;dr: I'm +1 (binding) > > > > > > I just reviewed the KIP again (I hope you don't mind, I > > > fixed a couple of missed renames in the text and examples). > > > > > > One of the design of IQv2 is to make proposing and evolving > > > these queries much less onerous. Unlike adding new methods > > > to all StateStore interfaces and implementations, if we want > > > to make (eg) the WindowRangeQuery more flexible in the > > > future, we can easily do so by just adding some builder > > > methods to set the bounds independently, or by adding new > > > static methods to provide different parameterizations. > > > > > > Therefore, even though this is not the ultimate expression > > > of what we think the range queries should be, it's a > > > perfectly fine starting point. The most pressing concerns to > > > me were the cases where Luke and Guozhang pointed out that > > > some parts of the proposal or interfaces were ambiguous, > > > which looks like it's fixed now. > > > > > > My preference would be to go ahead and accept this proposed > > > scope so that we have at least some basic key-value, window, > > > and session store queries in the code base. Until we have > > > those MVP queries in place, we can't really start to address > > > higher-level follow-up items like: > > > https://issues.apache.org/jira/browse/KAFKA-13541 (to refine > > > how the serdes interplay with the queries) or > > > https://issues.apache.org/jira/browse/KAFKA-13541 (to > > > consider alternative ways to pin down the generic type > > > parameters better). > > > > > > Anyway, for the current version of the KIP, my vote is +1 > > > (binding). > > > > > > Thanks again! > > > -John > > > > > > On Wed, 2021-12-15 at 12:53 +0100, Patrick Stuedi wrote: > > > > Thanks everyone for the sharing comments and suggestions, this is > > > > very helpful! > > > > > > > > I have updated the KIP based on the suggestions, please let me know > > what > > > > you think. Here are the main changes: > > > > - Removed constructor in WindowRangeQuery (Guozhang) > > > > - Clearly specified how each of the instantiation of the different > > query > > > > types maps to specific operations in WindowStore and SessionStore > > (Guozhang) > > > > - Renamed the window start end time parameters and getter methods > > (Luke) > > > > - Added some comments on the use of WindowRangeQuery.fromKey > (Guozhang, > > > > John) > > > > > > > > The KIP currently takes a minimalist approach to move things forward. > > There > > > > are two follow ups I can think of from this KIP: > > > > * Add additional parameter combinations to WindowRangeQuery, e.g., > > support > > > > for key range with and without window ranges. > > > > * Remove WindowRangeQuery.fromKey and either use > > > > WindowKeyQuery.withKeyAndWindowBounds or add a new call like > > > > WindowKeyQuery.withKeyAndNoWindowBounds(): either way that would then > > raise > > > > the question which operation in WindowStore this query should map to, > > given > > > > that WindowKeyQuery is templated against WindowStoreIterator and the > > > > current use of WindowRangeQuery.fromKey is to call SessionStore.fetch > > which > > > > returns a KeyValueIterator. > > > > * Combine WindowKeyQuery and WindowRangeQuery, e.g., by exposing the > > > > template type > > > > * Create a "Builder" interface for the query types to avoid blowing > up > > the > > > > static methods. > > > > > > > > Since I think any of those tasks will require some more discussion, > > and in > > > > the interest of being pragmatic here, I suggest we defer those > > > > optimizations to future KIPs. > > > > > > > > Best, > > > > Patrick > > > > > > > > On Tue, Dec 14, 2021 at 1:02 AM Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > > > > > Hi John, > > > > > > > > > > Please see my follow-up comments inlined below. > > > > > > > > > > On Mon, Dec 13, 2021 at 2:26 PM John Roesler <vvcep...@apache.org> > > wrote: > > > > > > > > > > > Hi Patrick, thanks for the KIP! > > > > > > > > > > > > I hope you, Guozhang, and Luke don't mind if I share some > > > > > > thoughts: > > > > > > > > > > > > 1. I think you just meant to remove that private constructor > > > > > > from the KIP, right? > > > > > > > > > > > > > > > > Yes. > > > > > > > > > > > > > > > > > > > > > > 2. I think WindowRangeQuery#withWindowRage(windowLower, > > > > > > windowUpper) is the version that yields an iterator over > > > > > > both keys and windows, so it's not totally redundant. > > > > > > > > > > > > However, it does seem like WindowRangeQuery#withKey is > > > > > > equivalent to WindowKeyQuery#withKey . I overlooked it > > > > > > before, but we probably don't need both. > > > > > > > > > > > > Stepping back, it seems like we could either just de- > > > > > > duplicate that specific withKey query or we could even just > > > > > > merge both use cases into the WindowRangeQuery. I've > > > > > > personally never been convinced that the extra complexity of > > > > > > having the WindowStoreIterator is worth the savings of not > > > > > > duplicating the key on each record. Maybe it would be if we > > > > > > separately deserialized the same key over and over again, > > > > > > but that seems optimizable. > > > > > > > > > > > > On the other hand, that's more work, and the focus for IQv2 > > > > > > right now is to just get some MVP of queries implemented to > > > > > > cover basic use cases, so it might be worthwhile to keep it > > > > > > simple (by just dropping `WindowRangeQuery#withKey`). I'm > > > > > > happy with whatever call Patrick wants to make here, since > > > > > > it's his work to do. > > > > > > > > > > > > > > > > My original comments was actually referring to that > > `WindowRangeQuery` does > > > > > not allow users to specify a range of [keyFrom: windowFrom, keyTo: > > > > > windowTo], but only [key: windowFrom, key: windowTo], or [-INF: > > windowFrom, > > > > > INF: windowTo] if we do not specify the key but only do > > `withWindowRage`. I > > > > > was not sure if this is intentional by design, i.e. that the only > > > > > difference between the two classes is that the latter allows [-INF: > > > > > windowFrom, INF: windowTo], and and I was assuming it's now, but > > otherwise > > > > > these two are then very much alike. > > > > > > > > > > So I guess my question is that, is it by-design to not allow users > do > > > > > something like `query.withWindowRange(..).withKeyRange(..)`? > > > > > > > > > > > > > > > > > > > > > > 3. I got the impression that WindowRangeQuery was intended > > > > > > for use with both Window and Session stores, but that might > > > > > > be my assumption. > > > > > > > > > > > > > > > > > Today the session store's query API is at least different on the > > names, > > > > > e.g. "findSessions", so I'm not sure if the proposal intend to > > replace > > > > > these functions with the WindowRangeQuery such that: > > > > > > > > > > sessionStore.findSessions(K, Ins, Ins) -> > > query.withKey().withWindowRange() > > > > > sessionStore.findSessions(K, K, Ins, Ins) -> > > > > > query.withKeyRange().withWindowRange() // again, assuming we > would > > want > > > > > `withKeyRange` as in 2) above. > > > > > sessionStore.fetch(K) -> query.withKey() > > > > > sessionStore.fetch(K, K) -> query.withKeyRange() > > > > > > > > > > Also, sessionStore.fetchSession(K, Ins, Ins) seems would not be > > supported > > > > > with the new API. I'm not enforcing that we have the complete > > coverage in > > > > > this KIP, and I'm with you that we would focus on getting some MVP > > out > > > > > sooner, but I'd like our KIP to clarify on what's really covered > > comparing > > > > > against the old API and what's not. > > > > > > > > > > > > > > > > > > > > > 4. That's good feedback. I think those names were inspired > > > > > > by the WindowStore methods that just say stuff like > > > > > > `timeFrom` and `timeTo`. The SessionStore does a better job > > > > > > of being unambiguous, since it says stuff like > > > > > > `earliestSessionEndTime` and `latestSessionStartTime`. > > > > > > > > > > > > And, actually, that brings up a point that I think we > > > > > > overlooked before. While the method signatures of the > > > > > > WindowStore and SessionStore methods look the same, the > > > > > > meanings of their arguments are not the same. In particular, > > > > > > the WindowStore ranges are always in reference to the window > > > > > > start time, while the SessionStore ranges may be in > > > > > > reference to the window start time or end time (or different > > > > > > combinations of both). > > > > > > > > > > > > I suspect that your instinct to cover both stores with the > > > > > > same Query is correct, and we just need to be specific about > > > > > > the kinds of ranges we're talking about. For example, > > > > > > instead of withWindowRange, maybe we would have: > > > > > > > > > > > > withWindowStartRange( > > > > > > Instant earliestWindowStartTime, > > > > > > Instant latestWindowStartTime > > > > > > ); > > > > > > > > > > > > Then in the future, we could add stuff like: > > > > > > > > > > > > withWindowStartAndEndRange( > > > > > > Instant earliestWindowStartTime, > > > > > > Instant latestWindowEndTime > > > > > > ); > > > > > > > > > > > > Etc. > > > > > > > > > > > > 5. I think Luke's question reveals how weirdly similar but > > > > > > not the same these two queries are. It's not your fault, > > > > > > this is inherited from the existing set of weirdly-similar- > > > > > > but-not-the-same store methods. The thing that distinguishes > > > > > > the WindowKeyQuery from the WindowRangeQuery is: > > > > > > * WindowKeyQuery: all results share the same K key (cf point > > > > > > 2: I don't think we need `WindowRangeQuery#withKey(k)`). > > > > > > Therefore, the iterator is just (windowStartTime, value) > > > > > > pairs. > > > > > > * WindowRangeQuery: the results may have different keys, so > > > > > > the iterator is (Windowed<K>, value) pairs, where > > > > > > Windowed<K> is basically a (windowStartTime, K key) pair. > > > > > > > > > > > > Therefore, we could support a WindowRangeQuery with a fixed > > > > > > key, but it would just be another way to express the same > > > > > > thing as the WindowKeyQuery with the same parameters. > > > > > > > > > > > > > > > > > As I suggested, we do not necessarily need to cover all existing > > cases > > > > > compared with the old API, but it's better to be very clear on > what's > > > > > actually covered v.s. what's not. For example in WindowRangeQuery, > > I'd like > > > > > to clarify if all the following are intended in this KIP's scope: > > > > > > > > > > * query.withKey().withWindowRange() > > > > > * query.withKey() // range all window > > > > > * query.withWindowRange() // range all keys. > > > > > > > > > > > > > > > > As in point 2, I do think it might be worth converging all > > > > > > use cases into the WindowRangeQuery, but we can also just > > > > > > shoot for parity now and defer elegance for the future. > > > > > > > > > > > > 6. I'll just add one question of my own here: If we do keep > > > > > > both query classes, then I think we would drop `withKey` and > > > > > > `getKey` from the WindowRangeQuery. But if we do wind up > > > > > > keeping the WindowRangeQuery class, I think we should > > > > > > reconsider the name of the getter, since there are other > > > > > > range queries that support a range of keys. Is there a > > > > > > getter name we can use that would still work if we come back > > > > > > later and add lower and upper key bounds? > > > > > > > > > > > > Thanks again for the KIP! > > > > > > John > > > > > > > > > > > > On Mon, 2021-12-13 at 16:35 +0800, Luke Chen wrote: > > > > > > > Hi Patrick, > > > > > > > > > > > > > > Thanks for the KIP! > > > > > > > > > > > > > > I have some comments, in addition to Guozhang's comments: > > > > > > > 4. The parameter names `windowLower` and `windowUpper` are kind > > of > > > > > > > ambiguous to me. Could we come up a better name for it, like > > > > > > > `windowStartTime`, `windowEndTime`, or even we don't need the > > "window" > > > > > > > name, just `startTime` and `endTime`? > > > > > > > 5. Why can't we support window range query with a key within a > > time > > > > > > range? > > > > > > > You might need to explain in the KIP. > > > > > > > > > > > > > > Thank you. > > > > > > > Luke > > > > > > > > > > > > > > > > > > > > > On Sat, Dec 11, 2021 at 7:54 AM Guozhang Wang < > > wangg...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > Hi Patrick, > > > > > > > > > > > > > > > > I made a pass on the KIP and have a few comments below: > > > > > > > > > > > > > > > > 1. The `WindowRangeQuery` has a private constructor while the > > > > > > > > `WindowKeyQuery` has not, is that intentional? > > > > > > > > > > > > > > > > 2. The `WindowRangeQuery` seems not allowing to range over > both > > > > > window > > > > > > and > > > > > > > > key, but only window with a fixed key, in that case it seems > > pretty > > > > > > much > > > > > > > > the same as the other (ignoring the constructor), since we > > know we > > > > > > would > > > > > > > > only have a single `key` value in the returned iterator, and > > hence it > > > > > > seems > > > > > > > > returning in the form of `WindowStoreIterator<V>` is also > fine > > as the > > > > > > key > > > > > > > > is fixed and hence no need to maintain it in the returned > > iterator. > > > > > I'm > > > > > > > > wondering should we actually support ranging over keys as > well > > in > > > > > > > > `WindowRangeQuery`? > > > > > > > > > > > > > > > > 3. The KIP title mentioned both session and window, but the > > APIs only > > > > > > > > involves window stores; However the return type > > `WindowStoreIterator` > > > > > > is > > > > > > > > only for window stores not session stores, so I feel we would > > still > > > > > > have > > > > > > > > some differences for session window query interface? > > > > > > > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > On Fri, Dec 10, 2021 at 1:32 PM Patrick Stuedi > > > > > > > > <pstu...@confluent.io.invalid> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > I would like to start the vote for KIP-806 that adds window > > and > > > > > > session > > > > > > > > > query support to query KV-stores using IQv2. > > > > > > > > > > > > > > > > > > The KIP can be found here: > > > > > > > > > https://cwiki.apache.org/confluence/x/LJaqCw > > > > > > > > > > > > > > > > > > Skipping the discussion phase as this KIP is following the > > same > > > > > > pattern > > > > > > > > as > > > > > > > > > the previously submitted KIP-805 (KIP: > > > > > > > > > https://cwiki.apache.org/confluence/x/85OqCw, Discussion: > > > > > > > > > https://tinyurl.com/msp5mcb2). Of course concerns/comments > > can > > > > > > still be > > > > > > > > > brought up in this thread. > > > > > > > > > > > > > > > > > > -Patrick > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -- Guozhang > > > > > > > > > > > > > > -- > -- Guozhang >