FYI, I filed this follow-on task to explore a more general
pattern for these queries:
https://issues.apache.org/jira/browse/KAFKA-13548

We can unblock the current scope for these queries but still
plan to revisit the API before the first release of IQv2.

Thanks!
-John

On Wed, 2021-12-15 at 10:34 -0600, John Roesler wrote:
> Thanks for the update, Patrick!
> 
> Tl;dr: I'm +1 (binding)
> 
> I just reviewed the KIP again (I hope you don't mind, I
> fixed a couple of missed renames in the text and examples).
> 
> One of the design of IQv2 is to make proposing and evolving
> these queries much less onerous. Unlike adding new methods
> to all StateStore interfaces and implementations, if we want
> to make (eg) the WindowRangeQuery more flexible in the
> future, we can easily do so by just adding some builder
> methods to set the bounds independently, or by adding new
> static methods to provide different parameterizations.
> 
> Therefore, even though this is not the ultimate expression
> of what we think the range queries should be, it's a
> perfectly fine starting point. The most pressing concerns to
> me were the cases where Luke and Guozhang pointed out that
> some parts of the proposal or interfaces were ambiguous,
> which looks like it's fixed now.
> 
> My preference would be to go ahead and accept this proposed
> scope so that we have at least some basic key-value, window,
> and session store queries in the code base. Until we have
> those MVP queries in place, we can't really start to address
> higher-level follow-up items like:
> https://issues.apache.org/jira/browse/KAFKA-13541 (to refine
> how the serdes interplay with the queries) or
> https://issues.apache.org/jira/browse/KAFKA-13541 (to
> consider alternative ways to pin down the generic type
> parameters better).
> 
> Anyway, for the current version of the KIP, my vote is +1
> (binding).
> 
> Thanks again!
> -John
> 
> On Wed, 2021-12-15 at 12:53 +0100, Patrick Stuedi wrote:
> > Thanks everyone for the sharing comments and suggestions, this is
> > very helpful!
> > 
> > I have updated the KIP based on the suggestions, please let me know what
> > you think. Here are the main changes:
> > - Removed constructor in WindowRangeQuery (Guozhang)
> > - Clearly specified how each of the instantiation of the different query
> > types maps to specific operations in WindowStore and SessionStore (Guozhang)
> > - Renamed the window start end time parameters and getter methods (Luke)
> > - Added some comments on the use of WindowRangeQuery.fromKey (Guozhang,
> > John)
> > 
> > The KIP currently takes a minimalist approach to move things forward. There
> > are two follow ups I can think of from this KIP:
> > * Add additional parameter combinations to WindowRangeQuery, e.g., support
> > for key range with and without window ranges.
> > * Remove WindowRangeQuery.fromKey and either use
> > WindowKeyQuery.withKeyAndWindowBounds or add a new call like
> > WindowKeyQuery.withKeyAndNoWindowBounds(): either way that would then raise
> > the question which operation in WindowStore this query should map to, given
> > that WindowKeyQuery is templated against WindowStoreIterator and the
> > current use of WindowRangeQuery.fromKey is to call SessionStore.fetch which
> > returns a KeyValueIterator.
> > * Combine WindowKeyQuery and WindowRangeQuery, e.g., by exposing the
> > template type
> > * Create a "Builder" interface for the query types to avoid blowing up the
> > static methods.
> > 
> > Since I think any of those tasks will require some more discussion, and in
> > the interest of being pragmatic here, I suggest we defer those
> > optimizations to future KIPs.
> > 
> > Best,
> >   Patrick
> > 
> > On Tue, Dec 14, 2021 at 1:02 AM Guozhang Wang <wangg...@gmail.com> wrote:
> > 
> > > Hi John,
> > > 
> > > Please see my follow-up comments inlined below.
> > > 
> > > On Mon, Dec 13, 2021 at 2:26 PM John Roesler <vvcep...@apache.org> wrote:
> > > 
> > > > Hi Patrick, thanks for the KIP!
> > > > 
> > > > I hope you, Guozhang, and Luke don't mind if I share some
> > > > thoughts:
> > > > 
> > > > 1. I think you just meant to remove that private constructor
> > > > from the KIP, right?
> > > > 
> > > 
> > > Yes.
> > > 
> > > 
> > > > 
> > > > 2. I think WindowRangeQuery#withWindowRage(windowLower,
> > > > windowUpper) is the version that yields an iterator over
> > > > both keys and windows, so it's not totally redundant.
> > > > 
> > > > However, it does seem like WindowRangeQuery#withKey is
> > > > equivalent to WindowKeyQuery#withKey . I overlooked it
> > > > before, but we probably don't need both.
> > > > 
> > > > Stepping back, it seems like we could either just de-
> > > > duplicate that specific withKey query or we could even just
> > > > merge both use cases into the WindowRangeQuery. I've
> > > > personally never been convinced that the extra complexity of
> > > > having the WindowStoreIterator is worth the savings of not
> > > > duplicating the key on each record. Maybe it would be if we
> > > > separately deserialized the same key over and over again,
> > > > but that seems optimizable.
> > > > 
> > > > On the other hand, that's more work, and the focus for IQv2
> > > > right now is to just get some MVP of queries implemented to
> > > > cover basic use cases, so it might be worthwhile to keep it
> > > > simple (by just dropping `WindowRangeQuery#withKey`). I'm
> > > > happy with whatever call Patrick wants to make here, since
> > > > it's his work to do.
> > > > 
> > > 
> > > My original comments was actually referring to that `WindowRangeQuery` 
> > > does
> > > not allow users to specify a range of [keyFrom: windowFrom, keyTo:
> > > windowTo], but only [key: windowFrom, key: windowTo], or [-INF: 
> > > windowFrom,
> > > INF: windowTo] if we do not specify the key but only do `withWindowRage`. 
> > > I
> > > was not sure if this is intentional by design, i.e. that the only
> > > difference between the two classes is that the latter allows [-INF:
> > > windowFrom, INF: windowTo], and and I was assuming it's now, but otherwise
> > > these two are then very much alike.
> > > 
> > > So I guess my question is that, is it by-design to not allow users do
> > > something like `query.withWindowRange(..).withKeyRange(..)`?
> > > 
> > > 
> > > > 
> > > > 3. I got the impression that WindowRangeQuery was intended
> > > > for use with both Window and Session stores, but that might
> > > > be my assumption.
> > > > 
> > > > 
> > > Today the session store's query API is at least different on the names,
> > > e.g. "findSessions", so I'm not sure if the proposal intend to replace
> > > these functions with the WindowRangeQuery such that:
> > > 
> > > sessionStore.findSessions(K, Ins, Ins) -> 
> > > query.withKey().withWindowRange()
> > > sessionStore.findSessions(K, K, Ins, Ins) ->
> > > query.withKeyRange().withWindowRange()   // again, assuming we would want
> > > `withKeyRange` as in 2) above.
> > > sessionStore.fetch(K) -> query.withKey()
> > > sessionStore.fetch(K, K) -> query.withKeyRange()
> > > 
> > > Also, sessionStore.fetchSession(K, Ins, Ins) seems would not be supported
> > > with the new API. I'm not enforcing that we have the complete coverage in
> > > this KIP, and I'm with you that we would focus on getting some MVP out
> > > sooner, but I'd like our KIP to clarify on what's really covered comparing
> > > against the old API and what's not.
> > > 
> > > 
> > > 
> > > > 4. That's good feedback. I think those names were inspired
> > > > by the WindowStore methods that just say stuff like
> > > > `timeFrom` and `timeTo`. The SessionStore does a better job
> > > > of being unambiguous, since it says stuff like
> > > > `earliestSessionEndTime` and `latestSessionStartTime`.
> > > > 
> > > > And, actually, that brings up a point that I think we
> > > > overlooked before. While the method signatures of the
> > > > WindowStore and SessionStore methods look the same, the
> > > > meanings of their arguments are not the same. In particular,
> > > > the WindowStore ranges are always in reference to the window
> > > > start time, while the SessionStore ranges may be in
> > > > reference to the window start time or end time (or different
> > > > combinations of both).
> > > > 
> > > > I suspect that your instinct to cover both stores with the
> > > > same Query is correct, and we just need to be specific about
> > > > the kinds of ranges we're talking about. For example,
> > > > instead of withWindowRange, maybe we would have:
> > > > 
> > > > withWindowStartRange(
> > > >   Instant earliestWindowStartTime,
> > > >   Instant latestWindowStartTime
> > > > );
> > > > 
> > > > Then in the future, we could add stuff like:
> > > > 
> > > > withWindowStartAndEndRange(
> > > >   Instant earliestWindowStartTime,
> > > >   Instant latestWindowEndTime
> > > > );
> > > > 
> > > > Etc.
> > > > 
> > > > 5. I think Luke's question reveals how weirdly similar but
> > > > not the same these two queries are. It's not your fault,
> > > > this is inherited from the existing set of weirdly-similar-
> > > > but-not-the-same store methods. The thing that distinguishes
> > > > the WindowKeyQuery from the WindowRangeQuery is:
> > > > * WindowKeyQuery: all results share the same K key (cf point
> > > > 2: I don't think we need `WindowRangeQuery#withKey(k)`).
> > > > Therefore, the iterator is just (windowStartTime, value)
> > > > pairs.
> > > > * WindowRangeQuery: the results may have different keys, so
> > > > the iterator is (Windowed<K>, value) pairs, where
> > > > Windowed<K> is basically a (windowStartTime, K key) pair.
> > > > 
> > > > Therefore, we could support a WindowRangeQuery with a fixed
> > > > key, but it would just be another way to express the same
> > > > thing as the WindowKeyQuery with the same parameters.
> > > > 
> > > > 
> > > As I suggested, we do not necessarily need to cover all existing cases
> > > compared with the old API, but it's better to be very clear on what's
> > > actually covered v.s. what's not. For example in WindowRangeQuery, I'd 
> > > like
> > > to clarify if all the following are intended in this KIP's scope:
> > > 
> > > * query.withKey().withWindowRange()
> > > * query.withKey() // range all window
> > > * query.withWindowRange() // range all keys.
> > > 
> > > 
> > > > As in point 2, I do think it might be worth converging all
> > > > use cases into the WindowRangeQuery, but we can also just
> > > > shoot for parity now and defer elegance for the future.
> > > > 
> > > > 6. I'll just add one question of my own here: If we do keep
> > > > both query classes, then I think we would drop `withKey` and
> > > > `getKey` from the WindowRangeQuery. But if we do wind up
> > > > keeping the WindowRangeQuery class, I think we should
> > > > reconsider the name of the getter, since there are other
> > > > range queries that support a range of keys. Is there a
> > > > getter name we can use that would still work if we come back
> > > > later and add lower and upper key bounds?
> > > > 
> > > > Thanks again for the KIP!
> > > > John
> > > > 
> > > > On Mon, 2021-12-13 at 16:35 +0800, Luke Chen wrote:
> > > > > Hi Patrick,
> > > > > 
> > > > > Thanks for the KIP!
> > > > > 
> > > > > I have some comments, in addition to Guozhang's comments:
> > > > > 4. The parameter names `windowLower` and `windowUpper` are kind of
> > > > > ambiguous to me. Could we come up a better name for it, like
> > > > > `windowStartTime`, `windowEndTime`, or even we don't need the "window"
> > > > > name, just `startTime` and `endTime`?
> > > > > 5. Why can't we support window range query with a key within a time
> > > > range?
> > > > > You might need to explain in the KIP.
> > > > > 
> > > > > Thank you.
> > > > > Luke
> > > > > 
> > > > > 
> > > > > On Sat, Dec 11, 2021 at 7:54 AM Guozhang Wang <wangg...@gmail.com>
> > > > wrote:
> > > > > 
> > > > > > Hi Patrick,
> > > > > > 
> > > > > > I made a pass on the KIP and have a few comments below:
> > > > > > 
> > > > > > 1. The `WindowRangeQuery` has a private constructor while the
> > > > > > `WindowKeyQuery` has not, is that intentional?
> > > > > > 
> > > > > > 2. The `WindowRangeQuery` seems not allowing to range over both
> > > window
> > > > and
> > > > > > key, but only window with a fixed key, in that case it seems pretty
> > > > much
> > > > > > the same as the other (ignoring the constructor), since we know we
> > > > would
> > > > > > only have a single `key` value in the returned iterator, and hence 
> > > > > > it
> > > > seems
> > > > > > returning in the form of `WindowStoreIterator<V>` is also fine as 
> > > > > > the
> > > > key
> > > > > > is fixed and hence no need to maintain it in the returned iterator.
> > > I'm
> > > > > > wondering should we actually support ranging over keys as well in
> > > > > > `WindowRangeQuery`?
> > > > > > 
> > > > > > 3. The KIP title mentioned both session and window, but the APIs 
> > > > > > only
> > > > > > involves window stores; However the return type 
> > > > > > `WindowStoreIterator`
> > > > is
> > > > > > only for window stores not session stores, so I feel we would still
> > > > have
> > > > > > some differences for session window query interface?
> > > > > > 
> > > > > > 
> > > > > > Guozhang
> > > > > > 
> > > > > > On Fri, Dec 10, 2021 at 1:32 PM Patrick Stuedi
> > > > > > <pstu...@confluent.io.invalid>
> > > > > > wrote:
> > > > > > 
> > > > > > > Hi everyone,
> > > > > > > 
> > > > > > > I would like to start the vote for KIP-806 that adds window and
> > > > session
> > > > > > > query support to query KV-stores using IQv2.
> > > > > > > 
> > > > > > > The KIP can be found here:
> > > > > > > https://cwiki.apache.org/confluence/x/LJaqCw
> > > > > > > 
> > > > > > > Skipping the discussion phase as this KIP is following the same
> > > > pattern
> > > > > > as
> > > > > > > the previously submitted KIP-805 (KIP:
> > > > > > > https://cwiki.apache.org/confluence/x/85OqCw, Discussion:
> > > > > > > https://tinyurl.com/msp5mcb2). Of course concerns/comments can
> > > > still be
> > > > > > > brought up in this thread.
> > > > > > > 
> > > > > > > -Patrick
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > --
> > > > > > -- Guozhang
> > > > > > 
> > > > 
> > > > 
> > > 
> > > --
> > > -- Guozhang
> > > 
> 

Reply via email to