Thanks John.

> it looks like there’s a revision error in which two methods are proposed
for SessionStore, but seem like they should be in ReadOnlySessionStore. Do
I read that right?

Yes, I've opted to keep the new methods alongside the existing ones. In the
case of SessionStore, `findSessions` are in `SessionStore`, and `fetch` are
in `ReadOnlySessionStore`. If it makes more sense, I can move all of them
to ReadOnlySessionStore.
Let me know what you think.

Thanks,
Jorge.

On Thu, Jul 2, 2020 at 2:36 PM John Roesler <vvcep...@apache.org> wrote:

> Hi Jorge,
>
> Thanks for the update. I think this is a good plan.
>
> I just took a look at the KIP again, and it looks like there’s a revision
> error in which two methods are proposed for SessionStore, but seem like
> they should be in ReadOnlySessionStore. Do I read that right?
>
> Otherwise, I’m happy with your proposal.
>
> Thanks,
> John
>
> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote:
> > Quick update: KIP is updated with latest changes now.
> > Will leave it open this week while working on the PR.
> >
> > Hope to open a new vote thread over the next few days if no additional
> > feedback is provided.
> >
> > Cheers,
> > Jorge.
> >
> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Thanks, John!
> > >
> > > Make sense to reconsider the current approach. I was heading in a
> similar
> > > direction while drafting the implementation. Metered, Caching, and
> other
> > > layers will also have to get duplicated to build up new methods in
> `Stores`
> > > factory, and class casting issues would appear on stores created by
> DSL.
> > >
> > > I will draft a proposal with new methods (move methods from proposed
> > > interfaces to existing ones) with default implementation in a KIP
> update
> > > and wait for Matthias to chime in to validate this approach.
> > >
> > > Jorge.
> > >
> > >
> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler <vvcep...@apache.org>
> wrote:
> > >
> > >> Hi Jorge,
> > >>
> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1
> releases.
> > >>
> > >> The idea to separate the new methods into "mixin" interfaces seems
> > >> like a good one, but as we've discovered in KIP-614, it doesn't work
> > >> out that way in practice. The problem is that the store
> implementations
> > >> are just the base layer that get composed with other layers in Streams
> > >> before they can be accessed in the DSL. This is extremely subtle, so
> > >> I'm going to put everyone to sleep with a detailed explanation:
> > >>
> > >> For example, this is the mechanism by which all KeyValueStore
> > >> implementations get added to Streams:
> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
> > >> return new MeteredKeyValueStore<>(
> > >>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
> > >>   storeSupplier.metricsScope(),
> > >>   time,
> > >>   keySerde,
> > >>   valueSerde
> > >> );
> > >>
> > >> In the DSL, the store that a processor gets from the context would be
> > >> the result of this composition. So even if the storeSupplier.get()
> returns
> > >> a store that implements the "reverse" interface, when you try to use
> it
> > >> from a processor like:
> > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init
> > >> ReadOnlyBackwardWindowStore<K, V> store =
> > >>   (ReadOnlyBackwardWindowStore<K, V>) context.getStateStore(..)
> > >>
> > >> You'd just get a ClassCastException because it's actually a
> > >> MeteredKeyValueStore, which doesn't implement
> > >> ReadOnlyBackwardWindowStore.
> > >>
> > >> The only way to make this work would be to make the Metered,
> > >> Caching, and Logging layers also implement the new interfaces,
> > >> but this effectively forces implementations to also implement
> > >> the interface. Otherwise, the intermediate layers would have to
> > >> cast the store in each method, like this:
> > >> MeteredWindowStore#backwardFetch {
> > >>   ((ReadOnlyBackwardWindowStore<K, V>) innerStore).backwardFetch(..)
> > >> }
> > >>
> > >> And then if the implementation doesn't "opt in" by implementing
> > >> the interface, you'd get a ClassCastException, not when you get the
> > >> store, but when you try to use it.
> > >>
> > >> The fact that we get ClassCastExceptions no matter which way we
> > >> turn here indicates that we're really not getting any benefit from the
> > >> type system, which makes the extra interfaces seem not worth all the
> > >> code involved.
> > >>
> > >> Where we landed in KIP-614 is that, unless we want to completely
> > >> revamp the way that StateStores work in the DSL, you might as
> > >> well just add the new methods to the existing interfaces. To prevent
> > >> compilation errors, we can add default implementations that throw
> > >> UnsupportedOperationException. If a store doesn't opt in by
> > >> implementing the methods, you'd get an UnsupportedOperationException,
> > >> which seems no worse, and maybe better, than the ClassCastException
> > >> you'd get if we go with the "mixin interface" approach.
> > >>
> > >> A quick note: This entire discussion focuses on the DSL. If you're
> just
> > >> using the Processor API by directly adding the a custom store to the
> > >> Topology:
> > >> org.apache.kafka.streams.Topology#addStateStore
> > >> and then retrieving it in the processor via:
> > >> org.apache.kafka.streams.processor.ProcessorContext#getStateStore
> > >> in
> > >> org.apache.kafka.streams.processor.Processor#init
> > >>
> > >> Then, you can both register and retrieve _any_ StateStore
> implementation.
> > >> There's no need to use KeyValueStore or any other built-in interface.
> > >> In other words, KeyValueStore and company are only part of the DSL,
> > >> not the PAPI. So, discussions about the build-in store interfaces are
> only
> > >> really relevant in the context of the DSL, Transformers, and
> Materialized.
> > >>
> > >> So, in conclusion, I'd really recommend just adding any new methods to
> > >> the existing store interfaces. We might be able to revamp the API in
> the
> > >> future to support mixins, but it's a much larger scope of work than
> this
> > >> KIP.
> > >> A more minor comment is that we don't need to add Deprecated variants
> > >> of new methods.
> > >>
> > >> Thanks again, and once again, I'm sorry I tuned out and didn't offer
> this
> > >> feedback before you revised the KIP.
> > >> -John
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Jun 22, 2020, at 06:11, Jorge Esteban Quilcate Otoya wrote:
> > >> > Hi everyone,
> > >> >
> > >> > I've updated the KIP, applying Matthias' feedback regarding
> interface
> > >> > hierarchy.
> > >> >
> > >> > Also, following the last email, I think we can consider reverse
> > >> operations
> > >> > on KeyValue range as well, as implementation supports lexicographic
> > >> order.
> > >> >
> > >> > I considered different naming between Key-based ranges and
> Time-based
> > >> > ranges, and mitigate confusion when fetching keys and time ranges as
> > >> > WindowStore does:
> > >> >
> > >> > Key-based ranges: reverseRange(), reverseAll()
> > >> > Time-based ranges: backwardFetch()
> > >> >
> > >> > Then, key-based changes apply to KeyValueStore, and time-based
> changes
> > >> to
> > >> > Window and Session stores.
> > >> >
> > >> > Let me know if you have any questions.
> > >> >
> > >> > Thanks,
> > >> > Jorge.
> > >> >
> > >> >
> > >> > On Tue, Jun 16, 2020 at 12:47 AM Jorge Esteban Quilcate Otoya <
> > >> > quilcate.jo...@gmail.com> wrote:
> > >> >
> > >> > > Hi everyone, sorry for the late reply.
> > >> > >
> > >> > > Thanks Matthias for your feedback. I think it makes sense to
> > >> reconsider
> > >> > > the current design based on your input.
> > >> > >
> > >> > > After digging deeper into the current implementation, I'd like to
> > >> bring my
> > >> > > current understanding to be double-checked as it might be
> redefining
> > >> the
> > >> > > KIP's scope:
> > >> > >
> > >> > > 1. There are 2 ranges been exposed by different stores:
> > >> > >
> > >> > > a. Key Range
> > >> > > b. Timestamp Range
> > >> > >
> > >> > > So far, we have discussed covering both.
> > >> > >
> > >> > > 2. Key Range functions do not provide ordering guarantees by
> design:
> > >> > >
> > >> > > ```ReadOnlyKeyValueStore.java
> > >> > >     /**
> > >> > >      * Get an iterator over a given range of keys. This iterator
> must
> > >> be
> > >> > > closed after use.
> > >> > >      * The returned iterator must be safe from {@link
> > >> > > java.util.ConcurrentModificationException}s
> > >> > >      * and must not return null values. No ordering guarantees are
> > >> > > provided.
> > >> > >      * ...
> > >> > >      */
> > >> > >      KeyValueIterator<K, V> range(K from, K to);
> > >> > > ```
> > >> > >
> > >> > > Therefore, I'd propose removing Key range operations from the
> scope.
> > >> > >
> > >> > > 3. Timestamp Range operations happen at the SegmentsStore level
> > >> (internal)
> > >> > > API
> > >> > >
> > >> > > AFAICT, Segments wrappers handle all Timestamp ranges queries.
> > >> > >
> > >> > > I'd propose extending `Segments#segments(timeFrom, timeTo,
> backwards)`
> > >> > > with a flag for backwards operations.
> > >> > >
> > >> > > As segments returned will be processed backwards, I'm not
> extending
> > >> > > KeyValueStores to query each segment backwards as previous point
> 2.
> > >> > >
> > >> > > 4. Extend WindowStores implementations with a new
> > >> > > WindowBackwardStore/ReadOnlyBackwardStore:
> > >> > >
> > >> > > ```java
> > >> > > public interface ReadOnlyWindowBackwardStore<K, V> {
> > >> > >     WindowStoreIterator<V> backwardFetch(K key, Instant from,
> Instant
> > >> to)
> > >> > > throws IllegalArgumentException;
> > >> > >
> > >> > >     KeyValueIterator<Windowed<K>, V> backwardFetch(K from, K to,
> > >> Instant
> > >> > > fromTime, Instant toTime)
> > >> > >         throws IllegalArgumentException;
> > >> > >
> > >> > >     KeyValueIterator<Windowed<K>, V> backwardFetchAll(Instant
> from,
> > >> > > Instant to) throws IllegalArgumentException;
> > >> > > ```
> > >> > >
> > >> > > 5. SessionStore is a bit different as it has fetch/find sessions
> > >> spread
> > >> > > between SessionStore and ReadOnlySessionStore.
> > >> > >
> > >> > > I'd propose a new interface `SessionBackwardStore` to expose
> backward
> > >> find
> > >> > > operations:
> > >> > >
> > >> > > ```java
> > >> > > public interface SessionBackwardStore<K, AGG> {
> > >> > >     KeyValueIterator<Windowed<K>, AGG> backwardFindSessions(final
> K
> > >> key,
> > >> > > final long earliestSessionEndTime, final long
> latestSessionStartTime);
> > >> > >
> > >> > >     KeyValueIterator<Windowed<K>, AGG> backwardFindSessions(final
> K
> > >> > > keyFrom, final K keyTo, final long earliestSessionEndTime, final
> long
> > >> > > latestSessionStartTime);
> > >> > > }
> > >> > > ```
> > >> > >
> > >> > > If this understanding is correct I'll proceed to update the KIP
> based
> > >> on
> > >> > > this.
> > >> > >
> > >> > > Looking forward to your feedback.
> > >> > >
> > >> > > Thanks,
> > >> > > Jorge.
> > >> > >
> > >> > > On Fri, May 29, 2020 at 3:32 AM Matthias J. Sax <mj...@apache.org
> >
> > >> wrote:
> > >> > >
> > >> > >> Hey,
> > >> > >>
> > >> > >> Sorry that I am late to the game. I am not 100% convinced about
> the
> > >> > >> current proposal. Using a new config as feature flag seems to be
> > >> rather
> > >> > >> "nasty" to me, and flipping from/to is a little bit too fancy
> for my
> > >> > >> personal taste.
> > >> > >>
> > >> > >> I agree, that the original proposal using a "ReadDirection" enum
> is
> > >> not
> > >> > >> ideal either.
> > >> > >>
> > >> > >> Thus, I would like to put out a new idea: We could add a new
> > >> interface
> > >> > >> that offers new methods that return revers iterators.
> > >> > >>
> > >> > >> The KIP already proposes to add `reverseAll()` and it seems
> backward
> > >> > >> incompatible to just add this method to `ReadOnlyKeyValueStore`
> and
> > >> > >> `ReadOnlyWindowStore`. I don't think we could provide a useful
> > >> default
> > >> > >> implementation for custom stores and thus either break
> compatibility
> > >> or
> > >> > >> need add a default that just throws an exception. Neither seems
> to
> > >> be a
> > >> > >> good option.
> > >> > >>
> > >> > >> Using a new interface avoid this issue and allows users
> implementing
> > >> > >> custom stores to opt-in by adding the interface to their stores.
> > >> > >> Furthermore, we don't need any config. In the end, we encapsulte
> the
> > >> > >> change into the store, and our runtime is agnostic to it (as it
> > >> should
> > >> > >> be).
> > >> > >>
> > >> > >> The hierarchy becomes a little complex (but uses would not
> really see
> > >> > >> the complexity):
> > >> > >>
> > >> > >> // exsiting
> > >> > >> ReadOnlyKeyValueStore
> > >> > >> KeyValueStore extend StateStore, ReadOnlyKeyValueStore
> > >> > >>
> > >> > >>
> > >> > >> // helper interface; users don't care
> > >> > >> // need similar ones for other stores
> > >> > >> ReverseReadOnlyKeyValueStore {
> > >> > >>     KeyValueIterator<K, V> reverseRange(K from, K to);
> > >> > >>     KeyValueIterator<K, V> reverseAll();
> > >> > >> }
> > >> > >>
> > >> > >>
> > >> > >> // two new user facing interfaces for kv-store
> > >> > >> // need similar ones for other stores
> > >> > >> ReadOnlyKeyValueStoreWithReverseIterators extends
> > >> ReadOnlyKeyValueStore,
> > >> > >> ReverseReadOnlyKeyValueStore
> > >> > >>
> > >> > >> KeyValueStoreWithReverseIterators extends KeyValueStore,
> > >> > >> ReverseReadOnlyKeyValueStore
> > >> > >>
> > >> > >>
> > >> > >> // updated (also internal)
> > >> > >> // also need to update other built-in stores
> > >> > >> RocksDB implements KeyValueStoreWithReverseIterators,
> > >> BulkLoadingStore
> > >> > >>
> > >> > >>
> > >> > >> In the end, user would only care about the two (for kv-case) new
> > >> > >> interface that offer revers iterator (read only and regular) and
> can
> > >> > >> cast stores accordingly in their Processors/Transformers or via
> IQ.
> > >> > >>
> > >> > >>
> > >> > >> Btw: if we add revers iterator for KeyValue and Window store,
> should
> > >> we
> > >> > >> do the same for Session store?
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >> This might be more code to write, but I believe it provides the
> > >> better
> > >> > >> user experience. Thoughts?
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >> -Matthias
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >> On 5/26/20 8:47 PM, John Roesler wrote:
> > >> > >> > Sorry for my silence, Jorge,
> > >> > >> >
> > >> > >> > I've just taken another look, and I'm personally happy with the
> > >> KIP.
> > >> > >> >
> > >> > >> > Thanks,
> > >> > >> > -John
> > >> > >> >
> > >> > >> > On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya
> wrote:
> > >> > >> >> If no additional comments, I will proceed to start the a vote
> > >> thread.
> > >> > >> >>
> > >> > >> >> Thanks a lot for your feedback!
> > >> > >> >>
> > >> > >> >> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya <
> > >> > >> >> quilcate.jo...@gmail.com> wrote:
> > >> > >> >>
> > >> > >> >>> Thanks Sophie. I like the `reverseAll()` idea.
> > >> > >> >>>
> > >> > >> >>> I updated the KIP with your feedback.
> > >> > >> >>>
> > >> > >> >>>
> > >> > >> >>>
> > >> > >> >>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman <
> > >> > >> sop...@confluent.io>
> > >> > >> >>> wrote:
> > >> > >> >>>
> > >> > >> >>>> Hm, the case of `all()` does seem to present a dilemma in
> the
> > >> case of
> > >> > >> >>>> variable-length keys.
> > >> > >> >>>>
> > >> > >> >>>> In the case of fixed-length keys, you can just compute the
> keys
> > >> that
> > >> > >> >>>> correspond
> > >> > >> >>>> to the maximum and minimum serialized bytes, then perform a
> > >> `range()`
> > >> > >> >>>> query
> > >> > >> >>>> instead of an `all()`. If your keys don't have a
> well-defined
> > >> > >> ordering
> > >> > >> >>>> such
> > >> > >> >>>> that
> > >> > >> >>>> you can't determine the MAX_KEY, then you probably don't
> care
> > >> about
> > >> > >> the
> > >> > >> >>>> iterator order anyway.
> > >> > >> >>>>
> > >> > >> >>>>  But with variable-length keys, there is no MAX_KEY. If all
> your
> > >> > >> keys were
> > >> > >> >>>> just
> > >> > >> >>>> of the form 'a', 'aa', 'aaaaa', 'aaaaaaa' then in fact the
> only
> > >> way
> > >> > >> to
> > >> > >> >>>> figure out the
> > >> > >> >>>> maximum key in the store is by using `all()` -- and without
> a
> > >> reverse
> > >> > >> >>>> iterator, you're
> > >> > >> >>>> doomed to iterate through every single key just to answer
> that
> > >> simple
> > >> > >> >>>> question.
> > >> > >> >>>>
> > >> > >> >>>> That said, I still think determining the iterator order
> based
> > >> on the
> > >> > >> >>>> to/from bytes
> > >> > >> >>>> makes a lot of intuitive sense and gives the API a nice
> > >> symmetry.
> > >> > >> What if
> > >> > >> >>>> we
> > >> > >> >>>> solved the `all()` problem by just giving `all()` a reverse
> > >> form to
> > >> > >> >>>> complement it?
> > >> > >> >>>> Ie we would have `all()` and `reverseAll()`, or something to
> > >> that
> > >> > >> effect.
> > >> > >> >>>>
> > >> > >> >>>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate
> Otoya <
> > >> > >> >>>> quilcate.jo...@gmail.com> wrote:
> > >> > >> >>>>
> > >> > >> >>>>> Thanks John.
> > >> > >> >>>>>
> > >> > >> >>>>> Agree. I like the first approach as well, with
> StreamsConfig
> > >> flag
> > >> > >> >>>> passing
> > >> > >> >>>>> by via ProcessorContext.
> > >> > >> >>>>>
> > >> > >> >>>>> Another positive effect with "reverse parameters" is that
> in
> > >> the
> > >> > >> case of
> > >> > >> >>>>> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide
> > >> _which_
> > >> > >> pair
> > >> > >> >>>> to
> > >> > >> >>>>> flip, whether with `ReadDirection` enum it apply to both.
> > >> > >> >>>>>
> > >> > >> >>>>> The only issue I've found while reviewing the KIP is that
> > >> `all()`
> > >> > >> won't
> > >> > >> >>>> fit
> > >> > >> >>>>> within this approach.
> > >> > >> >>>>>
> > >> > >> >>>>> We could remove it from the KIP and argue that for
> WindowStore,
> > >> > >> >>>>> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in
> reverse
> > >> > >> order,
> > >> > >> >>>> and
> > >> > >> >>>>> for KeyValueStore, no ordering guarantees are provided.
> > >> > >> >>>>>
> > >> > >> >>>>> If there is consensus with this changes, I will go and
> update
> > >> the
> > >> > >> KIP.
> > >> > >> >>>>>
> > >> > >> >>>>> On Thu, May 21, 2020 at 3:33 PM John Roesler <
> > >> vvcep...@apache.org>
> > >> > >> >>>> wrote:
> > >> > >> >>>>>
> > >> > >> >>>>>> Hi Jorge,
> > >> > >> >>>>>>
> > >> > >> >>>>>> Thanks for that idea. I agree, a feature flag would
> protect
> > >> anyone
> > >> > >> >>>>>> who may be depending on the current behavior.
> > >> > >> >>>>>>
> > >> > >> >>>>>> It seems better to locate the feature flag in the
> > >> initialization
> > >> > >> >>>> logic of
> > >> > >> >>>>>> the store, rather than have a method on the "live" store
> that
> > >> > >> changes
> > >> > >> >>>>>> its behavior on the fly.
> > >> > >> >>>>>>
> > >> > >> >>>>>> It seems like there are two options here, one is to add a
> new
> > >> > >> config:
> > >> > >> >>>>>>
> > >> > >> >>>>>> StreamsConfig.ENABLE_BACKWARDS_ITERATION =
> > >> > >> >>>>>>   "enable.backwards.iteration
> > >> > >> >>>>>>
> > >> > >> >>>>>> Or we can add a feature flag in Materialized, like
> > >> > >> >>>>>>
> > >> > >> >>>>>> Materialized.enableBackwardsIteration()
> > >> > >> >>>>>>
> > >> > >> >>>>>> I think I'd personally lean toward the config, for the
> > >> following
> > >> > >> >>>> reason.
> > >> > >> >>>>>> The concern that Sophie raised is that someone's program
> may
> > >> depend
> > >> > >> >>>>>> on the existing contract of getting an empty iterator. We
> > >> don't
> > >> > >> want
> > >> > >> >>>> to
> > >> > >> >>>>>> switch behavior when they aren't expecting it, so we
> provide
> > >> them a
> > >> > >> >>>>>> config to assert that they _are_ expecting the new
> behavior,
> > >> which
> > >> > >> >>>>>> means they take responsibility for updating their code to
> > >> expect
> > >> > >> the
> > >> > >> >>>> new
> > >> > >> >>>>>> behavior.
> > >> > >> >>>>>>
> > >> > >> >>>>>> There doesn't seem to be a reason to offer a choice of
> > >> behaviors
> > >> > >> on a
> > >> > >> >>>>>> per-query, or per-store basis. We just want people to be
> not
> > >> > >> surprised
> > >> > >> >>>>>> by this change in general.
> > >> > >> >>>>>>
> > >> > >> >>>>>> What do you think?
> > >> > >> >>>>>> Thanks,
> > >> > >> >>>>>> -John
> > >> > >> >>>>>>
> > >> > >> >>>>>> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote:
> > >> > >> >>>>>>> Thank you both for the great feedback.
> > >> > >> >>>>>>>
> > >> > >> >>>>>>> I like the "fancy" proposal :), and how it removes the
> need
> > >> for
> > >> > >> >>>>>>> additional API methods. And with a feature flag on
> > >> `StateStore`,
> > >> > >> >>>>>>> disabled by default, should no break current users.
> > >> > >> >>>>>>>
> > >> > >> >>>>>>> The only side-effect I can think of is that: by moving
> the
> > >> flag
> > >> > >> >>>>> upwards,
> > >> > >> >>>>>>> all later operations become affected; which might be ok
> for
> > >> most
> > >> > >> >>>> (all?)
> > >> > >> >>>>>>> cases. I can't think of an scenario where this would be
> an
> > >> issue,
> > >> > >> >>>> just
> > >> > >> >>>>>>> want to point this out.
> > >> > >> >>>>>>>
> > >> > >> >>>>>>> If moving to this approach, I'd like to check if I got
> this
> > >> right
> > >> > >> >>>>> before
> > >> > >> >>>>>>> updating the KIP:
> > >> > >> >>>>>>>
> > >> > >> >>>>>>> - only `StateStore` will change by having a new method:
> > >> > >> >>>>>>> `backwardIteration()`, `false` by default to keep things
> > >> > >> compatible.
> > >> > >> >>>>>>> - then all `*Stores` will have to update their
> implementation
> > >> > >> based
> > >> > >> >>>> on
> > >> > >> >>>>>>> this flag.
> > >> > >> >>>>>>>
> > >> > >> >>>>>>>
> > >> > >> >>>>>>> On 20/05/2020 21:02, Sophie Blee-Goldman wrote:
> > >> > >> >>>>>>>>> There's no possibility that someone could be relying
> > >> > >> >>>>>>>>> on iterating over that range in increasing order,
> because
> > >> that's
> > >> > >> >>>> not
> > >> > >> >>>>>> what
> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on
> getting
> > >> an
> > >> > >> >>>> empty
> > >> > >> >>>>>>>> iterator
> > >> > >> >>>>>>>>
> > >> > >> >>>>>>>> I just meant that they might be relying on the
> assumption
> > >> that
> > >> > >> the
> > >> > >> >>>>>> range
> > >> > >> >>>>>>>> query
> > >> > >> >>>>>>>> will never return results with decreasing keys. The
> empty
> > >> > >> iterator
> > >> > >> >>>>>> wouldn't
> > >> > >> >>>>>>>> break that contract, but of course a surprise reverse
> > >> iterator
> > >> > >> >>>> would.
> > >> > >> >>>>>>>>
> > >> > >> >>>>>>>> FWIW I actually am in favor of automatically converting
> to a
> > >> > >> >>>> reverse
> > >> > >> >>>>>>>> iterator,
> > >> > >> >>>>>>>> I just thought we should consider whether this should be
> > >> off by
> > >> > >> >>>>>> default or
> > >> > >> >>>>>>>> even possible to disable at all.
> > >> > >> >>>>>>>>
> > >> > >> >>>>>>>> On Tue, May 19, 2020 at 7:42 PM John Roesler <
> > >> > >> vvcep...@apache.org
> > >> > >> >>>>>
> > >> > >> >>>>>> wrote:
> > >> > >> >>>>>>>>
> > >> > >> >>>>>>>>> Thanks for the response, Sophie,
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> I wholeheartedly agree we should take as much into
> account
> > >> as
> > >> > >> >>>>> possible
> > >> > >> >>>>>>>>> up front, rather than regretting our decisions later. I
> > >> actually
> > >> > >> >>>> do
> > >> > >> >>>>>> share
> > >> > >> >>>>>>>>> your vague sense of worry, which was what led me to say
> > >> > >> initially
> > >> > >> >>>>>> that I
> > >> > >> >>>>>>>>> thought my counterproposal might be "too fancy".
> > >> Sometimes, it's
> > >> > >> >>>>>> better
> > >> > >> >>>>>>>>> to be explicit instead of "elegant", if we think more
> > >> people
> > >> > >> >>>> will be
> > >> > >> >>>>>>>>> confused
> > >> > >> >>>>>>>>> than not.
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> I really don't think that there's any danger of
> "relying
> > >> on a
> > >> > >> >>>> bug"
> > >> > >> >>>>>> here,
> > >> > >> >>>>>>>>> although
> > >> > >> >>>>>>>>> people certainly could be relying on current behavior.
> One
> > >> thing
> > >> > >> >>>> to
> > >> > >> >>>>> be
> > >> > >> >>>>>>>>> clear
> > >> > >> >>>>>>>>> about (which I just left a more detailed comment in
> > >> KAFKA-8159
> > >> > >> >>>>> about)
> > >> > >> >>>>>> is
> > >> > >> >>>>>>>>> that
> > >> > >> >>>>>>>>> when we say something like key1 > key2, this ordering
> is
> > >> defined
> > >> > >> >>>> by
> > >> > >> >>>>>> the
> > >> > >> >>>>>>>>> serde's output and nothing else.
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> Currently, thanks to your fix in
> > >> > >> >>>>>> https://github.com/apache/kafka/pull/6521
> > >> > >> >>>>>>>>> ,
> > >> > >> >>>>>>>>> the store contract is that for range scans, if from >
> to,
> > >> then
> > >> > >> >>>> the
> > >> > >> >>>>>> store
> > >> > >> >>>>>>>>> must
> > >> > >> >>>>>>>>> return an empty iterator. There's no possibility that
> > >> someone
> > >> > >> >>>> could
> > >> > >> >>>>> be
> > >> > >> >>>>>>>>> relying
> > >> > >> >>>>>>>>> on iterating over that range in increasing order,
> because
> > >> that's
> > >> > >> >>>> not
> > >> > >> >>>>>> what
> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on
> getting
> > >> an
> > >> > >> >>>> empty
> > >> > >> >>>>>>>>> iterator.
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> My counterproposal was to actually change this
> contract to
> > >> say
> > >> > >> >>>> that
> > >> > >> >>>>>> the
> > >> > >> >>>>>>>>> store
> > >> > >> >>>>>>>>> must return an iterator over the keys in that range,
> but
> > >> in the
> > >> > >> >>>>>> reverse
> > >> > >> >>>>>>>>> order.
> > >> > >> >>>>>>>>> So, in addition to considering whether this idea is
> "too
> > >> fancy"
> > >> > >> >>>> (aka
> > >> > >> >>>>>>>>> confusing),
> > >> > >> >>>>>>>>> we should also consider the likelihood of breaking an
> > >> existing
> > >> > >> >>>>>> program with
> > >> > >> >>>>>>>>> this behavior/contract change.
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> To echo your clarification, I'm also not advocating
> > >> strongly in
> > >> > >> >>>>> favor
> > >> > >> >>>>>> of my
> > >> > >> >>>>>>>>> proposal. I just wanted to present it for consideration
> > >> > >> alongside
> > >> > >> >>>>>> Jorge's
> > >> > >> >>>>>>>>> original one.
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> Thanks for raising these very good points,
> > >> > >> >>>>>>>>> -John
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>>>> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman
> wrote:
> > >> > >> >>>>>>>>>>> Rather than working around it, I think we should just
> > >> fix it
> > >> > >> >>>>>>>>>> Now *that's* a "fancy" idea :P
> > >> > >> >>>>>>>>>>
> > >> > >> >>>>>>>>>> That was my primary concern, although I do have a
> vague
> > >> sense
> > >> > >> of
> > >> > >> >>>>>> worry
> > >> > >> >>>>>>>>>> that we might be allowing users to get into trouble
> > >> without
> > >> > >> >>>>>> realizing it.
> > >> > >> >>>>>>>>>> For example if their custom serdes suffer a similar
> bug
> > >> as the
> > >> > >> >>>>> above,
> > >> > >> >>>>>>>>>> and/or
> > >> > >> >>>>>>>>>> they rely on getting results in increasing order (of
> the
> > >> keys)
> > >> > >> >>>> even
> > >> > >> >>>>>> when
> > >> > >> >>>>>>>>>> to < from. Maybe they're relying on the fact that the
> > >> range
> > >> > >> >>>> query
> > >> > >> >>>>>> returns
> > >> > >> >>>>>>>>>> nothing in that case.
> > >> > >> >>>>>>>>>>
> > >> > >> >>>>>>>>>> Not sure if that qualifies as relying on a bug or not,
> > >> but in
> > >> > >> >>>> that
> > >> > >> >>>>>> past
> > >> > >> >>>>>>>>>> we've
> > >> > >> >>>>>>>>>> taken the stance that we should not break
> compatibility
> > >> even if
> > >> > >> >>>> the
> > >> > >> >>>>>> user
> > >> > >> >>>>>>>>>> was relying on bugs or unintentional behavior.
> > >> > >> >>>>>>>>>>
> > >> > >> >>>>>>>>>> Just to clarify I'm not advocating strongly against
> this
> > >> > >> >>>> proposal,
> > >> > >> >>>>>> just
> > >> > >> >>>>>>>>>> laying
> > >> > >> >>>>>>>>>> out some considerations we should take into account.
> At
> > >> the end
> > >> > >> >>>> of
> > >> > >> >>>>>> the
> > >> > >> >>>>>>>>> day
> > >> > >> >>>>>>>>>> we should do what's right rather than maintain
> > >> compatibility
> > >> > >> >>>> with
> > >> > >> >>>>>>>>> existing
> > >> > >> >>>>>>>>>> bugs, but sometimes there's a reasonable middle
> ground.
> > >> > >> >>>>>>>>>>
> > >> > >> >>>>>>>>>> On Tue, May 19, 2020 at 6:15 PM John Roesler <
> > >> > >> >>>> vvcep...@apache.org>
> > >> > >> >>>>>>>>> wrote:
> > >> > >> >>>>>>>>>>> Thanks Sophie,
> > >> > >> >>>>>>>>>>>
> > >> > >> >>>>>>>>>>> Woah, that’s a nasty bug. Rather than working around
> it,
> > >> I
> > >> > >> >>>> think
> > >> > >> >>>>> we
> > >> > >> >>>>>>>>> should
> > >> > >> >>>>>>>>>>> just fix it. I’ll leave some comments on the Jira.
> > >> > >> >>>>>>>>>>>
> > >> > >> >>>>>>>>>>> It doesn’t seem like it should be this KIP’s concern
> > >> that some
> > >> > >> >>>>>> serdes
> > >> > >> >>>>>>>>>>> might be incorrectly written.
> > >> > >> >>>>>>>>>>>
> > >> > >> >>>>>>>>>>> Were there other practical concerns that you had in
> mind?
> > >> > >> >>>>>>>>>>>
> > >> > >> >>>>>>>>>>> Thanks,
> > >> > >> >>>>>>>>>>> John
> > >> > >> >>>>>>>>>>>
> > >> > >> >>>>>>>>>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman
> > >> wrote:
> > >> > >> >>>>>>>>>>>> I like this "fancy idea" to just flip the to/from
> bytes
> > >> but I
> > >> > >> >>>>> think
> > >> > >> >>>>>>>>> there
> > >> > >> >>>>>>>>>>>> are some practical limitations to implementing
> this. In
> > >> > >> >>>>> particular
> > >> > >> >>>>>>>>>>>> I'm thinking about this issue
> > >> > >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-8159>
> > >> with the
> > >> > >> >>>>>> built-in
> > >> > >> >>>>>>>>>>> signed
> > >> > >> >>>>>>>>>>>> number serdes.
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> This trick would actually fix the problem for
> > >> > >> >>>> negative-negative
> > >> > >> >>>>>>>>> queries
> > >> > >> >>>>>>>>>>>> (ie where to & from are negative) but would cause
> > >> > >> undetectable
> > >> > >> >>>>>>>>>>>> incorrect results for negative-positive queries. For
> > >> example,
> > >> > >> >>>> say
> > >> > >> >>>>>> you
> > >> > >> >>>>>>>>>>>> call #range with from = -1 and to = 1, using the
> Short
> > >> > >> serdes.
> > >> > >> >>>>> The
> > >> > >> >>>>>>>>>>>> serialized bytes for that are
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> from = 1111111111111111
> > >> > >> >>>>>>>>>>>> to = 0000000000000001
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> so we would end up flipping those and iterating over
> > >> all keys
> > >> > >> >>>>> from
> > >> > >> >>>>>>>>>>>> 0000000000000001 to 1111111111111111. Iterating in
> > >> > >> >>>>> lexicographical
> > >> > >> >>>>>>>>>>>> order means we would iterate over every key in the
> space
> > >> > >> >>>> *except*
> > >> > >> >>>>>> for
> > >> > >> >>>>>>>>>>>> 0, but 0 is actually the *only* other key we meant
> to be
> > >> > >> >>>> included
> > >> > >> >>>>>> in
> > >> > >> >>>>>>>>> the
> > >> > >> >>>>>>>>>>>> range query.
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> Currently we just log a warning and return an empty
> > >> iterator
> > >> > >> >>>> when
> > >> > >> >>>>>>>>>>>> to < from, which is obviously also incorrect but
> feels
> > >> > >> >>>> slightly
> > >> > >> >>>>>> more
> > >> > >> >>>>>>>>>>>> palatable. If we start automatically converting to
> > >> reverse
> > >> > >> >>>>> queries
> > >> > >> >>>>>> we
> > >> > >> >>>>>>>>>>>> can't even log a warning in this case unless we
> wanted
> > >> to log
> > >> > >> >>>> a
> > >> > >> >>>>>>>>> warning
> > >> > >> >>>>>>>>>>>> every time, which would be weird to do for a valid
> > >> usage of a
> > >> > >> >>>> new
> > >> > >> >>>>>>>>>>>> feature.
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> All that said, I still like the idea overall. Off
> the
> > >> top of
> > >> > >> >>>> my
> > >> > >> >>>>>> head
> > >> > >> >>>>>>>>> I
> > >> > >> >>>>>>>>>>> guess
> > >> > >> >>>>>>>>>>>> we could add a store config to enable/disable
> automatic
> > >> > >> >>>> reverse
> > >> > >> >>>>>>>>>>> iteration,
> > >> > >> >>>>>>>>>>>> which is off by default?
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> Thanks for the KIP! This will be a nice addition
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> Sophie
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler <
> > >> > >> >>>>> vvcep...@apache.org>
> > >> > >> >>>>>>>>>>> wrote:
> > >> > >> >>>>>>>>>>>>> Hi there Jorge,
> > >> > >> >>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>> Thanks for the KIP!
> > >> > >> >>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>> I think this feature sounds very reasonable.
> > >> > >> >>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>> I'm not 100% sure if this is "too fancy", but what
> do
> > >> you
> > >> > >> >>>> think
> > >> > >> >>>>>>>>>>>>> about avoiding the enum by instead allowing people
> to
> > >> flip
> > >> > >> >>>>>>>>>>>>> the "from" and "to" endpoints? I.e., reading from
> "A"
> > >> to "Z"
> > >> > >> >>>>> would
> > >> > >> >>>>>>>>>>>>> be a forward scan, and from "Z" to "A" would be a
> > >> backward
> > >> > >> >>>> one?
> > >> > >> >>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>> Thanks,
> > >> > >> >>>>>>>>>>>>> -John
> > >> > >> >>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate
> wrote:
> > >> > >> >>>>>>>>>>>>>> Hi everyone,
> > >> > >> >>>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>>> I would like to start the discussion for KIP-617:
> > >> > >> >>>>>>>>>>>>>>
> > >> > >> >>>>>>>>>
> > >> > >> >>>>>>
> > >> > >> >>>>>
> > >> > >> >>>>
> > >> > >>
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> > >> > >> >>>>>>>>>>>>>> Looking forward to your feedback.
> > >> > >> >>>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>>> Thanks!
> > >> > >> >>>>>>>>>>>>>> Jorge.
> > >> > >> >>>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>>>
> > >> > >> >>>>>>>>>>>>>> Attachments:
> > >> > >> >>>>>>>>>>>>>> * 0x5F2C6E22064982DF.asc
> > >> > >> >>>>>>>
> > >> > >> >>>>>>>
> > >> > >> >>>>>>> Attachments:
> > >> > >> >>>>>>> * 0x5F2C6E22064982DF.asc
> > >> > >> >>>>>>
> > >> > >> >>>>>
> > >> > >> >>>>
> > >> > >> >>>
> > >> > >> >>
> > >> > >>
> > >> > >>
> > >> >
> > >>
> > >
> >
>

Reply via email to