Thanks John. Agree. I like the first approach as well, with StreamsConfig flag passing by via ProcessorContext.
Another positive effect with "reverse parameters" is that in the case of `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_ pair to flip, whether with `ReadDirection` enum it apply to both. The only issue I've found while reviewing the KIP is that `all()` won't fit within this approach. We could remove it from the KIP and argue that for WindowStore, `fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse order, and for KeyValueStore, no ordering guarantees are provided. If there is consensus with this changes, I will go and update the KIP. On Thu, May 21, 2020 at 3:33 PM John Roesler <vvcep...@apache.org> wrote: > Hi Jorge, > > Thanks for that idea. I agree, a feature flag would protect anyone > who may be depending on the current behavior. > > It seems better to locate the feature flag in the initialization logic of > the store, rather than have a method on the "live" store that changes > its behavior on the fly. > > It seems like there are two options here, one is to add a new config: > > StreamsConfig.ENABLE_BACKWARDS_ITERATION = > "enable.backwards.iteration > > Or we can add a feature flag in Materialized, like > > Materialized.enableBackwardsIteration() > > I think I'd personally lean toward the config, for the following reason. > The concern that Sophie raised is that someone's program may depend > on the existing contract of getting an empty iterator. We don't want to > switch behavior when they aren't expecting it, so we provide them a > config to assert that they _are_ expecting the new behavior, which > means they take responsibility for updating their code to expect the new > behavior. > > There doesn't seem to be a reason to offer a choice of behaviors on a > per-query, or per-store basis. We just want people to be not surprised > by this change in general. > > What do you think? > Thanks, > -John > > On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote: > > Thank you both for the great feedback. > > > > I like the "fancy" proposal :), and how it removes the need for > > additional API methods. And with a feature flag on `StateStore`, > > disabled by default, should no break current users. > > > > The only side-effect I can think of is that: by moving the flag upwards, > > all later operations become affected; which might be ok for most (all?) > > cases. I can't think of an scenario where this would be an issue, just > > want to point this out. > > > > If moving to this approach, I'd like to check if I got this right before > > updating the KIP: > > > > - only `StateStore` will change by having a new method: > > `backwardIteration()`, `false` by default to keep things compatible. > > - then all `*Stores` will have to update their implementation based on > > this flag. > > > > > > On 20/05/2020 21:02, Sophie Blee-Goldman wrote: > > >> There's no possibility that someone could be relying > > >> on iterating over that range in increasing order, because that's not > what > > >> happens. However, they could indeed be relying on getting an empty > > > iterator > > > > > > I just meant that they might be relying on the assumption that the > range > > > query > > > will never return results with decreasing keys. The empty iterator > wouldn't > > > break that contract, but of course a surprise reverse iterator would. > > > > > > FWIW I actually am in favor of automatically converting to a reverse > > > iterator, > > > I just thought we should consider whether this should be off by > default or > > > even possible to disable at all. > > > > > > On Tue, May 19, 2020 at 7:42 PM John Roesler <vvcep...@apache.org> > wrote: > > > > > >> Thanks for the response, Sophie, > > >> > > >> I wholeheartedly agree we should take as much into account as possible > > >> up front, rather than regretting our decisions later. I actually do > share > > >> your vague sense of worry, which was what led me to say initially > that I > > >> thought my counterproposal might be "too fancy". Sometimes, it's > better > > >> to be explicit instead of "elegant", if we think more people will be > > >> confused > > >> than not. > > >> > > >> I really don't think that there's any danger of "relying on a bug" > here, > > >> although > > >> people certainly could be relying on current behavior. One thing to be > > >> clear > > >> about (which I just left a more detailed comment in KAFKA-8159 about) > is > > >> that > > >> when we say something like key1 > key2, this ordering is defined by > the > > >> serde's output and nothing else. > > >> > > >> Currently, thanks to your fix in > https://github.com/apache/kafka/pull/6521 > > >> , > > >> the store contract is that for range scans, if from > to, then the > store > > >> must > > >> return an empty iterator. There's no possibility that someone could be > > >> relying > > >> on iterating over that range in increasing order, because that's not > what > > >> happens. However, they could indeed be relying on getting an empty > > >> iterator. > > >> > > >> My counterproposal was to actually change this contract to say that > the > > >> store > > >> must return an iterator over the keys in that range, but in the > reverse > > >> order. > > >> So, in addition to considering whether this idea is "too fancy" (aka > > >> confusing), > > >> we should also consider the likelihood of breaking an existing > program with > > >> this behavior/contract change. > > >> > > >> To echo your clarification, I'm also not advocating strongly in favor > of my > > >> proposal. I just wanted to present it for consideration alongside > Jorge's > > >> original one. > > >> > > >> Thanks for raising these very good points, > > >> -John > > >> > > >> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman wrote: > > >>>> Rather than working around it, I think we should just fix it > > >>> Now *that's* a "fancy" idea :P > > >>> > > >>> That was my primary concern, although I do have a vague sense of > worry > > >>> that we might be allowing users to get into trouble without > realizing it. > > >>> For example if their custom serdes suffer a similar bug as the above, > > >>> and/or > > >>> they rely on getting results in increasing order (of the keys) even > when > > >>> to < from. Maybe they're relying on the fact that the range query > returns > > >>> nothing in that case. > > >>> > > >>> Not sure if that qualifies as relying on a bug or not, but in that > past > > >>> we've > > >>> taken the stance that we should not break compatibility even if the > user > > >>> was relying on bugs or unintentional behavior. > > >>> > > >>> Just to clarify I'm not advocating strongly against this proposal, > just > > >>> laying > > >>> out some considerations we should take into account. At the end of > the > > >> day > > >>> we should do what's right rather than maintain compatibility with > > >> existing > > >>> bugs, but sometimes there's a reasonable middle ground. > > >>> > > >>> On Tue, May 19, 2020 at 6:15 PM John Roesler <vvcep...@apache.org> > > >> wrote: > > >>>> Thanks Sophie, > > >>>> > > >>>> Woah, that’s a nasty bug. Rather than working around it, I think we > > >> should > > >>>> just fix it. I’ll leave some comments on the Jira. > > >>>> > > >>>> It doesn’t seem like it should be this KIP’s concern that some > serdes > > >>>> might be incorrectly written. > > >>>> > > >>>> Were there other practical concerns that you had in mind? > > >>>> > > >>>> Thanks, > > >>>> John > > >>>> > > >>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman wrote: > > >>>>> I like this "fancy idea" to just flip the to/from bytes but I think > > >> there > > >>>>> are some practical limitations to implementing this. In particular > > >>>>> I'm thinking about this issue > > >>>>> <https://issues.apache.org/jira/browse/KAFKA-8159> with the > built-in > > >>>> signed > > >>>>> number serdes. > > >>>>> > > >>>>> This trick would actually fix the problem for negative-negative > > >> queries > > >>>>> (ie where to & from are negative) but would cause undetectable > > >>>>> incorrect results for negative-positive queries. For example, say > you > > >>>>> call #range with from = -1 and to = 1, using the Short serdes. The > > >>>>> serialized bytes for that are > > >>>>> > > >>>>> from = 1111111111111111 > > >>>>> to = 0000000000000001 > > >>>>> > > >>>>> so we would end up flipping those and iterating over all keys from > > >>>>> 0000000000000001 to 1111111111111111. Iterating in lexicographical > > >>>>> order means we would iterate over every key in the space *except* > for > > >>>>> 0, but 0 is actually the *only* other key we meant to be included > in > > >> the > > >>>>> range query. > > >>>>> > > >>>>> Currently we just log a warning and return an empty iterator when > > >>>>> to < from, which is obviously also incorrect but feels slightly > more > > >>>>> palatable. If we start automatically converting to reverse queries > we > > >>>>> can't even log a warning in this case unless we wanted to log a > > >> warning > > >>>>> every time, which would be weird to do for a valid usage of a new > > >>>>> feature. > > >>>>> > > >>>>> All that said, I still like the idea overall. Off the top of my > head > > >> I > > >>>> guess > > >>>>> we could add a store config to enable/disable automatic reverse > > >>>> iteration, > > >>>>> which is off by default? > > >>>>> > > >>>>> Thanks for the KIP! This will be a nice addition > > >>>>> > > >>>>> Sophie > > >>>>> > > >>>>> > > >>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler <vvcep...@apache.org> > > >>>> wrote: > > >>>>>> Hi there Jorge, > > >>>>>> > > >>>>>> Thanks for the KIP! > > >>>>>> > > >>>>>> I think this feature sounds very reasonable. > > >>>>>> > > >>>>>> I'm not 100% sure if this is "too fancy", but what do you think > > >>>>>> about avoiding the enum by instead allowing people to flip > > >>>>>> the "from" and "to" endpoints? I.e., reading from "A" to "Z" would > > >>>>>> be a forward scan, and from "Z" to "A" would be a backward one? > > >>>>>> > > >>>>>> Thanks, > > >>>>>> -John > > >>>>>> > > >>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote: > > >>>>>>> Hi everyone, > > >>>>>>> > > >>>>>>> I would like to start the discussion for KIP-617: > > >>>>>>> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards > > >>>>>>> Looking forward to your feedback. > > >>>>>>> > > >>>>>>> Thanks! > > >>>>>>> Jorge. > > >>>>>>> > > >>>>>>> > > >>>>>>> Attachments: > > >>>>>>> * 0x5F2C6E22064982DF.asc > > > > > > Attachments: > > * 0x5F2C6E22064982DF.asc >