Thanks for the updates, Alieh!

The example in the KIP uses the allVersions() method which we agreed to remove.

Regarding your questions:
1. asOf vs. until: I am fine with both but slightly prefer until.
2. What about KeyMultiVersionsQuery, KeyVersionsQuery (KIP-960 would then be KeyVersionQuery). However, I am also fine with MultiVersionedKeyQuery since none of the names sounds better or worse to me. 3. I agree with you not to introduce the method with the two bounds to keep things simple.
4. Forget about fromTime() an asOfTime(), from() and asOf() is fine.
5. The main purpose is to show how to use the API. Maybe make an example with just the key to distinguish this query from the single value query of KIP-960 and then one with a key and a time range. When you iterate over the results you could also call validTo(). Maybe add some actual records in the comments to show what the result might look like.

Regarding the test plan, I hope you also plan to add unit tests in all of your KIPs. Maybe you could also explain why system tests are not needed here.

Best,
Bruno

On 10/10/23 5:36 PM, Alieh Saeedi wrote:
Thank you all for the very exact and constructive comments. I really
enjoyed reading your ideas and all the important points you made me aware
of. I updated KIP-968 as follows:

    1. If the key or time bounds are null, the method returns NPE.
    2. The "valid" word: I removed the sentence "all the records that are
    valid..." and replaced it with an exact explanation. More over, I explained
    it with an example in the KIP but not in the javadocs. Do I need to add the
    example to the javadocs as well?
    3. Since I followed Bruno's suggestion and removed the allVersions()
    method, the problem of meaningless combinations is solved, and I do not
    need any IllegalArgumentException or something like that. Therefore, the
    change is that if no time bound is specified, the query returns the records
    with the specified key for all timestamps (all versions).
    4. As Victoria suggested, adding a method to the *VersionedKeyValueStore
    *interface is essential. So I did that. I had this method only in the
    RocksDBVersionedStore class, which was not enough.
    5. I added the *validTo* field to the VersionedRecord class to be able
    to represent the tombstones. As you suggested, we postpone solving the
    problem of retrieving consecutive tombstones for later.
    6. I added the "Test Plan" section to all KIPs. I hope what I wrote is
    convincing.
    7. I added the *withAscendingTimestamp()* method to provide more
code readability
    for the user.
    8. I removed the evil word "get" from all getter methods.

There have also been some more suggestions which I am still not convinced
or clear about them:

    1. Regarding asOf vs until: reading all comments, my conclusion was that
    I keep it as "asOf" (following Walker's idea as the native speaker as well
    as Bruno's suggestion to be consistent with single-key_single_ts queries).
    But I do not have a personal preference. If you insist on "until", I change
    it.
    2. Bruno suggested renaming the class "MultiVersionedKeyQuery" to sth
    else. We already had a long discussion about the name with Matthias. I am
    open to renaming it to something else, but do you have any ideas?
    3. Matthias suggested having a method with two input parameters that
    enables the user to specify both time bounds in the same method. Isn't it
    introducing redundancy? It is somehow disrespectful to the idea of having
    composable methods.
    4. Bruno suggested renaming the methods "asOf" and "from" to "asOfTime"
    and "fromTime". If I do that, then it is not consistent with KIP-960.
    Moreover, the input parameter is clearly a timestamp, which explains
    enough. What do you think about that?
    5. I was asked to add more examples to the example section. My question
    is, what is the main purpose of that? If I know it clearly, then I can add
    what you mean.



Cheers,
Alieh

On Tue, Oct 10, 2023 at 1:13 AM Matthias J. Sax <mj...@apache.org> wrote:

Bruno and I had some background conversation about the `get` prefix
question including a few other committers.

The official policy was never changed, and we should not add the
`get`-prefix. It's a slip on our side in previous KIPs to add the
`get`-prefix and we should actually clean it up doing a follow up KIP.


-Matthias


On 10/5/23 5:26 AM, Bruno Cadonna wrote:
Hi Matthias,

Given all the IQv2 KIPs that use getX and given recent PRs (internal
interfaces mainly) that got merged, I was under the impression that we
moved away from the strict no-getX policy.

I do not think it was an accident using getX in the IQv2 KIPs since
somebody would have brought it up, otherwise.

I am fine with both types of getters.

If we think, we need to discuss this in a broader context, let's start a
separate thread.


Best,
Bruno





On 10/5/23 7:44 AM, Matthias J. Sax wrote:
I agree to (almost) everything what Bruno said.


In general, we tend to move away from using getters without "get",
recently. So I would keep the "get".

This is new to me? Can you elaborate on this point? Why do you think
that's the case?

I actually did realize (after Walker mentioned it) that existing query
types use `get` prefix, but to me it seems that it was by accident and
we should consider correcting it? Thus, I would actually prefer to not
add the `get` prefix for new methods query types.

IMHO, we should do a follow up KIP to deprecate all methods with `get`
prefix and replace them with new ones without `get` -- it's of course
always kinda "unnecessary" noise, but if we don't do it, we might get
into more and more inconsistent naming what would result in a "bad" API.

If we indeed want to change the convention and use the `get` prefix, I
would strongly advocate to bit the bullet and do KIP to pro-actively
add the `get` "everywhere" it's missing... But overall, it seems to be
a much broader decision, and we should get buy in from many committers
about it -- as long as there is no broad consensus to add `get`
everywhere, I would strongly prefer not to diverge from the current
agreement to omit `get`.



-Matthias




On 10/4/23 2:36 AM, Bruno Cadonna wrote:
Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to
VersionedRecord or we need to return tombstones, otherwise the result
is not complete, because users could never know a record was deleted
at some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more
concise and easier interpretable.

Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones
into a versioned state store? I guess if somebody does that on
purpose, then there should be a way to retrieve each of those
tombstones, right?
So maybe we need both -- validTo field and the option to return
tombstones. The latter might be moved to a future KIP in case we see
the need.


Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or
.until()). If we go with .within(), I would opt for
.withinTimeRange(fromTs, toTs), because the query becomes more
readable:

MultiVersionedKeyQuery
    .withKey(1)
    .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z),
Instant.parse(2023-08-04T10:37:30.00Z))

If we stay with .from() and .until(), we should consider .fromTime()
and .untilTime() (or .toTime()):

MultiVersionedKeyQuery
   .withKey(1)
   .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
   .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker
mentioned where this KIP specifies a time range. IMO asOf() fits very
well with KIP-960 where one version is queried, but here I think
.until() fits better. That might just be a matter of taste and in the
end I am fine with both as long as it is well documented.


Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we
tend to move away from using getters without "get", recently. So I
would keep the "get".


Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:
Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me
where
until gives me the idea that the query is making a change. It's
totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

     * The key that was specified for this query.
     */
    public K getKey();

    /**
     * The starting time point of the query, if specified
     */
    public Optional<Instant> getFromTimestamp();

    /**
     * The ending time point of the query, if specified
     */
    public Optional<Instant> getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the
direction
of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax <mj...@apache.org>
wrote:

Thanks for the updated KIP. Overall I like it.

Victoria raises a very good point, and I personally tend to prefer (I
believe so does Victoria, but it's not totally clear from her
email) if
a range query would not return any tombstones, ie, only two records
in
Victoria's example. Thus, it seems best to include a `validTo`
ts-field
to `VersionedRecord` -- otherwise, the retrieved result cannot be
interpreted correctly.

Not sure what others think about it.

I would also be open to actually add a `includeDeletes()` (or
`includeTombstones()`) method/flag (disabled by default) to allow
users
to get all tombstone: this would only be helpful if there are two
consecutive tombstone though (if I got it right), so not sure if we
want
to add it or not -- it seems also possible to add it later if there
is
user demand for it, so it might be a premature addition as this
point?


Nit:

the public interface ValueIterator is used

"is used" -> "is added" (otherwise it sounds like as if
`ValueIterator`
exist already)



Should we also add a `.within(fromTs, toTs)` (or maybe some better
name?) to allow specifying both bounds at once? The existing
`RangeQuery` does the same for specifying the key-range, so might be
good to add for time-range too?



-Matthias


On 9/6/23 5:01 AM, Bruno Cadonna wrote:
In my last e-mail I missed to finish a sentence.

"I think from a KIP"

should be

"I think the KIP looks good!"


On 9/6/23 1:59 PM, Bruno Cadonna wrote:
Hi Alieh,

Thanks for the KIP!

I think from a KIP

1.
I propose to throw an IllegalArgumentException or an
IllegalStateException for meaningless combinations. In any case,
the
KIP should specify what exception is thrown.

2.
Why does not specifying a range return the latest version? I would
expect that it returns all versions since an empty lower or upper
limit is interpreted as no limit.

3.
I second Matthias comment about replacing "asOf" with "until" or
"to".

4.
Do we need "allVersions()"? As I said above I would return all
versions if no limits are specified. I think if we get rid of
allVersions() there might not be any meaningless combinations
anymore.
If a user applies twice the same limit like for example
MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one
wins.

5.
Could you add some more examples with time ranges to the example
section?

6.
The KIP misses the test plan section.

7.
I propose to rename the class to "MultiVersionKeyQuery" since we
are
querying multiple versions of the same key.

8.
Could you also add withAscendingTimestamps()? IMO it gives users
the
possibility to make their code more readable instead of only
relying
on the default.

Best,
Bruno


On 8/17/23 4:13 AM, Matthias J. Sax wrote:
Thanks for splitting this part into a separate KIP!

For `withKey()` we should be explicit that `null` is not allowed.

(Looking into existing `KeyQuery` it seems the JavaDocs don't
cover
this either -- would you like to do a tiny cleanup PR for this, or
fix on-the-side in one of your PRs?)



The key query returns all the records that are valid in the time
range starting from the timestamp {@code fromTimestamp}.

In the JavaDocs you use the phrase `are valid` -- I think we
need to
explain what "valid" means? It might even be worth to add some
examples. It's annoying, but being precise if kinda important.

With regard to KIP-962, should we allow `null` for time bounds ?
The
JavaDocs should also be explicit if `null` is allowed or not and
what
the semantics are if allowed.



You are using `asOf()` however, because we are doing time-range
queries, to me using `until()` to describe the upper bound would
sound better (I am not a native speaker though, so maybe I am
off?)


The key query returns all the records that have timestamp <=
{@code
asOfTimestamp}.

This is only correct if not lower-bound is set, right?


In your reply to KIP-960 you mentioned:

the meaningless combinations are prevented by throwing
exceptions.

We should add corresponding JavaDocs like:

      @throws IllegalArgumentException if {@code fromTimestamp} is
equal or
                                       larger than {@code
untilTimestamp}

Or something similar.


With regard to KIP-960: if we need to introduce a
`VersionedKeyQuery`
class for single-key-single-ts lookup, would we need to find a new
name for the query class of this KIP, given that the return type
is
different?


-Matthias



On 8/16/23 10:57 AM, Alieh Saeedi wrote:
Hi all,

I splitted KIP-960
<

https://cwiki.apache.org/confluence/display/KAFKA/KIP-960%3A+Support+single-key_single-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+stores

into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You
can
see all
the addressed reviews on the following page. Thanks in advance.

KIP-968: Support single-key_multi-timestamp interactive queries
(IQv2) for
versioned state stores
<

https://cwiki.apache.org/confluence/display/KAFKA/KIP-968%3A+Support+single-key_multi-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+stores


I look forward to your feedback!

Cheers,
Alieh





Reply via email to