Just catching up here.

I agree that we have two issue, and the first (align subscription store
to main store) can be done as a bug-fix.

For the KIP (that addressed the second), I tend to agree that reusing
`Materialized` might be better as it would keep the API surface area
smaller.


-Matthias

On 4/6/21 8:48 AM, John Roesler wrote:
> Hi Marco,
> 
> Just a quick clarification: I just reviewed the Materialized class. It looks 
> like the only undesirable members are:
> 1. Retention
> 2. Key/Value serdes 
> 
> The underlying store type would be “KeyValueStore<Bytes,byte[]>” , for which 
> case the withRetention javadoc already says it’s ignored.
> 
> Perhaps we could just stick with Materialized by adding a note to the 
> Key/Value serdes setters that they are ignored for FKJoin subscription stores?
> 
> Not as elegant as a new config class, but these config classes actually bring 
> a fair amount of complexity, so it might be nice to avoid a new one.
> 
> Thanks,
> John
> 
> On Tue, Apr 6, 2021, at 10:28, Marco Aurélio Lotz wrote:
>> Hi John / Guozhang,
>>
>> If I correctly understood John's message, he agrees on having the two
>> scenarios (piggy-back and api extension). In my view, these two scenarios
>> are separate tasks - the first one is a bug-fix and the second is an
>> improvement on the current API.
>>
>> - bug-fix: On the current API, we change its implementation to piggy back
>> on the materialization method provided to the materialized parameter. This
>> way it will not be opinionated anymore and will not force RocksDb
>> persistence for subscription store. Thus an in-memory materialized
>> parameter would imply an in-memory subscription store, for example. From my
>> understanding, the original implementation tried to be as unopionated
>> towards storage methods as possible - and the current implementation is not
>> allowing that. Was that the case? We would still need to add this
>> modification to the update notes, since it may affect some deployments.
>>
>> - improvement: We extend the API to allow a user to fine tune different
>> materialization methods for subscription and join store. This is done by
>> adding a new parameter to the associated methods.
>>
>> Does it sound reasonable to you Guozhang?
>> On your question, does it make sense for an user to decide retention
>> policies (withRetention method) or caching for subscription stores? I can
>> see why to finetune Logging for example, but in a first moment not these
>> other behaviours. That's why I am unsure about using Materialized class.
>>
>> @John, I will update the KIP with your points as soon as we clarify this.
>>
>> Cheers,
>> Marco
>>
>> On Tue, Apr 6, 2021 at 1:17 AM Guozhang Wang <wangg...@gmail.com> wrote:
>>
>>> Thanks Marco / John,
>>>
>>> I think the arguments for not piggy-backing on the existing Materialized
>>> makes sense; on the other hand, if we go this route should we just use a
>>> separate Materialized than using an extended /
>>> narrowed-scoped MaterializedSubscription since it seems we want to allow
>>> users to fully customize this store?
>>>
>>> Guozhang
>>>
>>> On Thu, Apr 1, 2021 at 5:28 PM John Roesler <vvcep...@apache.org> wrote:
>>>
>>>> Thanks Marco,
>>>>
>>>> Sorry if I caused any trouble!
>>>>
>>>> I don’t remember what I was thinking before, but reasoning about it now,
>>>> you might need the fine-grained choice if:
>>>>
>>>> 1. The number or size of records in each partition of both tables is
>>>> small(ish), but the cardinality of the join is very high. Then you might
>>>> want an in-memory table store, but an on-disk subscription store.
>>>>
>>>> 2. The number or size of records is very large, but the join cardinality
>>>> is low. Then you might need an on-disk table store, but an in-memory
>>>> subscription store.
>>>>
>>>> 3. You might want a different kind (or differently configured) store for
>>>> the subscription store, since it’s access pattern is so different.
>>>>
>>>> If you buy these, it might be good to put the justification into the KIP.
>>>>
>>>> I’m in favor of the default you’ve proposed.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> On Mon, Mar 29, 2021, at 04:24, Marco Aurélio Lotz wrote:
>>>>> Hi Guozhang,
>>>>>
>>>>> Apologies for the late answer. Originally that was my proposal - to
>>>>> piggyback on the provided materialisation method (
>>>>> https://issues.apache.org/jira/browse/KAFKA-10383).
>>>>> John Roesler suggested to us to provide even further fine tuning on API
>>>>> level parameters. Maybe we could see this as two sides of the same
>>> coin:
>>>>>
>>>>> - On the current API, we change it to piggy back on the materialization
>>>>> method provided to the join store.
>>>>> - We extend the API to allow a user to fine tune different
>>>> materialization
>>>>> methods for subscription and join store.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Cheers,
>>>>> Marco
>>>>>
>>>>> On Thu, Mar 4, 2021 at 8:04 PM Guozhang Wang <wangg...@gmail.com>
>>> wrote:
>>>>>
>>>>>> Thanks Marco,
>>>>>>
>>>>>> Just a quick thought: what if we reuse the existing Materialized
>>>> object for
>>>>>> both subscription and join stores, instead of introducing a new
>>> param /
>>>>>> class?
>>>>>>
>>>>>> Guozhang
>>>>>>
>>>>>> On Tue, Mar 2, 2021 at 1:07 AM Marco Aurélio Lotz <
>>>> cont...@marcolotz.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi folks,
>>>>>>>
>>>>>>> I would like to invite everyone to discuss further KIP-718:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-718%3A+Make+KTable+Join+on+Foreign+key+unopinionated
>>>>>>>
>>>>>>> I welcome all feedback on it.
>>>>>>>
>>>>>>> Kind Regards,
>>>>>>> Marco Lotz
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -- Guozhang
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>

Reply via email to