Thanks Jeyhun.

About ValueTransformer: I don't think we can remove it. Note,
ValueTransformer allows to attach a state and also allows to register
punctuations. Both those features will not be available via withKey()
interfaces.

-Matthias


On 5/27/17 1:25 PM, Jeyhun Karimov wrote:
> Hi Matthias,
> 
> Thanks for your comments.
> 
> I tested the deep copy approach. It has significant overhead. Especially
> for "light" and stateless operators it slows down the topology
> significantly (> 20% ). I think "warning"  users about not-changing the key
> is better warning them about possible performance loss.
> 
> About the interfaces, additionally I considered adding InitializerWithKey,
> AggregatorWithKey and ValueTransformerWithKey. I think they are included in
> PR but not in KIP. I will also include them in KIP, sorry my bad.
> Including ReducerWithKey definitely makes sense. Thanks.
> 
> One thing I want to mention is that, maybe we should deprecate methods with
> argument type ValueTransformerSupplier (KStream.transformValues(...)) and
> and as a whole the ValueTransformerSupplier interface.
> We can use ValueTransformer/ValueTransformerWithKey type instead without
> additional supplier layer.
> 
> 
> Cheers,
> Jeyhun
> 
> 
> On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax <matth...@confluent.io>
> wrote:
> 
>> One more question:
>>
>> Should we add any of
>>  - InitizialierWithKey
>>  - ReducerWithKey
>>  - ValueTransformerWithKey
>>
>> To get consistent/complete API, it might be a good idea. Any thoughts?
>>
>>
>> -Matthias
>>
>>
>> On 5/24/17 3:47 PM, Matthias J. Sax wrote:
>>> Jeyhun,
>>>
>>> I was just wondering if you did look into the key-deep-copy idea we
>>> discussed. I am curious to see what the impact might be.
>>>
>>>
>>> -Matthias
>>>
>>> On 5/20/17 2:03 AM, Jeyhun Karimov wrote:
>>>> Hi,
>>>>
>>>> Thanks for your comments. I rethink about including rich functions into
>>>> this KIP.
>>>> I think once we include rich functions in this KIP and then fix
>>>> ProcessorContext in another KIP and incorporate with existing rich
>>>> functions, the code will not be backwards compatible.
>>>>
>>>> I see Damian's and your point more clearly now.
>>>>
>>>> Conclusion: we include only withKey interfaces in this KIP (I updated
>> the
>>>> KIP), I will try also initiate another KIP for rich functions.
>>>>
>>>> Cheers,
>>>> Jeyhun
>>>>
>>>> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax <matth...@confluent.io
>>>
>>>> wrote:
>>>>
>>>>> With the current KIP, using ValueMapper and ValueMapperWithKey
>>>>> interfaces, RichFunction seems to be an independent add-on. To fix the
>>>>> original issue to allow key access, RichFunctions are not required
>> IMHO.
>>>>>
>>>>> I initially put the RichFunction idea on the table, because I was
>> hoping
>>>>> to get a uniform API. And I think, is was good to consider them --
>>>>> however, the discussion showed that they are not necessary for key
>>>>> access. Thus, it seems to be better to handle RichFunctions in an own
>>>>> KIP. The ProcessorContext/RecordContext issues seems to be a main
>>>>> challenge for this. And introducing RichFunctions with parameter-less
>>>>> init() method, seem not to help too much. We would get an
>> "intermediate"
>>>>> API that we want to change anyway later on...
>>>>>
>>>>> As you put already much effort into RichFunction, feel free do push
>> this
>>>>> further and start a new KIP (we can do this even in parallel) -- we
>>>>> don't want to slow you down :) But it make the discussion and code
>>>>> review easier, if we separate both IMHO.
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>>
>>>>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote:
>>>>>> Hi Damian,
>>>>>>
>>>>>> Thanks for your comments. I think providing to users *interface*
>> rather
>>>>>> than *abstract class* should be preferred (Matthias also raised this
>>>>> issue
>>>>>> ), anyway I changed the corresponding parts of KIP.
>>>>>>
>>>>>> Regarding with passing additional contextual information, I think it
>> is a
>>>>>> tradeoff,
>>>>>> 1) first, we fix the context parameter for *init() *method in another
>> PR
>>>>>> and solve Rich functions afterwards
>>>>>> 2) first, we fix the requested issues on jira ([1-3]) with providing
>>>>>> (not-complete) Rich functions and integrate the context parameters to
>>>>> this
>>>>>> afterwards (like improvement)
>>>>>>
>>>>>> To me, the second approach seems more incremental. However you are
>> right,
>>>>>> the names might confuse the users.
>>>>>>
>>>>>>
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/KAFKA-4218
>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-4726
>>>>>> [3] https://issues.apache.org/jira/browse/KAFKA-3745
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Jeyhun
>>>>>>
>>>>>>
>>>>>> On Fri, May 19, 2017 at 10:42 AM Damian Guy <damian....@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I see you've removed the `ProcessorContext` from the RichFunction
>> which
>>>>> is
>>>>>>> good, but why is it a `RichFunction`? I'd have expected it to pass
>> some
>>>>>>> additional contextual information, i.e., the `RecordContext` that
>>>>> contains
>>>>>>> just the topic, partition, timestamp, offset.  I'm ok with it not
>>>>> passing
>>>>>>> this contextual information, but is the naming incorrect? I'm not
>> sure,
>>>>>>> tbh. I'm wondering if we should drop `RichFunctions` until we can do
>> it
>>>>>>> properly with the correct context?
>>>>>>>
>>>>>>> Also, i don't like the abstract classes: RichValueMapper,
>>>>> RichValueJoiner,
>>>>>>> RichInitializer etc. Why can't they not just be interfaces?
>> Generally we
>>>>>>> should provide users with Intefaces to code against, not classes.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Damian
>>>>>>>
>>>>>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov <je.kari...@gmail.com>
>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Thanks. I initiated the PR as well, to get a better overview.
>>>>>>>>
>>>>>>>> The only reason that I used abstract class instead of interface for
>>>>> Rich
>>>>>>>> functions is that in future if we will have some
>> AbstractRichFunction
>>>>>>>> abstract classes,
>>>>>>>> we can easily extend:
>>>>>>>>
>>>>>>>> public abstract class RichValueMapper<K, V, VR>  implements
>>>>>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends
>>>>>>> AbstractRichFunction*{
>>>>>>>> }
>>>>>>>>  With interfaces we are only limited to interfaces for inheritance.
>>>>>>>>
>>>>>>>> However, I think we can easily change it (from abstract class ->
>>>>>>> interface)
>>>>>>>> if you think interface is a better fit.
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Jeyhun
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax <
>> matth...@confluent.io
>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the update and explanations. The KIP is quite improved
>> now
>>>>>>> --
>>>>>>>>> great job!
>>>>>>>>>
>>>>>>>>> One more question: Why are RichValueMapper etc abstract classes and
>>>>> not
>>>>>>>>> interfaces?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Overall, I like the KIP a lot!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Matthias
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Thanks for your comments.
>>>>>>>>>>
>>>>>>>>>> I think supporting Lambdas for `withKey` and
>> `AbstractRichFunction`
>>>>>>>>>>> don't go together, as Lambdas are only supported for interfaces
>>>>>>> AFAIK.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Maybe I misunderstood your comment.
>>>>>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they don't
>> have
>>>>>>>>> direct
>>>>>>>>>> relation with *AbstractRichFunction*.
>>>>>>>>>> *withKey* and and *withOnlyValue* interfaces have only one
>> method ,
>>>>>>> so
>>>>>>>>> we
>>>>>>>>>> can use lambdas.
>>>>>>>>>> Where does the *AbstractRichFunction* comes to play? Inside Rich
>>>>>>>>> functions.
>>>>>>>>>> And we use Rich functions in 2 places:
>>>>>>>>>>
>>>>>>>>>> 1. User doesn't use rich functions. Just regular *withKey* and and
>>>>>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get those
>>>>>>>>> interfaces
>>>>>>>>>> and wrap into Rich function while building the topology, and send
>> to
>>>>>>>>>> Processor.
>>>>>>>>>> 2. User does use rich functions (Rich functions implement
>> *withKey*
>>>>>>>>>> interface). As a result no lamdas here by definition. In this
>> case,
>>>>>>>> while
>>>>>>>>>> building the topology we do a type check if the object type is
>>>>>>>>>> *withKey* or *RichFunction*.
>>>>>>>>>>
>>>>>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich
>> functions
>>>>>>>> and
>>>>>>>>>> does not affect using lambdas.
>>>>>>>>>>
>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to
>> have a
>>>>>>>>>>> interface approach like this
>>>>>>>>>>>   - RichFunction -> only adding init() and close()
>>>>>>>>>>>   - ValueMapper
>>>>>>>>>>>   - ValueMapperWithKey
>>>>>>>>>>>   - RichValueMapper extends ValueMapperWithKey, RichFunction
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As I said above, currently we support lambdas for *withKey*
>>>>>>> interfaces
>>>>>>>> as
>>>>>>>>>> well.  However, I agree with your idea and I will remove the
>>>>>>>>>> AbstractRichFunction from the design.
>>>>>>>>>>
>>>>>>>>>> As an alternative, we could argue, that it is sufficient to
>> support
>>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API".
>>>>>>> For
>>>>>>>>>>> this, RichFunction could add key+init+close and
>> AbstractRichFunction
>>>>>>>>>>> would allow to only care about getting the key.
>>>>>>>>>>> Not sure, which one is better. I don't like the idea of more
>>>>>>>> overloaded
>>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because
>> we
>>>>>>>> have
>>>>>>>>>>> already so many overlaods. On the other hand, I do see value in
>>>>>>>>>>> supporting Lambdas for `withKey`.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just to clarify, with current design we have only one extra
>>>>>>> overloaded
>>>>>>>>>> method per *withOnlyValue* interface:  which is *withKey* version
>> of
>>>>>>>>>> particular interface.
>>>>>>>>>> We don't need extra overload for Rich function as Rich function
>>>>>>>>> implements
>>>>>>>>>> *withKey* interface as a result they have same type. We
>> differentiate
>>>>>>>>> them
>>>>>>>>>> while building the topology.
>>>>>>>>>> We supported lambdas for *withKey* APIs because of the comment:
>>>>>>>>>>
>>>>>>>>>> @Jeyhun: I did not put any thought into this, but can we have a
>>>>>>> design
>>>>>>>>>>> that allows for both? Also, with regard to lambdas, it might make
>>>>>>>> sense
>>>>>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> However, I don't think that this complicates the overall design
>>>>>>>>>> significantly.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Depending on what we want to support, it might make sense to
>>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this
>> also
>>>>>>>>>>> determines if we should have a "ProcessorContext KIP" before
>> driving
>>>>>>>>>>> this KIP further.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Based on our discussion I think we should keep Rich functions as I
>>>>>>>> don't
>>>>>>>>>> think that they bring extra layer of overhead to library.
>>>>>>>>>>
>>>>>>>>>> Any comments are appreciated.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Jeyhun
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax <
>>>>>>>> matth...@confluent.io>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Jeyhun,
>>>>>>>>>>>
>>>>>>>>>>> thanks for the update.
>>>>>>>>>>>
>>>>>>>>>>> I think supporting Lambdas for `withKey` and
>> `AbstractRichFunction`
>>>>>>>>>>> don't go together, as Lambdas are only supported for interfaces
>>>>>>> AFAIK.
>>>>>>>>>>>
>>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to
>> have a
>>>>>>>>>>> interface approach like this
>>>>>>>>>>>
>>>>>>>>>>>   - RichFunction -> only adding init() and close()
>>>>>>>>>>>
>>>>>>>>>>>   - ValueMapper
>>>>>>>>>>>   - ValueMapperWithKey
>>>>>>>>>>>
>>>>>>>>>>>   - RichValueMapper extends ValueMapperWithKey, RichFunction
>>>>>>>>>>>
>>>>>>>>>>> For this approach, AbstractRichFunction does not make sense
>> anymore,
>>>>>>>> as
>>>>>>>>>>> the only purpose of `RichFunction` is to allow the
>> implementation of
>>>>>>>>>>> init() and close() -- if you don't want those, you would
>> implement a
>>>>>>>>>>> different interface (ie, ValueMapperWithKey)
>>>>>>>>>>>
>>>>>>>>>>> As an alternative, we could argue, that it is sufficient to
>> support
>>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API".
>>>>>>> For
>>>>>>>>>>> this, RichFunction could add key+init+close and
>> AbstractRichFunction
>>>>>>>>>>> would allow to only care about getting the key.
>>>>>>>>>>>
>>>>>>>>>>> Not sure, which one is better. I don't like the idea of more
>>>>>>>> overloaded
>>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because
>> we
>>>>>>>> have
>>>>>>>>>>> already so many overlaods. On the other hand, I do see value in
>>>>>>>>>>> supporting Lambdas for `withKey`.
>>>>>>>>>>>
>>>>>>>>>>> Depending on what we want to support, it might make sense to
>>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this
>> also
>>>>>>>>>>> determines if we should have a "ProcessorContext KIP" before
>> driving
>>>>>>>>>>> this KIP further.
>>>>>>>>>>>
>>>>>>>>>>> Thoughts?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -Matthias
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for super late response. Thanks for your comments.
>>>>>>>>>>>>
>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I
>>>>>>>> cannot
>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem is.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> - From [1] says "A functional interface is an interface that has
>>>>>>> just
>>>>>>>>> one
>>>>>>>>>>>> abstract method, and thus represents a single function
>> contract".
>>>>>>>>>>>> So basically once we extend some interface from another (in our
>>>>>>> case,
>>>>>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in
>> the
>>>>>>>>>>> extended
>>>>>>>>>>>> interface.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Further comments:
>>>>>>>>>>>>>  - The KIP get a little hard to read -- can you maybe reformat
>> the
>>>>>>>>> wiki
>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> - I will work on the KIP.
>>>>>>>>>>>>
>>>>>>>>>>>>  - What about KStream-KTable joins? You don't have overlaods
>> added
>>>>>>>> for
>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any
>> new
>>>>>>>>>>>>> overloads)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> - Actually there are more than one Processor and public APIs to
>> be
>>>>>>>>>>>> changed (KStream-KTable
>>>>>>>>>>>> joins is one case). However all of them has similar structure:
>> we
>>>>>>>>>>> overload
>>>>>>>>>>>> the *method* with  *methodWithKey*,
>>>>>>>>>>>> wrap it into the Rich function, send to processor and inside the
>>>>>>>>>>> processor
>>>>>>>>>>>> call *init* and *close* methods of the Rich function.
>>>>>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea with
>>>>>>> only
>>>>>>>>>>>> *ValueMapper* as the same can be applied to all changes.
>>>>>>>>>>>> Anyway I will update the KIP.
>>>>>>>>>>>>
>>>>>>>>>>>>  - Why do we need `AbstractRichFunction`?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and*
>> close()*
>>>>>>>>>>> methods
>>>>>>>>>>>> in every Rich function with empty body like:
>>>>>>>>>>>>
>>>>>>>>>>>> @Override
>>>>>>>>>>>> void init(ProcessorContext context) {}
>>>>>>>>>>>>
>>>>>>>>>>>> @Override
>>>>>>>>>>>> void close () {}
>>>>>>>>>>>>
>>>>>>>>>>>> I thought that we can override them once in
>> *AbstractRichFunction*
>>>>>>>> and
>>>>>>>>>>>> extent new Rich functions from *AbstractRichFunction*.
>>>>>>>>>>>> Basically this can eliminate code copy-paste and ease the
>>>>>>>> maintenance.
>>>>>>>>>>>>
>>>>>>>>>>>>  - What about interfaces Initializer, ForeachAction, Merger,
>>>>>>>> Predicate,
>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but
>> we
>>>>>>>>> should
>>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g.,
>>>>>>>>>>>>> RichForachAction does make sense IMHO)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Definitely agree. As I said, the same technique applies to all
>> this
>>>>>>>>>>>> interfaces and I didn't want to explode the KIP, just wanted to
>>>>>>> give
>>>>>>>>> the
>>>>>>>>>>>> overall intuition.
>>>>>>>>>>>> However, I will update the KIP as I said.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` --
>>>>>>>>> `RichValueXX`
>>>>>>>>>>>>> in general -- but why can't we do all this with interfaces
>> only?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Sure we can. However the main intuition is we should not force
>>>>>>> users
>>>>>>>> to
>>>>>>>>>>>> implement *init(ProcessorContext)* and *close()* functions every
>>>>>>> time
>>>>>>>>>>> they
>>>>>>>>>>>> use Rich functions.
>>>>>>>>>>>> If one needs, she can override the respective methods. However,
>> I
>>>>>>> am
>>>>>>>>> open
>>>>>>>>>>>> for discussion.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'd rather not see the use of  `ProcessorContext` spread any
>>>>>>> further
>>>>>>>>> than
>>>>>>>>>>>>> it currently is. So maybe we need another KIP that is done
>> before
>>>>>>>>> this?
>>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That is good point. I wanted to make *init(ProcessorContext)*
>>>>>>> method
>>>>>>>>>>>> persistent among the library (which use ProcessorContext as an
>>>>>>>> input),
>>>>>>>>>>>> therefore I put *ProcessorContext* as an input.
>>>>>>>>>>>> So the important question is that (as @dguy and @mjsax
>> mentioned)
>>>>>>>>> whether
>>>>>>>>>>>> continue this KIP without providing users an access to
>>>>>>>>> *ProcessorContext*
>>>>>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or
>>>>>>>>>>>> initiate another KIP before this.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java-
>>>>>>>>>>> se-8-jls-pfd-diffs.pdf
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy <
>> damian....@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'd rather not see the use of  `ProcessorContext` spread any
>>>>>>> further
>>>>>>>>>>> than
>>>>>>>>>>>>> it currently is. So maybe we need another KIP that is done
>> before
>>>>>>>>> this?
>>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax <
>>>>>>> matth...@confluent.io
>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree that that `ProcessorContext` interface is too broad in
>>>>>>>>> general
>>>>>>>>>>>>>> -- this is even true for transform/process, and it's also
>>>>>>> reflected
>>>>>>>>> in
>>>>>>>>>>>>>> the API improvement list we want to do.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/
>>>>>>>>>>>>> Kafka+Streams+Discussions
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I am wondering, if you question the `RichFunction`
>> approach in
>>>>>>>>>>>>>> general? Or if you suggest to either extend the scope of this
>> KIP
>>>>>>>> to
>>>>>>>>>>>>>> include this---or maybe better, do another KIP for it and
>> delay
>>>>>>>> this
>>>>>>>>>>> KIP
>>>>>>>>>>>>>> until the other one is done?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote:
>>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we
>> really
>>>>>>>> want
>>>>>>>>> to
>>>>>>>>>>>>>> give
>>>>>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It has a
>>>>>>> bunch
>>>>>>>>> of
>>>>>>>>>>>>>>> methods on it that seem in-appropriate for some of the DSL
>>>>>>>> methods,
>>>>>>>>>>>>> i.e,
>>>>>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is
>>>>>>> far
>>>>>>>>> too
>>>>>>>>>>>>>>> broad. I think it would be better to have a narrower
>> interface
>>>>>>>> like
>>>>>>>>>>> the
>>>>>>>>>>>>>>> `RecordContext`  - remembering it is easier to add
>>>>>>>>> methods/interfaces
>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>> than to remove them
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax <
>>>>>>>> matth...@confluent.io
>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jeyhun,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little
>> bit?
>>>>>>> I
>>>>>>>>>>>>> cannot
>>>>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem
>> is.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's up to
>>>>>>> you.
>>>>>>>>>>>>> Maybe a
>>>>>>>>>>>>>>>> committer can comment on this?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Further comments:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  - The KIP get a little hard to read -- can you maybe
>> reformat
>>>>>>>> the
>>>>>>>>>>>>> wiki
>>>>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  - What about KStream-KTable joins? You don't have overlaods
>>>>>>>> added
>>>>>>>>>>> for
>>>>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add
>> any
>>>>>>>> new
>>>>>>>>>>>>>>>> overloads)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  - Why do we need `AbstractRichFunction`?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  - What about interfaces Initializer, ForeachAction, Merger,
>>>>>>>>>>>>> Predicate,
>>>>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all,
>> but
>>>>>>> we
>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g.,
>>>>>>>>>>>>>>>> RichForachAction does make sense IMHO)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` --
>>>>>>>>>>>>> `RichValueXX`
>>>>>>>>>>>>>>>> in general -- but why can't we do all this with interfaces
>>>>>>> only?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend the two
>>>>>>>>>>> interfaces
>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should
>>>>>>>> change
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> title, because now we are not limiting the KIP to only
>>>>>>>>> ValueMapper,
>>>>>>>>>>>>>>>>> ValueTransformer and ValueJoiner.
>>>>>>>>>>>>>>>>> Please feel free to comment.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+
>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax <
>>>>>>>>>>>>> matth...@confluent.io>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't
>> need
>>>>>>> the
>>>>>>>>> new
>>>>>>>>>>>>>>>>>> overlaod.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens when
>>>>>>>>> building
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we
>>>>>>> don't
>>>>>>>>>>> need
>>>>>>>>>>>>>> any
>>>>>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote:
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for feedback.
>>>>>>>>>>>>>>>>>>> Then we need to overload method
>>>>>>>>>>>>>>>>>>>   <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ?
>>>>>>>> extends
>>>>>>>>>>>>> VR>
>>>>>>>>>>>>>>>>>>> mapper);
>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>   <VR> KStream<K, VR> mapValues(ValueMapperWithKey<?
>> super
>>>>>>> V,
>>>>>>>> ?
>>>>>>>>>>>>>> extends
>>>>>>>>>>>>>>>>>> VR>
>>>>>>>>>>>>>>>>>>> mapper);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to check
>> it
>>>>>>> is
>>>>>>>>>>>>>>>>>> ValueMapper
>>>>>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich
>>>>>>>> function.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki <
>>>>>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +1 :)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote:
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we
>>>>>>>> should
>>>>>>>>>>> do
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> design a little different.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction and
>>>>>>>>>>>>>> non-RichFunction
>>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the time.
>>>>>>>> Thus,
>>>>>>>>> on
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> DSL
>>>>>>>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we
>>>>>>> wrap
>>>>>>>> it
>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. This
>>>>>>>>>>>>>> RichFunction
>>>>>>>>>>>>>>>>>>>>> would just forward the call omitting the key:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet):
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements
>>>>>>> RichValueMapper()
>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>    private ValueMapper userProvidedMapper; // set by
>>>>>>>>>>> constructor
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    public VR apply(final K key, final V1 value1,
>> final V2
>>>>>>>>>>>>> value2)
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>        return userProvidedMapper(value1, value2);
>>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  From a performance point of view, I am not sure if the
>>>>>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have
>> more
>>>>>>>>>>> overhead
>>>>>>>>>>>>>>>> than
>>>>>>>>>>>>>>>>>>>>> this approach (we would do more nested method call for
>>>>>>>>>>>>>>>> non-RichFunction
>>>>>>>>>>>>>>>>>>>>> which should be more common than RichFunctions).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I miss
>>>>>>>>>>> something?)
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top level
>>>>>>>>> interface
>>>>>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and
>>>>>>> also
>>>>>>>>>>>>>>>> interfaces
>>>>>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes
>>>>>>>>> required).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Any thoughts?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include
>>>>>>> rich
>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep copying
>> of
>>>>>>>> keys.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak <
>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com>
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hey Matthias,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the
>> immutability of
>>>>>>>> the
>>>>>>>>>>>>> key
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> best approach available.  Better than requiring the
>> key
>>>>>>> to
>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>> serializable
>>>>>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no
>> performance
>>>>>>>> risk.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type
>> constraints
>>>>>>>> of
>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>> kind.
>>>>>>>>>>>>>>>>>>>> :-)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Mathieu
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax <
>>>>>>>>>>>>>>>>>>>> matth...@confluent.io>
>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path,
>> it
>>>>>>>>> should
>>>>>>>>>>>>>> cover
>>>>>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure what
>> to
>>>>>>> do
>>>>>>>>>>> about
>>>>>>>>>>>>>>>> it...
>>>>>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and name
>> the
>>>>>>> key
>>>>>>>>>>>>>>>> parameter
>>>>>>>>>>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I
>>>>>>> would
>>>>>>>>>>>>> prefer
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no good
>> idea
>>>>>>>> how
>>>>>>>>>>> to
>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of
>> corrupted
>>>>>>>>>>>>>>>> partitioning
>>>>>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, well,
>> bad
>>>>>>>>> luck,
>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> got
>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep
>> copy
>>>>>>>> with
>>>>>>>>> a
>>>>>>>>>>>>>>>>>> potential
>>>>>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm without
>> any
>>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>>>>>>>>> test).
>>>>>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's
>> worth
>>>>>>>> for
>>>>>>>>>>> you
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> apply
>>>>>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very
>>>>>>> basic
>>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you want
>> to
>>>>>>> do
>>>>>>>>>>>>> this,
>>>>>>>>>>>>>>>> look
>>>>>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and into
>>>>>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find
>> the
>>>>>>>> perf
>>>>>>>>>>>>>> test.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Hi Matthias,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions
>> totally
>>>>>>>>>>> makes
>>>>>>>>>>>>>>>>>> sense.
>>>>>>>>>>>>>>>>>>>>>>> So,
>>>>>>>>>>>>>>>>>>>>>>>>>   we don't want to guard the keys because it is
>>>>>>> costly.
>>>>>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should
>> not be
>>>>>>>>>>>>> limited
>>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>>>>>>>>> the 3
>>>>>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of
>>>>>>>>> interfaces.
>>>>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax <
>>>>>>>>>>>>>>>>>>>> matth...@confluent.io
>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the user
>> list:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj=
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+
>>>>>>>>>>>>> key+
>>>>>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder,
>>>>>>> and
>>>>>>>>>>>>>>>> Substractor
>>>>>>>>>>>>>>>>>>>>>>>>>> inferface, too.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are other
>>>>>>> interface
>>>>>>>>> we
>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>> miss
>>>>>>>>>>>>>>>>>>>>>>>> atm.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am
>>>>>>>>> actually
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> little
>>>>>>>>>>>>>>>>>>>>>>> bit
>>>>>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might
>> want to
>>>>>>>> do
>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>> quantify this impact.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of
>> `RichFunction`
>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata (like
>>>>>>>>>>>>> timestamp,
>>>>>>>>>>>>>>>>>>>> offset,
>>>>>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a
>> similar
>>>>>>>>>>> concept.
>>>>>>>>>>>>>>>>>> Thus, I
>>>>>>>>>>>>>>>>>>>>>>> am
>>>>>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the
>>>>>>> scope
>>>>>>>>> of
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> KIP
>>>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>> that? WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach
>>>>>>> and
>>>>>>>>>>>>> updated
>>>>>>>>>>>>>>>> PR
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>> KIP
>>>>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in
>> Processors
>>>>>>>>>>>>> sending
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> copy
>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>>>>>>>> actual key.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I
>> think
>>>>>>>>> memory
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>> bottleneck
>>>>>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak <
>>>>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper
>> (...etc) to
>>>>>>>> be
>>>>>>>>>>>>>>>> classes,
>>>>>>>>>>>>>>>>>>>>>>> rather
>>>>>>>>>>>>>>>>>>>>>>>>>> than
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards
>> incompatible
>>>>>>>>>>>>> change.
>>>>>>>>>>>>>>>> An
>>>>>>>>>>>>>>>>>>>>>>>>>> alternative
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible
>> would
>>>>>>> be
>>>>>>>>> to
>>>>>>>>>>>>>>>> define
>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those
>>>>>>>>> interfaces
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>> used.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as
>> "final"
>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>> change
>>>>>>>>>>>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change.  It only
>>>>>>> prevents
>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> parameter
>>>>>>>>>>>>>>>>>>>>>>>>>> variable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned.  If the key type is a
>>>>>>> mutable
>>>>>>>>>>>>> object
>>>>>>>>>>>>>>>>>> (eg.
>>>>>>>>>>>>>>>>>>>>>>>>>> byte[]),
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0).  But
>>>>>>> I'm
>>>>>>>>> not
>>>>>>>>>>>>>>>> really
>>>>>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>>>>>>>>>>>> there's
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov
>> <
>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can
>> guard
>>>>>>> for
>>>>>>>>>>>>>>>> immutable
>>>>>>>>>>>>>>>>>>>> keys
>>>>>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I
>>>>>>> didn't
>>>>>>>>>>>>>> consider
>>>>>>>>>>>>>>>>>>>>>>>> backward
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In
>> both
>>>>>>>>>>> cases,
>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>>>>>>>> accesses
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type
>>>>>>> parameter
>>>>>>>>>>> will
>>>>>>>>>>>>>>>> break
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility.  So user has to cast
>> to
>>>>>>>>> actual
>>>>>>>>>>>>>> key
>>>>>>>>>>>>>>>>>>>> type.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with
>> 2
>>>>>>>>>>> argument
>>>>>>>>>>>>>>>> (key
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>> value)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this,  I
>>>>>>>> think
>>>>>>>>> we
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>>>> address
>>>>>>>>>>>>>>>>>>>>>>>>>> both
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against
>> key
>>>>>>>>> change.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess
>> like:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> public class KeyAccess {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>      Object key;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>      public void beforeApply(final Object
>> key) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>          this.key = key;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>      public Object getKey() {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>          return key;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and
>>>>>>>>>>>>>>>>>> *ValueTransformer*
>>>>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example
>>>>>>>>>>>>>>>>>>>>>>>> *KTableMapValuesProcessor*)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can
>> set
>>>>>>> the
>>>>>>>>>>>>> *key*
>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user
>> can
>>>>>>>> use
>>>>>>>>>>>>>>>>>> *getKey()*
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>> access
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J.
>> Sax <
>>>>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to
>> address:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward
>>>>>>>>> compatibility.
>>>>>>>>>>>>>> Users
>>>>>>>>>>>>>>>>>> did
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> complain
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as
>> the
>>>>>>>> user
>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>>>> grows,
>>>>>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in
>>>>>>> future
>>>>>>>>>>>>>> releases
>>>>>>>>>>>>>>>>>>>>>>> anymore.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to
>> allow
>>>>>>> key
>>>>>>>>>>>>>> access.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same
>> direction
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile
>>>>>>>> failures
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no
>> guard
>>>>>>> to
>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt
>> partitioning
>>>>>>> if
>>>>>>>>>>>>> users
>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>> alter
>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not
>> aware
>>>>>>>>> that
>>>>>>>>>>>>>> they
>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object)
>> and
>>>>>>>> thus
>>>>>>>>>>>>> break
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> application. (This was the original
>> motivation
>>>>>>> to
>>>>>>>>> not
>>>>>>>>>>>>>>>> provide
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> key
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against
>>>>>>>>> modification.)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too,
>> have
>>>>>>>>> wished
>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> access
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar
>>>>>>>> operation.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile
>>>>>>> failures
>>>>>>>>>>> that
>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-)
>> But
>>>>>>>> at
>>>>>>>>>>>>> least
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun
>> Karimov
>>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear community,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues
>>>>>>>>>>>>> KAFKA-4218
>>>>>>>>>>>>>>>> [2],
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-4726
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3],
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be
>> found at
>>>>>>>>> [5].
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/
>>>>>>>>>>>>> confluence/display/KAFKA/KIP-
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira
>>>>>>>>>>> /browse/KAFKA-4218
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira
>>>>>>>>>>> /browse/KAFKA-4726
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira
>>>>>>>>>>> /browse/KAFKA-3745
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [5]
>> https://github.com/apache/kafka/pull/2946
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to