Thanks Jeyhun. About ValueTransformer: I don't think we can remove it. Note, ValueTransformer allows to attach a state and also allows to register punctuations. Both those features will not be available via withKey() interfaces.
-Matthias On 5/27/17 1:25 PM, Jeyhun Karimov wrote: > Hi Matthias, > > Thanks for your comments. > > I tested the deep copy approach. It has significant overhead. Especially > for "light" and stateless operators it slows down the topology > significantly (> 20% ). I think "warning" users about not-changing the key > is better warning them about possible performance loss. > > About the interfaces, additionally I considered adding InitializerWithKey, > AggregatorWithKey and ValueTransformerWithKey. I think they are included in > PR but not in KIP. I will also include them in KIP, sorry my bad. > Including ReducerWithKey definitely makes sense. Thanks. > > One thing I want to mention is that, maybe we should deprecate methods with > argument type ValueTransformerSupplier (KStream.transformValues(...)) and > and as a whole the ValueTransformerSupplier interface. > We can use ValueTransformer/ValueTransformerWithKey type instead without > additional supplier layer. > > > Cheers, > Jeyhun > > > On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax <matth...@confluent.io> > wrote: > >> One more question: >> >> Should we add any of >> - InitizialierWithKey >> - ReducerWithKey >> - ValueTransformerWithKey >> >> To get consistent/complete API, it might be a good idea. Any thoughts? >> >> >> -Matthias >> >> >> On 5/24/17 3:47 PM, Matthias J. Sax wrote: >>> Jeyhun, >>> >>> I was just wondering if you did look into the key-deep-copy idea we >>> discussed. I am curious to see what the impact might be. >>> >>> >>> -Matthias >>> >>> On 5/20/17 2:03 AM, Jeyhun Karimov wrote: >>>> Hi, >>>> >>>> Thanks for your comments. I rethink about including rich functions into >>>> this KIP. >>>> I think once we include rich functions in this KIP and then fix >>>> ProcessorContext in another KIP and incorporate with existing rich >>>> functions, the code will not be backwards compatible. >>>> >>>> I see Damian's and your point more clearly now. >>>> >>>> Conclusion: we include only withKey interfaces in this KIP (I updated >> the >>>> KIP), I will try also initiate another KIP for rich functions. >>>> >>>> Cheers, >>>> Jeyhun >>>> >>>> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax <matth...@confluent.io >>> >>>> wrote: >>>> >>>>> With the current KIP, using ValueMapper and ValueMapperWithKey >>>>> interfaces, RichFunction seems to be an independent add-on. To fix the >>>>> original issue to allow key access, RichFunctions are not required >> IMHO. >>>>> >>>>> I initially put the RichFunction idea on the table, because I was >> hoping >>>>> to get a uniform API. And I think, is was good to consider them -- >>>>> however, the discussion showed that they are not necessary for key >>>>> access. Thus, it seems to be better to handle RichFunctions in an own >>>>> KIP. The ProcessorContext/RecordContext issues seems to be a main >>>>> challenge for this. And introducing RichFunctions with parameter-less >>>>> init() method, seem not to help too much. We would get an >> "intermediate" >>>>> API that we want to change anyway later on... >>>>> >>>>> As you put already much effort into RichFunction, feel free do push >> this >>>>> further and start a new KIP (we can do this even in parallel) -- we >>>>> don't want to slow you down :) But it make the discussion and code >>>>> review easier, if we separate both IMHO. >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> >>>>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote: >>>>>> Hi Damian, >>>>>> >>>>>> Thanks for your comments. I think providing to users *interface* >> rather >>>>>> than *abstract class* should be preferred (Matthias also raised this >>>>> issue >>>>>> ), anyway I changed the corresponding parts of KIP. >>>>>> >>>>>> Regarding with passing additional contextual information, I think it >> is a >>>>>> tradeoff, >>>>>> 1) first, we fix the context parameter for *init() *method in another >> PR >>>>>> and solve Rich functions afterwards >>>>>> 2) first, we fix the requested issues on jira ([1-3]) with providing >>>>>> (not-complete) Rich functions and integrate the context parameters to >>>>> this >>>>>> afterwards (like improvement) >>>>>> >>>>>> To me, the second approach seems more incremental. However you are >> right, >>>>>> the names might confuse the users. >>>>>> >>>>>> >>>>>> >>>>>> [1] https://issues.apache.org/jira/browse/KAFKA-4218 >>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-4726 >>>>>> [3] https://issues.apache.org/jira/browse/KAFKA-3745 >>>>>> >>>>>> >>>>>> Cheers, >>>>>> Jeyhun >>>>>> >>>>>> >>>>>> On Fri, May 19, 2017 at 10:42 AM Damian Guy <damian....@gmail.com> >>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I see you've removed the `ProcessorContext` from the RichFunction >> which >>>>> is >>>>>>> good, but why is it a `RichFunction`? I'd have expected it to pass >> some >>>>>>> additional contextual information, i.e., the `RecordContext` that >>>>> contains >>>>>>> just the topic, partition, timestamp, offset. I'm ok with it not >>>>> passing >>>>>>> this contextual information, but is the naming incorrect? I'm not >> sure, >>>>>>> tbh. I'm wondering if we should drop `RichFunctions` until we can do >> it >>>>>>> properly with the correct context? >>>>>>> >>>>>>> Also, i don't like the abstract classes: RichValueMapper, >>>>> RichValueJoiner, >>>>>>> RichInitializer etc. Why can't they not just be interfaces? >> Generally we >>>>>>> should provide users with Intefaces to code against, not classes. >>>>>>> >>>>>>> Thanks, >>>>>>> Damian >>>>>>> >>>>>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov <je.kari...@gmail.com> >>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Thanks. I initiated the PR as well, to get a better overview. >>>>>>>> >>>>>>>> The only reason that I used abstract class instead of interface for >>>>> Rich >>>>>>>> functions is that in future if we will have some >> AbstractRichFunction >>>>>>>> abstract classes, >>>>>>>> we can easily extend: >>>>>>>> >>>>>>>> public abstract class RichValueMapper<K, V, VR> implements >>>>>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends >>>>>>> AbstractRichFunction*{ >>>>>>>> } >>>>>>>> With interfaces we are only limited to interfaces for inheritance. >>>>>>>> >>>>>>>> However, I think we can easily change it (from abstract class -> >>>>>>> interface) >>>>>>>> if you think interface is a better fit. >>>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Jeyhun >>>>>>>> >>>>>>>> >>>>>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax < >> matth...@confluent.io >>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks for the update and explanations. The KIP is quite improved >> now >>>>>>> -- >>>>>>>>> great job! >>>>>>>>> >>>>>>>>> One more question: Why are RichValueMapper etc abstract classes and >>>>> not >>>>>>>>> interfaces? >>>>>>>>> >>>>>>>>> >>>>>>>>> Overall, I like the KIP a lot! >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Thanks for your comments. >>>>>>>>>> >>>>>>>>>> I think supporting Lambdas for `withKey` and >> `AbstractRichFunction` >>>>>>>>>>> don't go together, as Lambdas are only supported for interfaces >>>>>>> AFAIK. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Maybe I misunderstood your comment. >>>>>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they don't >> have >>>>>>>>> direct >>>>>>>>>> relation with *AbstractRichFunction*. >>>>>>>>>> *withKey* and and *withOnlyValue* interfaces have only one >> method , >>>>>>> so >>>>>>>>> we >>>>>>>>>> can use lambdas. >>>>>>>>>> Where does the *AbstractRichFunction* comes to play? Inside Rich >>>>>>>>> functions. >>>>>>>>>> And we use Rich functions in 2 places: >>>>>>>>>> >>>>>>>>>> 1. User doesn't use rich functions. Just regular *withKey* and and >>>>>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get those >>>>>>>>> interfaces >>>>>>>>>> and wrap into Rich function while building the topology, and send >> to >>>>>>>>>> Processor. >>>>>>>>>> 2. User does use rich functions (Rich functions implement >> *withKey* >>>>>>>>>> interface). As a result no lamdas here by definition. In this >> case, >>>>>>>> while >>>>>>>>>> building the topology we do a type check if the object type is >>>>>>>>>> *withKey* or *RichFunction*. >>>>>>>>>> >>>>>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich >> functions >>>>>>>> and >>>>>>>>>> does not affect using lambdas. >>>>>>>>>> >>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to >> have a >>>>>>>>>>> interface approach like this >>>>>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>>>>> - ValueMapper >>>>>>>>>>> - ValueMapperWithKey >>>>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> As I said above, currently we support lambdas for *withKey* >>>>>>> interfaces >>>>>>>> as >>>>>>>>>> well. However, I agree with your idea and I will remove the >>>>>>>>>> AbstractRichFunction from the design. >>>>>>>>>> >>>>>>>>>> As an alternative, we could argue, that it is sufficient to >> support >>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". >>>>>>> For >>>>>>>>>>> this, RichFunction could add key+init+close and >> AbstractRichFunction >>>>>>>>>>> would allow to only care about getting the key. >>>>>>>>>>> Not sure, which one is better. I don't like the idea of more >>>>>>>> overloaded >>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because >> we >>>>>>>> have >>>>>>>>>>> already so many overlaods. On the other hand, I do see value in >>>>>>>>>>> supporting Lambdas for `withKey`. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Just to clarify, with current design we have only one extra >>>>>>> overloaded >>>>>>>>>> method per *withOnlyValue* interface: which is *withKey* version >> of >>>>>>>>>> particular interface. >>>>>>>>>> We don't need extra overload for Rich function as Rich function >>>>>>>>> implements >>>>>>>>>> *withKey* interface as a result they have same type. We >> differentiate >>>>>>>>> them >>>>>>>>>> while building the topology. >>>>>>>>>> We supported lambdas for *withKey* APIs because of the comment: >>>>>>>>>> >>>>>>>>>> @Jeyhun: I did not put any thought into this, but can we have a >>>>>>> design >>>>>>>>>>> that allows for both? Also, with regard to lambdas, it might make >>>>>>>> sense >>>>>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> However, I don't think that this complicates the overall design >>>>>>>>>> significantly. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Depending on what we want to support, it might make sense to >>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this >> also >>>>>>>>>>> determines if we should have a "ProcessorContext KIP" before >> driving >>>>>>>>>>> this KIP further. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Based on our discussion I think we should keep Rich functions as I >>>>>>>> don't >>>>>>>>>> think that they bring extra layer of overhead to library. >>>>>>>>>> >>>>>>>>>> Any comments are appreciated. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Jeyhun >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax < >>>>>>>> matth...@confluent.io> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Jeyhun, >>>>>>>>>>> >>>>>>>>>>> thanks for the update. >>>>>>>>>>> >>>>>>>>>>> I think supporting Lambdas for `withKey` and >> `AbstractRichFunction` >>>>>>>>>>> don't go together, as Lambdas are only supported for interfaces >>>>>>> AFAIK. >>>>>>>>>>> >>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to >> have a >>>>>>>>>>> interface approach like this >>>>>>>>>>> >>>>>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>>>>> >>>>>>>>>>> - ValueMapper >>>>>>>>>>> - ValueMapperWithKey >>>>>>>>>>> >>>>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction >>>>>>>>>>> >>>>>>>>>>> For this approach, AbstractRichFunction does not make sense >> anymore, >>>>>>>> as >>>>>>>>>>> the only purpose of `RichFunction` is to allow the >> implementation of >>>>>>>>>>> init() and close() -- if you don't want those, you would >> implement a >>>>>>>>>>> different interface (ie, ValueMapperWithKey) >>>>>>>>>>> >>>>>>>>>>> As an alternative, we could argue, that it is sufficient to >> support >>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". >>>>>>> For >>>>>>>>>>> this, RichFunction could add key+init+close and >> AbstractRichFunction >>>>>>>>>>> would allow to only care about getting the key. >>>>>>>>>>> >>>>>>>>>>> Not sure, which one is better. I don't like the idea of more >>>>>>>> overloaded >>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because >> we >>>>>>>> have >>>>>>>>>>> already so many overlaods. On the other hand, I do see value in >>>>>>>>>>> supporting Lambdas for `withKey`. >>>>>>>>>>> >>>>>>>>>>> Depending on what we want to support, it might make sense to >>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this >> also >>>>>>>>>>> determines if we should have a "ProcessorContext KIP" before >> driving >>>>>>>>>>> this KIP further. >>>>>>>>>>> >>>>>>>>>>> Thoughts? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Sorry for super late response. Thanks for your comments. >>>>>>>>>>>> >>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I >>>>>>>> cannot >>>>>>>>>>>>> follow the explanation in the KIP to see what the problem is. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - From [1] says "A functional interface is an interface that has >>>>>>> just >>>>>>>>> one >>>>>>>>>>>> abstract method, and thus represents a single function >> contract". >>>>>>>>>>>> So basically once we extend some interface from another (in our >>>>>>> case, >>>>>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in >> the >>>>>>>>>>> extended >>>>>>>>>>>> interface. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Further comments: >>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe reformat >> the >>>>>>>>> wiki >>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - I will work on the KIP. >>>>>>>>>>>> >>>>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods >> added >>>>>>>> for >>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any >> new >>>>>>>>>>>>> overloads) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - Actually there are more than one Processor and public APIs to >> be >>>>>>>>>>>> changed (KStream-KTable >>>>>>>>>>>> joins is one case). However all of them has similar structure: >> we >>>>>>>>>>> overload >>>>>>>>>>>> the *method* with *methodWithKey*, >>>>>>>>>>>> wrap it into the Rich function, send to processor and inside the >>>>>>>>>>> processor >>>>>>>>>>>> call *init* and *close* methods of the Rich function. >>>>>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea with >>>>>>> only >>>>>>>>>>>> *ValueMapper* as the same can be applied to all changes. >>>>>>>>>>>> Anyway I will update the KIP. >>>>>>>>>>>> >>>>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and* >> close()* >>>>>>>>>>> methods >>>>>>>>>>>> in every Rich function with empty body like: >>>>>>>>>>>> >>>>>>>>>>>> @Override >>>>>>>>>>>> void init(ProcessorContext context) {} >>>>>>>>>>>> >>>>>>>>>>>> @Override >>>>>>>>>>>> void close () {} >>>>>>>>>>>> >>>>>>>>>>>> I thought that we can override them once in >> *AbstractRichFunction* >>>>>>>> and >>>>>>>>>>>> extent new Rich functions from *AbstractRichFunction*. >>>>>>>>>>>> Basically this can eliminate code copy-paste and ease the >>>>>>>> maintenance. >>>>>>>>>>>> >>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, >>>>>>>> Predicate, >>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but >> we >>>>>>>>> should >>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g., >>>>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Definitely agree. As I said, the same technique applies to all >> this >>>>>>>>>>>> interfaces and I didn't want to explode the KIP, just wanted to >>>>>>> give >>>>>>>>> the >>>>>>>>>>>> overall intuition. >>>>>>>>>>>> However, I will update the KIP as I said. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>>>>> `RichValueXX` >>>>>>>>>>>>> in general -- but why can't we do all this with interfaces >> only? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Sure we can. However the main intuition is we should not force >>>>>>> users >>>>>>>> to >>>>>>>>>>>> implement *init(ProcessorContext)* and *close()* functions every >>>>>>> time >>>>>>>>>>> they >>>>>>>>>>>> use Rich functions. >>>>>>>>>>>> If one needs, she can override the respective methods. However, >> I >>>>>>> am >>>>>>>>> open >>>>>>>>>>>> for discussion. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any >>>>>>> further >>>>>>>>> than >>>>>>>>>>>>> it currently is. So maybe we need another KIP that is done >> before >>>>>>>>> this? >>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> That is good point. I wanted to make *init(ProcessorContext)* >>>>>>> method >>>>>>>>>>>> persistent among the library (which use ProcessorContext as an >>>>>>>> input), >>>>>>>>>>>> therefore I put *ProcessorContext* as an input. >>>>>>>>>>>> So the important question is that (as @dguy and @mjsax >> mentioned) >>>>>>>>> whether >>>>>>>>>>>> continue this KIP without providing users an access to >>>>>>>>> *ProcessorContext* >>>>>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or >>>>>>>>>>>> initiate another KIP before this. >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- >>>>>>>>>>> se-8-jls-pfd-diffs.pdf >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Jeyhun >>>>>>>>>>>> >>>>>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy < >> damian....@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any >>>>>>> further >>>>>>>>>>> than >>>>>>>>>>>>> it currently is. So maybe we need another KIP that is done >> before >>>>>>>>> this? >>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax < >>>>>>> matth...@confluent.io >>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I agree that that `ProcessorContext` interface is too broad in >>>>>>>>> general >>>>>>>>>>>>>> -- this is even true for transform/process, and it's also >>>>>>> reflected >>>>>>>>> in >>>>>>>>>>>>>> the API improvement list we want to do. >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/ >>>>>>>>>>>>> Kafka+Streams+Discussions >>>>>>>>>>>>>> >>>>>>>>>>>>>> So I am wondering, if you question the `RichFunction` >> approach in >>>>>>>>>>>>>> general? Or if you suggest to either extend the scope of this >> KIP >>>>>>>> to >>>>>>>>>>>>>> include this---or maybe better, do another KIP for it and >> delay >>>>>>>> this >>>>>>>>>>> KIP >>>>>>>>>>>>>> until the other one is done? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote: >>>>>>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we >> really >>>>>>>> want >>>>>>>>> to >>>>>>>>>>>>>> give >>>>>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It has a >>>>>>> bunch >>>>>>>>> of >>>>>>>>>>>>>>> methods on it that seem in-appropriate for some of the DSL >>>>>>>> methods, >>>>>>>>>>>>> i.e, >>>>>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is >>>>>>> far >>>>>>>>> too >>>>>>>>>>>>>>> broad. I think it would be better to have a narrower >> interface >>>>>>>> like >>>>>>>>>>> the >>>>>>>>>>>>>>> `RecordContext` - remembering it is easier to add >>>>>>>>> methods/interfaces >>>>>>>>>>>>>> later >>>>>>>>>>>>>>> than to remove them >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax < >>>>>>>> matth...@confluent.io >>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little >> bit? >>>>>>> I >>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem >> is. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's up to >>>>>>> you. >>>>>>>>>>>>> Maybe a >>>>>>>>>>>>>>>> committer can comment on this? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Further comments: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe >> reformat >>>>>>>> the >>>>>>>>>>>>> wiki >>>>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods >>>>>>>> added >>>>>>>>>>> for >>>>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add >> any >>>>>>>> new >>>>>>>>>>>>>>>> overloads) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, >>>>>>>>>>>>> Predicate, >>>>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, >> but >>>>>>> we >>>>>>>>>>>>> should >>>>>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g., >>>>>>>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>>>>>>>>> `RichValueXX` >>>>>>>>>>>>>>>> in general -- but why can't we do all this with interfaces >>>>>>> only? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend the two >>>>>>>>>>> interfaces >>>>>>>>>>>>>> if >>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should >>>>>>>> change >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> title, because now we are not limiting the KIP to only >>>>>>>>> ValueMapper, >>>>>>>>>>>>>>>>> ValueTransformer and ValueJoiner. >>>>>>>>>>>>>>>>> Please feel free to comment. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < >>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't >> need >>>>>>> the >>>>>>>>> new >>>>>>>>>>>>>>>>>> overlaod. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens when >>>>>>>>> building >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we >>>>>>> don't >>>>>>>>>>> need >>>>>>>>>>>>>> any >>>>>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks for feedback. >>>>>>>>>>>>>>>>>>> Then we need to overload method >>>>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ? >>>>>>>> extends >>>>>>>>>>>>> VR> >>>>>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? >> super >>>>>>> V, >>>>>>>> ? >>>>>>>>>>>>>> extends >>>>>>>>>>>>>>>>>> VR> >>>>>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to check >> it >>>>>>> is >>>>>>>>>>>>>>>>>> ValueMapper >>>>>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich >>>>>>>> function. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < >>>>>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> +1 :) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we >>>>>>>> should >>>>>>>>>>> do >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> design a little different. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction and >>>>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the time. >>>>>>>> Thus, >>>>>>>>> on >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> DSL >>>>>>>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we >>>>>>> wrap >>>>>>>> it >>>>>>>>>>>>> by a >>>>>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. This >>>>>>>>>>>>>> RichFunction >>>>>>>>>>>>>>>>>>>>> would just forward the call omitting the key: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet): >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements >>>>>>> RichValueMapper() >>>>>>>> { >>>>>>>>>>>>>>>>>>>>>> private ValueMapper userProvidedMapper; // set by >>>>>>>>>>> constructor >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> public VR apply(final K key, final V1 value1, >> final V2 >>>>>>>>>>>>> value2) >>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>>>> return userProvidedMapper(value1, value2); >>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> From a performance point of view, I am not sure if the >>>>>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have >> more >>>>>>>>>>> overhead >>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>> this approach (we would do more nested method call for >>>>>>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>>>>>>> which should be more common than RichFunctions). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I miss >>>>>>>>>>> something?) >>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top level >>>>>>>>> interface >>>>>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and >>>>>>> also >>>>>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes >>>>>>>>> required). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Any thoughts? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include >>>>>>> rich >>>>>>>>>>>>>>>> functions. >>>>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep copying >> of >>>>>>>> keys. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hey Matthias, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the >> immutability of >>>>>>>> the >>>>>>>>>>>>> key >>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> best approach available. Better than requiring the >> key >>>>>>> to >>>>>>>>> be >>>>>>>>>>>>>>>>>>>> serializable >>>>>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no >> performance >>>>>>>> risk. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type >> constraints >>>>>>>> of >>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>> kind. >>>>>>>>>>>>>>>>>>>> :-) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax < >>>>>>>>>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, >> it >>>>>>>>> should >>>>>>>>>>>>>> cover >>>>>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure what >> to >>>>>>> do >>>>>>>>>>> about >>>>>>>>>>>>>>>> it... >>>>>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and name >> the >>>>>>> key >>>>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I >>>>>>> would >>>>>>>>>>>>> prefer >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no good >> idea >>>>>>>> how >>>>>>>>>>> to >>>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>>>> this. >>>>>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of >> corrupted >>>>>>>>>>>>>>>> partitioning >>>>>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, well, >> bad >>>>>>>>> luck, >>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>> got >>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep >> copy >>>>>>>> with >>>>>>>>> a >>>>>>>>>>>>>>>>>> potential >>>>>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm without >> any >>>>>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>>>> test). >>>>>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's >> worth >>>>>>>> for >>>>>>>>>>> you >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>> apply >>>>>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very >>>>>>> basic >>>>>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you want >> to >>>>>>> do >>>>>>>>>>>>> this, >>>>>>>>>>>>>>>> look >>>>>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and into >>>>>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find >> the >>>>>>>> perf >>>>>>>>>>>>>> test. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Hi Matthias, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions >> totally >>>>>>>>>>> makes >>>>>>>>>>>>>>>>>> sense. >>>>>>>>>>>>>>>>>>>>>>> So, >>>>>>>>>>>>>>>>>>>>>>>>> we don't want to guard the keys because it is >>>>>>> costly. >>>>>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should >> not be >>>>>>>>>>>>> limited >>>>>>>>>>>>>>>> only >>>>>>>>>>>>>>>>>>>>>>> the 3 >>>>>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of >>>>>>>>> interfaces. >>>>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax < >>>>>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the user >> list: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ >>>>>>>>>>>>> key+ >>>>>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder, >>>>>>> and >>>>>>>>>>>>>>>> Substractor >>>>>>>>>>>>>>>>>>>>>>>>>> inferface, too. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are other >>>>>>> interface >>>>>>>>> we >>>>>>>>>>>>>> might >>>>>>>>>>>>>>>>>>>> miss >>>>>>>>>>>>>>>>>>>>>>>> atm. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am >>>>>>>>> actually >>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>> little >>>>>>>>>>>>>>>>>>>>>>> bit >>>>>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might >> want to >>>>>>>> do >>>>>>>>>>>>> some >>>>>>>>>>>>>>>> test >>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>> quantify this impact. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of >> `RichFunction` >>>>>>>>>>>>>> interface >>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata (like >>>>>>>>>>>>> timestamp, >>>>>>>>>>>>>>>>>>>> offset, >>>>>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a >> similar >>>>>>>>>>> concept. >>>>>>>>>>>>>>>>>> Thus, I >>>>>>>>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the >>>>>>> scope >>>>>>>>> of >>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>>>> that? WDYT? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach >>>>>>> and >>>>>>>>>>>>> updated >>>>>>>>>>>>>>>> PR >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in >> Processors >>>>>>>>>>>>> sending >>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>> copy >>>>>>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>>>>> actual key. >>>>>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I >> think >>>>>>>>> memory >>>>>>>>>>>>>> can >>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>> bottleneck >>>>>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper >> (...etc) to >>>>>>>> be >>>>>>>>>>>>>>>> classes, >>>>>>>>>>>>>>>>>>>>>>> rather >>>>>>>>>>>>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards >> incompatible >>>>>>>>>>>>> change. >>>>>>>>>>>>>>>> An >>>>>>>>>>>>>>>>>>>>>>>>>> alternative >>>>>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible >> would >>>>>>> be >>>>>>>>> to >>>>>>>>>>>>>>>> define >>>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those >>>>>>>>> interfaces >>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>> used. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as >> "final" >>>>>>>>>>>>> doesn't >>>>>>>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>>>>>>>> much >>>>>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change. It only >>>>>>> prevents >>>>>>>>> the >>>>>>>>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>>>>>>>> variable >>>>>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type is a >>>>>>> mutable >>>>>>>>>>>>> object >>>>>>>>>>>>>>>>>> (eg. >>>>>>>>>>>>>>>>>>>>>>>>>> byte[]), >>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But >>>>>>> I'm >>>>>>>>> not >>>>>>>>>>>>>>>> really >>>>>>>>>>>>>>>>>>>> sure >>>>>>>>>>>>>>>>>>>>>>>>>> there's >>>>>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov >> < >>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can >> guard >>>>>>> for >>>>>>>>>>>>>>>> immutable >>>>>>>>>>>>>>>>>>>> keys >>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I >>>>>>> didn't >>>>>>>>>>>>>> consider >>>>>>>>>>>>>>>>>>>>>>>> backward >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In >> both >>>>>>>>>>> cases, >>>>>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>>>>> accesses >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type >>>>>>> parameter >>>>>>>>>>> will >>>>>>>>>>>>>>>> break >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to cast >> to >>>>>>>>> actual >>>>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>>> type. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with >> 2 >>>>>>>>>>> argument >>>>>>>>>>>>>>>> (key >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>> value) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this, I >>>>>>>> think >>>>>>>>> we >>>>>>>>>>>>>> can >>>>>>>>>>>>>>>>>>>>>>> address >>>>>>>>>>>>>>>>>>>>>>>>>> both >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against >> key >>>>>>>>> change. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess >> like: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> public class KeyAccess { >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Object key; >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> public void beforeApply(final Object >> key) { >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this.key = key; >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> public Object getKey() { >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> return key; >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and >>>>>>>>>>>>>>>>>> *ValueTransformer* >>>>>>>>>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example >>>>>>>>>>>>>>>>>>>>>>>> *KTableMapValuesProcessor*) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can >> set >>>>>>> the >>>>>>>>>>>>> *key* >>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user >> can >>>>>>>> use >>>>>>>>>>>>>>>>>> *getKey()* >>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. >> Sax < >>>>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to >> address: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward >>>>>>>>> compatibility. >>>>>>>>>>>>>> Users >>>>>>>>>>>>>>>>>> did >>>>>>>>>>>>>>>>>>>>>>>>>>>>> complain >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as >> the >>>>>>>> user >>>>>>>>>>>>> base >>>>>>>>>>>>>>>>>>>> grows, >>>>>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in >>>>>>> future >>>>>>>>>>>>>> releases >>>>>>>>>>>>>>>>>>>>>>> anymore. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to >> allow >>>>>>> key >>>>>>>>>>>>>> access. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same >> direction >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile >>>>>>>> failures >>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no >> guard >>>>>>> to >>>>>>>>>>>>> prevent >>>>>>>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt >> partitioning >>>>>>> if >>>>>>>>>>>>> users >>>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>>>> alter >>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not >> aware >>>>>>>>> that >>>>>>>>>>>>>> they >>>>>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) >> and >>>>>>>> thus >>>>>>>>>>>>> break >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> application. (This was the original >> motivation >>>>>>> to >>>>>>>>> not >>>>>>>>>>>>>>>> provide >>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against >>>>>>>>> modification.) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, >> have >>>>>>>>> wished >>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar >>>>>>>> operation. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile >>>>>>> failures >>>>>>>>>>> that >>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) >> But >>>>>>>> at >>>>>>>>>>>>> least >>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun >> Karimov >>>>>>> < >>>>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear community, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues >>>>>>>>>>>>> KAFKA-4218 >>>>>>>>>>>>>>>> [2], >>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-4726 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3], >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be >> found at >>>>>>>>> [5]. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ >>>>>>>>>>>>> confluence/display/KAFKA/KIP- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira >>>>>>>>>>> /browse/KAFKA-4218 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira >>>>>>>>>>> /browse/KAFKA-4726 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira >>>>>>>>>>> /browse/KAFKA-3745 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [5] >> https://github.com/apache/kafka/pull/2946 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >
signature.asc
Description: OpenPGP digital signature