Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-17 Thread Guozhang Wang
Thanks Damian.

I think I'm convinced to introduce a new class for the sake of semantics
clarity plus being able to expose record context along with it. I'm going
to update the KIP and start the voting thread now.

On Thu, May 17, 2018 at 10:49 AM, Damian Guy  wrote:

> Hi Guozhang, yes i think it would make sense to add this now as having the
> additional record context would be valuable. Though i'm happy either way.
>
> On Wed, 16 May 2018 at 23:07 Guozhang Wang  wrote:
>
>> Hi Damian,
>>
>> My current plan is to add the "RichKeyValueMapper" when getting in
>> KIP-159 to include the record context in this dynamic routing feature. So
>> just to clarify: are you more concerning that if we are going to do that
>> anyways in the future, we should not add overloaded functions with
>> "KeyValueMapper" as of now since they will be subsumed soon?
>>
>>
>> Guozhang
>>
>>
>> On Wed, May 16, 2018 at 2:59 PM, Damian Guy  wrote:
>>
>>> Overall i'm a +1 on this, but i'm not a big fan of using the
>>> KeyValueMapper
>>> to choose the topic. It is a bit counter-intuitve to me. I'd prefer to
>>> add
>>> a class specifically for it and possibly pass in the RecordContext
>>>
>>> On Wed, 16 May 2018 at 13:22 Guozhang Wang  wrote:
>>>
>>> > Hello folks,
>>> >
>>> > Please let me know if you have further feedbacks; if there is no more
>>> > feedbacks I'm going to start the voting thread soon.
>>> >
>>> >
>>> > Guozhang
>>> >
>>> >
>>> > On Wed, May 16, 2018 at 8:31 AM, Guozhang Wang 
>>> wrote:
>>> >
>>> > > I have thought about exposing record context as well, and in the end
>>> I
>>> > > decided to piggy-back it with KIP-159. And if we want to indeed
>>> reuse the
>>> > > class it will be:
>>> > >
>>> > > ```
>>> > > public interface RichKeyValueMapper {
>>> > > VR apply(final K key, final V value, final RecordContext
>>> > > recordContext);
>>> > > }
>>> > > ```
>>> > >
>>> > >
>>> > >
>>> > > Guozhang
>>> > >
>>> > >
>>> > > On Tue, May 15, 2018 at 10:04 PM, Matthias J. Sax <
>>> matth...@confluent.io
>>> > >
>>> > > wrote:
>>> > >
>>> > >> Just my 2 cents:
>>> > >>
>>> > >> I am fine with `KeyValueMapper` (+1 for code reusage) -- the
>>> JavaDocs
>>> > >> will explain what the `KeyValueMapper` is supposed to do, ie,
>>> extract
>>> > >> and return the sink topic name from the key-value pair.
>>> > >>
>>> > >> A side remark though: do we think that accessing key/value is
>>> > >> sufficient? Or should we provide access to the full metadata? We
>>> could
>>> > >> also do this with KIP-159 of course -- but this would come earliest
>>> in
>>> > >> 2.1. As an alternative we could add a `TopicNameExtractor` to
>>> expose the
>>> > >> whole record context. The advantage would be, that we don't need to
>>> > >> change it via KIP-159 later again. WDYT?
>>> > >>
>>> > >> -Matthias
>>> > >>
>>> > >> On 5/15/18 5:57 PM, Bill Bejeck wrote:
>>> > >> > Thanks for the KIP Guozhang, it's a +1 for me.
>>> > >> >
>>> > >> > As for re-using the KeyValueMapper for choosing the topic, I am
>>> on the
>>> > >> > fence, a more explicitly named class would be more clear, but I'm
>>> not
>>> > >> sure
>>> > >> > it's worth a new class that will primarily perform the same
>>> actions as
>>> > >> the
>>> > >> > KeyValueMapper.
>>> > >> >
>>> > >> > Thanks,
>>> > >> > Bill
>>> > >> >
>>> > >> > On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang <
>>> wangg...@gmail.com>
>>> > >> wrote:
>>> > >> >
>>> > >> >> Hello John:
>>> > >> >>
>>> > >> >> * As for the type superclass, it is mainly for allowing super
>>> class
>>> > >> serdes.
>>> > >> >> More details can be found here:
>>> > >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> > >> >> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>>> > >> >>
>>> > >> >> * I may have slight preference on reusing existing classes but I
>>> > think
>>> > >> most
>>> > >> >> of my rationales are quite subjective. Personally I do not find
>>> `self
>>> > >> >> documenting` worth a new class but I can be convinced if people
>>> have
>>> > >> other
>>> > >> >> motivations doing it :)
>>> > >> >>
>>> > >> >>
>>> > >> >> Guozhang
>>> > >> >>
>>> > >> >>
>>> > >> >> On Tue, May 15, 2018 at 11:19 AM, John Roesler <
>>> j...@confluent.io>
>>> > >> wrote:
>>> > >> >>
>>> > >> >>> Thanks for the KIP, Guozhang.
>>> > >> >>>
>>> > >> >>> It looks good overall to me; I just have one question:
>>> > >> >>> * Why do we bound the generics of KVMapper in KStream to be
>>> > >> >> superclass-of-K
>>> > >> >>> and superclass-of-V instead of exactly K and V, as in Topology?
>>> I
>>> > >> might
>>> > >> >> be
>>> > >> >>> thinking about it wrong, but that seems backwards to me. If
>>> > anything,
>>> > >> >>> bounding to be a subclass-of-K or subclass-of-V would seem
>>> right to
>>> > >> me.
>>> > >> >>>
>>> > >> >>> One extra thought: I agree that KVMapper>> name*/>
>>> > >> is an

Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-17 Thread Damian Guy
Hi Guozhang, yes i think it would make sense to add this now as having the
additional record context would be valuable. Though i'm happy either way.

On Wed, 16 May 2018 at 23:07 Guozhang Wang  wrote:

> Hi Damian,
>
> My current plan is to add the "RichKeyValueMapper" when getting in KIP-159
> to include the record context in this dynamic routing feature. So just to
> clarify: are you more concerning that if we are going to do that anyways in
> the future, we should not add overloaded functions with "KeyValueMapper" as
> of now since they will be subsumed soon?
>
>
> Guozhang
>
>
> On Wed, May 16, 2018 at 2:59 PM, Damian Guy  wrote:
>
>> Overall i'm a +1 on this, but i'm not a big fan of using the
>> KeyValueMapper
>> to choose the topic. It is a bit counter-intuitve to me. I'd prefer to add
>> a class specifically for it and possibly pass in the RecordContext
>>
>> On Wed, 16 May 2018 at 13:22 Guozhang Wang  wrote:
>>
>> > Hello folks,
>> >
>> > Please let me know if you have further feedbacks; if there is no more
>> > feedbacks I'm going to start the voting thread soon.
>> >
>> >
>> > Guozhang
>> >
>> >
>> > On Wed, May 16, 2018 at 8:31 AM, Guozhang Wang 
>> wrote:
>> >
>> > > I have thought about exposing record context as well, and in the end I
>> > > decided to piggy-back it with KIP-159. And if we want to indeed reuse
>> the
>> > > class it will be:
>> > >
>> > > ```
>> > > public interface RichKeyValueMapper {
>> > > VR apply(final K key, final V value, final RecordContext
>> > > recordContext);
>> > > }
>> > > ```
>> > >
>> > >
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Tue, May 15, 2018 at 10:04 PM, Matthias J. Sax <
>> matth...@confluent.io
>> > >
>> > > wrote:
>> > >
>> > >> Just my 2 cents:
>> > >>
>> > >> I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
>> > >> will explain what the `KeyValueMapper` is supposed to do, ie, extract
>> > >> and return the sink topic name from the key-value pair.
>> > >>
>> > >> A side remark though: do we think that accessing key/value is
>> > >> sufficient? Or should we provide access to the full metadata? We
>> could
>> > >> also do this with KIP-159 of course -- but this would come earliest
>> in
>> > >> 2.1. As an alternative we could add a `TopicNameExtractor` to expose
>> the
>> > >> whole record context. The advantage would be, that we don't need to
>> > >> change it via KIP-159 later again. WDYT?
>> > >>
>> > >> -Matthias
>> > >>
>> > >> On 5/15/18 5:57 PM, Bill Bejeck wrote:
>> > >> > Thanks for the KIP Guozhang, it's a +1 for me.
>> > >> >
>> > >> > As for re-using the KeyValueMapper for choosing the topic, I am on
>> the
>> > >> > fence, a more explicitly named class would be more clear, but I'm
>> not
>> > >> sure
>> > >> > it's worth a new class that will primarily perform the same
>> actions as
>> > >> the
>> > >> > KeyValueMapper.
>> > >> >
>> > >> > Thanks,
>> > >> > Bill
>> > >> >
>> > >> > On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang > >
>> > >> wrote:
>> > >> >
>> > >> >> Hello John:
>> > >> >>
>> > >> >> * As for the type superclass, it is mainly for allowing super
>> class
>> > >> serdes.
>> > >> >> More details can be found here:
>> > >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > >> >> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>> > >> >>
>> > >> >> * I may have slight preference on reusing existing classes but I
>> > think
>> > >> most
>> > >> >> of my rationales are quite subjective. Personally I do not find
>> `self
>> > >> >> documenting` worth a new class but I can be convinced if people
>> have
>> > >> other
>> > >> >> motivations doing it :)
>> > >> >>
>> > >> >>
>> > >> >> Guozhang
>> > >> >>
>> > >> >>
>> > >> >> On Tue, May 15, 2018 at 11:19 AM, John Roesler > >
>> > >> wrote:
>> > >> >>
>> > >> >>> Thanks for the KIP, Guozhang.
>> > >> >>>
>> > >> >>> It looks good overall to me; I just have one question:
>> > >> >>> * Why do we bound the generics of KVMapper in KStream to be
>> > >> >> superclass-of-K
>> > >> >>> and superclass-of-V instead of exactly K and V, as in Topology? I
>> > >> might
>> > >> >> be
>> > >> >>> thinking about it wrong, but that seems backwards to me. If
>> > anything,
>> > >> >>> bounding to be a subclass-of-K or subclass-of-V would seem right
>> to
>> > >> me.
>> > >> >>>
>> > >> >>> One extra thought: I agree that KVMapper> name*/>
>> > >> is an
>> > >> >>> applicable type for extracting the topic name, but I wonder what
>> the
>> > >> >> value
>> > >> >>> of reusing the KVMapper for this purpose is. Would defining a new
>> > >> class,
>> > >> >>> say TopicNameExtractor, provide the same functionality while
>> > >> being a
>> > >> >>> bit more self-documenting?
>> > >> >>>
>> > >> >>> Thanks,
>> > >> >>> -John
>> > >> >>>
>> > >> >>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang <
>> 

Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-17 Thread Guozhang Wang
Hi Damian,

My current plan is to add the "RichKeyValueMapper" when getting in KIP-159
to include the record context in this dynamic routing feature. So just to
clarify: are you more concerning that if we are going to do that anyways in
the future, we should not add overloaded functions with "KeyValueMapper" as
of now since they will be subsumed soon?


Guozhang


On Wed, May 16, 2018 at 2:59 PM, Damian Guy  wrote:

> Overall i'm a +1 on this, but i'm not a big fan of using the KeyValueMapper
> to choose the topic. It is a bit counter-intuitve to me. I'd prefer to add
> a class specifically for it and possibly pass in the RecordContext
>
> On Wed, 16 May 2018 at 13:22 Guozhang Wang  wrote:
>
> > Hello folks,
> >
> > Please let me know if you have further feedbacks; if there is no more
> > feedbacks I'm going to start the voting thread soon.
> >
> >
> > Guozhang
> >
> >
> > On Wed, May 16, 2018 at 8:31 AM, Guozhang Wang 
> wrote:
> >
> > > I have thought about exposing record context as well, and in the end I
> > > decided to piggy-back it with KIP-159. And if we want to indeed reuse
> the
> > > class it will be:
> > >
> > > ```
> > > public interface RichKeyValueMapper {
> > > VR apply(final K key, final V value, final RecordContext
> > > recordContext);
> > > }
> > > ```
> > >
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, May 15, 2018 at 10:04 PM, Matthias J. Sax <
> matth...@confluent.io
> > >
> > > wrote:
> > >
> > >> Just my 2 cents:
> > >>
> > >> I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
> > >> will explain what the `KeyValueMapper` is supposed to do, ie, extract
> > >> and return the sink topic name from the key-value pair.
> > >>
> > >> A side remark though: do we think that accessing key/value is
> > >> sufficient? Or should we provide access to the full metadata? We could
> > >> also do this with KIP-159 of course -- but this would come earliest in
> > >> 2.1. As an alternative we could add a `TopicNameExtractor` to expose
> the
> > >> whole record context. The advantage would be, that we don't need to
> > >> change it via KIP-159 later again. WDYT?
> > >>
> > >> -Matthias
> > >>
> > >> On 5/15/18 5:57 PM, Bill Bejeck wrote:
> > >> > Thanks for the KIP Guozhang, it's a +1 for me.
> > >> >
> > >> > As for re-using the KeyValueMapper for choosing the topic, I am on
> the
> > >> > fence, a more explicitly named class would be more clear, but I'm
> not
> > >> sure
> > >> > it's worth a new class that will primarily perform the same actions
> as
> > >> the
> > >> > KeyValueMapper.
> > >> >
> > >> > Thanks,
> > >> > Bill
> > >> >
> > >> > On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang 
> > >> wrote:
> > >> >
> > >> >> Hello John:
> > >> >>
> > >> >> * As for the type superclass, it is mainly for allowing super class
> > >> serdes.
> > >> >> More details can be found here:
> > >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> >> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
> > >> >>
> > >> >> * I may have slight preference on reusing existing classes but I
> > think
> > >> most
> > >> >> of my rationales are quite subjective. Personally I do not find
> `self
> > >> >> documenting` worth a new class but I can be convinced if people
> have
> > >> other
> > >> >> motivations doing it :)
> > >> >>
> > >> >>
> > >> >> Guozhang
> > >> >>
> > >> >>
> > >> >> On Tue, May 15, 2018 at 11:19 AM, John Roesler 
> > >> wrote:
> > >> >>
> > >> >>> Thanks for the KIP, Guozhang.
> > >> >>>
> > >> >>> It looks good overall to me; I just have one question:
> > >> >>> * Why do we bound the generics of KVMapper in KStream to be
> > >> >> superclass-of-K
> > >> >>> and superclass-of-V instead of exactly K and V, as in Topology? I
> > >> might
> > >> >> be
> > >> >>> thinking about it wrong, but that seems backwards to me. If
> > anything,
> > >> >>> bounding to be a subclass-of-K or subclass-of-V would seem right
> to
> > >> me.
> > >> >>>
> > >> >>> One extra thought: I agree that KVMapper name*/>
> > >> is an
> > >> >>> applicable type for extracting the topic name, but I wonder what
> the
> > >> >> value
> > >> >>> of reusing the KVMapper for this purpose is. Would defining a new
> > >> class,
> > >> >>> say TopicNameExtractor, provide the same functionality while
> > >> being a
> > >> >>> bit more self-documenting?
> > >> >>>
> > >> >>> Thanks,
> > >> >>> -John
> > >> >>>
> > >> >>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang <
> wangg...@gmail.com
> > >
> > >> >>> wrote:
> > >> >>>
> > >>  Hello folks,
> > >> 
> > >>  I'd like to start a discussion on adding dynamic routing
> > >> functionality
> > >> >> in
> > >>  Streams sink node. I.e. users do not need to specify the topic
> name
> > >> at
> > >>  compilation time but can dynamically determine which topic to
> send
> > to
> > >> >>> based
> > >> 

Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-16 Thread Damian Guy
Overall i'm a +1 on this, but i'm not a big fan of using the KeyValueMapper
to choose the topic. It is a bit counter-intuitve to me. I'd prefer to add
a class specifically for it and possibly pass in the RecordContext

On Wed, 16 May 2018 at 13:22 Guozhang Wang  wrote:

> Hello folks,
>
> Please let me know if you have further feedbacks; if there is no more
> feedbacks I'm going to start the voting thread soon.
>
>
> Guozhang
>
>
> On Wed, May 16, 2018 at 8:31 AM, Guozhang Wang  wrote:
>
> > I have thought about exposing record context as well, and in the end I
> > decided to piggy-back it with KIP-159. And if we want to indeed reuse the
> > class it will be:
> >
> > ```
> > public interface RichKeyValueMapper {
> > VR apply(final K key, final V value, final RecordContext
> > recordContext);
> > }
> > ```
> >
> >
> >
> > Guozhang
> >
> >
> > On Tue, May 15, 2018 at 10:04 PM, Matthias J. Sax  >
> > wrote:
> >
> >> Just my 2 cents:
> >>
> >> I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
> >> will explain what the `KeyValueMapper` is supposed to do, ie, extract
> >> and return the sink topic name from the key-value pair.
> >>
> >> A side remark though: do we think that accessing key/value is
> >> sufficient? Or should we provide access to the full metadata? We could
> >> also do this with KIP-159 of course -- but this would come earliest in
> >> 2.1. As an alternative we could add a `TopicNameExtractor` to expose the
> >> whole record context. The advantage would be, that we don't need to
> >> change it via KIP-159 later again. WDYT?
> >>
> >> -Matthias
> >>
> >> On 5/15/18 5:57 PM, Bill Bejeck wrote:
> >> > Thanks for the KIP Guozhang, it's a +1 for me.
> >> >
> >> > As for re-using the KeyValueMapper for choosing the topic, I am on the
> >> > fence, a more explicitly named class would be more clear, but I'm not
> >> sure
> >> > it's worth a new class that will primarily perform the same actions as
> >> the
> >> > KeyValueMapper.
> >> >
> >> > Thanks,
> >> > Bill
> >> >
> >> > On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang 
> >> wrote:
> >> >
> >> >> Hello John:
> >> >>
> >> >> * As for the type superclass, it is mainly for allowing super class
> >> serdes.
> >> >> More details can be found here:
> >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> >> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
> >> >>
> >> >> * I may have slight preference on reusing existing classes but I
> think
> >> most
> >> >> of my rationales are quite subjective. Personally I do not find `self
> >> >> documenting` worth a new class but I can be convinced if people have
> >> other
> >> >> motivations doing it :)
> >> >>
> >> >>
> >> >> Guozhang
> >> >>
> >> >>
> >> >> On Tue, May 15, 2018 at 11:19 AM, John Roesler 
> >> wrote:
> >> >>
> >> >>> Thanks for the KIP, Guozhang.
> >> >>>
> >> >>> It looks good overall to me; I just have one question:
> >> >>> * Why do we bound the generics of KVMapper in KStream to be
> >> >> superclass-of-K
> >> >>> and superclass-of-V instead of exactly K and V, as in Topology? I
> >> might
> >> >> be
> >> >>> thinking about it wrong, but that seems backwards to me. If
> anything,
> >> >>> bounding to be a subclass-of-K or subclass-of-V would seem right to
> >> me.
> >> >>>
> >> >>> One extra thought: I agree that KVMapper
> >> is an
> >> >>> applicable type for extracting the topic name, but I wonder what the
> >> >> value
> >> >>> of reusing the KVMapper for this purpose is. Would defining a new
> >> class,
> >> >>> say TopicNameExtractor, provide the same functionality while
> >> being a
> >> >>> bit more self-documenting?
> >> >>>
> >> >>> Thanks,
> >> >>> -John
> >> >>>
> >> >>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang  >
> >> >>> wrote:
> >> >>>
> >>  Hello folks,
> >> 
> >>  I'd like to start a discussion on adding dynamic routing
> >> functionality
> >> >> in
> >>  Streams sink node. I.e. users do not need to specify the topic name
> >> at
> >>  compilation time but can dynamically determine which topic to send
> to
> >> >>> based
> >>  on each record's key value pairs. Please find a KIP here:
> >> 
> >>  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>  303%3A+Add+Dynamic+Routing+in+Streams+Sink
> >> 
> >>  Any feedbacks are highly appreciated.
> >> 
> >>  Thanks!
> >> 
> >>  -- Guozhang
> >> 
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> -- Guozhang
> >> >>
> >> >
> >>
> >>
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-16 Thread Guozhang Wang
Hello folks,

Please let me know if you have further feedbacks; if there is no more
feedbacks I'm going to start the voting thread soon.


Guozhang


On Wed, May 16, 2018 at 8:31 AM, Guozhang Wang  wrote:

> I have thought about exposing record context as well, and in the end I
> decided to piggy-back it with KIP-159. And if we want to indeed reuse the
> class it will be:
>
> ```
> public interface RichKeyValueMapper {
> VR apply(final K key, final V value, final RecordContext
> recordContext);
> }
> ```
>
>
>
> Guozhang
>
>
> On Tue, May 15, 2018 at 10:04 PM, Matthias J. Sax 
> wrote:
>
>> Just my 2 cents:
>>
>> I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
>> will explain what the `KeyValueMapper` is supposed to do, ie, extract
>> and return the sink topic name from the key-value pair.
>>
>> A side remark though: do we think that accessing key/value is
>> sufficient? Or should we provide access to the full metadata? We could
>> also do this with KIP-159 of course -- but this would come earliest in
>> 2.1. As an alternative we could add a `TopicNameExtractor` to expose the
>> whole record context. The advantage would be, that we don't need to
>> change it via KIP-159 later again. WDYT?
>>
>> -Matthias
>>
>> On 5/15/18 5:57 PM, Bill Bejeck wrote:
>> > Thanks for the KIP Guozhang, it's a +1 for me.
>> >
>> > As for re-using the KeyValueMapper for choosing the topic, I am on the
>> > fence, a more explicitly named class would be more clear, but I'm not
>> sure
>> > it's worth a new class that will primarily perform the same actions as
>> the
>> > KeyValueMapper.
>> >
>> > Thanks,
>> > Bill
>> >
>> > On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang 
>> wrote:
>> >
>> >> Hello John:
>> >>
>> >> * As for the type superclass, it is mainly for allowing super class
>> serdes.
>> >> More details can be found here:
>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>> >>
>> >> * I may have slight preference on reusing existing classes but I think
>> most
>> >> of my rationales are quite subjective. Personally I do not find `self
>> >> documenting` worth a new class but I can be convinced if people have
>> other
>> >> motivations doing it :)
>> >>
>> >>
>> >> Guozhang
>> >>
>> >>
>> >> On Tue, May 15, 2018 at 11:19 AM, John Roesler 
>> wrote:
>> >>
>> >>> Thanks for the KIP, Guozhang.
>> >>>
>> >>> It looks good overall to me; I just have one question:
>> >>> * Why do we bound the generics of KVMapper in KStream to be
>> >> superclass-of-K
>> >>> and superclass-of-V instead of exactly K and V, as in Topology? I
>> might
>> >> be
>> >>> thinking about it wrong, but that seems backwards to me. If anything,
>> >>> bounding to be a subclass-of-K or subclass-of-V would seem right to
>> me.
>> >>>
>> >>> One extra thought: I agree that KVMapper
>> is an
>> >>> applicable type for extracting the topic name, but I wonder what the
>> >> value
>> >>> of reusing the KVMapper for this purpose is. Would defining a new
>> class,
>> >>> say TopicNameExtractor, provide the same functionality while
>> being a
>> >>> bit more self-documenting?
>> >>>
>> >>> Thanks,
>> >>> -John
>> >>>
>> >>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
>> >>> wrote:
>> >>>
>>  Hello folks,
>> 
>>  I'd like to start a discussion on adding dynamic routing
>> functionality
>> >> in
>>  Streams sink node. I.e. users do not need to specify the topic name
>> at
>>  compilation time but can dynamically determine which topic to send to
>> >>> based
>>  on each record's key value pairs. Please find a KIP here:
>> 
>>  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>  303%3A+Add+Dynamic+Routing+in+Streams+Sink
>> 
>>  Any feedbacks are highly appreciated.
>> 
>>  Thanks!
>> 
>>  -- Guozhang
>> 
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> -- Guozhang
>> >>
>> >
>>
>>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-16 Thread Guozhang Wang
I have thought about exposing record context as well, and in the end I
decided to piggy-back it with KIP-159. And if we want to indeed reuse the
class it will be:

```
public interface RichKeyValueMapper {
VR apply(final K key, final V value, final RecordContext recordContext);
}
```



Guozhang


On Tue, May 15, 2018 at 10:04 PM, Matthias J. Sax 
wrote:

> Just my 2 cents:
>
> I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
> will explain what the `KeyValueMapper` is supposed to do, ie, extract
> and return the sink topic name from the key-value pair.
>
> A side remark though: do we think that accessing key/value is
> sufficient? Or should we provide access to the full metadata? We could
> also do this with KIP-159 of course -- but this would come earliest in
> 2.1. As an alternative we could add a `TopicNameExtractor` to expose the
> whole record context. The advantage would be, that we don't need to
> change it via KIP-159 later again. WDYT?
>
> -Matthias
>
> On 5/15/18 5:57 PM, Bill Bejeck wrote:
> > Thanks for the KIP Guozhang, it's a +1 for me.
> >
> > As for re-using the KeyValueMapper for choosing the topic, I am on the
> > fence, a more explicitly named class would be more clear, but I'm not
> sure
> > it's worth a new class that will primarily perform the same actions as
> the
> > KeyValueMapper.
> >
> > Thanks,
> > Bill
> >
> > On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang 
> wrote:
> >
> >> Hello John:
> >>
> >> * As for the type superclass, it is mainly for allowing super class
> serdes.
> >> More details can be found here:
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
> >>
> >> * I may have slight preference on reusing existing classes but I think
> most
> >> of my rationales are quite subjective. Personally I do not find `self
> >> documenting` worth a new class but I can be convinced if people have
> other
> >> motivations doing it :)
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Tue, May 15, 2018 at 11:19 AM, John Roesler 
> wrote:
> >>
> >>> Thanks for the KIP, Guozhang.
> >>>
> >>> It looks good overall to me; I just have one question:
> >>> * Why do we bound the generics of KVMapper in KStream to be
> >> superclass-of-K
> >>> and superclass-of-V instead of exactly K and V, as in Topology? I might
> >> be
> >>> thinking about it wrong, but that seems backwards to me. If anything,
> >>> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
> >>>
> >>> One extra thought: I agree that KVMapper is
> an
> >>> applicable type for extracting the topic name, but I wonder what the
> >> value
> >>> of reusing the KVMapper for this purpose is. Would defining a new
> class,
> >>> say TopicNameExtractor, provide the same functionality while
> being a
> >>> bit more self-documenting?
> >>>
> >>> Thanks,
> >>> -John
> >>>
> >>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
> >>> wrote:
> >>>
>  Hello folks,
> 
>  I'd like to start a discussion on adding dynamic routing functionality
> >> in
>  Streams sink node. I.e. users do not need to specify the topic name at
>  compilation time but can dynamically determine which topic to send to
> >>> based
>  on each record's key value pairs. Please find a KIP here:
> 
>  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>  303%3A+Add+Dynamic+Routing+in+Streams+Sink
> 
>  Any feedbacks are highly appreciated.
> 
>  Thanks!
> 
>  -- Guozhang
> 
> >>>
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Matthias J. Sax
Just my 2 cents:

I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
will explain what the `KeyValueMapper` is supposed to do, ie, extract
and return the sink topic name from the key-value pair.

A side remark though: do we think that accessing key/value is
sufficient? Or should we provide access to the full metadata? We could
also do this with KIP-159 of course -- but this would come earliest in
2.1. As an alternative we could add a `TopicNameExtractor` to expose the
whole record context. The advantage would be, that we don't need to
change it via KIP-159 later again. WDYT?

-Matthias

On 5/15/18 5:57 PM, Bill Bejeck wrote:
> Thanks for the KIP Guozhang, it's a +1 for me.
> 
> As for re-using the KeyValueMapper for choosing the topic, I am on the
> fence, a more explicitly named class would be more clear, but I'm not sure
> it's worth a new class that will primarily perform the same actions as the
> KeyValueMapper.
> 
> Thanks,
> Bill
> 
> On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang  wrote:
> 
>> Hello John:
>>
>> * As for the type superclass, it is mainly for allowing super class serdes.
>> More details can be found here:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>>
>> * I may have slight preference on reusing existing classes but I think most
>> of my rationales are quite subjective. Personally I do not find `self
>> documenting` worth a new class but I can be convinced if people have other
>> motivations doing it :)
>>
>>
>> Guozhang
>>
>>
>> On Tue, May 15, 2018 at 11:19 AM, John Roesler  wrote:
>>
>>> Thanks for the KIP, Guozhang.
>>>
>>> It looks good overall to me; I just have one question:
>>> * Why do we bound the generics of KVMapper in KStream to be
>> superclass-of-K
>>> and superclass-of-V instead of exactly K and V, as in Topology? I might
>> be
>>> thinking about it wrong, but that seems backwards to me. If anything,
>>> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
>>>
>>> One extra thought: I agree that KVMapper is an
>>> applicable type for extracting the topic name, but I wonder what the
>> value
>>> of reusing the KVMapper for this purpose is. Would defining a new class,
>>> say TopicNameExtractor, provide the same functionality while being a
>>> bit more self-documenting?
>>>
>>> Thanks,
>>> -John
>>>
>>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
>>> wrote:
>>>
 Hello folks,

 I'd like to start a discussion on adding dynamic routing functionality
>> in
 Streams sink node. I.e. users do not need to specify the topic name at
 compilation time but can dynamically determine which topic to send to
>>> based
 on each record's key value pairs. Please find a KIP here:

 https://cwiki.apache.org/confluence/display/KAFKA/KIP-
 303%3A+Add+Dynamic+Routing+in+Streams+Sink

 Any feedbacks are highly appreciated.

 Thanks!

 -- Guozhang

>>>
>>
>>
>>
>> --
>> -- Guozhang
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Bill Bejeck
Thanks for the KIP Guozhang, it's a +1 for me.

As for re-using the KeyValueMapper for choosing the topic, I am on the
fence, a more explicitly named class would be more clear, but I'm not sure
it's worth a new class that will primarily perform the same actions as the
KeyValueMapper.

Thanks,
Bill

On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang  wrote:

> Hello John:
>
> * As for the type superclass, it is mainly for allowing super class serdes.
> More details can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>
> * I may have slight preference on reusing existing classes but I think most
> of my rationales are quite subjective. Personally I do not find `self
> documenting` worth a new class but I can be convinced if people have other
> motivations doing it :)
>
>
> Guozhang
>
>
> On Tue, May 15, 2018 at 11:19 AM, John Roesler  wrote:
>
> > Thanks for the KIP, Guozhang.
> >
> > It looks good overall to me; I just have one question:
> > * Why do we bound the generics of KVMapper in KStream to be
> superclass-of-K
> > and superclass-of-V instead of exactly K and V, as in Topology? I might
> be
> > thinking about it wrong, but that seems backwards to me. If anything,
> > bounding to be a subclass-of-K or subclass-of-V would seem right to me.
> >
> > One extra thought: I agree that KVMapper is an
> > applicable type for extracting the topic name, but I wonder what the
> value
> > of reusing the KVMapper for this purpose is. Would defining a new class,
> > say TopicNameExtractor, provide the same functionality while being a
> > bit more self-documenting?
> >
> > Thanks,
> > -John
> >
> > On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
> > wrote:
> >
> > > Hello folks,
> > >
> > > I'd like to start a discussion on adding dynamic routing functionality
> in
> > > Streams sink node. I.e. users do not need to specify the topic name at
> > > compilation time but can dynamically determine which topic to send to
> > based
> > > on each record's key value pairs. Please find a KIP here:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 303%3A+Add+Dynamic+Routing+in+Streams+Sink
> > >
> > > Any feedbacks are highly appreciated.
> > >
> > > Thanks!
> > >
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Guozhang Wang
Hello John:

* As for the type superclass, it is mainly for allowing super class serdes.
More details can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-100+-+Relax+Type+constraints+in+Kafka+Streams+API

* I may have slight preference on reusing existing classes but I think most
of my rationales are quite subjective. Personally I do not find `self
documenting` worth a new class but I can be convinced if people have other
motivations doing it :)


Guozhang


On Tue, May 15, 2018 at 11:19 AM, John Roesler  wrote:

> Thanks for the KIP, Guozhang.
>
> It looks good overall to me; I just have one question:
> * Why do we bound the generics of KVMapper in KStream to be superclass-of-K
> and superclass-of-V instead of exactly K and V, as in Topology? I might be
> thinking about it wrong, but that seems backwards to me. If anything,
> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
>
> One extra thought: I agree that KVMapper is an
> applicable type for extracting the topic name, but I wonder what the value
> of reusing the KVMapper for this purpose is. Would defining a new class,
> say TopicNameExtractor, provide the same functionality while being a
> bit more self-documenting?
>
> Thanks,
> -John
>
> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
> wrote:
>
> > Hello folks,
> >
> > I'd like to start a discussion on adding dynamic routing functionality in
> > Streams sink node. I.e. users do not need to specify the topic name at
> > compilation time but can dynamically determine which topic to send to
> based
> > on each record's key value pairs. Please find a KIP here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 303%3A+Add+Dynamic+Routing+in+Streams+Sink
> >
> > Any feedbacks are highly appreciated.
> >
> > Thanks!
> >
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Matthias J. Sax
Thanks for the KIP.

+1

@John: compare
https://cwiki.apache.org/confluence/display/KAFKA/KIP-100+-+Relax+Type+constraints+in+Kafka+Streams+API
about the generics


-Matthias

On 5/15/18 11:19 AM, John Roesler wrote:
> Thanks for the KIP, Guozhang.
> 
> It looks good overall to me; I just have one question:
> * Why do we bound the generics of KVMapper in KStream to be superclass-of-K
> and superclass-of-V instead of exactly K and V, as in Topology? I might be
> thinking about it wrong, but that seems backwards to me. If anything,
> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
> 
> One extra thought: I agree that KVMapper is an
> applicable type for extracting the topic name, but I wonder what the value
> of reusing the KVMapper for this purpose is. Would defining a new class,
> say TopicNameExtractor, provide the same functionality while being a
> bit more self-documenting?
> 
> Thanks,
> -John
> 
> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang  wrote:
> 
>> Hello folks,
>>
>> I'd like to start a discussion on adding dynamic routing functionality in
>> Streams sink node. I.e. users do not need to specify the topic name at
>> compilation time but can dynamically determine which topic to send to based
>> on each record's key value pairs. Please find a KIP here:
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 303%3A+Add+Dynamic+Routing+in+Streams+Sink
>>
>> Any feedbacks are highly appreciated.
>>
>> Thanks!
>>
>> -- Guozhang
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread John Roesler
Thanks for the KIP, Guozhang.

It looks good overall to me; I just have one question:
* Why do we bound the generics of KVMapper in KStream to be superclass-of-K
and superclass-of-V instead of exactly K and V, as in Topology? I might be
thinking about it wrong, but that seems backwards to me. If anything,
bounding to be a subclass-of-K or subclass-of-V would seem right to me.

One extra thought: I agree that KVMapper is an
applicable type for extracting the topic name, but I wonder what the value
of reusing the KVMapper for this purpose is. Would defining a new class,
say TopicNameExtractor, provide the same functionality while being a
bit more self-documenting?

Thanks,
-John

On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang  wrote:

> Hello folks,
>
> I'd like to start a discussion on adding dynamic routing functionality in
> Streams sink node. I.e. users do not need to specify the topic name at
> compilation time but can dynamically determine which topic to send to based
> on each record's key value pairs. Please find a KIP here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 303%3A+Add+Dynamic+Routing+in+Streams+Sink
>
> Any feedbacks are highly appreciated.
>
> Thanks!
>
> -- Guozhang
>