Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2019-11-24 Thread Matthias J. Sax
gt;>>>>>>>>>
>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 11/1/17 9:16 PM, Guozhang Wang wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Jeyhun,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I think I'm convinced to not do KAFKA-3907 in this KIP. We
>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> carefully if we should add this functionality to the DSL
>>> layer
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> moving
>>>>>>>>>>>>>>>>>> forward since from what we discovered working on it the
>>>>>>>>>>>>>>>>>> conclusion is
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2018-03-14 Thread Matthias J. Sax
Warren,

thanks for following up this KIP. And sorry for the "messy" discussion
thread. Adding this feature is a little tricky. We still hope to get it
into 1.2 release, but atm there is not much progress.

However, for your use case, you can replace .map() with .transform()
that allows you to access the record's timestamp (via the provided
`context` object) as extracted from the TimestampExtractor. See the docs
for more details:
https://kafka.apache.org/documentation/streams/developer-guide/dsl-api.html#applying-processors-and-transformers-processor-api-integration


-Matthias

On 3/13/18 12:51 PM, Warren, Brad wrote:
> Hi devs,
> 
>  
> 
> It’s a bit difficult to put all of the pieces together regarding the
> status and API changes around the KIPs dealing with exposing the record
> metadata in the Processor and DSL APIs.  This is a feature that my team
> here at American Airlines is keenly interested in and I’d like to
> provide a real world use case to help move the discussion along:
> 
>  
> 
> I have a source topic that contains a text value that includes datetimes
> without a year.  The desire is to order the records in a stream by an
> extracted timestamp from the record value and we plan to use the
> timestamp from the source topic to provide the year.  We’re hoping to
> use the DSL.  Something like:
> 
>  
> 
> val streamOrderedByMyValueTime = Builder.stream(“sourceTopic”).map( K,V
> -> KeyValue(KR, VR, timestamp) )
> 
>  
> 
> so then I can do
> 
>  
> 
> groupBy(), aggregate(), etc.
> 
>  
> 
> Inside the mapper, my timestamp would be something like
> LocalDateTime.of(yearFromIncomingConsumerRecordTimestamp,
> monthFromValue, dayFromValue, ….)
> 
>  
> 
> Looking at the wiki here
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73637757,
> what is the proposed implementation of RichValueMapper?  Is it going to
> support what I want to do here?
> 
>  
> 
>  
> 
> Thanks,
> 
> Brad
> 
>  
> 
> cid:49F8CA06-65F7-457B-9DC0-8251F696295B
> 
>  
> 
> *Brad Warren***
> 
> /Principal Application Architect/
> 
> /Airport Technology///
> 
>  
> 
> brad.war...@aa.com
> 
>  
> 
> cid:DB82A805-2411-4411-8D3D-3688F7234324
> 
>  
> 
>  
> 
> 
> 
>  
> 
> NOTICE: This email and any attachments are for the exclusive and
> confidential use of the intended recipient(s). If you are not an
> intended recipient, please do not read, distribute, or take action in
> reliance upon this message. If you have received this in error, please
> notify me immediately by return email and promptly delete this message
> and its attachments from your computer.
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2018-03-13 Thread Warren, Brad
Hi devs,

It's a bit difficult to put all of the pieces together regarding the status and 
API changes around the KIPs dealing with exposing the record metadata in the 
Processor and DSL APIs.  This is a feature that my team here at American 
Airlines is keenly interested in and I'd like to provide a real world use case 
to help move the discussion along:

I have a source topic that contains a text value that includes datetimes 
without a year.  The desire is to order the records in a stream by an extracted 
timestamp from the record value and we plan to use the timestamp from the 
source topic to provide the year.  We're hoping to use the DSL.  Something like:

val streamOrderedByMyValueTime = Builder.stream("sourceTopic").map( K,V -> 
KeyValue(KR, VR, timestamp) )

so then I can do

groupBy(), aggregate(), etc.

Inside the mapper, my timestamp would be something like 
LocalDateTime.of(yearFromIncomingConsumerRecordTimestamp, monthFromValue, 
dayFromValue, )

Looking at the wiki here 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73637757, what 
is the proposed implementation of RichValueMapper?  Is it going to support what 
I want to do here?


Thanks,
Brad

[cid:49F8CA06-65F7-457B-9DC0-8251F696295B]

Brad Warren
Principal Application Architect
Airport Technology

brad.war...@aa.com

[cid:DB82A805-2411-4411-8D3D-3688F7234324]





NOTICE: This email and any attachments are for the exclusive and confidential 
use of the intended recipient(s). If you are not an intended recipient, please 
do not read, distribute, or take action in reliance upon this message. If you 
have received this in error, please notify me immediately by return email and 
promptly delete this message and its attachments from your computer.


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-12-18 Thread Matthias J. Sax
>>>>>>>>>> Thanks a lot for correcting. It is a leftover from the past
>>>>>>>>>>>>>>> designs
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>
>>>>>>>>>>>>> punctuate() was not deprecated.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I corrected.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 6, 2017 at 5:30 PM Matthias J. Sax
>>>>>>>>>>>>>>> <matth...@confluent.io>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I just re-read the KIP.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One minor comment: we don't need to introduce any deprecated
>>>>>>>>>>>>>>>> methods.
>>>>>>>>>>>>>>>> Thus, RichValueTransformer#punctuate can be removed
>> completely
>>>>>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>> of introducing it as deprecated.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Otherwise looks good to me.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for being so patient!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11/1/17 9:16 PM, Guozhang Wang wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jeyhun,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think I'm convinced to not do KAFKA-3907 in this KIP. We
>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> carefully if we should add this functionality to the DSL
>> layer
>>>>>>>>>>>>>
>>>>>>>>>>>>>> moving
>>>>>>>>>>>>>>>>> forward since from what we discovered working on it the
>>>>>>>>>>>>>>>>> conclusion is
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> it would require revamping the public APIs quite a lot, and
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> it's
>>>>>
>>>>>> not
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> clear
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> if it is a good trade-off than asking users to call
>> process()
>>>>>>>>>>>>>>>> instead.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Nov 1, 2017 at 4:50 AM, Damian Guy
>>>>>>>>>>>>>>>>> <damian@gmail.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-12-18 Thread Bill Bejeck
;>>>>>>>>>>>>>> I think I'm convinced to not do KAFKA-3907 in this KIP. We
> >>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> think
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> carefully if we should add this functionality to the DSL
> layer
> >>>>>>>>>>>
> >>>>>>>>>>>> moving
> >>>>>>>>>>>>>>> forward since from what we discovered working on it the
> >>>>>>>>>>>>>>> conclusion is
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> it would require revamping the public APIs quite a lot, and
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> it's
> >>>
> >>>> not
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> clear
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> if it is a good trade-off than asking users to call
> process()
> >>>>>>>>>>>>>> instead.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Nov 1, 2017 at 4:50 AM, Damian Guy
> >>>>>>>>>>>>>>> <damian@gmail.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Jeyhun, thanks, looks good.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Do we need to remove the line that says:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> - on-demand commit() feature
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>> Damian
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, 31 Oct 2017 at 23:07 Jeyhun Karimov <
> >>>>>>>>>>>>>>>> je.kari...@gmail.com>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I removed the 'commit()' feature, as we discussed. It
> >>>>>>>>>>>>>>>> simplified
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> overall design of KIP a lot.
> >>>>>>>>>>>
> >>>>>>>>>>>> If it is ok, I would like to start a VOTE thread.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 5:28 PM Matthias J. Sax <
> >>>>>>>>>>>>>>>>> matth

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-12-15 Thread Guozhang Wang
Matthias J. Sax <
>>>>>>>>>>>>>>>>> matth...@confluent.io
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks. I understand what you are saying, but I don't
>>>>>>>>>>>>>>>>> agree that
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> but also we need a commit() method
>>>>>>>>>>>>>>>>>> I would just not provide `commit()` at DSL level and
>>>>>>>>>>>>>>>>>> close the
>>>>>>>>>>>>>>>>>> corresponding Jira as "not a problem" or similar.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-12-07 Thread Jan Filipiak
rdContext().partition();
 }
   };


So, we cannot deprecate `ProcessorContext.commit()` in
this


case

IMO.

2. Add the `task` reference to the impl class,

`ProcessorRecordContext`,

so


that it can implement the commit call itself.

- Actually, I don't think that we need `commit()` in
`ProcessorRecordContext`. The main intuition is to
"transfer"
`ProcessorContext.commit()` call to Rich interfaces, to
support
user-specific committing.
  To do so, we introduce `commit()` method in
`RecordContext()`
just

only

to


call ProcessorContext.commit() inside. (see the above

code

snippet)
So, in Rich interfaces, we are not dealing with

`ProcessorRecordContext`

at all, and we leave all its methods as it is.


In this KIP, we made `RecordContext` to be the parent

class of
`ProcessorRecordContext`, just because of they share
quite


amount

of

methods and it is logical to enable inheritance between
those

two.


3. In the wiki page, the statement that "However, call to a

commit()

method,

is valid only within RecordContext interface (at least for

now),

we

throw

an exception in ProcessorRecordContext.commit()." and the

code
snippet


below would need to be updated as well.

- I think above explanation covers this as well.

I want to gain some speed to this KIP, as it has gone
though


many

changes

based on user/developer needs, both in


documentation-/implementation-wise.

Cheers,

Jeyhun



On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <
wangg...@gmail.com>

wrote:

Thanks for the information Jeyhun. I had also forgot

about

KAFKA-3907


with

this KIP..

Thinking a bit more, I'm now inclined to go with what

we

agreed

before,

to
add the commit() call to `RecordContext`. A few minor

tweaks on
its


implementation:

1. Maybe we can deprecate the `commit()` in

ProcessorContext,


to

enforce

user to consolidate this call as
"processorContext.recordContext().commit()". And

internal

implementation

of
`ProcessorContext.commit()` in `ProcessorContextImpl` is

also

changed

to

this call.

2. Add the `task` reference to the impl class,

`ProcessorRecordContext`, so

that it can implement the commit call itself.


3. In the wiki page, the statement that "However,
call to a

commit()

method,

is valid only within RecordContext interface (at least for

now),

we

throw

an exception in ProcessorRecordContext.commit()." and the

code
snippet


below would need to be updated as well.

Guozhang


On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <

matth...@confluent.io

wrote:
Fair point. This is a long discussion and I totally

forgot

that

we

discussed this.

Seems I changed my opinion about including KAFKA-3907...

Happy to hear what others think.


-Matthias

On 10/23/17 1:20 PM, Jeyhun Karimov wrote:

Hi Matthias,

It is probably my bad, the discussion was a bit
long in
this

thread. I

proposed the related issue in the related KIP discuss

thread
[1]


and

got

an

approval [2,3].

Maybe I misunderstood.

[1]
http://search-hadoop.com/m/Kaf
ka/uyzND19Asmg1GKKXT1?subj=

Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+

Streams

[2]

http://search-hadoop.com/m/Kaf
ka/uyzND1kpct22GKKXT1?subj=

Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+

Streams

[3]

http://search-hadoop.com/m/

Kafka/uyzND1G6TGIGKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+

Streams

On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <

matth...@confluent.io

wrote:
Interesting.

I thought that https://issues.apache.org/

jira/browse/KAFKA-4125

is

the

main motivation for this KIP :)

I also think, that we should not expose the full

ProcessorContext

at

DSL

level.

Thus, overall I am not even sure if we should fix
KAFKA-3907

at

all.

Manual commits are something DSL users should not

worry

about


--

and

if

one really needs this, an advanced user can still insert

a

dummy


`transform` to request a commit from there.

-Matthias

On 10/18/17 5:39 AM, Jeyhun Karimov wrote:

Hi,

The main intuition is to solve [1], which is part

of

this
KIP.
I agree with you that this might not seem
semantically
correct

as

we

are

not committing record state.

Alternatively, we can remove commit() from
RecordContext
and

add

ProcessorContext (which has commit() method) as an
extra

argument

to

Rich

methods:

instead of

public interface RichValueMapper<V, VR, K> {
 VR apply(final V value,
  final K key,
  final RecordContext

recordContext);

}

we can adopt

public interface RichValueMapper<V, VR, K> {
 VR apply(final V value,
  final K key,
  final RecordContext recordContext,
  final ProcessorContext
processorContext);
}


However, in this case, a user can get confused as

ProcessorContext

and

RecordContext share some methods with the same name.

Cheers,

Jeyhun


[1] https://issues.apache.org/

jira/browse/KAFKA

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-12-05 Thread Bill Bejeck
t;>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 5:28 PM Matthias J. Sax <
> >>>>>>>>>>>>>> matth...@confluent.io
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks. I understand what you are saying, but I don't
> >>>>>>>>>>>>>> agree that
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> but also we need a commit() method
> >>>>>>>>>>>>>>> I would just not provide `commit()` at DSL level and
> >>>>>>>>>>>>>>> close the
> >>>>>>>>>>>>>>> corresponding Jira as "not a problem" or similar.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 10/27/17 3:42 PM, Jeyhun Karimov wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Matthias,
> >>>>>>>>>>>>>>>> Thanks for your comments. I agree that this is not the
> best
> >>>>>>>>>>>>>>>> way
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> to
> >>>>>>>> do.
> >>>>>>>>>>>>>> A
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> bit of history behind this design.
> >>>>>>>>>>>>>>>> Prior doing this, I tried to provide ProcessorContext
> >>>>>>>>>>>>>>>> itself
> >>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> an
> >>>>>>>> argument
> >>>>>>>>>>>>>>> in Rich interfaces. However, we dont want to give users
> that
> >>>>>>>>>>>>>>>> flexibility
> >>>>>>>>>>>>>>> and “power”. Moreover, ProcessorContext contains processor
> >>>>>>>>>>>>>>> level
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> information and not Record level info. The only thing we
> >>>>>>>>>>>>>>>> need ij
> >>>>>>>>>>>>>>>> ProcessorContext is commit() method.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> So, as far as I understood, we need recor context (offset,
> >>>>>>>>>>>>>>>> timestamp
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> etc) but also we need a commit() method ( we dont want to
> >>>>>>>>>>>>>>> provide
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ProcessorContext as a parameter so users can use
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ProcessorContext.commit()
> >>>>>>>>>>>>>>> ).
> >>>>>>>>>>>>>>>> As a result, I thought to “propagate” commit() call from
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> RecordContext
> >>>>>>>>>>>>>> to
> >>>&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-12-04 Thread Matthias J. Sax
ntext (offset,
>>>>>>>>>>>>>>>> timestamp
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> etc) but also we need a commit() method ( we dont want to
>>>>>>>>>>>>>>> provide
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ProcessorContext as a parameter so users can use
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ProcessorContext.commit()
>>>>>>>>>>>>>>> ).
>>>>>>>>>>>>>>>> As a result, I thought to “propagate” commit() call from
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> RecordContext
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ProcessorContext() .
>>>>>>>>>>>>>>>> If there is a misunderstanding in motvation/discussion of
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> KIP/included
>>>>>>>>>>>>>> jiras please let me know.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri 27. Oct 2017 at 12:39, Matthias J. Sax <
>>>>>>>>>>>>>>>> matth...@confluent.io
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> I am personally still not convinced, that we should add
>>>>>>>>>>>>>>> `commit()`
>>>>>>>> at
>>>>>>>>>>>>>>> all.
>>>>>>>>>>>>>>> @Guozhang: you created the original Jira. Can you
>>>>>>>>>>>>>>> elaborate a
>>>>>>>>>>>>>>>>> little
>>>>>>>>>>>>>>>>> bit? Isn't requesting commits a low level API that should
>>>>>>>>>>>>>>>>> not be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> exposed
>>>>>>>>>>>>>>> in the DSL? Just want to understand the motivation
>>>>>>>>>>>>>>> better. Why
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> would
>>>>>>>> anybody that uses the DSL ever want to request a commit? To
>>>>>>>>>>>>>>>>> me,
>>>>>>>>>>>>>>>>> requesting commits is useful if you manipulated state
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> explicitly,
>>>>>>>> ie,
>>>>>>>>>>>>>>> via Processor API.
>>>>>>>>>>>>>>> Also, for the solution: it seem rather unnatural to me,
>>>>>>>>>>>>>>>>> that we
>>>>>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>>> `commit()` to `RecordContext` -- from my understanding,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> `RecordContext`
>>>>>>>>>>>>>>> is an helper object that provide access to record meta data.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Requesting
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a commit is something 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-29 Thread Jan Filipiak
cord.
To me, this does not seem to be a sound API design if we follow


this

path.

-Matthias



On 10/26/17 10:41 PM, Jeyhun Karimov wrote:

Hi,

Thanks for your suggestions.

I have some comments, to make sure that there is no

misunderstanding.
1. Maybe we can deprecate the `commit()` in
ProcessorContext,


to

enforce

user to consolidate this call as


"processorContext.recordContext().commit()". And internal

implementation

of
`ProcessorContext.commit()` in `ProcessorContextImpl` is

also

changed

to

this call.

- I think we should not deprecate

`ProcessorContext.commit()`.
The

main

intuition that we introduce `commit()` in `RecordContext` is

that,


`RecordContext` is the one which is provided in Rich

interfaces.

So

if

user
wants to commit, then there should be some method inside

`RecordContext`

to


do so. Internally, `RecordContext.commit()` calls

`ProcessorContext.commit()`  (see the last code snippet in

KIP-159):

@Override

public void process(final K1 key, final V1 value) {

recordContext = new RecordContext()
{   //
recordContext initialization is added in this KIP
@Override
public void commit() {
context().commit();
}

@Override
public long offset() {
return context().recordContext().offs
et();
}

@Override
public long timestamp() {
return
context().recordContext().timestamp();
}

@Override
public String topic() {
return context().recordContext().topi
c();
}

@Override
public int partition() {
return
context().recordContext().partition();
}
  };


So, we cannot deprecate `ProcessorContext.commit()` in this


case

IMO.

2. Add the `task` reference to the impl class,

`ProcessorRecordContext`,

so


that it can implement the commit call itself.

- Actually, I don't think that we need `commit()` in
`ProcessorRecordContext`. The main intuition is to
"transfer"
`ProcessorContext.commit()` call to Rich interfaces, to
support
user-specific committing.
 To do so, we introduce `commit()` method in
`RecordContext()`
just

only

to


call ProcessorContext.commit() inside. (see the above code

snippet)
So, in Rich interfaces, we are not dealing with

`ProcessorRecordContext`

at all, and we leave all its methods as it is.


In this KIP, we made `RecordContext` to be the parent

class of
`ProcessorRecordContext`, just because of they share quite


amount

of

methods and it is logical to enable inheritance between those

two.


3. In the wiki page, the statement that "However, call to a

commit()

method,

is valid only within RecordContext interface (at least for

now),

we

throw

an exception in ProcessorRecordContext.commit()." and the

code
snippet


below would need to be updated as well.

- I think above explanation covers this as well.


I want to gain some speed to this KIP, as it has gone though


many

changes

based on user/developer needs, both in


documentation-/implementation-wise.

Cheers,

Jeyhun



On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <
wangg...@gmail.com>

wrote:

Thanks for the information Jeyhun. I had also forgot about

KAFKA-3907


with

this KIP..

Thinking a bit more, I'm now inclined to go with what we


agreed

before,

to
add the commit() call to `RecordContext`. A few minor

tweaks on
its


implementation:

1. Maybe we can deprecate the `commit()` in

ProcessorContext,


to

enforce

user to consolidate this call as
"processorContext.recordContext().commit()". And internal

implementation

of
`ProcessorContext.commit()` in `ProcessorContextImpl` is

also

changed

to

this call.

2. Add the `task` reference to the impl class,

`ProcessorRecordContext`, so

that it can implement the commit call itself.


3. In the wiki page, the statement that "However, call to a

commit()

method,

is valid only within RecordContext interface (at least for

now),

we

throw

an exception in ProcessorRecordContext.commit()." and the

code
snippet


below would need to be updated as well.

Guozhang



On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <

matth...@confluent.io

wrote:
Fair point. This is a long discussion and I totally forgot

that

we

discussed this.

Seems I changed my opinion about including KAFKA-3907...

Happy to hear what others think.


-Matthias

On 10/23/17 1:20 PM, Jeyhun Karimov wrote:

Hi Matthias,

It is probably my bad, the discussion was a bit long in
this

thread. I

proposed the related issue in the related KIP discuss

thread
[1]


and

got

an

approval [2,3].

Maybe I misunderstood.

[1]
http://search-hadoop.com/m/Kaf
ka/uyzND19Asmg1GKKXT1?subj=

Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-28 Thread Guozhang Wang
>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>> ProcessorContext() .
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If there is a misunderstanding in motvation/discussion of
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> KIP/included
>>>>>>>>>>>>>
>>>>>>>>>>>> jiras please let me know.
>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Jeyhun
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri 27. Oct 2017 at 12:39, Matthias J. Sax <
>>>>>>>>>>>>>> matth...@confluent.io
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am personally still not convinced, that we should add
>>>>>>>>>>>>>>
>>>>>>>>>>>>> `commit()`
>>>>>
>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>> all.
>>>>>>>>>>>>
>>>>>>>>>>>>> @Guozhang: you created the original Jira. Can you elaborate a
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> little
>>>>>>>>>>>>>>> bit? Isn't requesting commits a low level API that should
>>>>>>>>>>>>>>> not be
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> exposed
>>>>>>>>>>>>>>
>>>>>>>>>>>>> in the DSL? Just want to understand the motivation better. Why
>>>>>>>>>>>>>
>>>>>>>>>>>> would
>>>>>
>>>>>> anybody that uses the DSL ever want to request a commit? To
>>>>>>>>>>>>>>> me,
>>>>>>>>>>>>>>> requesting commits is useful if you manipulated state
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> explicitly,
>>>>>
>>>>>> ie,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> via Processor API.
>>>>>>>>>>>>
>>>>>>>>>>>>> Also, for the solution: it seem rather unnatural to me,
>>>>>>>>>>>>>>> that we
>>>>>>>>>>>>>>> add
>>>>>>>>>>>>>>> `commit()` to `RecordContext` -- from my understanding,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> `RecordContext`
>>>>>>>>>>>>>>
>>>>>>>>>>>>> is an helper object that provide access to record meta data.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Requesting
>>>>>>>>>>>>>>
>>>>>>>>>>>>> a commit is something quite different. Additionally, a commit
>>>>>>>>>>>>> does
>>>>>>>>>>>>>
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>
>>>>>>>>>>>>> commit a specific record but a `RecrodContext` is for a
>>>>>>>>>>>> specific
>>>>>>>>>>>>
>>>>>>>>>>>>> record.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> To me, this does not seem to be a sound API design if we follow
>>>>>>>>>>>>>
>>>>>>>>>>>> this
>>>>>
>>>>>> path.
>&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-24 Thread Jan Filipiak
 and the
code
snippet

below would need to be updated as well.

Guozhang



On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <


matth...@confluent.io

wrote:

Fair point. This is a long discussion and I totally forgot

that

we

discussed this.

Seems I changed my opinion about including KAFKA-3907...

Happy to hear what others think.


-Matthias

On 10/23/17 1:20 PM, Jeyhun Karimov wrote:


Hi Matthias,

It is probably my bad, the discussion was a bit long in
this


thread. I

proposed the related issue in the related KIP discuss thread
[1]

and

got

an


approval [2,3].
Maybe I misunderstood.

[1]
http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


[2]
http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


[3]
http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <


matth...@confluent.io

wrote:

Interesting.

I thought that https://issues.apache.org/


jira/browse/KAFKA-4125

is


the

main motivation for this KIP :)

I also think, that we should not expose the full


ProcessorContext

at

DSL

level.

Thus, overall I am not even sure if we should fix

KAFKA-3907

at


all.

Manual commits are something DSL users should not worry about

--

and

if

one really needs this, an advanced user can still insert a

dummy

`transform` to request a commit from there.

-Matthias


On 10/18/17 5:39 AM, Jeyhun Karimov wrote:


Hi,

The main intuition is to solve [1], which is part of
this
KIP.
I agree with you that this might not seem semantically
correct


as

we


are

not committing record state.

Alternatively, we can remove commit() from RecordContext

and

add

ProcessorContext (which has commit() method) as an extra

argument

to


Rich

methods:

instead of
public interface RichValueMapper<V, VR, K> {
   VR apply(final V value,
final K key,
final RecordContext recordContext);
}

we can adopt

public interface RichValueMapper<V, VR, K> {
   VR apply(final V value,
final K key,
final RecordContext recordContext,
final ProcessorContext processorContext);
}


However, in this case, a user can get confused as


ProcessorContext

and

RecordContext share some methods with the same name.

Cheers,
Jeyhun


[1] https://issues.apache.org/jira/browse/KAFKA-3907


On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang <


wangg...@gmail.com

wrote:

Regarding #6 above, I'm still not clear why we would
need
`commit()`

in

both ProcessorContext and RecordContext, could you
elaborate

a

bit

more?

To me `commit()` is really a processor context not a
record

context

logically: when you call that function, it means we would

commit

the

state

of the whole task up to this processed record, not only

that

single

record

itself.

Guozhang

On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov <


je.kari...@gmail.com

wrote:

Hi,

Thanks for the feedback.


0. RichInitializer definition seems missing.



- Fixed.


I'd suggest moving the key parameter in the

RichValueXX

and


RichReducer

after the value parameters, as well as in the templates;

e.g.

public interface RichValueJoiner<V1, V2, VR, K> {

   VR apply(final V1 value1, final V2 value2,
final K
key,


final

RecordContext

recordContext);
}


- Fixed.


2. Some of the listed functions are not necessary
since


their

pairing

APIs

are being deprecated in 1.0 already:

 KGroupedStream<KR, V> groupBy(final


RichKeyValueMapper
super

K,


?

super V, KR> selector,

  final Serde


keySerde,

  final Serde

valSerde);

<VT, VR> KStream<K, VR> leftJoin(final KTable<K, VT> table,

final

RichValueJoiner
super

K,


?

super

V,

? super VT, ? extends VR> joiner,
final Serde
keySerde,
final Serde
valSerde);


-Fixed

3. For a few functions where we are adding three APIs

for

a


combo

of

both

mapper / joiner, or both initializer / aggregator, or

adder /

subtractor,

I'm wondering if we can just keep one that use "rich"
functions

for

both;

so that we can have less overloads and let users who

only

want

to


access

one of them to just use dummy parameter declarations.

For

example:

<GK, GV, RV> KStream<K, RV> join(final GlobalKTable<GK, GV>

globalKTable,
final

RichKeyValueMapper
super

K, ?

super

V, ? extends GK> keyValueMapper,
final

RichValueJoiner
super

K,


?

super

V,

? super GV, ? extends RV> joiner);


-Agreed. Fixed.


4. For TimeWindowedKStream, I'm wondering 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-23 Thread Matthias J. Sax
>>
>>>>>>>>>>>>> On 10/26/17 10:41 PM, Jeyhun Karimov wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for your suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have some comments, to make sure that there is no
>>>>>>>>>>>>>>
>>>>>>>>>>>>> misunderstanding.
>>>>>>>>>>>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext,
>>> to
>>>>>>>>>>>>> enforce
>>>>>>>>>>>> user to consolidate this call as
>>>>>>>>>>>>>>> "processorContext.recordContext().commit()". And internal
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> implementation
>>>>>>>>>>>> of
>>>>>>>>>>>>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is
>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> changed
>>>>>>>>>>> to
>>>>>>>>>>>
>>>>>>>>>>>> this call.
>>>>>>>>>>>>>> - I think we should not deprecate
>>>>>>>>>>>>>> `ProcessorContext.commit()`.
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>
>>>>>>>>>>>>> main
>>>>>>>>>>> intuition that we introduce `commit()` in `RecordContext` is
>>>>>>>>>>> that,
>>>>>>>>>>>>>> `RecordContext` is the one which is provided in Rich
>>> interfaces.
>>>>>>>>>>>>>> So
>>>>>>>>>>>>>>
>>>>>>>>>>>>> if
>>>>>>>>>>> user
>>>>>>>>>>>>>> wants to commit, then there should be some method inside
>>>>>>>>>>>>>>
>>>>>>>>>>>>> `RecordContext`
>>>>>>>>>>>> to
>>>>>>>>>>>>>> do so. Internally, `RecordContext.commit()` calls
>>>>>>>>>>>>>> `ProcessorContext.commit()`  (see the last code snippet in
>>>>>>>>>>>>>>
>>>>>>>>>>>>> KIP-159):
>>>>>>>>>> @Override
>>>>>>>>>>>>>>   public void process(final K1 key, final V1 value) {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   recordContext = new RecordContext()
>>>>>>>>>>>>>> {   //
>>>>>>>>>>>>>> recordContext initialization is added in this KIP
>>>>>>>>>>>>>>   @Override
>>>>>>>>>>>>>>   public void commit() {
>>>>>>>>>>>>>>   context().commit();
>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   @Override
>>>>>>>>>>>>>>   public long offset() {
>>>>>>>>>>>>>>   return context().recordContext().offset();
>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   @Override
>>>>>>>>>>>>>>   public long timestamp() {
>>>>>>>>>>>>>>   return
>>>>>>>>>>>>>> context().recordContext().timestamp();
>>>>>>>>>>>&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-23 Thread Jan Filipiak
   recordContext = new RecordContext() {   //
recordContext initialization is added in this KIP
  @Override
  public void commit() {
  context().commit();
  }

  @Override
  public long offset() {
  return context().recordContext().offset();
  }

  @Override
  public long timestamp() {
  return context().recordContext().timestamp();
  }

  @Override
  public String topic() {
  return context().recordContext().topic();
  }

  @Override
  public int partition() {
  return context().recordContext().partition();
  }
};


So, we cannot deprecate `ProcessorContext.commit()` in this

case

IMO.

2. Add the `task` reference to the impl class,


`ProcessorRecordContext`,

so

that it can implement the commit call itself.
- Actually, I don't think that we need `commit()` in
`ProcessorRecordContext`. The main intuition is to "transfer"
`ProcessorContext.commit()` call to Rich interfaces, to support
user-specific committing.
   To do so, we introduce `commit()` method in `RecordContext()`
just


only

to

call ProcessorContext.commit() inside. (see the above code
snippet)
So, in Rich interfaces, we are not dealing with


`ProcessorRecordContext`

at all, and we leave all its methods as it is.

In this KIP, we made `RecordContext` to be the parent class of
`ProcessorRecordContext`, just because of they share quite

amount

of

methods and it is logical to enable inheritance between those two.

3. In the wiki page, the statement that "However, call to a


commit()

method,

is valid only within RecordContext interface (at least for

now),

we

throw

an exception in ProcessorRecordContext.commit()." and the code
snippet

below would need to be updated as well.

- I think above explanation covers this as well.


I want to gain some speed to this KIP, as it has gone though

many

changes

based on user/developer needs, both in

documentation-/implementation-wise.


Cheers,
Jeyhun



On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <
wangg...@gmail.com>


wrote:


Thanks for the information Jeyhun. I had also forgot about
KAFKA-3907

with

this KIP..

Thinking a bit more, I'm now inclined to go with what we

agreed

before,

to

add the commit() call to `RecordContext`. A few minor tweaks on
its

implementation:

1. Maybe we can deprecate the `commit()` in ProcessorContext,

to

enforce

user to consolidate this call as

"processorContext.recordContext().commit()". And internal


implementation

of

`ProcessorContext.commit()` in `ProcessorContextImpl` is also


changed

to


this call.

2. Add the `task` reference to the impl class,


`ProcessorRecordContext`, so
that it can implement the commit call itself.

3. In the wiki page, the statement that "However, call to a


commit()

method,

is valid only within RecordContext interface (at least for

now),

we

throw

an exception in ProcessorRecordContext.commit()." and the code
snippet

below would need to be updated as well.


Guozhang



On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <


matth...@confluent.io

wrote:

Fair point. This is a long discussion and I totally forgot

that

we

discussed this.

Seems I changed my opinion about including KAFKA-3907...

Happy to hear what others think.


-Matthias

On 10/23/17 1:20 PM, Jeyhun Karimov wrote:


Hi Matthias,

It is probably my bad, the discussion was a bit long in this


thread. I

proposed the related issue in the related KIP discuss thread [1]

and

got

an


approval [2,3].
Maybe I misunderstood.

[1]
http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


[2]
http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


[3]
http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=


Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <


matth...@confluent.io

wrote:

Interesting.

I thought that https://issues.apache.org/


jira/browse/KAFKA-4125

is


the

main motivation for this KIP :)

I also think, that we should not expose the full


ProcessorContext

at

DSL

level.

Thus, overall I am not even sure if we should fix

KAFKA-3907

at


all.

Manual commits are something DSL users should not worry about

--

and

if

one really needs this, an advanced user can still insert a

dummy

`transform` to request a commit from there.

-Matthias


On 10/18/17 5:39 AM, Jeyhun Karimov wrote:


Hi,

The main intuition is to solve [1], which is part of this
KIP.
I agree with you that this might not seem semantically
correct


as

we


are

not committing record state.

Alternatively, we can remove commit() from RecordContex

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-22 Thread Damian Guy
>>>>>> via Processor API.
> >>>>>>>>>>
> >>>>>>>>>> Also, for the solution: it seem rather unnatural to me, that we
> >>>>>>>>>> add
> >>>>>>>>>> `commit()` to `RecordContext` -- from my understanding,
> >>>>>>>>>>
> >>>>>>>>> `RecordContext`
> >>>>>>>
> >>>>>>>> is an helper object that provide access to record meta data.
> >>>>>>>>>>
> >>>>>>>>> Requesting
> >>>>>>>
> >>>>>>>> a commit is something quite different. Additionally, a commit does
> >>>>>>>>>>
> >>>>>>>>> not
> >>>>>>
> >>>>>>> commit a specific record but a `RecrodContext` is for a specific
> >>>>>>>>>>
> >>>>>>>>> record.
> >>>>>>>
> >>>>>>>> To me, this does not seem to be a sound API design if we follow
> this
> >>>>>>>>>>
> >>>>>>>>> path.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> -Matthias
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 10/26/17 10:41 PM, Jeyhun Karimov wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for your suggestions.
> >>>>>>>>>>>
> >>>>>>>>>>> I have some comments, to make sure that there is no
> >>>>>>>>>>>
> >>>>>>>>>> misunderstanding.
> >>>>>>
> >>>>>>>
> >>>>>>>>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext,
> to
> >>>>>>>>>>>
> >>>>>>>>>> enforce
> >>>>>>>>
> >>>>>>>>> user to consolidate this call as
> >>>>>>>>>>>> "processorContext.recordContext().commit()". And internal
> >>>>>>>>>>>>
> >>>>>>>>>>> implementation
> >>>>>>>>
> >>>>>>>>> of
> >>>>>>>>>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also
> >>>>>>>>>>>>
> >>>>>>>>>>> changed
> >>>>>>>
> >>>>>>>> to
> >>>>>>>>
> >>>>>>>>> this call.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> - I think we should not deprecate `ProcessorContext.commit()`.
> >>>>>>>>>>> The
> >>>>>>>>>>>
> >>>>>>>>>> main
> >>>>>>>
> >>>>>>>> intuition that we introduce `commit()` in `RecordContext` is that,
> >>>>>>>>>>> `RecordContext` is the one which is provided in Rich
> interfaces.
> >>>>>>>>>>> So
> >>>>>>>>>>>
> >>>>>>>>>> if
> >>>>>>>
> >>>>>>>> user
> >>>>>>>>>>
> >>>>>>>>>>> wants to commit, then there should be some method inside
> >>>>>>>>>>>
> >>>>>>>>>> `RecordContext`
> >>>>>>>>
> >>>>>>>>> to
> >>>>>>>>>>
> >>>>>>>>>>> do so. Internally, `RecordContext.commit()` calls
> >>>>>>>>>>> `ProcessorContext.commit()`  (see the last code snippet in
> >>>>>>>>>>>
> >>>>>>>>>> KIP-159):
> >>>>>>
> >>

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-21 Thread Guozhang Wang
;>>>>>> Thanks for your suggestions.
>>>>>>>>>>>
>>>>>>>>>>> I have some comments, to make sure that there is no
>>>>>>>>>>>
>>>>>>>>>> misunderstanding.
>>>>>>
>>>>>>>
>>>>>>>>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
>>>>>>>>>>>
>>>>>>>>>> enforce
>>>>>>>>
>>>>>>>>> user to consolidate this call as
>>>>>>>>>>>> "processorContext.recordContext().commit()". And internal
>>>>>>>>>>>>
>>>>>>>>>>> implementation
>>>>>>>>
>>>>>>>>> of
>>>>>>>>>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also
>>>>>>>>>>>>
>>>>>>>>>>> changed
>>>>>>>
>>>>>>>> to
>>>>>>>>
>>>>>>>>> this call.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> - I think we should not deprecate `ProcessorContext.commit()`.
>>>>>>>>>>> The
>>>>>>>>>>>
>>>>>>>>>> main
>>>>>>>
>>>>>>>> intuition that we introduce `commit()` in `RecordContext` is that,
>>>>>>>>>>> `RecordContext` is the one which is provided in Rich interfaces.
>>>>>>>>>>> So
>>>>>>>>>>>
>>>>>>>>>> if
>>>>>>>
>>>>>>>> user
>>>>>>>>>>
>>>>>>>>>>> wants to commit, then there should be some method inside
>>>>>>>>>>>
>>>>>>>>>> `RecordContext`
>>>>>>>>
>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>> do so. Internally, `RecordContext.commit()` calls
>>>>>>>>>>> `ProcessorContext.commit()`  (see the last code snippet in
>>>>>>>>>>>
>>>>>>>>>> KIP-159):
>>>>>>
>>>>>>> @Override
>>>>>>>>>>>  public void process(final K1 key, final V1 value) {
>>>>>>>>>>>
>>>>>>>>>>>  recordContext = new RecordContext() {   //
>>>>>>>>>>> recordContext initialization is added in this KIP
>>>>>>>>>>>  @Override
>>>>>>>>>>>  public void commit() {
>>>>>>>>>>>  context().commit();
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>  @Override
>>>>>>>>>>>  public long offset() {
>>>>>>>>>>>  return context().recordContext().offset();
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>  @Override
>>>>>>>>>>>  public long timestamp() {
>>>>>>>>>>>  return context().recordContext().timestamp();
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>  @Override
>>>>>>>>>>>  public String topic() {
>>>>>>>>>>>  return context().recordContext().topic();
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>  @Override
>>>>>>>>>>>  public int partition() {
>>>>>>>>>>>  return context().recordContext().partition();
>>>>>>>>>>>  }
>>>>>>>>>>>}

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-20 Thread Jan Filipiak
introduce `commit()` method in `RecordContext()` just

only

to

call ProcessorContext.commit() inside. (see the above code snippet)
So, in Rich interfaces, we are not dealing with

`ProcessorRecordContext`

at all, and we leave all its methods as it is.
In this KIP, we made `RecordContext` to be the parent class of
`ProcessorRecordContext`, just because of they share quite amount

of

methods and it is logical to enable inheritance between those two.

3. In the wiki page, the statement that "However, call to a

commit()

method,

is valid only within RecordContext interface (at least for now),

we

throw

an exception in ProcessorRecordContext.commit()." and the code

snippet

below would need to be updated as well.


- I think above explanation covers this as well.


I want to gain some speed to this KIP, as it has gone though many

changes

based on user/developer needs, both in

documentation-/implementation-wise.


Cheers,
Jeyhun



On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>

wrote:

Thanks for the information Jeyhun. I had also forgot about

KAFKA-3907

with

this KIP..

Thinking a bit more, I'm now inclined to go with what we agreed

before,

to

add the commit() call to `RecordContext`. A few minor tweaks on

its

implementation:

1. Maybe we can deprecate the `commit()` in ProcessorContext, to

enforce

user to consolidate this call as
"processorContext.recordContext().commit()". And internal

implementation

of
`ProcessorContext.commit()` in `ProcessorContextImpl` is also

changed

to

this call.

2. Add the `task` reference to the impl class,

`ProcessorRecordContext`, so

that it can implement the commit call itself.

3. In the wiki page, the statement that "However, call to a

commit()

method,
is valid only within RecordContext interface (at least for now),

we

throw

an exception in ProcessorRecordContext.commit()." and the code

snippet

below would need to be updated as well.


Guozhang



On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <

matth...@confluent.io

wrote:


Fair point. This is a long discussion and I totally forgot that

we

discussed this.

Seems I changed my opinion about including KAFKA-3907...

Happy to hear what others think.


-Matthias

On 10/23/17 1:20 PM, Jeyhun Karimov wrote:

Hi Matthias,

It is probably my bad, the discussion was a bit long in this

thread. I

proposed the related issue in the related KIP discuss thread [1]

and

got

an

approval [2,3].
Maybe I misunderstood.

[1]
http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=

Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams

[2]
http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=

Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams

[3]
http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=

Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <

matth...@confluent.io

wrote:


Interesting.

I thought that https://issues.apache.org/

jira/browse/KAFKA-4125

is

the

main motivation for this KIP :)

I also think, that we should not expose the full

ProcessorContext

at

DSL

level.

Thus, overall I am not even sure if we should fix KAFKA-3907 at

all.

Manual commits are something DSL users should not worry about

--

and

if

one really needs this, an advanced user can still insert a

dummy

`transform` to request a commit from there.

-Matthias


On 10/18/17 5:39 AM, Jeyhun Karimov wrote:

Hi,

The main intuition is to solve [1], which is part of this KIP.
I agree with you that this might not seem semantically correct

as

we

are

not committing record state.
Alternatively, we can remove commit() from RecordContext and

add

ProcessorContext (which has commit() method) as an extra

argument

to

Rich

methods:

instead of
public interface RichValueMapper<V, VR, K> {
 VR apply(final V value,
  final K key,
  final RecordContext recordContext);
}

we can adopt

public interface RichValueMapper<V, VR, K> {
 VR apply(final V value,
  final K key,
  final RecordContext recordContext,
  final ProcessorContext processorContext);
}


However, in this case, a user can get confused as

ProcessorContext

and

RecordContext share some methods with the same name.


Cheers,
Jeyhun


[1] https://issues.apache.org/jira/browse/KAFKA-3907


On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang <

wangg...@gmail.com

wrote:

Regarding #6 above, I'm still not clear why we would need

`commit()`

in

both ProcessorContext and RecordContext, could you elaborate

a

bit

more?

To me `commit()` is really a processor context not a record

context

logically: when you call that function, it means we would

commit

the

state

of the whole task up to this processed record, not only that

single

record

itself.


Guozhang

On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov <

je.kari...@gmail.com

wrote:

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-20 Thread Matthias J. Sax
record.
>>>>>>>>
>>>>>>>> To me, this does not seem to be a sound API design if we follow this
>>>>>> path.
>>>>>>>>
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/26/17 10:41 PM, Jeyhun Karimov wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Thanks for your suggestions.
>>>>>>>>>
>>>>>>>>> I have some comments, to make sure that there is no
>>>> misunderstanding.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
>>>>>> enforce
>>>>>>>>>> user to consolidate this call as
>>>>>>>>>> "processorContext.recordContext().commit()". And internal
>>>>>> implementation
>>>>>>>>>> of
>>>>>>>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also
>>>>> changed
>>>>>> to
>>>>>>>>>> this call.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - I think we should not deprecate `ProcessorContext.commit()`. The
>>>>> main
>>>>>>>>> intuition that we introduce `commit()` in `RecordContext` is that,
>>>>>>>>> `RecordContext` is the one which is provided in Rich interfaces. So
>>>>> if
>>>>>>>> user
>>>>>>>>> wants to commit, then there should be some method inside
>>>>>> `RecordContext`
>>>>>>>> to
>>>>>>>>> do so. Internally, `RecordContext.commit()` calls
>>>>>>>>> `ProcessorContext.commit()`  (see the last code snippet in
>>>> KIP-159):
>>>>>>>>>
>>>>>>>>> @Override
>>>>>>>>> public void process(final K1 key, final V1 value) {
>>>>>>>>>
>>>>>>>>> recordContext = new RecordContext() {   //
>>>>>>>>> recordContext initialization is added in this KIP
>>>>>>>>> @Override
>>>>>>>>> public void commit() {
>>>>>>>>> context().commit();
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> @Override
>>>>>>>>> public long offset() {
>>>>>>>>> return context().recordContext().offset();
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> @Override
>>>>>>>>> public long timestamp() {
>>>>>>>>> return context().recordContext().timestamp();
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> @Override
>>>>>>>>> public String topic() {
>>>>>>>>> return context().recordContext().topic();
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> @Override
>>>>>>>>> public int partition() {
>>>>>>>>> return context().recordContext().partition();
>>>>>>>>> }
>>>>>>>>>   };
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So, we cannot deprecate `ProcessorContext.commit()` in this case
>>>> IMO.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2. Add the `task` reference to the impl class,
>>>>>> `ProcessorRecordContext`,
>>>>>>>> so
>>>>>>>>>> that it can implement the commit call itself.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - Actually, I don't think that we need `commit()` in
>&g

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-06 Thread Jeyhun Karimov
t;>> So, we cannot deprecate `ProcessorContext.commit()` in this case
> >> IMO.
> >>>>>>>
> >>>>>>>
> >>>>>>> 2. Add the `task` reference to the impl class,
> >>>> `ProcessorRecordContext`,
> >>>>>> so
> >>>>>>>> that it can implement the commit call itself.
> >>>>>>>
> >>>>>>>
> >>>>>>> - Actually, I don't think that we need `commit()` in
> >>>>>>> `ProcessorRecordContext`. The main intuition is to "transfer"
> >>>>>>> `ProcessorContext.commit()` call to Rich interfaces, to support
> >>>>>>> user-specific committing.
> >>>>>>>  To do so, we introduce `commit()` method in `RecordContext()` just
> >>>> only
> >>>>>> to
> >>>>>>> call ProcessorContext.commit() inside. (see the above code snippet)
> >>>>>>> So, in Rich interfaces, we are not dealing with
> >>>> `ProcessorRecordContext`
> >>>>>>> at all, and we leave all its methods as it is.
> >>>>>>> In this KIP, we made `RecordContext` to be the parent class of
> >>>>>>> `ProcessorRecordContext`, just because of they share quite amount
> >> of
> >>>>>>> methods and it is logical to enable inheritance between those two.
> >>>>>>>
> >>>>>>> 3. In the wiki page, the statement that "However, call to a
> >> commit()
> >>>>>> method,
> >>>>>>>> is valid only within RecordContext interface (at least for now),
> >> we
> >>>>>> throw
> >>>>>>>> an exception in ProcessorRecordContext.commit()." and the code
> >>> snippet
> >>>>>>>> below would need to be updated as well.
> >>>>>>>
> >>>>>>>
> >>>>>>> - I think above explanation covers this as well.
> >>>>>>>
> >>>>>>>
> >>>>>>> I want to gain some speed to this KIP, as it has gone though many
> >>>> changes
> >>>>>>> based on user/developer needs, both in
> >>>>>> documentation-/implementation-wise.
> >>>>>>>
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Jeyhun
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks for the information Jeyhun. I had also forgot about
> >>> KAFKA-3907
> >>>>>> with
> >>>>>>>> this KIP..
> >>>>>>>>
> >>>>>>>> Thinking a bit more, I'm now inclined to go with what we agreed
> >>>> before,
> >>>>>> to
> >>>>>>>> add the commit() call to `RecordContext`. A few minor tweaks on
> >> its
> >>>>>>>> implementation:
> >>>>>>>>
> >>>>>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
> >>>> enforce
> >>>>>>>> user to consolidate this call as
> >>>>>>>> "processorContext.recordContext().commit()". And internal
> >>>> implementation
> >>>>>>>> of
> >>>>>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also
> >>> changed
> >>>> to
> >>>>>>>> this call.
> >>>>>>>>
> >>>>>>>> 2. Add the `task` reference to the impl class,
> >>>>>> `ProcessorRecordContext`, so
> >>>>>>>> that it can implement the commit call itself.
> >>>>>>>>
> >>>>>>>> 3. In the wiki page, the statement that "However, call to a
> >> commit()
> >>>>>>>> method,
> >>>>>>>> is valid only within RecordContext interface (at least for now),
> >> we
> >>>>>> throw
> >>>>>>>> an exception in ProcessorRecordContext.commi

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-06 Thread Matthias J. Sax
;>>>>> To me, this does not seem to be a sound API design if we follow this
>>>> path.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/26/17 10:41 PM, Jeyhun Karimov wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks for your suggestions.
>>>>>>>
>>>>>>> I have some comments, to make sure that there is no
>> misunderstanding.
>>>>>>>
>>>>>>>
>>>>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
>>>> enforce
>>>>>>>> user to consolidate this call as
>>>>>>>> "processorContext.recordContext().commit()". And internal
>>>> implementation
>>>>>>>> of
>>>>>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also
>>> changed
>>>> to
>>>>>>>> this call.
>>>>>>>
>>>>>>>
>>>>>>> - I think we should not deprecate `ProcessorContext.commit()`. The
>>> main
>>>>>>> intuition that we introduce `commit()` in `RecordContext` is that,
>>>>>>> `RecordContext` is the one which is provided in Rich interfaces. So
>>> if
>>>>>> user
>>>>>>> wants to commit, then there should be some method inside
>>>> `RecordContext`
>>>>>> to
>>>>>>> do so. Internally, `RecordContext.commit()` calls
>>>>>>> `ProcessorContext.commit()`  (see the last code snippet in
>> KIP-159):
>>>>>>>
>>>>>>> @Override
>>>>>>> public void process(final K1 key, final V1 value) {
>>>>>>>
>>>>>>> recordContext = new RecordContext() {   //
>>>>>>> recordContext initialization is added in this KIP
>>>>>>> @Override
>>>>>>> public void commit() {
>>>>>>> context().commit();
>>>>>>> }
>>>>>>>
>>>>>>> @Override
>>>>>>> public long offset() {
>>>>>>> return context().recordContext().offset();
>>>>>>> }
>>>>>>>
>>>>>>> @Override
>>>>>>> public long timestamp() {
>>>>>>> return context().recordContext().timestamp();
>>>>>>> }
>>>>>>>
>>>>>>> @Override
>>>>>>> public String topic() {
>>>>>>> return context().recordContext().topic();
>>>>>>> }
>>>>>>>
>>>>>>> @Override
>>>>>>> public int partition() {
>>>>>>> return context().recordContext().partition();
>>>>>>> }
>>>>>>>   };
>>>>>>>
>>>>>>>
>>>>>>> So, we cannot deprecate `ProcessorContext.commit()` in this case
>> IMO.
>>>>>>>
>>>>>>>
>>>>>>> 2. Add the `task` reference to the impl class,
>>>> `ProcessorRecordContext`,
>>>>>> so
>>>>>>>> that it can implement the commit call itself.
>>>>>>>
>>>>>>>
>>>>>>> - Actually, I don't think that we need `commit()` in
>>>>>>> `ProcessorRecordContext`. The main intuition is to "transfer"
>>>>>>> `ProcessorContext.commit()` call to Rich interfaces, to support
>>>>>>> user-specific committing.
>>>>>>>  To do so, we introduce `commit()` method in `RecordContext()` just
>>>> only
>>>>>> to
>>>>>>> call ProcessorContext.commit() inside. (see the above code snippet)
>>>>>>> So, in Rich interfaces, we are not dealing with
>>>> `ProcessorRecordContext`
>>>>>>> at all, and we leave all its methods as it is.
>>>>>>> In this KIP, we made `RecordConte

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-01 Thread Guozhang Wang
be updated as well.
> > > >>>
> > > >>>
> > > >>> - I think above explanation covers this as well.
> > > >>>
> > > >>>
> > > >>> I want to gain some speed to this KIP, as it has gone though many
> > > changes
> > > >>> based on user/developer needs, both in
> > > >> documentation-/implementation-wise.
> > > >>>
> > > >>>
> > > >>> Cheers,
> > > >>> Jeyhun
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>
> > > >> wrote:
> > > >>>
> > > >>>> Thanks for the information Jeyhun. I had also forgot about
> > KAFKA-3907
> > > >> with
> > > >>>> this KIP..
> > > >>>>
> > > >>>> Thinking a bit more, I'm now inclined to go with what we agreed
> > > before,
> > > >> to
> > > >>>> add the commit() call to `RecordContext`. A few minor tweaks on
> its
> > > >>>> implementation:
> > > >>>>
> > > >>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
> > > enforce
> > > >>>> user to consolidate this call as
> > > >>>> "processorContext.recordContext().commit()". And internal
> > > implementation
> > > >>>> of
> > > >>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also
> > changed
> > > to
> > > >>>> this call.
> > > >>>>
> > > >>>> 2. Add the `task` reference to the impl class,
> > > >> `ProcessorRecordContext`, so
> > > >>>> that it can implement the commit call itself.
> > > >>>>
> > > >>>> 3. In the wiki page, the statement that "However, call to a
> commit()
> > > >>>> method,
> > > >>>> is valid only within RecordContext interface (at least for now),
> we
> > > >> throw
> > > >>>> an exception in ProcessorRecordContext.commit()." and the code
> > snippet
> > > >>>> below would need to be updated as well.
> > > >>>>
> > > >>>>
> > > >>>> Guozhang
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <
> > > matth...@confluent.io
> > > >>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Fair point. This is a long discussion and I totally forgot that
> we
> > > >>>>> discussed this.
> > > >>>>>
> > > >>>>> Seems I changed my opinion about including KAFKA-3907...
> > > >>>>>
> > > >>>>> Happy to hear what others think.
> > > >>>>>
> > > >>>>>
> > > >>>>> -Matthias
> > > >>>>>
> > > >>>>> On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
> > > >>>>>> Hi Matthias,
> > > >>>>>>
> > > >>>>>> It is probably my bad, the discussion was a bit long in this
> > > thread. I
> > > >>>>>> proposed the related issue in the related KIP discuss thread [1]
> > and
> > > >>>> got
> > > >>>>> an
> > > >>>>>> approval [2,3].
> > > >>>>>> Maybe I misunderstood.
> > > >>>>>>
> > > >>>>>> [1]
> > > >>>>>> http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=
> > > >>>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> > > >>>>>> [2]
> > > >>>>>> http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=
> > > >>>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> > > >>>>>> [3]
> > > >>>>>> http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=
> > > >>>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> > > >>>>

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-11-01 Thread Damian Guy
; to
> > >>> do so. Internally, `RecordContext.commit()` calls
> > >>> `ProcessorContext.commit()`  (see the last code snippet in KIP-159):
> > >>>
> > >>> @Override
> > >>> public void process(final K1 key, final V1 value) {
> > >>>
> > >>> recordContext = new RecordContext() {   //
> > >>> recordContext initialization is added in this KIP
> > >>> @Override
> > >>> public void commit() {
> > >>> context().commit();
> > >>> }
> > >>>
> > >>> @Override
> > >>> public long offset() {
> > >>> return context().recordContext().offset();
> > >>> }
> > >>>
> > >>> @Override
> > >>> public long timestamp() {
> > >>> return context().recordContext().timestamp();
> > >>> }
> > >>>
> > >>> @Override
> > >>> public String topic() {
> > >>> return context().recordContext().topic();
> > >>> }
> > >>>
> > >>> @Override
> > >>> public int partition() {
> > >>> return context().recordContext().partition();
> > >>> }
> > >>>   };
> > >>>
> > >>>
> > >>> So, we cannot deprecate `ProcessorContext.commit()` in this case IMO.
> > >>>
> > >>>
> > >>> 2. Add the `task` reference to the impl class,
> > `ProcessorRecordContext`,
> > >> so
> > >>>> that it can implement the commit call itself.
> > >>>
> > >>>
> > >>> - Actually, I don't think that we need `commit()` in
> > >>> `ProcessorRecordContext`. The main intuition is to "transfer"
> > >>> `ProcessorContext.commit()` call to Rich interfaces, to support
> > >>> user-specific committing.
> > >>>  To do so, we introduce `commit()` method in `RecordContext()` just
> > only
> > >> to
> > >>> call ProcessorContext.commit() inside. (see the above code snippet)
> > >>> So, in Rich interfaces, we are not dealing with
> > `ProcessorRecordContext`
> > >>> at all, and we leave all its methods as it is.
> > >>> In this KIP, we made `RecordContext` to be the parent class of
> > >>> `ProcessorRecordContext`, just because of they share quite amount of
> > >>> methods and it is logical to enable inheritance between those two.
> > >>>
> > >>> 3. In the wiki page, the statement that "However, call to a commit()
> > >> method,
> > >>>> is valid only within RecordContext interface (at least for now), we
> > >> throw
> > >>>> an exception in ProcessorRecordContext.commit()." and the code
> snippet
> > >>>> below would need to be updated as well.
> > >>>
> > >>>
> > >>> - I think above explanation covers this as well.
> > >>>
> > >>>
> > >>> I want to gain some speed to this KIP, as it has gone though many
> > changes
> > >>> based on user/developer needs, both in
> > >> documentation-/implementation-wise.
> > >>>
> > >>>
> > >>> Cheers,
> > >>> Jeyhun
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Thanks for the information Jeyhun. I had also forgot about
> KAFKA-3907
> > >> with
> > >>>> this KIP..
> > >>>>
> > >>>> Thinking a bit more, I'm now inclined to go with what we agreed
> > before,
> > >> to
> > >>>> add the commit() call to `RecordContext`. A few minor tweaks on its
> > >>>> implementation:
> > >>>>
> > >>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
> > enforce
> > >>>> user to consolidate this call as
> > >>>> "processorContext.recordContext().commit()". And internal
> > imple

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-31 Thread Jeyhun Karimov
t().recordContext().offset();
> >>> }
> >>>
> >>> @Override
> >>> public long timestamp() {
> >>> return context().recordContext().timestamp();
> >>> }
> >>>
> >>> @Override
> >>> public String topic() {
> >>> return context().recordContext().topic();
> >>> }
> >>>
> >>> @Override
> >>> public int partition() {
> >>> return context().recordContext().partition();
> >>> }
> >>>   };
> >>>
> >>>
> >>> So, we cannot deprecate `ProcessorContext.commit()` in this case IMO.
> >>>
> >>>
> >>> 2. Add the `task` reference to the impl class,
> `ProcessorRecordContext`,
> >> so
> >>>> that it can implement the commit call itself.
> >>>
> >>>
> >>> - Actually, I don't think that we need `commit()` in
> >>> `ProcessorRecordContext`. The main intuition is to "transfer"
> >>> `ProcessorContext.commit()` call to Rich interfaces, to support
> >>> user-specific committing.
> >>>  To do so, we introduce `commit()` method in `RecordContext()` just
> only
> >> to
> >>> call ProcessorContext.commit() inside. (see the above code snippet)
> >>> So, in Rich interfaces, we are not dealing with
> `ProcessorRecordContext`
> >>> at all, and we leave all its methods as it is.
> >>> In this KIP, we made `RecordContext` to be the parent class of
> >>> `ProcessorRecordContext`, just because of they share quite amount of
> >>> methods and it is logical to enable inheritance between those two.
> >>>
> >>> 3. In the wiki page, the statement that "However, call to a commit()
> >> method,
> >>>> is valid only within RecordContext interface (at least for now), we
> >> throw
> >>>> an exception in ProcessorRecordContext.commit()." and the code snippet
> >>>> below would need to be updated as well.
> >>>
> >>>
> >>> - I think above explanation covers this as well.
> >>>
> >>>
> >>> I want to gain some speed to this KIP, as it has gone though many
> changes
> >>> based on user/developer needs, both in
> >> documentation-/implementation-wise.
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>>
> >>>
> >>> On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >>>
> >>>> Thanks for the information Jeyhun. I had also forgot about KAFKA-3907
> >> with
> >>>> this KIP..
> >>>>
> >>>> Thinking a bit more, I'm now inclined to go with what we agreed
> before,
> >> to
> >>>> add the commit() call to `RecordContext`. A few minor tweaks on its
> >>>> implementation:
> >>>>
> >>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to
> enforce
> >>>> user to consolidate this call as
> >>>> "processorContext.recordContext().commit()". And internal
> implementation
> >>>> of
> >>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also changed
> to
> >>>> this call.
> >>>>
> >>>> 2. Add the `task` reference to the impl class,
> >> `ProcessorRecordContext`, so
> >>>> that it can implement the commit call itself.
> >>>>
> >>>> 3. In the wiki page, the statement that "However, call to a commit()
> >>>> method,
> >>>> is valid only within RecordContext interface (at least for now), we
> >> throw
> >>>> an exception in ProcessorRecordContext.commit()." and the code snippet
> >>>> below would need to be updated as well.
> >>>>
> >>>>
> >>>> Guozhang
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <
> matth...@confluent.io
> >>>
> >>>> wrote:
> >>>>
> >>>>> Fair point. This is a long discussion and I totally forgot that we
> >>>>> discussed this.
> >>>>>
&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-27 Thread Matthias J. Sax
gt;>> 2. Add the `task` reference to the impl class, `ProcessorRecordContext`,
>> so
>>>> that it can implement the commit call itself.
>>>
>>>
>>> - Actually, I don't think that we need `commit()` in
>>> `ProcessorRecordContext`. The main intuition is to "transfer"
>>> `ProcessorContext.commit()` call to Rich interfaces, to support
>>> user-specific committing.
>>>  To do so, we introduce `commit()` method in `RecordContext()` just only
>> to
>>> call ProcessorContext.commit() inside. (see the above code snippet)
>>> So, in Rich interfaces, we are not dealing with  `ProcessorRecordContext`
>>> at all, and we leave all its methods as it is.
>>> In this KIP, we made `RecordContext` to be the parent class of
>>> `ProcessorRecordContext`, just because of they share quite amount of
>>> methods and it is logical to enable inheritance between those two.
>>>
>>> 3. In the wiki page, the statement that "However, call to a commit()
>> method,
>>>> is valid only within RecordContext interface (at least for now), we
>> throw
>>>> an exception in ProcessorRecordContext.commit()." and the code snippet
>>>> below would need to be updated as well.
>>>
>>>
>>> - I think above explanation covers this as well.
>>>
>>>
>>> I want to gain some speed to this KIP, as it has gone though many changes
>>> based on user/developer needs, both in
>> documentation-/implementation-wise.
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>>
>>>
>>> On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>
>> wrote:
>>>
>>>> Thanks for the information Jeyhun. I had also forgot about KAFKA-3907
>> with
>>>> this KIP..
>>>>
>>>> Thinking a bit more, I'm now inclined to go with what we agreed before,
>> to
>>>> add the commit() call to `RecordContext`. A few minor tweaks on its
>>>> implementation:
>>>>
>>>> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to enforce
>>>> user to consolidate this call as
>>>> "processorContext.recordContext().commit()". And internal implementation
>>>> of
>>>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also changed to
>>>> this call.
>>>>
>>>> 2. Add the `task` reference to the impl class,
>> `ProcessorRecordContext`, so
>>>> that it can implement the commit call itself.
>>>>
>>>> 3. In the wiki page, the statement that "However, call to a commit()
>>>> method,
>>>> is valid only within RecordContext interface (at least for now), we
>> throw
>>>> an exception in ProcessorRecordContext.commit()." and the code snippet
>>>> below would need to be updated as well.
>>>>
>>>>
>>>> Guozhang
>>>>
>>>>
>>>>
>>>> On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <matth...@confluent.io
>>>
>>>> wrote:
>>>>
>>>>> Fair point. This is a long discussion and I totally forgot that we
>>>>> discussed this.
>>>>>
>>>>> Seems I changed my opinion about including KAFKA-3907...
>>>>>
>>>>> Happy to hear what others think.
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
>>>>>> Hi Matthias,
>>>>>>
>>>>>> It is probably my bad, the discussion was a bit long in this thread. I
>>>>>> proposed the related issue in the related KIP discuss thread [1] and
>>>> got
>>>>> an
>>>>>> approval [2,3].
>>>>>> Maybe I misunderstood.
>>>>>>
>>>>>> [1]
>>>>>> http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=
>>>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
>>>>>> [2]
>>>>>> http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=
>>>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
>>>>>> [3]
>>>>>> http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=
>>>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
>>>>>>
>>>>>>
>>>>>> On Mon, 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-27 Thread Jeyhun Karimov
methods as it is.
> > In this KIP, we made `RecordContext` to be the parent class of
> > `ProcessorRecordContext`, just because of they share quite amount of
> > methods and it is logical to enable inheritance between those two.
> >
> > 3. In the wiki page, the statement that "However, call to a commit()
> method,
> >> is valid only within RecordContext interface (at least for now), we
> throw
> >> an exception in ProcessorRecordContext.commit()." and the code snippet
> >> below would need to be updated as well.
> >
> >
> > - I think above explanation covers this as well.
> >
> >
> > I want to gain some speed to this KIP, as it has gone though many changes
> > based on user/developer needs, both in
> documentation-/implementation-wise.
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> > On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> >> Thanks for the information Jeyhun. I had also forgot about KAFKA-3907
> with
> >> this KIP..
> >>
> >> Thinking a bit more, I'm now inclined to go with what we agreed before,
> to
> >> add the commit() call to `RecordContext`. A few minor tweaks on its
> >> implementation:
> >>
> >> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to enforce
> >> user to consolidate this call as
> >> "processorContext.recordContext().commit()". And internal implementation
> >> of
> >> `ProcessorContext.commit()` in `ProcessorContextImpl` is also changed to
> >> this call.
> >>
> >> 2. Add the `task` reference to the impl class,
> `ProcessorRecordContext`, so
> >> that it can implement the commit call itself.
> >>
> >> 3. In the wiki page, the statement that "However, call to a commit()
> >> method,
> >> is valid only within RecordContext interface (at least for now), we
> throw
> >> an exception in ProcessorRecordContext.commit()." and the code snippet
> >> below would need to be updated as well.
> >>
> >>
> >> Guozhang
> >>
> >>
> >>
> >> On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <matth...@confluent.io
> >
> >> wrote:
> >>
> >>> Fair point. This is a long discussion and I totally forgot that we
> >>> discussed this.
> >>>
> >>> Seems I changed my opinion about including KAFKA-3907...
> >>>
> >>> Happy to hear what others think.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
> >>>> Hi Matthias,
> >>>>
> >>>> It is probably my bad, the discussion was a bit long in this thread. I
> >>>> proposed the related issue in the related KIP discuss thread [1] and
> >> got
> >>> an
> >>>> approval [2,3].
> >>>> Maybe I misunderstood.
> >>>>
> >>>> [1]
> >>>> http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=
> >>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> >>>> [2]
> >>>> http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=
> >>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> >>>> [3]
> >>>> http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=
> >>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> >>>>
> >>>>
> >>>> On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <
> matth...@confluent.io
> >>>
> >>>> wrote:
> >>>>
> >>>>> Interesting.
> >>>>>
> >>>>> I thought that https://issues.apache.org/jira/browse/KAFKA-4125 is
> >> the
> >>>>> main motivation for this KIP :)
> >>>>>
> >>>>> I also think, that we should not expose the full ProcessorContext at
> >> DSL
> >>>>> level.
> >>>>>
> >>>>> Thus, overall I am not even sure if we should fix KAFKA-3907 at all.
> >>>>> Manual commits are something DSL users should not worry about -- and
> >> if
> >>>>> one really needs this, an advanced user can still insert a dummy
> >>>>> `transform` to request a commit from there.
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-27 Thread Matthias J. Sax
mmit()". And internal implementation
>> of
>> `ProcessorContext.commit()` in `ProcessorContextImpl` is also changed to
>> this call.
>>
>> 2. Add the `task` reference to the impl class, `ProcessorRecordContext`, so
>> that it can implement the commit call itself.
>>
>> 3. In the wiki page, the statement that "However, call to a commit()
>> method,
>> is valid only within RecordContext interface (at least for now), we throw
>> an exception in ProcessorRecordContext.commit()." and the code snippet
>> below would need to be updated as well.
>>
>>
>> Guozhang
>>
>>
>>
>> On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <matth...@confluent.io>
>> wrote:
>>
>>> Fair point. This is a long discussion and I totally forgot that we
>>> discussed this.
>>>
>>> Seems I changed my opinion about including KAFKA-3907...
>>>
>>> Happy to hear what others think.
>>>
>>>
>>> -Matthias
>>>
>>> On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
>>>> Hi Matthias,
>>>>
>>>> It is probably my bad, the discussion was a bit long in this thread. I
>>>> proposed the related issue in the related KIP discuss thread [1] and
>> got
>>> an
>>>> approval [2,3].
>>>> Maybe I misunderstood.
>>>>
>>>> [1]
>>>> http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=
>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
>>>> [2]
>>>> http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=
>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
>>>> [3]
>>>> http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=
>>> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
>>>>
>>>>
>>>> On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <matth...@confluent.io
>>>
>>>> wrote:
>>>>
>>>>> Interesting.
>>>>>
>>>>> I thought that https://issues.apache.org/jira/browse/KAFKA-4125 is
>> the
>>>>> main motivation for this KIP :)
>>>>>
>>>>> I also think, that we should not expose the full ProcessorContext at
>> DSL
>>>>> level.
>>>>>
>>>>> Thus, overall I am not even sure if we should fix KAFKA-3907 at all.
>>>>> Manual commits are something DSL users should not worry about -- and
>> if
>>>>> one really needs this, an advanced user can still insert a dummy
>>>>> `transform` to request a commit from there.
>>>>>
>>>>> -Matthias
>>>>>
>>>>>
>>>>> On 10/18/17 5:39 AM, Jeyhun Karimov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> The main intuition is to solve [1], which is part of this KIP.
>>>>>> I agree with you that this might not seem semantically correct as we
>>> are
>>>>>> not committing record state.
>>>>>> Alternatively, we can remove commit() from RecordContext and add
>>>>>> ProcessorContext (which has commit() method) as an extra argument to
>>> Rich
>>>>>> methods:
>>>>>>
>>>>>> instead of
>>>>>> public interface RichValueMapper<V, VR, K> {
>>>>>> VR apply(final V value,
>>>>>>  final K key,
>>>>>>  final RecordContext recordContext);
>>>>>> }
>>>>>>
>>>>>> we can adopt
>>>>>>
>>>>>> public interface RichValueMapper<V, VR, K> {
>>>>>> VR apply(final V value,
>>>>>>  final K key,
>>>>>>  final RecordContext recordContext,
>>>>>>  final ProcessorContext processorContext);
>>>>>> }
>>>>>>
>>>>>>
>>>>>> However, in this case, a user can get confused as ProcessorContext
>> and
>>>>>> RecordContext share some methods with the same name.
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Jeyhun
>>>>>>
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/KAFKA-3907
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wan

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-26 Thread Jeyhun Karimov
Hi,

Thanks for your suggestions.

I have some comments, to make sure that there is no misunderstanding.


1. Maybe we can deprecate the `commit()` in ProcessorContext, to enforce
> user to consolidate this call as
> "processorContext.recordContext().commit()". And internal implementation
> of
> `ProcessorContext.commit()` in `ProcessorContextImpl` is also changed to
> this call.


- I think we should not deprecate `ProcessorContext.commit()`. The main
intuition that we introduce `commit()` in `RecordContext` is that,
`RecordContext` is the one which is provided in Rich interfaces. So if user
wants to commit, then there should be some method inside `RecordContext` to
do so. Internally, `RecordContext.commit()` calls
`ProcessorContext.commit()`  (see the last code snippet in KIP-159):

@Override
public void process(final K1 key, final V1 value) {

recordContext = new RecordContext() {   //
recordContext initialization is added in this KIP
@Override
public void commit() {
context().commit();
}

@Override
public long offset() {
return context().recordContext().offset();
}

@Override
public long timestamp() {
return context().recordContext().timestamp();
}

@Override
public String topic() {
return context().recordContext().topic();
}

@Override
public int partition() {
return context().recordContext().partition();
}
  };


So, we cannot deprecate `ProcessorContext.commit()` in this case IMO.


2. Add the `task` reference to the impl class, `ProcessorRecordContext`, so
> that it can implement the commit call itself.


- Actually, I don't think that we need `commit()` in
`ProcessorRecordContext`. The main intuition is to "transfer"
`ProcessorContext.commit()` call to Rich interfaces, to support
user-specific committing.
 To do so, we introduce `commit()` method in `RecordContext()` just only to
call ProcessorContext.commit() inside. (see the above code snippet)
So, in Rich interfaces, we are not dealing with  `ProcessorRecordContext`
at all, and we leave all its methods as it is.
In this KIP, we made `RecordContext` to be the parent class of
`ProcessorRecordContext`, just because of they share quite amount of
methods and it is logical to enable inheritance between those two.

3. In the wiki page, the statement that "However, call to a commit() method,
> is valid only within RecordContext interface (at least for now), we throw
> an exception in ProcessorRecordContext.commit()." and the code snippet
> below would need to be updated as well.


- I think above explanation covers this as well.


I want to gain some speed to this KIP, as it has gone though many changes
based on user/developer needs, both in documentation-/implementation-wise.


Cheers,
Jeyhun



On Tue, Oct 24, 2017 at 1:41 AM Guozhang Wang <wangg...@gmail.com> wrote:

> Thanks for the information Jeyhun. I had also forgot about KAFKA-3907 with
> this KIP..
>
> Thinking a bit more, I'm now inclined to go with what we agreed before, to
> add the commit() call to `RecordContext`. A few minor tweaks on its
> implementation:
>
> 1. Maybe we can deprecate the `commit()` in ProcessorContext, to enforce
> user to consolidate this call as
> "processorContext.recordContext().commit()". And internal implementation
> of
> `ProcessorContext.commit()` in `ProcessorContextImpl` is also changed to
> this call.
>
> 2. Add the `task` reference to the impl class, `ProcessorRecordContext`, so
> that it can implement the commit call itself.
>
> 3. In the wiki page, the statement that "However, call to a commit()
> method,
> is valid only within RecordContext interface (at least for now), we throw
> an exception in ProcessorRecordContext.commit()." and the code snippet
> below would need to be updated as well.
>
>
> Guozhang
>
>
>
> On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <matth...@confluent.io>
> wrote:
>
> > Fair point. This is a long discussion and I totally forgot that we
> > discussed this.
> >
> > Seems I changed my opinion about including KAFKA-3907...
> >
> > Happy to hear what others think.
> >
> >
> > -Matthias
> >
> > On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
> > > Hi Matthias,
> > >
> > > It is probably my bad, the discussion was a bit long in this thread. I
> > > proposed the related issue in the related KIP discuss thread [1] and
> got
> > an
> > > approval [2,3].
> > > Maybe I misunderstood.
> > >
> > > [1]
> > > http://sear

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-23 Thread Guozhang Wang
Thanks for the information Jeyhun. I had also forgot about KAFKA-3907 with
this KIP..

Thinking a bit more, I'm now inclined to go with what we agreed before, to
add the commit() call to `RecordContext`. A few minor tweaks on its
implementation:

1. Maybe we can deprecate the `commit()` in ProcessorContext, to enforce
user to consolidate this call as
"processorContext.recordContext().commit()". And internal implementation of
`ProcessorContext.commit()` in `ProcessorContextImpl` is also changed to
this call.

2. Add the `task` reference to the impl class, `ProcessorRecordContext`, so
that it can implement the commit call itself.

3. In the wiki page, the statement that "However, call to a commit() method,
is valid only within RecordContext interface (at least for now), we throw
an exception in ProcessorRecordContext.commit()." and the code snippet
below would need to be updated as well.


Guozhang



On Mon, Oct 23, 2017 at 1:40 PM, Matthias J. Sax <matth...@confluent.io>
wrote:

> Fair point. This is a long discussion and I totally forgot that we
> discussed this.
>
> Seems I changed my opinion about including KAFKA-3907...
>
> Happy to hear what others think.
>
>
> -Matthias
>
> On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
> > Hi Matthias,
> >
> > It is probably my bad, the discussion was a bit long in this thread. I
> > proposed the related issue in the related KIP discuss thread [1] and got
> an
> > approval [2,3].
> > Maybe I misunderstood.
> >
> > [1]
> > http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=
> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> > [2]
> > http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=
> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> > [3]
> > http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=
> Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> >
> >
> > On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <matth...@confluent.io>
> > wrote:
> >
> >> Interesting.
> >>
> >> I thought that https://issues.apache.org/jira/browse/KAFKA-4125 is the
> >> main motivation for this KIP :)
> >>
> >> I also think, that we should not expose the full ProcessorContext at DSL
> >> level.
> >>
> >> Thus, overall I am not even sure if we should fix KAFKA-3907 at all.
> >> Manual commits are something DSL users should not worry about -- and if
> >> one really needs this, an advanced user can still insert a dummy
> >> `transform` to request a commit from there.
> >>
> >> -Matthias
> >>
> >>
> >> On 10/18/17 5:39 AM, Jeyhun Karimov wrote:
> >>> Hi,
> >>>
> >>> The main intuition is to solve [1], which is part of this KIP.
> >>> I agree with you that this might not seem semantically correct as we
> are
> >>> not committing record state.
> >>> Alternatively, we can remove commit() from RecordContext and add
> >>> ProcessorContext (which has commit() method) as an extra argument to
> Rich
> >>> methods:
> >>>
> >>> instead of
> >>> public interface RichValueMapper<V, VR, K> {
> >>> VR apply(final V value,
> >>>  final K key,
> >>>  final RecordContext recordContext);
> >>> }
> >>>
> >>> we can adopt
> >>>
> >>> public interface RichValueMapper<V, VR, K> {
> >>> VR apply(final V value,
> >>>  final K key,
> >>>  final RecordContext recordContext,
> >>>  final ProcessorContext processorContext);
> >>> }
> >>>
> >>>
> >>> However, in this case, a user can get confused as ProcessorContext and
> >>> RecordContext share some methods with the same name.
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>>
> >>> [1] https://issues.apache.org/jira/browse/KAFKA-3907
> >>>
> >>>
> >>> On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >>>
> >>>> Regarding #6 above, I'm still not clear why we would need `commit()`
> in
> >>>> both ProcessorContext and RecordContext, could you elaborate a bit
> more?
> >>>>
> >>>> To me `commit()` is really a processor context not a record context
> >>>> logically: when you call that function, it means we would commit the
> &g

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-23 Thread Matthias J. Sax
Fair point. This is a long discussion and I totally forgot that we
discussed this.

Seems I changed my opinion about including KAFKA-3907...

Happy to hear what others think.


-Matthias

On 10/23/17 1:20 PM, Jeyhun Karimov wrote:
> Hi Matthias,
> 
> It is probably my bad, the discussion was a bit long in this thread. I
> proposed the related issue in the related KIP discuss thread [1] and got an
> approval [2,3].
> Maybe I misunderstood.
> 
> [1]
> http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> [2]
> http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> [3]
> http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
> 
> 
> On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <matth...@confluent.io>
> wrote:
> 
>> Interesting.
>>
>> I thought that https://issues.apache.org/jira/browse/KAFKA-4125 is the
>> main motivation for this KIP :)
>>
>> I also think, that we should not expose the full ProcessorContext at DSL
>> level.
>>
>> Thus, overall I am not even sure if we should fix KAFKA-3907 at all.
>> Manual commits are something DSL users should not worry about -- and if
>> one really needs this, an advanced user can still insert a dummy
>> `transform` to request a commit from there.
>>
>> -Matthias
>>
>>
>> On 10/18/17 5:39 AM, Jeyhun Karimov wrote:
>>> Hi,
>>>
>>> The main intuition is to solve [1], which is part of this KIP.
>>> I agree with you that this might not seem semantically correct as we are
>>> not committing record state.
>>> Alternatively, we can remove commit() from RecordContext and add
>>> ProcessorContext (which has commit() method) as an extra argument to Rich
>>> methods:
>>>
>>> instead of
>>> public interface RichValueMapper<V, VR, K> {
>>> VR apply(final V value,
>>>  final K key,
>>>  final RecordContext recordContext);
>>> }
>>>
>>> we can adopt
>>>
>>> public interface RichValueMapper<V, VR, K> {
>>> VR apply(final V value,
>>>  final K key,
>>>  final RecordContext recordContext,
>>>  final ProcessorContext processorContext);
>>> }
>>>
>>>
>>> However, in this case, a user can get confused as ProcessorContext and
>>> RecordContext share some methods with the same name.
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>>
>>> [1] https://issues.apache.org/jira/browse/KAFKA-3907
>>>
>>>
>>> On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang <wangg...@gmail.com>
>> wrote:
>>>
>>>> Regarding #6 above, I'm still not clear why we would need `commit()` in
>>>> both ProcessorContext and RecordContext, could you elaborate a bit more?
>>>>
>>>> To me `commit()` is really a processor context not a record context
>>>> logically: when you call that function, it means we would commit the
>> state
>>>> of the whole task up to this processed record, not only that single
>> record
>>>> itself.
>>>>
>>>>
>>>> Guozhang
>>>>
>>>> On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov <je.kari...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks for the feedback.
>>>>>
>>>>>
>>>>> 0. RichInitializer definition seems missing.
>>>>>
>>>>>
>>>>>
>>>>> - Fixed.
>>>>>
>>>>>
>>>>>  I'd suggest moving the key parameter in the RichValueXX and
>> RichReducer
>>>>>> after the value parameters, as well as in the templates; e.g.
>>>>>> public interface RichValueJoiner<V1, V2, VR, K> {
>>>>>> VR apply(final V1 value1, final V2 value2, final K key, final
>>>>>> RecordContext
>>>>>> recordContext);
>>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> - Fixed.
>>>>>
>>>>>
>>>>> 2. Some of the listed functions are not necessary since their pairing
>>>> APIs
>>>>>> are being deprecated in 1.0 already:
>>>>>&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-23 Thread Jeyhun Karimov
Hi Matthias,

It is probably my bad, the discussion was a bit long in this thread. I
proposed the related issue in the related KIP discuss thread [1] and got an
approval [2,3].
Maybe I misunderstood.

[1]
http://search-hadoop.com/m/Kafka/uyzND19Asmg1GKKXT1?subj=Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
[2]
http://search-hadoop.com/m/Kafka/uyzND1kpct22GKKXT1?subj=Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams
[3]
http://search-hadoop.com/m/Kafka/uyzND1G6TGIGKKXT1?subj=Re+DISCUSS+KIP+159+Introducing+Rich+functions+to+Streams


On Mon, Oct 23, 2017 at 8:44 PM Matthias J. Sax <matth...@confluent.io>
wrote:

> Interesting.
>
> I thought that https://issues.apache.org/jira/browse/KAFKA-4125 is the
> main motivation for this KIP :)
>
> I also think, that we should not expose the full ProcessorContext at DSL
> level.
>
> Thus, overall I am not even sure if we should fix KAFKA-3907 at all.
> Manual commits are something DSL users should not worry about -- and if
> one really needs this, an advanced user can still insert a dummy
> `transform` to request a commit from there.
>
> -Matthias
>
>
> On 10/18/17 5:39 AM, Jeyhun Karimov wrote:
> > Hi,
> >
> > The main intuition is to solve [1], which is part of this KIP.
> > I agree with you that this might not seem semantically correct as we are
> > not committing record state.
> > Alternatively, we can remove commit() from RecordContext and add
> > ProcessorContext (which has commit() method) as an extra argument to Rich
> > methods:
> >
> > instead of
> > public interface RichValueMapper<V, VR, K> {
> > VR apply(final V value,
> >  final K key,
> >  final RecordContext recordContext);
> > }
> >
> > we can adopt
> >
> > public interface RichValueMapper<V, VR, K> {
> > VR apply(final V value,
> >  final K key,
> >  final RecordContext recordContext,
> >  final ProcessorContext processorContext);
> > }
> >
> >
> > However, in this case, a user can get confused as ProcessorContext and
> > RecordContext share some methods with the same name.
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> > [1] https://issues.apache.org/jira/browse/KAFKA-3907
> >
> >
> > On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> >> Regarding #6 above, I'm still not clear why we would need `commit()` in
> >> both ProcessorContext and RecordContext, could you elaborate a bit more?
> >>
> >> To me `commit()` is really a processor context not a record context
> >> logically: when you call that function, it means we would commit the
> state
> >> of the whole task up to this processed record, not only that single
> record
> >> itself.
> >>
> >>
> >> Guozhang
> >>
> >> On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov <je.kari...@gmail.com>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Thanks for the feedback.
> >>>
> >>>
> >>> 0. RichInitializer definition seems missing.
> >>>
> >>>
> >>>
> >>> - Fixed.
> >>>
> >>>
> >>>  I'd suggest moving the key parameter in the RichValueXX and
> RichReducer
> >>>> after the value parameters, as well as in the templates; e.g.
> >>>> public interface RichValueJoiner<V1, V2, VR, K> {
> >>>> VR apply(final V1 value1, final V2 value2, final K key, final
> >>>> RecordContext
> >>>> recordContext);
> >>>> }
> >>>
> >>>
> >>>
> >>> - Fixed.
> >>>
> >>>
> >>> 2. Some of the listed functions are not necessary since their pairing
> >> APIs
> >>>> are being deprecated in 1.0 already:
> >>>>  KGroupedStream<KR, V> groupBy(final RichKeyValueMapper >> ?
> >>>> super V, KR> selector,
> >>>>final Serde keySerde,
> >>>>final Serde valSerde);
> >>>> <VT, VR> KStream<K, VR> leftJoin(final KTable<K, VT> table,
> >>>>  final RichValueJoiner >> super
> >>>> V,
> >>>> ? super VT, ? extends VR> joiner,
> >>>>  final Serde keySerde,
> >>>&

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-23 Thread Matthias J. Sax
Interesting.

I thought that https://issues.apache.org/jira/browse/KAFKA-4125 is the
main motivation for this KIP :)

I also think, that we should not expose the full ProcessorContext at DSL
level.

Thus, overall I am not even sure if we should fix KAFKA-3907 at all.
Manual commits are something DSL users should not worry about -- and if
one really needs this, an advanced user can still insert a dummy
`transform` to request a commit from there.

-Matthias


On 10/18/17 5:39 AM, Jeyhun Karimov wrote:
> Hi,
> 
> The main intuition is to solve [1], which is part of this KIP.
> I agree with you that this might not seem semantically correct as we are
> not committing record state.
> Alternatively, we can remove commit() from RecordContext and add
> ProcessorContext (which has commit() method) as an extra argument to Rich
> methods:
> 
> instead of
> public interface RichValueMapper {
> VR apply(final V value,
>  final K key,
>  final RecordContext recordContext);
> }
> 
> we can adopt
> 
> public interface RichValueMapper {
> VR apply(final V value,
>  final K key,
>  final RecordContext recordContext,
>  final ProcessorContext processorContext);
> }
> 
> 
> However, in this case, a user can get confused as ProcessorContext and
> RecordContext share some methods with the same name.
> 
> 
> Cheers,
> Jeyhun
> 
> 
> [1] https://issues.apache.org/jira/browse/KAFKA-3907
> 
> 
> On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang  wrote:
> 
>> Regarding #6 above, I'm still not clear why we would need `commit()` in
>> both ProcessorContext and RecordContext, could you elaborate a bit more?
>>
>> To me `commit()` is really a processor context not a record context
>> logically: when you call that function, it means we would commit the state
>> of the whole task up to this processed record, not only that single record
>> itself.
>>
>>
>> Guozhang
>>
>> On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov 
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks for the feedback.
>>>
>>>
>>> 0. RichInitializer definition seems missing.
>>>
>>>
>>>
>>> - Fixed.
>>>
>>>
>>>  I'd suggest moving the key parameter in the RichValueXX and RichReducer
 after the value parameters, as well as in the templates; e.g.
 public interface RichValueJoiner {
 VR apply(final V1 value1, final V2 value2, final K key, final
 RecordContext
 recordContext);
 }
>>>
>>>
>>>
>>> - Fixed.
>>>
>>>
>>> 2. Some of the listed functions are not necessary since their pairing
>> APIs
 are being deprecated in 1.0 already:
  KGroupedStream groupBy(final RichKeyValueMapper> ?
 super V, KR> selector,
final Serde keySerde,
final Serde valSerde);
  KStream leftJoin(final KTable table,
  final RichValueJoiner> super
 V,
 ? super VT, ? extends VR> joiner,
  final Serde keySerde,
  final Serde valSerde);
>>>
>>>
>>> -Fixed
>>>
>>> 3. For a few functions where we are adding three APIs for a combo of both
 mapper / joiner, or both initializer / aggregator, or adder /
>> subtractor,
 I'm wondering if we can just keep one that use "rich" functions for
>> both;
 so that we can have less overloads and let users who only want to
>> access
 one of them to just use dummy parameter declarations. For example:

  KStream join(final GlobalKTable
>> globalKTable,
  final RichKeyValueMapper>>> super
  V, ? extends GK> keyValueMapper,
  final RichValueJoiner> super
 V,
 ? super GV, ? extends RV> joiner);
>>>
>>>
>>>
>>> -Agreed. Fixed.
>>>
>>>
>>> 4. For TimeWindowedKStream, I'm wondering why we do not make its
 Initializer also "rich" functions? I.e.
>>>
>>>
>>> - It was a typo. Fixed.
>>>
>>>
>>> 5. We need to move "RecordContext" from o.a.k.processor.internals to
 o.a.k.processor.

 6. I'm not clear why we want to move `commit()` from ProcessorContext
>> to
 RecordContext?

>>>
>>> -
>>> Because it makes sense logically and  to reduce code maintenance (both
>>> interfaces have offset() timestamp() topic() partition() methods),  I
>>> inherit ProcessorContext from RecordContext.
>>> Since we need commit() method both in ProcessorContext and in
>> RecordContext
>>> I move commit() method to parent class (RecordContext).
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>>
>>>
>>> On Wed, Oct 11, 2017 at 12:59 AM, Guozhang Wang 
>>> wrote:
>>>
 Jeyhun,

 Thanks for the updated KIP, here are my comments.

 0. RichInitializer definition seems missing.

 1. I'd suggest moving the key parameter in the RichValueXX and

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-18 Thread Jeyhun Karimov
Hi,

The main intuition is to solve [1], which is part of this KIP.
I agree with you that this might not seem semantically correct as we are
not committing record state.
Alternatively, we can remove commit() from RecordContext and add
ProcessorContext (which has commit() method) as an extra argument to Rich
methods:

instead of
public interface RichValueMapper {
VR apply(final V value,
 final K key,
 final RecordContext recordContext);
}

we can adopt

public interface RichValueMapper {
VR apply(final V value,
 final K key,
 final RecordContext recordContext,
 final ProcessorContext processorContext);
}


However, in this case, a user can get confused as ProcessorContext and
RecordContext share some methods with the same name.


Cheers,
Jeyhun


[1] https://issues.apache.org/jira/browse/KAFKA-3907


On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang  wrote:

> Regarding #6 above, I'm still not clear why we would need `commit()` in
> both ProcessorContext and RecordContext, could you elaborate a bit more?
>
> To me `commit()` is really a processor context not a record context
> logically: when you call that function, it means we would commit the state
> of the whole task up to this processed record, not only that single record
> itself.
>
>
> Guozhang
>
> On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov 
> wrote:
>
> > Hi,
> >
> > Thanks for the feedback.
> >
> >
> > 0. RichInitializer definition seems missing.
> >
> >
> >
> > - Fixed.
> >
> >
> >  I'd suggest moving the key parameter in the RichValueXX and RichReducer
> > > after the value parameters, as well as in the templates; e.g.
> > > public interface RichValueJoiner {
> > > VR apply(final V1 value1, final V2 value2, final K key, final
> > > RecordContext
> > > recordContext);
> > > }
> >
> >
> >
> > - Fixed.
> >
> >
> > 2. Some of the listed functions are not necessary since their pairing
> APIs
> > > are being deprecated in 1.0 already:
> > >  KGroupedStream groupBy(final RichKeyValueMapper ?
> > > super V, KR> selector,
> > >final Serde keySerde,
> > >final Serde valSerde);
> > >  KStream leftJoin(final KTable table,
> > >  final RichValueJoiner super
> > > V,
> > > ? super VT, ? extends VR> joiner,
> > >  final Serde keySerde,
> > >  final Serde valSerde);
> >
> >
> > -Fixed
> >
> > 3. For a few functions where we are adding three APIs for a combo of both
> > > mapper / joiner, or both initializer / aggregator, or adder /
> subtractor,
> > > I'm wondering if we can just keep one that use "rich" functions for
> both;
> > > so that we can have less overloads and let users who only want to
> access
> > > one of them to just use dummy parameter declarations. For example:
> > >
> > >  KStream join(final GlobalKTable
> globalKTable,
> > >  final RichKeyValueMapper > > super
> > >  V, ? extends GK> keyValueMapper,
> > >  final RichValueJoiner super
> > > V,
> > > ? super GV, ? extends RV> joiner);
> >
> >
> >
> > -Agreed. Fixed.
> >
> >
> > 4. For TimeWindowedKStream, I'm wondering why we do not make its
> > > Initializer also "rich" functions? I.e.
> >
> >
> > - It was a typo. Fixed.
> >
> >
> > 5. We need to move "RecordContext" from o.a.k.processor.internals to
> > > o.a.k.processor.
> > >
> > > 6. I'm not clear why we want to move `commit()` from ProcessorContext
> to
> > > RecordContext?
> > >
> >
> > -
> > Because it makes sense logically and  to reduce code maintenance (both
> > interfaces have offset() timestamp() topic() partition() methods),  I
> > inherit ProcessorContext from RecordContext.
> > Since we need commit() method both in ProcessorContext and in
> RecordContext
> > I move commit() method to parent class (RecordContext).
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> > On Wed, Oct 11, 2017 at 12:59 AM, Guozhang Wang 
> > wrote:
> >
> > > Jeyhun,
> > >
> > > Thanks for the updated KIP, here are my comments.
> > >
> > > 0. RichInitializer definition seems missing.
> > >
> > > 1. I'd suggest moving the key parameter in the RichValueXX and
> > RichReducer
> > > after the value parameters, as well as in the templates; e.g.
> > >
> > > public interface RichValueJoiner {
> > > VR apply(final V1 value1, final V2 value2, final K key, final
> > > RecordContext
> > > recordContext);
> > > }
> > >
> > > My motivation is that for lambda expression in J8, users that would not
> > > care about the key but only the context, or vice versa, is likely to
> > write
> > > it as (value1, value2, dummy, context) -> ... than putting the dummy at
> > the
> > > beginning of the parameter list. 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-16 Thread Guozhang Wang
Regarding #6 above, I'm still not clear why we would need `commit()` in
both ProcessorContext and RecordContext, could you elaborate a bit more?

To me `commit()` is really a processor context not a record context
logically: when you call that function, it means we would commit the state
of the whole task up to this processed record, not only that single record
itself.


Guozhang

On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov 
wrote:

> Hi,
>
> Thanks for the feedback.
>
>
> 0. RichInitializer definition seems missing.
>
>
>
> - Fixed.
>
>
>  I'd suggest moving the key parameter in the RichValueXX and RichReducer
> > after the value parameters, as well as in the templates; e.g.
> > public interface RichValueJoiner {
> > VR apply(final V1 value1, final V2 value2, final K key, final
> > RecordContext
> > recordContext);
> > }
>
>
>
> - Fixed.
>
>
> 2. Some of the listed functions are not necessary since their pairing APIs
> > are being deprecated in 1.0 already:
> >  KGroupedStream groupBy(final RichKeyValueMapper > super V, KR> selector,
> >final Serde keySerde,
> >final Serde valSerde);
> >  KStream leftJoin(final KTable table,
> >  final RichValueJoiner > V,
> > ? super VT, ? extends VR> joiner,
> >  final Serde keySerde,
> >  final Serde valSerde);
>
>
> -Fixed
>
> 3. For a few functions where we are adding three APIs for a combo of both
> > mapper / joiner, or both initializer / aggregator, or adder / subtractor,
> > I'm wondering if we can just keep one that use "rich" functions for both;
> > so that we can have less overloads and let users who only want to access
> > one of them to just use dummy parameter declarations. For example:
> >
> >  KStream join(final GlobalKTable globalKTable,
> >  final RichKeyValueMapper > super
> >  V, ? extends GK> keyValueMapper,
> >  final RichValueJoiner > V,
> > ? super GV, ? extends RV> joiner);
>
>
>
> -Agreed. Fixed.
>
>
> 4. For TimeWindowedKStream, I'm wondering why we do not make its
> > Initializer also "rich" functions? I.e.
>
>
> - It was a typo. Fixed.
>
>
> 5. We need to move "RecordContext" from o.a.k.processor.internals to
> > o.a.k.processor.
> >
> > 6. I'm not clear why we want to move `commit()` from ProcessorContext to
> > RecordContext?
> >
>
> -
> Because it makes sense logically and  to reduce code maintenance (both
> interfaces have offset() timestamp() topic() partition() methods),  I
> inherit ProcessorContext from RecordContext.
> Since we need commit() method both in ProcessorContext and in RecordContext
> I move commit() method to parent class (RecordContext).
>
>
> Cheers,
> Jeyhun
>
>
>
> On Wed, Oct 11, 2017 at 12:59 AM, Guozhang Wang 
> wrote:
>
> > Jeyhun,
> >
> > Thanks for the updated KIP, here are my comments.
> >
> > 0. RichInitializer definition seems missing.
> >
> > 1. I'd suggest moving the key parameter in the RichValueXX and
> RichReducer
> > after the value parameters, as well as in the templates; e.g.
> >
> > public interface RichValueJoiner {
> > VR apply(final V1 value1, final V2 value2, final K key, final
> > RecordContext
> > recordContext);
> > }
> >
> > My motivation is that for lambda expression in J8, users that would not
> > care about the key but only the context, or vice versa, is likely to
> write
> > it as (value1, value2, dummy, context) -> ... than putting the dummy at
> the
> > beginning of the parameter list. Generally speaking we'd like to make all
> > the "necessary" parameters prior to optional ones.
> >
> >
> > 2. Some of the listed functions are not necessary since their pairing
> APIs
> > are being deprecated in 1.0 already:
> >
> >  KGroupedStream groupBy(final RichKeyValueMapper > super V, KR> selector,
> >final Serde keySerde,
> >final Serde valSerde);
> >
> >  KStream leftJoin(final KTable table,
> >  final RichValueJoiner > V,
> > ? super VT, ? extends VR> joiner,
> >  final Serde keySerde,
> >  final Serde valSerde);
> >
> >
> >
> > 3. For a few functions where we are adding three APIs for a combo of both
> > mapper / joiner, or both initializer / aggregator, or adder / subtractor,
> > I'm wondering if we can just keep one that use "rich" functions for both;
> > so that we can have less overloads and let users who only want to access
> > one of them to just use dummy parameter declarations. For example:
> >
> >
> >  KStream join(final GlobalKTable globalKTable,
> >  final 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-16 Thread Jeyhun Karimov
Hi,

Thanks for the feedback.


0. RichInitializer definition seems missing.



- Fixed.


 I'd suggest moving the key parameter in the RichValueXX and RichReducer
> after the value parameters, as well as in the templates; e.g.
> public interface RichValueJoiner {
> VR apply(final V1 value1, final V2 value2, final K key, final
> RecordContext
> recordContext);
> }



- Fixed.


2. Some of the listed functions are not necessary since their pairing APIs
> are being deprecated in 1.0 already:
>  KGroupedStream groupBy(final RichKeyValueMapper super V, KR> selector,
>final Serde keySerde,
>final Serde valSerde);
>  KStream leftJoin(final KTable table,
>  final RichValueJoiner V,
> ? super VT, ? extends VR> joiner,
>  final Serde keySerde,
>  final Serde valSerde);


-Fixed

3. For a few functions where we are adding three APIs for a combo of both
> mapper / joiner, or both initializer / aggregator, or adder / subtractor,
> I'm wondering if we can just keep one that use "rich" functions for both;
> so that we can have less overloads and let users who only want to access
> one of them to just use dummy parameter declarations. For example:
>
>  KStream join(final GlobalKTable globalKTable,
>  final RichKeyValueMapper super
>  V, ? extends GK> keyValueMapper,
>  final RichValueJoiner V,
> ? super GV, ? extends RV> joiner);



-Agreed. Fixed.


4. For TimeWindowedKStream, I'm wondering why we do not make its
> Initializer also "rich" functions? I.e.


- It was a typo. Fixed.


5. We need to move "RecordContext" from o.a.k.processor.internals to
> o.a.k.processor.
>
> 6. I'm not clear why we want to move `commit()` from ProcessorContext to
> RecordContext?
>

-
Because it makes sense logically and  to reduce code maintenance (both
interfaces have offset() timestamp() topic() partition() methods),  I
inherit ProcessorContext from RecordContext.
Since we need commit() method both in ProcessorContext and in RecordContext
I move commit() method to parent class (RecordContext).


Cheers,
Jeyhun



On Wed, Oct 11, 2017 at 12:59 AM, Guozhang Wang  wrote:

> Jeyhun,
>
> Thanks for the updated KIP, here are my comments.
>
> 0. RichInitializer definition seems missing.
>
> 1. I'd suggest moving the key parameter in the RichValueXX and RichReducer
> after the value parameters, as well as in the templates; e.g.
>
> public interface RichValueJoiner {
> VR apply(final V1 value1, final V2 value2, final K key, final
> RecordContext
> recordContext);
> }
>
> My motivation is that for lambda expression in J8, users that would not
> care about the key but only the context, or vice versa, is likely to write
> it as (value1, value2, dummy, context) -> ... than putting the dummy at the
> beginning of the parameter list. Generally speaking we'd like to make all
> the "necessary" parameters prior to optional ones.
>
>
> 2. Some of the listed functions are not necessary since their pairing APIs
> are being deprecated in 1.0 already:
>
>  KGroupedStream groupBy(final RichKeyValueMapper super V, KR> selector,
>final Serde keySerde,
>final Serde valSerde);
>
>  KStream leftJoin(final KTable table,
>  final RichValueJoiner V,
> ? super VT, ? extends VR> joiner,
>  final Serde keySerde,
>  final Serde valSerde);
>
>
>
> 3. For a few functions where we are adding three APIs for a combo of both
> mapper / joiner, or both initializer / aggregator, or adder / subtractor,
> I'm wondering if we can just keep one that use "rich" functions for both;
> so that we can have less overloads and let users who only want to access
> one of them to just use dummy parameter declarations. For example:
>
>
>  KStream join(final GlobalKTable globalKTable,
>  final RichKeyValueMapper super
>  V, ? extends GK> keyValueMapper,
>  final RichValueJoiner V,
> ? super GV, ? extends RV> joiner);
>
>  KTable aggregate(final RichInitializer initializer,
>  final RichAggregator
> aggregator,
>  final Materialized byte[]>> materialized);
>
> Similarly for KGroupedTable, a bunch of aggregate() are deprecated so we do
> not need to add its rich functions any more.
>
>
> 4. For TimeWindowedKStream, I'm wondering why we do not make its
> Initializer also "rich" functions? I.e.
>
>  KTable aggregate(final RichInitializer
> initializer,
> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-10 Thread Guozhang Wang
Jeyhun,

Thanks for the updated KIP, here are my comments.

0. RichInitializer definition seems missing.

1. I'd suggest moving the key parameter in the RichValueXX and RichReducer
after the value parameters, as well as in the templates; e.g.

public interface RichValueJoiner {
VR apply(final V1 value1, final V2 value2, final K key, final RecordContext
recordContext);
}

My motivation is that for lambda expression in J8, users that would not
care about the key but only the context, or vice versa, is likely to write
it as (value1, value2, dummy, context) -> ... than putting the dummy at the
beginning of the parameter list. Generally speaking we'd like to make all
the "necessary" parameters prior to optional ones.


2. Some of the listed functions are not necessary since their pairing APIs
are being deprecated in 1.0 already:

 KGroupedStream groupBy(final RichKeyValueMapper selector,
   final Serde keySerde,
   final Serde valSerde);

 KStream leftJoin(final KTable table,
 final RichValueJoiner joiner,
 final Serde keySerde,
 final Serde valSerde);



3. For a few functions where we are adding three APIs for a combo of both
mapper / joiner, or both initializer / aggregator, or adder / subtractor,
I'm wondering if we can just keep one that use "rich" functions for both;
so that we can have less overloads and let users who only want to access
one of them to just use dummy parameter declarations. For example:


 KStream join(final GlobalKTable globalKTable,
 final RichKeyValueMapper keyValueMapper,
 final RichValueJoiner joiner);

 KTable aggregate(final RichInitializer initializer,
 final RichAggregator
aggregator,
 final Materialized> materialized);

Similarly for KGroupedTable, a bunch of aggregate() are deprecated so we do
not need to add its rich functions any more.


4. For TimeWindowedKStream, I'm wondering why we do not make its
Initializer also "rich" functions? I.e.

 KTable aggregate(final RichInitializer
initializer,
   final RichAggregator aggregator);
 KTable aggregate(final RichInitializer
initializer,
   final RichAggregator aggregator,
   final Materialized> materialized);


5. We need to move "RecordContext" from o.a.k.processor.internals to
o.a.k.processor.

6. I'm not clear why we want to move `commit()` from ProcessorContext to
RecordContext? Conceptually I think it would better staying in the
ProcessorContext. Do you find this not doable in the internal
implementations?


Guozhang



On Fri, Sep 22, 2017 at 1:09 PM, Ted Yu  wrote:

>recordContext = new RecordContext() {   // recordContext
> initialization is added in this KIP
>
> This code snippet seems to be standard - would it make sense to pull it
> into a (sample) RecordContext implementation ?
>
> Cheers
>
> On Fri, Sep 22, 2017 at 12:14 PM, Jeyhun Karimov 
> wrote:
>
> > Hi Ted,
> >
> > Thanks for your comments. I added a couple of comments in KIP to clarify
> > some points.
> >
> >
> > bq. provides a hybrd solution
> > > Typo in hybrid.
> >
> >
> > - My bad. Thanks for the correction.
> >
> > It would be nice if you can name some Value operator as examples.
> >
> >
> > >
> > - I added the corresponding interface names to KIP.
> >
> >
> >  KTable aggregate(final Initializer initializer,
> > >  final Aggregator
> > > adder,
> > > The adder doesn't need to be RichAggregator ?
> >
> >
> >
> > - Exactly. However, there are 2 Aggregator-type arguments in the related
> > method. So, I had to overload all possible their Rich counterparts:
> >
> > // adder with non-rich, subtrctor is rich
> >  KTable aggregate(final Initializer initializer,
> >  final Aggregator
> > adder,
> >  final RichAggregator VR>
> > subtractor,
> >  final Materialized KeyValueStore > byte[]>> materialized);
> >
> > // adder withrich, subtrctor is non-rich
> >  KTable aggregate(final Initializer initializer,
> >  final RichAggregator VR>
> > adder,
> >  final Aggregator
> > subtractor,
> >  final Materialized KeyValueStore > byte[]>> materialized);
> >
> > // both adder and subtractor are rich
> >  KTable aggregate(final Initializer initializer,
> > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-22 Thread Ted Yu
   recordContext = new RecordContext() {   // recordContext
initialization is added in this KIP

This code snippet seems to be standard - would it make sense to pull it
into a (sample) RecordContext implementation ?

Cheers

On Fri, Sep 22, 2017 at 12:14 PM, Jeyhun Karimov 
wrote:

> Hi Ted,
>
> Thanks for your comments. I added a couple of comments in KIP to clarify
> some points.
>
>
> bq. provides a hybrd solution
> > Typo in hybrid.
>
>
> - My bad. Thanks for the correction.
>
> It would be nice if you can name some Value operator as examples.
>
>
> >
> - I added the corresponding interface names to KIP.
>
>
>  KTable aggregate(final Initializer initializer,
> >  final Aggregator
> > adder,
> > The adder doesn't need to be RichAggregator ?
>
>
>
> - Exactly. However, there are 2 Aggregator-type arguments in the related
> method. So, I had to overload all possible their Rich counterparts:
>
> // adder with non-rich, subtrctor is rich
>  KTable aggregate(final Initializer initializer,
>  final Aggregator
> adder,
>  final RichAggregator
> subtractor,
>  final Materialized byte[]>> materialized);
>
> // adder withrich, subtrctor is non-rich
>  KTable aggregate(final Initializer initializer,
>  final RichAggregator
> adder,
>  final Aggregator
> subtractor,
>  final Materialized byte[]>> materialized);
>
> // both adder and subtractor are rich
>  KTable aggregate(final Initializer initializer,
>  final RichAggregator
> adder,
>  final RichAggregator
> subtractor,
>  final Materialized byte[]>> materialized);
>
>
> Can you explain a bit about the above implementation ?
> >void commit () {
> >  throw new UnsupportedOperationException("commit() is not supported
> in
> > this context");
> > Is the exception going to be replaced with real code in the PR ?
>
>
>
> - I added some comments both inside and outside the code snippets in KIP.
> Specifically, for the code snippet above, we add *commit()* method to
> *RecordContext* interface.
> However, we want  *commit()* method to be used only for *RecordContext*
> instances (at least for now), so we add UnsupportedOperationException in
> all classes/interfaces that extend/implement *RecordContext.*
> In general, 1) we make RecordContext publicly available within
> ProcessorContext,  2) initialize its instance within all required
> Processors and 3) pass it as an argument to the related Rich interfaces
> inside Processors.
>
>
>
>
> Cheers,
> Jeyhun
>
> On Fri, Sep 22, 2017 at 6:44 PM Ted Yu  wrote:
>
> > bq. provides a hybrd solution
> >
> > Typo in hybrid.
> >
> > bq. accessing read-only keys within XXXValues operators
> >
> > It would be nice if you can name some Value operator as examples.
> >
> >  KTable aggregate(final Initializer initializer,
> >  final Aggregator
> > adder,
> >
> > The adder doesn't need to be RichAggregator ?
> >
> >   public RecordContext recordContext() {
> > return this.recordContext();
> >
> > Can you explain a bit about the above implementation ?
> >
> >void commit () {
> >  throw new UnsupportedOperationException("commit() is not supported
> in
> > this context");
> >
> > Is the exception going to be replaced with real code in the PR ?
> >
> > Cheers
> >
> >
> > On Fri, Sep 22, 2017 at 9:28 AM, Jeyhun Karimov 
> > wrote:
> >
> > > Dear community,
> > >
> > > I updated the related KIP [1]. Please feel free to comment.
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > [1]
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 159%3A+Introducing+Rich+functions+to+Streams
> > >
> > >
> > >
> > >
> > > On Fri, Sep 22, 2017 at 12:20 AM Jeyhun Karimov 
> > > wrote:
> > >
> > > > Hi Damian,
> > > >
> > > > Thanks for the update. I working on it and will provide an update
> soon.
> > > >
> > > > Cheers,
> > > > Jeyhun
> > > >
> > > > On Thu, Sep 21, 2017 at 4:50 PM Damian Guy 
> > wrote:
> > > >
> > > >> Hi Jeyhun,
> > > >>
> > > >> All KIP-182 API PRs have now been merged. So you can consider it as
> > > >> stable.
> > > >> Thanks,
> > > >> Damian
> > > >>
> > > >> On Thu, 21 Sep 2017 at 15:23 Jeyhun Karimov 
> > > wrote:
> > > >>
> > > >> > Hi all,
> > > >> >
> > > >> > Thanks a lot for your comments. For the single interface (RichXXX
> > and
> > > >> > XXXWithKey) solution, I have already submitted a PR but probably
> it
> > is
> > > >> > outdated (when the KIP first proposed), I need to revisit that
> one.
> > > >> >
> > > >> > @Guozhang, from our 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-22 Thread Jeyhun Karimov
Hi Ted,

Thanks for your comments. I added a couple of comments in KIP to clarify
some points.


bq. provides a hybrd solution
> Typo in hybrid.


- My bad. Thanks for the correction.

It would be nice if you can name some Value operator as examples.


>
- I added the corresponding interface names to KIP.


 KTable aggregate(final Initializer initializer,
>  final Aggregator
> adder,
> The adder doesn't need to be RichAggregator ?



- Exactly. However, there are 2 Aggregator-type arguments in the related
method. So, I had to overload all possible their Rich counterparts:

// adder with non-rich, subtrctor is rich
 KTable aggregate(final Initializer initializer,
 final Aggregator
adder,
 final RichAggregator
subtractor,
 final Materialized> materialized);

// adder withrich, subtrctor is non-rich
 KTable aggregate(final Initializer initializer,
 final RichAggregator
adder,
 final Aggregator
subtractor,
 final Materialized> materialized);

// both adder and subtractor are rich
 KTable aggregate(final Initializer initializer,
 final RichAggregator
adder,
 final RichAggregator
subtractor,
 final Materialized> materialized);


Can you explain a bit about the above implementation ?
>void commit () {
>  throw new UnsupportedOperationException("commit() is not supported in
> this context");
> Is the exception going to be replaced with real code in the PR ?



- I added some comments both inside and outside the code snippets in KIP.
Specifically, for the code snippet above, we add *commit()* method to
*RecordContext* interface.
However, we want  *commit()* method to be used only for *RecordContext*
instances (at least for now), so we add UnsupportedOperationException in
all classes/interfaces that extend/implement *RecordContext.*
In general, 1) we make RecordContext publicly available within
ProcessorContext,  2) initialize its instance within all required
Processors and 3) pass it as an argument to the related Rich interfaces
inside Processors.




Cheers,
Jeyhun

On Fri, Sep 22, 2017 at 6:44 PM Ted Yu  wrote:

> bq. provides a hybrd solution
>
> Typo in hybrid.
>
> bq. accessing read-only keys within XXXValues operators
>
> It would be nice if you can name some Value operator as examples.
>
>  KTable aggregate(final Initializer initializer,
>  final Aggregator
> adder,
>
> The adder doesn't need to be RichAggregator ?
>
>   public RecordContext recordContext() {
> return this.recordContext();
>
> Can you explain a bit about the above implementation ?
>
>void commit () {
>  throw new UnsupportedOperationException("commit() is not supported in
> this context");
>
> Is the exception going to be replaced with real code in the PR ?
>
> Cheers
>
>
> On Fri, Sep 22, 2017 at 9:28 AM, Jeyhun Karimov 
> wrote:
>
> > Dear community,
> >
> > I updated the related KIP [1]. Please feel free to comment.
> >
> > Cheers,
> > Jeyhun
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 159%3A+Introducing+Rich+functions+to+Streams
> >
> >
> >
> >
> > On Fri, Sep 22, 2017 at 12:20 AM Jeyhun Karimov 
> > wrote:
> >
> > > Hi Damian,
> > >
> > > Thanks for the update. I working on it and will provide an update soon.
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > On Thu, Sep 21, 2017 at 4:50 PM Damian Guy 
> wrote:
> > >
> > >> Hi Jeyhun,
> > >>
> > >> All KIP-182 API PRs have now been merged. So you can consider it as
> > >> stable.
> > >> Thanks,
> > >> Damian
> > >>
> > >> On Thu, 21 Sep 2017 at 15:23 Jeyhun Karimov 
> > wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > Thanks a lot for your comments. For the single interface (RichXXX
> and
> > >> > XXXWithKey) solution, I have already submitted a PR but probably it
> is
> > >> > outdated (when the KIP first proposed), I need to revisit that one.
> > >> >
> > >> > @Guozhang, from our (offline) discussion, I understood that we may
> not
> > >> make
> > >> > it merge this KIP into the upcoming release, as KIP-159 is not voted
> > yet
> > >> > (because we want both KIP-149 and KIP-159 to be as an "atomic"
> merge).
> > >> So
> > >> > I decided to wait until KIP-182 gets stable (there are some minor
> > >> updates
> > >> > AFAIK) and update the KIP accordingly. Please correct me if I am
> wrong
> > >> or I
> > >> > misunderstood.
> > >> >
> > >> > Cheers,
> > >> > Jeyhun
> > >> >
> > >> >
> > >> > On Thu, Sep 21, 2017 at 4:11 PM Damian Guy 
> > >> wrote:
> > >> >
> > >> > > +1

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-22 Thread Ted Yu
bq. provides a hybrd solution

Typo in hybrid.

bq. accessing read-only keys within XXXValues operators

It would be nice if you can name some Value operator as examples.

 KTable aggregate(final Initializer initializer,
 final Aggregator
adder,

The adder doesn't need to be RichAggregator ?

  public RecordContext recordContext() {
return this.recordContext();

Can you explain a bit about the above implementation ?

   void commit () {
 throw new UnsupportedOperationException("commit() is not supported in
this context");

Is the exception going to be replaced with real code in the PR ?

Cheers


On Fri, Sep 22, 2017 at 9:28 AM, Jeyhun Karimov 
wrote:

> Dear community,
>
> I updated the related KIP [1]. Please feel free to comment.
>
> Cheers,
> Jeyhun
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 159%3A+Introducing+Rich+functions+to+Streams
>
>
>
>
> On Fri, Sep 22, 2017 at 12:20 AM Jeyhun Karimov 
> wrote:
>
> > Hi Damian,
> >
> > Thanks for the update. I working on it and will provide an update soon.
> >
> > Cheers,
> > Jeyhun
> >
> > On Thu, Sep 21, 2017 at 4:50 PM Damian Guy  wrote:
> >
> >> Hi Jeyhun,
> >>
> >> All KIP-182 API PRs have now been merged. So you can consider it as
> >> stable.
> >> Thanks,
> >> Damian
> >>
> >> On Thu, 21 Sep 2017 at 15:23 Jeyhun Karimov 
> wrote:
> >>
> >> > Hi all,
> >> >
> >> > Thanks a lot for your comments. For the single interface (RichXXX and
> >> > XXXWithKey) solution, I have already submitted a PR but probably it is
> >> > outdated (when the KIP first proposed), I need to revisit that one.
> >> >
> >> > @Guozhang, from our (offline) discussion, I understood that we may not
> >> make
> >> > it merge this KIP into the upcoming release, as KIP-159 is not voted
> yet
> >> > (because we want both KIP-149 and KIP-159 to be as an "atomic" merge).
> >> So
> >> > I decided to wait until KIP-182 gets stable (there are some minor
> >> updates
> >> > AFAIK) and update the KIP accordingly. Please correct me if I am wrong
> >> or I
> >> > misunderstood.
> >> >
> >> > Cheers,
> >> > Jeyhun
> >> >
> >> >
> >> > On Thu, Sep 21, 2017 at 4:11 PM Damian Guy 
> >> wrote:
> >> >
> >> > > +1
> >> > >
> >> > > On Thu, 21 Sep 2017 at 13:46 Guozhang Wang 
> >> wrote:
> >> > >
> >> > > > +1 for me as well for collapsing.
> >> > > >
> >> > > > Jeyhun, could you update the wiki accordingly to show what's the
> >> final
> >> > > > updates post KIP-182 that needs to be done in KIP-159 including
> >> > KIP-149?
> >> > > > The child page I made is just a suggestion, but you would still
> >> need to
> >> > > > update your proposal for people to comment and vote on.
> >> > > >
> >> > > >
> >> > > > Guozhang
> >> > > >
> >> > > >
> >> > > > On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu 
> >> wrote:
> >> > > >
> >> > > > > +1
> >> > > > >
> >> > > > > One interface is cleaner.
> >> > > > >
> >> > > > > On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck  >
> >> > > wrote:
> >> > > > >
> >> > > > > > +1 for me on collapsing the Rich and ValueWithKey
> >> > interfaces
> >> > > > > into 1
> >> > > > > > interface.
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > Bill
> >> > > > > >
> >> > > > > > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov <
> >> > > je.kari...@gmail.com
> >> > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi Damian,
> >> > > > > > >
> >> > > > > > > Thanks for your feedback. Actually, this (what you propose)
> >> was
> >> > the
> >> > > > > first
> >> > > > > > > idea of KIP-149. Then we decided to divide it into two
> KIPs. I
> >> > also
> >> > > > > > > expressed my opinion that keeping the two interfaces (Rich
> and
> >> > > > withKey)
> >> > > > > > > separate would add more overloads. So, email discussion
> >> resulted
> >> > > that
> >> > > > > > this
> >> > > > > > > would not be a problem.
> >> > > > > > >
> >> > > > > > > Our initial idea was similar to :
> >> > > > > > >
> >> > > > > > > public abstract class RichValueMapper  implements
> >> > > > > > > ValueMapperWithKey, RichFunction {
> >> > > > > > > ..
> >> > > > > > > }
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > So, we check the type of object, whether it is RichXXX or
> >> > > XXXWithKey
> >> > > > > > inside
> >> > > > > > > the called method and continue accordingly.
> >> > > > > > >
> >> > > > > > > If this is ok with the community, I would like to revert the
> >> > > current
> >> > > > > > design
> >> > > > > > > to this again.
> >> > > > > > >
> >> > > > > > > Cheers,
> >> > > > > > > Jeyhun
> >> > > > > > >
> >> > > > > > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy <
> >> damian@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi Jeyhun,
> >> > > > > > > >
> >> > > > > > > > Thanks for sending 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-22 Thread Jeyhun Karimov
Dear community,

I updated the related KIP [1]. Please feel free to comment.

Cheers,
Jeyhun

[1]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams




On Fri, Sep 22, 2017 at 12:20 AM Jeyhun Karimov 
wrote:

> Hi Damian,
>
> Thanks for the update. I working on it and will provide an update soon.
>
> Cheers,
> Jeyhun
>
> On Thu, Sep 21, 2017 at 4:50 PM Damian Guy  wrote:
>
>> Hi Jeyhun,
>>
>> All KIP-182 API PRs have now been merged. So you can consider it as
>> stable.
>> Thanks,
>> Damian
>>
>> On Thu, 21 Sep 2017 at 15:23 Jeyhun Karimov  wrote:
>>
>> > Hi all,
>> >
>> > Thanks a lot for your comments. For the single interface (RichXXX and
>> > XXXWithKey) solution, I have already submitted a PR but probably it is
>> > outdated (when the KIP first proposed), I need to revisit that one.
>> >
>> > @Guozhang, from our (offline) discussion, I understood that we may not
>> make
>> > it merge this KIP into the upcoming release, as KIP-159 is not voted yet
>> > (because we want both KIP-149 and KIP-159 to be as an "atomic" merge).
>> So
>> > I decided to wait until KIP-182 gets stable (there are some minor
>> updates
>> > AFAIK) and update the KIP accordingly. Please correct me if I am wrong
>> or I
>> > misunderstood.
>> >
>> > Cheers,
>> > Jeyhun
>> >
>> >
>> > On Thu, Sep 21, 2017 at 4:11 PM Damian Guy 
>> wrote:
>> >
>> > > +1
>> > >
>> > > On Thu, 21 Sep 2017 at 13:46 Guozhang Wang 
>> wrote:
>> > >
>> > > > +1 for me as well for collapsing.
>> > > >
>> > > > Jeyhun, could you update the wiki accordingly to show what's the
>> final
>> > > > updates post KIP-182 that needs to be done in KIP-159 including
>> > KIP-149?
>> > > > The child page I made is just a suggestion, but you would still
>> need to
>> > > > update your proposal for people to comment and vote on.
>> > > >
>> > > >
>> > > > Guozhang
>> > > >
>> > > >
>> > > > On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu 
>> wrote:
>> > > >
>> > > > > +1
>> > > > >
>> > > > > One interface is cleaner.
>> > > > >
>> > > > > On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck 
>> > > wrote:
>> > > > >
>> > > > > > +1 for me on collapsing the Rich and ValueWithKey
>> > interfaces
>> > > > > into 1
>> > > > > > interface.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Bill
>> > > > > >
>> > > > > > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov <
>> > > je.kari...@gmail.com
>> > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi Damian,
>> > > > > > >
>> > > > > > > Thanks for your feedback. Actually, this (what you propose)
>> was
>> > the
>> > > > > first
>> > > > > > > idea of KIP-149. Then we decided to divide it into two KIPs. I
>> > also
>> > > > > > > expressed my opinion that keeping the two interfaces (Rich and
>> > > > withKey)
>> > > > > > > separate would add more overloads. So, email discussion
>> resulted
>> > > that
>> > > > > > this
>> > > > > > > would not be a problem.
>> > > > > > >
>> > > > > > > Our initial idea was similar to :
>> > > > > > >
>> > > > > > > public abstract class RichValueMapper  implements
>> > > > > > > ValueMapperWithKey, RichFunction {
>> > > > > > > ..
>> > > > > > > }
>> > > > > > >
>> > > > > > >
>> > > > > > > So, we check the type of object, whether it is RichXXX or
>> > > XXXWithKey
>> > > > > > inside
>> > > > > > > the called method and continue accordingly.
>> > > > > > >
>> > > > > > > If this is ok with the community, I would like to revert the
>> > > current
>> > > > > > design
>> > > > > > > to this again.
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > Jeyhun
>> > > > > > >
>> > > > > > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy <
>> damian@gmail.com
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi Jeyhun,
>> > > > > > > >
>> > > > > > > > Thanks for sending out the update. I guess i was thinking
>> more
>> > > > along
>> > > > > > the
>> > > > > > > > lines of option 2 where we collapse the Rich and
>> > > > ValueWithKey
>> > > > > > etc
>> > > > > > > > interfaces into 1 interface that has all of the arguments. I
>> > > think
>> > > > we
>> > > > > > > then
>> > > > > > > > only need to add one additional overload for each operator?
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > > Damian
>> > > > > > > >
>> > > > > > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov <
>> > > je.kari...@gmail.com>
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Dear all,
>> > > > > > > > >
>> > > > > > > > > I would like to resume the discussion on KIP-159. I (and
>> > > > Guozhang)
>> > > > > > > think
>> > > > > > > > > that releasing KIP-149 and KIP-159 in the same release
>> would
>> > > make
>> > > > > > sense
>> > > > > > > > to
>> > > > > > > > > avoid a release with "partial" public APIs. There is a KIP
>> > [1]
>> > > > > > proposed
>> > > > > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-21 Thread Jeyhun Karimov
Hi Damian,

Thanks for the update. I working on it and will provide an update soon.

Cheers,
Jeyhun

On Thu, Sep 21, 2017 at 4:50 PM Damian Guy  wrote:

> Hi Jeyhun,
>
> All KIP-182 API PRs have now been merged. So you can consider it as stable.
> Thanks,
> Damian
>
> On Thu, 21 Sep 2017 at 15:23 Jeyhun Karimov  wrote:
>
> > Hi all,
> >
> > Thanks a lot for your comments. For the single interface (RichXXX and
> > XXXWithKey) solution, I have already submitted a PR but probably it is
> > outdated (when the KIP first proposed), I need to revisit that one.
> >
> > @Guozhang, from our (offline) discussion, I understood that we may not
> make
> > it merge this KIP into the upcoming release, as KIP-159 is not voted yet
> > (because we want both KIP-149 and KIP-159 to be as an "atomic" merge).
> So
> > I decided to wait until KIP-182 gets stable (there are some minor updates
> > AFAIK) and update the KIP accordingly. Please correct me if I am wrong
> or I
> > misunderstood.
> >
> > Cheers,
> > Jeyhun
> >
> >
> > On Thu, Sep 21, 2017 at 4:11 PM Damian Guy  wrote:
> >
> > > +1
> > >
> > > On Thu, 21 Sep 2017 at 13:46 Guozhang Wang  wrote:
> > >
> > > > +1 for me as well for collapsing.
> > > >
> > > > Jeyhun, could you update the wiki accordingly to show what's the
> final
> > > > updates post KIP-182 that needs to be done in KIP-159 including
> > KIP-149?
> > > > The child page I made is just a suggestion, but you would still need
> to
> > > > update your proposal for people to comment and vote on.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu 
> wrote:
> > > >
> > > > > +1
> > > > >
> > > > > One interface is cleaner.
> > > > >
> > > > > On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck 
> > > wrote:
> > > > >
> > > > > > +1 for me on collapsing the Rich and ValueWithKey
> > interfaces
> > > > > into 1
> > > > > > interface.
> > > > > >
> > > > > > Thanks,
> > > > > > Bill
> > > > > >
> > > > > > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov <
> > > je.kari...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Damian,
> > > > > > >
> > > > > > > Thanks for your feedback. Actually, this (what you propose) was
> > the
> > > > > first
> > > > > > > idea of KIP-149. Then we decided to divide it into two KIPs. I
> > also
> > > > > > > expressed my opinion that keeping the two interfaces (Rich and
> > > > withKey)
> > > > > > > separate would add more overloads. So, email discussion
> resulted
> > > that
> > > > > > this
> > > > > > > would not be a problem.
> > > > > > >
> > > > > > > Our initial idea was similar to :
> > > > > > >
> > > > > > > public abstract class RichValueMapper  implements
> > > > > > > ValueMapperWithKey, RichFunction {
> > > > > > > ..
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > So, we check the type of object, whether it is RichXXX or
> > > XXXWithKey
> > > > > > inside
> > > > > > > the called method and continue accordingly.
> > > > > > >
> > > > > > > If this is ok with the community, I would like to revert the
> > > current
> > > > > > design
> > > > > > > to this again.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Jeyhun
> > > > > > >
> > > > > > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy <
> damian@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi Jeyhun,
> > > > > > > >
> > > > > > > > Thanks for sending out the update. I guess i was thinking
> more
> > > > along
> > > > > > the
> > > > > > > > lines of option 2 where we collapse the Rich and
> > > > ValueWithKey
> > > > > > etc
> > > > > > > > interfaces into 1 interface that has all of the arguments. I
> > > think
> > > > we
> > > > > > > then
> > > > > > > > only need to add one additional overload for each operator?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Damian
> > > > > > > >
> > > > > > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov <
> > > je.kari...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Dear all,
> > > > > > > > >
> > > > > > > > > I would like to resume the discussion on KIP-159. I (and
> > > > Guozhang)
> > > > > > > think
> > > > > > > > > that releasing KIP-149 and KIP-159 in the same release
> would
> > > make
> > > > > > sense
> > > > > > > > to
> > > > > > > > > avoid a release with "partial" public APIs. There is a KIP
> > [1]
> > > > > > proposed
> > > > > > > > by
> > > > > > > > > Guozhang (and approved by me) to unify both KIPs.
> > > > > > > > > Please feel free to comment on this.
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > > > action?pageId=73637757
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Jeyhun
> > > > > > > > >
> > > > > > > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov <
> > > > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-21 Thread Damian Guy
Hi Jeyhun,

All KIP-182 API PRs have now been merged. So you can consider it as stable.
Thanks,
Damian

On Thu, 21 Sep 2017 at 15:23 Jeyhun Karimov  wrote:

> Hi all,
>
> Thanks a lot for your comments. For the single interface (RichXXX and
> XXXWithKey) solution, I have already submitted a PR but probably it is
> outdated (when the KIP first proposed), I need to revisit that one.
>
> @Guozhang, from our (offline) discussion, I understood that we may not make
> it merge this KIP into the upcoming release, as KIP-159 is not voted yet
> (because we want both KIP-149 and KIP-159 to be as an "atomic" merge).  So
> I decided to wait until KIP-182 gets stable (there are some minor updates
> AFAIK) and update the KIP accordingly. Please correct me if I am wrong or I
> misunderstood.
>
> Cheers,
> Jeyhun
>
>
> On Thu, Sep 21, 2017 at 4:11 PM Damian Guy  wrote:
>
> > +1
> >
> > On Thu, 21 Sep 2017 at 13:46 Guozhang Wang  wrote:
> >
> > > +1 for me as well for collapsing.
> > >
> > > Jeyhun, could you update the wiki accordingly to show what's the final
> > > updates post KIP-182 that needs to be done in KIP-159 including
> KIP-149?
> > > The child page I made is just a suggestion, but you would still need to
> > > update your proposal for people to comment and vote on.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu  wrote:
> > >
> > > > +1
> > > >
> > > > One interface is cleaner.
> > > >
> > > > On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck 
> > wrote:
> > > >
> > > > > +1 for me on collapsing the Rich and ValueWithKey
> interfaces
> > > > into 1
> > > > > interface.
> > > > >
> > > > > Thanks,
> > > > > Bill
> > > > >
> > > > > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov <
> > je.kari...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Damian,
> > > > > >
> > > > > > Thanks for your feedback. Actually, this (what you propose) was
> the
> > > > first
> > > > > > idea of KIP-149. Then we decided to divide it into two KIPs. I
> also
> > > > > > expressed my opinion that keeping the two interfaces (Rich and
> > > withKey)
> > > > > > separate would add more overloads. So, email discussion resulted
> > that
> > > > > this
> > > > > > would not be a problem.
> > > > > >
> > > > > > Our initial idea was similar to :
> > > > > >
> > > > > > public abstract class RichValueMapper  implements
> > > > > > ValueMapperWithKey, RichFunction {
> > > > > > ..
> > > > > > }
> > > > > >
> > > > > >
> > > > > > So, we check the type of object, whether it is RichXXX or
> > XXXWithKey
> > > > > inside
> > > > > > the called method and continue accordingly.
> > > > > >
> > > > > > If this is ok with the community, I would like to revert the
> > current
> > > > > design
> > > > > > to this again.
> > > > > >
> > > > > > Cheers,
> > > > > > Jeyhun
> > > > > >
> > > > > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy  >
> > > > wrote:
> > > > > >
> > > > > > > Hi Jeyhun,
> > > > > > >
> > > > > > > Thanks for sending out the update. I guess i was thinking more
> > > along
> > > > > the
> > > > > > > lines of option 2 where we collapse the Rich and
> > > ValueWithKey
> > > > > etc
> > > > > > > interfaces into 1 interface that has all of the arguments. I
> > think
> > > we
> > > > > > then
> > > > > > > only need to add one additional overload for each operator?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Damian
> > > > > > >
> > > > > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov <
> > je.kari...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Dear all,
> > > > > > > >
> > > > > > > > I would like to resume the discussion on KIP-159. I (and
> > > Guozhang)
> > > > > > think
> > > > > > > > that releasing KIP-149 and KIP-159 in the same release would
> > make
> > > > > sense
> > > > > > > to
> > > > > > > > avoid a release with "partial" public APIs. There is a KIP
> [1]
> > > > > proposed
> > > > > > > by
> > > > > > > > Guozhang (and approved by me) to unify both KIPs.
> > > > > > > > Please feel free to comment on this.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > > action?pageId=73637757
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Jeyhun
> > > > > > > >
> > > > > > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov <
> > > > je.kari...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Matthias, Damian, all,
> > > > > > > > >
> > > > > > > > > Thanks for your comments and sorry for super-late update.
> > > > > > > > >
> > > > > > > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > > > > > > I made some changes to KIP document based on my prototype.
> > > > > > > > >
> > > > > > > > > Please feel free to comment.
> > > > > > > > >
> > > > > > > > > Cheers,
> > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-21 Thread Jeyhun Karimov
Hi all,

Thanks a lot for your comments. For the single interface (RichXXX and
XXXWithKey) solution, I have already submitted a PR but probably it is
outdated (when the KIP first proposed), I need to revisit that one.

@Guozhang, from our (offline) discussion, I understood that we may not make
it merge this KIP into the upcoming release, as KIP-159 is not voted yet
(because we want both KIP-149 and KIP-159 to be as an "atomic" merge).  So
I decided to wait until KIP-182 gets stable (there are some minor updates
AFAIK) and update the KIP accordingly. Please correct me if I am wrong or I
misunderstood.

Cheers,
Jeyhun


On Thu, Sep 21, 2017 at 4:11 PM Damian Guy  wrote:

> +1
>
> On Thu, 21 Sep 2017 at 13:46 Guozhang Wang  wrote:
>
> > +1 for me as well for collapsing.
> >
> > Jeyhun, could you update the wiki accordingly to show what's the final
> > updates post KIP-182 that needs to be done in KIP-159 including KIP-149?
> > The child page I made is just a suggestion, but you would still need to
> > update your proposal for people to comment and vote on.
> >
> >
> > Guozhang
> >
> >
> > On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu  wrote:
> >
> > > +1
> > >
> > > One interface is cleaner.
> > >
> > > On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck 
> wrote:
> > >
> > > > +1 for me on collapsing the Rich and ValueWithKey interfaces
> > > into 1
> > > > interface.
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov <
> je.kari...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Damian,
> > > > >
> > > > > Thanks for your feedback. Actually, this (what you propose) was the
> > > first
> > > > > idea of KIP-149. Then we decided to divide it into two KIPs. I also
> > > > > expressed my opinion that keeping the two interfaces (Rich and
> > withKey)
> > > > > separate would add more overloads. So, email discussion resulted
> that
> > > > this
> > > > > would not be a problem.
> > > > >
> > > > > Our initial idea was similar to :
> > > > >
> > > > > public abstract class RichValueMapper  implements
> > > > > ValueMapperWithKey, RichFunction {
> > > > > ..
> > > > > }
> > > > >
> > > > >
> > > > > So, we check the type of object, whether it is RichXXX or
> XXXWithKey
> > > > inside
> > > > > the called method and continue accordingly.
> > > > >
> > > > > If this is ok with the community, I would like to revert the
> current
> > > > design
> > > > > to this again.
> > > > >
> > > > > Cheers,
> > > > > Jeyhun
> > > > >
> > > > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy 
> > > wrote:
> > > > >
> > > > > > Hi Jeyhun,
> > > > > >
> > > > > > Thanks for sending out the update. I guess i was thinking more
> > along
> > > > the
> > > > > > lines of option 2 where we collapse the Rich and
> > ValueWithKey
> > > > etc
> > > > > > interfaces into 1 interface that has all of the arguments. I
> think
> > we
> > > > > then
> > > > > > only need to add one additional overload for each operator?
> > > > > >
> > > > > > Thanks,
> > > > > > Damian
> > > > > >
> > > > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov <
> je.kari...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Dear all,
> > > > > > >
> > > > > > > I would like to resume the discussion on KIP-159. I (and
> > Guozhang)
> > > > > think
> > > > > > > that releasing KIP-149 and KIP-159 in the same release would
> make
> > > > sense
> > > > > > to
> > > > > > > avoid a release with "partial" public APIs. There is a KIP [1]
> > > > proposed
> > > > > > by
> > > > > > > Guozhang (and approved by me) to unify both KIPs.
> > > > > > > Please feel free to comment on this.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > action?pageId=73637757
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Jeyhun
> > > > > > >
> > > > > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov <
> > > je.kari...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Matthias, Damian, all,
> > > > > > > >
> > > > > > > > Thanks for your comments and sorry for super-late update.
> > > > > > > >
> > > > > > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > > > > > I made some changes to KIP document based on my prototype.
> > > > > > > >
> > > > > > > > Please feel free to comment.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Jeyhun
> > > > > > > >
> > > > > > > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax <
> > > > > matth...@confluent.io>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> I would not block this KIP with regard to DSL refactoring.
> > IMHO,
> > > > we
> > > > > > can
> > > > > > > >> just finish this one and the DSL refactoring will help later
> > on
> > > to
> > > > > > > >> reduce the number of overloads.
> > > > > > > >>
> > > > > > > >> -Matthias
> > > > > > > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-21 Thread Damian Guy
+1

On Thu, 21 Sep 2017 at 13:46 Guozhang Wang  wrote:

> +1 for me as well for collapsing.
>
> Jeyhun, could you update the wiki accordingly to show what's the final
> updates post KIP-182 that needs to be done in KIP-159 including KIP-149?
> The child page I made is just a suggestion, but you would still need to
> update your proposal for people to comment and vote on.
>
>
> Guozhang
>
>
> On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu  wrote:
>
> > +1
> >
> > One interface is cleaner.
> >
> > On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck  wrote:
> >
> > > +1 for me on collapsing the Rich and ValueWithKey interfaces
> > into 1
> > > interface.
> > >
> > > Thanks,
> > > Bill
> > >
> > > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov  >
> > > wrote:
> > >
> > > > Hi Damian,
> > > >
> > > > Thanks for your feedback. Actually, this (what you propose) was the
> > first
> > > > idea of KIP-149. Then we decided to divide it into two KIPs. I also
> > > > expressed my opinion that keeping the two interfaces (Rich and
> withKey)
> > > > separate would add more overloads. So, email discussion resulted that
> > > this
> > > > would not be a problem.
> > > >
> > > > Our initial idea was similar to :
> > > >
> > > > public abstract class RichValueMapper  implements
> > > > ValueMapperWithKey, RichFunction {
> > > > ..
> > > > }
> > > >
> > > >
> > > > So, we check the type of object, whether it is RichXXX or XXXWithKey
> > > inside
> > > > the called method and continue accordingly.
> > > >
> > > > If this is ok with the community, I would like to revert the current
> > > design
> > > > to this again.
> > > >
> > > > Cheers,
> > > > Jeyhun
> > > >
> > > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy 
> > wrote:
> > > >
> > > > > Hi Jeyhun,
> > > > >
> > > > > Thanks for sending out the update. I guess i was thinking more
> along
> > > the
> > > > > lines of option 2 where we collapse the Rich and
> ValueWithKey
> > > etc
> > > > > interfaces into 1 interface that has all of the arguments. I think
> we
> > > > then
> > > > > only need to add one additional overload for each operator?
> > > > >
> > > > > Thanks,
> > > > > Damian
> > > > >
> > > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov 
> > > > wrote:
> > > > >
> > > > > > Dear all,
> > > > > >
> > > > > > I would like to resume the discussion on KIP-159. I (and
> Guozhang)
> > > > think
> > > > > > that releasing KIP-149 and KIP-159 in the same release would make
> > > sense
> > > > > to
> > > > > > avoid a release with "partial" public APIs. There is a KIP [1]
> > > proposed
> > > > > by
> > > > > > Guozhang (and approved by me) to unify both KIPs.
> > > > > > Please feel free to comment on this.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > action?pageId=73637757
> > > > > >
> > > > > > Cheers,
> > > > > > Jeyhun
> > > > > >
> > > > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov <
> > je.kari...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Matthias, Damian, all,
> > > > > > >
> > > > > > > Thanks for your comments and sorry for super-late update.
> > > > > > >
> > > > > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > > > > I made some changes to KIP document based on my prototype.
> > > > > > >
> > > > > > > Please feel free to comment.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Jeyhun
> > > > > > >
> > > > > > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax <
> > > > matth...@confluent.io>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> I would not block this KIP with regard to DSL refactoring.
> IMHO,
> > > we
> > > > > can
> > > > > > >> just finish this one and the DSL refactoring will help later
> on
> > to
> > > > > > >> reduce the number of overloads.
> > > > > > >>
> > > > > > >> -Matthias
> > > > > > >>
> > > > > > >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > > > > > >> > I am following the related thread in the mailing list and
> > > looking
> > > > > > >> forward
> > > > > > >> > for one-shot solution for overloads issue.
> > > > > > >> >
> > > > > > >> > Cheers,
> > > > > > >> > Jeyhun
> > > > > > >> >
> > > > > > >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy <
> > > damian@gmail.com>
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> >> Hi Jeyhun,
> > > > > > >> >>
> > > > > > >> >> About overrides, what other alternatives do we have? For
> > > > > > >> >>> backwards-compatibility we have to add extra methods to
> the
> > > > > existing
> > > > > > >> >> ones.
> > > > > > >> >>>
> > > > > > >> >>>
> > > > > > >> >> It wasn't clear to me in the KIP if these are new methods
> or
> > > > > > replacing
> > > > > > >> >> existing ones.
> > > > > > >> >> Also, we are currently discussing options for replacing the
> > > > > > overrides.
> > > > > > >> >>
> > > > > > >> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-21 Thread Guozhang Wang
+1 for me as well for collapsing.

Jeyhun, could you update the wiki accordingly to show what's the final
updates post KIP-182 that needs to be done in KIP-159 including KIP-149?
The child page I made is just a suggestion, but you would still need to
update your proposal for people to comment and vote on.


Guozhang


On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu  wrote:

> +1
>
> One interface is cleaner.
>
> On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck  wrote:
>
> > +1 for me on collapsing the Rich and ValueWithKey interfaces
> into 1
> > interface.
> >
> > Thanks,
> > Bill
> >
> > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov 
> > wrote:
> >
> > > Hi Damian,
> > >
> > > Thanks for your feedback. Actually, this (what you propose) was the
> first
> > > idea of KIP-149. Then we decided to divide it into two KIPs. I also
> > > expressed my opinion that keeping the two interfaces (Rich and withKey)
> > > separate would add more overloads. So, email discussion resulted that
> > this
> > > would not be a problem.
> > >
> > > Our initial idea was similar to :
> > >
> > > public abstract class RichValueMapper  implements
> > > ValueMapperWithKey, RichFunction {
> > > ..
> > > }
> > >
> > >
> > > So, we check the type of object, whether it is RichXXX or XXXWithKey
> > inside
> > > the called method and continue accordingly.
> > >
> > > If this is ok with the community, I would like to revert the current
> > design
> > > to this again.
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy 
> wrote:
> > >
> > > > Hi Jeyhun,
> > > >
> > > > Thanks for sending out the update. I guess i was thinking more along
> > the
> > > > lines of option 2 where we collapse the Rich and ValueWithKey
> > etc
> > > > interfaces into 1 interface that has all of the arguments. I think we
> > > then
> > > > only need to add one additional overload for each operator?
> > > >
> > > > Thanks,
> > > > Damian
> > > >
> > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov 
> > > wrote:
> > > >
> > > > > Dear all,
> > > > >
> > > > > I would like to resume the discussion on KIP-159. I (and Guozhang)
> > > think
> > > > > that releasing KIP-149 and KIP-159 in the same release would make
> > sense
> > > > to
> > > > > avoid a release with "partial" public APIs. There is a KIP [1]
> > proposed
> > > > by
> > > > > Guozhang (and approved by me) to unify both KIPs.
> > > > > Please feel free to comment on this.
> > > > >
> > > > > [1]
> > > > >
> > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=73637757
> > > > >
> > > > > Cheers,
> > > > > Jeyhun
> > > > >
> > > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov <
> je.kari...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Matthias, Damian, all,
> > > > > >
> > > > > > Thanks for your comments and sorry for super-late update.
> > > > > >
> > > > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > > > I made some changes to KIP document based on my prototype.
> > > > > >
> > > > > > Please feel free to comment.
> > > > > >
> > > > > > Cheers,
> > > > > > Jeyhun
> > > > > >
> > > > > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax <
> > > matth...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > >> I would not block this KIP with regard to DSL refactoring. IMHO,
> > we
> > > > can
> > > > > >> just finish this one and the DSL refactoring will help later on
> to
> > > > > >> reduce the number of overloads.
> > > > > >>
> > > > > >> -Matthias
> > > > > >>
> > > > > >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > > > > >> > I am following the related thread in the mailing list and
> > looking
> > > > > >> forward
> > > > > >> > for one-shot solution for overloads issue.
> > > > > >> >
> > > > > >> > Cheers,
> > > > > >> > Jeyhun
> > > > > >> >
> > > > > >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy <
> > damian@gmail.com>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >> Hi Jeyhun,
> > > > > >> >>
> > > > > >> >> About overrides, what other alternatives do we have? For
> > > > > >> >>> backwards-compatibility we have to add extra methods to the
> > > > existing
> > > > > >> >> ones.
> > > > > >> >>>
> > > > > >> >>>
> > > > > >> >> It wasn't clear to me in the KIP if these are new methods or
> > > > > replacing
> > > > > >> >> existing ones.
> > > > > >> >> Also, we are currently discussing options for replacing the
> > > > > overrides.
> > > > > >> >>
> > > > > >> >> Thanks,
> > > > > >> >> Damian
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>> About ProcessorContext vs RecordContext, you are right. I
> > think
> > > I
> > > > > >> need to
> > > > > >> >>> implement a prototype to understand the full picture as some
> > > parts
> > > > > of
> > > > > >> the
> > > > > >> >>> KIP might not be as straightforward as I thought.
> > > > > >> >>>
> > > > > >> >>>
> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-21 Thread Guozhang Wang
+1

On Thu, Sep 14, 2017 at 10:37 PM, Ted Yu  wrote:

> +1
>
> One interface is cleaner.
>
> On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck  wrote:
>
> > +1 for me on collapsing the Rich and ValueWithKey interfaces
> into 1
> > interface.
> >
> > Thanks,
> > Bill
> >
> > On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov 
> > wrote:
> >
> > > Hi Damian,
> > >
> > > Thanks for your feedback. Actually, this (what you propose) was the
> first
> > > idea of KIP-149. Then we decided to divide it into two KIPs. I also
> > > expressed my opinion that keeping the two interfaces (Rich and withKey)
> > > separate would add more overloads. So, email discussion resulted that
> > this
> > > would not be a problem.
> > >
> > > Our initial idea was similar to :
> > >
> > > public abstract class RichValueMapper  implements
> > > ValueMapperWithKey, RichFunction {
> > > ..
> > > }
> > >
> > >
> > > So, we check the type of object, whether it is RichXXX or XXXWithKey
> > inside
> > > the called method and continue accordingly.
> > >
> > > If this is ok with the community, I would like to revert the current
> > design
> > > to this again.
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy 
> wrote:
> > >
> > > > Hi Jeyhun,
> > > >
> > > > Thanks for sending out the update. I guess i was thinking more along
> > the
> > > > lines of option 2 where we collapse the Rich and ValueWithKey
> > etc
> > > > interfaces into 1 interface that has all of the arguments. I think we
> > > then
> > > > only need to add one additional overload for each operator?
> > > >
> > > > Thanks,
> > > > Damian
> > > >
> > > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov 
> > > wrote:
> > > >
> > > > > Dear all,
> > > > >
> > > > > I would like to resume the discussion on KIP-159. I (and Guozhang)
> > > think
> > > > > that releasing KIP-149 and KIP-159 in the same release would make
> > sense
> > > > to
> > > > > avoid a release with "partial" public APIs. There is a KIP [1]
> > proposed
> > > > by
> > > > > Guozhang (and approved by me) to unify both KIPs.
> > > > > Please feel free to comment on this.
> > > > >
> > > > > [1]
> > > > >
> > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=73637757
> > > > >
> > > > > Cheers,
> > > > > Jeyhun
> > > > >
> > > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov <
> je.kari...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Matthias, Damian, all,
> > > > > >
> > > > > > Thanks for your comments and sorry for super-late update.
> > > > > >
> > > > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > > > I made some changes to KIP document based on my prototype.
> > > > > >
> > > > > > Please feel free to comment.
> > > > > >
> > > > > > Cheers,
> > > > > > Jeyhun
> > > > > >
> > > > > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax <
> > > matth...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > >> I would not block this KIP with regard to DSL refactoring. IMHO,
> > we
> > > > can
> > > > > >> just finish this one and the DSL refactoring will help later on
> to
> > > > > >> reduce the number of overloads.
> > > > > >>
> > > > > >> -Matthias
> > > > > >>
> > > > > >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > > > > >> > I am following the related thread in the mailing list and
> > looking
> > > > > >> forward
> > > > > >> > for one-shot solution for overloads issue.
> > > > > >> >
> > > > > >> > Cheers,
> > > > > >> > Jeyhun
> > > > > >> >
> > > > > >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy <
> > damian@gmail.com>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >> Hi Jeyhun,
> > > > > >> >>
> > > > > >> >> About overrides, what other alternatives do we have? For
> > > > > >> >>> backwards-compatibility we have to add extra methods to the
> > > > existing
> > > > > >> >> ones.
> > > > > >> >>>
> > > > > >> >>>
> > > > > >> >> It wasn't clear to me in the KIP if these are new methods or
> > > > > replacing
> > > > > >> >> existing ones.
> > > > > >> >> Also, we are currently discussing options for replacing the
> > > > > overrides.
> > > > > >> >>
> > > > > >> >> Thanks,
> > > > > >> >> Damian
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>> About ProcessorContext vs RecordContext, you are right. I
> > think
> > > I
> > > > > >> need to
> > > > > >> >>> implement a prototype to understand the full picture as some
> > > parts
> > > > > of
> > > > > >> the
> > > > > >> >>> KIP might not be as straightforward as I thought.
> > > > > >> >>>
> > > > > >> >>>
> > > > > >> >>> Cheers,
> > > > > >> >>> Jeyhun
> > > > > >> >>>
> > > > > >> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy <
> > > damian@gmail.com>
> > > > > >> wrote:
> > > > > >> >>>
> > > > > >>  HI Jeyhun,
> > > > > >> 
> > > > > >>  Is the intention that these methods are new overloads on
> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-14 Thread Ted Yu
+1

One interface is cleaner.

On Thu, Sep 14, 2017 at 7:26 AM, Bill Bejeck  wrote:

> +1 for me on collapsing the Rich and ValueWithKey interfaces into 1
> interface.
>
> Thanks,
> Bill
>
> On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov 
> wrote:
>
> > Hi Damian,
> >
> > Thanks for your feedback. Actually, this (what you propose) was the first
> > idea of KIP-149. Then we decided to divide it into two KIPs. I also
> > expressed my opinion that keeping the two interfaces (Rich and withKey)
> > separate would add more overloads. So, email discussion resulted that
> this
> > would not be a problem.
> >
> > Our initial idea was similar to :
> >
> > public abstract class RichValueMapper  implements
> > ValueMapperWithKey, RichFunction {
> > ..
> > }
> >
> >
> > So, we check the type of object, whether it is RichXXX or XXXWithKey
> inside
> > the called method and continue accordingly.
> >
> > If this is ok with the community, I would like to revert the current
> design
> > to this again.
> >
> > Cheers,
> > Jeyhun
> >
> > On Wed, Sep 13, 2017 at 3:02 PM Damian Guy  wrote:
> >
> > > Hi Jeyhun,
> > >
> > > Thanks for sending out the update. I guess i was thinking more along
> the
> > > lines of option 2 where we collapse the Rich and ValueWithKey
> etc
> > > interfaces into 1 interface that has all of the arguments. I think we
> > then
> > > only need to add one additional overload for each operator?
> > >
> > > Thanks,
> > > Damian
> > >
> > > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov 
> > wrote:
> > >
> > > > Dear all,
> > > >
> > > > I would like to resume the discussion on KIP-159. I (and Guozhang)
> > think
> > > > that releasing KIP-149 and KIP-159 in the same release would make
> sense
> > > to
> > > > avoid a release with "partial" public APIs. There is a KIP [1]
> proposed
> > > by
> > > > Guozhang (and approved by me) to unify both KIPs.
> > > > Please feel free to comment on this.
> > > >
> > > > [1]
> > > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=73637757
> > > >
> > > > Cheers,
> > > > Jeyhun
> > > >
> > > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov  >
> > > > wrote:
> > > >
> > > > > Hi Matthias, Damian, all,
> > > > >
> > > > > Thanks for your comments and sorry for super-late update.
> > > > >
> > > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > > I made some changes to KIP document based on my prototype.
> > > > >
> > > > > Please feel free to comment.
> > > > >
> > > > > Cheers,
> > > > > Jeyhun
> > > > >
> > > > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax <
> > matth...@confluent.io>
> > > > > wrote:
> > > > >
> > > > >> I would not block this KIP with regard to DSL refactoring. IMHO,
> we
> > > can
> > > > >> just finish this one and the DSL refactoring will help later on to
> > > > >> reduce the number of overloads.
> > > > >>
> > > > >> -Matthias
> > > > >>
> > > > >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > > > >> > I am following the related thread in the mailing list and
> looking
> > > > >> forward
> > > > >> > for one-shot solution for overloads issue.
> > > > >> >
> > > > >> > Cheers,
> > > > >> > Jeyhun
> > > > >> >
> > > > >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy <
> damian@gmail.com>
> > > > >> wrote:
> > > > >> >
> > > > >> >> Hi Jeyhun,
> > > > >> >>
> > > > >> >> About overrides, what other alternatives do we have? For
> > > > >> >>> backwards-compatibility we have to add extra methods to the
> > > existing
> > > > >> >> ones.
> > > > >> >>>
> > > > >> >>>
> > > > >> >> It wasn't clear to me in the KIP if these are new methods or
> > > > replacing
> > > > >> >> existing ones.
> > > > >> >> Also, we are currently discussing options for replacing the
> > > > overrides.
> > > > >> >>
> > > > >> >> Thanks,
> > > > >> >> Damian
> > > > >> >>
> > > > >> >>
> > > > >> >>> About ProcessorContext vs RecordContext, you are right. I
> think
> > I
> > > > >> need to
> > > > >> >>> implement a prototype to understand the full picture as some
> > parts
> > > > of
> > > > >> the
> > > > >> >>> KIP might not be as straightforward as I thought.
> > > > >> >>>
> > > > >> >>>
> > > > >> >>> Cheers,
> > > > >> >>> Jeyhun
> > > > >> >>>
> > > > >> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy <
> > damian@gmail.com>
> > > > >> wrote:
> > > > >> >>>
> > > > >>  HI Jeyhun,
> > > > >> 
> > > > >>  Is the intention that these methods are new overloads on the
> > > > KStream,
> > > > >>  KTable, etc?
> > > > >> 
> > > > >>  It is worth noting that a ProcessorContext is not a
> > > RecordContext.
> > > > A
> > > > >>  RecordContext, as it stands, only exists during the
> processing
> > > of a
> > > > >> >>> single
> > > > >>  record. Whereas the ProcessorContext exists for the lifetime
> of
> > > the
> > > > >>  Processor. Sot it 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-14 Thread Bill Bejeck
+1 for me on collapsing the Rich and ValueWithKey interfaces into 1
interface.

Thanks,
Bill

On Wed, Sep 13, 2017 at 11:31 AM, Jeyhun Karimov 
wrote:

> Hi Damian,
>
> Thanks for your feedback. Actually, this (what you propose) was the first
> idea of KIP-149. Then we decided to divide it into two KIPs. I also
> expressed my opinion that keeping the two interfaces (Rich and withKey)
> separate would add more overloads. So, email discussion resulted that this
> would not be a problem.
>
> Our initial idea was similar to :
>
> public abstract class RichValueMapper  implements
> ValueMapperWithKey, RichFunction {
> ..
> }
>
>
> So, we check the type of object, whether it is RichXXX or XXXWithKey inside
> the called method and continue accordingly.
>
> If this is ok with the community, I would like to revert the current design
> to this again.
>
> Cheers,
> Jeyhun
>
> On Wed, Sep 13, 2017 at 3:02 PM Damian Guy  wrote:
>
> > Hi Jeyhun,
> >
> > Thanks for sending out the update. I guess i was thinking more along the
> > lines of option 2 where we collapse the Rich and ValueWithKey etc
> > interfaces into 1 interface that has all of the arguments. I think we
> then
> > only need to add one additional overload for each operator?
> >
> > Thanks,
> > Damian
> >
> > On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov 
> wrote:
> >
> > > Dear all,
> > >
> > > I would like to resume the discussion on KIP-159. I (and Guozhang)
> think
> > > that releasing KIP-149 and KIP-159 in the same release would make sense
> > to
> > > avoid a release with "partial" public APIs. There is a KIP [1] proposed
> > by
> > > Guozhang (and approved by me) to unify both KIPs.
> > > Please feel free to comment on this.
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=73637757
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov 
> > > wrote:
> > >
> > > > Hi Matthias, Damian, all,
> > > >
> > > > Thanks for your comments and sorry for super-late update.
> > > >
> > > > Sure, the DSL refactoring is not blocking for this KIP.
> > > > I made some changes to KIP document based on my prototype.
> > > >
> > > > Please feel free to comment.
> > > >
> > > > Cheers,
> > > > Jeyhun
> > > >
> > > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax <
> matth...@confluent.io>
> > > > wrote:
> > > >
> > > >> I would not block this KIP with regard to DSL refactoring. IMHO, we
> > can
> > > >> just finish this one and the DSL refactoring will help later on to
> > > >> reduce the number of overloads.
> > > >>
> > > >> -Matthias
> > > >>
> > > >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > > >> > I am following the related thread in the mailing list and looking
> > > >> forward
> > > >> > for one-shot solution for overloads issue.
> > > >> >
> > > >> > Cheers,
> > > >> > Jeyhun
> > > >> >
> > > >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy 
> > > >> wrote:
> > > >> >
> > > >> >> Hi Jeyhun,
> > > >> >>
> > > >> >> About overrides, what other alternatives do we have? For
> > > >> >>> backwards-compatibility we have to add extra methods to the
> > existing
> > > >> >> ones.
> > > >> >>>
> > > >> >>>
> > > >> >> It wasn't clear to me in the KIP if these are new methods or
> > > replacing
> > > >> >> existing ones.
> > > >> >> Also, we are currently discussing options for replacing the
> > > overrides.
> > > >> >>
> > > >> >> Thanks,
> > > >> >> Damian
> > > >> >>
> > > >> >>
> > > >> >>> About ProcessorContext vs RecordContext, you are right. I think
> I
> > > >> need to
> > > >> >>> implement a prototype to understand the full picture as some
> parts
> > > of
> > > >> the
> > > >> >>> KIP might not be as straightforward as I thought.
> > > >> >>>
> > > >> >>>
> > > >> >>> Cheers,
> > > >> >>> Jeyhun
> > > >> >>>
> > > >> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy <
> damian@gmail.com>
> > > >> wrote:
> > > >> >>>
> > > >>  HI Jeyhun,
> > > >> 
> > > >>  Is the intention that these methods are new overloads on the
> > > KStream,
> > > >>  KTable, etc?
> > > >> 
> > > >>  It is worth noting that a ProcessorContext is not a
> > RecordContext.
> > > A
> > > >>  RecordContext, as it stands, only exists during the processing
> > of a
> > > >> >>> single
> > > >>  record. Whereas the ProcessorContext exists for the lifetime of
> > the
> > > >>  Processor. Sot it doesn't make sense to cast a ProcessorContext
> > to
> > > a
> > > >>  RecordContext.
> > > >>  You mentioned above passing the InternalProcessorContext to the
> > > >> init()
> > > >>  calls. It is internal for a reason and i think it should remain
> > > that
> > > >> >> way.
> > > >>  It might be better to move the recordContext() method from
> > > >>  InternalProcessorContext to ProcessorContext.
> > > >> 
> > > >> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-13 Thread Jeyhun Karimov
Hi Damian,

Thanks for your feedback. Actually, this (what you propose) was the first
idea of KIP-149. Then we decided to divide it into two KIPs. I also
expressed my opinion that keeping the two interfaces (Rich and withKey)
separate would add more overloads. So, email discussion resulted that this
would not be a problem.

Our initial idea was similar to :

public abstract class RichValueMapper  implements
ValueMapperWithKey, RichFunction {
..
}


So, we check the type of object, whether it is RichXXX or XXXWithKey inside
the called method and continue accordingly.

If this is ok with the community, I would like to revert the current design
to this again.

Cheers,
Jeyhun

On Wed, Sep 13, 2017 at 3:02 PM Damian Guy  wrote:

> Hi Jeyhun,
>
> Thanks for sending out the update. I guess i was thinking more along the
> lines of option 2 where we collapse the Rich and ValueWithKey etc
> interfaces into 1 interface that has all of the arguments. I think we then
> only need to add one additional overload for each operator?
>
> Thanks,
> Damian
>
> On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov  wrote:
>
> > Dear all,
> >
> > I would like to resume the discussion on KIP-159. I (and Guozhang) think
> > that releasing KIP-149 and KIP-159 in the same release would make sense
> to
> > avoid a release with "partial" public APIs. There is a KIP [1] proposed
> by
> > Guozhang (and approved by me) to unify both KIPs.
> > Please feel free to comment on this.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73637757
> >
> > Cheers,
> > Jeyhun
> >
> > On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov 
> > wrote:
> >
> > > Hi Matthias, Damian, all,
> > >
> > > Thanks for your comments and sorry for super-late update.
> > >
> > > Sure, the DSL refactoring is not blocking for this KIP.
> > > I made some changes to KIP document based on my prototype.
> > >
> > > Please feel free to comment.
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax 
> > > wrote:
> > >
> > >> I would not block this KIP with regard to DSL refactoring. IMHO, we
> can
> > >> just finish this one and the DSL refactoring will help later on to
> > >> reduce the number of overloads.
> > >>
> > >> -Matthias
> > >>
> > >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > >> > I am following the related thread in the mailing list and looking
> > >> forward
> > >> > for one-shot solution for overloads issue.
> > >> >
> > >> > Cheers,
> > >> > Jeyhun
> > >> >
> > >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy 
> > >> wrote:
> > >> >
> > >> >> Hi Jeyhun,
> > >> >>
> > >> >> About overrides, what other alternatives do we have? For
> > >> >>> backwards-compatibility we have to add extra methods to the
> existing
> > >> >> ones.
> > >> >>>
> > >> >>>
> > >> >> It wasn't clear to me in the KIP if these are new methods or
> > replacing
> > >> >> existing ones.
> > >> >> Also, we are currently discussing options for replacing the
> > overrides.
> > >> >>
> > >> >> Thanks,
> > >> >> Damian
> > >> >>
> > >> >>
> > >> >>> About ProcessorContext vs RecordContext, you are right. I think I
> > >> need to
> > >> >>> implement a prototype to understand the full picture as some parts
> > of
> > >> the
> > >> >>> KIP might not be as straightforward as I thought.
> > >> >>>
> > >> >>>
> > >> >>> Cheers,
> > >> >>> Jeyhun
> > >> >>>
> > >> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy 
> > >> wrote:
> > >> >>>
> > >>  HI Jeyhun,
> > >> 
> > >>  Is the intention that these methods are new overloads on the
> > KStream,
> > >>  KTable, etc?
> > >> 
> > >>  It is worth noting that a ProcessorContext is not a
> RecordContext.
> > A
> > >>  RecordContext, as it stands, only exists during the processing
> of a
> > >> >>> single
> > >>  record. Whereas the ProcessorContext exists for the lifetime of
> the
> > >>  Processor. Sot it doesn't make sense to cast a ProcessorContext
> to
> > a
> > >>  RecordContext.
> > >>  You mentioned above passing the InternalProcessorContext to the
> > >> init()
> > >>  calls. It is internal for a reason and i think it should remain
> > that
> > >> >> way.
> > >>  It might be better to move the recordContext() method from
> > >>  InternalProcessorContext to ProcessorContext.
> > >> 
> > >>  In the KIP you have an example showing:
> > >>  richMapper.init((RecordContext) processorContext);
> > >>  But the interface is:
> > >>  public interface RichValueMapper {
> > >>  VR apply(final V value, final RecordContext recordContext);
> > >>  }
> > >>  i.e., there is no init(...), besides as above this wouldn't make
> > >> sense.
> > >> 
> > >>  Thanks,
> > >>  Damian
> > >> 
> > >>  On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-13 Thread Damian Guy
Hi Jeyhun,

Thanks for sending out the update. I guess i was thinking more along the
lines of option 2 where we collapse the Rich and ValueWithKey etc
interfaces into 1 interface that has all of the arguments. I think we then
only need to add one additional overload for each operator?

Thanks,
Damian

On Wed, 13 Sep 2017 at 10:59 Jeyhun Karimov  wrote:

> Dear all,
>
> I would like to resume the discussion on KIP-159. I (and Guozhang) think
> that releasing KIP-149 and KIP-159 in the same release would make sense to
> avoid a release with "partial" public APIs. There is a KIP [1] proposed by
> Guozhang (and approved by me) to unify both KIPs.
> Please feel free to comment on this.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73637757
>
> Cheers,
> Jeyhun
>
> On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov 
> wrote:
>
> > Hi Matthias, Damian, all,
> >
> > Thanks for your comments and sorry for super-late update.
> >
> > Sure, the DSL refactoring is not blocking for this KIP.
> > I made some changes to KIP document based on my prototype.
> >
> > Please feel free to comment.
> >
> > Cheers,
> > Jeyhun
> >
> > On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax 
> > wrote:
> >
> >> I would not block this KIP with regard to DSL refactoring. IMHO, we can
> >> just finish this one and the DSL refactoring will help later on to
> >> reduce the number of overloads.
> >>
> >> -Matthias
> >>
> >> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> >> > I am following the related thread in the mailing list and looking
> >> forward
> >> > for one-shot solution for overloads issue.
> >> >
> >> > Cheers,
> >> > Jeyhun
> >> >
> >> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy 
> >> wrote:
> >> >
> >> >> Hi Jeyhun,
> >> >>
> >> >> About overrides, what other alternatives do we have? For
> >> >>> backwards-compatibility we have to add extra methods to the existing
> >> >> ones.
> >> >>>
> >> >>>
> >> >> It wasn't clear to me in the KIP if these are new methods or
> replacing
> >> >> existing ones.
> >> >> Also, we are currently discussing options for replacing the
> overrides.
> >> >>
> >> >> Thanks,
> >> >> Damian
> >> >>
> >> >>
> >> >>> About ProcessorContext vs RecordContext, you are right. I think I
> >> need to
> >> >>> implement a prototype to understand the full picture as some parts
> of
> >> the
> >> >>> KIP might not be as straightforward as I thought.
> >> >>>
> >> >>>
> >> >>> Cheers,
> >> >>> Jeyhun
> >> >>>
> >> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy 
> >> wrote:
> >> >>>
> >>  HI Jeyhun,
> >> 
> >>  Is the intention that these methods are new overloads on the
> KStream,
> >>  KTable, etc?
> >> 
> >>  It is worth noting that a ProcessorContext is not a RecordContext.
> A
> >>  RecordContext, as it stands, only exists during the processing of a
> >> >>> single
> >>  record. Whereas the ProcessorContext exists for the lifetime of the
> >>  Processor. Sot it doesn't make sense to cast a ProcessorContext to
> a
> >>  RecordContext.
> >>  You mentioned above passing the InternalProcessorContext to the
> >> init()
> >>  calls. It is internal for a reason and i think it should remain
> that
> >> >> way.
> >>  It might be better to move the recordContext() method from
> >>  InternalProcessorContext to ProcessorContext.
> >> 
> >>  In the KIP you have an example showing:
> >>  richMapper.init((RecordContext) processorContext);
> >>  But the interface is:
> >>  public interface RichValueMapper {
> >>  VR apply(final V value, final RecordContext recordContext);
> >>  }
> >>  i.e., there is no init(...), besides as above this wouldn't make
> >> sense.
> >> 
> >>  Thanks,
> >>  Damian
> >> 
> >>  On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov 
> >> >> wrote:
> >> 
> >> > Hi Matthias,
> >> >
> >> > Actually my intend was to provide to RichInitializer and later on
> we
> >>  could
> >> > provide the context of the record as you also mentioned.
> >> > I remove that not to confuse the users.
> >> > Regarding the RecordContext and ProcessorContext interfaces, I
> just
> >> > realized the InternalProcessorContext class. Can't we pass this
> as a
> >> > parameter to init() method of processors? Then we would be able to
> >> >> get
> >> > RecordContext easily with just a method call.
> >> >
> >> >
> >> > Cheers,
> >> > Jeyhun
> >> >
> >> > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax <
> >> >>> matth...@confluent.io>
> >> > wrote:
> >> >
> >> >> One more thing:
> >> >>
> >> >> I don't think `RichInitializer` does make sense. As we don't have
> >> >> any
> >> >> input record, there is also no context. We could of course
> provide
> >> >>> the
> >> >> context 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-09-13 Thread Jeyhun Karimov
Dear all,

I would like to resume the discussion on KIP-159. I (and Guozhang) think
that releasing KIP-149 and KIP-159 in the same release would make sense to
avoid a release with "partial" public APIs. There is a KIP [1] proposed by
Guozhang (and approved by me) to unify both KIPs.
Please feel free to comment on this.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73637757

Cheers,
Jeyhun

On Fri, Jul 21, 2017 at 2:00 AM Jeyhun Karimov  wrote:

> Hi Matthias, Damian, all,
>
> Thanks for your comments and sorry for super-late update.
>
> Sure, the DSL refactoring is not blocking for this KIP.
> I made some changes to KIP document based on my prototype.
>
> Please feel free to comment.
>
> Cheers,
> Jeyhun
>
> On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax 
> wrote:
>
>> I would not block this KIP with regard to DSL refactoring. IMHO, we can
>> just finish this one and the DSL refactoring will help later on to
>> reduce the number of overloads.
>>
>> -Matthias
>>
>> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
>> > I am following the related thread in the mailing list and looking
>> forward
>> > for one-shot solution for overloads issue.
>> >
>> > Cheers,
>> > Jeyhun
>> >
>> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy 
>> wrote:
>> >
>> >> Hi Jeyhun,
>> >>
>> >> About overrides, what other alternatives do we have? For
>> >>> backwards-compatibility we have to add extra methods to the existing
>> >> ones.
>> >>>
>> >>>
>> >> It wasn't clear to me in the KIP if these are new methods or replacing
>> >> existing ones.
>> >> Also, we are currently discussing options for replacing the overrides.
>> >>
>> >> Thanks,
>> >> Damian
>> >>
>> >>
>> >>> About ProcessorContext vs RecordContext, you are right. I think I
>> need to
>> >>> implement a prototype to understand the full picture as some parts of
>> the
>> >>> KIP might not be as straightforward as I thought.
>> >>>
>> >>>
>> >>> Cheers,
>> >>> Jeyhun
>> >>>
>> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy 
>> wrote:
>> >>>
>>  HI Jeyhun,
>> 
>>  Is the intention that these methods are new overloads on the KStream,
>>  KTable, etc?
>> 
>>  It is worth noting that a ProcessorContext is not a RecordContext. A
>>  RecordContext, as it stands, only exists during the processing of a
>> >>> single
>>  record. Whereas the ProcessorContext exists for the lifetime of the
>>  Processor. Sot it doesn't make sense to cast a ProcessorContext to a
>>  RecordContext.
>>  You mentioned above passing the InternalProcessorContext to the
>> init()
>>  calls. It is internal for a reason and i think it should remain that
>> >> way.
>>  It might be better to move the recordContext() method from
>>  InternalProcessorContext to ProcessorContext.
>> 
>>  In the KIP you have an example showing:
>>  richMapper.init((RecordContext) processorContext);
>>  But the interface is:
>>  public interface RichValueMapper {
>>  VR apply(final V value, final RecordContext recordContext);
>>  }
>>  i.e., there is no init(...), besides as above this wouldn't make
>> sense.
>> 
>>  Thanks,
>>  Damian
>> 
>>  On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov 
>> >> wrote:
>> 
>> > Hi Matthias,
>> >
>> > Actually my intend was to provide to RichInitializer and later on we
>>  could
>> > provide the context of the record as you also mentioned.
>> > I remove that not to confuse the users.
>> > Regarding the RecordContext and ProcessorContext interfaces, I just
>> > realized the InternalProcessorContext class. Can't we pass this as a
>> > parameter to init() method of processors? Then we would be able to
>> >> get
>> > RecordContext easily with just a method call.
>> >
>> >
>> > Cheers,
>> > Jeyhun
>> >
>> > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax <
>> >>> matth...@confluent.io>
>> > wrote:
>> >
>> >> One more thing:
>> >>
>> >> I don't think `RichInitializer` does make sense. As we don't have
>> >> any
>> >> input record, there is also no context. We could of course provide
>> >>> the
>> >> context of the record that triggers the init call, but this seems
>> >> to
>> >>> be
>> >> semantically questionable. Also, the context for this first record
>> >>> will
>> >> be provided by the consecutive call to aggregate anyways.
>> >>
>> >>
>> >> -Matthias
>> >>
>> >> On 6/29/17 1:11 PM, Matthias J. Sax wrote:
>> >>> Thanks for updating the KIP.
>> >>>
>> >>> I have one concern with regard to backward compatibility. You
>> >>> suggest
>> > to
>> >>> use RecrodContext as base interface for ProcessorContext. This
>> >> will
>> >>> break compatibility.
>> >>>
>> >>> I think, we should just have two independent 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-20 Thread Jeyhun Karimov
Hi Matthias, Damian, all,

Thanks for your comments and sorry for super-late update.

Sure, the DSL refactoring is not blocking for this KIP.
I made some changes to KIP document based on my prototype.

Please feel free to comment.

Cheers,
Jeyhun

On Fri, Jul 7, 2017 at 9:35 PM Matthias J. Sax 
wrote:

> I would not block this KIP with regard to DSL refactoring. IMHO, we can
> just finish this one and the DSL refactoring will help later on to
> reduce the number of overloads.
>
> -Matthias
>
> On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> > I am following the related thread in the mailing list and looking forward
> > for one-shot solution for overloads issue.
> >
> > Cheers,
> > Jeyhun
> >
> > On Fri, Jul 7, 2017 at 10:32 AM Damian Guy  wrote:
> >
> >> Hi Jeyhun,
> >>
> >> About overrides, what other alternatives do we have? For
> >>> backwards-compatibility we have to add extra methods to the existing
> >> ones.
> >>>
> >>>
> >> It wasn't clear to me in the KIP if these are new methods or replacing
> >> existing ones.
> >> Also, we are currently discussing options for replacing the overrides.
> >>
> >> Thanks,
> >> Damian
> >>
> >>
> >>> About ProcessorContext vs RecordContext, you are right. I think I need
> to
> >>> implement a prototype to understand the full picture as some parts of
> the
> >>> KIP might not be as straightforward as I thought.
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy 
> wrote:
> >>>
>  HI Jeyhun,
> 
>  Is the intention that these methods are new overloads on the KStream,
>  KTable, etc?
> 
>  It is worth noting that a ProcessorContext is not a RecordContext. A
>  RecordContext, as it stands, only exists during the processing of a
> >>> single
>  record. Whereas the ProcessorContext exists for the lifetime of the
>  Processor. Sot it doesn't make sense to cast a ProcessorContext to a
>  RecordContext.
>  You mentioned above passing the InternalProcessorContext to the init()
>  calls. It is internal for a reason and i think it should remain that
> >> way.
>  It might be better to move the recordContext() method from
>  InternalProcessorContext to ProcessorContext.
> 
>  In the KIP you have an example showing:
>  richMapper.init((RecordContext) processorContext);
>  But the interface is:
>  public interface RichValueMapper {
>  VR apply(final V value, final RecordContext recordContext);
>  }
>  i.e., there is no init(...), besides as above this wouldn't make
> sense.
> 
>  Thanks,
>  Damian
> 
>  On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov 
> >> wrote:
> 
> > Hi Matthias,
> >
> > Actually my intend was to provide to RichInitializer and later on we
>  could
> > provide the context of the record as you also mentioned.
> > I remove that not to confuse the users.
> > Regarding the RecordContext and ProcessorContext interfaces, I just
> > realized the InternalProcessorContext class. Can't we pass this as a
> > parameter to init() method of processors? Then we would be able to
> >> get
> > RecordContext easily with just a method call.
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax <
> >>> matth...@confluent.io>
> > wrote:
> >
> >> One more thing:
> >>
> >> I don't think `RichInitializer` does make sense. As we don't have
> >> any
> >> input record, there is also no context. We could of course provide
> >>> the
> >> context of the record that triggers the init call, but this seems
> >> to
> >>> be
> >> semantically questionable. Also, the context for this first record
> >>> will
> >> be provided by the consecutive call to aggregate anyways.
> >>
> >>
> >> -Matthias
> >>
> >> On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> >>> Thanks for updating the KIP.
> >>>
> >>> I have one concern with regard to backward compatibility. You
> >>> suggest
> > to
> >>> use RecrodContext as base interface for ProcessorContext. This
> >> will
> >>> break compatibility.
> >>>
> >>> I think, we should just have two independent interfaces. Our own
> >>> ProcessorContextImpl class would implement both. This allows us
> >> to
>  cast
> >>> it to `RecordContext` and thus limit the visible scope.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>>
> >>>
> >>> On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
>  Hi all,
> 
>  I updated the KIP w.r.t. discussion and comments.
>  Basically I eliminated overloads for particular method if they
> >> are
> > more
>  than 3.
>  As we can see there are a lot of overloads (and more will come
> >>> with
> >> KIP-149
>  :) )
>  So, is it wise to
> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-07 Thread Matthias J. Sax
I would not block this KIP with regard to DSL refactoring. IMHO, we can
just finish this one and the DSL refactoring will help later on to
reduce the number of overloads.

-Matthias

On 7/7/17 5:28 AM, Jeyhun Karimov wrote:
> I am following the related thread in the mailing list and looking forward
> for one-shot solution for overloads issue.
> 
> Cheers,
> Jeyhun
> 
> On Fri, Jul 7, 2017 at 10:32 AM Damian Guy  wrote:
> 
>> Hi Jeyhun,
>>
>> About overrides, what other alternatives do we have? For
>>> backwards-compatibility we have to add extra methods to the existing
>> ones.
>>>
>>>
>> It wasn't clear to me in the KIP if these are new methods or replacing
>> existing ones.
>> Also, we are currently discussing options for replacing the overrides.
>>
>> Thanks,
>> Damian
>>
>>
>>> About ProcessorContext vs RecordContext, you are right. I think I need to
>>> implement a prototype to understand the full picture as some parts of the
>>> KIP might not be as straightforward as I thought.
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy  wrote:
>>>
 HI Jeyhun,

 Is the intention that these methods are new overloads on the KStream,
 KTable, etc?

 It is worth noting that a ProcessorContext is not a RecordContext. A
 RecordContext, as it stands, only exists during the processing of a
>>> single
 record. Whereas the ProcessorContext exists for the lifetime of the
 Processor. Sot it doesn't make sense to cast a ProcessorContext to a
 RecordContext.
 You mentioned above passing the InternalProcessorContext to the init()
 calls. It is internal for a reason and i think it should remain that
>> way.
 It might be better to move the recordContext() method from
 InternalProcessorContext to ProcessorContext.

 In the KIP you have an example showing:
 richMapper.init((RecordContext) processorContext);
 But the interface is:
 public interface RichValueMapper {
 VR apply(final V value, final RecordContext recordContext);
 }
 i.e., there is no init(...), besides as above this wouldn't make sense.

 Thanks,
 Damian

 On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov 
>> wrote:

> Hi Matthias,
>
> Actually my intend was to provide to RichInitializer and later on we
 could
> provide the context of the record as you also mentioned.
> I remove that not to confuse the users.
> Regarding the RecordContext and ProcessorContext interfaces, I just
> realized the InternalProcessorContext class. Can't we pass this as a
> parameter to init() method of processors? Then we would be able to
>> get
> RecordContext easily with just a method call.
>
>
> Cheers,
> Jeyhun
>
> On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax <
>>> matth...@confluent.io>
> wrote:
>
>> One more thing:
>>
>> I don't think `RichInitializer` does make sense. As we don't have
>> any
>> input record, there is also no context. We could of course provide
>>> the
>> context of the record that triggers the init call, but this seems
>> to
>>> be
>> semantically questionable. Also, the context for this first record
>>> will
>> be provided by the consecutive call to aggregate anyways.
>>
>>
>> -Matthias
>>
>> On 6/29/17 1:11 PM, Matthias J. Sax wrote:
>>> Thanks for updating the KIP.
>>>
>>> I have one concern with regard to backward compatibility. You
>>> suggest
> to
>>> use RecrodContext as base interface for ProcessorContext. This
>> will
>>> break compatibility.
>>>
>>> I think, we should just have two independent interfaces. Our own
>>> ProcessorContextImpl class would implement both. This allows us
>> to
 cast
>>> it to `RecordContext` and thus limit the visible scope.
>>>
>>>
>>> -Matthias
>>>
>>>
>>>
>>> On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
 Hi all,

 I updated the KIP w.r.t. discussion and comments.
 Basically I eliminated overloads for particular method if they
>> are
> more
 than 3.
 As we can see there are a lot of overloads (and more will come
>>> with
>> KIP-149
 :) )
 So, is it wise to
 wait the result of constructive DSL thread or
 extend KIP to address this issue as well or
 continue as it is?

 Cheers,
 Jeyhun

 On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang <
>>> wangg...@gmail.com>
>> wrote:

> LGTM. Thanks!
>
>
> Guozhang
>
> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov <
> je.kari...@gmail.com>
> wrote:
>
>> Thanks for the comment Matthias. After all the discussion
>>> (thanks
 to
>> all
>> participants), I think 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-07 Thread Jeyhun Karimov
I am following the related thread in the mailing list and looking forward
for one-shot solution for overloads issue.

Cheers,
Jeyhun

On Fri, Jul 7, 2017 at 10:32 AM Damian Guy  wrote:

> Hi Jeyhun,
>
> About overrides, what other alternatives do we have? For
> > backwards-compatibility we have to add extra methods to the existing
> ones.
> >
> >
> It wasn't clear to me in the KIP if these are new methods or replacing
> existing ones.
> Also, we are currently discussing options for replacing the overrides.
>
> Thanks,
> Damian
>
>
> > About ProcessorContext vs RecordContext, you are right. I think I need to
> > implement a prototype to understand the full picture as some parts of the
> > KIP might not be as straightforward as I thought.
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Wed, Jul 5, 2017 at 10:40 AM Damian Guy  wrote:
> >
> > > HI Jeyhun,
> > >
> > > Is the intention that these methods are new overloads on the KStream,
> > > KTable, etc?
> > >
> > > It is worth noting that a ProcessorContext is not a RecordContext. A
> > > RecordContext, as it stands, only exists during the processing of a
> > single
> > > record. Whereas the ProcessorContext exists for the lifetime of the
> > > Processor. Sot it doesn't make sense to cast a ProcessorContext to a
> > > RecordContext.
> > > You mentioned above passing the InternalProcessorContext to the init()
> > > calls. It is internal for a reason and i think it should remain that
> way.
> > > It might be better to move the recordContext() method from
> > > InternalProcessorContext to ProcessorContext.
> > >
> > > In the KIP you have an example showing:
> > > richMapper.init((RecordContext) processorContext);
> > > But the interface is:
> > > public interface RichValueMapper {
> > > VR apply(final V value, final RecordContext recordContext);
> > > }
> > > i.e., there is no init(...), besides as above this wouldn't make sense.
> > >
> > > Thanks,
> > > Damian
> > >
> > > On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov 
> wrote:
> > >
> > > > Hi Matthias,
> > > >
> > > > Actually my intend was to provide to RichInitializer and later on we
> > > could
> > > > provide the context of the record as you also mentioned.
> > > > I remove that not to confuse the users.
> > > > Regarding the RecordContext and ProcessorContext interfaces, I just
> > > > realized the InternalProcessorContext class. Can't we pass this as a
> > > > parameter to init() method of processors? Then we would be able to
> get
> > > > RecordContext easily with just a method call.
> > > >
> > > >
> > > > Cheers,
> > > > Jeyhun
> > > >
> > > > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax <
> > matth...@confluent.io>
> > > > wrote:
> > > >
> > > > > One more thing:
> > > > >
> > > > > I don't think `RichInitializer` does make sense. As we don't have
> any
> > > > > input record, there is also no context. We could of course provide
> > the
> > > > > context of the record that triggers the init call, but this seems
> to
> > be
> > > > > semantically questionable. Also, the context for this first record
> > will
> > > > > be provided by the consecutive call to aggregate anyways.
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> > > > > > Thanks for updating the KIP.
> > > > > >
> > > > > > I have one concern with regard to backward compatibility. You
> > suggest
> > > > to
> > > > > > use RecrodContext as base interface for ProcessorContext. This
> will
> > > > > > break compatibility.
> > > > > >
> > > > > > I think, we should just have two independent interfaces. Our own
> > > > > > ProcessorContextImpl class would implement both. This allows us
> to
> > > cast
> > > > > > it to `RecordContext` and thus limit the visible scope.
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> > > > > >> Hi all,
> > > > > >>
> > > > > >> I updated the KIP w.r.t. discussion and comments.
> > > > > >> Basically I eliminated overloads for particular method if they
> are
> > > > more
> > > > > >> than 3.
> > > > > >> As we can see there are a lot of overloads (and more will come
> > with
> > > > > KIP-149
> > > > > >> :) )
> > > > > >> So, is it wise to
> > > > > >> wait the result of constructive DSL thread or
> > > > > >> extend KIP to address this issue as well or
> > > > > >> continue as it is?
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Jeyhun
> > > > > >>
> > > > > >> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang <
> > wangg...@gmail.com>
> > > > > wrote:
> > > > > >>
> > > > > >>> LGTM. Thanks!
> > > > > >>>
> > > > > >>>
> > > > > >>> Guozhang
> > > > > >>>
> > > > > >>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov <
> > > > je.kari...@gmail.com>
> > > > > >>> wrote:
> > > > > >>>
> > > > >  Thanks for the comment Matthias. After all the discussion
> > (thanks
> > > to
> > > > > all
> > > > > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-07 Thread Damian Guy
Hi Jeyhun,

About overrides, what other alternatives do we have? For
> backwards-compatibility we have to add extra methods to the existing ones.
>
>
It wasn't clear to me in the KIP if these are new methods or replacing
existing ones.
Also, we are currently discussing options for replacing the overrides.

Thanks,
Damian


> About ProcessorContext vs RecordContext, you are right. I think I need to
> implement a prototype to understand the full picture as some parts of the
> KIP might not be as straightforward as I thought.
>
>
> Cheers,
> Jeyhun
>
> On Wed, Jul 5, 2017 at 10:40 AM Damian Guy  wrote:
>
> > HI Jeyhun,
> >
> > Is the intention that these methods are new overloads on the KStream,
> > KTable, etc?
> >
> > It is worth noting that a ProcessorContext is not a RecordContext. A
> > RecordContext, as it stands, only exists during the processing of a
> single
> > record. Whereas the ProcessorContext exists for the lifetime of the
> > Processor. Sot it doesn't make sense to cast a ProcessorContext to a
> > RecordContext.
> > You mentioned above passing the InternalProcessorContext to the init()
> > calls. It is internal for a reason and i think it should remain that way.
> > It might be better to move the recordContext() method from
> > InternalProcessorContext to ProcessorContext.
> >
> > In the KIP you have an example showing:
> > richMapper.init((RecordContext) processorContext);
> > But the interface is:
> > public interface RichValueMapper {
> > VR apply(final V value, final RecordContext recordContext);
> > }
> > i.e., there is no init(...), besides as above this wouldn't make sense.
> >
> > Thanks,
> > Damian
> >
> > On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov  wrote:
> >
> > > Hi Matthias,
> > >
> > > Actually my intend was to provide to RichInitializer and later on we
> > could
> > > provide the context of the record as you also mentioned.
> > > I remove that not to confuse the users.
> > > Regarding the RecordContext and ProcessorContext interfaces, I just
> > > realized the InternalProcessorContext class. Can't we pass this as a
> > > parameter to init() method of processors? Then we would be able to get
> > > RecordContext easily with just a method call.
> > >
> > >
> > > Cheers,
> > > Jeyhun
> > >
> > > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax <
> matth...@confluent.io>
> > > wrote:
> > >
> > > > One more thing:
> > > >
> > > > I don't think `RichInitializer` does make sense. As we don't have any
> > > > input record, there is also no context. We could of course provide
> the
> > > > context of the record that triggers the init call, but this seems to
> be
> > > > semantically questionable. Also, the context for this first record
> will
> > > > be provided by the consecutive call to aggregate anyways.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> > > > > Thanks for updating the KIP.
> > > > >
> > > > > I have one concern with regard to backward compatibility. You
> suggest
> > > to
> > > > > use RecrodContext as base interface for ProcessorContext. This will
> > > > > break compatibility.
> > > > >
> > > > > I think, we should just have two independent interfaces. Our own
> > > > > ProcessorContextImpl class would implement both. This allows us to
> > cast
> > > > > it to `RecordContext` and thus limit the visible scope.
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > >
> > > > >
> > > > > On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> > > > >> Hi all,
> > > > >>
> > > > >> I updated the KIP w.r.t. discussion and comments.
> > > > >> Basically I eliminated overloads for particular method if they are
> > > more
> > > > >> than 3.
> > > > >> As we can see there are a lot of overloads (and more will come
> with
> > > > KIP-149
> > > > >> :) )
> > > > >> So, is it wise to
> > > > >> wait the result of constructive DSL thread or
> > > > >> extend KIP to address this issue as well or
> > > > >> continue as it is?
> > > > >>
> > > > >> Cheers,
> > > > >> Jeyhun
> > > > >>
> > > > >> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang <
> wangg...@gmail.com>
> > > > wrote:
> > > > >>
> > > > >>> LGTM. Thanks!
> > > > >>>
> > > > >>>
> > > > >>> Guozhang
> > > > >>>
> > > > >>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov <
> > > je.kari...@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > >  Thanks for the comment Matthias. After all the discussion
> (thanks
> > to
> > > > all
> > > >  participants), I think this (single method that passes in a
> > > > RecordContext
> > > >  object) is the best alternative.
> > > >  Just a side note: I think KAFKA-3907 [1] can also be integrated
> > into
> > > > the
> > > >  KIP by adding related method inside RecordContext interface.
> > > > 
> > > > 
> > > >  [1] https://issues.apache.org/jira/browse/KAFKA-3907
> > > > 
> > > > 
> > > >  Cheers,
> > > >  Jeyhun
> > > > 
> > > >  On Tue, Jun 13, 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-06 Thread Jeyhun Karimov
Hi Damian,

Thanks for comments.
About overrides, what other alternatives do we have? For
backwards-compatibility we have to add extra methods to the existing ones.

About ProcessorContext vs RecordContext, you are right. I think I need to
implement a prototype to understand the full picture as some parts of the
KIP might not be as straightforward as I thought.


Cheers,
Jeyhun

On Wed, Jul 5, 2017 at 10:40 AM Damian Guy  wrote:

> HI Jeyhun,
>
> Is the intention that these methods are new overloads on the KStream,
> KTable, etc?
>
> It is worth noting that a ProcessorContext is not a RecordContext. A
> RecordContext, as it stands, only exists during the processing of a single
> record. Whereas the ProcessorContext exists for the lifetime of the
> Processor. Sot it doesn't make sense to cast a ProcessorContext to a
> RecordContext.
> You mentioned above passing the InternalProcessorContext to the init()
> calls. It is internal for a reason and i think it should remain that way.
> It might be better to move the recordContext() method from
> InternalProcessorContext to ProcessorContext.
>
> In the KIP you have an example showing:
> richMapper.init((RecordContext) processorContext);
> But the interface is:
> public interface RichValueMapper {
> VR apply(final V value, final RecordContext recordContext);
> }
> i.e., there is no init(...), besides as above this wouldn't make sense.
>
> Thanks,
> Damian
>
> On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov  wrote:
>
> > Hi Matthias,
> >
> > Actually my intend was to provide to RichInitializer and later on we
> could
> > provide the context of the record as you also mentioned.
> > I remove that not to confuse the users.
> > Regarding the RecordContext and ProcessorContext interfaces, I just
> > realized the InternalProcessorContext class. Can't we pass this as a
> > parameter to init() method of processors? Then we would be able to get
> > RecordContext easily with just a method call.
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax 
> > wrote:
> >
> > > One more thing:
> > >
> > > I don't think `RichInitializer` does make sense. As we don't have any
> > > input record, there is also no context. We could of course provide the
> > > context of the record that triggers the init call, but this seems to be
> > > semantically questionable. Also, the context for this first record will
> > > be provided by the consecutive call to aggregate anyways.
> > >
> > >
> > > -Matthias
> > >
> > > On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> > > > Thanks for updating the KIP.
> > > >
> > > > I have one concern with regard to backward compatibility. You suggest
> > to
> > > > use RecrodContext as base interface for ProcessorContext. This will
> > > > break compatibility.
> > > >
> > > > I think, we should just have two independent interfaces. Our own
> > > > ProcessorContextImpl class would implement both. This allows us to
> cast
> > > > it to `RecordContext` and thus limit the visible scope.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > >
> > > > On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> > > >> Hi all,
> > > >>
> > > >> I updated the KIP w.r.t. discussion and comments.
> > > >> Basically I eliminated overloads for particular method if they are
> > more
> > > >> than 3.
> > > >> As we can see there are a lot of overloads (and more will come with
> > > KIP-149
> > > >> :) )
> > > >> So, is it wise to
> > > >> wait the result of constructive DSL thread or
> > > >> extend KIP to address this issue as well or
> > > >> continue as it is?
> > > >>
> > > >> Cheers,
> > > >> Jeyhun
> > > >>
> > > >> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang 
> > > wrote:
> > > >>
> > > >>> LGTM. Thanks!
> > > >>>
> > > >>>
> > > >>> Guozhang
> > > >>>
> > > >>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov <
> > je.kari...@gmail.com>
> > > >>> wrote:
> > > >>>
> > >  Thanks for the comment Matthias. After all the discussion (thanks
> to
> > > all
> > >  participants), I think this (single method that passes in a
> > > RecordContext
> > >  object) is the best alternative.
> > >  Just a side note: I think KAFKA-3907 [1] can also be integrated
> into
> > > the
> > >  KIP by adding related method inside RecordContext interface.
> > > 
> > > 
> > >  [1] https://issues.apache.org/jira/browse/KAFKA-3907
> > > 
> > > 
> > >  Cheers,
> > >  Jeyhun
> > > 
> > >  On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax <
> > > matth...@confluent.io>
> > >  wrote:
> > > 
> > > > Hi,
> > > >
> > > > I would like to push this discussion further. It seems we got
> nice
> > > > alternatives (thanks for the summary Jeyhun!).
> > > >
> > > > With respect to RichFunctions and allowing them to be stateful, I
> > > have
> > > > my doubt as expressed already. From my understanding, the idea
> was

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-05 Thread Damian Guy
HI Jeyhun,

Is the intention that these methods are new overloads on the KStream,
KTable, etc?

It is worth noting that a ProcessorContext is not a RecordContext. A
RecordContext, as it stands, only exists during the processing of a single
record. Whereas the ProcessorContext exists for the lifetime of the
Processor. Sot it doesn't make sense to cast a ProcessorContext to a
RecordContext.
You mentioned above passing the InternalProcessorContext to the init()
calls. It is internal for a reason and i think it should remain that way.
It might be better to move the recordContext() method from
InternalProcessorContext to ProcessorContext.

In the KIP you have an example showing:
richMapper.init((RecordContext) processorContext);
But the interface is:
public interface RichValueMapper {
VR apply(final V value, final RecordContext recordContext);
}
i.e., there is no init(...), besides as above this wouldn't make sense.

Thanks,
Damian

On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov  wrote:

> Hi Matthias,
>
> Actually my intend was to provide to RichInitializer and later on we could
> provide the context of the record as you also mentioned.
> I remove that not to confuse the users.
> Regarding the RecordContext and ProcessorContext interfaces, I just
> realized the InternalProcessorContext class. Can't we pass this as a
> parameter to init() method of processors? Then we would be able to get
> RecordContext easily with just a method call.
>
>
> Cheers,
> Jeyhun
>
> On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax 
> wrote:
>
> > One more thing:
> >
> > I don't think `RichInitializer` does make sense. As we don't have any
> > input record, there is also no context. We could of course provide the
> > context of the record that triggers the init call, but this seems to be
> > semantically questionable. Also, the context for this first record will
> > be provided by the consecutive call to aggregate anyways.
> >
> >
> > -Matthias
> >
> > On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> > > Thanks for updating the KIP.
> > >
> > > I have one concern with regard to backward compatibility. You suggest
> to
> > > use RecrodContext as base interface for ProcessorContext. This will
> > > break compatibility.
> > >
> > > I think, we should just have two independent interfaces. Our own
> > > ProcessorContextImpl class would implement both. This allows us to cast
> > > it to `RecordContext` and thus limit the visible scope.
> > >
> > >
> > > -Matthias
> > >
> > >
> > >
> > > On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> > >> Hi all,
> > >>
> > >> I updated the KIP w.r.t. discussion and comments.
> > >> Basically I eliminated overloads for particular method if they are
> more
> > >> than 3.
> > >> As we can see there are a lot of overloads (and more will come with
> > KIP-149
> > >> :) )
> > >> So, is it wise to
> > >> wait the result of constructive DSL thread or
> > >> extend KIP to address this issue as well or
> > >> continue as it is?
> > >>
> > >> Cheers,
> > >> Jeyhun
> > >>
> > >> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang 
> > wrote:
> > >>
> > >>> LGTM. Thanks!
> > >>>
> > >>>
> > >>> Guozhang
> > >>>
> > >>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov <
> je.kari...@gmail.com>
> > >>> wrote:
> > >>>
> >  Thanks for the comment Matthias. After all the discussion (thanks to
> > all
> >  participants), I think this (single method that passes in a
> > RecordContext
> >  object) is the best alternative.
> >  Just a side note: I think KAFKA-3907 [1] can also be integrated into
> > the
> >  KIP by adding related method inside RecordContext interface.
> > 
> > 
> >  [1] https://issues.apache.org/jira/browse/KAFKA-3907
> > 
> > 
> >  Cheers,
> >  Jeyhun
> > 
> >  On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax <
> > matth...@confluent.io>
> >  wrote:
> > 
> > > Hi,
> > >
> > > I would like to push this discussion further. It seems we got nice
> > > alternatives (thanks for the summary Jeyhun!).
> > >
> > > With respect to RichFunctions and allowing them to be stateful, I
> > have
> > > my doubt as expressed already. From my understanding, the idea was
> to
> > > give access to record metadata information only. If you want to do
> a
> > > stateful computation you should rather use #transform().
> > >
> > > Furthermore, as pointed out, we would need to switch to a
> > > supplier-pattern introducing many more overloads.
> > >
> > > For those reason, I advocate for a simple interface with a single
> > >>> method
> > > that passes in a RecordContext object.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 6/6/17 5:15 PM, Guozhang Wang wrote:
> > >> Thanks for the comprehensive summary!
> > >>
> > >> Personally I'd prefer the option of passing RecordContext as an
> > > additional
> > >> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-04 Thread Jeyhun Karimov
Hi Matthias,

Actually my intend was to provide to RichInitializer and later on we could
provide the context of the record as you also mentioned.
I remove that not to confuse the users.
Regarding the RecordContext and ProcessorContext interfaces, I just
realized the InternalProcessorContext class. Can't we pass this as a
parameter to init() method of processors? Then we would be able to get
RecordContext easily with just a method call.


Cheers,
Jeyhun

On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax 
wrote:

> One more thing:
>
> I don't think `RichInitializer` does make sense. As we don't have any
> input record, there is also no context. We could of course provide the
> context of the record that triggers the init call, but this seems to be
> semantically questionable. Also, the context for this first record will
> be provided by the consecutive call to aggregate anyways.
>
>
> -Matthias
>
> On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> > Thanks for updating the KIP.
> >
> > I have one concern with regard to backward compatibility. You suggest to
> > use RecrodContext as base interface for ProcessorContext. This will
> > break compatibility.
> >
> > I think, we should just have two independent interfaces. Our own
> > ProcessorContextImpl class would implement both. This allows us to cast
> > it to `RecordContext` and thus limit the visible scope.
> >
> >
> > -Matthias
> >
> >
> >
> > On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> >> Hi all,
> >>
> >> I updated the KIP w.r.t. discussion and comments.
> >> Basically I eliminated overloads for particular method if they are more
> >> than 3.
> >> As we can see there are a lot of overloads (and more will come with
> KIP-149
> >> :) )
> >> So, is it wise to
> >> wait the result of constructive DSL thread or
> >> extend KIP to address this issue as well or
> >> continue as it is?
> >>
> >> Cheers,
> >> Jeyhun
> >>
> >> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang 
> wrote:
> >>
> >>> LGTM. Thanks!
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov 
> >>> wrote:
> >>>
>  Thanks for the comment Matthias. After all the discussion (thanks to
> all
>  participants), I think this (single method that passes in a
> RecordContext
>  object) is the best alternative.
>  Just a side note: I think KAFKA-3907 [1] can also be integrated into
> the
>  KIP by adding related method inside RecordContext interface.
> 
> 
>  [1] https://issues.apache.org/jira/browse/KAFKA-3907
> 
> 
>  Cheers,
>  Jeyhun
> 
>  On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax <
> matth...@confluent.io>
>  wrote:
> 
> > Hi,
> >
> > I would like to push this discussion further. It seems we got nice
> > alternatives (thanks for the summary Jeyhun!).
> >
> > With respect to RichFunctions and allowing them to be stateful, I
> have
> > my doubt as expressed already. From my understanding, the idea was to
> > give access to record metadata information only. If you want to do a
> > stateful computation you should rather use #transform().
> >
> > Furthermore, as pointed out, we would need to switch to a
> > supplier-pattern introducing many more overloads.
> >
> > For those reason, I advocate for a simple interface with a single
> >>> method
> > that passes in a RecordContext object.
> >
> >
> > -Matthias
> >
> >
> > On 6/6/17 5:15 PM, Guozhang Wang wrote:
> >> Thanks for the comprehensive summary!
> >>
> >> Personally I'd prefer the option of passing RecordContext as an
> > additional
> >> parameter into he overloaded function. But I'm also open to other
> > arguments
> >> if there are sth. that I have overlooked.
> >>
> >> Guozhang
> >>
> >>
> >> On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov <
> je.kari...@gmail.com
> 
> > wrote:
> >>
> >>> Hi,
> >>>
> >>> Thanks for your comments Matthias and Guozhang.
> >>>
> >>> Below I mention the quick summary of the main alternatives we
> looked
>  at
> > to
> >>> introduce the Rich functions (I will refer to it as Rich functions
> > until we
> >>> find better/another name). Initially the proposed alternatives was
> >>> not
> >>> backwards-compatible, so I will not mention them.
> >>> The related discussions are spread in KIP-149 and in this KIP
>  (KIP-159)
> >>> discussion threads.
> >>>
> >>>
> >>>
> >>> 1. The idea of rich functions came into the stage with KIP-149, in
> >>> discussion thread. As a result we extended KIP-149 to support Rich
> >>> functions as well.
> >>>
> >>> 2.  To as part of the Rich functions, we provided init
> > (ProcessorContext)
> >>> method. Afterwards, Dammian suggested that we should not provide
> >>> ProcessorContext to users. As a 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-29 Thread Matthias J. Sax
One more thing:

I don't think `RichInitializer` does make sense. As we don't have any
input record, there is also no context. We could of course provide the
context of the record that triggers the init call, but this seems to be
semantically questionable. Also, the context for this first record will
be provided by the consecutive call to aggregate anyways.


-Matthias

On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> Thanks for updating the KIP.
> 
> I have one concern with regard to backward compatibility. You suggest to
> use RecrodContext as base interface for ProcessorContext. This will
> break compatibility.
> 
> I think, we should just have two independent interfaces. Our own
> ProcessorContextImpl class would implement both. This allows us to cast
> it to `RecordContext` and thus limit the visible scope.
> 
> 
> -Matthias
> 
> 
> 
> On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
>> Hi all,
>>
>> I updated the KIP w.r.t. discussion and comments.
>> Basically I eliminated overloads for particular method if they are more
>> than 3.
>> As we can see there are a lot of overloads (and more will come with KIP-149
>> :) )
>> So, is it wise to
>> wait the result of constructive DSL thread or
>> extend KIP to address this issue as well or
>> continue as it is?
>>
>> Cheers,
>> Jeyhun
>>
>> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang  wrote:
>>
>>> LGTM. Thanks!
>>>
>>>
>>> Guozhang
>>>
>>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov 
>>> wrote:
>>>
 Thanks for the comment Matthias. After all the discussion (thanks to all
 participants), I think this (single method that passes in a RecordContext
 object) is the best alternative.
 Just a side note: I think KAFKA-3907 [1] can also be integrated into the
 KIP by adding related method inside RecordContext interface.


 [1] https://issues.apache.org/jira/browse/KAFKA-3907


 Cheers,
 Jeyhun

 On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax 
 wrote:

> Hi,
>
> I would like to push this discussion further. It seems we got nice
> alternatives (thanks for the summary Jeyhun!).
>
> With respect to RichFunctions and allowing them to be stateful, I have
> my doubt as expressed already. From my understanding, the idea was to
> give access to record metadata information only. If you want to do a
> stateful computation you should rather use #transform().
>
> Furthermore, as pointed out, we would need to switch to a
> supplier-pattern introducing many more overloads.
>
> For those reason, I advocate for a simple interface with a single
>>> method
> that passes in a RecordContext object.
>
>
> -Matthias
>
>
> On 6/6/17 5:15 PM, Guozhang Wang wrote:
>> Thanks for the comprehensive summary!
>>
>> Personally I'd prefer the option of passing RecordContext as an
> additional
>> parameter into he overloaded function. But I'm also open to other
> arguments
>> if there are sth. that I have overlooked.
>>
>> Guozhang
>>
>>
>> On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov  wrote:
>>
>>> Hi,
>>>
>>> Thanks for your comments Matthias and Guozhang.
>>>
>>> Below I mention the quick summary of the main alternatives we looked
 at
> to
>>> introduce the Rich functions (I will refer to it as Rich functions
> until we
>>> find better/another name). Initially the proposed alternatives was
>>> not
>>> backwards-compatible, so I will not mention them.
>>> The related discussions are spread in KIP-149 and in this KIP
 (KIP-159)
>>> discussion threads.
>>>
>>>
>>>
>>> 1. The idea of rich functions came into the stage with KIP-149, in
>>> discussion thread. As a result we extended KIP-149 to support Rich
>>> functions as well.
>>>
>>> 2.  To as part of the Rich functions, we provided init
> (ProcessorContext)
>>> method. Afterwards, Dammian suggested that we should not provide
>>> ProcessorContext to users. As a result, we separated the two
>>> problems
> into
>>> two separate KIPs, as it seems they can be solved in parallel.
>>>
>>> - One approach we considered was :
>>>
>>> public interface ValueMapperWithKey {
>>> VR apply(final K key, final V value);
>>> }
>>>
>>> public interface RichValueMapper extends RichFunction{
>>> }
>>>
>>> public interface RichFunction {
>>> void init(RecordContext recordContext);
>>> void close();
>>> }
>>>
>>> public interface RecordContext {
>>> String applicationId();
>>> TaskId taskId();
>>> StreamsMetrics metrics();
>>> String topic();
>>> int partition();
>>> long offset();
>>> long timestamp();
>>> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-29 Thread Matthias J. Sax
Thanks for updating the KIP.

I have one concern with regard to backward compatibility. You suggest to
use RecrodContext as base interface for ProcessorContext. This will
break compatibility.

I think, we should just have two independent interfaces. Our own
ProcessorContextImpl class would implement both. This allows us to cast
it to `RecordContext` and thus limit the visible scope.


-Matthias



On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> Hi all,
> 
> I updated the KIP w.r.t. discussion and comments.
> Basically I eliminated overloads for particular method if they are more
> than 3.
> As we can see there are a lot of overloads (and more will come with KIP-149
> :) )
> So, is it wise to
> wait the result of constructive DSL thread or
> extend KIP to address this issue as well or
> continue as it is?
> 
> Cheers,
> Jeyhun
> 
> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang  wrote:
> 
>> LGTM. Thanks!
>>
>>
>> Guozhang
>>
>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov 
>> wrote:
>>
>>> Thanks for the comment Matthias. After all the discussion (thanks to all
>>> participants), I think this (single method that passes in a RecordContext
>>> object) is the best alternative.
>>> Just a side note: I think KAFKA-3907 [1] can also be integrated into the
>>> KIP by adding related method inside RecordContext interface.
>>>
>>>
>>> [1] https://issues.apache.org/jira/browse/KAFKA-3907
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>> On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax 
>>> wrote:
>>>
 Hi,

 I would like to push this discussion further. It seems we got nice
 alternatives (thanks for the summary Jeyhun!).

 With respect to RichFunctions and allowing them to be stateful, I have
 my doubt as expressed already. From my understanding, the idea was to
 give access to record metadata information only. If you want to do a
 stateful computation you should rather use #transform().

 Furthermore, as pointed out, we would need to switch to a
 supplier-pattern introducing many more overloads.

 For those reason, I advocate for a simple interface with a single
>> method
 that passes in a RecordContext object.


 -Matthias


 On 6/6/17 5:15 PM, Guozhang Wang wrote:
> Thanks for the comprehensive summary!
>
> Personally I'd prefer the option of passing RecordContext as an
 additional
> parameter into he overloaded function. But I'm also open to other
 arguments
> if there are sth. that I have overlooked.
>
> Guozhang
>
>
> On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov >>
 wrote:
>
>> Hi,
>>
>> Thanks for your comments Matthias and Guozhang.
>>
>> Below I mention the quick summary of the main alternatives we looked
>>> at
 to
>> introduce the Rich functions (I will refer to it as Rich functions
 until we
>> find better/another name). Initially the proposed alternatives was
>> not
>> backwards-compatible, so I will not mention them.
>> The related discussions are spread in KIP-149 and in this KIP
>>> (KIP-159)
>> discussion threads.
>>
>>
>>
>> 1. The idea of rich functions came into the stage with KIP-149, in
>> discussion thread. As a result we extended KIP-149 to support Rich
>> functions as well.
>>
>> 2.  To as part of the Rich functions, we provided init
 (ProcessorContext)
>> method. Afterwards, Dammian suggested that we should not provide
>> ProcessorContext to users. As a result, we separated the two
>> problems
 into
>> two separate KIPs, as it seems they can be solved in parallel.
>>
>> - One approach we considered was :
>>
>> public interface ValueMapperWithKey {
>> VR apply(final K key, final V value);
>> }
>>
>> public interface RichValueMapper extends RichFunction{
>> }
>>
>> public interface RichFunction {
>> void init(RecordContext recordContext);
>> void close();
>> }
>>
>> public interface RecordContext {
>> String applicationId();
>> TaskId taskId();
>> StreamsMetrics metrics();
>> String topic();
>> int partition();
>> long offset();
>> long timestamp();
>> Map appConfigs();
>> Map appConfigsWithPrefix(String prefix);
>> }
>>
>>
>> public interface ProcessorContext extends RecordContext {
>>// all methods but the ones in RecordContext
>> }
>>
>> As a result:
>> * . All "withKey" and "withoutKey" interfaces can be converted to
>>> their
>> Rich counterparts (with empty init() and close() methods)
>> *. All related Processors will accept Rich interfaces in their
>> constructors.
>> *. So, we convert the related "withKey" or "withoutKey" 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-27 Thread Jeyhun Karimov
Hi all,

I updated the KIP w.r.t. discussion and comments.
Basically I eliminated overloads for particular method if they are more
than 3.
As we can see there are a lot of overloads (and more will come with KIP-149
:) )
So, is it wise to
wait the result of constructive DSL thread or
extend KIP to address this issue as well or
continue as it is?

Cheers,
Jeyhun

On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang  wrote:

> LGTM. Thanks!
>
>
> Guozhang
>
> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov 
> wrote:
>
> > Thanks for the comment Matthias. After all the discussion (thanks to all
> > participants), I think this (single method that passes in a RecordContext
> > object) is the best alternative.
> > Just a side note: I think KAFKA-3907 [1] can also be integrated into the
> > KIP by adding related method inside RecordContext interface.
> >
> >
> > [1] https://issues.apache.org/jira/browse/KAFKA-3907
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax 
> > wrote:
> >
> > > Hi,
> > >
> > > I would like to push this discussion further. It seems we got nice
> > > alternatives (thanks for the summary Jeyhun!).
> > >
> > > With respect to RichFunctions and allowing them to be stateful, I have
> > > my doubt as expressed already. From my understanding, the idea was to
> > > give access to record metadata information only. If you want to do a
> > > stateful computation you should rather use #transform().
> > >
> > > Furthermore, as pointed out, we would need to switch to a
> > > supplier-pattern introducing many more overloads.
> > >
> > > For those reason, I advocate for a simple interface with a single
> method
> > > that passes in a RecordContext object.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 6/6/17 5:15 PM, Guozhang Wang wrote:
> > > > Thanks for the comprehensive summary!
> > > >
> > > > Personally I'd prefer the option of passing RecordContext as an
> > > additional
> > > > parameter into he overloaded function. But I'm also open to other
> > > arguments
> > > > if there are sth. that I have overlooked.
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov  >
> > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Thanks for your comments Matthias and Guozhang.
> > > >>
> > > >> Below I mention the quick summary of the main alternatives we looked
> > at
> > > to
> > > >> introduce the Rich functions (I will refer to it as Rich functions
> > > until we
> > > >> find better/another name). Initially the proposed alternatives was
> not
> > > >> backwards-compatible, so I will not mention them.
> > > >> The related discussions are spread in KIP-149 and in this KIP
> > (KIP-159)
> > > >> discussion threads.
> > > >>
> > > >>
> > > >>
> > > >> 1. The idea of rich functions came into the stage with KIP-149, in
> > > >> discussion thread. As a result we extended KIP-149 to support Rich
> > > >> functions as well.
> > > >>
> > > >> 2.  To as part of the Rich functions, we provided init
> > > (ProcessorContext)
> > > >> method. Afterwards, Dammian suggested that we should not provide
> > > >> ProcessorContext to users. As a result, we separated the two
> problems
> > > into
> > > >> two separate KIPs, as it seems they can be solved in parallel.
> > > >>
> > > >> - One approach we considered was :
> > > >>
> > > >> public interface ValueMapperWithKey {
> > > >> VR apply(final K key, final V value);
> > > >> }
> > > >>
> > > >> public interface RichValueMapper extends RichFunction{
> > > >> }
> > > >>
> > > >> public interface RichFunction {
> > > >> void init(RecordContext recordContext);
> > > >> void close();
> > > >> }
> > > >>
> > > >> public interface RecordContext {
> > > >> String applicationId();
> > > >> TaskId taskId();
> > > >> StreamsMetrics metrics();
> > > >> String topic();
> > > >> int partition();
> > > >> long offset();
> > > >> long timestamp();
> > > >> Map appConfigs();
> > > >> Map appConfigsWithPrefix(String prefix);
> > > >> }
> > > >>
> > > >>
> > > >> public interface ProcessorContext extends RecordContext {
> > > >>// all methods but the ones in RecordContext
> > > >> }
> > > >>
> > > >> As a result:
> > > >> * . All "withKey" and "withoutKey" interfaces can be converted to
> > their
> > > >> Rich counterparts (with empty init() and close() methods)
> > > >> *. All related Processors will accept Rich interfaces in their
> > > >> constructors.
> > > >> *. So, we convert the related "withKey" or "withoutKey" interfaces
> to
> > > Rich
> > > >> interface while building the topology and initialize the related
> > > processors
> > > >> with Rich interfaces only.
> > > >> *. We will not need to overloaded methods for rich functions as Rich
> > > >> interfaces extend withKey interfaces. We will just check the object
> > type
> > 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-14 Thread Matthias J. Sax
Including KAFKA-3907 sounds reasonable to me.

-Matthias

On 6/14/17 2:29 PM, Guozhang Wang wrote:
> LGTM. Thanks!
> 
> 
> Guozhang
> 
> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov 
> wrote:
> 
>> Thanks for the comment Matthias. After all the discussion (thanks to all
>> participants), I think this (single method that passes in a RecordContext
>> object) is the best alternative.
>> Just a side note: I think KAFKA-3907 [1] can also be integrated into the
>> KIP by adding related method inside RecordContext interface.
>>
>>
>> [1] https://issues.apache.org/jira/browse/KAFKA-3907
>>
>>
>> Cheers,
>> Jeyhun
>>
>> On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax 
>> wrote:
>>
>>> Hi,
>>>
>>> I would like to push this discussion further. It seems we got nice
>>> alternatives (thanks for the summary Jeyhun!).
>>>
>>> With respect to RichFunctions and allowing them to be stateful, I have
>>> my doubt as expressed already. From my understanding, the idea was to
>>> give access to record metadata information only. If you want to do a
>>> stateful computation you should rather use #transform().
>>>
>>> Furthermore, as pointed out, we would need to switch to a
>>> supplier-pattern introducing many more overloads.
>>>
>>> For those reason, I advocate for a simple interface with a single method
>>> that passes in a RecordContext object.
>>>
>>>
>>> -Matthias
>>>
>>>
>>> On 6/6/17 5:15 PM, Guozhang Wang wrote:
 Thanks for the comprehensive summary!

 Personally I'd prefer the option of passing RecordContext as an
>>> additional
 parameter into he overloaded function. But I'm also open to other
>>> arguments
 if there are sth. that I have overlooked.

 Guozhang


 On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov 
>>> wrote:

> Hi,
>
> Thanks for your comments Matthias and Guozhang.
>
> Below I mention the quick summary of the main alternatives we looked
>> at
>>> to
> introduce the Rich functions (I will refer to it as Rich functions
>>> until we
> find better/another name). Initially the proposed alternatives was not
> backwards-compatible, so I will not mention them.
> The related discussions are spread in KIP-149 and in this KIP
>> (KIP-159)
> discussion threads.
>
>
>
> 1. The idea of rich functions came into the stage with KIP-149, in
> discussion thread. As a result we extended KIP-149 to support Rich
> functions as well.
>
> 2.  To as part of the Rich functions, we provided init
>>> (ProcessorContext)
> method. Afterwards, Dammian suggested that we should not provide
> ProcessorContext to users. As a result, we separated the two problems
>>> into
> two separate KIPs, as it seems they can be solved in parallel.
>
> - One approach we considered was :
>
> public interface ValueMapperWithKey {
> VR apply(final K key, final V value);
> }
>
> public interface RichValueMapper extends RichFunction{
> }
>
> public interface RichFunction {
> void init(RecordContext recordContext);
> void close();
> }
>
> public interface RecordContext {
> String applicationId();
> TaskId taskId();
> StreamsMetrics metrics();
> String topic();
> int partition();
> long offset();
> long timestamp();
> Map appConfigs();
> Map appConfigsWithPrefix(String prefix);
> }
>
>
> public interface ProcessorContext extends RecordContext {
>// all methods but the ones in RecordContext
> }
>
> As a result:
> * . All "withKey" and "withoutKey" interfaces can be converted to
>> their
> Rich counterparts (with empty init() and close() methods)
> *. All related Processors will accept Rich interfaces in their
> constructors.
> *. So, we convert the related "withKey" or "withoutKey" interfaces to
>>> Rich
> interface while building the topology and initialize the related
>>> processors
> with Rich interfaces only.
> *. We will not need to overloaded methods for rich functions as Rich
> interfaces extend withKey interfaces. We will just check the object
>> type
> and act accordingly.
>
>
>
>
> 3. There was some thoughts that the above approach does not support
>>> lambdas
> so we should support only one method, only init(RecordContext), as
>> part
>>> of
> Rich interfaces.
> This is still in discussion. Personally I think Rich interfaces are by
> definition lambda-free and we should not care much about it.
>
>
> 4. Thanks to Matthias's discussion, an alternative we considered was
>> to
> pass in the RecordContext as method parameter.  This might even allow
>> to
> use Lambdas and we could keep the name RichFunction as we preserve the
> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-14 Thread Guozhang Wang
LGTM. Thanks!


Guozhang

On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov 
wrote:

> Thanks for the comment Matthias. After all the discussion (thanks to all
> participants), I think this (single method that passes in a RecordContext
> object) is the best alternative.
> Just a side note: I think KAFKA-3907 [1] can also be integrated into the
> KIP by adding related method inside RecordContext interface.
>
>
> [1] https://issues.apache.org/jira/browse/KAFKA-3907
>
>
> Cheers,
> Jeyhun
>
> On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax 
> wrote:
>
> > Hi,
> >
> > I would like to push this discussion further. It seems we got nice
> > alternatives (thanks for the summary Jeyhun!).
> >
> > With respect to RichFunctions and allowing them to be stateful, I have
> > my doubt as expressed already. From my understanding, the idea was to
> > give access to record metadata information only. If you want to do a
> > stateful computation you should rather use #transform().
> >
> > Furthermore, as pointed out, we would need to switch to a
> > supplier-pattern introducing many more overloads.
> >
> > For those reason, I advocate for a simple interface with a single method
> > that passes in a RecordContext object.
> >
> >
> > -Matthias
> >
> >
> > On 6/6/17 5:15 PM, Guozhang Wang wrote:
> > > Thanks for the comprehensive summary!
> > >
> > > Personally I'd prefer the option of passing RecordContext as an
> > additional
> > > parameter into he overloaded function. But I'm also open to other
> > arguments
> > > if there are sth. that I have overlooked.
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov 
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> Thanks for your comments Matthias and Guozhang.
> > >>
> > >> Below I mention the quick summary of the main alternatives we looked
> at
> > to
> > >> introduce the Rich functions (I will refer to it as Rich functions
> > until we
> > >> find better/another name). Initially the proposed alternatives was not
> > >> backwards-compatible, so I will not mention them.
> > >> The related discussions are spread in KIP-149 and in this KIP
> (KIP-159)
> > >> discussion threads.
> > >>
> > >>
> > >>
> > >> 1. The idea of rich functions came into the stage with KIP-149, in
> > >> discussion thread. As a result we extended KIP-149 to support Rich
> > >> functions as well.
> > >>
> > >> 2.  To as part of the Rich functions, we provided init
> > (ProcessorContext)
> > >> method. Afterwards, Dammian suggested that we should not provide
> > >> ProcessorContext to users. As a result, we separated the two problems
> > into
> > >> two separate KIPs, as it seems they can be solved in parallel.
> > >>
> > >> - One approach we considered was :
> > >>
> > >> public interface ValueMapperWithKey {
> > >> VR apply(final K key, final V value);
> > >> }
> > >>
> > >> public interface RichValueMapper extends RichFunction{
> > >> }
> > >>
> > >> public interface RichFunction {
> > >> void init(RecordContext recordContext);
> > >> void close();
> > >> }
> > >>
> > >> public interface RecordContext {
> > >> String applicationId();
> > >> TaskId taskId();
> > >> StreamsMetrics metrics();
> > >> String topic();
> > >> int partition();
> > >> long offset();
> > >> long timestamp();
> > >> Map appConfigs();
> > >> Map appConfigsWithPrefix(String prefix);
> > >> }
> > >>
> > >>
> > >> public interface ProcessorContext extends RecordContext {
> > >>// all methods but the ones in RecordContext
> > >> }
> > >>
> > >> As a result:
> > >> * . All "withKey" and "withoutKey" interfaces can be converted to
> their
> > >> Rich counterparts (with empty init() and close() methods)
> > >> *. All related Processors will accept Rich interfaces in their
> > >> constructors.
> > >> *. So, we convert the related "withKey" or "withoutKey" interfaces to
> > Rich
> > >> interface while building the topology and initialize the related
> > processors
> > >> with Rich interfaces only.
> > >> *. We will not need to overloaded methods for rich functions as Rich
> > >> interfaces extend withKey interfaces. We will just check the object
> type
> > >> and act accordingly.
> > >>
> > >>
> > >>
> > >>
> > >> 3. There was some thoughts that the above approach does not support
> > lambdas
> > >> so we should support only one method, only init(RecordContext), as
> part
> > of
> > >> Rich interfaces.
> > >> This is still in discussion. Personally I think Rich interfaces are by
> > >> definition lambda-free and we should not care much about it.
> > >>
> > >>
> > >> 4. Thanks to Matthias's discussion, an alternative we considered was
> to
> > >> pass in the RecordContext as method parameter.  This might even allow
> to
> > >> use Lambdas and we could keep the name RichFunction as we preserve the
> > >> nature of being a function.
> > >> "If you go with 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-13 Thread Jeyhun Karimov
Thanks for the comment Matthias. After all the discussion (thanks to all
participants), I think this (single method that passes in a RecordContext
object) is the best alternative.
Just a side note: I think KAFKA-3907 [1] can also be integrated into the
KIP by adding related method inside RecordContext interface.


[1] https://issues.apache.org/jira/browse/KAFKA-3907


Cheers,
Jeyhun

On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax 
wrote:

> Hi,
>
> I would like to push this discussion further. It seems we got nice
> alternatives (thanks for the summary Jeyhun!).
>
> With respect to RichFunctions and allowing them to be stateful, I have
> my doubt as expressed already. From my understanding, the idea was to
> give access to record metadata information only. If you want to do a
> stateful computation you should rather use #transform().
>
> Furthermore, as pointed out, we would need to switch to a
> supplier-pattern introducing many more overloads.
>
> For those reason, I advocate for a simple interface with a single method
> that passes in a RecordContext object.
>
>
> -Matthias
>
>
> On 6/6/17 5:15 PM, Guozhang Wang wrote:
> > Thanks for the comprehensive summary!
> >
> > Personally I'd prefer the option of passing RecordContext as an
> additional
> > parameter into he overloaded function. But I'm also open to other
> arguments
> > if there are sth. that I have overlooked.
> >
> > Guozhang
> >
> >
> > On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov 
> wrote:
> >
> >> Hi,
> >>
> >> Thanks for your comments Matthias and Guozhang.
> >>
> >> Below I mention the quick summary of the main alternatives we looked at
> to
> >> introduce the Rich functions (I will refer to it as Rich functions
> until we
> >> find better/another name). Initially the proposed alternatives was not
> >> backwards-compatible, so I will not mention them.
> >> The related discussions are spread in KIP-149 and in this KIP (KIP-159)
> >> discussion threads.
> >>
> >>
> >>
> >> 1. The idea of rich functions came into the stage with KIP-149, in
> >> discussion thread. As a result we extended KIP-149 to support Rich
> >> functions as well.
> >>
> >> 2.  To as part of the Rich functions, we provided init
> (ProcessorContext)
> >> method. Afterwards, Dammian suggested that we should not provide
> >> ProcessorContext to users. As a result, we separated the two problems
> into
> >> two separate KIPs, as it seems they can be solved in parallel.
> >>
> >> - One approach we considered was :
> >>
> >> public interface ValueMapperWithKey {
> >> VR apply(final K key, final V value);
> >> }
> >>
> >> public interface RichValueMapper extends RichFunction{
> >> }
> >>
> >> public interface RichFunction {
> >> void init(RecordContext recordContext);
> >> void close();
> >> }
> >>
> >> public interface RecordContext {
> >> String applicationId();
> >> TaskId taskId();
> >> StreamsMetrics metrics();
> >> String topic();
> >> int partition();
> >> long offset();
> >> long timestamp();
> >> Map appConfigs();
> >> Map appConfigsWithPrefix(String prefix);
> >> }
> >>
> >>
> >> public interface ProcessorContext extends RecordContext {
> >>// all methods but the ones in RecordContext
> >> }
> >>
> >> As a result:
> >> * . All "withKey" and "withoutKey" interfaces can be converted to their
> >> Rich counterparts (with empty init() and close() methods)
> >> *. All related Processors will accept Rich interfaces in their
> >> constructors.
> >> *. So, we convert the related "withKey" or "withoutKey" interfaces to
> Rich
> >> interface while building the topology and initialize the related
> processors
> >> with Rich interfaces only.
> >> *. We will not need to overloaded methods for rich functions as Rich
> >> interfaces extend withKey interfaces. We will just check the object type
> >> and act accordingly.
> >>
> >>
> >>
> >>
> >> 3. There was some thoughts that the above approach does not support
> lambdas
> >> so we should support only one method, only init(RecordContext), as part
> of
> >> Rich interfaces.
> >> This is still in discussion. Personally I think Rich interfaces are by
> >> definition lambda-free and we should not care much about it.
> >>
> >>
> >> 4. Thanks to Matthias's discussion, an alternative we considered was to
> >> pass in the RecordContext as method parameter.  This might even allow to
> >> use Lambdas and we could keep the name RichFunction as we preserve the
> >> nature of being a function.
> >> "If you go with `init()` and `close()` we basically
> >> allow users to have an in-memory state for a function. Thus, we cannot
> >> share a single instance of RichValueMapper (etc) over multiple tasks and
> >> we would need a supplier pattern similar to #transform(). And this would
> >> "break the flow" of the API, as (Rich)ValueMapperSupplier would not
> >> inherit from ValueMapper and thus we would 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-13 Thread Matthias J. Sax
Hi,

I would like to push this discussion further. It seems we got nice
alternatives (thanks for the summary Jeyhun!).

With respect to RichFunctions and allowing them to be stateful, I have
my doubt as expressed already. From my understanding, the idea was to
give access to record metadata information only. If you want to do a
stateful computation you should rather use #transform().

Furthermore, as pointed out, we would need to switch to a
supplier-pattern introducing many more overloads.

For those reason, I advocate for a simple interface with a single method
that passes in a RecordContext object.


-Matthias


On 6/6/17 5:15 PM, Guozhang Wang wrote:
> Thanks for the comprehensive summary!
> 
> Personally I'd prefer the option of passing RecordContext as an additional
> parameter into he overloaded function. But I'm also open to other arguments
> if there are sth. that I have overlooked.
> 
> Guozhang
> 
> 
> On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov  wrote:
> 
>> Hi,
>>
>> Thanks for your comments Matthias and Guozhang.
>>
>> Below I mention the quick summary of the main alternatives we looked at to
>> introduce the Rich functions (I will refer to it as Rich functions until we
>> find better/another name). Initially the proposed alternatives was not
>> backwards-compatible, so I will not mention them.
>> The related discussions are spread in KIP-149 and in this KIP (KIP-159)
>> discussion threads.
>>
>>
>>
>> 1. The idea of rich functions came into the stage with KIP-149, in
>> discussion thread. As a result we extended KIP-149 to support Rich
>> functions as well.
>>
>> 2.  To as part of the Rich functions, we provided init (ProcessorContext)
>> method. Afterwards, Dammian suggested that we should not provide
>> ProcessorContext to users. As a result, we separated the two problems into
>> two separate KIPs, as it seems they can be solved in parallel.
>>
>> - One approach we considered was :
>>
>> public interface ValueMapperWithKey {
>> VR apply(final K key, final V value);
>> }
>>
>> public interface RichValueMapper extends RichFunction{
>> }
>>
>> public interface RichFunction {
>> void init(RecordContext recordContext);
>> void close();
>> }
>>
>> public interface RecordContext {
>> String applicationId();
>> TaskId taskId();
>> StreamsMetrics metrics();
>> String topic();
>> int partition();
>> long offset();
>> long timestamp();
>> Map appConfigs();
>> Map appConfigsWithPrefix(String prefix);
>> }
>>
>>
>> public interface ProcessorContext extends RecordContext {
>>// all methods but the ones in RecordContext
>> }
>>
>> As a result:
>> * . All "withKey" and "withoutKey" interfaces can be converted to their
>> Rich counterparts (with empty init() and close() methods)
>> *. All related Processors will accept Rich interfaces in their
>> constructors.
>> *. So, we convert the related "withKey" or "withoutKey" interfaces to Rich
>> interface while building the topology and initialize the related processors
>> with Rich interfaces only.
>> *. We will not need to overloaded methods for rich functions as Rich
>> interfaces extend withKey interfaces. We will just check the object type
>> and act accordingly.
>>
>>
>>
>>
>> 3. There was some thoughts that the above approach does not support lambdas
>> so we should support only one method, only init(RecordContext), as part of
>> Rich interfaces.
>> This is still in discussion. Personally I think Rich interfaces are by
>> definition lambda-free and we should not care much about it.
>>
>>
>> 4. Thanks to Matthias's discussion, an alternative we considered was to
>> pass in the RecordContext as method parameter.  This might even allow to
>> use Lambdas and we could keep the name RichFunction as we preserve the
>> nature of being a function.
>> "If you go with `init()` and `close()` we basically
>> allow users to have an in-memory state for a function. Thus, we cannot
>> share a single instance of RichValueMapper (etc) over multiple tasks and
>> we would need a supplier pattern similar to #transform(). And this would
>> "break the flow" of the API, as (Rich)ValueMapperSupplier would not
>> inherit from ValueMapper and thus we would need many new overload for
>> KStream/KTable classes". (Copy paste from Matthias's email)
>>
>>
>> Cheers,
>> Jeyhun
>>
>>
>> On Mon, Jun 5, 2017 at 5:18 AM Matthias J. Sax 
>> wrote:
>>
>>> Yes, we did consider this, and there is no consensus yet what the best
>>> alternative is.
>>>
>>> @Jeyhun: the email thread got pretty long. Maybe you can give a quick
>>> summary of the current state of the discussion?
>>>
>>>
>>> -Matthias
>>>
>>> On 6/4/17 6:04 PM, Guozhang Wang wrote:
 Thanks for the explanation Jeyhun and Matthias.

 I have just read through both KIP-149 and KIP-159 and am wondering if
>> you
 guys have considered a slight different approach for rich 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-06 Thread Guozhang Wang
Thanks for the comprehensive summary!

Personally I'd prefer the option of passing RecordContext as an additional
parameter into he overloaded function. But I'm also open to other arguments
if there are sth. that I have overlooked.

Guozhang


On Mon, Jun 5, 2017 at 3:19 PM, Jeyhun Karimov  wrote:

> Hi,
>
> Thanks for your comments Matthias and Guozhang.
>
> Below I mention the quick summary of the main alternatives we looked at to
> introduce the Rich functions (I will refer to it as Rich functions until we
> find better/another name). Initially the proposed alternatives was not
> backwards-compatible, so I will not mention them.
> The related discussions are spread in KIP-149 and in this KIP (KIP-159)
> discussion threads.
>
>
>
> 1. The idea of rich functions came into the stage with KIP-149, in
> discussion thread. As a result we extended KIP-149 to support Rich
> functions as well.
>
> 2.  To as part of the Rich functions, we provided init (ProcessorContext)
> method. Afterwards, Dammian suggested that we should not provide
> ProcessorContext to users. As a result, we separated the two problems into
> two separate KIPs, as it seems they can be solved in parallel.
>
> - One approach we considered was :
>
> public interface ValueMapperWithKey {
> VR apply(final K key, final V value);
> }
>
> public interface RichValueMapper extends RichFunction{
> }
>
> public interface RichFunction {
> void init(RecordContext recordContext);
> void close();
> }
>
> public interface RecordContext {
> String applicationId();
> TaskId taskId();
> StreamsMetrics metrics();
> String topic();
> int partition();
> long offset();
> long timestamp();
> Map appConfigs();
> Map appConfigsWithPrefix(String prefix);
> }
>
>
> public interface ProcessorContext extends RecordContext {
>// all methods but the ones in RecordContext
> }
>
> As a result:
> * . All "withKey" and "withoutKey" interfaces can be converted to their
> Rich counterparts (with empty init() and close() methods)
> *. All related Processors will accept Rich interfaces in their
> constructors.
> *. So, we convert the related "withKey" or "withoutKey" interfaces to Rich
> interface while building the topology and initialize the related processors
> with Rich interfaces only.
> *. We will not need to overloaded methods for rich functions as Rich
> interfaces extend withKey interfaces. We will just check the object type
> and act accordingly.
>
>
>
>
> 3. There was some thoughts that the above approach does not support lambdas
> so we should support only one method, only init(RecordContext), as part of
> Rich interfaces.
> This is still in discussion. Personally I think Rich interfaces are by
> definition lambda-free and we should not care much about it.
>
>
> 4. Thanks to Matthias's discussion, an alternative we considered was to
> pass in the RecordContext as method parameter.  This might even allow to
> use Lambdas and we could keep the name RichFunction as we preserve the
> nature of being a function.
> "If you go with `init()` and `close()` we basically
> allow users to have an in-memory state for a function. Thus, we cannot
> share a single instance of RichValueMapper (etc) over multiple tasks and
> we would need a supplier pattern similar to #transform(). And this would
> "break the flow" of the API, as (Rich)ValueMapperSupplier would not
> inherit from ValueMapper and thus we would need many new overload for
> KStream/KTable classes". (Copy paste from Matthias's email)
>
>
> Cheers,
> Jeyhun
>
>
> On Mon, Jun 5, 2017 at 5:18 AM Matthias J. Sax 
> wrote:
>
> > Yes, we did consider this, and there is no consensus yet what the best
> > alternative is.
> >
> > @Jeyhun: the email thread got pretty long. Maybe you can give a quick
> > summary of the current state of the discussion?
> >
> >
> > -Matthias
> >
> > On 6/4/17 6:04 PM, Guozhang Wang wrote:
> > > Thanks for the explanation Jeyhun and Matthias.
> > >
> > > I have just read through both KIP-149 and KIP-159 and am wondering if
> you
> > > guys have considered a slight different approach for rich function,
> that
> > is
> > > to add the `RecordContext` into the apply functions as an additional
> > > parameter. For example:
> > >
> > > ---
> > >
> > > interface RichValueMapper {
> > >
> > > VR apply(final V value, final RecordContext context);
> > >
> > > }
> > >
> > > ...
> > >
> > > // then in KStreams
> > >
> > >  KStream mapValues(ValueMapper
> > mapper);
> > >  KStream mapValueswithContext(RichValueMapper  > > extends VR> mapper);
> > >
> > > ---
> > >
> > > The caveat is that it will introduces more overloads; but I think the
> > > #.overloads are mainly introduced by 1) serde overrides and 2)
> > > state-store-supplier overides, both of which can be reduced in the near
> > > future, 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-05 Thread Jeyhun Karimov
Hi,

Thanks for your comments Matthias and Guozhang.

Below I mention the quick summary of the main alternatives we looked at to
introduce the Rich functions (I will refer to it as Rich functions until we
find better/another name). Initially the proposed alternatives was not
backwards-compatible, so I will not mention them.
The related discussions are spread in KIP-149 and in this KIP (KIP-159)
discussion threads.



1. The idea of rich functions came into the stage with KIP-149, in
discussion thread. As a result we extended KIP-149 to support Rich
functions as well.

2.  To as part of the Rich functions, we provided init (ProcessorContext)
method. Afterwards, Dammian suggested that we should not provide
ProcessorContext to users. As a result, we separated the two problems into
two separate KIPs, as it seems they can be solved in parallel.

- One approach we considered was :

public interface ValueMapperWithKey {
VR apply(final K key, final V value);
}

public interface RichValueMapper extends RichFunction{
}

public interface RichFunction {
void init(RecordContext recordContext);
void close();
}

public interface RecordContext {
String applicationId();
TaskId taskId();
StreamsMetrics metrics();
String topic();
int partition();
long offset();
long timestamp();
Map appConfigs();
Map appConfigsWithPrefix(String prefix);
}


public interface ProcessorContext extends RecordContext {
   // all methods but the ones in RecordContext
}

As a result:
* . All "withKey" and "withoutKey" interfaces can be converted to their
Rich counterparts (with empty init() and close() methods)
*. All related Processors will accept Rich interfaces in their constructors.
*. So, we convert the related "withKey" or "withoutKey" interfaces to Rich
interface while building the topology and initialize the related processors
with Rich interfaces only.
*. We will not need to overloaded methods for rich functions as Rich
interfaces extend withKey interfaces. We will just check the object type
and act accordingly.




3. There was some thoughts that the above approach does not support lambdas
so we should support only one method, only init(RecordContext), as part of
Rich interfaces.
This is still in discussion. Personally I think Rich interfaces are by
definition lambda-free and we should not care much about it.


4. Thanks to Matthias's discussion, an alternative we considered was to
pass in the RecordContext as method parameter.  This might even allow to
use Lambdas and we could keep the name RichFunction as we preserve the
nature of being a function.
"If you go with `init()` and `close()` we basically
allow users to have an in-memory state for a function. Thus, we cannot
share a single instance of RichValueMapper (etc) over multiple tasks and
we would need a supplier pattern similar to #transform(). And this would
"break the flow" of the API, as (Rich)ValueMapperSupplier would not
inherit from ValueMapper and thus we would need many new overload for
KStream/KTable classes". (Copy paste from Matthias's email)


Cheers,
Jeyhun


On Mon, Jun 5, 2017 at 5:18 AM Matthias J. Sax 
wrote:

> Yes, we did consider this, and there is no consensus yet what the best
> alternative is.
>
> @Jeyhun: the email thread got pretty long. Maybe you can give a quick
> summary of the current state of the discussion?
>
>
> -Matthias
>
> On 6/4/17 6:04 PM, Guozhang Wang wrote:
> > Thanks for the explanation Jeyhun and Matthias.
> >
> > I have just read through both KIP-149 and KIP-159 and am wondering if you
> > guys have considered a slight different approach for rich function, that
> is
> > to add the `RecordContext` into the apply functions as an additional
> > parameter. For example:
> >
> > ---
> >
> > interface RichValueMapper {
> >
> > VR apply(final V value, final RecordContext context);
> >
> > }
> >
> > ...
> >
> > // then in KStreams
> >
> >  KStream mapValues(ValueMapper
> mapper);
> >  KStream mapValueswithContext(RichValueMapper  > extends VR> mapper);
> >
> > ---
> >
> > The caveat is that it will introduces more overloads; but I think the
> > #.overloads are mainly introduced by 1) serde overrides and 2)
> > state-store-supplier overides, both of which can be reduced in the near
> > future, and I felt this overloading is still worthwhile, as it has the
> > following benefits:
> >
> > 1) still allow lambda expressions.
> > 2) clearer code path (do not need to "convert" from non-rich functions to
> > rich functions)
> >
> >
> > Maybe this approach has already been discussed and I may have overlooked
> in
> > the email thread; anyways, lmk.
> >
> >
> > Guozhang
> >
> >
> >
> > On Thu, Jun 1, 2017 at 10:18 PM, Matthias J. Sax 
> > wrote:
> >
> >> I agree with Jeyhun. As already mention, the overall API improvement
> >> ideas are 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-04 Thread Matthias J. Sax
Yes, we did consider this, and there is no consensus yet what the best
alternative is.

@Jeyhun: the email thread got pretty long. Maybe you can give a quick
summary of the current state of the discussion?


-Matthias

On 6/4/17 6:04 PM, Guozhang Wang wrote:
> Thanks for the explanation Jeyhun and Matthias.
> 
> I have just read through both KIP-149 and KIP-159 and am wondering if you
> guys have considered a slight different approach for rich function, that is
> to add the `RecordContext` into the apply functions as an additional
> parameter. For example:
> 
> ---
> 
> interface RichValueMapper {
> 
> VR apply(final V value, final RecordContext context);
> 
> }
> 
> ...
> 
> // then in KStreams
> 
>  KStream mapValues(ValueMapper mapper);
>  KStream mapValueswithContext(RichValueMapper  extends VR> mapper);
> 
> ---
> 
> The caveat is that it will introduces more overloads; but I think the
> #.overloads are mainly introduced by 1) serde overrides and 2)
> state-store-supplier overides, both of which can be reduced in the near
> future, and I felt this overloading is still worthwhile, as it has the
> following benefits:
> 
> 1) still allow lambda expressions.
> 2) clearer code path (do not need to "convert" from non-rich functions to
> rich functions)
> 
> 
> Maybe this approach has already been discussed and I may have overlooked in
> the email thread; anyways, lmk.
> 
> 
> Guozhang
> 
> 
> 
> On Thu, Jun 1, 2017 at 10:18 PM, Matthias J. Sax 
> wrote:
> 
>> I agree with Jeyhun. As already mention, the overall API improvement
>> ideas are overlapping and/or contradicting each other. For this reason,
>> not all ideas can be accomplished and some Jira might just be closed as
>> "won't fix".
>>
>> For this reason, we try to do those KIP discussion with are large scope
>> to get an overall picture to converge to an overall consisted API.
>>
>>
>> @Jeyhun: about the overloads. Yes, we might get more overload. It might
>> be sufficient though, to do a single xxxWithContext() overload that will
>> provide key+value+context. Otherwise, if might get too messy having
>> ValueMapper, ValueMapperWithKey, ValueMapperWithContext,
>> ValueMapperWithKeyWithContext.
>>
>> On the other hand, we also have the "builder pattern" idea as an API
>> change and this might mitigate the overload problem. Not for simple
>> function like map/flatMap etc but for joins and aggregations.
>>
>>
>> On the other hand, as I mentioned in an older email, I am personally
>> fine to break the pure functional interface, and add
>>
>>   - interface WithRecordContext with method `open(RecordContext)` (or
>> `init(...)`, or any better name) -- but not `close()`)
>>
>>   - interface ValueMapperWithRecordContext extends ValueMapper,
>> WithRecordContext
>>
>> This would allow us to avoid any overload. Of course, we don't get a
>> "pure function" interface and also sacrifices Lambdas.
>>
>>
>>
>> I am personally a little bit undecided what the better option might be.
>> Curious to hear what other think about this trade off.
>>
>>
>> -Matthias
>>
>>
>> On 6/1/17 6:13 PM, Jeyhun Karimov wrote:
>>> Hi Guozhang,
>>>
>>> It subsumes partially. Initially the idea was to support RichFunctions
>> as a
>>> separate interface. Throughout the discussion, however, we considered
>> maybe
>>> overloading the related methods (with RecodContext param) is better
>>> approach than providing a separate RichFunction interface.
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>> On Fri, Jun 2, 2017 at 2:27 AM Guozhang Wang  wrote:
>>>
 Does this KIP subsume this ticket as well?
 https://issues.apache.org/jira/browse/KAFKA-4125

 On Sat, May 20, 2017 at 9:05 AM, Jeyhun Karimov 
 wrote:

> Dear community,
>
> As we discussed in KIP-149 [DISCUSS] thread [1], I would like to
>> initiate
> KIP for rich functions (interfaces) [2].
> I would like to get your comments.
>
>
> [1]
> http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=
> Re+DISCUSS+KIP+149+Enabling+key+access+in+
>> ValueTransformer+ValueMapper+
> and+ValueJoiner
> [2]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 159%3A+Introducing+Rich+functions+to+Streams
>
>
> Cheers,
> Jeyhun
> --
> -Cheers
>
> Jeyhun
>



 --
 -- Guozhang

>>
>>
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-04 Thread Guozhang Wang
Thanks for the explanation Jeyhun and Matthias.

I have just read through both KIP-149 and KIP-159 and am wondering if you
guys have considered a slight different approach for rich function, that is
to add the `RecordContext` into the apply functions as an additional
parameter. For example:

---

interface RichValueMapper {

VR apply(final V value, final RecordContext context);

}

...

// then in KStreams

 KStream mapValues(ValueMapper mapper);
 KStream mapValueswithContext(RichValueMapper  mapper);

---

The caveat is that it will introduces more overloads; but I think the
#.overloads are mainly introduced by 1) serde overrides and 2)
state-store-supplier overides, both of which can be reduced in the near
future, and I felt this overloading is still worthwhile, as it has the
following benefits:

1) still allow lambda expressions.
2) clearer code path (do not need to "convert" from non-rich functions to
rich functions)


Maybe this approach has already been discussed and I may have overlooked in
the email thread; anyways, lmk.


Guozhang



On Thu, Jun 1, 2017 at 10:18 PM, Matthias J. Sax 
wrote:

> I agree with Jeyhun. As already mention, the overall API improvement
> ideas are overlapping and/or contradicting each other. For this reason,
> not all ideas can be accomplished and some Jira might just be closed as
> "won't fix".
>
> For this reason, we try to do those KIP discussion with are large scope
> to get an overall picture to converge to an overall consisted API.
>
>
> @Jeyhun: about the overloads. Yes, we might get more overload. It might
> be sufficient though, to do a single xxxWithContext() overload that will
> provide key+value+context. Otherwise, if might get too messy having
> ValueMapper, ValueMapperWithKey, ValueMapperWithContext,
> ValueMapperWithKeyWithContext.
>
> On the other hand, we also have the "builder pattern" idea as an API
> change and this might mitigate the overload problem. Not for simple
> function like map/flatMap etc but for joins and aggregations.
>
>
> On the other hand, as I mentioned in an older email, I am personally
> fine to break the pure functional interface, and add
>
>   - interface WithRecordContext with method `open(RecordContext)` (or
> `init(...)`, or any better name) -- but not `close()`)
>
>   - interface ValueMapperWithRecordContext extends ValueMapper,
> WithRecordContext
>
> This would allow us to avoid any overload. Of course, we don't get a
> "pure function" interface and also sacrifices Lambdas.
>
>
>
> I am personally a little bit undecided what the better option might be.
> Curious to hear what other think about this trade off.
>
>
> -Matthias
>
>
> On 6/1/17 6:13 PM, Jeyhun Karimov wrote:
> > Hi Guozhang,
> >
> > It subsumes partially. Initially the idea was to support RichFunctions
> as a
> > separate interface. Throughout the discussion, however, we considered
> maybe
> > overloading the related methods (with RecodContext param) is better
> > approach than providing a separate RichFunction interface.
> >
> > Cheers,
> > Jeyhun
> >
> > On Fri, Jun 2, 2017 at 2:27 AM Guozhang Wang  wrote:
> >
> >> Does this KIP subsume this ticket as well?
> >> https://issues.apache.org/jira/browse/KAFKA-4125
> >>
> >> On Sat, May 20, 2017 at 9:05 AM, Jeyhun Karimov 
> >> wrote:
> >>
> >>> Dear community,
> >>>
> >>> As we discussed in KIP-149 [DISCUSS] thread [1], I would like to
> initiate
> >>> KIP for rich functions (interfaces) [2].
> >>> I would like to get your comments.
> >>>
> >>>
> >>> [1]
> >>> http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=
> >>> Re+DISCUSS+KIP+149+Enabling+key+access+in+
> ValueTransformer+ValueMapper+
> >>> and+ValueJoiner
> >>> [2]
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 159%3A+Introducing+Rich+functions+to+Streams
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>> --
> >>> -Cheers
> >>>
> >>> Jeyhun
> >>>
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
>
>


-- 
-- Guozhang


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-01 Thread Matthias J. Sax
I agree with Jeyhun. As already mention, the overall API improvement
ideas are overlapping and/or contradicting each other. For this reason,
not all ideas can be accomplished and some Jira might just be closed as
"won't fix".

For this reason, we try to do those KIP discussion with are large scope
to get an overall picture to converge to an overall consisted API.


@Jeyhun: about the overloads. Yes, we might get more overload. It might
be sufficient though, to do a single xxxWithContext() overload that will
provide key+value+context. Otherwise, if might get too messy having
ValueMapper, ValueMapperWithKey, ValueMapperWithContext,
ValueMapperWithKeyWithContext.

On the other hand, we also have the "builder pattern" idea as an API
change and this might mitigate the overload problem. Not for simple
function like map/flatMap etc but for joins and aggregations.


On the other hand, as I mentioned in an older email, I am personally
fine to break the pure functional interface, and add

  - interface WithRecordContext with method `open(RecordContext)` (or
`init(...)`, or any better name) -- but not `close()`)

  - interface ValueMapperWithRecordContext extends ValueMapper,
WithRecordContext

This would allow us to avoid any overload. Of course, we don't get a
"pure function" interface and also sacrifices Lambdas.



I am personally a little bit undecided what the better option might be.
Curious to hear what other think about this trade off.


-Matthias


On 6/1/17 6:13 PM, Jeyhun Karimov wrote:
> Hi Guozhang,
> 
> It subsumes partially. Initially the idea was to support RichFunctions as a
> separate interface. Throughout the discussion, however, we considered maybe
> overloading the related methods (with RecodContext param) is better
> approach than providing a separate RichFunction interface.
> 
> Cheers,
> Jeyhun
> 
> On Fri, Jun 2, 2017 at 2:27 AM Guozhang Wang  wrote:
> 
>> Does this KIP subsume this ticket as well?
>> https://issues.apache.org/jira/browse/KAFKA-4125
>>
>> On Sat, May 20, 2017 at 9:05 AM, Jeyhun Karimov 
>> wrote:
>>
>>> Dear community,
>>>
>>> As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
>>> KIP for rich functions (interfaces) [2].
>>> I would like to get your comments.
>>>
>>>
>>> [1]
>>> http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=
>>> Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+
>>> and+ValueJoiner
>>> [2]
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 159%3A+Introducing+Rich+functions+to+Streams
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>> --
>>> -Cheers
>>>
>>> Jeyhun
>>>
>>
>>
>>
>> --
>> -- Guozhang
>>



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-01 Thread Jeyhun Karimov
Hi Guozhang,

It subsumes partially. Initially the idea was to support RichFunctions as a
separate interface. Throughout the discussion, however, we considered maybe
overloading the related methods (with RecodContext param) is better
approach than providing a separate RichFunction interface.

Cheers,
Jeyhun

On Fri, Jun 2, 2017 at 2:27 AM Guozhang Wang  wrote:

> Does this KIP subsume this ticket as well?
> https://issues.apache.org/jira/browse/KAFKA-4125
>
> On Sat, May 20, 2017 at 9:05 AM, Jeyhun Karimov 
> wrote:
>
> > Dear community,
> >
> > As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
> > KIP for rich functions (interfaces) [2].
> > I would like to get your comments.
> >
> >
> > [1]
> > http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=
> > Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+
> > and+ValueJoiner
> > [2]
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 159%3A+Introducing+Rich+functions+to+Streams
> >
> >
> > Cheers,
> > Jeyhun
> > --
> > -Cheers
> >
> > Jeyhun
> >
>
>
>
> --
> -- Guozhang
>
-- 
-Cheers

Jeyhun


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-06-01 Thread Guozhang Wang
Does this KIP subsume this ticket as well?
https://issues.apache.org/jira/browse/KAFKA-4125

On Sat, May 20, 2017 at 9:05 AM, Jeyhun Karimov 
wrote:

> Dear community,
>
> As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
> KIP for rich functions (interfaces) [2].
> I would like to get your comments.
>
>
> [1]
> http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=
> Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+
> and+ValueJoiner
> [2]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 159%3A+Introducing+Rich+functions+to+Streams
>
>
> Cheers,
> Jeyhun
> --
> -Cheers
>
> Jeyhun
>



-- 
-- Guozhang


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-28 Thread Jeyhun Karimov
After your response on KIP-149 related with ValueTransformerSupplier,
everything
you mentioned now makes complete sense. Thanks for clarification.

Just a note: We will have additional (to KIP-149) overloaded methods: for
each withKey and withoutKey methods (ValueMapper and ValueMapperWithKey) we
will have overloaded methods with RecordContext argument.
Other than this issue, I don't see any limitation.

Cheers,
Jeyhun


On Sun, May 28, 2017 at 6:34 PM Matthias J. Sax 
wrote:

> Thanks for you comments Jeyhun,
>
> I agree about the disadvantages. Only the punctuation part is something
> I don't buy. IMHO, RichFunctions should not allow to register and use
> punctuation. If you need punctuation, you should use #transform() or
> similar. Note, that we plan to provide `RecordContext` and not
> `ProcessorContext` and thus, it's not even possible to register
> punctuations.
>
> One more thought: if you go with `init()` and `close()` we basically
> allow users to have an in-memory state for a function. Thus, we cannot
> share a single instance of RichValueMapper (etc) over multiple tasks and
> we would need a supplier pattern similar to #transform(). And this would
> "break the flow" of the API, as (Rich)ValueMapperSupplier would not
> inherit from ValueMapper and thus we would need many new overload for
> KStream/KTable classes.
>
> The overall goal of RichFunction (from my understanding) was to provide
> record metadata information (like offset, timestamp, etc) to the user.
> And we still have #transform() that provided the init and close
> functionality. So if we introduce those with RichFunction we are quite
> close to what #transform provides, and thus it feels as if we duplicate
> functionality.
>
> For this reason, it seems to be better to got with the
> `#valueMapper(ValueMapper mapper, RecordContext context)` approach.
>
> WDYT?
>
>
>
> -Matthias
>
> On 5/27/17 11:00 AM, Jeyhun Karimov wrote:
> > Hi,
> >
> > Thanks for your comments. I will refer the overall approach as rich
> > functions until we find a better name.
> >
> > I think there are some pros and cons of the approach you described.
> >
> > Pros is that it is simple, has clear boundaries, avoids misunderstanding
> of
> > term "function".
> > So you propose sth like:
> > KStream.valueMapper (ValueMapper vm, RecordContext rc)
> > or
> > having rich functions with only a single init(RecordContext rc) method.
> >
> > Cons is that:
> >  - This will bring another set of overloads (if we use RecordContext as a
> > separate parameter). We should consider that the rich functions will be
> for
> > all main interfaces.
> >  - I don't think that we need lambdas in rich functions. It is by
> > definition "rich" so, no single method in interface -> as a result no
> > lambdas.
> >  - I disagree that rich functions should only contain init() method. This
> > depends on each interface. For example, for specific interfaces  we can
> add
> > methods (like punctuate()) to their rich functions.
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> > On Thu, May 25, 2017 at 1:02 AM Matthias J. Sax 
> > wrote:
> >
> >> I confess, the term is borrowed from Flink :)
> >>
> >> Personally, I never thought about it, but I tend to agree with Michal. I
> >> also want to clarify, that the main purpose is the ability to access
> >> record metadata. Thus, it might even be sufficient to only have "init".
> >>
> >> An alternative would of course be, to pass in the RecordContext as
> >> method parameter. This would allow us to drop "init()". This might even
> >> allow to use Lambdas and we could keep the name RichFunction as we
> >> preserve the nature of being a function.
> >>
> >>
> >> -Matthias
> >>
> >> On 5/24/17 12:13 PM, Jeyhun Karimov wrote:
> >>> Hi Michal,
> >>>
> >>> Thanks for your comments. I see your point and I agree with it.
> However,
> >>> I don't have a better idea for naming. I checked MR source code. There
> >>> it is used JobConfigurable and Closable, two different interfaces.
> Maybe
> >>> we can rename RichFunction as Configurable?
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>> On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki
> >>> >
> >>> wrote:
> >>>
> >>> Hi Jeyhun,
> >>>
> >>> I understand your argument about "Rich" in RichFunctions. Perhaps
> >>> I'm just being too puritan here, but let me ask this anyway:
> >>>
> >>> What is it that makes something a function? To me a function is
> >>> something that takes zero or more arguments and possibly returns a
> >>> value and while it may have side-effects (as opposed to "pure
> >>> functions" which can't), it doesn't have any life-cycle of its own.
> >>> This is what, in my mind, distinguishes the concept of a "function"
> >>> from that of more vaguely defined concepts.
> >>>
> >>> So if we add a life-cycle to a function, in that understanding, it
> >>> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-28 Thread Matthias J. Sax
Thanks for you comments Jeyhun,

I agree about the disadvantages. Only the punctuation part is something
I don't buy. IMHO, RichFunctions should not allow to register and use
punctuation. If you need punctuation, you should use #transform() or
similar. Note, that we plan to provide `RecordContext` and not
`ProcessorContext` and thus, it's not even possible to register
punctuations.

One more thought: if you go with `init()` and `close()` we basically
allow users to have an in-memory state for a function. Thus, we cannot
share a single instance of RichValueMapper (etc) over multiple tasks and
we would need a supplier pattern similar to #transform(). And this would
"break the flow" of the API, as (Rich)ValueMapperSupplier would not
inherit from ValueMapper and thus we would need many new overload for
KStream/KTable classes.

The overall goal of RichFunction (from my understanding) was to provide
record metadata information (like offset, timestamp, etc) to the user.
And we still have #transform() that provided the init and close
functionality. So if we introduce those with RichFunction we are quite
close to what #transform provides, and thus it feels as if we duplicate
functionality.

For this reason, it seems to be better to got with the
`#valueMapper(ValueMapper mapper, RecordContext context)` approach.

WDYT?



-Matthias

On 5/27/17 11:00 AM, Jeyhun Karimov wrote:
> Hi,
> 
> Thanks for your comments. I will refer the overall approach as rich
> functions until we find a better name.
> 
> I think there are some pros and cons of the approach you described.
> 
> Pros is that it is simple, has clear boundaries, avoids misunderstanding of
> term "function".
> So you propose sth like:
> KStream.valueMapper (ValueMapper vm, RecordContext rc)
> or
> having rich functions with only a single init(RecordContext rc) method.
> 
> Cons is that:
>  - This will bring another set of overloads (if we use RecordContext as a
> separate parameter). We should consider that the rich functions will be for
> all main interfaces.
>  - I don't think that we need lambdas in rich functions. It is by
> definition "rich" so, no single method in interface -> as a result no
> lambdas.
>  - I disagree that rich functions should only contain init() method. This
> depends on each interface. For example, for specific interfaces  we can add
> methods (like punctuate()) to their rich functions.
> 
> 
> Cheers,
> Jeyhun
> 
> 
> 
> On Thu, May 25, 2017 at 1:02 AM Matthias J. Sax 
> wrote:
> 
>> I confess, the term is borrowed from Flink :)
>>
>> Personally, I never thought about it, but I tend to agree with Michal. I
>> also want to clarify, that the main purpose is the ability to access
>> record metadata. Thus, it might even be sufficient to only have "init".
>>
>> An alternative would of course be, to pass in the RecordContext as
>> method parameter. This would allow us to drop "init()". This might even
>> allow to use Lambdas and we could keep the name RichFunction as we
>> preserve the nature of being a function.
>>
>>
>> -Matthias
>>
>> On 5/24/17 12:13 PM, Jeyhun Karimov wrote:
>>> Hi Michal,
>>>
>>> Thanks for your comments. I see your point and I agree with it. However,
>>> I don't have a better idea for naming. I checked MR source code. There
>>> it is used JobConfigurable and Closable, two different interfaces. Maybe
>>> we can rename RichFunction as Configurable?
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>> On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki
>>> >
>>> wrote:
>>>
>>> Hi Jeyhun,
>>>
>>> I understand your argument about "Rich" in RichFunctions. Perhaps
>>> I'm just being too puritan here, but let me ask this anyway:
>>>
>>> What is it that makes something a function? To me a function is
>>> something that takes zero or more arguments and possibly returns a
>>> value and while it may have side-effects (as opposed to "pure
>>> functions" which can't), it doesn't have any life-cycle of its own.
>>> This is what, in my mind, distinguishes the concept of a "function"
>>> from that of more vaguely defined concepts.
>>>
>>> So if we add a life-cycle to a function, in that understanding, it
>>> doesn't become a rich function but instead stops being a function
>>> altogether.
>>>
>>> You could say it's "just semantics" but to me precise use of
>>> language in the given context is an important foundation for good
>>> engineering. And in the context of programming "function" has a
>>> precise meaning. Of course we can say that in the context of Kafka
>>> Streams "function" has a different, looser meaning but I'd argue
>>> that won't do anyone any good.
>>>
>>> On the other hand other frameworks such as Flink use this
>>> terminology, so it could be that consistency is the reason. I'm
>>> guessing that's why the name was proposed in the first place. My
>>> point 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-27 Thread Jeyhun Karimov
Hi,

Thanks for your comments. I will refer the overall approach as rich
functions until we find a better name.

I think there are some pros and cons of the approach you described.

Pros is that it is simple, has clear boundaries, avoids misunderstanding of
term "function".
So you propose sth like:
KStream.valueMapper (ValueMapper vm, RecordContext rc)
or
having rich functions with only a single init(RecordContext rc) method.

Cons is that:
 - This will bring another set of overloads (if we use RecordContext as a
separate parameter). We should consider that the rich functions will be for
all main interfaces.
 - I don't think that we need lambdas in rich functions. It is by
definition "rich" so, no single method in interface -> as a result no
lambdas.
 - I disagree that rich functions should only contain init() method. This
depends on each interface. For example, for specific interfaces  we can add
methods (like punctuate()) to their rich functions.


Cheers,
Jeyhun



On Thu, May 25, 2017 at 1:02 AM Matthias J. Sax 
wrote:

> I confess, the term is borrowed from Flink :)
>
> Personally, I never thought about it, but I tend to agree with Michal. I
> also want to clarify, that the main purpose is the ability to access
> record metadata. Thus, it might even be sufficient to only have "init".
>
> An alternative would of course be, to pass in the RecordContext as
> method parameter. This would allow us to drop "init()". This might even
> allow to use Lambdas and we could keep the name RichFunction as we
> preserve the nature of being a function.
>
>
> -Matthias
>
> On 5/24/17 12:13 PM, Jeyhun Karimov wrote:
> > Hi Michal,
> >
> > Thanks for your comments. I see your point and I agree with it. However,
> > I don't have a better idea for naming. I checked MR source code. There
> > it is used JobConfigurable and Closable, two different interfaces. Maybe
> > we can rename RichFunction as Configurable?
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki
> > >
> > wrote:
> >
> > Hi Jeyhun,
> >
> > I understand your argument about "Rich" in RichFunctions. Perhaps
> > I'm just being too puritan here, but let me ask this anyway:
> >
> > What is it that makes something a function? To me a function is
> > something that takes zero or more arguments and possibly returns a
> > value and while it may have side-effects (as opposed to "pure
> > functions" which can't), it doesn't have any life-cycle of its own.
> > This is what, in my mind, distinguishes the concept of a "function"
> > from that of more vaguely defined concepts.
> >
> > So if we add a life-cycle to a function, in that understanding, it
> > doesn't become a rich function but instead stops being a function
> > altogether.
> >
> > You could say it's "just semantics" but to me precise use of
> > language in the given context is an important foundation for good
> > engineering. And in the context of programming "function" has a
> > precise meaning. Of course we can say that in the context of Kafka
> > Streams "function" has a different, looser meaning but I'd argue
> > that won't do anyone any good.
> >
> > On the other hand other frameworks such as Flink use this
> > terminology, so it could be that consistency is the reason. I'm
> > guessing that's why the name was proposed in the first place. My
> > point is simply that it's a poor choice of wording and Kafka Streams
> > don't have to follow that to the letter.
> >
> > Cheers,
> >
> > Michal
> >
> >
> > On 23/05/17 13:26, Jeyhun Karimov wrote:
> >> Hi Michal,
> >>
> >> Thanks for your comments.
> >>
> >>
> >> To me at least it feels strange that something is called a
> >> function yet doesn't follow the functional interface
> >> definition of having just one abstract method. I suppose init
> >> and close could be made default methods with empty bodies once
> >> Java 7 support is dropped to mitigate that concern. Still, I
> >> feel some resistance to consider something that requires
> >> initialisation and closing (which implies holding state) as
> >> being a function. Sounds more like the Processor/Transformer
> >> kind of thing semantically, rather than a function.
> >>
> >>
> >>  -  If we called the interface name only Function your assumptions
> >> will hold. However, the keyword Rich by definition implies that we
> >> have a function (as you described, with one abstract method and
> >> etc) but it is rich. So, there are multiple methods in it.
> >> Ideally it should be:
> >>
> >> public interface RichFunction extends Function {  // this
> >> is the Function that you described
> >>   void close();
> >>   void init(Some params);
> >>...
> >> }
> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-24 Thread Matthias J. Sax
I confess, the term is borrowed from Flink :)

Personally, I never thought about it, but I tend to agree with Michal. I
also want to clarify, that the main purpose is the ability to access
record metadata. Thus, it might even be sufficient to only have "init".

An alternative would of course be, to pass in the RecordContext as
method parameter. This would allow us to drop "init()". This might even
allow to use Lambdas and we could keep the name RichFunction as we
preserve the nature of being a function.


-Matthias

On 5/24/17 12:13 PM, Jeyhun Karimov wrote:
> Hi Michal,
> 
> Thanks for your comments. I see your point and I agree with it. However,
> I don't have a better idea for naming. I checked MR source code. There
> it is used JobConfigurable and Closable, two different interfaces. Maybe
> we can rename RichFunction as Configurable? 
> 
> 
> Cheers,
> Jeyhun
> 
> On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki
> >
> wrote:
> 
> Hi Jeyhun,
> 
> I understand your argument about "Rich" in RichFunctions. Perhaps
> I'm just being too puritan here, but let me ask this anyway:
> 
> What is it that makes something a function? To me a function is
> something that takes zero or more arguments and possibly returns a
> value and while it may have side-effects (as opposed to "pure
> functions" which can't), it doesn't have any life-cycle of its own.
> This is what, in my mind, distinguishes the concept of a "function"
> from that of more vaguely defined concepts.
> 
> So if we add a life-cycle to a function, in that understanding, it
> doesn't become a rich function but instead stops being a function
> altogether.
> 
> You could say it's "just semantics" but to me precise use of
> language in the given context is an important foundation for good
> engineering. And in the context of programming "function" has a
> precise meaning. Of course we can say that in the context of Kafka
> Streams "function" has a different, looser meaning but I'd argue
> that won't do anyone any good.
> 
> On the other hand other frameworks such as Flink use this
> terminology, so it could be that consistency is the reason. I'm
> guessing that's why the name was proposed in the first place. My
> point is simply that it's a poor choice of wording and Kafka Streams
> don't have to follow that to the letter.
> 
> Cheers,
> 
> Michal
> 
> 
> On 23/05/17 13:26, Jeyhun Karimov wrote:
>> Hi Michal,
>>
>> Thanks for your comments.
>>
>>
>> To me at least it feels strange that something is called a
>> function yet doesn't follow the functional interface
>> definition of having just one abstract method. I suppose init
>> and close could be made default methods with empty bodies once
>> Java 7 support is dropped to mitigate that concern. Still, I
>> feel some resistance to consider something that requires
>> initialisation and closing (which implies holding state) as
>> being a function. Sounds more like the Processor/Transformer
>> kind of thing semantically, rather than a function. 
>>
>>
>>  -  If we called the interface name only Function your assumptions
>> will hold. However, the keyword Rich by definition implies that we
>> have a function (as you described, with one abstract method and
>> etc) but it is rich. So, there are multiple methods in it. 
>> Ideally it should be:
>>
>> public interface RichFunction extends Function {  // this
>> is the Function that you described
>>   void close();
>>   void init(Some params);
>>...
>> }
>>
>>
>> The KIP says there are multiple use-cases for this but doesn't
>> enumerate any - I think some examples would be useful,
>> otherwise that section sounds a little bit vague. 
>>
>>
>> I thought it is obvious by definition but I will update it. Thanks. 
>>
>>
>> IMHO, it's the access to the RecordContext is where the added
>> value lies but maybe I'm just lacking in imagination, so I'm
>> asking all this to better understand the rationale for init()
>> and close().
>>
>>
>> Maybe I should add some examples. Thanks. 
>>
>>
>> Cheers,
>> Jeyhun
>>
>> On Mon, May 22, 2017 at 11:02 AM, Michal Borowiecki
>> > > wrote:
>>
>> Hi Jeyhun,
>>
>> I'd like to understand better the premise of RichFunctions and
>> why |init(Some params)|,| close() |are said to be needed.
>>
>> To me at least it feels strange that something is called a
>> function yet doesn't follow the functional interface
>> definition of having just one abstract method. I suppose init
>> and close could be made default methods with empty 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-24 Thread Jeyhun Karimov
Hi Michal,

Thanks for your comments. I see your point and I agree with it. However, I
don't have a better idea for naming. I checked MR source code. There it is
used JobConfigurable and Closable, two different interfaces. Maybe we can
rename RichFunction as Configurable?


Cheers,
Jeyhun

On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki <
michal.borowie...@openbet.com> wrote:

> Hi Jeyhun,
>
> I understand your argument about "Rich" in RichFunctions. Perhaps I'm just
> being too puritan here, but let me ask this anyway:
>
> What is it that makes something a function? To me a function is something
> that takes zero or more arguments and possibly returns a value and while it
> may have side-effects (as opposed to "pure functions" which can't), it
> doesn't have any life-cycle of its own. This is what, in my mind,
> distinguishes the concept of a "function" from that of more vaguely defined
> concepts.
>
> So if we add a life-cycle to a function, in that understanding, it doesn't
> become a rich function but instead stops being a function altogether.
>
> You could say it's "just semantics" but to me precise use of language in
> the given context is an important foundation for good engineering. And in
> the context of programming "function" has a precise meaning. Of course we
> can say that in the context of Kafka Streams "function" has a different,
> looser meaning but I'd argue that won't do anyone any good.
>
> On the other hand other frameworks such as Flink use this terminology, so
> it could be that consistency is the reason. I'm guessing that's why the
> name was proposed in the first place. My point is simply that it's a poor
> choice of wording and Kafka Streams don't have to follow that to the letter.
>
> Cheers,
>
> Michal
>
> On 23/05/17 13:26, Jeyhun Karimov wrote:
>
> Hi Michal,
>
> Thanks for your comments.
>
>
> To me at least it feels strange that something is called a function yet
>> doesn't follow the functional interface definition of having just one
>> abstract method. I suppose init and close could be made default methods
>> with empty bodies once Java 7 support is dropped to mitigate that concern.
>> Still, I feel some resistance to consider something that requires
>> initialisation and closing (which implies holding state) as being a
>> function. Sounds more like the Processor/Transformer kind of thing
>> semantically, rather than a function.
>
>
>  -  If we called the interface name only Function your assumptions will
> hold. However, the keyword Rich by definition implies that we have a
> function (as you described, with one abstract method and etc) but it is
> rich. So, there are multiple methods in it.
> Ideally it should be:
>
> public interface RichFunction extends Function {  // this is the
> Function that you described
>   void close();
>   void init(Some params);
>...
> }
>
>
> The KIP says there are multiple use-cases for this but doesn't enumerate
>> any - I think some examples would be useful, otherwise that section sounds
>> a little bit vague.
>
>
> I thought it is obvious by definition but I will update it. Thanks.
>
>
> IMHO, it's the access to the RecordContext is where the added value lies
>> but maybe I'm just lacking in imagination, so I'm asking all this to better
>> understand the rationale for init() and close().
>
>
> Maybe I should add some examples. Thanks.
>
>
> Cheers,
> Jeyhun
>
> On Mon, May 22, 2017 at 11:02 AM, Michal Borowiecki <
> michal.borowie...@openbet.com> wrote:
>
>> Hi Jeyhun,
>>
>> I'd like to understand better the premise of RichFunctions and why init(Some
>> params), close() are said to be needed.
>> To me at least it feels strange that something is called a function yet
>> doesn't follow the functional interface definition of having just one
>> abstract method. I suppose init and close could be made default methods
>> with empty bodies once Java 7 support is dropped to mitigate that concern.
>> Still, I feel some resistance to consider something that requires
>> initialisation and closing (which implies holding state) as being a
>> function. Sounds more like the Processor/Transformer kind of thing
>> semantically, rather than a function.
>>
>> The KIP says there are multiple use-cases for this but doesn't enumerate
>> any - I think some examples would be useful, otherwise that section sounds
>> a little bit vague.
>>
>> IMHO, it's the access to the RecordContext is where the added value lies
>> but maybe I'm just lacking in imagination, so I'm asking all this to better
>> understand the rationale for init() and close().
>>
>> Thanks,
>> Michał
>>
>> On 20/05/17 17:05, Jeyhun Karimov wrote:
>>
>> Dear community,
>>
>> As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
>> KIP for rich functions (interfaces) [2].
>> I would like to get your comments.
>>
>>
>> [1]http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+and+ValueJoiner
>> 

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-23 Thread Michal Borowiecki

Hi Jeyhun,

I understand your argument about "Rich" in RichFunctions. Perhaps I'm 
just being too puritan here, but let me ask this anyway:


What is it that makes something a function? To me a function is 
something that takes zero or more arguments and possibly returns a value 
and while it may have side-effects (as opposed to "pure functions" which 
can't), it doesn't have any life-cycle of its own. This is what, in my 
mind, distinguishes the concept of a "function" from that of more 
vaguely defined concepts.


So if we add a life-cycle to a function, in that understanding, it 
doesn't become a rich function but instead stops being a function 
altogether.


You could say it's "just semantics" but to me precise use of language in 
the given context is an important foundation for good engineering. And 
in the context of programming "function" has a precise meaning. Of 
course we can say that in the context of Kafka Streams "function" has a 
different, looser meaning but I'd argue that won't do anyone any good.


On the other hand other frameworks such as Flink use this terminology, 
so it could be that consistency is the reason. I'm guessing that's why 
the name was proposed in the first place. My point is simply that it's a 
poor choice of wording and Kafka Streams don't have to follow that to 
the letter.


Cheers,

Michal


On 23/05/17 13:26, Jeyhun Karimov wrote:

Hi Michal,

Thanks for your comments.


To me at least it feels strange that something is called a
function yet doesn't follow the functional interface definition of
having just one abstract method. I suppose init and close could be
made default methods with empty bodies once Java 7 support is
dropped to mitigate that concern. Still, I feel some resistance to
consider something that requires initialisation and closing (which
implies holding state) as being a function. Sounds more like the
Processor/Transformer kind of thing semantically, rather than a
function. 



 -  If we called the interface name only Function your assumptions 
will hold. However, the keyword Rich by definition implies that we 
have a function (as you described, with one abstract method and etc) 
but it is rich. So, there are multiple methods in it.

Ideally it should be:

public interface RichFunction extends Function {  // this is the 
Function that you described

  void close();
  void init(Some params);
   ...
}


The KIP says there are multiple use-cases for this but doesn't
enumerate any - I think some examples would be useful, otherwise
that section sounds a little bit vague. 



I thought it is obvious by definition but I will update it. Thanks.


IMHO, it's the access to the RecordContext is where the added
value lies but maybe I'm just lacking in imagination, so I'm
asking all this to better understand the rationale for init() and
close().


Maybe I should add some examples. Thanks.


Cheers,
Jeyhun

On Mon, May 22, 2017 at 11:02 AM, Michal Borowiecki 
> 
wrote:


Hi Jeyhun,

I'd like to understand better the premise of RichFunctions and why
|init(Some params)|,|close() |are said to be needed.

To me at least it feels strange that something is called a
function yet doesn't follow the functional interface definition of
having just one abstract method. I suppose init and close could be
made default methods with empty bodies once Java 7 support is
dropped to mitigate that concern. Still, I feel some resistance to
consider something that requires initialisation and closing (which
implies holding state) as being a function. Sounds more like the
Processor/Transformer kind of thing semantically, rather than a
function.

The KIP says there are multiple use-cases for this but doesn't
enumerate any - I think some examples would be useful, otherwise
that section sounds a little bit vague.

IMHO, it's the access to the RecordContext is where the added
value lies but maybe I'm just lacking in imagination, so I'm
asking all this to better understand the rationale for init() and
close().

Thanks,
Michał

On 20/05/17 17:05, Jeyhun Karimov wrote:

Dear community,

As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
KIP for rich functions (interfaces) [2].
I would like to get your comments.


[1]

http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+and+ValueJoiner


[2]

https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams





Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-23 Thread Jeyhun Karimov
Hi Michal,

Thanks for your comments.


To me at least it feels strange that something is called a function yet
> doesn't follow the functional interface definition of having just one
> abstract method. I suppose init and close could be made default methods
> with empty bodies once Java 7 support is dropped to mitigate that concern.
> Still, I feel some resistance to consider something that requires
> initialisation and closing (which implies holding state) as being a
> function. Sounds more like the Processor/Transformer kind of thing
> semantically, rather than a function.


 -  If we called the interface name only Function your assumptions will
hold. However, the keyword Rich by definition implies that we have a
function (as you described, with one abstract method and etc) but it is
rich. So, there are multiple methods in it.
Ideally it should be:

public interface RichFunction extends Function {  // this is the
Function that you described
  void close();
  void init(Some params);
   ...
}


The KIP says there are multiple use-cases for this but doesn't enumerate
> any - I think some examples would be useful, otherwise that section sounds
> a little bit vague.


I thought it is obvious by definition but I will update it. Thanks.


IMHO, it's the access to the RecordContext is where the added value lies
> but maybe I'm just lacking in imagination, so I'm asking all this to better
> understand the rationale for init() and close().


Maybe I should add some examples. Thanks.


Cheers,
Jeyhun

On Mon, May 22, 2017 at 11:02 AM, Michal Borowiecki <
michal.borowie...@openbet.com> wrote:

> Hi Jeyhun,
>
> I'd like to understand better the premise of RichFunctions and why init(Some
> params), close() are said to be needed.
> To me at least it feels strange that something is called a function yet
> doesn't follow the functional interface definition of having just one
> abstract method. I suppose init and close could be made default methods
> with empty bodies once Java 7 support is dropped to mitigate that concern.
> Still, I feel some resistance to consider something that requires
> initialisation and closing (which implies holding state) as being a
> function. Sounds more like the Processor/Transformer kind of thing
> semantically, rather than a function.
>
> The KIP says there are multiple use-cases for this but doesn't enumerate
> any - I think some examples would be useful, otherwise that section sounds
> a little bit vague.
>
> IMHO, it's the access to the RecordContext is where the added value lies
> but maybe I'm just lacking in imagination, so I'm asking all this to better
> understand the rationale for init() and close().
>
> Thanks,
> Michał
>
> On 20/05/17 17:05, Jeyhun Karimov wrote:
>
> Dear community,
>
> As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
> KIP for rich functions (interfaces) [2].
> I would like to get your comments.
>
>
> [1]http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+and+ValueJoiner
> [2]https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams
>
>
> Cheers,
> Jeyhun
>
>
> --
>  Michal Borowiecki
> Senior Software Engineer L4
> T: +44 208 742 1600 <+44%2020%208742%201600>
>
>
> +44 203 249 8448 <+44%2020%203249%208448>
>
>
>
> E: michal.borowie...@openbet.com
> W: www.openbet.com
> OpenBet Ltd
>
> Chiswick Park Building 9
>
> 566 Chiswick High Rd
>
> London
>
> W4 5XT
>
> UK
> 
> This message is confidential and intended only for the addressee. If you
> have received this message in error, please immediately notify the
> postmas...@openbet.com and delete it from your system as well as any
> copies. The content of e-mails as well as traffic data may be monitored by
> OpenBet for employment and security purposes. To protect the environment
> please do not print this e-mail unless necessary. OpenBet Ltd. Registered
> Office: Chiswick Park Building 9, 566 Chiswick High Road, London, W4 5XT,
> United Kingdom. A company registered in England and Wales. Registered no.
> 3134634. VAT no. GB927523612
>


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-22 Thread Michal Borowiecki

Hi Jeyhun,

I'd like to understand better the premise of RichFunctions and why 
|init(Some params)|,|close() |are said to be needed.


To me at least it feels strange that something is called a function yet 
doesn't follow the functional interface definition of having just one 
abstract method. I suppose init and close could be made default methods 
with empty bodies once Java 7 support is dropped to mitigate that 
concern. Still, I feel some resistance to consider something that 
requires initialisation and closing (which implies holding state) as 
being a function. Sounds more like the Processor/Transformer kind of 
thing semantically, rather than a function.


The KIP says there are multiple use-cases for this but doesn't enumerate 
any - I think some examples would be useful, otherwise that section 
sounds a little bit vague.


IMHO, it's the access to the RecordContext is where the added value lies 
but maybe I'm just lacking in imagination, so I'm asking all this to 
better understand the rationale for init() and close().


Thanks,
Michał

On 20/05/17 17:05, Jeyhun Karimov wrote:

Dear community,

As we discussed in KIP-149 [DISCUSS] thread [1], I would like to initiate
KIP for rich functions (interfaces) [2].
I would like to get your comments.


[1]
http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+and+ValueJoiner
[2]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams


Cheers,
Jeyhun


--
Signature
 Michal Borowiecki
Senior Software Engineer L4
T:  +44 208 742 1600


+44 203 249 8448



E:  michal.borowie...@openbet.com
W:  www.openbet.com 


OpenBet Ltd

Chiswick Park Building 9

566 Chiswick High Rd

London

W4 5XT

UK




This message is confidential and intended only for the addressee. If you 
have received this message in error, please immediately notify the 
postmas...@openbet.com  and delete it 
from your system as well as any copies. The content of e-mails as well 
as traffic data may be monitored by OpenBet for employment and security 
purposes. To protect the environment please do not print this e-mail 
unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building 
9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company 
registered in England and Wales. Registered no. 3134634. VAT no. 
GB927523612