[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2017-06-07 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3543:
---
Fix Version/s: (was: 0.11.0.0)

> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>  Labels: api
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm function as 
> {code}
>  KStream transform(TransformerSupplier Iterable>> transformerSupplier, String... stateStoreNames)
> {code}
> which is essentially the same as
> {code} transform().flatMap() {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-09-12 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-3543:
---
Fix Version/s: (was: 0.10.1.0)
   0.10.2.0

> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>  Labels: api
> Fix For: 0.10.2.0
>
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm function as 
> {code}
>  KStream transform(TransformerSupplier Iterable>> transformerSupplier, String... stateStoreNames)
> {code}
> which is essentially the same as
> {code} transform().flatMap() {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-06-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3543:
-
Assignee: (was: Guozhang Wang)

> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>  Labels: api
> Fix For: 0.10.1.0
>
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm function as 
> {code}
>  KStream transform(TransformerSupplier Iterable>> transformerSupplier, String... stateStoreNames)
> {code}
> which is essentially the same as
> {code} transform().flatMap() {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-04-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3543:
-
Description: 
Right now it seems that if you want to apply an arbitrary stateful 
transformation to a stream, you either have to use a TransformerSupplier or 
ProcessorSupplier sent to transform() or process(). The custom processor will 
allow you to emit multiple new values, but the process() method currently 
terminates that branch of the topology so you can't apply additional data flow. 
transform() lets you continue the data flow, but forces you to emit a single 
value for every input value.

(It actually doesn't quite force you to do this, since you can hold onto the 
ProcessorContext and emit multiple, but that's probably not the ideal way to do 
it :))

It seems desirable to somehow allow a transformation that emits multiple values 
per input value. I'm not sure of the best way to factor this inside of the 
current TransformerSupplier/Transformer architecture in a way that is clean and 
efficient -- currently I'm doing the workaround above of just calling forward() 
myself on the context and actually emitting dummy values which are filtered out 
downstream.

-

It is worth considering adding a new flatTransofrm function as 

{code}
 KStream transform(TransformerSupplier>> transformerSupplier, String... stateStoreNames)
{code}

which is essentially the same as

{code} transform().flatMap() {code}

  was:
Right now it seems that if you want to apply an arbitrary stateful 
transformation to a stream, you either have to use a TransformerSupplier or 
ProcessorSupplier sent to transform() or process(). The custom processor will 
allow you to emit multiple new values, but the process() method currently 
terminates that branch of the topology so you can't apply additional data flow. 
transform() lets you continue the data flow, but forces you to emit a single 
value for every input value.

(It actually doesn't quite force you to do this, since you can hold onto the 
ProcessorContext and emit multiple, but that's probably not the ideal way to do 
it :))

It seems desirable to somehow allow a transformation that emits multiple values 
per input value. I'm not sure of the best way to factor this inside of the 
current TransformerSupplier/Transformer architecture in a way that is clean and 
efficient -- currently I'm doing the workaround above of just calling forward() 
myself on the context and actually emitting dummy values which are filtered out 
downstream.

-

It is worth considering adding a new flatTransofrm


> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>  Labels: api
> Fix For: 0.10.1.0
>
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm function as 
> {code}
>  KStream transform(TransformerSupplier Iterable>> transformerSupplier, String... stateStoreNames)
> {code}
> which is essentially the same as
> {code} transform().flatMap() {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-04-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3543:
-
Labels: api  (was: )

> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>  Labels: api
> Fix For: 0.10.1.0
>
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm function as 
> {code}
>  KStream transform(TransformerSupplier Iterable>> transformerSupplier, String... stateStoreNames)
> {code}
> which is essentially the same as
> {code} transform().flatMap() {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-04-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3543:
-
Fix Version/s: 0.10.1.0

> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>  Labels: api
> Fix For: 0.10.1.0
>
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm function as 
> {code}
>  KStream transform(TransformerSupplier Iterable>> transformerSupplier, String... stateStoreNames)
> {code}
> which is essentially the same as
> {code} transform().flatMap() {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-04-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3543:
-
Description: 
Right now it seems that if you want to apply an arbitrary stateful 
transformation to a stream, you either have to use a TransformerSupplier or 
ProcessorSupplier sent to transform() or process(). The custom processor will 
allow you to emit multiple new values, but the process() method currently 
terminates that branch of the topology so you can't apply additional data flow. 
transform() lets you continue the data flow, but forces you to emit a single 
value for every input value.

(It actually doesn't quite force you to do this, since you can hold onto the 
ProcessorContext and emit multiple, but that's probably not the ideal way to do 
it :))

It seems desirable to somehow allow a transformation that emits multiple values 
per input value. I'm not sure of the best way to factor this inside of the 
current TransformerSupplier/Transformer architecture in a way that is clean and 
efficient -- currently I'm doing the workaround above of just calling forward() 
myself on the context and actually emitting dummy values which are filtered out 
downstream.

-

It is worth considering adding a new flatTransofrm

  was:
Right now it seems that if you want to apply an arbitrary stateful 
transformation to a stream, you either have to use a TransformerSupplier or 
ProcessorSupplier sent to transform() or process(). The custom processor will 
allow you to emit multiple new values, but the process() method currently 
terminates that branch of the topology so you can't apply additional data flow. 
transform() lets you continue the data flow, but forces you to emit a single 
value for every input value.

(It actually doesn't quite force you to do this, since you can hold onto the 
ProcessorContext and emit multiple, but that's probably not the ideal way to do 
it :))

It seems desirable to somehow allow a transformation that emits multiple values 
per input value. I'm not sure of the best way to factor this inside of the 
current TransformerSupplier/Transformer architecture in a way that is clean and 
efficient -- currently I'm doing the workaround above of just calling forward() 
myself on the context and actually emitting dummy values which are filtered out 
downstream.


> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>
> Right now it seems that if you want to apply an arbitrary stateful 
> transformation to a stream, you either have to use a TransformerSupplier or 
> ProcessorSupplier sent to transform() or process(). The custom processor will 
> allow you to emit multiple new values, but the process() method currently 
> terminates that branch of the topology so you can't apply additional data 
> flow. transform() lets you continue the data flow, but forces you to emit a 
> single value for every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.
> -
> It is worth considering adding a new flatTransofrm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3543) Allow a variant of transform() which can emit multiple values

2016-04-11 Thread Greg Fodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Fodor updated KAFKA-3543:
--
Summary: Allow a variant of transform() which can emit multiple values  
(was: Allow a variant of transform() which allows emitting multiple values)

> Allow a variant of transform() which can emit multiple values
> -
>
> Key: KAFKA-3543
> URL: https://issues.apache.org/jira/browse/KAFKA-3543
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>
> Right now it seems that if you want to apply an arbitrary transformation to a 
> stream, you either have to use a TransformerSupplier or ProcessorSupplier 
> sent to transform() or process(). The custom processor will allow you to emit 
> multiple new values, but the process() method currently terminates that 
> branch of the topology so you can't apply additional data flow. transform() 
> lets you continue the data flow, but forces you to emit a single value for 
> every input value.
> (It actually doesn't quite force you to do this, since you can hold onto the 
> ProcessorContext and emit multiple, but that's probably not the ideal way to 
> do it :))
> It seems desirable to somehow allow a transformation that emits multiple 
> values per input value. I'm not sure of the best way to factor this inside of 
> the current TransformerSupplier/Transformer architecture in a way that is 
> clean and efficient -- currently I'm doing the workaround above of just 
> calling forward() myself on the context and actually emitting dummy values 
> which are filtered out downstream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)