[jira] [Assigned] (KAFKA-2939) Make AbstractConfig.logUnused() tunable for clients

2019-01-23 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-2939:
--

Assignee: (was: Mariam John)

> Make AbstractConfig.logUnused() tunable for clients
> ---
>
> Key: KAFKA-2939
> URL: https://issues.apache.org/jira/browse/KAFKA-2939
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: newbie
>
> Today we always log unused configs in KafkaProducer / KafkaConsumer in their 
> constructors, however for some cases like Kafka Streams that make use of 
> these clients, other configs may be passed in to configure Partitioner / 
> Serializer classes, etc. So it would be better to make this function call 
> optional to avoid printing unnecessary and confusing WARN entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-2111) Command Line Standardization - Add Help Arguments & List Required Fields

2019-01-23 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-2111:
--

Assignee: (was: Mariam John)

> Command Line Standardization - Add Help Arguments & List Required Fields
> 
>
> Key: KAFKA-2111
> URL: https://issues.apache.org/jira/browse/KAFKA-2111
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Reporter: Matt Warhaftig
>Priority: Minor
>  Labels: newbie
>
> KIP-14 is the standardization of tool command line arguments.  As an offshoot 
> of that proposal there are standardization changes that don't need to be part 
> of the KIP since they are less invasive.  They are:
> - Properly format argument descriptions (into sentences) and add any missing 
> "REQUIRED" notes.
> - Add 'help' argument to any top-level tool scripts that were missing it.
> This JIRA is for tracking them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-6708) Review Exception messages with regards to Serde Useage

2019-01-23 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-6708:
--

Assignee: (was: Mariam John)

> Review Exception messages with regards to Serde Useage
> --
>
> Key: KAFKA-6708
> URL: https://issues.apache.org/jira/browse/KAFKA-6708
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>Priority: Major
>  Labels: newbie
>
> Error messages when not including Serdes required other than the provided 
> default ones should have error messages that are more specific with what 
> needs to be done and possible causes than just a {{ClassCastException}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-4650) Improve test coverage org.apache.kafka.streams.kstream.internals

2019-01-23 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-4650:
--

Assignee: (was: Mariam John)

> Improve test coverage org.apache.kafka.streams.kstream.internals
> 
>
> Key: KAFKA-4650
> URL: https://issues.apache.org/jira/browse/KAFKA-4650
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Damian Guy
>Priority: Minor
>  Labels: newbie
>
> Lots of classes have little or no coverage at all, i.e., 
> {{KTableAggregate.KTableAggregateValueGetter}}
> {{KTableKTableRightJoin.KTableKTableRightJoinValueGetterSupplier}}
> {{KStreamAggregate.KStreamAggregateValueGetter}}
> {{KStreamReduce.KStreamReduceValueGetter}}
> {{KStreamWindowReduce.new KTableValueGetterSupplier}}
> {{KTableAggregate.new KTableValueGetterSupplier}}
> {{KTableRepartitionMap.new KTableValueGetterSupplier}}
> {{KTableKTableRightJoin.KTableKTableRightJoinValueGetter}}
> {{KTableKTableLeftJoinValueGetter}}
> {{KStreamWindowReduce.KStreamWindowReduceValueGetter}}
> {{TimeWindow}}
> {{ChangedSerializer}}
> {{UnlimitedWindow}}
> {{WindowedDeserializer}}
> {{KStreamSessionWindowAggregate.KTableSessionWindowValueGetter}}
> {{KTableRepartitionMap}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7492) Explain `null` handling for reduce and aggregate

2019-01-23 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-7492:
--

Assignee: (was: Mariam John)

> Explain `null` handling for reduce and aggregate
> 
>
> Key: KAFKA-7492
> URL: https://issues.apache.org/jira/browse/KAFKA-7492
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Reporter: Matthias J. Sax
>Priority: Minor
>  Labels: beginner, docs, newbie
>
> We currently don't explain how records with `null` value are handled in 
> reduce and aggregate. In particular, what happens when the users' 
> aggregation/reduce `apply()` implementation returns `null`.
> We should update the JavaDocs accordingly and maybe also update the docs on 
> the web page.
> Cf. 
> https://stackoverflow.com/questions/52692202/what-happens-if-the-aggregator-of-a-kgroupedstream-returns-null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7243) Add unit integration tests to validate metrics in Kafka Streams

2018-12-17 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-7243:
--

Assignee: (was: Mariam John)

> Add unit integration tests to validate metrics in Kafka Streams
> ---
>
> Key: KAFKA-7243
> URL: https://issues.apache.org/jira/browse/KAFKA-7243
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: newbie++
>
> We should add an integration test for Kafka Streams, that validates:
> 1. After streams application are started, all metrics from different levels 
> (thread, task, processor, store, cache) are correctly created and displaying 
> recorded values.
> 2. When streams applicatio are shutdown, all metrics are correctly 
> de-registered and removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7243) Add unit integration tests to validate metrics in Kafka Streams

2018-12-17 Thread Mariam John (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723121#comment-16723121
 ] 

Mariam John commented on KAFKA-7243:


Done. 

> Add unit integration tests to validate metrics in Kafka Streams
> ---
>
> Key: KAFKA-7243
> URL: https://issues.apache.org/jira/browse/KAFKA-7243
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: newbie++
>
> We should add an integration test for Kafka Streams, that validates:
> 1. After streams application are started, all metrics from different levels 
> (thread, task, processor, store, cache) are correctly created and displaying 
> recorded values.
> 2. When streams applicatio are shutdown, all metrics are correctly 
> de-registered and removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7243) Add unit integration tests to validate metrics in Kafka Streams

2018-10-10 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-7243:
--

Assignee: Mariam John

> Add unit integration tests to validate metrics in Kafka Streams
> ---
>
> Key: KAFKA-7243
> URL: https://issues.apache.org/jira/browse/KAFKA-7243
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Mariam John
>Priority: Major
>  Labels: newbie++
>
> We should add an integration test for Kafka Streams, that validates:
> 1. After streams application are started, all metrics from different levels 
> (thread, task, processor, store, cache) are correctly created and displaying 
> recorded values.
> 2. When streams applicatio are shutdown, all metrics are correctly 
> de-registered and removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7492) Explain `null` handling for reduce and aggregate

2018-10-10 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-7492:
--

Assignee: Mariam John

> Explain `null` handling for reduce and aggregate
> 
>
> Key: KAFKA-7492
> URL: https://issues.apache.org/jira/browse/KAFKA-7492
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Reporter: Matthias J. Sax
>Assignee: Mariam John
>Priority: Minor
>  Labels: beginner, docs, newbie
>
> We currently don't explain how records with `null` value are handled in 
> reduce and aggregate. In particular, what happens when the users' 
> aggregation/reduce `apply()` implementation returns `null`.
> We should update the JavaDocs accordingly and maybe also update the docs on 
> the web page.
> Cf. 
> https://stackoverflow.com/questions/52692202/what-happens-if-the-aggregator-of-a-kgroupedstream-returns-null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-6708) Review Exception messages with regards to Serde Useage

2018-09-05 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-6708:
--

Assignee: Mariam John

> Review Exception messages with regards to Serde Useage
> --
>
> Key: KAFKA-6708
> URL: https://issues.apache.org/jira/browse/KAFKA-6708
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>Assignee: Mariam John
>Priority: Major
>  Labels: newbie
>
> Error messages when not including Serdes required other than the provided 
> default ones should have error messages that are more specific with what 
> needs to be done and possible causes than just a {{ClassCastException}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-2939) Make AbstractConfig.logUnused() tunable for clients

2018-09-05 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-2939:
--

Assignee: Mariam John

> Make AbstractConfig.logUnused() tunable for clients
> ---
>
> Key: KAFKA-2939
> URL: https://issues.apache.org/jira/browse/KAFKA-2939
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Reporter: Guozhang Wang
>Assignee: Mariam John
>Priority: Major
>  Labels: newbie
>
> Today we always log unused configs in KafkaProducer / KafkaConsumer in their 
> constructors, however for some cases like Kafka Streams that make use of 
> these clients, other configs may be passed in to configure Partitioner / 
> Serializer classes, etc. So it would be better to make this function call 
> optional to avoid printing unnecessary and confusing WARN entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-4650) Improve test coverage org.apache.kafka.streams.kstream.internals

2018-09-05 Thread Mariam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-4650:
--

Assignee: Mariam John

> Improve test coverage org.apache.kafka.streams.kstream.internals
> 
>
> Key: KAFKA-4650
> URL: https://issues.apache.org/jira/browse/KAFKA-4650
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Damian Guy
>Assignee: Mariam John
>Priority: Minor
>  Labels: newbie
>
> Lots of classes have little or no coverage at all, i.e., 
> {{KTableAggregate.KTableAggregateValueGetter}}
> {{KTableKTableRightJoin.KTableKTableRightJoinValueGetterSupplier}}
> {{KStreamAggregate.KStreamAggregateValueGetter}}
> {{KStreamReduce.KStreamReduceValueGetter}}
> {{KStreamWindowReduce.new KTableValueGetterSupplier}}
> {{KTableAggregate.new KTableValueGetterSupplier}}
> {{KTableRepartitionMap.new KTableValueGetterSupplier}}
> {{KTableKTableRightJoin.KTableKTableRightJoinValueGetter}}
> {{KTableKTableLeftJoinValueGetter}}
> {{KStreamWindowReduce.KStreamWindowReduceValueGetter}}
> {{TimeWindow}}
> {{ChangedSerializer}}
> {{UnlimitedWindow}}
> {{WindowedDeserializer}}
> {{KStreamSessionWindowAggregate.KTableSessionWindowValueGetter}}
> {{KTableRepartitionMap}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6437) Streams does not warn about missing input topics, but hangs

2018-03-28 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418185#comment-16418185
 ] 

Mariam John commented on KAFKA-6437:


[~mjsax] what do you mean by partially available input topics? KAFKA-6520 deals 
with when the kafka broker is down, then the Kafka streams apps connected to it 
have the state as RUNNING. I think, like you suggested we could have an IDLE 
state in both cases and log a different warning for the different cases. For 
example, in this case, it would be because of missing input topics and in the 
case of KAFKA-6520, it would be because it is unable to connect to the broker.

> Streams does not warn about missing input topics, but hangs
> ---
>
> Key: KAFKA-6437
> URL: https://issues.apache.org/jira/browse/KAFKA-6437
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
> Environment: Single client on single node broker
>Reporter: Chris Schwarzfischer
>Assignee: Mariam John
>Priority: Minor
>  Labels: newbie
>
> *Case*
> Streams application with two input topics being used for a left join.
> When the left side topic is missing upon starting the streams application, it 
> hangs "in the middle" of the topology (at …9, see below). Only parts of 
> the intermediate topics are created (up to …9)
> When the missing input topic is created, the streams application resumes 
> processing.
> {noformat}
> Topology:
> StreamsTask taskId: 2_0
>   ProcessorTopology:
>   KSTREAM-SOURCE-11:
>   topics: 
> [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition]
>   children:   [KTABLE-AGGREGATE-12]
>   KTABLE-AGGREGATE-12:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KTABLE-TOSTREAM-20]
>   KTABLE-TOSTREAM-20:
>   children:   [KSTREAM-SINK-21]
>   KSTREAM-SINK-21:
>   topic:  data_udr_month_customer_aggregration
>   KSTREAM-SOURCE-17:
>   topics: 
> [mystreams_app-KSTREAM-MAP-14-repartition]
>   children:   [KSTREAM-LEFTJOIN-18]
>   KSTREAM-LEFTJOIN-18:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KSTREAM-SINK-19]
>   KSTREAM-SINK-19:
>   topic:  data_UDR_joined
> Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, 
> mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0]
> {noformat}
> *Why this matters*
> The applications does quite a lot of preprocessing before joining with the 
> missing input topic. This preprocessing won't happen without the topic, 
> creating a huge backlog of data.
> *Fix*
> Issue an `warn` or `error` level message at start to inform about the missing 
> topic and it's consequences.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6720) Inconsistent Kafka Streams behaviour when topic does not exist

2018-03-28 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418089#comment-16418089
 ] 

Mariam John commented on KAFKA-6720:


[~wojda] if you look at the resolved tag, it is marked as duplicate. The only 
way to mark it as duplicate is to resolve it as duplicate. If you look at 
KAFKA-6437, you will see some comments that [~guozhang] added yesterday. We 
will use that defect to resolve this defect as well as KAFKA-6437. Hope that 
makes sense. I will add a comment as soon as I upload a fix. Thank you.

> Inconsistent Kafka Streams behaviour when topic does not exist
> --
>
> Key: KAFKA-6720
> URL: https://issues.apache.org/jira/browse/KAFKA-6720
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.1
>Reporter: Daniel Wojda
>Priority: Minor
>
> When Kafka Streams starts it reads metadata about topics used in topology
>  and it's partitions. If topology of that stream contains stateful operation 
> like #join, and a topic does not exist 
> [TopologyBuilderException|https://github.com/apache/kafka/blob/5bdfbd13524da667289cacb774bb92df36a253f2/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamPartitionAssignor.java#L719]
>  will be thrown.
> In case of streams with simple topology with stateless operations only, like 
> #mapValue, and topic does not exist, Kafka Streams does not throw any 
> exception, just logs a warning:
>  ["log.warn("No partitions found for topic {}", 
> topic);"|https://github.com/apache/kafka/blob/5bdfbd13524da667289cacb774bb92df36a253f2/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamPartitionAssignor.java#L435]
>  
> I believe the behaviour of Kafka Streams in both cases should be the same, 
> and it should throw TopologyBuilderException.
> I am more than happy to prepare a Pull Request if it is a valid issue.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6720) Inconsistent Kafka Streams behaviour when topic does not exist

2018-03-27 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John resolved KAFKA-6720.

Resolution: Duplicate

This is similar to KAFKA-6437.

> Inconsistent Kafka Streams behaviour when topic does not exist
> --
>
> Key: KAFKA-6720
> URL: https://issues.apache.org/jira/browse/KAFKA-6720
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.1
>Reporter: Daniel Wojda
>Priority: Minor
>
> When Kafka Streams starts it reads metadata about topics used in topology
>  and it's partitions. If topology of that stream contains stateful operation 
> like #join, and a topic does not exist 
> [TopologyBuilderException|https://github.com/apache/kafka/blob/5bdfbd13524da667289cacb774bb92df36a253f2/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamPartitionAssignor.java#L719]
>  will be thrown.
> In case of streams with simple topology with stateless operations only, like 
> #mapValue, and topic does not exist, Kafka Streams does not throw any 
> exception, just logs a warning:
>  ["log.warn("No partitions found for topic {}", 
> topic);"|https://github.com/apache/kafka/blob/5bdfbd13524da667289cacb774bb92df36a253f2/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamPartitionAssignor.java#L435]
>  
> I believe the behaviour of Kafka Streams in both cases should be the same, 
> and it should throw TopologyBuilderException.
> I am more than happy to prepare a Pull Request if it is a valid issue.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-6437) Streams does not warn about missing input topics, but hangs

2018-03-23 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-6437:
--

Assignee: Mariam John

> Streams does not warn about missing input topics, but hangs
> ---
>
> Key: KAFKA-6437
> URL: https://issues.apache.org/jira/browse/KAFKA-6437
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
> Environment: Single client on single node broker
>Reporter: Chris Schwarzfischer
>Assignee: Mariam John
>Priority: Minor
>  Labels: newbie
>
> *Case*
> Streams application with two input topics being used for a left join.
> When the left side topic is missing upon starting the streams application, it 
> hangs "in the middle" of the topology (at …9, see below). Only parts of 
> the intermediate topics are created (up to …9)
> When the missing input topic is created, the streams application resumes 
> processing.
> {noformat}
> Topology:
> StreamsTask taskId: 2_0
>   ProcessorTopology:
>   KSTREAM-SOURCE-11:
>   topics: 
> [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition]
>   children:   [KTABLE-AGGREGATE-12]
>   KTABLE-AGGREGATE-12:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KTABLE-TOSTREAM-20]
>   KTABLE-TOSTREAM-20:
>   children:   [KSTREAM-SINK-21]
>   KSTREAM-SINK-21:
>   topic:  data_udr_month_customer_aggregration
>   KSTREAM-SOURCE-17:
>   topics: 
> [mystreams_app-KSTREAM-MAP-14-repartition]
>   children:   [KSTREAM-LEFTJOIN-18]
>   KSTREAM-LEFTJOIN-18:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KSTREAM-SINK-19]
>   KSTREAM-SINK-19:
>   topic:  data_UDR_joined
> Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, 
> mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0]
> {noformat}
> *Why this matters*
> The applications does quite a lot of preprocessing before joining with the 
> missing input topic. This preprocessing won't happen without the topic, 
> creating a huge backlog of data.
> *Fix*
> Issue an `warn` or `error` level message at start to inform about the missing 
> topic and it's consequences.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6437) Streams does not warn about missing input topics, but hangs

2018-03-23 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411910#comment-16411910
 ] 

Mariam John commented on KAFKA-6437:


I will do that [~mjsax]. I agree that a WARN message seems right in this case. 
I gave some thought about the option of making this configurable but that might 
not be easy given that we would want to provide a way to configure per topic 
partition and replication factors. 

> Streams does not warn about missing input topics, but hangs
> ---
>
> Key: KAFKA-6437
> URL: https://issues.apache.org/jira/browse/KAFKA-6437
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
> Environment: Single client on single node broker
>Reporter: Chris Schwarzfischer
>Priority: Minor
>  Labels: newbie
>
> *Case*
> Streams application with two input topics being used for a left join.
> When the left side topic is missing upon starting the streams application, it 
> hangs "in the middle" of the topology (at …9, see below). Only parts of 
> the intermediate topics are created (up to …9)
> When the missing input topic is created, the streams application resumes 
> processing.
> {noformat}
> Topology:
> StreamsTask taskId: 2_0
>   ProcessorTopology:
>   KSTREAM-SOURCE-11:
>   topics: 
> [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition]
>   children:   [KTABLE-AGGREGATE-12]
>   KTABLE-AGGREGATE-12:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KTABLE-TOSTREAM-20]
>   KTABLE-TOSTREAM-20:
>   children:   [KSTREAM-SINK-21]
>   KSTREAM-SINK-21:
>   topic:  data_udr_month_customer_aggregration
>   KSTREAM-SOURCE-17:
>   topics: 
> [mystreams_app-KSTREAM-MAP-14-repartition]
>   children:   [KSTREAM-LEFTJOIN-18]
>   KSTREAM-LEFTJOIN-18:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KSTREAM-SINK-19]
>   KSTREAM-SINK-19:
>   topic:  data_UDR_joined
> Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, 
> mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0]
> {noformat}
> *Why this matters*
> The applications does quite a lot of preprocessing before joining with the 
> missing input topic. This preprocessing won't happen without the topic, 
> creating a huge backlog of data.
> *Fix*
> Issue an `warn` or `error` level message at start to inform about the missing 
> topic and it's consequences.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6437) Streams does not warn about missing input topics, but hangs

2018-03-22 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409759#comment-16409759
 ] 

Mariam John commented on KAFKA-6437:


In the KafkaStreams we can use the admin client to list all topics and go 
through the source and sink topics to see which all topics are missing and 
print an error message like a MissingTopicException. Is there something more we 
want to do for a fix or is there more to this bug I am missing. 

> Streams does not warn about missing input topics, but hangs
> ---
>
> Key: KAFKA-6437
> URL: https://issues.apache.org/jira/browse/KAFKA-6437
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
> Environment: Single client on single node broker
>Reporter: Chris Schwarzfischer
>Priority: Minor
>  Labels: newbie
>
> *Case*
> Streams application with two input topics being used for a left join.
> When the left side topic is missing upon starting the streams application, it 
> hangs "in the middle" of the topology (at …9, see below). Only parts of 
> the intermediate topics are created (up to …9)
> When the missing input topic is created, the streams application resumes 
> processing.
> {noformat}
> Topology:
> StreamsTask taskId: 2_0
>   ProcessorTopology:
>   KSTREAM-SOURCE-11:
>   topics: 
> [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition]
>   children:   [KTABLE-AGGREGATE-12]
>   KTABLE-AGGREGATE-12:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KTABLE-TOSTREAM-20]
>   KTABLE-TOSTREAM-20:
>   children:   [KSTREAM-SINK-21]
>   KSTREAM-SINK-21:
>   topic:  data_udr_month_customer_aggregration
>   KSTREAM-SOURCE-17:
>   topics: 
> [mystreams_app-KSTREAM-MAP-14-repartition]
>   children:   [KSTREAM-LEFTJOIN-18]
>   KSTREAM-LEFTJOIN-18:
>   states: 
> [KTABLE-AGGREGATE-STATE-STORE-09]
>   children:   [KSTREAM-SINK-19]
>   KSTREAM-SINK-19:
>   topic:  data_UDR_joined
> Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, 
> mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0]
> {noformat}
> *Why this matters*
> The applications does quite a lot of preprocessing before joining with the 
> missing input topic. This preprocessing won't happen without the topic, 
> creating a huge backlog of data.
> *Fix*
> Issue an `warn` or `error` level message at start to inform about the missing 
> topic and it's consequences.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-5085) Add test for rebalance exceptions

2018-01-20 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-5085:
--

Assignee: (was: Mariam John)

> Add test for rebalance exceptions
> -
>
> Key: KAFKA-5085
> URL: https://issues.apache.org/jira/browse/KAFKA-5085
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Priority: Major
>
> We currently lack a proper test for the case that an exceptions in throw 
> during rebalance within Streams rebalance listener.
> We recently had a bug, for which the app hang on an exception because the 
> exception was not handled properly (KAFKA-5073). Writing a test might require 
> some code refactoring to make testing simpler in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-4994) Fix findbugs warning about OffsetStorageWriter#currentFlushId

2018-01-20 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-4994:
--

Assignee: (was: Mariam John)

> Fix findbugs warning about OffsetStorageWriter#currentFlushId
> -
>
> Key: KAFKA-4994
> URL: https://issues.apache.org/jira/browse/KAFKA-4994
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Priority: Major
>  Labels: newbie
>
> We should fix the findbugs warning about 
> {{OffsetStorageWriter#currentFlushId}}
> {code}
> Multithreaded correctness Warnings
> IS2_INCONSISTENT_SYNC:   Inconsistent synchronization of 
> org.apache.kafka.connect.storage.OffsetStorageWriter.currentFlushId; locked 
> 83% of time
> IS2_INCONSISTENT_SYNC:   Inconsistent synchronization of 
> org.apache.kafka.connect.storage.OffsetStorageWriter.toFlush; locked 75% of 
> time
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4370) CorruptRecordException when ProducerRecord constructed without key nor partition and send

2018-01-20 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John resolved KAFKA-4370.

   Resolution: Duplicate
Fix Version/s: 1.0.0

This is already fixed in [https://github.com/apache/kafka/pull/3541.] I 
couldn't find a bug for the fix that already went in.

> CorruptRecordException when ProducerRecord constructed without key nor 
> partition and send
> -
>
> Key: KAFKA-4370
> URL: https://issues.apache.org/jira/browse/KAFKA-4370
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.0
>Reporter: Lars Pfannenschmidt
>Assignee: Mariam John
>Priority: Trivial
> Fix For: 1.0.0
>
>
> According to the JavaDoc of ProducerRecord it should be possible to send 
> messages without a key:
> {quote}
> If neither key nor partition is present a partition will be assigned in a 
> round-robin fashion.
> {quote}
> {code:title=SomeProducer.java|borderStyle=solid}
> ProducerRecord record = new ProducerRecord<>(topic, 
> "somemessage");
> return this.producer.send(record).get();
> {code}
> Unfortunately an Exception is thrown:
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.CorruptRecordException: This message has 
> failed its CRC checksum, exceeds the valid size, or is otherwise corrupt.
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:65)
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:52)
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:25)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-4370) CorruptRecordException when ProducerRecord constructed without key nor partition and send

2018-01-20 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333414#comment-16333414
 ] 

Mariam John commented on KAFKA-4370:


I just verified this and looks like the error message for 
CorruptRecordException has been modified to "This message has failed its CRC 
checksum, exceeds the valid size, has a null key for a compacted topic, or is 
otherwise corrupt.". This was fixed in: 
[https://github.com/apache/kafka/commit/fc0ea25025df8af5079e8142ba085939b3c9c073.]
 So closing this as fixed for now.

> CorruptRecordException when ProducerRecord constructed without key nor 
> partition and send
> -
>
> Key: KAFKA-4370
> URL: https://issues.apache.org/jira/browse/KAFKA-4370
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.0
>Reporter: Lars Pfannenschmidt
>Assignee: Mariam John
>Priority: Trivial
>
> According to the JavaDoc of ProducerRecord it should be possible to send 
> messages without a key:
> {quote}
> If neither key nor partition is present a partition will be assigned in a 
> round-robin fashion.
> {quote}
> {code:title=SomeProducer.java|borderStyle=solid}
> ProducerRecord record = new ProducerRecord<>(topic, 
> "somemessage");
> return this.producer.send(record).get();
> {code}
> Unfortunately an Exception is thrown:
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.CorruptRecordException: This message has 
> failed its CRC checksum, exceeds the valid size, or is otherwise corrupt.
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:65)
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:52)
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:25)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5697) StreamThread.close() need to interrupt the stream threads to break the loop

2018-01-20 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333410#comment-16333410
 ] 

Mariam John commented on KAFKA-5697:


@Guozhang Wang, sorry for the delay in responding. Just saw the comment on this 
bug. I am not working on this bug , so I unassigned myself. Thank you.

> StreamThread.close() need to interrupt the stream threads to break the loop
> ---
>
> Key: KAFKA-5697
> URL: https://issues.apache.org/jira/browse/KAFKA-5697
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>
> In {{StreamThread.close()}} we currently do nothing but set the state, hoping 
> the stream thread may eventually check it and shutdown itself. However, under 
> certain scenarios the thread may get blocked within a single loop and hence 
> will never check on this state enum. For example, it's {{consumer.poll}} call 
> trigger {{ensureCoordinatorReady()}} which will block until the coordinator 
> can be found. If the coordinator broker is never up and running then the 
> Stream instance will be blocked forever.
> A simple way to produce this issue is to start the work count demo without 
> starting the ZK / Kafka broker, and then it will get stuck in a single loop 
> and even `ctrl-C` will not stop it since its set state will never be read by 
> the thread:
> {code:java}
> [2017-08-03 15:17:39,981] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,046] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,101] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,206] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,261] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,366] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,472] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> ^C[2017-08-03 15:17:40,580] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-5697) StreamThread.close() need to interrupt the stream threads to break the loop

2018-01-20 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-5697:
--

Assignee: (was: Mariam John)

> StreamThread.close() need to interrupt the stream threads to break the loop
> ---
>
> Key: KAFKA-5697
> URL: https://issues.apache.org/jira/browse/KAFKA-5697
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>
> In {{StreamThread.close()}} we currently do nothing but set the state, hoping 
> the stream thread may eventually check it and shutdown itself. However, under 
> certain scenarios the thread may get blocked within a single loop and hence 
> will never check on this state enum. For example, it's {{consumer.poll}} call 
> trigger {{ensureCoordinatorReady()}} which will block until the coordinator 
> can be found. If the coordinator broker is never up and running then the 
> Stream instance will be blocked forever.
> A simple way to produce this issue is to start the work count demo without 
> starting the ZK / Kafka broker, and then it will get stuck in a single loop 
> and even `ctrl-C` will not stop it since its set state will never be read by 
> the thread:
> {code:java}
> [2017-08-03 15:17:39,981] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,046] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,101] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,206] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,261] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,366] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,472] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> ^C[2017-08-03 15:17:40,580] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5697) StreamThread.close() need to interrupt the stream threads to break the loop

2018-01-18 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John updated KAFKA-5697:
---
Description: 
In {{StreamThread.close()}} we currently do nothing but set the state, hoping 
the stream thread may eventually check it and shutdown itself. However, under 
certain scenarios the thread may get blocked within a single loop and hence 
will never check on this state enum. For example, it's {{consumer.poll}} call 
trigger {{ensureCoordinatorReady()}} which will block until the coordinator can 
be found. If the coordinator broker is never up and running then the Stream 
instance will be blocked forever.

A simple way to produce this issue is to start the work count demo without 
starting the ZK / Kafka broker, and then it will get stuck in a single loop and 
even `ctrl-C` will not stop it since its set state will never be read by the 
thread:
{code:java}
[2017-08-03 15:17:39,981] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,046] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,101] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,206] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,261] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,366] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,472] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
^C[2017-08-03 15:17:40,580] WARN Connection to node -1 could not be 
established. Broker may not be available. 
(org.apache.kafka.clients.NetworkClient)
{code}
 

  was:
In {{StreamThread.close()}} we currently do nothing but set the state, hoping 
the stream thread may eventually check it and shutdown itself. However, under 
certain scenarios the thread may get blocked within a single loop and hence 
will never check on this state enum. For example, it's {{consumer.poll}} call 
trigger {{ensureCoordinatorReady()}} which will block until the coordinator can 
be found. If the coordinator broker is never up and running then the Stream 
instance will be blocked forever.

A simple way to produce this issue is to start the work count demo without 
starting the ZK / Kafka broker, and then it will get stuck in a single loop and 
even `ctrl-C` will not stop it since its set state will never be read by the 
thread:

{code}
[2017-08-03 15:17:39,981] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,046] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,101] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,206] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,261] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,366] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2017-08-03 15:17:40,472] WARN Connection to node -1 could not be established. 
Broker may not be available. (org.apache.kafka.clients.NetworkClient)
^C[2017-08-03 15:17:40,580] WARN Connection to node -1 could not be 
established. Broker may not be available. 
(org.apache.kafka.clients.NetworkClient)
{code}


> StreamThread.close() need to interrupt the stream threads to break the loop
> ---
>
> Key: KAFKA-5697
> URL: https://issues.apache.org/jira/browse/KAFKA-5697
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Mariam John
>Priority: Major
>
> In {{StreamThread.close()}} we currently do nothing but set the state, hoping 
> the stream thread may eventually check it and shutdown itself. However, under 
> certain scenarios the thread may get blocked within a single loop and hence 
> will never check on this state enum. For example, it's {{consumer.poll}} call 
> trigger {{ensureCoordinatorReady()}} which will block 

[jira] [Assigned] (KAFKA-5581) Streams can be smarter in deciding when to create changelog topics for state stores

2017-08-07 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-5581:
--

Assignee: Mariam John

> Streams can be smarter in deciding when to create changelog topics for state 
> stores
> ---
>
> Key: KAFKA-5581
> URL: https://issues.apache.org/jira/browse/KAFKA-5581
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Mariam John
>  Labels: architecture, performance
>
> Today Streams make all state stores to be backed by a changelog topic by 
> default unless users overrides it by {{disableLogging}} when creating the 
> state store / materializing the KTable. However there are a few cases where a 
> separate changelog topic would not be required as we can re-use an existing 
> topic for that. A few examples:
> There are a few places where the materialized store do not need a separate 
> changelog topic, for example:
> 1) If a KTable is read directly from a source topic, and is materialized i.e. 
> {code}
> table1 = builder.table("topic1", "store1")`.
> {code}
> In this case {{table1}}'s changelog topic can just be {{topic1}}, and we do 
> not need to create a separate {{table1-changelog}} topic.
> 2) if a KTable is materialized and then sent directly into a sink topic with 
> the same key, e.g.
> {code}
> table1 = stream.groupBy(...).aggregate("state1").to("topic2");
> {code}
> In this case {{state1}}'s changelog topic can just be {{topic2}}, and we do 
> not need to create a separate {{state1-changelog}} topic anymore;
> 3) if a KStream is materialized for joins where the streams are directly from 
> a topic, e.g.:
> {code}
> stream1 = builder.stream("topic1");
> stream2 = builder.stream("topic2");
> stream3 = stream1.join(stream2, windows);  // stream1 and stream2 are 
> materialized with a changelog topic
> {code}
> Since stream materialization is append-only we do not need a changelog for 
> the state store as well but can just use the source {{topic1}} and {{topic2}}.
> 4) When you have some simple transformation operations or even join 
> operations that generated new KTables, and which needs to be materialized 
> with a state store, you can use the changelog topic of the previous KTable 
> and applies the transformation logic upon restoration instead of creating a 
> new changelog topic. For example:
> {code}
> table1 = builder.table("topic1");
> table2 = table1.filter(..).join(table3); // table2 needs to be materialized 
> for joining
> {code}
> We can set the {{getter}} function of table2's materialized store, say 
> {{state2}} to be reading from {{topic1}} and then apply the filter operator, 
> instead of creating a new {{state2-changelog}} topic in this case.
> 5) more use cases ...
> We can come up with a general internal impl optimizations to determine when / 
> how to set the changelog topic for those materialized stores at the runtime 
> startup when generating the topology.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5697) StreamThread.close() need to interrupt the stream threads to break the loop

2017-08-07 Thread Mariam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariam John reassigned KAFKA-5697:
--

Assignee: Mariam John

> StreamThread.close() need to interrupt the stream threads to break the loop
> ---
>
> Key: KAFKA-5697
> URL: https://issues.apache.org/jira/browse/KAFKA-5697
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Mariam John
>
> In {{StreamThread.close()}} we currently do nothing but set the state, hoping 
> the stream thread may eventually check it and shutdown itself. However, under 
> certain scenarios the thread may get blocked within a single loop and hence 
> will never check on this state enum. For example, it's {{consumer.poll}} call 
> trigger {{ensureCoordinatorReady()}} which will block until the coordinator 
> can be found. If the coordinator broker is never up and running then the 
> Stream instance will be blocked forever.
> A simple way to produce this issue is to start the work count demo without 
> starting the ZK / Kafka broker, and then it will get stuck in a single loop 
> and even `ctrl-C` will not stop it since its set state will never be read by 
> the thread:
> {code}
> [2017-08-03 15:17:39,981] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,046] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,101] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,206] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,261] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,366] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2017-08-03 15:17:40,472] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> ^C[2017-08-03 15:17:40,580] WARN Connection to node -1 could not be 
> established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-2111) Command Line Standardization - Add Help Arguments & List Required Fields

2017-06-16 Thread Mariam John (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16052033#comment-16052033
 ] 

Mariam John commented on KAFKA-2111:


Hi Tom,

  I am sorry I just saw this message right now. I am currently working on this 
and almost done with the changes. It does not contain all the recommendations 
you had mentioned in your mail. I will try to put out a patch today. 

> Command Line Standardization - Add Help Arguments & List Required Fields
> 
>
> Key: KAFKA-2111
> URL: https://issues.apache.org/jira/browse/KAFKA-2111
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Reporter: Matt Warhaftig
>Assignee: Mariam John
>Priority: Minor
>  Labels: newbie
>
> KIP-14 is the standardization of tool command line arguments.  As an offshoot 
> of that proposal there are standardization changes that don't need to be part 
> of the KIP since they are less invasive.  They are:
> - Properly format argument descriptions (into sentences) and add any missing 
> "REQUIRED" notes.
> - Add 'help' argument to any top-level tool scripts that were missing it.
> This JIRA is for tracking them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)