[jira] [Commented] (KAFKA-3262) Make KafkaStreams debugging friendly

2016-11-11 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15657242#comment-15657242
 ] 

Eno Thereska commented on KAFKA-3262:
-

This is now fixed with KIP-62

> Make KafkaStreams debugging friendly
> 
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Yasuhiro Matsuda
>Assignee: Eno Thereska
>  Labels: user-experience
> Fix For: 0.10.2.0
>
>
> Current KafkaStreams polls records in the same thread as the data processing 
> thread. This makes debugging user code, as well as KafkaStreams itself, 
> difficult. When the thread is suspended by the debugger, the next heartbeat 
> of the consumer tie to the thread won't be send until the thread is resumed. 
> This often results in missed heartbeats and causes a group rebalance. So it 
> may will be a completely different context then the thread hits the break 
> point the next time.
> We should consider using separate threads for polling and processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3262) Make KafkaStreams debugging friendly

2016-04-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229508#comment-15229508
 ] 

Guozhang Wang commented on KAFKA-3262:
--

One more thing that we have observed: currently Kafka Streams will decide 
whether or not to trigger poll() purely based on the size of its buffered data, 
but not considering the heartbeat intervals. As a result for complex topology, 
it is likely to get false positive failure detection with small 
session.timeout.ms. cc [~norwood]

> Make KafkaStreams debugging friendly
> 
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.10.0.0
>Reporter: Yasuhiro Matsuda
>  Labels: developer-experience
> Fix For: 0.10.1.0
>
>
> Current KafkaStreams polls records in the same thread as the data processing 
> thread. This makes debugging user code, as well as KafkaStreams itself, 
> difficult. When the thread is suspended by the debugger, the next heartbeat 
> of the consumer tie to the thread won't be send until the thread is resumed. 
> This often results in missed heartbeats and causes a group rebalance. So it 
> may will be a completely different context then the thread hits the break 
> point the next time.
> We should consider using separate threads for polling and processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3262) Make KafkaStreams debugging friendly

2016-03-29 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216934#comment-15216934
 ] 

Guozhang Wang commented on KAFKA-3262:
--

I agree that for development cycle we should enforce "single thread".

[~jkreps] What do you mean by "Another issue is that the clients default to 
debug logging"?

> Make KafkaStreams debugging friendly
> 
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.10.0.0
>Reporter: Yasuhiro Matsuda
> Fix For: 0.10.0.1
>
>
> Current KafkaStreams polls records in the same thread as the data processing 
> thread. This makes debugging user code, as well as KafkaStreams itself, 
> difficult. When the thread is suspended by the debugger, the next heartbeat 
> of the consumer tie to the thread won't be send until the thread is resumed. 
> This often results in missed heartbeats and causes a group rebalance. So it 
> may will be a completely different context then the thread hits the break 
> point the next time.
> We should consider using separate threads for polling and processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3262) Make KafkaStreams debugging friendly

2016-02-23 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159545#comment-15159545
 ] 

Jay Kreps commented on KAFKA-3262:
--

This is a good catch, I ran into this issue too. Another issue is that the 
clients default to debug logging which is not ideal for development (it can be 
kind of confusing whether something is happening or not since all the action is 
in the event loop).

I'm a little reticent about fixing this issue by background threading though. A 
few things to be careful of:
1. The complexity of orchestration back and forth from the thread is complicated
2. If we use a blocking queue to pass data it will be really important to batch 
actions to not kill performance (or at least that was our finding before).
3. Having a single thread and having the debugging step into the consumer 
itself is actually more transparent (I think) and will make various failure 
scenarios work the way we want (e.g. the contract in the consumer is if there 
is a fatal error to throw an exception which should propagate and not just kill 
the bg thread).

I suppose an alternative to changing the threading model would be just to set 
the timeout really high in development. It occurs to me there are several 
things you might want in development:
1. Large or infinite session timeout
2. More logging
3. Single threaded?
4. Re-start from the beginning of the inputs?
5. Recreate intermediate topics?

Dunno, maybe there should be some kind of overall "dev-mode" for all of this?

> Make KafkaStreams debugging friendly
> 
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.9.1.0
>Reporter: Yasuhiro Matsuda
>
> Current KafkaStreams polls records in the same thread as the data processing 
> thread. This makes debugging user code, as well as KafkaStreams itself, 
> difficult. When the thread is suspended by the debugger, the next heartbeat 
> of the consumer tie to the thread won't be send until the thread is resumed. 
> This often results in missed heartbeats and causes a group rebalance. So it 
> may will be a completely different context then the thread hits the break 
> point the next time.
> We should consider using separate threads for polling and processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)