[ 
https://issues.apache.org/jira/browse/KAFKA-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205740#comment-17205740
 ] 

Sophie Blee-Goldman commented on KAFKA-10555:
---------------------------------------------

I was also thinking we should not transit to ERROR if, for example, the user 
requests an application shutdown in the new exception handler. I would consider 
this to be a graceful shutdown and transit to NOT_RUNNING – unless of course an 
error occurs during the graceful shutdown. Then we should transit to ERROR. But 
if we transit to ERROR no matter what, then the state machine will not 
differentiate between a successful graceful shutdown and an actual error 
occurring.

Similarly, I think we should transit to NOT_RUNNING if the user chooses the 
SHUTDOWN_KAFKA_STREAMS_CLIENT option in the exception handler: this is also a 
graceful shutdown, equivalent to the user calling KafkaStreams#shutdown. 

But I'd be happy to include both options in the new Streams exception handler 
and allow users to choose which terminal state to end up in. I can see how a 
user may want to decide between ERROR and NOT_RUNNING based on the specific 
exception thrown.

 

That said, I'm much less concerned about the "weird position" where a dying 
thread on a multithreaded app can be replaced whereas a dying thread on 
single-threaded app cannot. For one thing, we plan to have a 
"REPLACE_STREAM_THREAD" enum in the new Streams uncaught exception handler – 
presumably, this would be implemented such that if the only thread dies, a new 
thread will be started up to replace it before transiting to ERROR. But if you 
allow the thread to die without choosing to start up a new thread, and the dead 
thread was your last one, then transiting to ERROR seems totally appropriate 
imo. You had the chance to start up a new thread and didn't take it. 

It also seems useful to me to retain the ERROR state as a way to notify users 
of the death of the final thread. Whether this was the fifth thread out of 5, 
or the first thread out of 1, seems besides the point.

cc [~mjsax] [~wcarlson5] [~cadonna]

> Improve client state machine
> ----------------------------
>
>                 Key: KAFKA-10555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10555
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Matthias J. Sax
>            Priority: Major
>              Labels: needs-kip
>
> The KafkaStreams client exposes its state to the user for monitoring purpose 
> (ie, RUNNING, REBALANCING etc). The state of the client depends on the 
> state(s) of the internal StreamThreads that have their own states.
> Furthermore, the client state has impact on what the user can do with the 
> client. For example, active task can only be queried in RUNNING state and 
> similar.
> With KIP-671 and KIP-663 we improved error handling capabilities and allow to 
> add/remove stream thread dynamically. We allow adding/removing threads only 
> in RUNNING and REBALANCING state. This puts us in a "weird" position, because 
> if we enter ERROR state (ie, if the last thread dies), we cannot add new 
> threads and longer. However, if we have multiple threads and one dies, we 
> don't enter ERROR state and do allow to recover the thread.
> Before the KIPs the definition of ERROR state was clear, however, with both 
> KIPs it seem that we should revisit its semantics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to