[ 
https://issues.apache.org/jira/browse/KAFKA-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295377#comment-16295377
 ] 

Chen He commented on KAFKA-6353:
--------------------------------

Thank you, Mr. [~rhauch]. Just sent the email. 

Also some updates on this issue. The problem is when user first submit request 
to create a connector, they announce the number of tasks. It means there are 2 
levels of status: Connector level and task level. We need first clarify the 
relationship between these 2 levels. 

1. What is the definition of the connector level "RUNNING": all tasks are 
running, some tasks are running, or not related to task status and just reflect 
the ability to create new tasks through connecor;
2. How to deal with the case, if all tasks are "FAILED", should we still mark 
connector as "RUNNING" ?

IMHO
a) Task status only means there is a task still running, even all tasks failed, 
we can still mark connector as running if it can create new task. 
b) Connector Status should also reflect current tasks existence. It means if 
there is one task running, but connector lose the ability to create new task, 
we should mark it a different status (like "SUSPENDED") other than "FAILED", in 
this way, UI or other monitoring system knows there is problem in connector but 
there still some tasks well functioning. It can help them to make right 
decision.
c) The reason why this issue happens is that tasks are still running correctly, 
however, connector failed to restart (timeout reach or any other problem 
happens in the connector level.) We may need to find an answer about why we set 
timeout.


> Connector status shows FAILED but actually task is in RUNNING status
> --------------------------------------------------------------------
>
>                 Key: KAFKA-6353
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6353
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 0.10.2.1
>            Reporter: Chen He
>
> {"name":"test","connector":{"state":"FAILED","trace":"ERROR 
> MESSAGE","worker_id":"localhost:8083"},"tasks":[{"state":"RUNNING","id":0,"worker_id":"localhost:8083"}]}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to