[GitHub] [kafka] C0urante commented on a diff in pull request #14336: KAFKA-14876: Add stopped state to Kafka Connect Administration docs section

via GitHub Tue, 05 Sep 2023 09:53:20 -0700


C0urante commented on code in PR #14336:
URL: https://github.com/apache/kafka/pull/14336#discussion_r1316150923



##########
docs/connect.html:
##########
@@ -1078,7 +1079,7 @@ <h4><a id="connect_administration" 
href="#connect_administration">Kafka Connect
     </p>
 
     <p>
-    It's sometimes useful to temporarily stop the message processing of a 
connector. For example, if the remote system is undergoing maintenance, it 
would be preferable for source connectors to stop polling it for new data 
instead of filling logs with exception spam. For this use case, Connect offers 
a pause/resume API. While a source connector is paused, Connect will stop 
polling it for additional records. While a sink connector is paused, Connect 
will stop pushing new messages to it. The pause state is persistent, so even if 
you restart the cluster, the connector will not begin message processing again 
until the task has been resumed. Note that there may be a delay before all of a 
connector's tasks have transitioned to the PAUSED state since it may take time 
for them to finish whatever processing they were in the middle of when being 
paused. Additionally, failed tasks will not transition to the PAUSED state 
until they have been restarted.
+    It's sometimes useful to temporarily stop the message processing of a 
connector. For example, if the remote system is undergoing maintenance, it 
would be preferable for source connectors to stop polling it for new data 
instead of filling logs with exception spam. For this use case, Connect offers 
pause / stop / resume APIs. While a source connector is paused or stopped, 
Connect will stop polling it for additional records. While a sink connector is 
paused or stopped, Connect will stop pushing new messages to it. The paused / 
stopped state is persistent, so even if you restart the cluster, the connector 
will not begin message processing again until it has been resumed. For a paused 
connector, any resources claimed by its tasks are left allocated, which allows 
the connector to begin processing data quickly once it is resumed. When a 
connector is stopped, however, its tasks are shut down and any resources 
claimed by its tasks are deallocated. This is more efficient from a resource u
 sage standpoint than pausing the connector, but can cause it to take longer to 
begin processing data once resumed. Note that there may be a delay before all 
of a connector's tasks have transitioned to the PAUSED state since it may take 
time for them to finish whatever processing they were in the middle of when 
being paused. Additionally, failed tasks will not transition to the PAUSED 
state until they have been restarted.

Review Comment:
   It feels like this muddies the waters a bit for the existing pause/resume 
API... what do you think about leaving this paragraph as it was, and adding a 
separate paragraph afterward that covers the `STOPPED` state, explaining that 
it goes beyond the `PAUSED` state by completely shutting down tasks instead of 
leaving them idling?
   
   We can also include information on offset management even if we intend to 
backport this PR; I can just make sure to either remove it or clarify that it 
refers to a feature available in later versions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] C0urante commented on a diff in pull request #14336: KAFKA-14876: Add stopped state to Kafka Connect Administration docs section

Reply via email to