Re: [DISCUSS] KIP-939: Support Participation in 2PC

2024-01-26 Thread Artem Livshits
Hi Jun,

> 20.  I am a bit confused by how we set keepPreparedTxn.  ...

keepPreparedTxn=true informs the transaction coordinator that it should
keep the ongoing transaction, if any.  If the keepPreparedTxn=false, then
any ongoing transaction is aborted (this is exactly the current behavior).
enable2Pc is a separate argument that is controlled by the
*transaction.two.phase.commit.enable *setting on the client.

To start 2PC, the client just needs to set
*transaction.two.phase.commit.enable*=true in the config.  Then if the
client knows the status of the transaction upfront (in the case of Flink,
Flink keeps the knowledge if the transaction is prepared in its own store,
so it always knows upfront), it can set keepPreparedTxn accordingly, then
if the transaction was prepared, it'll be ready for the client to complete
the appropriate action; if the client doesn't have a knowledge that the
transaction is prepared, keepPreparedTxn is going to be false, in which
case we'll get to a clean state (the same way we do today).

For the dual-write recipe, the client doesn't know upfront if the
transaction is prepared, this information is implicitly encoded
PreparedTxnState value that can be used to resolve the transaction state.
In that case, keepPreparedTxn should always be true, because we don't know
upfront and we don't want to accidentally abort a committed transaction.

The forceTerminateTransaction call can just use keepPreparedTxn=false, it
actually doesn't matter if it sets Enable2Pc flag.

> 21. TransactionLogValue: Do we need some field to identify whether this
is written for 2PC so that ongoing txn is never auto aborted?

The TransactionTimeoutMs would be set to Integer.MAX_VALUE if 2PC was
enabled.  I've added a note to the KIP about this.

> 22

You're right it's a typo.  I fixed it as well as step 9 (REQUEST:
ProducerId=73, ProducerEpoch=MAX).

> 23. It's a bit weird that Enable2Pc is driven by a config while
KeepPreparedTxn is from an API param ...

The intent to use 2PC doesn't change from transaction to transaction, but
the intent to keep prepared txn may change from transaction to
transaction.  In dual-write recipes the distinction is not clear, but for
use cases where keepPreparedTxn value is known upfront (e.g. Flink) it's
more prominent.  E.g. a Flink's Kafka sink operator could be deployed with
*transaction.two.phase.commit.enable*=true hardcoded in the image, but
keepPreparedTxn cannot be hardcoded in the image, because it depends on the
job manager's state.

> 24

The flow is actually going to be the same way as it is now -- the "main"
producer id + epoch needs to be used in all operations to prevent fencing
(it's sort of a common "header" in all RPC calls that follow the same
rules).  The ongoing txn info is just additional info for making a commit /
abort decision based on the PreparedTxnState from the DB.

--Artem

On Thu, Jan 25, 2024 at 11:05 AM Jun Rao  wrote:

> Hi, Artem,
>
> Thanks for the reply. A few more comments.
>
> 20. I am a bit confused by how we set keepPreparedTxn. From the KIP, I got
> the following (1) to start 2pc, we call
> InitProducerId(keepPreparedTxn=false); (2) when the producer fails and
> needs to do recovery, it calls InitProducerId(keepPreparedTxn=true); (3)
> Admin.forceTerminateTransaction() calls
> InitProducerId(keepPreparedTxn=false).
> 20.1 In (1), when a producer calls InitProducerId(false) with 2pc enabled,
> and there is an ongoing txn, should the server return an error to the
> InitProducerId request? If so, what would be the error code?
> 20.2 How do we distinguish between (1) and (3)? It's the same API call but
> (1) doesn't abort ongoing txn and (2) does.
> 20.3 The usage in (1) seems unintuitive. 2pc implies keeping the ongoing
> txn. So, setting keepPreparedTxn to false to start 2pc seems counter
> intuitive.
>
> 21. TransactionLogValue: Do we need some field to identify whether this is
> written for 2PC so that ongoing txn is never auto aborted?
>
> 22. "8. InitProducerId(true); TC STATE: Ongoing, ProducerId=42,
> ProducerEpoch=MAX-1, PrevProducerId=-1, NextProducerId=73,
> NextProducerEpoch=MAX; RESPONSE ProducerId=73, Epoch=MAX-1,
> OngoingTxnProducerId=42, OngoingTxnEpoch=MAX-1"
> It seems in the above example, Epoch in RESPONSE should be MAX to match
> NextProducerEpoch?
>
> 23. It's a bit weird that Enable2Pc is driven by a config
> while KeepPreparedTxn is from an API param. Should we make them more
> consistent since they seem related?
>
> 24. "9. Commit; REQUEST: ProducerId=73, ProducerEpoch=MAX-1; TC STATE:
> PrepareCommit, ProducerId=42, ProducerEpoch=MAX, PrevProducerId=73,
> NextProducerId=85, NextProducerEpoch=0; RESPONSE ProducerId=85, Epoch=0,
> When a commit request is sent, it uses the latest ProducerId and
> ProducerEpoch."
> The step where we use the next produceId to commit an old txn works, but
> can be confusing. It's going to be hard for people implementing this new
> client protocol to figure out when to use the current or 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #75

2024-01-26 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16200) Ensure RequestManager handling of expired timeouts are consistent

2024-01-26 Thread Kirk True (Jira)
Kirk True created KAFKA-16200:
-

 Summary: Ensure RequestManager handling of expired timeouts are 
consistent
 Key: KAFKA-16200
 URL: https://issues.apache.org/jira/browse/KAFKA-16200
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 3.8.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16199) Prune the event queue if events have expired before starting

2024-01-26 Thread Kirk True (Jira)
Kirk True created KAFKA-16199:
-

 Summary: Prune the event queue if events have expired before 
starting
 Key: KAFKA-16199
 URL: https://issues.apache.org/jira/browse/KAFKA-16199
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 3.8.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] KIP-1017: A health check endpoint for Kafka Connect

2024-01-26 Thread Chris Egerton
Hi all,

Happy Friday! I'd like to kick off discussion for KIP-1017, which (as the
title suggests) proposes adding a health check endpoint for Kafka Connect:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1017%3A+Health+check+endpoint+for+Kafka+Connect

This is one of the longest-standing issues with Kafka Connect and I'm
hoping we can finally put it in the ground soon. Looking forward to hearing
people's thoughts!

Cheers,

Chris


[jira] [Resolved] (KAFKA-16122) TransactionsBounceTest -- server disconnected before response was received

2024-01-26 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-16122.

Resolution: Fixed

> TransactionsBounceTest -- server disconnected before response was received
> --
>
> Key: KAFKA-16122
> URL: https://issues.apache.org/jira/browse/KAFKA-16122
> Project: Kafka
>  Issue Type: Test
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Major
>
> I noticed a ton of tests failing with 
> h4.  
> {code:java}
> Error  org.apache.kafka.common.KafkaException: Unexpected error in 
> TxnOffsetCommitResponse: The server disconnected before a response was 
> received.  {code}
> {code:java}
> Stacktrace  org.apache.kafka.common.KafkaException: Unexpected error in 
> TxnOffsetCommitResponse: The server disconnected before a response was 
> received.  at 
> app//org.apache.kafka.clients.producer.internals.TransactionManager$TxnOffsetCommitHandler.handleResponse(TransactionManager.java:1702)
>   at 
> app//org.apache.kafka.clients.producer.internals.TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java:1236)
>   at 
> app//org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154)
>   at 
> app//org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:608)
>   at app//org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:600) 
>  at 
> app//org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:457)
>   at 
> app//org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:334)
>   at 
> app//org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:249)  
> at java.base@21.0.1/java.lang.Thread.run(Thread.java:1583) {code}
> The error indicates a network error which is retriable but the 
> TxnOffsetCommit handler doesn't expect this. 
> https://issues.apache.org/jira/browse/KAFKA-14417 addressed many of the other 
> requests but not this one. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15987) Refactor ReplicaManager code for transaction verification

2024-01-26 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-15987.

Resolution: Fixed

> Refactor ReplicaManager code for transaction verification
> -
>
> Key: KAFKA-15987
> URL: https://issues.apache.org/jira/browse/KAFKA-15987
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Major
>
> I started to do this in KAFKA-15784, but the diff was deemed too large and 
> confusing. I just wanted to file a followup ticket to reference this in code 
> for the areas that will be refactored.
>  
> I hope to tackle it immediately after.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] 3.7.0 RC2

2024-01-26 Thread kafka
Apologies, I duplicated KAFKA-16157 twice in my previous message. I intended to 
mention KAFKA-16195
with the PR at https://github.com/apache/kafka/pull/15262 as the second JIRA.

Thanks,
Gaurav

> On 26 Jan 2024, at 15:34, ka...@gnarula.com wrote:
> 
> Hi Stan,
> 
> I wanted to share some updates about the bugs you shared earlier.
> 
> - KAFKA-14616: I've reviewed and tested the PR from Colin and have observed
> the fix works as intended.
> - KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed 
> fix. I've
> therefore raised https://github.com/apache/kafka/pull/15270 following a 
> discussion with Luke in JIRA.
> - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting
> feedback/reviews at https://github.com/apache/kafka/pull/15136
> 
> In addition to the above, there are 2 JIRAs I'd like to bring everyone's 
> attention to:
> 
> - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. 
> I've raised
> https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it.
> - KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. 
> This should
> hopefully get merged soon.
> 
> Regards,
> Gaurav
> 
> 
>> On 24 Jan 2024, at 11:51, ka...@gnarula.com wrote:
>> 
>> Hi Stanislav,
>> 
>> Thanks for bringing these JIRAs/PRs up.
>> 
>> I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I 
>> hope to have some feedback
>> by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's 
>> away. I'll try to build on his work in the meantime.
>> 
>> As for KAFKA-16082, we haven't been able to deduce a data loss scenario. 
>> There's a PR open
>> by me for promoting an abandoned future replica with approvals from Omnia 
>> and Proven,
>> so I'd appreciate a committer reviewing it.
>> 
>> Regards,
>> Gaurav
>> 
>> On 23 Jan 2024, at 20:17, Stanislav Kozlovski 
>>  wrote:
>>> 
>>> Hey all, I figured I'd give an update about what known blockers we have
>>> right now:
>>> 
>>> - KAFKA-16101: KRaft migration rollback documentation is incorrect -
>>> https://github.com/apache/kafka/pull/15193; This need not block RC
>>> creation, but we need the docs updated so that people can test properly
>>> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs -
>>> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that
>>> this is blocking JBOD for 3.7
>>> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 -
>>> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232
>>> - although I understand Proveen is out of office
>>> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am
>>> hearing mixed opinions on whether this is a blocker (
>>> https://github.com/apache/kafka/pull/15136)
>>> 
>>> Given that there are 3 JBOD blocker bugs, and I am not confident they will
>>> all be merged this week - I am on the edge of voting to revert JBOD from
>>> this release, or mark it early access.
>>> 
>>> By all accounts, it seems that if we keep with JBOD the release will have
>>> to spill into February, which is a month extra from the time-based release
>>> plan we had of start of January.
>>> 
>>> Can I ask others for an opinion?
>>> 
>>> Best,
>>> Stan
>>> 
>>> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen  wrote:
>>> 
 Hi all,
 
 I think I've found another blocker issue: KAFKA-16162
  .
 The impact is after upgrading to 3.7.0, any new created topics/partitions
 will be unavailable.
 I've put my findings in the JIRA.
 
 Thanks.
 Luke
 
 On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax  wrote:
 
> Stan, thanks for driving this all forward! Excellent job.
> 
> About
> 
>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141
>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139
> 
> For `StreamsUpgradeTest` it was a test setup issue and should be fixed
> now in trunk and 3.7 (and actually also in 3.6...)
> 
> For `StreamsStandbyTask` the failing test exposes a regression bug, so
> it's a blocker. I updated the ticket accordingly. We already have an
> open PR that reverts the code introducing the regression.
> 
> 
> -Matthias
> 
> On 1/17/24 9:44 AM, Proven Provenzano wrote:
>> We have another blocking issue for the RC :
>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar
> to
>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue
 however
>> can lead to the new topic having partitions that a producer cannot
 write
> to.
>> 
>> --Proven
>> 
>> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano <
> pprovenz...@confluent.io>
>> wrote:
>> 
>>> 
>>> I have a PR https://github.com/apache/kafka/pull/15197 for
>>> 

Re: [VOTE] 3.7.0 RC2

2024-01-26 Thread kafka
Hi Stan,

I wanted to share some updates about the bugs you shared earlier.

- KAFKA-14616: I've reviewed and tested the PR from Colin and have observed
the fix works as intended.
- KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed fix. 
I've
therefore raised https://github.com/apache/kafka/pull/15270 following a 
discussion with Luke in JIRA.
- KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting
feedback/reviews at https://github.com/apache/kafka/pull/15136

In addition to the above, there are 2 JIRAs I'd like to bring everyone's 
attention to:

- KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. I've 
raised
https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it.
- KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. 
This should
hopefully get merged soon.

Regards,
Gaurav


> On 24 Jan 2024, at 11:51, ka...@gnarula.com wrote:
> 
> Hi Stanislav,
> 
> Thanks for bringing these JIRAs/PRs up.
> 
> I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I 
> hope to have some feedback
> by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's 
> away. I'll try to build on his work in the meantime.
> 
> As for KAFKA-16082, we haven't been able to deduce a data loss scenario. 
> There's a PR open
> by me for promoting an abandoned future replica with approvals from Omnia and 
> Proven,
> so I'd appreciate a committer reviewing it.
> 
> Regards,
> Gaurav
> 
> On 23 Jan 2024, at 20:17, Stanislav Kozlovski 
>  wrote:
>> 
>> Hey all, I figured I'd give an update about what known blockers we have
>> right now:
>> 
>> - KAFKA-16101: KRaft migration rollback documentation is incorrect -
>> https://github.com/apache/kafka/pull/15193; This need not block RC
>> creation, but we need the docs updated so that people can test properly
>> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs -
>> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that
>> this is blocking JBOD for 3.7
>> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 -
>> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232
>> - although I understand Proveen is out of office
>> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am
>> hearing mixed opinions on whether this is a blocker (
>> https://github.com/apache/kafka/pull/15136)
>> 
>> Given that there are 3 JBOD blocker bugs, and I am not confident they will
>> all be merged this week - I am on the edge of voting to revert JBOD from
>> this release, or mark it early access.
>> 
>> By all accounts, it seems that if we keep with JBOD the release will have
>> to spill into February, which is a month extra from the time-based release
>> plan we had of start of January.
>> 
>> Can I ask others for an opinion?
>> 
>> Best,
>> Stan
>> 
>> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen  wrote:
>> 
>>> Hi all,
>>> 
>>> I think I've found another blocker issue: KAFKA-16162
>>>  .
>>> The impact is after upgrading to 3.7.0, any new created topics/partitions
>>> will be unavailable.
>>> I've put my findings in the JIRA.
>>> 
>>> Thanks.
>>> Luke
>>> 
>>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax  wrote:
>>> 
 Stan, thanks for driving this all forward! Excellent job.
 
 About
 
> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141
> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139
 
 For `StreamsUpgradeTest` it was a test setup issue and should be fixed
 now in trunk and 3.7 (and actually also in 3.6...)
 
 For `StreamsStandbyTask` the failing test exposes a regression bug, so
 it's a blocker. I updated the ticket accordingly. We already have an
 open PR that reverts the code introducing the regression.
 
 
 -Matthias
 
 On 1/17/24 9:44 AM, Proven Provenzano wrote:
> We have another blocking issue for the RC :
> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar
 to
> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue
>>> however
> can lead to the new topic having partitions that a producer cannot
>>> write
 to.
> 
> --Proven
> 
> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano <
 pprovenz...@confluent.io>
> wrote:
> 
>> 
>> I have a PR https://github.com/apache/kafka/pull/15197 for
>> https://issues.apache.org/jira/browse/KAFKA-16131 that is building
>>> now.
>> --Proven
>> 
>> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz  wrote:
>> 
>>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a
>>> blocker bug because it *
>>> *> will generate huge amount of logspam. I guess we didn't find it in
>>> junit
>>> tests *
>>> *> since logspam doesn't fail 

[jira] [Resolved] (KAFKA-14505) Implement TnxOffsetCommit API

2024-01-26 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-14505.
-
Fix Version/s: 3.8.0
   Resolution: Fixed

> Implement TnxOffsetCommit API
> -
>
> Key: KAFKA-14505
> URL: https://issues.apache.org/jira/browse/KAFKA-14505
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Major
>  Labels: kip-848-preview
> Fix For: 3.8.0
>
>
> Implement TnxOffsetCommit API in the new Group Coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16198) Reconciliation may lose partitions when topic metadata is delayed

2024-01-26 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-16198:
--

 Summary: Reconciliation may lose partitions when topic metadata is 
delayed
 Key: KAFKA-16198
 URL: https://issues.apache.org/jira/browse/KAFKA-16198
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Reporter: Lucas Brutschy
Assignee: Lucas Brutschy
 Fix For: 3.8.0


The current reconciliation code in `AsyncKafkaConsumer`s `MembershipManager` 
may lose part of the server-provided assignment when metadata is delayed. The 
reason is incorrect handling of partially resolved topic names, as in this 
example:

 * We get assigned {{T1-1}} and {{T2-1}}
 * We reconcile {{{}T1-1{}}}, {{T2-1}} remains in {{assignmentUnresolved}} 
since the topic id {{T2}} is not known yet
 * We get new cluster metadata, which includes {{{}T2{}}}, so {{T2-1}} is moved 
to {{assignmentReadyToReconcile}}
 * We call {{reconcile}} -- {{T2-1}} is now treated as the full assignment, so 
{{T1-1}} is being revoked
 * We end up with assignment {{T2-1, which is inconsistent with the broker-side 
target assignment.}}

 

Generally, this seems to be a problem around semantics of the internal 
collections `assignmentUnresolved` and `assignmentReadyToReconcile`. Absence of 
a topic in `assignmentReadyToReconcile` may mean either revocation of the topic 
partition(s), or unavailability of a topic name for the topic.

Internal state with simpler and correct invariant could be achieved by using a 
single collection `currentTargetAssignment` which is based on topic IDs and 
always corresponds to the latest assignment received from the broker. During 
every attempted reconciliation, all topic IDs will be resolved from the local 
cache, which should not introduce a lot of overhead. `assignmentUnresolved` and 
`assignmentReadyToReconcile` are removed. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16197) Connect Worker poll timeout prints Consumer poll timeout specific warnings.

2024-01-26 Thread Sagar Rao (Jira)
Sagar Rao created KAFKA-16197:
-

 Summary: Connect Worker poll timeout prints Consumer poll timeout 
specific warnings.
 Key: KAFKA-16197
 URL: https://issues.apache.org/jira/browse/KAFKA-16197
 Project: Kafka
  Issue Type: Bug
Reporter: Sagar Rao
Assignee: Sagar Rao


When a worker's poll timeout expires in Connect, the log lines that we see are:

{noformat}
consumer poll timeout has expired. This means the time between subsequent calls 
to poll() was longer than the configured max.poll.interval.ms, which typically 
implies that the poll loop is spending too much time processing messages. You 
can address this either by increasing max.poll.interval.ms or by reducing the 
maximum size of batches returned in poll() with max.poll.records.

{noformat}

and the reason for leaving the group is 


{noformat}
Member XX sending LeaveGroup request to coordinator XX due to consumer poll 
timeout has expired.
{noformat}

which is specific to Consumers and not to Connect workers. The log line above 
in specially misleading because the config `max.poll.interval.ms` is not 
configurable for a Connect worker and could make someone believe that the logs 
are being written for Sink Connectors and not for Connect worker. Ideally, we 
should print something specific to Connect.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)