from:"Konstantine Karantasis \(JIRA\)"

[jira] [Resolved] (KAFKA-5046) Support file rotation in FileStreamSource Connector

2023-11-29 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-5046.
---
Resolution: Won't Fix

> Support file rotation in FileStreamSource Connector
> ---
>
> Key: KAFKA-5046
> URL: https://issues.apache.org/jira/browse/KAFKA-5046
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Minor
>
> Currently when a source file is moved (for file rotation purposes, or between 
> restarts of Kafka Connect) the FileStreamSource Connector can not detect the 
> change, because it only uses the filename as key to its offset tracking. 
> Nevertheless, file rotation can be detected easily by checking basic file 
> attributes such as the {{fileKey}} in platforms that this attribute is 
> supported (for instance file key includes the device id and the inode in unix 
> based filesystems) and the file's creation time.
> Such checks need to take place when the task starts and when no more records 
> are read during a call to {{poll}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-12495) Unbalanced connectors/tasks distribution will happen in Connect's incremental cooperative assignor

2022-04-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17522464#comment-17522464
 ] 

Konstantine Karantasis commented on KAFKA-12495:


With respect to release logistics, I think it's worth being clear. This issue, 
although it's an annoying bug, is not a regression and therefore should not be 
treated as a blocker for 3.2.0 if we want to be consistent with our release 
process at this point. Given that we are way into code freeze the fix should 
target {{trunk}} and be backported to the respective release branches when they 
go out of the code freeze period. 

[~showuon] regarding the fix itself, it's worth noting that the challenge is 
with testing rather that the code changes themselves. But irrespective to that, 
I don't think that being tactical and rushing the existing fix is necessarily 
the right thing to do in this case (and to be honest in my opinion timing is 
rarely the most important factor for complicated improvements). 

I believe that it's really worth trying to avoid rebalance storms (that can 
happen when multiple events that can cause a rebalance happen concurrently). If 
users want to minimize the time to recover from this type of situation they 
will still have the ability to do so by setting 
[scheduled.rebalance.max.delay.ms|https://kafka.apache.org/documentation/#connectconfigs_scheduled.rebalance.max.delay.ms]
 to a lower value. And if we follow this approach and it works sufficiently, we 
can consider lowering the default time in the future as an improvement. 

> Unbalanced connectors/tasks distribution will happen in Connect's incremental 
> cooperative assignor
> --
>
> Key: KAFKA-12495
> URL: https://issues.apache.org/jira/browse/KAFKA-12495
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Blocker
> Fix For: 3.2.0
>
> Attachments: image-2021-03-18-15-04-57-854.png, 
> image-2021-03-18-15-05-52-557.png, image-2021-03-18-15-07-27-103.png
>
>
> In Kafka Connect, we implement incremental cooperative rebalance algorithm 
> based on KIP-415 
> ([https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect)|https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect].
>  However, we have a bad assumption in the algorithm implementation, which is: 
> after revoking rebalance completed, the member(worker) count will be the same 
> as the previous round of reblance.
>  
> Let's take a look at the example in the KIP-415:
> !image-2021-03-18-15-07-27-103.png|width=441,height=556!
> It works well for most cases. But what if W4 added after 1st rebalance 
> completed and before 2nd rebalance started? Let's see what will happened? 
> Let's see this example: (we'll use 10 tasks here):
>  
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, BT4, 
> BT5
> W1 is current leader
> W2 joins with assignment: []
> Rebalance is triggered
> W3 joins while rebalance is still active with assignment: []
> W1 joins with assignment: [AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, 
> BT4, BT5]
> W1 becomes leader
> W1 computes and sends assignments:
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> W2(delay: 0, assigned: [], revoked: [])
> W3(delay: 0, assigned: [], revoked: [])
> W1 stops revoked resources
> W1 rejoins with assignment: [AC0, AT1, AT2, AT3]
> Rebalance is triggered
> W2 joins with assignment: []
> W3 joins with assignment: []
> // one more member joined
> W4 joins with assignment: []
> W1 becomes leader
> W1 computes and sends assignments:
> // We assigned all the previous revoked Connectors/Tasks to the new member, 
> but we didn't revoke any more C/T in this round, which cause unbalanced 
> distribution
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [])
> W2(delay: 0, assigned: [AT4, AT5, BC0], revoked: [])
> W2(delay: 0, assigned: [BT1, BT2, BT4], revoked: [])
> W2(delay: 0, assigned: [BT4, BT5], revoked: [])
> {code}
> Because we didn't allow to do consecutive revoke in two consecutive 
> rebalances (under the same leader), we will have this uneven distribution 
> under this situation. We should allow consecutive rebalance to have another 
> round of revocation to revoke the C/T to the other members in this case.
> expected:
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5,

[jira] [Commented] (KAFKA-12495) Unbalanced connectors/tasks distribution will happen in Connect's incremental cooperative assignor

2022-04-13 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521853#comment-17521853
 ] 

Konstantine Karantasis commented on KAFKA-12495:


Thanks for documenting the issue in detail [~showuon]. Adding here the comment 
I added to the PR. 

My main concern is indeed related to the proposed change to apply consecutive 
rebalances that will perform revocations.

The current incremental cooperative rebalancing algorithm, is using two 
consecutive rebalances in order to move tasks between workers. One rebalance 
during which revocations are happening and one during which the revoked tasks 
are reassigned. Although clearly this is not an atomic process (as this issue 
also demonstrates) I find that it's a good property to maintain and reason 
about.

Allowing for consecutive revocations that happen immediately when an imbalance 
is detected might mean that the workers overreact to external circumstances 
that have caused an imbalanced between the initial calculation of task 
assignments of the revocation rebalance and the subsequent rebalance for the 
assignment of revoked tasks. Such circumstances might have to do with rolling 
upgrades, scaling a cluster up or down or simply might be caused by temporary 
instability. We were first able to reproduce this issue in integration tests by 
the test that is currently disabled.

My main thought was that, instead of risking shuffling tasks too aggressively 
within a short period of time and open the door to bugs that will make workers 
oscillate between imbalanced task assignments continuously and in a tight loop, 
we could use the existing mechanism of scheduling delayed rebalances to program 
workers to perform a pair of rebalanced (revocation + reassignment) soon after 
an imbalance is detected. Regarding when an imbalance is detected, the good 
news is that the leader worker sending the assignment during the second 
rebalance of a pair of rebalances knows that it will send an imbalanced 
assignment (there's no code to detect right now that but can be easily added 
just before the assignment is sent). The idea here would be to send this 
assignment anyways, but also schedule a follow up rebalance that will have the 
opportunity to balance tasks soon with our standard pair of rebalances that 
works dependably as long as no new workers are added or removed between the two 
rebalances. We can discuss what is a good setting for the delay. One obvious 
possibility is to reuse the existing property. Adding another config just for 
that seems unwarranted. To shield ourselves from infinite such rebalances the 
leader should also keep track of how many such attempts have been made and stop 
attempting to balance out tasks after a certain number of tries. Of course 
every other normal rebalance should reset both this counter and possibly the 
delay.

I'd be interested to hear what do you think of this approach that is quite 
similar to what you have demonstrated already but potentially less risky in 
terms of changes in the assignor logic and how aggressively the leader attempts 
to fix an imbalance.

> Unbalanced connectors/tasks distribution will happen in Connect's incremental 
> cooperative assignor
> --
>
> Key: KAFKA-12495
> URL: https://issues.apache.org/jira/browse/KAFKA-12495
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Blocker
> Fix For: 3.2.0
>
> Attachments: image-2021-03-18-15-04-57-854.png, 
> image-2021-03-18-15-05-52-557.png, image-2021-03-18-15-07-27-103.png
>
>
> In Kafka Connect, we implement incremental cooperative rebalance algorithm 
> based on KIP-415 
> ([https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect)|https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect].
>  However, we have a bad assumption in the algorithm implementation, which is: 
> after revoking rebalance completed, the member(worker) count will be the same 
> as the previous round of reblance.
>  
> Let's take a look at the example in the KIP-415:
> !image-2021-03-18-15-07-27-103.png|width=441,height=556!
> It works well for most cases. But what if W4 added after 1st rebalance 
> completed and before 2nd rebalance started? Let's see what will happened? 
> Let's see this example: (we'll use 10 tasks here):
>  
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, BT4, 
> BT5
> W1 is current leader
> W2 joins with assignment: []
> Rebalance

[jira] [Resolved] (KAFKA-13748) Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default

2022-03-30 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-13748.

Resolution: Fixed

cc [~cadonna] [~tombentley] re: inclusion to the upcoming releases. 

> Do not include file stream connectors in Connect's CLASSPATH and plugin.path 
> by default
> ---
>
> Key: KAFKA-13748
> URL: https://issues.apache.org/jira/browse/KAFKA-13748
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.2
>
>
> File stream connectors have been included with Kafka Connect distributions 
> from the very beginning. These simple connectors were included to show case 
> connector implementation but were never meant to be used in production and 
> have been only available for the straightforward demonstration of Connect's 
> capabilities through our quick start guides. 
>  
>  Given that these connectors are not production ready and yet they offer 
> access to the local filesystem, with this ticket I propose to remove them 
> from our deployments by default by excluding these connectors from the 
> {{CLASSPATH}} or the default {{{}plugin.path{}}}. 
>  
>  The impact will be minimal. Quick start guides will require a single 
> additional step of editing the {{plugin.path}} to include the single package 
> that includes these connectors. Production deployments will remain unaffected 
> because these are not production grade connectors. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (KAFKA-13759) Disable producer idempotence by default in producers instantiated by Connect

2022-03-30 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511516#comment-17511516
 ] 

Konstantine Karantasis edited comment on KAFKA-13759 at 3/30/22, 8:38 PM:
--

This issue has been now been merged on the 3.2 and 3.1 branches to avoid a 
breaking change when Connect contacts older brokers and idempotence is enabled 
in the producer by default. 
[~cadonna] [~tombentley] fyi. 
Hopefully this fix makes to the upcoming releases but please let me know if the 
targeted versions need to be adjusted. 


was (Author: kkonstantine):
This issue has been now been merged on the 3.2 and 3.1 branches to avoid a 
breaking change when Connect. 
[~cadonna] [~tombentley] fyi. 
Hopefully this fix makes to the upcoming releases but please let me know if the 
targeted versions need to be adjusted. 

> Disable producer idempotence by default in producers instantiated by Connect
> 
>
> Key: KAFKA-13759
> URL: https://issues.apache.org/jira/browse/KAFKA-13759
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.2
>
>
> https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently 
> referring to KIP-318. Before that in AK 3.0 idempotence was enabled by 
> default across Kafka producers. 
> However, some compatibility implications were missed in both cases. 
> If idempotence is enabled by default Connect won't be able to communicate via 
> its producers with Kafka brokers older than version 0.11. Perhaps more 
> importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL 
> is required to be granted to the principal of the Connect worker. 
> Given the above caveats, this ticket proposes to explicitly disable producer 
> idempotence in Connect by default. This feature, as it happens today, can be 
> enabled by setting worker and/or connector properties. However, enabling it 
> by default should be considered in a major version upgrade and after KIP-318 
> is updated to mention the compatibility requirements and gets officially 
> approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (KAFKA-13759) Disable producer idempotence by default in producers instantiated by Connect

2022-03-23 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-13759.

Resolution: Fixed

> Disable producer idempotence by default in producers instantiated by Connect
> 
>
> Key: KAFKA-13759
> URL: https://issues.apache.org/jira/browse/KAFKA-13759
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.2
>
>
> https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently 
> referring to KIP-318. Before that in AK 3.0 idempotence was enabled by 
> default across Kafka producers. 
> However, some compatibility implications were missed in both cases. 
> If idempotence is enabled by default Connect won't be able to communicate via 
> its producers with Kafka brokers older than version 0.11. Perhaps more 
> importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL 
> is required to be granted to the principal of the Connect worker. 
> Given the above caveats, this ticket proposes to explicitly disable producer 
> idempotence in Connect by default. This feature, as it happens today, can be 
> enabled by setting worker and/or connector properties. However, enabling it 
> by default should be considered in a major version upgrade and after KIP-318 
> is updated to mention the compatibility requirements and gets officially 
> approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (KAFKA-13759) Disable producer idempotence by default in producers instantiated by Connect

2022-03-23 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511516#comment-17511516
 ] 

Konstantine Karantasis commented on KAFKA-13759:


This issue has been now been merged on the 3.2 and 3.1 branches to avoid a 
breaking change when Connect. 
[~cadonna] [~tombentley] fyi. 
Hopefully this fix makes to the upcoming releases but please let me know if the 
targeted versions need to be adjusted. 

> Disable producer idempotence by default in producers instantiated by Connect
> 
>
> Key: KAFKA-13759
> URL: https://issues.apache.org/jira/browse/KAFKA-13759
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.2
>
>
> https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently 
> referring to KIP-318. Before that in AK 3.0 idempotence was enabled by 
> default across Kafka producers. 
> However, some compatibility implications were missed in both cases. 
> If idempotence is enabled by default Connect won't be able to communicate via 
> its producers with Kafka brokers older than version 0.11. Perhaps more 
> importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL 
> is required to be granted to the principal of the Connect worker. 
> Given the above caveats, this ticket proposes to explicitly disable producer 
> idempotence in Connect by default. This feature, as it happens today, can be 
> enabled by setting worker and/or connector properties. However, enabling it 
> by default should be considered in a major version upgrade and after KIP-318 
> is updated to mention the compatibility requirements and gets officially 
> approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (KAFKA-13759) Disable producer idempotence by default in producers instantiated by Connect

2022-03-23 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13759:
---
Description: 
https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently referring 
to KIP-318. Before that in AK 3.0 idempotence was enabled by default across 
Kafka producers. 

However, some compatibility implications were missed in both cases. 

If idempotence is enabled by default Connect won't be able to communicate via 
its producers with Kafka brokers older than version 0.11. Perhaps more 
importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL is 
required to be granted to the principal of the Connect worker. 

Given the above caveats, this ticket proposes to explicitly disable producer 
idempotence in Connect by default. This feature, as it happens today, can be 
enabled by setting worker and/or connector properties. However, enabling it by 
default should be considered in a major version upgrade and after KIP-318 is 
updated to mention the compatibility requirements and gets officially approved. 

  was:
https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently referring 
to KIP-318. Before that in AK 3.0 idempotency was enabled by default across 
Kafka producers. 

However, some compatibility implications were missed in both cases. 

If idempotency is enabled by default Connect won't be able to communicate via 
its producers with Kafka brokers older than version 0.11. Perhaps more 
importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL is 
required to be granted to the principal of the Connect worker. 

Given the above caveats, this ticket proposes to explicitly disable producer 
idempotency in Connect by default. This feature, as it happens today, can be 
enabled by setting worker and/or connector properties. However, enabling it by 
default should be considered in a major version upgrade and after KIP-318 is 
updated to mention the compatibility requirements and gets officially approved. 


> Disable producer idempotence by default in producers instantiated by Connect
> 
>
> Key: KAFKA-13759
> URL: https://issues.apache.org/jira/browse/KAFKA-13759
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
>
> https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently 
> referring to KIP-318. Before that in AK 3.0 idempotence was enabled by 
> default across Kafka producers. 
> However, some compatibility implications were missed in both cases. 
> If idempotence is enabled by default Connect won't be able to communicate via 
> its producers with Kafka brokers older than version 0.11. Perhaps more 
> importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL 
> is required to be granted to the principal of the Connect worker. 
> Given the above caveats, this ticket proposes to explicitly disable producer 
> idempotence in Connect by default. This feature, as it happens today, can be 
> enabled by setting worker and/or connector properties. However, enabling it 
> by default should be considered in a major version upgrade and after KIP-318 
> is updated to mention the compatibility requirements and gets officially 
> approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (KAFKA-13759) Disable producer idempotence by default in producers instantiated by Connect

2022-03-23 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13759:
---
Fix Version/s: 3.2.0
   3.1.1
   3.0.2

> Disable producer idempotence by default in producers instantiated by Connect
> 
>
> Key: KAFKA-13759
> URL: https://issues.apache.org/jira/browse/KAFKA-13759
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.2
>
>
> https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently 
> referring to KIP-318. Before that in AK 3.0 idempotence was enabled by 
> default across Kafka producers. 
> However, some compatibility implications were missed in both cases. 
> If idempotence is enabled by default Connect won't be able to communicate via 
> its producers with Kafka brokers older than version 0.11. Perhaps more 
> importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL 
> is required to be granted to the principal of the Connect worker. 
> Given the above caveats, this ticket proposes to explicitly disable producer 
> idempotence in Connect by default. This feature, as it happens today, can be 
> enabled by setting worker and/or connector properties. However, enabling it 
> by default should be considered in a major version upgrade and after KIP-318 
> is updated to mention the compatibility requirements and gets officially 
> approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (KAFKA-13759) Disable producer idempotence by default in producers instantiated by Connect

2022-03-23 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13759:
---
Summary: Disable producer idempotence by default in producers instantiated 
by Connect  (was: Disable producer idempotency by default in producers 
instantiated by Connect)

> Disable producer idempotence by default in producers instantiated by Connect
> 
>
> Key: KAFKA-13759
> URL: https://issues.apache.org/jira/browse/KAFKA-13759
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
>
> https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently 
> referring to KIP-318. Before that in AK 3.0 idempotency was enabled by 
> default across Kafka producers. 
> However, some compatibility implications were missed in both cases. 
> If idempotency is enabled by default Connect won't be able to communicate via 
> its producers with Kafka brokers older than version 0.11. Perhaps more 
> importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL 
> is required to be granted to the principal of the Connect worker. 
> Given the above caveats, this ticket proposes to explicitly disable producer 
> idempotency in Connect by default. This feature, as it happens today, can be 
> enabled by setting worker and/or connector properties. However, enabling it 
> by default should be considered in a major version upgrade and after KIP-318 
> is updated to mention the compatibility requirements and gets officially 
> approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (KAFKA-13759) Disable producer idempotency by default in producers instantiated by Connect

2022-03-22 Thread Konstantine Karantasis (Jira)

Konstantine Karantasis created KAFKA-13759:
--

 Summary: Disable producer idempotency by default in producers 
instantiated by Connect
 Key: KAFKA-13759
 URL: https://issues.apache.org/jira/browse/KAFKA-13759
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis


https://issues.apache.org/jira/browse/KAFKA-7077 was merged recently referring 
to KIP-318. Before that in AK 3.0 idempotency was enabled by default across 
Kafka producers. 

However, some compatibility implications were missed in both cases. 

If idempotency is enabled by default Connect won't be able to communicate via 
its producers with Kafka brokers older than version 0.11. Perhaps more 
importantly, for brokers older than version 2.8 the {{IDEMPOTENT_WRITE}} ACL is 
required to be granted to the principal of the Connect worker. 

Given the above caveats, this ticket proposes to explicitly disable producer 
idempotency in Connect by default. This feature, as it happens today, can be 
enabled by setting worker and/or connector properties. However, enabling it by 
default should be considered in a major version upgrade and after KIP-318 is 
updated to mention the compatibility requirements and gets officially approved. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (KAFKA-13748) Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default

2022-03-16 Thread Konstantine Karantasis (Jira)

Konstantine Karantasis created KAFKA-13748:
--

 Summary: Do not include file stream connectors in Connect's 
CLASSPATH and plugin.path by default
 Key: KAFKA-13748
 URL: https://issues.apache.org/jira/browse/KAFKA-13748
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis
 Fix For: 3.2.0, 3.1.1, 3.0.2


File stream connectors have been included with Kafka Connect distributions from 
the very beginning. These simple connectors were included to show case 
connector implementation but were never meant to be used in production and have 
been only available for the straightforward demonstration of Connect's 
capabilities through our quick start guides. 
 
 Given that these connectors are not production ready and yet they offer access 
to the local filesystem, with this ticket I propose to remove them from our 
deployments by default by excluding these connectors from the {{CLASSPATH}} or 
the default {{{}plugin.path{}}}. 
 
 The impact will be minimal. Quick start guides will require a single 
additional step of editing the {{plugin.path}} to include the single package 
that includes these connectors. Production deployments will remain unaffected 
because these are not production grade connectors. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (KAFKA-12487) Sink connectors do not work with the cooperative consumer rebalance protocol

2021-11-10 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12487:
---
Fix Version/s: 3.0.1

> Sink connectors do not work with the cooperative consumer rebalance protocol
> 
>
> Key: KAFKA-12487
> URL: https://issues.apache.org/jira/browse/KAFKA-12487
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.4.2, 2.5.2, 2.8.0, 2.7.1, 2.6.2, 3.0.0
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1
>
>
> The {{ConsumerRebalanceListener}} used by the framework to respond to 
> rebalance events in consumer groups for sink tasks is hard-coded with the 
> assumption that the consumer performs rebalances eagerly. In other words, it 
> assumes that whenever {{onPartitionsRevoked}} is called, all partitions have 
> been revoked from that consumer, and whenever {{onPartitionsAssigned}} is 
> called, the partitions passed in to that method comprise the complete set of 
> topic partitions assigned to that consumer.
> See the [WorkerSinkTask.HandleRebalance 
> class|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L669-L730]
>  for the specifics.
>  
> One issue this can cause is silently ignoring to-be-committed offsets 
> provided by sink tasks, since the framework ignores offsets provided by tasks 
> in their {{preCommit}} method if it does not believe that the consumer for 
> that task is currently assigned the topic partition for that offset. See 
> these lines in the [WorkerSinkTask::commitOffsets 
> method|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L429-L430]
>  for reference.
>  
> This may not be the only issue caused by configuring a sink connector's 
> consumer to use cooperative rebalancing. Rigorous unit and integration 
> testing should be added before claiming that the Connect framework supports 
> the use of cooperative consumers with sink connectors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (KAFKA-12487) Sink connectors do not work with the cooperative consumer rebalance protocol

2021-11-10 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-12487.

Resolution: Fixed

> Sink connectors do not work with the cooperative consumer rebalance protocol
> 
>
> Key: KAFKA-12487
> URL: https://issues.apache.org/jira/browse/KAFKA-12487
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.4.2, 2.5.2, 2.8.0, 2.7.1, 2.6.2, 3.0.0
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1
>
>
> The {{ConsumerRebalanceListener}} used by the framework to respond to 
> rebalance events in consumer groups for sink tasks is hard-coded with the 
> assumption that the consumer performs rebalances eagerly. In other words, it 
> assumes that whenever {{onPartitionsRevoked}} is called, all partitions have 
> been revoked from that consumer, and whenever {{onPartitionsAssigned}} is 
> called, the partitions passed in to that method comprise the complete set of 
> topic partitions assigned to that consumer.
> See the [WorkerSinkTask.HandleRebalance 
> class|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L669-L730]
>  for the specifics.
>  
> One issue this can cause is silently ignoring to-be-committed offsets 
> provided by sink tasks, since the framework ignores offsets provided by tasks 
> in their {{preCommit}} method if it does not believe that the consumer for 
> that task is currently assigned the topic partition for that offset. See 
> these lines in the [WorkerSinkTask::commitOffsets 
> method|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L429-L430]
>  for reference.
>  
> This may not be the only issue caused by configuring a sink connector's 
> consumer to use cooperative rebalancing. Rigorous unit and integration 
> testing should be added before claiming that the Connect framework supports 
> the use of cooperative consumers with sink connectors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (KAFKA-12487) Sink connectors do not work with the cooperative consumer rebalance protocol

2021-11-03 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438272#comment-17438272
 ] 

Konstantine Karantasis commented on KAFKA-12487:


Hi [~dajac]. I'd like to complete another review pass this week and merge it as 
a patch that will go in to both 3.0 and 3.1 branches. My initial comments on 
the PR seem to have been addressed. Would that be ok with you?

> Sink connectors do not work with the cooperative consumer rebalance protocol
> 
>
> Key: KAFKA-12487
> URL: https://issues.apache.org/jira/browse/KAFKA-12487
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.4.2, 2.5.2, 2.8.0, 2.7.1, 2.6.2, 3.0.0
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.1.0
>
>
> The {{ConsumerRebalanceListener}} used by the framework to respond to 
> rebalance events in consumer groups for sink tasks is hard-coded with the 
> assumption that the consumer performs rebalances eagerly. In other words, it 
> assumes that whenever {{onPartitionsRevoked}} is called, all partitions have 
> been revoked from that consumer, and whenever {{onPartitionsAssigned}} is 
> called, the partitions passed in to that method comprise the complete set of 
> topic partitions assigned to that consumer.
> See the [WorkerSinkTask.HandleRebalance 
> class|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L669-L730]
>  for the specifics.
>  
> One issue this can cause is silently ignoring to-be-committed offsets 
> provided by sink tasks, since the framework ignores offsets provided by tasks 
> in their {{preCommit}} method if it does not believe that the consumer for 
> that task is currently assigned the topic partition for that offset. See 
> these lines in the [WorkerSinkTask::commitOffsets 
> method|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L429-L430]
>  for reference.
>  
> This may not be the only issue caused by configuring a sink connector's 
> consumer to use cooperative rebalancing. Rigorous unit and integration 
> testing should be added before claiming that the Connect framework supports 
> the use of cooperative consumers with sink connectors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-13284) Use sftp protocol in release.py to upload release candidate artifacts

2021-09-08 Thread Konstantine Karantasis (Jira)

Konstantine Karantasis created KAFKA-13284:
--

 Summary: Use sftp protocol in release.py to upload release 
candidate artifacts 
 Key: KAFKA-13284
 URL: https://issues.apache.org/jira/browse/KAFKA-13284
 Project: Kafka
  Issue Type: Improvement
Reporter: Konstantine Karantasis
 Fix For: 3.1.0


{{home.apache.org}} has restricted access recently to {{sftp}} only.

This prevents {{release.py}} from uploading a single archive with the artifacts 
of a release candidate using {{rsync}} and then unpacking the archive with 
{{ssh}}



The script could be changed to mirror the contents and upload / delete files 
individually using the {{sftp}} protocol. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13160) Fix the code that calls the broker's config handler to pass the expected default resource name when using KRaft.

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13160:
---
Summary: Fix the code that calls the broker's config handler to pass the 
expected default resource name when using KRaft.  (was: Fix the code that calls 
the broker’s config handler to pass the expected default resource name when 
using KRaft.)

> Fix the code that calls the broker's config handler to pass the expected 
> default resource name when using KRaft.
> 
>
> Key: KAFKA-13160
> URL: https://issues.apache.org/jira/browse/KAFKA-13160
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ryan Dielhenn
>Assignee: Ryan Dielhenn
>Priority: Blocker
> Fix For: 3.0.0
>
>
> In a ZK cluster, dynamic default broker configs are stored in the zNode 
> /brokers/. Without this fix, when dynamic configs from snapshots are 
> processed by the KRaft brokers, the BrokerConfigHandler checks if the 
> resource name is "" to do a default update and converts the resource 
> name to an integer otherwise to do a per-broker config update.
> In KRaft, dynamic default broker configs are serialized in metadata with 
> empty string instead of "". This was causing the BrokerConfigHandler 
> to throw a NumberFormatException for dynamic default broker configs since the 
> resource name for them is not "" or a single integer. The code that 
> calls the handler method for config changes should be fixed to pass 
> "" instead of empty string to the handler method if using KRaft.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-7271) Fix ignored test in streams_upgrade_test.py: test_upgrade_downgrade_brokers

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-7271:
--
Fix Version/s: (was: 3.0.0)

> Fix ignored test in streams_upgrade_test.py: test_upgrade_downgrade_brokers
> ---
>
> Key: KAFKA-7271
> URL: https://issues.apache.org/jira/browse/KAFKA-7271
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: John Roesler
>Priority: Blocker
>
> Fix in the oldest branch that ignores the test and cherry-pick forward.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-5905) Remove PrincipalBuilder and DefaultPrincipalBuilder

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-5905:
--
Fix Version/s: (was: 3.0.0)

> Remove PrincipalBuilder and DefaultPrincipalBuilder
> ---
>
> Key: KAFKA-5905
> URL: https://issues.apache.org/jira/browse/KAFKA-5905
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Manikumar
>Priority: Blocker
>
> These classes were deprecated after KIP-189: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-189%3A+Improve+principal+builder+interface+and+add+support+for+SASL,
>  which is part of 1.0.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12582) Remove deprecated `ConfigEntry` constructor

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12582:
---
Fix Version/s: (was: 3.0.0)

> Remove deprecated `ConfigEntry` constructor
> ---
>
> Key: KAFKA-12582
> URL: https://issues.apache.org/jira/browse/KAFKA-12582
> Project: Kafka
>  Issue Type: Improvement
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Major
>
> ConfigEntry's constructor was deprecated in 1.1.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-10329) Enable connector context in logs by default

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-10329:
---
Fix Version/s: (was: 3.0.0)

> Enable connector context in logs by default
> ---
>
> Key: KAFKA-10329
> URL: https://issues.apache.org/jira/browse/KAFKA-10329
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 3.0.0
>Reporter: Randall Hauch
>Priority: Blocker
>  Labels: needs-kip
>
> When 
> [KIP-449|https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs]
>  was implemented and released as part of AK 2.3, we chose to not enable these 
> extra logging context information by default because it was not backward 
> compatible, and anyone relying upon the `connect-log4j.properties` file 
> provided by the AK distribution would after an upgrade to AK 2.3 (or later) 
> see different formats for their logs, which could break any log processing 
> functionality they were relying upon.
> However, we should enable this in AK 3.0, whenever that comes. Doing so will 
> require a fairly minor KIP to change the `connect-log4j.properties` file 
> slightly.
> Marked this as BLOCKER since it's a backward incompatible change that we 
> definitely want to do in the 3.0.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-12929) KIP-750: Deprecate support for Java 8 in Kafka 3.0

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-12929.

Resolution: Fixed

Resolving to unblock the RC for 3.0.0 and will keep a note to add to the 
downloads page



> KIP-750: Deprecate support for Java 8 in Kafka 3.0
> --
>
> Key: KAFKA-12929
> URL: https://issues.apache.org/jira/browse/KAFKA-12929
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12599) Remove deprecated --zookeeper in preferredReplicaLeaderElectionCommand

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12599:
---
Fix Version/s: (was: 3.0.0)

> Remove deprecated --zookeeper in preferredReplicaLeaderElectionCommand
> --
>
> Key: KAFKA-12599
> URL: https://issues.apache.org/jira/browse/KAFKA-12599
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-12930) KIP-751: Deprecate support for Scala 2.12 in Kafka 3.0

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-12930.

Resolution: Fixed

Resolving to unblock the RC for 3.0.0 and will keep a note to add to the 
downloads page

> KIP-751: Deprecate support for Scala 2.12 in Kafka 3.0
> --
>
> Key: KAFKA-12930
> URL: https://issues.apache.org/jira/browse/KAFKA-12930
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13095) TransactionsTest is failing in kraft mode

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13095:
---
Fix Version/s: (was: 3.0.0)

> TransactionsTest is failing in kraft mode
> -
>
> Key: KAFKA-13095
> URL: https://issues.apache.org/jira/browse/KAFKA-13095
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Colin McCabe
>Assignee: Jason Gustafson
>Priority: Blocker
>
> TransactionsTest#testSendOffsetsToTransactionTimeout keeps flaking on Jenkins.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2021-08-25 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404886#comment-17404886
 ] 

Konstantine Karantasis commented on KAFKA-13159:


For the remainder of this ticket I'm moving to 3.0.1 because this open ticket 
is blocking the 3.0.0 RC

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
> Fix For: 3.1.0, 3.0.1
>
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13159:
---
Fix Version/s: (was: 3.0.0)
   3.0.1

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
> Fix For: 3.1.0, 3.0.1
>
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13223) Idempotent producer error with Kraft

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13223:
---
Fix Version/s: (was: 3.0.0)

> Idempotent producer error with Kraft 
> -
>
> Key: KAFKA-13223
> URL: https://issues.apache.org/jira/browse/KAFKA-13223
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, producer 
>Reporter: Laurynas Butkus
>Priority: Major
>
> I get an error *"The broker does not support INIT_PRODUCER_ID"* if I try to 
> produce a message idempotence enabled.
> Result:
> {code:java}
> ➜  ./bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic 
> test --request-required-acks -1 --producer-property enable.idempotence=true
> >test
> >[2021-08-23 19:40:33,356] ERROR [Producer clientId=console-producer] 
> >Aborting producer batches due to fatal error 
> >(org.apache.kafka.clients.producer.internals.Sender)
> org.apache.kafka.common.errors.UnsupportedVersionException: The broker does 
> not support INIT_PRODUCER_ID
> [2021-08-23 19:40:33,358] ERROR Error when sending message to topic test with 
> key: null, value: 4 bytes with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.UnsupportedVersionException: The broker does 
> not support INIT_PRODUCER_ID
> {code}
>  
> It works fine with idempotence disabled. Also it works fine if using 
> zookeeper.
> Tested with altered docker image: 
> {code:java}
> FROM confluentinc/cp-kafka:6.2.0
> RUN sed -i '/KAFKA_ZOOKEEPER_CONNECT/d' /etc/confluent/docker/configure && \
> # Docker workaround: Ignore cub zk-ready
> sed -i 's/cub zk-ready/echo ignore zk-ready/' /etc/confluent/docker/ensure && 
> \
> # KRaft required step: Format the storage directory with a new cluster ID
> echo "kafka-storage format --ignore-formatted -t $(kafka-storage random-uuid) 
> -c /etc/kafka/kafka.properties" >> /etc/confluent/docker/ensure
> {code}
> docker-compose.yml
> {code:java}
> version: '3.4'
> services:
>   kafka:
> build: kafka
> restart: unless-stopped
> environment:
>   ALLOW_PLAINTEXT_LISTENER: "yes"
>   KAFKA_HEAP_OPTS: -Xms256m -Xmx256m
>   LOG4J_LOGGER_KAFKA: "WARN"
>   KAFKA_BROKER_ID: 1
>   KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 
> 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
>   KAFKA_ADVERTISED_LISTENERS: 
> 'PLAINTEXT://kafka:29092,PLAINTEXT_HOST://127.0.0.1:9092'
>   KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
>   KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
>   KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
>   KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
>   KAFKA_PROCESS_ROLES: 'broker,controller'
>   KAFKA_NODE_ID: 1
>   KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka:29093'
>   KAFKA_LISTENERS: 
> 'PLAINTEXT://kafka:29092,CONTROLLER://kafka:29093,PLAINTEXT_HOST://:9092'
>   KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
>   KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
>   KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
> ports:
>   - "127.0.0.1:9092:9092/tcp"
> command: "/etc/confluent/docker/run"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KAFKA-13223) Idempotent producer error with Kraft

2021-08-25 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis closed KAFKA-13223.
--

> Idempotent producer error with Kraft 
> -
>
> Key: KAFKA-13223
> URL: https://issues.apache.org/jira/browse/KAFKA-13223
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, producer 
>Reporter: Laurynas Butkus
>Priority: Major
> Fix For: 3.0.0
>
>
> I get an error *"The broker does not support INIT_PRODUCER_ID"* if I try to 
> produce a message idempotence enabled.
> Result:
> {code:java}
> ➜  ./bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic 
> test --request-required-acks -1 --producer-property enable.idempotence=true
> >test
> >[2021-08-23 19:40:33,356] ERROR [Producer clientId=console-producer] 
> >Aborting producer batches due to fatal error 
> >(org.apache.kafka.clients.producer.internals.Sender)
> org.apache.kafka.common.errors.UnsupportedVersionException: The broker does 
> not support INIT_PRODUCER_ID
> [2021-08-23 19:40:33,358] ERROR Error when sending message to topic test with 
> key: null, value: 4 bytes with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.UnsupportedVersionException: The broker does 
> not support INIT_PRODUCER_ID
> {code}
>  
> It works fine with idempotence disabled. Also it works fine if using 
> zookeeper.
> Tested with altered docker image: 
> {code:java}
> FROM confluentinc/cp-kafka:6.2.0
> RUN sed -i '/KAFKA_ZOOKEEPER_CONNECT/d' /etc/confluent/docker/configure && \
> # Docker workaround: Ignore cub zk-ready
> sed -i 's/cub zk-ready/echo ignore zk-ready/' /etc/confluent/docker/ensure && 
> \
> # KRaft required step: Format the storage directory with a new cluster ID
> echo "kafka-storage format --ignore-formatted -t $(kafka-storage random-uuid) 
> -c /etc/kafka/kafka.properties" >> /etc/confluent/docker/ensure
> {code}
> docker-compose.yml
> {code:java}
> version: '3.4'
> services:
>   kafka:
> build: kafka
> restart: unless-stopped
> environment:
>   ALLOW_PLAINTEXT_LISTENER: "yes"
>   KAFKA_HEAP_OPTS: -Xms256m -Xmx256m
>   LOG4J_LOGGER_KAFKA: "WARN"
>   KAFKA_BROKER_ID: 1
>   KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 
> 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
>   KAFKA_ADVERTISED_LISTENERS: 
> 'PLAINTEXT://kafka:29092,PLAINTEXT_HOST://127.0.0.1:9092'
>   KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
>   KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
>   KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
>   KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
>   KAFKA_PROCESS_ROLES: 'broker,controller'
>   KAFKA_NODE_ID: 1
>   KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka:29093'
>   KAFKA_LISTENERS: 
> 'PLAINTEXT://kafka:29092,CONTROLLER://kafka:29093,PLAINTEXT_HOST://:9092'
>   KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
>   KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
>   KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
> ports:
>   - "127.0.0.1:9092:9092/tcp"
> command: "/etc/confluent/docker/run"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13215) Flaky test org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation

2021-08-18 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13215:
---
Description: 
Integration test {{test 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation()}}
 sometimes fails with

{code:java}
java.lang.AssertionError: only one task
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
at 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.getTaskMetadata(TaskMetadataIntegrationTest.java:163)
at 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation(TaskMetadataIntegrationTest.java:144)
{code}

  was:
Integration test {{test 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation()}}
 sometimes fails with

{code:java}
java.lang.AssertionError: only one task
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
at 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.getTaskMetadata(TaskMetadataIntegrationTest.java:162)
at 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation(TaskMetadataIntegrationTest.java:117)
{code}


> Flaky test 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation
> ---
>
> Key: KAFKA-13215
> URL: https://issues.apache.org/jira/browse/KAFKA-13215
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Konstantine Karantasis
>Priority: Major
>  Labels: flaky-test
> Fix For: 3.1.0
>
>
> Integration test {{test 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation()}}
>  sometimes fails with
> {code:java}
> java.lang.AssertionError: only one task
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.getTaskMetadata(TaskMetadataIntegrationTest.java:163)
>   at 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation(TaskMetadataIntegrationTest.java:144)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-13215) Flaky test org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation

2021-08-18 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis reassigned KAFKA-13215:
--

Assignee: (was: Walker Carlson)

> Flaky test 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation
> ---
>
> Key: KAFKA-13215
> URL: https://issues.apache.org/jira/browse/KAFKA-13215
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Konstantine Karantasis
>Priority: Major
>  Labels: flaky-test
> Fix For: 3.1.0
>
>
> Integration test {{test 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation()}}
>  sometimes fails with
> {code:java}
> java.lang.AssertionError: only one task
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.getTaskMetadata(TaskMetadataIntegrationTest.java:162)
>   at 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation(TaskMetadataIntegrationTest.java:117)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-13215) Flaky test org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation

2021-08-18 Thread Konstantine Karantasis (Jira)

Konstantine Karantasis created KAFKA-13215:
--

 Summary: Flaky test 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation
 Key: KAFKA-13215
 URL: https://issues.apache.org/jira/browse/KAFKA-13215
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Bruno Cadonna
Assignee: Walker Carlson
 Fix For: 3.1.0


Integration test {{test 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation()}}
 sometimes fails with

{code:java}
java.lang.AssertionError: only one task
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
at 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.getTaskMetadata(TaskMetadataIntegrationTest.java:162)
at 
org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation(TaskMetadataIntegrationTest.java:117)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13215) Flaky test org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation

2021-08-18 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13215:
---
Reporter: Konstantine Karantasis  (was: Bruno Cadonna)

> Flaky test 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectEndOffsetInformation
> ---
>
> Key: KAFKA-13215
> URL: https://issues.apache.org/jira/browse/KAFKA-13215
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Konstantine Karantasis
>Assignee: Walker Carlson
>Priority: Major
>  Labels: flaky-test
> Fix For: 3.1.0
>
>
> Integration test {{test 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation()}}
>  sometimes fails with
> {code:java}
> java.lang.AssertionError: only one task
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.getTaskMetadata(TaskMetadataIntegrationTest.java:162)
>   at 
> org.apache.kafka.streams.integration.TaskMetadataIntegrationTest.shouldReportCorrectCommittedOffsetInformation(TaskMetadataIntegrationTest.java:117)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-13165) Validate node id, process role and quorum voters

2021-08-11 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-13165.

Resolution: Fixed

> Validate node id, process role and quorum voters
> 
>
> Key: KAFKA-13165
> URL: https://issues.apache.org/jira/browse/KAFKA-13165
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kraft
>Reporter: Jose Armando Garcia Sancio
>Assignee: Ryan Dielhenn
>Priority: Blocker
>  Labels: kip-500
> Fix For: 3.0.0
>
>
> Under certain configuration is possible for the Kafka Server to boot up as a 
> broker only but be the cluster metadata quorum leader. We should validate the 
> configuration to avoid this case.
>  # If the {{process.roles}} contains {{controller}} then the {{node.id}} 
> needs to be in the {{controller.quorum.voters}}
>  # If the {{process.roles}} doesn't contain {{controller}} then the 
> {{node.id}} cannot be in the {{controller.quorum.voters}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12644) Add Missing Class-Level Javadoc to Descendants of org.apache.kafka.common.errors.ApiException

2021-08-10 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17396873#comment-17396873
 ] 

Konstantine Karantasis commented on KAFKA-12644:


Given that at this point in time for 3.0 we are focusing exclusively on blocker 
issues and stabilization fixes (e.g. tests), I'll go ahead and push the target 
fix version for this issue to 3.0.1 and 3.1.0.

> Add Missing Class-Level Javadoc to Descendants of 
> org.apache.kafka.common.errors.ApiException
> -
>
> Key: KAFKA-12644
> URL: https://issues.apache.org/jira/browse/KAFKA-12644
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 3.0.0, 2.8.1
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>  Labels: documentation
> Fix For: 3.0.0, 2.8.1
>
>
> I noticed that class-level Javadocs are missing from some classes in the 
> org.apache.kafka.common.errors package. This issue is for tracking the work 
> of adding the missing class-level javadocs for those Exception classes.
> https://kafka.apache.org/27/javadoc/org/apache/kafka/common/errors/package-summary.html
> https://github.com/apache/kafka/tree/trunk/clients/src/main/java/org/apache/kafka/common/errors
> Basic class-level documentation could be derived by mapping the error 
> conditions documented in the protocol
> https://kafka.apache.org/protocol#protocol_constants



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12644) Add Missing Class-Level Javadoc to Descendants of org.apache.kafka.common.errors.ApiException

2021-08-10 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12644:
---
Fix Version/s: (was: 3.0.0)
   3.0.1
   3.1.0

> Add Missing Class-Level Javadoc to Descendants of 
> org.apache.kafka.common.errors.ApiException
> -
>
> Key: KAFKA-12644
> URL: https://issues.apache.org/jira/browse/KAFKA-12644
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 3.0.0, 2.8.1
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>  Labels: documentation
> Fix For: 3.1.0, 2.8.1, 3.0.1
>
>
> I noticed that class-level Javadocs are missing from some classes in the 
> org.apache.kafka.common.errors package. This issue is for tracking the work 
> of adding the missing class-level javadocs for those Exception classes.
> https://kafka.apache.org/27/javadoc/org/apache/kafka/common/errors/package-summary.html
> https://github.com/apache/kafka/tree/trunk/clients/src/main/java/org/apache/kafka/common/errors
> Basic class-level documentation could be derived by mapping the error 
> conditions documented in the protocol
> https://kafka.apache.org/protocol#protocol_constants



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12724) Add 2.8.0 to system tests and streams upgrade tests

2021-08-04 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17393508#comment-17393508
 ] 

Konstantine Karantasis commented on KAFKA-12724:


The PR got merged, so I'm resolving this issue. Thanks [~vvcephei] and 
[~ckamal] 

If any issues persist in system tests we can always reopen this ticket. 

> Add 2.8.0 to system tests and streams upgrade tests
> ---
>
> Key: KAFKA-12724
> URL: https://issues.apache.org/jira/browse/KAFKA-12724
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Blocker
> Fix For: 3.0.0
>
>
> Kafka v2.8.0 is released. We should add this version to the system tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-13112) Controller's committed offset get out of sync with raft client listener context

2021-08-01 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391328#comment-17391328
 ] 

Konstantine Karantasis commented on KAFKA-13112:


[~jagsancio] the two subtasks of this issue have been resolved. Does this make 
this blocker issue resolved as well?

> Controller's committed offset get out of sync with raft client listener 
> context
> ---
>
> Key: KAFKA-13112
> URL: https://issues.apache.org/jira/browse/KAFKA-13112
> Project: Kafka
>  Issue Type: Bug
>  Components: controller, kraft
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Blocker
>  Labels: kip-500
> Fix For: 3.0.0
>
>
> The active controller creates an in-memory snapshot for every offset returned 
> by RaftClient::scheduleAppend and RaftClient::scheduleAtomicAppend. For 
> RaftClient::scheduleAppend, the RaftClient is free to split those records 
> into multiple batches. Because of this when scheduleAppend is use there is no 
> guarantee that the active leader will always have an in-memory snapshot for 
> every "last committed offset".
> To get around this problem, when the active controller renounces from leader 
> if there is no snapshot at the last committed offset it will instead.
>  # Reset the snapshot registry
>  # Unregister the listener from the RaftClient
>  # Register a new listener with the RaftClient



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13142) KRaft brokers do not validate dynamic configs before forwarding them to controller

2021-07-27 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13142:
---
Fix Version/s: 3.0.0

> KRaft brokers do not validate dynamic configs before forwarding them to 
> controller
> --
>
> Key: KAFKA-13142
> URL: https://issues.apache.org/jira/browse/KAFKA-13142
> Project: Kafka
>  Issue Type: Task
>  Components: kraft
>Affects Versions: 3.0.0
>Reporter: Ryan Dielhenn
>Assignee: Ryan Dielhenn
>Priority: Blocker
> Fix For: 3.0.0
>
>
> The KRaft brokers are not currently validating dynamic configs before 
> forwarding them to the controller. To ensure that KRaft clusters are easily 
> upgradable it would be a good idea to validate dynamic configs in the first 
> release of KRaft so that invalid dynamic configs are never stored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-13139) Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-27 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-13139.

Resolution: Fixed

> Empty response after requesting to restart a connector without the tasks 
> results in NPE
> ---
>
> Key: KAFKA-13139
> URL: https://issues.apache.org/jira/browse/KAFKA-13139
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.0.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Blocker
> Fix For: 3.0.0
>
>
> After https://issues.apache.org/jira/browse/KAFKA-4793 a response to restart 
> only the connector (without any tasks) returns OK with an empty body. 
> As system test runs revealed, this causes an NPE in 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/RestClient.java#L135]
> We should return 204 (NO_CONTENT) instead. 
> This is a regression from previous behavior, therefore the ticket is marked 
> as a blocker candidate for 3.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-13139) Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread Konstantine Karantasis (Jira)

Konstantine Karantasis created KAFKA-13139:
--

 Summary: Empty response after requesting to restart a connector 
without the tasks results in NPE
 Key: KAFKA-13139
 URL: https://issues.apache.org/jira/browse/KAFKA-13139
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 3.0.0
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis
 Fix For: 3.0.0


After https://issues.apache.org/jira/browse/KAFKA-4793 a response to restart 
only the connector (without any tasks) returns OK with an empty body. 

As system test runs revealed, this causes an NPE in 
[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/RestClient.java#L135]

We should return 204 (NO_CONTENT) instead. 

This is a regression from previous behavior, therefore the ticket is marked as 
a blocker candidate for 3.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-13132) Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387456#comment-17387456
 ] 

Konstantine Karantasis commented on KAFKA-13132:


Marked as Blocker for now targeting 3.0, as it seems to be a regression. 

> Upgrading to topic IDs in LISR requests has gaps introduced in 3.0
> --
>
> Key: KAFKA-13132
> URL: https://issues.apache.org/jira/browse/KAFKA-13132
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Blocker
> Fix For: 3.0.0
>
>
> With the change in 3.0 to how topic IDs are assigned to logs, a bug was 
> inadvertently introduced. Now, topic IDs will only be assigned on the load of 
> the log to a partition in LISR requests. This means we will only assign topic 
> IDs for newly created topics/partitions, on broker startup, or potentially 
> when a partition is reassigned.
>  
> In the case of upgrading from an IBP before 2.8, we may have a scenario where 
> we upgrade the controller to IBP 3.0 (or even 2.8) last. (Ie, the controller 
> is IBP < 2.8 and all other brokers are on the newest IBP) Upon the last 
> broker upgrading, we will elect a new controller but its LISR request will 
> not result in topic IDs being assigned to logs of existing topics. They will 
> only be assigned in the cases mentioned above.
> *Keep in mind, in this scenario, topic IDs will be still be assigned in the 
> controller/ZK to all new and pre-existing topics and will show up in 
> metadata.*  This means we are not ensured the same guarantees we had in 2.8. 
> *It is just the LISR/partition.metadata part of the code that is affected.* 
>  
> The problem is two-fold
>  1. We ignore LISR requests when the partition leader epoch has not increased 
> (previously we assigned the ID before this check)
>  2. We only assign the topic ID when we are associating the log with the 
> partition in replicamanager for the first time. Though in the scenario 
> described above, we have logs associated with partitions that need to be 
> upgraded.
>  
> We should check the if the LISR request is resulting in a topic ID addition 
> and add logic to logs already associated to partitions in replica manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13132) Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13132:
---
Priority: Blocker  (was: Major)

> Upgrading to topic IDs in LISR requests has gaps introduced in 3.0
> --
>
> Key: KAFKA-13132
> URL: https://issues.apache.org/jira/browse/KAFKA-13132
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Blocker
> Fix For: 3.0.0
>
>
> With the change in 3.0 to how topic IDs are assigned to logs, a bug was 
> inadvertently introduced. Now, topic IDs will only be assigned on the load of 
> the log to a partition in LISR requests. This means we will only assign topic 
> IDs for newly created topics/partitions, on broker startup, or potentially 
> when a partition is reassigned.
>  
> In the case of upgrading from an IBP before 2.8, we may have a scenario where 
> we upgrade the controller to IBP 3.0 (or even 2.8) last. (Ie, the controller 
> is IBP < 2.8 and all other brokers are on the newest IBP) Upon the last 
> broker upgrading, we will elect a new controller but its LISR request will 
> not result in topic IDs being assigned to logs of existing topics. They will 
> only be assigned in the cases mentioned above.
> *Keep in mind, in this scenario, topic IDs will be still be assigned in the 
> controller/ZK to all new and pre-existing topics and will show up in 
> metadata.*  This means we are not ensured the same guarantees we had in 2.8. 
> *It is just the LISR/partition.metadata part of the code that is affected.* 
>  
> The problem is two-fold
>  1. We ignore LISR requests when the partition leader epoch has not increased 
> (previously we assigned the ID before this check)
>  2. We only assign the topic ID when we are associating the log with the 
> partition in replicamanager for the first time. Though in the scenario 
> described above, we have logs associated with partitions that need to be 
> upgraded.
>  
> We should check the if the LISR request is resulting in a topic ID addition 
> and add logic to logs already associated to partitions in replica manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13132) Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13132:
---
Fix Version/s: 3.0.0

> Upgrading to topic IDs in LISR requests has gaps introduced in 3.0
> --
>
> Key: KAFKA-13132
> URL: https://issues.apache.org/jira/browse/KAFKA-13132
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Major
> Fix For: 3.0.0
>
>
> With the change in 3.0 to how topic IDs are assigned to logs, a bug was 
> inadvertently introduced. Now, topic IDs will only be assigned on the load of 
> the log to a partition in LISR requests. This means we will only assign topic 
> IDs for newly created topics/partitions, on broker startup, or potentially 
> when a partition is reassigned.
>  
> In the case of upgrading from an IBP before 2.8, we may have a scenario where 
> we upgrade the controller to IBP 3.0 (or even 2.8) last. (Ie, the controller 
> is IBP < 2.8 and all other brokers are on the newest IBP) Upon the last 
> broker upgrading, we will elect a new controller but its LISR request will 
> not result in topic IDs being assigned to logs of existing topics. They will 
> only be assigned in the cases mentioned above.
> *Keep in mind, in this scenario, topic IDs will be still be assigned in the 
> controller/ZK to all new and pre-existing topics and will show up in 
> metadata.*  This means we are not ensured the same guarantees we had in 2.8. 
> *It is just the LISR/partition.metadata part of the code that is affected.* 
>  
> The problem is two-fold
>  1. We ignore LISR requests when the partition leader epoch has not increased 
> (previously we assigned the ID before this check)
>  2. We only assign the topic ID when we are associating the log with the 
> partition in replicamanager for the first time. Though in the scenario 
> described above, we have logs associated with partitions that need to be 
> upgraded.
>  
> We should check the if the LISR request is resulting in a topic ID addition 
> and add logic to logs already associated to partitions in replica manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12994) Migrate all Tests to New API and Remove Suppression for Deprecation Warnings related to KIP-633

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383652#comment-17383652
 ] 

Konstantine Karantasis commented on KAFKA-12994:


Downgraded the priority to Major since this is not a blocker. If we get a PR 
soon we could consider inclusion to 3.0 if the changes don't have any risk. 

> Migrate all Tests to New API and Remove Suppression for Deprecation Warnings 
> related to KIP-633
> ---
>
> Key: KAFKA-12994
> URL: https://issues.apache.org/jira/browse/KAFKA-12994
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Affects Versions: 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>  Labels: kip, kip-633
> Fix For: 3.0.0
>
>
> Due to the API changes for KIP-633 a lot of deprecation warnings have been 
> generated in tests that are using the old deprecated APIs. There are a lot of 
> tests using the deprecated methods. We should absolutely migrate them all to 
> the new APIs and then get rid of all the applicable annotations for 
> suppressing the deprecation warnings.
> The applies to all Java and Scala examples and tests using the deprecated 
> APIs in the JoinWindows, SessionWindows, TimeWindows and SlidingWindows 
> classes.
>  
> This is based on the feedback from reviewers in this PR
>  
> https://github.com/apache/kafka/pull/10926



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13021) Improve Javadocs for API Changes from KIP-633

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13021:
---
Priority: Major  (was: Blocker)

> Improve Javadocs for API Changes from KIP-633
> -
>
> Key: KAFKA-13021
> URL: https://issues.apache.org/jira/browse/KAFKA-13021
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Affects Versions: 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>
> There are Javadoc changes from the following PR that needs to be completed 
> prior to the 3.0 release. This Jira item is to track that work
> [https://github.com/apache/kafka/pull/10926]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12994) Migrate all Tests to New API and Remove Suppression for Deprecation Warnings related to KIP-633

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12994:
---
Priority: Major  (was: Blocker)

> Migrate all Tests to New API and Remove Suppression for Deprecation Warnings 
> related to KIP-633
> ---
>
> Key: KAFKA-12994
> URL: https://issues.apache.org/jira/browse/KAFKA-12994
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Affects Versions: 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>  Labels: kip, kip-633
> Fix For: 3.0.0
>
>
> Due to the API changes for KIP-633 a lot of deprecation warnings have been 
> generated in tests that are using the old deprecated APIs. There are a lot of 
> tests using the deprecated methods. We should absolutely migrate them all to 
> the new APIs and then get rid of all the applicable annotations for 
> suppressing the deprecation warnings.
> The applies to all Java and Scala examples and tests using the deprecated 
> APIs in the JoinWindows, SessionWindows, TimeWindows and SlidingWindows 
> classes.
>  
> This is based on the feedback from reviewers in this PR
>  
> https://github.com/apache/kafka/pull/10926



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12622) Automate LICENSE file validation

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12622:
---
Fix Version/s: (was: 3.0.0)
   3.0.1

> Automate LICENSE file validation
> 
>
> Key: KAFKA-12622
> URL: https://issues.apache.org/jira/browse/KAFKA-12622
> Project: Kafka
>  Issue Type: Task
>Reporter: John Roesler
>Priority: Major
> Fix For: 2.8.1, 3.0.1
>
>
> In https://issues.apache.org/jira/browse/KAFKA-12602, we manually constructed 
> a correct license file for 2.8.0. This file will certainly become wrong again 
> in later releases, so we need to write some kind of script to automate a 
> check.
> It crossed my mind to automate the generation of the file, but it seems to be 
> an intractable problem, considering that each dependency may change licenses, 
> may package license files, link to them from their poms, link to them from 
> their repos, etc. I've also found multiple URLs listed with various 
> delimiters, broken links that I have to chase down, etc.
> Therefore, it seems like the solution to aim for is simply: list all the jars 
> that we package, and print out a report of each jar that's extra or missing 
> vs. the ones in our `LICENSE-binary` file.
> The check should be part of the release script at least, if not part of the 
> regular build (so we keep it up to date as dependencies change).
>  
> Here's how I do this manually right now:
> {code:java}
> // build the binary artifacts
> $ ./gradlewAll releaseTarGz
> // unpack the binary artifact 
> $ tar xf core/build/distributions/kafka_2.13-X.Y.Z.tgz
> $ cd xf kafka_2.13-X.Y.Z
> // list the packaged jars 
> // (you can ignore the jars for our own modules, like kafka, kafka-clients, 
> etc.)
> $ ls libs/
> // cross check the jars with the packaged LICENSE
> // make sure all dependencies are listed with the right versions
> $ cat LICENSE
> // also double check all the mentioned license files are present
> $ ls licenses {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KAFKA-12291) Fix Ignored Upgrade Tests in streams_upgrade_test.py: test_upgrade_downgrade_brokers

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383642#comment-17383642
 ] 

Konstantine Karantasis edited comment on KAFKA-12291 at 7/19/21, 10:35 PM:
---

[~cadonna] we are now past code freeze for 3.0. Is this an issue that should 
block the 3.0 release, or could we postpone the fix to the next one?


was (Author: kkonstantine):
[~cadonna] we are now past code freeze for 3.0. Is this an issue that should 
block the 3.0, or could we postpone the fix to the next one?

> Fix Ignored Upgrade Tests in streams_upgrade_test.py: 
> test_upgrade_downgrade_brokers
> 
>
> Key: KAFKA-12291
> URL: https://issues.apache.org/jira/browse/KAFKA-12291
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: Bruno Cadonna
>Priority: Blocker
> Fix For: 3.0.0
>
>
> Fix in the oldest branch that ignores the test and cherry-pick forward.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12291) Fix Ignored Upgrade Tests in streams_upgrade_test.py: test_upgrade_downgrade_brokers

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383642#comment-17383642
 ] 

Konstantine Karantasis commented on KAFKA-12291:


[~cadonna] we are now past code freeze for 3.0. Is this an issue that should 
block the 3.0, or could we postpone the fix to the next one?

> Fix Ignored Upgrade Tests in streams_upgrade_test.py: 
> test_upgrade_downgrade_brokers
> 
>
> Key: KAFKA-12291
> URL: https://issues.apache.org/jira/browse/KAFKA-12291
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: Bruno Cadonna
>Priority: Blocker
> Fix For: 3.0.0
>
>
> Fix in the oldest branch that ignores the test and cherry-pick forward.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-13021) Improve Javadocs for API Changes from KIP-633

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383636#comment-17383636
 ] 

Konstantine Karantasis commented on KAFKA-13021:


I don't see a PR linked for this issue. Unless we have a straightforward low 
risk PR which we could merge very soon, I'd recommend postponing the fix for 
the next release and unblocking 3.0

> Improve Javadocs for API Changes from KIP-633
> -
>
> Key: KAFKA-13021
> URL: https://issues.apache.org/jira/browse/KAFKA-13021
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Affects Versions: 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Blocker
>
> There are Javadoc changes from the following PR that needs to be completed 
> prior to the 3.0 release. This Jira item is to track that work
> [https://github.com/apache/kafka/pull/10926]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12994) Migrate all Tests to New API and Remove Suppression for Deprecation Warnings related to KIP-633

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383632#comment-17383632
 ] 

Konstantine Karantasis commented on KAFKA-12994:


Hi [~iekpo]. You mentioned on the dev mailing list that a PR was in the works 
for this issue.

https://lists.apache.org/thread.html/r25f41514ae9751f260b5773abc039dfc828b00154297f20b4a14a151%40%3Cdev.kafka.apache.org%3E

However, I don't see a link on this issue here yet. 

We are now past code freeze for 3.0. We'll need to either get a PR for this 
issue, review soon and possibly accept as a blocker or postpone for the next 
release. Do you have an update you could share?

> Migrate all Tests to New API and Remove Suppression for Deprecation Warnings 
> related to KIP-633
> ---
>
> Key: KAFKA-12994
> URL: https://issues.apache.org/jira/browse/KAFKA-12994
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Affects Versions: 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Blocker
>  Labels: kip, kip-633
> Fix For: 3.0.0
>
>
> Due to the API changes for KIP-633 a lot of deprecation warnings have been 
> generated in tests that are using the old deprecated APIs. There are a lot of 
> tests using the deprecated methods. We should absolutely migrate them all to 
> the new APIs and then get rid of all the applicable annotations for 
> suppressing the deprecation warnings.
> The applies to all Java and Scala examples and tests using the deprecated 
> APIs in the JoinWindows, SessionWindows, TimeWindows and SlidingWindows 
> classes.
>  
> This is based on the feedback from reviewers in this PR
>  
> https://github.com/apache/kafka/pull/10926



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12487) Sink connectors do not work with the cooperative consumer rebalance protocol

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12487:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Sink connectors do not work with the cooperative consumer rebalance protocol
> 
>
> Key: KAFKA-12487
> URL: https://issues.apache.org/jira/browse/KAFKA-12487
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.0.0, 2.4.2, 2.5.2, 2.8.0, 2.7.1, 2.6.2
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.1.0
>
>
> The {{ConsumerRebalanceListener}} used by the framework to respond to 
> rebalance events in consumer groups for sink tasks is hard-coded with the 
> assumption that the consumer performs rebalances eagerly. In other words, it 
> assumes that whenever {{onPartitionsRevoked}} is called, all partitions have 
> been revoked from that consumer, and whenever {{onPartitionsAssigned}} is 
> called, the partitions passed in to that method comprise the complete set of 
> topic partitions assigned to that consumer.
> See the [WorkerSinkTask.HandleRebalance 
> class|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L669-L730]
>  for the specifics.
>  
> One issue this can cause is silently ignoring to-be-committed offsets 
> provided by sink tasks, since the framework ignores offsets provided by tasks 
> in their {{preCommit}} method if it does not believe that the consumer for 
> that task is currently assigned the topic partition for that offset. See 
> these lines in the [WorkerSinkTask::commitOffsets 
> method|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L429-L430]
>  for reference.
>  
> This may not be the only issue caused by configuring a sink connector's 
> consumer to use cooperative rebalancing. Rigorous unit and integration 
> testing should be added before claiming that the Connect framework supports 
> the use of cooperative consumers with sink connectors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12487) Sink connectors do not work with the cooperative consumer rebalance protocol

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383609#comment-17383609
 ] 

Konstantine Karantasis commented on KAFKA-12487:


Changing the default consumer protocol to be the cooperative protocol has been 
postponed for 3.1.0. Given that we are past the code freeze for 3.0, I'm 
postponing this issue to 3.1.0 while keeping its blocker status for this 
release. The change is not trivial and would be good to have enough time to 
test before we release. 

> Sink connectors do not work with the cooperative consumer rebalance protocol
> 
>
> Key: KAFKA-12487
> URL: https://issues.apache.org/jira/browse/KAFKA-12487
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.0.0, 2.4.2, 2.5.2, 2.8.0, 2.7.1, 2.6.2
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.0.0
>
>
> The {{ConsumerRebalanceListener}} used by the framework to respond to 
> rebalance events in consumer groups for sink tasks is hard-coded with the 
> assumption that the consumer performs rebalances eagerly. In other words, it 
> assumes that whenever {{onPartitionsRevoked}} is called, all partitions have 
> been revoked from that consumer, and whenever {{onPartitionsAssigned}} is 
> called, the partitions passed in to that method comprise the complete set of 
> topic partitions assigned to that consumer.
> See the [WorkerSinkTask.HandleRebalance 
> class|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L669-L730]
>  for the specifics.
>  
> One issue this can cause is silently ignoring to-be-committed offsets 
> provided by sink tasks, since the framework ignores offsets provided by tasks 
> in their {{preCommit}} method if it does not believe that the consumer for 
> that task is currently assigned the topic partition for that offset. See 
> these lines in the [WorkerSinkTask::commitOffsets 
> method|https://github.com/apache/kafka/blob/b96fc7892f1e885239d3290cf509e1d1bb41e7db/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L429-L430]
>  for reference.
>  
> This may not be the only issue caused by configuring a sink connector's 
> consumer to use cooperative rebalancing. Rigorous unit and integration 
> testing should be added before claiming that the Connect framework supports 
> the use of cooperative consumers with sink connectors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-13069) Add magic number to DefaultKafkaPrincipalBuilder.KafkaPrincipalSerde

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383604#comment-17383604
 ] 

Konstantine Karantasis commented on KAFKA-13069:


Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. 

> Add magic number to DefaultKafkaPrincipalBuilder.KafkaPrincipalSerde
> 
>
> Key: KAFKA-13069
> URL: https://issues.apache.org/jira/browse/KAFKA-13069
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Ron Dagostino
>Assignee: Ron Dagostino
>Priority: Critical
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13069) Add magic number to DefaultKafkaPrincipalBuilder.KafkaPrincipalSerde

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13069:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Add magic number to DefaultKafkaPrincipalBuilder.KafkaPrincipalSerde
> 
>
> Key: KAFKA-13069
> URL: https://issues.apache.org/jira/browse/KAFKA-13069
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Ron Dagostino
>Assignee: Ron Dagostino
>Priority: Critical
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12712) KRaft: Missing controller.quorom.voters config not properly handled

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12712:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> KRaft: Missing controller.quorom.voters config not properly handled
> ---
>
> Key: KAFKA-12712
> URL: https://issues.apache.org/jira/browse/KAFKA-12712
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Magnus Edenhill
>Priority: Major
>  Labels: kip-500
> Fix For: 3.1.0
>
>
> When trying out KRaft in 2.8 I mispelled controller.quorum.voters as 
> controller.quorum.voters, but the broker did not fail to start, nor did it 
> print any warning.
>  
> Instead it raised this error:
>  
> {code:java}
> [2021-04-23 18:25:13,484] INFO Starting controller 
> (kafka.server.ControllerServer)[2021-04-23 18:25:13,484] INFO Starting 
> controller (kafka.server.ControllerServer)[2021-04-23 18:25:13,485] ERROR 
> [kafka-raft-io-thread]: Error due to 
> (kafka.raft.KafkaRaftManager$RaftIoThread)java.lang.IllegalArgumentException: 
> bound must be positive at java.util.Random.nextInt(Random.java:388) at 
> org.apache.kafka.raft.RequestManager.findReadyVoter(RequestManager.java:57) 
> at 
> org.apache.kafka.raft.KafkaRaftClient.maybeSendAnyVoterFetch(KafkaRaftClient.java:1778)
>  at 
> org.apache.kafka.raft.KafkaRaftClient.pollUnattachedAsObserver(KafkaRaftClient.java:2080)
>  at 
> org.apache.kafka.raft.KafkaRaftClient.pollUnattached(KafkaRaftClient.java:2061)
>  at 
> org.apache.kafka.raft.KafkaRaftClient.pollCurrentState(KafkaRaftClient.java:2096)
>  at org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:2181) at 
> kafka.raft.KafkaRaftManager$RaftIoThread.doWork(RaftManager.scala:53) at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
> {code}
> which I guess eventually (1 minute later) lead to this error which terminated 
> the broker:
> {code:java}
> [2021-04-23 18:26:14,435] ERROR [BrokerLifecycleManager id=2] Shutting down 
> because we were unable to register with the controller quorum. 
> (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,435] ERROR 
> [BrokerLifecycleManager id=2] Shutting down because we were unable to 
> register with the controller quorum. 
> (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,436] INFO 
> [BrokerLifecycleManager id=2] registrationTimeout: shutting down event queue. 
> (org.apache.kafka.queue.KafkaEventQueue)[2021-04-23 18:26:14,437] INFO 
> [BrokerLifecycleManager id=2] Transitioning from STARTING to SHUTTING_DOWN. 
> (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,437] INFO 
> [broker-2-to-controller-send-thread]: Shutting down 
> (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,438] INFO 
> [broker-2-to-controller-send-thread]: Stopped 
> (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,438] INFO 
> [broker-2-to-controller-send-thread]: Shutdown completed 
> (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,441] ERROR 
> [BrokerServer id=2] Fatal error during broker startup. Prepare to shutdown 
> (kafka.server.BrokerServer)java.util.concurrent.CancellationException at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) at 
> kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:474)
>  at 
> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:174)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
> But since the client listeners were made available prior to shutting down, 
> the broker was deemed up and operational by the (naiive) monitoring tool.
> So..:
>  - Broker should fail on startup on invalid/unknown config properties. I 
> understand this is tehcnically tricky, so at least a warning log should be 
> printed.
>  - Perhaps not create client listeners before control plane is somewhat happy.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12712) KRaft: Missing controller.quorom.voters config not properly handled

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383602#comment-17383602
 ] 

Konstantine Karantasis commented on KAFKA-12712:


Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. 

> KRaft: Missing controller.quorom.voters config not properly handled
> ---
>
> Key: KAFKA-12712
> URL: https://issues.apache.org/jira/browse/KAFKA-12712
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Magnus Edenhill
>Priority: Major
>  Labels: kip-500
> Fix For: 3.0.0
>
>
> When trying out KRaft in 2.8 I mispelled controller.quorum.voters as 
> controller.quorum.voters, but the broker did not fail to start, nor did it 
> print any warning.
>  
> Instead it raised this error:
>  
> {code:java}
> [2021-04-23 18:25:13,484] INFO Starting controller 
> (kafka.server.ControllerServer)[2021-04-23 18:25:13,484] INFO Starting 
> controller (kafka.server.ControllerServer)[2021-04-23 18:25:13,485] ERROR 
> [kafka-raft-io-thread]: Error due to 
> (kafka.raft.KafkaRaftManager$RaftIoThread)java.lang.IllegalArgumentException: 
> bound must be positive at java.util.Random.nextInt(Random.java:388) at 
> org.apache.kafka.raft.RequestManager.findReadyVoter(RequestManager.java:57) 
> at 
> org.apache.kafka.raft.KafkaRaftClient.maybeSendAnyVoterFetch(KafkaRaftClient.java:1778)
>  at 
> org.apache.kafka.raft.KafkaRaftClient.pollUnattachedAsObserver(KafkaRaftClient.java:2080)
>  at 
> org.apache.kafka.raft.KafkaRaftClient.pollUnattached(KafkaRaftClient.java:2061)
>  at 
> org.apache.kafka.raft.KafkaRaftClient.pollCurrentState(KafkaRaftClient.java:2096)
>  at org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:2181) at 
> kafka.raft.KafkaRaftManager$RaftIoThread.doWork(RaftManager.scala:53) at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
> {code}
> which I guess eventually (1 minute later) lead to this error which terminated 
> the broker:
> {code:java}
> [2021-04-23 18:26:14,435] ERROR [BrokerLifecycleManager id=2] Shutting down 
> because we were unable to register with the controller quorum. 
> (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,435] ERROR 
> [BrokerLifecycleManager id=2] Shutting down because we were unable to 
> register with the controller quorum. 
> (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,436] INFO 
> [BrokerLifecycleManager id=2] registrationTimeout: shutting down event queue. 
> (org.apache.kafka.queue.KafkaEventQueue)[2021-04-23 18:26:14,437] INFO 
> [BrokerLifecycleManager id=2] Transitioning from STARTING to SHUTTING_DOWN. 
> (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,437] INFO 
> [broker-2-to-controller-send-thread]: Shutting down 
> (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,438] INFO 
> [broker-2-to-controller-send-thread]: Stopped 
> (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,438] INFO 
> [broker-2-to-controller-send-thread]: Shutdown completed 
> (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,441] ERROR 
> [BrokerServer id=2] Fatal error during broker startup. Prepare to shutdown 
> (kafka.server.BrokerServer)java.util.concurrent.CancellationException at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) at 
> kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:474)
>  at 
> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:174)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
> But since the client listeners were made available prior to shutting down, 
> the broker was deemed up and operational by the (naiive) monitoring tool.
> So..:
>  - Broker should fail on startup on invalid/unknown config properties. I 
> understand this is tehcnically tricky, so at least a warning log should be 
> printed.
>  - Perhaps not create client listeners before control plane is somewhat happy.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12882) Add RegisteredBrokerCount and UnfencedBrokerCount metrics to the QuorumController

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383600#comment-17383600
 ] 

Konstantine Karantasis commented on KAFKA-12882:


The KIP has been published but it's still under discussion. 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-748%3A+Add+Broker+Count+Metrics#KIP748:AddBrokerCountMetrics]

I'm resetting the Fix version since we are past the relevant deadlines. Please 
make sure to set the appropriate version once the KIP gets approved. 

> Add RegisteredBrokerCount and UnfencedBrokerCount metrics to the 
> QuorumController
> -
>
> Key: KAFKA-12882
> URL: https://issues.apache.org/jira/browse/KAFKA-12882
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ryan Dielhenn
>Assignee: Ryan Dielhenn
>Priority: Minor
>  Labels: needs-kip
> Fix For: 3.0.0
>
>
> Adding RegisteredBrokerCount and UnfencedBrokerCount metrics to the 
> QuorumController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12882) Add RegisteredBrokerCount and UnfencedBrokerCount metrics to the QuorumController

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12882:
---
Fix Version/s: (was: 3.0.0)

> Add RegisteredBrokerCount and UnfencedBrokerCount metrics to the 
> QuorumController
> -
>
> Key: KAFKA-12882
> URL: https://issues.apache.org/jira/browse/KAFKA-12882
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ryan Dielhenn
>Assignee: Ryan Dielhenn
>Priority: Minor
>  Labels: needs-kip
>
> Adding RegisteredBrokerCount and UnfencedBrokerCount metrics to the 
> QuorumController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12699) Streams no longer overrides the java default uncaught exception handler

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12699:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Streams no longer overrides the java default uncaught exception handler  
> -
>
> Key: KAFKA-12699
> URL: https://issues.apache.org/jira/browse/KAFKA-12699
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.0
>Reporter: Walker Carlson
>Priority: Minor
> Fix For: 3.1.0
>
>
> If a user used `Thread.setUncaughtExceptionHanlder()` to set the handler for 
> all threads in the runtime streams would override that with its own handler. 
> However since streams does not use the `Thread` handler anymore it will no 
> longer do so. This can cause problems if the user does something like 
> `System.exit(1)` in the handler. 
>  
> If using the old handler in streams it will still work as it used to



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KAFKA-12699) Streams no longer overrides the java default uncaught exception handler

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383586#comment-17383586
 ] 

Konstantine Karantasis edited comment on KAFKA-12699 at 7/19/21, 9:08 PM:
--

Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for the 3.0 code freeze. 


was (Author: kkonstantine):
Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. 

> Streams no longer overrides the java default uncaught exception handler  
> -
>
> Key: KAFKA-12699
> URL: https://issues.apache.org/jira/browse/KAFKA-12699
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.0
>Reporter: Walker Carlson
>Priority: Minor
> Fix For: 3.0.0
>
>
> If a user used `Thread.setUncaughtExceptionHanlder()` to set the handler for 
> all threads in the runtime streams would override that with its own handler. 
> However since streams does not use the `Thread` handler anymore it will no 
> longer do so. This can cause problems if the user does something like 
> `System.exit(1)` in the handler. 
>  
> If using the old handler in streams it will still work as it used to



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12699) Streams no longer overrides the java default uncaught exception handler

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383586#comment-17383586
 ] 

Konstantine Karantasis commented on KAFKA-12699:


Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. 

> Streams no longer overrides the java default uncaught exception handler  
> -
>
> Key: KAFKA-12699
> URL: https://issues.apache.org/jira/browse/KAFKA-12699
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.0
>Reporter: Walker Carlson
>Priority: Minor
> Fix For: 3.0.0
>
>
> If a user used `Thread.setUncaughtExceptionHanlder()` to set the handler for 
> all threads in the runtime streams would override that with its own handler. 
> However since streams does not use the `Thread` handler anymore it will no 
> longer do so. This can cause problems if the user does something like 
> `System.exit(1)` in the handler. 
>  
> If using the old handler in streams it will still work as it used to



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12774) kafka-streams 2.8: logging in uncaught-exceptionhandler doesn't go through log4j

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12774:
---
Fix Version/s: 3.0.1

> kafka-streams 2.8: logging in uncaught-exceptionhandler doesn't go through 
> log4j
> 
>
> Key: KAFKA-12774
> URL: https://issues.apache.org/jira/browse/KAFKA-12774
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.0
>Reporter: Jørgen
>Priority: Minor
> Fix For: 3.1.0, 2.8.1, 3.0.1
>
>
> When exceptions is handled in the uncaught-exception handler introduced in 
> KS2.8, the logging of the stacktrace doesn't seem to go through the logging 
> framework configured by the application (log4j2 in our case), but gets 
> printed to console "line-by-line".
> All other exceptions logged by kafka-streams go through log4j2 and gets 
> formatted properly according to the log4j2 appender (json in our case). 
> Haven't tested this on other frameworks like logback.
> Application setup:
>  * Spring-boot 2.4.5
>  * Log4j 2.13.3
>  * Slf4j 1.7.30
> Log4j2 appender config:
> {code:java}
> 
> 
>  stacktraceAsString="true" properties="true">
>  value="$${date:-MM-dd'T'HH:mm:ss.SSSZ}"/>
> 
> 
>  {code}
> Uncaught exception handler config:
> {code:java}
> kafkaStreams.setUncaughtExceptionHandler { exception ->
> logger.warn("Uncaught exception handled - replacing thread", exception) 
> // logged properly
> 
> StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD
> } {code}
> Stacktrace that gets printed line-by-line:
> {code:java}
> Exception in thread "xxx-f5860dff-9a41-490e-8ab0-540b1a7f9ce4-StreamThread-2" 
> org.apache.kafka.streams.errors.StreamsException: Error encountered sending 
> record to topic xxx-repartition for task 3_2 due 
> to:org.apache.kafka.common.errors.InvalidPidMappingException: The producer 
> attempted to use a producer id which is not currently assigned to its 
> transactional id.Exception handler choose to FAIL the processing, no more 
> records would be sent.  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:226)
>at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$0(RecordCollectorImpl.java:196)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1365)
> at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:231)
>  at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.abort(ProducerBatch.java:159)
>   at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.abortUndrainedBatches(RecordAccumulator.java:783)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:430)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:315)  
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:242)
>   at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
> org.apache.kafka.common.errors.InvalidPidMappingException: The producer 
> attempted to use a producer id which is not currently assigned to its 
> transactional id. {code}
>  
> It's a little bit hard to reproduce as I haven't found any way to trigger 
> uncaught-exception-handler through junit-tests.
> Link to discussion on slack: 
> https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1620389197436700



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12774) kafka-streams 2.8: logging in uncaught-exceptionhandler doesn't go through log4j

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12774:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> kafka-streams 2.8: logging in uncaught-exceptionhandler doesn't go through 
> log4j
> 
>
> Key: KAFKA-12774
> URL: https://issues.apache.org/jira/browse/KAFKA-12774
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.0
>Reporter: Jørgen
>Priority: Minor
> Fix For: 3.1.0, 2.8.1
>
>
> When exceptions is handled in the uncaught-exception handler introduced in 
> KS2.8, the logging of the stacktrace doesn't seem to go through the logging 
> framework configured by the application (log4j2 in our case), but gets 
> printed to console "line-by-line".
> All other exceptions logged by kafka-streams go through log4j2 and gets 
> formatted properly according to the log4j2 appender (json in our case). 
> Haven't tested this on other frameworks like logback.
> Application setup:
>  * Spring-boot 2.4.5
>  * Log4j 2.13.3
>  * Slf4j 1.7.30
> Log4j2 appender config:
> {code:java}
> 
> 
>  stacktraceAsString="true" properties="true">
>  value="$${date:-MM-dd'T'HH:mm:ss.SSSZ}"/>
> 
> 
>  {code}
> Uncaught exception handler config:
> {code:java}
> kafkaStreams.setUncaughtExceptionHandler { exception ->
> logger.warn("Uncaught exception handled - replacing thread", exception) 
> // logged properly
> 
> StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD
> } {code}
> Stacktrace that gets printed line-by-line:
> {code:java}
> Exception in thread "xxx-f5860dff-9a41-490e-8ab0-540b1a7f9ce4-StreamThread-2" 
> org.apache.kafka.streams.errors.StreamsException: Error encountered sending 
> record to topic xxx-repartition for task 3_2 due 
> to:org.apache.kafka.common.errors.InvalidPidMappingException: The producer 
> attempted to use a producer id which is not currently assigned to its 
> transactional id.Exception handler choose to FAIL the processing, no more 
> records would be sent.  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:226)
>at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$0(RecordCollectorImpl.java:196)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1365)
> at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:231)
>  at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.abort(ProducerBatch.java:159)
>   at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.abortUndrainedBatches(RecordAccumulator.java:783)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:430)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:315)  
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:242)
>   at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
> org.apache.kafka.common.errors.InvalidPidMappingException: The producer 
> attempted to use a producer id which is not currently assigned to its 
> transactional id. {code}
>  
> It's a little bit hard to reproduce as I haven't found any way to trigger 
> uncaught-exception-handler through junit-tests.
> Link to discussion on slack: 
> https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1620389197436700



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-10641) ACL Command hangs with SSL as not existing with proper error code

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-10641:
---
Fix Version/s: (was: 3.0.0)

> ACL Command hangs with SSL as not existing with proper error code
> -
>
> Key: KAFKA-10641
> URL: https://issues.apache.org/jira/browse/KAFKA-10641
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0, 2.1.1, 2.3.0, 2.2.1, 2.2.2, 
> 2.4.0, 2.3.1, 2.5.0, 2.4.1, 2.6.0, 2.5.1
>Reporter: Senthilnathan Muthusamy
>Assignee: Senthilnathan Muthusamy
>Priority: Minor
>
> When using ACL Command with SSL mode, the process is not terminating after 
> successful ACL operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-10641) ACL Command hangs with SSL as not existing with proper error code

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383584#comment-17383584
 ] 

Konstantine Karantasis commented on KAFKA-10641:


Resetting the fix version of this ticket since there hasn't been any follow up 
on the comments on the PR. Regarding 3.0, this issue is not a blocker and did 
not make it on time for the 3.0 code freeze. 

> ACL Command hangs with SSL as not existing with proper error code
> -
>
> Key: KAFKA-10641
> URL: https://issues.apache.org/jira/browse/KAFKA-10641
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0, 2.1.1, 2.3.0, 2.2.1, 2.2.2, 
> 2.4.0, 2.3.1, 2.5.0, 2.4.1, 2.6.0, 2.5.1
>Reporter: Senthilnathan Muthusamy
>Assignee: Senthilnathan Muthusamy
>Priority: Minor
> Fix For: 3.0.0
>
>
> When using ACL Command with SSL mode, the process is not terminating after 
> successful ACL operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-10642) Expose the real stack trace if any exception occurred during SSL Client Trust Verification in extension

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383582#comment-17383582
 ] 

Konstantine Karantasis commented on KAFKA-10642:


Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. Having said that, it seems it's 
been a while since there was any update on the associated PR. cc [~senthilm-ms] 
[~rajinisiva...@gmail.com] on whether this is still an issue. 

> Expose the real stack trace if any exception occurred during SSL Client Trust 
> Verification in extension
> ---
>
> Key: KAFKA-10642
> URL: https://issues.apache.org/jira/browse/KAFKA-10642
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.3.0, 2.4.0, 2.3.1, 2.5.0, 2.4.1, 2.6.0, 2.5.1
>Reporter: Senthilnathan Muthusamy
>Assignee: Senthilnathan Muthusamy
>Priority: Minor
> Fix For: 3.0.0
>
>
> If there is any exception occurred in the custom implementation of client 
> trust verification (i.e. using security.provider), the inner exception is 
> suppressed or hidden and not logged to the log file...
>  
> Below is an example stack trace not showing actual exception from the 
> extension/custom implementation.
>  
> [2020-05-13 14:30:26,892] ERROR [KafkaServer id=423810470] Fatal error during 
> KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer)[2020-05-13 14:30:26,892] ERROR [KafkaServer 
> id=423810470] Fatal error during KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer) org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:71)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85)
>  at kafka.network.Processor.(SocketServer.scala:753) at 
> kafka.network.SocketServer.newProcessor(SocketServer.scala:394) at 
> kafka.network.SocketServer.$anonfun$addDataPlaneProcessors$1(SocketServer.scala:279)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.network.SocketServer.addDataPlaneProcessors(SocketServer.scala:278) at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:241)
>  at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:238)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:238)
>  at kafka.network.SocketServer.startup(SocketServer.scala:121) at 
> kafka.server.KafkaServer.startup(KafkaServer.scala:265) at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) at 
> kafka.Kafka$.main(Kafka.scala:84) at kafka.Kafka.main(Kafka.scala)Caused by: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:100)
>  at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:69)
>  ... 18 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-10642) Expose the real stack trace if any exception occurred during SSL Client Trust Verification in extension

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-10642:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Expose the real stack trace if any exception occurred during SSL Client Trust 
> Verification in extension
> ---
>
> Key: KAFKA-10642
> URL: https://issues.apache.org/jira/browse/KAFKA-10642
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.3.0, 2.4.0, 2.3.1, 2.5.0, 2.4.1, 2.6.0, 2.5.1
>Reporter: Senthilnathan Muthusamy
>Assignee: Senthilnathan Muthusamy
>Priority: Minor
> Fix For: 3.1.0
>
>
> If there is any exception occurred in the custom implementation of client 
> trust verification (i.e. using security.provider), the inner exception is 
> suppressed or hidden and not logged to the log file...
>  
> Below is an example stack trace not showing actual exception from the 
> extension/custom implementation.
>  
> [2020-05-13 14:30:26,892] ERROR [KafkaServer id=423810470] Fatal error during 
> KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer)[2020-05-13 14:30:26,892] ERROR [KafkaServer 
> id=423810470] Fatal error during KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer) org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:71)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85)
>  at kafka.network.Processor.(SocketServer.scala:753) at 
> kafka.network.SocketServer.newProcessor(SocketServer.scala:394) at 
> kafka.network.SocketServer.$anonfun$addDataPlaneProcessors$1(SocketServer.scala:279)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.network.SocketServer.addDataPlaneProcessors(SocketServer.scala:278) at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:241)
>  at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:238)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:238)
>  at kafka.network.SocketServer.startup(SocketServer.scala:121) at 
> kafka.server.KafkaServer.startup(KafkaServer.scala:265) at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) at 
> kafka.Kafka$.main(Kafka.scala:84) at kafka.Kafka.main(Kafka.scala)Caused by: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:100)
>  at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:69)
>  ... 18 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12899) Support --bootstrap-server in ReplicaVerificationTool

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12899:
---
Fix Version/s: (was: 3.0.0)

> Support --bootstrap-server in ReplicaVerificationTool
> -
>
> Key: KAFKA-12899
> URL: https://issues.apache.org/jira/browse/KAFKA-12899
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
>  Labels: needs-kip
>
> kafka.tools.ReplicaVerificationTool still uses --broker-list, breaking 
> consistency with other (already migrated) tools.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12899) Support --bootstrap-server in ReplicaVerificationTool

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383577#comment-17383577
 ] 

Konstantine Karantasis commented on KAFKA-12899:


KIP freeze has past a while ago and the associated KIP has not received the 
required votes yet: 
[https://lists.apache.org/thread.html/rebd427d5fd34acf5b378d7a904af2c804e7460b32d34ddbb3368776c%40%3Cdev.kafka.apache.org%3E]

Will reset the target version for this feature. I believe it makes sense to set 
it once the voting process concludes. 

> Support --bootstrap-server in ReplicaVerificationTool
> -
>
> Key: KAFKA-12899
> URL: https://issues.apache.org/jira/browse/KAFKA-12899
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
>  Labels: needs-kip
> Fix For: 3.0.0
>
>
> kafka.tools.ReplicaVerificationTool still uses --broker-list, breaking 
> consistency with other (already migrated) tools.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9803) Allow producers to recover gracefully from transaction timeouts

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-9803:
--
Fix Version/s: (was: 3.0.0)
   3.1.0

> Allow producers to recover gracefully from transaction timeouts
> ---
>
> Key: KAFKA-9803
> URL: https://issues.apache.org/jira/browse/KAFKA-9803
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer , streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.1.0
>
>
> Transaction timeouts are detected by the transaction coordinator. When the 
> coordinator detects a timeout, it bumps the producer epoch and aborts the 
> transaction. The epoch bump is necessary in order to prevent the current 
> producer from being able to begin writing to a new transaction which was not 
> started through the coordinator.  
> Transactions may also be aborted if a new producer with the same 
> `transactional.id` starts up. Similarly this results in an epoch bump. 
> Currently the coordinator does not distinguish these two cases. Both will end 
> up as a `ProducerFencedException`, which means the producer needs to shut 
> itself down. 
> We can improve this with the new APIs from KIP-360. When the coordinator 
> times out a transaction, it can remember that fact and allow the existing 
> producer to claim the bumped epoch and continue. Roughly the logic would work 
> like this:
> 1. When a transaction times out, set lastProducerEpoch to the current epoch 
> and do the normal bump.
> 2. Any transactional requests from the old epoch result in a new 
> TRANSACTION_TIMED_OUT error code, which is propagated to the application.
> 3. The producer recovers by sending InitProducerId with the current epoch. 
> The coordinator returns the bumped epoch.
> One issue that needs to be addressed is how to handle INVALID_PRODUCER_EPOCH 
> from Produce requests. Partition leaders will not generally know if a bumped 
> epoch was the result of a timed out transaction or a fenced producer. 
> Possibly the producer can treat these errors as abortable when they come from 
> Produce responses. In that case, the user would try to abort the transaction 
> and then we can see if it was due to a timeout or otherwise.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9803) Allow producers to recover gracefully from transaction timeouts

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383576#comment-17383576
 ] 

Konstantine Karantasis commented on KAFKA-9803:
---

Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. 

> Allow producers to recover gracefully from transaction timeouts
> ---
>
> Key: KAFKA-9803
> URL: https://issues.apache.org/jira/browse/KAFKA-9803
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer , streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.0.0
>
>
> Transaction timeouts are detected by the transaction coordinator. When the 
> coordinator detects a timeout, it bumps the producer epoch and aborts the 
> transaction. The epoch bump is necessary in order to prevent the current 
> producer from being able to begin writing to a new transaction which was not 
> started through the coordinator.  
> Transactions may also be aborted if a new producer with the same 
> `transactional.id` starts up. Similarly this results in an epoch bump. 
> Currently the coordinator does not distinguish these two cases. Both will end 
> up as a `ProducerFencedException`, which means the producer needs to shut 
> itself down. 
> We can improve this with the new APIs from KIP-360. When the coordinator 
> times out a transaction, it can remember that fact and allow the existing 
> producer to claim the bumped epoch and continue. Roughly the logic would work 
> like this:
> 1. When a transaction times out, set lastProducerEpoch to the current epoch 
> and do the normal bump.
> 2. Any transactional requests from the old epoch result in a new 
> TRANSACTION_TIMED_OUT error code, which is propagated to the application.
> 3. The producer recovers by sending InitProducerId with the current epoch. 
> The coordinator returns the bumped epoch.
> One issue that needs to be addressed is how to handle INVALID_PRODUCER_EPOCH 
> from Produce requests. Partition leaders will not generally know if a bumped 
> epoch was the result of a timed out transaction or a fenced producer. 
> Possibly the producer can treat these errors as abortable when they come from 
> Produce responses. In that case, the user would try to abort the transaction 
> and then we can see if it was due to a timeout or otherwise.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383575#comment-17383575
 ] 

Konstantine Karantasis commented on KAFKA-12842:


This is an infrequent failure on a third-party dependency. Not a blocker, so 
I'm postponing it to the next release. 

> Failing test: 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
> --
>
> Key: KAFKA-12842
> URL: https://issues.apache.org/jira/browse/KAFKA-12842
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: John Roesler
>Priority: Major
> Fix For: 3.1.0
>
>
> This test failed during a PR build, which means that it failed twice in a 
> row, due to the test-retry logic in PR builds.
>  
> [https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]
>  
> {noformat}
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at org.reflections.Store.getAllIncluding(Store.java:82)
>   at org.reflections.Store.getAll(Store.java:93)
>   at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
>   at 
> org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
>   at 
> org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
>   at 
> org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
>   at 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>

[jira] [Assigned] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis reassigned KAFKA-12842:
--

Assignee: (was: Konstantine Karantasis)

> Failing test: 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
> --
>
> Key: KAFKA-12842
> URL: https://issues.apache.org/jira/browse/KAFKA-12842
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: John Roesler
>Priority: Major
> Fix For: 3.1.0
>
>
> This test failed during a PR build, which means that it failed twice in a 
> row, due to the test-retry logic in PR builds.
>  
> [https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]
>  
> {noformat}
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at org.reflections.Store.getAllIncluding(Store.java:82)
>   at org.reflections.Store.getAll(Store.java:93)
>   at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
>   at 
> org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
>   at 
> org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
>   at 
> org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
>   at 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
>

[jira] [Updated] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12842:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Failing test: 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
> --
>
> Key: KAFKA-12842
> URL: https://issues.apache.org/jira/browse/KAFKA-12842
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: John Roesler
>Priority: Major
> Fix For: 3.1.0
>
>
> This test failed during a PR build, which means that it failed twice in a 
> row, due to the test-retry logic in PR builds.
>  
> [https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]
>  
> {noformat}
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at org.reflections.Store.getAllIncluding(Store.java:82)
>   at org.reflections.Store.getAll(Store.java:93)
>   at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
>   at 
> org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
>   at 
> org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
>   at 
> org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
>   at 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
>

[jira] [Assigned] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis reassigned KAFKA-12842:
--

Assignee: Konstantine Karantasis

> Failing test: 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
> --
>
> Key: KAFKA-12842
> URL: https://issues.apache.org/jira/browse/KAFKA-12842
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: John Roesler
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.1.0
>
>
> This test failed during a PR build, which means that it failed twice in a 
> row, due to the test-retry logic in PR builds.
>  
> [https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]
>  
> {noformat}
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at org.reflections.Store.getAllIncluding(Store.java:82)
>   at org.reflections.Store.getAll(Store.java:93)
>   at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
>   at 
> org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
>   at 
> org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
>   at 
> org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
>   at 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
>

[jira] [Commented] (KAFKA-9910) Implement new transaction timed out error

2021-07-19 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383573#comment-17383573
 ] 

Konstantine Karantasis commented on KAFKA-9910:
---

Postponing to the subsequent release given that this issue is not a blocker and 
did not make it on time for 3.0 code freeze. 

> Implement new transaction timed out error
> -
>
> Key: KAFKA-9910
> URL: https://issues.apache.org/jira/browse/KAFKA-9910
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Assignee: HaiyuanZhao
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9910) Implement new transaction timed out error

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-9910:
--
Fix Version/s: (was: 3.0.0)
   3.1.0

> Implement new transaction timed out error
> -
>
> Key: KAFKA-9910
> URL: https://issues.apache.org/jira/browse/KAFKA-9910
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Assignee: HaiyuanZhao
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-12803) Support reassigning partitions when in KRaft mode

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-12803.

Resolution: Fixed

Resolving as Fixed given that the PR got merged and cherry-picked to 3.0:
https://github.com/apache/kafka/pull/10753

> Support reassigning partitions when in KRaft mode
> -
>
> Key: KAFKA-12803
> URL: https://issues.apache.org/jira/browse/KAFKA-12803
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
>  Labels: kip-500
> Fix For: 3.0.0
>
>
> Support reassigning partitions when in KRaft mode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-13090) Improve cluster snapshot integration test

2021-07-19 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-13090.

Resolution: Fixed

Resolving given that the PR got merged and cherry-picked to 3.0: 
https://github.com/apache/kafka/pull/11054

> Improve cluster snapshot integration test
> -
>
> Key: KAFKA-13090
> URL: https://issues.apache.org/jira/browse/KAFKA-13090
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: kip-500
> Fix For: 3.0.0
>
>
> Extends the test in RaftClusterSnapshotTest to verify that both the 
> controllers and brokers are generating snapshots.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-13073) Simulation test fails due to inconsistency in MockLog's implementation

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-13073.

Resolution: Fixed

> Simulation test fails due to inconsistency in MockLog's implementation
> --
>
> Key: KAFKA-13073
> URL: https://issues.apache.org/jira/browse/KAFKA-13073
> Project: Kafka
>  Issue Type: Bug
>  Components: controller, replication
>Affects Versions: 3.0.0
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: kip-500
> Fix For: 3.0.0
>
>
> We are getting the following error on trunk
> {code:java}
> RaftEventSimulationTest > canRecoverAfterAllNodesKilled STANDARD_OUT
> timestamp = 2021-07-12T16:26:55.663, 
> RaftEventSimulationTest:canRecoverAfterAllNodesKilled =
>   java.lang.RuntimeException:
> Uncaught exception during poll of node 1  
> |---jqwik---
> tries = 25| # of calls to property
> checks = 25   | # of not rejected calls
> generation = RANDOMIZED   | parameters are randomly generated
> after-failure = PREVIOUS_SEED | use the previous seed
> when-fixed-seed = ALLOW   | fixing the random seed is allowed
> edge-cases#mode = MIXIN   | edge cases are mixed in
> edge-cases#total = 108| # of all combined edge cases
> edge-cases#tried = 4  | # of edge cases tried in current run
> seed = 8079861963960994566| random seed to reproduce generated values 
>Sample
> --
>   arg0: 4002
>   arg1: 2
>   arg2: 4{code}
> I think there are a couple of issues here:
>  # The {{ListenerContext}} for {{KafkaRaftClient}} uses the value returned by 
> {{ReplicatedLog::startOffset()}} to determined the log start and when to load 
> a snapshot while the {{MockLog}} implementation uses {{logStartOffset}} which 
> could be a different value.
>  # {{MockLog}} doesn't implement {{ReplicatedLog::maybeClean}} so the log 
> start offset is always 0.
>  # The snapshot id validation for {{MockLog}} and {{KafkaMetadataLog}}'s 
> {{createNewSnapshot}} throws an exception when the snapshot id is less than 
> the log start offset.
> Solutions:
> Fix the error quoted above we only need to fix bullet point 3. but I think we 
> should fix all of the issues enumerated in this Jira.
> For 1. we should change the {{MockLog}} implementation so that it uses 
> {{startOffset}} both externally and internally.
> For 2. I will file another issue to track this implementation.
> For 3. I think this validation is too strict. I think it is safe to simply 
> ignore any attempt by the state machine to create an snapshot with an id less 
> that the log start offset. We should return a {{Optional.empty()}}when the 
> snapshot id is less than the log start offset. This tells the user that it 
> doesn't need to generate a snapshot for that offset. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KAFKA-7632) Support Compression Level

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380866#comment-17380866
 ] 

Konstantine Karantasis edited comment on KAFKA-7632 at 7/14/21, 9:50 PM:
-

This feature was not approved on time for 3.0. Pushing the target version to 3.1


was (Author: kkonstantine):
This feature was not approved in time for 3.0. Pushing the target version to 3.1

> Support Compression Level
> -
>
> Key: KAFKA-7632
> URL: https://issues.apache.org/jira/browse/KAFKA-7632
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 2.1.0
> Environment: all
>Reporter: Dave Waters
>Assignee: Dongjin Lee
>Priority: Blocker
>  Labels: needs-kip
> Fix For: 3.1.0
>
>
> The compression level for ZSTD is currently set to use the default level (3), 
> which is a conservative setting that in some use cases eliminates the value 
> that ZSTD provides with improved compression. Each use case will vary, so 
> exposing the level as a broker configuration setting will allow the user to 
> adjust the level.
> Since it applies to the other compression codecs, we should add the same 
> functionalities to them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9366) Upgrade log4j to log4j2

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380874#comment-17380874
 ] 

Konstantine Karantasis commented on KAFKA-9366:
---

This feature was not approved on time for 3.0. Pushing the target version to 3.1
[|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13198668]

> Upgrade log4j to log4j2
> ---
>
> Key: KAFKA-9366
> URL: https://issues.apache.org/jira/browse/KAFKA-9366
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0, 2.1.1, 2.3.0, 2.4.0
>Reporter: leibo
>Assignee: Dongjin Lee
>Priority: Critical
>  Labels: needs-kip
> Fix For: 3.0.0
>
>
> h2. CVE-2019-17571 Detail
> Included in Log4j 1.2 is a SocketServer class that is vulnerable to 
> deserialization of untrusted data which can be exploited to remotely execute 
> arbitrary code when combined with a deserialization gadget when listening to 
> untrusted network traffic for log data. This affects Log4j versions up to 1.2 
> up to 1.2.17.
>  
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17571]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9366) Upgrade log4j to log4j2

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-9366:
--
Fix Version/s: (was: 3.0.0)
   3.1.0

> Upgrade log4j to log4j2
> ---
>
> Key: KAFKA-9366
> URL: https://issues.apache.org/jira/browse/KAFKA-9366
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0, 2.1.1, 2.3.0, 2.4.0
>Reporter: leibo
>Assignee: Dongjin Lee
>Priority: Critical
>  Labels: needs-kip
> Fix For: 3.1.0
>
>
> h2. CVE-2019-17571 Detail
> Included in Log4j 1.2 is a SocketServer class that is vulnerable to 
> deserialization of untrusted data which can be exploited to remotely execute 
> arbitrary code when combined with a deserialization gadget when listening to 
> untrusted network traffic for log data. This affects Log4j versions up to 1.2 
> up to 1.2.17.
>  
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17571]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12495) Unbalanced connectors/tasks distribution will happen in Connect's incremental cooperative assignor

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380872#comment-17380872
 ] 

Konstantine Karantasis commented on KAFKA-12495:


This issue corresponds to a corner case that does not seem to appear in 
practice often. The current suggestion to allow for consecutive revocations 
carries some risk. I have another fix in mind that I'd like to explore. In the 
meantime I'm punting this issue to the next release. 

> Unbalanced connectors/tasks distribution will happen in Connect's incremental 
> cooperative assignor
> --
>
> Key: KAFKA-12495
> URL: https://issues.apache.org/jira/browse/KAFKA-12495
> Project: Kafka
>  Issue Type: Bug
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Attachments: image-2021-03-18-15-04-57-854.png, 
> image-2021-03-18-15-05-52-557.png, image-2021-03-18-15-07-27-103.png
>
>
> In Kafka Connect, we implement incremental cooperative rebalance algorithm 
> based on KIP-415 
> ([https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect)|https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect].
>  However, we have a bad assumption in the algorithm implementation, which is: 
> after revoking rebalance completed, the member(worker) count will be the same 
> as the previous round of reblance.
>  
> Let's take a look at the example in the KIP-415:
> !image-2021-03-18-15-07-27-103.png|width=441,height=556!
> It works well for most cases. But what if W4 added after 1st rebalance 
> completed and before 2nd rebalance started? Let's see what will happened? 
> Let's see this example: (we'll use 10 tasks here):
>  
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, BT4, 
> BT5
> W1 is current leader
> W2 joins with assignment: []
> Rebalance is triggered
> W3 joins while rebalance is still active with assignment: []
> W1 joins with assignment: [AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, 
> BT4, BT5]
> W1 becomes leader
> W1 computes and sends assignments:
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> W2(delay: 0, assigned: [], revoked: [])
> W3(delay: 0, assigned: [], revoked: [])
> W1 stops revoked resources
> W1 rejoins with assignment: [AC0, AT1, AT2, AT3]
> Rebalance is triggered
> W2 joins with assignment: []
> W3 joins with assignment: []
> // one more member joined
> W4 joins with assignment: []
> W1 becomes leader
> W1 computes and sends assignments:
> // We assigned all the previous revoked Connectors/Tasks to the new member, 
> but we didn't revoke any more C/T in this round, which cause unbalanced 
> distribution
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [])
> W2(delay: 0, assigned: [AT4, AT5, BC0], revoked: [])
> W2(delay: 0, assigned: [BT1, BT2, BT4], revoked: [])
> W2(delay: 0, assigned: [BT4, BT5], revoked: [])
> {code}
> Because we didn't allow to do consecutive revoke in two consecutive 
> rebalances (under the same leader), we will have this uneven distribution 
> under this situation. We should allow consecutive rebalance to have another 
> round of revocation to revoke the C/T to the other members in this case.
> expected:
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, BT4, 
> BT5
> W1 is current leader
> W2 joins with assignment: []
> Rebalance is triggered
> W3 joins while rebalance is still active with assignment: []
> W1 joins with assignment: [AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, 
> BT4, BT5]
> W1 becomes leader
> W1 computes and sends assignments:
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> W2(delay: 0, assigned: [], revoked: [])
> W3(delay: 0, assigned: [], revoked: [])
> W1 stops revoked resources
> W1 rejoins with assignment: [AC0, AT1, AT2, AT3]
> Rebalance is triggered
> W2 joins with assignment: []
> W3 joins with assignment: []
> // one more member joined
> W4 joins with assignment: []
> W1 becomes leader
> W1 computes and sends assignments:
> // We assigned all the previous revoked Connectors/Tasks to the new member, 
> **and also revoke some C/T** 
> W1(delay: 0, assigned: [AC0, AT1, AT2], revoked: [AT3])
> W2(delay: 0, assigned: [AT4, AT5, BC0], revoked: [])
> W3(delay: 0, assigned: [BT1, BT2, BT4], revoked: [])
> W4(delay: 0, assigned: [BT4, BT5], revoked: [])
> // another round of rebalance to

[jira] [Updated] (KAFKA-12495) Unbalanced connectors/tasks distribution will happen in Connect's incremental cooperative assignor

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12495:
---
Fix Version/s: 3.1.0

> Unbalanced connectors/tasks distribution will happen in Connect's incremental 
> cooperative assignor
> --
>
> Key: KAFKA-12495
> URL: https://issues.apache.org/jira/browse/KAFKA-12495
> Project: Kafka
>  Issue Type: Bug
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: image-2021-03-18-15-04-57-854.png, 
> image-2021-03-18-15-05-52-557.png, image-2021-03-18-15-07-27-103.png
>
>
> In Kafka Connect, we implement incremental cooperative rebalance algorithm 
> based on KIP-415 
> ([https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect)|https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect].
>  However, we have a bad assumption in the algorithm implementation, which is: 
> after revoking rebalance completed, the member(worker) count will be the same 
> as the previous round of reblance.
>  
> Let's take a look at the example in the KIP-415:
> !image-2021-03-18-15-07-27-103.png|width=441,height=556!
> It works well for most cases. But what if W4 added after 1st rebalance 
> completed and before 2nd rebalance started? Let's see what will happened? 
> Let's see this example: (we'll use 10 tasks here):
>  
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, BT4, 
> BT5
> W1 is current leader
> W2 joins with assignment: []
> Rebalance is triggered
> W3 joins while rebalance is still active with assignment: []
> W1 joins with assignment: [AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, 
> BT4, BT5]
> W1 becomes leader
> W1 computes and sends assignments:
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> W2(delay: 0, assigned: [], revoked: [])
> W3(delay: 0, assigned: [], revoked: [])
> W1 stops revoked resources
> W1 rejoins with assignment: [AC0, AT1, AT2, AT3]
> Rebalance is triggered
> W2 joins with assignment: []
> W3 joins with assignment: []
> // one more member joined
> W4 joins with assignment: []
> W1 becomes leader
> W1 computes and sends assignments:
> // We assigned all the previous revoked Connectors/Tasks to the new member, 
> but we didn't revoke any more C/T in this round, which cause unbalanced 
> distribution
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [])
> W2(delay: 0, assigned: [AT4, AT5, BC0], revoked: [])
> W2(delay: 0, assigned: [BT1, BT2, BT4], revoked: [])
> W2(delay: 0, assigned: [BT4, BT5], revoked: [])
> {code}
> Because we didn't allow to do consecutive revoke in two consecutive 
> rebalances (under the same leader), we will have this uneven distribution 
> under this situation. We should allow consecutive rebalance to have another 
> round of revocation to revoke the C/T to the other members in this case.
> expected:
> {code:java}
> Initial group and assignment: W1([AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> Config topic contains: AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, BT4, 
> BT5
> W1 is current leader
> W2 joins with assignment: []
> Rebalance is triggered
> W3 joins while rebalance is still active with assignment: []
> W1 joins with assignment: [AC0, AT1, AT2, AT3, AT4, AT5, BC0, BT1, BT2, BT4, 
> BT4, BT5]
> W1 becomes leader
> W1 computes and sends assignments:
> W1(delay: 0, assigned: [AC0, AT1, AT2, AT3], revoked: [AT4, AT5, BC0, BT1, 
> BT2, BT4, BT4, BT5])
> W2(delay: 0, assigned: [], revoked: [])
> W3(delay: 0, assigned: [], revoked: [])
> W1 stops revoked resources
> W1 rejoins with assignment: [AC0, AT1, AT2, AT3]
> Rebalance is triggered
> W2 joins with assignment: []
> W3 joins with assignment: []
> // one more member joined
> W4 joins with assignment: []
> W1 becomes leader
> W1 computes and sends assignments:
> // We assigned all the previous revoked Connectors/Tasks to the new member, 
> **and also revoke some C/T** 
> W1(delay: 0, assigned: [AC0, AT1, AT2], revoked: [AT3])
> W2(delay: 0, assigned: [AT4, AT5, BC0], revoked: [])
> W3(delay: 0, assigned: [BT1, BT2, BT4], revoked: [])
> W4(delay: 0, assigned: [BT4, BT5], revoked: [])
> // another round of rebalance to assign the new revoked C/T to the other 
> members
> W1 rejoins with assignment: [AC0, AT1, AT2] 
> Rebalance is triggered 
> W2 joins with assignment: [AT4, AT5, BC0] 
> W3 joins with assignment: [BT1, BT2, BT4]
> W4 joins with assignment: [BT4, BT5]
> W1 becomes leader

[jira] [Commented] (KAFKA-12283) Flaky Test RebalanceSourceConnectorsIntegrationTest#testMultipleWorkersRejoining

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380871#comment-17380871
 ] 

Konstantine Karantasis commented on KAFKA-12283:


As mentioned above, this failure corresponds to a corner case that does not 
seem to appear in practice often. The current suggestion to allow for 
consecutive revocations carries some risk. I have another fix in mind that I'd 
like to explore. In the meantime I'm punting this issue to the next release. 

> Flaky Test 
> RebalanceSourceConnectorsIntegrationTest#testMultipleWorkersRejoining
> 
>
> Key: KAFKA-12283
> URL: https://issues.apache.org/jira/browse/KAFKA-12283
> Project: Kafka
>  Issue Type: Test
>  Components: KafkaConnect, unit tests
>Reporter: Matthias J. Sax
>Assignee: Luke Chen
>Priority: Blocker
>  Labels: flaky-test
> Fix For: 3.0.0
>
>
> https://github.com/apache/kafka/pull/1/checks?check_run_id=1820092809
> {quote} {{java.lang.AssertionError: Tasks are imbalanced: 
> localhost:36037=[seq-source13-0, seq-source13-1, seq-source13-2, 
> seq-source13-3, seq-source12-0, seq-source12-1, seq-source12-2, 
> seq-source12-3]
> localhost:43563=[seq-source11-0, seq-source11-2, seq-source10-0, 
> seq-source10-2]
> localhost:46539=[seq-source11-1, seq-source11-3, seq-source10-1, 
> seq-source10-3]
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at 
> org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.assertConnectorAndTasksAreUniqueAndBalanced(RebalanceSourceConnectorsIntegrationTest.java:362)
>   at 
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:303)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:351)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:319)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:300)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:290)
>   at 
> org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.testMultipleWorkersRejoining(RebalanceSourceConnectorsIntegrationTest.java:313)}}
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12283) Flaky Test RebalanceSourceConnectorsIntegrationTest#testMultipleWorkersRejoining

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12283:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Flaky Test 
> RebalanceSourceConnectorsIntegrationTest#testMultipleWorkersRejoining
> 
>
> Key: KAFKA-12283
> URL: https://issues.apache.org/jira/browse/KAFKA-12283
> Project: Kafka
>  Issue Type: Test
>  Components: KafkaConnect, unit tests
>Reporter: Matthias J. Sax
>Assignee: Luke Chen
>Priority: Blocker
>  Labels: flaky-test
> Fix For: 3.1.0
>
>
> https://github.com/apache/kafka/pull/1/checks?check_run_id=1820092809
> {quote} {{java.lang.AssertionError: Tasks are imbalanced: 
> localhost:36037=[seq-source13-0, seq-source13-1, seq-source13-2, 
> seq-source13-3, seq-source12-0, seq-source12-1, seq-source12-2, 
> seq-source12-3]
> localhost:43563=[seq-source11-0, seq-source11-2, seq-source10-0, 
> seq-source10-2]
> localhost:46539=[seq-source11-1, seq-source11-3, seq-source10-1, 
> seq-source10-3]
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at 
> org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.assertConnectorAndTasksAreUniqueAndBalanced(RebalanceSourceConnectorsIntegrationTest.java:362)
>   at 
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:303)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:351)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:319)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:300)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:290)
>   at 
> org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.testMultipleWorkersRejoining(RebalanceSourceConnectorsIntegrationTest.java:313)}}
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-7632) Support Compression Level

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380866#comment-17380866
 ] 

Konstantine Karantasis commented on KAFKA-7632:
---

This feature was not approved in time for 3.0. Pushing the target version to 3.1

> Support Compression Level
> -
>
> Key: KAFKA-7632
> URL: https://issues.apache.org/jira/browse/KAFKA-7632
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 2.1.0
> Environment: all
>Reporter: Dave Waters
>Assignee: Dongjin Lee
>Priority: Blocker
>  Labels: needs-kip
> Fix For: 3.0.0
>
>
> The compression level for ZSTD is currently set to use the default level (3), 
> which is a conservative setting that in some use cases eliminates the value 
> that ZSTD provides with improved compression. Each use case will vary, so 
> exposing the level as a broker configuration setting will allow the user to 
> adjust the level.
> Since it applies to the other compression codecs, we should add the same 
> functionalities to them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-7632) Support Compression Level

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-7632:
--
Fix Version/s: (was: 3.0.0)
   3.1.0

> Support Compression Level
> -
>
> Key: KAFKA-7632
> URL: https://issues.apache.org/jira/browse/KAFKA-7632
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 2.1.0
> Environment: all
>Reporter: Dave Waters
>Assignee: Dongjin Lee
>Priority: Blocker
>  Labels: needs-kip
> Fix For: 3.1.0
>
>
> The compression level for ZSTD is currently set to use the default level (3), 
> which is a conservative setting that in some use cases eliminates the value 
> that ZSTD provides with improved compression. Each use case will vary, so 
> exposing the level as a broker configuration setting will allow the user to 
> adjust the level.
> Since it applies to the other compression codecs, we should add the same 
> functionalities to them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-12308) ConfigDef.parseType deadlock

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-12308:
---
Fix Version/s: 3.0.0

> ConfigDef.parseType deadlock
> 
>
> Key: KAFKA-12308
> URL: https://issues.apache.org/jira/browse/KAFKA-12308
> Project: Kafka
>  Issue Type: Bug
>  Components: config, KafkaConnect
>Affects Versions: 2.5.0
> Environment: kafka 2.5.0
> centos7
> java version "1.8.0_231"
>Reporter: cosmozhu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: deadlock.log
>
>
> hi,
>  the problem was found, when I restarted *ConnectDistributed*
> I restart ConnectDistributed in the single node for the test, with not delete 
> connectors.
>  sometimes the process stopped when creating connectors.
> I add some logger and found it had a deadlock in `ConfigDef.parseType`.My 
> connectors always have the same transforms. I guess when connector startup 
> (in startAndStopExecutor which default 8 threads) and load the same class 
> file it has something wrong.
> I attached the jstack log file.
> thanks for any help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-12308) ConfigDef.parseType deadlock

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-12308.

Resolution: Fixed

> ConfigDef.parseType deadlock
> 
>
> Key: KAFKA-12308
> URL: https://issues.apache.org/jira/browse/KAFKA-12308
> Project: Kafka
>  Issue Type: Bug
>  Components: config, KafkaConnect
>Affects Versions: 2.5.0
> Environment: kafka 2.5.0
> centos7
> java version "1.8.0_231"
>Reporter: cosmozhu
>Priority: Major
> Attachments: deadlock.log
>
>
> hi,
>  the problem was found, when I restarted *ConnectDistributed*
> I restart ConnectDistributed in the single node for the test, with not delete 
> connectors.
>  sometimes the process stopped when creating connectors.
> I add some logger and found it had a deadlock in `ConfigDef.parseType`.My 
> connectors always have the same transforms. I guess when connector startup 
> (in startAndStopExecutor which default 8 threads) and load the same class 
> file it has something wrong.
> I attached the jstack log file.
> thanks for any help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-7421) Deadlock in Kafka Connect during class loading

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380773#comment-17380773
 ] 

Konstantine Karantasis commented on KAFKA-7421:
---

The fix has now been merged. Let's keep track and report any new issues if they 
appear. Some context exists on 
https://issues.apache.org/jira/browse/KAFKA-12308 as well which reported a 
similar issue. 

> Deadlock in Kafka Connect during class loading
> --
>
> Key: KAFKA-7421
> URL: https://issues.apache.org/jira/browse/KAFKA-7421
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Maciej Bryński
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.0.0
>
>
> I'm getting this deadlock on half of Kafka Connect runs when having two 
> different types connectors (in this configuration it's debezium and hdfs).
> Thread 1:
> {code}
> "pool-22-thread-2@4748" prio=5 tid=0x4d nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>waiting for pool-22-thread-1@4747 to release lock on <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:91)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:367)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Thread 2:
> {code}
> "pool-22-thread-1@4747" prio=5 tid=0x4c nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>blocks pool-22-thread-2@4748
>waiting for pool-22-thread-2@4748 to release lock on <0x1421> (a 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:406)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:358)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> - locked <0x1424> (a java.lang.Object)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
> - locked <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at 
> io.debezium.transforms.ByLogicalTableRouter.(ByLogicalTableRouter.java:57)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
>

[jira] [Resolved] (KAFKA-7421) Deadlock in Kafka Connect during class loading

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-7421.
---
Resolution: Fixed

> Deadlock in Kafka Connect during class loading
> --
>
> Key: KAFKA-7421
> URL: https://issues.apache.org/jira/browse/KAFKA-7421
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Maciej Bryński
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.0.0
>
>
> I'm getting this deadlock on half of Kafka Connect runs when having two 
> different types connectors (in this configuration it's debezium and hdfs).
> Thread 1:
> {code}
> "pool-22-thread-2@4748" prio=5 tid=0x4d nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>waiting for pool-22-thread-1@4747 to release lock on <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:91)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:367)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Thread 2:
> {code}
> "pool-22-thread-1@4747" prio=5 tid=0x4c nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>blocks pool-22-thread-2@4748
>waiting for pool-22-thread-2@4748 to release lock on <0x1421> (a 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:406)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:358)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> - locked <0x1424> (a java.lang.Object)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
> - locked <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at 
> io.debezium.transforms.ByLogicalTableRouter.(ByLogicalTableRouter.java:57)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at

[jira] [Updated] (KAFKA-7421) Deadlock in Kafka Connect during class loading

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-7421:
--
Fix Version/s: 3.0.0

> Deadlock in Kafka Connect during class loading
> --
>
> Key: KAFKA-7421
> URL: https://issues.apache.org/jira/browse/KAFKA-7421
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Maciej Bryński
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 3.0.0
>
>
> I'm getting this deadlock on half of Kafka Connect runs when having two 
> different types connectors (in this configuration it's debezium and hdfs).
> Thread 1:
> {code}
> "pool-22-thread-2@4748" prio=5 tid=0x4d nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>waiting for pool-22-thread-1@4747 to release lock on <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:91)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:367)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Thread 2:
> {code}
> "pool-22-thread-1@4747" prio=5 tid=0x4c nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>blocks pool-22-thread-2@4748
>waiting for pool-22-thread-2@4748 to release lock on <0x1421> (a 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:406)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:358)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> - locked <0x1424> (a java.lang.Object)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
> - locked <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at 
> io.debezium.transforms.ByLogicalTableRouter.(ByLogicalTableRouter.java:57)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at

[jira] [Commented] (KAFKA-12308) ConfigDef.parseType deadlock

2021-07-14 Thread Konstantine Karantasis (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380772#comment-17380772
 ] 

Konstantine Karantasis commented on KAFKA-12308:


Adding the comment that I added in the PR here as well: 



The idea that the {{DelegatingClassLoader}} did not have to be parallel capable 
originated to the fact that it doesn't load classes directly. It delegates 
loading either to the appropriate PluginClassLoader directly via composition, 
or to the parent by calling {{super.loadClass}}.

The latter is the key point of why we need to make the 
{{DelegatingClassLoader}} also parallel capable even though it doesn't load a 
class. Because inheritance is used (via a call to {{super.loadClass}}) and not 
composition (via a hypothetical call to {{parent.loadClass}}, which is not 
possible because {{parent}} is a private member of the base abstract class 
{{ClassLoader}}) when {{getClassLoadingLock}} is called in {{super.loadClass}} 
it checks that actually the derived class (here an instance of 
{{DelegatingClassLoader}}) is not parallel capable and therefore ends up not 
applying fine-grain locking during classloading even though the parent 
clasloader is used actually load the classes.

Based on the above, the {{DelegatingClassLoader}} needs to be parallel capable 
too in order for the parent loader to load classes in parallel. 

I've tested both classloader types being parallel capable in a variety of 
scenarios with multiple connectors, SMTs and converters and a deadlock did not 
reproduce. Of course reproducing the issue is difficult without the specifics 
of the jar layout to begin with. The possibility of a deadlock is still not 
zero, but also probably not exacerbated compared to the current code. The 
plugin that depends on other plugins to be loaded while it's loading its 
classes is the connector type plugin only and there are no inter-connector 
dependencies (a connector requiring another connector's classes to be loaded 
while loading its own). With that in mind, a deadlock should be even less 
possible now. In the future we could consider introducing deadlock recovery 
methods to get out of this type of situation if necessary.

> ConfigDef.parseType deadlock
> 
>
> Key: KAFKA-12308
> URL: https://issues.apache.org/jira/browse/KAFKA-12308
> Project: Kafka
>  Issue Type: Bug
>  Components: config, KafkaConnect
>Affects Versions: 2.5.0
> Environment: kafka 2.5.0
> centos7
> java version "1.8.0_231"
>Reporter: cosmozhu
>Priority: Major
> Attachments: deadlock.log
>
>
> hi,
>  the problem was found, when I restarted *ConnectDistributed*
> I restart ConnectDistributed in the single node for the test, with not delete 
> connectors.
>  sometimes the process stopped when creating connectors.
> I add some logger and found it had a deadlock in `ConfigDef.parseType`.My 
> connectors always have the same transforms. I guess when connector startup 
> (in startAndStopExecutor which default 8 threads) and load the same class 
> file it has something wrong.
> I attached the jstack log file.
> thanks for any help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-7421) Deadlock in Kafka Connect during class loading

2021-07-14 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-7421:
--
Summary: Deadlock in Kafka Connect during class loading  (was: Deadlock in 
Kafka Connect)

> Deadlock in Kafka Connect during class loading
> --
>
> Key: KAFKA-7421
> URL: https://issues.apache.org/jira/browse/KAFKA-7421
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Maciej Bryński
>Assignee: Konstantine Karantasis
>Priority: Major
>
> I'm getting this deadlock on half of Kafka Connect runs when having two 
> different types connectors (in this configuration it's debezium and hdfs).
> Thread 1:
> {code}
> "pool-22-thread-2@4748" prio=5 tid=0x4d nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>waiting for pool-22-thread-1@4747 to release lock on <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:91)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:367)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Thread 2:
> {code}
> "pool-22-thread-1@4747" prio=5 tid=0x4c nid=NA waiting for monitor entry
>   java.lang.Thread.State: BLOCKED
>blocks pool-22-thread-2@4748
>waiting for pool-22-thread-2@4748 to release lock on <0x1421> (a 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:406)
> at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:358)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> - locked <0x1424> (a java.lang.Object)
> at 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
> - locked <0x1423> (a 
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at 
> io.debezium.transforms.ByLogicalTableRouter.(ByLogicalTableRouter.java:57)
> at java.lang.Class.forName0(Class.java:-1)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:715)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.enrich(ConnectorConfig.java:295)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:200)
> at 
> org.apache.kafka.connect.runtime.ConnectorConfig.(ConnectorConfig.java:194)
> at 
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:233)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:916)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:111)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:932)
> at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:928)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
>

[jira] [Updated] (KAFKA-10675) Error message from ConnectSchema.validateValue() should include the name of the schema.

2021-07-09 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-10675:
---
Fix Version/s: 2.7.2
   2.6.3

> Error message from ConnectSchema.validateValue() should include the name of 
> the schema.
> ---
>
> Key: KAFKA-10675
> URL: https://issues.apache.org/jira/browse/KAFKA-10675
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Alexander Iskuskov
>Priority: Minor
> Fix For: 3.0.0, 2.6.3, 2.7.2, 2.8.1
>
>
> The following error message
> {code:java}
> org.apache.kafka.connect.errors.DataException: Invalid Java object for schema 
> type INT64: class java.lang.Long for field: "moderate_time"
> {code}
> can be confusing because {{java.lang.Long}} is acceptable type for schema 
> {{INT64}}. In fact, in this case {{org.apache.kafka.connect.data.Timestamp}} 
> is used but this info is not logged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9705) Zookeeper mutation protocols should be redirected to Controller only

2021-07-09 Thread Konstantine Karantasis (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-9705:
--
Fix Version/s: (was: 3.0.0)

> Zookeeper mutation protocols should be redirected to Controller only
> 
>
> Key: KAFKA-9705
> URL: https://issues.apache.org/jira/browse/KAFKA-9705
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> In the bridge release, we need to restrict the direct access of ZK to 
> controller only. This means the existing AlterConfig path should be migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 4 5 6 >

1 - 100 of 555 matches

Mail list logo