[GitHub] [kafka] omkreddy commented on pull request #12148: MINOR: Remove unnecessary log4j-appender dependency and tweak explicit log4j dependency

2022-09-19 Thread GitBox


omkreddy commented on PR #12148:
URL: https://github.com/apache/kafka/pull/12148#issuecomment-1251901651

   @ijuma  sorry missed this. Can you rebase the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] omkreddy commented on pull request #12039: KAFKA-13725: KIP-768 OAuth code mixes public and internal classes in same package

2022-09-19 Thread GitBox


omkreddy commented on PR #12039:
URL: https://github.com/apache/kafka/pull/12039#issuecomment-1251902166

   @kirktrue Can you fix checkStyle errors


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] akhileshchg commented on pull request #12628: KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads.

2022-09-19 Thread GitBox


akhileshchg commented on PR #12628:
URL: https://github.com/apache/kafka/pull/12628#issuecomment-1251845966

   > Reopened the PR. Moved the ReadWrite lock from StandardAuthorizerData 
scope to StandardAuthorizer. I'm okay with most methods after moving the lock, 
but loadSnapshot can be pretty heavy with this implementation. Let me know your 
thoughts @mumrah @cmccabe @hachikuji
   
   Reduced the critical section for `loadSnapshot`. The change is ready for 
review. @mumrah @cmccabe @hachikuji 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] showuon commented on pull request #12628: KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads.

2022-09-19 Thread GitBox


showuon commented on PR #12628:
URL: https://github.com/apache/kafka/pull/12628#issuecomment-1251787224

   > Reopened the PR. Moved the ReadWrite lock from StandardAuthorizerData 
scope to StandardAuthorizer. I'm okay with most methods after moving the lock, 
but loadSnapshot can be pretty heavy with this implementation. Let me know your 
thoughts @mumrah @cmccabe @hachikuji
   
   The `loadSnapshot` is indeed a concern. I'm wondering if we can remove lock 
for `loadSnapshot` method? After all, the authorizer won't be up and ready to 
use before `loadSnapshot` completed, right? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] akhileshchg commented on pull request #12628: KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads.

2022-09-19 Thread GitBox


akhileshchg commented on PR #12628:
URL: https://github.com/apache/kafka/pull/12628#issuecomment-1251763252

   I reverted the benchmark changes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (KAFKA-5564) Fail to create topics with error 'While recording the replica LEO, the partition [topic2,0] hasn't been created'

2022-09-19 Thread Ahmed Toumi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606803#comment-17606803
 ] 

Ahmed Toumi edited comment on KAFKA-5564 at 9/20/22 12:28 AM:
--

Same issue with kafka 2.3.4, when executing reassign script for more then 700 
partitions

infinit log like this

[2022-09-20 02:23:57,265] WARN [ReplicaManager broker=5] While recording the 
replica LEO, the partition topic-name-3 hasn't been created. 
(kafka.server.ReplicaManager)


was (Author: 2me):
Same issue with kafka 2.3.4

infinit log like this

[2022-09-20 02:23:57,265] WARN [ReplicaManager broker=5] While recording the 
replica LEO, the partition topic-name-3 hasn't been created. 
(kafka.server.ReplicaManager)

> Fail to create topics with error 'While recording the replica LEO, the 
> partition [topic2,0] hasn't been created'
> 
>
> Key: KAFKA-5564
> URL: https://issues.apache.org/jira/browse/KAFKA-5564
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Klearchos Chaloulos
>Priority: Major
>
> Hello,
> *Short version*
> we have seen sporadic occurrences of the following issue: Topics whose leader 
> is a specific broker fail to be created properly, and it is impossible to 
> produce to them or consume from them.
>  The following logs appears in the broker that is the leader of the faulty 
> topics:
> {noformat}
> [2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
> the replica LEO, the partition [topic2,0] hasn't been created. 
> (kafka.server.ReplicaManager)
> {noformat}
> *Detailed version*:
> Our setup consists of three brokers with ids 1, 2, 3. Broker 2 is the 
> controller. We create 7 topics called topic1, topic2, topic3, topic4, topic5, 
> topic6, topic7.
> Sometimes (sporadically) some of the topics are faulty. In the particular 
> example I describe here the faulty topics are topics are topic6, topic4, 
> topic2, topic3. The faulty topics all have the same leader broker 3.
> If we do a kafka-topics.sh --describe on the topics we see that for topics 
> that do not have broker 3 as leader, the in sync replicas report that broker 
> 3 is not synced:
> {noformat}
>  bin/kafka-topics.sh --describe --zookeeper zookeeper:2181/kafka
> Topic:topic6  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic6   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> Topic:topic5  PartitionCount:1ReplicationFactor:3 
> Configs:retention.ms=30
>   Topic: topic5   Partition: 0Leader: 2   Replicas: 2,3,1 Isr: 2,1
> Topic:topic7  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic7   Partition: 0Leader: 1   Replicas: 1,3,2 Isr: 1,2
> Topic:topic4  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic4   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> Topic:topic1  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic1   Partition: 0Leader: 2   Replicas: 2,1,3 Isr: 2,1
> Topic:topic2  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic2   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> Topic:topic3  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic3   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> {noformat}
> While for the faulty topics it is reported that all replicas are in sync.
> Also, the topic directories under the log.dir folder were not created in the 
> faulty broker 3.
> We see the following logs in broker 3, which is the leader of the faulty 
> topics:
> {noformat}
> [2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
> the replica LEO, the partition [topic2,0] hasn't been created. 
> (kafka.server.ReplicaManager)
> {noformat}
> The above log is logged continuously.
> and the following error logs in the other 2 brokers, the replicas:
> {noformat}
> ERROR [ReplicaFetcherThread-0-3], Error for partition [topic3,0] to broker 
> 3:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 
> server does not host this topic-partition
> {noformat}
> Again the above log is logged continuously.
> The issue described above occurs immediately after the deployment of the 
> kafka cluster.
> A restart of the faulty broker (3 in this case) fixes the problem and the 
> faulty topics work normally.
> I have also attached the broker configuration we use.
> Do you have any idea what might cause this issue?
> Best regards,
> Klearchos



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-5564) Fail to create topics with error 'While recording the replica LEO, the partition [topic2,0] hasn't been created'

2022-09-19 Thread Ahmed Toumi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606803#comment-17606803
 ] 

Ahmed Toumi commented on KAFKA-5564:


Same issue with kafka 2.3.4

infinit log like this

[2022-09-20 02:23:57,265] WARN [ReplicaManager broker=5] While recording the 
replica LEO, the partition topic-name-3 hasn't been created. 
(kafka.server.ReplicaManager)

> Fail to create topics with error 'While recording the replica LEO, the 
> partition [topic2,0] hasn't been created'
> 
>
> Key: KAFKA-5564
> URL: https://issues.apache.org/jira/browse/KAFKA-5564
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Klearchos Chaloulos
>Priority: Major
>
> Hello,
> *Short version*
> we have seen sporadic occurrences of the following issue: Topics whose leader 
> is a specific broker fail to be created properly, and it is impossible to 
> produce to them or consume from them.
>  The following logs appears in the broker that is the leader of the faulty 
> topics:
> {noformat}
> [2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
> the replica LEO, the partition [topic2,0] hasn't been created. 
> (kafka.server.ReplicaManager)
> {noformat}
> *Detailed version*:
> Our setup consists of three brokers with ids 1, 2, 3. Broker 2 is the 
> controller. We create 7 topics called topic1, topic2, topic3, topic4, topic5, 
> topic6, topic7.
> Sometimes (sporadically) some of the topics are faulty. In the particular 
> example I describe here the faulty topics are topics are topic6, topic4, 
> topic2, topic3. The faulty topics all have the same leader broker 3.
> If we do a kafka-topics.sh --describe on the topics we see that for topics 
> that do not have broker 3 as leader, the in sync replicas report that broker 
> 3 is not synced:
> {noformat}
>  bin/kafka-topics.sh --describe --zookeeper zookeeper:2181/kafka
> Topic:topic6  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic6   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> Topic:topic5  PartitionCount:1ReplicationFactor:3 
> Configs:retention.ms=30
>   Topic: topic5   Partition: 0Leader: 2   Replicas: 2,3,1 Isr: 2,1
> Topic:topic7  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic7   Partition: 0Leader: 1   Replicas: 1,3,2 Isr: 1,2
> Topic:topic4  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic4   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> Topic:topic1  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic1   Partition: 0Leader: 2   Replicas: 2,1,3 Isr: 2,1
> Topic:topic2  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic2   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> Topic:topic3  PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: topic3   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
> 3,1,2
> {noformat}
> While for the faulty topics it is reported that all replicas are in sync.
> Also, the topic directories under the log.dir folder were not created in the 
> faulty broker 3.
> We see the following logs in broker 3, which is the leader of the faulty 
> topics:
> {noformat}
> [2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
> the replica LEO, the partition [topic2,0] hasn't been created. 
> (kafka.server.ReplicaManager)
> {noformat}
> The above log is logged continuously.
> and the following error logs in the other 2 brokers, the replicas:
> {noformat}
> ERROR [ReplicaFetcherThread-0-3], Error for partition [topic3,0] to broker 
> 3:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 
> server does not host this topic-partition
> {noformat}
> Again the above log is logged continuously.
> The issue described above occurs immediately after the deployment of the 
> kafka cluster.
> A restart of the faulty broker (3 in this case) fixes the problem and the 
> faulty topics work normally.
> I have also attached the broker configuration we use.
> Do you have any idea what might cause this issue?
> Best regards,
> Klearchos



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] cmccabe commented on pull request #12628: KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads.

2022-09-19 Thread GitBox


cmccabe commented on PR #12628:
URL: https://github.com/apache/kafka/pull/12628#issuecomment-1251690504

   Thanks, @akhileshchg . Can you remove the AclAuthorizerBenchmark change from 
here? I can post a separate PR for this (which we don't need in 3.3...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cmccabe opened a new pull request, #12664: KAFKA-14243: Temporarily disable unsafe downgrade

2022-09-19 Thread GitBox


cmccabe opened a new pull request, #12664:
URL: https://github.com/apache/kafka/pull/12664

   Temporarily disable unsafe downgrade until we can implement reloading 
snapshots on
   unsafe downgrade.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-14243) Temporarily disable unsafe downgrade

2022-09-19 Thread Colin McCabe (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin McCabe updated KAFKA-14243:
-
Description: Temporarily disable unsafe downgrade since we haven't 
implemented reloading snapshots on unsafe downgrade  (was: Disable unsafe 
downgrade in 3.3)

> Temporarily disable unsafe downgrade
> 
>
> Key: KAFKA-14243
> URL: https://issues.apache.org/jira/browse/KAFKA-14243
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Blocker
>
> Temporarily disable unsafe downgrade since we haven't implemented reloading 
> snapshots on unsafe downgrade



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14243) Disable unsafe downgrade in 3.3

2022-09-19 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-14243:


 Summary: Disable unsafe downgrade in 3.3
 Key: KAFKA-14243
 URL: https://issues.apache.org/jira/browse/KAFKA-14243
 Project: Kafka
  Issue Type: Bug
Reporter: Colin McCabe
Assignee: Colin McCabe


Disable unsafe downgrade in 3.3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14243) Temporarily disable unsafe downgrade

2022-09-19 Thread Colin McCabe (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin McCabe updated KAFKA-14243:
-
Summary: Temporarily disable unsafe downgrade  (was: Disable unsafe 
downgrade in 3.3)

> Temporarily disable unsafe downgrade
> 
>
> Key: KAFKA-14243
> URL: https://issues.apache.org/jira/browse/KAFKA-14243
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Blocker
>
> Disable unsafe downgrade in 3.3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] philipnee commented on pull request #12663: [Consumer Refactor] Define event handler interface and events

2022-09-19 Thread GitBox


philipnee commented on PR #12663:
URL: https://github.com/apache/kafka/pull/12663#issuecomment-1251684468

   cc @guozhangwang @hachikuji @dajac for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] akhileshchg commented on pull request #12628: KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads.

2022-09-19 Thread GitBox


akhileshchg commented on PR #12628:
URL: https://github.com/apache/kafka/pull/12628#issuecomment-1251672840

   Reopened the PR. Moved the ReadWrite lock from StandardAuthorizerData scope 
to StandardAuthorizer. I'm fine with most of the methods. with moving the lock, 
but it seems loadSnapshot can be quite heavy with this implementation. Let me 
know your thoughts @mumrah @cmccabe @hachikuji 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] ryan-burningham commented on pull request #11748: KAFKA-12635: Don't emit checkpoints for partitions without any offset…

2022-09-19 Thread GitBox


ryan-burningham commented on PR #11748:
URL: https://github.com/apache/kafka/pull/11748#issuecomment-1251628710

   Help me understand where we are at with this issue.  @mimaison it appears 
you deleted the branch, but is this fixed in another branch or unreleased 
version of mirrormaker?  I am sorry if it is a misunderstanding from me.  I am 
new to community supported software.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] philipnee closed pull request #12633: [Consumer Refactor] Background thread skeleton

2022-09-19 Thread GitBox


philipnee closed pull request #12633: [Consumer Refactor] Background thread 
skeleton
URL: https://github.com/apache/kafka/pull/12633


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] philipnee opened a new pull request, #12663: [Consumer Refactor] Define event handler interface and events

2022-09-19 Thread GitBox


philipnee opened a new pull request, #12663:
URL: https://github.com/apache/kafka/pull/12663

   
https://cwiki.apache.org/confluence/display/KAFKA/Proposal%3A+Consumer+Threading+Model+Refactor
   
   * We want to abstract away the Queue from the new KafkaConsumer using an 
EventHandler.
   - poll: Retrieve a response from the eventHandler
   - add: Add a request to the handler
   The event handling should live inside of the concrete implementation
   
   *Unit tests
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] guozhangwang commented on a diff in pull request #12659: [DO NOT MERGE] KAFKA-10199: Integrate Topology Pause/Resume with StateUpdater

2022-09-19 Thread GitBox


guozhangwang commented on code in PR #12659:
URL: https://github.com/apache/kafka/pull/12659#discussion_r974686659


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -785,6 +787,22 @@ private void addTasksToStateUpdater() {
 }
 }
 
+private void pauseTasksInStateUpdater() {
+for (final Task task : stateUpdater.getUpdatingTasks()) {

Review Comment:
   Alternative idea is to add a `pauseTasks(topologyMetadata)` / 
`resumeTasks(topologyMetadata)` to replace the `pauseTask/resumeTask` by task 
ids, but that would a bit intrusive as introducing topologyMetadata to the 
state updater..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] guozhangwang commented on a diff in pull request #12659: [DO NOT MERGE] KAFKA-10199: Integrate Topology Pause/Resume with StateUpdater

2022-09-19 Thread GitBox


guozhangwang commented on code in PR #12659:
URL: https://github.com/apache/kafka/pull/12659#discussion_r974503648


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -785,6 +787,22 @@ private void addTasksToStateUpdater() {
 }
 }
 
+private void pauseTasksInStateUpdater() {
+for (final Task task : stateUpdater.getUpdatingTasks()) {

Review Comment:
   I'm a bit concerned about perf here since this is called for each iteration, 
and to be safer we'd need to wrap each as read-only task as we did for 
`getTasks`, hence generating a lot of young gen for GCs, but other than the 
current approach I have not got to a better solution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cmccabe commented on pull request #12636: KAFKA-14214: Convert StandardAuthorizer to copy-on-write

2022-09-19 Thread GitBox


cmccabe commented on PR #12636:
URL: https://github.com/apache/kafka/pull/12636#issuecomment-1251532201

   I uploaded a new version at https://github.com/apache/kafka/pull/12662
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cmccabe opened a new pull request, #12662: KAFKA-14214: Convert StandardAuthorizer to copy-on-write

2022-09-19 Thread GitBox


cmccabe opened a new pull request, #12662:
URL: https://github.com/apache/kafka/pull/12662

   Convert StandardAuthorizer to use copy-on-write data structures.  The issue 
with the concurrent skiplist was that because it was modified while in use by 
StandardAuthorizer#authorize, we could sometimes expose an inconsistent state. 
For example, if we added a "deny principal foo", followed by "allow all", a 
request for principal foo might happen to see the second one, without seeing 
the first one, even though the first one was added first.
   
   In order to efficiently implement prefix ACLs, store them in a prefix tree. 
This ensures that we can check all prefix ACLs for a path in logarithmic time. 
Also implement Authorizer#authorizeByResourceType. The default implementation 
of this function is quite slow, so it is good to have an implementation in 
StandardAuthorizer.
   
   Finally, this PR renames AclAuthorizerBenchmark to AuthorizerBenchmark and 
extends it to report information about StandardAuthorizer as well as 
AclAuthorizer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-14174) Operation documentation for KRaft

2022-09-19 Thread Jose Armando Garcia Sancio (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Armando Garcia Sancio updated KAFKA-14174:
---
Fix Version/s: (was: 3.3.0)

> Operation documentation for KRaft
> -
>
> Key: KAFKA-14174
> URL: https://issues.apache.org/jira/browse/KAFKA-14174
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Blocker
>  Labels: documentation, kraft
>
> KRaft documentation for 3.3
>  # Disk recovery
>  # External controller is the recommended configuration. The majority of 
> integration tests don't run against co-located mode.
>  # Talk about KRaft operation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14207) Add a 6.10 section for KRaft

2022-09-19 Thread Jose Armando Garcia Sancio (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Armando Garcia Sancio updated KAFKA-14207:
---
Fix Version/s: (was: 3.3.0)

> Add a 6.10 section for KRaft
> 
>
> Key: KAFKA-14207
> URL: https://issues.apache.org/jira/browse/KAFKA-14207
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: documentation, kraft
>
> The section should talk about:
>  # Limitation
>  # Recommended deployment: external controller
>  # How to start a KRaft cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14207) Add a 6.10 section for KRaft

2022-09-19 Thread Jose Armando Garcia Sancio (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Armando Garcia Sancio updated KAFKA-14207:
---
Affects Version/s: 3.3.0

> Add a 6.10 section for KRaft
> 
>
> Key: KAFKA-14207
> URL: https://issues.apache.org/jira/browse/KAFKA-14207
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: documentation, kraft
> Fix For: 3.3.0
>
>
> The section should talk about:
>  # Limitation
>  # Recommended deployment: external controller
>  # How to start a KRaft cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14207) Add a 6.10 section for KRaft

2022-09-19 Thread Jose Armando Garcia Sancio (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Armando Garcia Sancio updated KAFKA-14207:
---
Labels: documentation kraft  (was: )

> Add a 6.10 section for KRaft
> 
>
> Key: KAFKA-14207
> URL: https://issues.apache.org/jira/browse/KAFKA-14207
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: documentation, kraft
> Fix For: 3.3.0
>
>
> The section should talk about:
>  # Limitation
>  # Recommended deployment: external controller
>  # How to start a KRaft cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14240) Ensure kraft metadata log dir is initialized with valid snapshot state

2022-09-19 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-14240.
-
Resolution: Fixed

> Ensure kraft metadata log dir is initialized with valid snapshot state
> --
>
> Key: KAFKA-14240
> URL: https://issues.apache.org/jira/browse/KAFKA-14240
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 3.3.0
>
>
> If the first segment under __cluster_metadata has a base offset greater than 
> 0, then there must exist at least one snapshot which has a larger offset than 
> whatever the first segment starts at. We should check for this at startup to 
> prevent the controller from initialization with invalid state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] hachikuji merged pull request #12653: KAFKA-14240; Validate kraft snapshot state on startup

2022-09-19 Thread GitBox


hachikuji merged PR #12653:
URL: https://github.com/apache/kafka/pull/12653


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-10635) Streams application fails with OutOfOrderSequenceException after rolling restarts of brokers

2022-09-19 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606711#comment-17606711
 ] 

Guozhang Wang commented on KAFKA-10635:
---

>From the logs, I think the OOOSException was thrown here: 
>https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/ProducerStateManager.scala#L236.
> because `currentLastSeq` is -1 (i.e. UNKNOWN). It usually indicates that due 
>to a log truncation (which did happen before the exception thrown), the 
>producer's state has all been deleted, while in that case 
>currentEntry.producerEpoch == RecordBatch.NO_PRODUCER_EPOCH should be 
>satisfied — however it does not. And here's my suspected route that lead this:

 

T1: The partition starts with replicas 3,4,5, with 5 as the leader; producers 
are still writing to 5.

T2: Assume there's a producer with id 61009, start writing to leader 5, the 
first append is at an offset larger than offset 853773. NOTE that at this time 
that append has not bee fully replicated across, and hence high watermark has 
not been advanced.

T3: Replica 10 is added to the replica list and old leader 5 is removed. 
Replica 10 truncates itself till 853773, and then rebuild its producer state up 
to offset 853773 as well (you can see that from the log). Note that since 
producer 61009's append record is beyond 8553733, it's not yet contained in the 
persistent producer snapshot and hence not loaded into the new leader 10's 
in-memory producer states.

T4: There's a truncation happened: it seems be deleting an empty log segment 
though, since the log segment is (baseOffset=0, size=0), but that should not 
have any impact on the producer state since deleting files does not immediately 
affect the in-memory producer entries.

T5: Producer 61009 finally learned about the new leader, and start sending to 
it. It's append start offset is 853836 (larger than 853773), the producer 
entry's metadata queue is empty, HOWEVER its epoch is somehow not -1 (UNKNOWN), 
i.e. in the old snapshot it does not have any metadata but has an existing 
epoch. And hence this exception is triggered. Unfortunately since we do not 
have enough log info (I can file a quick PR to enhance it in the future 
releases) so I cannot be certain why that snapshot contains no metadata but a 
non -1 epoch... would like to hear some expert's opinion, [~hachikuji] 
[~junrao] .

> Streams application fails with OutOfOrderSequenceException after rolling 
> restarts of brokers
> 
>
> Key: KAFKA-10635
> URL: https://issues.apache.org/jira/browse/KAFKA-10635
> Project: Kafka
>  Issue Type: Bug
>  Components: core, producer 
>Affects Versions: 2.5.1
>Reporter: Peeraya Maetasatidsuk
>Priority: Blocker
> Attachments: logs.csv
>
>
> We are upgrading our brokers to version 2.5.1 (from 2.3.1) by performing a 
> rolling restart of the brokers after installing the new version. After the 
> restarts we notice one of our streams app (client version 2.4.1) fails with 
> OutOfOrderSequenceException:
>  
> {code:java}
> ERROR [2020-10-13 22:52:21,400] [com.aaa.bbb.ExceptionHandler] Unexpected 
> error. Record: a_record, destination topic: 
> topic-name-Aggregation-repartition 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
> ERROR [2020-10-13 22:52:21,413] 
> [org.apache.kafka.streams.processor.internals.AssignedTasks] stream-thread 
> [topic-name-StreamThread-1] Failed to commit stream task 1_39 due to the 
> following error: org.apache.kafka.streams.errors.StreamsException: task 
> [1_39] Abort sending since an error caught with a previous record (timestamp 
> 1602654659000) to topic topic-name-Aggregation-repartition due to 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:144)
> at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
> at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:204)
> at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1348)
> at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:230)
> at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:196)
> at 
> org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:730) 
>    at 
> org.apache.kafka.clients.producer.internals.Sender.failBat

[GitHub] [kafka] vamossagar12 commented on a diff in pull request #12561: KAFKA-12495: Exponential backoff retry to prevent rebalance storms when worker joins after revoking rebalance

2022-09-19 Thread GitBox


vamossagar12 commented on code in PR #12561:
URL: https://github.com/apache/kafka/pull/12561#discussion_r974548386


##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/WorkerCoordinatorIncrementalTest.java:
##
@@ -517,13 +517,13 @@ public void testTaskAssignmentWhenWorkerBounces() {
 leaderAssignment = deserializeAssignment(result, leaderId);
 assertAssignment(leaderId, offset,
 Collections.emptyList(), 0,
-Collections.emptyList(), 0,
+Collections.emptyList(), 1,

Review Comment:
   Yeah that's true. Revoking tasks at this point leads to imbalance. This was 
happening since as per the new changes, the moment the delay expires we were 
allowing revocation. So, at this point, the flag is true, the delay goes to 0 
and since revocation is technically possible at this point, the code was doing 
it. I have added the `canRevoke` flag back to handle this case with which this 
testcase another test 
`IncrementalCooperativeAssignorTest#testTaskAssignmentWhenWorkerBounces` which 
seemed to have a similar issue seem to be fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jordanbull commented on a diff in pull request #12566: KAFKA-13927: Only clear origOffsets when clearing messageBatch

2022-09-19 Thread GitBox


jordanbull commented on code in PR #12566:
URL: https://github.com/apache/kafka/pull/12566#discussion_r974545797


##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/WorkerSinkTaskTest.java:
##
@@ -435,6 +453,13 @@ public void testPollRedelivery() throws Exception {
 assertTaskMetricValue("batch-size-max", 1.0);
 assertTaskMetricValue("batch-size-avg", 0.5);
 
+assertEquals(workerCurrentOffsets, Whitebox.>getInternalState(workerTask, "currentOffsets"));

Review Comment:
   I've also added an assertion on the offsets given to precommit above. Happy 
to remove this line



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jordanbull commented on a diff in pull request #12566: KAFKA-13927: Only clear origOffsets when clearing messageBatch

2022-09-19 Thread GitBox


jordanbull commented on code in PR #12566:
URL: https://github.com/apache/kafka/pull/12566#discussion_r974544940


##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java:
##
@@ -763,6 +763,9 @@ private void onPartitionsRemoved(Collection 
partitions, boolean
 return;
 
 try {
+for (TopicPartition partition: partitions) {
+origOffsets.remove(partition);
+}

Review Comment:
   Makes sense



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] guozhangwang commented on a diff in pull request #12659: [DO NOT MERGE] KAFKA-10199: Integrate Topology Pause/Resume with StateUpdater

2022-09-19 Thread GitBox


guozhangwang commented on code in PR #12659:
URL: https://github.com/apache/kafka/pull/12659#discussion_r974503648


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -785,6 +787,22 @@ private void addTasksToStateUpdater() {
 }
 }
 
+private void pauseTasksInStateUpdater() {
+for (final Task task : stateUpdater.getUpdatingTasks()) {

Review Comment:
   I'm a bit concerned about perf here since this is called for each iteration, 
and it will construct a list of read-only tasks hence generating a lot of young 
gen for GCs, but other than the current approach I have not got to a better 
solution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna merged pull request #12650: KAFKA-10199: Adapt restoration integration tests to state updater

2022-09-19 Thread GitBox


cadonna merged PR #12650:
URL: https://github.com/apache/kafka/pull/12650


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on pull request #12650: KAFKA-10199: Adapt restoration integration tests to state updater

2022-09-19 Thread GitBox


cadonna commented on PR #12650:
URL: https://github.com/apache/kafka/pull/12650#issuecomment-1251322148

   Build failures are unrelated:
   ```
   Build / JDK 11 and Scala 2.13 / 
org.apache.kafka.connect.runtime.rest.RestServerTest.testDisableAdminEndpoint
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] guozhangwang commented on a diff in pull request #12650: KAFKA-10199: Adapt restoration integration tests to state updater

2022-09-19 Thread GitBox


guozhangwang commented on code in PR #12650:
URL: https://github.com/apache/kafka/pull/12650#discussion_r974499005


##
streams/src/test/java/org/apache/kafka/streams/integration/RestoreIntegrationTest.java:
##
@@ -291,8 +297,9 @@ public void shouldSuccessfullyStartWhenLoggingDisabled() 
throws InterruptedExcep
 assertTrue(startupLatch.await(30, TimeUnit.SECONDS));
 }
 
-@Test
-public void shouldProcessDataFromStoresWithLoggingDisabled() throws 
InterruptedException {
+@ParameterizedTest

Review Comment:
   Ack, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] nizhikov commented on pull request #12632: KAFKA-12878 Support --bootstrap-server in kafka-streams-application-reset tool

2022-09-19 Thread GitBox


nizhikov commented on PR #12632:
URL: https://github.com/apache/kafka/pull/12632#issuecomment-1251318220

   @C0urante Thank you very much)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-09-19 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-12878.
---
Fix Version/s: 3.4.0
   Resolution: Done

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Nikolay Izhikov
>Priority: Major
>  Labels: needs-kip, newbie
> Fix For: 3.4.0
>
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] C0urante merged pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


C0urante merged PR #12656:
URL: https://github.com/apache/kafka/pull/12656


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on pull request #12632: KAFKA-12878 Support --bootstrap-server in kafka-streams-application-reset tool

2022-09-19 Thread GitBox


C0urante commented on PR #12632:
URL: https://github.com/apache/kafka/pull/12632#issuecomment-1251316903

   Yep, looks good. Merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante merged pull request #12632: KAFKA-12878 Support --bootstrap-server in kafka-streams-application-reset tool

2022-09-19 Thread GitBox


C0urante merged PR #12632:
URL: https://github.com/apache/kafka/pull/12632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] nizhikov commented on pull request #12632: KAFKA-12878 Support --bootstrap-server in kafka-streams-application-reset tool

2022-09-19 Thread GitBox


nizhikov commented on PR #12632:
URL: https://github.com/apache/kafka/pull/12632#issuecomment-1251293104

   @C0urante tests finished. It seems for me that failures unrelated but can 
you double check?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-14241) Implement the snapshot cleanup policy

2022-09-19 Thread Jose Armando Garcia Sancio (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606669#comment-17606669
 ] 

Jose Armando Garcia Sancio commented on KAFKA-14241:


Yes and those topics are internal at the moment. So at high-level we need to 
have a more complicated validation logic that is able to distinguish if the 
affected topic is a KRaft topic vs an ISR topic.

> Implement the snapshot cleanup policy
> -
>
> Key: KAFKA-14241
> URL: https://issues.apache.org/jira/browse/KAFKA-14241
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kraft
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
> Fix For: 3.4.0
>
>
> It looks like delete policy needs to be set to either delete or compact:
> {code:java}
>         .define(CleanupPolicyProp, LIST, Defaults.CleanupPolicy, 
> ValidList.in(LogConfig.Compact, LogConfig.Delete), MEDIUM, CompactDoc,
>           KafkaConfig.LogCleanupPolicyProp)
> {code}
> Neither is correct for KRaft topics. KIP-630 talks about adding a third 
> policy called snapshot:
> {code:java}
> The __cluster_metadata topic will have snapshot as the cleanup.policy. {code}
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-630%3A+Kafka+Raft+Snapshot#KIP630:KafkaRaftSnapshot-ProposedChanges]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14241) Implement the snapshot cleanup policy

2022-09-19 Thread Jose Armando Garcia Sancio (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606669#comment-17606669
 ] 

Jose Armando Garcia Sancio edited comment on KAFKA-14241 at 9/19/22 4:32 PM:
-

Yes and that topic is internal at the moment. So at high-level we need to have 
a more complicated validation logic that is able to distinguish if the affected 
topic is a KRaft topic vs an ISR topic.


was (Author: jagsancio):
Yes and those topics are internal at the moment. So at high-level we need to 
have a more complicated validation logic that is able to distinguish if the 
affected topic is a KRaft topic vs an ISR topic.

> Implement the snapshot cleanup policy
> -
>
> Key: KAFKA-14241
> URL: https://issues.apache.org/jira/browse/KAFKA-14241
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kraft
>Reporter: Jose Armando Garcia Sancio
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
> Fix For: 3.4.0
>
>
> It looks like delete policy needs to be set to either delete or compact:
> {code:java}
>         .define(CleanupPolicyProp, LIST, Defaults.CleanupPolicy, 
> ValidList.in(LogConfig.Compact, LogConfig.Delete), MEDIUM, CompactDoc,
>           KafkaConfig.LogCleanupPolicyProp)
> {code}
> Neither is correct for KRaft topics. KIP-630 talks about adding a third 
> policy called snapshot:
> {code:java}
> The __cluster_metadata topic will have snapshot as the cleanup.policy. {code}
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-630%3A+Kafka+Raft+Snapshot#KIP630:KafkaRaftSnapshot-ProposedChanges]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14063) Kafka message parsing can cause ooms with small antagonistic payloads

2022-09-19 Thread Manikumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-14063.
---
Resolution: Fixed

> Kafka message parsing can cause ooms with small antagonistic payloads
> -
>
> Key: KAFKA-14063
> URL: https://issues.apache.org/jira/browse/KAFKA-14063
> Project: Kafka
>  Issue Type: Bug
>  Components: generator
>Affects Versions: 3.2.0
>Reporter: Daniel Collins
>Priority: Major
> Fix For: 2.8.2, 3.2.3, 3.1.2, 3.0.2
>
>
> When parsing code receives a payload for a variable length field where the 
> length is specified in the code as some arbitrarily large number (assume 
> INT32_MAX for example) this will immediately try to allocate an ArrayList to 
> hold this many elements, before checking whether this is a reasonable array 
> size given the available data. 
> The fix for this is to instead throw a runtime exception if the length of a 
> variably sized container exceeds the amount of remaining data. Then, the 
> worst a user can do is force the server to allocate 8x the size of the actual 
> delivered data (if they claim there are N elements for a container of Objects 
> (i.e. not a byte string) and each Object bottoms out in an 8 byte pointer in 
> the ArrayList's backing array).
> This was identified by fuzzing the kafka request parsing code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14063) Kafka message parsing can cause ooms with small antagonistic payloads

2022-09-19 Thread Manikumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar updated KAFKA-14063:
--
Fix Version/s: 2.8.2
   3.2.3
   3.1.2
   3.0.2

> Kafka message parsing can cause ooms with small antagonistic payloads
> -
>
> Key: KAFKA-14063
> URL: https://issues.apache.org/jira/browse/KAFKA-14063
> Project: Kafka
>  Issue Type: Bug
>  Components: generator
>Affects Versions: 3.2.0
>Reporter: Daniel Collins
>Priority: Major
> Fix For: 2.8.2, 3.0.2, 3.1.2, 3.2.3
>
>
> When parsing code receives a payload for a variable length field where the 
> length is specified in the code as some arbitrarily large number (assume 
> INT32_MAX for example) this will immediately try to allocate an ArrayList to 
> hold this many elements, before checking whether this is a reasonable array 
> size given the available data. 
> The fix for this is to instead throw a runtime exception if the length of a 
> variably sized container exceeds the amount of remaining data. Then, the 
> worst a user can do is force the server to allocate 8x the size of the actual 
> delivered data (if they claim there are N elements for a container of Objects 
> (i.e. not a byte string) and each Object bottoms out in an 8 byte pointer in 
> the ArrayList's backing array).
> This was identified by fuzzing the kafka request parsing code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] fvaleri commented on pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


fvaleri commented on PR #12656:
URL: https://github.com/apache/kafka/pull/12656#issuecomment-1251149885

   Thanks @C0urante for the quick feedback. It should be good now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] fvaleri commented on a diff in pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


fvaleri commented on code in PR #12656:
URL: https://github.com/apache/kafka/pull/12656#discussion_r974366092


##
docs/connect.html:
##
@@ -53,7 +53,7 @@ Running 
Kafka ConnectThe important configuration options specific to standalone mode are:
 
-offset.storage.file.filename - File to store offset 
data in
+offset.storage.file.filename - File to store source 
task offsets

Review Comment:
   Thanks.



##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/standalone/StandaloneConfig.java:
##
@@ -28,7 +28,7 @@ public class StandaloneConfig extends WorkerConfig {
  * offset.storage.file.filename
  */
 public static final String OFFSET_STORAGE_FILE_FILENAME_CONFIG = 
"offset.storage.file.filename";
-private static final String OFFSET_STORAGE_FILE_FILENAME_DOC = "File to 
store offset data in";
+private static final String OFFSET_STORAGE_FILE_FILENAME_DOC = "File to 
store source task offsets";

Review Comment:
   Yes, I like consistency. Thanks.



##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java:
##
@@ -108,7 +108,7 @@ public class WorkerConfig extends AbstractConfig {
 
 public static final String OFFSET_COMMIT_INTERVAL_MS_CONFIG = 
"offset.flush.interval.ms";
 private static final String OFFSET_COMMIT_INTERVAL_MS_DOC
-= "Interval at which to try committing offsets for tasks.";
+= "Interval at which to try committing offsets for source tasks.";

Review Comment:
   Right, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] ijuma commented on pull request #12658: KAFKA-14233: disable testReloadUpdatedFilesWithoutConfigChange

2022-09-19 Thread GitBox


ijuma commented on PR #12658:
URL: https://github.com/apache/kafka/pull/12658#issuecomment-1251084739

   Thanks @showuon!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-14239) Merge StateRestorationIntegrationTest into RestoreIntegrationTest

2022-09-19 Thread Ahmed Sobeh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Sobeh reassigned KAFKA-14239:
---

Assignee: Ahmed Sobeh

> Merge StateRestorationIntegrationTest into RestoreIntegrationTest
> -
>
> Key: KAFKA-14239
> URL: https://issues.apache.org/jira/browse/KAFKA-14239
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>Reporter: Guozhang Wang
>Assignee: Ahmed Sobeh
>Priority: Major
>  Labels: newbie++
>
> We have two integration test classes for store restoration, and 
> StateRestorationIntegrationTest only has one single test method. We can merge 
> it with the other to save integration testing time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14239) Merge StateRestorationIntegrationTest into RestoreIntegrationTest

2022-09-19 Thread Ahmed Sobeh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606594#comment-17606594
 ] 

Ahmed Sobeh edited comment on KAFKA-14239 at 9/19/22 2:11 PM:
--

-Hi! I'm a newbie and I'd like to work on this. Sorry if it's already been 
answered multiple times, but any specific steps I need to take to get this 
assigned to me?-
all good, now assigned to me


was (Author: JIRAUSER295920):
Hi! I'm a newbie and I'd like to work on this. Sorry if it's already been 
answered multiple times, but any specific steps I need to take to get this 
assigned to me?

> Merge StateRestorationIntegrationTest into RestoreIntegrationTest
> -
>
> Key: KAFKA-14239
> URL: https://issues.apache.org/jira/browse/KAFKA-14239
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: newbie++
>
> We have two integration test classes for store restoration, and 
> StateRestorationIntegrationTest only has one single test method. We can merge 
> it with the other to save integration testing time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] C0urante commented on a diff in pull request #12566: KAFKA-13927: Only clear origOffsets when clearing messageBatch

2022-09-19 Thread GitBox


C0urante commented on code in PR #12566:
URL: https://github.com/apache/kafka/pull/12566#discussion_r974231345


##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/WorkerSinkTaskTest.java:
##
@@ -435,6 +453,13 @@ public void testPollRedelivery() throws Exception {
 assertTaskMetricValue("batch-size-max", 1.0);
 assertTaskMetricValue("batch-size-avg", 0.5);
 
+assertEquals(workerCurrentOffsets, Whitebox.>getInternalState(workerTask, "currentOffsets"));

Review Comment:
   Do we have to probe internal fields to verify this change? Couldn't we 
examine the offsets given to `SinkTask::preCommit` or `Consumer::commitAsync` 
instead?



##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java:
##
@@ -763,6 +763,9 @@ private void onPartitionsRemoved(Collection 
partitions, boolean
 return;
 
 try {
+for (TopicPartition partition: partitions) {
+origOffsets.remove(partition);
+}

Review Comment:
   Any reason not to put this in `closePartitions`, where we also clear out 
entries from `currentOffsets`?
   Also, this can be simplified:
   ```java
   origOffsets.keySet().removeAll(partitions);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-14239) Merge StateRestorationIntegrationTest into RestoreIntegrationTest

2022-09-19 Thread Ahmed Sobeh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606594#comment-17606594
 ] 

Ahmed Sobeh commented on KAFKA-14239:
-

Hi! I'm a newbie and I'd like to work on this. Sorry if it's already been 
answered multiple times, but any specific steps I need to take to get this 
assigned to me?

> Merge StateRestorationIntegrationTest into RestoreIntegrationTest
> -
>
> Key: KAFKA-14239
> URL: https://issues.apache.org/jira/browse/KAFKA-14239
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: newbie++
>
> We have two integration test classes for store restoration, and 
> StateRestorationIntegrationTest only has one single test method. We can merge 
> it with the other to save integration testing time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-13927) Kafka Connect Sink Connector Success after RetriableException, no commit offset to remote.

2022-09-19 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton reassigned KAFKA-13927:
-

Assignee: Jordan Bull

> Kafka Connect Sink Connector Success after RetriableException, no commit 
> offset to remote.
> --
>
> Key: KAFKA-13927
> URL: https://issues.apache.org/jira/browse/KAFKA-13927
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.8.0
>Reporter: gj choi
>Assignee: Jordan Bull
>Priority: Minor
>
> I made a custom SinkConnector.
> and I set retries with RetriableException.
>  
> In normal case, successfully offset commited with no LAG.
> but one or more RetriableException occured and then, retry make success
> there are LAG about sucess record, despite the retry success.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (KAFKA-13927) Kafka Connect Sink Connector Success after RetriableException, no commit offset to remote.

2022-09-19 Thread Chris Egerton (Jira)


[ https://issues.apache.org/jira/browse/KAFKA-13927 ]


Chris Egerton deleted comment on KAFKA-13927:
---

was (Author: chrisegerton):
https://github.com/apache/kafka/pull/12566

> Kafka Connect Sink Connector Success after RetriableException, no commit 
> offset to remote.
> --
>
> Key: KAFKA-13927
> URL: https://issues.apache.org/jira/browse/KAFKA-13927
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.8.0
>Reporter: gj choi
>Assignee: Jordan Bull
>Priority: Minor
>
> I made a custom SinkConnector.
> and I set retries with RetriableException.
>  
> In normal case, successfully offset commited with no LAG.
> but one or more RetriableException occured and then, retry make success
> there are LAG about sucess record, despite the retry success.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] C0urante commented on a diff in pull request #12561: KAFKA-12495: Exponential backoff retry to prevent rebalance storms when worker joins after revoking rebalance

2022-09-19 Thread GitBox


C0urante commented on code in PR #12561:
URL: https://github.com/apache/kafka/pull/12561#discussion_r974209237


##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/WorkerCoordinatorIncrementalTest.java:
##
@@ -517,13 +517,13 @@ public void testTaskAssignmentWhenWorkerBounces() {
 leaderAssignment = deserializeAssignment(result, leaderId);
 assertAssignment(leaderId, offset,
 Collections.emptyList(), 0,
-Collections.emptyList(), 0,
+Collections.emptyList(), 1,

Review Comment:
   Isn't this a regression? We shouldn't be revoking tasks in this round since, 
without those revocations, we'd have a balanced assignment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] vvcephei commented on pull request #12594: MINOR: Include all hosts in metadata for topology

2022-09-19 Thread GitBox


vvcephei commented on PR #12594:
URL: https://github.com/apache/kafka/pull/12594#issuecomment-1250971460

   ```
   [2022-09-07T15:38:17.417Z] [ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka-pr_PR-12594/streams/src/test/java/org/apache/kafka/streams/integration/NamedTopologyIntegrationTest.java:64:8:
 Unused import - java.util.Collections. [UnusedImports]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] nizhikov commented on a diff in pull request #12632: KAFKA-12878 Support --bootstrap-server in kafka-streams-application-reset tool

2022-09-19 Thread GitBox


nizhikov commented on code in PR #12632:
URL: https://github.com/apache/kafka/pull/12632#discussion_r974176089


##
core/src/main/scala/kafka/tools/StreamsResetter.java:
##
@@ -213,11 +217,16 @@ private void parseArguments(final String[] args) {
 .ofType(String.class)
 .describedAs("id")
 .required();
-bootstrapServerOption = optionParser.accepts("bootstrap-servers", 
"Comma-separated list of broker urls with format: HOST1:PORT1,HOST2:PORT2")
-.withRequiredArg()
+bootstrapServersOption = optionParser.accepts("bootstrap-servers", 
"DEPRECATED: Comma-separated list of broker urls with format: 
HOST1:PORT1,HOST2:PORT2")
+.withOptionalArg()

Review Comment:
   Thank. I changed code following your suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on a diff in pull request #12632: KAFKA-12878 Support --bootstrap-server in kafka-streams-application-reset tool

2022-09-19 Thread GitBox


C0urante commented on code in PR #12632:
URL: https://github.com/apache/kafka/pull/12632#discussion_r974169402


##
core/src/main/scala/kafka/tools/StreamsResetter.java:
##
@@ -213,11 +217,16 @@ private void parseArguments(final String[] args) {
 .ofType(String.class)
 .describedAs("id")
 .required();
-bootstrapServerOption = optionParser.accepts("bootstrap-servers", 
"Comma-separated list of broker urls with format: HOST1:PORT1,HOST2:PORT2")
-.withRequiredArg()
+bootstrapServersOption = optionParser.accepts("bootstrap-servers", 
"DEPRECATED: Comma-separated list of broker urls with format: 
HOST1:PORT1,HOST2:PORT2")
+.withOptionalArg()

Review Comment:
   I think we still need to revert the change from `withRequiredArg` to 
`withOptionalArg`. If something has a required argument, the flag (e.g., 
`--bootstrap-server`) has to be followed by an argument; if it has an optional 
argument, you can provide just the flag without any argument.
   
   It's fine if we make providing the flags themselves optional (which I 
believe your if/else logic does), but we shouldn't make providing arguments for 
those flags optional; if someone provides `--bootstrap-server` or 
`--bootstrap-servers`, it should always be followed by the actual bootstrap 
server list.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on a diff in pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


C0urante commented on code in PR #12656:
URL: https://github.com/apache/kafka/pull/12656#discussion_r974156910


##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java:
##
@@ -108,7 +108,7 @@ public class WorkerConfig extends AbstractConfig {
 
 public static final String OFFSET_COMMIT_INTERVAL_MS_CONFIG = 
"offset.flush.interval.ms";
 private static final String OFFSET_COMMIT_INTERVAL_MS_DOC
-= "Interval at which to try committing offsets for tasks.";
+= "Interval at which to try committing offsets for source tasks.";

Review Comment:
   The offset commit interval also affects sink tasks; see here:
   
https://github.com/apache/kafka/blob/352c71ffb5d825c4a88454c12b9fa66c1750add3/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L212
   We can probably revert this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


C0urante commented on PR #12656:
URL: https://github.com/apache/kafka/pull/12656#issuecomment-1250915379

   Thanks @fvaleri, this is a useful change. I've left a few small comments but 
it should be good to go after they're addressed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on a diff in pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


C0urante commented on code in PR #12656:
URL: https://github.com/apache/kafka/pull/12656#discussion_r974158739


##
docs/connect.html:
##
@@ -53,7 +53,7 @@ Running 
Kafka ConnectThe important configuration options specific to standalone mode are:
 
-offset.storage.file.filename - File to store offset 
data in
+offset.storage.file.filename - File to store source 
task offsets

Review Comment:
   Same suggestion RE "connector" vs. "task":
   ```suggestion
   offset.storage.file.filename - File to store source 
connector offsets
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on a diff in pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


C0urante commented on code in PR #12656:
URL: https://github.com/apache/kafka/pull/12656#discussion_r974158439


##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/standalone/StandaloneConfig.java:
##
@@ -28,7 +28,7 @@ public class StandaloneConfig extends WorkerConfig {
  * offset.storage.file.filename
  */
 public static final String OFFSET_STORAGE_FILE_FILENAME_CONFIG = 
"offset.storage.file.filename";
-private static final String OFFSET_STORAGE_FILE_FILENAME_DOC = "File to 
store offset data in";
+private static final String OFFSET_STORAGE_FILE_FILENAME_DOC = "File to 
store source task offsets";

Review Comment:
   To remain consistent with the description for the `offset.storage.topic`, we 
should probably use the term "connector" instead of "task":
   ```suggestion
   private static final String OFFSET_STORAGE_FILE_FILENAME_DOC = "File to 
store source connector offsets";
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on a diff in pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


C0urante commented on code in PR #12656:
URL: https://github.com/apache/kafka/pull/12656#discussion_r974156910


##
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java:
##
@@ -108,7 +108,7 @@ public class WorkerConfig extends AbstractConfig {
 
 public static final String OFFSET_COMMIT_INTERVAL_MS_CONFIG = 
"offset.flush.interval.ms";
 private static final String OFFSET_COMMIT_INTERVAL_MS_DOC
-= "Interval at which to try committing offsets for tasks.";
+= "Interval at which to try committing offsets for source tasks.";

Review Comment:
   The offset commit interval also affects sink tasks; see here:
   
https://github.com/apache/kafka/blob/352c71ffb5d825c4a88454c12b9fa66c1750add3/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L212



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] omkreddy merged pull request #12661: MINOR: Update release versions for upgrade tests with 3.0.2, 3.1.2, 3.2.3 release

2022-09-19 Thread GitBox


omkreddy merged PR #12661:
URL: https://github.com/apache/kafka/pull/12661


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on a diff in pull request #12650: KAFKA-10199: Adapt restoration integration tests to state updater

2022-09-19 Thread GitBox


cadonna commented on code in PR #12650:
URL: https://github.com/apache/kafka/pull/12650#discussion_r974112162


##
streams/src/test/java/org/apache/kafka/streams/integration/RestoreIntegrationTest.java:
##
@@ -291,8 +297,9 @@ public void shouldSuccessfullyStartWhenLoggingDisabled() 
throws InterruptedExcep
 assertTrue(startupLatch.await(30, TimeUnit.SECONDS));
 }
 
-@Test
-public void shouldProcessDataFromStoresWithLoggingDisabled() throws 
InterruptedException {
+@ParameterizedTest

Review Comment:
   In Junit 5 it is apparently not possible to parametrize on test class level. 
However, you can use a method as the source for the parameter values `true` and 
`false` for all test methods instead of enumerating `true` and `false` for each 
test method. I used the method source.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests

2022-09-19 Thread Matthew de Detrich (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14132:
---
Description: 
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}InReview{color}
{color:#00875a}Merged{color}
 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # WorkerMetricsGroupTest (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
 # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
 # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
 # RetryUtilTest (owner: [~mdedetrich-aiven] )
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*

  was:
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#FF8B00}InReview{color}
{color:#00875A}Merged{color}

 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # WorkerMetricsGroupTest (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875A}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875A}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875A}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875A}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest
 # StandaloneHerderTest
 # KafkaConfigBackingStoreTest
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest
 # RetryUtilTest
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*


> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#de350b}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> {color:#ff8b00}InReview{color}
> {color:#00875a}Merged{color}
>  # ErrorHandlingTaskTest (owner: Divij)
>  # SourceTaskOffsetCommiterTest (owner: Divij)
>  # WorkerMetricsGroupTest (owner: Divij)
>  # WorkerSinkTaskTest (owner: Divij)
>  # WorkerSinkTaskThreadedTest (owner: Divij)
>  # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
>  # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
>  # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
>  # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
>  # KafkaOffsetBackingStoreTest (owner: Christo) 
> ([https://github.com/apache/kafka/pull/12418])
>  # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
>  # RetryUtilTest (owner: [~mdedetrich-aiven] )
>  # RepartitionTopicTest (streams) (owner: Christo)
>  # StateManagerUtilTest (streams) (owner: Christo)
> *The coverage report for the above tests after the change should b

[GitHub] [kafka] fvaleri commented on pull request #12656: MINOR: Update offset.storage.topic description

2022-09-19 Thread GitBox


fvaleri commented on PR #12656:
URL: https://github.com/apache/kafka/pull/12656#issuecomment-1250780273

   @mimaison 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-14208) KafkaConsumer#commitAsync throws unexpected WakeupException

2022-09-19 Thread Tom Bentley (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Bentley updated KAFKA-14208:

Fix Version/s: 3.2.3
   (was: 3.2.2)

> KafkaConsumer#commitAsync throws unexpected WakeupException
> ---
>
> Key: KAFKA-14208
> URL: https://issues.apache.org/jira/browse/KAFKA-14208
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 3.2.1
>Reporter: Qingsheng Ren
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 3.3.0, 3.2.3
>
>
> We recently encountered a bug after upgrading Kafka client to 3.2.1 in Flink 
> Kafka connector (FLINK-29153). Here's the exception:
> {code:java}
> org.apache.kafka.common.errors.WakeupException
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:514)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:278)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:252)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.coordinatorUnknownAndUnready(ConsumerCoordinator.java:493)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsAsync(ConsumerCoordinator.java:1055)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.commitAsync(KafkaConsumer.java:1573)
>   at 
> org.apache.flink.streaming.connectors.kafka.internals.KafkaConsumerThread.run(KafkaConsumerThread.java:226)
>  {code}
> As {{WakeupException}} is not listed in the JavaDoc of 
> {{{}KafkaConsumer#commitAsync{}}}, Flink Kafka connector doesn't catch the 
> exception thrown directly from KafkaConsumer#commitAsync but handles all 
> exceptions in the callback.
> I checked the source code and suspect this is caused by KAFKA-13563. Also we 
> never had this exception in commitAsync when we used Kafka client 2.4.1 & 
> 2.8.1. 
> I'm wondering if this is kind of breaking the public API as the 
> WakeupException is not listed in JavaDoc, and maybe it's better to invoke the 
> callback to handle the {{WakeupException}} instead of throwing it directly 
> from the method itself. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14196) Duplicated consumption during rebalance, causing OffsetValidationTest to act flaky

2022-09-19 Thread Tom Bentley (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Bentley updated KAFKA-14196:

Fix Version/s: 3.2.3
   (was: 3.2.2)

> Duplicated consumption during rebalance, causing OffsetValidationTest to act 
> flaky
> --
>
> Key: KAFKA-14196
> URL: https://issues.apache.org/jira/browse/KAFKA-14196
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.2.1
>Reporter: Philip Nee
>Assignee: Philip Nee
>Priority: Blocker
>  Labels: new-consumer-threading-should-fix
> Fix For: 3.3.0, 3.2.3
>
>
> Several flaky tests under OffsetValidationTest are indicating potential 
> consumer duplication issue, when autocommit is enabled.  I believe this is 
> affecting *3.2* and onward.  Below shows the failure message:
>  
> {code:java}
> Total consumed records 3366 did not match consumed position 3331 {code}
>  
> After investigating the log, I discovered that the data consumed between the 
> start of a rebalance event and the async commit was lost for those failing 
> tests.  In the example below, the rebalance event kicks in at around 
> 1662054846995 (first record), and the async commit of the offset 3739 is 
> completed at around 1662054847015 (right before partitions_revoked).
>  
> {code:java}
> {"timestamp":1662054846995,"name":"records_consumed","count":3,"partitions":[{"topic":"test_topic","partition":0,"count":3,"minOffset":3739,"maxOffset":3741}]}
> {"timestamp":1662054846998,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3742,"maxOffset":3743}]}
> {"timestamp":1662054847008,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3744,"maxOffset":3745}]}
> {"timestamp":1662054847016,"name":"partitions_revoked","partitions":[{"topic":"test_topic","partition":0}]}
> {"timestamp":1662054847031,"name":"partitions_assigned","partitions":[{"topic":"test_topic","partition":0}]}
> {"timestamp":1662054847038,"name":"records_consumed","count":23,"partitions":[{"topic":"test_topic","partition":0,"count":23,"minOffset":3739,"maxOffset":3761}]}
>  {code}
> A few things to note here:
>  # Manually calling commitSync in the onPartitionsRevoke cb seems to 
> alleviate the issue
>  # Setting includeMetadataInTimeout to false also seems to alleviate the 
> issue.
> The above tries seems to suggest that contract between poll() and 
> asyncCommit() is broken.  AFAIK, we implicitly uses poll() to ack the 
> previously fetched data, and the consumer would (try to) commit these offsets 
> in the current poll() loop.  However, it seems like as the poll continues to 
> loop, the "acked" data isn't being committed.
>  
> I believe this could be introduced in  KAFKA-14024, which originated from 
> KAFKA-13310.
> More specifically, (see the comments below), the ConsumerCoordinator will 
> alway return before async commit, due to the previous incomplete commit.  
> However, this is a bit contradictory here because:
>  # I think we want to commit asynchronously while the poll continues, and if 
> we do that, we are back to KAFKA-14024, that the consumer will get rebalance 
> timeout and get kicked out of the group.
>  # But we also need to commit all the "acked" offsets before revoking the 
> partition, and this has to be blocked.
> *Steps to Reproduce the Issue:*
>  # Check out AK 3.2
>  # Run this several times: (Recommend to only run runs with autocommit 
> enabled in consumer_test.py to save time)
> {code:java}
> _DUCKTAPE_OPTIONS="--debug" 
> TC_PATHS="tests/kafkatest/tests/client/consumer_test.py::OffsetValidationTest.test_consumer_failure"
>  bash tests/docker/run_tests.sh {code}
>  
> *Steps to Diagnose the Issue:*
>  # Open the test results in *results/*
>  # Go to the consumer log.  It might look like this
>  
> {code:java}
> results/2022-09-03--005/OffsetValidationTest/test_consumer_failure/clean_shutdown=True.enable_autocommit=True.metadata_quorum=ZK/2/VerifiableConsumer-0-xx/dockerYY
>  {code}
> 3. Find the docker instance that has partition getting revoked and rejoined.  
> Observed the offset before and after.
> *Propose Fixes:*
>  TBD
>  
> https://github.com/apache/kafka/pull/12603



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] Gerrrr commented on pull request #12393: WIP: KAFKA-12549 Prototype for transactional state stores

2022-09-19 Thread GitBox


Ge commented on PR #12393:
URL: https://github.com/apache/kafka/pull/12393#issuecomment-1250710841

   Hey @guozhangwang !
   
   Yeah, I am still working on the PR - adding integration tests, running 
benchmarks, etc. I will ping you here and update the description once the PR is 
ready for review. We can decide on the best way to split it then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] tombentley opened a new pull request, #12661: MINOR: Update release versions for upgrade tests with 3.0.2, 3.1.2, 3.2.3 release

2022-09-19 Thread GitBox


tombentley opened a new pull request, #12661:
URL: https://github.com/apache/kafka/pull/12661

   Updates release versions in files that are used for upgrade test with the 
3.0.2, 3.1.2, 3.2.3 release version.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (KAFKA-14043) The latest status of the task is overwritten by the old status

2022-09-19 Thread doupengwei (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606402#comment-17606402
 ] 

doupengwei edited comment on KAFKA-14043 at 9/19/22 7:58 AM:
-

Exp: for pg connector will try to release slot and publication resources, but 
due to net work issue. it will failed and try to update task status to failed.

So i think we can add a generation information for task status, so that after 
net work issue resolved. the old status info can be filtered by check 
generation.

 

Hello Ismael Juma,  sorry to bother u,  can u  check my point. @[~ijuma] 

 


was (Author: doudou):
Exp: for pg connector will try to release slot and publication resources, but 
due to net work issue. it will failed and try to update task status to failed.

So i think we can add a generation information for task status, so that after 
net work issue resolved. the old status info can be filtered by check 
generation.

 

 

> The latest status of the task is overwritten by the old status
> --
>
> Key: KAFKA-14043
> URL: https://issues.apache.org/jira/browse/KAFKA-14043
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.4.0
> Environment: Centos 7.6
> kafka  2.4.0
>Reporter: doupengwei
>Priority: Major
>
> Kafka version : 2.4.0
> connect.protocol : compatible
> In kafka connect cluster, if  one node  faced net work issue,  then will 
> caused worker try to stop connector and task which running on this node. and 
> due to net work issue. it will stop failed and throw exception, then worker 
> process will try to update task or connector status, producer will retry 
> indefinitely until successfully sent. but due to net work issue, the new 
> assignment have  performed on the other connect node. after net work recover, 
> old task status will cover latest status.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka] showuon commented on pull request #12658: KAFKA-14233: disable testReloadUpdatedFilesWithoutConfigChange

2022-09-19 Thread GitBox


showuon commented on PR #12658:
URL: https://github.com/apache/kafka/pull/12658#issuecomment-1250648906

   Finally got a build without red light
   
![image](https://user-images.githubusercontent.com/43372967/190966322-be85699a-f247-489d-b0ef-61a4870fcae8.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org