[jira] [Resolved] (KAFKA-14883) Broker state should be "observer" in KRaft quorum

2023-04-12 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-14883.
---
Fix Version/s: 3.5.0
   Resolution: Fixed

> Broker state should be "observer" in KRaft quorum
> -
>
> Key: KAFKA-14883
> URL: https://issues.apache.org/jira/browse/KAFKA-14883
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft, metrics
>Affects Versions: 3.4.0
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Major
> Fix For: 3.5.0
>
>
> Currently, the `current-state` KRaft related metric reports `follower` state 
> for a broker while technically it should be reported as an `observer`  as the 
> `kafka-metadata-quorum` tool does.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-9550) RemoteLogManager - copying eligible log segments to remote storage implementation

2023-04-12 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-9550.
--
Resolution: Fixed

> RemoteLogManager - copying eligible log segments to remote storage 
> implementation 
> --
>
> Key: KAFKA-9550
> URL: https://issues.apache.org/jira/browse/KAFKA-9550
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
> Fix For: 3.5.0
>
>
> Implementation of RLM as mentioned in the HLD section of KIP-405, this JIRA 
> covers copying segments to remote storage. 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP-405:KafkaTieredStorage-High-leveldesign]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1756

2023-04-12 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.4 #107

2023-04-12 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1755

2023-04-12 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 367175 lines...]
[2023-04-13T00:35:19.675Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testControllerManagementMethods() STARTED
[2023-04-13T00:35:20.496Z] 
[2023-04-13T00:35:20.496Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testControllerManagementMethods() PASSED
[2023-04-13T00:35:20.496Z] 
[2023-04-13T00:35:20.496Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testTopicAssignmentMethods() STARTED
[2023-04-13T00:35:20.496Z] 
[2023-04-13T00:35:20.496Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testTopicAssignmentMethods() PASSED
[2023-04-13T00:35:20.496Z] 
[2023-04-13T00:35:20.496Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testConnectionViaNettyClient() STARTED
[2023-04-13T00:35:21.317Z] 
[2023-04-13T00:35:21.318Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testConnectionViaNettyClient() PASSED
[2023-04-13T00:35:21.318Z] 
[2023-04-13T00:35:21.318Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testPropagateIsrChanges() STARTED
[2023-04-13T00:35:21.318Z] 
[2023-04-13T00:35:21.318Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testPropagateIsrChanges() PASSED
[2023-04-13T00:35:21.318Z] 
[2023-04-13T00:35:21.318Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testControllerEpochMethods() STARTED
[2023-04-13T00:35:21.318Z] 
[2023-04-13T00:35:21.318Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testControllerEpochMethods() PASSED
[2023-04-13T00:35:21.318Z] 
[2023-04-13T00:35:21.318Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testDeleteRecursive() STARTED
[2023-04-13T00:35:22.138Z] 
[2023-04-13T00:35:22.138Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testDeleteRecursive() PASSED
[2023-04-13T00:35:22.138Z] 
[2023-04-13T00:35:22.138Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testGetTopicPartitionStates() STARTED
[2023-04-13T00:35:22.138Z] 
[2023-04-13T00:35:22.138Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testGetTopicPartitionStates() PASSED
[2023-04-13T00:35:22.138Z] 
[2023-04-13T00:35:22.138Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testCreateConfigChangeNotification() STARTED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testCreateConfigChangeNotification() PASSED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testDelegationTokenMethods() STARTED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > KafkaZkClientTest > testDelegationTokenMethods() PASSED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testUpdateExistingPartitions() STARTED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testUpdateExistingPartitions() PASSED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testEmptyWrite() STARTED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testEmptyWrite() PASSED
[2023-04-13T00:35:22.960Z] 
[2023-04-13T00:35:22.960Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testExistingKRaftControllerClaim() STARTED
[2023-04-13T00:35:23.781Z] 
[2023-04-13T00:35:23.781Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testExistingKRaftControllerClaim() PASSED
[2023-04-13T00:35:23.781Z] 
[2023-04-13T00:35:23.781Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testReadAndWriteProducerId() STARTED
[2023-04-13T00:35:23.781Z] 
[2023-04-13T00:35:23.781Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testReadAndWriteProducerId() PASSED
[2023-04-13T00:35:23.781Z] 
[2023-04-13T00:35:23.781Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 94 > ZkMigrationClientTest > testMigrationBrokerConfigs() 

[jira] [Resolved] (KAFKA-14561) Improve transactions experience for older clients by ensuring ongoing transaction

2023-04-12 Thread Jun Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-14561.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

merged the PR to trunk.

> Improve transactions experience for older clients by ensuring ongoing 
> transaction
> -
>
> Key: KAFKA-14561
> URL: https://issues.apache.org/jira/browse/KAFKA-14561
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Major
> Fix For: 3.5.0
>
>
> This is part 3 of KIP-890:
> 3. *To cover older clients, we will ensure a transaction is ongoing before we 
> write to a transaction. We can do this by querying the transaction 
> coordinator and caching the result.*
> See KIP-890 for more details: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-890%3A+Transactions+Server-Side+Defense



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1754

2023-04-12 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 556315 lines...]
[2023-04-12T21:31:11.321Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardNoSslDualListener PASSED
[2023-04-12T21:31:11.321Z] 
[2023-04-12T21:31:11.321Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardSslDualListener STARTED
[2023-04-12T21:31:20.705Z] 
[2023-04-12T21:31:20.705Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardSslDualListener PASSED
[2023-04-12T21:31:20.705Z] 
[2023-04-12T21:31:20.705Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardFollowerSsl STARTED
[2023-04-12T21:31:23.756Z] 
[2023-04-12T21:31:23.756Z] > Task 
:streams:upgrade-system-tests-0102:integrationTest
[2023-04-12T21:31:23.756Z] > Task 
:streams:upgrade-system-tests-0110:integrationTest
[2023-04-12T21:31:34.370Z] 
[2023-04-12T21:31:34.370Z] > Task :connect:runtime:integrationTest
[2023-04-12T21:31:34.370Z] 
[2023-04-12T21:31:34.370Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardFollowerSsl PASSED
[2023-04-12T21:31:34.370Z] 
[2023-04-12T21:31:34.370Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardSsl STARTED
[2023-04-12T21:31:35.189Z] 
[2023-04-12T21:31:35.189Z] > Task 
:streams:upgrade-system-tests-10:integrationTest
[2023-04-12T21:31:46.548Z] 
[2023-04-12T21:31:46.548Z] > Task :connect:runtime:integrationTest
[2023-04-12T21:31:46.548Z] 
[2023-04-12T21:31:46.548Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 133 > 
org.apache.kafka.connect.integration.ExampleConnectIntegrationTest > 
testSourceConnector PASSED
[2023-04-12T21:31:46.548Z] 
[2023-04-12T21:31:46.548Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 133 > 
org.apache.kafka.connect.integration.ExampleConnectIntegrationTest > 
testSinkConnector STARTED
[2023-04-12T21:31:46.548Z] 
[2023-04-12T21:31:46.548Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardSsl PASSED
[2023-04-12T21:31:46.548Z] 
[2023-04-12T21:31:46.548Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardNoSsl STARTED
[2023-04-12T21:31:46.548Z] 
[2023-04-12T21:31:46.548Z] > Task 
:streams:upgrade-system-tests-11:integrationTest
[2023-04-12T21:31:52.223Z] 
[2023-04-12T21:31:52.223Z] > Task :connect:runtime:integrationTest
[2023-04-12T21:31:52.223Z] 
[2023-04-12T21:31:52.223Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardNoSsl PASSED
[2023-04-12T21:31:52.223Z] 
[2023-04-12T21:31:52.223Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardLeaderSsl STARTED
[2023-04-12T21:31:54.706Z] 
[2023-04-12T21:31:54.706Z] > Task 
:streams:upgrade-system-tests-20:integrationTest
[2023-04-12T21:32:11.409Z] 
[2023-04-12T21:32:11.409Z] > Task :connect:runtime:integrationTest
[2023-04-12T21:32:11.409Z] 
[2023-04-12T21:32:11.409Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.RestForwardingIntegrationTest > 
testRestForwardLeaderSsl PASSED
[2023-04-12T21:32:11.409Z] 
[2023-04-12T21:32:11.409Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 132 > 
org.apache.kafka.connect.integration.SinkConnectorsIntegrationTest > 
testCooperativeConsumerPartitionAssignment STARTED
[2023-04-12T21:32:11.409Z] 
[2023-04-12T21:32:11.409Z] > Task 
:streams:upgrade-system-tests-21:integrationTest
[2023-04-12T21:32:15.709Z] > Task 
:streams:upgrade-system-tests-22:integrationTest
[2023-04-12T21:32:33.467Z] > Task 
:streams:upgrade-system-tests-23:integrationTest
[2023-04-12T21:32:33.467Z] 
[2023-04-12T21:32:33.467Z] > Task :connect:runtime:integrationTest
[2023-04-12T21:32:33.467Z] 
[2023-04-12T21:32:33.467Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 133 > 
org.apache.kafka.connect.integration.ExampleConnectIntegrationTest > 
testSinkConnector 

[jira] [Created] (KAFKA-14901) Flaky test ExactlyOnceSourceIntegrationTest.testConnectorReconfiguration

2023-04-12 Thread Greg Harris (Jira)
Greg Harris created KAFKA-14901:
---

 Summary: Flaky test 
ExactlyOnceSourceIntegrationTest.testConnectorReconfiguration
 Key: KAFKA-14901
 URL: https://issues.apache.org/jira/browse/KAFKA-14901
 Project: Kafka
  Issue Type: Test
  Components: KafkaConnect
Reporter: Greg Harris


The EOS Source test appears to be occasionally failing with the following error:
{noformat}
org.apache.kafka.common.KafkaException: Unexpected error in 
InitProducerIdResponse; The server experienced an unexpected error when 
processing the request.
at 
app//org.apache.kafka.clients.producer.internals.TransactionManager$InitProducerIdHandler.handleResponse(TransactionManager.java:1303)
at 
app//org.apache.kafka.clients.producer.internals.TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java:1207)
at 
app//org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154)
at 
app//org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:594)
at 
app//org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:586)
at 
app//org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:426)
at 
app//org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:316)
at 
app//org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:243)
at java.base@11.0.16.1/java.lang.Thread.run(Thread.java:829){noformat}
which appears to be triggered by the following failure inside the broker:
{noformat}
[2023-04-12 14:01:38,931] ERROR [KafkaApi-0] Unexpected error handling request 
RequestHeader(apiKey=INIT_PRODUCER_ID, apiVersion=4, 
clientId=simulated-task-producer-exactlyOnceQuestionMark-1, correlationId=5, 
headerVersion=2) -- 
InitProducerIdRequestData(transactionalId='exactly-once-source-integration-test-exactlyOnceQuestionMark-1',
 transactionTimeoutMs=6, producerId=-1, producerEpoch=-1) with context 
RequestContext(header=RequestHeader(apiKey=INIT_PRODUCER_ID, apiVersion=4, 
clientId=simulated-task-producer-exactlyOnceQuestionMark-1, correlationId=5, 
headerVersion=2), connectionId='127.0.0.1:54213-127.0.0.1:54367-46', 
clientAddress=/127.0.0.1, principal=User:ANONYMOUS, 
listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, 
clientInformation=ClientInformation(softwareName=apache-kafka-java, 
softwareVersion=3.5.0-SNAPSHOT), fromPrivilegedListener=true, 
principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@615924cd])
 (kafka.server.KafkaApis:76)
java.lang.IllegalStateException: Preparing transaction state transition to 
Empty while it already a pending state Ongoing
    at 
kafka.coordinator.transaction.TransactionMetadata.prepareTransitionTo(TransactionMetadata.scala:380)
    at 
kafka.coordinator.transaction.TransactionMetadata.prepareIncrementProducerEpoch(TransactionMetadata.scala:311)
    at 
kafka.coordinator.transaction.TransactionCoordinator.prepareInitProducerIdTransit(TransactionCoordinator.scala:240)
    at 
kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleInitProducerId$3(TransactionCoordinator.scala:151)
    at 
kafka.coordinator.transaction.TransactionMetadata.inLock(TransactionMetadata.scala:242)
    at 
kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleInitProducerId$2(TransactionCoordinator.scala:150)
    at scala.util.Either.flatMap(Either.scala:352)
    at 
kafka.coordinator.transaction.TransactionCoordinator.handleInitProducerId(TransactionCoordinator.scala:145)
    at kafka.server.KafkaApis.handleInitProducerIdRequest(KafkaApis.scala:2236)
    at kafka.server.KafkaApis.handle(KafkaApis.scala:202)
    at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:76)
    at java.base/java.lang.Thread.run(Thread.java:829{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14898) [ MirrorMaker ] sync.topic.configs.enabled not working as expected

2023-04-12 Thread Greg Harris (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-14898.
-
Resolution: Fixed

> [ MirrorMaker ] sync.topic.configs.enabled not working as expected
> --
>
> Key: KAFKA-14898
> URL: https://issues.apache.org/jira/browse/KAFKA-14898
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 3.4.0
>Reporter: Srinivas Boga
>Priority: Major
>  Labels: mirrormaker
>
> Hello,
> In my replication set up , i do not want to sync the topic configs, the use 
> case is to have different retention time for the topic on the target cluster, 
> I am passing the config
> {code:java}
>  sync.topic.configs.enabled = false{code}
> but this is not working as expected the topic retention time is being set to 
> whatever is being set in the source cluster, looking at the mirrormaker logs 
> i can see that MirrorSourceConnector is still setting the above config as true
> {code:java}
> [2023-04-12 17:04:55,184] INFO [MirrorSourceConnector|task-8] ConsumerConfig 
> values:
>         allow.auto.create.topics = true
>         auto.commit.interval.ms = 5000
>         auto.include.jmx.reporter = true
>         auto.offset.reset = earliest
>         bootstrap.servers = [sourcecluster.com:9092]
>         check.crcs = true
>         client.dns.lookup = use_all_dns_ips
>         client.id = consumer-null-2
>         client.rack =
>         connections.max.idle.ms = 54
>         default.api.timeout.ms = 6
>         enable.auto.commit = false
>         exclude.internal.topics = true
>         fetch.max.bytes = 52428800
>         fetch.max.wait.ms = 500
>         fetch.min.bytes = 1
>         group.id = null
>         group.instance.id = null
>         heartbeat.interval.ms = 3000
>         interceptor.classes = []
>         internal.leave.group.on.close = true
>         internal.throw.on.fetch.stable.offset.unsupported = false
>         isolation.level = read_uncommitted
>         key.deserializer = class 
> org.apache.kafka.common.serialization.ByteArrayDeserializer
>         max.partition.fetch.bytes = 1048576
>         max.poll.interval.ms = 30
>         max.poll.records = 500
>         metadata.max.age.ms = 30
>         metric.reporters = []
>         metrics.num.samples = 2
>         metrics.recording.level = INFO
>         metrics.sample.window.ms = 3
>         partition.assignment.strategy = [class 
> org.apache.kafka.clients.consumer.RangeAssignor, class 
> org.apache.kafka.clients.consumer.CooperativeStickyAssignor]
>         receive.buffer.bytes = 65536
>         reconnect.backoff.max.ms = 1000
>         reconnect.backoff.ms = 50
>         request.timeout.ms = 3
>         retry.backoff.ms = 100
>         sasl.client.callback.handler.class = null
>         sasl.jaas.config = null
>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>         sasl.kerberos.min.time.before.relogin = 6
>         sasl.kerberos.service.name = null
>         sasl.kerberos.ticket.renew.jitter = 0.05
>         sasl.kerberos.ticket.renew.window.factor = 0.8
>         sasl.login.callback.handler.class = null
>         sasl.login.class = null
>         sasl.login.connect.timeout.ms = null
>         sasl.login.read.timeout.ms = null
>         sasl.login.refresh.buffer.seconds = 300
>         sasl.login.refresh.min.period.seconds = 60
>         sasl.login.refresh.window.factor = 0.8
>         sasl.login.refresh.window.jitter = 0.05
>         sasl.login.retry.backoff.max.ms = 1
>         sasl.login.retry.backoff.ms = 100
>         sasl.mechanism = GSSAPI
>         sasl.oauthbearer.clock.skew.seconds = 30
>         sasl.oauthbearer.expected.audience = null
>         sasl.oauthbearer.expected.issuer = null
>         sasl.oauthbearer.jwks.endpoint.refresh.ms = 360
>         sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 1
>         sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
>         sasl.oauthbearer.jwks.endpoint.url = null
>         sasl.oauthbearer.scope.claim.name = scope
>         sasl.oauthbearer.sub.claim.name = sub
>         sasl.oauthbearer.token.endpoint.url = null
>         security.protocol = PLAINTEXT
>         security.providers = null
>         send.buffer.bytes = 131072
>         session.timeout.ms = 45000
>         socket.connection.setup.timeout.max.ms = 3
>         socket.connection.setup.timeout.ms = 1
>         ssl.cipher.suites = null
>         ssl.enabled.protocols = [TLSv1.2]
>         ssl.endpoint.identification.algorithm = https
>         ssl.engine.factory.class = null
>         ssl.key.password = null
>         ssl.keymanager.algorithm = SunX509
>         ssl.keystore.certificate.chain = null
>         ssl.keystore.key = null
>         

[jira] [Created] (KAFKA-14900) Flaky test AuthorizerTest failing with NPE

2023-04-12 Thread Greg Harris (Jira)
Greg Harris created KAFKA-14900:
---

 Summary: Flaky test AuthorizerTest failing with NPE
 Key: KAFKA-14900
 URL: https://issues.apache.org/jira/browse/KAFKA-14900
 Project: Kafka
  Issue Type: Test
  Components: kraft
Reporter: Greg Harris
Assignee: Greg Harris


The AuthorizerTest has multiple tests that appear to have the same flaky 
failure:
{noformat}
org.apache.kafka.server.fault.FaultHandlerException: 
quorumTestHarnessFaultHandler: Unhandled error initializing new publishers: 
Cannot invoke "kafka.raft.KafkaRaftManager.client()" because the return value 
of "kafka.server.SharedServer.raftManager()" is null at 
app//kafka.server.SharedServer.$anonfun$start$3(SharedServer.scala:256)  at 
app//org.apache.kafka.image.loader.MetadataLoader.stillNeedToCatchUp(MetadataLoader.java:229)
at 
app//org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:270)
   at 
app//org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:258)
  at 
app//org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127)
   at 
app//org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
  at 
app//org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
   at java.base@17.0.4.1/java.lang.Thread.run(Thread.java:833)Caused by: 
java.lang.NullPointerException: Cannot invoke 
"kafka.raft.KafkaRaftManager.client()" because the return value of 
"kafka.server.SharedServer.raftManager()" is null... 8 more{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14899) Revisit Action Queue

2023-04-12 Thread Justine Olshan (Jira)
Justine Olshan created KAFKA-14899:
--

 Summary: Revisit Action Queue
 Key: KAFKA-14899
 URL: https://issues.apache.org/jira/browse/KAFKA-14899
 Project: Kafka
  Issue Type: Sub-task
Reporter: Justine Olshan
Assignee: Justine Olshan


With Kafka-14561 we introduced a notion for callback requests. It would be nice 
to standardize and combine action queue usage here. However, the current 
implementation of the callback request assumes local time is computed upon 
response send. 

This same paradigm may not be the case with the action queue. We should follow 
up and see what changes need to be made to combine the two.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14894) MetadataLoader must call finishSnapshot after loading a snapshot

2023-04-12 Thread Colin McCabe (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin McCabe resolved KAFKA-14894.
--
Fix Version/s: 3.5.0
 Reviewer: David Arthur
   Resolution: Fixed

> MetadataLoader must call finishSnapshot after loading a snapshot
> 
>
> Key: KAFKA-14894
> URL: https://issues.apache.org/jira/browse/KAFKA-14894
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14898) [ MirrorMaker ] sync.topic.configs.enabled not working as expected

2023-04-12 Thread Srinivas Boga (Jira)
Srinivas Boga created KAFKA-14898:
-

 Summary: [ MirrorMaker ] sync.topic.configs.enabled not working as 
expected
 Key: KAFKA-14898
 URL: https://issues.apache.org/jira/browse/KAFKA-14898
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Affects Versions: 3.4.0
Reporter: Srinivas Boga


Hello,

In my replication set up , i do not want to sync the topic configs, the use 
case is to have different retention time for the topic on the target cluster, I 
am passing the config
{code:java}
 sync.topic.configs.enabled = false{code}
but this is not working as expected the topic retention time is being set to 
whatever is being set in the source cluster, looking at the mirrormaker logs i 
can see that MirrorSourceConnector is still setting the above config as true
{code:java}
[2023-04-12 17:04:55,184] INFO [MirrorSourceConnector|task-8] ConsumerConfig 
values:
        allow.auto.create.topics = true
        auto.commit.interval.ms = 5000
        auto.include.jmx.reporter = true
        auto.offset.reset = earliest
        bootstrap.servers = [sourcecluster.com:9092]
        check.crcs = true
        client.dns.lookup = use_all_dns_ips
        client.id = consumer-null-2
        client.rack =
        connections.max.idle.ms = 54
        default.api.timeout.ms = 6
        enable.auto.commit = false
        exclude.internal.topics = true
        fetch.max.bytes = 52428800
        fetch.max.wait.ms = 500
        fetch.min.bytes = 1
        group.id = null
        group.instance.id = null
        heartbeat.interval.ms = 3000
        interceptor.classes = []
        internal.leave.group.on.close = true
        internal.throw.on.fetch.stable.offset.unsupported = false
        isolation.level = read_uncommitted
        key.deserializer = class 
org.apache.kafka.common.serialization.ByteArrayDeserializer
        max.partition.fetch.bytes = 1048576
        max.poll.interval.ms = 30
        max.poll.records = 500
        metadata.max.age.ms = 30
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 3
        partition.assignment.strategy = [class 
org.apache.kafka.clients.consumer.RangeAssignor, class 
org.apache.kafka.clients.consumer.CooperativeStickyAssignor]
        receive.buffer.bytes = 65536
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 3
        retry.backoff.ms = 100
        sasl.client.callback.handler.class = null
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 6
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.login.callback.handler.class = null
        sasl.login.class = null
        sasl.login.connect.timeout.ms = null
        sasl.login.read.timeout.ms = null
        sasl.login.refresh.buffer.seconds = 300
        sasl.login.refresh.min.period.seconds = 60
        sasl.login.refresh.window.factor = 0.8
        sasl.login.refresh.window.jitter = 0.05
        sasl.login.retry.backoff.max.ms = 1
        sasl.login.retry.backoff.ms = 100
        sasl.mechanism = GSSAPI
        sasl.oauthbearer.clock.skew.seconds = 30
        sasl.oauthbearer.expected.audience = null
        sasl.oauthbearer.expected.issuer = null
        sasl.oauthbearer.jwks.endpoint.refresh.ms = 360
        sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 1
        sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
        sasl.oauthbearer.jwks.endpoint.url = null
        sasl.oauthbearer.scope.claim.name = scope
        sasl.oauthbearer.sub.claim.name = sub
        sasl.oauthbearer.token.endpoint.url = null
        security.protocol = PLAINTEXT
        security.providers = null
        send.buffer.bytes = 131072
        session.timeout.ms = 45000
        socket.connection.setup.timeout.max.ms = 3
        socket.connection.setup.timeout.ms = 1
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2]
        ssl.endpoint.identification.algorithm = https
        ssl.engine.factory.class = null
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.certificate.chain = null
        ssl.keystore.key = null
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLSv1.2
        ssl.provider = null
        ssl.secure.random.implementation = null
        sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 1
        sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
        sasl.oauthbearer.jwks.endpoint.url = null
        sasl.oauthbearer.scope.claim.name = scope
        sasl.oauthbearer.sub.claim.name = 

[jira] [Created] (KAFKA-14897) 3.4.0 release notes does not mention fixing KAFKA-14696

2023-04-12 Thread Gray (Jira)
Gray created KAFKA-14897:


 Summary: 3.4.0 release notes does not mention fixing KAFKA-14696
 Key: KAFKA-14897
 URL: https://issues.apache.org/jira/browse/KAFKA-14897
 Project: Kafka
  Issue Type: Task
  Components: documentation
Affects Versions: 3.4.0
Reporter: Gray


According to KAFKA-14696, CVE-2023-25194 was fixed in version 3.4.0 but I see 
nothing about the task or CVE in the release notes.  Is this just a miss or is 
the "fix version" on the ticket wrong?  Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-875: First-class offsets support in Kafka Connect

2023-04-12 Thread Yash Mayya
Hi Chris,

Thanks, that makes sense - I hadn't considered the case where the worker
itself becomes a zombie.

Thanks,
Yash

On Wed, Apr 12, 2023 at 10:40 PM Chris Egerton 
wrote:

> Hi Yash,
>
> Great, we can use the transactional ID for task zero for now 
>
> As far as why: we'd need to fence out that producer in the event that tasks
> for the connector are brought up while the alter offsets request is still
> taking place, since we'd want to make sure that the offsets aren't altered
> after tasks are brought up. I think it may be possible right now in
> extremely niche circumstances where the worker servicing the alter offsets
> request loses leadership of the cluster. More realistically, I think it
> leaves room for tweaking the logic for handling these requests to not use a
> five-second timeout in the future (we could potentially just
> fire-and-forget the request and, if new tasks are started for the connector
> in the meantime, trust that the request will get automatically fenced out
> without doing any more work).
>
> Cheers,
>
> Chris
>
> On Wed, Apr 12, 2023 at 1:01 PM Yash Mayya  wrote:
>
> > Hi Chris and Greg,
> >
> > The current implementation does already use the transactional ID for
> task 0
> > so no complaints from me. Although I'm not sure I follow the concerns
> w.r.t
> > zombie fencing? In which cases would we need to fence out the
> transactional
> > producer instantiated for altering offsets?
> >
> > Thanks,
> > Yash
> >
> > On Wed, Apr 12, 2023 at 9:02 PM Chris Egerton  wrote:
> >
> > > Hi Greg,
> > >
> > > I hadn't considered the implications W/R/T zombie fencing. I agree that
> > > using the transactional ID for task 0 is better in that case.
> > >
> > > Yash (who is implementing this part, cc'd), does this seem reasonable
> to
> > > you?
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Tue, Apr 11, 2023 at 3:23 PM Greg Harris
>  > >
> > > wrote:
> > >
> > >> Chris & Yash,
> > >>
> > >> 1. Since the global offsets topic does not have transactions on it
> > >> already,
> > >> I don't think adding transactions just for these reset operations
> would
> > be
> > >> an improvement. The transactional produce would not exclude other
> > >> non-transactional producers, but hanging transactions on the global
> > >> offsets
> > >> topic would negatively impact the general cluster health. Your
> proposed
> > >> strategy seems reasonable to me.
> > >>
> > >> 2. While it may be the connector performing the offset reset and not
> the
> > >> task, I think it would be preferable for the connector to use task 0's
> > >> task-id and 'impersonate' the task for the purpose of changing the
> > >> offsets.
> > >> I think the complication elsewhere (getting users to provide a new
> ACL,
> > >> expanding fencing to also fence the connector transaction id, etc) is
> > not
> > >> practically worth it to change 1 string value in the logs.
> > >> I would find a separate transaction ID beneficial if the connector
> could
> > >> be
> > >> given a different principal from the task, and be given distinct ACLs.
> > >> However, I don't think this is possible or desirable, and so I don't
> > think
> > >> it's relevant right now. Let me know if there are any other ways that
> > the
> > >> connector transaction ID would be useful.
> > >>
> > >> Thanks for all the effort on this feature!
> > >> Greg
> > >>
> > >> On Tue, Apr 11, 2023 at 7:52 AM Chris Egerton  >
> > >> wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > A couple slight tweaks to the design have been proposed during
> > >> > implementation and I'd like to report them here to make sure that
> > >> they're
> > >> > acceptable to all who previously voted for this KIP. I've updated
> the
> > >> KIP
> > >> > to include these changes but will be happy to revert and/or amend if
> > >> there
> > >> > are any concerns.
> > >> >
> > >> > 1. We would like to refrain from using a transaction when resetting
> > >> source
> > >> > connector offsets in the worker's global offsets topic when
> > exactly-once
> > >> > support is enabled. We would continue to use a transaction when
> > >> resetting
> > >> > offsets in the connector's offsets topic. Discussed in [1].
> > >> >
> > >> > 2. We would like to use a transactional ID of
> ${groupId}-${connector}
> > to
> > >> > alter/reset source connector offsets when exactly-once support is
> > >> enabled,
> > >> > where ${groupId} is the group ID of the Connect cluster and
> > >> ${connector} is
> > >> > the name of the connector. This is raised here because it would
> > >> introduce
> > >> > an additional ACL requirement for this API. A less-elegant
> alternative
> > >> that
> > >> > would obviate the additional ACL requirement is to use the
> > >> transactional ID
> > >> > that would be used by task 0 of the connector, but this may be
> > >> confusing to
> > >> > users as it could indicate that the task is actually running.
> > Discussed
> > >> in
> > >> > [2].
> > >> >
> > >> > [1] -
> > >> 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1753

2023-04-12 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-875: First-class offsets support in Kafka Connect

2023-04-12 Thread Chris Egerton
Hi Yash,

Great, we can use the transactional ID for task zero for now 

As far as why: we'd need to fence out that producer in the event that tasks
for the connector are brought up while the alter offsets request is still
taking place, since we'd want to make sure that the offsets aren't altered
after tasks are brought up. I think it may be possible right now in
extremely niche circumstances where the worker servicing the alter offsets
request loses leadership of the cluster. More realistically, I think it
leaves room for tweaking the logic for handling these requests to not use a
five-second timeout in the future (we could potentially just
fire-and-forget the request and, if new tasks are started for the connector
in the meantime, trust that the request will get automatically fenced out
without doing any more work).

Cheers,

Chris

On Wed, Apr 12, 2023 at 1:01 PM Yash Mayya  wrote:

> Hi Chris and Greg,
>
> The current implementation does already use the transactional ID for task 0
> so no complaints from me. Although I'm not sure I follow the concerns w.r.t
> zombie fencing? In which cases would we need to fence out the transactional
> producer instantiated for altering offsets?
>
> Thanks,
> Yash
>
> On Wed, Apr 12, 2023 at 9:02 PM Chris Egerton  wrote:
>
> > Hi Greg,
> >
> > I hadn't considered the implications W/R/T zombie fencing. I agree that
> > using the transactional ID for task 0 is better in that case.
> >
> > Yash (who is implementing this part, cc'd), does this seem reasonable to
> > you?
> >
> > Cheers,
> >
> > Chris
> >
> > On Tue, Apr 11, 2023 at 3:23 PM Greg Harris  >
> > wrote:
> >
> >> Chris & Yash,
> >>
> >> 1. Since the global offsets topic does not have transactions on it
> >> already,
> >> I don't think adding transactions just for these reset operations would
> be
> >> an improvement. The transactional produce would not exclude other
> >> non-transactional producers, but hanging transactions on the global
> >> offsets
> >> topic would negatively impact the general cluster health. Your proposed
> >> strategy seems reasonable to me.
> >>
> >> 2. While it may be the connector performing the offset reset and not the
> >> task, I think it would be preferable for the connector to use task 0's
> >> task-id and 'impersonate' the task for the purpose of changing the
> >> offsets.
> >> I think the complication elsewhere (getting users to provide a new ACL,
> >> expanding fencing to also fence the connector transaction id, etc) is
> not
> >> practically worth it to change 1 string value in the logs.
> >> I would find a separate transaction ID beneficial if the connector could
> >> be
> >> given a different principal from the task, and be given distinct ACLs.
> >> However, I don't think this is possible or desirable, and so I don't
> think
> >> it's relevant right now. Let me know if there are any other ways that
> the
> >> connector transaction ID would be useful.
> >>
> >> Thanks for all the effort on this feature!
> >> Greg
> >>
> >> On Tue, Apr 11, 2023 at 7:52 AM Chris Egerton 
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > A couple slight tweaks to the design have been proposed during
> >> > implementation and I'd like to report them here to make sure that
> >> they're
> >> > acceptable to all who previously voted for this KIP. I've updated the
> >> KIP
> >> > to include these changes but will be happy to revert and/or amend if
> >> there
> >> > are any concerns.
> >> >
> >> > 1. We would like to refrain from using a transaction when resetting
> >> source
> >> > connector offsets in the worker's global offsets topic when
> exactly-once
> >> > support is enabled. We would continue to use a transaction when
> >> resetting
> >> > offsets in the connector's offsets topic. Discussed in [1].
> >> >
> >> > 2. We would like to use a transactional ID of ${groupId}-${connector}
> to
> >> > alter/reset source connector offsets when exactly-once support is
> >> enabled,
> >> > where ${groupId} is the group ID of the Connect cluster and
> >> ${connector} is
> >> > the name of the connector. This is raised here because it would
> >> introduce
> >> > an additional ACL requirement for this API. A less-elegant alternative
> >> that
> >> > would obviate the additional ACL requirement is to use the
> >> transactional ID
> >> > that would be used by task 0 of the connector, but this may be
> >> confusing to
> >> > users as it could indicate that the task is actually running.
> Discussed
> >> in
> >> > [2].
> >> >
> >> > [1] -
> >> https://github.com/apache/kafka/pull/13465/#issuecomment-1486718538
> >> > [2] -
> >> https://github.com/apache/kafka/pull/13465/#discussion_r1159694956
> >> >
> >> > Cheers,
> >> >
> >> > Chris
> >> >
> >> > On Fri, Mar 3, 2023 at 10:22 AM Chris Egerton 
> wrote:
> >> >
> >> > > Hi all,
> >> > >
> >> > > Thanks for the votes! I'll cast a final +1 myself and close the vote
> >> out.
> >> > >
> >> > > This KIP passes with the following +1 votes (and no +0 or -1 votes):
> >> > >

Re: [VOTE] KIP-875: First-class offsets support in Kafka Connect

2023-04-12 Thread Yash Mayya
Hi Chris and Greg,

The current implementation does already use the transactional ID for task 0
so no complaints from me. Although I'm not sure I follow the concerns w.r.t
zombie fencing? In which cases would we need to fence out the transactional
producer instantiated for altering offsets?

Thanks,
Yash

On Wed, Apr 12, 2023 at 9:02 PM Chris Egerton  wrote:

> Hi Greg,
>
> I hadn't considered the implications W/R/T zombie fencing. I agree that
> using the transactional ID for task 0 is better in that case.
>
> Yash (who is implementing this part, cc'd), does this seem reasonable to
> you?
>
> Cheers,
>
> Chris
>
> On Tue, Apr 11, 2023 at 3:23 PM Greg Harris 
> wrote:
>
>> Chris & Yash,
>>
>> 1. Since the global offsets topic does not have transactions on it
>> already,
>> I don't think adding transactions just for these reset operations would be
>> an improvement. The transactional produce would not exclude other
>> non-transactional producers, but hanging transactions on the global
>> offsets
>> topic would negatively impact the general cluster health. Your proposed
>> strategy seems reasonable to me.
>>
>> 2. While it may be the connector performing the offset reset and not the
>> task, I think it would be preferable for the connector to use task 0's
>> task-id and 'impersonate' the task for the purpose of changing the
>> offsets.
>> I think the complication elsewhere (getting users to provide a new ACL,
>> expanding fencing to also fence the connector transaction id, etc) is not
>> practically worth it to change 1 string value in the logs.
>> I would find a separate transaction ID beneficial if the connector could
>> be
>> given a different principal from the task, and be given distinct ACLs.
>> However, I don't think this is possible or desirable, and so I don't think
>> it's relevant right now. Let me know if there are any other ways that the
>> connector transaction ID would be useful.
>>
>> Thanks for all the effort on this feature!
>> Greg
>>
>> On Tue, Apr 11, 2023 at 7:52 AM Chris Egerton 
>> wrote:
>>
>> > Hi all,
>> >
>> > A couple slight tweaks to the design have been proposed during
>> > implementation and I'd like to report them here to make sure that
>> they're
>> > acceptable to all who previously voted for this KIP. I've updated the
>> KIP
>> > to include these changes but will be happy to revert and/or amend if
>> there
>> > are any concerns.
>> >
>> > 1. We would like to refrain from using a transaction when resetting
>> source
>> > connector offsets in the worker's global offsets topic when exactly-once
>> > support is enabled. We would continue to use a transaction when
>> resetting
>> > offsets in the connector's offsets topic. Discussed in [1].
>> >
>> > 2. We would like to use a transactional ID of ${groupId}-${connector} to
>> > alter/reset source connector offsets when exactly-once support is
>> enabled,
>> > where ${groupId} is the group ID of the Connect cluster and
>> ${connector} is
>> > the name of the connector. This is raised here because it would
>> introduce
>> > an additional ACL requirement for this API. A less-elegant alternative
>> that
>> > would obviate the additional ACL requirement is to use the
>> transactional ID
>> > that would be used by task 0 of the connector, but this may be
>> confusing to
>> > users as it could indicate that the task is actually running. Discussed
>> in
>> > [2].
>> >
>> > [1] -
>> https://github.com/apache/kafka/pull/13465/#issuecomment-1486718538
>> > [2] -
>> https://github.com/apache/kafka/pull/13465/#discussion_r1159694956
>> >
>> > Cheers,
>> >
>> > Chris
>> >
>> > On Fri, Mar 3, 2023 at 10:22 AM Chris Egerton  wrote:
>> >
>> > > Hi all,
>> > >
>> > > Thanks for the votes! I'll cast a final +1 myself and close the vote
>> out.
>> > >
>> > > This KIP passes with the following +1 votes (and no +0 or -1 votes):
>> > >
>> > > • Greg Harris
>> > > • Yash Mayya
>> > > • Knowles Atchison Jr
>> > > • Mickael Maison (binding)
>> > > • Tom Bentley (binding)
>> > > • Josep Prat (binding)
>> > > • Chris Egerton (binding, author)
>> > >
>> > > I'll write up Jira tickets and begin implementing things next week.
>> > >
>> > > Cheers,
>> > >
>> > > Chris
>> > >
>> > > On Fri, Mar 3, 2023 at 10:07 AM Josep Prat
>> 
>> > > wrote:
>> > >
>> > >> Hi Chris,
>> > >>
>> > >> Thanks for the KIP. I have a non-blocking comment on the DISCUSS
>> thread.
>> > >>
>> > >> +1 (binding).
>> > >>
>> > >> Best,
>> > >>
>> > >> On Wed, Mar 1, 2023 at 12:16 PM Tom Bentley 
>> > wrote:
>> > >>
>> > >> > Hi Chris,
>> > >> >
>> > >> > Thanks for the KIP.
>> > >> >
>> > >> > +1 (binding).
>> > >> >
>> > >> > Cheers,
>> > >> >
>> > >> > Tom
>> > >> >
>> > >> > On Wed, 15 Feb 2023 at 16:11, Chris Egerton
>> 
>> > >> > wrote:
>> > >> >
>> > >> > > Hi all,
>> > >> > >
>> > >> > > Thanks to everyone who's voted so far! Just wanted to bump this
>> > thread
>> > >> > and
>> > >> > > see if we could get a few more votes; currently we're at +3
>> > >> non-binding
>> > >> > > 

Re: [DISCUSS] KIP-910: Update Source offsets for Source Connectors without producing records

2023-04-12 Thread Chris Egerton
Hi Sagar,

I'm sorry, I'm still not convinced that this design solves the problem(s)
it sets out to solve in the best way possible. I tried to highlight this in
my last email:

> In general, it seems like we're trying to solve two completely different
problems with this single KIP: adding framework-level support for emitting
heartbeat records for source connectors, and allowing source connectors to
emit offsets without also emitting source records. I don't mind addressing
the two at the same time if the result is elegant and doesn't compromise on
the solution for either problem, but that doesn't seem to be the case here.
Of the two problems, could we describe one as the primary and one as the
secondary? If so, we might consider dropping the secondary problem from
this KIP and addressing it separately.

If we wanted to add support for heartbeat records, we could (and IMO
should) do that without requiring connectors to implement any new methods
and only require adjustments to worker or connector configurations by users
in order to enable that feature.

If we want to add support for connectors to emit offsets without
accompanying source records, we could (and IMO should) do that without
requiring users to manually enable that feature by adjusting worker or
connector configurations.


I'm also not sure that it's worth preserving the current behavior that
offsets for records that have been filtered out via SMT are not committed.
I can't think of a case where this would be useful and there are obviously
plenty where it isn't. There's also a slight discrepancy in how these kinds
of records are treated by the Connect runtime now; if a record is dropped
because of an SMT, then its offset isn't committed, but if it's dropped
because exactly-once support is enabled and the connector chose to abort
the batch containing the record, then its offset is still committed. After
thinking carefully about the aborted transaction behavior, we realized that
it was fine to commit the offsets for those records, and I believe that the
same logic can be applied to any record that we're done trying to send to
Kafka (regardless of whether it was sent correctly, dropped due to producer
error, filtered via SMT, etc.).

I also find the file-based source connector example a little confusing.
What about that kind of connector causes the offset for the last record of
a file to be treated differently? Is there anything different about
filtering that record via SMT vs. dropping it altogether because of an
asynchronous producer error with "errors.tolerance" set to "all"? And
finally, how would such a connector use the design proposed here?

Finally, I don't disagree that if there are other legitimate use cases that
would be helped by addressing KAFKA-3821, we should try to solve that issue
in the Kafka Connect framework instead of requiring individual connectors
to implement their own solutions. But the cognitive load added by the
design proposed here, for connector developers and Connect cluster
administrators alike, costs too much to justify by pointing to an
already-solved problem encountered by a single group of connectors (i.e.,
Debezium). This is why I think it's crucial that we identify realistic
cases where this feature would actually be useful, and right now, I don't
think any have been provided (at least, not ones that have already been
addressed or could be addressed with much simpler changes).

Cheers,

Chris

On Tue, Apr 11, 2023 at 7:30 AM Sagar  wrote:

> Hi Chris,
>
> Thanks for your detailed feedback!
>
> nits: I have taken care of them now. Thanks for pointing those out.
>
> non-nits:
>
> 6) It seems (based on both the KIP and discussion on KAFKA-3821) that the
> > only use case for being able to emit offsets without also emitting source
> > records that's been identified so far is for CDC source connectors like
> > Debezium.
>
>
> I am aware of atleast one more case where the non production of offsets
> (due to non production of records ) leads to the failure of connectors when
> the source purges the records of interest. This happens in File based
> source connectors  (like s3/blob storage ) in which if the last record from
> a file is fiterterd due to an SMT, then that particular file is never
> committed to the source partition and eventually when the file is deleted
> from the source and the connector is restarted due to some reason, it
> fails.
> Moreover, I feel the reason this support should be there in the Kafka
> Connect framework is because this is a restriction of the framework and
> today the framework provides no support for getting around this limitation.
> Every connector has it's own way of handling offsets and having each
> connector handle this restriction in its own way can make it complex.
> Whether we choose to do it the way this KIP prescribes or any other way is
> up for debate but IMHO, the framework should provide a way of
> getting around this limitation.
>
> 7. If a task produces heartbeat records 

Re: [VOTE] KIP-902: Upgrade Zookeeper to 3.8.1

2023-04-12 Thread Christo Lolov
Hello Colin,

Thank you for the response!

1. I have attached the compatibility matrix in the KIP under the section
Compatibility, Deprecation, and Migration Plan.
2. I believe the answer to how many bridge releases (for Kafka) will be
needed to upgrade from 2.0 to 4.0 based on the compatibility matrix is 2 -
one from 2.0 to any of 2.4.x to 3.5.x (now that we are no longer
considering this KIP for 3.5.x) and one from that version to the bridge
release mentioned in KIP-500 and KIP-866 (assuming that bridge release has
a dependency on Zookeeper 3.8.1).
3. What determines whether you need your Zookeeper cluster to first be
upgraded to 3.4.6 is "Upgrading a running ZooKeeper ensemble to 3.5.0
should be done only after upgrading your ensemble to the 3.4.6 release."
(source:
https://zookeeper.apache.org/doc/r3.8.1/zookeeperReconfig.html#ch_reconfig_upgrade).
Continuing from the example in point 2, since Kafka 2.0 had Zookeeper
3.4.13 no second bridge upgrade for Zookeeper is needed. To clarify, you
would go from Zookeeper 3.4.13 to any between 3.5.x and 3.6.x and then to
3.8.1.
4. Ideally, if users make an error since they will be carrying out a
rolling restart of their Zookeeper cluster the errors should start
appearing with the first Zookeeper instance which is rebooted, so if they
have sufficient monitoring in place they should be able to catch it before
it takes down their whole Kafka cluster. To be honest, I have never had to
downgrade a Zookeeper cluster, but I suspect the procedure is the same as
upgrading but in reverse i.e. stop the new binary, remove the new binary,
put the old binary, start the old binary.
5. Fair point, I meant to say that Zookeeper will no longer be a thing when
Kafka 4.0 arrives.
6. I believe Ismael's response answers your last concern better than I
could.

Best,
Christo

On Mon, 10 Apr 2023 at 00:53, Colin McCabe  wrote:

> On Wed, Mar 15, 2023, at 04:58, Christo Lolov wrote:
> > Hello Colin,
> >
> > Thank you for taking the time to review the proposal!
> >
> > I have attached a compatibility matrix to aid the explanation below - if
> the mailing system rejects it I will find another way to share it.
>
> Hi Christo,
>
> The mailing list doesn't take attachments. So perhaps you could share this
> in textual form?
>
> > For the avoidance of doubt, I am not proposing to drop support for
> rolling upgrade from old Kafka versions to new ones. What I am saying is
> that additional care will need to be taken when upgrading to the latest
> Kafka versions depending on the version of the accompanying Zookeeper
> cluster. This additional care means one might have to upgrade to a Kafka
> version which falls in the intersection of the two sets in the accompanying
> diagram before upgrading the accompanying Zookeeper cluster.
>
> I think we are talking about the same thing, just using different
> terminology. If you have to go through multiple upgrades to get from
> version X to version Y, I would not say there is "support for rolling
> upgrade from X to Y." In particular if you have to go through some other
> version B, I would say that B is the "bridge release."
>
> This is different than having an "upgrade path" -- I think everyone agrees
> that there should be an upgrade path between any two kafka versions (well,
> ones that are 0.8 or newer, at least).
>
> So I'd like to understand what the bridge release would be for this kind
> of change, and how many "hops" would be required to get from, say, 2.0 to
> 4.0. Keeping in mind that 4.0 won't have ZK at all.
>
> > As a concrete example let's say you want to upgrade to Kafka 3.5 from
> Kafka 2.3 and Zookeeper 3.4. You will have to:
> > 1. Carry out a rolling upgrade of your Kafka cluster to a version
> between 2.4 and 3.4.
> > 2. Carry out a rolling upgrade of your Zookeeper cluster to 3.8.1 (with
> a possible stop at 3.4.6 due to
> https://zookeeper.apache.org/doc/r3.8.1/zookeeperReconfig.html#ch_reconfig_upgrade
> ).
>
> Hmm, what determines whether I have to make the stop or not?
>
> One thing we haven't discussed in this thread is that a lot of users don't
> upgrade ZK when they do a Kafka upgrade. So I'd also like to understand in
> what situations ZK upgrades would be required as part of Kafka upgrades, if
> we bump this version. Also, what will happen if they forget? I assume the
> cluster would be down for a while. Does ZK have a downgrade procedure?
>
> > 3. Carry out a rolling upgrade of your Kafka cluster from 3.4 to 3.5.
> >
> > It is true that Zookeeper is to be deprecated in Kafka 4.0, but as far
> as I looked there is no concrete release date for that version yet.
>
> ZK is not going to be deprecated in AK 4.0, but removed in 4.0.
>
> >
> > Until this is the case and unless we carry out a Zookeeper version
> upgrade we leave users to run on an end-of-life version with unpatched CVEs
> addressed in later versions.
> >
> > Some users have compliance requirements to only run on stable versions
> of a software and its dependencies 

Re: [VOTE] KIP-875: First-class offsets support in Kafka Connect

2023-04-12 Thread Chris Egerton
Hi Greg,

I hadn't considered the implications W/R/T zombie fencing. I agree that
using the transactional ID for task 0 is better in that case.

Yash (who is implementing this part, cc'd), does this seem reasonable to
you?

Cheers,

Chris

On Tue, Apr 11, 2023 at 3:23 PM Greg Harris 
wrote:

> Chris & Yash,
>
> 1. Since the global offsets topic does not have transactions on it already,
> I don't think adding transactions just for these reset operations would be
> an improvement. The transactional produce would not exclude other
> non-transactional producers, but hanging transactions on the global offsets
> topic would negatively impact the general cluster health. Your proposed
> strategy seems reasonable to me.
>
> 2. While it may be the connector performing the offset reset and not the
> task, I think it would be preferable for the connector to use task 0's
> task-id and 'impersonate' the task for the purpose of changing the offsets.
> I think the complication elsewhere (getting users to provide a new ACL,
> expanding fencing to also fence the connector transaction id, etc) is not
> practically worth it to change 1 string value in the logs.
> I would find a separate transaction ID beneficial if the connector could be
> given a different principal from the task, and be given distinct ACLs.
> However, I don't think this is possible or desirable, and so I don't think
> it's relevant right now. Let me know if there are any other ways that the
> connector transaction ID would be useful.
>
> Thanks for all the effort on this feature!
> Greg
>
> On Tue, Apr 11, 2023 at 7:52 AM Chris Egerton 
> wrote:
>
> > Hi all,
> >
> > A couple slight tweaks to the design have been proposed during
> > implementation and I'd like to report them here to make sure that they're
> > acceptable to all who previously voted for this KIP. I've updated the KIP
> > to include these changes but will be happy to revert and/or amend if
> there
> > are any concerns.
> >
> > 1. We would like to refrain from using a transaction when resetting
> source
> > connector offsets in the worker's global offsets topic when exactly-once
> > support is enabled. We would continue to use a transaction when resetting
> > offsets in the connector's offsets topic. Discussed in [1].
> >
> > 2. We would like to use a transactional ID of ${groupId}-${connector} to
> > alter/reset source connector offsets when exactly-once support is
> enabled,
> > where ${groupId} is the group ID of the Connect cluster and ${connector}
> is
> > the name of the connector. This is raised here because it would introduce
> > an additional ACL requirement for this API. A less-elegant alternative
> that
> > would obviate the additional ACL requirement is to use the transactional
> ID
> > that would be used by task 0 of the connector, but this may be confusing
> to
> > users as it could indicate that the task is actually running. Discussed
> in
> > [2].
> >
> > [1] -
> https://github.com/apache/kafka/pull/13465/#issuecomment-1486718538
> > [2] - https://github.com/apache/kafka/pull/13465/#discussion_r1159694956
> >
> > Cheers,
> >
> > Chris
> >
> > On Fri, Mar 3, 2023 at 10:22 AM Chris Egerton  wrote:
> >
> > > Hi all,
> > >
> > > Thanks for the votes! I'll cast a final +1 myself and close the vote
> out.
> > >
> > > This KIP passes with the following +1 votes (and no +0 or -1 votes):
> > >
> > > • Greg Harris
> > > • Yash Mayya
> > > • Knowles Atchison Jr
> > > • Mickael Maison (binding)
> > > • Tom Bentley (binding)
> > > • Josep Prat (binding)
> > > • Chris Egerton (binding, author)
> > >
> > > I'll write up Jira tickets and begin implementing things next week.
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Fri, Mar 3, 2023 at 10:07 AM Josep Prat  >
> > > wrote:
> > >
> > >> Hi Chris,
> > >>
> > >> Thanks for the KIP. I have a non-blocking comment on the DISCUSS
> thread.
> > >>
> > >> +1 (binding).
> > >>
> > >> Best,
> > >>
> > >> On Wed, Mar 1, 2023 at 12:16 PM Tom Bentley 
> > wrote:
> > >>
> > >> > Hi Chris,
> > >> >
> > >> > Thanks for the KIP.
> > >> >
> > >> > +1 (binding).
> > >> >
> > >> > Cheers,
> > >> >
> > >> > Tom
> > >> >
> > >> > On Wed, 15 Feb 2023 at 16:11, Chris Egerton  >
> > >> > wrote:
> > >> >
> > >> > > Hi all,
> > >> > >
> > >> > > Thanks to everyone who's voted so far! Just wanted to bump this
> > thread
> > >> > and
> > >> > > see if we could get a few more votes; currently we're at +3
> > >> non-binding
> > >> > > and +1 binding. Hoping we can get this approved, reviewed, and
> > merged
> > >> in
> > >> > > time for 3.5.0.
> > >> > >
> > >> > > Cheers,
> > >> > >
> > >> > > Chris
> > >> > >
> > >> > > On Tue, Jan 31, 2023 at 2:52 AM Mickael Maison <
> > >> mickael.mai...@gmail.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Thanks Chris for the KIP, this is a much needed feature!
> > >> > > >
> > >> > > > +1 (binding)
> > >> > > >
> > >> > > >
> > >> > > > On Tue, Jan 24, 2023 at 3:45 PM Knowles Atchison Jr
> > >> > > >  wrote:
> > >> > 

[jira] [Resolved] (KAFKA-14890) Kafka initiates shutdown due to connectivity problem with Zookeeper and FatalExitError from ChangeNotificationProcessorThread

2023-04-12 Thread Ron Dagostino (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Dagostino resolved KAFKA-14890.
---
Resolution: Duplicate

Duplicate of https://issues.apache.org/jira/browse/KAFKA-14887

> Kafka initiates shutdown due to connectivity problem with Zookeeper and 
> FatalExitError from ChangeNotificationProcessorThread
> -
>
> Key: KAFKA-14890
> URL: https://issues.apache.org/jira/browse/KAFKA-14890
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Denis Razuvaev
>Priority: Major
>
> Hello, 
> We have faced several times the deadlock in Kafka, the similar issue is - 
> https://issues.apache.org/jira/browse/KAFKA-13544 
> The question - is it expected behavior that Kafka decided to shut down due to 
> connectivity problems with Zookeeper? Seems like it is related to the 
> inability to read data from */feature* Zk node and the 
> _ZooKeeperClientExpiredException_ thrown from _ZooKeeperClient_ class. This 
> exception is thrown and it is caught only in catch block of _doWork()_ method 
> in {_}ChangeNotificationProcessorThread{_}, and it leads to 
> {_}FatalExitError{_}. 
> This problem with shutdown is reproduced in the new versions of Kafka (which 
> already have fix regarding deadlock from 13544). 
> It is hard to write a synthetic test to reproduce problem, but it can be 
> reproduced locally via debug mode with the following steps: 
> 1) Start Zookeeper and start Kafka in debug mode. 
> 2) Emulate connectivity problem between Kafka and Zookeeper, for example 
> connection can be closed via Netcrusher library. 
> 3) Put a breakpoint in _updateLatestOrThrow()_ method in 
> _FeatureCacheUpdater_ class, before 
> _zkClient.getDataAndVersion(featureZkNodePath)_ line execution. 
> 4) Restore connection between Kafka and Zookeeper after session expiration. 
> Kafka execution should be stopped on the breakpoint.
> 5) Resume execution until Kafka starts to execute line 
> _zooKeeperClient.handleRequests(remainingRequests)_ in 
> _retryRequestsUntilConnected_ method in _KafkaZkClient_ class. 
> 6) Again emulate connectivity problem between Kafka and Zookeeper and wait 
> until session will be expired. 
> 7) Restore connection between Kafka and Zookeeper. 
> 8) Kafka begins shutdown process, due to: 
> _ERROR [feature-zk-node-event-process-thread]: Failed to process feature ZK 
> node change event. The broker will eventually exit. 
> (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)_
>  
> The following problems on the real environment can be caused by some network 
> problems and periodic disconnection and connection to the Zookeeper in a 
> short time period. 
> I started mail thread in 
> [https://lists.apache.org/thread/gbk4scwd8g7mg2tfsokzj5tjgrjrb9dw] regarding 
> this problem, but have no answers.
> For me it seems like defect, because Kafka initiates shutdown after restoring 
> connection between Kafka and Zookeeper, and should be fixed. 
> Thank you.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1752

2023-04-12 Thread Apache Jenkins Server
See 




Zookeeper upgrade 3.4.14 to 3.5.7

2023-04-12 Thread Kafka Life
Hi Kafka , zookeeper experts

Is it possible to upgrade the 3.4.14 version of zookeeper cluster in a
rolling fashion (one by one node) to 3.5.7 zookeeper version. Would the
cluster work with a possible combination of 3.4.14 and 3.5.7 . Please
advise .


Re: [DISCUSS] Apache Kafka 3.5.0 release

2023-04-12 Thread Luke Chen
Hi Mickael,

I'd like to ask for some more days for KIP-405 tiered storage PRs to
include in v3.5.
Currently, we have 1 PR under reviewing (
https://github.com/apache/kafka/pull/13535), and 1 PR soon will be opened
for review.
After these 2 PRs merged, we can have an "Early Access" for tiered storage
feature, that allow users to use in non-production environments.
Does that work for you?

Thank you.
Luke

On Thu, Apr 6, 2023 at 2:49 AM Jeff Kim 
wrote:

> Hi Mickael,
>
> Thank you.
>
> Best,
> Jeff
>
> On Wed, Apr 5, 2023 at 1:28 PM Mickael Maison 
> wrote:
>
> > Hi Jeff,
> >
> > Ok, I've added KIP-915 to the release plan.
> >
> > Thanks,
> > Mickael
> >
> > On Wed, Apr 5, 2023 at 6:48 PM Jeff Kim 
> > wrote:
> > >
> > > Hi Mickael,
> > >
> > > I would like to bring up that KIP-915 proposes to patch 3.5
> > > although it missed the KIP freeze date. If the patch is done before the
> > > feature freeze date, 4/13, would this be acceptable? If so, should this
> > > be added to the 3.5.0 Release Plan wiki?
> > >
> > > Best,
> > > Jeff
> > >
> > > On Mon, Mar 27, 2023 at 1:02 PM Greg Harris
>  > >
> > > wrote:
> > >
> > > > Mickael,
> > > >
> > > > Just wanted to let you know that I will not be including KIP-898 in
> the
> > > > 3.5.0 release.
> > > > I think the change needed is not reviewable before the feature freeze
> > > > deadline, and would take resources away from other more necessary
> > changes.
> > > >
> > > > Thanks!
> > > > Greg
> > > >
> > > > On Thu, Mar 23, 2023 at 9:01 AM Chia-Ping Tsai 
> > wrote:
> > > >
> > > > > > If you have a KIP that is accepted, make sure it is listed in
> > > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.5.0
> > > > > > and that it's status is accurate.
> > > > >
> > > > > Thanks for the reminder. Have added KIP-641 to the list.
> > > > >
> > > > > Thanks,
> > > > > Chia-Ping
> > > > >
> > > > > > Mickael Maison  於 2023年3月23日 下午11:51
> 寫道:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > KIP Freeze was yesterday. The next milestone is feature freeze on
> > April
> > > > > 12.
> > > > > > If you have a KIP that is accepted, make sure it is listed in
> > > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.5.0
> > > > > > and that it's status is accurate.
> > > > > >
> > > > > > Thanks,
> > > > > > Mickael
> > > > > >
> > > > > > On Fri, Mar 17, 2023 at 6:22 PM Christo Lolov <
> > christolo...@gmail.com>
> > > > > wrote:
> > > > > >>
> > > > > >> Hello!
> > > > > >>
> > > > > >> What would you suggest as the best way to get more eyes on
> > KIP-902 as
> > > > I
> > > > > would like it to be included it in 3.5.0?
> > > > > >>
> > > > > >> Best,
> > > > > >> Christo
> > > > > >>
> > > > > >>> On 16 Mar 2023, at 10:33, Mickael Maison <
> > mickael.mai...@gmail.com>
> > > > > wrote:
> > > > > >>>
> > > > > >>> Hi,
> > > > > >>>
> > > > > >>> This is a reminder that KIP freeze is less than a week away (22
> > Mar).
> > > > > >>> For a KIP to be considered for this release, it must be voted
> and
> > > > > >>> accepted by that date.
> > > > > >>>
> > > > > >>> Feature freeze will be 3 weeks after this, so if you want KIPs
> or
> > > > > >>> other significant changes in the release, please get them ready
> > soon.
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Mickael
> > > > > >>>
> > > > >  On Tue, Feb 14, 2023 at 10:44 PM Ismael Juma <
> ism...@juma.me.uk
> > >
> > > > > wrote:
> > > > > 
> > > > >  Thanks!
> > > > > 
> > > > >  Ismael
> > > > > 
> > > > >  On Tue, Feb 14, 2023 at 1:07 PM Mickael Maison <
> > > > > mickael.mai...@gmail.com>
> > > > >  wrote:
> > > > > 
> > > > > > Hi Ismael,
> > > > > >
> > > > > > Good call. I shifted all dates by 2 weeks and moved them to
> > > > > Wednesdays.
> > > > > >
> > > > > > Thanks,
> > > > > > Mickael
> > > > > >
> > > > > > On Tue, Feb 14, 2023 at 6:01 PM Ismael Juma <
> ism...@juma.me.uk
> > >
> > > > > wrote:
> > > > > >>
> > > > > >> Thanks Mickael. A couple of notes:
> > > > > >>
> > > > > >> 1. We typically choose a Wednesday for the various freeze
> > dates -
> > > > > there
> > > > > > are
> > > > > >> often 1-2 day slips and it's better if that doesn't require
> > people
> > > > > >> working through the weekend.
> > > > > >> 2. Looks like we're over a month later compared to the
> > equivalent
> > > > > release
> > > > > >> last year (
> > > > > >>
> > > > >
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.2.0).
> > I
> > > > > >> understand that some of it is due to 3.4.0 slipping, but I
> > wonder
> > > > > if we
> > > > > >> could perhaps aim for the KIP freeze to be one or two weeks
> > > > earlier.
> > > > > >>
> > > > > >> Ismael
> > > > > >>
> > > > > >> On Tue, Feb 14, 2023 at 8:00 AM Mickael Maison <
> > > > > mickael.mai...@gmail.com
> > > > > >>
> > > > > >> wrote:
> > > > > >>
> > > > > 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1751

2023-04-12 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-895: Dynamically refresh partition count of __consumer_offsets

2023-04-12 Thread Alexandre Dupriez
Hi Divij,

Thanks for the follow-up. A few comments/questions.

100. The stated motivation to increase the number of partitions above
50 is scale. Are we sure that 50 partitions is not enough to cover all
valid use cases? An upper bound in the range 1 to 10 MB/s of ingress
per partition gives 50 to 500 MB/s. Assuming 100 bytes per offset and
metadata records, this gives between 500,000 and 5,000,000 offsets
committed per second. Assuming 10,000 consumers active on the cluster,
this would allow a rate of 50 to 500 offsets committed per second per
consumer. Are there really use cases where there is a genuine need for
more? Arguably, this does not include group metadata records which are
generated at a low frequency.

101. The partitioning scheme applied for consumer offsets is also used
in other parts such as the already mentioned transaction metadata or
remote log metadata for the topic-based remote log metadata manager
[1]. Have we considered a holistic approach for all these internal
topics?

Overall, I am not sure if changing the number of partitions for the
consumer offsets topic should even be allowed unless there is evidence
of it being required to accommodate throughput. Reassignment can be
required after cluster expansion, but that is correctly supported
IIRC.

Thanks,
Alexandre

[1] 
https://github.com/Hangleton/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/server/log/remote/metadata/storage/RemoteLogMetadataTopicPartitioner.java#L37

Le jeu. 6 avr. 2023 à 16:01, hzh0425  a écrit :
>
> I think it's a good idea as we may want to store segments in different buckets
>
>
>
> | |
> hzhka...@163.com
> |
> |
> 邮箱:hzhka...@163.com
> |
>
>
>
>
>  回复的原邮件 
> | 发件人 | Divij Vaidya |
> | 日期 | 2023年04月04日 23:56 |
> | 收件人 | dev@kafka.apache.org |
> | 抄送至 | |
> | 主题 | Re: [DISCUSS] KIP-895: Dynamically refresh partition count of 
> __consumer_offsets |
> FYI, a user faced this problem and reached out to us in the mailing list
> [1]. Implementation of this KIP could have reduced the downtime for these
> customers.
>
> Christo, would you like to create a JIRA and associate with the KIP so that
> we can continue to collect cases in the JIRA where users have faced this
> problem?
>
> [1] https://lists.apache.org/thread/zoowjshvdpkh5p0p7vqjd9fq8xvkr1nd
>
> --
> Divij Vaidya
>
>
>
> On Wed, Jan 18, 2023 at 9:52 AM Christo Lolov 
> wrote:
>
> > Greetings,
> >
> > I am bumping the below DISCUSSion thread for KIP-895. The KIP presents a
> > situation where consumer groups are in an undefined state until a rolling
> > restart of a cluster is performed. While I have demonstrated the behaviour
> > using a cluster using Zookeeper I believe the same problem can be shown in
> > a KRaft cluster. Please let me know your opinions on the problem and the
> > presented solution.
> >
> > Best,
> > Christo
> >
> > On Thursday, 29 December 2022 at 14:19:27 GMT, Christo
> > >  wrote:
> > >
> > >
> > > Hello!
> > > I would like to start this discussion thread on KIP-895: Dynamically
> > > refresh partition count of __consumer_offsets.
> > > The KIP proposes to alter brokers so that they refresh the partition
> > count
> > > of __consumer_offsets used to determine group coordinators without
> > > requiring a rolling restart of the cluster.
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-895%3A+Dynamically+refresh+partition+count+of+__consumer_offsets
> > >
> > > Let me know your thoughts on the matter!
> > > Best, Christo
> > >
> >


[jira] [Resolved] (KAFKA-14889) RemoteLogManager - allow consumer fetch records from remote storage implementation

2023-04-12 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-14889.
---
Resolution: Duplicate

> RemoteLogManager - allow consumer fetch records from remote storage 
> implementation 
> ---
>
> Key: KAFKA-14889
> URL: https://issues.apache.org/jira/browse/KAFKA-14889
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: Satish Duggana
>Priority: Major
>
> Implementation of RLM as mentioned in the HLD section of KIP-405, this JIRA 
> covers enabling consumers fetch records from remote storage
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP-405:KafkaTieredStorage-High-leveldesign]
> h4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14891) Fix rack-aware range assignor to improve rack-awareness with co-partitioning

2023-04-12 Thread Rajini Sivaram (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-14891.

Fix Version/s: 3.5.0
 Reviewer: David Jacot
   Resolution: Fixed

> Fix rack-aware range assignor to improve rack-awareness with co-partitioning
> 
>
> Key: KAFKA-14891
> URL: https://issues.apache.org/jira/browse/KAFKA-14891
> Project: Kafka
>  Issue Type: Bug
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 3.5.0
>
>
> We currently check all states for rack-aware assignment with co-partitioning 
> ([https://github.com/apache/kafka/blob/396536bb5aa1ba78c71ea824d736640b615bda8a/clients/src/main/java/org/apache/kafka/clients/consumer/RangeAssignor.java#L176).]
>  We should check each group of co-partitioned states separately so that we 
> can use rack-aware assignment with co-partitioning for subsets of topics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1750

2023-04-12 Thread Apache Jenkins Server
See