[
https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17861213#comment-17861213
]
Vinicius Vieira dos Santos commented on KAFKA-16986:
----------------------------------------------------
[~jolshan] We managed to make this change, but in our development broker we
have around 2500 partitions and countless topics and consumers, so the log was
a mess and I was only able to leave it active for a few hours to avoid crashing
our log collection tools during this period. Unfortunately I couldn't find any
application with the behavior I reported, I'll have to start a locally isolated
Kafka to evaluate this behavior and this could end up taking a while :/
I was able to verify that the logs are present in our approval and production
environment as well, that is, Kafka installations completely isolated from each
other that present the same log that I reported and in the same scenario where
multiple times for the same topic and partition there is the message "
associated topicId changed from null to xxxx"
> After upgrading to Kafka 3.4.1, the producer constantly produces logs related
> to topicId changes
> ------------------------------------------------------------------------------------------------
>
> Key: KAFKA-16986
> URL: https://issues.apache.org/jira/browse/KAFKA-16986
> Project: Kafka
> Issue Type: Bug
> Components: clients, producer
> Affects Versions: 3.0.1, 3.6.1
> Reporter: Vinicius Vieira dos Santos
> Priority: Minor
> Attachments: image-2024-07-01-09-05-17-147.png, image.png
>
>
> When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that
> the applications began to log the message "{*}Resetting the last seen epoch
> of partition PAYMENTS-0 to 0 since the associated topicId changed from null
> to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this
> behavior is not expected because the topic was not deleted and recreated so
> it should simply use the cached data and not go through this client log line.
> We have some applications with around 15 topics and 40 partitions which means
> around 600 log lines when metadata updates occur
> The main thing for me is to know if this could indicate a problem or if I can
> simply change the log level of the org.apache.kafka.clients.Metadata class to
> warn without worries
>
> There are other reports of the same behavior like this:
> [https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why]
>
> *Some log occurrences over an interval of about 7 hours, each block refers to
> an instance of the application in kubernetes*
>
> !image.png!
> *My scenario:*
> *Application:*
> - Java: 21
> - Client: 3.6.1, also tested on 3.0.1 and has the same behavior
> *Broker:*
> - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52
> image
>
> *Producer Config*
>
> acks = -1
> auto.include.jmx.reporter = true
> batch.size = 16384
> bootstrap.servers = [server:9092]
> buffer.memory = 33554432
> client.dns.lookup = use_all_dns_ips
> client.id = producer-1
> compression.type = gzip
> connections.max.idle.ms = 540000
> delivery.timeout.ms = 30000
> enable.idempotence = true
> interceptor.classes = []
> key.serializer = class
> org.apache.kafka.common.serialization.ByteArraySerializer
> linger.ms = 0
> max.block.ms = 60000
> max.in.flight.requests.per.connection = 1
> max.request.size = 1048576
> metadata.max.age.ms = 300000
> metadata.max.idle.ms = 300000
> metric.reporters = []
> metrics.num.samples = 2
> metrics.recording.level = INFO
> metrics.sample.window.ms = 30000
> partitioner.adaptive.partitioning.enable = true
> partitioner.availability.timeout.ms = 0
> partitioner.class = null
> partitioner.ignore.keys = false
> receive.buffer.bytes = 32768
> reconnect.backoff.max.ms = 1000
> reconnect.backoff.ms = 50
> request.timeout.ms = 30000
> retries = 3
> retry.backoff.ms = 100
> sasl.client.callback.handler.class = null
> sasl.jaas.config = [hidden]
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
> sasl.kerberos.min.time.before.relogin = 60000
> sasl.kerberos.service.name = null
> sasl.kerberos.ticket.renew.jitter = 0.05
> sasl.kerberos.ticket.renew.window.factor = 0.8
> sasl.login.callback.handler.class = null
> sasl.login.class = null
> sasl.login.connect.timeout.ms = null
> sasl.login.read.timeout.ms = null
> sasl.login.refresh.buffer.seconds = 300
> sasl.login.refresh.min.period.seconds = 60
> sasl.login.refresh.window.factor = 0.8
> sasl.login.refresh.window.jitter = 0.05
> sasl.login.retry.backoff.max.ms = 10000
> sasl.login.retry.backoff.ms = 100
> sasl.mechanism = PLAIN
> sasl.oauthbearer.clock.skew.seconds = 30
> sasl.oauthbearer.expected.audience = null
> sasl.oauthbearer.expected.issuer = null
> sasl.oauthbearer.jwks.endpoint.refresh.ms = 3600000
> sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 10000
> sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
> sasl.oauthbearer.jwks.endpoint.url = null
> sasl.oauthbearer.scope.claim.name = scope
> sasl.oauthbearer.sub.claim.name = sub
> sasl.oauthbearer.token.endpoint.url = null
> security.protocol = SASL_PLAINTEXT
> security.providers = null
> send.buffer.bytes = 131072
> socket.connection.setup.timeout.max.ms = 30000
> socket.connection.setup.timeout.ms = 10000
> ssl.cipher.suites = null
> ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
> ssl.endpoint.identification.algorithm = https
> ssl.engine.factory.class = null
> ssl.key.password = null
> ssl.keymanager.algorithm = SunX509
> ssl.keystore.certificate.chain = null
> ssl.keystore.key = null
> ssl.keystore.location = null
> ssl.keystore.password = null
> ssl.keystore.type = JKS
> ssl.protocol = TLSv1.3
> ssl.provider = null
> ssl.secure.random.implementation = null
> ssl.trustmanager.algorithm = PKIX
> ssl.truststore.certificates = null
> ssl.truststore.location = null
> ssl.truststore.password = null
> ssl.truststore.type = JKS
> transaction.timeout.ms = 60000
> transactional.id = null
> value.serializer = class
> org.apache.kafka.common.serialization.ByteArraySerializer
>
> If you need any more details, please let me know.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)