[jira] [Created] (KAFKA-10381) Add broker to a cluster not rebalancing partitions
Yogesh BG created KAFKA-10381: - Summary: Add broker to a cluster not rebalancing partitions Key: KAFKA-10381 URL: https://issues.apache.org/jira/browse/KAFKA-10381 Project: Kafka Issue Type: Bug Reporter: Yogesh BG Hi I have 3 node cluster, topic with one partition. when a node is deleted and add another node. Topic goes on unknown state and not able to write/read anything, below exception is seen {code:java} [2020-08-10 00:00:00,108] WARN [ReplicaManager broker=1004] Leader 1004 failed to record follower 1005's position 0 since the replica is not recognized to be one of the assigned replicas 1003,1004 for partition A-0. Empty records will be returned for this partition. (kafka.server.ReplicaManager) [2020-08-10 00:00:00,108] WARN [ReplicaManager broker=1004] Leader 1004 failed to record follower 1005's position 0 since the replica is not recognized to be one of the assigned replicas 1003,1004 for partition C-0. Empty records will be returned for this partition. (kafka.server.ReplicaManager) [2020-08-10 00:00:00,108] WARN [ReplicaManager broker=1004] Leader 1004 failed to record follower 1005's position 0 since the replica is not recognized to be one of the assigned replicas 1002,1004 for partition A-0. Empty records will be returned for this partition. (kafka.server.ReplicaManager) [2020-08-10 00:00:00,108] WARN [ReplicaManager broker=1004] Leader 1004 failed to record follower 1005's position 0 since the replica is not recognized to be one of the assigned replicas 1002,1004 for partition B-0. Empty records will be returned for this partition. (kafka.server.ReplicaManager) [2020-08-10 00:00:00,108] WARN [ReplicaManager broker=1004] Leader 1004 failed to record follower 1005's position 0 since the replica is not recognized to be one of the assigned replicas 1003,1004 for partition A-0. Empty records will be returned for this partition. (kafka.server.ReplicaManager) [2020-08-10 00:00:00,108] WARN [ReplicaManager broker=1004] Leader 1004 failed to record follower 1005's position 0 since the replica is not recognized to be one of the assigned replicas 1003,1004 for partition A-0. Empty records will be returned for this partition. (kafka.server.ReplicaManager) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10303) kafka producer says connect failed in cluster mode
Yogesh BG created KAFKA-10303: - Summary: kafka producer says connect failed in cluster mode Key: KAFKA-10303 URL: https://issues.apache.org/jira/browse/KAFKA-10303 Project: Kafka Issue Type: Bug Reporter: Yogesh BG Hi I am using kafka broker version 2.3.0 We have two setups with standalone(one node) and 3 nodes cluster we pump huge data ~25MBPS, ~80K messages per second It all works well in one node mode but in case of cluster, producer start throwing connect failed(librd kafka) after sometime again able to connect start sending traffic. What could be the issue? some of the configurations are replica.fetch.max.bytes=10485760 num.network.threads=12 num.replica.fetchers=6 queued.max.requests=5 # The number of threads doing disk I/O num.io.threads=12 replica.socket.receive.buffer.bytes=1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-7209) Kafka stream does not rebalance when one node gets down
Yogesh BG created KAFKA-7209: Summary: Kafka stream does not rebalance when one node gets down Key: KAFKA-7209 URL: https://issues.apache.org/jira/browse/KAFKA-7209 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 0.11.0.1 Reporter: Yogesh BG I use kafka streams 0.11.0.1session timeout 6, retries to be int.max and backoff time default I have 3 nodes running kafka cluster of 3 broker and i am running the 3 kafka stream with same [application.id|http://application.id/] each node has one broker one kafka stream application everything works fine during setup i bringdown one node, so one kafka broker and one streaming app is down now i see exceptions in other two streaming apps and it never gets re balanced waited for hours and never comes back to norma is there anything am missing? i also tried looking into when one broker is down call stream.close, cleanup and restart this also doesn't help can anyone help me? One thing i observed lately is that kafka topics with partitions one gets reassigned but i have topics of 16 partitions and replication factor 3. It never settles up -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-5998) /.checkpoint.tmp Not found exception
Yogesh BG created KAFKA-5998: Summary: /.checkpoint.tmp Not found exception Key: KAFKA-5998 URL: https://issues.apache.org/jira/browse/KAFKA-5998 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 0.11.0.0 Reporter: Yogesh BG Priority: Trivial I have one kafka broker and one kafka stream running... I am running its since two days under load of around 2500 msgs per second.. On third day am getting below exception for some of the partitions, I have 16 partitions only 0_0 and 0_1 gives this error 09:43:25.955 [ks_0_inst-StreamThread-6] WARN o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: java.io.FileNotFoundException: /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] at java.io.FileOutputStream.(FileOutputStream.java:221) ~[na:1.7.0_111] at java.io.FileOutputStream.(FileOutputStream.java:171) ~[na:1.7.0_111] at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] 09:43:25.974 [ks_0_inst-StreamThread-15] WARN o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: java.io.FileNotFoundException: /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] at java.io.FileOutputStream.(FileOutputStream.java:221) ~[na:1.7.0_111] at java.io.FileOutputStream.(FileOutputStream.java:171) ~[na:1.7.0_111] at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
[jira] [Created] (KAFKA-5618) Kafka stream not receive any topic/partitions/records info
Yogesh BG created KAFKA-5618: Summary: Kafka stream not receive any topic/partitions/records info Key: KAFKA-5618 URL: https://issues.apache.org/jira/browse/KAFKA-5618 Project: Kafka Issue Type: Bug Components: streams Reporter: Yogesh BG Priority: Critical Attachments: rtp-kafkastreams2.log, rtp-kafkastreams3.log, rtp-kafkastreams.log I have 3 brokers and 3 stream consumers. I have there are 360 partitions and not able to bring up streams successfully even after several retry. I have attached the logs. There are other topics which are having around 16 partitions and they are able to successfully be consumed by kafka client when tried getting thread dump using jstack the process is not responding Attaching to process ID 10663, please wait... Debugger attached successfully. Server compiler detected. JVM version is 24.141-b02 Deadlock Detection: java.lang.RuntimeException: Unable to deduce type of thread from address 0x7fdac4009000 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread) at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:162) at sun.jvm.hotspot.runtime.Threads.first(Threads.java:150) at sun.jvm.hotspot.runtime.DeadlockDetector.createThreadTable(DeadlockDetector.java:149) at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:56) at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5593) Kafka streams not re-balancing when 3 consumer streams are there
[ https://issues.apache.org/jira/browse/KAFKA-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yogesh BG resolved KAFKA-5593. -- Resolution: Invalid > Kafka streams not re-balancing when 3 consumer streams are there > > > Key: KAFKA-5593 > URL: https://issues.apache.org/jira/browse/KAFKA-5593 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Yogesh BG >Priority: Critical > Attachments: log1.txt, log2.txt, log3.txt > > > I have 3 broker nodes, 3 kafka streams > I observe that all 3 consumer streams are part of the group named > rtp-kafkastreams. but when i see the data is processed only by one node. > > DEBUG n.a.a.k.a.AccessLogMetricEnrichmentProcessor - > AccessLogMetricEnrichmentProcessor.process > when i do check the partition information shared by each of them i see first > node has all partitions like all 8. but in other streams the folder is empty. > > [root@ip-172-31-11-139 ~]# ls /data/kstreams/rtp-kafkastreams > 0_0 0_1 0_2 0_3 0_4 0_5 0_6 0_7 > and this folder is empty > I tried restarting the other two consumer streams still they won't become the > part of the group and re-balance. > I have attached the logs. > Configurations are inside the log file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5593) Kafka streams not re-balancing when 3 consumer streams are there
Yogesh BG created KAFKA-5593: Summary: Kafka streams not re-balancing when 3 consumer streams are there Key: KAFKA-5593 URL: https://issues.apache.org/jira/browse/KAFKA-5593 Project: Kafka Issue Type: Bug Components: streams Reporter: Yogesh BG Priority: Critical Attachments: log1.txt, log2.txt, log3.txt I have 3 broker nodes, 3 kafka streams I observe that all 3 consumer streams are part of the group named rtp-kafkastreams. but when i see the data is processed only by one node. DEBUG n.a.a.k.a.AccessLogMetricEnrichmentProcessor - AccessLogMetricEnrichmentProcessor.process when i do check the partition information shared by each of them i see first node has all partitions like all 8. but in other streams the folder is empty. [root@ip-172-31-11-139 ~]# ls /data/kstreams/rtp-kafkastreams 0_0 0_1 0_2 0_3 0_4 0_5 0_6 0_7 and this folder is empty I tried restarting the other two consumer streams still they won't become the part of the group and re-balance. I have attached the logs. Configurations are inside the log file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5545) Kafka Stream not able to successfully restart over new broker ip
Yogesh BG created KAFKA-5545: Summary: Kafka Stream not able to successfully restart over new broker ip Key: KAFKA-5545 URL: https://issues.apache.org/jira/browse/KAFKA-5545 Project: Kafka Issue Type: Bug Reporter: Yogesh BG Priority: Critical Hi I have one kafka broker and one kafka stream application initially kafka stream connected and starts processing data. Then i restart the broker. When broker restarts new ip will be assigned. In kafka stream i have a 5min interval thread which checks if broker ip changed and if changed, we cleanup the stream, rebuild topology(tried with reusing topology) and start the stream again. I end up with the following exceptions. 11:04:08.032 [StreamThread-38] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-38] Creating active task 0_5 with assigned partitions [PR-5] 11:04:08.033 [StreamThread-41] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-41] Creating active task 0_1 with assigned partitions [PR-1] 11:04:08.036 [StreamThread-34] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-34] Creating active task 0_7 with assigned partitions [PR-7] 11:04:08.036 [StreamThread-37] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-37] Creating active task 0_3 with assigned partitions [PR-3] 11:04:08.036 [StreamThread-45] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-45] Creating active task 0_0 with assigned partitions [PR-0] 11:04:08.037 [StreamThread-36] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-36] Creating active task 0_4 with assigned partitions [PR-4] 11:04:08.037 [StreamThread-43] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-43] Creating active task 0_6 with assigned partitions [PR-6] 11:04:08.038 [StreamThread-48] INFO o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-48] Creating active task 0_2 with assigned partitions [PR-2] 11:04:09.034 [StreamThread-38] WARN o.a.k.s.p.internals.StreamThread - Could not create task 0_5. Will retry. org.apache.kafka.streams.errors.LockException: task [0_5] Failed to lock the state directory for task 0_5 at org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210) ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] at