[jira] [Commented] (KAFKA-7358) Alternative Partitioner to Support "Always Round-Robin" Selection
[ https://issues.apache.org/jira/browse/KAFKA-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606712#comment-16606712 ] Matthias J. Sax commented on KAFKA-7358: Is this a duplicate to https://issues.apache.org/jira/browse/KAFKA- ? > Alternative Partitioner to Support "Always Round-Robin" Selection > - > > Key: KAFKA-7358 > URL: https://issues.apache.org/jira/browse/KAFKA-7358 > Project: Kafka > Issue Type: Wish > Components: clients >Reporter: M. Manna >Assignee: M. Manna >Priority: Minor > > In my organisation, we have been using kafka as the basic publish-subscribe > messaging system provider. Our goal is the send event-based (secure, > encrypted) SQL messages reliably, and process them accordingly. For us, the > message keys represent some metadata which we use to either ignore messages > (if a loopback to the sender), or log some information. We have the following > use case for messaging: > 1) A Database transaction event takes place > 2) The event is captured and messaged across 10 data centres all around the > world. > 3) A group of consumers (for each data centre with a unique consumer-group > ID) are will process messages from their respective partitions. 1 consumer > per partition. > Under the circumstances, we only need a guarantee that same message won't be > sent to multiple partitions. In other words, 1 partition will +never+ be > sought by multiple consumers. > Using DefaultPartitioner, we can achieve this only with NULL keys. But since > we need keys for metadata, we cannot maintain "Round-robin" selection of > partitions because a key hash will determine which partition to choose. We > need to have round-robin style selection regardless of key type (NULL or > not-NULL) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-6260) AbstractCoordinator not clearly handles NULL Exception
[ https://issues.apache.org/jira/browse/KAFKA-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-6260: --- Component/s: consumer > AbstractCoordinator not clearly handles NULL Exception > -- > > Key: KAFKA-6260 > URL: https://issues.apache.org/jira/browse/KAFKA-6260 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 1.0.0 > Environment: RedHat Linux >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Jason Gustafson >Priority: Major > Fix For: 1.0.1, 1.1.0 > > > The error reporting is not clear. But it seems that Kafka Heartbeat shuts > down application due to NULL exception caused by "fake" disconnections. > One more comment. We are processing messages in the stream, but sometimes we > have to block processing for minutes, as consumers are not handling too much > load. Is it possibble that when stream is waiting, then heartbeat is as well > blocked? > Can you check that? > {code} > 2017-11-23 23:54:47 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending Heartbeat request to coordinator > cljp01.eb.lan.at:9093 (id: 2147483646 rack: null) > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending HEARTBEAT > {group_id=kafka-endpoint,generation_id=3834,member_id=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer-94f18be5-e49a-4817-9e5a-fe82a64e0b08} > with correlation id 24 to node 2147483646 > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Completed receive from node 2147483646 for HEARTBEAT > with correlation id 24, received {throttle_time_ms=0,error_code=0} > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:52 DEBUG NetworkClient:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Disconnecting from node 1 due to request timeout. > 2017-11-23 23:54:52 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled request > {replica_id=-1,max_wait_time=6000,min_bytes=1,max_bytes=52428800,isolation_level=0,topics=[{topic=clj_internal_topic,partitions=[{partition=6,fetch_offset=211558395,log_start_offset=-1,max_bytes=1048576},{partition=8,fetch_offset=210178209,log_start_offset=-1,max_bytes=1048576},{partition=0,fetch_offset=209353523,log_start_offset=-1,max_bytes=1048576},{partition=2,fetch_offset=209291462,log_start_offset=-1,max_bytes=1048576},{partition=4,fetch_offset=210728595,log_start_offset=-1,max_bytes=1048576}]}]} > with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG ConsumerNetworkClient:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled FETCH request RequestHeader(apiKey=FETCH, > apiVersion=6, > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > correlationId=21) with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG Fetcher:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Fetch request > {clj_internal_topic-6=(offset=211558395, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-8=(offset=210178209, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-0=(offset=209353523, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-2=(offset=209291462, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-4=(offset=210728595, logStartOffset=-1, > maxBytes=1048576)} to cljp01.eb.lan.at:9093 (id: 1 rack: DC-1) failed > org.apache.kafka.common.errors.DisconnectException: null > 2017-11-23 23:54:52 TRACE NetworkClient:123 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Found least loaded node cljp01.eb.lan.at:9093 (id: 1 > rack: DC-1) > 2017-11-23 23:54:52 DEBUG
[jira] [Updated] (KAFKA-6457) Error: NOT_LEADER_FOR_PARTITION leads to NPE
[ https://issues.apache.org/jira/browse/KAFKA-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-6457: --- Component/s: (was: streams) consumer > Error: NOT_LEADER_FOR_PARTITION leads to NPE > > > Key: KAFKA-6457 > URL: https://issues.apache.org/jira/browse/KAFKA-6457 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 1.0.0 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.0.1 > > > One of our nodes was dead. Then the second one tooks all responsibility. > But streamming aplication in the meanwhile crashed due to NPE caused by > {{Error: NOT_LEADER_FOR_PARTITION}}. > The stack trace is below. > > Is it something expected? > > {code:java} > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer ...2018-01-17 > 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-5, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-7, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [ERROR] AbstractCoordinator:296 - [Consumer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-consumer, > groupId=restreamer-my] Heartbeat thread for group restreamer-my failed due > to unexpected error > java.lang.NullPointerException: null > at > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436) > ~[my-restreamer.jar:?] > at org.apache.kafka.common.network.Selector.poll(Selector.java:395) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:275) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:934) > [my-restreamer.jar:?] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (KAFKA-6457) Error: NOT_LEADER_FOR_PARTITION leads to NPE
[ https://issues.apache.org/jira/browse/KAFKA-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reopened KAFKA-6457: > Error: NOT_LEADER_FOR_PARTITION leads to NPE > > > Key: KAFKA-6457 > URL: https://issues.apache.org/jira/browse/KAFKA-6457 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.0.0 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.0.1 > > > One of our nodes was dead. Then the second one tooks all responsibility. > But streamming aplication in the meanwhile crashed due to NPE caused by > {{Error: NOT_LEADER_FOR_PARTITION}}. > The stack trace is below. > > Is it something expected? > > {code:java} > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer ...2018-01-17 > 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-5, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-7, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [ERROR] AbstractCoordinator:296 - [Consumer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-consumer, > groupId=restreamer-my] Heartbeat thread for group restreamer-my failed due > to unexpected error > java.lang.NullPointerException: null > at > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436) > ~[my-restreamer.jar:?] > at org.apache.kafka.common.network.Selector.poll(Selector.java:395) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:275) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:934) > [my-restreamer.jar:?] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-6457) Error: NOT_LEADER_FOR_PARTITION leads to NPE
[ https://issues.apache.org/jira/browse/KAFKA-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-6457. Resolution: Duplicate > Error: NOT_LEADER_FOR_PARTITION leads to NPE > > > Key: KAFKA-6457 > URL: https://issues.apache.org/jira/browse/KAFKA-6457 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.0.0 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.0.1 > > > One of our nodes was dead. Then the second one tooks all responsibility. > But streamming aplication in the meanwhile crashed due to NPE caused by > {{Error: NOT_LEADER_FOR_PARTITION}}. > The stack trace is below. > > Is it something expected? > > {code:java} > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer ...2018-01-17 > 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-5, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-7, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [ERROR] AbstractCoordinator:296 - [Consumer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-consumer, > groupId=restreamer-my] Heartbeat thread for group restreamer-my failed due > to unexpected error > java.lang.NullPointerException: null > at > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436) > ~[my-restreamer.jar:?] > at org.apache.kafka.common.network.Selector.poll(Selector.java:395) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:275) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:934) > [my-restreamer.jar:?] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606492#comment-16606492 ] Andy Bryant commented on KAFKA-7373: Thanks Stanislav - good spot on the KIP. Hopefully that one makes it into 2.1.0. Regarding the OutOfMemory, try the following prior to the call to increase the default heap allocation for the call. {code:java} KAFKA_HEAP_OPTS=-Xms512m -Xmx1g{code} > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5235) GetOffsetShell: retrieve offsets for all given topics and partitions with single request to the broker
[ https://issues.apache.org/jira/browse/KAFKA-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606488#comment-16606488 ] Andy Bryant commented on KAFKA-5235: Looking forward to this KIP being implemented. We currently can't use GetOffsetShell as it stands with our cluster as is uses SSL authentication (see KAFKA-7373). Having consistent parameter values will be a welcome change too. > GetOffsetShell: retrieve offsets for all given topics and partitions with > single request to the broker > -- > > Key: KAFKA-5235 > URL: https://issues.apache.org/jira/browse/KAFKA-5235 > Project: Kafka > Issue Type: Improvement > Components: tools >Reporter: Arseniy Tashoyan >Priority: Major > Labels: kip, tool > Fix For: 2.1.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > GetOffsetShell is implemented on old SimpleConsumer. It needs Zookeeper to > retrieve metadata about topics and partitions. At present, GetOffsetShell > does the following: > - get metadata from Zookeeper > - iterate over partitions > - for each partition, connect to its leader broker and request offsets > Instead, GetOffsetShell can use new KafkaConsumer and retrieve offsets by > means of endOffsets(), beginningOffsets() and offsetsForTimes() methods. One > request is sufficient for all topics and partitions. > As far as GetOffsetShell is re-implemented with new KafkaConsumer API, it > will not depend on obsolete API: SimpleConsumer, old producer API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7044) kafka-consumer-groups.sh NullPointerException describing round robin or sticky assignors
[ https://issues.apache.org/jira/browse/KAFKA-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606375#comment-16606375 ] Anna Povzner commented on KAFKA-7044: - Hi [~vahid], We are seeing this bug a lot recently, and really want to get it fixed asap. I will try Jason's suggestion and open a PR if it works. I hope that's ok with you. > kafka-consumer-groups.sh NullPointerException describing round robin or > sticky assignors > > > Key: KAFKA-7044 > URL: https://issues.apache.org/jira/browse/KAFKA-7044 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 1.1.0 > Environment: CentOS 7.4, Oracle JDK 1.8 >Reporter: Jeff Field >Assignee: Vahid Hashemian >Priority: Major > Fix For: 2.1.0 > > > We've recently moved to using the round robin assignor for one of our > consumer groups, and started testing the sticky assignor. In both cases, > using Kafka 1.1.0 we get a null pointer exception *unless* the group being > described is rebalancing: > {code:java} > kafka-consumer-groups --bootstrap-server fqdn:9092 --describe --group > groupname-for-consumer > Error: Executing consumer group command failed due to null > [2018-06-12 01:32:34,179] DEBUG Exception in consumer group command > (kafka.admin.ConsumerGroupCommand$) > java.lang.NullPointerException > at scala.Predef$.Long2long(Predef.scala:363) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:612) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:610) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:296) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.getLogEndOffsets(ConsumerGroupCommand.scala:610) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describePartitions(ConsumerGroupCommand.scala:328) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.collectConsumerAssignment(ConsumerGroupCommand.scala:308) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectConsumerAssignment(ConsumerGroupCommand.scala:544) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:571) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:565) > at scala.collection.immutable.List.flatMap(List.scala:338) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:565) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:558) > at scala.Option.map(Option.scala:146) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectGroupOffsets(ConsumerGroupCommand.scala:558) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describeGroup(ConsumerGroupCommand.scala:271) > at > kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.describeGroup(ConsumerGroupCommand.scala:544) > at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:77) > at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala) > [2018-06-12 01:32:34,255] DEBUG Removed sensor with name connections-closed: > (org.apache.kafka.common.metrics.Metrics){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7383) Verify leader epoch in produce requests (KIP-359)
Jason Gustafson created KAFKA-7383: -- Summary: Verify leader epoch in produce requests (KIP-359) Key: KAFKA-7383 URL: https://issues.apache.org/jira/browse/KAFKA-7383 Project: Kafka Issue Type: Improvement Components: producer Reporter: Jason Gustafson Assignee: Jason Gustafson Implementation of https://cwiki.apache.org/confluence/display/KAFKA/KIP-359%3A+Verify+leader+epoch+in+produce+requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6817) UnknownProducerIdException when writing messages with old timestamps
[ https://issues.apache.org/jira/browse/KAFKA-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606362#comment-16606362 ] Matthias J. Sax commented on KAFKA-6817: Yes, it marked as `1.1` and thus an issue for `2.0`, too (otherwise, it would be marked as fixed in `2.0`). It's a broker side issue and thus a broker upgrade will be required after it is fixed. > UnknownProducerIdException when writing messages with old timestamps > > > Key: KAFKA-6817 > URL: https://issues.apache.org/jira/browse/KAFKA-6817 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 1.1.0 >Reporter: Odin Standal >Priority: Major > > We are seeing the following exception in our Kafka application: > {code:java} > ERROR o.a.k.s.p.internals.StreamTask - task [0_0] Failed to close producer > due to the following error: org.apache.kafka.streams.errors.StreamsException: > task [0_0] Abort sending since an error caught with a previous record (key > 22 value some-value timestamp 1519200902670) to topic > exactly-once-test-topic- v2 due to This exception is raised by the broker if > it could not locate the producer metadata associated with the producerId in > question. This could happen if, for instance, the producer's records were > deleted because their retention time had elapsed. Once the last records of > the producerId are removed, the producer's metadata is removed from the > broker, and future appends by the producer will return this exception. at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:125) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:48) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:180) > at > org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1199) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:596) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:557) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:481) > at > org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74) > at > org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:692) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:482) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:474) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) at > java.lang.Thread.run(Thread.java:748) Caused by: > org.apache.kafka.common.errors.UnknownProducerIdException > {code} > We discovered this error when we had the need to reprocess old messages. See > more details on > [Stackoverflow|https://stackoverflow.com/questions/49872827/unknownproduceridexception-in-kafka-streams-when-enabling-exactly-once?noredirect=1#comment86901917_49872827] > We have reproduced the error with a smaller example application. The error > occurs after 10 minutes of producing messages that have old timestamps (type > 1 year old). The topic we are writing to has a retention.ms set to 1 year so > we are expecting the messages to stay there. > After digging through the ProducerStateManager-code in the Kafka source code > we have a theory of what might be wrong. > The ProducerStateManager.removeExpiredProducers() seems to remove producers > from memory erroneously when processing records which are older than the > maxProducerIdExpirationMs (coming from the `transactional.id.expiration.ms` > configuration), which is set by default to 7 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5117) Kafka Connect REST endpoints reveal Password typed values
[ https://issues.apache.org/jira/browse/KAFKA-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606326#comment-16606326 ] satyanarayan komandur commented on KAFKA-5117: -- I would like to add couple of more points related to this KIP Currently i noticed even accessing end point connectors/\{connector-name}/status is also hitting the configuration. I think this endpoint need not gather config information. > Kafka Connect REST endpoints reveal Password typed values > - > > Key: KAFKA-5117 > URL: https://issues.apache.org/jira/browse/KAFKA-5117 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.2.0 >Reporter: Thomas Holmes >Priority: Major > Labels: needs-kip > > A Kafka Connect connector can specify ConfigDef keys as type of Password. > This type was added to prevent logging the values (instead "[hidden]" is > logged). > This change does not apply to the values returned by executing a GET on > {{connectors/\{connector-name\}}} and > {{connectors/\{connector-name\}/config}}. This creates an easily accessible > way for an attacker who has infiltrated your network to gain access to > potential secrets that should not be available. > I have started on a code change that addresses this issue by parsing the > config values through the ConfigDef for the connector and returning their > output instead (which leads to the masking of Password typed configs as > [hidden]). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7214) Mystic FATAL error
[ https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606303#comment-16606303 ] John Roesler commented on KAFKA-7214: - Hi [~habdank], If I understood the scenario, you get stability problems when you increase the load by 50x but only increase memory 6x. This doesn't seem too surprising. Following a simple linear projection, if your app is healthy at 100 msg/s with 300 MB heap, when you increase the load by 50x (if it were heap-constrained to begin with) you also need 50x the heap, which puts you at 15GB. The fact that you are healthy at 1/3 this, or 5GB, indicates that it actually wasn't heap constrained to begin with. It seems like your initial hypothesis was that you needed 3MB per msg/sec, and your new hypothesis is that you need 1MB per msg/sec. So if you scale up again, you can use this as a starting point and adjust down or up, depending on how the system performs. Note that on the lower end of the spectrum, the overhead will tend to dominate, so I'm not sure if you can run 100 msg/s in only 100MB of heap, and you almost certainly cannot run 1 msg/s in 1MB of heap. Scaling up, the data itself will dominate the heap, but you'll find that there is also a limit to this thinking, as Java performs poorly with very large heaps (like terrabyte range). About your analysis: > 5000 Msg/s ~ 150 000 Mgs/30 sec ~ 150 MB This is a good lower bound, since you know that at a minimum all the live messages must reside in memory, but it is not likely to be a tight lower bound. This assumes that there is no overhead at all. That is, that the only thing in the heap is the messages themselves. Which cannot be true, since the JVM has overhead of its own, and Streams and the Clients have their own data structures to maintain, and each message is also resident a couple of different times due to serialization and deserialization, and finally because Java is memory managed, so every object continues to occupy heap after it is live until it gets collected. Does this help? -John > Mystic FATAL error > -- > > Key: KAFKA-7214 > URL: https://issues.apache.org/jira/browse/KAFKA-7214 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.3, 1.1.1 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Critical > > Dears, > Very often at startup of the streaming application I got exception: > {code} > Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-00, > topic=my_instance_medium_topic, partition=1, offset=198900203; > [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212), > > org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347), > > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420), > > org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339), > > org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648), > > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513), > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482), > > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)] > in thread > my_application-my_instance-my_instance_medium-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62 > {code} > and then (without shutdown request from my side): > {code} > 2018-07-30 07:45:02 [ar313] [INFO ] StreamThread:912 - stream-thread > [my_application-my_instance-my_instance-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62] > State transition from PENDING_SHUTDOWN to DEAD. > {code} > What is this? > How to correctly handle it? > Thanks in advance for help. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-4544) Add system tests for delegation token based authentication
[ https://issues.apache.org/jira/browse/KAFKA-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606206#comment-16606206 ] Manikumar commented on KAFKA-4544: -- [~asasvari] _>> Initially, I wanted to use console_consumer.py and verifiable clients to validate things (messages produced / consumed), but I ran into some issues:_ We can use "sasl.jaas.config" client config property to pass token credentials. With this we can avoid jaas.conf for token authentication. This can simplify the code. We should be able to pass this property to console_consumer.py and verifiable clients. {code:java} sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \ username="BQoHvMSzRpCC0Z8EHnCwkA" \ password="WKzmngyxSyoEdlkzUQoP7TsTrsNzeMC6+aKg7S0oeLkV+dnzBMjYo3tTtlAYYSFmLs4bTjf+lTZ1LCHR/ZZFNA==" \ tokenauth="true"; {code} > Add system tests for delegation token based authentication > -- > > Key: KAFKA-4544 > URL: https://issues.apache.org/jira/browse/KAFKA-4544 > Project: Kafka > Issue Type: Sub-task > Components: security >Reporter: Ashish Singh >Assignee: Attila Sasvari >Priority: Major > > Add system tests for delegation token based authentication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-6817) UnknownProducerIdException when writing messages with old timestamps
[ https://issues.apache.org/jira/browse/KAFKA-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606121#comment-16606121 ] Collin Scangarella edited comment on KAFKA-6817 at 9/6/18 5:28 PM: --- Does this bug impact 2.0? If not, would a streams upgrade be enough? Or would we have to upgrade the brokers as well? was (Author: col...@scangarella.com): Does this bug impact 2.0? > UnknownProducerIdException when writing messages with old timestamps > > > Key: KAFKA-6817 > URL: https://issues.apache.org/jira/browse/KAFKA-6817 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 1.1.0 >Reporter: Odin Standal >Priority: Major > > We are seeing the following exception in our Kafka application: > {code:java} > ERROR o.a.k.s.p.internals.StreamTask - task [0_0] Failed to close producer > due to the following error: org.apache.kafka.streams.errors.StreamsException: > task [0_0] Abort sending since an error caught with a previous record (key > 22 value some-value timestamp 1519200902670) to topic > exactly-once-test-topic- v2 due to This exception is raised by the broker if > it could not locate the producer metadata associated with the producerId in > question. This could happen if, for instance, the producer's records were > deleted because their retention time had elapsed. Once the last records of > the producerId are removed, the producer's metadata is removed from the > broker, and future appends by the producer will return this exception. at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:125) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:48) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:180) > at > org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1199) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:596) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:557) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:481) > at > org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74) > at > org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:692) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:482) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:474) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) at > java.lang.Thread.run(Thread.java:748) Caused by: > org.apache.kafka.common.errors.UnknownProducerIdException > {code} > We discovered this error when we had the need to reprocess old messages. See > more details on > [Stackoverflow|https://stackoverflow.com/questions/49872827/unknownproduceridexception-in-kafka-streams-when-enabling-exactly-once?noredirect=1#comment86901917_49872827] > We have reproduced the error with a smaller example application. The error > occurs after 10 minutes of producing messages that have old timestamps (type > 1 year old). The topic we are writing to has a retention.ms set to 1 year so > we are expecting the messages to stay there. > After digging through the ProducerStateManager-code in the Kafka source code > we have a theory of what might be wrong. > The ProducerStateManager.removeExpiredProducers() seems to remove producers > from memory erroneously when processing records which are older than the > maxProducerIdExpirationMs (coming from the `transactional.id.expiration.ms` > configuration), which is set by default to 7 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7093) Kafka warn messages after upgrade from 0.11.0.1 to 1.1.0
[ https://issues.apache.org/jira/browse/KAFKA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606126#comment-16606126 ] William Hammond commented on KAFKA-7093: Seeing the same issue as well, only on the __consumer_offsets partitions. > Kafka warn messages after upgrade from 0.11.0.1 to 1.1.0 > > > Key: KAFKA-7093 > URL: https://issues.apache.org/jira/browse/KAFKA-7093 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.1.0 >Reporter: Suleyman >Priority: Major > > I upgraded to kafka version from 0.11.0.1 to 1.1.0. After the upgrade, I'm > getting the below warn message too much. > WARN Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. > This implies messages have arrived out of order. New: \{epoch:0, > offset:793868383}, Current: \{epoch:4, offset:792201264} for Partition: > __consumer_offsets-42 (kafka.server.epoch.LeaderEpochFileCache) > How can I resolve this warn messages? And why I'm getting this warn messages? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6817) UnknownProducerIdException when writing messages with old timestamps
[ https://issues.apache.org/jira/browse/KAFKA-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606121#comment-16606121 ] Collin Scangarella commented on KAFKA-6817: --- Does this bug impact 2.0? > UnknownProducerIdException when writing messages with old timestamps > > > Key: KAFKA-6817 > URL: https://issues.apache.org/jira/browse/KAFKA-6817 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 1.1.0 >Reporter: Odin Standal >Priority: Major > > We are seeing the following exception in our Kafka application: > {code:java} > ERROR o.a.k.s.p.internals.StreamTask - task [0_0] Failed to close producer > due to the following error: org.apache.kafka.streams.errors.StreamsException: > task [0_0] Abort sending since an error caught with a previous record (key > 22 value some-value timestamp 1519200902670) to topic > exactly-once-test-topic- v2 due to This exception is raised by the broker if > it could not locate the producer metadata associated with the producerId in > question. This could happen if, for instance, the producer's records were > deleted because their retention time had elapsed. Once the last records of > the producerId are removed, the producer's metadata is removed from the > broker, and future appends by the producer will return this exception. at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:125) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:48) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:180) > at > org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1199) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:596) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:557) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:481) > at > org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74) > at > org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:692) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:482) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:474) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) at > java.lang.Thread.run(Thread.java:748) Caused by: > org.apache.kafka.common.errors.UnknownProducerIdException > {code} > We discovered this error when we had the need to reprocess old messages. See > more details on > [Stackoverflow|https://stackoverflow.com/questions/49872827/unknownproduceridexception-in-kafka-streams-when-enabling-exactly-once?noredirect=1#comment86901917_49872827] > We have reproduced the error with a smaller example application. The error > occurs after 10 minutes of producing messages that have old timestamps (type > 1 year old). The topic we are writing to has a retention.ms set to 1 year so > we are expecting the messages to stay there. > After digging through the ProducerStateManager-code in the Kafka source code > we have a theory of what might be wrong. > The ProducerStateManager.removeExpiredProducers() seems to remove producers > from memory erroneously when processing records which are older than the > maxProducerIdExpirationMs (coming from the `transactional.id.expiration.ms` > configuration), which is set by default to 7 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6699) When one of two Kafka nodes are dead, streaming API cannot handle messaging
[ https://issues.apache.org/jira/browse/KAFKA-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606059#comment-16606059 ] Matthias J. Sax commented on KAFKA-6699: It's recommended to run with at least 3 broker, use replication factor 3, isr = 2, and acks=all to guarantee consistency if one broker goes down. With your setting, ie, ack=1, a write is acknowledge if the leader received it, but before is was replicated. Thus, if the leader broker fails, before replication happens, you might loose the write (even if it was acked to the producer client). Also, with ISR=2, if one broker goes down, you are not able to write any longer. This is the reason why your Streams application starts to fail – the cluster cannot provide 2 ISRs, and thus rejects writes. You can either add one more broker, or reduce ISR to 1, to allow writes if one broker goes down. Overall, this ticket seems to be a config issue, rather than a bug. In general, please use the mailing list and open tickets if you know it's a bug. If you can verify that changing the config/setup resolves your issue, please report back so we can close this ticket. > When one of two Kafka nodes are dead, streaming API cannot handle messaging > --- > > Key: KAFKA-6699 > URL: https://issues.apache.org/jira/browse/KAFKA-6699 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.2 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > > Dears, > I am observing quite often, when Kafka Broker is partly dead(*), then > application, which uses streaming API are doing nothing. > (*) Partly dead in my case it means that one of two Kafka nodes are out of > order. > Especially when disk is full on one machine, then Broker is going in some > strange state, where streaming API goes vacations. It seems like regular > producer/consumer API has no problem in such a case. > Can you have a look on that matter? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (KAFKA-5882) NullPointerException in StreamTask
[ https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seweryn Habdank-Wojewodzki closed KAFKA-5882. - Not more visible in Kafka 1.1.1 > NullPointerException in StreamTask > -- > > Key: KAFKA-5882 > URL: https://issues.apache.org/jira/browse/KAFKA-5882 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.1.1 > > > It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] > is made, but introduce some other issue. > In some cases (I am not sure which ones) I got NPE (below). > I would expect that even in case of FATAL error anythink except NPE is thrown. > {code} > 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener > for group streamer failed on partition assignment > java.lang.NullPointerException: null > at > org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:123) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527) > [myapp-streamer.jar:?] > 2017-09-12 23:34:54 INFO StreamThread:1040 - stream-thread > [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down > 2017-09-12 23:34:54 INFO KafkaProducer:972 - Closing the Kafka producer with > timeoutMillis = 9223372036854775807 ms. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5882) NullPointerException in StreamTask
[ https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606053#comment-16606053 ] Guozhang Wang commented on KAFKA-5882: -- Thanks [~habdank] for confirming, since [~vikasamar] mentioned they see the issue but in 0.11.0.x only, I think it is okay to resolve it as for 1.1.1 for now. If people see a similar issue beyond 1.1.1, we can always create a new JIRA. > NullPointerException in StreamTask > -- > > Key: KAFKA-5882 > URL: https://issues.apache.org/jira/browse/KAFKA-5882 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.1.1 > > > It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] > is made, but introduce some other issue. > In some cases (I am not sure which ones) I got NPE (below). > I would expect that even in case of FATAL error anythink except NPE is thrown. > {code} > 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener > for group streamer failed on partition assignment > java.lang.NullPointerException: null > at > org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:123) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527) > [myapp-streamer.jar:?] > 2017-09-12 23:34:54 INFO StreamThread:1040 - stream-thread > [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down > 2017-09-12 23:34:54 INFO KafkaProducer:972 - Closing the Kafka producer with > timeoutMillis = 9223372036854775807 ms. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-5882) NullPointerException in StreamTask
[ https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-5882. -- Resolution: Fixed Fix Version/s: 1.1.1 > NullPointerException in StreamTask > -- > > Key: KAFKA-5882 > URL: https://issues.apache.org/jira/browse/KAFKA-5882 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.1.1 > > > It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] > is made, but introduce some other issue. > In some cases (I am not sure which ones) I got NPE (below). > I would expect that even in case of FATAL error anythink except NPE is thrown. > {code} > 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener > for group streamer failed on partition assignment > java.lang.NullPointerException: null > at > org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:123) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527) > [myapp-streamer.jar:?] > 2017-09-12 23:34:54 INFO StreamThread:1040 - stream-thread > [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down > 2017-09-12 23:34:54 INFO KafkaProducer:972 - Closing the Kafka producer with > timeoutMillis = 9223372036854775807 ms. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-4544) Add system tests for delegation token based authentication
[ https://issues.apache.org/jira/browse/KAFKA-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605944#comment-16605944 ] Attila Sasvari commented on KAFKA-4544: --- [~omkreddy] thanks for the info, I extended the test case to better cover the lifecycle of a delegation token based on your idea: - Create delegation token - Create a console-producer using SCRAM and delegation token and produce a message - Verify message is created (with kafka.search_data_files() ) - Create a console-consumer using SCRAM and delegation token and consume 1 message - Expire the token, immediately - Try producing one more message with the expired token - Verify the last message is not persisted by the broker Initially, I wanted to use console_consumer.py and verifiable clients to validate things (messages produced / consumed), but I ran into some issues: - jaas.conf / KafkaClient config cannot include more login modules {code} Multiple LoginModule-s in JAAS Caused by: java.lang.IllegalArgumentException: JAAS config property contains 2 login modules, should be 1 module at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:95) at org.apache.kafka.common.security.JaasContext.loadClientContext(JaasContext.java:84) at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:119) at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:65) at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:88) at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:419) {code} - To request a delegation token, we need GSSAPI (and use keytab), subsequently, consumers and producers use the delegation token. So I ended up constructing manually the jaas.config and client configs in my POC. - With and even without my changes, JMX failed to start up when I tried to run {{./ducker-ak test ../kafkatest/sanity_checks/test_console_consumer.py}}: {code} Exception in thread ConsoleConsumer-0-140287252789520-worker-1: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/usr/local/lib/python2.7/dist-packages/ducktape/services/background_thread.py", line 35, in _protected_worker self._worker(idx, node) File "/opt/kafka-dev/tests/kafkatest/services/console_consumer.py", line 229, in _worker self.start_jmx_tool(idx, node) File "/opt/kafka-dev/tests/kafkatest/services/monitor/jmx.py", line 86, in start_jmx_tool wait_until(lambda: self._jmx_has_output(node), timeout_sec=10, backoff_sec=.5, err_msg="%s: Jmx tool took too long to start" % node.account) File "/usr/local/lib/python2.7/dist-packages/ducktape/utils/util.py", line 36, in wait_until raise TimeoutError(err_msg) TimeoutError: ducker@ducker04: Jmx tool took too long to start {code} Right now a lot of things are [hardcoded|https://github.com/asasvari/kafka/commit/edfc37012079764d2a589dbf5f24ad04505975d4#diff-3e7b2bdbd55d075bcebbbe5ba8c4e269] (using shell scripts) in my POC. It would be nice to extract common functionalities and make them easily reusable (e.g. creating wrappers in python, for example, to do delegation token handling). > Add system tests for delegation token based authentication > -- > > Key: KAFKA-4544 > URL: https://issues.apache.org/jira/browse/KAFKA-4544 > Project: Kafka > Issue Type: Sub-task > Components: security >Reporter: Ashish Singh >Assignee: Attila Sasvari >Priority: Major > > Add system tests for delegation token based authentication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605937#comment-16605937 ] Stanislav Kozlovski commented on KAFKA-7373: Hi [~kiwiandy], There is KIP-308 for GetOffsetShell which suggests the addition of a `--consumer-property` field: https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+GetOffsetShell%3A+new+KafkaConsumer+API%2C+support+for+multiple+topics%2C+minimize+the+number+of+requests+to+server > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Kozlovski reassigned KAFKA-7373: -- Assignee: (was: Andy Bryant) > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Kozlovski reassigned KAFKA-7373: -- Assignee: Andy Bryant (was: Stanislav Kozlovski) > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Assignee: Andy Bryant >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605931#comment-16605931 ] ASF GitHub Bot commented on KAFKA-7373: --- stanislavkozlovski closed pull request #5617: KAFKA-7373: Allow GetOffsetShell command to accept a configurations file URL: https://github.com/apache/kafka/pull/5617 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/core/src/main/scala/kafka/tools/GetOffsetShell.scala b/core/src/main/scala/kafka/tools/GetOffsetShell.scala index eafddc66de4..6174479217f 100644 --- a/core/src/main/scala/kafka/tools/GetOffsetShell.scala +++ b/core/src/main/scala/kafka/tools/GetOffsetShell.scala @@ -26,6 +26,7 @@ import org.apache.kafka.clients.consumer.{ConsumerConfig, KafkaConsumer} import org.apache.kafka.common.{PartitionInfo, TopicPartition} import org.apache.kafka.common.requests.ListOffsetRequest import org.apache.kafka.common.serialization.ByteArrayDeserializer +import org.apache.kafka.common.utils.Utils import scala.collection.JavaConverters._ @@ -61,6 +62,10 @@ object GetOffsetShell { .describedAs("ms") .ofType(classOf[java.lang.Integer]) .defaultsTo(1000) +val configOpt = parser.accepts("config", s"Configuration properties file. Useful for configuring authentication. Note that all argument options take precedence over this config.") + .withOptionalArg() + .describedAs("config file") + .ofType(classOf[String]) if (args.length == 0) CommandLineUtils.printUsageAndDie(parser, "An interactive shell for getting topic offsets.") @@ -89,7 +94,10 @@ object GetOffsetShell { } val listOffsetsTimestamp = options.valueOf(timeOpt).longValue -val config = new Properties +val config = if (options.has(configOpt)) + Utils.loadProps(options.valueOf(configOpt)) +else + new Properties config.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokerList) config.setProperty(ConsumerConfig.CLIENT_ID_CONFIG, clientId) val consumer = new KafkaConsumer(config, new ByteArrayDeserializer, new ByteArrayDeserializer) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Assignee: Stanislav Kozlovski >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6777) Wrong reaction on Out Of Memory situation
[ https://issues.apache.org/jira/browse/KAFKA-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605926#comment-16605926 ] John Roesler commented on KAFKA-6777: - Yes, it certainly seems safer to generally avoid catching Errors. > But anyhow, I would expect, that lack of critical resources like memory will >quickly lead to crash with FATAL error e.g. Out Of Memory. Have you seen some evidence that this is not happening? Note that frequent or long GC pauses are not the same as running out of memory. > Wrong reaction on Out Of Memory situation > - > > Key: KAFKA-6777 > URL: https://issues.apache.org/jira/browse/KAFKA-6777 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Critical > Attachments: screenshot-1.png > > > Dears, > We already encountered many times problems related to Out Of Memory situation > in Kafka Broker and streaming clients. > The scenario is the following. > When Kafka Broker (or Streaming Client) is under load and has too less > memory, there are no errors in server logs. One can see some cryptic entries > in GC logs, but they are definitely not self-explaining. > Kafka Broker (and Streaming Clients) works further. Later we see in JMX > monitoring, that JVM uses more and more time in GC. In our case it grows from > e.g. 1% to 80%-90% of CPU time is used by GC. > Next, software collapses into zombie mode – process in not ending. In such a > case I would expect, that process is crashing (e.g. got SIG SEGV). Even worse > Kafka treats such a zombie process normal and somewhat sends messages, which > are in fact getting lost, also the cluster is not excluding broken nodes. The > question is how to configure Kafka to really terminate the JVM and not remain > in zombie mode, to give a chance to other nodes to know, that something is > dead. > I would expect that in Out Of Memory situation JVM is ended if not graceful > than at least process is crashed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605911#comment-16605911 ] ASF GitHub Bot commented on KAFKA-7373: --- stanislavkozlovski opened a new pull request #5617: KAFKA-7373: Allow GetOffsetShell command to accept a configurations file URL: https://github.com/apache/kafka/pull/5617 GetOffsetShell doesn't provide a mechanism to provide additional configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. This leaves it unable to connect to a broker using SSL. This PR allows it to accept a client configuration file, subsequently allowing it to provide SSL configurations and connect to a broker. I tested this manually. Trying to connect to a broker's SSL listener raised an out of memory error for me. After passing in the appropriate configurations via a config file, it connected successfully This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Assignee: Stanislav Kozlovski >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605905#comment-16605905 ] Stanislav Kozlovski commented on KAFKA-7373: I get another error with Kafka trunk {code:java} Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.base/java.nio.HeapByteBuffer.(HeapByteBuffer.java:68) at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:341) at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30) at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:112) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:360) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:321) at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:609) at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:541) at org.apache.kafka.common.network.Selector.poll(Selector.java:467) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215) at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:272) at org.apache.kafka.clients.consumer.internals.Fetcher.getAllTopicMetadata(Fetcher.java:255) at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1799) at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1777) at kafka.tools.GetOffsetShell$.listPartitionInfos(GetOffsetShell.scala:156) at kafka.tools.GetOffsetShell$.main(GetOffsetShell.scala:110) at kafka.tools.GetOffsetShell.main(GetOffsetShell.scala) {code} I'm going to implement the passing of a config file approach > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Assignee: Stanislav Kozlovski >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-7373) GetOffsetShell doesn't work when SSL authentication is enabled
[ https://issues.apache.org/jira/browse/KAFKA-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Kozlovski reassigned KAFKA-7373: -- Assignee: Stanislav Kozlovski > GetOffsetShell doesn't work when SSL authentication is enabled > -- > > Key: KAFKA-7373 > URL: https://issues.apache.org/jira/browse/KAFKA-7373 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Andy Bryant >Assignee: Stanislav Kozlovski >Priority: Major > > GetOffsetShell doesn't provide a mechanism to provide additional > configuration for the underlying KafkaConsumer as does the `ConsoleConsumer`. > Passing SSL config as system properties doesn't propagate to the consumer > either. > {code:java} > 10:47 $ ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > ${BROKER_LIST} --topic cld-dev-sor-crods-crodsdba_contact > Exception in thread "main" org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata{code} > Editing {{GetOffsetShell.scala}} to include the SSL properties in the > KafkaConsumer configuration resolved the issue. > Providing {{consumer-property}} and {{consumer-config}} configuration options > for {{kafka-run-class-sh}} or creating a separate run script for offsets and > using these properties in {{GetOffsetShell.scala}} seems like a good solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5882) NullPointerException in StreamTask
[ https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605883#comment-16605883 ] Seweryn Habdank-Wojewodzki commented on KAFKA-5882: --- In kafka 1.1.1 I do not see this anymore. Shall I close the issue with comment about that? > NullPointerException in StreamTask > -- > > Key: KAFKA-5882 > URL: https://issues.apache.org/jira/browse/KAFKA-5882 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > > It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] > is made, but introduce some other issue. > In some cases (I am not sure which ones) I got NPE (below). > I would expect that even in case of FATAL error anythink except NPE is thrown. > {code} > 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener > for group streamer failed on partition assignment > java.lang.NullPointerException: null > at > org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:123) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527) > [myapp-streamer.jar:?] > 2017-09-12 23:34:54 INFO StreamThread:1040 - stream-thread > [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down > 2017-09-12 23:34:54 INFO KafkaProducer:972 - Closing the Kafka producer with > timeoutMillis = 9223372036854775807 ms. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (KAFKA-6457) Error: NOT_LEADER_FOR_PARTITION leads to NPE
[ https://issues.apache.org/jira/browse/KAFKA-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seweryn Habdank-Wojewodzki closed KAFKA-6457. - In Kafka 1.1.1 see it no more. > Error: NOT_LEADER_FOR_PARTITION leads to NPE > > > Key: KAFKA-6457 > URL: https://issues.apache.org/jira/browse/KAFKA-6457 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.0.0 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > Fix For: 1.0.1 > > > One of our nodes was dead. Then the second one tooks all responsibility. > But streamming aplication in the meanwhile crashed due to NPE caused by > {{Error: NOT_LEADER_FOR_PARTITION}}. > The stack trace is below. > > Is it something expected? > > {code:java} > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer ...2018-01-17 > 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-5, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer] > Got error produce response with correlation id 768962 on topic-partition > my_internal_topic-7, retrying (9 attempts left). Error: > NOT_LEADER_FOR_PARTITION > 2018-01-17 11:47:21 [my] [ERROR] AbstractCoordinator:296 - [Consumer > clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-consumer, > groupId=restreamer-my] Heartbeat thread for group restreamer-my failed due > to unexpected error > java.lang.NullPointerException: null > at > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436) > ~[my-restreamer.jar:?] > at org.apache.kafka.common.network.Selector.poll(Selector.java:395) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:275) > ~[my-restreamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:934) > [my-restreamer.jar:?] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6699) When one of two Kafka nodes are dead, streaming API cannot handle messaging
[ https://issues.apache.org/jira/browse/KAFKA-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605867#comment-16605867 ] Seweryn Habdank-Wojewodzki commented on KAFKA-6699: --- ISR: 2 ack: 1 Do you suggest to have more brokers? > When one of two Kafka nodes are dead, streaming API cannot handle messaging > --- > > Key: KAFKA-6699 > URL: https://issues.apache.org/jira/browse/KAFKA-6699 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.2 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > > Dears, > I am observing quite often, when Kafka Broker is partly dead(*), then > application, which uses streaming API are doing nothing. > (*) Partly dead in my case it means that one of two Kafka nodes are out of > order. > Especially when disk is full on one machine, then Broker is going in some > strange state, where streaming API goes vacations. It seems like regular > producer/consumer API has no problem in such a case. > Can you have a look on that matter? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-7214) Mystic FATAL error
[ https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605851#comment-16605851 ] Seweryn Habdank-Wojewodzki edited comment on KAFKA-7214 at 9/6/18 2:40 PM: --- Problem is that KSTREAM-SOURCE-X is mostly KSTREAM-SOURCE-0 independently of which process and how much processes are running (or trying to run). How I reproduce error at my side. Let's assume I have low message flow < 100 Msg/sec Msg size ~ 1kB. I am starting app using streaming API. This app reads from 30 topics and send messages to 1 topic. Let's give this app 300MB JVM Heap. It is starting. Cool. At second server I am starting. second instance. The same. It is starting. The other case. Let's assume I have a bit higher message flow > 5000 Msg/sec Msg size ~ 1kB. I am starting app using streaming API. This app reads from 30 topics and send messages to 1 topic. Let's give this app 300 MB JVM Heap. It is not starting, even if memory spec stays that it is enough to calculate 30 sec of messages. 5000 Msg/s ~ 150 000 Mgs/30 sec ~ 150 MB. I am giving to app 2GB Heap. Is starting. Everything between 300 MB and 2 GB leads at some point to yet another mystic crasches. At second server I am starting. second instance. If I am starting it with 300 MB - I got immediately this error. Application tries to starrt, but then I got this error and all affected topics are goig to be dead. If I am giving 1GB, it is better application works some hours, but any minimal peak aroud 5000 Msg/s to e.g. 7000 Msg/s, causes the same. Finally - now - I am starting processes with 5GB. they could work continuously like 2-4 days. I am sorry I have no better description. Once I tried to start TRACE level logs in Kafka, but this is impossible with message flow at 5000 Msg/s. was (Author: habdank): Problem is that KSTREAM-SOURCE-X is mostly KSTREAM-SOURCE-0 independently of which process and how much processes are running (or trying to run). How I reproduce error at my side. Let's assume I have low message flow < 100 Msg/sec Msg size ~ 1kB. I am starting app using streaming API. This app reads from 30 topics and send messages to 1 topic. Let's give this app 300MB JVM Heap. It is starting. Cool. At second server I am starting. second instance. The same. It is starting. The other case. Let's assume I have a bit higher message flow > 5000 Msg/sec Msg size ~ 1kB. I am starting app using streaming API. This app reads from 30 topics and send messages to 1 topic. Let's give this app 300 MB JVM Heap. It is not starting, even in memory spec stays that it is enough to calculate 30 sec of messages. 5000 Msg/s ~ 150 000 Mgs/30 sec ~ 150 MB. I am giving to app 2GB Heap. Is starting. Everything between 300 MB and 2 GB leads at some point to yet another mystic crasches. At second server I am starting. second instance. If I am starting it with 300 MB - I got immediately this error. Application tries to starrt, but then I got this error and all affected topics are goig to be dead. If I am giving 1GB, it is better application works some hours, but any minimal peak aroud 5000 Msg/s to e.g. 7000 Msg/s, causes the same. Finally - now - I am starting processes with 5GB. they could work continuously like 2-4 days. I am sorry I have no better description. Once I tried to start TRACE level logs in Kafka, but this is impossible with message flow at 5000 Msg/s. > Mystic FATAL error > -- > > Key: KAFKA-7214 > URL: https://issues.apache.org/jira/browse/KAFKA-7214 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.3, 1.1.1 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Critical > > Dears, > Very often at startup of the streaming application I got exception: > {code} > Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-00, > topic=my_instance_medium_topic, partition=1, offset=198900203; > [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212), > > org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347), > > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420), > > org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339), > > org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648), > > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513), > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482), > > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)] > in thread >
[jira] [Commented] (KAFKA-7214) Mystic FATAL error
[ https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605851#comment-16605851 ] Seweryn Habdank-Wojewodzki commented on KAFKA-7214: --- Problem is that KSTREAM-SOURCE-X is mostly KSTREAM-SOURCE-0 independently of which process and how much processes are running (or trying to run). How I reproduce error at my side. Let's assume I have low message flow < 100 Msg/sec Msg size ~ 1kB. I am starting app using streaming API. This app reads from 30 topics and send messages to 1 topic. Let's give this app 300MB JVM Heap. It is starting. Cool. At second server I am starting. second instance. The same. It is starting. The other case. Let's assume I have low message flow > 5000 Msg/sec Msg size ~ 1kB. I am starting app using streaming API. This app reads from 30 topics and send messages to 1 topic. Let's give this app 300 MB JVM Heap. It is not starting, even in memory spec stays that it is enough to calculate 30 sec of messages. 5000 Msg/s ~ 150 000 Mgs/30 sec ~ 150 MB. I am giving to app 2GB Heap. Is starting. Everything between 300 MB and 2 GB leads at some point to yet another mystic crasches. At second server I am starting. second instance. If I am starting it with 300 MB - I got immediately this error. Application tries to starrt, but then I got this error and all affected topics are goig to be dead. If I am giving 1GB, it is better application works some hours, but any minimal peak aroud 5000 Msg/s to e.g. 7000 Msg/s, causes the same. Finally - now - I am starting processes with 5GB. they could work continuously like 2-4 days. I am sorry I have no better description. Once I tried to start TRACE level logs in Kafka, but this is impossible with message flow at 5000 Msg/s. > Mystic FATAL error > -- > > Key: KAFKA-7214 > URL: https://issues.apache.org/jira/browse/KAFKA-7214 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.3, 1.1.1 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Critical > > Dears, > Very often at startup of the streaming application I got exception: > {code} > Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-00, > topic=my_instance_medium_topic, partition=1, offset=198900203; > [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212), > > org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347), > > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420), > > org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339), > > org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648), > > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513), > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482), > > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)] > in thread > my_application-my_instance-my_instance_medium-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62 > {code} > and then (without shutdown request from my side): > {code} > 2018-07-30 07:45:02 [ar313] [INFO ] StreamThread:912 - stream-thread > [my_application-my_instance-my_instance-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62] > State transition from PENDING_SHUTDOWN to DEAD. > {code} > What is this? > How to correctly handle it? > Thanks in advance for help. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7382) We shoud guarantee at lest one replica of partition should be alive when create or update topic
zhaoshijie created KAFKA-7382: - Summary: We shoud guarantee at lest one replica of partition should be alive when create or update topic Key: KAFKA-7382 URL: https://issues.apache.org/jira/browse/KAFKA-7382 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.10.2.0 Reporter: zhaoshijie For example:I have brokers: 1,2,3,4,5. I create a new topic by command: {code:java} sh kafka-topics.sh --create --topic replicaserror --zookeeper localhost:2181 --replica-assignment 11:12:13,12:13:14,14:15:11,14:12:11,13:14:11 {code} Then kafkaController will process this,after partitionStateMachine and replicaStateMachine handle state change,topic metadatas and state will be strange,partitions is on NewPartition and replicas is on OnlineReplica. Next wo can not delete this topic(bacase state change illegal ),This will cause a number of problems.So i think wo shoud check replicas assignment when create or update topic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7093) Kafka warn messages after upgrade from 0.11.0.1 to 1.1.0
[ https://issues.apache.org/jira/browse/KAFKA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605452#comment-16605452 ] Kathleen commented on KAFKA-7093: - We see this error as well on a partition specific to our code but also on a partition of __consumer_offsets. > Kafka warn messages after upgrade from 0.11.0.1 to 1.1.0 > > > Key: KAFKA-7093 > URL: https://issues.apache.org/jira/browse/KAFKA-7093 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.1.0 >Reporter: Suleyman >Priority: Major > > I upgraded to kafka version from 0.11.0.1 to 1.1.0. After the upgrade, I'm > getting the below warn message too much. > WARN Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. > This implies messages have arrived out of order. New: \{epoch:0, > offset:793868383}, Current: \{epoch:4, offset:792201264} for Partition: > __consumer_offsets-42 (kafka.server.epoch.LeaderEpochFileCache) > How can I resolve this warn messages? And why I'm getting this warn messages? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7093) Kafka warn messages after upgrade from 0.11.0.1 to 1.1.0
[ https://issues.apache.org/jira/browse/KAFKA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605361#comment-16605361 ] Jameel Al-Aziz commented on KAFKA-7093: --- We've been observing the error as well. It occurred after trying to recover from a broker outage. The errors appear to prevent proper recovery or partition reassignment. > Kafka warn messages after upgrade from 0.11.0.1 to 1.1.0 > > > Key: KAFKA-7093 > URL: https://issues.apache.org/jira/browse/KAFKA-7093 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.1.0 >Reporter: Suleyman >Priority: Major > > I upgraded to kafka version from 0.11.0.1 to 1.1.0. After the upgrade, I'm > getting the below warn message too much. > WARN Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. > This implies messages have arrived out of order. New: \{epoch:0, > offset:793868383}, Current: \{epoch:4, offset:792201264} for Partition: > __consumer_offsets-42 (kafka.server.epoch.LeaderEpochFileCache) > How can I resolve this warn messages? And why I'm getting this warn messages? -- This message was sent by Atlassian JIRA (v7.6.3#76005)