[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
[ https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064176#comment-16064176 ] Mahesh Veerabathiran commented on KAFKA-5153: - Having the same issue. do anyone has a resolution? > KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting > --- > > Key: KAFKA-5153 > URL: https://issues.apache.org/jira/browse/KAFKA-5153 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.2.0 > Environment: RHEL 6 > Java Version 1.8.0_91-b14 >Reporter: Arpan >Priority: Critical > Attachments: server_1_72server.log, server_2_73_server.log, > server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, > ThreadDump_1493564177.dump, ThreadDump_1493564249.dump > > > Hi Team, > I was earlier referring to issue KAFKA-4477 because the problem i am facing > is similar. I tried to search the same reference in release docs as well but > did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using > 2.11_0.10.2.0. > I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set > of servers in cluster mode. We are having around 240GB of data getting > transferred through KAFKA everyday. What we are observing is disconnect of > the server from cluster and ISR getting reduced and it starts impacting > service. > I have also observed file descriptor count getting increased a bit, in normal > circumstances we have not observed FD count more than 500 but when issue > started we were observing it in the range of 650-700 on all 3 servers. > Attaching thread dumps of all 3 servers when we started facing the issue > recently. > The issue get vanished once you bounce the nodes and the set up is not > working more than 5 days without this issue. Attaching server logs as well. > Kindly let me know if you need any additional information. Attaching > server.properties as well for one of the server (It's similar on all 3 > serversP) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5522) ListOffset should take LSO into account when searching by timestamp
Jason Gustafson created KAFKA-5522: -- Summary: ListOffset should take LSO into account when searching by timestamp Key: KAFKA-5522 URL: https://issues.apache.org/jira/browse/KAFKA-5522 Project: Kafka Issue Type: Sub-task Affects Versions: 0.11.0.0 Reporter: Jason Gustafson Assignee: Jason Gustafson Fix For: 0.11.0.1 For a normal read_uncommitted consumer, we bound the offset returned from ListOffsets by the high watermark. For read_committed consumers, we should similarly bound offsets by the LSO. Currently we only handle the case of fetching the end offset. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5521) Support replicas movement between log directories (KIP-113)
Dong Lin created KAFKA-5521: --- Summary: Support replicas movement between log directories (KIP-113) Key: KAFKA-5521 URL: https://issues.apache.org/jira/browse/KAFKA-5521 Project: Kafka Issue Type: Bug Reporter: Dong Lin Assignee: Dong Lin -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5367) Producer should not expiry topic from metadata cache if accumulator still has data for this topic
[ https://issues.apache.org/jira/browse/KAFKA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin resolved KAFKA-5367. - Resolution: Invalid > Producer should not expiry topic from metadata cache if accumulator still has > data for this topic > - > > Key: KAFKA-5367 > URL: https://issues.apache.org/jira/browse/KAFKA-5367 > Project: Kafka > Issue Type: Bug >Reporter: Dong Lin >Assignee: Dong Lin > > To be added. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063956#comment-16063956 ] Matthias J. Sax commented on KAFKA-4750: I think [~damianguy] or [~enothereska] can comment best in this. AFAIK, we use {{put(key,null)}} with delete-semantics all over the place. Also for {{KTable}} caches. As it align with changelog delete semantics I also think it does make sense to keep it this way. I would rather educate user that plug in Serde to not return {{null}} if input is not {{null}}. We can also add checks to all {{Serde}} calls: (1) never call Serde for {{null}} as we know it must be {{null}} anyway (2) if we call Serde with not-null, make sure it does not return {{null}} -- otherwise throw exception. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-5464: --- Fix Version/s: 0.11.1.0 > StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > -- > > Key: KAFKA-5464 > URL: https://issues.apache.org/jira/browse/KAFKA-5464 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.10.2.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > Fix For: 0.11.1.0, 0.10.2.2, 0.11.0.1 > > > In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call > {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout. > However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to > {{KafkaConsumer.poll()}} and it's incorrect to use it for the > {{NetworkClient}}. If the config is increased, this can lead to a infinite > rebalance and rebalance on the client side is increased and thus, the client > is not able to meet broker enforced timeouts anymore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-5464: --- Fix Version/s: 0.11.0.1 0.10.2.2 > StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > -- > > Key: KAFKA-5464 > URL: https://issues.apache.org/jira/browse/KAFKA-5464 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.10.2.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > Fix For: 0.10.2.2, 0.11.0.1 > > > In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call > {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout. > However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to > {{KafkaConsumer.poll()}} and it's incorrect to use it for the > {{NetworkClient}}. If the config is increased, this can lead to a infinite > rebalance and rebalance on the client side is increased and thus, the client > is not able to meet broker enforced timeouts anymore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063937#comment-16063937 ] ASF GitHub Bot commented on KAFKA-5464: --- GitHub user mjsax opened a pull request: https://github.com/apache/kafka/pull/3439 KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG You can merge this pull request into a Git repository by running: $ git pull https://github.com/mjsax/kafka kafka-5464-streamskafkaclient-poll Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3439.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3439 commit 609ceeb0d546f02b9da2638545784f050dc9558a Author: Matthias J. SaxDate: 2017-06-26T22:52:56Z KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > -- > > Key: KAFKA-5464 > URL: https://issues.apache.org/jira/browse/KAFKA-5464 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.10.2.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > > In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call > {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout. > However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to > {{KafkaConsumer.poll()}} and it's incorrect to use it for the > {{NetworkClient}}. If the config is increased, this can lead to a infinite > rebalance and rebalance on the client side is increased and thus, the client > is not able to meet broker enforced timeouts anymore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes
[ https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063798#comment-16063798 ] James Cheng commented on KAFKA-3806: [~junrao], your example of "track when an offset is obsolete based on the activity of the group" is being tracked in https://issues.apache.org/jira/browse/KAFKA-4682 > Adjust default values of log.retention.hours and offsets.retention.minutes > -- > > Key: KAFKA-3806 > URL: https://issues.apache.org/jira/browse/KAFKA-3806 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 0.9.0.1, 0.10.0.0 >Reporter: Michal Turek >Priority: Minor > > Combination of default values of log.retention.hours (168 hours = 7 days) and > offsets.retention.minutes (1440 minutes = 1 day) may be dangerous in special > cases. Offset retention should be always greater than log retention. > We have observed the following scenario and issue: > - Producing of data to a topic was disabled two days ago by producer update, > topic wasn't deleted. > - Consumer consumed all data and properly committed offsets to Kafka. > - Consumer made no more offset commits for that topic because there was no > more incoming data and there was nothing to confirm. (We have auto-commit > disabled, I'm not sure how behaves enabled auto-commit.) > - After one day: Kafka cleared too old offsets according to > offsets.retention.minutes. > - After two days: Long-term running consumer was restarted after update, it > didn't find any committed offsets for that topic since they were deleted by > offsets.retention.minutes so it started consuming from the beginning. > - The messages were still in Kafka due to larger log.retention.hours, about 5 > days of messages were read again. > Known workaround to solve this issue: > - Explicitly configure log.retention.hours and offsets.retention.minutes, > don't use defaults. > Proposals: > - Prolong default value of offsets.retention.minutes to be at least twice > larger than log.retention.hours. > - Check these values during Kafka startup and log a warning if > offsets.retention.minutes is smaller than log.retention.hours. > - Add a note to migration guide about differences between storing of offsets > in ZooKeeper and Kafka (http://kafka.apache.org/documentation.html#upgrade). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4849) Bug in KafkaStreams documentation
[ https://issues.apache.org/jira/browse/KAFKA-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063739#comment-16063739 ] Guozhang Wang commented on KAFKA-4849: -- We have resolved all the reported issues in this JIRA except the one in web docs, which is covered in KAFKA-4705 already. Resolving this ticket for now. > Bug in KafkaStreams documentation > - > > Key: KAFKA-4849 > URL: https://issues.apache.org/jira/browse/KAFKA-4849 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Matthias J. Sax >Priority: Minor > > At the page: https://kafka.apache.org/documentation/streams > > In the chapter titled Application Configuration and Execution, in the example > there is a line: > > settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "zookeeper1:2181"); > > but ZOOKEEPER_CONNECT_CONFIG is deprecated in the Kafka version 0.10.2.0. > > Also the table on the page: > https://kafka.apache.org/0102/documentation/#streamsconfigs is a bit > misleading. > 1. Again zookeeper.connect is deprecated. > 2. The client.id and zookeeper.connect are marked by high importance, > but according to http://docs.confluent.io/3.2.0/streams/developer-guide.html > none of them are important to initialize the stream. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063725#comment-16063725 ] ASF GitHub Bot commented on KAFKA-3465: --- GitHub user vahidhashemian opened a pull request: https://github.com/apache/kafka/pull/3438 KAFKA-3465: Clarify warning message of ConsumerOffsetChecker Add that the tool works with the old consumer only. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vahidhashemian/kafka KAFKA-3465 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3438.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3438 commit fa68749267b40d156b3811368e2f49c5c8813a43 Author: Vahid HashemianDate: 2017-06-26T20:24:46Z KAFKA-3465: Clarify warning message of ConsumerOffsetChecker Add that the tool works with the old consumer only. > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5520) Extend Consumer Group Reset Offset tool for Stream Applications
[ https://issues.apache.org/jira/browse/KAFKA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Quilcate updated KAFKA-5520: -- Description: KIP documentation: https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application (was: KIP documentation: TODO) > Extend Consumer Group Reset Offset tool for Stream Applications > --- > > Key: KAFKA-5520 > URL: https://issues.apache.org/jira/browse/KAFKA-5520 > Project: Kafka > Issue Type: Improvement > Components: core, tools >Reporter: Jorge Quilcate > Labels: kip > Fix For: 0.11.1.0 > > > KIP documentation: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063618#comment-16063618 ] Vahid Hashemian commented on KAFKA-3465: [~cmccabe] I have opened a PR to update the relevant documentation [here|https://github.com/apache/kafka/pull/3405]. Would that address your concern? Thanks. > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5519) Support for multiple certificates in a single keystore
Alla Tumarkin created KAFKA-5519: Summary: Support for multiple certificates in a single keystore Key: KAFKA-5519 URL: https://issues.apache.org/jira/browse/KAFKA-5519 Project: Kafka Issue Type: New Feature Components: security Affects Versions: 0.10.2.1 Reporter: Alla Tumarkin Background Currently, we need to have a keystore exclusive to the component with exactly one key in it. Looking at the JSSE Reference guide, it seems like we would need to introduce our own KeyManager into the SSLContext which selects a configurable key alias name. https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/X509KeyManager.html has methods for dealing with aliases. The goal here to use a specific certificate (with proper ACLs set for this client), and not just the first one that matches. Looks like it requires a code change to the SSLChannelBuilder -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063569#comment-16063569 ] Colin P. McCabe commented on KAFKA-3465: Should we rename the tool or add documentation to make it clear when people find this class, that it is only for the old consumer? > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file
[ https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063514#comment-16063514 ] Nicholas Ngorok commented on KAFKA-5413: Thanks [~Kelvinrutt] for getting this solved very quickly! [~junrao] is there a timeline/plans for the 0.10.2.2 release? > Log cleaner fails due to large offset in segment file > - > > Key: KAFKA-5413 > URL: https://issues.apache.org/jira/browse/KAFKA-5413 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.0, 0.10.2.1 > Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0 >Reporter: Nicholas Ngorok >Assignee: Kelvin Rutt >Priority: Critical > Labels: reliability > Fix For: 0.11.0.0, 0.10.2.2 > > Attachments: .index.cleaned, > .log, .log.cleaned, > .timeindex.cleaned, 002147422683.log, > kafka-5413.patch > > > The log cleaner thread in our brokers is failing with the trace below > {noformat} > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 > 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner) > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp > Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. > (kafka.log.LogCleaner) > [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) > java.lang.IllegalArgumentException: requirement failed: largest offset in > message set can not be safely converted to relative offset. > at scala.Predef$.require(Predef.scala:224) > at kafka.log.LogSegment.append(LogSegment.scala:109) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.clean(LogCleaner.scala:362) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Stopped (kafka.log.LogCleaner) > {noformat} > This seems to point at the specific line [here| > https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92] > in the kafka src where the difference is actually larger than MAXINT as both > baseOffset and offset are of type long. It was introduced in this [pr| > https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631] > These were the outputs of dumping the first two log segments > {noformat} > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0.log > Dumping /kafka-logs/__consumer_offsets-12/.log > Starting offset: 0 > offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: > -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34 > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0002147343575.log > Dumping /kafka-logs/__consumer_offsets-12/002147343575.log > Starting offset: 2147343575 > offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo > adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34 > {noformat} > My guess is that since 2147539884 is larger than MAXINT, we are hitting this > exception. Was there a specific reason, this check was added in 0.10.2? > E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of > "key 1" following, wouldn't we run into this situation whenever the log > cleaner runs? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak
[ https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063456#comment-16063456 ] Joseph Aliase commented on KAFKA-5007: -- I believe that's the error we are seeing in the log. Let me reproduce the issue today. Will confirm thanks [~huxi_2b] > Kafka Replica Fetcher Thread- Resource Leak > --- > > Key: KAFKA-5007 > URL: https://issues.apache.org/jira/browse/KAFKA-5007 > Project: Kafka > Issue Type: Bug > Components: core, network >Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0 > Environment: Centos 7 > Jave 8 >Reporter: Joseph Aliase >Priority: Critical > Labels: reliability > Attachments: jstack-kafka.out, jstack-zoo.out, lsofkafka.txt, > lsofzookeeper.txt > > > Kafka is running out of open file descriptor when system network interface is > done. > Issue description: > We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file > descriptor for the account running Kafka is set to 10. > During an upgrade, network interface went down. Outage continued for 12 hours > eventually all the broker crashed with java.io.IOException: Too many open > files error. > We repeated the test in a lower environment and observed that Open Socket > count keeps on increasing while the NIC is down. > We have around 13 topics with max partition size of 120 and number of replica > fetcher thread is set to 8. > Using an internal monitoring tool we observed that Open Socket descriptor > for the broker pid continued to increase although NIC was down leading to > Open File descriptor error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.
[ https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063452#comment-16063452 ] Chen He commented on KAFKA-3554: Thank you for the quick reply [~becket_qin]. This work is really valuable. It provides us a tool that can exploit kafka system's capacity. For example, we can get lowest latency by only use 1 thread, at the same time, by increasing thread, we can find what is the maximum throughput for a kafka cluster. Only one question, I did applied this patch to latest kafka and comparing results with old ProducerPerformance.java file. I found out, if we set ack=all with snappy compression, with 100M record(100B each), it does not work as well as old PproducerPerformance.java file. > Generate actual data with specific compression ratio and add multi-thread > support in the ProducerPerformance tool. > -- > > Key: KAFKA-3554 > URL: https://issues.apache.org/jira/browse/KAFKA-3554 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.9.0.1 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.11.1.0 > > > Currently the ProducerPerformance always generate the payload with same > bytes. This does not quite well to test the compressed data because the > payload is extremely compressible no matter how big the payload is. > We can make some changes to make it more useful for compressed messages. > Currently I am generating the payload containing integer from a given range. > By adjusting the range of the integers, we can get different compression > ratios. > API wise, we can either let user to specify the integer range or the expected > compression ratio (we will do some probing to get the corresponding range for > the users) > Besides that, in many cases, it is useful to have multiple producer threads > when the producer threads themselves are bottleneck. Admittedly people can > run multiple ProducerPerformance to achieve similar result, but it is still > different from the real case when people actually use the producer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5473) handle ZK session expiration properly when a new session can't be established
[ https://issues.apache.org/jira/browse/KAFKA-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063437#comment-16063437 ] Jun Rao commented on KAFKA-5473: [~prasincs], there are a couple cases to consider. (1) This is the case that the broker knows for sure that it's not registered with ZK. In this case, it seems failing the broker is better since from the ZK server's perspective, the broker is down and failing the broker will make the behavior of the broker consistent with what's in ZK server. This is the issue that this particular jira is trying to solve. I think we can just wait up to zk.connection.time.ms and do a clean shutdown. (2) There is another case that the broker is partitioned off from ZK server. In this case, ZK server may have expired the session of the broker. However, until the network connection is back, the broker doesn't know that its session has expired. In that mode, currently the broker doesn't shut down and just wait until the network connection is back. > handle ZK session expiration properly when a new session can't be established > - > > Key: KAFKA-5473 > URL: https://issues.apache.org/jira/browse/KAFKA-5473 > Project: Kafka > Issue Type: Sub-task >Affects Versions: 0.9.0.0 >Reporter: Jun Rao >Assignee: Prasanna Gautam > > In https://issues.apache.org/jira/browse/KAFKA-2405, we change the logic in > handling ZK session expiration a bit. If a new ZK session can't be > established after session expiration, we just log an error and continue. > However, this can leave the broker in a bad state since it's up, but not > registered from the controller's perspective. Replicas on this broker may > never to be in sync. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5518) General Kafka connector performanc workload
Chen He created KAFKA-5518: -- Summary: General Kafka connector performanc workload Key: KAFKA-5518 URL: https://issues.apache.org/jira/browse/KAFKA-5518 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 0.10.2.1 Reporter: Chen He Sorry, first time to create Kafka JIRA. Just curious whether there is a general purpose performance workload for Kafka connector (hdfs, s3, etc). Then, we can setup an standard and evaluate the performance for further connectors such as swift, etc. Please feel free to comment or mark as dup if there already is one jira tracking this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5517) Support linking to particular configuration parameters
[ https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Bentley updated KAFKA-5517: --- Labels: patch-available (was: ) > Support linking to particular configuration parameters > -- > > Key: KAFKA-5517 > URL: https://issues.apache.org/jira/browse/KAFKA-5517 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Tom Bentley >Assignee: Tom Bentley >Priority: Minor > Labels: patch-available > > Currently the configuration parameters are documented long tables, and it's > only possible to link to the heading before a particular table. When > discussing configuration parameters on forums it would be helpful to be able > to link to the particular parameter under discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5517) Support linking to particular configuration parameters
[ https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063294#comment-16063294 ] ASF GitHub Bot commented on KAFKA-5517: --- GitHub user tombentley opened a pull request: https://github.com/apache/kafka/pull/3436 KAFKA-5517: Add id to config HTML tables to allow linking You can merge this pull request into a Git repository by running: $ git pull https://github.com/tombentley/kafka KAFKA-5517 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3436 > Support linking to particular configuration parameters > -- > > Key: KAFKA-5517 > URL: https://issues.apache.org/jira/browse/KAFKA-5517 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Tom Bentley >Assignee: Tom Bentley >Priority: Minor > > Currently the configuration parameters are documented long tables, and it's > only possible to link to the heading before a particular table. When > discussing configuration parameters on forums it would be helpful to be able > to link to the particular parameter under discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5517) Support linking to particular configuration parameters
Tom Bentley created KAFKA-5517: -- Summary: Support linking to particular configuration parameters Key: KAFKA-5517 URL: https://issues.apache.org/jira/browse/KAFKA-5517 Project: Kafka Issue Type: Improvement Components: documentation Reporter: Tom Bentley Assignee: Tom Bentley Priority: Minor Currently the configuration parameters are documented long tables, and it's only possible to link to the heading before a particular table. When discussing configuration parameters on forums it would be helpful to be able to link to the particular parameter under discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file
[ https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063228#comment-16063228 ] Jun Rao commented on KAFKA-5413: Merged https://github.com/apache/kafka/pull/3397 to 0.10.2. > Log cleaner fails due to large offset in segment file > - > > Key: KAFKA-5413 > URL: https://issues.apache.org/jira/browse/KAFKA-5413 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.0, 0.10.2.1 > Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0 >Reporter: Nicholas Ngorok >Assignee: Kelvin Rutt >Priority: Critical > Labels: reliability > Fix For: 0.11.0.0, 0.10.2.2 > > Attachments: .index.cleaned, > .log, .log.cleaned, > .timeindex.cleaned, 002147422683.log, > kafka-5413.patch > > > The log cleaner thread in our brokers is failing with the trace below > {noformat} > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 > 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner) > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp > Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. > (kafka.log.LogCleaner) > [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) > java.lang.IllegalArgumentException: requirement failed: largest offset in > message set can not be safely converted to relative offset. > at scala.Predef$.require(Predef.scala:224) > at kafka.log.LogSegment.append(LogSegment.scala:109) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.clean(LogCleaner.scala:362) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Stopped (kafka.log.LogCleaner) > {noformat} > This seems to point at the specific line [here| > https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92] > in the kafka src where the difference is actually larger than MAXINT as both > baseOffset and offset are of type long. It was introduced in this [pr| > https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631] > These were the outputs of dumping the first two log segments > {noformat} > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0.log > Dumping /kafka-logs/__consumer_offsets-12/.log > Starting offset: 0 > offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: > -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34 > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0002147343575.log > Dumping /kafka-logs/__consumer_offsets-12/002147343575.log > Starting offset: 2147343575 > offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo > adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34 > {noformat} > My guess is that since 2147539884 is larger than MAXINT, we are hitting this > exception. Was there a specific reason, this check was added in 0.10.2? > E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of > "key 1" following, wouldn't we run into this situation whenever the log > cleaner runs? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4388) Connect key and value converters are listed without default values
[ https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063107#comment-16063107 ] ASF GitHub Bot commented on KAFKA-4388: --- GitHub user evis opened a pull request: https://github.com/apache/kafka/pull/3435 KAFKA-4388 Recommended values for converters from plugins Questions to reviewers: 1. Should we cache `converterRecommenders.validValues()`, `SinkConnectorConfig.configDef()` and `SourceConnectorConfig.configDef()` results? 2. What is appropriate place for testing new `ConnectorConfig.configDef(plugins)` functionality? cc @ewencp You can merge this pull request into a Git repository by running: $ git pull https://github.com/evis/kafka converters_values Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3435.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3435 commit 069a1ba832f844c20224598c622fa19576b0ba61 Author: Evgeny VeretennikovDate: 2017-06-26T13:44:40Z KAFKA-4388 Recommended values for converters from plugins ConnectorConfig.configDef() takes Plugins parameter now. List of recommended values for converters is taken from plugins.converters() > Connect key and value converters are listed without default values > -- > > Key: KAFKA-4388 > URL: https://issues.apache.org/jira/browse/KAFKA-4388 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.1.0 >Reporter: Ewen Cheslack-Postava >Assignee: Evgeny Veretennikov >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > KIP-75 added per connector converters. This exposes the settings on a > per-connector basis via the validation API. However, the way this is > specified for each connector is via a config value with no default value. > This means the validation API implies there is no setting unless you provide > one. > It would be much better to include the default value extracted from the > WorkerConfig instead so it's clear you shouldn't need to override the default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion
[ https://issues.apache.org/jira/browse/KAFKA-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063049#comment-16063049 ] ASF GitHub Bot commented on KAFKA-5516: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3434 KAFKA-5516: Formatting verifiable producer/consumer output in a similar fashion You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka verifiable-consumer-producer Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3434.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3434 commit 6e6d728dce83ea689061a196e1b6811b447d4db7 Author: ppatiernoDate: 2017-06-26T12:07:08Z Modified JSON order attributes in a more readable fashion commit 9235aadfccf446415ffbfd5d90c8d4faeddecc08 Author: ppatierno Date: 2017-06-26T12:56:07Z Fixed documentation about old request.required.acks producer parameter Modified JSON order attributes in a more readable fashion Refactoring on verifiable producer to be like the verifiable consumer > Formatting verifiable producer/consumer output in a similar fashion > --- > > Key: KAFKA-5516 > URL: https://issues.apache.org/jira/browse/KAFKA-5516 > Project: Kafka > Issue Type: Improvement > Components: tools >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Trivial > > Hi, > following the proposal to have verifiable producer/consumer providing a very > similar output where the "timestamp" is always the first column followed by > "name" event and then all the specific data for such event. > It includes a verifiable producer refactoring for having that in the same way > as verifiable consumer. > Thanks, > Paolo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion
Paolo Patierno created KAFKA-5516: - Summary: Formatting verifiable producer/consumer output in a similar fashion Key: KAFKA-5516 URL: https://issues.apache.org/jira/browse/KAFKA-5516 Project: Kafka Issue Type: Improvement Components: tools Reporter: Paolo Patierno Assignee: Paolo Patierno Priority: Trivial Hi, following the proposal to have verifiable producer/consumer providing a very similar output where the "timestamp" is always the first column followed by "name" event and then all the specific data for such event. It includes a verifiable producer refactoring for having that in the same way as verifiable consumer. Thanks, Paolo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class
[ https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck updated KAFKA-5515: --- Description: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} and formatting each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. We should also consider keeping a lookup of existing segments to avoid as many string operations as possible. (was: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. We should also consider keeping a lookup of existing segments to avoid as many string operations as possible.) > Consider removing date formatting from Segments class > - > > Key: KAFKA-5515 > URL: https://issues.apache.org/jira/browse/KAFKA-5515 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck > Labels: performance > > Currently the {{Segments}} class uses a date when calculating the segment id > and uses {{SimpleDateFormat}} for formatting the segment id. However this is > a high volume code path and creating a new {{SimpleDateFormat}} and > formatting each segment id is expensive. We should look into removing the > date from the segment id or at a minimum use a faster alternative to > {{SimpleDateFormat}}. We should also consider keeping a lookup of existing > segments to avoid as many string operations as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class
[ https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck updated KAFKA-5515: --- Description: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. We should also consider keeping a lookup of existing segments to avoid as many string operations as possible. (was: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}} ) > Consider removing date formatting from Segments class > - > > Key: KAFKA-5515 > URL: https://issues.apache.org/jira/browse/KAFKA-5515 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck > Labels: performance > > Currently the {{Segments}} class uses a date when calculating the segment id > and uses {{SimpleDateFormat}} for formatting the segment id. However this is > a high volume code path and creating a new {{SimpleDateFormat}} for each > segment id is expensive. We should look into removing the date from the > segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. > We should also consider keeping a lookup of existing segments to avoid as > many string operations as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5514) KafkaConsumer ignores default values in Properties object because of incorrect use of Properties object.
Geert Schuring created KAFKA-5514: - Summary: KafkaConsumer ignores default values in Properties object because of incorrect use of Properties object. Key: KAFKA-5514 URL: https://issues.apache.org/jira/browse/KAFKA-5514 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.10.2.1 Reporter: Geert Schuring When setting default values in a Properties object the KafkaConsumer ignores these values because the Properties object is being treated as a Map. The ConsumerConfig object uses the putAll method to copy properties from the incoming object to its local copy. (See https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java#L471) This is incorrect because it only copies the explicit properties and ignores the default values also present in the properties object. (Also see: https://stackoverflow.com/questions/2004833/how-to-merge-two-java-util-properties-objects) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5372) Unexpected state transition Dead to PendingShutdown
[ https://issues.apache.org/jira/browse/KAFKA-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062996#comment-16062996 ] ASF GitHub Bot commented on KAFKA-5372: --- GitHub user enothereska opened a pull request: https://github.com/apache/kafka/pull/3432 KAFKA-5372: fixes to state transitions You can merge this pull request into a Git repository by running: $ git pull https://github.com/enothereska/kafka KAFKA-5372-state-transitions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3432 commit 5f57558ef293351d9f5db11edb62089367e39b76 Author: Eno ThereskaDate: 2017-06-26T11:04:20Z Checkpoint commit ac372b196998052a024aac47af64dbd803a65733 Author: Eno Thereska Date: 2017-06-26T12:11:26Z Some fixes > Unexpected state transition Dead to PendingShutdown > --- > > Key: KAFKA-5372 > URL: https://issues.apache.org/jira/browse/KAFKA-5372 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jason Gustafson >Assignee: Eno Thereska > Fix For: 0.11.1.0 > > > I often see this running integration tests: > {code} > [2017-06-02 15:36:03,411] WARN stream-thread > [appId-1-c382ef0a-adbd-422b-9717-9b2bc52b55eb-StreamThread-13] Unexpected > state transition from DEAD to PENDING_SHUTDOWN. > (org.apache.kafka.streams.processor.internals.StreamThread:976) > {code} > Maybe a race condition on shutdown or something? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
[ https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062859#comment-16062859 ] Ismael Juma commented on KAFKA-5512: Nice catch. Are you intending to submit a PR with your fixes? > KafkaConsumer: High memory allocation rate when idle > > > Key: KAFKA-5512 > URL: https://issues.apache.org/jira/browse/KAFKA-5512 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.1.1 >Reporter: Stephane Roset > Labels: performance > Fix For: 0.11.0.1 > > > Hi, > We noticed in our application that the memory allocation rate increased > significantly when we have no Kafka messages to consume. We isolated the > issue by using a JVM that simply runs 128 Kafka consumers. These consumers > consume 128 partitions (so each consumer consumes one partition). The > partitions are empty and no message has been sent during the test. The > consumers were configured with default values (session.timeout.ms=3, > fetch.max.wait.ms=500, receive.buffer.bytes=65536, > heartbeat.interval.ms=3000, max.poll.interval.ms=30, > max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this > context, the allocation rate was about 55 MiB/s. This high allocation rate > generates a lot of GC activity (to garbage the young heap) and was an issue > for our project. > We profiled the JVM with JProfiler. We noticed that there were a huge > quantity of ArrayList$Itr in memory. These collections were mainly > instantiated by the methods handleCompletedReceives, handleCompletedSends, > handleConnecions and handleDisconnections of the class NetWorkClient. We also > noticed that we had a lot of calls to the method pollOnce of the class > KafkaConsumer. > So we decided to run only one consumer and to profile the calls to the method > pollOnce. We noticed that regularly a huge number of calls is made to this > method, up to 268000 calls within 100ms. The pollOnce method calls the > NetworkClient.handle* methods. These methods iterate on collections (even if > they are empty), so that explains the huge number of iterators in memory. > The large number of calls is related to the heartbeat mechanism. The pollOnce > method calculates the poll timeout; if a heartbeat needs to be done, the > timeout will be set to 0. The problem is that the heartbeat thread checks > every 100 ms (default value of retry.backoff.ms) if a heartbeat should be > sent, so the KafkaConsumer will call the poll method in a loop without > timeout until the heartbeat thread awakes. For example: the heartbeat thread > just started to wait and will awake in 99ms. So during 99ms, the > KafkaConsumer will call in a loop the pollOnce method and will use a timeout > of 0. That explains how we can have 268000 calls within 100ms. > The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so > I think the Kafka consumer should awake the heartbeat thread with a notify > when needed. > We made two quick fixes to solve this issue: > - In NetworkClient.handle*(), we don't iterate on collections if they are > empty (to avoid unnecessary iterators instantiations). > - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify > the heartbeat thread to awake it (dirty fix because we don't handle the > autocommit case). > With these 2 quick fixes and 128 consumers, the allocation rate drops down > from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
[ https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-5512: --- Labels: performance (was: ) > KafkaConsumer: High memory allocation rate when idle > > > Key: KAFKA-5512 > URL: https://issues.apache.org/jira/browse/KAFKA-5512 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.1.1 >Reporter: Stephane Roset > Labels: performance > Fix For: 0.11.0.1 > > > Hi, > We noticed in our application that the memory allocation rate increased > significantly when we have no Kafka messages to consume. We isolated the > issue by using a JVM that simply runs 128 Kafka consumers. These consumers > consume 128 partitions (so each consumer consumes one partition). The > partitions are empty and no message has been sent during the test. The > consumers were configured with default values (session.timeout.ms=3, > fetch.max.wait.ms=500, receive.buffer.bytes=65536, > heartbeat.interval.ms=3000, max.poll.interval.ms=30, > max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this > context, the allocation rate was about 55 MiB/s. This high allocation rate > generates a lot of GC activity (to garbage the young heap) and was an issue > for our project. > We profiled the JVM with JProfiler. We noticed that there were a huge > quantity of ArrayList$Itr in memory. These collections were mainly > instantiated by the methods handleCompletedReceives, handleCompletedSends, > handleConnecions and handleDisconnections of the class NetWorkClient. We also > noticed that we had a lot of calls to the method pollOnce of the class > KafkaConsumer. > So we decided to run only one consumer and to profile the calls to the method > pollOnce. We noticed that regularly a huge number of calls is made to this > method, up to 268000 calls within 100ms. The pollOnce method calls the > NetworkClient.handle* methods. These methods iterate on collections (even if > they are empty), so that explains the huge number of iterators in memory. > The large number of calls is related to the heartbeat mechanism. The pollOnce > method calculates the poll timeout; if a heartbeat needs to be done, the > timeout will be set to 0. The problem is that the heartbeat thread checks > every 100 ms (default value of retry.backoff.ms) if a heartbeat should be > sent, so the KafkaConsumer will call the poll method in a loop without > timeout until the heartbeat thread awakes. For example: the heartbeat thread > just started to wait and will awake in 99ms. So during 99ms, the > KafkaConsumer will call in a loop the pollOnce method and will use a timeout > of 0. That explains how we can have 268000 calls within 100ms. > The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so > I think the Kafka consumer should awake the heartbeat thread with a notify > when needed. > We made two quick fixes to solve this issue: > - In NetworkClient.handle*(), we don't iterate on collections if they are > empty (to avoid unnecessary iterators instantiations). > - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify > the heartbeat thread to awake it (dirty fix because we don't handle the > autocommit case). > With these 2 quick fixes and 128 consumers, the allocation rate drops down > from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
[ https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-5512: --- Fix Version/s: 0.11.0.1 > KafkaConsumer: High memory allocation rate when idle > > > Key: KAFKA-5512 > URL: https://issues.apache.org/jira/browse/KAFKA-5512 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.1.1 >Reporter: Stephane Roset > Labels: performance > Fix For: 0.11.0.1 > > > Hi, > We noticed in our application that the memory allocation rate increased > significantly when we have no Kafka messages to consume. We isolated the > issue by using a JVM that simply runs 128 Kafka consumers. These consumers > consume 128 partitions (so each consumer consumes one partition). The > partitions are empty and no message has been sent during the test. The > consumers were configured with default values (session.timeout.ms=3, > fetch.max.wait.ms=500, receive.buffer.bytes=65536, > heartbeat.interval.ms=3000, max.poll.interval.ms=30, > max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this > context, the allocation rate was about 55 MiB/s. This high allocation rate > generates a lot of GC activity (to garbage the young heap) and was an issue > for our project. > We profiled the JVM with JProfiler. We noticed that there were a huge > quantity of ArrayList$Itr in memory. These collections were mainly > instantiated by the methods handleCompletedReceives, handleCompletedSends, > handleConnecions and handleDisconnections of the class NetWorkClient. We also > noticed that we had a lot of calls to the method pollOnce of the class > KafkaConsumer. > So we decided to run only one consumer and to profile the calls to the method > pollOnce. We noticed that regularly a huge number of calls is made to this > method, up to 268000 calls within 100ms. The pollOnce method calls the > NetworkClient.handle* methods. These methods iterate on collections (even if > they are empty), so that explains the huge number of iterators in memory. > The large number of calls is related to the heartbeat mechanism. The pollOnce > method calculates the poll timeout; if a heartbeat needs to be done, the > timeout will be set to 0. The problem is that the heartbeat thread checks > every 100 ms (default value of retry.backoff.ms) if a heartbeat should be > sent, so the KafkaConsumer will call the poll method in a loop without > timeout until the heartbeat thread awakes. For example: the heartbeat thread > just started to wait and will awake in 99ms. So during 99ms, the > KafkaConsumer will call in a loop the pollOnce method and will use a timeout > of 0. That explains how we can have 268000 calls within 100ms. > The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so > I think the Kafka consumer should awake the heartbeat thread with a notify > when needed. > We made two quick fixes to solve this issue: > - In NetworkClient.handle*(), we don't iterate on collections if they are > empty (to avoid unnecessary iterators instantiations). > - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify > the heartbeat thread to awake it (dirty fix because we don't handle the > autocommit case). > With these 2 quick fixes and 128 consumers, the allocation rate drops down > from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
Stephane Roset created KAFKA-5512: - Summary: KafkaConsumer: High memory allocation rate when idle Key: KAFKA-5512 URL: https://issues.apache.org/jira/browse/KAFKA-5512 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 0.10.1.1 Reporter: Stephane Roset Hi, We noticed in our application that the memory allocation rate increased significantly when we have no Kafka messages to consume. We isolated the issue by using a JVM that simply runs 128 Kafka consumers. These consumers consume 128 partitions (so each consumer consumes one partition). The partitions are empty and no message has been sent during the test. The consumers were configured with default values (session.timeout.ms=3, fetch.max.wait.ms=500, receive.buffer.bytes=65536, heartbeat.interval.ms=3000, max.poll.interval.ms=30, max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this context, the allocation rate was about 55 MiB/s. This high allocation rate generates a lot of GC activity (to garbage the young heap) and was an issue for our project. We profiled the JVM with JProfiler. We noticed that there were a huge quantity of ArrayList$Itr in memory. These collections were mainly instantiated by the methods handleCompletedReceives, handleCompletedSends, handleConnecions and handleDisconnections of the class NetWorkClient. We also noticed that we had a lot of calls to the method pollOnce of the class KafkaConsumer. So we decided to run only one consumer and to profile the calls to the method pollOnce. We noticed that regularly a huge number of calls is made to this method, up to 268000 calls within 100ms. The pollOnce method calls the NetworkClient.handle* methods. These methods iterate on collections (even if they are empty), so that explains the huge number of iterators in memory. The large number of calls is related to the heartbeat mechanism. The pollOnce method calculates the poll timeout; if a heartbeat needs to be done, the timeout will be set to 0. The problem is that the heartbeat thread checks every 100 ms (default value of retry.backoff.ms) if a heartbeat should be sent, so the KafkaConsumer will call the poll method in a loop without timeout until the heartbeat thread awakes. For example: the heartbeat thread just started to wait and will awake in 99ms. So during 99ms, the KafkaConsumer will call in a loop the pollOnce method and will use a timeout of 0. That explains how we can have 268000 calls within 100ms. The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so I think the Kafka consumer should awake the heartbeat thread with a notify when needed. We made two quick fixes to solve this issue: - In NetworkClient.handle*(), we don't iterate on collections if they are empty (to avoid unnecessary iterators instantiations). - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify the heartbeat thread to awake it (dirty fix because we don't handle the autocommit case). With these 2 quick fixes and 128 consumers, the allocation rate drops down from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-4388) Connect key and value converters are listed without default values
[ https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Veretennikov reassigned KAFKA-4388: -- Assignee: Evgeny Veretennikov > Connect key and value converters are listed without default values > -- > > Key: KAFKA-4388 > URL: https://issues.apache.org/jira/browse/KAFKA-4388 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.1.0 >Reporter: Ewen Cheslack-Postava >Assignee: Evgeny Veretennikov >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > KIP-75 added per connector converters. This exposes the settings on a > per-connector basis via the validation API. However, the way this is > specified for each connector is via a config value with no default value. > This means the validation API implies there is no setting unless you provide > one. > It would be much better to include the default value extracted from the > WorkerConfig instead so it's clear you shouldn't need to override the default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062575#comment-16062575 ] Evgeny Veretennikov commented on KAFKA-4750: If we invoke {{store.put(key, value)}}, and serdes returns null for value, shouldn't we throw NPE instead of deleting key from store? It seems counter-intuitive, that value here can be null, since null indicates, that no value found: {code:java} Object value = ???; // some value serialized to null by serde store.put(key, value); Object value = store.get(key); // returns null, that seems like no value found {code} Throwing exception inside {{put()}} prevents from potential data loss. By the way, why does {{KeyValueStore.put()}} method allows value to be null? > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)