[jira] [Created] (KAFKA-9728) kafka-dump-log.sh to support dumping only head or tail part
Weichu Liu created KAFKA-9728: - Summary: kafka-dump-log.sh to support dumping only head or tail part Key: KAFKA-9728 URL: https://issues.apache.org/jira/browse/KAFKA-9728 Project: Kafka Issue Type: New Feature Components: tools Reporter: Weichu Liu When I use {{kafka-dump-log.sh}} to dump a .log file, its behavior is to traverse through the whole file which is quite heavy. Even with {{| head}}, the tool will still read through the whole file. In many cases I just want to inspect the first several messages or last several messages of a log file. It will be great to have {{kafka-dump-log.sh}} to add options support partial reading. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KAFKA-9669) Kafka 2.4.0 Chokes on Filebeat 5.6 Produced Data
[ https://issues.apache.org/jira/browse/KAFKA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053077#comment-17053077 ] Weichu Liu edited comment on KAFKA-9669 at 3/6/20, 8:11 AM: -With 2.4.1 RC0, the Kafka does provider prettier error logs, but still cannot produce:- Update: Sorry for the misleading, so I double checked the result. Despite for the ERROR log, the messages were actually stored to the topic. On the other hand, throwing one ERROR level message per produced record is kinda not acceptable for production usage, because the Kafka log would flood out the disk. {noformat} [2020-03-06 06:36:37,159] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=1478844555, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,177] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=4137384030, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,181] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=4137384030, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,194] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=3885590418, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. {noformat} was (Author: weichu): With 2.4.1 RC0, the Kafka does provider prettier error logs, but still cannot produce: {noformat} [2020-03-06 06:36:37,159] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=1478844555, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,177] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=4137384030, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,181] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=4137384030, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,194] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=3885590418, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. {noformat} > Kafka 2.4.0 Chokes on Filebeat 5.6 Produced Data > > > Key: KAFKA-9669 > URL: https://issues.apache.org/jira/browse/KAFKA-9669 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Weichu
[jira] [Commented] (KAFKA-9669) Kafka 2.4.0 Chokes on Filebeat 5.6 Produced Data
[ https://issues.apache.org/jira/browse/KAFKA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053077#comment-17053077 ] Weichu Liu commented on KAFKA-9669: --- With 2.4.1 RC0, the Kafka does provider prettier error logs, but still cannot produce: {noformat} [2020-03-06 06:36:37,159] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=1478844555, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,177] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=4137384030, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,181] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=4137384030, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 06:36:37,194] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=3885590418, CreateTime=1583476596257, key=0 bytes, value=199 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. {noformat} > Kafka 2.4.0 Chokes on Filebeat 5.6 Produced Data > > > Key: KAFKA-9669 > URL: https://issues.apache.org/jira/browse/KAFKA-9669 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Weichu Liu >Priority: Major > > Hi > In our environment, after upgrading to Kafka 2.4.0, we discovered the broker > was not compatible with filebeat 5. > Here is how to reproduce: > 1. Startup Kafka 2.4.0, all configurations are vanilla: > {code} > $ kafka_2.13-2.4.0/bin/zookeeper-server-start.sh > kafka_2.13-2.4.0/config/zookeeper.properties > $ kafka_2.13-2.4.0/bin/kafka-server-start.sh > kafka_2.13-2.4.0/config/server.properties > {code} > 2. Startup filebeat 5.6.16 with the following configuration. (downloaded from > https://www.elastic.co/jp/downloads/past-releases/filebeat-5-6-16) > {code} > $ cat /tmp/filebeat.yml > name: test > output.kafka: > enabled: true > hosts: > - localhost:9092 > topic: test-3 > version: 0.10.0 > compression: gzip > filebeat: > prospectors: > - input_type: log > paths: > - /tmp/filebeat-in > encoding: plain > {code} > {code} > $ filebeat-5.6.16-linux-x86_64/filebeat -e -c /tmp/filebeat.yml > {code} > 3. Write some lines to file {{/tmp/filebeat-in}}. Looks like single line > won't trigger the issue, but 30 lines are enough. > {code} > seq 30 >> /tmp/filebeat-in > {code} > 4. Kafka throws the following error chunk, like, per produced record. > {noformat} > [2020-03-06 05:17:40,129] ERROR [ReplicaManager broker=0] Error processing > append operation on partition test-3-0 (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Inner record > LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, > crc=1453875406, CreateTime=1583471854475, key=0 bytes, value=202 bytes)) > inside the compressed record batch does not have incremental offsets, > expected offset is 1 in topic partition test-3-0. > [2020-03-06 05:17:40,129] ERROR [KafkaApi-0] Error when handling request: > clientId=beats, correlationId=102, api=PRODUCE, version=2, > body={acks=1,timeout=1,partitionSizes=[test-3-0=272]} > (kafka.server.KafkaApis) > java.lang.NullPointerException: `field` must be non-null > at java.base/java.util.Objects.requireNonNull(Objects.java:246) > at > org.apache.kafka.common.protocol.types.Struct.validateField(Struct.java:474) > at > org.apache.kafka.common.protocol.types.Struct.instance(Struct.java:418) > at > org.apache.kafka.common.protocol.types.Struct.instance(Struct.java:436) > at >
[jira] [Created] (KAFKA-9669) Kafka 2.4.0 Chokes on Filebeat 5.6 Produced Data
Weichu Liu created KAFKA-9669: - Summary: Kafka 2.4.0 Chokes on Filebeat 5.6 Produced Data Key: KAFKA-9669 URL: https://issues.apache.org/jira/browse/KAFKA-9669 Project: Kafka Issue Type: Bug Affects Versions: 2.4.0 Reporter: Weichu Liu Hi In our environment, after upgrading to Kafka 2.4.0, we discovered the broker was not compatible with filebeat 5. Here is how to reproduce: 1. Startup Kafka 2.4.0, all configurations are vanilla: {code} $ kafka_2.13-2.4.0/bin/zookeeper-server-start.sh kafka_2.13-2.4.0/config/zookeeper.properties $ kafka_2.13-2.4.0/bin/kafka-server-start.sh kafka_2.13-2.4.0/config/server.properties {code} 2. Startup filebeat 5.6.16 with the following configuration. (downloaded from https://www.elastic.co/jp/downloads/past-releases/filebeat-5-6-16) {code} $ cat /tmp/filebeat.yml name: test output.kafka: enabled: true hosts: - localhost:9092 topic: test-3 version: 0.10.0 compression: gzip filebeat: prospectors: - input_type: log paths: - /tmp/filebeat-in encoding: plain {code} {code} $ filebeat-5.6.16-linux-x86_64/filebeat -e -c /tmp/filebeat.yml {code} 3. Write some lines to file {{/tmp/filebeat-in}}. Looks like single line won't trigger the issue, but 30 lines are enough. {code} seq 30 >> /tmp/filebeat-in {code} 4. Kafka throws the following error chunk, like, per produced record. {noformat} [2020-03-06 05:17:40,129] ERROR [ReplicaManager broker=0] Error processing append operation on partition test-3-0 (kafka.server.ReplicaManager) org.apache.kafka.common.InvalidRecordException: Inner record LegacyRecordBatch(offset=0, Record(magic=1, attributes=0, compression=NONE, crc=1453875406, CreateTime=1583471854475, key=0 bytes, value=202 bytes)) inside the compressed record batch does not have incremental offsets, expected offset is 1 in topic partition test-3-0. [2020-03-06 05:17:40,129] ERROR [KafkaApi-0] Error when handling request: clientId=beats, correlationId=102, api=PRODUCE, version=2, body={acks=1,timeout=1,partitionSizes=[test-3-0=272]} (kafka.server.KafkaApis) java.lang.NullPointerException: `field` must be non-null at java.base/java.util.Objects.requireNonNull(Objects.java:246) at org.apache.kafka.common.protocol.types.Struct.validateField(Struct.java:474) at org.apache.kafka.common.protocol.types.Struct.instance(Struct.java:418) at org.apache.kafka.common.protocol.types.Struct.instance(Struct.java:436) at org.apache.kafka.common.requests.ProduceResponse.toStruct(ProduceResponse.java:281) at org.apache.kafka.common.requests.AbstractResponse.toSend(AbstractResponse.java:35) at org.apache.kafka.common.requests.RequestContext.buildResponse(RequestContext.java:80) at kafka.server.KafkaApis.sendResponse(KafkaApis.scala:2892) at kafka.server.KafkaApis.sendResponseCallback$2(KafkaApis.scala:554) at kafka.server.KafkaApis.$anonfun$handleProduceRequest$11(KafkaApis.scala:576) at kafka.server.KafkaApis.$anonfun$handleProduceRequest$11$adapted(KafkaApis.scala:576) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:546) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:577) at kafka.server.KafkaApis.handle(KafkaApis.scala:126) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:70) at java.base/java.lang.Thread.run(Thread.java:835) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7349) Long Disk Writes cause Zookeeper Disconnects
[ https://issues.apache.org/jira/browse/KAFKA-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954340#comment-16954340 ] Weichu Liu commented on KAFKA-7349: --- We encountered similar issue, and that high disk IO latency actually caused brokers to frequently re-elect leaders and controllers, and eventually it triggered a split-brain -- one broker is alive but could not rejoin the cluster until it is manually rebooted. > Long Disk Writes cause Zookeeper Disconnects > > > Key: KAFKA-7349 > URL: https://issues.apache.org/jira/browse/KAFKA-7349 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.11.0.1 >Reporter: Adam Kafka >Priority: Minor > Attachments: SpikeInWriteTime.png > > > We run our Kafka cluster on a cloud service provider. As a consequence, we > notice a large tail latency write time that is out of our control. Some > writes take on the order of seconds. We have noticed that often these long > write times are correlated with subsequent Zookeeper disconnects from the > brokers. It appears that during the long write time, the Zookeeper heartbeat > thread does not get scheduled CPU time, resulting in a long gap of heartbeats > sent. After the write, the ZK thread does get scheduled CPU time, but it > detects that it has not received a heartbeat from Zookeeper in a while, so it > drops its connection then rejoins the cluster. > Note that the timeout reported is inconsistent with the timeout as set by the > configuration ({{zookeeper.session.timeout.ms}} = default of 6 seconds). We > have seen a range of values reported here, including 5950ms (less than > threshold), 12032ms (double the threshold), 25999ms (much larger than the > threshold). > We noticed that during a service degradation of the storage service of our > cloud provider, these Zookeeper disconnects increased drastically in > frequency. > We are hoping there is a way to decouple these components. Do you agree with > our diagnosis that the ZK disconnects are occurring due to thread contention > caused by long disk writes? Perhaps the ZK thread could be scheduled at a > higher priority? Do you have any suggestions for how to avoid the ZK > disconnects? > Here is an example of one of these events: > Logs on the Broker: > {code} > [2018-08-25 04:10:19,695] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:21,697] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:23,700] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:25,701] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:27,702] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:29,704] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:31,707] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:33,709] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:35,712] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:37,714] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:39,716] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:41,719] DEBUG Got ping response for sessionid: > 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) > ... > [2018-08-25 04:10:53,752] WARN Client session timed out, have not heard from > server in 12032ms for sessionid 0x36202ab4337002c > (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:53,754] INFO Client session timed out, have not heard from > server in 12032ms for sessionid 0x36202ab4337002c, closing socket connection > and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2018-08-25 04:10:53,920] INFO zookeeper state changed (Disconnected) > (org.I0Itec.zkclient.ZkClient) > [2018-08-25 04:10:53,920] INFO Waiting for keeper state SyncConnected > (org.I0Itec.zkclient.ZkClient) > ... > {code} > GC logs during the same time (demonstrating this is not just a long GC): > {code} > 2018-08-25T04:10:36.434+: 35150.779: [GC (Allocation Failure) > 3074119K->2529089K(6223360K), 0.0137342 secs] > 2018-08-25T04:10:37.367+: 35151.713: [GC (Allocation Failure) >
[jira] [Commented] (KAFKA-8633) Extra in generated documents
[ https://issues.apache.org/jira/browse/KAFKA-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887731#comment-16887731 ] Weichu Liu commented on KAFKA-8633: --- PR (https://github.com/apache/kafka/pull/7056) is ready to review. It'd be appreciated if somebody can do the review and merge it. > Extra in generated documents > -- > > Key: KAFKA-8633 > URL: https://issues.apache.org/jira/browse/KAFKA-8633 > Project: Kafka > Issue Type: Task > Components: documentation >Reporter: Weichu Liu >Priority: Trivial > > The auto generated tables for all configurations (e.g. > [https://kafka.apache.org/documentation/#brokerconfigs]) are with 2 for > each cell. > e.g. the first row for broker configuration. > {noformat} > > zookeeper.connectSpecifies the ZooKeeper connection string > in the form hostname:port where host and port are the host and > port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes > when that ZooKeeper machine is down you can also specify multiple hosts in > the form hostname1:port1,hostname2:port2,hostname3:port3. > The server can also have a ZooKeeper chroot path as part of its ZooKeeper > connection string which puts its data under some path in the global ZooKeeper > namespace. For example to give a chroot path of /chroot/path you > would give the connection string as > hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only > {noformat} > This is due to {{toHtmlTable}} function in > {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is > appending an extra "" in the code. > {code:java} > for (String headerName : headers()) { > addColumnValue(b, getConfigValue(key, headerName)); > b.append(""); > } > {code} > (The addColumnValue already wrap the value with and ) > This is very minor issue, but it will prevent an html parser to properly > fetch table data (like what I was trying to do) > -- > Update: I also found another glitch in the doc: > Some configuration are using '<>' in the string, but they are recognized as > html tags so the description is not properly displayed. > For example, the {{client.id}} of [Kafka Streams > Configs|https://kafka.apache.org/documentation/#streamsconfigs] displays > {noformat} > An ID prefix string used for the client IDs of internal consumer, producer > and restore-consumer, with pattern '-StreamThread--'. > {noformat} > However it should be > {noformat} > with pattern > '-StreamThread--'. > {noformat} > I feel the fastest way is to avoid angle brackets at all. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (KAFKA-8679) kafka-topics.sh --describe with --zookeeper throws error when there is no topic
[ https://issues.apache.org/jira/browse/KAFKA-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887724#comment-16887724 ] Weichu Liu commented on KAFKA-8679: --- Err, right. This one duplicates KAFKA-8670. I didn't notice someone already reported the issue. Anyways, happy to know it is getting fixed. > kafka-topics.sh --describe with --zookeeper throws error when there is no > topic > --- > > Key: KAFKA-8679 > URL: https://issues.apache.org/jira/browse/KAFKA-8679 > Project: Kafka > Issue Type: Bug >Reporter: Weichu Liu >Priority: Minor > > h3. Step to Reproduce: > First, start a Kafka server (2.2.0+), with no topic on it. > Then run `kafka-topics.sh --describe --zookeeper ...:2181` to get topic > details > h3. Expected Behavior > Expected Behavior should be the command prints nothing and return 0. > h3. Actual Behavior > The command throws an exception and exit with 1. > {code} > $ kafka_2.12-2.2.1/bin/kafka-topics.sh --describe --zookeeper localhost:2181 > Error while executing topic command : Topics in [] does not exist > [2019-07-18 06:29:21,336] ERROR java.lang.IllegalArgumentException: Topics in > [] does not exist > at > kafka.admin.TopicCommand$.kafka$admin$TopicCommand$$ensureTopicExists(TopicCommand.scala:416) > at > kafka.admin.TopicCommand$ZookeeperTopicService.describeTopic(TopicCommand.scala:332) > at kafka.admin.TopicCommand$.main(TopicCommand.scala:66) > at kafka.admin.TopicCommand.main(TopicCommand.scala) > (kafka.admin.TopicCommand$) > {code} > h3. Others > IIRC, the version before 2.2.0 did not throw exceptions. > Also, {{--describing}} with {{--bootstrap-server}} will exit 0 and print > nothing. > Also {{--list}} with either {{--bootstrap-server}} and {{--zookeeper}} will > also exit 0 and prints nothing. > I did some quick search and seems this issue is introduced by > https://issues.apache.org/jira/browse/KAFKA-7054. I didn't check which exact > line caused the exception. Hope that will help. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (KAFKA-8679) kafka-topics.sh --describe with --zookeeper throws error when there is no topic
[ https://issues.apache.org/jira/browse/KAFKA-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichu Liu resolved KAFKA-8679. --- Resolution: Duplicate > kafka-topics.sh --describe with --zookeeper throws error when there is no > topic > --- > > Key: KAFKA-8679 > URL: https://issues.apache.org/jira/browse/KAFKA-8679 > Project: Kafka > Issue Type: Bug >Reporter: Weichu Liu >Priority: Minor > > h3. Step to Reproduce: > First, start a Kafka server (2.2.0+), with no topic on it. > Then run `kafka-topics.sh --describe --zookeeper ...:2181` to get topic > details > h3. Expected Behavior > Expected Behavior should be the command prints nothing and return 0. > h3. Actual Behavior > The command throws an exception and exit with 1. > {code} > $ kafka_2.12-2.2.1/bin/kafka-topics.sh --describe --zookeeper localhost:2181 > Error while executing topic command : Topics in [] does not exist > [2019-07-18 06:29:21,336] ERROR java.lang.IllegalArgumentException: Topics in > [] does not exist > at > kafka.admin.TopicCommand$.kafka$admin$TopicCommand$$ensureTopicExists(TopicCommand.scala:416) > at > kafka.admin.TopicCommand$ZookeeperTopicService.describeTopic(TopicCommand.scala:332) > at kafka.admin.TopicCommand$.main(TopicCommand.scala:66) > at kafka.admin.TopicCommand.main(TopicCommand.scala) > (kafka.admin.TopicCommand$) > {code} > h3. Others > IIRC, the version before 2.2.0 did not throw exceptions. > Also, {{--describing}} with {{--bootstrap-server}} will exit 0 and print > nothing. > Also {{--list}} with either {{--bootstrap-server}} and {{--zookeeper}} will > also exit 0 and prints nothing. > I did some quick search and seems this issue is introduced by > https://issues.apache.org/jira/browse/KAFKA-7054. I didn't check which exact > line caused the exception. Hope that will help. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (KAFKA-8679) kafka-topics.sh --describe with --zookeeper throws error when there is no topic
Weichu Liu created KAFKA-8679: - Summary: kafka-topics.sh --describe with --zookeeper throws error when there is no topic Key: KAFKA-8679 URL: https://issues.apache.org/jira/browse/KAFKA-8679 Project: Kafka Issue Type: Bug Reporter: Weichu Liu h3. Step to Reproduce: First, start a Kafka server (2.2.0+), with no topic on it. Then run `kafka-topics.sh --describe --zookeeper ...:2181` to get topic details h3. Expected Behavior Expected Behavior should be the command prints nothing and return 0. h3. Actual Behavior The command throws an exception and exit with 1. {code} $ kafka_2.12-2.2.1/bin/kafka-topics.sh --describe --zookeeper localhost:2181 Error while executing topic command : Topics in [] does not exist [2019-07-18 06:29:21,336] ERROR java.lang.IllegalArgumentException: Topics in [] does not exist at kafka.admin.TopicCommand$.kafka$admin$TopicCommand$$ensureTopicExists(TopicCommand.scala:416) at kafka.admin.TopicCommand$ZookeeperTopicService.describeTopic(TopicCommand.scala:332) at kafka.admin.TopicCommand$.main(TopicCommand.scala:66) at kafka.admin.TopicCommand.main(TopicCommand.scala) (kafka.admin.TopicCommand$) {code} h3. Others IIRC, the version before 2.2.0 did not throw exceptions. Also, {{--describing}} with {{--bootstrap-server}} will exit 0 and print nothing. Also {{--list}} with either {{--bootstrap-server}} and {{--zookeeper}} will also exit 0 and prints nothing. I did some quick search and seems this issue is introduced by https://issues.apache.org/jira/browse/KAFKA-7054. I didn't check which exact line caused the exception. Hope that will help. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (KAFKA-8633) Extra in generated documents
[ https://issues.apache.org/jira/browse/KAFKA-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichu Liu updated KAFKA-8633: -- Description: The auto generated tables for all configurations (e.g. [https://kafka.apache.org/documentation/#brokerconfigs]) are with 2 for each cell. e.g. the first row for broker configuration. {noformat} zookeeper.connectSpecifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. The server can also have a ZooKeeper chroot path as part of its ZooKeeper connection string which puts its data under some path in the global ZooKeeper namespace. For example to give a chroot path of /chroot/path you would give the connection string as hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only {noformat} This is due to {{toHtmlTable}} function in {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is appending an extra "" in the code. {code:java} for (String headerName : headers()) { addColumnValue(b, getConfigValue(key, headerName)); b.append(""); } {code} (The addColumnValue already wrap the value with and ) This is very minor issue, but it will prevent an html parser to properly fetch table data (like what I was trying to do) -- Update: I also found another glitch in the doc: Some configuration are using '<>' in the string, but they are recognized as html tags so the description is not properly displayed. For example, the {{client.id}} of [Kafka Streams Configs|https://kafka.apache.org/documentation/#streamsconfigs] displays {noformat} An ID prefix string used for the client IDs of internal consumer, producer and restore-consumer, with pattern '-StreamThread--'. {noformat} However it should be {noformat} with pattern '-StreamThread--'. {noformat} I feel the fastest way is to avoid angle brackets at all. was: The auto generated tables for all configurations (e.g. https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for each cell. e.g. the first row for broker configuration. {noformat} zookeeper.connectSpecifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. The server can also have a ZooKeeper chroot path as part of its ZooKeeper connection string which puts its data under some path in the global ZooKeeper namespace. For example to give a chroot path of /chroot/path you would give the connection string as hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only {noformat} This is due to {{toHtmlTable}} function in {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is appending an extra "" in the code. {code:java} for (String headerName : headers()) { addColumnValue(b, getConfigValue(key, headerName)); b.append(""); } {code} (The addColumnValue already wrap the value with and ) This is very minor issue, but it will prevent an html parser to properly fetch table data (like what I was trying to do) -- Update: I also found another glitch in the doc: Some configuration are using '<>' in the string, but they are recognized as html tags so the description is not properly displayed. For example, the {{client.id}} of [Kafka Streams Configs|https://kafka.apache.org/documentation/#streamsconfigs] displays > An ID prefix string used for the client IDs of internal consumer, producer > and restore-consumer, with pattern '-StreamThread--'. However it should be > with pattern > '-StreamThread--'. I feel the fastest way is to avoid angle brackets at all. > Extra in generated documents > -- > > Key: KAFKA-8633 > URL: https://issues.apache.org/jira/browse/KAFKA-8633 > Project: Kafka > Issue Type: Task > Components: documentation >Reporter: Weichu Liu >Priority: Trivial > > The auto generated tables for all configurations (e.g. > [https://kafka.apache.org/documentation/#brokerconfigs]) are with 2 for > each cell. > e.g. the first row for broker configuration. > {noformat} > > zookeeper.connectSpecifies the ZooKeeper connection string > in the form hostname:port where host and port are the host and > port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes > when that ZooKeeper machine
[jira] [Updated] (KAFKA-8633) Extra in generated documents
[ https://issues.apache.org/jira/browse/KAFKA-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichu Liu updated KAFKA-8633: -- Description: The auto generated tables for all configurations (e.g. https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for each cell. e.g. the first row for broker configuration. {noformat} zookeeper.connectSpecifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. The server can also have a ZooKeeper chroot path as part of its ZooKeeper connection string which puts its data under some path in the global ZooKeeper namespace. For example to give a chroot path of /chroot/path you would give the connection string as hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only {noformat} This is due to {{toHtmlTable}} function in {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is appending an extra "" in the code. {code:java} for (String headerName : headers()) { addColumnValue(b, getConfigValue(key, headerName)); b.append(""); } {code} (The addColumnValue already wrap the value with and ) This is very minor issue, but it will prevent an html parser to properly fetch table data (like what I was trying to do) -- Update: I also found another glitch in the doc: Some configuration are using '<>' in the string, but they are recognized as html tags so the description is not properly displayed. For example, the {{client.id}} of [Kafka Streams Configs|https://kafka.apache.org/documentation/#streamsconfigs] displays > An ID prefix string used for the client IDs of internal consumer, producer > and restore-consumer, with pattern '-StreamThread--'. However it should be > with pattern > '-StreamThread--'. I feel the fastest way is to avoid angle brackets at all. was: The auto generated tables for all configurations (e.g. https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for each cell. e.g. the first row for broker configuration. {noformat} zookeeper.connectSpecifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. The server can also have a ZooKeeper chroot path as part of its ZooKeeper connection string which puts its data under some path in the global ZooKeeper namespace. For example to give a chroot path of /chroot/path you would give the connection string as hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only {noformat} This is due to {{toHtmlTable}} function in {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is appending an extra "" in the code. {code:java} for (String headerName : headers()) { addColumnValue(b, getConfigValue(key, headerName)); b.append(""); } {code} (The addColumnValue already wrap the value with and ) This is very minor issue, but it will prevent an html parser to properly fetch table data (like what I was trying to do) > Extra in generated documents > -- > > Key: KAFKA-8633 > URL: https://issues.apache.org/jira/browse/KAFKA-8633 > Project: Kafka > Issue Type: Task > Components: documentation >Reporter: Weichu Liu >Priority: Trivial > > The auto generated tables for all configurations (e.g. > https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for > each cell. > e.g. the first row for broker configuration. > {noformat} > > zookeeper.connectSpecifies the ZooKeeper connection string > in the form hostname:port where host and port are the host and > port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes > when that ZooKeeper machine is down you can also specify multiple hosts in > the form hostname1:port1,hostname2:port2,hostname3:port3. > The server can also have a ZooKeeper chroot path as part of its ZooKeeper > connection string which puts its data under some path in the global ZooKeeper > namespace. For example to give a chroot path of /chroot/path you > would give the connection string as > hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only > {noformat} > This is due to {{toHtmlTable}} function in > {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is > appending an
[jira] [Commented] (KAFKA-8633) Extra in generated documents
[ https://issues.apache.org/jira/browse/KAFKA-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881110#comment-16881110 ] Weichu Liu commented on KAFKA-8633: --- Seems no one is responding, so I will make a PR to fix this. > Extra in generated documents > -- > > Key: KAFKA-8633 > URL: https://issues.apache.org/jira/browse/KAFKA-8633 > Project: Kafka > Issue Type: Task > Components: documentation >Reporter: Weichu Liu >Priority: Trivial > > The auto generated tables for all configurations (e.g. > https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for > each cell. > e.g. the first row for broker configuration. > {noformat} > > zookeeper.connectSpecifies the ZooKeeper connection string > in the form hostname:port where host and port are the host and > port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes > when that ZooKeeper machine is down you can also specify multiple hosts in > the form hostname1:port1,hostname2:port2,hostname3:port3. > The server can also have a ZooKeeper chroot path as part of its ZooKeeper > connection string which puts its data under some path in the global ZooKeeper > namespace. For example to give a chroot path of /chroot/path you > would give the connection string as > hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only > {noformat} > This is due to {{toHtmlTable}} function in > {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is > appending an extra "" in the code. > {code:java} > for (String headerName : headers()) { > addColumnValue(b, getConfigValue(key, headerName)); > b.append(""); > } > {code} > (The addColumnValue already wrap the value with and ) > This is very minor issue, but it will prevent an html parser to properly > fetch table data (like what I was trying to do) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8633) Extra in generated documents
[ https://issues.apache.org/jira/browse/KAFKA-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichu Liu updated KAFKA-8633: -- Priority: Trivial (was: Major) > Extra in generated documents > -- > > Key: KAFKA-8633 > URL: https://issues.apache.org/jira/browse/KAFKA-8633 > Project: Kafka > Issue Type: Task > Components: documentation >Reporter: Weichu Liu >Priority: Trivial > > The auto generated tables for all configurations (e.g. > https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for > each cell. > e.g. the first row for broker configuration. > {noformat} > > zookeeper.connectSpecifies the ZooKeeper connection string > in the form hostname:port where host and port are the host and > port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes > when that ZooKeeper machine is down you can also specify multiple hosts in > the form hostname1:port1,hostname2:port2,hostname3:port3. > The server can also have a ZooKeeper chroot path as part of its ZooKeeper > connection string which puts its data under some path in the global ZooKeeper > namespace. For example to give a chroot path of /chroot/path you > would give the connection string as > hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only > {noformat} > This is due to {{toHtmlTable}} function in > {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is > appending an extra "" in the code. > {code:java} > for (String headerName : headers()) { > addColumnValue(b, getConfigValue(key, headerName)); > b.append(""); > } > {code} > (The addColumnValue already wrap the value with and ) > This is very minor issue, but it will prevent an html parser to properly > fetch table data (like what I was trying to do) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8633) Extra in generated documents
Weichu Liu created KAFKA-8633: - Summary: Extra in generated documents Key: KAFKA-8633 URL: https://issues.apache.org/jira/browse/KAFKA-8633 Project: Kafka Issue Type: Task Components: documentation Reporter: Weichu Liu The auto generated tables for all configurations (e.g. https://kafka.apache.org/documentation/#brokerconfigs) are with 2 for each cell. e.g. the first row for broker configuration. {noformat} zookeeper.connectSpecifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. The server can also have a ZooKeeper chroot path as part of its ZooKeeper connection string which puts its data under some path in the global ZooKeeper namespace. For example to give a chroot path of /chroot/path you would give the connection string as hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.stringhighread-only {noformat} This is due to {{toHtmlTable}} function in {{./clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java}} is appending an extra "" in the code. {code:java} for (String headerName : headers()) { addColumnValue(b, getConfigValue(key, headerName)); b.append(""); } {code} (The addColumnValue already wrap the value with and ) This is very minor issue, but it will prevent an html parser to properly fetch table data (like what I was trying to do) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8335) Log cleaner skips Transactional mark and batch record, causing unlimited growth of __consumer_offsets
[ https://issues.apache.org/jira/browse/KAFKA-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870907#comment-16870907 ] Weichu Liu commented on KAFKA-8335: --- [~francisco.juan] On our cluster, after upgrading, the logs shrunk but still has several gigs for the biggest partition. The contents of remained logs are quite the same as what you've posted. >From reading https://github.com/apache/kafka/pull/6715 description, there >seems to be 2 different causes of empty batch to be retained. The first was addressed by the PR but the second was not. Could that be the reason why __consumer_offsets is still big? [~hachikuji] > Log cleaner skips Transactional mark and batch record, causing unlimited > growth of __consumer_offsets > - > > Key: KAFKA-8335 > URL: https://issues.apache.org/jira/browse/KAFKA-8335 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Boquan Tang >Assignee: Jason Gustafson >Priority: Major > Fix For: 2.0.2, 2.1.2, 2.2.1 > > Attachments: seg_april_25.zip, segment.zip > > > My Colleague Weichu already sent out a mail to kafka user mailing list > regarding this issue, but we think it's worth having a ticket tracking it. > We are using Kafka Streams with exactly-once enabled on a Kafka cluster for > a while. > Recently we found that the size of __consumer_offsets partitions grew huge. > Some partition went over 30G. This caused Kafka to take quite long to load > "__consumer_offsets" topic on startup (it loads the topic in order to > become group coordinator). > We dumped the __consumer_offsets segments and found that while normal > offset commits are nicely compacted, transaction records (COMMIT, etc) are > all preserved. Looks like that since these messages don't have a key, the > LogCleaner is keeping them all: > -- > $ bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files > /003484332061.log --key-decoder-class > kafka.serializer.StringDecoder 2>/dev/null | cat -v | head > Dumping 003484332061.log > Starting offset: 3484332061 > offset: 3484332089 position: 549 CreateTime: 1556003706952 isvalid: true > keysize: 4 valuesize: 6 magic: 2 compresscodec: NONE producerId: 1006 > producerEpoch: 2530 sequence: -1 isTransactional: true headerKeys: [] > endTxnMarker: COMMIT coordinatorEpoch: 81 > offset: 3484332090 position: 627 CreateTime: 1556003706952 isvalid: true > keysize: 4 valuesize: 6 magic: 2 compresscodec: NONE producerId: 4005 > producerEpoch: 2520 sequence: -1 isTransactional: true headerKeys: [] > endTxnMarker: COMMIT coordinatorEpoch: 84 > ... > -- > Streams is doing transaction commits per 100ms (commit.interval.ms=100 when > exactly-once) so the __consumer_offsets is growing really fast. > Is this (to keep all transactions) by design, or is that a bug for > LogCleaner? What would be the way to clean up the topic? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8335) Log cleaner skips Transactional mark and batch record, causing unlimited growth of __consumer_offsets
[ https://issues.apache.org/jira/browse/KAFKA-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836061#comment-16836061 ] Weichu Liu commented on KAFKA-8335: --- Hi, I uploaded a sample segment here: [^segment.zip] And here is the broker setting on our Kafka {noformat} [2019-05-07 02:00:50,472] INFO KafkaConfig values: advertised.host.name = null advertised.listeners = null advertised.port = null alter.config.policy.class.name = null alter.log.dirs.replication.quota.window.num = 11 alter.log.dirs.replication.quota.window.size.seconds = 1 authorizer.class.name = auto.create.topics.enable = true auto.leader.rebalance.enable = true background.threads = 10 broker.id = 1 broker.id.generation.enable = true broker.rack = null client.quota.callback.class = null compression.type = producer connection.failed.authentication.delay.ms = 100 connections.max.idle.ms = 60 connections.max.reauth.ms = 0 control.plane.listener.name = null controlled.shutdown.enable = true controlled.shutdown.max.retries = 3 controlled.shutdown.retry.backoff.ms = 5000 controller.socket.timeout.ms = 3 create.topic.policy.class.name = null default.replication.factor = 3 delegation.token.expiry.check.interval.ms = 360 delegation.token.expiry.time.ms = 8640 delegation.token.master.key = null delegation.token.max.lifetime.ms = 60480 delete.records.purgatory.purge.interval.requests = 1 delete.topic.enable = true fetch.purgatory.purge.interval.requests = 1000 group.initial.rebalance.delay.ms = 3000 group.max.session.timeout.ms = 30 group.max.size = 2147483647 group.min.session.timeout.ms = 6000 host.name = inter.broker.listener.name = null inter.broker.protocol.version = 2.2-IV1 kafka.metrics.polling.interval.secs = 10 kafka.metrics.reporters = [] leader.imbalance.check.interval.seconds = 300 leader.imbalance.per.broker.percentage = 10 listener.security.protocol.map = PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL listeners = PLAINTEXT://:9092,SSL://:9093 log.cleaner.backoff.ms = 15000 log.cleaner.dedupe.buffer.size = 134217728 log.cleaner.delete.retention.ms = 8640 log.cleaner.enable = true log.cleaner.io.buffer.load.factor = 0.9 log.cleaner.io.buffer.size = 524288 log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308 log.cleaner.min.cleanable.ratio = 0.5 log.cleaner.min.compaction.lag.ms = 0 log.cleaner.threads = 1 log.cleanup.policy = [delete] log.dir = /tmp/kafka-logs log.dirs = /var/lib/kafka/data log.flush.interval.messages = 9223372036854775807 log.flush.interval.ms = null log.flush.offset.checkpoint.interval.ms = 6 log.flush.scheduler.interval.ms = 9223372036854775807 log.flush.start.offset.checkpoint.interval.ms = 6 log.index.interval.bytes = 4096 log.index.size.max.bytes = 10485760 log.message.downconversion.enable = true log.message.format.version = 2.2-IV1 log.message.timestamp.difference.max.ms = 9223372036854775807 log.message.timestamp.type = CreateTime log.preallocate = false log.retention.bytes = -1 log.retention.check.interval.ms = 30 log.retention.hours = 168 log.retention.minutes = null log.retention.ms = null log.roll.hours = 168 log.roll.jitter.hours = 0 log.roll.jitter.ms = null log.roll.ms = null log.segment.bytes = 1073741824 log.segment.delete.delay.ms = 6 max.connections.per.ip = 2147483647 max.connections.per.ip.overrides = max.incremental.fetch.session.cache.slots = 1000 message.max.bytes = 2097152 metric.reporters = [] metrics.num.samples = 2 metrics.recording.level = INFO metrics.sample.window.ms = 3 min.insync.replicas = 1 num.io.threads = 8 num.network.threads = 3 num.partitions = 5 num.recovery.threads.per.data.dir = 1 num.replica.alter.log.dirs.threads = null num.replica.fetchers = 1 offset.metadata.max.bytes = 4096 offsets.commit.required.acks = -1 offsets.commit.timeout.ms = 5000 offsets.load.buffer.size = 5242880 offsets.retention.check.interval.ms = 60 offsets.retention.minutes = 10080 offsets.topic.compression.codec = 0 offsets.topic.num.partitions = 50 offsets.topic.replication.factor = 3 offsets.topic.segment.bytes =
[jira] [Updated] (KAFKA-8335) Log cleaner skips Transactional mark and batch record, causing unlimited growth of __consumer_offsets
[ https://issues.apache.org/jira/browse/KAFKA-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichu Liu updated KAFKA-8335: -- Attachment: segment.zip > Log cleaner skips Transactional mark and batch record, causing unlimited > growth of __consumer_offsets > - > > Key: KAFKA-8335 > URL: https://issues.apache.org/jira/browse/KAFKA-8335 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Boquan Tang >Priority: Major > Attachments: segment.zip > > > My Colleague Weichu already sent out a mail to kafka user mailing list > regarding this issue, but we think it's worth having a ticket tracking it. > We are using Kafka Streams with exactly-once enabled on a Kafka cluster for > a while. > Recently we found that the size of __consumer_offsets partitions grew huge. > Some partition went over 30G. This caused Kafka to take quite long to load > "__consumer_offsets" topic on startup (it loads the topic in order to > become group coordinator). > We dumped the __consumer_offsets segments and found that while normal > offset commits are nicely compacted, transaction records (COMMIT, etc) are > all preserved. Looks like that since these messages don't have a key, the > LogCleaner is keeping them all: > -- > $ bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files > /003484332061.log --key-decoder-class > kafka.serializer.StringDecoder 2>/dev/null | cat -v | head > Dumping 003484332061.log > Starting offset: 3484332061 > offset: 3484332089 position: 549 CreateTime: 1556003706952 isvalid: true > keysize: 4 valuesize: 6 magic: 2 compresscodec: NONE producerId: 1006 > producerEpoch: 2530 sequence: -1 isTransactional: true headerKeys: [] > endTxnMarker: COMMIT coordinatorEpoch: 81 > offset: 3484332090 position: 627 CreateTime: 1556003706952 isvalid: true > keysize: 4 valuesize: 6 magic: 2 compresscodec: NONE producerId: 4005 > producerEpoch: 2520 sequence: -1 isTransactional: true headerKeys: [] > endTxnMarker: COMMIT coordinatorEpoch: 84 > ... > -- > Streams is doing transaction commits per 100ms (commit.interval.ms=100 when > exactly-once) so the __consumer_offsets is growing really fast. > Is this (to keep all transactions) by design, or is that a bug for > LogCleaner? What would be the way to clean up the topic? -- This message was sent by Atlassian JIRA (v7.6.3#76005)