[ https://issues.apache.org/jira/browse/KAFKA-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Li Haijun updated KAFKA-5010: ----------------------------- Comment: was deleted (was: Hi Shuai Lin, we encounterd the relative problem recently, how could I find the specific topic that triggered the crash of the log cleaner? By the way, the log.cleaner.log text : ``` 1 [2019-02-19 09:59:37,700] INFO Starting the log cleaner (kafka.log.LogCleaner) 2 [2019-02-19 09:59:37,739] INFO [kafka-log-cleaner-thread-0], Starting (kafka.log.LogCleaner) 3 [2019-02-19 09:59:38,044] INFO Cleaner 0: Beginning cleaning of log get_coupon_streams_2-get_coupon_agg-changelog-5. (kafka.log.LogCleaner) 4 [2019-02-19 09:59:38,044] INFO Cleaner 0: Building offset map for get_coupon_streams_2-get_coupon_agg-changelog-5... (kafka.log.LogCleaner) 5 [2019-02-19 09:59:38,072] INFO Cleaner 0: Building offset map for log get_coupon_streams_2-get_coupon_agg-changelog-5 for 15 segments in offset range [0, 3481151). (kafka.log.LogCleaner) 6 [2019-02-19 09:59:44,899] INFO Cleaner 0: Offset map for log get_coupon_streams_2-get_coupon_agg-changelog-5 complete. (kafka.log.LogCleaner) 7 [2019-02-19 09:59:44,925] INFO Cleaner 0: Cleaning log get_coupon_streams_2-get_coupon_agg-changelog-5 (cleaning prior to Wed Feb 13 16:56:15 CST 2019, discarding tombstones prior to Thu Jan 01 08:00:00 CST 1970)... (kafka.log.LogCleaner) 8 [2019-02-19 09:59:44,946] INFO Cleaner 0: Cleaning segment 0 in log get_coupon_streams_2-get_coupon_agg-changelog-5 (largest timestamp Wed Nov 07 16:51:37 CST 2018) into 0, retaining deletes. (kafka.log.LogCleaner) 9 [2019-02-19 09:59:45,231] ERROR [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) 10 java.lang.IllegalArgumentException 11 at java.nio.Buffer.position(Buffer.java:236) 12 at org.apache.kafka.common.record.MemoryRecordsBuilder.<init>(MemoryRecordsBuilder.java:141) 13 at org.apache.kafka.common.record.MemoryRecords.builder(MemoryRecords.java:300) 14 at org.apache.kafka.common.record.MemoryRecords.builderWithEntries(MemoryRecords.java:408) 15 at org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:168) 16 at org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111) 17 at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468) 18 at kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405) 19 at kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401) 20 at scala.collection.immutable.List.foreach(List.scala:381) 21 at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401) 22 at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363) 23 at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362) 24 at scala.collection.immutable.List.foreach(List.scala:381) 25 at kafka.log.Cleaner.clean(LogCleaner.scala:362) 26 at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) 27 at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) 28 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) 29 [2019-02-19 09:59:45,274] INFO [kafka-log-cleaner-thread-0], Stopped (kafka.log.LogCleaner) ```) > Log cleaner crashed with BufferOverflowException when writing to the > writeBuffer > -------------------------------------------------------------------------------- > > Key: KAFKA-5010 > URL: https://issues.apache.org/jira/browse/KAFKA-5010 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 0.10.2.0 > Reporter: Shuai Lin > Priority: Critical > Labels: reliability > > After upgrading from 0.10.0.1 to 0.10.2.0 the log cleaner thread crashed with > BufferOverflowException when writing the filtered records into the > writeBuffer: > {code} > [2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting > (kafka.log.LogCleaner) > [2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log > app-topic-20170317-20. (kafka.log.LogCleaner) > [2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for > app-topic-20170317-20... (kafka.log.LogCleaner) > [2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log > app-topic-20170317-20 for 1 segments in offset range [9737795, 9887707). > (kafka.log.LogCleaner) > [2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log > app-topic-20170317-20 complete. (kafka.log.LogCleaner) > [2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log app-topic-20170317-20 > (cleaning prior to Fri Mar 24 10:36:06 GMT 2017, discarding tombstones prior > to Thu Mar 23 10:18:02 GMT 2017)... (kafka.log.LogCleaner) > [2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log > app-topic-20170317-20 (largest timestamp Fri Mar 24 09:58:25 GMT 2017) into > 0, retaining deletes. (kafka.log.LogCleaner) > [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to > (kafka.log.LogCleaner) > java.nio.BufferOverflowException > at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206) > at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.java:98) > at > org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:158) > at > org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468) > at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:405) > at > kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleaner.scala:401) > at scala.collection.immutable.List.foreach(List.scala:378) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401) > at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363) > at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:362) > at scala.collection.immutable.List.foreach(List.scala:378) > at kafka.log.Cleaner.clean(LogCleaner.scala:362) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > [2017-03-24 10:41:07,375] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > {code} > I tried different values of log.cleaner.buffer.size, from 512K to 2M to 10M > to 128M, all with no luck: The log cleaner thread crashed immediately after > the broker got restarted. But setting it to 256MB fixed the problem! > Here are the settings for the cluster: > {code} > - log.message.format.version = 0.9.0.0 (we use 0.9 format because have old > consumers) > - log.cleaner.enable = 'true' > - log.cleaner.min.cleanable.ratio = '0.1' > - log.cleaner.threads = '1' > - log.cleaner.io.buffer.load.factor = '0.98' > - log.roll.hours = '24' > - log.cleaner.dedupe.buffer.size = 2GB > - log.segment.bytes = 256MB (global is 512MB, but we have been using 256MB > for this topic) > - message.max.bytes = 10MB > {code} > Given that the size of readBuffer and writeBuffer are exactly the same (half > of log.cleaner.io.buffer.size), why would the cleaner throw a > BufferOverflowException when writing the filtered records into the > writeBuffer? IIUC that should never happen because the size of the filtered > records should be no greater than the size of the readBuffer, thus no greater > than the size of the writeBuffer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)