[ https://issues.apache.org/jira/browse/KAFKA-9156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312207#comment-17312207 ]
Wenbing Shen commented on KAFKA-9156: ------------------------------------- Hi, [~amironov] [~ijuma] , I applied kafka-7283 and kafka-9156 in our version of kafka-2.0.0 to apply the benefits of LazyIndex when starting the broker. When I applied this feature to a small Kafka cluster, there was no problem, but when I first applied it to a cluster with high traffic, some brokers with small traffic seemed to be no exception, but after the brokers with large traffic started, a large number of replica fetcher threads throw java.nio.BufferOverflowException. Same as the problem encountered by [~iBlackeyes] , the current patch still does not fix this problem. Its stack information is as follows: {panel:title=我的标题} 文本标题 {panel} [2021-03-31 15:23:54,935] ERROR (ReplicaFetcherThread-1-1001 kafka.server.ReplicaFetcherThread 76) [ReplicaFetcher replicaId=1006, leaderId=1001, fetcherId=1] Error due to org.apache.kafka.common.KafkaException: Error processing data for partition sinan_assets_tagged_default-9 offset 38543576 at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:214) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:175) at scala.Option.foreach(Option.scala:257) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:175) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:172) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:172) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:172) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:172) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:255) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:170) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:114) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) Caused by: java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:527) at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:793) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:131) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:111) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:111) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:255) at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:111) at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:578) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$32.apply(Log.scala:1582) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$32.apply(Log.scala:1582) at scala.Option.foreach(Option.scala:257) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1582) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1568) at kafka.log.Log.maybeHandleIOException(Log.scala:1943) at kafka.log.Log.roll(Log.scala:1568) at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1553) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:956) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:850) at kafka.log.Log.maybeHandleIOException(Log.scala:1943) at kafka.log.Log.append(Log.scala:850) at kafka.log.Log.appendAsFollower(Log.scala:831) at kafka.cluster.Partition$$anonfun$doAppendRecordsToFollowerOrFutureReplica$1.apply(Partition.scala:589) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:255) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:261) at kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:576) at kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:596) at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:129) at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:43) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:189) ... 13 more [2021-03-31 15:23:54,936] INFO (ReplicaFetcherThread-1-1001 kafka.server.ReplicaFetcherThread 66) [ReplicaFetcher replicaId=1006, leaderId=1001, fetcherId=1] Stopped > LazyTimeIndex & LazyOffsetIndex may cause niobufferoverflow in concurrent > state > ------------------------------------------------------------------------------- > > Key: KAFKA-9156 > URL: https://issues.apache.org/jira/browse/KAFKA-9156 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 2.3.0, 2.3.1 > Reporter: shilin Lu > Assignee: Alex Mironov > Priority: Blocker > Labels: regression > Fix For: 2.4.0, 2.3.2 > > Attachments: image-2019-11-07-17-42-13-852.png, > image-2019-11-07-17-44-05-357.png, image-2019-11-07-17-46-53-650.png > > > !image-2019-11-07-17-42-13-852.png! > this timeindex get function is not thread safe ,may cause create some > timeindex. > !image-2019-11-07-17-44-05-357.png! > When create timeindex not exactly one ,may cause mappedbytebuffer position to > end. Then write index entry to this mmap file will cause > java.nio.BufferOverflowException. > > !image-2019-11-07-17-46-53-650.png! > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)