[ https://issues.apache.org/jira/browse/KAFKA-12494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486011#comment-17486011 ]
Sergey Ivanov commented on KAFKA-12494: --------------------------------------- Hi all, We faced the similar issue on our env when testing failover scenarios: we force shutdown Kafka brokers during working, and after restarting and some loading they started to raise a lot of errors with "{_}java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code"{_} For example: {code:java} java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at kafka.server.FullFetchContext.$anonfun$updateAndGenerateResponseData$3(FetchSession.scala:373) at java.base/java.util.LinkedHashMap.forEach(Unknown Source) at kafka.server.FullFetchContext.createNewSession$1(FetchSession.scala:372) {code} {code:java} java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.base/java.util.LinkedHashMap.newNode(Unknown Source) at java.base/java.util.HashMap.putVal(Unknown Source) at java.base/java.util.HashMap.put(Unknown Source) {code} {code:java} java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at kafka.server.LogReadResult.error(ReplicaManager.scala:104) at kafka.server.ReplicaManager.$anonfun$updateFollowerFetchState$1(ReplicaManager.scala:1621) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:273) {code} And many of others stacktraces. Looks like java.lang.InternalError is not a root cause of issue, but in logs we couldn't find any other errors. After Kafka broker started to raise this error it became not fully operated: it can't handle requests of clients and other broker. The issue has gone after restart (looks like WA), but can't image what is the +root cause+ of this issue. Please correct me if this is not right ticket for this issue. > Broker raise InternalError after disk sector medium error without marking dir > to offline > ---------------------------------------------------------------------------------------- > > Key: KAFKA-12494 > URL: https://issues.apache.org/jira/browse/KAFKA-12494 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 2.4.0, 2.6.0, 2.5.1, 2.7.0 > Environment: Kafka Version: 1.1.0 > Jdk Version: jdk1.8 > Reporter: iBlackeyes > Priority: Major > Original Estimate: 168h > Remaining Estimate: 168h > > In my produce env, we encounter a case that kafka broker only raise errors > like > `_*2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | > [ReplicaManager broker=7] Error processing append operation on partition > xxxxxxx-0 | kafka.server.ReplicaManager (Logging.scala:76)*_ > _*java.lang.InternalError: a fault occurred in a recent unsafe memory access > operation in compiled Java code*_` > when broker append to a error disk sector and doesn't mark the dir on this > disk to offline. > This result in many partitions which assign replicas on this disk in > under-replicated state . > Here is the logs: > *os messages log:* > {code:java} > Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, > dev sds, sector 2308010408 > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium > Error [current] > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: > Unrecovered read error > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 > 91 71 a8 00 00 08 00 > Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, > dev sds, sector 2308010408 > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium > Error [current] > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: > Unrecovered read error > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 > 91 71 a8 00 00 08 00 > Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, > dev sds, sector 2308010408 > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium > Error [current] > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: > Unrecovered read error > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 > 91 71 a8 00 00 08 00 > Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, > dev sds, sector 2308010408 > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium > Error [current] > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: > Unrecovered read error > Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 > 91 71 a8 00 00 08 00 > Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, > dev sds, sector 2308010408{code} > *broker server.log:* > {code:java} > 2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | > [ReplicaManager broker=7] Error processing append operation on xxxxxxxxx-0 | > kafka.server.ReplicaManager (Logging.scala:76) 2021-02-16 23:24:24,965 | > ERROR | [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] > Error processing append operation on xxxxxxxxx-0 | > kafka.server.ReplicaManager (Logging.scala:76) java.lang.InternalError: a > fault occurred in a recent unsafe memory access operation in compiled Java > code at java.util.zip.Inflater.<init>(Inflater.java:102) at > java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:77) at > org.apache.kafka.common.record.CompressionType$2.wrapForInput(CompressionType.java:69) > at > org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:265) > at > org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:332) > at > scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:54) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:267) > at > kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:259) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) at > scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:259) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:70) > at kafka.log.Log$$anonfun$append$2.liftedTree1$1(Log.scala:672) at > kafka.log.Log$$anonfun$append$2.apply(Log.scala:671) at > kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at > kafka.log.Log.maybeHandleIOException(Log.scala:1711) at > kafka.log.Log.append(Log.scala:653) at > kafka.log.Log.appendAsLeader(Log.scala:623) at > kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:609) at > kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:597) at > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at > kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at > kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:596) at > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) > at > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) > at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at > scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.AbstractTraversable.map(Traversable.scala:104) at > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at > kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at > kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at > kafka.server.KafkaApis.handle(KafkaApis.scala:104) at > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at > java.lang.Thread.run(Thread.java:748)2021-02-16 23:24:24,999 | ERROR | > [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] Error > processing append operation on partition xxxxxxx-0 | > kafka.server.ReplicaManager (Logging.scala:76) java.lang.InternalError: a > fault occurred in a recent unsafe memory access operation in compiled Java > code at java.util.zip.Inflater.<init>(Inflater.java:102) at > java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:77) at > org.apache.kafka.common.record.CompressionType$2.wrapForInput(CompressionType.java:69) > at > org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:265) > at > org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:332) > at > scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:54) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:267) > at > kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:259) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) at > scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:259) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:70) > at kafka.log.Log$$anonfun$append$2.liftedTree1$1(Log.scala:672) at > kafka.log.Log$$anonfun$append$2.apply(Log.scala:671) at > kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at > kafka.log.Log.maybeHandleIOException(Log.scala:1711) at > kafka.log.Log.append(Log.scala:653) at > kafka.log.Log.appendAsLeader(Log.scala:623) at > kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:609) at > kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:597) at > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at > kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at > kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:596) at > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) > at > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) > at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at > scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.AbstractTraversable.map(Traversable.scala:104) at > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at > kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at > kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at > kafka.server.KafkaApis.handle(KafkaApis.scala:104) at > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at > java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)