[ 
https://issues.apache.org/jira/browse/KAFKA-12494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486011#comment-17486011
 ] 

Sergey Ivanov commented on KAFKA-12494:
---------------------------------------

Hi all,

We faced the similar issue on our env when testing failover scenarios: we force 
shutdown Kafka brokers during working, and after restarting and some loading 
they started to raise a lot of errors with "{_}java.lang.InternalError: a fault 
occurred in a recent unsafe memory access operation in compiled Java code"{_} 

For example:
{code:java}
java.lang.InternalError: a fault occurred in a recent unsafe memory access 
operation in compiled Java code
        at 
kafka.server.FullFetchContext.$anonfun$updateAndGenerateResponseData$3(FetchSession.scala:373)
        at java.base/java.util.LinkedHashMap.forEach(Unknown Source)
        at 
kafka.server.FullFetchContext.createNewSession$1(FetchSession.scala:372) 
{code}
{code:java}
java.lang.InternalError: a fault occurred in a recent unsafe memory access 
operation in compiled Java code
        at java.base/java.util.LinkedHashMap.newNode(Unknown Source)
        at java.base/java.util.HashMap.putVal(Unknown Source)
        at java.base/java.util.HashMap.put(Unknown Source)
 {code}
{code:java}
java.lang.InternalError: a fault occurred in a recent unsafe memory access 
operation in compiled Java code
        at kafka.server.LogReadResult.error(ReplicaManager.scala:104)
        at 
kafka.server.ReplicaManager.$anonfun$updateFollowerFetchState$1(ReplicaManager.scala:1621)
        at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:273)
{code}
And many of others stacktraces.

Looks like java.lang.InternalError is not a root cause of issue, but in logs we 
couldn't find any other errors. 
After Kafka broker started to raise this error it became not fully operated: it 
can't handle requests of clients and other broker.
The issue has gone after restart (looks like WA), but can't image what is the 
+root cause+ of this issue.

Please correct me if this is not right ticket for this issue.

> Broker raise InternalError after disk sector medium error without marking dir 
> to offline
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12494
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12494
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.4.0, 2.6.0, 2.5.1, 2.7.0
>         Environment: Kafka Version: 1.1.0
> Jdk Version:  jdk1.8
>            Reporter: iBlackeyes
>            Priority: Major
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In my produce env, we encounter a case that kafka broker only raise errors 
> like 
>  `_*2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | 
> [ReplicaManager broker=7] Error processing append operation on partition 
> xxxxxxx-0 | kafka.server.ReplicaManager (Logging.scala:76)*_ 
> _*java.lang.InternalError: a fault occurred in a recent unsafe memory access 
> operation in compiled Java code*_` 
> when broker append to a error disk sector  and doesn't mark the dir on this 
> disk to offline.
> This result in many partitions which assign replicas on this disk  in 
> under-replicated state . 
> Here is the logs:
> *os messages log:*
> {code:java}
> Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, 
> dev sds, sector 2308010408
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: 
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium 
> Error [current] 
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: 
> Unrecovered read error
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 
> 91 71 a8 00 00 08 00
> Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, 
> dev sds, sector 2308010408
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: 
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium 
> Error [current] 
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: 
> Unrecovered read error
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 
> 91 71 a8 00 00 08 00
> Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, 
> dev sds, sector 2308010408
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: 
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium 
> Error [current] 
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: 
> Unrecovered read error
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 
> 91 71 a8 00 00 08 00
> Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, 
> dev sds, sector 2308010408
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: 
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium 
> Error [current] 
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: 
> Unrecovered read error
> Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 
> 91 71 a8 00 00 08 00
> Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, 
> dev sds, sector 2308010408{code}
> *broker server.log:*
> {code:java}
> 2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | 
> [ReplicaManager broker=7] Error processing append operation on xxxxxxxxx-0 | 
> kafka.server.ReplicaManager (Logging.scala:76) 2021-02-16 23:24:24,965 | 
> ERROR | [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] 
> Error processing append operation on xxxxxxxxx-0 | 
> kafka.server.ReplicaManager (Logging.scala:76) java.lang.InternalError: a 
> fault occurred in a recent unsafe memory access operation in compiled Java 
> code at java.util.zip.Inflater.<init>(Inflater.java:102) at 
> java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:77) at 
> org.apache.kafka.common.record.CompressionType$2.wrapForInput(CompressionType.java:69)
>  at 
> org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:265)
>  at 
> org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:332)
>  at 
> scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:54)
>  at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
> kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:267)
>  at 
> kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:259)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
> kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:259)
>  at 
> kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:70)
>  at kafka.log.Log$$anonfun$append$2.liftedTree1$1(Log.scala:672) at 
> kafka.log.Log$$anonfun$append$2.apply(Log.scala:671) at 
> kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at 
> kafka.log.Log.maybeHandleIOException(Log.scala:1711) at 
> kafka.log.Log.append(Log.scala:653) at 
> kafka.log.Log.appendAsLeader(Log.scala:623) at 
> kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:609) at 
> kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:597) at 
> kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at 
> kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at 
> kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:596) at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739)
>  at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723)
>  at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>  at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>  at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) 
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) 
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) 
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at 
> scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at 
> kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at 
> kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:104) at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at 
> java.lang.Thread.run(Thread.java:748)2021-02-16 23:24:24,999 | ERROR | 
> [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] Error 
> processing append operation on partition xxxxxxx-0 | 
> kafka.server.ReplicaManager (Logging.scala:76) java.lang.InternalError: a 
> fault occurred in a recent unsafe memory access operation in compiled Java 
> code at java.util.zip.Inflater.<init>(Inflater.java:102) at 
> java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:77) at 
> org.apache.kafka.common.record.CompressionType$2.wrapForInput(CompressionType.java:69)
>  at 
> org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:265)
>  at 
> org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:332)
>  at 
> scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:54)
>  at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
> kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:267)
>  at 
> kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:259)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
> kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:259)
>  at 
> kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:70)
>  at kafka.log.Log$$anonfun$append$2.liftedTree1$1(Log.scala:672) at 
> kafka.log.Log$$anonfun$append$2.apply(Log.scala:671) at 
> kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at 
> kafka.log.Log.maybeHandleIOException(Log.scala:1711) at 
> kafka.log.Log.append(Log.scala:653) at 
> kafka.log.Log.appendAsLeader(Log.scala:623) at 
> kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:609) at 
> kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:597) at 
> kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at 
> kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at 
> kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:596) at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739)
>  at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723)
>  at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>  at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>  at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) 
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) 
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) 
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at 
> scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at 
> kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at 
> kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:104) at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to