[ https://issues.apache.org/jira/browse/KAFKA-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16577706#comment-16577706 ]
Dong Lin commented on KAFKA-7278: --------------------------------- [~ijuma] The exception is probably thrown from `segment.changeFileSuffixes("", Log.DeletedFileSuffix)`. Below is the stacktrace in the discussion of https://issues.apache.org/jira/browse/KAFKA-6188. {code} [2018-05-07 16:53:06,721] ERROR Failed to clean up log for __consumer_offsets-24 in dir /tmp/kafka-logs due to IOException (kafka.server.LogDirFailureChannel) java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-24/00000000000000000000.log at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:697) at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:212) at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:415) at kafka.log.Log.asyncDeleteSegment(Log.scala:1601) at kafka.log.Log.$anonfun$replaceSegments$1(Log.scala:1653) at kafka.log.Log.$anonfun$replaceSegments$1$adapted(Log.scala:1648) at scala.collection.immutable.List.foreach(List.scala:389) at kafka.log.Log.replaceSegments(Log.scala:1648) at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:535) at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:462) at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:461) at scala.collection.immutable.List.foreach(List.scala:389) at kafka.log.Cleaner.doClean(LogCleaner.scala:461) at kafka.log.Cleaner.clean(LogCleaner.scala:438) at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:305) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:291) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) Suppressed: java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-24/00000000000000000000.log -> /tmp/kafka-logs/__consumer_offsets-24/00000000000000000000.log.deleted at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) ... 16 more [2018-05-07 16:53:06,725] INFO [ReplicaManager broker=0] Stopping serving replicas in dir /tmp/kafka-logs (kafka.server.ReplicaManager) [2018-05-07 16:53:06,762] INFO Stopping serving logs in dir /tmp/kafka-logs (kafka.log.LogManager) [2018-05-07 16:53:07,032] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager) {code} > replaceSegments() should not call asyncDeleteSegment() for segments which > have been removed from segments list > -------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-7278 > URL: https://issues.apache.org/jira/browse/KAFKA-7278 > Project: Kafka > Issue Type: Improvement > Reporter: Dong Lin > Assignee: Dong Lin > Priority: Major > > Currently Log.replaceSegments() will call `asyncDeleteSegment(...)` for every > segment listed in the `oldSegments`. oldSegments should be constructed from > Log.segments and only contain segments listed in Log.segments. > However, Log.segments may be modified between the time oldSegments is > determined to the time Log.replaceSegments() is called. If there are > concurrent async deletion of the same log segment file, Log.replaceSegments() > will call asyncDeleteSegment() for a segment that does not exist and Kafka > server may shutdown the log directory due to NoSuchFileException. > This is likely the root cause of > https://issues.apache.org/jira/browse/KAFKA-6188. > Given the understanding of the problem, we should be able to fix the issue by > only deleting segment if the segment can be found in Log.segments. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)