[ https://issues.apache.org/jira/browse/KAFKA-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219019#comment-16219019 ]
ASF GitHub Bot commented on KAFKA-6075: --------------------------------------- GitHub user tedyu opened a pull request: https://github.com/apache/kafka/pull/4134 KAFKA-6075 Kafka cannot recover after an unclean shutdown on Windows As Vahid commented, Files.deleteIfExists(file.toPath) seems to destabilize Windows environment. This PR reverts to calling delete() directly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tedyu/kafka trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4134.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4134 ---- commit c734471f496b2bace8359f9c899cca73e636aa8d Author: tedyu <yuzhih...@gmail.com> Date: 2017-10-25T16:28:18Z KAFKA-6075 Kafka cannot recover after an unclean shutdown on Windows commit 7355e8c282b1cb7d70f2c290702b5b216f28d3cd Author: tedyu <yuzhih...@gmail.com> Date: 2017-10-25T16:36:50Z Use delete() ---- > Kafka cannot recover after an unclean shutdown on Windows > --------------------------------------------------------- > > Key: KAFKA-6075 > URL: https://issues.apache.org/jira/browse/KAFKA-6075 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.11.0.1 > Reporter: Vahid Hashemian > > An unclean shutdown of broker on Windows cannot be recovered by Kafka. Steps > to reproduce from a fresh build: > # Start zookeeper > # Start a broker > # Create a topic {{test}} > # Do an unclean shutdown of broker (find the process id by {{wmic process > where "caption = 'java.exe' and commandline like '%server.properties%'" get > processid}}), then kill the process by {{taskkill /pid #### /f}} > # Start the broker again > This leads to the following errors: > {code} > [2017-10-17 17:13:24,819] ERROR Error while loading log dir C:\tmp\kafka-logs > (kafka.log.LogManager) > java.nio.file.FileSystemException: > C:\tmp\kafka-logs\test-0\00000000000000000000.timeindex: The process cannot > access the file because it is being used by another process. > at > sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) > at > sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) > at > sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108) > at java.nio.file.Files.deleteIfExists(Files.java:1165) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.Log.loadSegmentFiles(Log.scala:295) > at kafka.log.Log.loadSegments(Log.scala:404) > at kafka.log.Log.<init>(Log.scala:201) > at kafka.log.Log$.apply(Log.scala:1729) > at > kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221) > at > kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-10-17 17:13:24,819] ERROR Error while deleting the clean shutdown file > in dir C:\tmp\kafka-logs (kafka.server.LogDirFailureChannel) > java.nio.file.FileSystemException: > C:\tmp\kafka-logs\test-0\00000000000000000000.timeindex: The process cannot > access the file because it is being used by another process. > at > sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) > at > sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) > at > sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108) > at java.nio.file.Files.deleteIfExists(Files.java:1165) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.Log.loadSegmentFiles(Log.scala:295) > at kafka.log.Log.loadSegments(Log.scala:404) > at kafka.log.Log.<init>(Log.scala:201) > at kafka.log.Log$.apply(Log.scala:1729) > at > kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221) > at > kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-10-17 17:13:24,819] INFO Logs loading complete in 47 ms. > (kafka.log.LogManager) > [2017-10-17 17:13:24,865] WARN Error processing > kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\tmp\kafka-logs > (com.yammer.metrics.reporting.JmxReporter) > javax.management.MalformedObjectNameException: Invalid character ':' in value > part of property > at javax.management.ObjectName.construct(ObjectName.java:618) > at javax.management.ObjectName.<init>(ObjectName.java:1382) > at > com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395) > at > com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516) > at > com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491) > at > com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79) > at > kafka.metrics.KafkaMetricsGroup$class.newGauge(KafkaMetricsGroup.scala:80) > at kafka.log.LogManager.newGauge(LogManager.scala:50) > at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:117) > at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:116) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at kafka.log.LogManager.<init>(LogManager.scala:116) > at kafka.log.LogManager$.apply(LogManager.scala:799) > at kafka.server.KafkaServer.startup(KafkaServer.scala:222) > at > kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38) > at kafka.Kafka$.main(Kafka.scala:92) > at kafka.Kafka.main(Kafka.scala) > [2017-10-17 17:13:24,865] INFO Starting log cleanup with a period of 300000 > ms. (kafka.log.LogManager) > [2017-10-17 17:13:24,881] INFO Starting log flusher with a default period of > 9223372036854775807 ms. (kafka.log.LogManager) > [2017-10-17 17:13:25,131] INFO Awaiting socket connections on 0.0.0.0:9092. > (kafka.network.Acceptor) > [2017-10-17 17:13:25,147] INFO [SocketServer brokerId=0] Started 1 acceptor > threads (kafka.network.SocketServer) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-Produce]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-DeleteRecords]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-Fetch]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [LogDirFailureHandler]: Starting > (kafka.server.ReplicaManager$LogDirFailureHandler) > [2017-10-17 17:13:25,162] INFO [ReplicaManager broker=0] Stopping serving > replicas in dir C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,162] INFO [ReplicaManager broker=0] Partitions are > offline due to failure on log directory C:\tmp\kafka-logs > (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,162] INFO [ReplicaFetcherManager on broker 0] Removed > fetcher for partitions (kafka.server.ReplicaFetcherManager) > [2017-10-17 17:13:25,178] INFO [ReplicaManager broker=0] Broker 0 stopped > fetcher for partitions because they are in the failed log dir > C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,178] INFO Stopping serving logs in dir C:\tmp\kafka-logs > (kafka.log.LogManager) > [2017-10-17 17:13:25,178] FATAL Shutdown broker because all log dirs in > C:\tmp\kafka-logs have failed (kafka.log.LogManager) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)