[ 
https://issues.apache.org/jira/browse/KAFKA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057699#comment-17057699
 ] 

ASF GitHub Bot commented on KAFKA-9458:
---------------------------------------

hirik commented on pull request #8276: KAFKA-9458: Kafka broker crashes in 
windows platform
URL: https://github.com/apache/kafka/pull/8276
 
 
   Reason: I found that in LogSegment, files are being renamed as per the 
status flag(deleted, cleaned, swap). In windows renaming or deleting an opened 
file is not allowed.
   
   Existing behavior (LogSegment Management):
   While deleting a segment or creating a clean segment, we are trying to 
change the status of the segment. Segment status is maintained by the file 
suffix. 
   
   Possible solutions:
   1. Close the existing files before renaming or deleting
   2. Maintain a separate status file to store the status of the segment.
   
   I've implemented the second solution and it's successfully running on my 
windows environment. Please review the changes and let me know your comments. 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka crashed in windows environment
> ------------------------------------
>
>                 Key: KAFKA-9458
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9458
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 2.4.0
>         Environment: Windows Server 2019
>            Reporter: hirik
>            Priority: Critical
>              Labels: windows
>             Fix For: 2.6.0
>
>         Attachments: Windows_crash_fix.patch, logs.zip
>
>
> Hi,
> while I was trying to validate Kafka retention policy, Kafka Server crashed 
> with below exception trace. 
> [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, 
> dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] 
> Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log)
> [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in 
> dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka 
> (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
>  at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
>  at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
>  at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
>  at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
>  at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
>  at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
>  at kafka.log.Log.deleteSegments(Log.scala:1691)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1686)
>  at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1753)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
>  at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
>  at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
>  at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>  at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:830)
>  Suppressed: java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
>  ... 27 more
> [2020-01-21 17:10:40,495] INFO [ReplicaManager broker=0] Stopping serving 
> replicas in dir 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka 
> (kafka.server.ReplicaManager)
> [2020-01-21 17:10:40,495] ERROR Uncaught exception in scheduled task 
> 'kafka-log-retention' (kafka.utils.KafkaScheduler)
> org.apache.kafka.common.errors.KafkaStorageException: Error while deleting 
> segments for test1-3 in dir 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka
> Caused by: java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
>  at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
>  at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
>  at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
>  at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
>  at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
>  at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
>  at kafka.log.Log.deleteSegments(Log.scala:1691)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1686)
>  at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1753)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
>  at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
>  at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
>  at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>  at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:830)
>  Suppressed: java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
>  ... 27 more
> [2020-01-21 17:10:40,505] INFO [ReplicaFetcherManager on broker 0] Removed 
> fetcher for partitions HashSet(test1-3, test1-7, test-0, test1-0, test1-1, 
> test1-5, test1-2, test1-8, test1-4, test1-9, test1-6) 
> (kafka.server.ReplicaFetcherManager)
> [2020-01-21 17:10:40,507] INFO [ReplicaAlterLogDirsManager on broker 0] 
> Removed fetcher for partitions HashSet(test1-3, test1-7, test-0, test1-0, 
> test1-1, test1-5, test1-2, test1-8, test1-4, test1-9, test1-6) 
> (kafka.server.ReplicaAlterLogDirsManager)
> [2020-01-21 17:10:40,522] INFO [ReplicaManager broker=0] Broker 0 stopped 
> fetcher for partitions 
> test1-3,test1-7,test-0,test1-0,test1-1,test1-5,test1-2,test1-8,test1-4,test1-9,test1-6
>  and stopped moving logs for partitions because they are in the failed log 
> directory 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka. 
> (kafka.server.ReplicaManager)
> [2020-01-21 17:10:40,523] INFO Stopping serving logs in dir 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka 
> (kafka.log.LogManager)
> [2020-01-21 17:10:40,526] ERROR Shutdown broker because all log dirs in 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka have 
> failed (kafka.log.LogManager)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to