DomGarguilo opened a new issue #1916: URL: https://github.com/apache/accumulo/issues/1916
**Describe the bug** While investigating a flaky test ([SuspendedTabletsIT](https://github.com/apache/accumulo/pull/1888)), it was found that the garbage collector deleted a metadata file that was still needed for the test which caused it to fail once there was an attempt to read the metadata. This error was reported in [this comment](https://github.com/apache/accumulo/pull/1888#issuecomment-768674007). I cannot find anything that seems to suggest SuspendedTabletsIT directly caused this error, that is, it seems the error lies within the garbage collectors behavior. This error has only occurred once while running this test and I was unable to reproduce it. **Logs** _Originally posted by @ctubbsii in https://github.com/apache/accumulo/issues/1888#issuecomment-768674007_ <details> ```java [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 605.616 s <<< FAILURE! - in org.apache.accumulo.test.master.SuspendedTabletsIT [ERROR] crashAndResumeTserver(org.apache.accumulo.test.master.SuspendedTabletsIT) Time elapsed: 347.964 s <<< FAILURE! ``` ```java java.lang.AssertionError: Scanning of metadata failed, aborting at org.junit.Assert.fail(Assert.java:89) at org.apache.accumulo.test.master.SuspendedTabletsIT$TabletLocations.retrieve(SuspendedTabletsIT.java:306) at org.apache.accumulo.test.master.SuspendedTabletsIT.suspensionTestBody(SuspendedTabletsIT.java:208) at org.apache.accumulo.test.master.SuspendedTabletsIT.crashAndResumeTserver(SuspendedTabletsIT.java:101) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:834) ``` </details> <details> ```java 2021-01-27T17:59:14,161 [gc.SimpleGarbageCollector] DEBUG: Deleting file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/ ``` ```java 2021-01-27T18:00:10,691 [tserver.FileManager] ERROR: Failed to open file file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf java.io.FileNotFoundException: File file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf does not exist 2021-01-27T18:00:10,693 [tserver.FileManager] ERROR: Failed to open file file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf java.io.FileNotFoundException: File file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf does not exist 2021-01-27T18:00:10,693 [tserver.FileManager] ERROR: Failed to open file file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf java.io.FileNotFoundException: File file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf does not exist 2021-01-27T18:00:10,694 [problems.ProblemReports] DEBUG: Filing problem report !0 FILE_READ file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf 2021-01-27T18:00:10,694 [scan.LookupTask] WARN : lookup failed for tablet !0;~< java.io.IOException: Failed to open file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:331) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:492) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:501) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.tablet.ScanDataSource.createIterator(ScanDataSource.java:164) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.tablet.ScanDataSource.iterator(ScanDataSource.java:120) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.iteratorsImpl.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:228) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.tablet.Tablet.lookup(Tablet.java:493) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.tablet.Tablet.lookup(Tablet.java:646) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.scan.LookupTask.run(LookupTask.java:117) [accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.session.ScanSession$ScanMeasurer.run(ScanSession.java:54) [accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57) [htrace-core-3.2.0-incubating.jar:3.2.0-incubating] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: java.io.UncheckedIOException: java.io.FileNotFoundException: File file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf does not exist at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$BCFileLoader.load(CachableBlockFile.java:227) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.cache.lru.SynchronousLoadingBlockCache.getBlock(SynchronousLoadingBlockCache.java:127) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.cache.lru.SynchronousLoadingBlockCache.resolveDependencies(SynchronousLoadingBlockCache.java:64) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.cache.lru.SynchronousLoadingBlockCache.getBlock(SynchronousLoadingBlockCache.java:109) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:381) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:1164) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:1256) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFileOperations.getReader(RFileOperations.java:55) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:70) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(DispatchingFileFactory.java:85) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.FileOperations$ReaderBuilder.build(FileOperations.java:449) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:309) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] ... 13 more Caused by: java.io.FileNotFoundException: File file:/home/christopher/git/apache/accumulo/accumulo/test/target/mini-tests/org.apache.accumulo.test.master.SuspendedTabletsIT_crashAndResumeTserver/accumulo/tables/!0/table_info/F0000038.rf does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:668) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:989) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:658) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:460) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:155) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:356) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:945) ~[hadoop-client-api-3.3.0.jar:?] at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$CachableBuilder.lambda$fsPath$0(CachableBlockFile.java:92) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:167) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$BCFileLoader.load(CachableBlockFile.java:225) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.cache.lru.SynchronousLoadingBlockCache.getBlock(SynchronousLoadingBlockCache.java:127) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.cache.lru.SynchronousLoadingBlockCache.resolveDependencies(SynchronousLoadingBlockCache.java:64) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.cache.lru.SynchronousLoadingBlockCache.getBlock(SynchronousLoadingBlockCache.java:109) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:381) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:1164) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:1256) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFileOperations.getReader(RFileOperations.java:55) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:70) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(DispatchingFileFactory.java:85) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.core.file.FileOperations$ReaderBuilder.build(FileOperations.java:449) ~[accumulo-core-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:309) ~[accumulo-tserver-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] ... 13 more ``` </details> **Additional context** SuspendedTabletsIT sometimes hangs while running and times out. This error occurred in a run when the timeout was extended. The deletion of the metadata file came at minute 9 of the test which failed at 10 minutes. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
