Ah this is a race condition, what happened is: - Region server 0 gets killed and starts shutting down - Master starts splitting logs - Master starts recovering the lease of the first HLog - RS 0, on it's way out, archives the log almost at the same time - Master fails recovering because of: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/hudson/.logs/vesta.apache.org,58598,1294117333857/vesta.apache.org%3A58598.1294117406909 File does not exist. [Lease. Holder: DFSClient_-986975908, pendingcreates: 1]
That exception confused me a lot, you would expect to see a FNFE instead, but it comes out as a lease problem. J-D On Tue, Jan 4, 2011 at 11:27 AM, Jean-Daniel Cryans <[email protected]> wrote: > What happened is that the splitting failed after a region server was > killed but was never retried, thus the logs weren't archived nor > replicated. > > Looks like a bug in MasterFileSystem or something around that part of > the code? Investigating. > > J-D > > On Mon, Jan 3, 2011 at 10:08 PM, Apache Hudson Server > <[email protected]> wrote: >> See <https://hudson.apache.org/hudson/job/hbase-0.90/48/changes> >> >> Changes: >> >> [todd] HBASE-3392. Update backport of InputSampler to reflect MAPREDUCE-1820 >> >> [stack] HBASE-3344 Master aborts after RPC to server that was shutting down >> >> ------------------------------------------ >> [...truncated 2738 lines...] >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec >> Running org.apache.hadoop.hbase.rest.TestStatusResource >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.691 sec >> Running org.apache.hadoop.hbase.executor.TestExecutorService >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.048 sec >> Running org.apache.hadoop.hbase.client.TestFromClientSide >> Tests run: 41, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 352.09 sec >> Running org.apache.hadoop.hbase.replication.TestReplication >> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 195.094 sec >> <<< FAILURE! >> Running org.apache.hadoop.hbase.regionserver.TestCompaction >> Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 152.126 sec >> Running org.apache.hadoop.hbase.filter.TestSingleColumnValueFilter >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.036 sec >> Running org.apache.hadoop.hbase.mapreduce.TestTimeRangeMapRed >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.449 sec >> Running org.apache.hadoop.hbase.zookeeper.TestZooKeeperMainServerArg >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.031 sec >> Running org.apache.hadoop.hbase.mapreduce.TestSimpleTotalOrderPartitioner >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.1 sec >> Running org.apache.hadoop.hbase.master.TestClockSkewDetection >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.04 sec >> Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat >> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 190.382 sec >> Running org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.441 sec >> Running org.apache.hadoop.hbase.client.replication.TestReplicationAdmin >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.417 sec >> Running org.apache.hadoop.hbase.regionserver.TestScanDeleteTracker >> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.165 sec >> Running org.apache.hadoop.hbase.client.TestMetaScanner >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 57.037 sec >> Running org.apache.hadoop.hbase.metrics.TestMetricsMBeanBase >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec >> Running org.apache.hadoop.hbase.TestRegionRebalancing >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.534 sec >> Running org.apache.hadoop.hbase.filter.TestSingleColumnValueExcludeFilter >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec >> Running org.apache.hadoop.hbase.filter.TestInclusiveStopFilter >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec >> Running org.apache.hadoop.hbase.rest.TestTableResource >> Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 57.457 sec >> Running org.apache.hadoop.hbase.TestHBaseTestingUtility >> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.011 sec >> Running org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.538 sec >> Running org.apache.hadoop.hbase.rest.TestSchemaResource >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 88.296 sec >> Running org.apache.hadoop.hbase.regionserver.wal.TestWALReplay >> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.086 sec >> Running org.apache.hadoop.hbase.regionserver.TestKeyValueScanFixture >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec >> Running org.apache.hadoop.hbase.rest.model.TestTableListModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.144 sec >> Running org.apache.hadoop.hbase.rest.model.TestColumnSchemaModel >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.032 sec >> Running org.apache.hadoop.hbase.regionserver.TestStoreFile >> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.342 sec >> Running org.apache.hadoop.hbase.zookeeper.TestZooKeeperNodeTracker >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.967 sec >> Running org.apache.hadoop.hbase.rest.model.TestScannerModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.153 sec >> Running org.apache.hadoop.hbase.regionserver.wal.TestHLogMethods >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.409 sec >> Running org.apache.hadoop.hbase.rest.TestVersionResource >> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.841 sec >> Running org.apache.hadoop.hbase.filter.TestFilter >> Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.132 sec >> Running org.apache.hadoop.hbase.regionserver.TestWideScanner >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.662 sec >> Running org.apache.hadoop.hbase.regionserver.TestSplitTransaction >> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.141 sec >> Running org.apache.hadoop.hbase.replication.regionserver.TestReplicationSink >> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 37.475 sec >> Running org.apache.hadoop.hbase.regionserver.TestGetClosestAtOrBefore >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.276 sec >> Running org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.02 sec >> Running org.apache.hadoop.hbase.util.TestMergeTable >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 112.903 sec >> Running org.apache.hadoop.hbase.rest.model.TestVersionModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.14 sec >> Running org.apache.hadoop.hbase.rest.model.TestCellSetModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.156 sec >> Running org.apache.hadoop.hbase.rest.client.TestRemoteTable >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.265 sec >> Running org.apache.hadoop.hbase.rest.model.TestStorageClusterStatusModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.146 sec >> Running org.apache.hadoop.hbase.avro.TestAvroServer >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.775 sec >> Running org.apache.hadoop.hbase.TestFullLogReconstruction >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 82.992 sec >> Running org.apache.hadoop.hbase.TestSerialization >> Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.061 sec >> Running org.apache.hadoop.hbase.regionserver.TestKeyValueHeap >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.135 sec >> Running org.apache.hadoop.hbase.regionserver.TestStoreScanner >> Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.035 sec >> Running org.apache.hadoop.hbase.TestHMsg >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 sec >> Running org.apache.hadoop.hbase.io.hfile.TestReseekTo >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.237 sec >> Running org.apache.hadoop.hbase.regionserver.TestHRegionInfo >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec >> Running org.apache.hadoop.hbase.util.TestIncrementingEnvironmentEdge >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec >> Running org.apache.hadoop.hbase.util.TestKeying >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec >> Running org.apache.hadoop.hbase.regionserver.TestScanner >> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 64.97 sec >> Running org.apache.hadoop.hbase.util.TestHBaseFsck >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.869 sec >> Running org.apache.hadoop.hbase.master.TestRestartCluster >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.783 sec >> Running org.apache.hadoop.hbase.rest.model.TestTableInfoModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.148 sec >> Running org.apache.hadoop.hbase.TestInfoServers >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.095 sec >> Running org.apache.hadoop.hbase.master.TestLogsCleaner >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.15 sec >> Running org.apache.hadoop.hbase.regionserver.TestMasterAddressManager >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.308 sec >> Running org.apache.hadoop.hbase.regionserver.TestQueryMatcher >> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.152 sec >> Running org.apache.hadoop.hbase.zookeeper.TestZKTable >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.384 sec >> Running org.apache.hadoop.hbase.client.TestAdmin >> Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 503.517 sec >> Running org.apache.hadoop.hbase.regionserver.TestPriorityCompactionQueue >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.077 sec >> Running org.apache.hadoop.hbase.util.TestMergeTool >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.965 sec >> Running org.apache.hadoop.hbase.TestAcidGuarantees >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.015 sec >> Running org.apache.hadoop.hbase.util.TestBase64 >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec >> Running org.apache.hadoop.hbase.regionserver.TestMemStore >> Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.43 sec >> Running org.apache.hadoop.hbase.rest.TestScannerResource >> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 58.096 sec >> Running org.apache.hadoop.hbase.io.hfile.TestLruBlockCache >> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.055 sec >> Running org.apache.hadoop.hbase.master.TestDeadServer >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec >> Running org.apache.hadoop.hbase.rest.TestGzipFilter >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.141 sec >> Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 111.507 sec >> Running org.apache.hadoop.hbase.TestKeyValue >> Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.027 sec >> Running org.apache.hadoop.hbase.client.TestMultiParallel >> Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 70.261 sec >> Running org.apache.hadoop.hbase.master.TestMaster >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.188 sec >> Running org.apache.hadoop.hbase.master.TestRollingRestart >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 75.922 sec >> Running org.apache.hadoop.hbase.client.TestResult >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.038 sec >> Running org.apache.hadoop.hbase.client.TestHTablePool >> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.408 sec >> Running org.apache.hadoop.hbase.rest.model.TestTableSchemaModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.163 sec >> Running org.apache.hadoop.hbase.util.TestByteBloomFilter >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.739 sec >> Running org.apache.hadoop.hbase.filter.TestFilterList >> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.039 sec >> Running org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan >> Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 337.102 sec >> Running org.apache.hadoop.hbase.regionserver.TestHRegion >> Tests run: 52, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.796 sec >> Running org.apache.hadoop.hbase.master.TestMasterFailover >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 88.081 sec >> Running org.apache.hadoop.hbase.rest.model.TestTableRegionModel >> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.037 sec >> Running org.apache.hadoop.hbase.client.TestTimestampsFilter >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.945 sec >> Running org.apache.hadoop.hbase.master.TestLoadBalancer >> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.271 sec >> Running org.apache.hadoop.hbase.util.TestRootPath >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.021 sec >> Running org.apache.hadoop.hbase.client.TestScannerTimeout >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 58.024 sec >> Running org.apache.hadoop.hbase.mapred.TestTableMapReduce >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.54 sec >> Running org.apache.hadoop.hbase.regionserver.wal.TestWALObserver >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.43 sec >> Running org.apache.hadoop.hbase.master.TestActiveMasterManager >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.433 sec >> Running >> org.apache.hadoop.hbase.replication.regionserver.TestReplicationSourceManager >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.065 sec >> Running org.apache.hadoop.hbase.util.TestBytes >> Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.022 sec >> Running org.apache.hadoop.hbase.util.TestCompressionTest >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.035 sec >> Running org.apache.hadoop.hbase.client.TestGetRowVersions >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 37.983 sec >> Running org.apache.hadoop.hbase.filter.TestDependentColumnFilter >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.359 sec >> Running org.apache.hadoop.hbase.rest.model.TestCellModel >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.152 sec >> Running org.apache.hadoop.hbase.master.TestCatalogJanitor >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.531 sec >> Running org.apache.hadoop.hbase.catalog.TestCatalogTracker >> Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.324 sec >> Running org.apache.hadoop.hbase.replication.TestReplicationSource >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.614 sec >> Running org.apache.hadoop.hbase.rest.TestRowResource >> Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.97 sec >> Running org.apache.hadoop.hbase.io.hfile.TestHFile >> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.365 sec >> Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor >> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 42.062 sec >> Running org.apache.hadoop.hbase.mapreduce.TestImportTsv >> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.043 sec >> Running org.apache.hadoop.hbase.rest.client.TestRemoteAdmin >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.571 sec >> Running org.apache.hadoop.hbase.regionserver.wal.TestHLog >> Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 88.143 sec >> Running org.apache.hadoop.hbase.client.TestTimestamp >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.669 sec >> Running org.apache.hadoop.hbase.TestScanMultipleVersions >> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.235 sec >> Running org.apache.hadoop.hbase.rest.model.TestStorageClusterVersionModel >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.031 sec >> Running org.apache.hadoop.hbase.client.TestHCM >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.254 sec >> >> Results : >> >> Failed tests: >> >> Tests run: 612, Failures: 1, Errors: 0, Skipped: 9 >> >> [INFO] >> ------------------------------------------------------------------------ >> [ERROR] BUILD FAILURE >> [INFO] >> ------------------------------------------------------------------------ >> [INFO] There are test failures. >> >> Please refer to >> <https://hudson.apache.org/hudson/job/hbase-0.90/ws/trunk/target/surefire-reports> >> for the individual test results. >> [INFO] >> ------------------------------------------------------------------------ >> [INFO] For more information, run Maven with the -e switch >> [INFO] >> ------------------------------------------------------------------------ >> [INFO] Total time: 86 minutes >> [INFO] Finished at: Tue Jan 04 06:08:02 UTC 2011 >> [INFO] Final Memory: 90M/982M >> [INFO] >> ------------------------------------------------------------------------ >> [locks-and-latches] Releasing all the locks >> [locks-and-latches] All the locks released >> Archiving artifacts >> Recording test results >> >> >
