P.S. the JIRA number is: 1. Bookkeeper <https://issues.apache.org/jira/browse/BOOKKEEPER> 2. BOOKKEEPER-838 <https://issues.apache.org/jira/browse/BOOKKEEPER-838>
On Sat, Feb 21, 2015 at 1:24 PM, Jia Zhai <[email protected]> wrote: > According to Ivan’s reply, I did a check of the build history. Seems > recently failing is with this stack: > > java.io.IOException: Unable to delete directory > /tmp/bkTest3561939033223584760.dir/current/0. > > at > org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1337) > > at > org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1910) > > at > org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1399) > > at > org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1331) > > at > org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1910) > > at > org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1399) > > at > org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1331) > > at > org.apache.bookkeeper.test.BookKeeperClusterTestCase.cleanupTempDirs(BookKeeperClusterTestCase.java:186) > > at > org.apache.bookkeeper.test.BookKeeperClusterTestCase.tearDown(BookKeeperClusterTestCase.java:114) > > > > This may be caused by an error in ForceWriteThread::run(), which leaked > “logFile.close()” when interrupt comes. And I have opened a ticket in JIRA. > > > > private class ForceWriteThread { > > public void run() { > > LOG.info("ForceWrite Thread started"); > > boolean shouldForceWrite = true; > > int numReqInLastForceWrite = 0; > > while(running) { > > ForceWriteRequest req = null; > > try { > > … > > } catch (IOException ioe) { > > LOG.error("I/O exception in ForceWrite thread", ioe); > > running = false; > > } catch (InterruptedException e) { > > LOG.error("ForceWrite thread interrupted", e); > > if (null != req) { > > req.closeFileIfNecessary(); < ==== 2, when > interrupt, “shouldClose” not set properly, so file may not close > > } > > running = false; > > } > > } > > // Regardless of what caused us to exit, we should notify the > > // the parent thread as it should either exit or be in the > process > > // of exiting else we will have write requests hang > > threadToNotifyOnEx.interrupt(); > > } > > // shutdown sync thread > > void shutdown() throws InterruptedException { > > running = false; > > this.interrupt(); < ==== 1, call interrupt > > this.join(); > > } > > } > > > > public void closeFileIfNecessary() { > > // Close if shouldClose is set > > if (shouldClose) { < ==== 3, “shouldClose” is false > here. > > // We should guard against exceptions so its > > // safe to call in catch blocks > > try { > > logFile.close(); > > // Call close only once > > shouldClose = false; > > } > > catch (IOException ioe) { > > LOG.error("I/O exception while closing file", ioe); > > } > > } > > } > > > Thanks. > > -Jia > > On Sat, Feb 21, 2015 at 3:07 AM, Ivan Kelly <[email protected]> wrote: > >> there does seem to be some flakiness in master in general. Jenkins is >> failing every couple of builds. >> >> https://builds.apache.org/job/bookkeeper-master/ >> >> On Fri, Feb 20, 2015 at 7:23 PM, Sijie Guo <[email protected]> wrote: >> > I didn't encounter this. Does it work if you run master? just to >> isolate if >> > it is the branch-only issue. >> > >> > - Sijie >> > >> > On Thu, Feb 19, 2015 at 1:56 PM, Flavio Junqueira < >> > [email protected]> wrote: >> > >> >> Right now I can't get this test to pass in any of my settings, is it >> known >> >> to be flaky? >> >> >> >> Tests in error: >> >> >> >> >> testPeriodicCheckWhenLedgerDeleted(org.apache.bookkeeper.replication.AuditorPeriodicCheckTest): >> >> test timed out after 60000 milliseconds >> >> >> >> -Flavio >> >> >> >> > On 18 Feb 2015, at 22:55, Sijie Guo <[email protected]> wrote: >> >> > >> >> > How about the master? Are u able to get a clean build on it? >> >> > >> >> > On Wed, Feb 18, 2015 at 7:12 AM, Flavio Junqueira < >> >> > [email protected]> wrote: >> >> > >> >> >> The disk isn't getting full while running the tests, I checked >> multiple >> >> >> times. I'm having a hard time to get a clean build with this >> computer, >> >> it >> >> >> sounds like there are some flaky tests. >> >> >> >> >> >> -Flavio >> >> >> >> >> >>> On 17 Feb 2015, at 19:22, Sijie Guo <[email protected]> wrote: >> >> >>> >> >> >>> Hi Flavio: >> >> >>> >> >> >>> What is your disk space usage when you run the tests? >> >> >>> >> >> >>> - Sijie >> >> >>> >> >> >>> On Sat, Feb 14, 2015 at 8:02 AM, Flavio Junqueira < >> >> >>> [email protected]> wrote: >> >> >>> >> >> >>>> I'm getting a lot of test errors, am I the only one to observe >> this? >> >> >>>> >> >> >>>> Results : >> >> >>>> >> >> >>>> Failed tests: >> >> >>>> testCloseDuringOp[0](org.apache.bookkeeper.client.BookKeeperTest): >> >> Close >> >> >>>> never completed >> >> >>>> testShutdown(org.apache.bookkeeper.replication.AuditorBookieTest): >> >> >>>> Auditor re-election is not happened for auditor failure! expected >> not >> >> >> same >> >> >>>> >> >> >>>> >> >> >> >> >> >> testIndexCorruption(org.apache.bookkeeper.replication.AuditorPeriodicCheckTest): >> >> >>>> Ledger should be under replicated expected:<4> but was:<-1> >> >> >>>> >> >> >>>> >> >> >> >> >> >> testPeriodicCheckWhenDisabled(org.apache.bookkeeper.replication.AuditorPeriodicCheckTest): >> >> >>>> All should be underreplicated >> >> >>>> >> testShutdown(org.apache.bookkeeper.replication.AutoRecoveryMainTest): >> >> >>>> AuditorElector should not be running >> >> >>>> >> >> >>>> Tests in error: >> >> >>>> >> >> >>>> >> >> >> >> >> >> testBookieRestartContinuously(org.apache.bookkeeper.bookie.BookieShutdownTest): >> >> >>>> test timed out after 150000 milliseconds >> >> >>>> testCloseDuringOp[1](org.apache.bookkeeper.client.BookKeeperTest): >> >> test >> >> >>>> timed out after 60000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testShouldNotGetTheFragmentIfThereIsNoMissedEntry(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testShouldGetTwoFrgamentsIfTwoBookiesFailedInSameEnsemble(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testShouldNotGetAnyFragmentIfNoLedgerPresent(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testShouldGetFailedEnsembleNumberOfFgmntsIfEnsembleBookiesFailedOnNextWrite(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testShouldGetOneFragmentWithSingleEntryOpenedLedger(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testSingleEntryAfterEnsembleChange(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testClosedSingleEntryLedger(org.apache.bookkeeper.client.TestLedgerChecker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testPeriodicCheckWhenLedgerDeleted(org.apache.bookkeeper.replication.AuditorPeriodicCheckTest): >> >> >>>> test timed out after 60000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testRWShouldCleanTheLedgerFromUnderReplicationIfLedgerAlreadyDeleted[0](org.apache.bookkeeper.replication.TestReplicationWorker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testRWShouldCleanTheLedgerFromUnderReplicationIfLedgerAlreadyDeleted[1](org.apache.bookkeeper.replication.TestReplicationWorker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> >> >> >>>> >> >> >> >> >> >> testRWShouldCleanTheLedgerFromUnderReplicationIfLedgerAlreadyDeleted[2](org.apache.bookkeeper.replication.TestReplicationWorker): >> >> >>>> test timed out after 3000 milliseconds >> >> >>>> testCompat400(org.apache.bookkeeper.test.TestBackwardCompat): test >> >> >> timed >> >> >>>> out after 60000 milliseconds >> >> >>>> testCompat410(org.apache.bookkeeper.test.TestBackwardCompat): test >> >> >> timed >> >> >>>> out after 60000 milliseconds >> >> >>>> >> >> >>>>> On 13 Feb 2015, at 09:23, Sijie Guo <[email protected]> wrote: >> >> >>>>> >> >> >>>>> This is the first release candidate for Apache BookKeeper, >> version >> >> >> 4.3.1. >> >> >>>>> It fixes the following issues: >> >> >>>>> >> >> >>>> >> >> >> >> >> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12328755&styleName=Html&projectId=12311293 >> >> >>>>> >> >> >>>>> *** Please download, test and vote by Feb 17th 2015, 10:00 GMT. >> >> >>>>> >> >> >>>>> Note that we are voting upon the source (tag), binaries are >> provided >> >> >> for >> >> >>>>> convenience. >> >> >>>>> >> >> >>>>> Source and binary files: >> >> >>>>> >> >> >>>> >> >> >> >> >> >> https://dist.apache.org/repos/dist/dev/bookkeeper/bookkeeper-4.3.1-candidate-0/ >> >> >>>>> >> >> >>>>> Maven staging repo: >> >> >>>>> >> >> >>>> >> >> >> >> >> >> https://repository.apache.org/content/repositories/orgapachebookkeeper-1005/ >> >> >>>>> >> >> >>>>> The tag to be voted upon: >> >> >>>>> release-4.3.1 (b830f4e88c991d67a84ed883c6136989a54c2556) >> >> >>>>> >> >> >>>>> BookKeeper's KEYS file containing PGP keys we use to sign the >> >> release: >> >> >>>>> https://dist.apache.org/repos/dist/release/bookkeeper/KEYS >> >> >>>>> >> >> >>>>> Please download the the source package, and follow the README to >> >> build >> >> >>>>> and run a bookkeeper and hedwig service. >> >> >>>> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> > >
