[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504167#comment-13504167 ] Myrna van Lunteren commented on DERBY-5975: --- merged revision 1412392 and 1413812 to 10.9 with revision 1413885 and to 10.8 with revision 1413886 (http://svn.apache.org/viewvc?view=revision&revision=1413885 and http://svn.apache.org/viewvc?view=revision&revision=1413886) > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.10.0.0, 10.8.3.0, 10.9.2.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > Attachments: DERBY-5975.diff, fail.zip > > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > So far I have only seen this on weme6.2 windows runs. Likely there is a > timing issue that causes the > test to fail and then bad cleanup of this test leads to hang. In the one > stack I see as thread stuck > in shutdown and a thread stuck waiting on the log. > If no easy fixes for this it may make sense to disable this test in this one > environment until someone > wants to work on this one. Then we can at least get the rest of the testing > to procede. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504056#comment-13504056 ] Myrna van Lunteren commented on DERBY-5975: --- I accidentally added an unneeded import in my previous commit, removed it again with revision 1413812 (http://svn.apache.org/viewvc?view=revision&revision=1413812) > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.10.0.0, 10.8.3.0, 10.9.2.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > Attachments: DERBY-5975.diff, fail.zip > > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > So far I have only seen this on weme6.2 windows runs. Likely there is a > timing issue that causes the > test to fail and then bad cleanup of this test leads to hang. In the one > stack I see as thread stuck > in shutdown and a thread stuck waiting on the log. > If no easy fixes for this it may make sense to disable this test in this one > environment until someone > wants to work on this one. Then we can at least get the rest of the testing > to procede. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502491#comment-13502491 ] Myrna van Lunteren commented on DERBY-5975: --- I disabled the test as per the patch with revision 1412392: http://svn.apache.org/viewvc?view=revision&revision=1412392 I intend to backport this to 10.9 and 10.8. > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.9.1.1, 10.10.0.0, 10.8.3.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > Attachments: DERBY-5975.diff, fail.zip > > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > So far I have only seen this on weme6.2 windows runs. Likely there is a > timing issue that causes the > test to fail and then bad cleanup of this test leads to hang. In the one > stack I see as thread stuck > in shutdown and a thread stuck waiting on the log. > If no easy fixes for this it may make sense to disable this test in this one > environment until someone > wants to work on this one. Then we can at least get the rest of the testing > to procede. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488567#comment-13488567 ] Knut Anders Hatlen commented on DERBY-5975: --- Thanks, Mike. It looks like replication failover fails with a NullPointerException: Caused by: java.lang.NullPointerException at java.io.ObjectOutputStream.drain(ObjectOutputStream.java:258) at java.io.ObjectOutputStream.flush(ObjectOutputStream.java:331) at java.io.ObjectOutputStream.close(ObjectOutputStream.java:220) at org.apache.derby.impl.store.replication.net.SocketConnection.tearDown(Unknown Source) at org.apache.derby.impl.store.replication.net.ReplicationMessageTransmit.tearDown(Unknown Source) at org.apache.derby.impl.store.replication.master.MasterController.teardownNetwork(Unknown Source) at org.apache.derby.impl.store.replication.master.MasterController.startFailover(Unknown Source) at org.apache.derby.impl.store.raw.RawStore.failover(Unknown Source) at org.apache.derby.impl.store.access.RAMAccessManager.failover(Unknown Source) at org.apache.derby.impl.db.BasicDatabase.failover(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleFailoverMaster(Unknown Source) ... 41 more Since the NPE happens inside java.io.ObjectOutputStream.drain(), and not in Derby code called from drain(), I'd expect this to be a JVM bug. At least I have convinced myself that this NPE is impossible in OpenJDK by looking at the source for the ObjectOutputStream class. The argument I used to convince myself, goes like this: In OpenJDK, ObjectOutputStream's drain() method simply forwards the call to bout.drain(), so any NPE in drain must be caused by the bout field being null. bout is a final field which is always initialized to a non-null value in the ObjectOutputStream constructor we use in replication.net.SocketConnection, so it's guaranteed to be non-null in drain(), and the NPE can't happen. I think it's OK to disable this test on weme for now. Note that this is the only replication test that runs on weme (or any of the JSR-169/CDC FP platforms), so this code has never been exercised on weme before. All the other replication tests use separate network servers and the client driver to communicate with them, and are therefore disabled on those platforms. > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.8.2.3, 10.9.1.1, 10.10.0.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > Attachments: fail.zip > > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > So far I have only seen this on weme6.2 windows runs. Likely there is a > timing issue that causes the > test to fail and then bad cleanup of this test leads to hang. In the one > stack I see as thread stuck > in shutdown and a thread stuck waiting on the log. > If no easy fixes for this it may make sense to disable this test in this one > environment until someone > wants to work on this one. Then we can at least get the rest of the testing > to procede. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488150#comment-13488150 ] Knut Anders Hatlen commented on DERBY-5975: --- Is there a fail directory with an error-stacktrace.out file that could tell why the test failed? > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.8.2.3, 10.9.1.1, 10.10.0.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > So far I have only seen this on weme6.2 windows runs. Likely there is a > timing issue that causes the > test to fail and then bad cleanup of this test leads to hang. In the one > stack I see as thread stuck > in shutdown and a thread stuck waiting on the log. > If no easy fixes for this it may make sense to disable this test in this one > environment until someone > wants to work on this one. Then we can at least get the rest of the testing > to procede. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488001#comment-13488001 ] Mike Matrigali commented on DERBY-5975: --- failed against 10.8, windows, weme 6.2, non-vmware - on 10/16, 10/22, 10/24, 10/28, 10/30. All failed with Derby5937SlaveShutdownTest.testSlaveFailoverLeak failed as last test in result and nothing else in result, indicating test hung and then eventually harness killed test > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.10.0.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487956#comment-13487956 ] Mike Matrigali commented on DERBY-5975: --- failed on 10/27 and 10/29 against 10.9, windows/vmware, weme6.2, worked on 10/28 http://people.apache.org/~myrnavl/derby_test_results/v10_9/windows/suites.All_history.html On 10/29 got some dump to the result log: (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used 20252 ms F Stack Traces of Threads: ThreadName=Triggered DumpAgent Thread(239B3234) Status=Running ThreadName=main(003353CC) Status=Waiting Monitor=23956EC0 (Object monitor for org/apache/derby/impl/store/raw/log/LogToFile @ 02704A60) Count=0 Owner=(00223700) In java/lang/Object.wait(JI)V In java/lang/Object.wait()V In org/apache/derby/impl/store/raw/log/LogToFile.flush(JJ)V In org/apache/derby/impl/store/raw/log/LogToFile.flush(Lorg/apache/derby/iapi/store/raw/log/LogInstant;)V In org/apache/derby/impl/store/raw/log/LogToFile.checkpointWithTran(Lorg/apache/derby/iapi/store/raw/xact/RawTransaction;Lorg/apache/derby/iapi/store/raw/RawStoreFactory;Lorg/apache/derby/iapi/store/raw/data/DataFactory;Lorg/apache/derby/iapi/store/raw/xact/TransactionFactory;Z)Z In org/apache/derby/impl/store/raw/log/LogToFile.checkpoint(Lorg/apache/derby/iapi/store/raw/RawStoreFactory;Lorg/apache/derby/iapi/store/raw/data/DataFactory;Lorg/apache/derby/iapi/store/raw/xact/TransactionFactory;Z)Z In org/apache/derby/impl/store/raw/RawStore.stop()V In org/apache/derby/impl/services/monitor/TopService.stop(Ljava/lang/Object;)V In org/apache/derby/impl/services/monitor/TopService.shutdown()Z In org/apache/derby/impl/services/monitor/BaseMonitor.shutdown(Ljava/lang/Object;)V In org/apache/derby/impl/db/DatabaseContextImpl.cleanupOnError(Ljava/lang/Throwable;)V In org/apache/derby/iapi/services/context/ContextManager.cleanupOnError(Ljava/lang/Throwable;Z)Z In org/apache/derby/impl/jdbc/TransactionResourceImpl.cleanupOnError(Ljava/lang/Throwable;Z)Z In org/apache/derby/impl/jdbc/EmbedConnection.(Lorg/apache/derby/jdbc/InternalDriver;Ljava/lang/String;Ljava/util/Properties;)V In org/apache/derby/jdbc/Driver169.getNewEmbedConnection(Ljava/lang/String;Ljava/util/Properties;)Lorg/apache/derby/impl/jdbc/EmbedConnection; In org/apache/derby/jdbc/InternalDriver.connect(Ljava/lang/String;Ljava/util/Properties;)Ljava/sql/Connection; In org/apache/derby/jdbc/EmbeddedSimpleDataSource.getConnection(Ljava/lang/String;Ljava/lang/String;)Ljava/sql/Connection; In org/apache/derby/jdbc/EmbeddedSimpleDataSource.getConnection()Ljava/sql/Connection; In org/apache/derbyTesting/junit/JDBCDataSource.shutdownDatabase(Ljavax/sql/DataSource;)V In org/apache/derbyTesting/junit/DropDatabaseSetup.tearDown()V In junit/extensions/TestSetup$1.protect()V In junit/framework/TestResult.runProtected(Ljunit/framework/Test;Ljunit/framework/Protectable;)V In junit/extensions/TestSetup.run(Ljunit/framework/TestResult;)V In org/apache/derbyTesting/junit/BaseTestSetup.run(Ljunit/framework/TestResult;)V In junit/extensions/TestDecorator.basicRun(Ljunit/framework/TestResult;)V In junit/extensions/TestSetup$1.protect()V In junit/framework/TestResult.runProtected(Ljunit/framework/Test;Ljunit/framework/Protectable;)V In junit/extensions/TestSetup.run(Ljunit/framework/TestResult;)V In junit/extensions/TestDecorator.basicRun(Ljunit/framework/TestResult;)V In junit/extensions/TestSetup$1.protect()V In junit/framework/TestResult.runProtected(Ljunit/framework/Test;Ljunit/framework/Protectable;)V In junit/extensions/TestSetup.run(Ljunit/framework/TestResult;)V In junit/framework/TestSuite.runTest(Ljunit/framework/Test;Ljunit/framework/TestResult;)V In junit/framework/TestSuite.run(Ljunit/framework/TestResult;)V In junit/framework/TestSuite.runTest(Ljunit/framework/Test;Ljunit/framework/TestResult;)V In junit/framework/TestSuite.run(Ljunit/framework/TestResult;)V In junit/framework/TestSuite.runTest(Ljunit/framework/Test;Ljunit/framework/TestResult;)V In junit/framework/TestSuite.run(Ljunit/framework/TestResult;)V In junit/textui/TestRunner.doRun(Ljunit/framework/Test;Z)Ljunit/framework/TestResult; In junit/textui/TestRunner.start([Ljava/lang/String;)Ljunit/framework/TestResult; In junit/textui/TestRunner.main([Ljava/lang/String;)V ThreadName=JIT Compilation Thread(0033562C) Status=Waiting Monitor=003350D8 (System monitor) Count=0 Owner=(00223C00) ThreadName=Finalizer thread(00335FAC) Status=Waiting Monitor=0033D090 (System monitor) Count=0 Owner=(2163EE00) ThreadName=Thread-73(0033F02C) Status=Wait
[jira] [Commented] (DERBY-5975) intermittent nightly test failure across releases in Derby5937SlaveShutdownTest.testSlaveFailoverLeak
[ https://issues.apache.org/jira/browse/DERBY-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487933#comment-13487933 ] Mike Matrigali commented on DERBY-5975: --- failed twice against trunk, windows/vmware, weme6.2 on 10/27/12 and 10/28/12 and then worked on 10/29 - builds 1403143 and 1402922 http://people.apache.org/~myrnavl/derby_test_results/main/windows/suites.All_history.html > intermittent nightly test failure across releases in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak > - > > Key: DERBY-5975 > URL: https://issues.apache.org/jira/browse/DERBY-5975 > Project: Derby > Issue Type: Bug > Components: Test >Affects Versions: 10.10.0.0 > Environment: windows weme6.2 >Reporter: Mike Matrigali > > Across multiple versions nightly tests have failed in > Derby5937SlaveShutdownTest.testSlaveFailoverLeak. > Subsequent to this no other test runs and thus we get no info printed to the > log, and the ibm test > reporter does not post anything other than a red box if the tests do not > finish. Not sure if the > tests are hanging as part of trying to clean up the failure or if the next > test is hanging. Will post > test runs that have failed in additional comments. > (emb)jdbcapi.DatabaseMetaDataTest.testGetColumns_DERBY5274 used 343 ms . > (emb)jdbcapi.DatabaseMetaDataTest.testDMDconnClosed used 79 ms Test upgrade > done. > Test upgrade from: 10.9.1.0, phase: POST UPGRADE > . > (emb)upgradeTests.BasicSetup.noConnectionAfterHardUpgrade used 156 ms Test > upgrade done. > . > (emb)replicationTests.Derby5937SlaveShutdownTest.testSlaveFailoverLeak used > 24221 ms F -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
