Quick question re 10 runs: are these runs that are in parallel with all the unit tests or just this test?
The other question is: how do we construct these tests so they it is extremely unlikely to get a failure even if processing is slow or threads are suspended? On Wed, Apr 29, 2015 at 7:53 AM, Sudheesh Katkam <[email protected]> wrote: > I am responsible for those tests. I ran the tests at least 10 times on my > Linux VM with 1 second pauses, all of which passed. > > On your second run, what different errors did you see? > > On your third run, are you able to reproduce the test case the hangs? > > Sorry that the message is not informative. I already have a patch which is > a slight improvement to Jacques change that improves the message in those > tests. > > What tool did you use to get the thread count? > > - Sudheesh > > Sent from my iPhone. Pardon any typos. > > > On Apr 29, 2015, at 6:28 AM, Abdel Hakim Deneche <[email protected]> > wrote: > > > > The message displayed in the first run contains actually two different > > issues: > > > > 1. The error message "Error shutting down Drillbit 'beta'" is most likely > > caused by this issue DRILL-2878 > > <https://issues.apache.org/jira/browse/DRILL-2878> > > > > 2. The test that failed with an "java.lang.AssertionError: null" is most > > likely a bug because that unit test should not fail. I've seen this error > > before, but it only happens intermittently. > > > > The system error reported in the 3rd run is actually an "expected" > injected > > exception, but 278 threads looks suspicious!!! > > > > On Wed, Apr 29, 2015 at 12:13 AM, Daniel Barclay <[email protected]> > > wrote: > > > >> Does anyone know what's going on with TestDrillbitResilience (rebased > >> from master today)? (Is it working right?) > >> > >> > >> One run, via "mvn install", yielded assertion errors: > >> > >> ... > >> Error shutting down Drillbit "beta". > >> Tests run: 11, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 33.811 > >> sec <<< FAILURE! - in > org.apache.drill.exec.server.TestDrillbitResilience > >> > cancelAfterEverythingIsCompleted(org.apache.drill.exec.server.TestDrillbitResilience) > >> Time elapsed: 1.468 sec <<< FAILURE! > >> java.lang.AssertionError: null > >> at > >> > org.apache.drill.exec.server.TestDrillbitResilience.assertCancelled(TestDrillbitResilience.java:459) > >> at > >> > org.apache.drill.exec.server.TestDrillbitResilience.cancelAfterEverythingIsCompleted(TestDrillbitResilience.java:565) > >> > >> > cancelInMiddleOfFetchingResults(org.apache.drill.exec.server.TestDrillbitResilience) > >> Time elapsed: 1.496 sec <<< FAILURE! > >> java.lang.AssertionError: null > >> at > >> > org.apache.drill.exec.server.TestDrillbitResilience.assertCancelled(TestDrillbitResilience.java:459) > >> at > >> > org.apache.drill.exec.server.TestDrillbitResilience.cancelInMiddleOfFetchingResults(TestDrillbitResilience.java:510) > >> > >> Running <next test> > >> ... > >> > >> > >> A second run, run individually (but still via Maven) died with different > >> errors. > >> > >> > >> > >> A third run, via "mvn install" again, seems hung after reporting this > >> (maybe expected) exception: > >> > >> Exception (no rows returned): > >> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > >> run-try-end > >> > >> > >> [fb9cfe61-af6e-4c9c-b6ab-8a1b8725c6e9 on dev-linux2:31010] > >> > >> > >> The process is using only about 5% CPU--but has 278 threads! > >> (That includes about 35 threads all with the same name of > "BitClient-1".) > >> > >> > >> Daniel > >> > >> > >> > >> > >> > >> > >> -- > >> Daniel Barclay > >> MapR Technologies > > > > > > > > -- > > > > Abdelhakim Deneche > > > > Software Engineer > > > > <http://www.mapr.com/> > > > > > > Now Available - Free Hadoop On-Demand Training > > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >
