Peter, As I said, I am happy to take on a few classes and see if I can find some problems or discuss possible concurrency failures. I am pretty good at recognizing concurrency problems, at least within the range of coding styles that I am familiar with. But I do not know the river internals. This will of necessity limit my ability to recognize problems with respect to the overall architecture.
Bryan On 4/1/13 8:10 AM, "Peter" <[email protected]> wrote: >The big problem is there are less concurrency bugs in the present branch >and there are definitely more in the previous release, they just haven't >presented in test in previous releases, although they have in deployment. > >The issue appears to be with javaspaces / lease / transactions. > >It's also a possibility that there's just a timing issue with the tests. > >I'd like to fix the tests before releasing, it's unfortunate the last fix >to jeri seemed to expose these new test failures. It's a little like >fixing rust in an old car, the more you remove, the more you find. > >What I really need is more help, I don't work on this full time it takes >longer than a team woking on commercial software. > >Peter. >----- Original message ----- >> I am uncomfortable with a release that has known concurrency problems. >>I >> am also uncomfortable that the custom Levels serialization change by >> Oracle [1] has broken the current release for new JVMs. I would like to >> see a minor release which fixes that serialization problem and a >>candidate >> release which gives people a chance to discovery concurrent issues >>without >> risking a release that is known to be unstable. >> >> I'm happy to review a few of the classes with known concurrency problems >> to see if I can help nail some of these bugs. Since I do not know the >> river internals, I would only be able to spot concurrency problems that >> exist within a class. I am not in a good position to comment on >> concurrency problems that might arise through the interactions among >> classes. >> >> Bryan >> >> [1] https://issues.apache.org/jira/browse/RIVER-416 >> >> On 4/1/13 6:14 AM, "Peter Firmstone" <[email protected]> wrote: >> >> > The attachments will be removed from the list, so I've cc'd you, >>anyone >> > who's interested, let me know I can forward the attachments. They can >> > be opened with jvisualvm. >> > >> > The profiling isn't perfect, the test runs for about 8.5 minutes, so >> > hotspot should have kicked in relatively early in both test runs. >> > >> > I guess a significant problem is; the more I remove performance >> > impedances, like unnecessary DNS calls, the faster multithreading and >> > context switching gets. >> > >> > Not only did the old policy providers create contention, but it was >> > slower for single threaded performance (I'll have to run the previous >> > release branch for comparison when I get some time). >> > >> > The URIGrant.implies call is now down to .228 ms per invocation, down >> > from 1.68 ms per invocation this week, which was already quite good >>(on >> > old UltraSparcII hardware), during stress tests this method is called >> > almost 40,000 times. >> > >> > In comparison the old policy provider which required a DNS call (every >> > time CodeSource.implies is called, functionality now replaced by >> > URIGrant.implies), the old policy provider also cached all >>Permission's >> > in highly contended PermissionCollection's, which during network calls >> > invoked SocketPermission.implies, possibly for every SocketPermission >>in >> > the PermissionCollection, DNS is also consulted by >> > SocketPermission.implies, while synchronized, ouch! >> > >> > URIGrant.implies is non blocking, that's right zero contention. >> > SocketPermission.implies DNS calls can be avoided in most cases if >> > PermissionComparator finds an exact match or wild card. >> > >> > This is without the CombinerSecurityManager, which improves security >> > performance by a factor of 10 (SocketPermission is only checked once >>for >> > each AccessControlContext). >> > >> > PreferredClassProvider no longer creates unecessary DNS calls, and >> > neither does SecureClassLoader, URLClassLoader or >>PreferredClassLoader. >> > >> > So the good news is the next release will feel much faster, the bad >>news >> > is that existing concurrency bugs that previously didn't appear during >> > test runs, but likely to manifest during production are now occurring >> > during testing. There's more good news, we fixed a number of >> > concurrency bugs since the last release too, I guess I have to draw >>the >> > line somewhere and cut a release. >> > >> > DNS calls haven't been completely eliminated as some are still >> > necessary, but a multitude of unnecessary DNS calls have been >>eliminated. >> > >> > Regards, >> > >> > Peter. >> > >> > Peter Firmstone wrote: >> > > They've passed more consistently in the past, they're either >> > > concurrency bugs or network timing related, it would be nice to at >> > > least determine if it's the former or latter. >> > > >> > > Cheers, >> > > >> > > Peter. >> > > >> > > Tom Hobbs wrote: >> > > > Are these all new failures, i.e. were they working before? Or >>are they >> > > > "new" failures in that the test categories have only recently been >> > > > reactivated and the failures discovered? >> > > > >> > > > Are they a big enough blocker to stop a release? >> > > > >> > > > >> > > > On Sun, Mar 31, 2013 at 11:31 PM, Peter Firmstone >><[email protected]> >> > > > wrote: >> > > > >> > > > >> > > > > The following test fails 30 times in a run of 130 tests: >> > > > > >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests passed = 1 >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > 100 >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > Buildfile: build.xml >> > > > > >> > > > > qa.run-tests: >> > > > > >> > > > > james-brown: >> > > > > [delete] Deleting directory /opt/src/River_Fixed/** >> > > > > peterConcurrentPolicy/qa/soul >> > > > > [mkdir] Created dir: /opt/src/River_Fixed/** >> > > > > peterConcurrentPolicy/qa/soul >> > > > > [touch] Creating >> > > > > /opt/src/River_Fixed/**peterConcurrentPolicy/qa/soul/* >> > > > > *soul.201303312239034808 >> > > > > >> > > > > run-tests: >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] CONFIGURATION FILE: >> > > > > [java] >> > > > > [java] >>/opt/src/River_Fixed/**peterConcurrentPolicy/qa/src/** >> > > > > com/sun/jini/test/resources/**qaHarness.prop >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] SETTING UP THE TEST LIST: >> > > > > [java] >> > > > > [java] Adding test: >> > > > > com/sun/jini/test/spec/**javaspace/conformance/ >> > > > > **snapshot/**SnapshotExpirationNotifyTest.**td >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] GENERAL HARNESS CONFIGURATION INFORMATION: >> > > > > [java] >> > > > > [java] Date started: >> > > > > [java] Sun Mar 31 22:39:37 EST 2013 >> > > > > [java] Installation directory of the JSK: >> > > > > [java] >>com.sun.jini.jsk.home=/opt/**src/River_Fixed/** >> > > > > peterConcurrentPolicy >> > > > > [java] Installation directory of the harness: >> > > > > [java] >>com.sun.jini.qa.home=/opt/src/**River_Fixed/** >> > > > > peterConcurrentPolicy/qa >> > > > > [java] Categories being tested: >> > > > > [java] categories=No Categories >> > > > > [java] ------------------------------**----------- >> > > > > [java] ENVIRONMENT PROPERTIES: >> > > > > [java] >> > > > > [java] JVM information: >> > > > > [java] Java HotSpot(TM) Server VM, 20.5-b03, >>32 bit VM mode >> > > > > [java] Sun Microsystems Inc. >> > > > > [java] OS information: >> > > > > [java] SunOS, 5.10, sparc >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] STARTING TO RUN THE TESTS >> > > > > [java] >> > > > > [java] >> > > > > [java] Running >>com/sun/jini/test/spec/**javaspace/conformance/** >> > > > > snapshot/**SnapshotExpirationNotifyTest.**td >> > > > > [java] Time is Sun Mar 31 22:39:38 EST 2013 >> > > > > [java] Starting test in separate process with command: >> > > > > [java] /usr/jdk/jdk1.6.0_30/jre/bin/**java >> > > > > >> > > > > >>-Djava.security.manager=org.**apache.river.api.security.**CombinerSecur >> > > > > ityManager >> > > > > >> > > > > -Djava.security.policy=file:/**opt/src/River_Fixed/** >> > > > > peterConcurrentPolicy/qa/**harness/policy/defaulttest.**policy >> > > > > >> > > > > >>-Djava.rmi.server.codebase=htt**p://bluto:9082/qa1-javaspace-**dl.jar<h >> > > > > ttp://bluto:9082/qa1-javaspace-dl.jar>-cp >> > > > > /opt/src/River_Fixed/ >> > > > > >> > > > > >>**peterConcurrentPolicy/qa/lib/**jiniharness.jar:/opt/src/**River_Fixed >> > > > > /** >> > > > > >> > > > > peterConcurrentPolicy/qa/lib/**jinitests.jar:/opt/src/River_** >> > > > > >> > > > > >>Fixed/peterConcurrentPolicy/**lib/jsk-platform.jar:/opt/src/**River_Fix >> > > > > ed/ >> > > > > >> > > > > >>**peterConcurrentPolicy/lib/jsk-**lib.jar:/opt/src/River_Fixed/** >> > > > > >> > > > > >>peterConcurrentPolicy/lib/**high-scale-lib.jar:/opt/src/**River_Fixed/* >> > > > > * >> > > > > >> > > > > peterConcurrentPolicy/lib/**custard-apple-1.0.2.jar -ea -esa >>-client >> > > > > -Djava.ext.dirs=/usr/jdk/jdk1.**6.0_30/jre/lib/ext:/usr/jdk/** >> > > > > >> > > > > >>packages/lib/ext:/opt/src/**River_Fixed/**peterConcurrentPolicy/qa/lib- >> > > > > ** >> > > > > >> > > > > ext:/opt/src/River_Fixed/**peterConcurrentPolicy/lib-ext >> > > > > -Dcom.sun.jini.jsk.port=9080 -Dcom.sun.jini.qa.port=9081 >> > > > > >>-Dcom.sun.jini.jsk.home=/opt/**src/River_Fixed/**peterConcurrentPolicy >> > > > > >> > > > > >>-Dcom.sun.jini.qa.home=/opt/**src/River_Fixed/**peterConcurrentPolicy/q >> > > > > a >> > > > > >> > > > > -Dcom.sun.jini.qa.harness.**harnessJar=/opt/src/River_** >> > > > > Fixed/peterConcurrentPolicy/**qa/lib/jiniharness.jar >> > > > > -Dcom.sun.jini.qa.harness.**testJar=/opt/src/River_Fixed/** >> > > > > peterConcurrentPolicy/qa/lib/**jinitests.jar >> > > > > -Dcom.sun.jini.qa.harness.**runjiniserver=true >> > > > > -Dcom.sun.jini.qa.harness.**runkitserver=true >> > > > > -Djava.security.properties=* >> > > > > *file:/opt/src/River_Fixed/**peterConcurrentPolicy/qa/** >> > > > > harness/trust/dynamic-policy.**properties >> > > > > -Dcom.sun.jini.qa.harness.**testhosts= >> > > > > >>-Djava.util.logging.config.**file=/home/peter/logging.**properties >> > > > > >> > > > > >>-Dcom.sun.jini.test.home=/opt/**src/River_Fixed/**peterConcurrentPolicy >> > > > > /qa >> > > > > >> > > > > -Dcom.sun.jini.test.port=9082 -Dcom.sun.jini.qa.harness.** >> > > > > policies=file:/opt/src/River_**Fixed/peterConcurrentPolicy/** >> > > > > qa/src/com/sun/jini/test/**resources/jinitest.policy >> > > > > -Djava.ext.dirs=/usr/jdk/jdk1.**6.0_30/jre/lib/ext:/usr/jdk/** >> > > > > >> > > > > >>packages/lib/ext:/opt/src/**River_Fixed/**peterConcurrentPolicy/qa/lib- >> > > > > ** >> > > > > >> > > > > ext:/opt/src/River_Fixed/**peterConcurrentPolicy/lib-ext >> > > > > com.sun.jini.qa.harness.**MasterTest com/sun/jini/test/spec/** >> > > > > >>javaspace/conformance/**snapshot/**SnapshotExpirationNotifyTest.**td >> > > > > [java] com.sun.jini.qa.harness.**TestException: Not all >> > > > > listeners've >> > > > > got expected number of events. >> > > > > [java] at >>com.sun.jini.test.spec.**javaspace.conformance.** >> > > > > snapshot.**SnapshotExpirationNotifyTest.**run(** >> > > > > SnapshotExpirationNotifyTest.**java:370) >> > > > > [java] at >> > > > > com.sun.jini.qa.harness.**MasterTest.doTest(MasterTest.* >> > > > > *java:256) >> > > > > [java] at >> > > > > com.sun.jini.qa.harness.**MasterTest.main(MasterTest.** >> > > > > java:144) >> > > > > [java] >> > > > > [java] TIME: 10:42:54 PM >> > > > > [java] >> > > > > [java] Test process was destroyed and returned code 1 >> > > > > [java] >> > > > > com/sun/jini/test/spec/**javaspace/conformance/**snapshot/** >> > > > > SnapshotExpirationNotifyTest.**td >> > > > > [java] Test Failed: Test Failed: >> > > > > com.sun.jini.qa.harness.**TestException: >> > > > > Not all listeners've got expected number of events. >> > > > > [java] >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] >> > > > > [java] SUMMARY ==============================**=== >> > > > > [java] >> > > > > [java] >> > > > > com/sun/jini/test/spec/**javaspace/conformance/**snapshot/** >> > > > > SnapshotExpirationNotifyTest.**td >> > > > > [java] Test Failed: Test Failed: >> > > > > com.sun.jini.qa.harness.**TestException: >> > > > > Not all listeners've got expected number of events. >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] >> > > > > [java] # of tests started = 1 >> > > > > [java] # of tests completed = 1 >> > > > > [java] # of tests passed = 0 >> > > > > [java] # of tests failed = 1 >> > > > > [java] >> > > > > [java] ------------------------------**----------- >> > > > > [java] >> > > > > [java] Date finished: >> > > > > [java] Sun Mar 31 22:42:59 EST 2013 >> > > > > [java] Time elapsed: >> > > > > [java] 201 seconds >> > > > > [java] >> > > > > [java] Java Result: 1 >> > > > > >> > > > > collect-result: >> > > > > >> > > > > BUILD FAILED >> > > > > /opt/src/River_Fixed/**peterConcurrentPolicy/build.**xml:2105: >>The >> > > > > following error occurred while executing this line: >> > > > > /opt/src/River_Fixed/**peterConcurrentPolicy/qa/**build.xml:357: >> > > > > condition satisfied >> > > > > >> > > > > Total time: 3 minutes 30 seconds >> > > > > >> > > > > >> > > > > >> > > > >> > > > >> > > >> > > >> > >> >
