On Wed, 10 Sep 2025 09:01:06 GMT, Johny Jose <d...@openjdk.org> wrote:
>> Increasing the count value of available objects to 6 (which is half the >> number of objects created). The failures were reported in macos which seems >> to take more time to clear the objects. Though majority runs has less values >> for objects (less than 4), in order to eliminate intermittent failures, >> raising the threshold values to 6 > > Johny Jose has updated the pull request incrementally with one additional > commit since the last revision: > > Review changes The current approach should add a bit more consistency to the test. If you could oblige and run the test through the CI pipeline on --test-repeat 200 please Also on --test-repeat 200 run the test as part of its jdk_rmi test group And a final check run it in a tier3 job However, I would like to offer a possible enhancement on this approach If we look at the test's objectives and the structure. The test spawns 10 test client processes. They invoke a ping remote method call on the remote server This activates a lease. The lease is set to expire after 20 msecs Thus, it should be removed from the lease table A test is made on the lease table to check expiring leases have been purged. The test structure is subject to the process and thread scheduling variations of an OS. This allows indeterminate behaviour to exist, and as such we could look to reduce this execution variability. There is no guarantee that the spawned process will start executing immediately. Generally, this is the case on a lightly loaded machine, but one under significant load (as might be the case in a CI pipeline) that may not be the case ... thus, scheduling delays may occur. It we add some check that all clients have executed prior to testing the lease, that will add additional reliability to the leaseTable test. Thus, if the testing of the lease table is synchronised with the remote invocations, it is possible to create consistent conditions for the test. That is, to test the lease expiry and lease table removal after the last remote invocation has been made This requires using a CountDownLatch(NO_OF_CLIENTS), which is countdown for each remote invocation And the main thread await for the latch countdown. Thereafter the main thread sleeps for a period of time equal to the NO_OF__CLIENTS * LEASE_EXPIRY_TIME * GOOD_LUCK_FACTOR Thus the the check of the lease table is subject to the condition that all clients have made a remote invocation and that their leases have expired. Theoretically, the number of active leases should be zero, but retaining the current lease check condition is fine Thus the following could be added public class CheckLeaseLeak extends UnicastRemoteObject implements LeaseLeak { public CheckLeaseLeak() throws RemoteException { } public void ping () throws RemoteException { remoteCallsComplete.countDown(); } . . . private final static int ITERATIONS = 10; private final static int numberPingCalls = 0; private final static int CHECK_INTERVAL = 400; private final static int LEASE_VALUE = 20; private static final int NO_OF_CLIENTS = ITERATIONS; private static final int GOOD_LUCK_FACTOR = 2; . . . try { if (jvm.execute() != 0) { TestLibrary.bomb("Client process failed"); } } finally { jvm.destroy(); } } try { remoteCallsComplete.await(); System.out.println("remoteCallsComplete . . . "); } catch (InterruptedException intEx) { System.out.println("remoteCallsComplete.await interrupted . . . "); } Thread.sleep(NO_OF__CLIENTS * LEASE_EXPIRY_TIME * GOOD_LUCK_FACTOR); numLeft = getDGCLeaseTableSize(); . . . ------------- PR Comment: https://git.openjdk.org/jdk/pull/26815#issuecomment-3299476175