On 4/01/2014 3:40 PM, Greg Trasuk wrote:
I accidentally deleted something from the top of that message...
I checked out the qa_refactor branch, and compared the 2.2 implementation of
com.sun.jini.RegistrarImpl with the qa_refactor implementation.
In the 2.2 branch, Reggie uses an instance of TaskManager that is looked up
through a Configuration instance. In the qa_refactor branch, you’ve replaced
that with two separate hard-coded instances of ThreadPoolExecutor.
In the 2.2 implementation, it was possible to have Reggie share a TaskManager
with other services, or with one or more ServiceDiscoveryManagers, by creating
the appropriate configuration file. That’s no longer possible.
Have a look at Mahalo, it shows an ExecutorService set by configuration,
but with a wrapper class to bolt on required functionality.
That's how Reggie will look when it's finished.
Further, in a container scenario, we’d like to move away from having individual
services creating their own threads, towards a shared work manager. Ideally,
we would have Reggie and all other services use a shared work manager rather
than creating their own threads, as they currently do. That way, the container
can manage and prioritize executions appropriately (e.g app A’s tasks take
priority over app B’s), and they can all share a thread pool. Removing the
TaskManager usage moves us further from that goal. I suspect that TaskManager
might have to be extended to do this properly, and I certainly would prefer
that it were an interface rather than a concrete class, but it’s a decent
starting point.
We'll be able to do something similar with ExecutorService, most of the
threads have been converted to Runnable's that are passed as an argument
to Thread's constructor (instead of extending Thread), they could be
passed to an executor instead.
However it's also worth noting that there are cases where it isn't
recommended for certain tasks to share an Executor.
The other issue is shutdown, if you shutdown one service and it share's
a TaskManager or ExecutorService with another service, then the other
service will have a terminated TaskManager or ExecutorService.
We really do need to dump TaskManager, it can't hold a candle to Doug
Lee's Executor framework.
I read somewhere the Jini development team had planned to replace
TaskManager due to issues with Task.runAfter?
When I profile the stress tests now, the hotspot's are Socket's and
there's very little monitor contention.
Regards,
Peter.
On Jan 4, 2014, at 12:18 AM, Greg Trasuk<[email protected]> wrote:
I’ll also point out Patricia’s recent statement that TaskManager should be
reasonably efficient for small task queues, but less efficient for larger task
queues. We don’t have solid evidence that the task queues ever get large.
Hence, the assertion that “TaskManager doesn’t scale” is meaningless. If real
usage never requires a large task queue, then scalability isn’t an issue, and
we don’t know whether it ever needs a large task queue.
In any case, removing TaskManager and replacing it with hard-coded
ThreadPoolExecutors moves us farther away from having the capability of a
shared work queue. So I’m not in favour of this change. I haven’t looked at
the other services or utility classes, but if the changes are similar, I’m also
not in favour. You’re introducing changes that introduce test failures (which
is why you’re asking for help) without a good reason. You’re never going to
ship this code unless you stop modifying it.
Also, when you say below,
I'm developing an ExecutorService wrapper that retry's failed tasks in
org.apache.river.impl.thread.SerialExecutorService, by not removing a task from
it's queue until it completes successfully, it prevents any dependant tasks
from running, I would like to use this as a replacement for TaskManager and
RetryTask.
…be careful! You’re getting into the same difficult area as transactional
semantics around messaging. Will you need to provide a “dead task” queue? Do
you need to set a limit on how many times a task get retried? What happens
when that limit is exceeded? Do all tasks have the same limit? Should a task
get notified when it’s exceeded the retry limit? How long should you wait
between retries? Is that number the same for all tasks. Is there some kind of
alarm or notification when tasks end up being retried, or when the dead task
queue becomes full?
Sometimes it’s best not to try to abstract-away all complexity.
Greg.
On Jan 3, 2014, at 10:43 PM, Peter Firmstone<[email protected]> wrote:
ServiceDiscoveryManager is now the only class that utilises TaskManager and
RetryTask. JoinManager still uses TaskManager but not RetryTask. See
River-344 for an explanation of the problem.
Most instances of TaskManager in qa-refactor have been replaced with
ExecutorService, RetryTask now implements RunnableFuture and can be cancelled
by Future.cancel from the ExecutorService.
I'm developing an ExecutorService wrapper that retry's failed tasks in
org.apache.river.impl.thread.SerialExecutorService, by not removing a task from
it's queue until it completes successfully, it prevents any dependant tasks
from running, I would like to use this as a replacement for TaskManager and
RetryTask.
Can anyone spare time to review, suggest alternatives, or improvements?
Thanks in advance,
Peter.
Failed com_sun_jini_test_impl_servicediscovery_event_DiscardDownReDiscover.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 1 discovery event(s)
received
Failed com_sun_jini_test_impl_servicediscovery_event_DiscardServiceDown.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_impl_servicediscovery_event_DiscardServiceUp.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_impl_servicediscovery_event_LookupTaskRace.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_impl_servicediscovery_event_ReRegisterBadEquals.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_impl_servicediscovery_event_ReRegisterGoodEquals.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s)
received
Failed
com_sun_jini_test_impl_servicediscovery_event_ServiceDiscardCacheTerminate.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_cache_CacheDiscard.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_cache_CacheLookup.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_Lookup.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMax.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 3 discovery event(s) expected, 2 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMaxFilter.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMinEqualsMax.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s)
received
Failed
com_sun_jini_test_spec_servicediscovery_lookup_LookupMinMaxNoBlockFilter.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupWait.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupWaitFilter.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 1 discovery event(s)
received
Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupWaitNoBlock.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s)
received
On 4/01/2014 10:27 AM, Apache Jenkins Server wrote:
See<https://builds.apache.org/job/river-qa-refactor-win/45/>
------------------------------------------
[...truncated 15733 lines...]
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeIfExistsTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeIfExistsWaitTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeNO_WAITTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeReadTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeWaitTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteLeaseANYTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteLeaseFOREVERTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteNegativeLeaseTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeIfExistsNotifyTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeIfExistsTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeNotifyTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteLeaseANYTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteLeaseFOREVERTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteNegativeLeaseTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/AdminIFShutdownTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/AdminIFTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/LeaseExpireCancelTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/LeaseExpireRenewTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/LeaseMapTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/LeaseTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/MahaloCreateShutdownTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/MahaloIFTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/MahaloImplReadyStateTest.td
[java] Test Skipped: verifiers are:
com.sun.jini.test.impl.mercury.ActivatableMercuryVerifier
com.sun.jini.qa.harness.SkipConfigTestVerifier
[java] -----------------------------------------
[java]
com/sun/jini/test/impl/mahalo/NestableServerTransactionCreatedToStringTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/impl/mahalo/NestableTransactionCreatedToStringTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest2.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest3.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest4.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest5.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/RandomStressTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/ServerTransactionEqualityTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/ServerTransactionToStringTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/TransactionCreatedToStringTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/impl/mahalo/TransactionManagerCreatedToStringTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
com/sun/jini/test/impl/mahalo/TxnMgrImplNullActivationConfigEntries.td
[java] Test Skipped: verifiers are:
com.sun.jini.test.impl.mahalo.ActivatableMahaloVerifier
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullConfigEntries.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullRecoveredLocators.td
[java] Test Skipped: verifiers are:
com.sun.jini.test.impl.mahalo.ActivatableMahaloVerifier
[java] -----------------------------------------
[java] com/sun/jini/test/impl/mahalo/TxnMgrProxyEqualityTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/AsynchAbortOnCommitTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/AsynchAbortOnPrepareTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/CommitExpiredTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/CommitTimeoutTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/GetStateTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/JoinIdempotentTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/JoinWhileActiveTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/ManyParticipantsTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/PrepareTimeoutTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/RollBackErrorTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/RollForwardErrorTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java] com/sun/jini/test/spec/txnmanager/TwoPhaseTest.td
[java] Test Passed: OK
[java]
[java] -----------------------------------------
[java]
[java] # of tests started = 1406
[java] # of tests completed = 1406
[java] # of tests skipped = 52
[java] # of tests passed = 1388
[java] # of tests failed = 18
[java]
[java] -----------------------------------------
[java]
[java] Date finished:
[java] Fri Jan 03 16:27:03 PST 2014
[java] Time elapsed:
[java] 59325 seconds
[java]
[java] Java Result: 1
collect-result:
[copy] Copying 1 file
to<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\result>
[copy] Copying 1 file
to<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\result>
[zip] Building
zip:<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\result\qaresults-amd64-Windows>
Server 2008 R2-1.7.0.zip
BUILD FAILED
<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\build.xml>:2109:
The following error occurred while executing this line:
<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\build.xml>:406:
The following error occurred while executing this line:
<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\build.xml>:380:
condition satisfied
Total time: 996 minutes 9 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts