Peter Firmstone wrote:
I started thinking about a replacement, it's in org.apache.river.imp.util.
Excellent. I'll take a look, and pick up code and ideas as appropriate.
The name spaces are as follows:
net.jini.* API - must be backward compatible, or breakage minimal for if
there's a good reason.
com.sun.jini.* - implementation, subject to change.
com.artima.* - implementation, subject to change.
org.apache.river.api.* - new API under review, please feel free to look
for ways to improve before its frozen in November / December.
org.apache.river.imp.* - new implementation, subject to change,
eventually the com.* namespace will move there too.
Best bet would be to search for the classes that depend on the Task
interface, see how they use it and come up with something better.
I've looked at several of them. Most runAfter implementations
unconditionally return false. I'm tracking down the ones that don't.
Agreed, the Task interface is less than optimum, best to replace it.
TaskManager has been identified as a performance bottleneck.
Are there any benchmarks demonstrating the bottleneck? I'll need some to
see if my changes make matters significantly better. My guess from
reading TaskManager code is that it is fine for small numbers of tasks,
but does not scale well. However, measurements always beat guesses.
My current idea is to use a long sequence number assigned when a Task is
added to the TaskManager. That would work for a TaskManager that
operates continuously for over 290 years given one new Task every
nanosecond. If more is needed, I could go to BigInteger depending on
what the benchmarks tell me.
This requires wrapping the caller's Task object in another data
structure, which has other uses.
Given a sequence number, I could separate the tasks into separate data
structures depending status. Each data structure would be designed for
the operations TaskManager needs to do on tasks in that state. In
particular, tasks that are waiting to be assigned a thread would live in
a BlockingPriorityQueue whose Comparator goes by sequence number.
I am considering changing the runAfter method to take a single Task,
rather than a List. It moves a method call from outside to inside a
loop, but I think it is worth it for the additional information it would
give TaskManager. It simplifies the interface, which is usually a good
thing to do for maintainability. It decouples the TaskManager callers
from how TaskManager organizes its data structures, important not just
for this round, but for any future tuning of TaskManager.
TaskManager, given a new Task, would begin by identifying the highest
sequence number existing task on which it depends, using a backwards
scan. If the Task depends on nothing, it goes in the priority queue for
dispatch to a thread. If Task Y is the youngest incomplete task on which
Task X depends, add X to a Set associated with Y, and put X in a pending
data structure.
(To reduce object creation and destruction in simple cases, Y's Set
would be created the first time an object is added to it.)
When Task Y leaves the TaskManager, through removal or completion,
reexamine X, looking to see if it depends on an incomplete Task with a
lower sequence number than Y. If it does, put it in that Task's waiting
Set and leave it pending. If not, put it in the priority queue.
In this design, each question of the form "Does X depend on Y" would be
asked at most once. A Task that is waiting on runAfter is kept out of
the dispatch data structure.
This is just a first thought, and will almost certainly change after I
have read more and done some experiments.
Patricia