Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

Sandy McArthur Sat, 25 Mar 2006 22:28:35 -0800

On 3/25/06, Rahul Akolkar <[EMAIL PROTECTED]> wrote:
> On 3/25/06, Sandy McArthur <[EMAIL PROTECTED]> wrote:
> <snip/>
> >
> > The main behavior of the composite pools are configured via four
> > type-safe enum types. I'll describe what each type controls and then
> > suggest name variants. Let me know which one you think is the most
> > self-evident and user friendly. Feel free to suggest new names.
> >
> > 1. "Specifies the how objects are borrowed and returned to the pool."
> > a) BorrowType  b) BorrowStrategy  c) BorrowPolicy  d) BorrowBehavior
> >
> > 2. "Specifies the behavior of the pool when the pool is out of idle 
> > objects."
> > a) ExhaustionPolicy  b) ExhaustionBehavior  c) ExhaustionType d)
> > ExhaustionStrategy
> >
> > 3. "Specifies the behavior of when there is a limit on the number of
> > concurrently borrowed objects."
> > a) LimitStrategy  b) LimitPolicy  c) LimitBehavior  d) LimitType
> >
> > 4. "Specifies how active objects are tracked while they are borrowed
> > from the pool."
> > a) TrackingBehavior  b) TrackingType  c) TrackingStrategy  d) TrackingPolicy
> >
> > The enums above don't actually specify any implementation, they
> > describe desired features of a pool. The actual implementation isn't
> > broken down into four parts like that so try not to confuse how you
> > would implement that feature with how you would request that feature.
> >
> <snap/>
>
> Why isn't it broken down like that?


Because there are fundamentally three parts to a pool's behavior.
1. How objects are treated while they are in the idle object pool.
2. How objects are added/removed from the idle object pool.
3. How objects are treated while they are out of the pool, aka: active.

I choose to map these three aspects to four types of behavior because
that made the most sense in balancing the usability of the public
interface and allowing the functionality to expand in new ways.
Expressing all possible combinations with three enum created too many
permutations of enum choices to remain usable. Splitting the choices
like I did across four enum types groups them into logical chunks and
means the programmer only has to consider a handful of choices at a
time instead of dozens of choices at a time.

> IMO, such enum types have limited use, unless we can guarantee
> reasonable (ideally, full) closure. Often, it is not possible to
> enumerate all the types / strategies / policies that may make sense
> for the varying use cases that we only attempt to foresee. In many
> cases, such as this one, my personal preference is to leave things
> pluggable, rather than enumerable.

The composite pool is already plugable, you must give it a
PoolableObjectFactory. :-)

> We should instead, if you and
> others agree, define the contracts between a "pool" and each of the
> four "behaviors" that you list above. We can supply (n) out-of-the-box
> implementations, but leave it open for a user to *easily* define a
> (n+1)th should such a need arise (and I believe it will, sooner or
> later).

We already provide a number of out of the box implementations:
GenericObjectPool, StackObjectPool, SoftReferenceObjectPool, and soon
a "Composite ObjectPool". (There are also similar KeyedObjectPools)

Not everything is made better because it's made plugable (or
subclassable). Anything that you expose as public or protected you
cannot change without risking compatibility. By making all of the
implementation details private you can completely change the
implementation without worrying about breaking compatibility.

Also, because the composite pool implementation is so separated from
the way it configured it allows for internal optimizations. The
composite pool factory currently optimizes the created pool in a
number of ways, including:
* detecting when the idle pool will never grow over a conservatively
tweaked internal threshold and chooses an ArrayList over a LinkedList
because the worst case performance of an ArrayList with the size of
~15 is still better than the best case performance of a LinkedList
with a size greater than zero.
* detecting when a configured expression of a pool can be more
efficiently expressed as a different configuration and still have the
same behavior.
* detecting a pool with a self-contradictory configuration and
preventing the creation of a broken pool.

I also have some more intrusive optimizations planned that may not be
available with a more exposed implementation. The largest performance
killer of the composite pool code right now is serialization due to
synchronization, not the java.io.Serialize type. Different
configurations need different amounts of synchronization to remain
thread-safe and correct. Currently the composite pool code
synchronizes more than is needed for the default and most common
configuration. When I have time I'll add another optimization that
figures out what is the narrowest amount of synchronization needed to
remain thread-safe and maintain correct behavior. I'm pretty sure
other optimizations will be made available when the composite pool can
depend on Java 1.5 and take advantage of j.u.concurrent features.

With a fully plugable API the synchronization optimization above
wouldn't really be available. You could use marker interfaces or add
methods to query the synchronization needs of a plugin but that would
be poorly usable. Same logic applies as to why you should always use a
j.u.Iterator to loop across a List instead of checking for the
j.u.RandomAccess marker interface.

I'm not against plugable APIs. They often make sense but not always,
and this is one time they don't. I also want the composite pool code
to be "future proof". Peter Steijn who emailed a week ago is exploring
some new ways to improve the performance of Pool (and by extension
Dbcp) by using some more complex threading behaviors. We've discussed
some of his ideas off-list and provided his ideas pan out maybe the
composite pool code in Pool 2.1 will be faster and client code using
pool won't have to know or care how the improved performance came
about.

> As a concrete example, for [scxml], we define a SCXMLExecutor (the
> state machine "engine") accompanied by a SCXMLSemantics interface [1].
> The basic modus operandi for an engine is simple - when an event is
> triggered, figure out which (if any) transition(s) to follow, and
> transit to the new set of states executing any specified actions along
> the way. However, there are numerous points of contention along the
> way. Lets take dispute resolution for example -- when more than one
> outbound transitions from a single state holds true. Which path do we
> take? The default implementation available in the distro is puristic,
> it will throw a ModelException. However, a user may want:
>
>  * The transition defined closest to the document root to be followed
>  * The transition defined farthest from the document root to be followed
>  * The transition whose origin and target have the lowest common
> ancestor to be followed
>  * The transition whose origin and target have the highest common
> ancestor to be followed
>
> Even after one of above dispute resolution algorithms is applied, if
> we end up with more than one candidate transitions, the user may want:
>
>  * A ModelException to be thrown
>  * The transition that appears first in document order to be followed
>  * The transition that appears last in document order to be followed
>
> To implement any of the above choices, the user may simply extend the
> default SCXMLSemantics implementation, override the
> filterTransitionsSet() method, and use the new semantics while
> instantiating the SCXMLExecutor.
>
> This approach means:
>
>  * We don't have to forsee all dispute resolution algorithms, and
> provide implementations
>  * Users don't have to convince anyone that the algorithm they need is
> useful, they can just implement it if they need it
>  * We don't even have to contend that the default puristic behavior
> that doesn't tolerate any non-determinism is the most common or the
> most useful one, it is just one that is chosen as default (because I
> personally believe it leads to better proof of correctness arguments).
>
> Since we're talking about Pool 2.0 and beyond, perhaps a focus on
> similar extensibility is justified, and maybe we should revisit the
> enumeration approach, even before we get to names.

If you want to implement a more plugable pool implementation and put
the plugable pool and composite pool code in a steel cage match to the
death based on usability, flexibility, and performance I'm all for it.

> -Rahul
>
> (long, possibly fragmented URL below)
>
> [1] 
> http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/scxml/trunk/src/main/java/org/apache/commons/scxml/SCXMLSemantics.java?view=markup

--
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

Reply via email to