On 12/14/2010 8:37 AM, Gregg Wonderly wrote:
On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
I would say that in addition to just be a fast data structure the data
structure
> must be fast and accommodate synchronous and asynchronous backups,
partitions,
> and transactions.
This is an important issue from the perspective that there are two
scenarios that used to be supported by outrigger. A persistent and an
non-persistent version used to exist. The persistent version used PSE
for serialization to disk. That was a simple yet powerful mechanism. Due
to licensing (Sun paid for a distribution license), it was in a sense,
deprecated at the point of River being started.
For those that don't know about PSE, it used a post compilation bytecode
manipulator that looked for calls to a "start transaction" method, and
then found modification assignments to associated data structures, and
modified the byte code to set a "modified bit" on the associated data.
When "end transaction" was encountered, it stopped.
I think it would be a good idea to focus on the performance of the in
memory (messaging only type of application) version. The persistent
version is a completely different animal and requires some fairly
advanced features for managing all of the appropriate control points.
Making one code path do both can be somewhat challenging from an all out
performance perspective.
Thanks for the useful background information.
There is one slim hope I can see for a common code path, but it is a
very long way off.
My prejudice, subject to being convinced that another approach would be
better, would be to try to map a persistent version to a relational
database through SQL. Relational databases deal with transactions, ACID,
distribution, and performance issues. There are a lot of options for
users, more than for OO databases, at all price points starting at free.
The way outrigger uses its FastList looks rather like a sort of
simplified relational database, with each FastList instance representing
a table and selects being done by linear scan of the table.
If we made a persistent version use a relational database to represent
the space, we could then experiment with performance run-offs between
our best shot at an ad-hoc in-memory implementation, and what we get
from the persistent version if we drop in an in-memory database
implementation. If they come close, we could drop the ad-hoc
implementation and focus all effort on the relational database version.
It is a slim hope. Often, a custom tuned data structure will out-perform
a specialization of a general data structure. In any case, I agree with
working first on the in-memory version.
Patricia