Magnus Lie Hetland wrote:
[EMAIL PROTECTED] <[EMAIL PROTECTED]>:
Check out http://e4graph.sourceforge.net/
Thanks -- I've seen it (a while ago), but I guess I'll have another
look.
My main questions were really about how Metakit would perform (i.e.,
the number of disk accesses etc.) under an implementation roughly like
the one I described :)
The main thing to keep in mind is the column-wise nature of data,
including subviews:
http://www.equi4.com/mkviews.html
Lots of tiny subviews are not going to scale, because of the
fragmentation. This is one of the things which e4graph adresses.
Fragmentation due to lots of tiny columns is something I plan to
improve in the future (by inling more small amounts of data into
their parent data-structures).
MK's "blocked" views are very much like B-trees in a column wise
world. Their depth is fixed at two, but it turns out that this gets
many of the benefits of B-trees:
http://www.equi4.com/mkblocked.html
It takes a bit of mental juggling to see how blocked subviews and
columns fit together.
Also, again due to columns: keep in mind that as you iterate over
data structures in MK, only properties/columns which actually
participate in the iteration will cause disk access. And that
columns with ints in them are considerably more efficient to access
in random order (and often a lot smaller, due to 1..32-bit adaptive
int-width vector sizing).
As always, if performance is your top priority then there is really
no way around trying out various approaches. You can expect order-of-
magnitude differences, and I've dabbled in this columnar world long
enough to know that there are lots and lots of surprises either way.
YMWVAL: your mileage *will* vary a lot.
-jcw
_____________________________________________
Metakit mailing list - Metakit@equi4.com
http://www.equi4.com/mailman/listinfo/metakit