Magnus Lie Hetland wrote:

[EMAIL PROTECTED] <[EMAIL PROTECTED]>:

Check out http://e4graph.sourceforge.net/

Thanks -- I've seen it (a while ago), but I guess I'll have another
look.

My main questions were really about how Metakit would perform (i.e.,
the number of disk accesses etc.) under an implementation roughly like
the one I described :)

The main thing to keep in mind is the column-wise nature of data, including subviews:
        http://www.equi4.com/mkviews.html

Lots of tiny subviews are not going to scale, because of the fragmentation. This is one of the things which e4graph adresses. Fragmentation due to lots of tiny columns is something I plan to improve in the future (by inling more small amounts of data into their parent data-structures).

MK's "blocked" views are very much like B-trees in a column wise world. Their depth is fixed at two, but it turns out that this gets many of the benefits of B-trees:
        http://www.equi4.com/mkblocked.html
It takes a bit of mental juggling to see how blocked subviews and columns fit together.

Also, again due to columns: keep in mind that as you iterate over data structures in MK, only properties/columns which actually participate in the iteration will cause disk access. And that columns with ints in them are considerably more efficient to access in random order (and often a lot smaller, due to 1..32-bit adaptive int-width vector sizing).

As always, if performance is your top priority then there is really no way around trying out various approaches. You can expect order-of- magnitude differences, and I've dabbled in this columnar world long enough to know that there are lots and lots of surprises either way. YMWVAL: your mileage *will* vary a lot.

-jcw

_____________________________________________
Metakit mailing list  -  Metakit@equi4.com
http://www.equi4.com/mailman/listinfo/metakit

Reply via email to