Jacob Levy wrote:
OK, now try to iterate through the data and see Metakit really win.
Metakit really hits a groove when you iterate over the data sequentially
as that plays very well with the paging and caching on modern OSes and
hardware.
Let's do the test:
2.9143433 seconds to iterate
Brian:
Thanks much for these observations. They are compatible with my own
hunches at this point.And at the moment I'm using both BSDDB and
Metakit in the way you describe: storing ontology structure information in
BSDDB and document index information in Metakit (with references to
ontology
Brian Kelley wrote:
Let's do the test:
2.9143433 seconds to iterate bsddb3
1.8621608 seconds to iterate metakit
So metakit is approximately 30% faster for linear access. Both are
pretty good though.
As you know, statistics can be made to come out any way you like, i.e.
your above
Two options not implemented in MK 2.4.9.2, are compression and
encryption.
Due to the column-wise design of MK, this may actually have substantial
consequences. The idea, is that in a datafile with say layout
names[first:S,last:S,phones[type:S,number:S]] it would be possible to
designate
gary,
I'm no metakit expert but I currently the size of a metakit is constrained
by the file system. So if your OS or file system is limited to 2Gb you could
have problem with a matix that is 50,000x50,000 since that 2.5G elements.
Tom K.
It strikes me that Metakit might provide a handy way of
I would use encryption for storages that contain Tcl programs that
represent my IP, definitely.
I would use compression for compressable data, e.g. if I stored GIF or
JPEG images as blobs.
Not too sure about the combination, I can't think of a scenario where I'd
like to use compressed +
Jacob Levy wrote:
The intersection of a row and column would represent what exactly? That
there are entities that have both of these categories? Or some shared
property of the two categories?
My compound fingerprinting scheme is pretty similar, except that I just
store integers instead of the