On Friday, 24 March 2017 at 17:48:35 UTC, H. S. Teoh wrote:
(In my case, though, B-trees may not represent much of an
improvement, because I'm dealing with high-dimensional data
that cannot be easily linearized to take maximum advantage of
B-tree locality. So at some level I still need some kind of
hash-like structure to work with my data. But it will probably
have some tree-like structure to it, because of the
(high-dimensional) locality it exhibits.)
T
Hi T,
Your problem is intriguing and definitely stretching my mind!
I'll be factoring your ideas into my app design as I go along.
Some techniques that might be relevant to your app, if only for
relative performance comparisons, might be:
Using metadata in lieu of actual data to maximize the number of
rows "represented" in the caches.
Using one or more columnstores, both the intra- and extra-cache,
to allow transformations of one or more fields of one or more
rows with extremely small read, computation and write costs.
Scaling the app horizontally, if possible.
Using stored procedures on a a SQL NoSQL or NewSQL DBMS to
harness the DBMS's bulk-processing and high-throughput
capabilities.
I'd love to hear whatever details you can share about your app.
Alternatively I made a list of a dozen or so questions that would
help me think about how to approach your problem. If you're
interested in pursuing either avenue, let me know! Thanks again