On 05/11/13 05:35, Robert Haas wrote:
On Mon, Nov 4, 2013 at 11:32 AM, Andres Freund <and...@2ndquadrant.com> wrote:
I think doing this outside of s_b will make stuff rather hard for
physical replication and crash recovery since we either will need to
flush the whole buffer at checkpoints - which is hard since the
checkpointer doesn't work inside individual databases - or we need to
persist the in-memory buffer across restart which also sucks.
You might be right, but I think part of the value of LSM-trees is that
the in-memory portion of the data structure is supposed to be able to
be optimized for in-memory storage rather than on disk storage.  It
may be that block-structuring that data bleeds away much of the
performance benefit.  Of course, I'm talking out of my rear end here:
I don't really have a clue how these algorithms are supposed to work.

How about having a 'TRANSIENT INDEX' that only exists in memory, so there is no requirement to write it to disk or to replicate directly? This type of index would be very fast and easier to implement. Recovery would involve rebuilding the index, and sharing would involve recreating on a slave. Probably not appropriate for a primary index, but may be okay for secondary indexes used to speed specific queries.

This could be useful in some situations now, and allow time to get experience in how best to implement the basic concept. Then a more robust solution using WAL etc can be developed later.

I suspect that such a TRANSIENT INDEX would still be useful even when a more robust in memory index method was available. As I expect it would be faster to set up than a robust memory index - which might be good when you need to have one or more indexes for a short period of time, or the size of the index is so small that recreating it requires very little time (total elapsed time might even be less than a disk backed one?).


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to