Re: [HACKERS] [RFC] Minmax indexes

Simon Riggs Sat, 15 Jun 2013 08:16:02 -0700

On 15 June 2013 00:01, Josh Berkus <j...@agliodbs.com> wrote:
> Alvaro,
>
> This sounds really interesting, and I can see the possibilities.
> However ...
>
>> Value changes in columns that are part of a minmax index, and tuple insertion
>> in summarized pages, would invalidate the stored min/max values.  To support
>> this, each minmax index has a validity map; a range can only be considered 
>> in a
>> scan if it hasn't been invalidated by such changes (A range "not considered" 
>> in
>> the scan needs to be returned in whole regardless of the stored min/max 
>> values,
>> that is, it cannot be pruned per query quals).  The validity map is very
>> similar to the visibility map in terms of performance characteristics: quick
>> enough that it's not contentious, allowing updates and insertions to proceed
>> even when data values violate the minmax index conditions.  An invalidated
>> range can be made valid by re-summarization (see below).
>
> This begins to sound like these indexes are only useful on append-only
> tables.  Not that there aren't plenty of those, but ...


The index is basically using the "index only scan" mechanism. The
"only useful on append-only tables" comment would/should apply also to
index only scans. I can't see a reason to raise that specifically for
this index type.


>> Re-summarization is relatively expensive, because the complete page range has
>> to be scanned.
>
> Why?  Why can't we just update the affected pages in the index?

Again, same thing as index-only scans. For IOS, we reset the
visibility info at vacuum. The route proposed here follows exactly the
same timing, same mechanism. I can't see a reason for any difference
between the two.


>>  To avoid this, a table having a minmax index would be
>> configured so that inserts only go to the page(s) at the end of the table; 
>> this
>> avoids frequent invalidation of ranges in the middle of the table.  We 
>> provide
>> a table reloption that tweaks the FSM behavior, so that summarized pages are
>> not candidates for insertion.
>
> We haven't had an index type which modifies table insertion behavior
> before, and I'm not keen to start now; imagine having two indexes on the
> same table each with their own, conflicting, requirements.  This is
> sounding a lot more like a candidate for our prospective pluggable
> storage manager.  Also, the above doesn't help us at all with UPDATEs.
>
> If we're going to start adding reloptions for specific table behavior,
> I'd rather think of all of the optimizations we might have for a
> prospective "append-only table" and bundle those, rather than tying it
> to whether a certain index exists or not.

I agree that the FSM behaviour shouldn't be linked to index existence.

IMHO that should be a separate table parameter, WITH (fsm_mode = append)

Index only scans would also benefit from that.


> Also, I hate the name ... if this feature goes ahead, I'm going to be
> lobbying to change it.  But that's pretty minor compared to the update
> issues.

This feature has already had 3 different names. I don't think the name
is crucial, but it makes sense to give it a name up front. So if you
want to lobby for that then you'd need to come up with a name soon, so
poor Alvaro can cope with name #4.

(There's no consistency in naming from any other implementation either).

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Minmax indexes

Reply via email to