Re: [HACKERS] [RFC] Minmax indexes

2013-09-01 Thread Noah Misch
On Fri, Jun 14, 2013 at 06:28:06PM -0400, Alvaro Herrera wrote: Partial indexes are not supported; since an index is concerned with minimum and maximum values of the involved columns across all the pages in the table, it doesn't make sense to exclude values. It can make sense if the

Re: [HACKERS] [RFC] Minmax indexes

2013-07-19 Thread Josh Berkus
On 07/18/2013 10:39 PM, Alvaro Herrera wrote: To scan the index, we first obtain the TID of index tuple for page 0. If this returns a valid TID, we read that tuple to determine the min/max bounds for this page range. If it returns invalid, then the range is unsummarized, and the scan must

Re: [HACKERS] [RFC] Minmax indexes

2013-07-18 Thread Alvaro Herrera
Alvaro Herrera wrote: This is a preliminary proposal for Minmax indexes. I'm experimenting with the code, but it's too crude to post yet, so here's a document explaining what they are and how they work, so that reviewers can poke holes to have the design improved. After going further with

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Jim Nasby
On 6/17/13 3:38 PM, Josh Berkus wrote: Why? Why can't we just update the affected pages in the index? The page range has to be scanned in order to find out the min/max values for the indexed columns on the range; and then, with these data, update the index. Seems like you could incrementally

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Claudio Freire
On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby j...@nasby.net wrote: On 6/17/13 3:38 PM, Josh Berkus wrote: Why? Why can't we just update the affected pages in the index? The page range has to be scanned in order to find out the min/max values for the indexed columns on the range; and then,

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Jim Nasby
On 6/28/13 12:26 PM, Claudio Freire wrote: On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby j...@nasby.net wrote: On 6/17/13 3:38 PM, Josh Berkus wrote: Why? Why can't we just update the affected pages in the index? The page range has to be scanned in order to find out the min/max values for

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Josh Berkus
On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby j...@nasby.net wrote: If we add a dirty flag it would probably be wise to allow for more than one value so we can do a clock-sweep. That would allow for detecting a range that is getting dirtied repeatedly and not bother to try and re-summarize it

Re: [HACKERS] [RFC] Minmax indexes

2013-06-25 Thread Simon Riggs
On 25 June 2013 00:51, Bruce Momjian br...@momjian.us wrote: On Sat, Jun 15, 2013 at 11:39:23AM -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote: If we're going to start adding reloptions for specific table

Re: [HACKERS] [RFC] Minmax indexes

2013-06-25 Thread Alvaro Herrera
I just noticed that this patch was closed with returned with feedback in the commitfest app. This is good, IMV -- it's saying that the opinion of the various people commenting on the thread is positive, and therefore no more discussion is currently needed. I will post an actual patch to CF2, at

Re: [HACKERS] [RFC] Minmax indexes

2013-06-24 Thread Bruce Momjian
On Sat, Jun 15, 2013 at 11:39:23AM -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote: If we're going to start adding reloptions for specific table behavior, I'd rather think of all of the optimizations we might have

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Simon Riggs
On 17 June 2013 02:05, Josh Berkus j...@agliodbs.com wrote: I agree that the FSM behaviour shouldn't be linked to index existence. IMHO that should be a separate table parameter, WITH (fsm_mode = append) Index only scans would also benefit from that. -1 ... I cannot believe that such a

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Josh Berkus
So there isn't a fall down thing here. We expect the recently loaded/updated data to be scanned and that's OK. Having the minmax index updated greedily is just adding extra work for fast diminishing returns. We can always add that later if really needed, but I doubt it will be needed - in

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Alvaro Herrera
Greg Stark wrote: On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Re-summarization is relatively expensive, because the complete page range has to be scanned. That doesn't sound too bad to me. It just means there's a downside to having larger page

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Alvaro Herrera
Josh Berkus wrote: Value changes in columns that are part of a minmax index, and tuple insertion in summarized pages, would invalidate the stored min/max values. To support this, each minmax index has a validity map; a range can only be considered in a scan if it hasn't been

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Andres Freund
On 2013-06-17 16:23:40 -0400, Alvaro Herrera wrote: Re-summarization is relatively expensive, because the complete page range has to be scanned. Why? Why can't we just update the affected pages in the index? The page range has to be scanned in order to find out the min/max

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Alvaro Herrera
Tom Lane wrote: We've talked a lot about index-organized tables in the past. How much of the use case for this would be subsumed by a feature like that? IOTs are not flexible enough. You can only have one index that you index-organize the table on; and you can search only based on a prefix

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Josh Berkus
This begins to sound like these indexes are only useful on append-only tables. Not that there aren't plenty of those, but ... But what? ... but the other comments further down in my email. Also, my successive comments in other emails. Why? Why can't we just update the affected pages

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Tomas Vondra
Hi! This sounds really interesting, so a few quick comments. On 15.6.2013 00:28, Alvaro Herrera wrote: In each index tuple (corresponding to one page range), we store: - first block this tuple applies to - last block this tuple applies to - for each indexed column: * min() value across all

Re: [HACKERS] [RFC] Minmax indexes

2013-06-16 Thread Craig Ringer
On 06/15/2013 11:39 PM, Tom Lane wrote: There's a nearby thread complaining bitterly about our willingness to create hard-to-use, hard-to-tune features. In its current form, this will be another one of those. For what it's worth, I was going for usability concerns, not complaint... and I'd

Re: [HACKERS] [RFC] Minmax indexes

2013-06-16 Thread Josh Berkus
I agree that the FSM behaviour shouldn't be linked to index existence. IMHO that should be a separate table parameter, WITH (fsm_mode = append) Index only scans would also benefit from that. -1 ... I cannot believe that such a parameter would ever get turned on in production by anyone. If

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote: Alvaro, This sounds really interesting, and I can see the possibilities. However ... Value changes in columns that are part of a minmax index, and tuple insertion in summarized pages, would invalidate the stored min/max values. To

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 00:07, Tom Lane t...@sss.pgh.pa.us wrote: We've talked a lot about index-organized tables in the past. How much of the use case for this would be subsumed by a feature like that? IOTs would only work for one specific organisation, acting like a multi-column btree. So you could

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote: If we're going to start adding reloptions for specific table behavior, I'd rather think of all of the optimizations we might have for a prospective append-only table and bundle those, rather

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 16:39, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote: If we're going to start adding reloptions for specific table behavior, I'd rather think of all of the optimizations we might have for

[HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Alvaro Herrera
Hi, This is a preliminary proposal for Minmax indexes. I'm experimenting with the code, but it's too crude to post yet, so here's a document explaining what they are and how they work, so that reviewers can poke holes to have the design improved. My intention is to have a patch to show for CF2,

Re: [HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Josh Berkus
Alvaro, This sounds really interesting, and I can see the possibilities. However ... Value changes in columns that are part of a minmax index, and tuple insertion in summarized pages, would invalidate the stored min/max values. To support this, each minmax index has a validity map; a range

Re: [HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes: To avoid this, a table having a minmax index would be configured so that inserts only go to the page(s) at the end of the table; this avoids frequent invalidation of ranges in the middle of the table. We provide a table reloption that tweaks the FSM

Re: [HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Greg Stark
On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Re-summarization is relatively expensive, because the complete page range has to be scanned. That doesn't sound too bad to me. It just means there's a downside to having larger page ranges. I would expect the page