Re: [HACKERS] [RFC] Minmax indexes

2013-09-01 Thread Noah Misch
On Fri, Jun 14, 2013 at 06:28:06PM -0400, Alvaro Herrera wrote: > Partial indexes are not supported; since an index is concerned with minimum > and > maximum values of the involved columns across all the pages in the table, it > doesn't make sense to exclude values. It can make sense if the predi

Re: [HACKERS] [RFC] Minmax indexes

2013-07-19 Thread Josh Berkus
On 07/18/2013 10:39 PM, Alvaro Herrera wrote: > To scan the index, we first obtain the TID of index tuple for page 0. If > this returns a valid TID, we read that tuple to determine the min/max bounds > for this page range. If it returns invalid, then the range is unsummarized, > and the scan must

Re: [HACKERS] [RFC] Minmax indexes

2013-07-18 Thread Alvaro Herrera
Alvaro Herrera wrote: > This is a preliminary proposal for Minmax indexes. I'm experimenting > with the code, but it's too crude to post yet, so here's a document > explaining what they are and how they work, so that reviewers can poke > holes to have the design improved. After going further wit

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Josh Berkus
>> On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby wrote: >>> If we add a dirty flag it would probably be wise to allow for more >>> than one >>> value so we can do a clock-sweep. That would allow for detecting a range >>> that is getting dirtied repeatedly and not bother to try and >>> re-summarize i

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Jim Nasby
On 6/28/13 12:26 PM, Claudio Freire wrote: On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby wrote: On 6/17/13 3:38 PM, Josh Berkus wrote: Why? Why can't we just update the affected pages in the index? The page range has to be scanned in order to find out the min/max values for the indexed colu

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Claudio Freire
On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby wrote: > On 6/17/13 3:38 PM, Josh Berkus wrote: Why? Why can't we just update the affected pages in the index? >>> >>> > >>> >The page range has to be scanned in order to find out the min/max values >>> >for the indexed columns on the range; an

Re: [HACKERS] [RFC] Minmax indexes

2013-06-28 Thread Jim Nasby
On 6/17/13 3:38 PM, Josh Berkus wrote: Why? Why can't we just update the affected pages in the index? > >The page range has to be scanned in order to find out the min/max values >for the indexed columns on the range; and then, with these data, update >the index. Seems like you could incrementa

Re: [HACKERS] [RFC] Minmax indexes

2013-06-25 Thread Alvaro Herrera
I just noticed that this patch was closed with "returned with feedback" in the commitfest app. This is good, IMV -- it's saying that the opinion of the various people commenting on the thread is positive, and therefore no more discussion is currently needed. I will post an actual patch to CF2, at

Re: [HACKERS] [RFC] Minmax indexes

2013-06-24 Thread Simon Riggs
On 25 June 2013 00:51, Bruce Momjian wrote: > On Sat, Jun 15, 2013 at 11:39:23AM -0400, Tom Lane wrote: > > Simon Riggs writes: > > > On 15 June 2013 00:01, Josh Berkus wrote: > > >> If we're going to start adding reloptions for specific table behavior, > > >> I'd rather think of all of the opt

Re: [HACKERS] [RFC] Minmax indexes

2013-06-24 Thread Bruce Momjian
On Sat, Jun 15, 2013 at 11:39:23AM -0400, Tom Lane wrote: > Simon Riggs writes: > > On 15 June 2013 00:01, Josh Berkus wrote: > >> If we're going to start adding reloptions for specific table behavior, > >> I'd rather think of all of the optimizations we might have for a > >> prospective "append-

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Tomas Vondra
Hi! This sounds really interesting, so a few quick comments. On 15.6.2013 00:28, Alvaro Herrera wrote: > In each index tuple (corresponding to one page range), we store: - > first block this tuple applies to - last block this tuple applies to > - for each indexed column: * min() value across al

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Josh Berkus
>> This begins to sound like these indexes are only useful on append-only >> tables. Not that there aren't plenty of those, but ... > > But what? ... "but" the other comments further down in my email. Also, my successive comments in other emails. >> Why? Why can't we just update the affect

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Alvaro Herrera
Tom Lane wrote: > We've talked a lot about index-organized tables in the past. How much > of the use case for this would be subsumed by a feature like that? IOTs are not flexible enough. You can only have one index that you index-organize the table on; and you can search only based on a prefix

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Andres Freund
On 2013-06-17 16:23:40 -0400, Alvaro Herrera wrote: > > > Re-summarization is relatively expensive, because the complete page range > > > has > > > to be scanned. > > > > Why? Why can't we just update the affected pages in the index? > > The page range has to be scanned in order to find out the

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Alvaro Herrera
Josh Berkus wrote: > > Value changes in columns that are part of a minmax index, and tuple > > insertion > > in summarized pages, would invalidate the stored min/max values. To support > > this, each minmax index has a validity map; a range can only be considered > > in a > > scan if it hasn't

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Alvaro Herrera
Greg Stark wrote: > On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera > wrote: > > Re-summarization is relatively expensive, because the complete page range > > has > > to be scanned. > > That doesn't sound too bad to me. It just means there's a downside to > having larger page ranges. I would ex

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Josh Berkus
> So there isn't a "fall down" thing here. We expect the recently > loaded/updated data to be scanned and that's OK. > > Having the minmax index updated greedily is just adding extra work for > fast diminishing returns. We can always add that later if really > needed, but I doubt it will be neede

Re: [HACKERS] [RFC] Minmax indexes

2013-06-17 Thread Simon Riggs
On 17 June 2013 02:05, Josh Berkus wrote: > >>> I agree that the FSM behaviour shouldn't be linked to index existence. >>> IMHO that should be a separate table parameter, WITH (fsm_mode = append) >>> Index only scans would also benefit from that. >> >> -1 ... I cannot believe that such a parameter

Re: [HACKERS] [RFC] Minmax indexes

2013-06-16 Thread Josh Berkus
>> I agree that the FSM behaviour shouldn't be linked to index existence. >> IMHO that should be a separate table parameter, WITH (fsm_mode = append) >> Index only scans would also benefit from that. > > -1 ... I cannot believe that such a parameter would ever get turned on > in production by any

Re: [HACKERS] [RFC] Minmax indexes

2013-06-16 Thread Craig Ringer
On 06/15/2013 11:39 PM, Tom Lane wrote: > There's a nearby thread complaining bitterly about our willingness to > create hard-to-use, hard-to-tune "features". In its current form, > this will be another one of those. For what it's worth, I was going for "usability concerns", not "complaint"... and

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 16:39, Tom Lane wrote: > Simon Riggs writes: >> On 15 June 2013 00:01, Josh Berkus wrote: >>> If we're going to start adding reloptions for specific table behavior, >>> I'd rather think of all of the optimizations we might have for a >>> prospective "append-only table" and bundle

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Tom Lane
Simon Riggs writes: > On 15 June 2013 00:01, Josh Berkus wrote: >> If we're going to start adding reloptions for specific table behavior, >> I'd rather think of all of the optimizations we might have for a >> prospective "append-only table" and bundle those, rather than tying it >> to whether a c

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 00:07, Tom Lane wrote: > We've talked a lot about index-organized tables in the past. How much > of the use case for this would be subsumed by a feature like that? IOTs would only work for one specific organisation, acting like a multi-column btree. So you could use IOTs for thi

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 00:01, Josh Berkus wrote: > Alvaro, > > This sounds really interesting, and I can see the possibilities. > However ... > >> Value changes in columns that are part of a minmax index, and tuple insertion >> in summarized pages, would invalidate the stored min/max values. To support

Re: [HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Greg Stark
On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera wrote: > Re-summarization is relatively expensive, because the complete page range has > to be scanned. That doesn't sound too bad to me. It just means there's a downside to having larger page ranges. I would expect the page ranges to be something

Re: [HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Tom Lane
Josh Berkus writes: >> To avoid this, a table having a minmax index would be >> configured so that inserts only go to the page(s) at the end of the table; >> this >> avoids frequent invalidation of ranges in the middle of the table. We >> provide >> a table reloption that tweaks the FSM behavio

Re: [HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Josh Berkus
Alvaro, This sounds really interesting, and I can see the possibilities. However ... > Value changes in columns that are part of a minmax index, and tuple insertion > in summarized pages, would invalidate the stored min/max values. To support > this, each minmax index has a validity map; a range

[HACKERS] [RFC] Minmax indexes

2013-06-14 Thread Alvaro Herrera
Hi, This is a preliminary proposal for Minmax indexes. I'm experimenting with the code, but it's too crude to post yet, so here's a document explaining what they are and how they work, so that reviewers can poke holes to have the design improved. My intention is to have a patch to show for CF2,