Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Leonardo Francalanci
Simon Riggs wrote >> Can you CLUSTER >> against a minmax index? > > Not in this release, at least in my understanding. It's not yet > possible to do an ordered fetch, so the cluster scan probably won't > work. As per the patch I helped writing, CLUSTER should use the sequential heap scan+sort whe

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Simon Riggs
On 13 November 2013 11:54, Merlin Moncure wrote: >> Load time >> MinMax no overhead, same as raw COPY >> BTree - considerably slower And just as a general comment, the min max index does not slow down COPY as the table gets larger, whereas the btree gets slower as the table gets larger. Which is

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Simon Riggs
On 13 November 2013 11:54, Merlin Moncure wrote: > On Wed, Nov 13, 2013 at 7:33 AM, Simon Riggs wrote: >> On 13 November 2013 09:31, Leonardo Francalanci wrote: >>> I would like to see some numbers. >> >> Alvaro has given me some results for his patch. The figures I have are >> for a 2GB table.

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Merlin Moncure
On Wed, Nov 13, 2013 at 7:33 AM, Simon Riggs wrote: > On 13 November 2013 09:31, Leonardo Francalanci wrote: >> I would like to see some numbers. > > Alvaro has given me some results for his patch. The figures I have are > for a 2GB table. > > Index Build Time > MinMax 11 s > Btree 96s > > Inde

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Leonardo Francalanci
Simon Riggs wrote > From our discussions here, IMHO there is a strong case for avoiding > btrees completely for larger historical data tables. That isn't > something I had even considered as desirable before this conversation > but ISTM now that taking that approach will be more fruitful than > att

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Simon Riggs
On 13 November 2013 09:31, Leonardo Francalanci wrote: > Or, in other words: what are you going to write in the minmax index > documentation, "try and see if they work better for you"? That seems like good advice to me. Bacon > Aristotle. There is very little written about suitability of any in

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Leonardo Francalanci
Jeremy Harris wrote > Surely there's good correlation between IMSI & IMEI, so have a separate > table to translate one to (a group of) the others, and > halve the indexes on your main table? Yes; unfortunately not always both are available; but it's something we are thinking about (it requires log

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Leonardo Francalanci
Simon Riggs wrote > So in the use case you describe, the min max index would require a > scan of only 25% of the table, not the 80% described earlier for > random inserts. In my experience, people wish to keep data for much > longer periods and so the percentage of scan required would drop lower >

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Jeremy Harris
On 13/11/13 09:07, Leonardo Francalanci wrote: Problem? having 4 btree indexes on random values (imsi+imei * 2, since we have calling and caller) kills the performance in insertion after a while. Surely there's good correlation between IMSI & IMEI, so have a separate table to translate one to (

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Simon Riggs
On 13 November 2013 06:07, Leonardo Francalanci wrote: > The use case is pretty simple. > Think it as the NSA, as it would be much easier. > Collect all calls made/received by any user on a mobile network. > (in fact, it's something more than calls, so in fact is 2-10 rows per call). > Keep the d

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Leonardo Francalanci
Simon Riggs wrote > On 5 November 2013 14:28, Leonardo Francalanci < > m_lists@ > > wrote: > >> Either my sql is not correct (likely), or my understanding of the minmax >> index is >> not correct (even more likely), or the minmax index is not usable in a >> random inputs >> scenario. > > Please

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Nicolas Barbier
2013/11/12 Simon Riggs : > On 12 November 2013 21:41, Nicolas Barbier wrote: > >> Look-up speed is as follows: Each look-up must look through all >> B-trees. > > That can be optimised by using a min max approach, so we need only > look at sub-trees that may contain data. Under the assumption tha

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Simon Riggs
On 12 November 2013 19:54, Claudio Freire wrote: > On Tue, Nov 12, 2013 at 7:14 PM, Simon Riggs wrote: >> I still think we need to look at this from a query perspective though. >> We need to check whether there is a class of real world queries that >> are not well optimised by minmax indexes, or

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-13 Thread Nicolas Barbier
2013/11/12 Claudio Freire : > On Tue, Nov 12, 2013 at 6:41 PM, Nicolas Barbier > wrote: > >> (Note that K B-trees can be merged by simply scanning all of them >> concurrently, and merging them just like a merge sort merges runs. >> Also, all B-trees except for the first level (of size S) can be >

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-12 Thread Claudio Freire
On Tue, Nov 12, 2013 at 6:41 PM, Nicolas Barbier wrote: > (Note that K B-trees can be merged by simply scanning all of them > concurrently, and merging them just like a merge sort merges runs. > Also, all B-trees except for the first level (of size S) can be > compacted 100% as there is no need to

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-12 Thread Simon Riggs
On 12 November 2013 21:41, Nicolas Barbier wrote: > Look-up speed is as follows: Each look-up must look through all > B-trees. That can be optimised by using a min max approach, so we need only look at sub-trees that may contain data. > Index size: I think (didn’t calculate) that the combined s

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-12 Thread Nicolas Barbier
2013/11/12 Nicolas Barbier : > In conclusion, use a “B-forest” when: > > * The index entries are small (large fan-out). > * The insertion throughput is high. > * It’s OK for look-ups to be slow. > * Extra points when the storage medium has high seek times. Oops, forgot the most important ones: *

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-12 Thread Nicolas Barbier
2013/11/2 Simon Riggs : > On 29 October 2013 16:10, Peter Geoghegan wrote: > >> Presumably someone will get around to implementing a btree index >> insertion buffer one day. I think that would be a particularly >> compelling optimization for us, because we could avoid ever inserting >> index tupl

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-12 Thread Simon Riggs
On 5 November 2013 14:28, Leonardo Francalanci wrote: > Either my sql is not correct (likely), or my understanding of the minmax > index is > not correct (even more likely), or the minmax index is not usable in a > random inputs > scenario. Please show the real world SQL you intend to run, so we

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-11 Thread Jeff Janes
On Thu, Oct 31, 2013 at 12:43 AM, Leonardo Francalanci wrote: > Jeff Janes wrote > > True, but that is also true of indexes created in bulk. It all has to > > reach disk eventually-- > > [...] > > If the checkpoint interval is as long as the partitioning period, then > > hopefully the active inde

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-11 Thread Jeff Janes
On Tue, Nov 5, 2013 at 9:52 AM, Leonardo Francalanci wrote: > Jeff Janes wrote > > Some experiments I did a few years ago showed that applying sorts to the > > data to be inserted could be helpful even when the sort batch size was as > > small as one tuple per 5 pages of existing index. Maybe eve

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Alvaro Herrera
Greg Stark escribió: > I think minmax indexes are more akin to bitmap indexes. They will be very > effective for columns with low-cardinality, especially for columns that are > very clustered. In the extreme if all the values in some regions of the > table are the same then minmax indexes would be

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Greg Stark
On Tue, Nov 5, 2013 at 5:30 PM, Claudio Freire wrote: > > Maybe there's value in minmax indexes for sequential data, but not for > > random data, which is the topic of this thread. > > > Well, of course, they're not magic pixie dust. > > But is your data really random? (or normal?) I think minma

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 2:52 PM, Leonardo Francalanci wrote: > Jeff Janes wrote >> Some experiments I did a few years ago showed that applying sorts to the >> data to be inserted could be helpful even when the sort batch size was as >> small as one tuple per 5 pages of existing index. Maybe even l

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Jeff Janes wrote > Some experiments I did a few years ago showed that applying sorts to the > data to be inserted could be helpful even when the sort batch size was as > small as one tuple per 5 pages of existing index. Maybe even less. Cool!!! Do you have any idea/hint on how I could try and rep

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Claudio Freire wrote > Well, of course, they're not magic pixie dust. Of course they aren't. I think they can make a difference in a sequential input scenario. But I'm not the one who said that they are fit to solve the problems me and other people are talking about in this thread. Claudio Frei

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 12:22 PM, Leonardo Francalanci wrote: > Claudio Freire wrote >> you haven't really >> analyzed update cost, which is what we were talking about in that last >> post. > > I don't care for a better update cost if the cost to query is a table scan. > Otherwise, I'll just claim

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Jeff Janes
On Tue, Nov 5, 2013 at 12:25 AM, Leonardo Francalanci wrote: > Andres Freund-3 wrote > > On 2013-11-04 11:27:33 -0500, Robert Haas wrote: > >> On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire < > > > klaussfreire@ > > > > wrote: > >> > Such a thing would help COPY, so maybe it's worth a look > >> >

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Claudio Freire wrote > real data isn't truly random Well, let's try normal_rand??? create table t1 as select trunc(normal_rand(100, 50, 3)) as n, generate_series(1, 100) as i; with cte as (select min(n) as minn, max(n) as maxn, i/100 from t1 group by i/100),

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 11:28 AM, Leonardo Francalanci wrote: > I get > 9000 pages for 49 values out of 50... which means scanning 90% of > the table. > > Either my sql is not correct (likely), or my understanding of the minmax > index is > not correct (even more likely), or the minmax index is not

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Claudio Freire wrote > Min-max indexes always require a sequential scan of the min-max index > itself when querying. I'm worried about the number of heap pages that will be scanned. My understanding is that given the random input, the index will not be selective enough, and will end up requiring

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 6:57 AM, Leonardo Francalanci wrote: > Simon Riggs wrote >> Minmax indexes seem to surprise many people, so broad generalisations >> aren't likely to be useful. >> >> I think the best thing to do is to publish some SQL requests that >> demonstrate in detail what you are tryi

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Simon Riggs wrote > On 5 November 2013 09:57, Leonardo Francalanci < > m_lists@ > > wrote: >> While I do believe in testing (since "In theory there is no difference >> between theory and practice. In practice there is"), I would like to know >> the "properties" of the minmax index before trying i

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Simon Riggs
On 5 November 2013 09:57, Leonardo Francalanci wrote: > Simon Riggs wrote >> Minmax indexes seem to surprise many people, so broad generalisations >> aren't likely to be useful. >> >> I think the best thing to do is to publish some SQL requests that >> demonstrate in detail what you are trying to

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Simon Riggs wrote > Minmax indexes seem to surprise many people, so broad generalisations > aren't likely to be useful. > > I think the best thing to do is to publish some SQL requests that > demonstrate in detail what you are trying to achieve and test them > against minmax indexes. That way we c

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Simon Riggs
On 5 November 2013 08:32, Leonardo Francalanci wrote: > Simon Riggs wrote >> Everybody on this thread is advised to look closely at Min Max indexes >> before starting any further work. >> >> MinMax will give us access to many new kinds of plan, plus they are >> about as close to perfectly efficien

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Simon Riggs wrote > Everybody on this thread is advised to look closely at Min Max indexes > before starting any further work. > > MinMax will give us access to many new kinds of plan, plus they are > about as close to perfectly efficient, by which I mean almost zero > overhead, with regard to ins

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Andres Freund-3 wrote > On 2013-11-04 11:27:33 -0500, Robert Haas wrote: >> On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire < > klaussfreire@ > > wrote: >> > Such a thing would help COPY, so maybe it's worth a look >> >> I have little doubt that a deferred insertion buffer of some kind >> could

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Simon Riggs
On 30 October 2013 14:34, Yann Fontana wrote: > > >> On 30 October 2013 11:23, Leonardo Francalanci wrote: >> >> >> In terms of generality, do you think its worth a man year of developer >> >> effort to replicate what you have already achieved? Who would pay? > > > I work on an application that d

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Simon Riggs
On 4 November 2013 19:55, Gavin Flower wrote: > How about having a 'TRANSIENT INDEX' that only exists in memory, so there is > no requirement to write it to disk or to replicate directly? This type of > index would be very fast and easier to implement. Recovery would involve > rebuilding the ind

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Claudio Freire
On Mon, Nov 4, 2013 at 5:01 PM, Simon Riggs wrote: >> Of course, it's possible that even we do get a shared memory >> allocator, a hypothetical person working on this project might prefer >> to make the data block-structured anyway and steal storage from >> shared_buffers. So my aspirations in th

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Jeff Janes
On Mon, Nov 4, 2013 at 8:09 AM, Robert Haas wrote: > On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs wrote: > > On 29 October 2013 16:10, Peter Geoghegan wrote: > >> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci > wrote: > >>> I don't see much interest in insert-efficient indexes. > >> > >

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Simon Riggs
On 4 November 2013 16:09, Robert Haas wrote: > On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs wrote: >> On 29 October 2013 16:10, Peter Geoghegan wrote: >>> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci >>> wrote: I don't see much interest in insert-efficient indexes. >>> >>> Presuma

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Gavin Flower
On 05/11/13 05:35, Robert Haas wrote: On Mon, Nov 4, 2013 at 11:32 AM, Andres Freund wrote: I think doing this outside of s_b will make stuff rather hard for physical replication and crash recovery since we either will need to flush the whole buffer at checkpoints - which is hard since the chec

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Robert Haas
On Mon, Nov 4, 2013 at 11:32 AM, Andres Freund wrote: > I think doing this outside of s_b will make stuff rather hard for > physical replication and crash recovery since we either will need to > flush the whole buffer at checkpoints - which is hard since the > checkpointer doesn't work inside indi

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Andres Freund
On 2013-11-04 11:27:33 -0500, Robert Haas wrote: > On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire > wrote: > > Such a thing would help COPY, so maybe it's worth a look > > I have little doubt that a deferred insertion buffer of some kind > could help performance on some workloads, though I susp

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Robert Haas
On Mon, Nov 4, 2013 at 11:31 AM, Claudio Freire wrote: > On Mon, Nov 4, 2013 at 1:27 PM, Robert Haas wrote: >> On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire >> wrote: >>> Such a thing would help COPY, so maybe it's worth a look >> >> I have little doubt that a deferred insertion buffer of som

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Claudio Freire
On Mon, Nov 4, 2013 at 1:27 PM, Robert Haas wrote: > On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire > wrote: >> Such a thing would help COPY, so maybe it's worth a look > > I have little doubt that a deferred insertion buffer of some kind > could help performance on some workloads, though I sus

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Robert Haas
On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire wrote: > Such a thing would help COPY, so maybe it's worth a look I have little doubt that a deferred insertion buffer of some kind could help performance on some workloads, though I suspect the buffer would have to be pretty big to make it worthwhi

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Claudio Freire
On Mon, Nov 4, 2013 at 1:09 PM, Robert Haas wrote: > On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs wrote: >> On 29 October 2013 16:10, Peter Geoghegan wrote: >>> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci >>> wrote: I don't see much interest in insert-efficient indexes. >>> >>> P

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-04 Thread Robert Haas
On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs wrote: > On 29 October 2013 16:10, Peter Geoghegan wrote: >> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci >> wrote: >>> I don't see much interest in insert-efficient indexes. >> >> Presumably someone will get around to implementing a btree in

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-02 Thread Simon Riggs
On 29 October 2013 16:10, Peter Geoghegan wrote: > On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci > wrote: >> I don't see much interest in insert-efficient indexes. > > Presumably someone will get around to implementing a btree index > insertion buffer one day. I think that would be a par

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-31 Thread Leonardo Francalanci
Gavin Flower-2 wrote > How about being able to mark indexes: > 'MEMORY ONLY' to make them not go to disk > and > 'PERSISTENT | TRANSIENT' to mark if they should be recreated on > machine bootup? I would love that. But: 1) I'd like to make some tests with a "memory drive", and confirm t

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-31 Thread Leonardo Francalanci
Jeff Janes wrote > True, but that is also true of indexes created in bulk. It all has to > reach disk eventually-- > [...] > If the checkpoint interval is as long as the partitioning period, then > hopefully the active index buffers get re-dirtied while protected in > shared_buffers, and only get

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Claudio Freire
On Wed, Oct 30, 2013 at 10:53 AM, Simon Riggs wrote: > LSM-tree also covers the goal of maintaining just 2 sub-trees and a > concurrent process of merging sub-trees. That sounds like it would > take a lot of additional time to get right and would need some > off-line process to perform the merge.

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Gavin Flower
On 31/10/13 06:46, Jeff Janes wrote: On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci mailto:m_li...@yahoo.it>> wrote: Jeff Janes wrote > The index insertions should be fast until the size of the active part of > the indexes being inserted into exceeds shared_buffers by som

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Jeff Janes
On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci wrote: > Jeff Janes wrote > > The index insertions should be fast until the size of the active part of > > the indexes being inserted into exceeds shared_buffers by some amount > > (what > > that amount is would depend on how much dirty data th

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
Jeff Janes wrote > You could periodically merge older partitions into larger tables, index > those aggregated tables, then transactionally disinherit the old > partitions > and inherit the new aggregated one. This would keep the value of K down, > at the expense of re-writing data multiple times (

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
Jeff Janes wrote > Are partitions read-only once time has moved on, or can stragglers show up > that need to be inserted into older partitions? > > You could periodically merge older partitions into larger tables, index > those aggregated tables, then transactionally disinherit the old > partition

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
> Point being: hardware is marching along pretty fast (after 20+ years > of stagnation) and it's dangerous (IMO) to make big software > investments based on the situation on the ground *today*. Yes, that's a good point. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To m

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
Jeff Janes wrote > The index insertions should be fast until the size of the active part of > the indexes being inserted into exceeds shared_buffers by some amount > (what > that amount is would depend on how much dirty data the kernel is willing > to > allow in the page cache before it starts suff

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Jeff Janes
On Wed, Oct 30, 2013 at 4:23 AM, Leonardo Francalanci wrote: > > > 1) I haven't achieved what I need: realtime indexing. I can't query the > "current 15 minutes" table efficiently. Plus, K*log(N) is not that great > when you have a lot of K. > Are partitions read-only once time has moved on, or c

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Jeff Janes
On Wed, Oct 30, 2013 at 3:35 AM, Leonardo Francalanci wrote: > > Presumably the data you are inserting isn't actually random. Please > > describe the use case you are considering in more detail and some view > > on how frequent that is, with some examples. Once we understand the > > use case and a

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Merlin Moncure
On Wed, Oct 30, 2013 at 9:23 AM, Leonardo Francalanci wrote: >> LSM-trees seem patent free > > I'm no expert, and I gave it just a look some time ago: it looked to me very > complicated to get right... and as far as I remember you don't get that much > gain, unless you go multi-level which would

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Yann Fontana
On 30 October 2013 11:23, Leonardo Francalanci wrote: > > >> In terms of generality, do you think its worth a man year of developer > >> effort to replicate what you have already achieved? Who would pay? > I work on an application that does exactly what Leonardo described. We hit the exact same p

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
> LSM-trees seem patent free I'm no expert, and I gave it just a look some time ago: it looked to me very complicated to get right... and as far as I remember you don't get that much gain, unless you go multi-level which would complicate things further > Please somebody advise patent status of

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Simon Riggs
On 30 October 2013 11:23, Leonardo Francalanci wrote: >> What is the reason for needing such fast access to individual groups >> of records? Sure sounds like the NSA or similar ;-) > > > Users need to search all calls originated from/to a user or from/to a > specific mobile phone to answer/analyz

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Kaare Rasmussen
On 2013-10-30 12:08, Simon Riggs wrote: effort to replicate what you have already achieved? Who would pay? The NSA, obviously ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
> What is the reason for needing such fast access to individual groups > of records? Sure sounds like the NSA or similar ;-) Users need to search all calls originated from/to a user or from/to a specific mobile phone to answer/analyze customers' probl... ok, I give up: I work for the NSA ;) >

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Simon Riggs
On 30 October 2013 10:35, Leonardo Francalanci wrote: >> Presumably the data you are inserting isn't actually random. Please >> describe the use case you are considering in more detail and some view >> on how frequent that is, with some examples. Once we understand the >> use case and agree it is

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
> Presumably the data you are inserting isn't actually random. Please > describe the use case you are considering in more detail and some view > on how frequent that is, with some examples. Once we understand the > use case and agree it is important, we might solve problems. Collecting calls data

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Simon Riggs
On 30 October 2013 07:55, Leonardo Francalanci wrote: >> Hmm, you realise Alvaro is working on MinMax indexes in this release? >> They are very efficient with regard to index inserts and specially >> designed for use on large tables. >> >> Prior work by Heikki on Grouped Item Tuples was a way of r

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
> Hmm, you realise Alvaro is working on MinMax indexes in this release? > They are very efficient with regard to index inserts and specially > designed for use on large tables. > > Prior work by Heikki on Grouped Item Tuples was a way of reducing the > size of indexes, yet still allowing uniquenes

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Leonardo Francalanci
> Hmm, you realise Alvaro is working on MinMax indexes in this release? > They are very efficient with regard to index inserts and specially > designed for use on large tables. > > Prior work by Heikki on Grouped Item Tuples was a way of reducing the > size of indexes, yet still allowing uniquene

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Simon Riggs
On 29 October 2013 07:53, Leonardo Francalanci wrote: > I don't see much interest in insert-efficient indexes. Hmm, you realise Alvaro is working on MinMax indexes in this release? They are very efficient with regard to index inserts and specially designed for use on large tables. Prior work by

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Tom Lane
Jeff Janes writes: > Robert removed the lmgr lock on the meta page by using a retry loop with > lightweight locks. I've outlined how to remove the heavyweight lock on the > bucket page as well, but it would require an on-disk change to the index so > that each page knows how far the bucket it is

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Jeff Janes
On Tue, Oct 29, 2013 at 8:16 AM, Tom Lane wrote: > Leonardo Francalanci writes: > >> Before getting too excited about some new academic index type, it's > worth > >> noting the sad state in which hash indexes have languished for years. > > > Aren't hash indexes in a poor state because they are n

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Leonardo Francalanci
> I bet you've mis-diagnosed the problem.  Btrees don't have a problem > keeping up with 50m records; you're problem is that after a certain > point your page cache can't keep up with the pseudo-random i/o > patterns and you start seeing faults to storage. > [...]  This has nothing to do the btree

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Peter Geoghegan
On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci wrote: > I don't see much interest in insert-efficient indexes. Presumably someone will get around to implementing a btree index insertion buffer one day. I think that would be a particularly compelling optimization for us, because we could av

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Merlin Moncure
On Tue, Oct 29, 2013 at 10:49 AM, Leonardo Francalanci wrote: >> Another point to add: I don't really see btree as a barrier to >> performance for most of the problems I face. The real barriers to >> database performance are storage, contention, and query planning. > > Ehm that's true for regular

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Claudio Freire
On Tue, Oct 29, 2013 at 1:10 PM, Peter Geoghegan wrote: > On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci > wrote: > > I don't see much interest in insert-efficient indexes. > > Presumably someone will get around to implementing a btree index > insertion buffer one day. I think that would

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Leonardo Francalanci
> They should, in theory, be faster than btrees -- O(1) not O(log N) page > fetches per lookup.  In practice they don't seem to be faster, and > nobody's bothered to find out exactly why.  Again, this isn't a terribly > encouraging precedent for implementing some other index type that's > supposed

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Leonardo Francalanci
> Another point to add: I don't really see btree as a barrier to > performance for most of the problems I face.  The real barriers to > database performance are storage, contention, and query planning. Ehm that's true for regular OLTP stuff, which I understand is what most (95%?) of people use/ne

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Tom Lane
Leonardo Francalanci writes: >> Before getting too excited about some new academic index type, it's worth >> noting the sad state in which hash indexes have languished for years. > Aren't hash indexes in a poor state because they are not faster than btree in > every condition? They should, in t

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread k...@rice.edu
On Tue, Oct 29, 2013 at 02:53:37PM +, Leonardo Francalanci wrote: > > Before getting too excited about some new academic index type, it's worth > > noting the sad state in which hash indexes have languished for years. > > Nobody's bothered to add WAL support, let alone do any other real work >

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Alvaro Herrera
Leonardo Francalanci wrote: > > Before getting too excited about some new academic index type, it's worth > > noting the sad state in which hash indexes have languished for years. > > Nobody's bothered to add WAL support, let alone do any other real work > > on them.  The non-btree index types that

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Leonardo Francalanci
> Before getting too excited about some new academic index type, it's worth > noting the sad state in which hash indexes have languished for years. > Nobody's bothered to add WAL support, let alone do any other real work > on them.  The non-btree index types that have been getting love are the > on

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Merlin Moncure
On Tue, Oct 29, 2013 at 2:53 AM, Leonardo Francalanci wrote: > Hi, > > > I don't see much interest in insert-efficient indexes. These are the ones > I've found: > > - LSM-tree (used by Cassandra and SQLite4?) > - Y-Tree > (http://www.bossconsulting.com/oracle_dba/white_papers/DW%20in%20oracle/P2

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Tom Lane
Craig Ringer writes: > On 10/29/2013 03:53 PM, Leonardo Francalanci wrote: >> 5) something else??? > Quite likely nobody has had the enthusiasm and interest to implement a > viable, quality implementation and stick with it long enough to get it > committed. > There are a great many good ideas fo

Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Craig Ringer
On 10/29/2013 03:53 PM, Leonardo Francalanci wrote: > 5) something else??? Quite likely nobody has had the enthusiasm and interest to implement a viable, quality implementation and stick with it long enough to get it committed. There are a great many good ideas for improvements to Pg that just do

[HACKERS] Fast insertion indexes: why no developments

2013-10-29 Thread Leonardo Francalanci
Hi, I don't see much interest in insert-efficient indexes. These are the ones I've found: - LSM-tree (used by Cassandra and SQLite4?) - Y-Tree (http://www.bossconsulting.com/oracle_dba/white_papers/DW%20in%20oracle/P23%20(ytree%20index%20structure%20for%20DWs).pdf ) - Fractal indexes (TokuDB,