Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-20 Thread deavid
El vie., 19 jun. 2015 a las 15:06, Simon Riggs (si...@2ndquadrant.com)
escribió:

 It doesn't say anything about their being only one index buffer per table,
 nor do I think it would make sense to do it that way. So ISTM that the
 foreground process still has to insert serially into N index buffers, with
 each insert being WAL logged.

 So the only saving for the foreground process is the random I/O from
 inserting into the indexes, which means the value of the technique is in
 the case where you have many very large secondary indexes - which is now
 covered by BRIN.


I'm still learning how postgresql, but, you're assuming when inserting in
bulk into an insert would require the same amount of CPU cycles and the
same amount of kB written compared to doing it row-by-row.

Most memory-based indexes have a bulk load technique that relies in having
the data pre-sorted. Sorting pure random data and then bulk-inserting it
into the index is faster than the classic insertion. (less CPU time, no
idea about IO)

Database indexes are disk-based and there are some points (regarding IO
performance) that are hard for me to fully understand. But seems logic that
would be faster to scan the index only once from begin to end and do
something like a merge sort between pre-sorted input and the index.

So I guess I missed something. Maybe is WAL logging the problem? If so,
could this work for TEMP/UNLOGGED tables?

Lots of tables that are heavily written are materialized views (or they
perform more or less the same), so they could be refreshed in case of
server failure. I hope bulk inserts could double the performance;
otherwise, this
idea may not be worth it.

About BRIN indexes, i'm really impressed. They are several times faster
than I could imagine. Also, on select they perform very well. I have to
test them more, with more complex queries (they would work when used on
JOIN clauses?). If select times are good enough even in those cases, then
there's no need for doing bulk-inserts with btree.


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-19 Thread Simon Riggs
On 19 June 2015 at 14:30, Robert Haas robertmh...@gmail.com wrote:

  So I really doubt that anyone would have any enthusiasm for saddling
 btree
  with a similar mechanism.  It's complicated (and has been the cause of
  multiple bugs); it's hard to figure out when is the optimal time to flush
  the pending insertions; and it slows down searches in favor of making
  inserts cheap, which is generally not the way to bet --- if that's the
  tradeoff you want, why not drop the index altogether?

 I'm not sure you're right about that.  MySQL has a feature called
 secondary index buffering:

 https://dev.mysql.com/doc/refman/5.0/en/innodb-insert-buffering.html

 Now that might not be exactly what we want to do for one reason or
 another, but I think it would be silly to think that they implemented
 that for any reason other than performance, so there may be some
 performance to be gained there.

 Consider that on a table with multiple indexes, we've got to insert
 into all of them.  If it turns out that the first leaf page we need
 isn't in shared buffers, we'll wait for it to be read in.  We won't
 start the second index insertion until we've completed the first one,
 and so on.  So the whole thing is serial.  In the system MySQL has
 implemented, the foreground process would proceed unimpeded and any
 indexes whose pages were not in the buffer pool would get updated in
 the background.

 Ignoring for the moment the complexities of whether they've got the
 right design and how to implement it, that's sort of cool.


Interesting.

Reading that URL it shows that they would need to write WAL to insert into
the buffer and then again to insert into the index. You might get away with
skipping WAL logs on the index buffer if you had a special WAL record to
record the event all indexes updated for xid , but since that would
be written lazily it would significantly complicate the lazy update
mechanism to track that.

It doesn't say anything about their being only one index buffer per table,
nor do I think it would make sense to do it that way. So ISTM that the
foreground process still has to insert serially into N index buffers, with
each insert being WAL logged.

So the only saving for the foreground process is the random I/O from
inserting into the indexes, which means the value of the technique is in
the case where you have many very large secondary indexes - which is now
covered by BRIN.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-19 Thread Merlin Moncure
On Thu, Jun 11, 2015 at 9:32 PM, Qingqing Zhou
zhouqq.postg...@gmail.com wrote:
 On Fri, Jun 5, 2015 at 10:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 So I really doubt that anyone would have any enthusiasm for saddling btree
 with a similar mechanism.  It's complicated (and has been the cause of
 multiple bugs); it's hard to figure out when is the optimal time to flush
 the pending insertions; and it slows down searches in favor of making
 inserts cheap, which is generally not the way to bet --- if that's the
 tradeoff you want, why not drop the index altogether?

 Hint bits update is also painful in above case, but it is out of the topic 
 here.

Are your records spread out around many transactions or so you tend to
have large batches all with the same xid?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-19 Thread Robert Haas
 So I really doubt that anyone would have any enthusiasm for saddling btree
 with a similar mechanism.  It's complicated (and has been the cause of
 multiple bugs); it's hard to figure out when is the optimal time to flush
 the pending insertions; and it slows down searches in favor of making
 inserts cheap, which is generally not the way to bet --- if that's the
 tradeoff you want, why not drop the index altogether?

I'm not sure you're right about that.  MySQL has a feature called
secondary index buffering:

https://dev.mysql.com/doc/refman/5.0/en/innodb-insert-buffering.html

Now that might not be exactly what we want to do for one reason or
another, but I think it would be silly to think that they implemented
that for any reason other than performance, so there may be some
performance to be gained there.

Consider that on a table with multiple indexes, we've got to insert
into all of them.  If it turns out that the first leaf page we need
isn't in shared buffers, we'll wait for it to be read in.  We won't
start the second index insertion until we've completed the first one,
and so on.  So the whole thing is serial.  In the system MySQL has
implemented, the foreground process would proceed unimpeded and any
indexes whose pages were not in the buffer pool would get updated in
the background.

Ignoring for the moment the complexities of whether they've got the
right design and how to implement it, that's sort of cool.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-18 Thread Jim Nasby

On 6/12/15 9:17 PM, deavid wrote:

   (upper(codagencia::text) );


FWIW, you should consider using citext for these cases; it would let you 
cut down on the number of indexes (by 50%?).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-18 Thread deavid
I did another try on BRIN and GIN indexes, and I compared to regular btree
indexes. Now i have 16M rows to do the test.

The numbers seem to be good. Both GIN and BRIN seem good options for
certain tables with more writes than reads (Specially BRIN is very good)

I want to share with you my test; I used real-world data, but i didn't had
time to do something accurate or real-word uses. I know the methodology is
not enough, and maybe some calculations on the spreadsheet are wrong. I
tried to do my best.

I'm using an SSD and I'm trying to compare CPU cost, not I/O.

In short, the results were: (compared to btree)
- INSERT: GIN is 50% faster; BRIN is 6x faster. This is the best scenario.
- UPDATE: each case has a winner; for big updates BRIN is 10x faster and
GIN is 25x faster. For small updates (most real world cases) BTREE is
always the winner; but BRIN gives some good results too.
- DELETE: Almost no difference between the three.
- SELECT: BTREE here is the winner. BRIN is 10% slower, and GIN performance
seems a bit random.

VACUUM, ANALYZE and other tasks are 6x faster with BRIN, 50% faster with
GIN.
Index sizes are 50% smaller with GIN, but with BRIN they are very very
small

Hope you find useful these numbers.


El sáb., 13 jun. 2015 a las 11:41, deavid (deavidsed...@gmail.com)
escribió:

 Sorry; Because some misconfiugration vacuum and analyze were'nt working.
 Now I'm getting better numbers for BRIN indexes where there are zero rows
 to match.

 El sáb., 13 jun. 2015 a las 3:17, deavid (deavidsed...@gmail.com)
 escribió:

 So I just ran a test case for hash, btree, gin_btree and brin indexes.
 Also without indexes, and without primary keys.
 * Testing deliverynotes table.
   - Definition and use case:
 It is a table contaning real delivery note headers of several years
 It consists of 300k rows, 128 columns, 63 indexes, 243Mb of data
 excluding indexes. Since is a table visible for users, almost every
 column can be searched so we need lots of indexes. We do not need
 searches to be the fastest possible, we only need to accelerate a
 bit our user searches; without harming too much writes.
   - Things to test:
 - measure index creation times.
 - measure index space.
 - with indexes but without primary key
 - with everything
 - Create fully, delete everything and Insert again data in blocks
 - Test updates for recent data

 I attached the logs for every test, if anyone wants to see what i'm
 exactly testing.
 This was tested on my i5 laptop with 4Gb of RAM and a 120Gb SSD (OCZ
 Agility 3). I'm trying to measure CPU time, not I/O time, so some
 configurations and tests are specific to avoid as much as IO as I can.
 I'm using a dev build for Postgresql 9.5 downloaded from git sources.

 Conclusions:
 - Gin_btree seems slower in almost every case. It's writes are marginally
 better than regular btrees even when using work_mem=160MB. (May be 20%
 faster than btree). They are smaller than I thought.
 - BRIN indexes seem very fast for writes. For selects maybe is a blend
 between having indexes and don't having them. They don't recognize that
 some values are simply out of range of indexed values, and that's a pity.
 If the values we want are packed together I guess I would get even better
 results.
 - Primary keys and uniqueness checks doesn't seem to make any difference
 here.
 - Having no indexes at all is faster than I imagined. (Sometimes it beats
 BRIN or Btree) Maybe because the IO here is faster than usual.
 - Hash indexes: i tried to do something, but they take too much time to
 build and i don't know why. If creates are slow, updates should be slow
 too. I'm not going to test them again.

 And finally, don't know why but i couldn't vacuum or analyze tables. It
 always get stalled without doing anything; so i had to comment every
 vacuum. Maybe there is a bug in this dev version or i misconfigured
 something.

 El vie., 12 jun. 2015 a las 7:27, Simon Riggs (si...@2ndquadrant.com)
 escribió:

 On 5 June 2015 at 18:07, deavid deavidsed...@gmail.com wrote:

 There are several use cases where I see useful an index, but adding it
 will slow too much inserts and updates.
 For example, when we have 10 million rows on a table, and it's a table
 which has frequent updates, we need several index to speed up selects, but
 then we'll slow down updates a lot, specially when we have 10 or more
 indexes.
 Other cases involve indexes for text search, which are used only for
 user search and aren't that important, so we want to have them, but we
 don't want the overload they put whenever we write on the table.
 I know different approaches that already solve some of those problems
 in some ways (table partitioning, partial indexes, etc), but i don't feel
 they are the solution to every problem of this kind.

 Some people already asked for delayed write indexes, but the idea
 gets discarded because the index could get out of sync, so it can omit
 results and this is unacceptable. But i think maybe that could be fixed 

Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-18 Thread Alvaro Herrera
deavid wrote:
 I did another try on BRIN and GIN indexes, and I compared to regular btree
 indexes. Now i have 16M rows to do the test.
 
 The numbers seem to be good. Both GIN and BRIN seem good options for
 certain tables with more writes than reads (Specially BRIN is very good)

Thanks, that's very nice to hear.

So you may or may not have realized two important details regarding
BRIN.  One is that when you insert into a table, the values are not
immediately put into the index but only as part of a later vacuum, or a
brin_summarize_new_values() function call.  Either of those things scan
the not-already-summarized part of the table and insert the summarized
values into the index.  If the new values are not in the index, then a
query would have to read that part of the table completely instead of
excluding it from the scan.

The other thing is that in BRIN you can tell the system to make the
summary information very detailed or coarsely detailed -- say one
summary tuple for every page, or one summary tuple for every 128 pages.
The latter is the default.  Obviously, the more detailed it is, the more
you can skip when scanning the table.  If the values are perfectly
correlated, then there's no point to the extra detail, but if there's
some variability then it could be worthwhile.  You change this by
specifying WITH (pages_per_range=16) to the CREATE INDEX command, or
by doing ALTER INDEX SET (pages_per_range=16) and then REINDEX it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-18 Thread Nicolas Barbier
2015-06-05 deavid deavidsed...@gmail.com:

 Mode 3: on aminsert, put the new entry on a second btree; leaving the
 first one untouched. Because the second btree is new, will be small, and
 writes should be faster. When doing a index scan, read tuples from both at
 same time (like merge sort). On vacuum, merge the second btree onto the
 first. On this mode, the index is sorted and there's no need of recheck.

You might be interested in reading the thread “Fast insertion indexes:
why no developments” (2013), starting at
1383033222.73186.yahoomail...@web172602.mail.ir2.yahoo.com .

That thread talks mostly about reducing the (delayed) I/O caused by
inserting in a super-big index at a continuously high rate, while you
seem more interested in the delay problem caused by the CPU time used
when inserting in multiple indexes (which should be quite fast
already, as most of the writing is delayed).

Your problem (if it is really about delay and not about continuous
throughput) might be better served by a delay-reducing solution such
as writing a logical “row X is inserted, please make sure that all
indexes are up-to-date” to the WAL, instead of the physical “row X is
inserted into table A, part Y of index Z must be updated like this,
part Q of index S must be updated like so, etc” as is done now.
Updating the indexes (not only the writing itself) would then be
performed in a delayed fashion. Reading of an index must always
additionally scan the in-memory queue of logical WAL records that is
kept.

Of course, switching the WAL wholesale from a physical description of
the changes that must be performed to a logical description is
probably not feasible. Therefore, one could think about some new kind
of “logical WAL” that gets logged separately (or even mixed in with
the normal, physical WAL), where first the logical WAL is written
(“insert row X in table A”), after which other operations can continue
and the logical WAL is converted to physical WAL asynchronously (by
“performing” the changes as is currently done, but by a background
process). Recovery then would first need to replay the physical WAL,
and then replay the logical WAL.

Nicolas

-- 
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-13 Thread deavid
Sorry; Because some misconfiugration vacuum and analyze were'nt working.
Now I'm getting better numbers for BRIN indexes where there are zero rows
to match.

El sáb., 13 jun. 2015 a las 3:17, deavid (deavidsed...@gmail.com)
escribió:

 So I just ran a test case for hash, btree, gin_btree and brin indexes.
 Also without indexes, and without primary keys.
 * Testing deliverynotes table.
   - Definition and use case:
 It is a table contaning real delivery note headers of several years
 It consists of 300k rows, 128 columns, 63 indexes, 243Mb of data
 excluding indexes. Since is a table visible for users, almost every
 column can be searched so we need lots of indexes. We do not need
 searches to be the fastest possible, we only need to accelerate a
 bit our user searches; without harming too much writes.
   - Things to test:
 - measure index creation times.
 - measure index space.
 - with indexes but without primary key
 - with everything
 - Create fully, delete everything and Insert again data in blocks
 - Test updates for recent data

 I attached the logs for every test, if anyone wants to see what i'm
 exactly testing.
 This was tested on my i5 laptop with 4Gb of RAM and a 120Gb SSD (OCZ
 Agility 3). I'm trying to measure CPU time, not I/O time, so some
 configurations and tests are specific to avoid as much as IO as I can.
 I'm using a dev build for Postgresql 9.5 downloaded from git sources.

 Conclusions:
 - Gin_btree seems slower in almost every case. It's writes are marginally
 better than regular btrees even when using work_mem=160MB. (May be 20%
 faster than btree). They are smaller than I thought.
 - BRIN indexes seem very fast for writes. For selects maybe is a blend
 between having indexes and don't having them. They don't recognize that
 some values are simply out of range of indexed values, and that's a pity.
 If the values we want are packed together I guess I would get even better
 results.
 - Primary keys and uniqueness checks doesn't seem to make any difference
 here.
 - Having no indexes at all is faster than I imagined. (Sometimes it beats
 BRIN or Btree) Maybe because the IO here is faster than usual.
 - Hash indexes: i tried to do something, but they take too much time to
 build and i don't know why. If creates are slow, updates should be slow
 too. I'm not going to test them again.

 And finally, don't know why but i couldn't vacuum or analyze tables. It
 always get stalled without doing anything; so i had to comment every
 vacuum. Maybe there is a bug in this dev version or i misconfigured
 something.

 El vie., 12 jun. 2015 a las 7:27, Simon Riggs (si...@2ndquadrant.com)
 escribió:

 On 5 June 2015 at 18:07, deavid deavidsed...@gmail.com wrote:

 There are several use cases where I see useful an index, but adding it
 will slow too much inserts and updates.
 For example, when we have 10 million rows on a table, and it's a table
 which has frequent updates, we need several index to speed up selects, but
 then we'll slow down updates a lot, specially when we have 10 or more
 indexes.
 Other cases involve indexes for text search, which are used only for
 user search and aren't that important, so we want to have them, but we
 don't want the overload they put whenever we write on the table.
 I know different approaches that already solve some of those problems in
 some ways (table partitioning, partial indexes, etc), but i don't feel they
 are the solution to every problem of this kind.

 Some people already asked for delayed write indexes, but the idea gets
 discarded because the index could get out of sync, so it can omit results
 and this is unacceptable. But i think maybe that could be fixed in several
 ways and we can have a fast and reliable index (but maybe not so fast on
 selects).


 This is exactly the use case and mechanism for BRIN indexes.

 --
 Simon Riggshttp://www.2ndQuadrant.com/
 http://www.2ndquadrant.com/

 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services




Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-11 Thread Jim Nasby

On 6/5/15 6:54 PM, deavid wrote:


So the problem is: i see a low iowait, and CPU time for one core is at
80-90% most of the time. I can buy more ram, better disks, or cpu's with
more cores. But one cpu core would have more-or-less the same speed no
matter how much money you invest.

When someone wants a delayed-write index is similar to setting
  synchronous_commit = off. We want to give an OK to the backend as
soon as is possible and do this work in background. But we also want
some reliability against crashes.

Also, if the task is done in background it may be done from other
backend, so probably several indexes could be packed at once using
different backend processes. We could use the entire cpu if our index
writes aren't tied to the session who wrote the row.


Something that might help here would be doing the index maintenance in 
parallel via background workers. There's now enough parallelism 
infrastructure that it shouldn't be too hard to hack up a quick test of 
that.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-11 Thread Simon Riggs
On 5 June 2015 at 18:07, deavid deavidsed...@gmail.com wrote:

 There are several use cases where I see useful an index, but adding it
 will slow too much inserts and updates.
 For example, when we have 10 million rows on a table, and it's a table
 which has frequent updates, we need several index to speed up selects, but
 then we'll slow down updates a lot, specially when we have 10 or more
 indexes.
 Other cases involve indexes for text search, which are used only for user
 search and aren't that important, so we want to have them, but we don't
 want the overload they put whenever we write on the table.
 I know different approaches that already solve some of those problems in
 some ways (table partitioning, partial indexes, etc), but i don't feel they
 are the solution to every problem of this kind.

 Some people already asked for delayed write indexes, but the idea gets
 discarded because the index could get out of sync, so it can omit results
 and this is unacceptable. But i think maybe that could be fixed in several
 ways and we can have a fast and reliable index (but maybe not so fast on
 selects).


This is exactly the use case and mechanism for BRIN indexes.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-11 Thread Qingqing Zhou
On Fri, Jun 5, 2015 at 10:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 So I really doubt that anyone would have any enthusiasm for saddling btree
 with a similar mechanism.  It's complicated (and has been the cause of
 multiple bugs); it's hard to figure out when is the optimal time to flush
 the pending insertions; and it slows down searches in favor of making
 inserts cheap, which is generally not the way to bet --- if that's the
 tradeoff you want, why not drop the index altogether?

I have seen a case that a major fact table with up to 7 indices, every
15~60 mins with large amount of data loading, and there are
concurrrent seeks against indices at the same time. We can play with
partitioning, or sarcrifice some application semantics, to alleviate
the pressure but it is good to see if we can improve: sorting and
batching insert into btree is helpful for better IO and locking
behavior. So can we guard the case that hard to handle, e.g., the
indices enforcing some constraints (like uniqueness), and improve the
loading senario?

Hint bits update is also painful in above case, but it is out of the topic here.

Thanks,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-06 Thread k...@rice.edu
On Fri, Jun 05, 2015 at 11:54:01PM +, deavid wrote:
 Thanks to everybody for answering. I wasn't expecting this attention; this
 is a great community :-)
 
 Jim asked me about something real. Well, the problem is this showed up more
 than five years ago, and keeps popping from time to time since in different
 circumstances. I solved them in different ways each time, depending the
 exact use-case. I wanted to generalize, because seems a good feature for
 several situations; and I don't expect a solution for me as each time I hit
 with this I found some way to sort it out.
 As Jim said, we need here are figures for real examples, and i don't have
 yet. I'll do my homework and email back with exact problems with exact
 timing. Give me a week or two.
 
 Also, some of you are talking about IO. Well, it's hard to say without the
 figures here, but I'm pretty sure I'm hitting CPU time only. We use SSD on
 those big databases, and also in my tests i tried setting fsync=off.
 
 So the problem is: i see a low iowait, and CPU time for one core is at
 80-90% most of the time. I can buy more ram, better disks, or cpu's with
 more cores. But one cpu core would have more-or-less the same speed no
 matter how much money you invest.
 
 When someone wants a delayed-write index is similar to setting
  synchronous_commit = off. We want to give an OK to the backend as soon
 as is possible and do this work in background. But we also want some
 reliability against crashes.
 
 Also, if the task is done in background it may be done from other backend,
 so probably several indexes could be packed at once using different backend
 processes. We could use the entire cpu if our index writes aren't tied to
 the session who wrote the row.
 
 PD: I'm very interested on existent approaches like GIN or BRIN (this one
 is new to me). Thanks a lot; i'll try them in my tests.

Hi David,

Here is an interesting read comparing LSM and Fractal Tree indexing:

http://highscalability.com/blog/2014/8/6/tokutek-white-paper-a-comparison-of-log-structured-merge-lsm.html

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is it possible to have a fast-write Index?

2015-06-05 Thread deavid
There are several use cases where I see useful an index, but adding it will
slow too much inserts and updates.
For example, when we have 10 million rows on a table, and it's a table
which has frequent updates, we need several index to speed up selects, but
then we'll slow down updates a lot, specially when we have 10 or more
indexes.
Other cases involve indexes for text search, which are used only for user
search and aren't that important, so we want to have them, but we don't
want the overload they put whenever we write on the table.
I know different approaches that already solve some of those problems in
some ways (table partitioning, partial indexes, etc), but i don't feel they
are the solution to every problem of this kind.

Some people already asked for delayed write indexes, but the idea gets
discarded because the index could get out of sync, so it can omit results
and this is unacceptable. But i think maybe that could be fixed in several
ways and we can have a fast and reliable index (but maybe not so fast on
selects).

Since I do not know every internal of postgres, i feel simpler to share
here and ask which things can or cannot be done.

Let's imagine there is a new type of index called weird_btree, where we
trade-off simplicity for speed. In almost every mode, we will rely on
VACUUM to put our index in optimal state.

Mode 1: on aminsert mark the index as INVALID. So, if you modified the
table you need to run REINDEX/CREATE INDEX CONCURRENTLY before doing
SELECT. This is almost the same as create index concurrently, the main
difference is you don't have to remember to drop the index before writing.
(I don't see much benefit here)

Mode 2: on aminsert, put the new entry in a plain, unordered list instead
of the btree. Inserting at the end of a list should be faster than big
btrees and you'll know later which entries you missed indexing.

Mode 2.a: on index scan (amrescan, amgettuple), pretend that after the
btree there is the list and output every row, out-of order. You will have
to tell postgres that our index isn't sorted and it will have to recheck
every row.

Mode 2.b: mark the index invalid instead. When doing the next vacuum, sort
the list and insert it to the btree in a bulk operation. If it's ok, mark
the index valid.

Mode 3: on aminsert, put the new entry on a second btree; leaving the
first one untouched. Because the second btree is new, will be small, and
writes should be faster. When doing a index scan, read tuples from both at
same time (like merge sort). On vacuum, merge the second btree onto the
first. On this mode, the index is sorted and there's no need of recheck.

Anyone thinks this would be a interesting feature for postgresql?
Did I miss something?

PD: Maybe it's also possible to take advantage of clustering, and have
indexes which entries are range of TIDs; but i'm not sure if this is too
exotic, or if it will make a difference.

Sincerely,
David.


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-05 Thread Gavin Flower

On 06/06/15 04:07, deavid wrote:
There are several use cases where I see useful an index, but adding it 
will slow too much inserts and updates.
For example, when we have 10 million rows on a table, and it's a table 
which has frequent updates, we need several index to speed up selects, 
but then we'll slow down updates a lot, specially when we have 10 or 
more indexes.
Other cases involve indexes for text search, which are used only for 
user search and aren't that important, so we want to have them, but we 
don't want the overload they put whenever we write on the table.
I know different approaches that already solve some of those problems 
in some ways (table partitioning, partial indexes, etc), but i don't 
feel they are the solution to every problem of this kind.


Some people already asked for delayed write indexes, but the idea 
gets discarded because the index could get out of sync, so it can omit 
results and this is unacceptable. But i think maybe that could be 
fixed in several ways and we can have a fast and reliable index (but 
maybe not so fast on selects).


Since I do not know every internal of postgres, i feel simpler to 
share here and ask which things can or cannot be done.


Let's imagine there is a new type of index called weird_btree, where 
we trade-off simplicity for speed. In almost every mode, we will rely 
on VACUUM to put our index in optimal state.


Mode 1: on aminsert mark the index as INVALID. So, if you modified 
the table you need to run REINDEX/CREATE INDEX CONCURRENTLY before 
doing SELECT. This is almost the same as create index concurrently, 
the main difference is you don't have to remember to drop the index 
before writing. (I don't see much benefit here)


Mode 2: on aminsert, put the new entry in a plain, unordered list 
instead of the btree. Inserting at the end of a list should be faster 
than big btrees and you'll know later which entries you missed indexing.


Mode 2.a: on index scan (amrescan, amgettuple), pretend that after the 
btree there is the list and output every row, out-of order. You will 
have to tell postgres that our index isn't sorted and it will have to 
recheck every row.


Mode 2.b: mark the index invalid instead. When doing the next vacuum, 
sort the list and insert it to the btree in a bulk operation. If it's 
ok, mark the index valid.


Mode 3: on aminsert, put the new entry on a second btree; leaving 
the first one untouched. Because the second btree is new, will be 
small, and writes should be faster. When doing a index scan, read 
tuples from both at same time (like merge sort). On vacuum, merge the 
second btree onto the first. On this mode, the index is sorted and 
there's no need of recheck.


Anyone thinks this would be a interesting feature for postgresql?
Did I miss something?

PD: Maybe it's also possible to take advantage of clustering, and have 
indexes which entries are range of TIDs; but i'm not sure if this is 
too exotic, or if it will make a difference.


Sincerely,
David.

How about a hybrid indexing system, with 2 parts:

(1) existing index system which is checked first and has been mostly 
optimised for speed of reading.  If there are only a few inserts/updates 
and the system is not heavily loaded, then it gets modified 
immediately.  The threshold for being too busy, and few enough changes, 
could be configurable.


(2) overflow index optimised for writing.  Possible in memory and not 
backed to permanent storage.  A crash would require a complete index 
rebuild - but only when there were entries in it (or at least more than 
some configurable threshold, to allow for cases were some missing index 
entries are acceptable).


So where the index is needed for a query, part 1 is checked first, and 
the part 2 if necessary


Have a background process that will move entries from part 2 to part 1, 
when the systems is less busy.



Cheers,
Gavin





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a fast-write Index?

2015-06-05 Thread deavid
Thanks to everybody for answering. I wasn't expecting this attention; this
is a great community :-)

Jim asked me about something real. Well, the problem is this showed up more
than five years ago, and keeps popping from time to time since in different
circumstances. I solved them in different ways each time, depending the
exact use-case. I wanted to generalize, because seems a good feature for
several situations; and I don't expect a solution for me as each time I hit
with this I found some way to sort it out.
As Jim said, we need here are figures for real examples, and i don't have
yet. I'll do my homework and email back with exact problems with exact
timing. Give me a week or two.

Also, some of you are talking about IO. Well, it's hard to say without the
figures here, but I'm pretty sure I'm hitting CPU time only. We use SSD on
those big databases, and also in my tests i tried setting fsync=off.

So the problem is: i see a low iowait, and CPU time for one core is at
80-90% most of the time. I can buy more ram, better disks, or cpu's with
more cores. But one cpu core would have more-or-less the same speed no
matter how much money you invest.

When someone wants a delayed-write index is similar to setting
 synchronous_commit = off. We want to give an OK to the backend as soon
as is possible and do this work in background. But we also want some
reliability against crashes.

Also, if the task is done in background it may be done from other backend,
so probably several indexes could be packed at once using different backend
processes. We could use the entire cpu if our index writes aren't tied to
the session who wrote the row.

PD: I'm very interested on existent approaches like GIN or BRIN (this one
is new to me). Thanks a lot; i'll try them in my tests.