Re: [HACKERS] BRIN index and aborted transaction

2015-07-24 Thread Simon Riggs
On 23 July 2015 at 19:59, Josh Berkus j...@agliodbs.com wrote:

 On 07/23/2015 11:18 AM, Robert Haas wrote:
  Cool.  I'm not sure exactly what the right solution is either, but it
  seems like the current situation could very well lead to degrading
  index performance over time, with no way to put that right except to
  rebuild the index completely.  So it seems worth trying to improve
  things.

 As a reality check, if that was the situation, it wouldn't be the only
 type of index to have that problem.  Even our BTrees, with certain
 update patterns, need to be periodically rebuilt from scratch.


Agreed


 In other words, I don't think that fixing performance issues with BRIN
 indexes and frequenly-updated tables should be a blocker for 9.5.  Not
 clear on whether we're considering this an open item or not.


Definitely not, though I think we should document a few things better here.
I'll do that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] BRIN index and aborted transaction

2015-07-23 Thread Tatsuo Ishii
 Hm, well, I am not sure that we want to pay the overhead of
 re-summarization every time we prune a single tuple from a block range.
 That's going to make vacuum much slower, I assume (without measuring);
 many page ranges are going to be re-summarized without this actually
 changing the range.

What I'm interested in is, if there's a case that using BRIN index is
slower than plain sequential scan (or planner is stupid enough to
choose BRIN index scan over sequential scan even if the former is
slower).

If such that case exists, we may want to fix it before releasing 9.5.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-23 Thread Alvaro Herrera
Tatsuo Ishii wrote:
  Hm, well, I am not sure that we want to pay the overhead of
  re-summarization every time we prune a single tuple from a block range.
  That's going to make vacuum much slower, I assume (without measuring);
  many page ranges are going to be re-summarized without this actually
  changing the range.
 
 What I'm interested in is, if there's a case that using BRIN index is
 slower than plain sequential scan (or planner is stupid enough to
 choose BRIN index scan over sequential scan even if the former is
 slower).

No, that shouldn't be an issue.  If brin degrades completely, the worst
that can happen is a seqscan.  There is very small overhead, but the
index itself is small and should be very quick to scan.

 If such that case exists, we may want to fix it before releasing 9.5.

I'm going to try to get the issue addressed in 9.5 along the lines we
discussed upthread.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-23 Thread Alvaro Herrera
Robert Haas wrote:

 Maybe I'm confused here, but it seems like the only time
 re-summarization can be needed is when tuples are pruned.  The mere
 act of deleting a tuple, even if the delete goes on to commit, doesn't
 create a scenario where re-summarization can work out to a win,
 because there may still be snapshots that can see it.  At the point
 where we prune the tuple, though, there might well be a benefit in
 re-summarizing, because now a newly-computed summary value won't need
 to cover a value that previously had to be there.

Hm, well, I am not sure that we want to pay the overhead of
re-summarization every time we prune a single tuple from a block range.
That's going to make vacuum much slower, I assume (without measuring);
many page ranges are going to be re-summarized without this actually
changing the range.

For minmax, it would work well to be able to tell whether the deleted
tuple had a value that was either the min or the max; if so it is
possible that the range can be decreased, otherwise not.  I'm not sure
that this would work for inclusion, though.  For geometric types it
means you check whether the value in the deleted tuple overlaps one of
the borders of the bounding box.  I don't know whether this actually
makes sense.  (The obvious thing, which is whether the value overlaps
the bounding box, is also obviously useless because all values overlap
the bounding box by definition.)

I think this would require a new support procedure for opclasses.

 But it seems obviously impractical to re-summarize when we HOT-prune,
 so it seems like the obvious thing to do is make vacuum do it.

Agreed.

 We know during phase one of vacuum whether we saw any dead tuples in
 page range X-Y; if yes, re-summarize.  The only reason not to do this
 is if it causes us to do a lot of resummarization that frequently
 fails to produce a smaller range. Do you have any experimental data
 suggesting that this is or is not a problem?

Well, the other issue is that vacuum is at arms length from a BRIN
index.  Vacuum doesn't provide the deleted-tuples array in a format
convenient for brin to access it; currently the only way we provide
access is a callback function that the index AM can call for every
single indexed TID to indicate whether it is to be removed or not.  BRIN
doesn't have TIDs, so it cannot call it usefully.  (We could make it
call once for every possible TID in a page, but that would be very
wasteful).

I guess we could provide a different callback that provides per-block
information rather than per-tuple; or perhaps something completely
different like simply the pointer to the deleted-TIDs array.
I vaguely recall somebody mentioned the current setup isn't great for
GIN either, so maybe we can find something that solves both cases?

I think this requires that BRIN calls heap_fetch() for each deleted
tuple as it is pruned.  This seems terrible from a performance point of
view.

There has to be a better way.  I'll give it a spin.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-23 Thread Josh Berkus
On 07/23/2015 11:18 AM, Robert Haas wrote:
 Cool.  I'm not sure exactly what the right solution is either, but it
 seems like the current situation could very well lead to degrading
 index performance over time, with no way to put that right except to
 rebuild the index completely.  So it seems worth trying to improve
 things.

As a reality check, if that was the situation, it wouldn't be the only
type of index to have that problem.  Even our BTrees, with certain
update patterns, need to be periodically rebuilt from scratch.

In other words, I don't think that fixing performance issues with BRIN
indexes and frequenly-updated tables should be a blocker for 9.5.  Not
clear on whether we're considering this an open item or not.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-23 Thread Robert Haas
On Wed, Jul 22, 2015 at 3:20 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Hm, well, I am not sure that we want to pay the overhead of
 re-summarization every time we prune a single tuple from a block range.
 That's going to make vacuum much slower, I assume (without measuring);
 many page ranges are going to be re-summarized without this actually
 changing the range.

 For minmax, it would work well to be able to tell whether the deleted
 tuple had a value that was either the min or the max; if so it is
 possible that the range can be decreased, otherwise not.  I'm not sure
 that this would work for inclusion, though.  For geometric types it
 means you check whether the value in the deleted tuple overlaps one of
 the borders of the bounding box.  I don't know whether this actually
 makes sense.  (The obvious thing, which is whether the value overlaps
 the bounding box, is also obviously useless because all values overlap
 the bounding box by definition.)

 I think this would require a new support procedure for opclasses.

Yeah, you could have something that basically says If SUMMARY didn't
need to cover VALUE, could that change the result?.  A stupid opclass
could always return true.  A minmax opclass could return true if the
value is the min or max, and false otherwise.  etc.

 We know during phase one of vacuum whether we saw any dead tuples in
 page range X-Y; if yes, re-summarize.  The only reason not to do this
 is if it causes us to do a lot of resummarization that frequently
 fails to produce a smaller range. Do you have any experimental data
 suggesting that this is or is not a problem?

 Well, the other issue is that vacuum is at arms length from a BRIN
 index.  Vacuum doesn't provide the deleted-tuples array in a format
 convenient for brin to access it; currently the only way we provide
 access is a callback function that the index AM can call for every
 single indexed TID to indicate whether it is to be removed or not.  BRIN
 doesn't have TIDs, so it cannot call it usefully.  (We could make it
 call once for every possible TID in a page, but that would be very
 wasteful).

 I guess we could provide a different callback that provides per-block
 information rather than per-tuple; or perhaps something completely
 different like simply the pointer to the deleted-TIDs array.
 I vaguely recall somebody mentioned the current setup isn't great for
 GIN either, so maybe we can find something that solves both cases?

 I think this requires that BRIN calls heap_fetch() for each deleted
 tuple as it is pruned.  This seems terrible from a performance point of
 view.

 There has to be a better way.  I'll give it a spin.

Cool.  I'm not sure exactly what the right solution is either, but it
seems like the current situation could very well lead to degrading
index performance over time, with no way to put that right except to
rebuild the index completely.  So it seems worth trying to improve
things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-21 Thread Robert Haas
On Sat, Jul 18, 2015 at 5:11 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Yeah, that's a bit of an open problem: we don't have any mechanism to
 mark a block range as needing resummarization, yet.  I don't have any
 great ideas there, TBH.  Some options that were discussed but never led
 anywhere:

 1. whenever a heap tuple is deleted that's minimum or maximum for a
 column, mark the index tuple as needing resummarization.  One a future
 vacuuming pass the index would be updated.  (I think this works for
 minmax, but I don't see how to apply it to inclusion).

 2. have block ranges be resummarized randomly during vacuum.

 3. Have index tuples last for only X number of transactions, marking the
 as needing summarization when that expires.

 4. Have a user-invoked function that re-runs summarization.  That way
 the user can implement any of the above policies, or others.

Maybe I'm confused here, but it seems like the only time
re-summarization can be needed is when tuples are pruned.  The mere
act of deleting a tuple, even if the delete goes on to commit, doesn't
create a scenario where re-summarization can work out to a win,
because there may still be snapshots that can see it.  At the point
where we prune the tuple, though, there might well be a benefit in
re-summarizing, because now a newly-computed summary value won't need
to cover a value that previously had to be there.

But it seems obviously impractical to re-summarize when we HOT-prune,
so it seems like the obvious thing to do is make vacuum do it.  We
know during phase one of vacuum whether we saw any dead tuples in page
range X-Y; if yes, re-summarize.  The only reason not to do this is if
it causes us to do a lot of resummarization that frequently fails to
produce a smaller range. Do you have any experimental data suggesting
that this is or is not a problem?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-19 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Tatsuo Ishii wrote:
 
  When a transaction aborts, it seems a BRIN index leaves summary data
  which is not valid any more. Is this an expected behavior?  I guess
  the answer is yes, because it does not affect correctness of a query
  result, but I would like to make sure.
 
 You're right, that is not rolled back (just like any other index type,
 actually).

Let me clarify this a bit.  Summarization normally takes place during
vacuum (or upon the brin_summarize_new_ranges() function being called on
the table).  If the INSERT adds tuples to a page in a range that has
already been summarized, then the summary tuple for that page range will
be updated to cover the to-be-aborted tuples.  On the other hand, if the
INSERT adds tuples to a page that is not summarized, there is no summary
tuple to update; and the subsequent vacuum will remove those tuples
before summarizing the range, so they will not appear in the summary
tuple.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-18 Thread Alvaro Herrera
Tatsuo Ishii wrote:

 When a transaction aborts, it seems a BRIN index leaves summary data
 which is not valid any more. Is this an expected behavior?  I guess
 the answer is yes, because it does not affect correctness of a query
 result, but I would like to make sure.

You're right, that is not rolled back (just like any other index type,
actually).

 Second question is when the wrong summary data is gone? It seems
 vacuum does not help. Do I have to recreate the index (or reindex)?

Yeah, that's a bit of an open problem: we don't have any mechanism to
mark a block range as needing resummarization, yet.  I don't have any
great ideas there, TBH.  Some options that were discussed but never led
anywhere:

1. whenever a heap tuple is deleted that's minimum or maximum for a
column, mark the index tuple as needing resummarization.  One a future
vacuuming pass the index would be updated.  (I think this works for
minmax, but I don't see how to apply it to inclusion).

2. have block ranges be resummarized randomly during vacuum.

3. Have index tuples last for only X number of transactions, marking the
as needing summarization when that expires.

4. Have a user-invoked function that re-runs summarization.  That way
the user can implement any of the above policies, or others.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-18 Thread Tatsuo Ishii
Alvaro,

Thank you for the explanation. It's really helpfull.

 Second question is when the wrong summary data is gone? It seems
 vacuum does not help. Do I have to recreate the index (or reindex)?
 
 Yeah, that's a bit of an open problem: we don't have any mechanism to
 mark a block range as needing resummarization, yet.  I don't have any
 great ideas there, TBH.  Some options that were discussed but never led
 anywhere:
 
 1. whenever a heap tuple is deleted that's minimum or maximum for a
 column, mark the index tuple as needing resummarization.  One a future
 vacuuming pass the index would be updated.  (I think this works for
 minmax, but I don't see how to apply it to inclusion).
 
 2. have block ranges be resummarized randomly during vacuum.
 
 3. Have index tuples last for only X number of transactions, marking the
 as needing summarization when that expires.
 
 4. Have a user-invoked function that re-runs summarization.  That way
 the user can implement any of the above policies, or others.

What about doing resummarization while rechecking the heap data?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] BRIN index and aborted transaction

2015-07-17 Thread Tatsuo Ishii
Forgive me if this has been already discussed somewhere.

When a transaction aborts, it seems a BRIN index leaves summary data
which is not valid any more. Is this an expected behavior?  I guess
the answer is yes, because it does not affect correctness of a query
result, but I would like to make sure.

Second question is when the wrong summary data is gone? It seems
vacuum does not help. Do I have to recreate the index (or reindex)?

test=# begin;
BEGIN
test=# insert into t1 values(101);
INSERT 0 1
test=# SELECT * FROM brin_page_items(get_raw_page('brinidx', 2),'brinidx');
 itemoffset | blknum | attnum | allnulls | hasnulls | placeholder |
value
+++--+--+-+-
  1 |  0 |  1 | f| f| f   | {1 .. 28928}
  2 |128 |  1 | f| f| f   | {28929 .. 
57856}
  3 |256 |  1 | f| f| f   | {57857 .. 
86784}
[snip]
 34 |   4224 |  1 | f| f| f   | {954625 .. 
983552}
 35 |   4352 |  1 | f| f| f   | {983553 .. 
101}
(35 rows)

test=# abort;
ROLLBACK
test=# SELECT * FROM brin_page_items(get_raw_page('brinidx', 2),'brinidx');
 itemoffset | blknum | attnum | allnulls | hasnulls | placeholder |
value
+++--+--+-+-
  1 |  0 |  1 | f| f| f   | {1 .. 28928}
  2 |128 |  1 | f| f| f   | {28929 .. 
57856}
  3 |256 |  1 | f| f| f   | {57857 .. 
86784}
[snip]
 34 |   4224 |  1 | f| f| f   | {954625 .. 
983552}
 35 |   4352 |  1 | f| f| f   | {983553 .. 
101}
(35 rows)

test=# vacuum t1;
VACUUM
test=# SELECT * FROM brin_page_items(get_raw_page('brinidx', 2),'brinidx');
 itemoffset | blknum | attnum | allnulls | hasnulls | placeholder |
value
+++--+--+-+-
  1 |  0 |  1 | f| f| f   | {1 .. 28928}
  2 |128 |  1 | f| f| f   | {28929 .. 
57856}
  3 |256 |  1 | f| f| f   | {57857 .. 
86784}
[snip]
 33 |   4096 |  1 | f| f| f   | {925697 .. 
954624}
 34 |   4224 |  1 | f| f| f   | {954625 .. 
983552}
 35 |   4352 |  1 | f| f| f   | {983553 .. 
101}
(35 rows)

test=# select max(i) from t1;
   max   
-
 100
(1 row)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers