Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-04 Thread Simon Riggs
On 4 August 2016 at 18:27, Bruce Momjian  wrote:

>> > Also, why not use this bitmap for all indexes, not just update chains?
>>
>> I don't understand where you get this update chains thing from.
>>
>> The bitmap can apply to multiple tuples on one page, which is described.
>
> I am asking if we could use this idea for all rows on a page, not just
> HOT updated, e.g. we have five rows that have col1=4 in an index --- why
> not just use on index pointer for all of them, with a bitmap to point to
> the possible matches?

Yes. I described that...

"The same situation can occur if we have many INSERTs with same values
on the same block. This could lead to a reduction in size of the btree
for indexes with many duplicate values."

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-04 Thread Claudio Freire
On Thu, Aug 4, 2016 at 1:58 PM, Simon Riggs  wrote:
> On 3 August 2016 at 20:37, Claudio Freire  wrote:
>> On Wed, Aug 3, 2016 at 4:20 AM, Simon Riggs  wrote:
>>> == IndexScan ==
>>>
>>> Note that the executor code for IndexScan appears identical between
>>> the two optimizations. The difference between duplicate and range LITE
>>> tuples is needed only at INSERT time (or UPDATE indexed column to a
>>> new value).
>>>
>>> When we do an IndexScan if we see a LITE tuple we do a scan of the
>>> linepointer ranges covered by this index tuple (which might eventually
>>> go to a full block scan). For example with bit1 set we would scan 16
>>> linepointers (on an 8192 byte block) and with 2 bits set we would scan
>>> 32 linepointers etc.., not necessarily consecutive ranges. So the
>>> IndexScan can return potentially many heap rows per index entry, which
>>> need to be re-checked and may also need to be sorted prior to
>>> returning them. If no rows are returned we can kill the index pointer,
>>> just as we do now for btrees, so they will be removed eventually from
>>> the index without the need for VACUUM. An BitmapIndexScan that sees a
>>> lossy pointer can also use the lossy TID concept, so we can have
>>> partially lossy bitmaps.
>>
>> Wouldn't this risk scanning rows more than once unless it's part of a
>> bitmap scan?
>
> It's always the job of the index insert logic to ensure a valid index exists.
>
> I'm not sure what you see that changes that here. Say more?

I just explained it further in the WARM thread, which is actually the
idea I was arriving to.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-04 Thread Bruce Momjian
On Thu, Aug  4, 2016 at 06:03:34PM +0100, Simon Riggs wrote:
> On 4 August 2016 at 02:13, Bruce Momjian  wrote:
> 
> > How do plan to clear the bitmask so it, over time, doesn't end up being
> > all-set?
> 
> I don't have a plan, though thoughts welcome.
> 
> Similar situation that our current indexes don't recover from bloat, a
> situation made worse by non-HOT updates.
> 
> So, we have a choice of which disadvantage to accept.
> 
> 
> > Also, why not use this bitmap for all indexes, not just update chains?
> 
> I don't understand where you get this update chains thing from.
> 
> The bitmap can apply to multiple tuples on one page, which is described.

I am asking if we could use this idea for all rows on a page, not just
HOT updated, e.g. we have five rows that have col1=4 in an index --- why
not just use on index pointer for all of them, with a bitmap to point to
the possible matches?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-04 Thread Simon Riggs
On 4 August 2016 at 02:13, Bruce Momjian  wrote:

> How do plan to clear the bitmask so it, over time, doesn't end up being
> all-set?

I don't have a plan, though thoughts welcome.

Similar situation that our current indexes don't recover from bloat, a
situation made worse by non-HOT updates.

So, we have a choice of which disadvantage to accept.


> Also, why not use this bitmap for all indexes, not just update chains?

I don't understand where you get this update chains thing from.

The bitmap can apply to multiple tuples on one page, which is described.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-04 Thread Simon Riggs
On 3 August 2016 at 20:37, Claudio Freire  wrote:
> On Wed, Aug 3, 2016 at 4:20 AM, Simon Riggs  wrote:
>> == IndexScan ==
>>
>> Note that the executor code for IndexScan appears identical between
>> the two optimizations. The difference between duplicate and range LITE
>> tuples is needed only at INSERT time (or UPDATE indexed column to a
>> new value).
>>
>> When we do an IndexScan if we see a LITE tuple we do a scan of the
>> linepointer ranges covered by this index tuple (which might eventually
>> go to a full block scan). For example with bit1 set we would scan 16
>> linepointers (on an 8192 byte block) and with 2 bits set we would scan
>> 32 linepointers etc.., not necessarily consecutive ranges. So the
>> IndexScan can return potentially many heap rows per index entry, which
>> need to be re-checked and may also need to be sorted prior to
>> returning them. If no rows are returned we can kill the index pointer,
>> just as we do now for btrees, so they will be removed eventually from
>> the index without the need for VACUUM. An BitmapIndexScan that sees a
>> lossy pointer can also use the lossy TID concept, so we can have
>> partially lossy bitmaps.
>
> Wouldn't this risk scanning rows more than once unless it's part of a
> bitmap scan?

It's always the job of the index insert logic to ensure a valid index exists.

I'm not sure what you see that changes that here. Say more?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Claudio Freire
On Wed, Aug 3, 2016 at 4:37 PM, Claudio Freire  wrote:
> On Wed, Aug 3, 2016 at 4:20 AM, Simon Riggs  wrote:
>> == IndexScan ==
>>
>> Note that the executor code for IndexScan appears identical between
>> the two optimizations. The difference between duplicate and range LITE
>> tuples is needed only at INSERT time (or UPDATE indexed column to a
>> new value).
>>
>> When we do an IndexScan if we see a LITE tuple we do a scan of the
>> linepointer ranges covered by this index tuple (which might eventually
>> go to a full block scan). For example with bit1 set we would scan 16
>> linepointers (on an 8192 byte block) and with 2 bits set we would scan
>> 32 linepointers etc.., not necessarily consecutive ranges. So the
>> IndexScan can return potentially many heap rows per index entry, which
>> need to be re-checked and may also need to be sorted prior to
>> returning them. If no rows are returned we can kill the index pointer,
>> just as we do now for btrees, so they will be removed eventually from
>> the index without the need for VACUUM. An BitmapIndexScan that sees a
>> lossy pointer can also use the lossy TID concept, so we can have
>> partially lossy bitmaps.
>
> Wouldn't this risk scanning rows more than once unless it's part of a
> bitmap scan?

I think an alternative that would not have such a problem and would
have a smaller impact both in performance and code, would be to just
not add new index tuples when the update happens on the same page,
easily checked by comparing the old tuple's t_ctid against the new's.
Index scans would have to follow same-page update chains, and perhaps
vacuum would need some fixing too, but that would probably be less of
a performance hit than the proposed lossy item pointers.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Bruce Momjian
On Wed, Aug  3, 2016 at 08:34:02PM -0400, Bruce Momjian wrote:
> On Thu, Aug  4, 2016 at 01:16:20AM +0100, Simon Riggs wrote:
> > > Would you only add a LITE index entry when there isn't an
> > > existing index entry for the same values and heap page?  That seems
> > > quite complicated.
> > 
> > The insertion algorithm is described. Doesn't seem complicated to me.
> 
> Ah, I see it now:
> 
>   As UPDATEs occur we request inserts into the index. If a lossy index
>   pointer already exists that covers the new linepointer then we skip
>   the index insert, optimizing writes to the index.
> 
> I thought you meant "already exists" just means it matches the existing
> range.  I was unclear that was the entire page.

How do plan to clear the bitmask so it, over time, doesn't end up being
all-set?  I can see how a non-LITE index tuple would become a LITE tuple
when an update chain happens on the same page, but then new matching
index values would also set the bitmap, update-chain or not.  Then, when
the update chain is gone and only non-update-chain tuples exist, you
will need to remove the bitmap and create new index entries by scanning
the heap page.

Also, what happens when you have two non-LITE index tuples in a heap
page, and then you create an update chain in the heap page --- do you
remove one of the two index entries and create a bitmap out of the
remaining one?  Do you put them back when the heap chain is gone?

Also, why not use this bitmap for all indexes, not just update chains? 
Because it would break index-only scans?

I think there are two use-cases for LITE --- first, it would shrink
indexes for tables with many duplicate indexed values on the same page. 
In the case Simon mentioned, it would help with update chain churn, but
the problem is it seems to cause a lot of overhead to create and remove
the bitmap in the index.  This is why I was thinking of per-update-chain
LITE entries.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Bruce Momjian
On Thu, Aug  4, 2016 at 01:16:20AM +0100, Simon Riggs wrote:
> > Would you only add a LITE index entry when there isn't an
> > existing index entry for the same values and heap page?  That seems
> > quite complicated.
> 
> The insertion algorithm is described. Doesn't seem complicated to me.

Ah, I see it now:

As UPDATEs occur we request inserts into the index. If a lossy index
pointer already exists that covers the new linepointer then we skip
the index insert, optimizing writes to the index.

I thought you meant "already exists" just means it matches the existing
range.  I was unclear that was the entire page.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Bruce Momjian
On Thu, Aug  4, 2016 at 01:16:20AM +0100, Simon Riggs wrote:
> On 4 August 2016 at 00:56, Bruce Momjian  wrote:
> > On Wed, Aug  3, 2016 at 07:28:52PM -0400, Bruce Momjian wrote:
> >> With LITE, you can avoid the creation of duplicate-value index entries
> >> for indexes without changed column values by using a bitmap in place of
> >> the tid item number (16 bits).  It can't remove dead tids.
> >
> > How would you handle the case where there are two LITE index entries
> > pointing to two different update chains on the same page?
> > When you
> > search the page for the first heap chain, could the second index entry
> > find the same chain.  How would you know which index entry is which
> > chain?
> 
> It's easiest to understand this is by imagining each LITE pointer
> pointing to a whole page. The chains aren't followed during the scan,
> individual heap tuple versions are treated as they would be by a seq
> scan.
> 
> That is more expensive than we might like, so the bitmap/linepointer
> thing is just an extra tweak to avoid scanning the whole block. The
> bitmap refers to ranges of linepointers, not chains. i.e. 0-15, 16-31,
> 32-47 etc

Well, there is no way to know how many linepointers there are on a page,
so doing "mod 16" automatically hashes the line pointers into the 16-bit
field.

> > Would you only add a LITE index entry when there isn't an
> > existing index entry for the same values and heap page?  That seems
> > quite complicated.
> 
> The insertion algorithm is described. Doesn't seem complicated to me.

OK.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Simon Riggs
On 4 August 2016 at 00:56, Bruce Momjian  wrote:
> On Wed, Aug  3, 2016 at 07:28:52PM -0400, Bruce Momjian wrote:
>> With LITE, you can avoid the creation of duplicate-value index entries
>> for indexes without changed column values by using a bitmap in place of
>> the tid item number (16 bits).  It can't remove dead tids.
>
> How would you handle the case where there are two LITE index entries
> pointing to two different update chains on the same page?
> When you
> search the page for the first heap chain, could the second index entry
> find the same chain.  How would you know which index entry is which
> chain?

It's easiest to understand this is by imagining each LITE pointer
pointing to a whole page. The chains aren't followed during the scan,
individual heap tuple versions are treated as they would be by a seq
scan.

That is more expensive than we might like, so the bitmap/linepointer
thing is just an extra tweak to avoid scanning the whole block. The
bitmap refers to ranges of linepointers, not chains. i.e. 0-15, 16-31,
32-47 etc

> Would you only add a LITE index entry when there isn't an
> existing index entry for the same values and heap page?  That seems
> quite complicated.

The insertion algorithm is described. Doesn't seem complicated to me.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Bruce Momjian
On Wed, Aug  3, 2016 at 07:28:52PM -0400, Bruce Momjian wrote:
> With LITE, you can avoid the creation of duplicate-value index entries
> for indexes without changed column values by using a bitmap in place of
> the tid item number (16 bits).  It can't remove dead tids.

How would you handle the case where there are two LITE index entries
pointing to two different update chains on the same page?  When you
search the page for the first heap chain, could the second index entry
find the same chain.  How would you know which index entry is which
chain?  Would you only add a LITE index entry when there isn't an
existing index entry for the same values and heap page?  That seems
quite complicated.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Bruce Momjian
On Wed, Aug  3, 2016 at 08:20:49AM +0100, Simon Riggs wrote:
> == Update Duplicate Removal ==
> 
> We want an optimization that reduces the effects of multiple UPDATEs
> on the same block that have duplicate values caused because another
> index column has been updated and a non-HOT index insert has been
> generated. This optimizes the case where someone has 12 indexes yet
> updates just one of them, so we optimize the 11 unnecessary updates,
> if possible.
> 
> As UPDATEs occur we request inserts into the index. If a lossy index
> pointer already exists that covers the new linepointer then we skip
> the index insert, optimizing writes to the index. If a lossy index
> pointer already exists for that value but doesn't yet cover our
> linepointer then we update the bitmask. If a non-lossy index pointer
> already exists we set the lossy bit and update the bitmask to reflect
> which linepointers the index entry covers. If no index pointer exists
> we insert one. The first update on a row will not be optimized, but
> the 2nd - 16th could be. These actions optimize away most of the
> writes and WAL activity to the index data block since only 1 in every
> 16 updates would cause changes to the block (actually about 1 in 10 on
> average). Most importantly, updates will cause far fewer index block
> splits and reindexing will be less needed.
> 
> The same situation can occur if we have many INSERTs with same values
> on the same block. This could lead to a reduction in size of the btree
> for indexes with many duplicate values.

Let me see if I understand this.  Right now, if no indexed values are
changed and the old and new update rows are in the same page, we don't
create a new index entry for the new row, but rather use redirect tid
pointers.  We can recycle tuple data and tid pointers for frequent
updates because there is only one index entry for the entire update
chain.

This doesn't work if we changed an indexed value.  Your solution is not
to use redirect tid/line pointers at all.  Instead, you create normal
index pointers for the indexes with changed column values.  For the
indexes without changed column values, you create this new LITE index
entry which can point to all the update-chain tuples in the same page.

In Postgres currently, we can remove tuple data when it is expired, and
tids if they are part of redirect chains (no index changes), and need
vacuum to remove non-redirect tids and old index entries.

With LITE, you can avoid the creation of duplicate-value index entries
for indexes without changed column values by using a bitmap in place of
the tid item number (16 bits).  It can't remove dead tids.

I had some ideas of how to handle the bitmap, like doing "mod 16" on the
tid item value, meaning we would have 16 equal-sized buckets.  Assuming
tuples are 100 bytes, that makes each bucket hold five rows on average.

However, I am now wondering why bother with the bitmap at all.  Can't
the LITE item pointer just point to the head of the update chain, and we
can walk the chain to find the tuple that matches our snapshot?  (The
tuple has a pointer to the next entry in the chain.)

Basically, right now with an update that changes values in some indexes
and not others, and the tuples are all on the same page, we are creating
multiple index pointers to point to different tuples on the same page. 
I am suggesting we don't do that, and instead just leave the index tuple
pointing to the head of the chain.

I think the big problem with this is that removing the tuple data will
mark also remove the rows update chain pointer.  Maybe we can modify tid
redirect pointers to handle this.

> == Clustered Index ==

I read about clustered indexes and I am not sure of how it would help in
most cases.  Are there enough index entries with identical or
sequentially-grouped values to be useful?  It kind of feels like a
single-page BRIN index.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lossy Index Tuple Enhancement (LITE)

2016-08-03 Thread Claudio Freire
On Wed, Aug 3, 2016 at 4:20 AM, Simon Riggs  wrote:
> == IndexScan ==
>
> Note that the executor code for IndexScan appears identical between
> the two optimizations. The difference between duplicate and range LITE
> tuples is needed only at INSERT time (or UPDATE indexed column to a
> new value).
>
> When we do an IndexScan if we see a LITE tuple we do a scan of the
> linepointer ranges covered by this index tuple (which might eventually
> go to a full block scan). For example with bit1 set we would scan 16
> linepointers (on an 8192 byte block) and with 2 bits set we would scan
> 32 linepointers etc.., not necessarily consecutive ranges. So the
> IndexScan can return potentially many heap rows per index entry, which
> need to be re-checked and may also need to be sorted prior to
> returning them. If no rows are returned we can kill the index pointer,
> just as we do now for btrees, so they will be removed eventually from
> the index without the need for VACUUM. An BitmapIndexScan that sees a
> lossy pointer can also use the lossy TID concept, so we can have
> partially lossy bitmaps.

Wouldn't this risk scanning rows more than once unless it's part of a
bitmap scan?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers