Re: GIN pending list pages not recycled promptly (was Re: [HACKERS] GIN improvements part 1: additional information)

2014-06-18 Thread Amit Langote
On Wed, Jan 22, 2014 at 9:12 PM, Heikki Linnakangas
 wrote:
> On 01/22/2014 03:39 AM, Tomas Vondra wrote:
>>
>> What annoys me a bit is the huge size difference between the index
>> updated incrementally (by a sequence of INSERT commands), and the index
>> rebuilt from scratch using VACUUM FULL. It's a bit better with the patch
>> (2288 vs. 2035 MB), but is there a chance to improve this?
>
>
> Hmm. What seems to be happening is that pending item list pages that the
> fast update mechanism uses are not getting recycled. When enough list pages
> are filled up, they are flushed into the main index and the list pages are
> marked as deleted. But they are not recorded in the FSM, so they won't be
> recycled until the index is vacuumed. Almost all of the difference can be
> attributed to deleted pages left behind like that.
>
> So this isn't actually related to the packed postinglists patch at all. It
> just makes the bloat more obvious, because it makes the actual size of the
> index size, excluding deleted pages, smaller. But it can be observed on git
> master as well:
>
> I created a simple test table and index like this:
>
> create table foo (intarr int[]);
> create index i_foo on foo using gin(intarr) with (fastupdate=on);
>
> I filled the table like this:
>
> insert into foo select array[-1] from generate_series(1, 1000) g;
>
> postgres=# \d+i
>List of relations
>  Schema | Name | Type  | Owner  |  Size  | Description
> +--+---+++-
>  public | foo  | table | heikki | 575 MB |
> (1 row)
>
> postgres=# \di+
>List of relations
>  Schema | Name  | Type  | Owner  | Table |  Size  | Description
> +---+---++---++-
>  public | i_foo | index | heikki | foo   | 251 MB |
> (1 row)
>
> I wrote a little utility that scans all pages in a gin index, and prints out
> the flags indicating what kind of a page it is. The distribution looks like
> this:
>
>  19 DATA
>7420 DATA LEAF
>   24701 DELETED
>   1 LEAF
>   1 META
>
> I think we need to add the deleted pages to the FSM more aggressively.
>
> I tried simply adding calls to RecordFreeIndexPage, after the list pages
> have been marked as deleted, but unfortunately that didn't help. The problem
> is that the FSM is organized into a three-level tree, and
> RecordFreeIndexPage only updates the bottom level. The upper levels are not
> updated until the FSM is vacuumed, so the pages are still not visible to
> GetFreeIndexPage calls until next vacuum. The simplest fix would be to add a
> call to IndexFreeSpaceMapVacuum after flushing the pending list, per
> attached patch. I'm slightly worried about the performance impact of the
> IndexFreeSpaceMapVacuum() call. It scans the whole FSM of the index, which
> isn't exactly free. So perhaps we should teach RecordFreeIndexPage to update
> the upper levels of the FSM in a retail-fashion instead.
>

I wonder if you pursued this further?

You recently added a number of TODO items related to GIN index; is it
worth adding this to the list?

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GIN pending list pages not recycled promptly (was Re: [HACKERS] GIN improvements part 1: additional information)

2014-01-22 Thread Alvaro Herrera
Heikki Linnakangas wrote:

> I wrote a little utility that scans all pages in a gin index, and
> prints out the flags indicating what kind of a page it is. The
> distribution looks like this:
> 
>  19 DATA
>7420 DATA LEAF
>   24701 DELETED
>   1 LEAF
>   1 META

Hah.

> I think we need to add the deleted pages to the FSM more aggressively.
>
> I tried simply adding calls to RecordFreeIndexPage, after the list
> pages have been marked as deleted, but unfortunately that didn't
> help. The problem is that the FSM is organized into a three-level
> tree, and RecordFreeIndexPage only updates the bottom level.

Interesting.  I think the idea of having an option for RecordFreeIndexPage
to update upper levels makes sense (no need to force it for other
users.)

Some time ago I proposed an index-only cleanup for vacuum.  That would
help GIN get this kind of treatment (vacuuming its FSM and processing
the pending list) separately from vacuuming the index.  It's probably
too late for 9.4 though.

One other thing worth considering in this area is that making the
pending list size depend on work_mem appears to have been a really bad
idea.  I know one case where the server is really large and seems to run
mostly OLAP type stuff with occasional updates, so they globally set
work_mem=2GB; they have GIN indexes for text search, and the result is
horrible performance 90% of the time, then a vacuum cleans the pending
list and it is blazing fast until the pending list starts getting big
again.  Now you can argue that setting work_mem to that value is a bad
idea, but as it turns out, in this case other than the GIN pending list
it seems to work fine.

Not related to the patch at hand, but I thought I would out it for
consideration, 'cause I'm not gonna start a new thread about it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


GIN pending list pages not recycled promptly (was Re: [HACKERS] GIN improvements part 1: additional information)

2014-01-22 Thread Heikki Linnakangas

On 01/22/2014 03:39 AM, Tomas Vondra wrote:

What annoys me a bit is the huge size difference between the index
updated incrementally (by a sequence of INSERT commands), and the index
rebuilt from scratch using VACUUM FULL. It's a bit better with the patch
(2288 vs. 2035 MB), but is there a chance to improve this?


Hmm. What seems to be happening is that pending item list pages that the 
fast update mechanism uses are not getting recycled. When enough list 
pages are filled up, they are flushed into the main index and the list 
pages are marked as deleted. But they are not recorded in the FSM, so 
they won't be recycled until the index is vacuumed. Almost all of the 
difference can be attributed to deleted pages left behind like that.


So this isn't actually related to the packed postinglists patch at all. 
It just makes the bloat more obvious, because it makes the actual size 
of the index size, excluding deleted pages, smaller. But it can be 
observed on git master as well:


I created a simple test table and index like this:

create table foo (intarr int[]);
create index i_foo on foo using gin(intarr) with (fastupdate=on);

I filled the table like this:

insert into foo select array[-1] from generate_series(1, 1000) g;

postgres=# \d+i
   List of relations
 Schema | Name | Type  | Owner  |  Size  | Description
+--+---+++-
 public | foo  | table | heikki | 575 MB |
(1 row)

postgres=# \di+
   List of relations
 Schema | Name  | Type  | Owner  | Table |  Size  | Description
+---+---++---++-
 public | i_foo | index | heikki | foo   | 251 MB |
(1 row)

I wrote a little utility that scans all pages in a gin index, and prints 
out the flags indicating what kind of a page it is. The distribution 
looks like this:


 19 DATA
   7420 DATA LEAF
  24701 DELETED
  1 LEAF
  1 META

I think we need to add the deleted pages to the FSM more aggressively.

I tried simply adding calls to RecordFreeIndexPage, after the list pages 
have been marked as deleted, but unfortunately that didn't help. The 
problem is that the FSM is organized into a three-level tree, and 
RecordFreeIndexPage only updates the bottom level. The upper levels are 
not updated until the FSM is vacuumed, so the pages are still not 
visible to GetFreeIndexPage calls until next vacuum. The simplest fix 
would be to add a call to IndexFreeSpaceMapVacuum after flushing the 
pending list, per attached patch. I'm slightly worried about the 
performance impact of the IndexFreeSpaceMapVacuum() call. It scans the 
whole FSM of the index, which isn't exactly free. So perhaps we should 
teach RecordFreeIndexPage to update the upper levels of the FSM in a 
retail-fashion instead.


- Heikki
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
***
*** 21,26 
--- 21,27 
  #include "access/gin_private.h"
  #include "commands/vacuum.h"
  #include "miscadmin.h"
+ #include "storage/indexfsm.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
  
***
*** 434,440  ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
--- 435,453 
  	END_CRIT_SECTION();
  
  	if (needCleanup)
+ 	{
  		ginInsertCleanup(ginstate, false, NULL);
+ 
+ 		/*
+ 		 * Vacuum the FSM to make the deleted list pages available for re-use.
+ 		 *
+ 		 * gininsertCleanup marked them as free in the FSM by calling
+ 		 * RecordFreeIndexPage, but that only updates the bottom FSM level.
+ 		 * Since GetFreeIndexPage scans from the top, they won't be visible
+ 		 * for reuse until the upper levels have been updated.
+ 		 */
+ 		IndexFreeSpaceMapVacuum(index);
+ 	}
  }
  
  /*
***
*** 599,608  shiftList(Relation index, Buffer metabuffer, BlockNumber newHead,
  			}
  		}
  
  		for (i = 0; i < data.ndeleted; i++)
  			UnlockReleaseBuffer(buffers[i]);
! 
! 		END_CRIT_SECTION();
  	} while (blknoToDelete != newHead);
  
  	return false;
--- 612,625 
  			}
  		}
  
+ 		END_CRIT_SECTION();
+ 
  		for (i = 0; i < data.ndeleted; i++)
+ 		{
+ 			BlockNumber blkno = BufferGetBlockNumber(buffers[i]);
  			UnlockReleaseBuffer(buffers[i]);
! 			RecordFreeIndexPage(index, blkno);
! 		}
  	} while (blknoToDelete != newHead);
  
  	return false;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers