Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-08-02 Thread Simon Riggs

On Sat, 2008-08-02 at 00:24 -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  I think it makes sense to commit this patch now, per previous
  discussions on which we have agreed to make incremental changes.
 
 Yeah, but at the same time there is merit in the argument that the
 proposed patch hasn't actually been proven to be usable for anything.
 I would be a lot happier if there were even a trivial proof-of-concept
 plugin example submitted with it, just to prove that there were no
 showstopper problems in the plugin design, like failure to pass
 essential information or not getting the locking straight.

Plugins were my other patch. I did originally submit a version with
changes, but this patch was specifically a version with *no* external
behaviour changes, to form a base from which various people's ideas
might be explored.

  I'm just wondering if the change of usage_count from 16 to 8 bits was
  discussed and agreed?
 
 Umm ... it was not, but given that we have logic in there to limit the
 usage_count to 5 or so, it's hard to argue that there's a big problem.

It was discussed and it was Tom's suggestion to do this. I agreed!

 I confess to not having read the patch in detail --- where did the other
 8 bits go to?

Keeping track of the number of hints set on a block since last write.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-08-01 Thread Alvaro Herrera
Simon Riggs wrote:

 The first half is actually quite large, but that makes it even more
 sensible to commit this part now.
 
 The enclosed patch introduces the machinery by which we might later
 optimise hint bit setting. It differentiates between hint bit setting
 and block dirtying, when the distinction can safely be made. It acts
 safely during VACUUM and correctly during checkpoint. In all other
 respects it emulates current behaviour.

I think it makes sense to commit this patch now, per previous
discussions on which we have agreed to make incremental changes.  I
think we should just get rid of the bogus changes Pavan identified.

I'm just wondering if the change of usage_count from 16 to 8 bits was
discussed and agreed?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-08-01 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 I think it makes sense to commit this patch now, per previous
 discussions on which we have agreed to make incremental changes.

Yeah, but at the same time there is merit in the argument that the
proposed patch hasn't actually been proven to be usable for anything.
I would be a lot happier if there were even a trivial proof-of-concept
plugin example submitted with it, just to prove that there were no
showstopper problems in the plugin design, like failure to pass
essential information or not getting the locking straight.

 I'm just wondering if the change of usage_count from 16 to 8 bits was
 discussed and agreed?

Umm ... it was not, but given that we have logic in there to limit the
usage_count to 5 or so, it's hard to argue that there's a big problem.

I confess to not having read the patch in detail --- where did the other
8 bits go to?

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-07-22 Thread Simon Riggs

On Mon, 2008-07-21 at 16:33 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Mon, 2008-07-21 at 11:36 +0530, Pavan Deolasee wrote:
  I think we should try at least one or two possible optimizations and
  get some numbers before we jump and make substantial changes to the
  code.
 
  You know you're suggesting months of tests and further discussion. :-(
 
 I agree with Pavan that we should have something that'd at least serve
 as test scaffolding to verify that the framework patch is sane.  The
 test code needn't be anything we'd want to commit.  

The test code is/was there, in the sense that the patch was (supposed
to) do exactly what it does now, just with extra code to keep track of
hint counts.

Probably the most important point is not yet covered: we don't keep any
track of which blocks are dirtied solely for hint bits. We need to do
this so we can measure the efficacy of *any* patch that seeks to improve
the current situation.

The best time to do this is in integration phase of release, so will do
it then.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-07-21 Thread Simon Riggs

On Mon, 2008-07-21 at 11:36 +0530, Pavan Deolasee wrote:

 I think we should try at least one or two possible optimizations and
 get some numbers before we jump and make substantial changes to the
 code.

You know you're suggesting months of tests and further discussion. :-(

I'll fix the bug, but I'm not doing any more on this now till feature
freeze. It's the wrong time to chase mirages.

Thanks for checking my logic.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-30 Thread Simon Riggs

On Fri, 2008-06-27 at 16:02 +0100, Simon Riggs wrote:
 The patch splits into two parts:
 * the machinery to *not* dirty a page when we set hints
 * behaviour modifications now that we can tell the difference between
 dirty and hinted pages
 
 Nobody has yet come up with any comments about the first half, which is
 good. The second part is clearly where much debate will occur. I'm going
 to literally split the patch into two, so we can get the machinery into
 CVS and then fiddle and argue over the second part over next few months.

The first half is actually quite large, but that makes it even more
sensible to commit this part now.

The enclosed patch introduces the machinery by which we might later
optimise hint bit setting. It differentiates between hint bit setting
and block dirtying, when the distinction can safely be made. It acts
safely during VACUUM and correctly during checkpoint. In all other
respects it emulates current behaviour.

The actual tuning patch can be discussed later, probably at length.
Later patches will be fairly small in comparison and so various people
can fairly easily come up with their own favoured modifications for
testing.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/access/gist/gistget.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/gist/gistget.c,v
retrieving revision 1.73
diff -c -r1.73 gistget.c
*** src/backend/access/gist/gistget.c	12 May 2008 00:00:44 -	1.73
--- src/backend/access/gist/gistget.c	30 Jun 2008 22:05:43 -
***
*** 49,55 
  			/* page unchanged, so all is simple */
  			offset = ItemPointerGetOffsetNumber(iptr);
  			ItemIdMarkDead(PageGetItemId(p, offset));
! 			SetBufferCommitInfoNeedsSave(buffer);
  			LockBuffer(buffer, GIST_UNLOCK);
  			break;
  		}
--- 49,55 
  			/* page unchanged, so all is simple */
  			offset = ItemPointerGetOffsetNumber(iptr);
  			ItemIdMarkDead(PageGetItemId(p, offset));
! 			SetBufferCommitInfoNeedsSave(buffer,  true);
  			LockBuffer(buffer, GIST_UNLOCK);
  			break;
  		}
***
*** 64,70 
  			{
  /* found */
  ItemIdMarkDead(PageGetItemId(p, offset));
! SetBufferCommitInfoNeedsSave(buffer);
  LockBuffer(buffer, GIST_UNLOCK);
  if (buffer != so-curbuf)
  	ReleaseBuffer(buffer);
--- 64,70 
  			{
  /* found */
  ItemIdMarkDead(PageGetItemId(p, offset));
! SetBufferCommitInfoNeedsSave(buffer, true);
  LockBuffer(buffer, GIST_UNLOCK);
  if (buffer != so-curbuf)
  	ReleaseBuffer(buffer);
Index: src/backend/access/hash/hash.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/hash/hash.c,v
retrieving revision 1.103
diff -c -r1.103 hash.c
*** src/backend/access/hash/hash.c	12 May 2008 00:00:44 -	1.103
--- src/backend/access/hash/hash.c	30 Jun 2008 22:05:43 -
***
*** 242,251 
  
  			/*
  			 * Since this can be redone later if needed, it's treated the same
! 			 * as a commit-hint-bit status update for heap tuples: we mark the
! 			 * buffer dirty but don't make a WAL log entry.
  			 */
! 			SetBufferCommitInfoNeedsSave(so-hashso_curbuf);
  		}
  
  		/*
--- 242,251 
  
  			/*
  			 * Since this can be redone later if needed, it's treated the same
! 			 * as a commit-hint-bit status update for heap tuples: we don't make
! 			 * a WAL log entry, and mark the page for a flexible dirty write.
  			 */
! 			SetBufferCommitInfoNeedsSave(so-hashso_curbuf, true);
  		}
  
  		/*
Index: src/backend/access/heap/heapam.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.259
diff -c -r1.259 heapam.c
*** src/backend/access/heap/heapam.c	12 Jun 2008 09:12:30 -	1.259
--- src/backend/access/heap/heapam.c	30 Jun 2008 22:05:43 -
***
*** 1533,1539 
  		 */
  		if (all_dead  *all_dead 
  			HeapTupleSatisfiesVacuum(heapTuple.t_data, RecentGlobalXmin,
! 	 buffer) != HEAPTUPLE_DEAD)
  			*all_dead = false;
  
  		/*
--- 1533,1539 
  		 */
  		if (all_dead  *all_dead 
  			HeapTupleSatisfiesVacuum(heapTuple.t_data, RecentGlobalXmin,
! 	 buffer, true) != HEAPTUPLE_DEAD)
  			*all_dead = false;
  
  		/*
***
*** 1717,1726 
  	{
  		if (TransactionIdDidCommit(xid))
  			HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
!  xid);
  		else
  			HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
!  InvalidTransactionId);
  	}
  }
  
--- 1717,1726 
  	{
  		if (TransactionIdDidCommit(xid))
  			HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
!  xid, true);
  		else
  			HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
!  InvalidTransactionId, true);
  	}
  }
  
Index: 

Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-27 Thread Gregory Stark
Alvaro Herrera [EMAIL PROTECTED] writes:

 If only VACUUM is going to set flexible to off, maybe it's better to
 leave the APIs as they are and have a global that's set by VACUUM only
 (and reset in a PG_CATCH block).

Ugh. Perhaps it would be simpler to have a wrapper function HTSV() macro which
passes flexible=true to HTSV_internal(). Then vacuum can call HTSV_internal().

I'm not sure what the performance tradeoff is between having an extra argument
to HTSV and having HTSV check a global which messes with optimizations.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-27 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 The default and minimum value for this parameter is 1, so very similar to
 existing behaviour. Expected settings would be 2-5, possibly as high as 20,
 though those are just educated guesses. So the maximum is set arbitrarily as
 100.

Not a fan of arbitrary constants. ISTM this should just have a maximum of
MaxHeapTuplesPerPage.

I'm not really happy with having this parameter at all. It's not something a
DBA can understand or have any hope of setting intelligently. I assume this is
a temporary measure until we have a better understanding of what real-world
factors affect the right values for this knob?

 Temp buffers are never dirtied by hint bit setting. Most temp tables are
 written in a single command, so that re-accessing clog for temp tuples
 is seldom costly. This also changes current behaviour.

I'm not sure I agree with this logic and it doesn't seem like temporary tables
are an important enough case to start coming up with special cases which may
help or may hurt. Most people use temporary tables the way you describe but
I'm sure there's someone out there using temporary tables in a radically
different fashion.

I'm also a bit concerned that *how many hint bits* isn't enough information to
determine how important it is to write out the page. What about how old the
oldest transaction is which was hinted? Or how many *unhinted* xmin/xmax
values were found? If HTSV can hint xmin for a tuple but finds xmax still in
progress perhaps that's a good sign it's not worth dirtying the page?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-27 Thread Simon Riggs

On Fri, 2008-06-27 at 15:25 +0100, Gregory Stark wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 
  If only VACUUM is going to set flexible to off, maybe it's better to
  leave the APIs as they are and have a global that's set by VACUUM only
  (and reset in a PG_CATCH block).
 
 Ugh. Perhaps it would be simpler to have a wrapper function HTSV() macro which
 passes flexible=true to HTSV_internal(). Then vacuum can call HTSV_internal().
 
 I'm not sure what the performance tradeoff is between having an extra argument
 to HTSV and having HTSV check a global which messes with optimizations.

Doing this doesn't actually reduce the size of the patch much, as it
turns out, so I suggest we don't do this.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-27 Thread Simon Riggs

On Fri, 2008-06-27 at 15:36 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  The default and minimum value for this parameter is 1, so very similar to
  existing behaviour. Expected settings would be 2-5, possibly as high as 20,
  though those are just educated guesses. So the maximum is set arbitrarily as
  100.
 
 Not a fan of arbitrary constants. ISTM this should just have a maximum of
 MaxHeapTuplesPerPage.
 
 I'm not really happy with having this parameter at all. It's not something a
 DBA can understand or have any hope of setting intelligently. I assume this is
 a temporary measure until we have a better understanding of what real-world
 factors affect the right values for this knob?

Yes, its a guess at what sort of control we'll need.

  Temp buffers are never dirtied by hint bit setting. Most temp tables are
  written in a single command, so that re-accessing clog for temp tuples
  is seldom costly. This also changes current behaviour.
 
 I'm not sure I agree with this logic and it doesn't seem like temporary tables
 are an important enough case to start coming up with special cases which may
 help or may hurt. Most people use temporary tables the way you describe but
 I'm sure there's someone out there using temporary tables in a radically
 different fashion.

Thanks for your comments. The patch splits into two parts:
* the machinery to *not* dirty a page when we set hints
* behaviour modifications now that we can tell the difference between
dirty and hinted pages

Nobody has yet come up with any comments about the first half, which is
good. The second part is clearly where much debate will occur. I'm going
to literally split the patch into two, so we can get the machinery into
CVS and then fiddle and argue over the second part over next few months.

 I'm also a bit concerned that *how many hint bits* isn't enough information to
 determine how important it is to write out the page. What about how old the
 oldest transaction is which was hinted? Or how many *unhinted* xmin/xmax
 values were found? If HTSV can hint xmin for a tuple but finds xmax still in
 progress perhaps that's a good sign it's not worth dirtying the page?

Sounds interesting. We can track anything and everything really, but we
do need to come to a firm dirty/not decision at some point.

If you can develop those ideas a bit more by Monday, I'll try to put
them in the patch. (I'm away until then now).

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-27 Thread Heikki Linnakangas

Gregory Stark wrote:

I'm also a bit concerned that *how many hint bits* isn't enough information to
determine how important it is to write out the page. 


Agreed, that doesn't seem like a very good metric to me either.


Or how many *unhinted* xmin/xmax
values were found? If HTSV can hint xmin for a tuple but finds xmax still in
progress perhaps that's a good sign it's not worth dirtying the page?


I like that thought.

Overall, I feel that we should never dirty when setting a hint bit, just 
set the separate buffer flag to indicate that hint bits have been set. 
The decision to dirty and write out, or not, should be delayed until 
we're about to write/replace the buffer. That is, in bgwriter.


How about this strategy:

1. First of all, before writing a dirty buffer, scan all tuples on the 
page and set all hint bits that can be set. This will hopefully save us 
from having to dirty the page again in the future, when another tuple on 
the page is accessed. This has been proposed before, and IIRC Tom has 
argued that it's a modularity violation for bgwriter to access the 
contents of pages like that, but I'm sure we can find a way to do it safely.


2. When bgwriter encounters a page that's marked as hint bits dirty, 
write it only if *all* hint bits on the page has been, or can be, set. 
Dirtying a page before that point doesn't seem worthwhile, as the next 
access to the tuple that doesn't have all the hint bits set will have to 
dirty the page again.



Actually, I'd like to see some benchmarks on an even simpler strategy:
just never dirty a page just because a hint bit has been set. It might 
work surprisingly well in practice: If a database is I/O bound, we don't 
care about the extra CPU work or lock congestion of checking the clog. 
If it's CPU bound, the active pages that matter are in the buffer cache, 
and so are the hint bits for those pages.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-20 Thread Simon Riggs

On Wed, 2008-06-18 at 23:41 +0100, Simon Riggs wrote:

 New version on its way.

New version complete, but I'm doing some more performance profiling
before submitting next version. If anybody is waiting, just shout and
I'll post the current version.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Hint Bits and Write I/O

2008-06-18 Thread Alvaro Herrera
Simon Riggs wrote:

 When running a VACUUM command we always dirty the block when setting
 hint bits, for a number of reasons:
 * VACUUM FULL expects all hint bits to be set prior to moving tuples
 * Setting all hint bits allows us to truncate the clog
 * it forces the VACUUM to write out its own dirty buffers, which is OK,
 since it is a background process.
 
 Other commands call HeapTupleSatisfiesVacuum(), yet these tasks can be
 more flexible with hint bit setting. These include ANALYZE, CREATE
 INDEX, CLUSTER, HOT pruning and index scan marking deleted tuples (with
 changes in all index AMs). This means we have to differentiate between
 VACUUM and other callers of HeapTupleSatisfiesVacuum().
 
 So the patch changes the APIs of HeapTupleSatisfiesVacuum(),
 SetBufferCommitInfoNeedsSave() and SetHintBits() with changes to 13 AM
 and command files. There are many changes in tqual.c, which seems the
 right way because SetHintBits() is inlined. These make the patch fairly
 large, though most of it is simple changes.

If only VACUUM is going to set flexible to off, maybe it's better to
leave the APIs as they are and have a global that's set by VACUUM only
(and reset in a PG_CATCH block).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches