Re: [PATCHES] hash index improving v3

2008-09-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 maintenance_work_mem is already used for 3 separate operations that bear
 little resemblance to each other. If it's appropriate for all of those
 then its appropriate for this usage also.

No, it isn't.

The fundamental point here is that this isn't a memory allocation
parameter; it's a switchover threshold between two different behaviors.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] hash index improving v3

2008-09-23 Thread Simon Riggs

On Tue, 2008-09-23 at 08:16 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  maintenance_work_mem is already used for 3 separate operations that bear
  little resemblance to each other. If it's appropriate for all of those
  then its appropriate for this usage also.
 
 No, it isn't.
 
 The fundamental point here is that this isn't a memory allocation
 parameter; it's a switchover threshold between two different behaviors.

That's a little confusing since sorts switch their behaviour also, but
use (some form of) work_mem, which is *also* their max allocation. I see
the difficulty in understanding the algorithm's behaviour now.

So shared_buffers is the wrong parameter, but even if we had a parameter
it would be very difficult to set it. 

Thinks: Why not just sort all of the time and skip the debate entirely?
I thought the main use case was for larger indexes, since that's when
the number of levels in the index is significantly less than btrees? Do
we need to optimise creation time of smaller hash indexes at all?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] hash index improving v3

2008-09-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Thinks: Why not just sort all of the time and skip the debate entirely?

The sort is demonstrably a loser for smaller indexes.  Admittedly,
if the index is small then the sort can't cost all that much, but if
the (correct) threshold is some large fraction of shared_buffers then
it could still take awhile on installations with lots-o-buffers.

The other side of that coin is that it's not clear this is really worth
arguing about, much less exposing a separate parameter for.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] hash index improving v3

2008-09-23 Thread Simon Riggs

On Tue, 2008-09-23 at 09:13 -0400, Tom Lane wrote:

 The other side of that coin is that it's not clear this is really worth
 arguing about, much less exposing a separate parameter for.

Agreed.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Infrastructure changes for recovery

2008-09-23 Thread Simon Riggs

On Mon, 2008-09-22 at 23:06 +0100, Simon Riggs wrote:
 On Thu, 2008-09-18 at 10:09 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   On Thu, 2008-09-18 at 09:06 -0400, Tom Lane wrote:
   Do we really need a checkpoint there at all?
  
   Timelines only change at shutdown checkpoints.
  
  Hmm.  I *think* that that is just a debugging crosscheck rather than a
  critical property.  But yeah, it would take some close investigation,
  which maybe isn't warranted if you have a less-invasive solution.
 
 OK, new patch, version 6. Some major differences to previous patch.

 Ready for serious review prior to commit. I will be performing further
 testing also.

Version 7

I've removed the concept of interrupting a restartpoint half way
through, I found a fault there. It was more ugly than the alternative
and less robust. The code now waits at the end of recovery if we are in
the middle of a restartpoint, but forces a do-it-more-quickly also. That
means we won't always get a fast start even though we skip the shutdown
checkpoint, but at least we're sure there's no chance of breakage
because of concurrent activiy, state changes etc..

I'm happy with this now.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/access/transam/multixact.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.28
diff -c -r1.28 multixact.c
*** src/backend/access/transam/multixact.c	1 Aug 2008 13:16:08 -	1.28
--- src/backend/access/transam/multixact.c	22 Sep 2008 19:28:56 -
***
*** 1543,1549 
  	 * SimpleLruTruncate would get confused.  It seems best not to risk
  	 * removing any data during recovery anyway, so don't truncate.
  	 */
! 	if (!InRecovery)
  		TruncateMultiXact();
  
  	TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE(true);
--- 1543,1549 
  	 * SimpleLruTruncate would get confused.  It seems best not to risk
  	 * removing any data during recovery anyway, so don't truncate.
  	 */
! 	if (!IsRecoveryProcessingMode())
  		TruncateMultiXact();
  
  	TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE(true);
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.317
diff -c -r1.317 xlog.c
*** src/backend/access/transam/xlog.c	11 Aug 2008 11:05:10 -	1.317
--- src/backend/access/transam/xlog.c	23 Sep 2008 14:56:37 -
***
*** 66,76 
  bool		fullPageWrites = true;
  bool		log_checkpoints = false;
  int 		sync_method = DEFAULT_SYNC_METHOD;
- 
  #ifdef WAL_DEBUG
  bool		XLOG_DEBUG = false;
  #endif
- 
  /*
   * XLOGfileslop is the maximum number of preallocated future XLOG segments.
   * When we are done with an old XLOG segment file, we will recycle it as a
--- 66,74 
***
*** 119,124 
--- 117,123 
  
  /* Are we doing recovery from XLOG? */
  bool		InRecovery = false;
+ bool		reachedSafeStopPoint = false;
  
  /* Are we recovering using offline XLOG archives? */
  static bool InArchiveRecovery = false;
***
*** 131,137 
  static bool recoveryTarget = false;
  static bool recoveryTargetExact = false;
  static bool recoveryTargetInclusive = true;
- static bool recoveryLogRestartpoints = false;
  static TransactionId recoveryTargetXid;
  static TimestampTz recoveryTargetTime;
  static TimestampTz recoveryLastXTime = 0;
--- 130,135 
***
*** 286,295 
--- 284,295 
  /*
   * Total shared-memory state for XLOG.
   */
+ #define	XLOGCTL_BUFFER_SPACING	128
  typedef struct XLogCtlData
  {
  	/* Protected by WALInsertLock: */
  	XLogCtlInsert Insert;
+ 	char	InsertPadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogCtlInsert)];
  
  	/* Protected by info_lck: */
  	XLogwrtRqst LogwrtRqst;
***
*** 297,305 
--- 297,312 
  	uint32		ckptXidEpoch;	/* nextXID  epoch of latest checkpoint */
  	TransactionId ckptXid;
  	XLogRecPtr	asyncCommitLSN; /* LSN of newest async commit */
+ 	/* add data structure padding for above info_lck declarations */
+ 	char	InfoPadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogwrtRqst) 
+ - sizeof(XLogwrtResult)
+ - sizeof(uint32)
+ - sizeof(TransactionId)
+ - sizeof(XLogRecPtr)];
  
  	/* Protected by WALWriteLock: */
  	XLogCtlWrite Write;
+ 	char	WritePadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogCtlWrite)];
  
  	/*
  	 * These values do not change after startup, although the pointed-to pages
***
*** 311,316 
--- 318,342 
  	int			XLogCacheBlck;	/* highest allocated xlog buffer index */
  	TimeLineID	ThisTimeLineID;
  
+ 	/*
+ 	 * IsRecoveryProcessingMode shows whether the postmaster is in a
+ 	 * postmaster state earlier than PM_RUN, or not. This is a globally
+ 	 * accessible state to allow EXEC_BACKEND