[PATCHES] Split _bt_insertonpg to two functions

2007-02-26 Thread Heikki Linnakangas

Here's a patch that:

Moves the logic to find a page with enough room from _bt_insertonpg to a 
new function, _bt_findinsertloc. It makes the code more readable, and 
simplifies the forthcoming Grouped Index Tuples patch.


Also, the insert location within page used to be calculated twice for 
unique indexes, once in _bt_checkunique and second time in 
_bt_insertonpg. That's a waste of cycles, and this patch fixes that.



I couldn't measure a difference with pgbench, but this micro-benchmark 
shows it:


 psql postgres -c CREATE TABLE inserttest (i int PRIMARY KEY);
 psql postgres -c TRUNCATE inserttest; checkpoint;; sync
 time ~/pgsql.cvshead/bin/psql postgres -c TRUNCATE inserttest; 
INSERT INTO inserttest SELECT a FROM generate_series(1,100) a;


Without patch:  real0m7.260s
With patch: real0m6.963s

On my laptop, fsync=off, full_page_writes=off, checkpoint_segments = 10, 
to remove any other variables.


It's not a huge difference, but it's worth having, and performance 
wasn't the main motivation of the patch anyway.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/nbtree/nbtinsert.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/nbtree/nbtinsert.c,v
retrieving revision 1.152
diff -c -r1.152 nbtinsert.c
*** src/backend/access/nbtree/nbtinsert.c	21 Feb 2007 20:02:17 -	1.152
--- src/backend/access/nbtree/nbtinsert.c	26 Feb 2007 09:37:16 -
***
*** 46,58 
  static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
  
  static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
!  Relation heapRel, Buffer buf,
   ScanKey itup_scankey);
  static void _bt_insertonpg(Relation rel, Buffer buf,
  			   BTStack stack,
- 			   int keysz, ScanKey scankey,
  			   IndexTuple itup,
! 			   OffsetNumber afteritem,
  			   bool split_only_page);
  static Buffer _bt_split(Relation rel, Buffer buf, OffsetNumber firstright,
  		  OffsetNumber newitemoff, Size newitemsz,
--- 46,63 
  static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
  
  static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
!  Relation heapRel, Buffer buf, OffsetNumber ioffset,
   ScanKey itup_scankey);
+ static void _bt_findinsertloc(Relation rel,
+   Buffer *bufptr, 
+   OffsetNumber *offsetptr,
+   int keysz,
+   ScanKey scankey,
+   IndexTuple newtup);
  static void _bt_insertonpg(Relation rel, Buffer buf,
  			   BTStack stack,
  			   IndexTuple itup,
! 			   OffsetNumber newitemoff,
  			   bool split_only_page);
  static Buffer _bt_split(Relation rel, Buffer buf, OffsetNumber firstright,
  		  OffsetNumber newitemoff, Size newitemsz,
***
*** 86,91 
--- 91,97 
  	ScanKey		itup_scankey;
  	BTStack		stack;
  	Buffer		buf;
+ 	OffsetNumber offset;
  
  	/* we need an insertion scan key to do our search, so build one */
  	itup_scankey = _bt_mkscankey(rel, itup);
***
*** 94,99 
--- 100,107 
  	/* find the first page containing this key */
  	stack = _bt_search(rel, natts, itup_scankey, false, buf, BT_WRITE);
  
+ 	offset = InvalidOffsetNumber;
+ 
  	/* trade in our read lock for a write lock */
  	LockBuffer(buf, BUFFER_LOCK_UNLOCK);
  	LockBuffer(buf, BT_WRITE);
***
*** 128,134 
  	{
  		TransactionId xwait;
  
! 		xwait = _bt_check_unique(rel, itup, heapRel, buf, itup_scankey);
  
  		if (TransactionIdIsValid(xwait))
  		{
--- 136,143 
  	{
  		TransactionId xwait;
  
! 		offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
! 		xwait = _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey);
  
  		if (TransactionIdIsValid(xwait))
  		{
***
*** 142,148 
  	}
  
  	/* do the insertion */
! 	_bt_insertonpg(rel, buf, stack, natts, itup_scankey, itup, 0, false);
  
  	/* be tidy */
  	_bt_freestack(stack);
--- 151,158 
  	}
  
  	/* do the insertion */
! 	_bt_findinsertloc(rel, buf, offset, natts, itup_scankey, itup);
! 	_bt_insertonpg(rel, buf, stack, itup, offset, false);
  
  	/* be tidy */
  	_bt_freestack(stack);
***
*** 152,169 
  /*
   *	_bt_check_unique() -- Check for violation of unique index constraint
   *
   * Returns InvalidTransactionId if there is no conflict, else an xact ID
   * we must wait for to see if it commits a conflicting tuple.	If an actual
   * conflict is detected, no return --- just ereport().
   */
  static TransactionId
  _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
!  Buffer buf, ScanKey itup_scankey)
  {
  	TupleDesc	itupdesc = RelationGetDescr(rel);
  	int			natts = rel-rd_rel-relnatts;
! 	OffsetNumber offset,
! maxoff;
  	Page		page;
  	BTPageOpaque opaque;
  	Buffer		nbuf = InvalidBuffer;
--- 162,182 
  /*
   *	_bt_check_unique() -- Check for violation of unique index constraint
   *
+  * offset points to the first 

Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-02-26 Thread NikhilS

Hi,

I was wondering if we can apply the same logic of recalculating OldestXmin
within IndexBuildHeapScan which occurs as part of create index operation?

Having to index lesser number of tuples should be a good save in the CREATE
INDEX CONCURRENTLY case, if the above is possible?

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] [BUGS] BUG #2969: Inaccuracies in Solaris FAQ

2007-02-26 Thread Bruce Momjian
Peter Eisentraut wrote:
 Zdenek Kotala wrote:
  There is Solaris FAQ update. Please, look on it and let me know any
  comments.
 
 The actual answer to the question Can I compile PostgreSQL with 
 Kerberos v5 support? is Yes, why not?.  I don't think Can I use 
 this weird internal private library that seems to provide a similar 
 interface? is a Frequently Asked Question.

I don't use Solaris so I don't know if it is frequent issue or not.
I assume the author knows.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PATCHES] 1-byte packed varlena headers

2007-02-26 Thread Gregory Stark


Updated patch at:

 http://community.enterprisedb.com/varlena/patch-varvarlena-12.patch.gz

This fixes a rather critical oversight which caused it all to appear to work
but not actually save any space. I've added an assertion check that the
predicted tuple size matches the tuple size generated by heap_form*tuple.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PATCHES] Numeric patch to add special-case representations for 8 bytes

2007-02-26 Thread Gregory Stark

I've uploaded a quick hack to store numerics in  8 bytes when possible. 

 http://community.enterprisedb.com/numeric-hack-1.patch

This is a bit of a kludge since it doesn't actually provide any interface for
external clients of the numeric module to parse the resulting data. Ie, the
macros in numeric.h will return garbage.

But I'm not entirely convinced that matters. It's not like those macros were
really useful to any other modules anyways since there was no way to extract
the actual digits.

The patch is also slightly unsatisfactory because as I discovered numbers like
1.1 are stored as two digits currently. But it does work and it does save a
substantial amount of space for integers.

postgres=# select n,pg_column_size(n) from n;
n | pg_column_size 
--+
1 |  2
   11 |  2
  101 |  2
 1001 |  3
10001 |  9
   11 |  9
  1.1 |  9
 10.1 |  9
100.1 |  9
   1000.1 |  9
  1.1 | 11
 10.1 | 11

I had hoped to get the second batch to be 3-4 bytes. But even now note how
much space is saved for integers 1.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Allow \pset to parse on or off for boolean values

2007-02-26 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Chad Wagner wrote:
 This is a TODO item:
 
 o Allow psql \pset boolean variables to set to fixed values, rather
   than toggle
 
 
 Basically allows for:
 test=# \pset expanded on
 Expanded display is on.
 test=# \pset footer off
 Default footer is off.
 test=# \pset footer
 Default footer is on.
 test=# \pset footer
 Default footer is off.
 test=# \pset footer
 Default footer is on.
 
 
 Basically the change is to check if value in do_pset is != NULL to call
 ParseVariableBool to get a true/false and set the value directly, of the
 second parameter is not provided then the old logic of toggling is
 supported.
 
 The pset variables that are adjusted are:
 expanded
 numericlocale
 tuples_only
 pager (supports on/off/always now)
 footer

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Bruce Momjian

I am a little concerned about a log_* setting that is INFO. I understand
why you used INFO (for log_min_error_messages), but INFO is inconsistent
with the log* prefix, and by default INFO doesn't appear in the log
file.

So, by default, the INFO is going to go to the user terminal, and not to
the logfile.

Ideas?

---

Simon Riggs wrote:
 On Mon, 2007-02-19 at 19:38 +, Simon Riggs wrote:
  On Tue, 2007-02-13 at 22:19 -0500, Bruce Momjian wrote:
   Simon Riggs wrote:
On Thu, 2006-10-26 at 18:45 -0400, Tom Lane wrote:
 Chris Campbell [EMAIL PROTECTED] writes:
  Is there additional logging information I can turn on to get more  
  details? I guess I need to see exactly what locks both processes  
  hold, and what queries they were running when the deadlock 
  occurred?  
  Is that easily done, without turning on logging for *all* 
  statements?
 
 log_min_error_statement = error would at least get you the statements
 reporting the deadlocks, though not what they're conflicting against.

Yeh, we need a much better locking logger for performance analysis.

We really need to dump the whole wait-for graph for deadlocks, since
this might be more complex than just two statements involved. Deadlocks
ought to be so infrequent that we can afford the log space to do this -
plus if we did this it would likely lead to fewer deadlocks.

For 8.3 I'd like to have a log_min_duration_lockwait (secs) parameter
that would allow you to dump the wait-for graph for any data-level locks
that wait too long, rather than just those that deadlock. Many
applications experience heavy locking because of lack of holistic
design. That will also show up the need for other utilities to act
CONCURRENTLY, if possible.
   
   Old email, but I don't see how our current output is not good enough?
   
 test= lock a;
 ERROR:  deadlock detected
 DETAIL:  Process 6855 waits for AccessExclusiveLock on relation 16394 of
 database 16384; blocked by process 6795.
 Process 6795 waits for AccessExclusiveLock on relation 16396 of database
 16384; blocked by process 6855.
  
  This detects deadlocks, but it doesn't detect lock waits. 
  
  When I wrote that it was previous experience driving me. Recent client
  experience has highlighted the clear need for this. We had a lock wait
  of 50 hours because of an RI check; thats the kind of thing I'd like to
  show up in the logs somewhere.
  
  Lock wait detection can be used to show up synchronisation points that
  have been inadvertently designed into an application, so its a useful
  tool in investigating performance issues.
  
  I have a patch implementing the logging as agreed with Tom, will post to
  patches later tonight.
 
 Patch for discussion, includes doc entries at top of patch, so its
 fairly clear how it works.
 
 Output is an INFO message, to allow this to trigger
 log_min_error_statement when it generates a message, to allow us to see
 the SQL statement that is waiting. This allows it to generate a message
 prior to the statement completing, which is important because it may not
 ever complete, in some cases, so simply logging a list of pids won't
 always tell you what the SQL was that was waiting.
 
 Other approaches are possible...
 
 Comments?
 
 -- 
   Simon Riggs 
   EnterpriseDB   http://www.enterprisedb.com
 

[ Attachment, skipping... ]

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] correct format for date, time, timestamp for XML functionality

2007-02-26 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Pavel Stehule wrote:
 Hello,
 
 this patch ensures independency datetime fields on current datestyle 
 setting. Add new internal datestyle USE_XSD_DATESTYLE. It's almoust same to 
 USE_ISO_DATESTYLE. Differences are for timestamp:
 
 ISO: -mm-dd hh24:mi:ss
 XSD: -mm-ddThh24:mi:ss
 
 I found one link about this topic: 
 http://forums.oracle.com/forums/thread.jspa?threadID=467278tstart=0
 
 Regards
 Pavel Stehule
 
 _
 Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. 
 http://messenger.msn.cz/

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] pg_standby Error cleanup

2007-02-26 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Darcy Buskermolen wrote:
 Please find attached a patch which provides for logging in the event that -k 
 is unable to clean up an old WAL file.  Also make the failed to remove file 
 error message consistant for the trigger file.
 
 
 
 -- 
 Darcy Buskermolen
 Command Prompt, Inc.
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
 PostgreSQL solutions since 1997
 http://www.commandprompt.com/

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Simon Riggs
On Mon, 2007-02-26 at 13:34 -0500, Bruce Momjian wrote:

 I am a little concerned about a log_* setting that is INFO. I understand
 why you used INFO (for log_min_error_messages), but INFO is inconsistent
 with the log* prefix, and by default INFO doesn't appear in the log
 file.

Yeh, LOG would be most appropriate, but thats not possible.

log_min_messages allows only DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1,
INFO, NOTICE and WARNING for non-error states.

Possibly DEBUG1?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Yeh, LOG would be most appropriate, but thats not possible.

You have not given any good reason for that.

 log_min_messages allows only DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1,
 INFO, NOTICE and WARNING for non-error states.

I don't think you understand quite how the log message priority works...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Simon Riggs
On Mon, 2007-02-26 at 14:11 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Yeh, LOG would be most appropriate, but thats not possible.
 
 You have not given any good reason for that.

The idea of the patch is that it generates a log message which then
invokes log_min_error_statement so that the SQL statement is displayed.
LOG is not on the list of options there, otherwise I would use it.

The reason for behaving like this is so that a message is generated
while the statement is still waiting, rather than at the end. As I
mentioned in the submission, you may not like that behaviour; I'm in two
minds myself, but I'm trying to get to the stage of having useful
information come out of the server when we have long lock waits.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 The idea of the patch is that it generates a log message which then
 invokes log_min_error_statement so that the SQL statement is displayed.
 LOG is not on the list of options there, otherwise I would use it.

As I said, you don't understand how the logging priority control works.
LOG *is* the appropriate level for stuff intended to go to the server log.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Simon Riggs
On Mon, 2007-02-26 at 14:28 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  The idea of the patch is that it generates a log message which then
  invokes log_min_error_statement so that the SQL statement is displayed.
  LOG is not on the list of options there, otherwise I would use it.
 
 As I said, you don't understand how the logging priority control works.
 LOG *is* the appropriate level for stuff intended to go to the server log.

Please look at the definition of log_min_error_statement, so you
understand where I'm coming from.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2007-02-26 at 14:28 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 The idea of the patch is that it generates a log message which then
 invokes log_min_error_statement so that the SQL statement is displayed.
 LOG is not on the list of options there, otherwise I would use it.
 
 As I said, you don't understand how the logging priority control works.
 LOG *is* the appropriate level for stuff intended to go to the server log.

 Please look at the definition of log_min_error_statement, so you
 understand where I'm coming from.

I *have* read the definition of log_min_error_statement.  (The SGML docs
are wrong btw, as a quick look at the code shows that LOG is an accepted
value.)

The real issue here is that send_message_to_server_log just does

if (edata-elevel = log_min_error_statement  debug_query_string != 
NULL)

to determine whether to log the statement, whereas arguably it should be
using a test like is_log_level_output --- that is, the priority ordering
for log_min_error_statement should be like log_min_messages not like
client_min_messages.  We've discussed that before in another thread, but
it looks like nothing's been done yet.  In any case, if you're unhappy
with the code's choice of whether to emit the STATEMENT part of a log
message, some changes here are what's indicated, not bizarre choices of
elevel for individual messages.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Simon Riggs
On Mon, 2007-02-26 at 14:52 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Mon, 2007-02-26 at 14:28 -0500, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
  The idea of the patch is that it generates a log message which then
  invokes log_min_error_statement so that the SQL statement is displayed.
  LOG is not on the list of options there, otherwise I would use it.
  
  As I said, you don't understand how the logging priority control works.
  LOG *is* the appropriate level for stuff intended to go to the server log.
 
  Please look at the definition of log_min_error_statement, so you
  understand where I'm coming from.
 
 I *have* read the definition of log_min_error_statement.  (The SGML docs
 are wrong btw, as a quick look at the code shows that LOG is an accepted
 value.)

OK, I should have looked passed the manual.

 The real issue here is that send_message_to_server_log just does
 
   if (edata-elevel = log_min_error_statement  debug_query_string != 
 NULL)
 
 to determine whether to log the statement, whereas arguably it should be
 using a test like is_log_level_output --- that is, the priority ordering
 for log_min_error_statement should be like log_min_messages not like
 client_min_messages.  We've discussed that before in another thread, but
 it looks like nothing's been done yet.  

Hopefully not with me? Don't remember that.

 In any case, if you're unhappy
 with the code's choice of whether to emit the STATEMENT part of a log
 message, some changes here are what's indicated, not bizarre choices of
 elevel for individual messages.

Well, I would have chosen LOG if I thought it was available.

I'll do some more to the patch.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Bruce Momjian
Simon Riggs wrote:
 On Mon, 2007-02-26 at 13:34 -0500, Bruce Momjian wrote:
 
  I am a little concerned about a log_* setting that is INFO. I understand
  why you used INFO (for log_min_error_messages), but INFO is inconsistent
  with the log* prefix, and by default INFO doesn't appear in the log
  file.
 
 Yeh, LOG would be most appropriate, but thats not possible.
 
 log_min_messages allows only DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1,
 INFO, NOTICE and WARNING for non-error states.
 
 Possibly DEBUG1?

This highlights a problem we have often had with LOG output where we
also want the query.

I think there are two possible approaches.  First, we could add a new
bitmap value like LOG_STATEMENT to ereport when we want the statement
with the log line:

ereport (LOG | LOG_STATEMENT, ...)

(or a new LOG_WITH_STATEMENT log level) and a new GUC like
log_include_statement that would control the output of statements for
certain GUC parameters, and we document with GUC values it controls.

A simpler idea would be to unconditionally include the query in the
errdetail() of the actual LOG ereport.

This is not the first GUC that has needed this.  We had this issue with
log_temp_files, which we just added, and the only suggested solution
was to use log_statement = 'all'.  Either of these ideas above would be
useful for this as well.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] Deadlock with pg_dump?

2007-02-26 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 This is not the first GUC that has needed this.

Exactly.  I think that we simply made a mistake in the initial
implementation of log_min_error_statement: we failed to think about
whether it should use client or server priority ordering, and the
easy-to-code behavior was the wrong one.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] COMMIT NOWAIT Performance Option (patch)

2007-02-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 A prototype patch is posted to -patches, which is WORK IN PROGRESS.
 [This patch matches discussion thread on -hackers.]

What does this accomplish other than adding syntactic sugar over a
feature that really doesn't work well anyway?  I don't see any point
in encouraging people to use commit_delay in its present form.  If we
had a portable solution for millisecond-or-so waits then maybe it would
work ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] COMMIT NOWAIT Performance Option (patch)

2007-02-26 Thread Simon Riggs
On Mon, 2007-02-26 at 18:14 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  A prototype patch is posted to -patches, which is WORK IN PROGRESS.
  [This patch matches discussion thread on -hackers.]
 
 What does this accomplish other than adding syntactic sugar over a
 feature that really doesn't work well anyway?  I don't see any point
 in encouraging people to use commit_delay in its present form.  If we
 had a portable solution for millisecond-or-so waits then maybe it would
 work ...

This patch doesn't intend to implement group commit. I've changed the
meaning of commit_delay, sorry if that confuses.

You and I discussed this in Toronto actually, IIRC. The best way to
describe this proposal is deferred fsync, so perhaps a different
parameter commit_fsync_delay would be more appropriate. 

Bruce has requested this feature many times from me, so I thought it
about time to publish.

The key point is that COMMIT NOWAIT doesn't wait for group commit to
return, it just doesn't wait at all - leaving someone else to flush WAL.
It's much better than fsync=off.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] Load distributed checkpoint

2007-02-26 Thread ITAGAKI Takahiro
Josh Berkus josh@agliodbs.com wrote:

 Can I have a copy of the patch to add to the Sun testing queue?

This is the revised version of the patch. Delay factors in checkpoints
can be specified by checkpoint_write_percent, checkpoint_nap_percent
and checkpoint_sync_percent. They are relative to checkpoint_timeout.

Also, checking of archive_timeout during checkpoints and some error
handling routines were added.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



checkpoint_02-27.patch.gz
Description: Binary data

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PATCHES]

2007-02-26 Thread John Bartlett
Hi,

 

This is the first posting to the community of the WIP patch for the
Updatable Cursor implementation.

 

I want to confirm that the community is satisfied that the effort to date is
in a suitable direction and to get comments on the development to date.

 

The patch is in the following state:

 

The grammar definition is complete and 'yacc'ed to produce gram.y.c.

 

The functions transformUpdateStmt and transformDeleteStmt have been updated
to process the cursor name and obtain the related portal.

 

The change to save the current tuple id (ctid) into the portal, related to
the Fetch command has been done.

 

The ctids relating to the Update/Delete statements' TidScan are being
extracted to be saved in the executor.

 

The parts in progress are to complete the saving of the ctids from the
TidScan into a list stored in a file, plus related searching the list for an
individual ctid obtained from the Update/Delete statements.

 

Unstarted as yet:

 

1)Correctly process, in the database, the Delete / Update of the
tuple from the cursor.

2)To enable the cursor name to be defined as a parameter in a
PREPARE statement and provided as part if an EXECUTE statement.

 

The community may wish to comment on the following issue:

 

1)At present the file that will contain the list of ctids is going into
a new directory called pg_ctids, analogous to pg_twophase, and also stored
in the pg_data directory.

 

Regards,
John Bartlett

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 
693 481. It is confidential to the ordinary user of the email address to which 
it was addressed and may contain copyright and/or legally privileged 
information. No one else may read, print, store, copy or forward all or any of 
it or its attachments. If you receive this email in error, please return to 
sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia 
Software Technology Pty Ltd, please email [EMAIL PROTECTED]


270207_updatable_cursor.diff
Description: Binary data

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PATCHES] POSIX shared memory support

2007-02-26 Thread Chris Marcellino
Recapitulating and addressing some of the issues with my previous  
attempt at this feature:


PostgreSQL currently uses the System V shared memory APIs to access  
shared memory. On Mac OS X and other BSDs,
the default System V shared memory limits are often very low and  
require adjustment for acceptable performance. Par-
ticularly, when Postgres is included as part of larger end-user  
friendly software products, these kernel settings are often

difficult to change programatically.

The System V shared memory facilities provide a method to determine  
who is attached to a shared memory segment.
This is used to prevent backends that were orphaned by crashed or  
killed database processes from corrupting the data-
base as it is restarted. The same effect can be achieved with using  
the POSIX APIs, but since the POSIX library does not
have a way to check who is attached to a segment, atomic segment  
creation must be used to ensure exclusive access to

the database.

In order for this to work, the key name used to open and create the  
shared memory segment must be unique for each
data directory. This is done by using a strong hash of the canonical  
form of the data directory’s pathname. This also re-
moves any risk of other applications, or other databases’ memory  
segments colliding with the current shared memory

segment, which conveniently simplifies the logic.

The algorithm changes are described by the before and after flowchart  
PDF included in the tarball below.


There is also a Windows version of this patch included, which can  
replace the current SysV-to-Win32 shared memory
layer as it currently does not check for orphaned backends in the  
database. If this is used,
src/backend/port/win32/shmem.c and its makefile reference can be  
removed.


Included is a patch for the configure.in file to opt-in all of the  
supported platforms to use this POSIX shared memory
layer except those that are explicitly excluded (of course, this  
could be done in reverse if desired). Notably, NetBSD and
OpenBSD do not support the POSIX shared memory calls and their  
template patches are included. Other platforms
should be tested and excluded as needed. In addition, the  
documentation section 16.4.1 ‘Shared Memory and Sema-
phores’ could be largely pruned and updated to reflect use of the  
POSIX calls.


The patches are available here (the postings were being silently  
dropped when I attached this large of a file):


http://homepage.mac.com/cmarcellino/postgres-posix-shmem.tar

Please let me know if there is a better way to post this.


Thanks for your feedback,
Chris Marcellino


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] POSIX shared memory support

2007-02-26 Thread Tom Lane
Chris Marcellino [EMAIL PROTECTED] writes:
 The System V shared memory facilities provide a method to determine  
 who is attached to a shared memory segment.
 This is used to prevent backends that were orphaned by crashed or  
 killed database processes from corrupting the data-
 base as it is restarted. The same effect can be achieved with using  
 the POSIX APIs,

... except that it can't ...

 but since the POSIX library does not
 have a way to check who is attached to a segment, atomic segment  
 creation must be used to ensure exclusive access to
 the database.

How does that fix the problem?  If you can't actually tell whether
someone is attached to an existing segment, then you're still up against
the basic rock-and-a-hard-place issue: either you assume there is no one
there (and corrupt your database if you're wrong) or you assume there is
someone there (and force manual intervention by the DBA to recover after
postmaster crashes).  Neither of these alternatives is really acceptable.

 In order for this to work, the key name used to open and create the  
 shared memory segment must be unique for each
 data directory. This is done by using a strong hash of the canonical  
 form of the data directory’s pathname.

Strong hash is not a guarantee, even if you could promise that you
could get a unique canonical path, which I doubt you can.  In any case
this fails if the DBA decides to rename the directory on the fly (don't
laugh; not only are there instances of that in our archives, there are
people opining that we need to allow it --- even with the postmaster
still running).

 This also re-
 moves any risk of other applications, or other databases’ memory  
 segments colliding with the current shared memory
 segment, which conveniently simplifies the logic.

How exactly does it remove that risk?  I think you're wishfully-thinking
that if you are creating an unreadable hash value then there will never
be any collisions against someone else with the same touching faith that
*his* unreadable hash values will never collide with anyone else's.
Doesn't give me a lot of comfort.  Not that it matters, since the
approach is broken even if this specific assumption were sustainable.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] COMMIT NOWAIT Performance Option (patch)

2007-02-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2007-02-26 at 18:14 -0500, Tom Lane wrote:
 What does this accomplish other than adding syntactic sugar over a
 feature that really doesn't work well anyway?

 This patch doesn't intend to implement group commit. I've changed the
 meaning of commit_delay, sorry if that confuses.

Ah.  The patch was pretty much unintelligible without the discussion
(which got here considerably later :-().  I've still got misgivings
about how safe it really is, but at least this is better than what
commit_delay wishes it could do.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Load distributed checkpoint

2007-02-26 Thread Inaam Rana

On 2/26/07, ITAGAKI Takahiro [EMAIL PROTECTED] wrote:


Josh Berkus josh@agliodbs.com wrote:

 Can I have a copy of the patch to add to the Sun testing queue?

This is the revised version of the patch. Delay factors in checkpoints
can be specified by checkpoint_write_percent, checkpoint_nap_percent
and checkpoint_sync_percent. They are relative to checkpoint_timeout.

Also, checking of archive_timeout during checkpoints and some error
handling routines were added.




One of the issues we had during testing with original patch was db stop not
working properly. I think you coded something to do a stop checkpoint in
immediately but if a checkpoint is already in progress at that time, it
would take its own time to complete.
Does this patch resolve that issue? Also, is it based on pg82stable or HEAD?

regards,
inaam

Regards,

---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings






--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] POSIX shared memory support

2007-02-26 Thread Chris Marcellino


On Feb 26, 2007, at 10:43 PM, Tom Lane wrote:


Chris Marcellino [EMAIL PROTECTED] writes:

The System V shared memory facilities provide a method to determine
who is attached to a shared memory segment.
This is used to prevent backends that were orphaned by crashed or
killed database processes from corrupting the data-
base as it is restarted. The same effect can be achieved with using
the POSIX APIs,


... except that it can't ...


but since the POSIX library does not
have a way to check who is attached to a segment, atomic segment
creation must be used to ensure exclusive access to
the database.


How does that fix the problem?  If you can't actually tell whether
someone is attached to an existing segment, then you're still up  
against
the basic rock-and-a-hard-place issue: either you assume there is  
no one
there (and corrupt your database if you're wrong) or you assume  
there is
someone there (and force manual intervention by the DBA to recover  
after
postmaster crashes).  Neither of these alternatives is really  
acceptable.


Ignoring the case where backends are still alive in the database,  
since they would require intervention or patience either way, there  
are two options:
1) There is a postmaster/backend still running and you try to start  
another postmaster: the unique segment cannot be closed and  
atomically recreated and will fail as it does in the current  
implementation.
2) There are no errant processes still in the database: the segment  
can be closed and atomically recreated.


Try making a build with the patch, then start a postmaster for a  
given folder, delete the lock file and start another postmaster (on a  
different port) in that folder. Please let me know if I am  
overlooking something.





In order for this to work, the key name used to open and create the
shared memory segment must be unique for each
data directory. This is done by using a strong hash of the canonical
form of the data directory’s pathname.


Strong hash is not a guarantee, even if you could promise that you
could get a unique canonical path, which I doubt you can.  In any case
this fails if the DBA decides to rename the directory on the fly  
(don't

laugh; not only are there instances of that in our archives, there are
people opining that we need to allow it --- even with the postmaster
still running).


Strong hash is an effective guarantee that many computing paradigms  
are based upon. The collision rate is astronomically small, and can  
be made astronomically smaller with longer hashes.
(For MD5 there would need to be 10^15 postmasters on a server before  
a collision is likely, and they all would need to have crashed and  
left backends in the database, etc. )


True, renaming is a problem that I had had not anticipated at all.  
Now that you mention it, hard links might be an issue on some  
machines that don't canonicalize them to a unique path, since that  
isn't required by the POSIX docs. Oh, the horrible degenerate cases.  
Good point though.


Perhaps there is some other unique identifying feature of a given  
database. A per-database persistent UUID would fit nicely here. It  
could just be the shmem key.





This also re-
moves any risk of other applications, or other databases’ memory
segments colliding with the current shared memory
segment, which conveniently simplifies the logic.


How exactly does it remove that risk?


This is fruitless due to the renaming issue, but the hash isn't an  
issue. I'm not sure that a hex string beginning with \pg_x is any  
less readable than the shmem id integers that are generated ad-hoc by  
the current implementation.



I think you're wishfully-thinking
that if you are creating an unreadable hash value then there will  
never
be any collisions against someone else with the same touching faith  
that

*his* unreadable hash values will never collide with anyone else's.


I'm flattered that you hold my coding abilities with such devout  
conviction, but I assure you that cryptography, even in this limited  
use, is based in rational thought :).
In addition, the astronomically unlikely collision isn't a risk as  
the database can't be damaged. The admin would then need to clear the  
lockfile, after he won the lottery twice and was stuck by lightning  
in his overturned car.



Doesn't give me a lot of comfort.
Not that it matters, since the
approach is broken even if this specific assumption were sustainable.


Postmasters failing to load don't give me much comfort either, and  
that isn't a pipe dream.


I suppose that the renaming issue relegates this patch to situations  
where the database cannot be renamed or hard linked to and started  
more than once, yet require this to start up databases without  
restarting and needing to control how many other databases are  
consuming shmem on the same box.


Thanks for the reply,
Chris Marcellino



regards, tom lane



---(end of