date:20110929

Hi, 

 PS: When you send a review, you should add the author's email to the
 To: line to make sure they see it. I noticed your email only today
 because it was in a new thread and not addressed to me directly.

 Thanks for the advise. I will do so after this.

 Good catch, a new patch is attached. Apparently the results of this
 query have also changed in recent versions, but I didn't touch that.

 I've comfirmed that is fixed.
 I'll send this comitter review.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

Sorry for late to re-review.

 One question is remaind, 

  Q1: The shmem entry for timestamp is not initialized on
  allocating. Is this OK? (I don't know that for OSs other than
  Linux) And zeroing double field is OK for all OSs?
 
 CreateSharedBackendStatus() initializes that shmem entries by doing
 MemSet(BackendStatusArray, 0, size). You think this is not enough?

 Sorry for missing it. That's enough.

  Nevertheless this is ok for all OSs, I don't know whether
  initializing TimestampTz(double, int64 is ok) field with 8 bytes
  zeros is OK or not, for all platforms. (It is ok for
  IEEE754-binary64).
 
 Which code are you concerned about?

xlog.c: 5889
   beentry = pgstat_fetch_all_beentry();
 
   for (i = 0; i  MaxBackends; i++, beentry++)
   {
   xtime = beentry-st_xact_end_timestamp;

 I think the last line in quoted code above reads possibly
zero-initialized double (or int64) value, then the doubted will
be compared and copied to another double.

   if (result  xtime)
   result = xtime;
 

  
 No, st_changecount is used to read st_xact_end_timestamp.
 st_xact_end_timestamp is read from the shmem to the local memory
 in pgstat_read_current_status(), and this function always checks
 st_changecount when reading the shmem value.

 Yes, I confirmed that pg_lats_xact_insert_timestamp looks local
copy of BackendStatus. I've found it not unnecessary. 

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feature proposal: www_fdw

2011-09-29 Thread Alexander Soudakov

On Thu, Sep 29, 2011 at 1:20 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
Florian Pflug f...@phlo.org wrote:
On Sep28, 2011, at 15:32 , Alexander Soudakov wrote:
Here you can find www_fdw feature documentation:
http://wiki.postgresql.org/wiki/WWW_FDW

Certainly looks useful (as a third-party extension, as others have
already pointed out)

Our programmers agree that it is likely to be useful here. I agree
that it should be an extension.

What I didn't quite understand is how one would pass (dynamic)
parameters for a GET request. For example, not too long ago I
needed to access the Google Maps API from postgres. I ended up
using pl/python, and now wonder if your FDW would support that
use-case.

I would assume that the usual ? to start parameters and between
parameters would be used. For example, with Google Maps:

http://maps.google.com/maps?hl=enie=UTF8hq=hnear=Madison,+Dane,+Wisconsinll=43.074684,-89.38188spn=0.003006,0.00383t=hz=18layer=ccbll=43.07468,-89.381742panoid=LhJ-PFHVzxRguJ6h616mmQcbp=12,355.53,,0,-1.32

-Kevin

There would be an option to specify callback for forming request
(request_serialize_callback):
it would be passed with configuration parameters, details of the query
and action (currently it's only SELECT).
So it can:
* add specific parameters (like developer key)
* change column name to query parameter name (for eg: column name -
column, query parameter - q)
* create dynamic parameter

And return query string via output parameter uri.
Also I plan to add output parameter request of composite type for
future to use it for passing any post parameters (or some other http
headers specific to api).

--
Alexander Soudakov
Developer Programmer
email: cyga...@gmail.com
skype: asudakov

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

This is new version of make_greater_string patch.

1. wchar.c:1532 pg_wchar_table: Restore the pg_wchar_table.

2. wchar.c:1371 pg_utf8_increment: Remove dangerous memcpy, but
   one memcpy is left because it's safe. Remove code check after
   increment.

3. wchar.c:1429 pg_eucjp_increment: Remove dangerous
   memcpy. Remove code check after increment. Minor bug fix.

4. wchar.c:1654 pg_database_encoding_character_incrementer:
   Select increment function by switch-select.



horiguchi.kyotaro This is rebased patch of `Allow encoding specific character
horiguchi.kyotaro 
incrementer'(https://commitfest.postgresql.org/action/patch_view?id=602).

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 3e84679..593dba6 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5653,6 +5653,18 @@ pattern_selectivity(Const *patt, Pattern_Type ptype)
 
 
 /*
+ * This function is character increment function for bytea used in
+ * make_greater_string() that has same interface with pg_wchar_tbl.charinc.
+ */
+static bool byte_increment(unsigned char *ptr, int len)
+{
+	if (*ptr = 255) return false;
+
+	(*ptr)++;
+	return true;
+}
+
+/*
  * Try to generate a string greater than the given string or any
  * string it is a prefix of.  If successful, return a palloc'd string
  * in the form of a Const node; else return NULL.
@@ -5691,6 +5703,7 @@ make_greater_string(const Const *str_const, FmgrInfo *ltproc, Oid collation)
 	int			len;
 	Datum		cmpstr;
 	text	   *cmptxt = NULL;
+	character_incrementer charincfunc;
 
 	/*
 	 * Get a modifiable copy of the prefix string in C-string format, and set
@@ -5752,27 +5765,38 @@ make_greater_string(const Const *str_const, FmgrInfo *ltproc, Oid collation)
 		}
 	}
 
+	if (datatype != BYTEAOID)
+		charincfunc = pg_database_encoding_character_incrementer();
+	else
+		charincfunc = byte_increment;
+
 	while (len  0)
 	{
-		unsigned char *lastchar = (unsigned char *) (workstr + len - 1);
-		unsigned char savelastchar = *lastchar;
+		int charlen;
+		unsigned char *lastchar;
+		unsigned char savelastbyte;
+		Const	   *workstr_const;
+		
+		if (datatype == BYTEAOID)
+			charlen = 1;
+		else
+			charlen = len - pg_mbcliplen(workstr, len, len - 1);
+
+		lastchar = (unsigned char *) (workstr + len - charlen);
 
 		/*
-		 * Try to generate a larger string by incrementing the last byte.
+		 * savelastbyte has meaning only for datatype == BYTEAOID
 		 */
-		while (*lastchar  (unsigned char) 255)
-		{
-			Const	   *workstr_const;
+		savelastbyte = *lastchar;
 
-			(*lastchar)++;
+		/*
+		 * Try to generate a larger string by incrementing the last byte or
+		 * character.
+		 */
 
+		if (charincfunc(lastchar, charlen)) {
 			if (datatype != BYTEAOID)
-			{
-/* do not generate invalid encoding sequences */
-if (!pg_verifymbstr(workstr, len, true))
-	continue;
 workstr_const = string_to_const(workstr, datatype);
-			}
 			else
 workstr_const = string_to_bytea_const(workstr, len);
 
@@ -5787,26 +5811,17 @@ make_greater_string(const Const *str_const, FmgrInfo *ltproc, Oid collation)
 pfree(workstr);
 return workstr_const;
 			}
-
+			
 			/* No good, release unusable value and try again */
 			pfree(DatumGetPointer(workstr_const-constvalue));
 			pfree(workstr_const);
 		}
 
-		/* restore last byte so we don't confuse pg_mbcliplen */
-		*lastchar = savelastchar;
-
 		/*
-		 * Truncate off the last character, which might be more than 1 byte,
-		 * depending on the character encoding.
+		 * Truncate off the last character or restore last byte for BYTEA.
 		 */
-		if (datatype != BYTEAOID  pg_database_encoding_max_length()  1)
-			len = pg_mbcliplen(workstr, len, len - 1);
-		else
-			len -= 1;
-
-		if (datatype != BYTEAOID)
-			workstr[len] = '\0';
+		len -= charlen;
+		workstr[len] = (datatype != BYTEAOID ? '\0' : savelastbyte);
 	}
 
 	/* Failed... */
diff --git a/src/backend/utils/mb/wchar.c b/src/backend/utils/mb/wchar.c
index f23732f..d6dc717 100644
--- a/src/backend/utils/mb/wchar.c
+++ b/src/backend/utils/mb/wchar.c
@@ -1336,6 +1336,195 @@ pg_utf8_islegal(const unsigned char *source, int length)
 
 /*
  *---
+ * character incrementer
+ *
+ * These functions accept charptr, a pointer to the first byte of a
+ * maybe-multibyte character. Try `increment' the character and return true if
+ * successed.  If these functions returns false, the character should be
+ * untouched.  These functions must be implemented in correspondence with
+ * verifiers, in other words, the rewrited character by this function must pass
+ * the check by pg_*_verifier() if returns true. Returning the return value of
+ * pg_*_verifier() corresponding can finnaly avoid such a inconsistency when
+ * something wrong.
+ * ---
+ */
+
+#ifndef FRONTEND
+static

Re: [HACKERS] bug of recovery?

2011-09-29 Thread Fujii Masao

On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug f...@phlo.org wrote:
 On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote:
 On 27.09.2011 00:28, Florian Pflug wrote:
 On Sep26, 2011, at 22:39 , Tom Lane wrote:
 It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
 we (think we) have reached consistency, rather than leaving it to be
 done only when we exit recovery mode.

 I believe we also need to prevent the creation of restart points before
 we've reached consistency.

 Seems reasonable. We could still allow restartpoints when the hash table is 
 empty, though. And once we've reached consistency, we can throw an error 
 immediately in log_invalid_page(), instead of adding the entry in the hash 
 table.

 That mimics the way the rm_safe_restartpoint callbacks work, which is good.

 Actually, why don't we use that machinery to implement this? There's 
 currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need 
 to create one that checks whether invalid_page_tab is empty.

Okay, the attached patch prevents the creation of restartpoints by using
rm_safe_restartpoint callback if we've not reached a consistent state yet
and the invalid-page table is not empty. But the invalid-page table is not
tied to the specific resource manager, so using rm_safe_restartpoint for
that seems to slightly odd. Is this OK?

Also, according to other suggestions, the patch changes XLogCheckInvalidPages()
so that it's called as soon as we've reached a consistent state, and changes
log_invalid_page() so that it emits PANIC immediately if consistency is already
reached. These are very good changes, I think. Because they enable us to
notice serious problem which causes PANIC error immediately. Without these
changes, you unfortunately might notice that the standby database is corrupted
when failover happens. Though such a problem might rarely happen, I think it's
worth doing those changes.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
*** a/src/backend/access/transam/rmgr.c
--- b/src/backend/access/transam/rmgr.c
***
*** 16,21 
--- 16,22 
  #include access/nbtree.h
  #include access/xact.h
  #include access/xlog_internal.h
+ #include access/xlogutils.h
  #include catalog/storage.h
  #include commands/dbcommands.h
  #include commands/sequence.h
***
*** 25,31 
  
  
  const RmgrData RmgrTable[RM_MAX_ID + 1] = {
! 	{XLOG, xlog_redo, xlog_desc, NULL, NULL, NULL},
  	{Transaction, xact_redo, xact_desc, NULL, NULL, NULL},
  	{Storage, smgr_redo, smgr_desc, NULL, NULL, NULL},
  	{CLOG, clog_redo, clog_desc, NULL, NULL, NULL},
--- 26,32 
  
  
  const RmgrData RmgrTable[RM_MAX_ID + 1] = {
! 	{XLOG, xlog_redo, xlog_desc, NULL, NULL, xlog_safe_restartpoint},
  	{Transaction, xact_redo, xact_desc, NULL, NULL, NULL},
  	{Storage, smgr_redo, smgr_desc, NULL, NULL, NULL},
  	{CLOG, clog_redo, clog_desc, NULL, NULL, NULL},
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 557,563  static TimeLineID lastPageTLI = 0;
  static XLogRecPtr minRecoveryPoint;		/* local copy of
  		 * ControlFile-minRecoveryPoint */
  static bool updateMinRecoveryPoint = true;
! static bool reachedMinRecoveryPoint = false;
  
  static bool InRedo = false;
  
--- 557,563 
  static XLogRecPtr minRecoveryPoint;		/* local copy of
  		 * ControlFile-minRecoveryPoint */
  static bool updateMinRecoveryPoint = true;
! bool reachedMinRecoveryPoint = false;
  
  static bool InRedo = false;
  
***
*** 6840,6851  StartupXLOG(void)
  		LocalXLogInsertAllowed = -1;
  
  		/*
- 		 * Check to see if the XLOG sequence contained any unresolved
- 		 * references to uninitialized pages.
- 		 */
- 		XLogCheckInvalidPages();
- 
- 		/*
  		 * Perform a checkpoint to update all our recovery activity to disk.
  		 *
  		 * Note that we write a shutdown checkpoint rather than an on-line
--- 6840,6845 
***
*** 6982,6987  CheckRecoveryConsistency(void)
--- 6976,6987 
  		XLByteLE(minRecoveryPoint, EndRecPtr) 
  		XLogRecPtrIsInvalid(ControlFile-backupStartPoint))
  	{
+ 		/*
+ 		 * Check to see if the XLOG sequence contained any unresolved
+ 		 * references to uninitialized pages.
+ 		 */
+ 		XLogCheckInvalidPages();
+ 
  		reachedMinRecoveryPoint = true;
  		ereport(LOG,
  (errmsg(consistent recovery state reached at %X/%X,
*** a/src/backend/access/transam/xlogutils.c
--- b/src/backend/access/transam/xlogutils.c
***
*** 52,57  typedef struct xl_invalid_page
--- 52,73 
  static HTAB *invalid_page_tab = NULL;
  
  
+ /* Report a reference to an invalid page */
+ static void
+ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+ 	BlockNumber blkno, bool present)
+ {
+ 	char	   *path = relpathperm(node, forkno);
+ 
+ 	if (present)
+ 		elog(elevel, page %u of relation %s is uninitialized,
+ 			 blkno, path);
+ 	else
+

Re: [HACKERS] bug of recovery?

2011-09-29 Thread Simon Riggs

On Thu, Sep 29, 2011 at 12:31 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug f...@phlo.org wrote:
 On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote:
 On 27.09.2011 00:28, Florian Pflug wrote:
 On Sep26, 2011, at 22:39 , Tom Lane wrote:
 It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
 we (think we) have reached consistency, rather than leaving it to be
 done only when we exit recovery mode.

 I believe we also need to prevent the creation of restart points before
 we've reached consistency.

 Seems reasonable. We could still allow restartpoints when the hash table is 
 empty, though. And once we've reached consistency, we can throw an error 
 immediately in log_invalid_page(), instead of adding the entry in the hash 
 table.

 That mimics the way the rm_safe_restartpoint callbacks work, which is good.

 Actually, why don't we use that machinery to implement this? There's 
 currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need 
 to create one that checks whether invalid_page_tab is empty.

 Okay, the attached patch prevents the creation of restartpoints by using
 rm_safe_restartpoint callback if we've not reached a consistent state yet
 and the invalid-page table is not empty. But the invalid-page table is not
 tied to the specific resource manager, so using rm_safe_restartpoint for
 that seems to slightly odd. Is this OK?

 Also, according to other suggestions, the patch changes 
 XLogCheckInvalidPages()
 so that it's called as soon as we've reached a consistent state, and changes
 log_invalid_page() so that it emits PANIC immediately if consistency is 
 already
 reached. These are very good changes, I think. Because they enable us to
 notice serious problem which causes PANIC error immediately. Without these
 changes, you unfortunately might notice that the standby database is corrupted
 when failover happens. Though such a problem might rarely happen, I think it's
 worth doing those changes.

Patch does everything we agreed it should.

Good suggestion from Florian.

This worries me slightly now though because the patch makes us PANIC
in a place we didn't used to and once we do that we cannot restart the
server at all. Are we sure we want that? It's certainly a great way to
shake down errors in other code...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting

Peter Eisentraut wrote:
 On ons, 2011-09-28 at 11:53 -0300, Alvaro Herrera wrote:
  Excerpts from Peter Eisentraut's message of mi? sep 28 04:49:43 -0300 2011:
   On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:
It would perhaps be useful to add optional --old-confdir and 
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config 
files are in the datadir. 
   
   It should work the same way the postmaster itself works: If the given
   directory is not a data directory, look for the postgresql.conf file and
   look there for the location of the data directory.
  
  So we need a postmaster switch:
  
  postmaster --parse-config-and-report=data_directory
 
 Perhaps.  That might have some use for pg_ctl as well.

FYI, unless this is backpatched, which I doubt, it is only going to be
available for the _new_ cluster.

You are right that while pg_upgrade doesn't care about the location of
postgresql.conf and pg_hba.conf, we have to point to those to start the
server, and pg_upgrade does need to access some data files, so it also
needs to know about the data location.

I am unclear how to proceed with this, particularly with the backpatch
requirement.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing savepointLevel from TransactionState


Excerpts from Tom Lane's message of jue sep 29 02:11:52 -0300 2011:
 Gurjeet Singh singh.gurj...@gmail.com writes:
  I noticed that the savepointLevel member of TransactionStateData struct is
  initialized to 0 from TopTransactionStateData, and never incremented or
  decremented afterwards.
 
  Since this is a file-local struct I think we can simply get rid of all
  usages of this without any risk.
 
 ISTM you have detected a bug, not just dead code that should be removed.
 Surely those tests that throw error on savepointLevel change were
 meant to do something important?

This is the patch I submitted that introduced that bit:
http://archives.postgresql.org/pgsql-patches/2004-07/msg00292.php
which got committed as cc813fc2b8d9293bbd4d0e0d6a6f3b9cf02fe32f.

Amusingly, the savepointLevel thing was introduced there; I don't
remember the details but I think what it was intended to implement is
some sort of restriction laid out by the SQL standard's spelling of
savepoint commands.

... in fact, SQL 2008 talks about savepoint levels in 4.35.2
Savepoints.  And as far as Part 2: Foundation is concerned, I think
only routine invocation can cause the savepoint level to be changed.
That is, if you have a function that declares itself to have NEW SAVEPOINT
LEVEL, then that function is not allowed to roll back savepoints that
were created before it started.

Now, we already disallow functions from doing this at all; so it seems
that the missing feature for us is OLD SAVEPOINT LEVEL (which, according
to the standard, is the default behavior).  Since this is not
implementable on the current SPI abstraction, we cannot do much about
this.  But if we ever have transaction-controlling SPs, then it seems to
me that we ought to keep this and enable those to use it as appropriate.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

2011-09-29 Thread Fujii Masao

On Thu, Sep 29, 2011 at 5:20 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 Sorry for late to re-review.

Thanks!

  Nevertheless this is ok for all OSs, I don't know whether
  initializing TimestampTz(double, int64 is ok) field with 8 bytes
  zeros is OK or not, for all platforms. (It is ok for
  IEEE754-binary64).

 Which code are you concerned about?

 xlog.c: 5889
       beentry = pgstat_fetch_all_beentry();

       for (i = 0; i  MaxBackends; i++, beentry++)
       {
               xtime = beentry-st_xact_end_timestamp;

  I think the last line in quoted code above reads possibly
 zero-initialized double (or int64) value, then the doubted will
 be compared and copied to another double.

               if (result  xtime)
                       result = xtime;

I believe it's safe. Such a code is placed elsewhere in the source, too.
If it's unsafe, we should have received lots of bug reports related
to that. But we've not.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?


From: Merlin Moncure mmonc...@gmail.com
can we see all of your memory settings plus physical memory?  the
solution is probably going to be reducing shared buffers an/or adding
physical memory.

Thank you for your response.

The amount of physical memory is 8GB, which is enough for the workload. I 
asked the customer for the output of SHOW ALL, but I haven't received it 
yet. However, shared_buffers should be less than 1.6GB because, as I wrote 
in the previous mail, top command showed 1.6GB in VIRT column before 
executing somefunc() PL/pgSQL function.


The direct cause of out of memory is that the virtual memory became full. 
32-bit Linux can allocate 3GB of user space in the virtual address space of 
each process. somefunc() used 1.0GB, which led to 2.6GB of virtual memory. 
After somefunc(), VACUUM tried to allocate 256MB of maintenance_work_mem. 
That allocation failed because the virtual address space was almost full.


As you mentioned, decreasing shared_buffers will be one of the solutions. 
However, we want to know why somefunc() uses so much memory. Therefore, the 
following is the core question. Q2 and Q3 are supplementary ones. It is just 
my guess that RelCache/SysCache may be the cause.


2011/9/28 MauMau maumau...@gmail.com:
Q1: When are the RelCache/SysCache entries removed from 
CacheMemoryContext?

Are they removed only when the corresponding relations are deleted? If so,
many tables and indexes is not friendly for the current PostgreSQL?


Regards
MauMau


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feature proposal: www_fdw

2011-09-29 Thread Dickson S. Guedes

2011/9/28 Florian Pflug f...@phlo.org:
 On Sep28, 2011, at 15:32 , Alexander Soudakov wrote:
 Here you can find www_fdw feature documentation:
 http://wiki.postgresql.org/wiki/WWW_FDW

 Certainly looks useful (as a third-party extension, as others have already 
 pointed out)

+1.

 What I didn't quite understand is how one would pass (dynamic) parameters for 
 a GET request. For example, not too long ago I needed to access the Google 
 Maps API from postgres. I ended up using pl/python, and now wonder if your 
 FDW would support that use-case.


I'm working on a google_contacts_fdw to google contacts api [1] but
stopped in the authentication design. As you can see in [2], for
google api, you should get an authorization token and store the Auth
value to use latter on the same session. I'm wondering how the best
way to cache this value as long as possible, because actually, when
you need authentication for a FDW, you use the
fdw_routine-BeginForeignScan call function but, in this situation,
each SELECT to foreign table will do the handshake and some APIs could
block this. Many client libraries work fine, caching the Auth value.
How WWW_FDW could play with behaviors like that, since other Web APIs
has the a authorization system like this [2]?


[1] http://code.google.com/apis/contacts/docs/3.0/developers_guide.html
[2] http://code.google.com/apis/gdata/articles/using_cURL.html


Regards.
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?


Excerpts from MauMau's message of jue sep 29 09:23:48 -0300 2011:

 The amount of physical memory is 8GB, which is enough for the workload. I 
 asked the customer for the output of SHOW ALL, but I haven't received it 
 yet. However, shared_buffers should be less than 1.6GB because, as I wrote 
 in the previous mail, top command showed 1.6GB in VIRT column before 
 executing somefunc() PL/pgSQL function.

You don't really know this; some operating systems (Linux in particular)
does not show shared memory as in use by a process until it is accessed.
It may very well have well over 1.6 GB of shared_buffers, yet not show
that in VIRT.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting

Bruce Momjian wrote:
 Peter Eisentraut wrote:
  On ons, 2011-09-28 at 11:53 -0300, Alvaro Herrera wrote:
   Excerpts from Peter Eisentraut's message of mi? sep 28 04:49:43 -0300 
   2011:
On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:
 It would perhaps be useful to add optional --old-confdir and 
 --new-confdir parameters to pg_upgrade. If these parameters are absent
 then pg_upgrade would work as it does now and assume that the config 
 files are in the datadir. 

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.
   
   So we need a postmaster switch:
   
   postmaster --parse-config-and-report=data_directory
  
  Perhaps.  That might have some use for pg_ctl as well.
 
 FYI, unless this is backpatched, which I doubt, it is only going to be
 available for the _new_ cluster.
 
 You are right that while pg_upgrade doesn't care about the location of
 postgresql.conf and pg_hba.conf, we have to point to those to start the
 server, and pg_upgrade does need to access some data files, so it also
 needs to know about the data location.
 
 I am unclear how to proceed with this, particularly with the backpatch
 requirement.

Thinking some more, I don't need to know the data directory while the
server is down --- I already am starting it. pg_upgrade starts both old
and new servers during its check phase, and it could look up the
data_directory GUC variable and use that value when accessing files. 
That would work for old and new servers.  However, I assume that is
something we would not backpatch to 9.1.  

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing savepointLevel from TransactionState

2011-09-29 Thread Gurjeet Singh

On Thu, Sep 29, 2011 at 8:10 AM, Alvaro Herrera
alvhe...@commandprompt.comwrote:


 Excerpts from Tom Lane's message of jue sep 29 02:11:52 -0300 2011:
  Gurjeet Singh singh.gurj...@gmail.com writes:
   I noticed that the savepointLevel member of TransactionStateData struct
 is
   initialized to 0 from TopTransactionStateData, and never incremented or
   decremented afterwards.
 
   Since this is a file-local struct I think we can simply get rid of all
   usages of this without any risk.
 
  ISTM you have detected a bug, not just dead code that should be removed.
  Surely those tests that throw error on savepointLevel change were
  meant to do something important?

 This is the patch I submitted that introduced that bit:
 http://archives.postgresql.org/pgsql-patches/2004-07/msg00292.php
 which got committed as cc813fc2b8d9293bbd4d0e0d6a6f3b9cf02fe32f.

 Amusingly, the savepointLevel thing was introduced there; I don't
 remember the details but I think what it was intended to implement is
 some sort of restriction laid out by the SQL standard's spelling of
 savepoint commands.

 ... in fact, SQL 2008 talks about savepoint levels in 4.35.2
 Savepoints.  And as far as Part 2: Foundation is concerned, I think
 only routine invocation can cause the savepoint level to be changed.
 That is, if you have a function that declares itself to have NEW SAVEPOINT
 LEVEL, then that function is not allowed to roll back savepoints that
 were created before it started.

 Now, we already disallow functions from doing this at all; so it seems
 that the missing feature for us is OLD SAVEPOINT LEVEL (which, according
 to the standard, is the default behavior).  Since this is not
 implementable on the current SPI abstraction, we cannot do much about
 this.  But if we ever have transaction-controlling SPs, then it seems to
 me that we ought to keep this and enable those to use it as appropriate.


I have seen some recent discussions around implementing procedures that
would allow transaction control, but don't know at what stage those
conversations ended. If we are still at hand-waving stage w.r.t SPs then I
would vote for removal of this code and rethink it as part of SP
implementation.

Having seen some commits after the initial one, that use this variable, ISTM
that we're maintaining a feature we never documented, or implemented for
that matter.

Dead code trips the unwary like me, and definitely does not help a
maintainer.

Regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?


From: Alvaro Herrera alvhe...@commandprompt.com

You don't really know this; some operating systems (Linux in particular)
does not show shared memory as in use by a process until it is accessed.
It may very well have well over 1.6 GB of shared_buffers, yet not show
that in VIRT.


Oh, really? When I started psql just after I set shared_buffers to 2500MB 
and ran pg_ctl start, ps -o vsz -p postgres_PID showed about 2500MB+some. 
ps's vsz is also the amount of virtual memory. But I want to know the 
shared_buffers setting.


Anyway, I'd appreciate if anyone could tell me about RelCache/SysCache. As 
far as I read the code, PostgreSQL seems to use memory for RelCache/SysCache 
without limit until the relations are dropped.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

2011-09-29 Thread Merlin Moncure

On Thu, Sep 29, 2011 at 7:23 AM, MauMau maumau...@gmail.com wrote:
 From: Merlin Moncure mmonc...@gmail.com
 can we see all of your memory settings plus physical memory?  the
 solution is probably going to be reducing shared buffers an/or adding
 physical memory.

 Thank you for your response.

 The amount of physical memory is 8GB, which is enough for the workload. I
 asked the customer for the output of SHOW ALL, but I haven't received it
 yet. However, shared_buffers should be less than 1.6GB because, as I wrote
 in the previous mail, top command showed 1.6GB in VIRT column before
 executing somefunc() PL/pgSQL function.

 The direct cause of out of memory is that the virtual memory became full.
 32-bit Linux can allocate 3GB of user space in the virtual address space of
 each process. somefunc() used 1.0GB, which led to 2.6GB of virtual memory.
 After somefunc(), VACUUM tried to allocate 256MB of maintenance_work_mem.
 That allocation failed because the virtual address space was almost full.

 As you mentioned, decreasing shared_buffers will be one of the solutions.
 However, we want to know why somefunc() uses so much memory. Therefore, the
 following is the core question. Q2 and Q3 are supplementary ones. It is just
 my guess that RelCache/SysCache may be the cause.

Oh -- I missed earlier that this was 32 bit o/s.  Well, I'd consider
drastically reducing shared buffers, down to say 256-512mb range.
Postgres function plans and various other structures, tables,
attributes are indeed cached and can use up a considerable amount of
memory in pathological cases -- this is largely depending on the
number of tables/views, number of functions and number of connections.
 I briefly looked at the relcache etc a little while back on a related
complaint and the takeaway is that the caching is heavy handed and
fairly brute force but legit and a huge win for most cases. This stuff
lives in the cache memory context and a couple of users (not that
many) have bumped into high memory usage.  Solutions tend to include:

*) not rely on implementation that requires 10 tables
*) use connection pooler
*) reset connections
*) go to 64 bit o/s
*) reduce shared_buffers for leaner memory profile (especially in 32 bit os)


Like I said, this doesn't really come up this often but the 'real'
solution in terms of postgrs is probably some kind of upper bound in
the amount of cache memory used plus some intelligence in the cache
implementation.  This is tricky stuff though and so far no credible
proposals have been made and the demand for the feature is not very
high.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting


Excerpts from Bruce Momjian's message of jue sep 29 09:56:09 -0300 2011:
 
 Thinking some more, I don't need to know the data directory while the
 server is down --- I already am starting it. pg_upgrade starts both old
 and new servers during its check phase, and it could look up the
 data_directory GUC variable and use that value when accessing files. 
 That would work for old and new servers.  However, I assume that is
 something we would not backpatch to 9.1.  

Why not?  We've supported separate data/config dirs for a long time now,
so it seems to me that pg_upgrade not coping with them is a bug.  If
pg_upgrade starts postmaster, it seems simple to grab the data_directory
setting, is it not?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feature proposal: www_fdw

On Sep29, 2011, at 14:45 , Dickson S. Guedes wrote:
 I'm working on a google_contacts_fdw to google contacts api [1] but
 stopped in the authentication design. As you can see in [2], for
 google api, you should get an authorization token and store the Auth
 value to use latter on the same session. I'm wondering how the best
 way to cache this value as long as possible, because actually, when
 you need authentication for a FDW, you use the
 fdw_routine-BeginForeignScan call function but, in this situation,
 each SELECT to foreign table will do the handshake and some APIs could
 block this. Many client libraries work fine, caching the Auth value.
 How WWW_FDW could play with behaviors like that, since other Web APIs
 has the a authorization system like this [2]?

You could use a hash table, allocated in the top-level memory context,
to store one authentication token per combination of server and local user.

I suggest you look at the MySQL FDW (https://github.com/dpage/mysql_fdw)
- they presumably re-use the same connection over multiple foreign scans,
which seems to be a problem similar to yours.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

2011-09-29 Thread Robert Haas

On Thu, Sep 29, 2011 at 9:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Like I said, this doesn't really come up this often but the 'real'
 solution in terms of postgrs is probably some kind of upper bound in
 the amount of cache memory used plus some intelligence in the cache
 implementation.  This is tricky stuff though and so far no credible
 proposals have been made and the demand for the feature is not very
 high.

We (i.e. $EMPLOYER) have a customer who ran into this problem (i.e.
relcache/syscache memory usage shooting through the roof) in testing,
so I'm somewhat motivated to see if we can't come up with a fix.  I am
fairly sure that was on a 64-bit build, so the issue wasn't just that
they didn't have enough address space.  It seems that we used to have
some kind of LRU algorithm to prevent excessive memory usage, but we
rippped it out because it was too expensive (see commit
8b9bc234ad43dfa788bde40ebf12e94f16556b7f).  I don't have a brilliant
idea at the moment, but I wonder if we could come up with something
that's cheap enough to manage that it doesn't materially affect
performance in normal cases, but just kicks in when things get really
out of control.

A trivial algorithm would be - if you're about to run out of memory,
flush all the caches; or evict 10% of the entries at random.  Of
course, the problem with anything like this is that it's hard to know
when you're about to run out of memory before you actually do, and any
hard-coded limit you care to set will sometimes be wrong.  So maybe
that's not the right approach.  At the same time, I don't think that
simply hoping the user has enough memory is an adequate answer.

One thing to consider is that in some cases a user may plan to do
something like touch every table in the database exactly once and then
exit.  In that case, if we knew in advance what the user's intentions
were, we'd want to use an MRU eviction algorithm rather than LRU.
Again, we don't know that in advance.  But in such a use case it's
reasonable for the user to expect that the amount of backend-private
memory used for caching will not grow without bound.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bug of recovery?

On Sep29, 2011, at 13:49 , Simon Riggs wrote:
 This worries me slightly now though because the patch makes us PANIC
 in a place we didn't used to and once we do that we cannot restart the
 server at all. Are we sure we want that? It's certainly a great way to
 shake down errors in other code...

The patch only introduces a new PANIC condition during archive recovery,
though. Crash recovery is unaffected, except that we no longer create
restart points before we reach consistency.

Also, if we hit an invalid page reference after reaching consistency,
the cause is probably either a bug in our recovery code, or (quite unlikely)
a corrupted WAL that passed the CRC check. In both cases, the likelyhood
of data-corruption seems high, so PANICing seems like the right thing to do.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?


From: Merlin Moncure mmonc...@gmail.com
--
Oh -- I missed earlier that this was 32 bit o/s.  Well, I'd consider
drastically reducing shared buffers, down to say 256-512mb range.
Postgres function plans and various other structures, tables,
attributes are indeed cached and can use up a considerable amount of
memory in pathological cases -- this is largely depending on the
number of tables/views, number of functions and number of connections.
I briefly looked at the relcache etc a little while back on a related
complaint and the takeaway is that the caching is heavy handed and
fairly brute force but legit and a huge win for most cases. This stuff
lives in the cache memory context and a couple of users (not that
many) have bumped into high memory usage.  Solutions tend to include:

*) not rely on implementation that requires 10 tables
*) use connection pooler
*) reset connections
*) go to 64 bit o/s
*) reduce shared_buffers for leaner memory profile (especially in 32 bit os)


Like I said, this doesn't really come up this often but the 'real'
solution in terms of postgrs is probably some kind of upper bound in
the amount of cache memory used plus some intelligence in the cache
implementation.  This is tricky stuff though and so far no credible
proposals have been made and the demand for the feature is not very
high.
--


Thank you very much. I'm relieved I could understand the reason. I will 
report it to the customer and ask him to consider taking the following 
measures:


* reduce shared_buffers
* run somefunc() and VACUUM in different psql sessions
* process 100,000 tables in multiple psql sessions

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

2011-09-29 Thread Merlin Moncure

On Thu, Sep 29, 2011 at 8:59 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 29, 2011 at 9:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Like I said, this doesn't really come up this often but the 'real'
 solution in terms of postgrs is probably some kind of upper bound in
 the amount of cache memory used plus some intelligence in the cache
 implementation.  This is tricky stuff though and so far no credible
 proposals have been made and the demand for the feature is not very
 high.

 We (i.e. $EMPLOYER) have a customer who ran into this problem (i.e.
 relcache/syscache memory usage shooting through the roof) in testing,
 so I'm somewhat motivated to see if we can't come up with a fix.  I am
 fairly sure that was on a 64-bit build, so the issue wasn't just that
 they didn't have enough address space.  It seems that we used to have
 some kind of LRU algorithm to prevent excessive memory usage, but we
 rippped it out because it was too expensive (see commit
 8b9bc234ad43dfa788bde40ebf12e94f16556b7f).  I don't have a brilliant
 idea at the moment, but I wonder if we could come up with something
 that's cheap enough to manage that it doesn't materially affect
 performance in normal cases, but just kicks in when things get really
 out of control.

 A trivial algorithm would be - if you're about to run out of memory,
 flush all the caches; or evict 10% of the entries at random.  Of
 course, the problem with anything like this is that it's hard to know
 when you're about to run out of memory before you actually do, and any
 hard-coded limit you care to set will sometimes be wrong.  So maybe
 that's not the right approach.  At the same time, I don't think that
 simply hoping the user has enough memory is an adequate answer.

 One thing to consider is that in some cases a user may plan to do
 something like touch every table in the database exactly once and then
 exit.  In that case, if we knew in advance what the user's intentions
 were, we'd want to use an MRU eviction algorithm rather than LRU.
 Again, we don't know that in advance.  But in such a use case it's
 reasonable for the user to expect that the amount of backend-private
 memory used for caching will not grow without bound.

I think this (cache memory usage) is a reasonable setting for a GUC,
Maybe if you keep it very simple, say only activate cache cleanup when
the limit is exceeded, you have more freedom to dump cache using
fancier methods like a calculated benefit.  You'd probably have to
expose another knob to guarantee maximum cache sweep runtime though.
Perhaps even user visible cache management features  (an extension of
DISCARD?) could be exposed...

Hm, what might make this complicated is that you'd probably want all
the various caches to live under the same umbrella with a central
authority making decisions about what stays and what goes.

On Thu, Sep 29, 2011 at 9:22 AM, MauMau maumau...@gmail.com wrote:
 * reduce shared_buffers
 * run somefunc() and VACUUM in different psql sessions
 * process 100,000 tables in multiple psql sessions

that's a start.  don't be afraid to reset the connection after
somefunc() and at appropriate times from the 'processors'.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting

2011-09-29 Thread Mr. Aaron W. Swenson

On Thu, Sep 29, 2011 at 10:44:29AM -0300, Alvaro Herrera wrote:
 
 Excerpts from Bruce Momjian's message of jue sep 29 09:56:09 -0300 2011:
  
  Thinking some more, I don't need to know the data directory while the
  server is down --- I already am starting it. pg_upgrade starts both old
  and new servers during its check phase, and it could look up the
  data_directory GUC variable and use that value when accessing files. 
  That would work for old and new servers.  However, I assume that is
  something we would not backpatch to 9.1.  
 
 Why not?  We've supported separate data/config dirs for a long time now,
 so it seems to me that pg_upgrade not coping with them is a bug.  If
 pg_upgrade starts postmaster, it seems simple to grab the data_directory
 setting, is it not?

I was going to say. I'd view this as bringing the behavior of pg_upgrade
to a consistent state with postgres. I vote for it being backpatched to
9.0 even. For whatever my vote is worth.

-- 
Mr. Aaron W. Swenson
Pseudonym: TitanOfOld
Gentoo Developer


pgpRbF13InZSg.pgp
Description: PGP signature

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

MauMau maumau...@gmail.com writes:
 Anyway, I'd appreciate if anyone could tell me about RelCache/SysCache. As 
 far as I read the code, PostgreSQL seems to use memory for RelCache/SysCache 
 without limit until the relations are dropped.

That's correct.  We used to have a limit on the size of catcache
(if memory serves, it was something like 5000 entries).  We got rid of
it after observing that performance fell off a cliff as soon as you had
a working set larger than the cache limit.  Trust me, if we had a limit,
you'd still be here complaining, the complaint would just take a
different form ;-)

I concur with Merlin's advice to rethink your schema.  10 tables is
far beyond what any sane design could require, and is costing you on
many levels (I'm sure the OS and filesystem aren't that happy with it
either).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feature proposal: www_fdw

2011-09-29 Thread Dickson S. Guedes

2011/9/29 Florian Pflug f...@phlo.org:
 You could use a hash table, allocated in the top-level memory context,
 to store one authentication token per combination of server and local user.

In fact I started something in this way, with ldap_fdw, stashing the
connection away using memory context and something using es_query_cxt
from EState, just testing until now. How do this from PlanForeignScan
I couldn't figure out yet.

 I suggest you look at the MySQL FDW (https://github.com/dpage/mysql_fdw)
 - they presumably re-use the same connection over multiple foreign scans,
 which seems to be a problem similar to yours.

From what I understand they re-use between BeginForeignScan and the
subsequent IterateForeignScans and freeing at end. In my tests, there
is a (re)connection for each SELECT * FROM ...

I'm wondering that would be nice to have some built-in facilities
(like this kind of cache between calls) provided by www_fdw, for
that WWW API based FDWs.

Regards.
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

Robert Haas robertmh...@gmail.com writes:
 ... It seems that we used to have
 some kind of LRU algorithm to prevent excessive memory usage, but we
 rippped it out because it was too expensive (see commit
 8b9bc234ad43dfa788bde40ebf12e94f16556b7f).

Not only was it too expensive, but performance fell off a cliff as soon
as you had a catalog working set large enough to cause the code to
actually do something,  I'm not in favor of putting anything like that
back in  people who have huge catalogs will just start complaining
about something different, ie, why did their apps get so much slower.

The short answer here is if you want a database with 10 tables,
you'd better be running it on more than desktop-sized hardware.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?


From: Tom Lane t...@sss.pgh.pa.us

That's correct.  We used to have a limit on the size of catcache
(if memory serves, it was something like 5000 entries).  We got rid of
it after observing that performance fell off a cliff as soon as you had
a working set larger than the cache limit.  Trust me, if we had a limit,
you'd still be here complaining, the complaint would just take a
different form ;-)


Yes, I can imagine. Now I'll believe that caching catalog entries in local 
memory without bound is one of PostgreSQL's elaborations for performance. 
64-bit computing makes that approach legit. Oracle avoids duplicate catalog 
entries by storing them in a shared memory, but that should necessate some 
kind of locking when accessing the shared catalog entries. PostgreSQL's 
approach, which does not require locking, is better for many-core 
environments.



I concur with Merlin's advice to rethink your schema.  10 tables is
far beyond what any sane design could require, and is costing you on
many levels (I'm sure the OS and filesystem aren't that happy with it
either).


I agree. I'll suggest that to the customer, too. Thank you very much.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feature proposal: www_fdw

On Sep29, 2011, at 16:43 , Dickson S. Guedes wrote:
 2011/9/29 Florian Pflug f...@phlo.org:
 You could use a hash table, allocated in the top-level memory context,
 to store one authentication token per combination of server and local user.
 
 In fact I started something in this way, with ldap_fdw, stashing the
 connection away using memory context and something using es_query_cxt
 from EState, just testing until now. How do this from PlanForeignScan
 I couldn't figure out yet.

Maybe I'm missing something, but I'd say just allocate the hash table in
TopMemoryContext (or however that's called) and store a reference
to in a global variable. At least in the RESTful API case, you don't really
need to worry about purging entries from the table I think. You might
want to use server, remote user instead of server, local user as the
key, though. That should avoid unnecessary authentication steps and hashtable
entries if multiple local users are mapped to the same remote user, which
is probably quite common for webservices.

 I suggest you look at the MySQL FDW (https://github.com/dpage/mysql_fdw)
 - they presumably re-use the same connection over multiple foreign scans,
 which seems to be a problem similar to yours.
 
 From what I understand they re-use between BeginForeignScan and the
 subsequent IterateForeignScans and freeing at end. In my tests, there
 is a (re)connection for each SELECT * FROM ...

Oh, OK, I didn't know that. They're probably not the best model, then...

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting

Mr. Aaron W. Swenson wrote:
-- Start of PGP signed section.
 On Thu, Sep 29, 2011 at 10:44:29AM -0300, Alvaro Herrera wrote:
  
  Excerpts from Bruce Momjian's message of jue sep 29 09:56:09 -0300 2011:
   
   Thinking some more, I don't need to know the data directory while the
   server is down --- I already am starting it. pg_upgrade starts both old
   and new servers during its check phase, and it could look up the
   data_directory GUC variable and use that value when accessing files. 
   That would work for old and new servers.  However, I assume that is
   something we would not backpatch to 9.1.  
  
  Why not?  We've supported separate data/config dirs for a long time now,
  so it seems to me that pg_upgrade not coping with them is a bug.  If
  pg_upgrade starts postmaster, it seems simple to grab the data_directory
  setting, is it not?
 
 I was going to say. I'd view this as bringing the behavior of pg_upgrade
 to a consistent state with postgres. I vote for it being backpatched to
 9.0 even. For whatever my vote is worth.

Well, I would call it more of a limitation of pg_upgrade, rather than a
bug --- perhaps documenting the limitation in the back branches is
sufficient.  I think the only big argument for backpatching the feature
is that 9.1 is the first release that packagers are going to use
pg_upgrade, and fixing it now makes sense because it avoids packagers
from implementing a workaround on every platform that will go away with
9.2.

So, there are four options:

1  document the limitation and require users to use symlinks
2  add a --old/new-configdir parameter to pg_upgrade
3  have pg_upgrade find the real data dir by starting the server
4  add a flag to some tool to return the real data dir, and backpatch
   that

#4 has the problem of backpatching.  I looked at #3 and the first thing
pg_upgrade does is to read PG_VERSION, and the second thing is to call
pg_controldata, both of which need the real data dir, so it is going to
require some major code ordering adjustments to do #3.

One interesting trick would be to start the old and new servers to pull
the data dir at the start only if we don't see PG_VERSION in the
specified data directory --- that would limit the overhead of starting
the servers, and limit the risk for backpatching.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

2011-09-29 Thread Robert Haas

On Thu, Sep 29, 2011 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 ... It seems that we used to have
 some kind of LRU algorithm to prevent excessive memory usage, but we
 rippped it out because it was too expensive (see commit
 8b9bc234ad43dfa788bde40ebf12e94f16556b7f).

 Not only was it too expensive, but performance fell off a cliff as soon
 as you had a catalog working set large enough to cause the code to
 actually do something,  ...

Sure, a big working set is going to cause a performance problem if you
start flushing cache entries that are being regularly used.  But the
point is just because you have, at some time, accessed 100,000 tables
during a session does not mean that your working set is that large.
The working set is the set of things that you are actually using
regularly, not the things you've *ever* accessed.

In addition to the problem of blowing out memory, there are a number
of other things about the current code that don't seem well-suited to
dealing with large numbers of tables.  For example, catcache hash
tables can't be resized, so for very large numbers of entries you can
potentially have to walk a very long chain.  And, you can exhaust the
shared memory space for the primary lock table, leading to, for
example, inability to back up the database using pg_dump (ouch!).

I can't really explain why people seem to keep wanting to create
hundreds of thousands or even millions of tables, but it's not like
MauMau's customer is the first one to try to do this, and I'm sure
they won't be the last.  I don't want to de-optimize the more common
(and sensible) cases too much, but slow still trumps fails
outright.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_regress input/output directory option

2011-09-29 Thread Robert Haas

On Wed, Sep 28, 2011 at 12:21 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Why are there no options in_regress to specify the directory where input and
 output data are located?
 Such options would bring more flexibility when running regressions without
 make check/installcheck for an external application.
 Is there a particular reason for that? Or do you think that pg_regress
 should be only used with make check?

It has --inputdir and --outputdir - is that not what you need?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Does RelCache/SysCache shrink except when relations are deleted?

2011-09-29 Thread Merlin Moncure

On Thu, Sep 29, 2011 at 10:22 AM, Robert Haas robertmh...@gmail.com wrote:
 I can't really explain why people seem to keep wanting to create
 hundreds of thousands or even millions of tables, but it's not like
 MauMau's customer is the first one to try to do this, and I'm sure
 they won't be the last.  I don't want to de-optimize the more common
 (and sensible) cases too much, but slow still trumps fails
 outright.

Yeah -- maybe baby steps in the right direction would be track cache
memory usage and add instrumentation so the user could get a readout
on usage -- this would also help us diagnose memory issues in the
field.  Also, thinking about it more, a DISCARD based cache flush
(DISCARD CACHES TO xyz) wrapping a monolithic LRU sweep could help
users deal with these cases without having to figure out how to make
an implementation that pleases everyone.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-29 Thread Linas Virbalas

 Linas, could you capture the output of pg_controldata *and* increase the
 log level to DEBUG1 on the standby? We should then see nextXid value of
 the checkpoint the recovery is starting from.

I'll try to do that whenever I'm in that territory again... Incidentally,
recently there was a lot of unrelated-to-this-post work to polish things up
for a talk being given at PGWest 2011 Today :)

 I also checked what rsync does when a file vanishes after rsync computed the
 file list, but before it is sent. rsync 3.0.7 on OSX, at least, complains
 loudly, and doesn't sync the file. It BTW also exits non-zero, with a special
 exit code for precisely that failure case.

To be precise, my script has logic to accept the exit code 24, just as
stated in PG manual:

Docs For example, some versions of rsync return a separate exit code for
Docs vanished source files, and you can write a driver script to accept
Docs this exit code as a non-error case.

--
Sincerely,
Linas Virbalas
http://flyingclusters.blogspot.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] have SLRU truncation use callbacks

Hi,

Currently, the mechanism that SLRU uses to truncate directory entries is
hardcoded in SlruScanDirectory.  It receives a cutoff page number and a
boolean flag; segments found in the SLRU directory that are below the
cutoff are deleted -- iff the flag is true.

This is fine for most current uses, with one exception, but it is a bit
too ad-hoc.  The exception is the new NOTIFY code (async.c); on
postmaster (re)start, that module wants to simply zap all files found in
the directory.  So it forcibly sets a different pagePrecedes routine,
then calls the truncation procedure with a made-up cutoff page.

Also, clog.c has a check that it scans the directory to figure out if
any segments would be removed, were we to do the thing for real.  (This
is so that it can skip WAL flush in case none would).

(Now, this code is also getting in the way of some changes I want to do
on multixact truncation for the keylocks patch; hence the motivation.
But I think these changes stand on their own, merely on code clarity
grounds).

So what I propose is that we turn SlruScanDirectory into a mere
directory walker, and it invokes a callback on each file.  We provide
three callbacks: one that simply removes everything (for NOTIFY); one
that removes everything before a given cutoff page; and a simple one
that reports is there any removable file given this cutoff, for clog.c
uses.

Patch is attached.

Opinions?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org


slru-truncate-callbacks.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Temporary tables and in-memory use

2011-09-29 Thread Marios Vodas

Hello,

If I'm not wrong, temporary tables stay in memory if they do not go over
temp_buffers limit (e.g. if temp_buffers is 2GB and the size of the table is
300MB the table will remain in memory).
What if a column is variable length (e.g. text), how does this column stay
in-memory since it should be stored in TOAST?
When I build a GiST index on a temporary table does the index stay in memory
as well?

Thank you,
Marios

Re: [HACKERS] Temporary tables and in-memory use

Marios Vodas mvo...@gmail.com writes:
 If I'm not wrong, temporary tables stay in memory if they do not go over
 temp_buffers limit (e.g. if temp_buffers is 2GB and the size of the table is
 300MB the table will remain in memory).
 What if a column is variable length (e.g. text), how does this column stay
 in-memory since it should be stored in TOAST?

Well, the toast table is also temp, so it'll get cached in temp_buffers
as well, as long as it fits.

 When I build a GiST index on a temporary table does the index stay in memory
 as well?

Same answer.

Keep in mind that temp_buffers is per process, not global.  Just as with
work_mem, you need to be careful about setting it sky-high.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting

2011-09-29 Thread Steve Crawford


On 09/29/2011 08:20 AM, Bruce Momjian wrote:

...
1  document the limitation and require users to use symlinks
2  add a --old/new-configdir parameter to pg_upgrade
3  have pg_upgrade find the real data dir by starting the server
4  add a flag to some tool to return the real data dir, and backpatch
that
5. (really 3a). Have pg_upgrade itself check the specified --XXX-datadir 
for postgresql.conf and use the data_directory setting therein using the 
same rules as followed by the server.


This would mean that there are no new options to pg_upgrade and that 
pg_upgrade operation would not change when postgresql.conf is in the 
data-directory. This would also make it consistent with PostgreSQL's 
notion of file-locations:


If you wish to keep the configuration files elsewhere than the data 
directory, the postgres -D command-line option or PGDATA environment 
variable must point to the directory containing the configuration files, 
and the data_directory parameter must be set in postgresql.conf...


So for backporting, it could just be considered a bug fix that aligns 
pg_upgrade's interpretation of datadir to that of the server.


Cheers,
Steve

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Temporary tables and in-memory use

2011-09-29 Thread Marios Vodas

Thank you. The setup is intended for one user environment for complex 
queries and operations that's why I wrote 2GB temp_buffers!

Thank you again, I really appreciate it.
Marios

On 29/9/2011 7:55 μμ, Tom Lane wrote:

Marios Vodasmvo...@gmail.com  writes:

If I'm not wrong, temporary tables stay in memory if they do not go over
temp_buffers limit (e.g. if temp_buffers is 2GB and the size of the table is
300MB the table will remain in memory).
What if a column is variable length (e.g. text), how does this column stay
in-memory since it should be stored in TOAST?

Well, the toast table is also temp, so it'll get cached in temp_buffers
as well, as long as it fits.


When I build a GiST index on a temporary table does the index stay in memory
as well?

Same answer.

Keep in mind that temp_buffers is per process, not global.  Just as with
work_mem, you need to be careful about setting it sky-high.

regards, tom lane



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Displaying accumulated autovacuum cost



I reviewed this patch.  My question for you is: does it make sense to
enable to reporting of write rate even when vacuum cost accounting is
enabled?  In my opinion it would be useful to do so.  If you agree,
please submit an updated patch.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] have SLRU truncation use callbacks

2011-09-29 Thread Kevin Grittner

Alvaro Herrera alvhe...@alvh.no-ip.org wrote:
 
 But I think these changes stand on their own, merely on code
 clarity grounds).
 
After a quick scan, I think it helps with that.  This was a messy
area to deal with in SSI given the old API; with this change I think
we could make that part of the SSI code clearer and maybe clean up
unneeded files in a couple places where cleanup under the old API
was judged not to be worth the code complexity needed to make it
happen.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Sep29, 2011, at 17:44 , Linas Virbalas wrote:
 I also checked what rsync does when a file vanishes after rsync computed the
 file list, but before it is sent. rsync 3.0.7 on OSX, at least, complains
 loudly, and doesn't sync the file. It BTW also exits non-zero, with a special
 exit code for precisely that failure case.
 
 To be precise, my script has logic to accept the exit code 24, just as
 stated in PG manual:
 
 Docs For example, some versions of rsync return a separate exit code for
 Docs vanished source files, and you can write a driver script to accept
 Docs this exit code as a non-error case.

Oh, cool. I was about to suggest that we add something along these lines
to the docs - guess I should RTFM from time to time ;-)

best regards,
Florian Pflug

 

 
 --
 Sincerely,
 Linas Virbalas
 http://flyingclusters.blogspot.com/
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

2011-09-29 Thread Kevin Grittner

Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  Tom Lane  wrote:
 
 Hmm, why is that patch the one posted for review, when several
 better versions were already discussed? See thread starting here:
 http://archives.postgresql.org/pgsql-hackers/2011-07/msg00028.php
  
 The patch I reviewed was added to the CF app before the post you
 cite was written.  I didn't see it in following the links
 (probably because it crossed a month boundary).  Thanks for
 pointing that out; I'll update the CF app and review the later
 versions.
 
The first patch Tom posted was a clear improvement on Heikki's
original patch, and performed slightly better.
 
The second patch Tom posted makes the patch more robust in the face
of changes to the structure, but reduces the performance benefit on
the dictionary, and performs very close to the unpatched version for
samples of English text.  If anything, it seems to be slower with
real text, but the difference is so small it might be noise.  (The
dictionary tests are skewed toward longer average word length than
typically found in actual readable text.)  I wonder whether the code
for the leading, unaligned portion of the data could be written such
that it was effectively unrolled and the length resolved at compile
time rather than figured out at run time for every call to the
function.  That would solve the robustness issue without hurting
performance.  If we don't do something like that, this patch doesn't
seem worth applying.
 
Heikki's second version, a more radical revision optimized for 64
bit systems, blows up on a 32 bit compile, writing off the end of
the structure.  Personally, I'd be OK with sacrificing some
performance for 32 bit systems to get better performance on 64 bit
systems, since people who care about performance generally seem to
be on 64 bit builds these days -- but it has to run.  Given Tom's
reservations about this approach, I don't know whether Heikki is
interested in fixing the crash so it can be benchmarked.  Heikki?
 
I will do a set of more carefully controlled performance tests on
whatever versions are still in the running after we sort out the
issues above.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updated version of pg_receivexlog

2011-09-29 Thread Magnus Hagander

On Thu, Sep 29, 2011 at 01:55, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Wed, Sep 28, 2011 at 12:50 PM, Magnus Hagander mag...@hagander.net wrote:

 pg_receivexlog worked good in my tests.

 pg_basebackup with --xlog=stream gives me an already recycled wal
 segment message (note that the file was in pg_xlog in the standby):
 FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
 segment 0001005C has already been removed

 Do you get this reproducibly? Or did you get it just once?

 And when you say in the standby what are you referring to? There is
 no standby server in the case of pg_basebackup --xlog=stream, it's
 just backup... But are you saying pg_basebackup had received the file,
 yet tried to get it again?


 ok, i was trying to setup a standby server cloning with
 pg_basebackup... i can't use it that way?

 the docs says:
 
 If this option is specified, it is possible to start a postmaster
 directly in the extracted directory without the need to consult the
 log archive, thus making this a completely standalone backup.
 

 it doesn't say that is not possible to use this for a standby
 server... probably that's why i get the error i put a recovery.conf
 after pg_basebackup finished... maybe we can say that  more loudly?

The idea is, if you use it with -x (or --xlog), it's for taking a
backup/clone, *not* for replication.

If you use it without -x, then you can use it as the start of a
replica, by adding a recovery.conf.

But you can't do both at once, that will confuse it.


 in other things:
 do we need to include src/bin/pg_basebackup/.gitignore in the patch?

 Not sure what you mean? We need to add pg_receivexlog to this file,
 yes - in head it just contains pg_basebackup.


 your patch includes a modification in the file
 src/bin/pg_basebackup/.gitignore, maybe i'm just being annoying
 besides is a simple change... just forget that...

Well, it needs to be included inthe commit, and if I exclude it inthe
posted patch, I'll just forget it in the end :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Object access hooks with arguments support (v1)

2011-09-29 Thread Kohei KaiGai

I noticed that the previous revision does not provide any way to inform
the modules name of foreign server, even if foreign table was created,
on the OAT_POST_CREATE hook.
So, I modified the invocation at heap_create_with_catalog to deliver
this information to the modules.

Rest of parts were unchanged, except for rebasing to the latest git
master.

2011/8/28 Kohei KaiGai kai...@kaigai.gr.jp:
 The attached patch is a draft to support arguments in addition to
 OAT_* enum and object identifiers.

 The existing object_access_hook enables loadable modules to acquire
 control when objects are referenced. The first guest of this hook is
 contrib/sepgsql for assignment of default security label on newly
 created objects. Right now, OAT_POST_CREATE is the all supported
 object access type. However, we plan to enhance this mechanism onto
 other widespread purpose; such as comprehensive DDL permissions
 supported by loadable modules.

 This patch is a groundwork to utilize this hook for object creation
 permission checks, not only default labeling.
 At the v9.1 development cycle, I proposed an idea that defines both
 OAT_CREATE hook prior to system catalog modification and
 OAT_POST_CREATE hook as currently we have. This design enables to
 check permission next to the existing pg_xxx_aclcheck() or
 pg_xxx_ownercheck(), and raise an error before system catalog updates.
 However, it was painful to deliver private datum set on OAT_CREATE to
 the OAT_POST_CREATE due to the code complexity.

 The other idea tried to do all the necessary things in OAT_POST_CREATE
 hook, and it had been merged, because loadable modules can pull
 properties of the new object from system catalogs by the supplied
 object identifiers. Thus, contrib/sepgsql assigns a default security
 label on new object using OAT_POST_CREATE hook.
 However, I have two concern on the existing hook to implement
 permission check for object creation. The first one is the entry of
 system catalog is not visible using SnaphotNow, so we had to open and
 scan system catalog again, instead of syscache mechanism. The second
 one is more significant. A part of information to make access control
 decision is not available within contents of the system catalog
 entries. For example, we hope to skip permission check when
 heap_create_with_catalog() was launched by make_new_heap() because the
 new relation is just used to swap later.

 Thus, I'd like to propose to support argument of object_access_hook to
 inform the loadable modules additional contextual information on its
 invocations; to solve these concerns.

 Regarding to the first concern, fortunately, most of existing
 OAT_POST_CREATE hook is deployed just after insert or update of system
 catalogs, but before CCI. So, it is helpful for the loadable modules
 to deliver Form_pg_ data to reference properties of the new
 object, instead of open and scan the catalog again.
 In the draft patch, I enhanced OAT_POST_CREATE hook commonly to take
 an argument that points to the Form_pg_ data of the new object.

 Regarding to the second concern, I added a few contextual information
 as second or later arguments in a part of object classes. Right now, I
 hope the following contextual information shall be provided to
 OAT_POST_CREATE hook to implement permission checks of object
 creation.

 * pg_class - TupleDesc structure of the new relation
 I want to reference of pg_attribute, not only pg_class.

 * pg_class - A flag to show whether the relation is defined for
 rebuilding, or not.
 I want not to apply permission check in the case when it is invoked
 from make_new_heap(), because it just create a dummy table as a part
 of internal process. All the necessary permission checks should be
 done at ALTER TABLE or CLUSTER.

And, name of the foreign server being used by the foreign table
being created just a moment before.

 * pg_class - A flag to show whether the relation is created with
 SELECT INTO, or not.
 I want to check insert permission of the new table, created by
 SELECT INTO, because DML hook is not available to check this case.

 * pg_type - A flag to show whether the type is defined as implicit
 array, or not.
 I want not to apply permission check on creation of implicit array type.

 * pg_database - Oid of the source (template) database.
 I want to fetch security label of the source database to compute a
 default label of the new database.

 * pg_trigger - A flag to show whether the trigger is used to FK
 constraint, or not.
 I want not to apply permission check on creation of FK constraint. It
 should be done in ALTER TABLE level.

 Sorry for this long explanation. Right now, I tend to consider it is
 the best way to implement permission checks on object creation with
 least invasive way.

 Thanks, Any comments welcome.
 --
 KaiGai Kohei kai...@kaigai.gr.jp

-- 
KaiGai Kohei kai...@kaigai.gr.jp


pgsql-v9.2-object-access-hooks-argument.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list

Re: [HACKERS] pg_upgrade - add config directory setting

Steve Crawford wrote:
 On 09/29/2011 08:20 AM, Bruce Momjian wrote:
  ...
  1  document the limitation and require users to use symlinks
  2  add a --old/new-configdir parameter to pg_upgrade
  3  have pg_upgrade find the real data dir by starting the server
  4  add a flag to some tool to return the real data dir, and backpatch
  that
 5. (really 3a). Have pg_upgrade itself check the specified --XXX-datadir 
 for postgresql.conf and use the data_directory setting therein using the 
 same rules as followed by the server.
 
 This would mean that there are no new options to pg_upgrade and that 
 pg_upgrade operation would not change when postgresql.conf is in the 
 data-directory. This would also make it consistent with PostgreSQL's 
 notion of file-locations:
 
 If you wish to keep the configuration files elsewhere than the data 
 directory, the postgres -D command-line option or PGDATA environment 
 variable must point to the directory containing the configuration files, 
 and the data_directory parameter must be set in postgresql.conf...
 
 So for backporting, it could just be considered a bug fix that aligns 
 pg_upgrade's interpretation of datadir to that of the server.

pg_upgrade is not about to start reading through postgresql.conf looking
for a definition for data_directory --- there are too many cases where
this could go wrong.  It would need a full postgresql.conf parser.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

2011-09-29 Thread Heikki Linnakangas


On 29.09.2011 20:27, Kevin Grittner wrote:

Heikki's second version, a more radical revision optimized for 64
bit systems, blows up on a 32 bit compile, writing off the end of
the structure.  Personally, I'd be OK with sacrificing some
performance for 32 bit systems to get better performance on 64 bit
systems, since people who care about performance generally seem to
be on 64 bit builds these days -- but it has to run.  Given Tom's
reservations about this approach, I don't know whether Heikki is
interested in fixing the crash so it can be benchmarked.  Heikki?


No, I'm not going to work on that 64-bit patch.

Looking at the big picture, however, the real problem with all those 
makesign() calls is that they happen in the first place. They happen 
when gist needs to choose which child page to place a new tuple on. It 
calls the penalty for every item on the internal page, always passing 
the new key as the 2nd argument, along the lines of:


for (all items on internal page)
  penalty(item[i], newitem);

At every call, gtrgm_penalty() has to calculate the signature for 
newitem, using makesign(). That's an enormous waste of effort, but 
there's currently no way gtrgm_penalty() to avoid that. If we could call 
makesign() only on the first call in the loop, and remember it for the 
subsequent calls, that would eliminate the need for any 
micro-optimization in makesign() and make inserting into a trigram index 
much faster (including building the index from scratch).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Looking at the big picture, however, the real problem with all those 
 makesign() calls is that they happen in the first place. They happen 
 when gist needs to choose which child page to place a new tuple on. It 
 calls the penalty for every item on the internal page, always passing 
 the new key as the 2nd argument, along the lines of:

 for (all items on internal page)
penalty(item[i], newitem);

 At every call, gtrgm_penalty() has to calculate the signature for 
 newitem, using makesign(). That's an enormous waste of effort, but 
 there's currently no way gtrgm_penalty() to avoid that.

Hmm.  Are there any other datatypes for which the penalty function has
to duplicate effort?  I'm disinclined to fool with this if pg_trgm is
the only example ... but if it's not, maybe we should do something
about that instead of micro-optimizing makesign.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

2011-09-29 Thread Alexander Korotkov

On Fri, Sep 30, 2011 at 1:08 AM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 At every call, gtrgm_penalty() has to calculate the signature for newitem,
 using makesign(). That's an enormous waste of effort, but there's currently
 no way gtrgm_penalty() to avoid that. If we could call makesign() only on
 the first call in the loop, and remember it for the subsequent calls, that
 would eliminate the need for any micro-optimization in makesign() and make
 inserting into a trigram index much faster (including building the index
 from scratch)

Isn't it possible to cache signature of newitem in gtrgm_penalty
like gtrgm_consistent do this for query?

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] pg_upgrade - add config directory setting

Bruce Momjian br...@momjian.us writes:
 pg_upgrade is not about to start reading through postgresql.conf looking
 for a definition for data_directory --- there are too many cases where
 this could go wrong.  It would need a full postgresql.conf parser.

Yeah.  I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values.  We've had requests for that
type of functionality before, IIRC.  The --describe-config option does
something related, but not what's needed here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix for pg_upgrade

 Alvaro Herrera wrote:
 
  Excerpts from Bruce Momjian's message of miC3A9 sep 28 13:48:28 -0300 
  2011:
   Bruce Momjian wrote:
OK, so it fails for all tables and you are using the newest version.
Thanks for all your work.  I am now guessing that pg_upgrade 9.1.X is
just broken on Windows.
   
Perhaps the variables set by pg_upgrade_support.so are not being passed
into the server variables?  I know pg_upgrade 9.0.X worked on Windows
because EnterpriseDB did extensive testing recently on this.   Has
anyone used pg_upgrade 9.1.X on Windows?
  
   OK, I have a new theory.  postmaster.c processes the -b
   (binary-upgrade) flag by setting a C variable:
  
   case 'b':
   /* Undocumented flag used for binary upgrades */
   IsBinaryUpgrade = true;
   break;
  
   I am now wondering if this variable is not being passed down to the
   sessions during Win32's EXEC_BACKEND.  Looking at the other postmaster
   settings, these set GUC variables, which I assume are passed down.  Can
   someone confirm this?
 
  Well, you could compile it with -DEXEC_BACKEND to test it for yourself.
 
How should this be fixed?
 
  Maybe it should be part of struct BackendParameters.
 
 Thanks.  That's what I did, and tested the failure with -DEXEC_BACKEND,
 and the fix with the patch, which is attached.  I am confident this will
 fix Windows as well.

Applied, and backpatched to 9.1.X.  Thanks for the report.  The fix will
be in 9.1.2.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

2011-09-29 Thread Alexander Korotkov

On Fri, Sep 30, 2011 at 1:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Hmm.  Are there any other datatypes for which the penalty function has
 to duplicate effort?  I'm disinclined to fool with this if pg_trgm is
 the only example ... but if it's not, maybe we should do something
 about that instead of micro-optimizing makesign.

GiST for tsearch works similarly.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] pg_upgrade - add config directory setting

Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  pg_upgrade is not about to start reading through postgresql.conf looking
  for a definition for data_directory --- there are too many cases where
  this could go wrong.  It would need a full postgresql.conf parser.
 
 Yeah.  I think the only sensible way to do this would be to provide an
 operating mode for the postgres executable that would just parse the
 config file and spit out requested values.  We've had requests for that
 type of functionality before, IIRC.  The --describe-config option does
 something related, but not what's needed here.

That would certainly solve the problem, though it would have to be
backpatched all the way back to 8.4, and it would require pg_upgrade
users to be on newer minor versions of Postgres.  We could minimize that
by using this feature only if postgresql.conf exists in the specified
data directory but PG_VERSION does not.

Adding this features is similar to this TODO item:

Allow configuration files to be independently validated 

This still seems like a lot to backpatch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade - add config directory setting

Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 Yeah.  I think the only sensible way to do this would be to provide an
 operating mode for the postgres executable that would just parse the
 config file and spit out requested values.

 That would certainly solve the problem, though it would have to be
 backpatched all the way back to 8.4, and it would require pg_upgrade
 users to be on newer minor versions of Postgres.

I would just say no to people who expect this to work against older
versions of Postgres.  I think it's sufficient if we get this into HEAD
so that it will work in the future.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

Alexander Korotkov aekorot...@gmail.com writes:
 On Fri, Sep 30, 2011 at 1:08 AM, Heikki Linnakangas 
 heikki.linnakan...@enterprisedb.com wrote:
 At every call, gtrgm_penalty() has to calculate the signature for newitem,
 using makesign(). That's an enormous waste of effort, but there's currently
 no way gtrgm_penalty() to avoid that. If we could call makesign() only on
 the first call in the loop, and remember it for the subsequent calls, that
 would eliminate the need for any micro-optimization in makesign() and make
 inserting into a trigram index much faster (including building the index
 from scratch)

 Isn't it possible to cache signature of newitem in gtrgm_penalty
 like gtrgm_consistent do this for query?

[ studies that code for awhile ... ]  Ick, what a kluge.

The main problem with that code is that the cache data gets leaked at
the conclusion of a scan.  Having just seen the consequences of leaking
the giststate, I think this is something we need to fix not emulate.

I wonder whether it's worth having the GIST code create a special
scan-lifespan (or insert-lifespan) memory context that could be used
for cached data such as this?  It's already creating a couple of
contexts for its own purposes, so one more might not be a big problem.
We'd have to figure out a way to make that context available to GIST
support functions, though, as well as something cleaner than fn_extra
for them to keep pointers in.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_regress input/output directory option

2011-09-29 Thread Michael Paquier

On Fri, Sep 30, 2011 at 12:37 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Sep 28, 2011 at 12:21 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Why are there no options in_regress to specify the directory where input
 and
  output data are located?
  Such options would bring more flexibility when running regressions
 without
  make check/installcheck for an external application.
  Is there a particular reason for that? Or do you think that pg_regress
  should be only used with make check?

 It has --inputdir and --outputdir - is that not what you need?

Yes thanks.

Regards,

Michael

Re: [HACKERS] pg_upgrade - add config directory setting

Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  Yeah.  I think the only sensible way to do this would be to provide an
  operating mode for the postgres executable that would just parse the
  config file and spit out requested values.
 
  That would certainly solve the problem, though it would have to be
  backpatched all the way back to 8.4, and it would require pg_upgrade
  users to be on newer minor versions of Postgres.
 
 I would just say no to people who expect this to work against older
 versions of Postgres.  I think it's sufficient if we get this into HEAD
 so that it will work in the future.

Well, it is going to work in the future only when the _old_ version is
9.2+.  Specifically, pg_upgrade using the flag could be patched to just
9.2, but the flag has to be supported on old and new backends for that
to work.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REVIEW proposal: a validator for configuration files

2011-09-29 Thread Alexander

On 09/10/2011 11:39 AM, Alexey Klyukin wrote:
Hi Andy,

On Sep 7, 2011, at 6:40 AM, Andy Colson wrote:

Hi Alexey, I was taking a quick look at this patch, and have a question for ya.

...

Where did the other warnings go? Its right though, line 570 is bad. It also
seems to have killed the server. I have not gotten through the history of
messages regarding this patch, but is it supposed to kill the server if there
is a syntax error in the config file?

Thank you for looking at this patch. v4 was more a what if concept that took
a lot of time for somebody to look at. There were a lot of problems with it, but I think
I've nailed down most of them.

Attached is v5. It should fix both problems you've experienced with v4. As with
the current code, the startup process gets interrupted on any error detected in
the configuration file. Unlike the current code, the patch tries to report all
of them before bailing out. The behavior during configuration reload has
changed significantly. Instead of ignoring all changes after the first error,
the code reports the problematic value and continues. It only skips applying
new values completely on syntax errors and invalid configuration option names.
In no cases should it bring the server down during reload.

One problem I'm not sure how to address is the fact that we require 2 calls of
set_config_option for each option, one to verify the new value and another to
actually apply it. Normally, this function returns true for a valid value and
false if it is unable to set the new value for whatever reason (either the
value is invalid, or the value cannot be changed in the context of the caller).
However, calling it once per value in the 'apply' mode during reload produces
false for every unchanged value that can only be changed during startup (i.e.
shared_buffers, or max_connections). If we ignore its return value, we'll lose
the ability to detect whether invalid values were encountered during the reload
and report this fact at the end of the function. I think the function should be
changed, as described in my previous email
(http://archives.postgresql.org/message-id/97A66029-9D3E-43AF-95AA-46FE1B447447(at)commandprompt(dot)com)
and I'd like to hear other opinions on that. Meanwhile

, due t

o 2 calls to set_config_option, it currently reports all invalid values twice.
If others will be opposed to changing the set_config_option, I'll fix this by
removing the first, verification call and final 'errors were detected' warning
to avoid 'false positives' on that (i.e. the WARNING you saw with the previous
version for the valid .conf).

I'd appreciate your further comments and suggestions.

Thank you.

--
Alexey Klyukinhttp://www.commandprompt.com
The PostgreSQL Company – Command Prompt, Inc.

After a quick two minute test, this patch seems to work well. It does just
what I think it should. I'll add it to the commitfest page for ya.

-Andy

Alexey,

I've included my review procedure outline and output below. Your patch
behaves as you described in the email containing 'v5' and judging by
Andy's earlier success, I think this is ready to go. I will mark this
'Ready for Committer'.

Alexander

---

TESTING METHODOLOGY
===

* Start postgres with stock, valid postgresql.conf
* Record output
* Stop postgres
* Add invalid configuration option to postgresql.conf
* Start postgres
* Record output
* Stop postgres
* Add configuration option with bad syntax to postgresql.conf
* Start postgres
* Record output

SETUP
-

./configure --prefix=~/builds/pgsql/orig; make install
./configure --prefix=~/builds/pgsql/patched/guc-param-validator; make
install

cd ~/builds/pgsql/orig; bin/initdb -D data; bin/postgres -D data
cd ~/builds/pgsql/patched/guc-param-validator; bin/initdb -D data;
bin/postgres -D data

OUTPUT
--

orig:

LOG: database system was shut down at 2011-09-28 20:26:17 PDT
LOG: database system is ready to accept connections
LOG: autovacuum launcher started

# alter postgresql.conf, add bogus param
# unknown_config_param = on
$ kill -9 `pgrep postgres | head -n1`; bin/postgres -D data

FATAL: unrecognized configuration parameter unknown_config_param

# alter postgresql.conf, add param with bad syntax
# unknown_config_param @@= on
$ kill -9 `pgrep postgres | head -n1`; bin/postgres -D data

FATAL: syntax error in file
/home/av/builds/pgsql/orig/data/postgresql.conf line 156, near token
@

patched:

LOG: database system was shut down at 2011-09-28 20:12:39 PDT
LOG: database system is ready to accept connections
LOG: autovacuum launcher started

# alter postgresql.conf, add bogus param
# unknown_config_param = on
$ kill -9 `pgrep postgres | head -n1`; bin/postgres -D data

LOG: unrecognized configuration parameter unknown_config_param
LOG: unrecognized configuration parameter another_unknown_config_param
FATAL: errors detected while parsing configuration files
DETAIL: changes were not applied.

Re: [HACKERS] bug of recovery?

2011-09-29 Thread Fujii Masao

On Thu, Sep 29, 2011 at 11:12 PM, Florian Pflug f...@phlo.org wrote:
 On Sep29, 2011, at 13:49 , Simon Riggs wrote:
 This worries me slightly now though because the patch makes us PANIC
 in a place we didn't used to and once we do that we cannot restart the
 server at all. Are we sure we want that? It's certainly a great way to
 shake down errors in other code...

 The patch only introduces a new PANIC condition during archive recovery,
 though. Crash recovery is unaffected, except that we no longer create
 restart points before we reach consistency.

 Also, if we hit an invalid page reference after reaching consistency,
 the cause is probably either a bug in our recovery code, or (quite unlikely)
 a corrupted WAL that passed the CRC check. In both cases, the likelyhood
 of data-corruption seems high, so PANICing seems like the right thing to do.

Fair enough.

We might be able to use FATAL or ERROR instead of PANIC because they
also cause all processes to exit when the startup process emits them.
For example, we now use FATAL to stop the server in recovery mode
when recovery is about to end before we've reached a consistent state.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp