date:20110923


On 23.09.2011 11:02, Linas Virbalas wrote:

On 9/22/11 6:59 PM, Euler Taveira de Oliveiraeu...@timbira.com  wrote:


If needed, I could do that, if I had the exact procedure... Currently,
during the start of the backup I take the following information:


Just show us the output of pg_start_backup and part of the standby log with
the following message 'redo starts at' and the subsequent messages up to the
failure.


Unfortunately, it's impossible, because the error message Could not read
from file pg_clog/0001 at offset 32768: Success is shown (and startup
aborted) before the turn for redo starts at message arrives.


It looks to me that pg_clog/0001 exists, but it shorter than recovery 
expects. Which shouldn't happen, of course, because the start-backup 
checkpoint should flush all the clog that's needed by recovery to disk 
before the backup procedure begins to them.


Can you do ls -l pg_clog in the master and the backup, and 
pg_controldata on the backup dir, and post the results, please?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Sep23, 2011, at 10:41 , Heikki Linnakangas wrote:
 On 23.09.2011 11:02, Linas Virbalas wrote:
 On 9/22/11 6:59 PM, Euler Taveira de Oliveiraeu...@timbira.com  wrote:
 
 If needed, I could do that, if I had the exact procedure... Currently,
 during the start of the backup I take the following information:
 
 Just show us the output of pg_start_backup and part of the standby log with
 the following message 'redo starts at' and the subsequent messages up to the
 failure.
 
 Unfortunately, it's impossible, because the error message Could not read
 from file pg_clog/0001 at offset 32768: Success is shown (and startup
 aborted) before the turn for redo starts at message arrives.
 
 It looks to me that pg_clog/0001 exists, but it shorter than recovery 
 expects. Which shouldn't happen, of course, because the start-backup 
 checkpoint should flush all the clog that's needed by recovery to disk before 
 the backup procedure begins to them.

Yeah. What confuses me though is that we fail while trying to read from the 
clog. When do we do that during normal (non-standby) recovery?

One other possibility would be that the problem is somehow triggered by vacuum 
running while the start-backup checkpoint is commencing. That'd explain why the 
problem goes away with immediate checkpoints, because those make the windows 
mach smaller. But I haven't got a concrete theory of whats happening..

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load


On 23.09.2011 11:48, Florian Pflug wrote:

On Sep23, 2011, at 10:41 , Heikki Linnakangas wrote:

On 23.09.2011 11:02, Linas Virbalas wrote:

On 9/22/11 6:59 PM, Euler Taveira de Oliveiraeu...@timbira.com   wrote:


If needed, I could do that, if I had the exact procedure... Currently,
during the start of the backup I take the following information:


Just show us the output of pg_start_backup and part of the standby log with
the following message 'redo starts at' and the subsequent messages up to the
failure.


Unfortunately, it's impossible, because the error message Could not read
from file pg_clog/0001 at offset 32768: Success is shown (and startup
aborted) before the turn for redo starts at message arrives.


It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. 
Which shouldn't happen, of course, because the start-backup checkpoint should 
flush all the clog that's needed by recovery to disk before the backup 
procedure begins to them.


Yeah. What confuses me though is that we fail while trying to read from the 
clog. When do we do that during normal (non-standby) recovery?


I believe the OP is setting up a standby. He didn't say if it's a hot 
standby, though. That changes the startup sequence a bit.


At the end of recovery, we read the last clog page, containing the 
next XID that will be assigned to a transaction, and zero out the future 
part of that clog file (StartupCLOG). In hot standby, we do that 
earlier, after reading the checkpoint record but before we start 
replaying records.



One other possibility would be that the problem is somehow triggered by vacuum 
running while the start-backup checkpoint is commencing. That'd explain why the 
problem goes away with immediate checkpoints, because those make the windows 
mach smaller. But I haven't got a concrete theory of whats happening..


Hmm, that's a good theory. The logic in SimpleLruPhysicakReadPage() 
tolerates non-existent files, because the old clog files might've been 
truncated away by a vacuum after the backup started. That's OK, because 
they will be recreated, and later deleted again, during the WAL replay. 
But what if something like this happens:


0. pg_start_backup().
1. rsync starts. It gets a list of all files. It notes that pg_clog/0001 
exists.

2. vacuum runs, and deletes pg_clog/0001
3. rsync starts to copy pg_clog/0001. Oops, it's gone. Instead of 
leaving it out altogether, it includes it as an empty file.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-23 Thread Kyotaro HORIGUCHI

Hi, I think I have comprehended roughly around the constructs and
the concept underlying.

At Thu, 22 Sep 2011 12:35:56 -0400, Tom Lane t...@sss.pgh.pa.us wrote in 
23159.1316709...@sss.pgh.pa.us
tgl Sure, if the increment the top byte strategy proves to not accomplish
tgl that effectively.  But I'd prefer not to design a complex strategy until
tgl it's been proven that a simpler one doesn't work.

Ok, I understand indistinctly that thought. But I have not grasp
your measure for the complexity.

 The current make_greater_string tips up the tail of bare byte
sequence and cheks the whole byte sequence to be valid against
the database encoding and try the next if not. On the other hand,
the patch (although the style is corrupted..) searches for the
last CHARACTER and try to tipping the last CARACTER up and
decline if failed.

Looking within the scope of the function make_greater_string,
feel more complexity on the former because of the check and loop
construct.

Yes, altough the `tipping the character up' has complexity
within, but the complexity is capsulated within single function.


From the performance perspective, the current implement always
slipps 64 times (0xff - 0xbf, for UTF8) and checks the WHOLE
pattern string on every slippage, and eventually declines for the
only but not negligible 100 (within Japanese chars only) code
points. The check-and-retry loop can't be a help for these
cases. And checks the whole pattern string at least once
nevertheless successfully landed.

While the algorithm of the patch seeks the whole pattern string
to find the last character but makes no slippage for whole the
code space and declines only on the point of chainging the
character length. (Of cource it is possible to save thses points
but it is `too expensive' for the gain to me:).  Not only
checking the whole string, but also checking the character after
increment operation is essentially needless for this method.

To summarise from my view, these two methods seems not so
different on performance for the `incrementable's by current
method and the latter seems rather efficient and applicable for
most of the `unincrementable's.


 The patch now does cheking the validity of the result as
last-resort because of the possible inconsistency caused by
careless chainging of the validity check function (changing the
character set, in other word, very unlikely.). But It is
unnessessary itself if the consistency between the incrementer
and the validator has been checked after the last modification.
The four-bytes memcpy would be get out by changing the rewinding
method. These modifications make the operation gets low-cost (I
think..) for UTF8 and it for others left untouched with regard to
the behavior.

As I've written in previous message, the modification on
pg_wchar_table can be rewinded before until another incrementer
comes.

Can I have another chance to show the another version of the
patch according to the above?


Regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote:

 The following bug has been logged online:

 Bug reference:      6189
 Logged by:          Srinivas Aji
 Email address:      srinivas@emc.com
 PostgreSQL version: 9.0.4
 Operating system:   Linux
 Description:        libpq: sslmode=require verifies server certificate if
 root.crt is present
 Details:

 From the documentation of sslmode values in
 http://www.postgresql.org/docs/9.0/static/libpq-ssl.html ,
 it looks like libpq will not verify the server certificate when the option
 sslmode=require is used, and will perform different levels of certificate
 verification in the cases sslmode=verify-ca and sslmode=verify-full.

 The observed behaviour is a bit different. If the ~/.postgresql/root.crt
 file (or any other filename set through sslrootcert option) is found,
 sslmode=require also performs the same level of certificate verification as
 verify-ca. The difference between require and verify-ca is that it is an
 error for the file to not exist when sslmode is verify-ca.

I looked at this again, and I'm pretty sure we did this intentionally.
The idea being that before we had the verify-ca/verify-full options,
adding the root cert would enable the verification. And we didn't want
to turn installations that previously did verify the certificate to
stop doing so in the new version.

So basically, the behaviour that is by design is:
* require: if certificate exists, verify. if certificate doesn't
exist, don't verify.
* verify-ca: if certificate exists, verify. if certificate doesn't
exist, disconnect.

The question is, have we had the new options long enough now that we
should change it so that we don't verify the cert in the case of
cert-exists-but-verification-wasn't-explicitly-asked-for?

Or should we just update the documentation to mention how this works?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: plpgsql - remove unnecessary ccache search when a array variable is updated

On Fri, Sep 23, 2011 at 12:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I fixed crash that described Tom. Do you know about other?

No, I just don't see a new version of the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: plpgsql - remove unnecessary ccache search when a array variable is updated

2011-09-23 Thread Pavel Stehule

2011/9/23 Robert Haas robertmh...@gmail.com:
 On Fri, Sep 23, 2011 at 12:26 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I fixed crash that described Tom. Do you know about other?

 No, I just don't see a new version of the patch.


sorry - my mistake - I sent it only to Tom

Regards

Pavel


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

*** ./src/pl/plpgsql/src/gram.y.orig	2011-09-19 10:25:28.0 +0200
--- ./src/pl/plpgsql/src/gram.y	2011-09-22 22:02:11.0 +0200
***
*** 998,1003 
--- 998,1004 
  		new-dtype		= PLPGSQL_DTYPE_ARRAYELEM;
  		new-subscript	= $3;
  		new-arrayparentno = $1;
+ 		new-arraytypoid = InvalidOid;
  
  		plpgsql_adddatum((PLpgSQL_datum *) new);
  
*** ./src/pl/plpgsql/src/pl_exec.c.orig	2011-09-19 10:25:28.0 +0200
--- ./src/pl/plpgsql/src/pl_exec.c	2011-09-22 22:14:46.0 +0200
***
*** 3991,4008 
  PLpgSQL_expr *subscripts[MAXDIM];
  int			subscriptvals[MAXDIM];
  bool		oldarrayisnull;
! Oid			arraytypeid,
! 			arrayelemtypeid;
  int32		arraytypmod;
- int16		arraytyplen,
- 			elemtyplen;
- bool		elemtypbyval;
- char		elemtypalign;
  Datum		oldarraydatum,
  			coerced_value;
  ArrayType  *oldarrayval;
  ArrayType  *newarrayval;
  SPITupleTable *save_eval_tuptable;
  
  /*
   * We need to do subscript evaluation, which might require
--- 3991,4004 
  PLpgSQL_expr *subscripts[MAXDIM];
  int			subscriptvals[MAXDIM];
  bool		oldarrayisnull;
! Oid			arraytypoid;
  int32		arraytypmod;
  Datum		oldarraydatum,
  			coerced_value;
  ArrayType  *oldarrayval;
  ArrayType  *newarrayval;
  SPITupleTable *save_eval_tuptable;
+ PLpgSQL_arrayelem *base_arrayelem;
  
  /*
   * We need to do subscript evaluation, which might require
***
*** 4027,4032 
--- 4023,4030 
  {
  	PLpgSQL_arrayelem *arrayelem = (PLpgSQL_arrayelem *) target;
  
+ 	base_arrayelem = arrayelem;
+ 
  	if (nsubscripts = MAXDIM)
  		ereport(ERROR,
  (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
***
*** 4038,4061 
  
  /* Fetch current value of array datum */
  exec_eval_datum(estate, target,
! arraytypeid, arraytypmod,
  oldarraydatum, oldarrayisnull);
  
! /* If target is domain over array, reduce to base type */
! arraytypeid = getBaseTypeAndTypmod(arraytypeid, arraytypmod);
  
! /* ... and identify the element type */
! arrayelemtypeid = get_element_type(arraytypeid);
! if (!OidIsValid(arrayelemtypeid))
! 	ereport(ERROR,
! 			(errcode(ERRCODE_DATATYPE_MISMATCH),
! 			 errmsg(subscripted object is not an array)));
  
! get_typlenbyvalalign(arrayelemtypeid,
! 	 elemtyplen,
! 	 elemtypbyval,
! 	 elemtypalign);
! arraytyplen = get_typlen(arraytypeid);
  
  /*
   * Evaluate the subscripts, switch into left-to-right order.
--- 4036,4067 
  
  /* Fetch current value of array datum */
  exec_eval_datum(estate, target,
! arraytypoid, arraytypmod,
  oldarraydatum, oldarrayisnull);
  
! /* If base_arrayelem has initialized metadata, then use it */
! if (base_arrayelem-arraytypoid != arraytypoid ||
! 	base_arrayelem-arraytypmod != arraytypmod)
! {
! 	/* If target is domain over array, reduce to base type */
! 	arraytypoid = getBaseTypeAndTypmod(arraytypoid, arraytypmod);
  
! 	base_arrayelem-arraytypoid = arraytypoid;
! 	base_arrayelem-arraytypmod = arraytypmod;
  
! 	/* ... and identify the element type */
! 	base_arrayelem-arrayelemtypoid = get_element_type(arraytypoid);
! 	if (!OidIsValid(base_arrayelem-arrayelemtypoid))
! 		ereport(ERROR,
! (errcode(ERRCODE_DATATYPE_MISMATCH),
!  errmsg(subscripted object is not an array)));
! 
! 	get_typlenbyvalalign(base_arrayelem-arrayelemtypoid,
! 		 (base_arrayelem-elemtyplen),
! 		 (base_arrayelem-elemtypbyval),
! 		 (base_arrayelem-elemtypalign));
! 	base_arrayelem-arraytyplen = get_typlen(arraytypoid);
! }
  
  /*
   * Evaluate the subscripts, switch into left-to-right order.
***
*** 4093,4099 
  /* Coerce source value to match array element type. */
  coerced_value = exec_simple_cast_value(value,
  	   valtype,
! 	   arrayelemtypeid,
  	   arraytypmod,
  	   *isNull);
  
--- 4099,4105 
  /* Coerce source value to match array element type. */
  coerced_value = exec_simple_cast_value(value,
  	   valtype,
! 	   base_arrayelem-arrayelemtypoid,
  	   arraytypmod,
  	   *isNull);
  
***
*** 4107,4118 
   * array, either, so that's a no-op too.  This is all ugly but
   * corresponds to the current

[HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-23 Thread Yeb Havinga


Hello list,

I have a use case where an extension dependency can be satisfied by one 
of five other extensions. Currently I'm unable to express that in the 
extension control file, since the elements from 'requires' are currently 
searched on exact name match. The attached patch changes this behaviour 
for list elements that end with a *, into prefix matching, so that e.g. 
table* matches tablefunc.


This allows me to specify in a controlfile

requires 'vocab*'

which is satisfied by having either one of the following extensions loaded:

vocab2005
vocab2006
vocab2008
vocab2009
vocab2010

thoughts?

regards,
Yeb Havinga

--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
new file mode 100644
index 278bbcf..5bf7f8e
*** a/src/backend/commands/extension.c
--- b/src/backend/commands/extension.c
*** static void ApplyExtensionUpdates(Oid ex
*** 114,145 
  Oid
  get_extension_oid(const char *extname, bool missing_ok)
  {
! 	Oid			result;
! 	Relation	rel;
  	SysScanDesc scandesc;
  	HeapTuple	tuple;
  	ScanKeyData entry[1];
  
  	rel = heap_open(ExtensionRelationId, AccessShareLock);
  
  	ScanKeyInit(entry[0],
  Anum_pg_extension_extname,
! BTEqualStrategyNumber, F_NAMEEQ,
  CStringGetDatum(extname));
  
! 	scandesc = systable_beginscan(rel, ExtensionNameIndexId, true,
!   SnapshotNow, 1, entry);
  
! 	tuple = systable_getnext(scandesc);
  
! 	/* We assume that there can be at most one matching tuple */
! 	if (HeapTupleIsValid(tuple))
! 		result = HeapTupleGetOid(tuple);
! 	else
! 		result = InvalidOid;
  
! 	systable_endscan(scandesc);
  
  	heap_close(rel, AccessShareLock);
  
  	if (!OidIsValid(result)  !missing_ok)
--- 114,152 
  Oid
  get_extension_oid(const char *extname, bool missing_ok)
  {
! 	Oid			result   = InvalidOid;
! 	Relation	rel, idx;
  	SysScanDesc scandesc;
  	HeapTuple	tuple;
  	ScanKeyData entry[1];
+ 	int len  = strlen(extname) - 1;
+ 	boolwildcard = (extname[len] == '*');
  
  	rel = heap_open(ExtensionRelationId, AccessShareLock);
+ 	idx = index_open(ExtensionNameIndexId, AccessShareLock);
  
  	ScanKeyInit(entry[0],
  Anum_pg_extension_extname,
! BTGreaterEqualStrategyNumber,
! F_NAMEGE,
  CStringGetDatum(extname));
  
! 	scandesc = systable_beginscan_ordered(rel, idx,
! 		  SnapshotNow, 1, entry);
  
! 	if (HeapTupleIsValid(tuple = systable_getnext_ordered(scandesc, ForwardScanDirection)))
! 	{
! 		Form_pg_extension form = (Form_pg_extension) GETSTRUCT(tuple);
  
! 		if (wildcard
! 			? (strncmp(extname, NameStr(form-extname), len) == 0)
! 			: (strcmp(extname, NameStr(form-extname)) == 0))
! 			result = HeapTupleGetOid(tuple);
! 	}
  
! 	systable_endscan_ordered(scandesc);
  
+ 	index_close(idx, AccessShareLock);
  	heap_close(rel, AccessShareLock);
  
  	if (!OidIsValid(result)  !missing_ok)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Satisfy extension dependency by one of multiple extensions


On 23.09.2011 14:56, Yeb Havinga wrote:

I have a use case where an extension dependency can be satisfied by one
of five other extensions. Currently I'm unable to express that in the
extension control file, since the elements from 'requires' are currently
searched on exact name match. The attached patch changes this behaviour
for list elements that end with a *, into prefix matching, so that e.g.
table* matches tablefunc.


That's going to quickly extend into even more complex rules, like foo 
OR bar, (foo OR bar) AND (foobar) etc. IIRC the extension control 
file format was modeled after some other package management system (.deb 
?). You might want to look at the past discussions when the extension 
control file format was decided.


We might want to have a system where an extension can declare that it 
provides capabilites, and then have another extension require those 
capabilities. That would be a neater solution to the case that there are 
multiple extensions that all provide the same capability.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-23 Thread Lou Picciano

From: Magnus Hagander mag...@hagander.net 
To: Srinivas Aji srinivas@emc.com 
Cc: PostgreSQL-development pgsql-hackers@postgresql.org 
Sent: Friday, September 23, 2011 7:28:09 AM 
Subject: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server 
certificate if root.crt is present 

On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: 

 The following bug has been logged online: 

 Bug reference: 6189 
 Logged by: Srinivas Aji 
 Email address: srinivas@emc.com 
 PostgreSQL version: 9.0.4 
 Operating system: Linux 
 Description: libpq: sslmode=require verifies server certificate if 
 root.crt is present 
 Details: 

... 

 The observed behaviour is a bit different. If the ~/.postgresql/root.crt 
 file (or any other filename set through sslrootcert option) is found, 
 sslmode=require also performs the same level of certificate verification as 
 verify-ca. The difference between require and verify-ca is that it is an 
 error for the file to not exist when sslmode is verify-ca. 

I looked at this again, and I'm pretty sure we did this intentionally. 
The idea being that before we had the verify-ca/verify-full options, 
adding the root cert would enable the verification. And we didn't want 
to turn installations that previously did verify the certificate to 
stop doing so in the new version. 

So basically, the behaviour that is by design is: 
* require: if certificate exists, verify. if certificate doesn't 
exist, don't verify. 
* verify-ca: if certificate exists, verify. if certificate doesn't 
exist, disconnect. 

The question is, have we had the new options long enough now that we 
should change it so that we don't verify the cert in the case of 
cert-exists-but-verification-wasn't-explicitly-asked-for? 

Or should we just update the documentation to mention how this works? 

-- 
Magnus Hagander 
Me: http://www.hagander.net/ 
Work: http://www.redpill-linpro.com/ 

Magnus, If you're accepting votes on this: I would say 'yes' - change the 
behavior to the most logically consistent ones; ie, isolate the verification 
bits a bit more explicitly. And, in documentation, indicate the deprecation of 
the old behavior. 

Our mileage, in practical terms, is that the perceived inconsistencies create a 
minor support hassle - we don't want to present any - even trivial - hurdle to 
adoption of SSL to our clients. 

Lou Picciano

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote:

 On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote:

 The following bug has been logged online:

 Bug reference:      6189
 Logged by:          Srinivas Aji
 Email address:      srinivas@emc.com
 PostgreSQL version: 9.0.4
 Operating system:   Linux
 Description:        libpq: sslmode=require verifies server certificate if
 root.crt is present
 Details:

 ...

 The observed behaviour is a bit different. If the ~/.postgresql/root.crt
 file (or any other filename set through sslrootcert option) is found,
 sslmode=require also performs the same level of certificate verification
 as
 verify-ca. The difference between require and verify-ca is that it is an
 error for the file to not exist when sslmode is verify-ca.

 I looked at this again, and I'm pretty sure we did this intentionally.
 The idea being that before we had the verify-ca/verify-full options,
 adding the root cert would enable the verification. And we didn't want
 to turn installations that previously did verify the certificate to
 stop doing so in the new version.

 So basically, the behaviour that is by design is:
 * require: if certificate exists, verify. if certificate doesn't
 exist, don't verify.
 * verify-ca: if certificate exists, verify. if certificate doesn't
 exist, disconnect.

 The question is, have we had the new options long enough now that we
 should change it so that we don't verify the cert in the case of
 cert-exists-but-verification-wasn't-explicitly-asked-for?

 Or should we just update the documentation to mention how this works?

 Magnus, If you're accepting votes on this: I would say 'yes' - change the
 behavior to the most logically consistent ones; ie, isolate the verification
 bits a bit more explicitly. And, in documentation, indicate the deprecation
 of the old behavior.

 Our mileage, in practical terms, is that the perceived inconsistencies
 create a minor support hassle - we don't want to present any - even trivial
 - hurdle to adoption of SSL to our clients.

There are really two options to this as well - we can backpatch such a
change, or we can change it only in 9.2. I'm leaning towards a no on
the backport, because that will change things for existing users. So
probably a doc change in backbranches and a behaviour change in 9.2
would be the reasonable choice in this case.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Anyway, I won't stand in the way of the patch as long as it's modified
 to limit the number of values considered for any one character position
 to something reasonably small.

I think that limit in both the old and new code is 1, except that the
new code does it more efficiently.

Am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

On Fri, Sep 23, 2011 at 5:16 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 Can I have another chance to show the another version of the
 patch according to the above?

You can always post a new version of any patch.

I think what you need to focus on is cleaning up the coding style to
match PG project standards.  In particular, you need to fix your brace
positioning and the way function declarations are laid out.  Someone
will have to do that before this is committed, and ideally that person
should be you rather than, say, me.  :-)

It would be good to fix the memcpy() issue Tom found too, and make a
careful examination of the code for anything else that can be tidied
up or made more bulletproof.

It seems like we are getting close to agreement on the basic behavior,
so that is good.  We can always fine-tune that later; that shouldn't
be much work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-23 Thread Linas Virbalas

On 9/23/11 12:05 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

 It looks to me that pg_clog/0001 exists, but it shorter than recovery
 expects. Which shouldn't happen, of course, because the start-backup
 checkpoint should flush all the clog that's needed by recovery to disk
 before the backup procedure begins to them.
 
 Yeah. What confuses me though is that we fail while trying to read from the
 clog. When do we do that during normal (non-standby) recovery?
 
 I believe the OP is setting up a standby. He didn't say if it's a hot
 standby, though. That changes the startup sequence a bit.

I confirm that it is a Hot Standby with Streaming Replication.

 One other possibility would be that the problem is somehow triggered by
 vacuum running while the start-backup checkpoint is commencing. That'd
 explain why the problem goes away with immediate checkpoints, because those
 make the windows mach smaller. But I haven't got a concrete theory of whats
 happening..
 
 Hmm, that's a good theory. The logic in SimpleLruPhysicakReadPage()
 tolerates non-existent files, because the old clog files might've been
 truncated away by a vacuum after the backup started. That's OK, because
 they will be recreated, and later deleted again, during the WAL replay.
 But what if something like this happens:
 
 0. pg_start_backup().
 1. rsync starts. It gets a list of all files. It notes that pg_clog/0001
 exists.
 2. vacuum runs, and deletes pg_clog/0001
 3. rsync starts to copy pg_clog/0001. Oops, it's gone. Instead of
 leaving it out altogether, it includes it as an empty file.

I cannot confirm that. I have tried this scenario one more time and was
observing how pg_clog/ looked on the master and standby. Here's an overview:

1. pg_start_backup('base_backup') - waiting...

2. Checking the size of pg_clog periodically on the master:

-bash-3.2$ ls -l pg_clog
total 8
-rw--- 1 postgres postgres 8192 Sep 23 14:31 
-bash-3.2$ ls -l pg_clog
total 8
-rw--- 1 postgres postgres 8192 Sep 23 14:31 
-bash-3.2$ ls -l pg_clog
total 16

3. Somewhere just before (1) step releases and we get into rsync phase, the
pg_clog size changes:

-rw--- 1 postgres postgres 16384 Sep 23 14:33 
-bash-3.2$ ls -l pg_clog
total 16
-rw--- 1 postgres postgres 16384 Sep 23 14:33 
-bash-3.2$ ls -l pg_clog
total 16
-rw--- 1 postgres postgres 16384 Sep 23 14:33 

4. The rsync does transfer the file:

...
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO
management.script.ScriptPlugin base/16394/16405_fsm
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO
management.script.ScriptPlugin base/16394/pg_internal.init
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12691
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12696
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12697
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/pg_internal.init
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_clog/
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_multixact/offsets/
INFO   | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_notify/
...

But on the standby its size is the old one (thus, it seems, that the size
changed after the rsync transfer and before the pg_stop_backup() was
called):

ls -l pg_clog/
total 8
-rw--- 1 postgres postgres 8192 Sep 23 14:31 

5. Hence, the server doesn't start and pg_log complains:

2011-09-23 14:33:45 CEST LOG:  streaming replication successfully connected
to primary
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  the database system is starting up
2011-09-23 14:33:46 CEST FATAL:  could not access status of transaction
37206
2011-09-23 14:33:46 CEST DETAIL:  Could not read from file pg_clog/ at
offset 8192: Success.

6. Now, if I do something that, of course, should never be done, and copy
this  file from master to the standby soon enough (without even
starting/stopping backup), the standby starts up successfully.

--
Hope this helps,
Linas Virbalas
http://flyingclusters.blogspot.com/


-- 
Sent via pgsql-hackers mailing list

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net wrote:
 On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote:
 On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote:

 The following bug has been logged online:

 Bug reference:      6189
 Logged by:          Srinivas Aji
 Email address:      srinivas@emc.com
 PostgreSQL version: 9.0.4
 Operating system:   Linux
 Description:        libpq: sslmode=require verifies server certificate if
 root.crt is present
 Details:

 ...

 The observed behaviour is a bit different. If the ~/.postgresql/root.crt
 file (or any other filename set through sslrootcert option) is found,
 sslmode=require also performs the same level of certificate verification
 as
 verify-ca. The difference between require and verify-ca is that it is an
 error for the file to not exist when sslmode is verify-ca.

 I looked at this again, and I'm pretty sure we did this intentionally.
 The idea being that before we had the verify-ca/verify-full options,
 adding the root cert would enable the verification. And we didn't want
 to turn installations that previously did verify the certificate to
 stop doing so in the new version.

 So basically, the behaviour that is by design is:
 * require: if certificate exists, verify. if certificate doesn't
 exist, don't verify.
 * verify-ca: if certificate exists, verify. if certificate doesn't
 exist, disconnect.

 The question is, have we had the new options long enough now that we
 should change it so that we don't verify the cert in the case of
 cert-exists-but-verification-wasn't-explicitly-asked-for?

 Or should we just update the documentation to mention how this works?

 Magnus, If you're accepting votes on this: I would say 'yes' - change the
 behavior to the most logically consistent ones; ie, isolate the verification
 bits a bit more explicitly. And, in documentation, indicate the deprecation
 of the old behavior.

 Our mileage, in practical terms, is that the perceived inconsistencies
 create a minor support hassle - we don't want to present any - even trivial
 - hurdle to adoption of SSL to our clients.

 There are really two options to this as well - we can backpatch such a
 change, or we can change it only in 9.2. I'm leaning towards a no on
 the backport, because that will change things for existing users. So
 probably a doc change in backbranches and a behaviour change in 9.2
 would be the reasonable choice in this case.

I definitely don't think we should back-patch a behavior change that
silently weakens security.  That's not good.

But what about not doing it in master, either?  It seems to me that we
could avoid ever breaking backward compatibility by adding a new
option require-no-verify.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-23 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Anyway, I won't stand in the way of the patch as long as it's modified
 to limit the number of values considered for any one character position
 to something reasonably small.

 I think that limit in both the old and new code is 1, except that the
 new code does it more efficiently.

 Am I confused?

Yes, or else I am.  Consider a 4-byte UTF8 character at the end of the
string.  The existing code increments the last byte up to 255 (rejecting
everything past 0xBF), then gives up and truncates that character away.
So the maximum number of tries for that character position is between 0
and 127 depending on what the original character was (with at most 63 of
the incremented values getting past the verifymbstr test).

The proposed patch is going to iterate through all Unicode code points
up to U+7F before giving up.  Since it's possible that we need to
increment something further left to succeed at all, this doesn't seem
like a good plan.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Fri, Sep 23, 2011 at 8:47 AM, Linas Virbalas
linas.virba...@continuent.com wrote:
 But on the standby its size is the old one (thus, it seems, that the size
 changed after the rsync transfer and before the pg_stop_backup() was
 called):

Now that seems pretty weird - I don't think that file should ever shrink.

Are you sure that the rsync isn't starting until after
pg_start_backup() completes?  Because if you were starting it just a
tiny bit early, that would explain the observed symptoms perfectly, I
think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-23 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 I looked at this again, and I'm pretty sure we did this intentionally.

Yeah, we did.

 Or should we just update the documentation to mention how this works?

+1 for doc change only.  I think the behavior was thought through
carefully, and the wording of the docs maybe not so much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-23 Thread Lou Picciano

From: Magnus Hagander mag...@hagander.net 
To: Lou Picciano loupicci...@comcast.net 
Cc: PostgreSQL-development pgsql-hackers@postgresql.org, Srinivas Aji 
srinivas@emc.com 
Sent: Friday, September 23, 2011 8:38:00 AM 
Subject: Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies 
server certificate if root.crt is present 

On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: 

 On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: 

 The following bug has been logged online: 

 Bug reference: 6189 
 Logged by: Srinivas Aji 
 Email address: srinivas@emc.com 
 PostgreSQL version: 9.0.4 
 Operating system: Linux 
 Description: libpq: sslmode=require verifies server certificate if 
 root.crt is present 
 Details: 

 ... 

 The observed behaviour is a bit different. If the ~/.postgresql/root.crt 
 file (or any other filename set through sslrootcert option) is found, 
 sslmode=require also performs the same level of certificate verification 
 as 
 verify-ca. The difference between require and verify-ca is that it is an 
 error for the file to not exist when sslmode is verify-ca. 

 I looked at this again, and I'm pretty sure we did this intentionally. 
 The idea being that before we had the verify-ca/verify-full options, 
 adding the root cert would enable the verification. And we didn't want 
 to turn installations that previously did verify the certificate to 
 stop doing so in the new version. 

 So basically, the behaviour that is by design is: 
 * require: if certificate exists, verify. if certificate doesn't 
 exist, don't verify. 
 * verify-ca: if certificate exists, verify. if certificate doesn't 
 exist, disconnect. 

 The question is, have we had the new options long enough now that we 
 should change it so that we don't verify the cert in the case of 
 cert-exists-but-verification-wasn't-explicitly-asked-for? 

 Or should we just update the documentation to mention how this works? 

 Magnus, If you're accepting votes on this: I would say 'yes' - change the 
 behavior to the most logically consistent ones; ie, isolate the verification 
 bits a bit more explicitly. And, in documentation, indicate the deprecation 
 of the old behavior. 

 Our mileage, in practical terms, is that the perceived inconsistencies 
 create a minor support hassle - we don't want to present any - even trivial 
 - hurdle to adoption of SSL to our clients. 

There are really two options to this as well - we can backpatch such a 
change, or we can change it only in 9.2. I'm leaning towards a no on 
the backport, because that will change things for existing users. So 
probably a doc change in backbranches and a behaviour change in 9.2 
would be the reasonable choice in this case. 

Again, if you were soliciting votes, I'd take the aggressive stance: +1 for the 
backport to 9.1. 

Of the population using SSL, you'd be pulling out the subset getting all the 
way down into PKI implementation, then, those actually doing apps teasing out 
these differences in verification behavior... Among _that_ group, you're only 
concerned with recent adopters of 9.1, and only those who wouldn't be in a 
position to adapt pretty quickly. Probably a pretty small cohort for something 
this esoteric. 

In our case, we do run into it - for our new clients. We find ourselves in 
something of a support role regarding pqlib's SSL capabilities! 

Lou Picciano

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

On Fri, Sep 23, 2011 at 8:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Anyway, I won't stand in the way of the patch as long as it's modified
 to limit the number of values considered for any one character position
 to something reasonably small.

 I think that limit in both the old and new code is 1, except that the
 new code does it more efficiently.

 Am I confused?

 Yes, or else I am.  Consider a 4-byte UTF8 character at the end of the
 string.  The existing code increments the last byte up to 255 (rejecting
 everything past 0xBF), then gives up and truncates that character away.
 So the maximum number of tries for that character position is between 0
 and 127 depending on what the original character was (with at most 63 of
 the incremented values getting past the verifymbstr test).

 The proposed patch is going to iterate through all Unicode code points
 up to U+7F before giving up.  Since it's possible that we need to
 increment something further left to succeed at all, this doesn't seem
 like a good plan.

I think you're misreading the code.  It does this:

while (len  0)
{
boring stuff;
if (charincfunc(lastchar, charlen))
{
more boring stuff;
if (we made a greater string)
return it;
cleanup;
}
   truncate away last character;
}

I don't see how that's ever going to try more than one character in
the same position.

What may be confusing you is that the old code has two loops: an outer
loop that tests whether we've made a greater string, and an inner loop
that tests whether we've made a validly encoded string at all.  In the
new code, at least in the UTF-8 case, the inner loop is GONE
altogether.  Instead of iterating until we construct a valid
character, we just use our mad UTF-8 skillz to assemble one, and
return it.

Or else I need to go drink a few cups of tea and look at this again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new createuser option for replication role

On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Agreed. Attached is the updated version of the patch. It adds two options
 --replication and --no-replication. If neither specified, neither REPLICATION
 nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case,
 replication privilege is granted to only superuser.

Committed.  Do we need to make any changes in interactive mode, when
we prompt?  In theory this could be either wanted or not wanted for
either superusers or non-superusers, but I'm not really sure it's
worth it, and I certainly don't want the command to go into
interactive mode just because neither --replication nor
--no-replication was specified.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-23 Thread Yeb Havinga


On 2011-09-23 14:19, Heikki Linnakangas wrote:

On 23.09.2011 14:56, Yeb Havinga wrote:

I have a use case where an extension dependency can be satisfied by one
of five other extensions. Currently I'm unable to express that in the
extension control file, since the elements from 'requires' are currently
searched on exact name match. The attached patch changes this behaviour
for list elements that end with a *, into prefix matching, so that e.g.
table* matches tablefunc.


That's going to quickly extend into even more complex rules, like foo 
OR bar, (foo OR bar) AND (foobar) etc. IIRC the extension control 
file format was modeled after some other package management system 
(.deb ?). You might want to look at the past discussions when the 
extension control file format was decided.


Ech.. 2364 hits on 'extension' in my mailbox. However I found a thread 
'extension dependency checking' that also talks about version numbers 
and the 'provides' capability you mention below.


We might want to have a system where an extension can declare that it 
provides capabilites, and then have another extension require 
those capabilities. That would be a neater solution to the case that 
there are multiple extensions that all provide the same capability.




Yes that would be neater. I can't think of anything else however than to 
add 'extprovides' to pg_extension, fill it with an explicit 'provides' 
from the control file when present, or extname otherwise, and use that 
column to check the 'requires' list on extension creation time.


--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

On Thu, Sep 15, 2011 at 4:52 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Thanks for the review!

Koyotaro Horiguchi -

Are you going to re-review the latest version of this patch?

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

On Fri, Sep 23, 2011 at 14:49, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net wrote:
 On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote:
 On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote:

 The following bug has been logged online:

 Bug reference:      6189
 Logged by:          Srinivas Aji
 Email address:      srinivas@emc.com
 PostgreSQL version: 9.0.4
 Operating system:   Linux
 Description:        libpq: sslmode=require verifies server certificate if
 root.crt is present
 Details:

 ...

 The observed behaviour is a bit different. If the ~/.postgresql/root.crt
 file (or any other filename set through sslrootcert option) is found,
 sslmode=require also performs the same level of certificate verification
 as
 verify-ca. The difference between require and verify-ca is that it is an
 error for the file to not exist when sslmode is verify-ca.

 I looked at this again, and I'm pretty sure we did this intentionally.
 The idea being that before we had the verify-ca/verify-full options,
 adding the root cert would enable the verification. And we didn't want
 to turn installations that previously did verify the certificate to
 stop doing so in the new version.

 So basically, the behaviour that is by design is:
 * require: if certificate exists, verify. if certificate doesn't
 exist, don't verify.
 * verify-ca: if certificate exists, verify. if certificate doesn't
 exist, disconnect.

 The question is, have we had the new options long enough now that we
 should change it so that we don't verify the cert in the case of
 cert-exists-but-verification-wasn't-explicitly-asked-for?

 Or should we just update the documentation to mention how this works?

 Magnus, If you're accepting votes on this: I would say 'yes' - change the
 behavior to the most logically consistent ones; ie, isolate the verification
 bits a bit more explicitly. And, in documentation, indicate the deprecation
 of the old behavior.

 Our mileage, in practical terms, is that the perceived inconsistencies
 create a minor support hassle - we don't want to present any - even trivial
 - hurdle to adoption of SSL to our clients.

 There are really two options to this as well - we can backpatch such a
 change, or we can change it only in 9.2. I'm leaning towards a no on
 the backport, because that will change things for existing users. So
 probably a doc change in backbranches and a behaviour change in 9.2
 would be the reasonable choice in this case.

 I definitely don't think we should back-patch a behavior change that
 silently weakens security.  That's not good.

 But what about not doing it in master, either?  It seems to me that we
 could avoid ever breaking backward compatibility by adding a new
 option require-no-verify.

Hmm. Intersting. and we could then deprecate the require option and
kill it off 4 releases later or so, I guess...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH

On Wed, Sep 21, 2011 at 12:19 PM, Andres Freund and...@anarazel.de wrote:
        /*
         * We also disallow data-modifying WITH in a cursor.  (This could be
         * allowed, but the semantics of when the updates occur might be
         * surprising.)
         */
        if (result-hasModifyingCTE)
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg(DECLARE CURSOR must not contain 
 data-modifying
 statements in WITH)));

 Given that cursors are about the only sensible way to return larger amounts of
 data, that behaviour reduces the usefulness of wCTEs a bit.

 Whats the exact cause of concern here? I personally don't think there is a
 problem documenting that you should fetch the cursor fully before relying on
 the updated tables to be in a sensible state. But that may be just me.

Well, it looks like right now you can't even using a simple INSERT ..
RETURNING there:

rhaas=# create table wuzzle (a int);
CREATE TABLE
rhaas=# declare w cursor for insert into wuzzle select g from
generate_series(1, 10) g returning g;
ERROR:  syntax error at or near insert
LINE 1: declare w cursor for insert into wuzzle select g from genera...
 ^

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new createuser option for replication role

2011-09-23 Thread Cédric Villemain

2011/9/23 Robert Haas robertmh...@gmail.com:
 On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Agreed. Attached is the updated version of the patch. It adds two options
 --replication and --no-replication. If neither specified, neither REPLICATION
 nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case,
 replication privilege is granted to only superuser.

 Committed.  Do we need to make any changes in interactive mode, when
 we prompt?  In theory this could be either wanted or not wanted for
 either superusers or non-superusers, but I'm not really sure it's
 worth it, and I certainly don't want the command to go into
 interactive mode just because neither --replication nor
 --no-replication was specified.

 Thoughts?

I believe the intereactive mode is useless.

There is still an issue with the patch for the documentation: a
superuser, even with NOREPLICATION is allowed to perform
pg_start_backup() and pg_stop_backup().



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new createuser option for replication role

2011-09-23 Thread Cédric Villemain

2011/9/23 Cédric Villemain cedric.villemain.deb...@gmail.com:
 2011/9/23 Robert Haas robertmh...@gmail.com:
 On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Agreed. Attached is the updated version of the patch. It adds two options
 --replication and --no-replication. If neither specified, neither 
 REPLICATION
 nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case,
 replication privilege is granted to only superuser.

 Committed.  Do we need to make any changes in interactive mode, when
 we prompt?  In theory this could be either wanted or not wanted for
 either superusers or non-superusers, but I'm not really sure it's
 worth it, and I certainly don't want the command to go into
 interactive mode just because neither --replication nor
 --no-replication was specified.

 Thoughts?

 I believe the intereactive mode is useless.

 There is still an issue with the patch for the documentation: a
 superuser, even with NOREPLICATION is allowed to perform
 pg_start_backup() and pg_stop_backup().

noise, sorry
I've just read the commited patch which fixed that.




 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




 --
 Cédric Villemain +33 (0)6 20 30 22 52
 http://2ndQuadrant.fr/
 PostgreSQL: Support 24x7 - Développement, Expertise et Formation




-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-23 Thread Alvaro Herrera


Excerpts from Magnus Hagander's message of vie sep 23 10:39:46 -0300 2011:
 On Fri, Sep 23, 2011 at 14:49, Robert Haas robertmh...@gmail.com wrote:
  On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net 
  wrote:
  On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net 
  wrote:
  On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote:
 
  The following bug has been logged online:
 
  Bug reference:      6189
  Logged by:          Srinivas Aji
  Email address:      srinivas@emc.com
  PostgreSQL version: 9.0.4
  Operating system:   Linux
  Description:        libpq: sslmode=require verifies server certificate if
  root.crt is present

  So basically, the behaviour that is by design is:
  * require: if certificate exists, verify. if certificate doesn't
  exist, don't verify.
  * verify-ca: if certificate exists, verify. if certificate doesn't
  exist, disconnect.

  I definitely don't think we should back-patch a behavior change that
  silently weakens security.  That's not good.
 
  But what about not doing it in master, either?  It seems to me that we
  could avoid ever breaking backward compatibility by adding a new
  option require-no-verify.
 
 Hmm. Intersting. and we could then deprecate the require option and
 kill it off 4 releases later or so, I guess...

So we would have
sslmode=verify-ca / require-no-verify / verify-full / disable / allow / prefer
?

This seems strange to me.  Why not have a second option to let the user
indicate the desired SSL verification?

sslmode=disable/allow/prefer/require
sslverify=none/ca-if-present/ca/full

(ca-if-present being the current require sslmode behavior).

We could then deprecate sslmode=verify and verify-full and have them be
synonyms of sslmode=require and corresponding sslverify.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

[ moving to -hacker s]

On Thu, Sep 22, 2011 at 9:26 PM, Thom Brown t...@linux.com wrote:
 On 22 September 2011 17:38, Josh Berkus j...@agliodbs.com wrote:

 So are there any plans to allow swappable drive/volatile storage
 unlogged tables?

 Be our guest.  ;-)

 Oh it can't be that difficult.  On first glance it looks like it's a
 case of piggy-backing mdopen and getting it to treat
 RELPERSISTENCE_TEMP relations in the same way as it would for
 relations during the bootstrap script (i.e. create it if it doesn't
 exist)... then telling it not to try reading anything from the
 relation... or something like this.  But I don't know C so...
 *puppy-dog eyes*

I don't think that's it.  It seems to me that what we really need to
do is put the _init forks in a different directory than all the other
forks (and then fix pg_upgrade to move them if upgrading from 9.1) -
so the core code change is actually in relpathbackend().  The question
is how you want to expose that to the user.  I mean, you could just
put them in another directory under the rug, and call it good.  Users
who know about the magic under the hood can fiddle with their mount
points, and everyone else will be none the wiser.

I'm not sure whether that's acceptable from a usability standpoint,
however.  If you want to expose an SQL-level interface, this gets to
be a harder problem... starting with defining what exactly that
interface should look like.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

On Fri, Sep 23, 2011 at 15:55, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Magnus Hagander's message of vie sep 23 10:39:46 -0300 2011:
 On Fri, Sep 23, 2011 at 14:49, Robert Haas robertmh...@gmail.com wrote:
  On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net 
  wrote:
  On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net 
  wrote:
  On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote:
 
  The following bug has been logged online:
 
  Bug reference:      6189
  Logged by:          Srinivas Aji
  Email address:      srinivas@emc.com
  PostgreSQL version: 9.0.4
  Operating system:   Linux
  Description:        libpq: sslmode=require verifies server certificate 
  if
  root.crt is present

  So basically, the behaviour that is by design is:
  * require: if certificate exists, verify. if certificate doesn't
  exist, don't verify.
  * verify-ca: if certificate exists, verify. if certificate doesn't
  exist, disconnect.

  I definitely don't think we should back-patch a behavior change that
  silently weakens security.  That's not good.
 
  But what about not doing it in master, either?  It seems to me that we
  could avoid ever breaking backward compatibility by adding a new
  option require-no-verify.

 Hmm. Intersting. and we could then deprecate the require option and
 kill it off 4 releases later or so, I guess...

 So we would have
 sslmode=verify-ca / require-no-verify / verify-full / disable / allow / prefer
 ?

 This seems strange to me.  Why not have a second option to let the user
 indicate the desired SSL verification?

 sslmode=disable/allow/prefer/require
 sslverify=none/ca-if-present/ca/full

 (ca-if-present being the current require sslmode behavior).

 We could then deprecate sslmode=verify and verify-full and have them be
 synonyms of sslmode=require and corresponding sslverify.

Hmm. I agree that the other suggestion was a bit weird, but I'm not
sure I like the multiple-options approach either. That's going to
require redesign of all software that deals with it at all today :S

Maybe we should just update the docs and be done with it :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only scans

2011-09-23 Thread Cédric Villemain

2011/8/16 Anssi Kääriäinen anssi.kaariai...@thl.fi:
 On 08/14/2011 12:31 AM, Heikki Linnakangas wrote:

 The same idea could of course be used to calculate the effective cache
 hit ratio for each table. Cache hit ratio would have the problem of feedback
 loops, though.

 Yeah, I'm not excited about making the planner and statistics more
 dynamic. Feedback loops and plan instability are not fun.

 I might be a little out of my league here... But I was thinking about the
 cache hit ratio and feedback loops. I understand automatic tuning would be
 hard. But making automatic tuning easier (by using pg_tune for example)
 would be a big plus for most use cases.

 To make it easier to tune the page read costs automatically, it would be
 nice if there would be four variables instead of the current two:
  - random_page_cost is the cost of reading a random page from storage.
 Currently it is not, it is the cost of accessing a random page, taking in
 account it might be in memory.
  - seq_page_cost is the cost of reading pages sequentially from storage
  - memory_page_cost is the cost of reading a page in memory
  - cache_hit_ratio is the expected cache hit ratio

 memory_page_cost would be server global, random and seq page costs
 tablespace specific, and cache_hit_ratio relation specific. You would get
 the current behavior by tuning *_page_costs realistically, and setting
 cache_hit_ratio globally so that the expected random_page_cost /
 seq_page_cost stays the same as now.

 The biggest advantage of this would be that the correct values are much
 easier to detect automatically compared to current situation. This can be
 done using pg_statio_* views and IO speed testing. They should not be tuned
 automatically by PostgreSQL, at least not the cache_hit_ratio, as that leads
 to the possibility of feedback loops and plan instability. The variables
 would also be much easier to understand.

 There is the question if one should be allowed to tune the *_page_costs at
 all. If I am not missing something, it is possible to detect the correct
 values programmatically and they do not change if you do not change the
 hardware. Cache hit ratio is the real reason why they are currently so
 important for tuning.

 An example why the current random_page_cost and seq_page_cost tuning is not
 adequate is that you can only set random_page_cost per tablespace. That
 makes perfect sense if random_page_cost would be the cost of accessing a
 page in storage. But it is not, it is a combination of that and caching
 effects, so that it actually varies per relation (and over time). How do you
 set it correctly for a query where one relation is fully cached and another
 one not?

 Another problem is that if you use random_page_cost == seq_page_cost, you
 are effectively saying that everything is in cache. But if everything is in
 cache, the cost of page access relative to cpu_*_costs is way off. The more
 random_page_cost and seq_page_cost are different, the more they mean the
 storage access costs. When they are the same, they mean the memory page
 cost. There can be an order of magnitude in difference of a storage page
 cost and a memory page cost. So it is hard to tune the cpu_*_costs
 realistically for cases where sometimes data is in cache and sometimes not.

 Ok, enough hand waving for one post :) Sorry if this all is obvious /
 discussed before. My googling didn't turn out anything directly related,
 although these have some similarity:
  - Per-table random_page_cost for tables that we know are always cached
 [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01503.php]
  - Script to compute random page cost
 [http://archives.postgresql.org/pgsql-hackers/2002-09/msg00503.php]
 -  The science of optimization in practical terms?
 [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php], getting
 really interesting starting from here:
 [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00787.php]

late reply.
You can add this link to your list:
http://archives.postgresql.org/pgsql-hackers/2011-06/msg01140.php


-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

2011-09-23 Thread Thom Brown

On 23 September 2011 15:12, Robert Haas robertmh...@gmail.com wrote:
 [ moving to -hacker s]

 On Thu, Sep 22, 2011 at 9:26 PM, Thom Brown t...@linux.com wrote:
 On 22 September 2011 17:38, Josh Berkus j...@agliodbs.com wrote:

 So are there any plans to allow swappable drive/volatile storage
 unlogged tables?

 Be our guest.  ;-)

 Oh it can't be that difficult.  On first glance it looks like it's a
 case of piggy-backing mdopen and getting it to treat
 RELPERSISTENCE_TEMP relations in the same way as it would for
 relations during the bootstrap script (i.e. create it if it doesn't
 exist)... then telling it not to try reading anything from the
 relation... or something like this.  But I don't know C so...
 *puppy-dog eyes*

 I don't think that's it.  It seems to me that what we really need to
 do is put the _init forks in a different directory than all the other
 forks (and then fix pg_upgrade to move them if upgrading from 9.1) -
 so the core code change is actually in relpathbackend().  The question
 is how you want to expose that to the user.  I mean, you could just
 put them in another directory under the rug, and call it good.  Users
 who know about the magic under the hood can fiddle with their mount
 points, and everyone else will be none the wiser.

 I'm not sure whether that's acceptable from a usability standpoint,
 however.  If you want to expose an SQL-level interface, this gets to
 be a harder problem... starting with defining what exactly that
 interface should look like.

Couldn't this come under tablespace changes then?  After all the
use-case stated would require a separate tablespace, and you could do
something like:

CREATE VOLATILE TABLESPACE drive_made_of_wax_left_in_the_sun LOCATION
'/mnt/ramdisk';

All objects then created or reassigned therein would insert magic
stuff here.  In theory it would be independent of UNLOGGEDness, but I
can see this would be problematic because such tables wouldn't be
allowed foreign key references to tables within a stable tablespace
and vice-versa, since the wonky tablespace could collapse any minute
and integrity with it.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-23 Thread Alvaro Herrera


Excerpts from Magnus Hagander's message of vie sep 23 11:31:37 -0300 2011:
 
 On Fri, Sep 23, 2011 at 15:55, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

  This seems strange to me.  Why not have a second option to let the user
  indicate the desired SSL verification?
 
  sslmode=disable/allow/prefer/require
  sslverify=none/ca-if-present/ca/full
 
  (ca-if-present being the current require sslmode behavior).
 
  We could then deprecate sslmode=verify and verify-full and have them be
  synonyms of sslmode=require and corresponding sslverify.
 
 Hmm. I agree that the other suggestion was a bit weird, but I'm not
 sure I like the multiple-options approach either. That's going to
 require redesign of all software that deals with it at all today :S

Why?  They could continue to use the existing options; or switch to the
new options if they wanted different behavior, as is the case of the OP.

 Maybe we should just update the docs and be done with it :-)

That's another option, sure ... :-)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

On Fri, Sep 9, 2011 at 11:28 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 It's very difficult or impossible to anticipate how effective the tool
 will be in practice, but when you consider that it works and does not
 produce false positives for the first 3 real-world cases tested, it
 seems reasonable to assume that it's at least worth having around. Tom
 recently said of a previous pgrminclude campaign in July 2006 that It
 took us two weeks to mostly recover, but we were still dealing with
 some fallout in December. I think that makes the case for adding this
 tool or some refinement as a complement to pgrminclude in src/tools
 fairly compelling.

I'm not opposed to adding something like this, but I think it needs to
either be tied into the actual running of the script, or have a lot
more documentation than it does now, or both.  I am possibly stupid,
but I can't understand from reading the script (or, honestly, the
thread) exactly what kind of pgrminclude-induced errors this is
protecting against; but even if we clarify that, it seems like it
would be a lot of work to run it manually on all the files that might
be affected by a pgrminclude run, unless we can somehow automate that.

I'm also curious to see how much more fallout we're going to see from
that run.  We had a few glitches when it was first done, but it didn't
seem like they were really all that bad.  It might be that we'd be
better off running pgrminclude a lot *more* often (like once a cycle,
or even after every CommitFest), because the scope of the changes
would then be far smaller and we wouldn't be dealing with 5 years of
accumulated cruft all at once; we'd also get a lot more experience
with what works or does not work with the script, which might lead to
improvements in that script on a less-than-geologic time scale.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH

2011-09-23 Thread Andres Freund

On Friday 23 Sep 2011 15:42:48 Robert Haas wrote:
 On Wed, Sep 21, 2011 at 12:19 PM, Andres Freund and...@anarazel.de wrote:
 /*
  * We also disallow data-modifying WITH in a cursor.  (This could
  be * allowed, but the semantics of when the updates occur might be *
  surprising.)
  */
 if (result-hasModifyingCTE)
 ereport(ERROR,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  errmsg(DECLARE CURSOR must not contain
  data-modifying statements in WITH)));
  
  Given that cursors are about the only sensible way to return larger
  amounts of data, that behaviour reduces the usefulness of wCTEs a bit.
  
  Whats the exact cause of concern here? I personally don't think there is
  a problem documenting that you should fetch the cursor fully before
  relying on the updated tables to be in a sensible state. But that may be
  just me.
 
 Well, it looks like right now you can't even using a simple INSERT ..
 RETURNING there:
 
 rhaas=# create table wuzzle (a int);
 CREATE TABLE
 rhaas=# declare w cursor for insert into wuzzle select g from
 generate_series(1, 10) g returning g;
 ERROR:  syntax error at or near insert
 LINE 1: declare w cursor for insert into wuzzle select g from genera...
One could argue that its a easier to implement it using a wCTE because the 
query will be simply materialize the query upfront.
That makes handling the case where somebody fetches 3 tuples from a query 
updating 10 easier.

Thats a bit harder for the normal cursor case because there is no tuplestore 
around to do that (except the WITH HOLD case where that is only used on 
commit...).

I find it an acceptable way to enforce using a CTE to do cursors on DML because 
it makes it more clear that they will be fully executed on start...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

On Fri, Sep 23, 2011 at 10:37 AM, Thom Brown t...@linux.com wrote:
 Couldn't this come under tablespace changes then?  After all the
 use-case stated would require a separate tablespace, and you could do
 something like:

 CREATE VOLATILE TABLESPACE drive_made_of_wax_left_in_the_sun LOCATION
 '/mnt/ramdisk';

 All objects then created or reassigned therein would insert magic
 stuff here.  In theory it would be independent of UNLOGGEDness, but I
 can see this would be problematic because such tables wouldn't be
 allowed foreign key references to tables within a stable tablespace
 and vice-versa, since the wonky tablespace could collapse any minute
 and integrity with it.

I don't get it.  It would certainly be possible to create a VOLATILE
TABLESPACE in which only TEMPORARY tables could be created.  We would
just disallow the creation of anything other than a temporary table
within that tablespace, and if the contents of the tablespace get
wiped out, WDC.  (Mind you, I think we'd likely want to insist that
the pg-version directory manufactured by CREATE TABLESPACE would stick
around... or else we'd need some provision for recreating it on every
startup.)

However, if you want a VOLATILE TABLESPACE to allow not only TEMPORARY
but also UNLOGGED objects, it's not so simple, because the _init forks
of an unlogged relation are not disposable.  Those are not allowed to
disappear, or you're going to be in trouble.  So the issue still comes
down to this: where are we gonna put those _init forks?  I guess we
could do something like this:

CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION
'/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk';

All forks of temporary relations, and all non-_init forks of
non-temporary relations, could be stored in the VOLATILE LOCATION,
while everything else could be stored in the regular LOCATION.

Hmm... actually, I kind of like that.  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas robertmh...@gmail.com wrote:
 CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION
 '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk';

 All forks of temporary relations, and all non-_init forks of
 non-temporary relations, could be stored in the VOLATILE LOCATION,
 while everything else could be stored in the regular LOCATION.

 Hmm... actually, I kind of like that.  Thoughts?

Gah.  I mean, all forks of temporary relations, and all non-_init
forks of *unlogged* relations, could be stored in the VOLATILE
LOCATION.  Permanent tables would have all forks in the regular
LOCATION, along with _init forks of unlogged tables.

Of course, that would have the problem that relpathbackend() would
need to know the relpersistence value in order to compute the
pathname, which I think is going to be ugly, come to think of it.

Hmm...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH

On Fri, Sep 23, 2011 at 10:53 AM, Andres Freund and...@anarazel.de wrote:
 One could argue that its a easier to implement it using a wCTE because the
 query will be simply materialize the query upfront.
 That makes handling the case where somebody fetches 3 tuples from a query
 updating 10 easier.

 Thats a bit harder for the normal cursor case because there is no tuplestore
 around to do that (except the WITH HOLD case where that is only used on
 commit...).

 I find it an acceptable way to enforce using a CTE to do cursors on DML 
 because
 it makes it more clear that they will be fully executed on start...

Hmm, maybe.  But if that's true, why does the comment read the way it
does?  If the updates all occur at the beginning, that wouldn't be
surprising, would it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-23 Thread Greg Stark

On Thu, Sep 22, 2011 at 10:45 PM, Jeff Davis pg...@j-davis.com wrote:
 +    for (i = 0; i  num_items; ++i)
 +        /* do something with q-items[i] */
 +
 +This code turns out to be unsafe, because the writer might increment
 +q-num_items before it finishes storing the new item into the
 appropriate slot.
 +More subtly, the reader might prefetch the contents of the q-items
 array
 +before reading q-num_items.

 How would the reader prefetch the contents of the items array, without
 knowing how big it is?

I don't think this is as mysterious as it sounds. Imagine the compiler
unrolled the loop so that it does something like

fetch num_items into register
if register  0 fetch q[0], process it
if register  1 fetch q[1], process it
...

Then the cpu could easily do branch prediction on the ifs and fetch
q[0] while the fetch of num_items was still in progress. If it turns
out that num_items is 0 then it would invalidate the operations
initiated after the branch prediction but if it sees 1 (ie after the
increment in the other process) then it's not so it doesn't.

So you have two memory fetches which I guess I still imagine have to
be initiated in the right order but they're both in flight at the same
time. I have no idea how the memory controller works and I could
easily imagine either one grabbing the value before the other.

I'm not even clear how other processors can reasonably avoid this. It
must be fantastically expensive to keep track of which branch
predictions depended on which registers and which memory fetches the
value of those registers depended on. And then it would have  to
somehow inform the memory controller of those old memory fetches that
this new memory fetch is dependent on and have it ensure that the
fetches are read the right order?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-23 Thread Linas Virbalas

 But on the standby its size is the old one (thus, it seems, that the size
 changed after the rsync transfer and before the pg_stop_backup() was
 called):

 Now that seems pretty weird - I don't think that file should ever shrink.

It seems, I was not clear in my last example. The pg_clog file doesn't shrink. 
On the contrary, when rsync kicks in it is still small, but after it completes 
and somewhere around the pg_stop_backup() call, the pg_clog file grows on the 
master. Hence, it is left smaller on the standby.

 Are you sure that the rsync isn't starting until after
 pg_start_backup() completes?

100% - the procedure is scripted in a single threaded application, so rsync is 
started only after pg_start_backup(...) returns.

 Because if you were starting it just a
 tiny bit early, that would explain the observed symptoms perfectly, I
 think.

I agree, but that is not the case.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

2011-09-23 Thread Thom Brown

On 23 September 2011 15:56, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas robertmh...@gmail.com wrote:
 CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION
 '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk';

 All forks of temporary relations, and all non-_init forks of
 non-temporary relations, could be stored in the VOLATILE LOCATION,
 while everything else could be stored in the regular LOCATION.

 Hmm... actually, I kind of like that.  Thoughts?

 Gah.  I mean, all forks of temporary relations, and all non-_init
 forks of *unlogged* relations, could be stored in the VOLATILE
 LOCATION.  Permanent tables would have all forks in the regular
 LOCATION, along with _init forks of unlogged tables.

 Of course, that would have the problem that relpathbackend() would
 need to know the relpersistence value in order to compute the
 pathname, which I think is going to be ugly, come to think of it.

I doubt I understand the whole _init forks thing correctly, but can't
the main tablespace provide sanctuary to such volatile supporting meta
data (pg_version, _init and whatever else you're worried about) except
the actual relation (and its vm/fsm)?  Anything you can't afford to
lose you get the main tablespace to look after.  And instead of having
a dir linked to the location in pg_tblspc, an actual dir could exist,
containing items directly linked to items in the volatile location.

Hmm... it doesn't sound quite right to me either.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-23 Thread Peter Eisentraut

On tis, 2011-09-20 at 16:38 -0400, Robert Haas wrote:
 For now, I think we're best off not changing the terminology, and
 confining the remit of this patch to (a) turning all of the existing
 recovery.conf parameters into GUCs and (b) replacing recovery.conf
 with a sentinel file a sentinel file (name TBB) to indicate that the
 server is to start in recovery mode.  The naming isn't great but the
 more we change at once the less chance of reaching agreement.  It
 seems like we have pretty broad agreement on the basics here, so let's
 start with that.

The only thing that's slightly bogus about that is that if you were
doing an archive recovery, you'd have to edit the main postgresql.conf
with one-shot parameters for that particular recovery (and then delete
them again, or leave them in place, confusing the next guy).  But
perhaps that's worth the overall simplification.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-23 Thread Aidan Van Dyk

On Fri, Sep 23, 2011 at 4:41 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

Unfortunately, it's impossible, because the error message Could not read
from file pg_clog/0001 at offset 32768: Success is shown (and startup
aborted) before the turn for redo starts at message arrives.

It looks to me that pg_clog/0001 exists, but it shorter than recovery
expects. Which shouldn't happen, of course, because the start-backup
checkpoint should flush all the clog that's needed by recovery to disk
before the backup procedure begins to them.

I think the point here is that recover *never starts*. Something in
the standby startup is looking for a value in a clog block that
recovery hadn't had a chance to replay (produce) yet.

So the standby is looking into the data directory *before* recovery
has had a chance to run, and based on that,goes to look for something
in clog page that wasn't guarenteed to exists at the start of the
backup period, and bombing out before recovery has a chance to start
replaying WAL and write the new clog page.

--
Aidan Van Dyk Create like a god,
ai...@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-23 Thread Alvaro Herrera


Excerpts from Linas Virbalas's message of vie sep 23 09:47:20 -0300 2011:
 On 9/23/11 12:05 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:

 But on the standby its size is the old one (thus, it seems, that the size
 changed after the rsync transfer and before the pg_stop_backup() was
 called):
 
 ls -l pg_clog/
 total 8
 -rw--- 1 postgres postgres 8192 Sep 23 14:31 

Sounds like rsync is caching the file size at the start of the run, and
then copying that many bytes, ignoring the growth that occurred after it
started.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Sep 23, 2011 5:59 PM, Alvaro Herrera alvhe...@commandprompt.com
wrote:


 Excerpts from Linas Virbalas's message of vie sep 23 09:47:20 -0300 2011:
  On 9/23/11 12:05 PM, Heikki Linnakangas
  heikki.linnakan...@enterprisedb.com wrote:

  But on the standby its size is the old one (thus, it seems, that the
size
  changed after the rsync transfer and before the pg_stop_backup() was
  called):
 
  ls -l pg_clog/
  total 8
  -rw--- 1 postgres postgres 8192 Sep 23 14:31 

 Sounds like rsync is caching the file size at the start of the run, and
 then copying that many bytes, ignoring the growth that occurred after it
 started.


That pretty much matches what Daniel does when he got the same failure case.


Is this not allowed? Shouldn't wal replay fix this?

/Magnus

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-23 Thread Andres Freund

Hi,

On Wednesday 21 Sep 2011 16:44:30 Linas Virbalas wrote:
 2011-09-21 13:41:05 CEST DETAIL:  Could not read from file pg_clog/0001
 at offset 32768: Success.
Any chance you can attach gdb to the startup process and provide a backtrace 
from the place where this message is printed?

Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-23 Thread Simon Riggs

On Tue, Sep 20, 2011 at 5:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 I sympathise with this view, to an extent.

 If people want to put all parameters in one file, they can do so. So +1 to 
 that.

 Should they be forced to adopt that new capability by us deliberately
 breaking their existing setups? No. So -1 to that.

 If we do an automatic include of recovery.conf first, then follow by
 reading postgresql,conf then we will preserve the old as well as
 allowing the new.

 I don't buy this argument at all.  I don't believe that recovery.conf is
 part of anyone's automated processes at all, let alone to an extent that
 they won't be able to cope with a change to rationalize the file layout.

There are many. Tools I can name include pgpool, 2warm, PITRtools, but
there are also various tools from Sun, an IBM reseller I have
forgotten the name of, OmniTI and various other backup software
providers. Those are just the ones I can recall quickly. We've
encouraged people to write software on top and they have done so.

Breaking backwards compatibility isn't something we do elsewhere, when
its easy to keep it.

I don't object to new functionality (and agreed to it upthread), just
don't break the old way when there is no need.


 And most especially I don't buy that someone who does want to keep using
 it couldn't cope with adding an include to postgresql.conf manually.

This just creates a barrier to upgrade. Most people using PostgreSQL
use multiple releases, so its a pain to have to maintain multiple
versions of scripts to have things work properly. Even people that
don't mind changing won't be able to do it fully. That creates
confusion, which leads to mistakes.

These things relate to backup and recovery. Any changes to them give
risk of data loss. Cosmetic considerations are secondary.

There is no command to safely confirm that include recovery.conf is
added to postgresql.conf, so leaving it optional makes it unclear as
to whether the old ways will work or not.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-23 Thread Joshua D. Drake



On 09/20/2011 09:23 AM, Tom Lane wrote:

Simon Riggssi...@2ndquadrant.com  writes:

I sympathise with this view, to an extent.





If we do an automatic include of recovery.conf first, then follow by
reading postgresql,conf then we will preserve the old as well as
allowing the new.


I don't buy this argument at all.  I don't believe that recovery.conf is
part of anyone's automated processes at all, let alone to an extent that
they won't be able to cope with a change to rationalize the file layout.
And most especially I don't buy that someone who does want to keep using
it couldn't cope with adding an include to postgresql.conf manually.


As Simon has already appropriately posted You would be incorrect.

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 2

On Tue, Aug 30, 2011 at 6:38 AM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 Yeah. If we don't know the status of the vacuum that collected the
 line pointer and marked it vacuum-dead, the next vacuum will pick it
 up again and stamp it with its own generation number.

I'm still not really comfortable with the handling of vacuum
generation numbers.  If we're going to say that 2^30 is large enough
that we don't need to worry about the counter wrapping around, then we
need some justification for that position.  Why can't we have 2^30
consecutive failed vacuums on a single table?  Sure, it would take a
long time, but we guard against many failure conditions that would
take a long time, and the result is that we have fewer corner-case
failures.  I want an explanation of why it's *safe*, and what the
smallest number of vacuum generations that we must support to make it
safe is.  If we blow the handling of this, we are going to eat the
user's data, so we had better have a really convincing argument as to
why what we're doing is OK.

Here's a possible alternative implementation: we allow up to 32 vacuum
generations to exist at once.  We keep a 64 bit integer indicating the
state of each vacuum generation: 00 = no line pointers with this
vacuum generation exist in the heap, 01 = some line pointers with this
vacuum generation may exist in the heap, but they are not removable,
11 = some line pointers with this vacuum generation exist in the heap,
and they are removable.  Then, when we start a VACUUM, we look for a
vacuum generation with flags 01.  If we find one, we adopt that as the
generation number for this vacuum.  If not, we look for one with flags
00, and if we find one, we set its flags to 01 and adopt it as the
generation number for this vacuum.  (If this too fails, then all
vacuums are in state 11.  There are several ways that could be handled
- either we make a pass over the heap just to free dead line pointers,
or we randomly select a vacuum generation number and push it back to
state 01, or we make all line pointers encountered during the vacuum
merely dead rather than dead-vacuumed; I think I like that option
best.)  When we complete the heap scan, we set the flags of any vacuum
generation numbers that were previously 11 back to 00 (assuming we've
visited all the not-all-visible pages).  When we complete the index
pass, we set the flags of our chosen vacuum generation number to 11.

There is clearly room for argument about the details here; for
example, as the algorithm is presented, it's hard to see how you would
end up with more than one vacuum generation number in each state, so
maybe you only need three values, not 32.  I suppose it could be
useful to have more values if you want to sometimes vacuum only part
of the heap, because then you'd only get to mark vacuum generation
numbers as unused on those occasions when you actually did scan the
whole heap.  But regardless of that detail, the thing I like about
what I'm proposing here is that it provides a closed loop around the
management of vacuum generation numbers - we always know the exact
state of each vacuum generation number, as opposed to just hoping that
by the billionth vacuum there won't be any leftovers.  Of course, it
may be also that we can convince ourselves that your algorithm as
implemented is safe ... but I'm not convinced, yet.

Another thing I'm not sure whether to worry about is the question of
where we store the vacuum generation information.  I mean, if we store
it in pg_class, then what happens if the user does a manual update of
pg_class just as we're updating the vacuum generation information?  We
had better make sure that there are no cases where we can accidentally
think that it's OK to reclaim dead line pointers that really still
have references, or we're going to end up with some awfully
difficult-to-find bugs...  never mind the fact the possibility of the
user manually updating the value and hosing themselves.  Of course, we
already have some of those issues - relfrozenxid probably has the same
problems - and I'm not 100% sure whether this one is any worse.  It
would be really nice to have those non-transactional tables that
Alvaro keeps mumbling about, though, or some other way to store this
information.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] citext operator precedence fix

2011-09-23 Thread Josh Berkus


 I'm OK with the proposed behavior change and I agree that it's
 probably what people want, but I am awfully suspicious that those
 extra casts are going to break something you haven't thought about.
 It might be worth posting a rough version first just to see if I (or
 someone else) can break it before you spend a lot of time on it.

Additional breakage confirmed (hash functions, etc.)  Looks like I need
to add a lot more support functions and test.   This is still worth
doing, but don't expect it for the next commitfest.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load


On 23.09.2011 19:03, Magnus Hagander wrote:

On Sep 23, 2011 5:59 PM, Alvaro Herreraalvhe...@commandprompt.com
wrote:



Excerpts from Linas Virbalas's message of vie sep 23 09:47:20 -0300 2011:

On 9/23/11 12:05 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:



But on the standby its size is the old one (thus, it seems, that the

size

changed after the rsync transfer and before the pg_stop_backup() was
called):

ls -l pg_clog/
total 8
-rw--- 1 postgres postgres 8192 Sep 23 14:31 


Sounds like rsync is caching the file size at the start of the run, and
then copying that many bytes, ignoring the growth that occurred after it
started.


That pretty much matches what Daniel does when he got the same failure case.

Is this not allowed? Shouldn't wal replay fix this?


That's OK. The effect is the same as if rsync had copied the file at the 
start.


What's not OK is to store the file as empty or truncated file in the 
backup, when the file is deleted from pg_clog while rsync is running. 
The file must have length = the length the file had when backup 
started, or the file must be omitted from the backup altogether.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Fri, Sep 23, 2011 at 11:43 AM, Aidan Van Dyk ai...@highrise.ca wrote:
 On Fri, Sep 23, 2011 at 4:41 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:

 Unfortunately, it's impossible, because the error message Could not read
 from file pg_clog/0001 at offset 32768: Success is shown (and startup
 aborted) before the turn for redo starts at message arrives.

 It looks to me that pg_clog/0001 exists, but it shorter than recovery
 expects. Which shouldn't happen, of course, because the start-backup
 checkpoint should flush all the clog that's needed by recovery to disk
 before the backup procedure begins to them.

 I think the point here is that recover *never starts*.  Something in
 the standby startup is looking for a value in a clog block that
 recovery hadn't had a chance to replay (produce) yet.

Ah.  I think you are right - Heikki made the same point.  Maybe some
of the stuff that happens just after this comment:

/*
 * Initialize for Hot Standby, if enabled. We won't let backends in
 * yet, not until we've reached the min recovery point specified in
 * control file and we've established a recovery snapshot from a
 * running-xacts WAL record.
 */


...actually needs to be postponed until after we've reached consistency?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Sep23, 2011, at 18:03 , Magnus Hagander wrote:
 On Sep 23, 2011 5:59 PM, Alvaro Herrera alvhe...@commandprompt.com wrote:
  Sounds like rsync is caching the file size at the start of the run, and
  then copying that many bytes, ignoring the growth that occurred after it
  started.
 
 That pretty much matches what Daniel does when he got the same failure case.
 
 Is this not allowed? Shouldn't wal replay fix this?

I don't see how it could be forbidden. ISTM that we absolutely *have* to be
able to deal with each byte of a file's date, including it's meta-data, being in
any state it was between the time pg_start_backup() returned and 
pg_stop_backup()
was issued. With the individual states being in no way synchronized.
(OK, in reality restricting this to individual 512-byte blocks is probably 
enough,
but still...).

This, I think, is also the reasons we need to force full_page_writes to on
during a hot backup. If a page was modified at all between pg_start_backup() and
pg_stop_backup(), we essentially have to assume it's totally garbled. 

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Sep23, 2011, at 18:45 , Robert Haas wrote:
 Ah.  I think you are right - Heikki made the same point.  Maybe some
 of the stuff that happens just after this comment:
 
/*
 * Initialize for Hot Standby, if enabled. We won't let backends in
 * yet, not until we've reached the min recovery point specified in
 * control file and we've established a recovery snapshot from a
 * running-xacts WAL record.
 */
 
 
 ...actually needs to be postponed until after we've reached consistency?

I came the the same conclusion. It seems the before we've reached consistency,
we shouldn't attempt to read anything because the data can be pretty arbitrarily
garbled.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-23 Thread Josh Berkus

Simon,

 There are many. Tools I can name include pgpool, 2warm, PITRtools, but
 there are also various tools from Sun, an IBM reseller I have
 forgotten the name of, OmniTI and various other backup software
 providers. Those are just the ones I can recall quickly. We've
 encouraged people to write software on top and they have done so.

Actually, just to correct this list:
* there are no tools from Sun
* pgPool2 does not deal with recovery.conf
* there are additional ones: WAL-E, etc., which may or may not need to
be edited.

 Breaking backwards compatibility isn't something we do elsewhere, when
 its easy to keep it.

FWIW, I've already found that I have to modify all my backup scripts for
9.0 from 8.4, and again for 9.1 from 9.0.  So we do break backwards
compatibility, frequently.

 I don't object to new functionality (and agreed to it upthread), just
 don't break the old way when there is no need.

I'm happy to make upgrades easier, but I want a path which eventually
ends in recovery.conf going away.  It's a bad API, confuses our users,
and is difficult to support and maintain.   If you think it's easier on
our users to do that in stages over several versions rather than in one
fell swoop, then let's plan it that way.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load


On 23.09.2011 19:49, Florian Pflug wrote:

On Sep23, 2011, at 18:45 , Robert Haas wrote:

Ah.  I think you are right - Heikki made the same point.  Maybe some
of the stuff that happens just after this comment:

/*
 * Initialize for Hot Standby, if enabled. We won't let backends in
 * yet, not until we've reached the min recovery point specified in
 * control file and we've established a recovery snapshot from a
 * running-xacts WAL record.
 */


...actually needs to be postponed until after we've reached consistency?


I came the the same conclusion. It seems the before we've reached consistency,
we shouldn't attempt to read anything because the data can be pretty arbitrarily
garbled.


There are pretty clear rules on what state clog can be in. When you 
launch postmaster in a standby:


* Any clog preceding the nextXid from the checkpoint record we start 
recovery from, must either be valid, or the clog file must be missing 
altogether (which can happen when it was vacuumed away while the backup 
in progress - if the clog is still needed at the end of backup it must 
not be missing, of course).

* Any clog following nextXid can be garbled or missing.

Recovery will overwrite any clog after nextXid from the WAL, but not the 
clog before it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] citext operator precedence fix

On Fri, Sep 23, 2011 at 12:37 PM, Josh Berkus j...@agliodbs.com wrote:
 I'm OK with the proposed behavior change and I agree that it's
 probably what people want, but I am awfully suspicious that those
 extra casts are going to break something you haven't thought about.
 It might be worth posting a rough version first just to see if I (or
 someone else) can break it before you spend a lot of time on it.

 Additional breakage confirmed (hash functions, etc.)  Looks like I need
 to add a lot more support functions and test.   This is still worth
 doing, but don't expect it for the next commitfest.

I would also be looking carefully at whether you can construct a
scenario where the operator resolution code can't decide between
=(text,citext) or =(text,text) - you probably need a third type like
varchar or bpchar in the mix to trigger a failure, if there's one to
be found.  Or you might have a problem with citext = bpchar being
ambiguous between =(text,citext) and =(varchar,text), or some such
thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

On Fri, Sep 23, 2011 at 12:51 PM, Josh Berkus j...@agliodbs.com wrote:
 I'm happy to make upgrades easier, but I want a path which eventually
 ends in recovery.conf going away.  It's a bad API, confuses our users,
 and is difficult to support and maintain.

I agree.

GUC = Grand Unified Configuration, but recovery.conf is an example of
where it's not so unified after all.  We've already done a non-trivial
amount of work to allow recovery.conf values to be specified without
quotes, a random incompatibility with GUCs that resulted from having
different parsing code for each file.  If that were the last issue,
then maybe it wouldn't be worth worrying about, but it's not.  For
example, it would be nice to have reload behavior on SIGHUP.  And we
keep talking about having an ALTER SYSTEM SET guc = value or SET
PERMANENT guc = value command, and I think ALTER SYSTEM SET
recovery_target_time = '...' would be pretty sweet.  I don't want us
to have to implement such things separately for postgresql.conf and
recovery.conf.

Now, it's true that Simon's proposal (of having recovery.conf
automatically included) if it exists doesn't necessarily preclude
those things.  But it seems to me that it is adding a lot of
complexity to core for a pretty minimal benefit to end-users, and that
the semantics are not going to end up being very clean.  For example,
now you potentially have the situation where recovery.conf has
work_mem=128MB and postgresql.conf has work_mem=4MB, and now when you
end recovery you've got to make sure that everyone picks up the new
setting.  Now, in some sense you could say that's a feature addition,
and I'm not going to deny that it might be useful to some people, but
I think it's also going to require a fairly substantial convolution of
the GUC machinery, and it's going to discourage people from moving
away from recovery.conf.  And like Josh, I think that ought to be the
long-term goal, for the reasons he states.

I don't want to go willy-nilly breaking third-party tools that work
with PostgreSQL, but in this case I think that the reason there are so
many tools in the first place is because what we're providing in core
is not very good.  If we are unwilling to improve it for fear of
breaking compatibility with the tools, then we are stuck.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-23 Thread Marcin Mańk

One idea:
col like 'foo%' could be translated to col = 'foo' and col = foo || 'zzz' , 
where 'z' is the largest possible character. This should be good enough  for 
calculating stats.

 How to find such a character, i do not know.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stored procedures

2011-09-23 Thread Merlin Moncure

On Thu, Sep 1, 2011 at 12:18 PM, Josh Berkus j...@agliodbs.com wrote:
 On 8/31/11 12:15 PM, Merlin Moncure wrote:
 An out of process, autonomous transaction type implementation should
 probably not sit under stored procedures for a number of reasons --
 mainly that it's going to expose too many implementation details to
 the user.  For example, does a SP heavy app have 2*N running
 processes?  Or do we slot them into a defined number of backends for
 that purpose? Yuck  yuck.  I like the AT feature, and kludge it
 frequently via dblink, but it's a solution for a different set of
 problems.

 I think that transaction control without parallelism would be the 80%
 solution.  That is, an SP has transaction control, but those
 transactions are strictly serial, and cannot be run in parallel.  For
 example, if you were writing an SP in PL/pgSQL, each BEGIN ... END
 block would be an explicit transaction, and standalone-only statements
 be allowed between BEGIN ... END blocks, or possibly in their own
 special block type (I prefer the latter).

 One issue we'd need to deal with is exception control around
 single-statement transactions and non-transactional statements (VACUUM,
 CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.).  In some cases, the user
 is going to want to catch exceptions and abort the SP, and in other
 cases ignore them, so both need to be possible.

Totally agree -- was thinking about this very issue.  One of the
things I'd really like to see SP be able to do is to abstract some of
the nasty details of MVCC away from the client -- setting isolation
mode, replaying errors on serialization, etc.  This requires error
handling.  Unfortunately, this (exception handling in non transaction
context) is probably going to add some complexity to the
implementation.  Are we on the right track here (that is, maybe we
really *should* be looking at out of process execution)? How do
procedures fit in terms of execution from the tcop down?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

On Fri, Sep 23, 2011 at 11:36 AM, Thom Brown t...@linux.com wrote:
 On 23 September 2011 15:56, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas robertmh...@gmail.com wrote:
 CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION
 '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk';

 All forks of temporary relations, and all non-_init forks of
 non-temporary relations, could be stored in the VOLATILE LOCATION,
 while everything else could be stored in the regular LOCATION.

 Hmm... actually, I kind of like that.  Thoughts?

 Gah.  I mean, all forks of temporary relations, and all non-_init
 forks of *unlogged* relations, could be stored in the VOLATILE
 LOCATION.  Permanent tables would have all forks in the regular
 LOCATION, along with _init forks of unlogged tables.

 Of course, that would have the problem that relpathbackend() would
 need to know the relpersistence value in order to compute the
 pathname, which I think is going to be ugly, come to think of it.

 I doubt I understand the whole _init forks thing correctly,

Basically, for every unlogged table, you get an empty _init fork, and
for every index of an unlogged table, you get an _init fork
initialized to an empty index.  The _init forks are copied over the
main forks by the startup process before entering normal running.

 but can't
 the main tablespace provide sanctuary to such volatile supporting meta
 data (pg_version, _init and whatever else you're worried about) except
 the actual relation (and its vm/fsm)?  Anything you can't afford to
 lose you get the main tablespace to look after.  And instead of having
 a dir linked to the location in pg_tblspc, an actual dir could exist,
 containing items directly linked to items in the volatile location.

 Hmm... it doesn't sound quite right to me either.

Well, we could certainly Decree From On High that the _init forks are
all going to be stored under $PGDATA rather than in the tablespace
directories.  That would make things simple.  Of course, it also means
that if you want the _init forks stored somewhere, you are out of
luck.  Now maybe that is an unlikely scenario.  Off the top of my
head, the only case I can think of would be if the storage space or
inode consumption requirements were problematic - and even then you
could stick a symlink in there someplace to make it work, if you're
the sort of person who knows how to do that.  So maybe it's OK.  But
it makes me a little uneasy.  When people ask to put stuff in a
tablespace, I tend to think they want it to show up in that
tablespace.

Hmm, hmm...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

On Fri, Sep 23, 2011 at 12:58 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 There are pretty clear rules on what state clog can be in. When you launch
 postmaster in a standby:

 * Any clog preceding the nextXid from the checkpoint record we start
 recovery from, must either be valid, or the clog file must be missing
 altogether (which can happen when it was vacuumed away while the backup in
 progress - if the clog is still needed at the end of backup it must not be
 missing, of course).
 * Any clog following nextXid can be garbled or missing.

 Recovery will overwrite any clog after nextXid from the WAL, but not the
 clog before it.

So the actual error message in the last test was:

2011-09-21 13:41:05 CEST FATAL:  could not access status of transaction 1188673

...but we can't tell if that was before or after nextXid, which seems
like it would be useful to know.

If Linas can rerun his experiment, but also capture the output of
pg_controldata before firing up the standby for the first time, then
we'd able to see that information.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Use new oom_score_adj without a new compile-time constant

On Mon, Sep 19, 2011 at 4:36 PM, Dan McGee d...@archlinux.org wrote:
 [ patch ]

I suppose it's Tom who really needs to comment on this, but I'm not
too enthusiastic about this approach.  Duplicating the Linux kernel
calculation into our code means that we could drift if the formula
changes again.

I like Tom's previous suggestion (I think) of allowing both constants
to be defined - if they are, then we try oom_score_adj first and fall
back to oom_adj if that fails.  For bonus points, we could have
postmaster stat() its own oom_score_adj file before forking and set a
global variable to indicate the results.  That way we'd only ever need
to test once per postmaster startup (at least until someone figures
out a way to swap out the running kernel without stopping the
server...!).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-23 Thread Peter Geoghegan

On 23 September 2011 15:46, Robert Haas robertmh...@gmail.com wrote:
 I'm not opposed to adding something like this, but I think it needs to
 either be tied into the actual running of the script, or have a lot
 more documentation than it does now, or both.  I am possibly stupid,
 but I can't understand from reading the script (or, honestly, the
 thread) exactly what kind of pgrminclude-induced errors this is
 protecting against;

The basic idea is simple. There is a high likelihood that if removing
a header alters the behaviour of an object file silently, that it will
also alter the symbol table for the same object file - in particular,
the value of function symbols (their local offset in the object
file), which relates to the number of instructions in each function.
Yes, that is imperfect, but it's better than nothing, and intuitively
I think that the only things that it won't catch in practice are
completely inorganic bugs (i.e. things done for the express purpose of
proving that it can be broken). Thinking about it now though, I have
to wonder if I could have done a better job with objdump -td.

 but even if we clarify that, it seems like it
 would be a lot of work to run it manually on all the files that might
 be affected by a pgrminclude run, unless we can somehow automate that.

I would have automated it if anyone had expressed any interest in the
basic idea - it might be an over-reaction to the problem. I'm not
sure. I'd have had it detect which object files might have been
affected (through directly including the header, or indirectly
including it by proxy). It could rename them such that, for example,
xlog.o was renamed to xlog_old.o . Then, you make your changes,
rebuild, and run the program again in a different mode. It notices the
*_old.o files, and runs nm-diff on them.

 I'm also curious to see how much more fallout we're going to see from
 that run.  We had a few glitches when it was first done, but it didn't
 seem like they were really all that bad.  It might be that we'd be
 better off running pgrminclude a lot *more* often (like once a cycle,
 or even after every CommitFest), because the scope of the changes
 would then be far smaller and we wouldn't be dealing with 5 years of
 accumulated cruft all at once; we'd also get a lot more experience
 with what works or does not work with the script, which might lead to
 improvements in that script on a less-than-geologic time scale.

Fair point. I'm a little busy with other things right now, but I'll
revisit it soon.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] database object class of contrib/sepgsql

On Mon, Sep 12, 2011 at 5:45 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The attached patch is a portion that we splitted off when we added
 pg_shseclabel system catalog.

 It enables the control/sepgsql to assign security label on pg_database
 objects that are utilized as a basis to compute a default security
 label of schema object.

Committed, although the fact that it didn't compile until I made
schema.c include pg_database.h makes me wonder how thoroughly you
tested this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

On Sep23, 2011, at 17:46 , Peter Eisentraut wrote:
 On tis, 2011-09-20 at 16:38 -0400, Robert Haas wrote:
 For now, I think we're best off not changing the terminology, and
 confining the remit of this patch to (a) turning all of the existing
 recovery.conf parameters into GUCs and (b) replacing recovery.conf
 with a sentinel file a sentinel file (name TBB) to indicate that the
 server is to start in recovery mode.  The naming isn't great but the
 more we change at once the less chance of reaching agreement.  It
 seems like we have pretty broad agreement on the basics here, so let's
 start with that.
 
 The only thing that's slightly bogus about that is that if you were
 doing an archive recovery, you'd have to edit the main postgresql.conf
 with one-shot parameters for that particular recovery (and then delete
 them again, or leave them in place, confusing the next guy).  But
 perhaps that's worth the overall simplification.

OTOH, if they're GUCs, you can specify them on the postmaster's
command line. We could even get crazy and patch pg_ctl to allow

  untar base backup
  pg_ctl recover -D dir --target_xid=the xid that broke stuff 
--target_inclusive=false
  pg_ctl start -D dir

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

On Thu, Sep 22, 2011 at 11:31 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 22, 2011 at 11:25 AM, Thom Brown t...@linux.com wrote:
 s/visca-versa/vice-versa/
 s/laods/loads/

 Fixed.  v4 attached.

Since it seems like people are fairly happy with this now, I've gone
ahead and committed this version.  I suspect there are bugs, but I
don't think we're going to get much further until we actually try to
use this for something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix Leaky View Problem