Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Tue, 2010-05-25 at 23:59 -0400, Robert Haas wrote:
 Quorum commit is definitely an extra knob, IMHO.

No, its about three less, as I have explained.

Explain your position, don't just demand others listen.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 13:03 +0900, Fujii Masao wrote:
 On Wed, May 26, 2010 at 1:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote:
  On Tue, May 25, 2010 at 10:29 AM, Josh Berkus j...@agliodbs.com wrote:
   I agree that #4 should be done last, but it will be needed, not in the
   least by your employer ;-) .  I don't see any obvious way to make #4
   compatible with any significant query load on the slave, but in general
   I'd think that users of #4 are far more concerned with 0% data loss than
   they are with getting the slave to run read queries.
 
  Since #2 and #3 are enough for 0% data loss, I think that such users
  would be more concerned about what results are visible in the standby.
  No?
 
  Please add #4 also. You can do that easily at the same time as #2 and
  #3, and it will leave me free to fix the perceived conflict problems.
 
 I think that we should implement the feature in small steps rather than
 submit one big patch at a time. So I'd like to focus on #2 and #3 at first,
 and #4 later (maybe third or fourth CF).

We both know if you do #2 and #3 then doing #4 also is trivial.

If you leave it out then we'll end up missing something that is required
and have to rework everything.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-26 Thread KaiGai Kohei
The attached patch is a revised one for DML permission checks.

List of updates:
- Source code comments in the patched functions were revised.
- ExecCheckRTPerms() and ExecCheckRTEPerms() were moved to aclchk.c,
  and renamed to chkpriv_relation_perms() and chkpriv_rte_perms().
- It took the 2nd argument (bool abort) that is a hint of behavior
  on access violations.
- It also returns AclResult, instead of bool.
- I assumed RI_Initial_Check() is not broken, right now.
  So, this patch just reworks DML permission checks without any bugfixes.
- The ESP hook were moved to ExecCheckRTPerms() from ExecCheckRTEPerms().
- At DoCopy() and RI_Initial_Check() call the checker function with
  list_make1(rte), instead of rte.
- In DoCopy(), required_access is used to store either ACL_SELECT or
  ACL_INSERT; initialized at head of the function.
- In DoCopy(), it initialize selectedCols or modifiedCol of RTE depending
  on if (is_from), instead of columnsSet.

ToDo:
- makeRangeTblEntry() stuff to allocate a RTE node with given parameter
  is not yet.

Thanks,

(2010/05/26 12:04), KaiGai Kohei wrote:
 (2010/05/26 11:12), Stephen Frost wrote:
 * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 #2: REALLY BIG ISSUE- You've added ExecutorCheckPerms_hook as part of
 this patch- don't, we're in feature-freeze right now and should not be
 adding hooks at this time.

 The patch is intended to submit for the v9.1 development, not v9.0, isn't 
 it?

 That really depends on if this is actually fixing a bug in the existing
 code or not.  I'm on the fence about that at the moment, to be honest.
 I was trying to find if we expliitly say that SELECT rights are needed
 to reference a column but wasn't able to.  If every code path is
 expecting that, then perhaps we should just document it that way and
 move on.  In that case, all these changes would be for 9.1.  If we
 decide the current behavior is a bug, it might be something which could
 be fixed in 9.0 and maybe back-patched.
 
 Ahh, because I found out an independent problem during the discussion,
 it made us confused. Please make clear this patch does not intend to
 fix the bug.
 
 If we decide it is an actual bug to be fixed/informed, I also agree
 it should be worked in a separated patch.
 
 Well, rest of discussion should be haven in different thread.
 
 In *either* case, given that one is a 'clean-up' patch and the other is
 'new functionality', they should be independent *anyway*.  Small
 incremental changes that don't break things when applied is what we're
 shooting for here.
 
 Agreed.
 
 #3: You didn't move ExecCheckRTPerms() and ExecCheckRTEPerms() to
 utils/acl and instead added executor/executor.h to rt_triggers.c.
 I don't particularly like that.  I admit that DoCopy() already knew
 about the executor, and if that were the only case outside of the
 executor where ExecCheckRTPerms() was getting called it'd probably be
 alright, but we already have another place that wants to use it, so
 let's move it to a more appropriate place.

 Sorry, I'm a bit confused.
 It seemed to me you suggested to utilize ExecCheckRTPerms() rather than
 moving its logic anywhere, so I kept it here. (Was it misunderstand?)

 I'm talking about moving the whole function (all 3 lines of it) to
 somewhere else and then reworking the function to be more appropriate
 based on it's new location (including renaming and changing arguments
 and return values, as appropriate).
 
 OK, I agreed.
 
 If so, but, I doubt utils/acl is the best placeholder of the moved
 ExecCheckRTPerms(), because the checker function calls both of the
 default acl functions and a optional external security function.

 Can you explain why you think that having a function in utils/acl (eg:
 include/utils/acl.h and backend/utils/aclchk.c) which calls default acl
 functions and an allows for an external hook would be a bad idea?

 It means the ExecCheckRTPerms() is caller of acl functions, not acl
 function itself, isn't it?

 It's providing a higher-level service, sure, but there's nothing
 particularly interesting or special about what it's doing in this case,
 and, we need it in multiple places.  Why duplicate it?
 
 If number of the checker functions is only a reason why we move
 ExecCheckRTPerms() into the backend/utils/aclchk.c right now, I
 don't have any opposition.
 When it reaches to a dozen, we can consider new location. Right?
 
 Sorry, the name of pg_rangetbl_aclcheck() was misleading for me.
 
 I agreed the checker function is not a part of executor, but it is
 also not a part of acl functions in my opinion.

 If it is disinclined to create a new directory to deploy the checker
 function, my preference is src/backend/utils/adt/security.c and
 src/include/utils/security.h .

 We don't need a new directory or file for one function, as Robert
 already pointed out.
 
 OK, let's consider when aclchk.c holds a dozen of checker functions.
 
 #6: I havn't checked yet, but if there are other things in an RTE which
 would make 

Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-05-26 Thread Takahiro Itagaki

Alvaro Herrera alvhe...@commandprompt.com wrote:

 Excerpts from Tom Lane's message of vie may 14 15:03:57 -0400 2010:
 
  Maybe a better solution is to have some kind of notion of a default-only
  entry, which is sufficient to insert the default into the struct but
  isn't accepted as a user-settable item.
 
 This patch (for 8.4, but applies fuzzily to 9.0) implements this idea.
 Note that there's no explicit check that every heap option has a
 corresponding toast option; that's left to the developer's judgement to
 add.  I added the new member to relopt_gen struct so that existing
 entries did not require changes in initializers.

The new default_only field can be initialized only from the internal codes
and is not exported to user definded reloptions. We could add an additional
argument to add_xxx_reloption() functions, but it breaks ABI.

How about the attached patch? It just fills fillfactor (and analyze_threshold)
to default values for TOAST relations. I think responsibility for filling
reloptions with proper values is not in the generic option routines but in
AM-specific reloption handlers.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



toast-ff-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Open items related to SR

2010-05-26 Thread Fujii Masao
Hi,

Many open items related to SR are listed on the wiki again.
http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items

I clarify the status of those items.

 Smart shutdown gets stuck - patch to fix from Fuji Masao

Robert is reviewing and testing the patch I submitted.
I believe that the patch will be applied soon :)

 Where should wal_level description and location be in postgresql.conf

I don't think this should be addressed.

 Patch for distinguishing normal shutdown from unexpected exit

Robert is reviewing the patch I submitted.

 keep_mumble or min_wal_segments naming

Nobody has proposed a parameter name which everyone accepts.
This seems not to be worth further discussion, so how about
removing this from open items?

 xlog timeline 0 pg_xlogfile_name_offset

This subject doesn't match the linked discussion. The subject
represents the same problem as that of another item Streaming
replication and pg_xlogfile_name().

In the linked discussion, I submitted the patch which adds new
function useful for truncating the unnecessary archived files.
But, after that, restartpoint_command was committed for that
purpose, so we don't need to apply the patch right now. Remove
the item?

 Streaming replication and pg_xlogfile_name()

As the result of the discussion, Heikki and I decided not to
address the root cause, but to just throw an error in
pg_xlogfile_name() if called during recovery.
http://archives.postgresql.org/pgsql-committers/2010-04/msg00075.php

We need to do nothing anymore for 9.0. How about moving the
item to 9.1 or later?

 Assertion failure in walreceiver

Already fixed.
http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php

 HS/SR and smart shutdown

Already fixed.
http://archives.postgresql.org/pgsql-committers/2010-04/msg00081.php

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 12:36 +0900, Fujii Masao wrote:
 On Wed, May 26, 2010 at 2:10 AM, Simon Riggs si...@2ndquadrant.com wrote:
  My suggestion is simply to have a single parameter (name unimportant)
 
  number_of_synch_servers_we_wait_for = N
 
  which is much easier to understand because it is phrased in terms of the
  guarantee given to the transaction, not in terms of what the admin
  thinks is the situation.
 
 How can we choose #2, #3 or #4 by using your proposed option?

 If async, the standby never sends any ACK. If recv, fsync,
 or redo, the standby sends the ACK when it has received, fsynced
 or replayed the WAL from the master, respectively.

Everything I've said about per-standby settings applies here, which
was based upon having just 2 settings: sync and async. If you have four
settings instead, things get even more complex. If we were going to
reduce complexity, it would be to reduce the number of options here to
just offering option #2 in the first phase.

AFAICS people would only ever select #2 or #4 anyway. IMHO #3 isn't
likely to be selected on its own because it performs badly for no real
benefit. Having two standbys, I might want to specify #2 to both, or if
one is down then #3 to the remaining standby instead.

Nobody else has yet tried to explain how we would specify what happens
when one of the standbys is down, with per-standby settings. Failure
modes are where the complexity is here. However we proceed, we must have
a discussion about how we specify the failure modes. This is not
something we should add on at the last minute, we should think about
that now and address it openly.

Oracle Data Guard is a great resource for what semantics we might need
to cover, but its also a lesson in complexity from its per-standby
settings. Please look at net_timeout and alternate options in
particular. See how difficult it is to specify failure modes, even
though Data Guard offers probably dozens of parameters and options - its
orientation is per-standby not towards the transaction and the user.

 On the other hand, we add new GUC max_synchronous_standbys
 (I prefer it to number_of_synch_servers_we_wait_for, but does
 anyone have better name?) as PGC_USERSET into postgresql.conf.
 It specifies the maximum number of standbys which transaction
 commit must wait for the ACK from.
 
 If max_synchronous_standbys is 0, no transaction commit waits for
 ACK even if some connected standbys set their replication_mode to
 recv, fsync or redo. If it's positive, transaction comit waits
 for N ACKs. N is the smaller number between max_synchronous_standbys
 and the actual number of connected synchronous standbys.

To summarise, I think we can get away with just 3 parameters:
synchronous_replication = N # similar in name to synchronous_commit
synch_rep_timeout = T
synch_rep_timeout_action = commit | abort

Conceptually, this is I want at least N replica copies made of my
database changes, I will wait for up to T milliseconds to get that
otherwise I will do X. Very easy and clear for an application to
understand what guarantees it is requesting. Also very easy for the
administrator to understand the guarantees requested and how to
provision for them: to deliver robustness they typically need N+1
servers, or for even higher levels of robustness and performance N+2
etc..

Making synchronous_replication into a USERSET would be an industry
first: transaction controlled robustness at every level.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Peter Eisentraut
On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote:
 Oid get_object-type_oid(List *qualname, bool missingok);
 -or-
 Oid get_object-type_oid(char *name, bool missingok);
 
 Thus get_database_oid and get_tablespace_oid would remain unchanged
 except for taking a second argument, get_roleid and get_roleid_checked
 would merge, and all of the others would change to match that style.

If you are doing some refactoring work in that area, maybe you can also
take care of the issue I talked about there:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php


Our code contains about 200 copies of the following code:

tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0);
if (!HeapTupleIsValid(tuple))
elog(ERROR, cache lookup failed for foo %u, fooid);


It looks like your proposal would reduce that number significantly.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Fujii Masao
On Wed, May 26, 2010 at 5:02 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Everything I've said about per-standby settings applies here, which
 was based upon having just 2 settings: sync and async. If you have four
 settings instead, things get even more complex. If we were going to
 reduce complexity, it would be to reduce the number of options here to
 just offering option #2 in the first phase.

 AFAICS people would only ever select #2 or #4 anyway. IMHO #3 isn't
 likely to be selected on its own because it performs badly for no real
 benefit. Having two standbys, I might want to specify #2 to both, or if
 one is down then #3 to the remaining standby instead.

I guess that dropping the support of #3 doesn't reduce complexity since
the code of #3 is almost the same as that of #2. Like walreceiver sends
the ACK after receiving the WAL in #2 case, it has only to do the same
thing after the WAL flush.

 Nobody else has yet tried to explain how we would specify what happens
 when one of the standbys is down, with per-standby settings. Failure
 modes are where the complexity is here. However we proceed, we must have
 a discussion about how we specify the failure modes. This is not
 something we should add on at the last minute, we should think about
 that now and address it openly.

 Imagine having 2 standbys, 1 synch, 1 async. If the synch server goes
 down, performance will improve and robustness will have been lost. What
 good would that be?

You are concerned about the above case you described on another post?
In that case, if you want to ensure robustness, you can specify #2, #3
or #4 in both standbys. If one of standbys is in remote site, we can
additionally set max_synchronous_standbys to 1. If you don't want to
failover to the standby in remote site when the master goes down, you
can specify #1 in remote standby, so the standby in the near location
is always guaranteed to be synch with the master.

 Oracle Data Guard is a great resource for what semantics we might need
 to cover, but its also a lesson in complexity from its per-standby
 settings. Please look at net_timeout and alternate options in
 particular. See how difficult it is to specify failure modes, even
 though Data Guard offers probably dozens of parameters and options - its
 orientation is per-standby not towards the transaction and the user.

Yeah, I'll research Oracle Data Guard.

 To summarise, I think we can get away with just 3 parameters:
 synchronous_replication = N     # similar in name to synchronous_commit
 synch_rep_timeout = T
 synch_rep_timeout_action = commit | abort

I agree to add the latter two parameters, which are also listed on
my outline of SynchRep.
http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability

 Conceptually, this is I want at least N replica copies made of my
 database changes, I will wait for up to T milliseconds to get that
 otherwise I will do X. Very easy and clear for an application to
 understand what guarantees it is requesting. Also very easy for the
 administrator to understand the guarantees requested and how to
 provision for them: to deliver robustness they typically need N+1
 servers, or for even higher levels of robustness and performance N+2
 etc..

I don't feel that synchronous_replication approach is intuitive for
the administrator. Even on this thread, some people seem to prefer
per-standby setting.

Without per-standby setting, when there are two standbys, one is in
the near rack and another is in remote site, synchronous_replication=1
cannot guarantee that the near standby is always synch with the master.
So when the master goes down, unfortunately we might have to failover to
the remote standby. OTOH, synchronous_replication=2 degrades the
performance on the master very much. synchronous_replication approach
doesn't seem to cover the typical use case.

Also, when synchronous_replication=1 and one of synchronous standbys
goes down, how should the surviving standby catch up with the master?
Such standby might be too far behind the master. The transaction commit
should wait for the ACK from the lagging standby immediately even if
there might be large gap? If yes, synch_rep_timeout would screw up
the replication easily.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot Materialized Views - GSoC

2010-05-26 Thread Peter Eisentraut
On fre, 2010-05-21 at 16:31 +0200, Florian Pflug wrote:
 I guess the justification is that with the same argument you could
 argue that a view should have relkind 'r', since it's just an empty
 table with a rewrite rule attached.

It used to be that way, but now a view doesn't have an empty table
attached to it, but no table at all.  Hence the different relkind.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regression testing for psql

2010-05-26 Thread Peter Eisentraut
On tis, 2010-05-25 at 06:23 -0400, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
   Of course, if people want to suggest tests that just shouldn't be
   included, I can go through and strip things out.
  
  Well...  I'm a little reluctant to believe that we should have 3.3M of
  tests for the entire backend and 5M of tests just for psql.  Then,
  too, there's the fact that many of these tests fail on my machine
  because my username is not sfrost, and/or because of row-ordering
  differences on backslash commands without enough ORDER BY to fully
  determine the output order.
 
 Yeah, you know, I had fully intended to go grepping through the output
 last night to check for things like that, but my wife decided I needed
 sleep instead. :)  Sorry about that.  Still, it's more of a general
 proposal than something I think should be committed as-is.  Should we
 try to deal with those kinds of differences, or just eliminate the tests
 which are dependent on username, etc?  It definitely strikes me that
 there's a fair bit of code in psql we're not exercising in some fashion
 in the regression suite... :/

Maybe pg_regress is not the right framework to test that sort of thing.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-26 Thread Mike Fowler

Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:
  

On Tue, May 25, 2010 at 1:09 PM, Mike Fowler m...@mlfowler.com wrote:


We're unlikely to accept this patch if it changes the minimum version
of libxml2 required to compile PostgreSQL


Why? 2.6.27 is almost 4 years old.

At a minimum, I think it's fair to say that the burden is on you to
justify what it's worth bumping the version number.



Yes.  Increasing the minimum required version of some library is a Big
Deal, we don't do it on a whim.  And we definitely don't do it just
because it's old.

regards, tom lane

  


OK, I consider myself suitably educated/chastised. I now understand why 
a version bump is such a big deal. Your objections are all reasonable, I 
suppose I'm just used to living on the bleeding edge of everything. 
Consequently I have changed the code to produce the same result in a 
different way without using the new function. I've down-graded my 
version to 2.6.26 and it all compiles cleanly. Please find attached my 
revised patch, and thanks all for your advise.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: src/backend/utils/adt/xml.c
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	26 May 2010 09:36:50 -
***
*** 3495,3497 
--- 3495,3678 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xmlexists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could not parse XML document);
+ 		xpathctx = xmlXPathNewContext(doc);
+ 		if (xpathctx == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate XPath context);
+ 		xpathctx-node = xmlDocGetRootElement(doc);
+ 		if 

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard (was: Add xpath_exists Function)

2010-05-26 Thread Peter Eisentraut
On tis, 2010-05-25 at 15:31 +0100, Mike Fowler wrote:
 I've been reading the SQL/XML standard and discovered that it defines a 
 function named XMLEXISTS that does exactly what the todo item 
 xpath_exists defines. My original patch named the function as per the 
 todo but I think using the function name from the standard is a better 
 idea. So this patch is the same as before, but the function is now named 
 XMLEXISTS instead of xpath_exists.

The XMLEXISTS function works with XQuery expressions and doesn't have
the call signature that your patch implements.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-26 Thread Mike Fowler

Peter Eisentraut wrote:

On tis, 2010-05-25 at 15:31 +0100, Mike Fowler wrote:
  
I've been reading the SQL/XML standard and discovered that it defines a 
function named XMLEXISTS that does exactly what the todo item 
xpath_exists defines. My original patch named the function as per the 
todo but I think using the function name from the standard is a better 
idea. So this patch is the same as before, but the function is now named 
XMLEXISTS instead of xpath_exists.



The XMLEXISTS function works with XQuery expressions and doesn't have
the call signature that your patch implements


Looking at the manuals of Oracle, Derby and DB2 I see how the call 
signature differs. I also note that Oracle's implementation is XPath 
only, Derby's is partial XQuery and DB2 appears to be full XQuery.


What do people prefer me to do? I see the options as:

1) Change the call signature to match the standard
2) Change the function name back to xpath_exists

Should option one be the more popular there's further choices:

1) Integrate XQuery support to completely match the standard, however 
this will require the addition of a new library libxquery

2) Leave the XPath as is, inline with Oracle's implementation
3) Hybrid approach. Since XML is a comple time option, add XQuery as 
another. Conditional completion gives the full XQuery support when 
available or just the XPath when not


Thoughts?

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-26 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I haven't dug in the SQL spec to see if that addresses
 the point, but it wouldn't bother me in the least to insist that
 both REFERENCES and SELECT privilege are required to create an FK.

Ok.  If we require REFERENCES and SELECT privs to create an FK then I
think the question is: when is the path in RI_Initial_Check not able
to be used (hence, why do we need a fall-back)?  My guess would be:

role X has:

Primary table: SELECT, REFERENCES
Foreign table: REFERENCES

This doesn't make much sense either though, because X has to own the
foreign table.

postgres= alter table fk_tbl add foreign key (x) references pk_tbl;
ERROR:  must be owner of relation fk_tbl

So, the only situation, it seems, where the fall-back method has to be
used is when X owns the table but doesn't have SELECT rights on it.
Maybe it's just me, but that seems pretty silly.

If we require:

Primary table: SELECT, REFERENCES
Foreign table: OWNER, SELECT, REFERENCES

Then it seems like we should be able to eliminate the fall-back method
and just use the RI_Initial_Check approach.  What am I missing here? :/

 In any case, RI_Initial_Check isn't broken, because if it can't do
 the SELECTs it just falls back to a slower method.  It's arguable
 that the FK triggers themselves are assuming more than they should
 about permissions, but I don't think that RI_Initial_Check can be
 claimed to be buggy.

RI_Initial_Check is at least missing an optimization to support
column-level priviledges.  If we say that REFERENCES alone is allowed to
create a FK, then the fall-back method is broken because it depends on
SELECT rights on the primary.

To be honest, I'm guessing that the reason there's so much confusion
around this is that the spec probably says you don't need SELECT rights
(I havn't checked though), and at some point in the distant past we
handled that correctly with the fall-back method and that has since been
broken by other changes (possibly to plug the hole the fall-back method
was using).

I'll try to go deipher the spec so we can at least have something more
interesting to discuss (if we agree with doing it how the spec says or
not :).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/25/2010 3:18 PM, Kevin Grittner wrote:

Jan Wieck janwi...@yahoo.com wrote:
 

Have you ever looked at one of those queries, that Londiste or
Slony issue against the provider DB in order to get all the log
data that has been committed between two snapshots? Is that really
the best you can think of?
 
No, I admit I haven't.  In fact, I was thinking primarily in terms

of log-driven situations, like HS.  What would be the best place for
me to look to come up to speed on your use case?  (I'm relatively
sure that the issue isn't that there's no information to find, but
that a sequential pass over all available information would take a
*long* time.)  I've been working through the issues on WAL-based
replicas, and have some additional ideas and alternatives, but I'd
like to see the big picture, including trigger-based replication,
before posting.


In short, what both systems are doing is as follows. An AFTER ROW 
trigger records the OLD PK and all changed columns, as well as the txid 
and a global, not cached serial number. Some background process 
periodically starts a serializable transaction and records the resulting 
snapshot.


To replicate from one consistent state to the next, the replication 
system now selects all log rows between two snapshots. Between here 
means it simulates MVCC visibility in the sense of that the writing 
transaction was in progress when the first snapshot was taken and had 
committed at the second. The resulting WHERE clause looks something like


WHERE (xid  s1.xmax OR (xid = s1.xmin AND xid IN (s1.xip)))
  AND (xid  s2.xmin OR (xid = s2.xmax AND xid NOT IN (s2.xip)))

Note that xip here is a comma separated list of txid's. I think it is 
easy to see that this is not a cheap query.


Anyhow, that set of log rows is now ordered by the serial number and 
applied to the replica.


Without this logic, the replication system could not combine multiple 
origin sessions into one replication session without risking to never 
find a state, in which it can commit.


It may be possible to work with two sessions on the replica and not 
require any reordering of the original actions at all. I need to think 
about that for a little longer since this idea just occurred to me a 
second ago.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 7:03 AM, Jan Wieck wrote:
To replicate from one consistent state to the next, the replication 
system now selects all log rows between two snapshots. Between here 
means it simulates MVCC visibility in the sense of that the writing 
transaction was in progress when the first snapshot was taken and had 
committed at the second. The resulting WHERE clause looks something like


Or it entirely happened between the snapshots, obviously.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 2:31 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-05-25 at 23:59 -0400, Robert Haas wrote:
 Quorum commit is definitely an extra knob, IMHO.

 No, its about three less, as I have explained.

 Explain your position, don't just demand others listen.

OK.  In words of one syllable, your way still has all the same knobs,
plus some more.

You sketched out a design which still had a per-standby setting for
each standby, but IN ADDITION had a setting for a setting to control
quorum commit[1].  You also argued that we needed four options for
each transaction rather than three[2], and that we need a userset GUC
to control the behavior on a per-transaction basis[3].  Not one other
person has agreed that we need all of these options in the first
version of the patch.  We don't.  We can start with a sync rep patch
that does ONE thing and does it well, and we can add these other
things later. I don't think I'm going too far out on a limb when I say
that it is easier to get a smaller patch committed than it is to get a
bigger one committed, and it is less likely to have bugs.

[1] http://archives.postgresql.org/pgsql-hackers/2010-05/msg01347.php
[2] http://archives.postgresql.org/pgsql-hackers/2010-05/msg01333.php
[3] http://archives.postgresql.org/pgsql-hackers/2010-05/msg01334.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 5:27 AM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote:
 Oid get_object-type_oid(List *qualname, bool missingok);
 -or-
 Oid get_object-type_oid(char *name, bool missingok);

 Thus get_database_oid and get_tablespace_oid would remain unchanged
 except for taking a second argument, get_roleid and get_roleid_checked
 would merge, and all of the others would change to match that style.

 If you are doing some refactoring work in that area, maybe you can also
 take care of the issue I talked about there:
 http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php

 
 Our code contains about 200 copies of the following code:

 tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0);
 if (!HeapTupleIsValid(tuple))
    elog(ERROR, cache lookup failed for foo %u, fooid);
 

 It looks like your proposal would reduce that number significantly.

Well, not directly, because I was proposing to do something about
wanting to turn an object identified by name into an OID, rather than
wanting to look up an OID and find a syscache tuple.  But the same
approach could be applied to the problem you mention.

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend.  It's true that that one file will then
depend on all the catalog stuff, but it actually can limit
dependencies a little bit on the other end, because if someone wants
to call a bunch of these functions from the same file, they only need
to include the one header where they are all declared, rather than all
the individual files that contain the individual functions.  And more
to the point, it's way easier from a maintenance standpoint: there is
much less chance that someone will change one function without
changing all the others if they are all in the same place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-26 Thread Robert Haas
On Mon, May 24, 2010 at 10:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, May 24, 2010 at 10:26 PM, Robert Haas robertmh...@gmail.com wrote:
 This looks pretty reasonable to me, but I guess I feel like it would
 be better to drive the CancelBackup() decision off of whether we've
 ever reached PM_RUN rather than consulting XLogCtl.  It just feels
 cleaner to me to drive all of the postmaster decisions off of the same
 signalling mechanism rather than having a separate one (that only
 works because it's used very late in shutdown when we theoretically
 don't need a lock) just for this one case.

 Okay, how about the attached patch? It uses the postmaster-local flag
 ReachedEndOfRecovery (better name?) instead of XLogCtl one.

I've committed part of this patch, with the naming change that Tom
suggested.  The parts I haven't committed are:

1. I don't see why we need to reset ReachedEndOfRecovery starting over
from PM_NO_CHILDREN.  It seems to me that once we reach PM_RUN, we
CAN'T go back to needing the backup label file, even if we have a
subsequent backend crash.  If I'm wrong, please let me know why and
I'll go put this back (with an appropriate comment).

2. The changes to avoid launching WALReceiver except during certain
PM_* states.  It seems fairly sensible, but what is the case where
adding this logic prevents a problem?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Alastair Turner
A suggestion, based on what I believe would be ideal default settings
for a fully developed SR capability. The thought being that as long as
the default behaviour was stable additional knobs could be added
across version boundaries without causing trouble.

Per slave the master needs to know:
 - The identity of the slave, even if only to limit who can replicate
(this will have to be specified)
 - Whether to expect an acknowledgement from the slave (as will this)
 - How long to wait for the acknowledgement (this may be a default)
 - What the slave is expected to do before acknowledging (I think this
should default to remote flush to disk - #3 in the mail which started
this thread - since it prevents data loss without exposing the master
to the possibility of locking delays)

Additionally the process on the master requires:
 - How many acknowledgments to require before declaring success
(defaulted to the number of servers expected to acknowledge since it
will cause the fewest surprises when failing over to a replica)
 - What to do if the number of acknowledgments is not received
(defaulting to abort/rollback since this is really what differentiates
synchronous from asynchronous replication - the certainty that once
data has been committed it can be recovered)

So in order to set up synchronous replication all a DBA would have to
specify is the slave server, that it is expected to send
acknowledgments and possibly a timeout.

If this is in fact a desirable state for the default behaviour or
minimum settings requirement then I would say it is also a desirable
target for the first patch.

Alastair Bell Turner
^F5

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-26 Thread Robert Haas
On Tue, May 25, 2010 at 6:19 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, May 18, 2010 at 3:09 PM, Fujii Masao masao.fu...@gmail.com wrote:
 (2)
 pg_ctl -ms stop emits the following warning whenever there is the
 backup_label file in $PGDATA.

       WARNING: online backup mode is active
       Shutdown will not complete until pg_stop_backup() is called.

 This warning doesn't fit in with the shutdown during recovery case.
 Since smart shutdown might be requested by other than pg_ctl, the
 warning should be emitted in server side rather than client, I think.
 How about moving the warning to the server side?

 Though I'm not sure if this should be fixed for 9.0, I attached the
 patch (move_bkp_cancel_warning_v1.patch).

 This patch is worth applying for 9.0? If not, I'll add it into
 the next CF for 9.1.

I'm not convinced that this is a good idea, because ISTM it will make
the error message to be less likely to be seen by the person running
pg_ctl.  In any case, it's a behavior change, so I think that means
it's a no-go for 9.0.

In terms of 9.1, it might make sense to log something to both places.
But maybe we shouldn't just do it once - maybe it should happen every
30 s or so until we actually manage to shut down, with a list of
what's still blocking shutdown a la errdetail_busy_db.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] out-of-date comment in CreateRestartPoint()

2010-05-26 Thread Robert Haas
This comment obviously requires adjustment now that HS is committed.
The obvious way to change it is to replace when we get hot standby
capability with when running in Hot Standby mode, but I'm not clear
whether that's all that's required.

/*
 * If the last checkpoint record we've replayed is already our last
 * restartpoint, we can't perform a new restart point. We still update
 * minRecoveryPoint in that case, so that if this is a shutdown restart
 * point, we won't start up earlier than before. That's not strictly
 * necessary, but when we get hot standby capability, it would be rather
 * weird if the database opened up for read-only connections at a
 * point-in-time before the last shutdown. Such time travel is still
 * possible in case of immediate shutdown, though.
 *
 * We don't explicitly advance minRecoveryPoint when we do create a
 * restartpoint. It's assumed that flushing the buffers will
do that as a
 * side-effect.
 */

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 18:52 +0900, Fujii Masao wrote:

 I guess that dropping the support of #3 doesn't reduce complexity
 since the code of #3 is almost the same as that of #2. Like
 walreceiver sends the ACK after receiving the WAL in #2 case, it has
 only to do the same thing after the WAL flush.

Hmm, well the code for #3 is similar also to the code for #4. So if you
do #2, its easy to do #2, #3 and #4 together.

The comment is about whether having #3 makes sense from a user interface
perspective. It's easy to add options, but they must have useful
meaning.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 18:52 +0900, Fujii Masao wrote:

  To summarise, I think we can get away with just 3 parameters:
  synchronous_replication = N # similar in name to synchronous_commit
  synch_rep_timeout = T
  synch_rep_timeout_action = commit | abort
 
 I agree to add the latter two parameters, which are also listed on
 my outline of SynchRep.
 http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability
 
  Conceptually, this is I want at least N replica copies made of my
  database changes, I will wait for up to T milliseconds to get that
  otherwise I will do X. Very easy and clear for an application to
  understand what guarantees it is requesting. Also very easy for the
  administrator to understand the guarantees requested and how to
  provision for them: to deliver robustness they typically need N+1
  servers, or for even higher levels of robustness and performance N+2
  etc..
 
 I don't feel that synchronous_replication approach is intuitive for
 the administrator. Even on this thread, some people seem to prefer
 per-standby setting.

Maybe they do, but that is because nobody has yet explained how you
would handle failure modes with per-standby settings. When you do they
will likely change their minds. Put the whole story on the table before
trying to force a decision.

 Without per-standby setting, when there are two standbys, one is in
 the near rack and another is in remote site, synchronous_replication=1
 cannot guarantee that the near standby is always synch with the master.
 So when the master goes down, unfortunately we might have to failover to
 the remote standby. 

If the remote server responded first, then that proves it is a better
candidate for failover than the one you think of as near. If the two
standbys vary over time then you have network problems that will
directly affect the performance on the master; synch_rep = N would
respond better to any such problems.

 OTOH, synchronous_replication=2 degrades the
 performance on the master very much. 

Yes, but only because you have only one near standby. It would clearly
to be foolish to make this setting without 2+ near standbys. We would
then have 4 or more servers; how do we specify everything for that
config??

 synchronous_replication approach
 doesn't seem to cover the typical use case.

You described the failure modes for the quorum proposal, but avoided
describing the failure modes for the per-standby proposal.

Please explain what will happen when the near server is unavailable,
with per-standby settings. Please also explain what will happen if we
choose to have 4 or 5 servers to maintain performance in case of the
near server going down. How will we specify the failure modes?

 Also, when synchronous_replication=1 and one of synchronous standbys
 goes down, how should the surviving standby catch up with the master?
 Such standby might be too far behind the master. The transaction commit
 should wait for the ACK from the lagging standby immediately even if
 there might be large gap? If yes, synch_rep_timeout would screw up
 the replication easily.

That depends upon whether we send the ACK at point #2, #3 or #4. It
would only cause a problem if you waited until #4.

I've explained why I have made the proposals I've done so far: reduced
complexity in failure modes and better user control. To understand that
better, you or somebody needs to explain how we would handle the failure
modes with per-standby settings so we can compare.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-26 Thread Stephen Frost
KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 Yes, it is entirely separate issue. I don't intend to argue whether
 we can assume the default PG permission allows owner to SELECT on
 the table, or not.

This actually isn't a separate issue.  It's the whole crux of it, as a
matter of fact.  So, wrt the standard, you do NOT need SELECT rights on
a table to create an FK against it.  You only need references.  PG
handles this correctly.

This error:

 postgres=  ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl 
  (a);
 ERROR:  permission denied for relation pk_tbl
 CONTEXT:  SQL statement SELECT 1 FROM ONLY public.pk_tbl x WHERE 
  a OPERATOR(pg_catalog.=) $1 FOR SHARE OF x

Is due to the *owner* of pk_tbl not having SELECT rights- a situation
we're not generally expecting to see.  What's happening is that prior to
the above being run, we've switched user over to the owner of the table
to perform this check.

This script illustrates what I'm talking about:

CREATE USER pk_owner;
CREATE USER fk_owner;
GRANT pk_owner TO sfrost;
GRANT fk_owner TO sfrost;

SET ROLE pk_owner;
CREATE TABLE pk_tbl (a int primary key, b text);
INSERT INTO pk_tbl VALUES (1,'aaa'), (2,'bbb'), (3,'ccc');
GRANT REFERENCES ON pk_tbl TO fk_owner;

SET ROLE fk_owner;
CREATE TABLE fk_tbl (x int, y text);
INSERT INTO fk_tbl VALUES (1,'xxx'), (2,'yyy'), (3,'zzz');

ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);

ALTER TABLE fk_tbl DROP CONSTRAINT fk_tbl_x_fkey;

SET ROLE pk_owner;
REVOKE ALL ON pk_tbl FROM pk_owner;

SET ROLE fk_owner;
ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);

ERROR:  permission denied for relation pk_tbl
CONTEXT:  SQL statement SELECT 1 FROM ONLY public.pk_tbl x WHERE
a OPERATOR(pg_catalog.=) $1 FOR SHARE OF x

This does make me think (as I've thought in the past..) that we really
should say *who* doesn't have that permission.  We run into the same
problem with views- they're run as the owner of the view, so you can get
a permission denied error trying to select from the view when you
clearly have select rights on the view itself.

As it turns out, the check in RI_Initial_Check() to provide the speed-up
is if the current role can just SELECT against the PK table- in which
case, you can run the check as the FK user and not have to change user.
We can't just switch to the PK user and run the same query though,
because that user might not have any rights on the FK table.  So, we end
up taking the slow path, which fires off the FK trigger that's been set
up on the fk_tbl but which runs as the owner of the pk_tbl.

So, long-and-short, I don't see that we have a bug in any of this.  I do
think we should allow RI_Initial_Check() to run if it has the necessary
column-level permissions, and we should remove the duplicate
permissions-checking code in favor of using the same code the executor
will, which happens to also be where the new hook we're talking about
is.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-26 Thread Giles Lean

Abhijit Menon-Sen a...@toroid.org wrote:

 Unless you explicitly declare and fetch from an SQL-level cursor, your
 many GBs of data are going to be transmitted to libpq, which will eat
 lots of memory. (The wire protocol does have something like cursors,
 but libpq does not use them, it retrieves the entire result set.)

Sounds like a project.  Anyone got any suggestions about
semantics and function names?  (Assuming that this can be done
without causing more problems on the backend; I'd rather one
frontend client get messed up than mess up the server if
someone makes a query like that.)

I'm not exactly volunteering to work on something like this
(my TODO list is a trifle long) but I'm working on a native Go
language interface for PostgreSQL presently (influced by but
not an exact clone of libpq) so it's perhaps something I could
do if I get free time in future.

Giles

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I still feel that we'd be better off putting all the functions that
 use the same design pattern in a single file, rather than spreading
 them out all over the backend.  It's true that that one file will then
 depend on all the catalog stuff, but it actually can limit
 dependencies a little bit on the other end, because if someone wants
 to call a bunch of these functions from the same file, they only need
 to include the one header where they are all declared,

This is nonsense, because the call sites are going to be places that are
actually *doing* something with that catalog, and so will need not only
the catalog .h file but any other support functions associated with
doing work on that catalog.  Centralizing the lookups will do nothing
whatsoever to reduce dependencies; it'll just create a central file
dependent on everything in addition to every dependency we have now.

The closest equivalent we have now is lsyscache.c, which is not exactly
a sterling example of how to design a module: it's got no conceptual
consistency whatsoever.

I'm for standardizing the API of lookup functions, but not for
relocating them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-26 Thread Tom Lane
KaiGai Kohei kai...@ak.jp.nec.com writes:
 Hmm. If both REFERENCES and SELECT privilege are required to create
 a new FK constraint, why RI_Initial_Check() need to check SELECT
 permission prior to SPI_execute()?

 It eventually checks SELECT privilege during execution of the secondary
 query. It is unclear for me why we need to provide a slower fallback.

Because the queries inside the triggers are done with a different
current userid.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 07:10 -0400, Robert Haas wrote:

 OK.  In words of one syllable, your way still has all the same knobs,
 plus some more.

I explained how the per-standby settings would take many parameters,
whereas per-transaction settings take far fewer.

 You sketched out a design which still had a per-standby setting for
 each standby, but IN ADDITION had a setting for a setting to control
 quorum commit[1].

No, you misread it. Again. The parameters were not IN ADDITION -
obviously so, otherwise I wouldn't claim there were fewer, would I?

Your reply has again avoided the subject of how we would handle failure
modes with per-standby settings. That is important.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] out-of-date comment in CreateRestartPoint()

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 16:16, Robert Haas wrote:

This comment obviously requires adjustment now that HS is committed.
The obvious way to change it is to replace when we get hot standby
capability with when running in Hot Standby mode, but I'm not clear
whether that's all that's required.


I think that's all that's required.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Greg Stark
On Sun, May 23, 2010 at 9:21 PM, Jan Wieck janwi...@yahoo.com wrote:
 Each record of the Transaction Commit Info consists of

     txid          xci_transaction_id
     timestamptz   xci_begin_timestamp
     timestamptz   xci_commit_timestamp
     int64         xci_total_rowcount


So I think you're going about this backwards.

Instead of discussing implementation I think you should start with the
API the replication system needs. In particular I'm not sure you
really want a server-side query at all. I'm wondering if you wouldn't
be better off with a public machine-parsable text format version of
the WAL. Ie, at the same time as writing out all the nitty gritty to
the binary wal we would write out a summary of public data to an xml
version containing just parts of the data stream that we can promise
won't change, such as transaction id, lsn, timestamp.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 9:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 This is nonsense

You can assert that, but I don't agree.  We certainly have places
(comment.c being the obvious example) where we need to look up a name
and map it to an OID without doing anything else, and actually I
believe there are useful ways to refactor the code that might lead to
more of this.  Anyway, I think the code maintenance argument ought to
carry a lot more weight than whether one or two small files get
rebuilt for dependencies slightly more often.  lsyscache.c might have
no conceptual consistency but it's extremely useful, and there are
plenty of other examples of where we've put code for different object
types into a single file to simplify maintenance and reduce code
complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq should not be using SSL_CTX_set_client_cert_cb

2010-05-26 Thread Garick Hamlin
On Tue, May 25, 2010 at 10:29:07PM -0400, Tom Lane wrote:
 I've been experimenting with SSL setups involving chains of CA
 certificates, ie, where the server or client cert itself is signed by
 an intermediate CA rather than a trusted root CA.  This appears to work
 well enough on the server side if you configure the server correctly
 (see discussion of bug #5468).  However, libpq is not able to work with
 a client certificate unless that cert is directly signed by a CA that
 the server trusts (ie, one listed directly in the server's root.crt file).
 This is because there is no good way to feed back any intermediate CA
 certs to the server.  The man page for SSL_CTX_set_client_cert_cb says
 in so many words that the client_cert_cb API is maldesigned:

I don't understand the problem (I don't know the code).  

However, requiring intermediate certs be prepopulated in a trust store will 
effectively make use cases like Bridged PKI not practical.  (and think that 
use cases like that will be very interesting as things like various PIV 
Cards projects are implemented).  

InCommon is also rolling out personal and machine certificates soon and 
there are some potentially similar use cases there for educational 
communities.  Their offering is not a bridge (although some users could
join a bridge), but there will likely be many different intermediate CA
created and likely created on a regular basis so the same issue exists.

Ideally, I think the client code should be able to compute a path to an
offered Trust Anchor and present it.  The server should not have to store 
and keep up-to-date the entire bridge's repository to authenticate any user. 
This would be the right way, IMO.

It should be easy for the client to build a path: they have the Trust Anchor 
in their store along with whatever intermediate certs they need otherwise
their cert would be less than useful.  They just need to send that chain.

I am guessing the problem is that validating the presented chain is hard?  
or am I misunderstanding the reason to want to nail down all the 
intermediates ahead of time.  

Again, I am not very familiar with what that code does currently

Garick

 
   BUGS
 
   The client_cert_cb() cannot return a complete certificate chain,
   it can only return one client certificate. If the chain only has
   a length of 2, the root CA certificate may be omitted according
   to the TLS standard and thus a standard conforming answer can be
   sent to the server. For a longer chain, the client must send the
   complete chain (with the option to leave out the root CA
   certificate). This can only be accomplished by either adding the
   intermediate CA certificates into the trusted certificate store
   for the SSL_CTX object (resulting in having to add CA
   certificates that otherwise maybe would not be trusted), or by
   adding the chain certificates using the
   SSL_CTX_add_extra_chain_cert(3) function, which is only
   available for the SSL_CTX object as a whole and that therefore
   probably can only apply for one client certificate, making the
   concept of the callback function (to allow the choice from
   several certificates) questionable.
 
 It strikes me that we could not only fix this case, but make the libpq
 code simpler and more like the backend case, if we got rid of
 client_cert_cb and instead preloaded the ~/.postgresql/postgresql.crt
 file using SSL_CTX_use_certificate_chain_file().  Then, using an
 indirectly signed client cert would only require including the full cert
 chain in that file.
 
 So I'm wondering if there was any specific reason behind using the
 callback API to start with.  Anybody remember?
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq should not be using SSL_CTX_set_client_cert_cb

2010-05-26 Thread Tom Lane
Garick Hamlin gham...@isc.upenn.edu writes:
 I am guessing the problem is that validating the presented chain is hard?  

No, the problem is that the current libpq code fails to present the
chain at all.  It will only load and send the first cert in the
postgresql.crt file.  This works only when the client's cert is signed
directly by one of the CAs trusted by the server.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread alvherre
Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

 I still feel that we'd be better off putting all the functions that
 use the same design pattern in a single file, rather than spreading
 them out all over the backend.  It's true that that one file will then
 depend on all the catalog stuff, but it actually can limit
 dependencies a little bit on the other end, because if someone wants
 to call a bunch of these functions from the same file, they only need
 to include the one header where they are all declared, rather than all
 the individual files that contain the individual functions.

This doesn't buy you anything, because that one header will likely have
to #include all the other headers anyway.  And if this is so, then all
those headers will now be included in all files that require even a
single one of these functions.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 11:01 AM, alvherre alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

 I still feel that we'd be better off putting all the functions that
 use the same design pattern in a single file, rather than spreading
 them out all over the backend.  It's true that that one file will then
 depend on all the catalog stuff, but it actually can limit
 dependencies a little bit on the other end, because if someone wants
 to call a bunch of these functions from the same file, they only need
 to include the one header where they are all declared, rather than all
 the individual files that contain the individual functions.

 This doesn't buy you anything, because that one header will likely have
 to #include all the other headers anyway.  And if this is so, then all
 those headers will now be included in all files that require even a
 single one of these functions.

Well, at any rate, I'm giving up on the argument.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Show schema name on REINDEX DATABASE

2010-05-26 Thread Selena Deckelmann
On Mon, Apr 5, 2010 at 9:29 AM, Greg Sabino Mullane g...@turnstep.com wrote:
 Patch attached to show the schema *and* table name when doing
 a REINDEX DATABASE.

Is this something that can be added to 9.1 commitfest?

-selena

-- 
http://chesnok.com/daily - me

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-26 Thread Stephen Frost
KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 The attached patch is a revised one for DML permission checks.

This is certainly alot better.

 ToDo:
 - makeRangeTblEntry() stuff to allocate a RTE node with given parameter
   is not yet.

I'd certainly like to see the above done, or to understand why it can't
be if that turns out to be the case.

A couple of other comments, all pretty minor things:

- I'd still rather see the hook itself in another patch, but given that
  we've determined that none of this is going to go into 9.0, it's not
  as big a deal.

- The hook definition in aclchk.c should really be at the top of that
  file.  We've been pretty consistant about putting hooks at the top of
  files instead of deep down in the file, this should also follow that
  scheme.

- Some of the comments at the top of chkpriv_rte_perms probably make
  sense to move up to where it's called from execMain.c.  Specifically,
  the comments about the other RTE types (function, join, subquery).
  I'd probably change the comment in chkpriv_rte_perms to be simpler-
  This is only used for checking plain relation permissions, nothing
  else is checked here, and also have that same comment around
  chkpriv_relation_perms, both in aclchk.c and in acl.h.

- I'd move chkpriv_relation_perms above chkpriv_rte_perms, it's what we
  expect people to use, after all.

- Don't particularly like the function names.  How about
  relation_privilege_check?  Or rangetbl_privilege_check?  We don't use
  'perms' much (uh, at all?) in function names, and even if we did, it'd
  be redundant and not really help someone understand what the function
  is doing.

- I don't really like having 'abort' as the variable name for the 2nd
  argument.  I'm not finding an obvious convention right now, but maybe
  something like error_on_failure instead?

- In DoCopy, some comments about what you're doing there to set up for
  calling chkpriv_relation_perms would be good (like the comment you
  removed- /* We don't have table permissions, check per-column
  permissions */, updated to for something like build an RTE with the
  columns referenced marked to check for necessary privileges).  
  Additionally, it might be worth considering if having an RTE built
  farther up in DoCopy would make sense and would then be usable for
  other bits in DoCopy.

- In RI_Initial_Check, why not build up an actual list of RTEs and just
  call chkpriv_relation_perms once?  Also, you should add comments
  there, again, about what you're doing and why.  If you can use another
  function to build the actual RTE, this will probably fall out more
  sensibly too.

- Have you checked if there are any bad side-effects from calling
  ri_FetchConstraintInfo before doing the permissions checking?

- The hook in acl.h should be separated out and brought to the top and
  documented independently as to exactly where the hook is and what it
  can be used for, along with what the arguments mean, etc.  Similairly,
  chkpriv_relation_perms should really have a short comment for it about
  what it's for.  Something more than 'security checker function'.

All pretty minor things that I'd probably just fix myself if I was going
to be committing it (not that I have that option ;).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Regression testing for psql

2010-05-26 Thread Stephen Frost
* Peter Eisentraut (pete...@gmx.net) wrote:
 Maybe pg_regress is not the right framework to test that sort of thing.

Perhaps, but if not, then what?  And how can we avoid writing a bunch of
new code that would then need to be checked itself..?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-26 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Because the queries inside the triggers are done with a different
 current userid.

Indeed, I figured that out eventually too.  Sorry it took so long. :/

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 10:04 AM, Greg Stark wrote:

On Sun, May 23, 2010 at 9:21 PM, Jan Wieck janwi...@yahoo.com wrote:

Each record of the Transaction Commit Info consists of

txid  xci_transaction_id
timestamptz   xci_begin_timestamp
timestamptz   xci_commit_timestamp
int64 xci_total_rowcount



So I think you're going about this backward

Instead of discussing implementation I think you should start with the
API the replication system needs. In particular I'm not sure you
really want a server-side query at all. I'm wondering if you wouldn't
be better off with a public machine-parsable text format version of
the WAL. Ie, at the same time as writing out all the nitty gritty to
the binary wal we would write out a summary of public data to an xml
version containing just parts of the data stream that we can promise
won't change, such as transaction id, lsn, timestamp.


Since the actual row level change information and other event data is 
found inside of regular tables, identified by TXID and sequence number, 
I am pretty sure I want that data in a server-side query. What you are 
proposing is to read the xid's and timestamps with an external process, 
that now forcibly needs to reside on the DB server itself (neither 
Londiste nor Slony have that requirement as of today), then bring it 
back into the DB at least inside the WHERE clause of a query.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 9:54 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, 2010-05-26 at 07:10 -0400, Robert Haas wrote:

 OK.  In words of one syllable, your way still has all the same knobs,
 plus some more.

 I explained how the per-standby settings would take many parameters,
 whereas per-transaction settings take far fewer.

 You sketched out a design which still had a per-standby setting for
 each standby, but IN ADDITION had a setting for a setting to control
 quorum commit[1].

 No, you misread it. Again. The parameters were not IN ADDITION -
 obviously so, otherwise I wouldn't claim there were fewer, would I?

Well, that does seem logical, but I can't figure out how to reconcile
that with what you wrote before, because as far as I can see you're
just saying over and over again that your way will need fewer
parameters without explaining which parameters your way won't need.

And frankly, I don't think it's possible for quorum commit to reduce
the number of parameters.  Even if we have that feature available, not
everyone will want to use it.  And the people who don't will
presumably need whatever parameters they would have needed if quorum
commit hadn't been available in the first place.

 Your reply has again avoided the subject of how we would handle failure
 modes with per-standby settings. That is important.

I don't think anyone is avoiding that, we just haven't discussed it.
The thing is, I don't think quorum commit actually does anything to
address that problem.  If I have a master and a standby configured for
sync rep and the standby goes down, we have to decide what impact that
has on the master.  If I have a master and two standbys configured for
sync rep with quorum commit such that I only need an ack from one of
them, and they both go down, we still have to decide what impact that
has on the master.  I agree we need to talk about, but I don't agree
that putting in quorum commit will remove the need to design that
case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Steve Singer

Jan Wieck wrote:

Since the actual row level change information and other event data is 
found inside of regular tables, identified by TXID and sequence number, 
I am pretty sure I want that data in a server-side query. What you are 
proposing is to read the xid's and timestamps with an external process, 
that now forcibly needs to reside on the DB server itself (neither 
Londiste nor Slony have that requirement as of today), then bring it 
back into the DB at least inside the WHERE clause of a query.



It depends on how you approach the problem.

If you had a process that could scan WAL files (or a platform/version 
independent representation of these WAL files) you could run that 
process on any server (the origin server, a replica, or some third 
server with the software installed).  Where you run it involves making 
trade-offs on the costs of storing transferring and processing the files 
and would ideally be configurable.


You could then have a process that transfers all of the data logged by 
the triggers to the replicas as soon as it is committed.  Basically 
saying 'copy any rows in sl_log from the origin to the replica that we 
haven't already sent to that replica'


You could then move the work of figuring out the commit order onto the 
replica where you would combine the output of the WAL scanning process 
with the transaction data that has been copied to the replica.






Jan




--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq should not be using SSL_CTX_set_client_cert_cb

2010-05-26 Thread Garick Hamlin
On Wed, May 26, 2010 at 10:54:42AM -0400, Tom Lane wrote:
 Garick Hamlin gham...@isc.upenn.edu writes:
  I am guessing the problem is that validating the presented chain is hard?  
 
 No, the problem is that the current libpq code fails to present the
 chain at all.  It will only load and send the first cert in the
 postgresql.crt file.  This works only when the client's cert is signed
 directly by one of the CAs trusted by the server.

Sorry, I just re-read your original message.  You were clear, but I read
it wrong.

This is much less limiting than what I thought was being suggested.  Having 
a user's credentials work with only one trust anchor isn't that bad.  I am 
not familiar enough with openssl to know if there is a specific pitfall to
the change you suggested (which I think was what you were asking)..

One could make it work with multiple TAs in a similar fashion if it also 
checked for the existence of a directory (like: ~/.postgresql/client_ta ) to 
store chains to each supported TA by fingerprint.  

That might not be worth the effort at this point...

Garick

 
   regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 11:43 AM, Steve Singer ssin...@ca.afilias.info wrote:
 Jan Wieck wrote:

 Since the actual row level change information and other event data is
 found inside of regular tables, identified by TXID and sequence number, I am
 pretty sure I want that data in a server-side query. What you are proposing
 is to read the xid's and timestamps with an external process, that now
 forcibly needs to reside on the DB server itself (neither Londiste nor Slony
 have that requirement as of today), then bring it back into the DB at least
 inside the WHERE clause of a query.


 It depends on how you approach the problem.

 If you had a process that could scan WAL files (or a platform/version
 independent representation of these WAL files) you could run that process on
 any server (the origin server, a replica, or some third server with the
 software installed).  Where you run it involves making trade-offs on the
 costs of storing transferring and processing the files and would ideally be
 configurable.

 You could then have a process that transfers all of the data logged by the
 triggers to the replicas as soon as it is committed.  Basically saying 'copy
 any rows in sl_log from the origin to the replica that we haven't already
 sent to that replica'

 You could then move the work of figuring out the commit order onto the
 replica where you would combine the output of the WAL scanning process with
 the transaction data that has been copied to the replica.

I'm sure it's possible to make this work however you want to do it,
but I don't really see what advantage Greg Stark's proposal has over
Jan's original proposal.  Recording the commits in one extra place at
commit time is practically free, especially compared to the overall
cost of replication.  Rescanning the WAL seems likely to be much more
expensive and potentially introduces more failure paths.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 10:04 AM, Greg Stark wrote:

Instead of discussing implementation I think you should start with the
API the replication system needs.


... but to answer that request, actually I don't even think we should be 
discussing API specifics.


During PGCon, Marco Kreen, Jim Nasby and I were discussing what the 
requirements of a unified message queue, shared by Londiste and Slony 
may look like. For some use cases of pgq, there isn't even any interest 
in user table changes. These are simply a reliable, database backed 
message passing system.


Today both systems use an agreeable order of changes selected by 
rather expensive queries based on serializable snapshot information and 
a global, non cacheable serial number.


This could be replaced with a logic based on the actual commit order of 
the transactions. This order does not need to be 100% accurate. As long 
as the order is recorded after all user actions have been performed 
(trigger queue shut down) and while the transaction is still holding 
onto its locks, that order is good enough. This will not allow a 
conflicting transaction, waiting on locks to be released, to appear 
having committed before the lock conflict winner.


It is obvious that in cases where only small portions or even none of 
the user table changes are needed, holding on to or even parsing the 
ENTIRE WAL sounds suboptimal for this use case.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq should not be using SSL_CTX_set_client_cert_cb

2010-05-26 Thread Tom Lane
Garick Hamlin gham...@isc.upenn.edu writes:
 One could make it work with multiple TAs in a similar fashion if it also 
 checked for the existence of a directory (like: ~/.postgresql/client_ta ) to 
 store chains to each supported TA by fingerprint.  

 That might not be worth the effort at this point...

I'm inclined to think not.  You can instruct libpq to send a non-default
certificate file by setting its sslcert/sslkey parameters, and I think
what people would typically do is just treat those as known properties
of each server connection they have to deal with.  Implementing cert
selection logic inside libpq would simplify such cases, but I can't see
that anybody is likely to get around to that anytime soon.

Chained certs, on the other hand, definitely are in use in the real
world, so we'd better fix libpq to handle that case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Greg Stark
On Wed, May 26, 2010 at 5:10 PM, Jan Wieck janwi...@yahoo.com wrote:
 ... but to answer that request, actually I don't even think we should be
 discussing API specifics.


How about just API generalities? Like, where do you need this data, on
the master or on the slave? Would PGXC like it on the transaction
coordinator?

What question do you need to answer, do you need to pull out sets of
commits in certain ranges or look up specific transaction ids and find
out when they committed? Or do you only need to answer which of two
transaction ids committed first?



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-26 Thread Peter Eisentraut
On ons, 2010-05-26 at 11:47 +0100, Mike Fowler wrote:
  The XMLEXISTS function works with XQuery expressions and doesn't have
  the call signature that your patch implements
 
 Looking at the manuals of Oracle, Derby and DB2 I see how the call 
 signature differs. I also note that Oracle's implementation is XPath 
 only, Derby's is partial XQuery and DB2 appears to be full XQuery.
 
 What do people prefer me to do? I see the options as:
 
 1) Change the call signature to match the standard
 2) Change the function name back to xpath_exists

It would be nice to make XMLEXISTS work as in the standard, seeing how
many others are providing the same interface.

 Should option one be the more popular there's further choices:
 
 1) Integrate XQuery support to completely match the standard, however 
 this will require the addition of a new library libxquery
 2) Leave the XPath as is, inline with Oracle's implementation
 3) Hybrid approach. Since XML is a comple time option, add XQuery as 
 another. Conditional completion gives the full XQuery support when 
 available or just the XPath when not

I think providing XPath is enough, at least for now.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 18:31, Robert Haas wrote:

And frankly, I don't think it's possible for quorum commit to reduce
the number of parameters.  Even if we have that feature available, not
everyone will want to use it.  And the people who don't will
presumably need whatever parameters they would have needed if quorum
commit hadn't been available in the first place.


Agreed, quorum commit is not a panacea.

For example, suppose that you have two servers, master and a standby, 
and you want transactions to be synchronously committed to both, so that 
in the event of a meteor striking the master, you don't lose any 
transactions that have been replied to the client as committed.


Now you want to set up a temporary replica of the master at a 
development server, for testing purposes. If you set quorum to 2, your 
development server becomes critical infrastructure, which is not what 
you want. If you set quorum to 1, it also becomes critical 
infrastructure, because it's possible that a transaction has been 
replicated to the test server but not the real production standby, and a 
meteor strikes.


Per-standby settings would let you express that, but not OTOH the quorum 
behavior where you require N out of M to acknowledge the commit before 
returning to client.


There's really no limit to how complex a setup can be. For example, 
imagine that you have two data centers, with two servers in each. You 
want to replicate the master to all four servers, but for commit to 
return to the client, it's enough that the transaction has been 
replicated to one server in each data center. How do you express that in 
the config file? And it would be nice to have per-transaction control 
too, like with synchronous_commit...


So this is a tradeoff between
* flexibility, how complex a setup you can express?
* code complexity, how complicated is it to implement?
* user-friendliness, how easy is it to configure?

One way out of this is to implement something very simple in PostgreSQL, 
and build external WAL proxying tools in pgfoundry that allow you to 
cascade and disseminate the WAL in as complex scenarios as you want.



Your reply has again avoided the subject of how we would handle failure
modes with per-standby settings. That is important.


I don't think anyone is avoiding that, we just haven't discussed it.
The thing is, I don't think quorum commit actually does anything to
address that problem.  If I have a master and a standby configured for
sync rep and the standby goes down, we have to decide what impact that
has on the master.  If I have a master and two standbys configured for
sync rep with quorum commit such that I only need an ack from one of
them, and they both go down, we still have to decide what impact that
has on the master.  I agree we need to talk about, but I don't agree
that putting in quorum commit will remove the need to design that
case.


Right, failure modes need to be discussed, but how quorum commit or 
whatnot is configured is irrelevant to that.


No-one has come up with a scheme on how to abort a transaction if you 
don't get a reply from a synchronous standby (or all standbys or a 
quorum of standbys). Until someone does, a commit on the master will 
have to always succeed. The synchronous aspect will provide a 
guarantee that if a standby is connected, any transaction in the master 
will become visible (or fsync'd or just streamed to, depending on the 
level) on the standby too before it's acknowledged as committed to the 
client, nothing more, nothing less.


One way to do that would be to refrain from flushing the commit record 
to disk on the master until the standby has acknowledged it. The 
downside is that the master is in a very severe state at that point: 
until you flush the WAL, you can buffer only a small amount WAL traffic 
until you run out of wal_buffers, stalling all write activity in the 
master, with backends waiting. You can't even shut down the server 
cleanly. But if you value your transaction integrity much higher than 
availability, maybe that's what you want.


PS. I whole-heartedly agree with Simon's concern upthread that if we 
allow a standby to specify in its config file that it wants to be a 
synchronous standby, that's a bit dangerous because connecting such a 
standby to the master will suddenly make all commits on the master a lot 
slower. Adding a synchronous standby should require some action in the 
master, since it affects the behavior on master.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 One way to do that would be to refrain from flushing the commit
 record to disk on the master until the standby has acknowledged
 it.
 
I'm not clear on the benefit of doing that, versus flushing the
commit record and then waiting for responses.  Either way some
databases will commit before others -- what is the benefit of having
the master lag?
 
 Adding a synchronous standby should require some action in the 
 master, since it affects the behavior on master.
 
+1
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Heikki Linnakangas
Could you generate the commit-order log by simply registering a commit 
hook (RegisterXactCallback(XACT_EVENT_COMMIT)) that writes such a log 
somewhere in the data directory? That would work with older versions 
too, no server changes required.


It would not get called during recovery, but I believe that would be 
sufficient for Slony. You could always batch commits that you don't know 
when they committed as if they committed simultaneously.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 20:10, Kevin Grittner wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  wrote:


One way to do that would be to refrain from flushing the commit
record to disk on the master until the standby has acknowledged
it.


I'm not clear on the benefit of doing that, versus flushing the
commit record and then waiting for responses.  Either way some
databases will commit before others -- what is the benefit of having
the master lag?


Hmm, I was going to answer that that way no other transactions can see 
the transaction as committed before it has been safely replicated, but I 
now realize that you could also flush, but refrain from releasing the 
entry from procarray until the standby acknowledges the commit, so the 
transaction would look like in-progress to other transactions in the 
master until that.


Although, if the master crashes at that point, and quickly recovers, you 
could see the last transactions committed on the master before they're 
replicated to the standby.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 11:31 -0400, Robert Haas wrote:
  Your reply has again avoided the subject of how we would handle failure
  modes with per-standby settings. That is important.
 
 I don't think anyone is avoiding that, we just haven't discussed it.

You haven't discussed it, but even before you do, you know its better.
Not very compelling perspective... 

 The thing is, I don't think quorum commit actually does anything to
 address that problem.  If I have a master and a standby configured for
 sync rep and the standby goes down, we have to decide what impact that
 has on the master.  If I have a master and two standbys configured for
 sync rep with quorum commit such that I only need an ack from one of
 them, and they both go down, we still have to decide what impact that
 has on the master.  

That's already been discussed, and AFAIK Masao and I already agreed on
how that would be handled in the quorum commit case.

What we haven't had explained is how you would handle all the sub cases
or failure modes for the per-standby situation.

The most common case for synch rep IMHO is this:

* 2 near standbys, 1 remote. Want to be able to ACK to first near
standby that responds, or if both are down, ACK to the remote.

I've proposed a way of specifying that with 3 simple parameters, e.g.
synch_rep_acks = 1
synch_rep_timeout = 30
synch_rep_timeout_action = commit

In Oracle this would be all of the following

* all nodes given unique names
DB_UNIQUE_NAME=master
DB_UNIQUE_NAME=near1
DB_UNIQUE_NAME=near2
DB_UNIQUE_NAME=remote

* parameter settings
LOG_ARCHIVE_CONFIG='DG_CONFIG=(master,near1, near2, remote)'

LOG_ARCHIVE_DEST_2='SERVICE=near1 SYNC AFFIRM NET_TIMEOUT=30
DB_UNIQUE_NAME=near1'
LOG_ARCHIVE_DEST_STATE_2='ENABLE'

LOG_ARCHIVE_DEST_3='SERVICE=near2 SYNC AFFIRM NET_TIMEOUT=30
DB_UNIQUE_NAME=near2'
LOG_ARCHIVE_DEST_STATE_3='ENABLE'

LOG_ARCHIVE_DEST_4='SERVICE=remote ASYNC NOAFFIRM DB_UNIQUE_NAME=remote'
LOG_ARCHIVE_DEST_STATE_4='ENABLE'

* modes
ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY;


The Oracle way doesn't allow you to specify that if near1 and near2 are
down then we should continue to SYNC via remote, nor does it allow you
to specify things from user perspective or at transaction level.

You don't need to do it that way, for sure. But we do need to say what
way you would pick, rather than just arguing against me before you've
even discussed it here or off-list.

 I agree we need to talk about, but I don't agree
 that putting in quorum commit will remove the need to design that
 case.

Yes, you need to design for that case. It's not a magic wand.

All I've said is that covering the common cases is easier and more
flexible by choosing transaction-centric style of parameters, and it
also allows user settable behaviour.

I want to do better than Oracle, if possible, using lessons learned. I
don't want to do the same thing because we're copying them or because
we're going down the same conceptual dead end they went down. We should
try to avoid doing something obvious and aim a little higher.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Tom Lane
alvherre alvhe...@commandprompt.com writes:
 Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:
 I still feel that we'd be better off putting all the functions that
 use the same design pattern in a single file, rather than spreading
 them out all over the backend.

 This doesn't buy you anything, because that one header will likely have
 to #include all the other headers anyway.  And if this is so, then all
 those headers will now be included in all files that require even a
 single one of these functions.

For the particular case Robert is proposing, the *header* isn't a
problem, because the only types it would deal in are Oid, bool,
const char *, and List *.  But you're right that in general this design
pattern carries a risk of having to include the world in a commonly-used
header file, which is certainly not a good idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 12:10 -0500, Kevin Grittner wrote:
  Adding a synchronous standby should require some action in the 
  master, since it affects the behavior on master.
  
 +1

+1
 
-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 Although, if the master crashes at that point, and quickly
 recovers, you could see the last transactions committed on the
 master before they're replicated to the standby.
 
Versus having the transaction committed on one or more slaves but
not on the master?  Unless we have a transaction manager and do
proper distributed transactions, how do you avoid edge conditions
like that?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 19:55 +0300, Heikki Linnakangas wrote:

 Now you want to set up a temporary replica of the master at a 
 development server, for testing purposes. If you set quorum to 2, your
 development server becomes critical infrastructure, which is not what 
 you want.

That's a good argument for standby relays. Nobody hooks in a disposable
test machine into a critical production config without expecting it to
have some effect.

 If you set quorum to 1, it also becomes critical 
 infrastructure, because it's possible that a transaction has been 
 replicated to the test server but not the real production standby, and
 a meteor strikes.

Why would you not want to use the test server? If its the only thing
left protecting you, and you wish to be protected, then it sounds very
cool to me. In my proposal this test server only gets data ahead of
other things if the real production standby responds too slowly.

It scares the  out of people that a DBA can take down a server and
suddenly the sync protection you thought you had is turned off. That way
of doing things means an application never knows the protection level
any piece of data has had. App designers want to be able to marks things
handle with care or just do it quick, don't care much. It's a real
pain to have to handle all your data the same, and for that to be
selectable only by administrators, who may or may not have everything
configured correctly/available.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 1:24 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 26/05/10 20:10, Kevin Grittner wrote:

 Heikki Linnakangasheikki.linnakan...@enterprisedb.com  wrote:

 One way to do that would be to refrain from flushing the commit
 record to disk on the master until the standby has acknowledged
 it.

 I'm not clear on the benefit of doing that, versus flushing the
 commit record and then waiting for responses.  Either way some
 databases will commit before others -- what is the benefit of having
 the master lag?

 Hmm, I was going to answer that that way no other transactions can see the
 transaction as committed before it has been safely replicated, but I now
 realize that you could also flush, but refrain from releasing the entry from
 procarray until the standby acknowledges the commit, so the transaction
 would look like in-progress to other transactions in the master until that.

 Although, if the master crashes at that point, and quickly recovers, you
 could see the last transactions committed on the master before they're
 replicated to the standby.

No matter what you do, there's going to be corner cases where one node
thinks the transaction committed and the other node doesn't know.  At
any given time, we're either in a state where a crash and restart on
the master will replay the commit record, or we're not.  And also, but
somewhat independently, we're in a state where a crash on the standby
will replay the commit record, or we're not.  Each of these is
dependent on a disk write, and there's no way to guarantee that both
of those disk writes succeed or both of them fail.

Now, in theory, maybe you could have a system where we don't have a
fixed definition of who the master is.  If either server crashes or if
they lose communication, both crash.  If they both come back up, they
agree on who has the higher LSN on disk and both roll forward to that
point, then designate one server to be the master.  If one comes back
up and can't reach the other, it appeals to the clusterware for help.
The clusterware is then responsible for shooting one node in the head
and telling the other node to carry on as the sole survivor.  When,
eventually, the dead node is resurrected, it *discards* any WAL
written after the point from which the new master restarted.

Short of that, I don't think abort the transaction is a recovery
mechanism for when we can't get hold of a standby.  We're going to
have to commit locally first and then we can decide how long to wait
for an ACK that a standby has also committed the same transaction
remotely.  We can wait not at all, forever, or for a while and then
declare the other guy dead.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for serializable transactions with predicate locking

2010-05-26 Thread Selena Deckelmann
On Wed, May 19, 2010 at 5:37 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I heard that others were considering work on predicate locks for
 9.1. Since Dan Ports of MIT and I have been working on that for the
 serializable implementation for the last few weeks, I felt it would
 be good to post a WIP patch to avoid duplicate effort.

I added this to the next commitfest with the 'WIP' prominent.

I figured it was worth including for initial reviews, although of
course, detailed work will likely wait until July.

-selena


-- 
http://chesnok.com/daily - me

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 20:33, Kevin Grittner wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  wrote:


Although, if the master crashes at that point, and quickly
recovers, you could see the last transactions committed on the
master before they're replicated to the standby.


Versus having the transaction committed on one or more slaves but
not on the master?  Unless we have a transaction manager and do
proper distributed transactions, how do you avoid edge conditions
like that?


Yeah, I guess you can't. You can guarantee that a commit is always 
safely flushed first in the master, or in the standby, but without 
two-phase commit you can't guarantee atomicity. It's useful to know 
which behavior you get, though, so that you can take it into account in 
your failover procedure.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql \? \daS

2010-05-26 Thread Stephen Frost
Greetings,

  Noticed this while playing around with psql regression tests.

Thanks,

Stephen

commit 31bf61bab77ad991f2a67a41699384e57c021508
Author: Stephen Frost sfr...@snowman.net
Date:   Wed May 26 13:51:27 2010 -0400

Add 'S' to optional parameters for \da

\da supports \daS, \da+ and \daS+.  This updates psql's \? help
screen to include the 'S'.  Missed during the addition of S as
an option for most \d commands to list System objects.

diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index f129041..600ef82 100644
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** slashUsage(unsigned short int pager)
*** 196,202 
fprintf(output, _(  (options: S = show system objects, + = additional 
detail)\n));
fprintf(output, _(  \\d[S+] list tables, views, and 
sequences\n));
fprintf(output, _(  \\d[S+]  NAME   describe table, view, 
sequence, or index\n));
!   fprintf(output, _(  \\da[+]  [PATTERN]  list aggregates\n));
fprintf(output, _(  \\db[+]  [PATTERN]  list tablespaces\n));
fprintf(output, _(  \\dc[S]  [PATTERN]  list conversions\n));
fprintf(output, _(  \\dC [PATTERN]  list casts\n));
--- 196,202 
fprintf(output, _(  (options: S = show system objects, + = additional 
detail)\n));
fprintf(output, _(  \\d[S+] list tables, views, and 
sequences\n));
fprintf(output, _(  \\d[S+]  NAME   describe table, view, 
sequence, or index\n));
!   fprintf(output, _(  \\da[S+] [PATTERN]  list aggregates\n));
fprintf(output, _(  \\db[+]  [PATTERN]  list tablespaces\n));
fprintf(output, _(  \\dc[S]  [PATTERN]  list conversions\n));
fprintf(output, _(  \\dC [PATTERN]  list casts\n));


signature.asc
Description: Digital signature


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 Unless we have a transaction manager and do proper distributed
 transactions, how do you avoid edge conditions like that?
 
 Yeah, I guess you can't. You can guarantee that a commit is
 always safely flushed first in the master, or in the standby, but
 without two-phase commit you can't guarantee atomicity. It's
 useful to know which behavior you get, though, so that you can
 take it into account in your failover procedure.
 
It strikes me that if you always write the commit for the master
first, there's at least a possibility of developing a heuristic for
getting a slave back in sync should the connection break.  If you
randomly update zero to N slaves and then have a failure, I don't
see much hope.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 20:40, Simon Riggs wrote:

On Wed, 2010-05-26 at 19:55 +0300, Heikki Linnakangas wrote:

If you set quorum to 1, it also becomes critical
infrastructure, because it's possible that a transaction has been
replicated to the test server but not the real production standby, and
a meteor strikes.


Why would you not want to use the test server?


Because your failover procedures known nothing about the test server. 
Even if the data is there in theory, it'd be completely impractical to 
fetch it from there.



If its the only thing
left protecting you, and you wish to be protected, then it sounds very
cool to me.  In my proposal this test server only gets data ahead of
other things if the real production standby responds too slowly.


There's many reasons why a test server could respond faster than the 
production standby. Maybe the standby is on a different continent. Maybe 
you have fsync=off on the test server because it's just a test server. 
Either way, you want the master to ignore it for the purpose of 
determining if a commit is safe.



It scares the  out of people that a DBA can take down a server and
suddenly the sync protection you thought you had is turned off.


Yeah, it depends on what you're trying to accomplish. If durability is 
absolutely critical to you, (vs. availability), you don't want the 
commit to ever be acknowledged to the client until it's safely flushed 
to disk in the standby, even if it means refusing any further commits on 
the master, until the standby reconnects and catches up.


OTOH, if you're not that worried about durability, but you're load 
balancing queries to the standby, you want to ensure that when you run a 
query against the standby, a transaction that committed on the master is 
also visible in the standby. In that scenario, if a standby can't be 
reached, it is simply pronounced dead, and the master can just ignore it 
until it reconnects.



That way
of doing things means an application never knows the protection level
any piece of data has had. App designers want to be able to marks things
handle with care or just do it quick, don't care much.


Yeah, that's useful too.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: Re: [BUGS] dividing money by money

2010-05-26 Thread Kevin Grittner
Hi Andy,
 
Do you want to package this up as a patch for 9.1?  If not, is it OK
if I do?
 
-Kevin
 
 
Andy Balholm a...@balholm.com wrote:
 On Apr 1, 2010, at 7:57 AM, Kevin Grittner wrote:
 I'm inclined to think it's better to have an explicit cast from
 money to numeric, as long as it is exact, and leave the division
 of money by money as float8.  It does sort of beg the question of
 whether we should support a cast back in the other direction,
 though.  I think that would wrap this all up in a tidy package.
 
 OK. Here is the whole thing in C:
 
 [implementation as user-side functions]


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 1:26 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, 2010-05-26 at 11:31 -0400, Robert Haas wrote:
  Your reply has again avoided the subject of how we would handle failure
  modes with per-standby settings. That is important.

 I don't think anyone is avoiding that, we just haven't discussed it.

 You haven't discussed it, but even before you do, you know its better.
 Not very compelling perspective...

I don't really understand this comment.  I have said, and I believe,
that a system without quorum commit is simpler than one with quorum
commit.  I'd debate the point with you but I find the point so
self-evident that I don't even know where to begin arguing it.  I am
not saying, and I have not said, that we shouldn't have quorum commit.
 I am saying that it is not something that we need to add as part of
an initial sync rep patch, because we can instead add it in a
follow-on patch.  As far as I can tell, we are not trying to decide
between two competing approaches and therefore do not need to decide
which one is better.

Everything you are proposing sounds useful and valuable.  I am not
sure whether it handles all of the use cases that folks might have.
For example, Heikki mentioned the case upthread of wanting to wait for
a commit ACK from one of two servers in the data center A and one of
two servers in data center B, rather than just two out of four servers
total.  So, we might need to think a little bit about whether we want
to handle those kinds of cases and what sort of infrastructure we
would need to support it.  But certainly I think quorum commit sounds
like a good feature and I hope it will be included in 9.1, or if not
9.1, then some future version.

What I don't agree with is that it needs to be part of the initial
implementation of sync rep.  If there's a case for doing that, I don't
believe it's been made on this thread.  At any rate, the fact that I
don't see it as a sine qua non for sync rep is neither obstructionism
nor an ad hominem attack.  It's simply an opinion, which I believe to
be based on solid technical reasoning, but which I might change my
mind about if someone convinces me that I'm looking at the problem the
wrong way.  That would an involve someone making an argument of the
following form: If we don't implement quorum commit in the very first
implementation of sync rep, then it will be hard to add later because
X.  So far no one has done that.  You have made the similar argument
If we do implement quorum commit in the very first version of sync
rep, it will save implementation work elsewhere - but I don't think
that's true and I have explained why.

 The thing is, I don't think quorum commit actually does anything to
 address that problem.  If I have a master and a standby configured for
 sync rep and the standby goes down, we have to decide what impact that
 has on the master.  If I have a master and two standbys configured for
 sync rep with quorum commit such that I only need an ack from one of
 them, and they both go down, we still have to decide what impact that
 has on the master.

 That's already been discussed, and AFAIK Masao and I already agreed on
 how that would be handled in the quorum commit case.

I can't find that in the thread.  Anyway, again, you're going to
probably want the same options there that you will in the master with
one standby case, which I personally think is going to be a LOT more
common than any other configuration.

 [configuration example]
 The Oracle way doesn't allow you to specify that if near1 and near2 are
 down then we should continue to SYNC via remote, nor does it allow you
 to specify things from user perspective or at transaction level.

 You don't need to do it that way, for sure. But we do need to say what
 way you would pick, rather than just arguing against me before you've
 even discussed it here or off-list.

Well, again, I am not arguing and have not argued that we shouldn't do
it, just that we shouldn't do it UNTIL we get the basic stuff working.
 On the substance of the design, the Oracle way doesn't look that bad
in terms of syntax (I suspect we'll end up with some of the same
knobs), but certainly I agree that it would be nice to do some of the
things they can't which you have detailed here.  I just don't want us
to bite off more than we can chew.  Then we might end up with nothing,
which would suck.

 I agree we need to talk about, but I don't agree
 that putting in quorum commit will remove the need to design that
 case.

 Yes, you need to design for that case. It's not a magic wand.

 All I've said is that covering the common cases is easier and more
 flexible by choosing transaction-centric style of parameters, and it
 also allows user settable behaviour.

One of the ideas you proposed upthread, in terms of
transaction-centric behavior, is having an individual transaction be
able to ask for a weaker integrity guarantee than whatever the default
is.  I think that is both a very good idea 

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Josh Berkus
On 5/25/10 10:04 PM, Heikki Linnakangas wrote:
 On 25/05/10 23:56, Josh Berkus wrote:
 Do we get a bit in the visibility map for a page which has aborted
 transaction rows on it?
 
 If there's a tuple with an aborted xmin on a page, the bit in the
 visibility map is not set. A tuple with aborted xmax doesn't matter.

Then it seems like pages in the visibility map, at least, would not need
to be vacuumed or frozen.  Do pages persist in the visibility map
indefinitely?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 1:17 PM, Heikki Linnakangas wrote:
Could you generate the commit-order log by simply registering a commit 
hook (RegisterXactCallback(XACT_EVENT_COMMIT)) that writes such a log 
somewhere in the data directory? That would work with older versions 
too, no server changes required.




That would work, as it seems that the backend keeps holding on to its 
locks until after calling the callbacks.


It would not get called during recovery, but I believe that would be 
sufficient for Slony. You could always batch commits that you don't know 
when they committed as if they committed simultaneously.


Here you are mistaken. If the origin crashes but can recover not yet 
flushed to xlog-commit-order transactions, then the consumer has no idea 
about the order of those commits, which throws us back to the point 
where we require a non cacheable global sequence to replay the 
individual actions of those now batched transactions in an agreeable 
order.


The commit order data needs to be covered by crash recovery.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 21:35, Josh Berkus wrote:

On 5/25/10 10:04 PM, Heikki Linnakangas wrote:

On 25/05/10 23:56, Josh Berkus wrote:

Do we get a bit in the visibility map for a page which has aborted
transaction rows on it?


If there's a tuple with an aborted xmin on a page, the bit in the
visibility map is not set. A tuple with aborted xmax doesn't matter.


Then it seems like pages in the visibility map, at least, would not need
to be vacuumed or frozen.  Do pages persist in the visibility map
indefinitely?


In theory, until any tuple on the page is inserted/updated/deleted 
again. However, we've been operating on the assumption that it's always 
safe to clear any bit in the visibility map, without affecting 
correctness. I would not like to give up that assumption, it makes life 
easier.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Josh Berkus

 In theory, until any tuple on the page is inserted/updated/deleted
 again. However, we've been operating on the assumption that it's always
 safe to clear any bit in the visibility map, without affecting
 correctness. I would not like to give up that assumption, it makes life
 easier.

It wouldn't affect correctness, it would just force that page to be
vacuumed-and-frozen.  I think I can make this work, let me just hammer
it out.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 1:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 alvherre alvhe...@commandprompt.com writes:
 Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:
 I still feel that we'd be better off putting all the functions that
 use the same design pattern in a single file, rather than spreading
 them out all over the backend.

 This doesn't buy you anything, because that one header will likely have
 to #include all the other headers anyway.  And if this is so, then all
 those headers will now be included in all files that require even a
 single one of these functions.

 For the particular case Robert is proposing, the *header* isn't a
 problem, because the only types it would deal in are Oid, bool,
 const char *, and List *.  But you're right that in general this design
 pattern carries a risk of having to include the world in a commonly-used
 header file, which is certainly not a good idea.

Right.  I am very cognizant of the problem, but it isn't really an
issue in this case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mapping object names to role IDs

2010-05-26 Thread alvherre
Excerpts from Robert Haas's message of mié may 26 10:34:00 -0400 2010:

 lsyscache.c might have no conceptual consistency but it's extremely
 useful,

I know I've been annoyed by lsyscache: looking for accessors to catalog
stuff, not finding them and so creating my own by using syscache
directly, only to find out later that they already existed there.
I think we should be moving in the direction of *removing* lsyscache,
not replicating it.

BTW I quite agree with both the suggestion you give in this thread
(modulo this issue), and Peter's idea of getting rid of the repetitive
syscache coding pattern.

 and there are
 plenty of other examples of where we've put code for different object
 types into a single file to simplify maintenance and reduce code
 complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.).

Well, that's all related to node manipulation, so I'm not so sure it's
exactly the same.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 2:44 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 26/05/10 21:35, Josh Berkus wrote:
 On 5/25/10 10:04 PM, Heikki Linnakangas wrote:
 On 25/05/10 23:56, Josh Berkus wrote:
 Do we get a bit in the visibility map for a page which has aborted
 transaction rows on it?

 If there's a tuple with an aborted xmin on a page, the bit in the
 visibility map is not set. A tuple with aborted xmax doesn't matter.

 Then it seems like pages in the visibility map, at least, would not need
 to be vacuumed or frozen.  Do pages persist in the visibility map
 indefinitely?

 In theory, until any tuple on the page is inserted/updated/deleted again.
 However, we've been operating on the assumption that it's always safe to
 clear any bit in the visibility map, without affecting correctness. I would
 not like to give up that assumption, it makes life easier.

What if we drove it off of the PD_ALL_VISIBLE bit on the page itself,
rather than the visibility map bit?  It would be safe to clear the
visibility map bit without touching the page, but if you clear the
PD_ALL_VISIBLE bit on the page itself then you set all the hint bits
and freeze all the tuples.  In the case where the visibility map bit
gets cleared but the page-level bit is still set, a future vacuum can
notice and reset the visibility map bit.  But whenever the visibility
map bit is set, you know that the page-level bit MUST be set, so you
needn't vacuum those pages, even for anti-wraparound: you know they'll
be frozen when and if they ever get written again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 14:30 -0400, Robert Haas wrote:
 On Wed, May 26, 2010 at 1:26 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On Wed, 2010-05-26 at 11:31 -0400, Robert Haas wrote:
   Your reply has again avoided the subject of how we would handle failure
   modes with per-standby settings. That is important.
 
  I don't think anyone is avoiding that, we just haven't discussed it.
 
  You haven't discussed it, but even before you do, you know its better.
  Not very compelling perspective...
 
 I don't really understand this comment.  I have said, and I believe,
 that a system without quorum commit is simpler than one with quorum
 commit.  I'd debate the point with you but I find the point so
 self-evident that I don't even know where to begin arguing it.  

 It's simply an opinion, which I believe to
 be based on solid technical reasoning, but which I might change my
 mind about if someone convinces me that I'm looking at the problem the
 wrong way.

You're saying you have solid technical reasons, but they are so
self-evident that you can't even begin to argue them. Why are you so
sure your reasons are solid?? Regrettably, I say this doesn't make any
sense, however much you write.

The decision may already have been made in your eyes, but the community
still has options as to how to proceed, whether or not Masao has already
written this.

Zoltan has already presented a patch that follows my proposal, so there
are alternate valid paths which we can decide between. It's not a matter
of opinion as to which is easier to code cos its already done; you can
run the patch and see. (No comment on other parts of that patch).

The alternative is an approach that hasn't even been presented itself
fully on list, with many unanswered questions. I've thought about this
myself and discussed my reasons on list for the past two years. If you
can read all I've presented to the community and come up with a better
way, great, we'll all be happy. 

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 21:43, Jan Wieck wrote:

On 5/26/2010 1:17 PM, Heikki Linnakangas wrote:

It would not get called during recovery, but I believe that would be
sufficient for Slony. You could always batch commits that you don't
know when they committed as if they committed simultaneously.


Here you are mistaken. If the origin crashes but can recover not yet
flushed to xlog-commit-order transactions, then the consumer has no idea
about the order of those commits, which throws us back to the point
where we require a non cacheable global sequence to replay the
individual actions of those now batched transactions in an agreeable
order.

The commit order data needs to be covered by crash recovery.


Perhaps I'm missing something, but I thought that Slony currently uses a 
heartbeat, and all transactions committed between two beats are banged 
together and committed as one in the slave so that their relative commit 
order doesn't matter. Can we not do the same for commits missing from 
the commit-order log?


I'm thinking that the commit-order log would contain two kinds of records:

a) Transaction with XID X committed
b) All transactions with XID  X committed

During normal operation we write the 1st kind of record at every commit. 
After crash recovery (perhaps at the first commit after recovery or when 
the slon daemon first polls the server, as there's no hook for 
end-of-recovery), we write the 2nd kind of record.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regression testing for psql

2010-05-26 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Then, too, there's the fact that many of these tests fail on my
 machine because my username is not sfrost, 

I've updated the patch to address this, it's again at:
http://snowman.net/~sfrost/psql-regress-help.patch

If the size is still an issue, I could just go through and remove the
commands which return lots of records (that would also remove most of
the info about the catalog, minimizing the effect on this set of tests
when people change the catalog..).  Downside there, of course, is that
we're not testing as many cases.  Still, better something than nothing.
:)

 and/or because of row-ordering differences on backslash commands
 without enough ORDER BY to fully determine the output order.

I don't believe this was ever actually an issue.  At least, I've run it
a number of times locally without issue.  If anyone is still getting
differences when run against 9.0 HEAD, please let me know.

commit e301c873740816c5f3fd5437dcc559c880b8f404
Author: Stephen Frost sfr...@snowman.net
Date:   Wed May 26 15:02:28 2010 -0400

Add regression tests for psql backslash commands

This patch adds rather extensive regression testing
of the psql backslash commands.  Hopefully this will
minimize issues such as the one which cropped up
recently with \h segfaulting.  Note that we don't
currently explicit check all the \h options and that
many catalog changes will mean that this needs to also
be updated.  Still, it's a start, we can reduce the
set of tests if that makes sense or they become a
problem.  Also, any tests which would include the
owner of the database have been excluded.

Patch here:
http://snowman.net/~sfrost/psql-regress-help.patch

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] psql \? \daS

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 1:59 PM, Stephen Frost sfr...@snowman.net wrote:
  Noticed this while playing around with psql regression tests.

Good catch.  It looks like the + option doesn't actually do anything
for \da, though, so I'm inclined to just write \da[S] rather than
\da[S+] (see: \dl, \dC).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 3:13 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I don't really understand this comment.  I have said, and I believe,
 that a system without quorum commit is simpler than one with quorum
 commit.  I'd debate the point with you but I find the point so
 self-evident that I don't even know where to begin arguing it.

 It's simply an opinion, which I believe to
 be based on solid technical reasoning, but which I might change my
 mind about if someone convinces me that I'm looking at the problem the
 wrong way.

 You're saying you have solid technical reasons, but they are so
 self-evident that you can't even begin to argue them. Why are you so
 sure your reasons are solid?? Regrettably, I say this doesn't make any
 sense, however much you write.

Yeah, especially when you juxtapose two different parts of my email
that were talking about different things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Joshua D. Drake
On Wed, 2010-05-26 at 15:37 -0400, Robert Haas wrote:
 On Wed, May 26, 2010 at 3:13 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I don't really understand this comment.  I have said, and I believe,
  that a system without quorum commit is simpler than one with quorum
  commit.  I'd debate the point with you but I find the point so
  self-evident that I don't even know where to begin arguing it.
 
  It's simply an opinion, which I believe to
  be based on solid technical reasoning, but which I might change my
  mind about if someone convinces me that I'm looking at the problem the
  wrong way.
 
  You're saying you have solid technical reasons, but they are so
  self-evident that you can't even begin to argue them. Why are you so
  sure your reasons are solid?? Regrettably, I say this doesn't make any
  sense, however much you write.
 
 Yeah, especially when you juxtapose two different parts of my email
 that were talking about different things.

/me rings bell and orders the fighters to their respective corners.

 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company
 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add _PG_init to PL language handler documentation

2010-05-26 Thread Robert Haas
On Tue, May 25, 2010 at 4:34 AM, Jonathan Leto jal...@gmail.com wrote:
 This tiny doc patch adds _PG_init to the skeleton example code for a
 PL. The information is quite valuable to PL authors, who might miss it
 when it is described in the shared library documentation.

I'm not sure it does much good to add it to the template without any
explanation of what it does.  What might make more sense is to add a
cross-reference to the section on dynamic loading, where this is
documented.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regression testing for psql

2010-05-26 Thread alvherre
Excerpts from Stephen Frost's message of mié may 26 15:19:59 -0400 2010:
 * Robert Haas (robertmh...@gmail.com) wrote:
  Then, too, there's the fact that many of these tests fail on my
  machine because my username is not sfrost, 
 
 I've updated the patch to address this, it's again at:
 http://snowman.net/~sfrost/psql-regress-help.patch

Isn't this kind of test a pain to maintain?  If somebody add a new SQL
command, it will affect the entire \h output and she'll have to either
apply the changes without checking them, or manually check the complete
list.  I have only to add a new function to make the test fail ...

Also, having to exclude tests that mention the database owner means that
you're only testing a fraction of the commands, so any possible problem
has a large chance of going undetected.  I mean, if we're going to test
this kind of thing, shouldn't we be using something that allows us to
ignore the db owner name?  A simple wildcard in place of the owner name
would suffice ... or do we need a regex for some other reason?

The \h output normally depends on terminal width.  Have you handled that
somehow?

(And if we want something like this, I think we should not have a single
huge file for the complete test, but a set of smaller files.  I'd even
put the bunch in src/bin/psql/regress rather than the main regress dir.)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Dimitri Fontaine
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Perhaps I'm missing something, but I thought that Slony currently uses a
 heartbeat, and all transactions committed between two beats are banged
 together and committed as one in the slave so that their relative commit
 order doesn't matter. 

I guess Slony does the same as pgq here: all events of all those
transactions between two given ticks are batched together in the order
of the event commits. (In fact the batches are made at the consumer
request, so possibly spreading more than 2 ticks at a time).

If you skip that event ordering (within transactions), you can't
maintain foreign keys on the slaves, among other things.

The idea of this proposal is to be able to get this commit order
directly from where the information is maintained, rather than use some
sort of user sequence for that.

So even ordering the txid and txid_snapshots with respect to WAL commit
time (LSN) won't be the whole story, for any given transaction
containing more than one event we also need to have them in order. I
know Jan didn't forget about it so it must either be in the proposal or
easily derived, too tired to recheck.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] cursor_to_xml: How to signal end?

2010-05-26 Thread Peter Eisentraut
Currently, cursor_to_xml returns an empty string when the end of the
cursor is reached (meaning the fetch returned zero rows).  As discussed
on -general, that's kind of weird, because you'd have to do something
like

IF val::text = ''

to test for the end, since there is no = operator for xml, and also
because '' isn't a valid XML value at all.

What would be a good way to address this?  Return null perhaps?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 4:11 PM, Dimitri Fontaine
dfonta...@hi-media.com wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Perhaps I'm missing something, but I thought that Slony currently uses a
 heartbeat, and all transactions committed between two beats are banged
 together and committed as one in the slave so that their relative commit
 order doesn't matter.

 I guess Slony does the same as pgq here: all events of all those
 transactions between two given ticks are batched together in the order
 of the event commits. (In fact the batches are made at the consumer
 request, so possibly spreading more than 2 ticks at a time).

 If you skip that event ordering (within transactions), you can't
 maintain foreign keys on the slaves, among other things.

 The idea of this proposal is to be able to get this commit order
 directly from where the information is maintained, rather than use some
 sort of user sequence for that.

Exactly.

 So even ordering the txid and txid_snapshots with respect to WAL commit
 time (LSN) won't be the whole story, for any given transaction
 containing more than one event we also need to have them in order. I
 know Jan didn't forget about it so it must either be in the proposal or
 easily derived, too tired to recheck.

Right, so the point is - with this proposal, he can switch to using a
LOCAL sequence number to order events within the session and then
order the sessions using the commit ordering.  Right now, he has to
use a GLOBAL sequence number because there's no way to know the commit
order.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Dimitri Fontaine
Simon Riggs si...@2ndquadrant.com writes:
 On Wed, 2010-05-26 at 19:55 +0300, Heikki Linnakangas wrote:
 Now you want to set up a temporary replica of the master at a 
 development server, for testing purposes. If you set quorum to 2, your
 development server becomes critical infrastructure, which is not what 
 you want.

 That's a good argument for standby relays. 

Well it seems to me we can have the best of both worlds as soon as we
have cascading support. Even in the test server example, this one would
be a slave of the main slave, not counted into the quorum on the master.

Now that's the quorum on the slave that would be deciding on the
availability of the test server. Set it down to 0 and your test server
has no impact on the production environment.

In the example of one master and 4 slaves in 2 different locations,
you'll have a quorum of 2 on the master, which will know about 2 slaves
only. And each of them will have 1 slave, with a quorum to set to 0 or 1
depending on what you want to achieve.

So if you want simplicity to admin, effective data availability and
precise control over the global setup, I say go for:
 a. transaction level control of the replication level
 b. cascading support
 c. quorum with timeout
 d. choice of commit or rollback at timeout

Then give me a setup example that you can't express fully.

As far as the options to control the whole thing are concerned, I think
that the cascading support does not add any. So that's 3 GUCs.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 3:16 PM, Heikki Linnakangas wrote:

On 26/05/10 21:43, Jan Wieck wrote:

On 5/26/2010 1:17 PM, Heikki Linnakangas wrote:

It would not get called during recovery, but I believe that would be
sufficient for Slony. You could always batch commits that you don't
know when they committed as if they committed simultaneously.


Here you are mistaken. If the origin crashes but can recover not yet
flushed to xlog-commit-order transactions, then the consumer has no idea
about the order of those commits, which throws us back to the point
where we require a non cacheable global sequence to replay the
individual actions of those now batched transactions in an agreeable
order.

The commit order data needs to be covered by crash recovery.


Perhaps I'm missing something, 


Apparently, more about that at the end.


I'm thinking that the commit-order log would contain two kinds of records:

a) Transaction with XID X committed
b) All transactions with XID  X committed


If that was true then long running transactions would delay all commits 
for transactions that started after them. Do they?




During normal operation we write the 1st kind of record at every commit. 
After crash recovery (perhaps at the first commit after recovery or when 
the slon daemon first polls the server, as there's no hook for 
end-of-recovery), we write the 2nd kind of record.


I think the callback is also called during backend startup, which means 
that it could record the first XID to come which is known from the 
control file and in that case, all  XID's are committed or aborted.


Which leads us to your missing piece above, the need for the global non 
cacheable sequence.


Consider two transactions A and B that due to transaction batching 
between snapshots get applied together. Let the order of actions be


1. A starts
2. B starts
3. B selects a row for update, then updates the row
4. A tries to do the same and blocks
5. B commits
6. A gets the lock, the row, does the update
7. A commits

If Slony (or Londiste) would not record the exact order of those 
individual row actions, then it would not have any idea if within that 
batch the action of B (higher XID) actually came first. Without that 
knowledge there is a 50/50 chance of getting your replica out of sync 
with that simple conflict.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Kevin Grittner
Jan Wieck janwi...@yahoo.com wrote:
 
 Without this logic, the replication system could not combine
 multiple origin sessions into one replication session without
 risking to never find a state, in which it can commit.
 
My latest idea for handling this in WAL-based replication involves
WAL-logging information about the transaction through which a the
committing transaction makes it safe to view.  There are a few
options here at the detail level that I'm still thinking through.
The idea would be that the xmin from read-only queries on the slaves
might be somewhere behind where you would expect based on
transactions committed.  (The details involve such things as where
non-serializable transactions fall into the plan on both sides, and
whether it's worth the effort to special-case read-only transactions
on the master.)
 
I can't say that I'm 100% sure that some lurking detail won't shoot
this technique down for HS, but it seems good to me at a conceptual
level.
 
I think, however, that this fails to work for systems like Slony and
Londiste because there could be transactions writing to tables which
are not replication targets, so the snapshot adjustments wouldn't be
safe.  True?  (If not true, I think that adding some sort of xmin
value, depending on the answers to the above questions, to Jan's
proposed structure might support better transactional integrity,
even to the level of full serializable support, at the cost of
delaying visibility of committed data.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 4:11 PM, Dimitri Fontaine wrote:

So even ordering the txid and txid_snapshots with respect to WAL commit
time (LSN) won't be the whole story, for any given transaction
containing more than one event we also need to have them in order. I
know Jan didn't forget about it so it must either be in the proposal or
easily derived, too tired to recheck.


No, that detail is actually not explained in the proposal. When applying 
all changes in transaction commit order, there is no need for a global 
sequence. A local counter per backend is sufficient because the total 
order of xact-commit-order, local-xact-seq yields a similarly 
agreeable order of actions.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] functional call named notation clashes with SQL feature

2010-05-26 Thread Peter Eisentraut
It turns out that the SQL standard uses the function call notation

foo(this AS that)

for something else:

routine invocation ::= routine name SQL argument list

routine name ::= [ schema name period ] qualified identifier

SQL argument list ::= left paren [ SQL argument [ { comma SQL
argument }... ] ] right paren

SQL argument ::= value expression
| generalized expression
| target specification

generalized expression ::= value expression AS path-resolved
user-defined type name

In systems that have inheritance of composite types, this is used to
specify which type the value is supposed to be interpreted as (for
example, to treat the value as a supertype).

Seems kind of bad to overload this with something completely different.
What should we do?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 23:32, Jan Wieck wrote:

Consider two transactions A and B that due to transaction batching
between snapshots get applied together. Let the order of actions be

1. A starts
2. B starts
3. B selects a row for update, then updates the row
4. A tries to do the same and blocks
5. B commits
6. A gets the lock, the row, does the update
7. A commits

If Slony (or Londiste) would not record the exact order of those
individual row actions, then it would not have any idea if within that
batch the action of B (higher XID) actually came first. Without that
knowledge there is a 50/50 chance of getting your replica out of sync
with that simple conflict.


Hmm, I don't see how even a fully reliable WAL-logged commit-order log 
would save you then. It seems that you need to not only know the 
relative order of commits, but the order of commits relative to actions 
within the transactions. I.e. in the above example it's not enough to 
know that B committed before A, you also have to know that A updated the 
row only after B committed.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 12:38 PM, Greg Stark wrote:

On Wed, May 26, 2010 at 5:10 PM, Jan Wieck janwi...@yahoo.com wrote:

... but to answer that request, actually I don't even think we should be
discussing API specifics.



How about just API generalities? Like, where do you need this data, on
the master or on the slave? Would PGXC like it on the transaction
coordinator?

What question do you need to answer, do you need to pull out sets of
commits in certain ranges or look up specific transaction ids and find
out when they committed? Or do you only need to answer which of two
transaction ids committed first?


The question I want answered is

  what was the order and xid of the next 0..n transactions, that
  committed after transaction X?

Preferably I would avoid scanning the entire available WAL just to get 
the next n xid's to process.


The proposal assigned a unique serial number (file segment and position 
driven) to each xid and used that for the ordering as well as 
identification of the last known transaction. That is certainly a 
premature implementation detail.


In this implementation it wouldn't even matter if a transaction that was 
recorded actually never made it because it crashed before the WAL flush. 
It would be reported by this commit order feature, but there would be 
no traces of whatever it did to be found inside the DB, so that anomaly 
is harmless.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 23:45, Heikki Linnakangas wrote:

On 26/05/10 23:32, Jan Wieck wrote:

Consider two transactions A and B that due to transaction batching
between snapshots get applied together. Let the order of actions be

1. A starts
2. B starts
3. B selects a row for update, then updates the row
4. A tries to do the same and blocks
5. B commits
6. A gets the lock, the row, does the update
7. A commits

If Slony (or Londiste) would not record the exact order of those
individual row actions, then it would not have any idea if within that
batch the action of B (higher XID) actually came first. Without that
knowledge there is a 50/50 chance of getting your replica out of sync
with that simple conflict.


Hmm, I don't see how even a fully reliable WAL-logged commit-order log
would save you then. It seems that you need to not only know the
relative order of commits, but the order of commits relative to actions
within the transactions. I.e. in the above example it's not enough to
know that B committed before A, you also have to know that A updated the
row only after B committed.


Ok, I think I understand it now. The commit order is enough, because 
replaying the actions in the order all actions of B, then all actions 
of A yields the same result.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Josh Berkus

 What if we drove it off of the PD_ALL_VISIBLE bit on the page itself,
 rather than the visibility map bit?  It would be safe to clear the
 visibility map bit without touching the page, but if you clear the
 PD_ALL_VISIBLE bit on the page itself then you set all the hint bits
 and freeze all the tuples.  In the case where the visibility map bit
 gets cleared but the page-level bit is still set, a future vacuum can
 notice and reset the visibility map bit.  But whenever the visibility
 map bit is set, you know that the page-level bit MUST be set, so you
 needn't vacuum those pages, even for anti-wraparound: you know they'll
 be frozen when and if they ever get written again.

How does that get us out of reading and writing old pages, though?  If
we're going to set a bit on them, we might as well freeze them.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Show schema name on REINDEX DATABASE

2010-05-26 Thread alvherre
Excerpts from Selena Deckelmann's message of mié may 26 11:07:40 -0400 2010:
 On Mon, Apr 5, 2010 at 9:29 AM, Greg Sabino Mullane g...@turnstep.com wrote:
  Patch attached to show the schema *and* table name when doing
  a REINDEX DATABASE.
 
 Is this something that can be added to 9.1 commitfest?

Not in this form, apparently.  Can we convince Greg or someone else to
work on adding some more error message fields?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 23:49, Jan Wieck wrote:

In this implementation it wouldn't even matter if a transaction that was
recorded actually never made it because it crashed before the WAL flush.
It would be reported by this commit order feature, but there would be
no traces of whatever it did to be found inside the DB, so that anomaly
is harmless.


Hmm, I think it would also not matter if the reported commit order 
doesn't match exactly the order of the commit records, as long as 
there's no dependency between the two transactions.


What I'm after is that I think it would be enough to establish the 
commit order using deferred triggers that are fired during pre-commit 
processing. The trigger could get a number from a global sequence to 
establish the commit order, and write it to a table. So you still need a 
global sequence, but it's only needed once per commit.


(you have to handle deferred triggers that fire after the commit-order 
trigger. perhaps by getting another number from the global sequence and 
replacing the previous number with it)


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-26 Thread Jan Wieck

On 5/26/2010 4:52 PM, Heikki Linnakangas wrote:

Ok, I think I understand it now. The commit order is enough, because 
replaying the actions in the order all actions of B, then all actions 
of A yields the same result.


Precisely.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] command tag logging

2010-05-26 Thread alvherre
Excerpts from Ray Stell's message of mié may 26 17:08:33 -0400 2010:
 I just installed a compiled from src 8.3.11.  I usually include %i, command 
 tag,
 in the log_line_prefix setting.  This causes some spewage I'd not seen before
 on connection received lines as if it is dumping the environment:
 
 [unknown],17462,[unknown],2010-05-26 16:04:33.293 
 EDT,4bfd7ed1.4436,1,2010-05-26 16:04:33 
 EDT,0,/usr/local/pgsql8311/bin/postgres-D/var/database/pgsql/alerts_subscribeMANPATH=/usr/local/pgsql/man:HOSTNAME=test.cns.vt.eduTERM=xtermSHELL=/bin/bashHISTSIZE=1000ANT_HOME=/var/local/apache-antUSER=postgresqlLS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*.tif=00;35:LD_LIBRARY_PATH=/usr/lib/openssl/:/usr/local/pgsql/lib:PATH=/usr/java/jdk1.6.0_20/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/db03/app/oracle/product/11.1.0/bin:/var/local/apache-ant/bin:/usr/local/maven/bin:/usr/local/pgsql/bi
 
nMAIL=/var/spool/mail/postgresql_=/usr/local/pgsql8311/bin/postgresPWD=/home/postgresqlINPUTRC=/etc/inputrcJAVA_HOME=/usr/java/jdk1.6.0_20LANG=en_US.UTF-8PGSYSCONFDIR=/usr/local/pgsql8311/etcSSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpassPGDIR=/usr/local/pgsqlHOME=/home/postgresqlSHLVL=2M2_HOME=/usr/local/mavenLOGNAME=postgresqlCVS_RSH=/usr/bin/sshPGDATA=/var/database/pgsql/alerts_subscribeLESSOPEN=|/usr/bin/lesspipe.sh
 %sORACLE_HOME=/db03/app/oracle/product/11.1.0G_BROKEN_FILENAMES=1 LOG:  
connection received: host=198.82.3.23 port=49723

Hmm, I bet it's the recent %.*s patch.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 23:31, Dimitri Fontaine wrote:

  d. choice of commit or rollback at timeout


Rollback is not an option. There is no going back after the commit 
record has been flushed to disk or sent to a standby.


The choice is to either commit anyway after the timeout, or wait forever.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Show schema name on REINDEX DATABASE

2010-05-26 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 Patch attached to show the schema *and* table name when doing
 a REINDEX DATABASE.
 
 Is this something that can be added to 9.1 commitfest?

 Not in this form, apparently.  Can we convince Greg or someone else to
 work on adding some more error message fields?

I think you might be confusing this with my somewhat more controversial 
patch to show schemas when a COPY error occurs. This just changes
the normal output, not the error output. Nobody has complained 
about this patch yet.


- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005261728
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkv9kvcACgkQvJuQZxSWSsgdvwCgygmLyKdAHKflL8WucdNvfuPn
84sAoOBuvAIgupEUcThNWW3KsPaIu9MQ
=Ptoe
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Jan Wieck

On 5/26/2010 12:55 PM, Heikki Linnakangas wrote:

On 26/05/10 18:31, Robert Haas wrote:

And frankly, I don't think it's possible for quorum commit to reduce
the number of parameters.  Even if we have that feature available, not
everyone will want to use it.  And the people who don't will
presumably need whatever parameters they would have needed if quorum
commit hadn't been available in the first place.


Agreed, quorum commit is not a panacea.

For example, suppose that you have two servers, master and a standby, 
and you want transactions to be synchronously committed to both, so that 
in the event of a meteor striking the master, you don't lose any 
transactions that have been replied to the client as committed.


Now you want to set up a temporary replica of the master at a 
development server, for testing purposes. If you set quorum to 2, your 
development server becomes critical infrastructure, which is not what 
you want. If you set quorum to 1, it also becomes critical 
infrastructure, because it's possible that a transaction has been 
replicated to the test server but not the real production standby, and a 
meteor strikes.


Per-standby settings would let you express that, but not OTOH the quorum 
behavior where you require N out of M to acknowledge the commit before 
returning to client.



You can do this only with per standby options, by giving each standby a 
weight, or a number of votes. Your DEV server would have a weight of 
zero, while your production standby's have higher weights, depending on 
their importance for your overall infrastructure. As long as majority 
means 50% of all votes in the house, you don't have a split brain risk.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >