Re: [HACKERS] warning message in standby

2010-06-29 Thread Fujii Masao
On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On the other hand, I like immediate-panicking. And I don't want the standby
 to retry reconnecting the master infinitely.

On second thought, the peremptory PANIC is not good for HA system. If the
master unfortunately has written an invalid record because of its crash,
the standby would exit with PANIC before performing a failover.

So when an invalid record is found in streamed WAL file, we should keep
the standby running and leave the decision whether the standby retries to
connect to the master forever or shuts down right now, up to the user
(actually, it may be a clusterware)?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Look-behind regular expressions

2010-06-29 Thread Thom Brown
Are there any plans to support look-behind and negative look-behind
for regular expressions in future?

Thanks

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-06-29 Thread Mike Fowler

Mike Fowler wrote:

Bruce Momjian wrote:

I have added this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view?id=6   
Thanks Bruce. Attached is a revised patch which changes the code 
slightly such that it uses an older version of the libxml library. 
I've added comments to the code so that we remember why we didn't use 
the latest function.


After seeing some other posts in the last couple of days, I realised I 
hadn't documented the function in the SGML. I have now done so, and 
added a couple of tests with XML literals. Please find the patch 
attached. Now time to go correct the xmlexists patch too...


--
Mike Fowler
Registered Linux user: 379787

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8626,8631  SELECT xpath('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a',
--- 8626,8664 
  (1 row)
  ]]/screen
 /para
+
+sect3
+ titlexpath_exists/title
+ 
+ indexterm
+  primaryxpath_exists/primary
+ /indexterm
+ 
+ synopsis
+  functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional)
+ /synopsis
+ 
+ para
+  The function functionxpath_exists/function is a specialised form
+  of the functionxpath/function function. Though the functions are
+  syntactically the same the xpath expressions are evaluated in differing
+  contexts. Instead of returning the XML values that satisfy the xpath, this
+  function returns a boolean indicating whether the query was satisfied or not.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', 
+ ARRAY[ARRAY['my', 'http://example.com']]);
+ 
+  xpath_exists  
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-mapping
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3495,3497  xpath(PG_FUNCTION_ARGS)
--- 3495,3681 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two 

Re: [HACKERS] warning message in standby

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 3:55 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On the other hand, I like immediate-panicking. And I don't want the standby
 to retry reconnecting the master infinitely.

 On second thought, the peremptory PANIC is not good for HA system. If the
 master unfortunately has written an invalid record because of its crash,
 the standby would exit with PANIC before performing a failover.

I don't think that should ever happen.  The master only streams WAL
that it has fsync'd.  Presumably there's no reason for the master to
ever fsync a partial WAL record (which is usually how a corrupt record
gets into the stream).

 So when an invalid record is found in streamed WAL file, we should keep
 the standby running and leave the decision whether the standby retries to
 connect to the master forever or shuts down right now, up to the user
 (actually, it may be a clusterware)?

Well, if we want to leave it up to the user/clusterware, the current
code is possibly adequate, although there are many different log
messages that could signal this situation, so coding it up might not
be too trivial.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-29 Thread Mike Fowler
Mike Fowler wrote:  
Thanks again for your help Robert, turns out the fault was in the 
pg_proc entry (the 3 up there should've been a two!). Once I took the 
grammar out it was quickly obvious where I'd gone wrong.


Attached is a patch with the revised XMLEXISTS function, complete with 
grammar support and regression tests. The implemented grammar is:


XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] )

Though the full grammar makes everything after the xpath_expression 
optional, I've left it has mandatory simply to avoid lots of rework of 
the function (would need new null checks, memory handling would need 
reworking).




As with the xpath_exists patch I've now added the SGML documentation 
detailing this function and extended the regression test a little to 
test XML literals.


Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
!sect3
  titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8570 
  ]]/screen
  /para
 /sect3
+/sect2
  
!sect2
  titleXML Predicates/title
+ 
+ indexterm
+  primaryXML Predicates/primary
+ /indexterm
+  
+sect3
+ titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8582,8616 
   between documents and content fragments.
  /para
 /sect3
+
+sect3
+ titleXMLEXISTS/title
+ 
+ indexterm
+  primaryXMLEXISTS/primary
+ /indexterm
+ 
+ synopsis
+ functionXMLEXISTS/function(replaceablexpath/replaceable PASSING BY REF replaceablexml/replaceable optionalBY REF/optional)
+ /synopsis
+ 
+ para
+  The function functionxmlexists/function returns true if the replaceablexml/replaceable 
+  satisfies the replaceablexpath/replaceable and false otherwise.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns');
+ 
+  xmlexists
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-processing
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 423,431  static TypeName *TableFuncTypeName(List *columns);
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes
  %type node	xml_root_version opt_xml_root_standalone
! %type ival	document_or_content
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
--- 423,432 
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes xmlexists_list
  %type node	xml_root_version opt_xml_root_standalone
! %type node	xmlexists_query_argument_list xml_default_passing_mechanism xml_passing_mechanism
! %type ival	document_or_content 
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
***
*** 511,523  static TypeName *TableFuncTypeName(List *columns);
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
--- 512,524 
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
***
*** 539,545  static TypeName *TableFuncTypeName(List *columns);
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  
! 	XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLFOREST XMLPARSE
  	XMLPI XMLROOT XMLSERIALIZE
  
  	YEAR_P YES_P
--- 540,546 
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  
! 	XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLPARSE
  	XMLPI XMLROOT XMLSERIALIZE
  
  	YEAR_P YES_P
***
*** 9806,9811  func_expr:	func_name '(' ')' over_clause
--- 9807,9828 
  {
  	$$ = makeXmlExpr(IS_XMLELEMENT, $4, $6, $8, 

Re: [HACKERS] warning message in standby

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 6:59 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 29, 2010 at 3:55 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On the other hand, I like immediate-panicking. And I don't want the standby
 to retry reconnecting the master infinitely.

 On second thought, the peremptory PANIC is not good for HA system. If the
 master unfortunately has written an invalid record because of its crash,
 the standby would exit with PANIC before performing a failover.

 I don't think that should ever happen.  The master only streams WAL
 that it has fsync'd.  Presumably there's no reason for the master to
 ever fsync a partial WAL record (which is usually how a corrupt record
 gets into the stream).

 So when an invalid record is found in streamed WAL file, we should keep
 the standby running and leave the decision whether the standby retries to
 connect to the master forever or shuts down right now, up to the user
 (actually, it may be a clusterware)?

 Well, if we want to leave it up to the user/clusterware, the current
 code is possibly adequate, although there are many different log
 messages that could signal this situation, so coding it up might not
 be too trivial.

So here's a patch that seems to implement the behavior I'm thinking of
- if we repeatedly retrieve the same WAL record from the master, and
we never succeed in replaying it, then give up.

It seems we don't have 100% consensus on this, but I thought posting
the patch might inspire some further thoughts.  I'm really
uncomfortable with the idea that if the slave gets out of sync with
the master we'll just do this forever:

FATAL:  terminating walreceiver process due to administrator command
LOG:  streaming replication successfully connected to primary
LOG:  invalid record length at 0/313FB638
FATAL:  terminating walreceiver process due to administrator command
LOG:  streaming replication successfully connected to primary
LOG:  invalid record length at 0/313FB638
FATAL:  terminating walreceiver process due to administrator command
LOG:  streaming replication successfully connected to primary
LOG:  invalid record length at 0/313FB638
FATAL:  terminating walreceiver process due to administrator command
LOG:  streaming replication successfully connected to primary
LOG:  invalid record length at 0/313FB638

...with this patch, following the above, you get:

FATAL:  invalid record in WAL stream
HINT:  Take a new base backup, or remove recovery.conf and restart in
read-write mode.
LOG:  startup process (PID 6126) exited with exit code 1
LOG:  terminating any other active server processes

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


bound-corrupt-record-retries.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalives win32

2010-06-29 Thread Robert Haas
On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Mon, Jun 28, 2010 at 22:39, Tom Lane t...@sss.pgh.pa.us wrote:
 I had in mind just legislating that the defaults are the RFC values,
 none of this try to use the registry values in one case business.

 Um, if you look at that patch, it doesn't try to use the registry. It
 falls back directly to the system default, ignoring the registry. The
 only special case is where the user doesn't specify any of the
 parameters.

 What I was trying to say is I think we could dispense with the
 setsockopt() code path, and just always use the WSAIoctl() path anytime
 keepalives are turned on.  I don't know what system default values
 you're speaking of, if they're not the registry entries; and I
 definitely don't see the point of consulting such values if they aren't
 user-settable.  We might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] warning message in standby

2010-06-29 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 ...with this patch, following the above, you get:
 
 FATAL:  invalid record in WAL stream
 HINT:  Take a new base backup, or remove recovery.conf and restart
 in read-write mode.
 LOG:  startup process (PID 6126) exited with exit code 1
 LOG:  terminating any other active server processes
 
If someone is sloppy about how they copy the WAL files around, they
could temporarily have a truncated file.  If we want to be tolerant
of straight file copies, without a temporary name or location with a
move on completion, we would need some kind of retry or timeout.  It
appears that you have this hard-coded to five retries.  I'm not
saying this is a bad setting, but I always wonder about hard-coded
magic numbers like this.  What's the delay between retries?  How did
you arrive at five as the magic number?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] keepalives on MacOS X

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, a bit of rooting around in the Darwin sources shows that this
 value is used as a per-connection override for tcp_keepidle.  So it
 is a synonym.  Not sure if it's worth supporting when the other value
 can't be set too.  Maybe it'd be more useful to document that people can
 set the system-wide values if they're desperate.

Well, the default value for tcp_keepidle is 2 hours, and the default
value for tcp_keepintvl is 75 seconds.  Assuming that tcp_keepcount
defaults to something reasonable (I think the default on Linux is 9),
the lion's share of the time will be waiting for tcp_keepidle - so
just the ability to reduce that value to something reasonable should
help a lot.  It's also not much code - proposed patch attached.

Some documentation of how to change this stuff via sysctl on various
OSes wouldn't be a bad thing either - anyone feel like writing
something up?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


tcp_keepalive_for_keepidle.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Look-behind regular expressions

2010-06-29 Thread Tom Lane
Thom Brown thombr...@gmail.com writes:
 Are there any plans to support look-behind and negative look-behind
 for regular expressions in future?

No.  Or are you volunteering?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] warning message in standby

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 10:21 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 ...with this patch, following the above, you get:

 FATAL:  invalid record in WAL stream
 HINT:  Take a new base backup, or remove recovery.conf and restart
 in read-write mode.
 LOG:  startup process (PID 6126) exited with exit code 1
 LOG:  terminating any other active server processes

 If someone is sloppy about how they copy the WAL files around, they
 could temporarily have a truncated file.

Can you explain the scenario you're concerned about in more detail?

 If we want to be tolerant
 of straight file copies, without a temporary name or location with a
 move on completion, we would need some kind of retry or timeout.  It
 appears that you have this hard-coded to five retries.  I'm not
 saying this is a bad setting, but I always wonder about hard-coded
 magic numbers like this.  What's the delay between retries?  How did
 you arrive at five as the magic number?

It's approximately the number Josh Berkus suggested in an email
upthread.  In other words, SWAG.

There's not a fixed delay between retries - it represents the number
of times that we have either (a) streamed the relevant chunk from the
master via WALSender, or (b) retrieved the segment from the archive
with restore_command.  The first retry CAN help, if WAL streaming from
master to standby was interrupted unexpectedly.  AFAIK, the additional
retries after that are just paranoia, but I can't rule out the
possibility that I'm missing something, in which case we might have to
rethink the whole approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Look-behind regular expressions

2010-06-29 Thread Thom Brown
On 29 June 2010 15:40, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown thombr...@gmail.com writes:
 Are there any plans to support look-behind and negative look-behind
 for regular expressions in future?

 No.  Or are you volunteering?


A n00b like me volunteer for that?  It's more of a suggestion.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] warning message in standby

2010-06-29 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 If someone is sloppy about how they copy the WAL files around,
 they could temporarily have a truncated file.
 
 Can you explain the scenario you're concerned about in more
 detail?
 
If someone uses cp or scp to copy a WAL file from the pg_xlog
directory to an archive directory, there will be a window of time
where the file exists and is not complete.  If you wait a while
(that being a duration which could be highly variable, depending on
the specifics of the environment and copy techniques used), the
missing part of the file will materialize.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalives win32

2010-06-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 What I was trying to say is I think we could dispense with the
 setsockopt() code path, and just always use the WSAIoctl() path anytime
 keepalives are turned on.  I don't know what system default values
 you're speaking of, if they're not the registry entries; and I
 definitely don't see the point of consulting such values if they aren't
 user-settable.  We might as well just consult the RFCs and be done.

 FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
 defend that preference...

Well, basically what I don't like about Magnus' proposal is that setting
one of the two values changes the default that will be used for the
other one.  (Or, if it does not change the default, the extra code is
useless anyway.)  If we just always go through the WSAIoctl() path then
we can clearly document the default for this on Windows is so-and-so.
How is the documentation going to explain the behavior of the proposed
code?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - code of implementation of materialized views

2010-06-29 Thread Robert Haas
2010/6/25 Pavel Baros baro...@seznam.cz:
 On http://github.com/pbaros/postgres can be seen changes and my attempt to
 implement materialized views. The first commit to the repository implements
 following:

 Materialized view can be created, dropped and used in SELECT statement.

 CREATE MATERIALIZED VIEW mvname AS SELECT ...;
 DROP MATERIALIZED VIEW mvname [CASCADE];
 SELECT * FROM mvname;

 also works:
 COMMENT ON MATERIALIZED VIEW mvname IS 'etc.';
 SELECT pg_get_viewdef(mvname);


 ... also you can look at enclosed patch.

So, this patch doesn't actually seem to do very much.  It doesn't
appear that creating the materialized view actually populates it with
any data; and the refresh command doesn't work either.  So it appears
that you can create a materialized view, but it won't actually
contain any data - which doesn't seem at all useful.

Some other problems:

- The command tag for CREATE MATERIALIZED VIEW should return CREATE
MATERIALIZED VIEW rather than CREATE VIEW, since we're treating it as
a separate object type.  I note that dropping a materialized view
already uses DROP MATERIALIZED VIEW, so right now it isn't
symmetrical.
- Using \d with no argument doesn't list materialized views.
- Using \d with a materialized view as an argument doesn't work
properly - the first line says something like ?m? public.m instead
of materialized view public.m.
- Using \d+ with a materialized view as an argument should probably
should the view definition.
- Using \dd doesn't list comments on materialized views.
- Commenting on a column of a materialized view should probably be allowed.
- pg_dump fails with a message like this: failed sanity check, parent
table OID 24604 of pg_rewrite entry OID 24607 not found
- ALTER MATERIALIZED VIEW name OWNER TO role, RENAME TO role, and SET
SCHEMA schema either fall to work or fail to parse (plan ALTER VIEW
also doesn't work on a materialized view)
- ALTER MATERIALIZED VIEW name SET/DROP DEFAULT also doesn't work,
which is OK: it shouldn't work.  But the error message needs work.
- The error message CREATE OR REPLACE on materialized view is not
support! shouldn't end with an exclamation point.
- The parser token OptMater should probably be called OptMaterialized
or opt_materialized, rather than abbreviating.
- There are no docs.
- There are no tests.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Look-behind regular expressions

2010-06-29 Thread David E. Wheeler
On Jun 29, 2010, at 7:44 AM, Thom Brown wrote:

 No.  Or are you volunteering?
 
 A n00b like me volunteer for that?  It's more of a suggestion.

N00bs gotta start somewhere…

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - code of implementation of materialized views

2010-06-29 Thread Pavel Baroš

Robert Haas napsal(a):

2010/6/25 Pavel Baros baro...@seznam.cz:
  

On http://github.com/pbaros/postgres can be seen changes and my attempt to
implement materialized views. The first commit to the repository implements
following:

Materialized view can be created, dropped and used in SELECT statement.

CREATE MATERIALIZED VIEW mvname AS SELECT ...;
DROP MATERIALIZED VIEW mvname [CASCADE];
SELECT * FROM mvname;

also works:
COMMENT ON MATERIALIZED VIEW mvname IS 'etc.';
SELECT pg_get_viewdef(mvname);
  

... also you can look at enclosed patch.



So, this patch doesn't actually seem to do very much.  It doesn't
appear that creating the materialized view actually populates it with
any data; and the refresh command doesn't work either.  So it appears
that you can create a materialized view, but it won't actually
contain any data - which doesn't seem at all useful.

  


Yeah, it is my fault, I did not mentioned that this patch is not final. 
It is only small part of whole implementation. I wanted to show just 
this, because I think that is the part that should not change much. And 
to show I did something, I am not ignoring GSoC. Now I can fully focus 
on the program.


Most of the problems you mentioned (except pg_dump) I have implemented 
and I will post it to HACKERS soon. Until now I've not had much time, 
because I just finished my BSc. studies yesterday.


And again, sorry for misunderstanding.

Pavel Baros


Some other problems:

- The command tag for CREATE MATERIALIZED VIEW should return CREATE
MATERIALIZED VIEW rather than CREATE VIEW, since we're treating it as
a separate object type.  I note that dropping a materialized view
already uses DROP MATERIALIZED VIEW, so right now it isn't
symmetrical.
- Using \d with no argument doesn't list materialized views.
- Using \d with a materialized view as an argument doesn't work
properly - the first line says something like ?m? public.m instead
of materialized view public.m.
- Using \d+ with a materialized view as an argument should probably
should the view definition.
- Using \dd doesn't list comments on materialized views.
- Commenting on a column of a materialized view should probably be allowed.
- pg_dump fails with a message like this: failed sanity check, parent
table OID 24604 of pg_rewrite entry OID 24607 not found
- ALTER MATERIALIZED VIEW name OWNER TO role, RENAME TO role, and SET
SCHEMA schema either fall to work or fail to parse (plan ALTER VIEW
also doesn't work on a materialized view)
- ALTER MATERIALIZED VIEW name SET/DROP DEFAULT also doesn't work,
which is OK: it shouldn't work.  But the error message needs work.
- The error message CREATE OR REPLACE on materialized view is not
support! shouldn't end with an exclamation point.
- The parser token OptMater should probably be called OptMaterialized
or opt_materialized, rather than abbreviating.
- There are no docs.
- There are no tests.

  



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - code of implementation of materialized views

2010-06-29 Thread David Christensen

On Jun 29, 2010, at 3:31 PM, Pavel Baroš wrote:

 Robert Haas napsal(a):
 2010/6/25 Pavel Baros baro...@seznam.cz:
  
 On http://github.com/pbaros/postgres can be seen changes and my attempt to
 implement materialized views. The first commit to the repository implements
 following:
 
 Materialized view can be created, dropped and used in SELECT statement.
 
 CREATE MATERIALIZED VIEW mvname AS SELECT ...;
 DROP MATERIALIZED VIEW mvname [CASCADE];
 SELECT * FROM mvname;
 
 also works:
 COMMENT ON MATERIALIZED VIEW mvname IS 'etc.';
 SELECT pg_get_viewdef(mvname);
  
 ... also you can look at enclosed patch.

 
 So, this patch doesn't actually seem to do very much.  It doesn't
 appear that creating the materialized view actually populates it with
 any data; and the refresh command doesn't work either.  So it appears
 that you can create a materialized view, but it won't actually
 contain any data - which doesn't seem at all useful.
 
  
 
 Yeah, it is my fault, I did not mentioned that this patch is not final. It is 
 only small part of whole implementation. I wanted to show just this, because 
 I think that is the part that should not change much. And to show I did 
 something, I am not ignoring GSoC. Now I can fully focus on the program.
 
 Most of the problems you mentioned (except pg_dump) I have implemented and I 
 will post it to HACKERS soon. Until now I've not had much time, because I 
 just finished my BSc. studies yesterday.
 
 And again, sorry for misunderstanding.
 
 Pavel Baros
 
 Some other problems:
 
 - The command tag for CREATE MATERIALIZED VIEW should return CREATE
 MATERIALIZED VIEW rather than CREATE VIEW, since we're treating it as
 a separate object type.  I note that dropping a materialized view
 already uses DROP MATERIALIZED VIEW, so right now it isn't
 symmetrical.
 - Using \d with no argument doesn't list materialized views.
 - Using \d with a materialized view as an argument doesn't work
 properly - the first line says something like ?m? public.m instead
 of materialized view public.m.
 - Using \d+ with a materialized view as an argument should probably
 should the view definition.
 - Using \dd doesn't list comments on materialized views.
 - Commenting on a column of a materialized view should probably be allowed.
 - pg_dump fails with a message like this: failed sanity check, parent
 table OID 24604 of pg_rewrite entry OID 24607 not found
 - ALTER MATERIALIZED VIEW name OWNER TO role, RENAME TO role, and SET
 SCHEMA schema either fall to work or fail to parse (plan ALTER VIEW
 also doesn't work on a materialized view)
 - ALTER MATERIALIZED VIEW name SET/DROP DEFAULT also doesn't work,
 which is OK: it shouldn't work.  But the error message needs work.
 - The error message CREATE OR REPLACE on materialized view is not
 support! shouldn't end with an exclamation point.

Do we see supporting the creation of a materialized view from a regular view, 
as in ALTER VIEW regular_view SET MATERIALIZED or some such?

Since we're treating this as a distinct object type, instead of repeatedly 
typing MATERIALIZED VIEW, is there a possibility of  introducing a keyword 
alias MATVIEW without complicating the grammar/code all that much, or is that 
frowned upon?  Paintbrushes, anyone?

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-29 Thread Guillaume Lelarge
Le 23/06/2010 23:29, Guillaume Lelarge a écrit :
 Le 23/06/2010 22:54, Tom Lane a écrit :
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 21, 2010 at 12:46 PM, Guillaume Lelarge
 guilla...@lelarge.info wrote:
 I added a CHECK_FOR_INTERRUPTS call in the copy_relation_data(),
 copy_dir(), and copy_file() functions. Works for me on ALTER TABLE ...
 SET TABLESPACE and ALTER DATABASE ... SET TABLESPACE, in 9.0 and 8.4.

 Adding a CHECK_FOR_INTERRUPTS() to copy_relation_data seems like it
 ought to be OK (though I haven't tested), but copydir() is in
 src/port, and I fear that putting CHECK_FOR_INTERRUPTS() in there
 might cause problems.

 copydir.c is already backend-specific thanks to all the ereport calls.
 If we ever tried to make it usable in frontend code, we could easily
 deal with CHECK_FOR_INTERRUPTS() via #ifndef FRONTEND --- changing the
 error management would be far more painful.

 
 I'm not sure I get it right. Do I need to do something on the patch so
 that it can get commited?
 

Still not sure what to do right now for this patch :)

Could it be applied? or should I work on it? (and if yes on the latter,
to do what?)

Thanks.



-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-29 Thread Tom Lane
Guillaume Lelarge guilla...@lelarge.info writes:
 Still not sure what to do right now for this patch :)

Put it on the commitfest list, if you didn't already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Planner deficiencies with views that use windowing functions

2010-06-29 Thread Daniel Grace
I found some interesting discrepancies with the query planner when
working with views (and presumably table subqueries) that contain
windowing functions.  This is all tested on 9.0beta2, though I suspect
8.4 has the same limitations.

The short version is:
- If querying a windowing view and asking only for columns that are
not derived from window functions, the window functions are still
applied even though they would have no impact in the output.
- If querying a windowing view using a WHERE clause, the entire view
must still be scanned once for each window function involved.  In some
cases this is to be expected (otherwise row values could vary based on
whether the where clause was specified or not) -- however:
  * If the WHERE clause includes a column that is named by PARTITION
BY for every involved windowing function, it is guarunteed (I think?)
to not have any impact on the actual results and thus that portion of
it ought to be folded in to the original query.
  * If the WHERE clause includes a column named in some, but not all
PARTITION BYs, the planner might be able to limit what rows are
scanned for those WindowAggs
  * If my logic is correct (it might not be), and the WHERE clause
looks like WHERE a=1 AND b=2, and each partition names at least one
of those columns, the inner query for the view itself could be
rewritten to WHERE a=1 OR b=2 (note change of AND to OR).  The
windowing functions would produce erroneous results for rows that
match only one condition, but it should be correct for rows that match
both -- and the rows that don't will be filtered out anyways.
- Note that by 'all windowing functions', I mean only those that are
named as columns in the outer select.

The long version, with test cases and EXPLAIN output, is the rest of
this message:

DROP SCHEMA IF EXISTS windowtest CASCADE;
CREATE SCHEMA windowtest;
SET search_path=windowtest;

CREATE TABLE numbers AS
SELECT
f.v::INTEGER AS v,
(f.v % 321)::INTEGER AS c -- Give us something to partition by.
FROM GENERATE_SERIES(1, 1) AS f(v);

ALTER TABLE numbers ADD PRIMARY KEY(c,v);

-- This is the basic query we'll use.
EXPLAIN ANALYZE SELECT
c, v,
FIRST_VALUE(v) OVER next_v_in_c AS next_v,
LAST_VALUE(v) OVER same_thousands AS last_in_thousands
FROM
numbers
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;

-- We'll also use a view based on the same query.
CREATE VIEW numbers_view AS SELECT
c, v,
FIRST_VALUE(v) OVER next_v_in_c AS next_v,
LAST_VALUE(v) OVER same_thousands AS last_in_thousands
FROM
numbers
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;

-- Case 1: Against the original query, let's ask for some data that doesn't
-- use any of the window functions but still defines the windows
EXPLAIN ANALYZE SELECT
c, v
FROM
numbers
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;
-- Seq Scan on numbers  (cost=0.00..220.00 rows=1 width=8) (actual
time=0.011..4.572 rows=1 loops=1)
-- Total runtime: 6.657 ms
-- Summary: The planner notices the windows aren't actually being used, so
-- it eliminates them.

-- Case 2: What if we still use the windowing functions, but only want a
-- couple categories?
EXPLAIN ANALYZE SELECT
c, v,
FIRST_VALUE(v) OVER next_v_in_c AS next_v,
LAST_VALUE(v) OVER same_thousands AS last_in_thousands
FROM
numbers
WHERE
c  5
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;
-- WindowAgg  (cost=531.93..623.59 rows= width=8) (actual
time=0.496..0.719 rows=159 loops=1)
--   -  Sort  (cost=531.93..540.26 rows= width=8) (actual
time=0.492..0.524 rows=159 loops=1)
-- Sort Key: c, (floor(((v / 1000))::double precision))
-- Sort Method:  quicksort  Memory: 28kB
-- -  WindowAgg  (cost=0.00..336.91 rows= width=8)
(actual time=0.026..0.399 rows=159 loops=1)
--   -  Index Scan using numbers_pkey on numbers
(cost=0.00..278.58 rows= width=8) (actual time=0.010..0.153
rows=159 loops=1)
-- Index Cond: (c  5)
-- Total runtime: 0.785 ms
-- Summary: Works about as well as we can hope for.

-- Case 3: What if we ask for rows from the view, but we don't ask for any
-- window functions?
EXPLAIN ANALYZE SELECT c, v FROM numbers_view;

Re: [HACKERS] Planner deficiencies with views that use windowing functions

2010-06-29 Thread Tom Lane
Daniel Grace dgr...@wingsnw.com writes:
 I found some interesting discrepancies with the query planner when
 working with views (and presumably table subqueries) that contain
 windowing functions.

Yeah, there's next to no optimization for window functions ATM.
This will probably get better over time, but don't hold your breath ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalives win32

2010-06-29 Thread Bruce Momjian
Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  What I was trying to say is I think we could dispense with the
  setsockopt() code path, and just always use the WSAIoctl() path anytime
  keepalives are turned on. ?I don't know what system default values
  you're speaking of, if they're not the registry entries; and I
  definitely don't see the point of consulting such values if they aren't
  user-settable. ?We might as well just consult the RFCs and be done.
 
  FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
  defend that preference...
 
 Well, basically what I don't like about Magnus' proposal is that setting
 one of the two values changes the default that will be used for the
 other one.  (Or, if it does not change the default, the extra code is
 useless anyway.)  If we just always go through the WSAIoctl() path then
 we can clearly document the default for this on Windows is so-and-so.
 How is the documentation going to explain the behavior of the proposed
 code?

Let's look at the usage probabilities.  99% of Win32 users will not use
any of these settings.  I would hate to come up with a solution that
changes the default behavior for that 99%.

Therefore, I think using hard-coded defaults only for cases where
someone sets one but not all settings is appropriate.  The documentation
text would be:

On Windows, if a keepalive settings is set, then defaults will be
used for any unset values, specifically, keepalives_idle (200) and
keepalives_interval(4).  Windows does not allow control of
keepalives_count.

Seems simple enough.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Propose Beta3 for July

2010-06-29 Thread Bruce Momjian
Josh Berkus wrote:
 Folks,
 
 There have been quite a number of fixes since Beta2.  If we're going to
 make a summer release date at all, we need to get moving.
 
 Therefore, I propose that we set a beta3 release date for July 8th.
 That should give it enough space from the American Holiday.
 
 This would imply, of course, that we're going to go ahead an accept
 Simon's patches for Keepalive, since we don't see to have any others.

FYI, if we are lucky, we have time for a beta3 and beta4 before we have
to start planning for RC1.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - code of implementation of materialized views

2010-06-29 Thread Robert Haas
2010/6/29 David Christensen da...@endpoint.com:
 Do we see supporting the creation of a materialized view from a regular view, 
 as in ALTER VIEW regular_view SET MATERIALIZED or some such?

I'm not sure.  I think we should focus our efforts on (1) getting it
to work at all and then (2) improving the performance of the refresh
operation, which will doubtless be pessimal in the initial
implementation.  Those are big enough problems that I'm not inclined
to spend much thought on bells and whistles at this point.

 Since we're treating this as a distinct object type, instead of repeatedly 
 typing MATERIALIZED VIEW, is there a possibility of  introducing a keyword 
 alias MATVIEW without complicating the grammar/code all that much, or is 
 that frowned upon?  Paintbrushes, anyone?

-1 from me, but IJWH.

By the way, does the SQL standard say anything about materialized views?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-29 Thread Bruce Momjian
Tom Lane wrote:
 Guillaume Lelarge guilla...@lelarge.info writes:
  Still not sure what to do right now for this patch :)
 
 Put it on the commitfest list, if you didn't already.

So this is not something we want fixed for 9.0, as indicated by Simon?
I don't see the patch on the commit-fest page yet.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] warning message in standby

2010-06-29 Thread Fujii Masao
On Tue, Jun 29, 2010 at 7:59 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 29, 2010 at 3:55 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On the other hand, I like immediate-panicking. And I don't want the standby
 to retry reconnecting the master infinitely.

 On second thought, the peremptory PANIC is not good for HA system. If the
 master unfortunately has written an invalid record because of its crash,
 the standby would exit with PANIC before performing a failover.

 I don't think that should ever happen.  The master only streams WAL
 that it has fsync'd.  Presumably there's no reason for the master to
 ever fsync a partial WAL record (which is usually how a corrupt record
 gets into the stream).

This is true. But what I'm concerned about is:

1. Backend writes and fsyncs the WAL to the disk
2. The WAL on the disk gets corrupted
3. Walsender reads and sends that corrupted WAL image
4. The master crashes because of the corruption of the disk
5. The standby attempts to replay the corrupted WAL... PANIC

 So when an invalid record is found in streamed WAL file, we should keep
 the standby running and leave the decision whether the standby retries to
 connect to the master forever or shuts down right now, up to the user
 (actually, it may be a clusterware)?

 Well, if we want to leave it up to the user/clusterware, the current
 code is possibly adequate, although there are many different log
 messages that could signal this situation, so coding it up might not
 be too trivial.

So the current code + user-settable-retry-count seems good to me.
If the retry-count is set to 0, we will not see the repeated log
messages. And we might need to provide the parameter specifying
how the standby should behave after exceeding the retry-count:
PANIC or stay-alive-without-retries.

Choosing PANIC and using the retry-count = 5 would cover your proposed
patch.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers

2010-06-29 Thread Bruce Momjian
Simon Riggs wrote:
 On Mon, 2010-06-21 at 18:08 +0900, Fujii Masao wrote:
 
  The problem is not that the master streams non-fsync'd WAL, but that the
  standby can replay that. So I'm thinking that we can send non-fsync'd WAL
  safely if the standby makes the recovery wait until the master has fsync'd
  WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
  location to walreceiver, and the standby applies only the WAL which the
  master has already fsync'd. Thought?
 
 Yes, good thought. The patch just applied seems too much.
 
 I had the same thought, though it would mean you'd need to send two xlog
 end locations, one for write, one for fsync. Though not really clear why
 we send the current end of WAL on the server anyway, so maybe we can
 just alter that.

Is this a TODO?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 9:42 PM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Guillaume Lelarge guilla...@lelarge.info writes:
  Still not sure what to do right now for this patch :)

 Put it on the commitfest list, if you didn't already.

 So this is not something we want fixed for 9.0, as indicated by Simon?
 I don't see the patch on the commit-fest page yet.

I tend to think we should fix it for 9.0, but could be talked out of
it if someone has a compelling argument to make.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] warning message in standby

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 10:03 PM, Fujii Masao masao.fu...@gmail.com wrote:
 This is true. But what I'm concerned about is:

 1. Backend writes and fsyncs the WAL to the disk
 2. The WAL on the disk gets corrupted
 3. Walsender reads and sends that corrupted WAL image
 4. The master crashes because of the corruption of the disk
 5. The standby attempts to replay the corrupted WAL... PANIC

That sounds like design behavior to me.

 Well, if we want to leave it up to the user/clusterware, the current
 code is possibly adequate, although there are many different log
 messages that could signal this situation, so coding it up might not
 be too trivial.

 So the current code + user-settable-retry-count seems good to me.
 If the retry-count is set to 0, we will not see the repeated log
 messages. And we might need to provide the parameter specifying
 how the standby should behave after exceeding the retry-count:
 PANIC or stay-alive-without-retries.

 Choosing PANIC and using the retry-count = 5 would cover your proposed
 patch.

I'm still having a hard time understanding why anyone would want to
configure this value as infinity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers

2010-06-29 Thread Robert Haas
On Tue, Jun 29, 2010 at 10:06 PM, Bruce Momjian br...@momjian.us wrote:
 Simon Riggs wrote:
 On Mon, 2010-06-21 at 18:08 +0900, Fujii Masao wrote:

  The problem is not that the master streams non-fsync'd WAL, but that the
  standby can replay that. So I'm thinking that we can send non-fsync'd WAL
  safely if the standby makes the recovery wait until the master has fsync'd
  WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
  location to walreceiver, and the standby applies only the WAL which the
  master has already fsync'd. Thought?

 Yes, good thought. The patch just applied seems too much.

 I had the same thought, though it would mean you'd need to send two xlog
 end locations, one for write, one for fsync. Though not really clear why
 we send the current end of WAL on the server anyway, so maybe we can
 just alter that.

 Is this a TODO?

Maybe.  As Heikki pointed out upthread, the standby can't even write
the WAL to back to the OS until it's been fsync'd on the master
without risking the problem under discussion.  So we can stream the
WAL from master to standby as long as the standby just buffers it in
memory (or somewhere other than the usual location in pg_xlog).

Before we get too busy frobnicating this gonkulator, I'd like to see a
little more discussion of what kind of performance people are
expecting from sync rep.  Sounds to me like the best we can expect
here is, on every commit: (a) wait for master fsync to complete, (b)
send message to standby, (c) wait for reply for reply from standby
indicating that fsync is complete on standby.  Even assuming that the
network overhead is minimal, that halves the commit rate.  Are the
people who want sync rep OK with that?  Is there any way to do better?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_archive_bypass

2010-06-29 Thread Bruce Momjian
Dimitri Fontaine wrote:
 Hi,
 
 I tend to consider it a bug that there's no known way under windows to
 use the same trick as under Unix by using '/usr/bin/true' as your
 archive command. And this Unix trick itself does feel like a hack.
 
 Also I'd very much like to be able to recommend (even if not change the
 official defaults) to setup wal_level to archive, archive_mode=on and
 archive_command=pg_archive_bypass, so that the day you have a HA budget
 ain't the day you're going to restart the server to enable the fault
 tolerance settings?
 
 So please find attached a very simple let's see about it patch to
 implement an internal archive_command that just returns true and is
 called pg_archive_bypass. It's missing documentation, which I'll provide
 if needed (meaning there's some will to consider applying such a patch).

Turn out 'REM' acts like /bin/true on Windows.  I have documented that
fact in the attached, applied patch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.287
diff -c -c -r1.287 config.sgml
*** doc/src/sgml/config.sgml	29 Jun 2010 22:29:13 -	1.287
--- doc/src/sgml/config.sgml	30 Jun 2010 02:40:40 -
***
*** 1793,1799 
  disabled, but the server continues to accumulate WAL segment files in
  the expectation that a command will soon be provided.  Setting
  varnamearchive_command/ to a command that does nothing but
! return true, e.g. literal/bin/true/, effectively disables
  archiving, but also breaks the chain of WAL files needed for
  archive recovery, so it should only be used in unusual circumstances.
 /para
--- 1793,1800 
  disabled, but the server continues to accumulate WAL segment files in
  the expectation that a command will soon be provided.  Setting
  varnamearchive_command/ to a command that does nothing but
! return true, e.g. literal/bin/true/ (literalREM/ on
! Windows), effectively disables
  archiving, but also breaks the chain of WAL files needed for
  archive recovery, so it should only be used in unusual circumstances.
 /para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] keepalives on MacOS X

2010-06-29 Thread Fujii Masao
On Tue, Jun 29, 2010 at 11:28 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 29, 2010 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, a bit of rooting around in the Darwin sources shows that this
 value is used as a per-connection override for tcp_keepidle.  So it
 is a synonym.  Not sure if it's worth supporting when the other value
 can't be set too.  Maybe it'd be more useful to document that people can
 set the system-wide values if they're desperate.

 Well, the default value for tcp_keepidle is 2 hours, and the default
 value for tcp_keepintvl is 75 seconds.  Assuming that tcp_keepcount
 defaults to something reasonable (I think the default on Linux is 9),
 the lion's share of the time will be waiting for tcp_keepidle - so
 just the ability to reduce that value to something reasonable should
 help a lot.  It's also not much code - proposed patch attached.

src/interfaces/libpq/fe-connect.c
+   appendPQExpBuffer(conn-errorMessage,
+ 
libpq_gettext(setsockopt(TCP_KEEPIDLE) failed: %s\n),
+ SOCK_STRERROR(SOCK_ERRNO, 
sebuf, sizeof(sebuf)));

s/TCP_KEEPIDLE/TCP_KEEPALIVE

Don't we need to change pq_getkeepalivesidle?

In pq_setkeepalivesidle, if neither TCP_KEEPIDLE nor TCP_KEEPALIVE are
supported, the following message is output.

setsockopt(TCP_KEEPIDLE) not supported

We should change it to something like?

neither setsockopt(TCP_KEEPIDLE) nor setsockopt(TCP_KEEPALIVE) are supported

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 29, 2010 at 9:42 PM, Bruce Momjian br...@momjian.us wrote:
 So this is not something we want fixed for 9.0, as indicated by Simon?
 I don't see the patch on the commit-fest page yet.

 I tend to think we should fix it for 9.0, but could be talked out of
 it if someone has a compelling argument to make.

Er, maybe I lost count, but I thought you were the one objecting to
the patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-29 Thread Guillaume Lelarge
Le 30/06/2010 05:25, Tom Lane a écrit :
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 29, 2010 at 9:42 PM, Bruce Momjian br...@momjian.us wrote:
 So this is not something we want fixed for 9.0, as indicated by Simon?
 I don't see the patch on the commit-fest page yet.
 
 I tend to think we should fix it for 9.0, but could be talked out of
 it if someone has a compelling argument to make.
 
 Er, maybe I lost count, but I thought you were the one objecting to
 the patch.
 

You're right. Robert questioned the use of CHECK_FOR_INTERRUPTS() in
code available in the src/port directory. I don't see what issue could
result with this. He also said that whatever would be commited should be
back-patched.

I can still add it for the next commit fest, I just don't want this
patch to get lost. Though I won't be able to do this before getting back
from work.

Thanks.


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers