Re: [HACKERS] GROUPING SETS revisited

2010-08-02 Thread Pavel Stehule
Hello

2010/8/3 Joshua Tolley :
> In case anyone's interested, I've taken the CTE-based grouping sets patch from
> [1] and made it apply to 9.1, attached. I haven't yet done things like checked
> it for whitespace consistency, style conformity, or anything else, but (tuits
> permitting) hope to figure out how it works and get it closer to commitability
> in some upcoming commitfest.
>
> I mention it here so that if someone else is working on it, we can avoid
> duplicated effort, and to see if a CTE-based grouping sets implementation is
> really the way we think we want to go.
>

I am plaing with it now :). I have a plan to replace CTE with similar
but explicit executor node. The main issue of existing patch was using
just transformation to CTE. I agree, so it isn't too much extensiable
in future. Now I am cleaning identifiers little bit. Any colaboration
is welcome.

My plan:
1) clean CTE patch
2) replace CTE with explicit executor node, but still based on tuplestore
3) when will be possible parallel processing based on hash agg - then
we don't need to use tuplestore

comments??

Regards

Pavel

> [1] http://archives.postgresql.org/pgsql-hackers/2009-05/msg00700.php
>
> --
> Joshua Tolley / eggyknap
> End Point Corporation
> http://www.endpoint.com
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkxXrggACgkQRiRfCGf1UMMlCQCglaIdtPj8Qe6G60V2LHn5pFNn
> kgIAniXRgIQEbVrK/eDVZnmKCzw33lT9
> =XVVV
> -END PGP SIGNATURE-
>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GROUPING SETS revisited

2010-08-02 Thread Joshua Tolley
In case anyone's interested, I've taken the CTE-based grouping sets patch from
[1] and made it apply to 9.1, attached. I haven't yet done things like checked
it for whitespace consistency, style conformity, or anything else, but (tuits
permitting) hope to figure out how it works and get it closer to commitability
in some upcoming commitfest.

I mention it here so that if someone else is working on it, we can avoid
duplicated effort, and to see if a CTE-based grouping sets implementation is
really the way we think we want to go.

[1] http://archives.postgresql.org/pgsql-hackers/2009-05/msg00700.php

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com
diff --git a/src/backend/parser/Makefile b/src/backend/parser/Makefile
index a8f4c07..fb248a6 100644
*** a/src/backend/parser/Makefile
--- b/src/backend/parser/Makefile
*** override CPPFLAGS := -I. -I$(srcdir) $(C
*** 15,21 
  OBJS= analyze.o gram.o keywords.o kwlookup.o parser.o \
parse_agg.o parse_clause.o parse_coerce.o parse_cte.o parse_expr.o \
parse_func.o parse_node.o parse_oper.o parse_param.o parse_relation.o \
!   parse_target.o parse_type.o parse_utilcmd.o scansup.o
  
  FLEXFLAGS = -CF
  
--- 15,21 
  OBJS= analyze.o gram.o keywords.o kwlookup.o parser.o \
parse_agg.o parse_clause.o parse_coerce.o parse_cte.o parse_expr.o \
parse_func.o parse_node.o parse_oper.o parse_param.o parse_relation.o \
!   parse_target.o parse_type.o parse_utilcmd.o scansup.o parse_gsets.o
  
  FLEXFLAGS = -CF
  
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 6b99a10..1b579a8 100644
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
***
*** 34,39 
--- 34,40 
  #include "parser/parse_clause.h"
  #include "parser/parse_coerce.h"
  #include "parser/parse_cte.h"
+ #include "parser/parse_gsets.h"
  #include "parser/parse_oper.h"
  #include "parser/parse_param.h"
  #include "parser/parse_relation.h"
*** parse_sub_analyze(Node *parseTree, Parse
*** 150,155 
--- 151,313 
  }
  
  /*
+  * process GROUPING SETS
+  */
+ static SelectStmt *
+ makeSelectStmt(List *targetList, List *fromClause)
+ {
+ 	SelectStmt *n = makeNode(SelectStmt);
+ 	n->distinctClause = NULL;
+ 	n->intoClause = NULL;
+ 	n->targetList = targetList;
+ 	n->fromClause = fromClause;
+ 	n->whereClause = NULL;
+ 	n->groupClause = NULL;
+ 	n->havingClause = NULL;
+ 	n->windowClause = NIL;
+ 	n->withClause = NULL;
+ 	n->valuesLists = NIL;
+ 	n->sortClause = NIL;
+ 	n->limitOffset = NULL;
+ 	n->limitCount = NULL;
+ 	n->lockingClause = NIL;
+ 	n->op = SETOP_NONE;
+ 	n->all = false;
+ 	n->larg = NULL;
+ 	n->rarg = NULL;
+ 	return n;
+ }
+ 
+ static List *
+ makeStarTargetList(void)
+ {
+ 	ResTarget *rt = makeNode(ResTarget);
+ 	
+ 	rt->name = NULL;
+ 	rt->indirection = NIL;
+ 	rt->val = (Node *) makeNode(ColumnRef);
+ 	((ColumnRef *) rt->val)->fields = list_make1(makeNode(A_Star));
+ 	rt->location = -1;
+ 
+ 	return list_make1(rt);
+ }
+  
+ static SelectStmt *
+ transformGroupingSets(ParseState *pstate, SelectStmt *stmt)
+ {
+ 	if (stmt->groupClause && IsA(stmt->groupClause, GroupByClause))
+ 	{
+ 		GroupingSetsSpec *gss = (GroupingSetsSpec *) expandGroupingSets(pstate, 
+ 		(List *)((GroupByClause *)stmt->groupClause)->fields);
+ 	
+ 		if (pstate->p_hasGroupingSets)
+ 		{
+ 			CommonTableExpr *cte = makeNode(CommonTableExpr);
+ 			SelectStmt  *cteedstmt;
+ 			int	ngroupingsets = list_length(gss->set_list) + (gss->has_empty_set ? 1 : 0);
+ 			bool	all = ((GroupByClause *) stmt->groupClause)->all;
+ 		
+ 			cteedstmt = makeSelectStmt(NIL, NIL);
+ 			cteedstmt->intoClause = stmt->intoClause;
+ 			cteedstmt->sortClause = stmt->sortClause;
+ 			cteedstmt->limitOffset = stmt->limitOffset;
+ 			cteedstmt->limitCount = stmt->limitCount;
+ 			cteedstmt->lockingClause = stmt->lockingClause;
+ 		
+ 			cte->ctename = "**g**";
+ 			cte->ctequery = (Node *) stmt;
+ 			cte->location = -1;
+ 		
+ 			cteedstmt->withClause = makeNode(WithClause);
+ 			cteedstmt->withClause->ctes = list_make1(cte);
+ 			cteedstmt->withClause->recursive = false;
+ 			cteedstmt->withClause->location = -1;
+ 		
+ 			/* when is more than one grouping set, then we should generate setop node */
+ 			if (ngroupingsets > 1)
+ 			{
+ /* add quuery under union all for every grouping set */
+ SelectStmt	*larg = NULL;
+ SelectStmt	*rarg;
+ ListCell*lc;
+ 			
+ foreach(lc, gss->set_list)
+ {
+ 	List	*groupClause;
+ 
+ 	Assert(IsA(lfirst(lc), List));
+ 	groupClause = (List *) lfirst(lc);
+ 
+ 	if (larg == NULL)
+ 	{
+ 		larg = makeSelectStmt(copyObject(stmt->targetList),
+ 	list_make1(makeRangeVar(NULL, "**g**", -1)));
+ 		larg->groupClause = (Node *) groupClause;
+ 		larg->havingClause = copyObject(stmt->havingClause);
+ 	}
+ 	else
+ 	{
+ 		SelectStmt	*setop = makeSelectStmt(NIL, NIL);
+ 	
+ 		rarg 

Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Pavel Stehule
2010/8/3 Robert Haas :
> On Mon, Aug 2, 2010 at 11:17 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane  wrote:
 I'm tempted
 to suggest forgetting about any user-configurable parameter and just
 provide code that strcmp's the $EDITOR value to see if it recognizes the
 editor name, otherwise do nothing.
>>
>>> With all due respect, that sounds like an amazingly bad idea.  Surely,
>>> we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
>>> or complaints that it's not already included.
>>
>> Well, yeah, that's the idea.  I say that beats a constant stream of
>> complaints that $MYFAVORITEEDITOR no longer works at all --- which
>> is what your concern was, no?
>
> Well, it'd still work fine for \e foo.  It'll just blow up for \e foo
> 3.  My concern isn't really that things that which work now will break
> so much as that this new feature will fail to work for a large
> percentage of the people who try to use it, including virtually
> everyone who tries to use it on Win32.
>
>>> While this is
>>> superficially a Nice Thing to Have and I would certainly support it if
>>> +linenumber were relatively universal, it's really a pretty minor
>>> convenience when you come right down to it, and I am not at all
>>> convinced it is worth the hassle of trying to divine what piece of
>>> syntax will equip the user's choice of editor with the necessary
>>> amount of clue.
>>
>> The other approach we could take is that this whole thing is disabled by
>> default, and you have to set a psql variable EDITOR_LINENUMBER_SWITCH
>> to turn it on.  If you haven't read the documentation enough to find
>> out that variable exists, well, no harm no foul.
>
> That might be reasonable.  Right now the default behavior is to do
> +line on Linux and /line on Windows.  But maybe a more sensible
> default would be to fail with an error message that says "you have to
> set thus-and-so variable if you want to use this feature".  That seems
> better than sometimes working and sometimes failing depending on what
> editor the user has configured.
>

I like this idea

> A side question is whether this should be an environment variable or a
> psql variable.
>
it can be just psql variable.

Pavel

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Pavel Stehule
2010/8/3 Robert Haas :
> On Sun, Aug 1, 2010 at 11:48 PM, Robert Haas  wrote:
>>> b) more robust algorithm for header rows identification
>>
>> Have not gotten to this one yet.
>
> I notIce that on WIN32 the default editor is notepad.exe and the
> default editor navigation option is "/".  Does "notepad.exe /lineno
> filename" actually work on Windows?  A quick Google search suggests
> that the answer is "no", which seems somewhat unfortunate: it means
> we'd be shipping "broken on Win32 out of the box".
>
> http://www.robvanderwoude.com/commandlineswitches.php#Notepad

notapad supports nothing :( Microsoft doesn't use command line. I
found this syntax in pspad. Next other editor is notepad++. It use a
syntax -n. Wide used KDE editors use --line syntax

>
> This is actually my biggest concern about this patch - that it may be
> just too much of a hassle to actually make it work for people.  I just
> tried setting $EDITOR to MacOS's TextEdit program, and it turns out
> that TextEdit doesn't understand +.  I'm afraid that we're going to
> end up with a situation where it only works for people using emacs or
> vi, and everyone else ends up with a broken install (and, possibly, no
> clear idea how to fix it).  Perhaps it would be better to accept \sf
> and reject \sf+ and \ef  .
>

\sf+ is base - we really need a source output with line numbers. \ef
foo func is just user very friendly function. I agree, so this topic
have to be better documented

> Assuming we get past that threshold issue, I'm also wondering whether
> the "navigation option" terminology is best; e.g. set
> PSQL_EDITOR_NAVIGATION_OPTION to configure it.  It doesn't seem
> terrible, but it doesn't seem great, either.  Anyone have a better
> idea?


>
> The docs are a little rough; they could benefit from some editing by a
> fluent English speaker.  If anyone has time to work on this, it would
> be much appreciated.
>
> Instead of "line number is unacceptable", I think you should write
> "invalid line number".
>
> "dollar" should not be spelled "dolar".  "function" should not be
> spelled "finction".
>
> This change looks suspiciously like magic.  If it's actually safe, it
> needs a comment explaining why:
>
> -       sys = pg_malloc(strlen(editorName) + strlen(fname) + 10 + 1);
> +       sys = pg_malloc(strlen(editorName) + strlen(fname) + 20 + 1);
>
> Attempting to compile with this patch applied emits a warning on my
> machine; whether or not the warning is valid, you need to make it go
> away:
>
> command.c: In function ‘HandleSlashCmds’:
> command.c:1055: warning: ‘bsep’ may be used uninitialized in this function
> command.c:1055: note: ‘bsep’ was declared here
>
> Why does the \sf output have a trailing blank line?  This seems weird,
> especially because \ef puts no such trailing blank line in the editor.
>
> Instead of:
>
> +       /* skip useles whitespaces */
> +       while (c >= func)
> +               if (isblank(*c))
> +                       c--;
> +               else
> +                       break;
>
> ...wouldn't it be just as good to write:
>
> while (c >= func && isblank(*c))
>    c--;
>
> (and similarly elsewhere)
>
> In extract_separator, if you invert the sense of the first if-test,
> then you can avoid having to indent the entire function contents.
>
> --

I'' recheck these issue


Pavel

> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 11:17 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane  wrote:
>>> I'm tempted
>>> to suggest forgetting about any user-configurable parameter and just
>>> provide code that strcmp's the $EDITOR value to see if it recognizes the
>>> editor name, otherwise do nothing.
>
>> With all due respect, that sounds like an amazingly bad idea.  Surely,
>> we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
>> or complaints that it's not already included.
>
> Well, yeah, that's the idea.  I say that beats a constant stream of
> complaints that $MYFAVORITEEDITOR no longer works at all --- which
> is what your concern was, no?

Well, it'd still work fine for \e foo.  It'll just blow up for \e foo
3.  My concern isn't really that things that which work now will break
so much as that this new feature will fail to work for a large
percentage of the people who try to use it, including virtually
everyone who tries to use it on Win32.

>> While this is
>> superficially a Nice Thing to Have and I would certainly support it if
>> +linenumber were relatively universal, it's really a pretty minor
>> convenience when you come right down to it, and I am not at all
>> convinced it is worth the hassle of trying to divine what piece of
>> syntax will equip the user's choice of editor with the necessary
>> amount of clue.
>
> The other approach we could take is that this whole thing is disabled by
> default, and you have to set a psql variable EDITOR_LINENUMBER_SWITCH
> to turn it on.  If you haven't read the documentation enough to find
> out that variable exists, well, no harm no foul.

That might be reasonable.  Right now the default behavior is to do
+line on Linux and /line on Windows.  But maybe a more sensible
default would be to fail with an error message that says "you have to
set thus-and-so variable if you want to use this feature".  That seems
better than sometimes working and sometimes failing depending on what
editor the user has configured.

A side question is whether this should be an environment variable or a
psql variable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (9.1) btree_gist support for searching on "not equals"

2010-08-02 Thread Jeff Davis
On Mon, 2010-08-02 at 12:27 -0400, Robert Haas wrote:
> I was also wondering if it would be worth adding some additional
> regression testing to contrib/btree_gist exercising this new
> functionality.  Thoughts?

Sure. I attached two tests.

Regards,
Jeff Davis

*** a/contrib/btree_gist/Makefile
--- b/contrib/btree_gist/Makefile
***
*** 11,17  DATA_built  = btree_gist.sql
  DATA= uninstall_btree_gist.sql
  
  REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz time timetz \
!   date interval macaddr inet cidr text varchar char bytea bit varbit numeric
  
  ifdef USE_PGXS
  PG_CONFIG = pg_config
--- 11,17 
  DATA= uninstall_btree_gist.sql
  
  REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz time timetz \
!   date interval macaddr inet cidr text varchar char bytea bit varbit numeric mixed
  
  ifdef USE_PGXS
  PG_CONFIG = pg_config
*** /dev/null
--- b/contrib/btree_gist/expected/mixed.out
***
*** 0 
--- 1,31 
+ SET enable_seqscan = 'false';
+ -- test search for "not equals"
+ CREATE TABLE test_ne (
+a  TIMESTAMP,
+b  NUMERIC
+ );
+ CREATE INDEX test_ne_idx ON test_ne USING gist (a, b);
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ INSERT INTO test_ne VALUES('2007-02-03', -91.3);
+ INSERT INTO test_ne VALUES('2011-09-01', 43.7);
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ SELECT * FROM test_ne WHERE a <> '2009-01-01' AND b <> 10.7;
+ a |   b   
+ --+---
+  Sat Feb 03 00:00:00 2007 | -91.3
+  Thu Sep 01 00:00:00 2011 |  43.7
+ (2 rows)
+ 
+ -- test search for "not equals" using an exclusion constraint
+ CREATE TABLE zoo (
+cage   INTEGER,
+animal TEXT,
+EXCLUDE USING gist (cage WITH =, animal WITH <>)
+ );
+ NOTICE:  CREATE TABLE / EXCLUDE will create implicit index "zoo_cage_animal_excl" for table "zoo"
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'lion');
+ ERROR:  conflicting key value violates exclusion constraint "zoo_cage_animal_excl"
+ DETAIL:  Key (cage, animal)=(123, lion) conflicts with existing key (cage, animal)=(123, zebra).
+ INSERT INTO zoo VALUES(124, 'lion');
*** /dev/null
--- b/contrib/btree_gist/sql/mixed.sql
***
*** 0 
--- 1,30 
+ 
+ SET enable_seqscan = 'false';
+ 
+ -- test search for "not equals"
+ 
+ CREATE TABLE test_ne (
+a  TIMESTAMP,
+b  NUMERIC
+ );
+ CREATE INDEX test_ne_idx ON test_ne USING gist (a, b);
+ 
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ INSERT INTO test_ne VALUES('2007-02-03', -91.3);
+ INSERT INTO test_ne VALUES('2011-09-01', 43.7);
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ 
+ SELECT * FROM test_ne WHERE a <> '2009-01-01' AND b <> 10.7;
+ 
+ -- test search for "not equals" using an exclusion constraint
+ 
+ CREATE TABLE zoo (
+cage   INTEGER,
+animal TEXT,
+EXCLUDE USING gist (cage WITH =, animal WITH <>)
+ );
+ 
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'lion');
+ INSERT INTO zoo VALUES(124, 'lion');

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Tom Lane
Robert Haas  writes:
> On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane  wrote:
>> I'm tempted
>> to suggest forgetting about any user-configurable parameter and just
>> provide code that strcmp's the $EDITOR value to see if it recognizes the
>> editor name, otherwise do nothing.

> With all due respect, that sounds like an amazingly bad idea.  Surely,
> we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
> or complaints that it's not already included.

Well, yeah, that's the idea.  I say that beats a constant stream of
complaints that $MYFAVORITEEDITOR no longer works at all --- which
is what your concern was, no?

> While this is
> superficially a Nice Thing to Have and I would certainly support it if
> +linenumber were relatively universal, it's really a pretty minor
> convenience when you come right down to it, and I am not at all
> convinced it is worth the hassle of trying to divine what piece of
> syntax will equip the user's choice of editor with the necessary
> amount of clue.

The other approach we could take is that this whole thing is disabled by
default, and you have to set a psql variable EDITOR_LINENUMBER_SWITCH
to turn it on.  If you haven't read the documentation enough to find
out that variable exists, well, no harm no foul.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> This is actually my biggest concern about this patch - that it may be
>> just too much of a hassle to actually make it work for people.  I just
>> tried setting $EDITOR to MacOS's TextEdit program, and it turns out
>> that TextEdit doesn't understand +.  I'm afraid that we're going to
>> end up with a situation where it only works for people using emacs or
>> vi, and everyone else ends up with a broken install (and, possibly, no
>> clear idea how to fix it).
>
> [ disclaimer: I've not looked at the proposed patch yet ]
>
> It seems like this ought to be fairly easily surmountable as long as
> the patch is designed for failure.

It isn't.

> The fallback position is just that
> the line number does nothing, ie \ef foo just opens the editor and
> doesn't try to position the cursor anywhere special; nobody can complain
> about that because it's no worse than before.  What we need is to not
> try to force positioning if we don't recognize the editor.

Supposing for the moment that we could make it work that way, that
would be reasonable.

> I'm tempted
> to suggest forgetting about any user-configurable parameter and just
> provide code that strcmp's the $EDITOR value to see if it recognizes the
> editor name, otherwise do nothing.

With all due respect, that sounds like an amazingly bad idea.  Surely,
we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
or complaints that it's not already included.  While this is
superficially a Nice Thing to Have and I would certainly support it if
+linenumber were relatively universal, it's really a pretty minor
convenience when you come right down to it, and I am not at all
convinced it is worth the hassle of trying to divine what piece of
syntax will equip the user's choice of editor with the necessary
amount of clue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Tom Lane
Robert Haas  writes:
> This is actually my biggest concern about this patch - that it may be
> just too much of a hassle to actually make it work for people.  I just
> tried setting $EDITOR to MacOS's TextEdit program, and it turns out
> that TextEdit doesn't understand +.  I'm afraid that we're going to
> end up with a situation where it only works for people using emacs or
> vi, and everyone else ends up with a broken install (and, possibly, no
> clear idea how to fix it).

[ disclaimer: I've not looked at the proposed patch yet ]

It seems like this ought to be fairly easily surmountable as long as
the patch is designed for failure.  The fallback position is just that
the line number does nothing, ie \ef foo just opens the editor and
doesn't try to position the cursor anywhere special; nobody can complain
about that because it's no worse than before.  What we need is to not
try to force positioning if we don't recognize the editor.  I'm tempted
to suggest forgetting about any user-configurable parameter and just
provide code that strcmp's the $EDITOR value to see if it recognizes the
editor name, otherwise do nothing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Robert Haas
On Sun, Aug 1, 2010 at 11:48 PM, Robert Haas  wrote:
>> b) more robust algorithm for header rows identification
>
> Have not gotten to this one yet.

I notIce that on WIN32 the default editor is notepad.exe and the
default editor navigation option is "/".  Does "notepad.exe /lineno
filename" actually work on Windows?  A quick Google search suggests
that the answer is "no", which seems somewhat unfortunate: it means
we'd be shipping "broken on Win32 out of the box".

http://www.robvanderwoude.com/commandlineswitches.php#Notepad

This is actually my biggest concern about this patch - that it may be
just too much of a hassle to actually make it work for people.  I just
tried setting $EDITOR to MacOS's TextEdit program, and it turns out
that TextEdit doesn't understand +.  I'm afraid that we're going to
end up with a situation where it only works for people using emacs or
vi, and everyone else ends up with a broken install (and, possibly, no
clear idea how to fix it).  Perhaps it would be better to accept \sf
and reject \sf+ and \ef  .

Assuming we get past that threshold issue, I'm also wondering whether
the "navigation option" terminology is best; e.g. set
PSQL_EDITOR_NAVIGATION_OPTION to configure it.  It doesn't seem
terrible, but it doesn't seem great, either.  Anyone have a better
idea?

The docs are a little rough; they could benefit from some editing by a
fluent English speaker.  If anyone has time to work on this, it would
be much appreciated.

Instead of "line number is unacceptable", I think you should write
"invalid line number".

"dollar" should not be spelled "dolar".  "function" should not be
spelled "finction".

This change looks suspiciously like magic.  If it's actually safe, it
needs a comment explaining why:

-   sys = pg_malloc(strlen(editorName) + strlen(fname) + 10 + 1);
+   sys = pg_malloc(strlen(editorName) + strlen(fname) + 20 + 1);

Attempting to compile with this patch applied emits a warning on my
machine; whether or not the warning is valid, you need to make it go
away:

command.c: In function ‘HandleSlashCmds’:
command.c:1055: warning: ‘bsep’ may be used uninitialized in this function
command.c:1055: note: ‘bsep’ was declared here

Why does the \sf output have a trailing blank line?  This seems weird,
especially because \ef puts no such trailing blank line in the editor.

Instead of:

+   /* skip useles whitespaces */
+   while (c >= func)
+   if (isblank(*c))
+   c--;
+   else
+   break;

...wouldn't it be just as good to write:

while (c >= func && isblank(*c))
c--;

(and similarly elsewhere)

In extract_separator, if you invert the sense of the first if-test,
then you can avoid having to indent the entire function contents.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 5:07 PM, Alexander Korotkov  wrote:
> Now I think patch is as good as can be. :)

OK, committed.

> I'm going to prepare less-or-equal function in same manner as this patch.

Sounds good.  Since we're now more than half-way through this
CommitFest and this patch has undergone quite a bit of change from
what we started out with, I'd like to ask that you submit the new
patch for the next CommitFest, to begin September 15th.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where in the world is Itagaki Takahiro?

2010-08-02 Thread Josh Berkus
On 8/2/10 3:42 PM, Itagaki Takahiro wrote:
> Sorry for delayed reply. I moved to a new job,
> and was very busy for it.

Congratulations!  Are you still at NTT Open Source?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where in the world is Itagaki Takahiro?

2010-08-02 Thread Itagaki Takahiro
2010/8/3 Kevin Grittner :
> I haven't seen any posts or CF activity from Itagaki in over a week,
> so I'm wondering how to handle some patches in the CF.  Does anyone
> know whether he's on vacation?  Expected return?

Sorry for delayed reply. I moved to a new job,
and was very busy for it.

My proposed patches should be returned or postponed
to the next commitfest. I will restart to check
patches I reviewed before in one or two weeks.
I apologize to patch's authors for the delay.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-08-02 Thread Florian Pflug
Hi

I've updated mvcc.sgml to explain the new serialization conflict rules for 
row-level locks, and added a paragraph to backend/executor/README that explains 
the implementation of those. I've chosen backend/executor/README because it 
already contains a description of UPDATE handling in READ COMMITTED mode, which 
seemed at least somewhat related.

The patch now removes all code dealing with the crosscheck_snapshot, since the 
RI triggers should now be safe without that.

I've also fixed the formatting of multi-line comments to match the coding 
style. I kept the braces around single-line "if" or "else" statements wherever 
both "if" and "else" blocks were present and one of them wasn't a single-line 
block.

I think, in addition to the documentation changes this patch contains, that a 
section on how to write RI triggers in pl/pgsql would be nice to have. It's not 
strictly part of the documentation of this feature though, more a potential 
use-case, so I didn't tackle it for the moment. I do hope to find the time to 
write such a section, though, and if that happens I'll post it as a separate 
documentation-only patch.

I've pushed the changes to the branch serializable_row_locks on 
git://github.com/fgp/postgres.git and also attached a patch.

best regards,
Florian Pflug


serializable_row_locks.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Where in the world is Itagaki Takahiro?

2010-08-02 Thread Kevin Grittner
I haven't seen any posts or CF activity from Itagaki in over a week,
so I'm wondering how to handle some patches in the CF.  Does anyone
know whether he's on vacation?  Expected return?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-02 Thread Mike Fowler

On 02/08/10 07:46, Pavel Stehule wrote:



I have not any suggestions now - so I'll change flag to "ready to commit"


sorry - contrib module should be a fixed

patch attached



Thanks Pavel, you saved me some time!

Regards,

--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-02 Thread Robert Haas
2010/8/2 Alexander Korotkov :
> The dump of the table with russian dictionary is in attachment.
>
> I use following tests:
> SELECT SUM(levenshtein(a, 'foo')) from words;
> SELECT SUM(levenshtein(a, 'Urbański')) FROM words;
> SELECT SUM(levenshtein(a, 'ańs')) FROM words;
> SELECT SUM(levenshtein(a, 'foo')) from words2;
> SELECT SUM(levenshtein(a, 'дом')) FROM words2;
> SELECT SUM(levenshtein(a, 'компьютер')) FROM words2;
>
> With your last version of patch results are:
> 63ms 94ms 61ms 100ms 121ms 217ms.
>
> But I found some other optimization. With this condition:
> if (x[x_char_len-1] == y[y_char_len-1] && x_char_len == y_char_len &&
>     (x_char_len == 1 || char_same(x, y, x_char_len - 1)))
> results become:
> 58ms 96ms 63ms 102ms 107ms 171ms
>
> On single-byte characters results almost didn't change, but they come better
> with multi-byte characters. Generally, this improvement is because first
> bytes of different characters are frequently same (for example, almost all
> Cyrillic characters start from same byte, and I think that there is similar
> situation in other alphabets), but match of last bytes is just a
> coincidence.

Hey, that's really cool.  It helps a lot here, too:

My previous patch, with your 5 test cases:
Time: 100.556 ms
Time: 205.254 ms
Time: 127.070 ms
Time: 77.734 ms
Time: 90.345 ms
Time: 166.954 ms

Your original patch, same 5 test cases:
Time: 131.489 ms
Time: 215.048 ms
Time: 125.471 ms
Time: 80.068 ms
Time: 87.110 ms
Time: 166.918 ms

My patch, with your proposed change to compare the last character
first, same 5 test cases:
Time: 96.489 ms
Time: 201.497 ms
Time: 121.876 ms
Time: 75.005 ms
Time: 80.219 ms
Time: 142.844 ms

Updated patch attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


mb_levenshtein_rmh-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 2:56 PM, Yeb Havinga  wrote:
> I do not completely understand what you mean with the destruction of
> reusability of ATOneLevelRecursion, currently only called by ATPrepAddColumn
> - in the patch it is documented in the definition of relVisited that is it
> visit info for each subcommand. The loop over subcommands is in
> ATController, where the value is properly reset for each all current and
> future subcommands. Hence the ATOneLevelRecursion routing is usable in its
> current form for all callers during the prep stage, and not only
> ATPrepAddColumn.

Well, only if they happen to want the "visit each table only once"
behavior, which might not be true for every command type.

>> I am of the opinion that the chances of a unifying solution popping up
>> are pretty much zero.  :-)
>
> Me too, now I understand the fixes must be in the execution stages.

OK.  I'll go ahead and commit the patch upthread, so that the
constraint bug is fixed, and then we can keep arguing about what do
about the column bug, perhaps on a new thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Kevin Grittner
Robert Haas  wrote:
 
>> I wonder whether this patch shouldn't be rejected with a request
>> that the timeout framework be submitted to the next CF.
 
> I think "Returned with Feedback" would be more appropriate than
> "Rejected", since we're asking for a rework, rather than saying -
> we just don't want this.  But otherwise, +1.
 
Done.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 3:09 PM, Kevin Grittner
 wrote:
> Marc Cousin  wrote:
>
>> This time, it's this case that doesn't work :
>
>> I really feel that the timeout framework is the way to go here.
>
> Since Zoltán also seems to feel this way:
>
> http://archives.postgresql.org/message-id/4c516c3a.6090...@cybertec.at
>
> I wonder whether this patch shouldn't be rejected with a request
> that the timeout framework be submitted to the next CF.  Does anyone
> feel this approach (without the framework) should be pursued
> further?

I think "Returned with Feedback" would be more appropriate than
"Rejected", since we're asking for a rework, rather than saying - we
just don't want this.  But otherwise, +1.

Generally, I'm of the opinion that patches needing significant rework
should be set to "Returned with Feedback" and resubmitted for the next
CF; otherwise, it just gets unmanageable.  Our goal for a CF should be
to review everything thoroughly, not to get everything committed.
Otherwise, we end up with a never-ending train of what are effectively
new patches popping up, and it becomes impossible to close out the
CommitFest on time.  Reviewers and committers end up getting slammed,
and it's not very much fun for patch authors (who are trying to
frantically do last-minute rewrites) either; nor is it good for the
overall quality of code the ends up in our tree.  Better to take a
breather and then start fresh.

(If you don't believe in committer fatigue, look at the review I gave
Itagaki Takahiro on the partitioning patch in January vs. the review I
gave in July.  One of those reviews is a whole lot more specific,
detailed, and accurate than the other one...)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Kevin Grittner
Boszormenyi Zoltan  wrote:
> Kevin Grittner írta:
 
>> I wonder whether this patch shouldn't be rejected with a request
>> that the timeout framework be submitted to the next CF.  Does
>> anyone feel this approach (without the framework) should be
>> pursued further?
> 
> I certainly think so, the current scheme seems to be very fragile
> and doesn't want to be extended.
 
Sorry, I phrased that question in a rather confusing way; I'm not
sure, but it sounds like you're in favor of dropping this approach
and pursuing the timeout framework in the next CF -- is that right?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Boszormenyi Zoltan
Hi,

Kevin Grittner írta:
> Marc Cousin  wrote:
>  
>   
>> This time, it's this case that doesn't work :
>> 
>  
>   
>> I really feel that the timeout framework is the way to go here.
>> 
>  
> Since Zoltán also seems to feel this way:
>  
> http://archives.postgresql.org/message-id/4c516c3a.6090...@cybertec.at
>  
> I wonder whether this patch shouldn't be rejected with a request
> that the timeout framework be submitted to the next CF.  Does anyone
> feel this approach (without the framework) should be pursued
> further?
>   

I certainly think so, the current scheme seems to be very fragile
and doesn't want to be extended.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Kevin Grittner
Marc Cousin  wrote:
 
> This time, it's this case that doesn't work :
 
> I really feel that the timeout framework is the way to go here.
 
Since Zoltán also seems to feel this way:
 
http://archives.postgresql.org/message-id/4c516c3a.6090...@cybertec.at
 
I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF.  Does anyone
feel this approach (without the framework) should be pursued
further?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Yeb Havinga

Robert Haas wrote:

I don't think that this is much cleaner than the global variable
solution; you haven't really localized that need to know about the new
flag in any meaningful way, the hacks in ATOneLevelRecusion()
basically destroy any pretense of that code possibly being reusable
for some other caller.  However, there's a more serious problem, which
is that it doesn't in general fix the bug: try it with the
top1/top2/bottom/basement example I posted upthread.  If you add the
same column to both top1 and top2 and then drop it in both top1 and
top2, basement ends up with a leftover copy.  The problem is that
"only visit each child once" is not the right algorithm; what you need
to do is "only visit the descendents of each child if no merge
happened at the parent".  I believe that the only way to do this
correct is to merge the prep stage into the execution stage, as the
code for adding constraints already does.  At the prep stage, you
don't have enough information to determine which relations you'll
ultimately need to update, since you don't know where the merges will
happen.
  

Hello Robert,

Again thanks for looking at the patch. Unfortunately I missed the 
top1/top2 example earlier, but now I've seen it I understand that it is 
impossible to fix this problem during the prep stage, without looking at 
actual existing columns, i.e. without the merge code. Running the 
top1/top2 example I'd also have expected an error while adding the 
column to the second top, since the columns added to top1 and top2 are 
from a different origin. It is apparently considered good behaviour, 
however troubles emerge when e.g. trying to rename a_table_column in the 
top1/top2 example, where that is no problem in the 'lollipop' structure, 
that has a single origin.


ALTER TABLE top RENAME COLUMN a_table_column TO another_table_column;

SELECT t.oid, t.relname, a.attname, a.attinhcount
FROM pg_class t
JOIN pg_attribute a ON (a.attrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance' AND a.attname LIKE '%table_column%'
ORDER BY oid;

I do not completely understand what you mean with the destruction of 
reusability of ATOneLevelRecursion, currently only called by 
ATPrepAddColumn - in the patch it is documented in the definition of 
relVisited that is it visit info for each subcommand. The loop over 
subcommands is in ATController, where the value is properly reset for 
each all current and future subcommands. Hence the ATOneLevelRecursion 
routing is usable in its current form for all callers during the prep 
stage, and not only ATPrepAddColumn.

I am of the opinion that the chances of a unifying solution popping up
are pretty much zero.  :-)
  

Me too, now I understand the fixes must be in the execution stages.

regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiling CVS HEAD with clang under OSX

2010-08-02 Thread Greg Stark
On Mon, Aug 2, 2010 at 4:27 PM, Tom Lane  wrote:
> Here's the problem: if the compiler is allowed to assume that overflow
> cannot happen, it is always going to be able to "prove" that the
> if-test is constant false.  This is inherent.  Anybody claiming to
> exhibit a safe way to code the overflow test is really only telling
> you that today's version of their compiler isn't smart enough to make
> the proof.

So I'll do the next two parts of the dialogue myself:

Questioner: So why not write that as:

if ((arg1 > 0 && arg2 > 0 && arg1 > MAXINT - arg2) ||
(arg1 < 0 && arg2 < 0 && arg1 < MININT + arg2))
  elog("overflow")
else
  return arg1+arg2;

Tom: Because that code is much more complex and prone to errors
especially when you start getting into multiplication and other
operations and it's also much slower than the code which allows
overflow to happen and then checks that the result makes sense.

I'm not entirely sure I agree. At least I haven't actually gone
through all the arithmetic operations and I'm not sure how much more
complex they get. If they were all at that level of complexity I think
I would say we should go ahead and bite the performance bullet and do
it the ultra-safe way.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (9.1) btree_gist support for searching on "not equals"

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 2:39 AM, Jeff Davis  wrote:
> On Sun, 2010-08-01 at 21:57 -0400, Robert Haas wrote:
>> On Fri, Jul 16, 2010 at 1:19 AM, Jeff Davis  wrote:
>> > Thank you for the review.
>> >
>> > On Mon, 2010-07-12 at 17:17 +0900, Itagaki Takahiro wrote:
>> >> (1) Exclusion constraints support for operators where "x  x"
>> >> is false (tiny patch)
>> >> https://commitfest.postgresql.org/action/patch_view?id=307
>> >> (2) btree_gist support for searching on <> ("not equals")
>> >> https://commitfest.postgresql.org/action/patch_view?id=308
>> >>
>> >> Those patches should be committed at once because (2) requires (1) to work
>> >> with EXCLUDE constraints. Also, (1) has no benefits without (2) because we
>> >> have no use cases for <> as an index-able operator. Both patches are very
>> >> simple and small, and worked as expected both "WHERE <>" and EXCLUDE
>> >> constraints cases.
>> >
>> > It appears that Tom already committed (1).
>> >
>> >> I'd like to ask you to write additional documentation about btree_gist [1]
>> >> that the module will be more useful when it is used with exclusion
>> >> constraints together. Without documentation, no users find the usages.
>> >
>> > Good idea, new patch attached.
>>
>> It seems pretty odd to define a constant called
>> BTNotEqualStrategyNumber in contrib/btree_gist.  Shouldn't we either
>> call this something else, or define it in access/skey.h?  Considering
>> that there seem to be some interesting gymnastics being done with
>> BTMaxStrategyNumber, I'd vote for the former.  Maybe just
>> BtreeGistNotEqualStrategyNumber?
>
> Sounds good to me.

OK, committed that way.

> At some point we may be interested to add this to BTree, as well. But we
> can cross that bridge when we come to it.

Yeah.

I was also wondering if it would be worth adding some additional
regression testing to contrib/btree_gist exercising this new
functionality.  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-08-02 Thread Robert Haas
2010/7/29 Alexander Korotkov :
> But, in pg_trgm it makes it possible to combine different similarity levels
> in one query. For example:
> select * from test_trgm order by t <-> 'asdf' < 0.5 or t <-> 'qwer' < 0.4;
> Is there any chance to handle this syntax also?

Maybe I'm missing something, but I don't think that ORDER BY clause
makes much sense.  OR is going to reduce a true or false value - and
it's usually not that interesting to order by a column that can only
take one of two values.

Am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 10:47 AM, Yeb Havinga  wrote:
>>> The attached patch uses the globally defined list.
>>
>> I can't speak for any other committer, but personally I'm prepared to
>> reject out of hand any solution involving a global variable.  This
>> code is none to easy to follow as it is and adding global variables to
>> the mix is not going to make it better, even if we can convince
>> ourselves that it's safe and correct.  This is something of a corner
>> case, so I won't be crushed if a cleaner fix is too invasive to
>> back-patch.
>
> Thanks for looking at the patch. I've attached a bit more wordy version,
> that adds a boolean to AlteredTableInfo and a function to wipe that boolean
> between processing of subcommands.

I don't think that this is much cleaner than the global variable
solution; you haven't really localized that need to know about the new
flag in any meaningful way, the hacks in ATOneLevelRecusion()
basically destroy any pretense of that code possibly being reusable
for some other caller.  However, there's a more serious problem, which
is that it doesn't in general fix the bug: try it with the
top1/top2/bottom/basement example I posted upthread.  If you add the
same column to both top1 and top2 and then drop it in both top1 and
top2, basement ends up with a leftover copy.  The problem is that
"only visit each child once" is not the right algorithm; what you need
to do is "only visit the descendents of each child if no merge
happened at the parent".  I believe that the only way to do this
correct is to merge the prep stage into the execution stage, as the
code for adding constraints already does.  At the prep stage, you
don't have enough information to determine which relations you'll
ultimately need to update, since you don't know where the merges will
happen.

>>  Incidentally, we need to shift this discussion to a new
>> subject line; we have a working patch for the original subject of this
>> thread, and are now discussing the related problem I found with
>> attributes.
>>
>
> Both solutions have changes in callers of 'find_inheritance_children'. I was
> working in the hope a unifiying solution would pop up naturally, but so far
> it has not. Checking of the new AlteredTableInfo->relVisited boolean could
> perhaps be pushed down into find_inheritance_children, if multiple-'doing
> things' for the childs under a multiple-inheriting relation is unwanted for
> every kind of action. It seems to me that the question whether that is the
> case must be answered, before the current working patch for coninhcount is
> 'ready for committer'.

I am of the opinion that the chances of a unifying solution popping up
are pretty much zero.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiling CVS HEAD with clang under OSX

2010-08-02 Thread Tom Lane
Neil Conway  writes:
> FWIW, I think we should aim to eventually remove the dependency on
> -fwrapv, and instead make the code correct under the semantics
> guaranteed by the C spec.

[ shrug... ]  We've gone around on that before.  I can't get excited
about it, and the reason is that 99% of the places where we actually
need -fwrapv behavior is in places where we are trying to test for
overflow.  The need for that can't be legislated away.  As an example,
in int4pl we have code like

result = arg1 + arg2;
if (some-test-on-result-and-arg1-and-arg2)
elog(ERROR, "overflow");

Here's the problem: if the compiler is allowed to assume that overflow
cannot happen, it is always going to be able to "prove" that the
if-test is constant false.  This is inherent.  Anybody claiming to
exhibit a safe way to code the overflow test is really only telling
you that today's version of their compiler isn't smart enough to make
the proof.  Next year, who knows?  I'm uninterested in getting into
that kind of arms race with the compiler authors.  I prefer to keep
the goalposts exactly where they've been for the past forty years,
namely -fwrapv semantics.

If the compiler guys actually were serious about helping people write
code that didn't need -fwrapv, they would provide primitives for
testing for arithmetic overflow.  I would be ecstatic if we could write
int4pl like this:

if (sum_overflows(arg1, arg2))
elog(ERROR, "overflow");
else
return arg1 + arg2;

especially since with a halfway decent compiler this sort of
construction wouldn't even require doing the addition twice ---
presumably the compiler could see that it had a common subexpression
there, and generate code that consists of one addition plus a
branch-on-overflow test.  This'd be clean, readable, and far more
efficient than the hacks we have to resort to now.

Short of that, though, -fwrapv is what it's gonna be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Marc Cousin
The Monday 02 August 2010 13:59:59, Boszormenyi Zoltan wrote :
> >
> > Also, I made sure that only one or two timeout causes (one of
> > deadlock_timeout
> > and lock_timeout in the first case or statement_timeout plus one of the
> > other two)
> > can be active at a time.
> 
> A little clarification is needed. The above statement is not entirely true.
> There can be a case when all three timeout causes can be active, but only
> when deadlock_timeout is the smallest of the three. If the fin_time value
> for the another timeout cause is smaller than for deadlock_timeout then
> there's no point to make deadlock_timeout active. And in case
> deadlock_timeout
> triggers and CheckDeadLock() finds that there really is a deadlock then the
> possibly active lock_timeout gets disabled to avoid calling
> RemoveFromWaitQueue() twice. I hope the comments in the code explain it
> well.
> 
> >  Previously I was able to trigger a segfault
> > with the default
> > 1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
> > system's
> > clock resolution makes the lock_timeout and deadlock_timeout equal and
> > RemoveFromWaitQueue() was called twice. This way it's a lot more robust.
> >   
> 
Ok, I've tested this new version:

This time, it's this case that doesn't work :

Session 1 : lock a table

Session 2 : set lock_timeout to a large value, just to make it obvious (10 
seconds).
It times out after 1 s (the deadlock timeout), returning 'could not obtain 
lock'.
Of course, it should wait 10 seconds.


I really feel that the timeout framework is the way to go here. I know what
you said about it before, but I think that the current code is getting
too complicated, with too many special cases all over.


As this is only my second review and I'm sure I am missing things here, could
someone with more experience have a look and give advice ?


Cheers

Marc

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-02 Thread Robert Haas
2010/8/2 Alexander Korotkov :
> On Mon, Aug 2, 2010 at 5:20 AM, Robert Haas  wrote:
>> I reviewed this code in a fair amount of detail today and ended up
>> rewriting it.  In general terms, it's best to avoid changing things
>> that are not relevant to the central purpose of the patch.  This patch
>> randomly adds a whole bunch of whitespace and removes a whole bunch of
>> comments which I find useful and do not want to see removed.  It took
>> me about an hour to reverse out all of those changes, which then let
>> me get down to the business of actually analyzing the code.
> I'm sorry. This is my first patch, I will be more accurate next time. But, I
> think there is no unified opinion about some changes like replacement "!m"
> with "m==0".

Sure; if that were the only such change, I wouldn't have mentioned it.

> I think this line is not correct:
> "if (m != s_bytes && n != t_bytes)"

Good catch, fixed in the attached version.

>> The attached patch takes the approach of using a fast-path for just
>> the innermost loop when neither string contains any multi-byte
>> characters (regardless of whether or not the encoding allows
>> multi-byte characters).  It's still slower than CVS HEAD, but for
>> strings containing no multi-byte characters it's only marginally, if
>> at all, slower than previous releases, because 9.0 doesn't contain
>> your fix to avoid the extra string copy; and it's faster than your
>> implementation even when multi-byte characters ARE present.  It is
>> slightly slower on single-byte encoding databases (I tested
>> SQL_ASCII), but still faster than previous releases.  It's also quite
>> a bit less code duplication.
>
> Can I look at the test which shows that your implementation is faster than
> my when multi-byte characters are present? My tests show reverse. For
> example, with russian dictionary of openoffice:

Hmm, with the correction you mention above, I'm getting about the same
results with multi-byte characters (but faster with only single-byte
characters).  I did this:

CREATE TABLE words (a text not null primary key);
COPY words from '/usr/share/dict/words';

SELECT SUM(levenshtein(a, 'foo')) from words;
SELECT SUM(levenshtein(a, 'Urbański')) FROM words;
SELECT SUM(levenshtein(a, 'ańs')) FROM words;

With the updated patch attached below, these ran about 102 ms, 222 ms,
129 ms.  With your patch, I get about 134 ms, 222 ms, 130 ms.  Perhaps
this is because I only have multi-byte characters in one of the
inputs, not both.  I don't have a dictionary handy with multi-byte
words in it.  Can you send me a file?

> I think that the cause of slow down slowdown is memcmp function. This
> function is very fast for long parts of memory, but of few bytes inline
> function is faster, because compiler have more freedom for optimization.
> After replacing memcmp with inline function like following in your
> implementation:

Yeah, that does seem to be slightly better.  I've done a version of
this in the attached patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


mb_levenshtein_rmh-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Yeb Havinga

Robert Haas wrote:

On Mon, Aug 2, 2010 at 9:20 AM, Yeb Havinga  wrote:

The attached patch uses the globally defined list.

I can't speak for any other committer, but personally I'm prepared to
reject out of hand any solution involving a global variable.  This
code is none to easy to follow as it is and adding global variables to
the mix is not going to make it better, even if we can convince
ourselves that it's safe and correct.  This is something of a corner
case, so I won't be crushed if a cleaner fix is too invasive to
back-patch.

Hello Robert,

Thanks for looking at the patch. I've attached a bit more wordy version, 
that adds a boolean to AlteredTableInfo and a function to wipe that 
boolean between processing of subcommands.

  Incidentally, we need to shift this discussion to a new
subject line; we have a working patch for the original subject of this
thread, and are now discussing the related problem I found with
attributes.
  
Both solutions have changes in callers of 'find_inheritance_children'. I 
was working in the hope a unifiying solution would pop up naturally, but 
so far it has not. Checking of the new AlteredTableInfo->relVisited 
boolean could perhaps be pushed down into find_inheritance_children, if 
multiple-'doing things' for the childs under a multiple-inheriting 
relation is unwanted for every kind of action. It seems to me that the 
question whether that is the case must be answered, before the current 
working patch for coninhcount is 'ready for committer'.


regards,
Yeb Havinga

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 49a6f73..d43efc3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -99,7 +99,6 @@ typedef struct OnCommitItem
 
 static List *on_commits = NIL;
 
-
 /*
  * State information for ALTER TABLE
  *
@@ -132,6 +131,7 @@ typedef struct AlteredTableInfo
Oid relid;  /* Relation to work on 
*/
charrelkind;/* Its relkind */
TupleDesc   oldDesc;/* Pre-modification tuple 
descriptor */
+   boolrelVisited; /* Was the rel visited before, for the 
current subcommand */
/* Information saved by Phase 1 for Phase 2: */
List   *subcmds[AT_NUM_PASSES]; /* Lists of AlterTableCmd */
/* Information saved by Phases 1/2 for Phase 3: */
@@ -335,6 +335,7 @@ static void ATExecDropInherit(Relation rel, RangeVar 
*parent, LOCKMODE lockmode)
 static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
   ForkNumber forkNum, bool istemp);
 static const char *storage_name(char c);
+static void ATResetRelVisited(List **wqueue);
 
 
 /* 
@@ -2583,6 +2584,7 @@ ATController(Relation rel, List *cmds, bool recurse, 
LOCKMODE lockmode)
{
AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);
 
+   ATResetRelVisited(&wqueue);
ATPrepCmd(&wqueue, rel, cmd, recurse, false, lockmode);
}
 
@@ -3503,6 +3505,23 @@ ATGetQueueEntry(List **wqueue, Relation rel)
 }
 
 /*
+ * ATResetRelVisited: reset the visitation info for each rel in the working
+ * queue, so it can be used for the next subcommand.
+ */
+static void
+ATResetRelVisited(List **wqueue)
+{
+   AlteredTableInfo *tab;
+   ListCell   *ltab;
+
+   foreach(ltab, *wqueue)
+   {
+   tab = (AlteredTableInfo *) lfirst(ltab);
+   tab->relVisited = false;
+   }
+}
+
+/*
  * ATSimplePermissions
  *
  * - Ensure that it is a relation (or possibly a view)
@@ -3618,19 +3637,29 @@ ATSimpleRecursion(List **wqueue, Relation rel,
 /*
  * ATOneLevelRecursion
  *
- * Here, we visit only direct inheritance children.  It is expected that
- * the command's prep routine will recurse again to find indirect children.
- * When using this technique, a multiply-inheriting child will be visited
- * multiple times.
+ * Here, we visit only direct inheritance children.  It is expected that the
+ * command's prep routine will recurse again to find indirect children.  When
+ * using this technique, a multiple-inheriting child will be visited multiple
+ * times. Childs of multiple-inheriting childs however are only visited once
+ * for each parent.
  */
 static void
 ATOneLevelRecursion(List **wqueue, Relation rel,
AlterTableCmd *cmd, LOCKMODE lockmode)
 {
Oid relid = RelationGetRelid(rel);
+   AlteredTableInfo *tab = ATGetQueueEntry(wqueue, rel);
ListCell   *child;
List   *children;
 
+   /* If we already visited the current multiple-inheriting relation, we
+* mustn't recurse to it's child tables, because they've already been
+* visited. Visiting them would lead to an incorrect value for
+* attinhcount. */
+   if (tab->relVisited

Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 10:21 AM, Kevin Grittner
 wrote:
> Sushant Sinha  wrote:
>
>> Yes thats what I am planning to do. I just wanted to see if anyone
>> can help me in estimating whether this is doable in the current
>> parser or I need to write a new one. If possible, then some idea
>> on how to go about implementing?
>
> The current tsearch parser is a state machine which does clunky mode
> switches to handle special cases like you describe.  If you're
> looking at doing very much in there, you might want to consider a
> rewrite to something based on regular expressions.  See discussion
> in these threads:
>
> http://archives.postgresql.org/message-id/200912102005.16560.and...@anarazel.de
>
> http://archives.postgresql.org/message-id/4b210d9e02250002d...@gw.wicourts.gov
>
> That was actually at the top of my personal PostgreSQL TODO list
> (after my current project is wrapped up), but I wouldn't complain if
> someone else wanted to take it.  :-)

If you end up rewriting it, it may be a good idea, in the initial
rewrite, to mimic the current results as closely as possible - and
then submit a separate patch to change the results.  Changing two
things at the same time exponentially increases the chance of your
patch getting rejected.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Kevin Grittner
Sushant Sinha  wrote:
 
> Yes thats what I am planning to do. I just wanted to see if anyone
> can help me in estimating whether this is doable in the current
> parser or I need to write a new one. If possible, then some idea
> on how to go about implementing?
 
The current tsearch parser is a state machine which does clunky mode
switches to handle special cases like you describe.  If you're
looking at doing very much in there, you might want to consider a
rewrite to something based on regular expressions.  See discussion
in these threads:
 
http://archives.postgresql.org/message-id/200912102005.16560.and...@anarazel.de
 
http://archives.postgresql.org/message-id/4b210d9e02250002d...@gw.wicourts.gov
 
That was actually at the top of my personal PostgreSQL TODO list
(after my current project is wrapped up), but I wouldn't complain if
someone else wanted to take it.  :-)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Tom Lane
Sushant Sinha  writes:
>> This would needlessly increase the number of tokens. Instead you'd 
>> better make it work like compound word support, having just "wikipedia" 
>> and "org" as tokens.

> The current text parser already returns url and url_path. That already
> increases the number of unique tokens. I am only asking for adding of
> normal english words as well so that if someone types only "wikipedia"
> he gets a match. 

The suggestion to make it work like compound words is still a good one,
ie given wikipedia.org you'd get back

hostwikipedia.org
host-part   wikipedia
host-part   org

not just the "host" token as at present.

Then the user could decide whether he needed to index hostname
components or not, by choosing whether to forward hostname-part
tokens to a dictionary or just discard them.

If you submit a patch that tries to force the issue by classifying
hostname parts as plain words, it'll probably get rejected out of
hand on backwards-compatibility grounds.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Sushant Sinha
On Mon, 2010-08-02 at 09:32 -0400, Robert Haas wrote:
> On Mon, Aug 2, 2010 at 9:12 AM, Sushant Sinha  wrote:
> > The current text parser already returns url and url_path. That already
> > increases the number of unique tokens. I am only asking for adding of
> > normal english words as well so that if someone types only "wikipedia"
> > he gets a match.
> [...]
> > Postgres english parser already emits urls as tokens. Only thing I am
> > asking is on improving the tokenization and positioning.
> 
> Can you write a patch to implement your idea?
> 

Yes thats what I am planning to do. I just wanted to see if anyone can
help me in estimating whether this is doable in the current parser or I
need to write a new one. If possible, then some idea on how to go about
implementing?

-Sushant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 9:20 AM, Yeb Havinga  wrote:
> Robert Haas wrote:
>>
>> I agree that's the crux of the problem, but I can't see solving it
>> with a global variable.  I realize you were just testing...
>
> Yes it was just a test. However, somewhere information must be kept or
> altered so it can be detected that a relation has already been visited, i.e.
> it is a multiple inheriting child. The other solutions I could think of are
> more intrusive (i.e. definitionin ATController and passing as parameter).
>
> The attached patch uses the globally defined list. After ATPrepCmd the list
> pointer is reset to NIL, the list not freed since the allocs are done in a
> memory context soon to be deleted (PortalHeapMemory). It passes regression
> as well as the script below.

I can't speak for any other committer, but personally I'm prepared to
reject out of hand any solution involving a global variable.  This
code is none to easy to follow as it is and adding global variables to
the mix is not going to make it better, even if we can convince
ourselves that it's safe and correct.  This is something of a corner
case, so I won't be crushed if a cleaner fix is too invasive to
back-patch.  Incidentally, we need to shift this discussion to a new
subject line; we have a working patch for the original subject of this
thread, and are now discussing the related problem I found with
attributes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 9:12 AM, Sushant Sinha  wrote:
> The current text parser already returns url and url_path. That already
> increases the number of unique tokens. I am only asking for adding of
> normal english words as well so that if someone types only "wikipedia"
> he gets a match.
[...]
> Postgres english parser already emits urls as tokens. Only thing I am
> asking is on improving the tokenization and positioning.

Can you write a patch to implement your idea?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Markus Wanner

Hi,

On 08/02/2010 03:12 PM, Sushant Sinha wrote:

The current text parser already returns url and url_path. That already
increases the number of unique tokens.


Well, I think I simply turned that off to be able to search for plain 
words. It still works for complete URLs, those are just treated like 
text, then.



Earlier people have expressed the need to index urls/emails and
currently the text parser already does so. Reverting that would be a
regression of functionality. Further, a ranking function can take
advantage of direct match of a token.


That's a point, yes. However, simply making the same string turn up 
twice in the tokenizer's output doesn't sound like the right solution to 
me. Especially considering that the query parser uses the very same 
tokenizer.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres as Historian

2010-08-02 Thread Etienne Dube

On 02/08/2010 3:20 AM, Hardik Belani wrote:
We are using postgres as RDBMS for our product. There is a requirement 
coming for a feature which will require me to store data about various 
data points (mainly numbers) on a time scale. Data measurement is 
being taken every few secs/mins based and it is needed to be stored 
for statistical analysis. Now this requires numbers (integers/floats) 
to be stored at every mins.
For this i can create a table with number and time (may be time offset 
instead of timestamp) as columns. But still it will require me to 
store huge number of rows in the order of few millions. Data is read 
only and only inserts can happen. But I need to perform all kinds of 
aggregation to get various statistics. for example: daily avg, monthly 
avg etc..
We already are using postgres for our product so using postgres does 
not add any additional installation requirement and hence it is a bit 
easier.
Would you recommand postgres for this kind of requirement and will be 
provide the performance. OR do you recommand any other database meant 
for such requirements. I am also searching for a good historian 
database if postgres doesn't suppport.

Thanks,
Hardik



Hi Hardik,

Data warehousing techniques could help you with your requirements of 
aggregating large amounts of data. Have a look at "The Data Warehouse 
Toolkit" by R. Kimball on how to design a star schema with aggregate 
tables (these can be done as materialized views using PL/pgSQL and 
triggers under postgres). You could also use an OLAP server (e.g. 
Mondrian, which pretty nice and open source as well) on top of your 
postgres DB, as it can use aggregate tables transparently when needed.


Etienne


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Yeb Havinga

Robert Haas wrote:

I agree that's the crux of the problem, but I can't see solving it
with a global variable.  I realize you were just testing...
  
Yes it was just a test. However, somewhere information must be kept or 
altered so it can be detected that a relation has already been visited, 
i.e. it is a multiple inheriting child. The other solutions I could 
think of are more intrusive (i.e. definitionin ATController and passing 
as parameter).


The attached patch uses the globally defined list. After ATPrepCmd the 
list pointer is reset to NIL, the list not freed since the allocs are 
done in a memory context soon to be deleted (PortalHeapMemory). It 
passes regression as well as the script below.


regards,
Yeb Havinga


DROP SCHEMA IF EXISTS test_inheritance CASCADE;
CREATE SCHEMA test_inheritance;
SET search_path TO test_inheritance;

CREATE TABLE top (i int);
CREATE TABLE mid1 () INHERITS (top);
CREATE TABLE mid2 () INHERITS (top);
CREATE TABLE bottom () INHERITS (mid1, mid2);
CREATE TABLE basement () INHERITS (bottom);

ALTER TABLE top
ADD COLUMN a_table_column integer,
ADD COLUMN a_table_column2 integer;

ALTER TABLE top
ADD COLUMN a_table_column3 integer;

ALTER TABLE top
ADD CONSTRAINT  a_check_constraint CHECK (i IN (0,1)),
ADD CONSTRAINT  a_check_constraint2 CHECK (i IN (0,1));

ALTER TABLE top
ADD CONSTRAINT  a_check_constraint3 CHECK (i IN (0,1));

SELECT t.oid, t.relname, a.attinhcount
FROM pg_class t
JOIN pg_attribute a ON (a.attrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance' AND a.attname LIKE 'a_table_column%'
ORDER BY oid;

SELECT t.oid, t.relname, c.coninhcount
FROM pg_class t
JOIN pg_constraint c ON (c.conrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance'
ORDER BY oid;




diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 49a6f73..08efffc 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -99,6 +99,10 @@ typedef struct OnCommitItem
 
 static List *on_commits = NIL;
 
+/*
+ * Per subcommand history of relids visited in an inheritance hierarchy.
+ */
+static List *visited_relids = NIL;
 
 /*
  * State information for ALTER TABLE
@@ -2584,6 +2588,7 @@ ATController(Relation rel, List *cmds, bool recurse, 
LOCKMODE lockmode)
AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);
 
ATPrepCmd(&wqueue, rel, cmd, recurse, false, lockmode);
+   visited_relids = NIL;
}
 
/* Close the relation, but keep lock until commit */
@@ -3618,10 +3623,11 @@ ATSimpleRecursion(List **wqueue, Relation rel,
 /*
  * ATOneLevelRecursion
  *
- * Here, we visit only direct inheritance children.  It is expected that
- * the command's prep routine will recurse again to find indirect children.
- * When using this technique, a multiply-inheriting child will be visited
- * multiple times.
+ * Here, we visit only direct inheritance children.  It is expected that the
+ * command's prep routine will recurse again to find indirect children.  When
+ * using this technique, a multiple-inheriting child will be visited multiple
+ * times. Childs of multiple-inheriting childs however are only visited once
+ * for each parent.
  */
 static void
 ATOneLevelRecursion(List **wqueue, Relation rel,
@@ -3631,6 +3637,14 @@ ATOneLevelRecursion(List **wqueue, Relation rel,
ListCell   *child;
List   *children;
 
+   /* If we already visited the current multiple-inheriting relation, we
+* mustn't recurse to it's child tables, because they've already been
+* visited. Visiting them would lead to an incorrect value for
+* attinhcount. */
+   if (list_member_oid(visited_relids, relid))
+   return;
+   visited_relids = lappend_oid(visited_relids, relid);
+
children = find_inheritance_children(relid, lockmode);
 
foreach(child, children)
@@ -4891,6 +4905,15 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo 
*tab, Relation rel,
CommandCounterIncrement();
 
/*
+* If the constraint got merged with an existing constraint, we're done.
+* We mustn't recurse to child tables in this case, because they've 
already
+* got the constraint, and visiting them again would lead to an 
incorrect
+* value for coninhcount.
+*/
+   if (newcons == NIL)
+   return;
+
+   /*
 * Propagate to children as appropriate.  Unlike most other ALTER
 * routines, we have to do this one level of recursion at a time; we 
can't
 * use find_all_inheritors to do it in one pass.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Sushant Sinha
> On 08/01/2010 08:04 PM, Sushant Sinha wrote:
> > 1. We do not have separate tokens "wikipedia" and "org"
> > 2. If we have the two tokens we should have them at adjacent position so
> > that a phrase search for "wikipedia org" should work.
> 
> This would needlessly increase the number of tokens. Instead you'd 
> better make it work like compound word support, having just "wikipedia" 
> and "org" as tokens.

The current text parser already returns url and url_path. That already
increases the number of unique tokens. I am only asking for adding of
normal english words as well so that if someone types only "wikipedia"
he gets a match. 

> 
> Searching for "wikipedia.org" or "wikipedia org" should then result in 
> the same search query with the two tokens: "wikipedia" and "org".

Earlier people have expressed the need to index urls/emails and
currently the text parser already does so. Reverting that would be a
regression of functionality. Further, a ranking function can take
advantage of direct match of a token.

> > position 0: WORD(wikipedia), URL(wikipedia.org/search?q=sushant)
> 
> IMO the differentiation between WORDs and URLs is not something the text 
> search engine should have to take care a lot. Let it just do the 
> searching and make it do that well.

Postgres english parser already emits urls as tokens. Only thing I am
asking is on improving the tokenization and positioning.

> What does a token "wikipedia.org/search?q=sushant" buy you in terms of 
> text searching? Or even result highlighting? I wouldn't expect anybody 
> to want to search for a full URL, do you?

There have been need expressed in past. And an exact token match can
result in better ranking functions. For example, a tf-idf ranking will
rank matching of such unique tokens significantly higher.

-Sushant.

> Regards
> 
> Markus Wanner



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 8:57 AM, Yeb Havinga  wrote:
> Fujii Masao wrote:
>>
>> On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas  wrote:
>>
>>>
>>> Let's not get *the manner of specifying the policy* confused with *the
>>> need to update the policy when the master changes*.  It doesn't seem
>>> likely you would want the same value for  synchronous_standbys on all
>>> your machines.  In the most common configuration, you'd probably have:
>>>
>>> on A: synchronous_standbys=B
>>> on B: synchronous_standbys=A
>>>
>>
>> Oh, true. But, what if we have another synchronous standby called C?
>> We specify the policy as follows?:
>>
>> on A: synchronous_standbys=B,C
>> on B: synchronous_standbys=A,C
>> on C: synchronous_standbys=A,B
>>
>> We would need to change the setting on both A and B when we want to
>> change the name of the third standby from C to D, for example. No?
>>
>
> What if the master is named as well in the 'pool of servers that are in
> sync'? In the scenario above this pool would be A,B,C. Working with this
> concept has as benefit that the setting can be copied to all other servers
> as well, and is invariant under any number of failures or switchovers. The
> same could also hold for quorum expressions like A && (B || C), if A,B,C are
> either master or standby.
>
> I initially though that once the definitions could be the same on all
> servers, having them in a system catalog would be a good thing. However
> that'd propably hard to setup, and also in the case of failures during
> change of the parameters it could become very messy.

Yeah, I think this information has to be stored either in GUCs or in a
flat-file somewhere.  Putting it in a system catalog will cause major
problems when trying to get a down system back up, I think.

I suspect that for complex setups, people will need to use some kind
of cluster-ware to update the settings as nodes go up and down.  But I
think it will still be simpler if the nodes are named.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Yeb Havinga

Fujii Masao wrote:

On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas  wrote:
  

Let's not get *the manner of specifying the policy* confused with *the
need to update the policy when the master changes*.  It doesn't seem
likely you would want the same value for  synchronous_standbys on all
your machines.  In the most common configuration, you'd probably have:

on A: synchronous_standbys=B
on B: synchronous_standbys=A



Oh, true. But, what if we have another synchronous standby called C?
We specify the policy as follows?:

on A: synchronous_standbys=B,C
on B: synchronous_standbys=A,C
on C: synchronous_standbys=A,B

We would need to change the setting on both A and B when we want to
change the name of the third standby from C to D, for example. No?
  
What if the master is named as well in the 'pool of servers that are in 
sync'? In the scenario above this pool would be A,B,C. Working with this 
concept has as benefit that the setting can be copied to all other 
servers as well, and is invariant under any number of failures or 
switchovers. The same could also hold for quorum expressions like A && 
(B || C), if A,B,C are either master or standby.


I initially though that once the definitions could be the same on all 
servers, having them in a system catalog would be a good thing. However 
that'd propably hard to setup, and also in the case of failures during 
change of the parameters it could become very messy.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Mon, Aug 2, 2010 at 8:32 PM, Robert Haas  wrote:
> Sure.  If you give the standbys names, then if people change the
> names, they'll have to update their configuration.  But I can't see
> that as an argument against doing it.  You can remove the possibility
> that someone will have a hassle if they rename a server by not
> allowing them to give it a name in the first place, but that doesn't
> seem like a win from a usability perspective.

I'm just comparing your idea (i.e., set synchronous_standbys on
each possible master) with my idea (i.e., set replication_mode on
each standby). Though your idea has the advantage described in the
following post, it seems to make the setup of the standbys more
complicated, as I described. So I'm trying to generate better idea.
http://archives.postgresql.org/pgsql-hackers/2010-08/msg7.php

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta:
> Marc Cousin írta:
>   
>> The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :
>>   
>> 
>>> I fixed this by adding CheckLockTimeout() function that works like
>>> CheckStatementTimeout() and ensuring that the same start time is
>>> used for both deadlock_timeout and lock_timeout if both are active.
>>> The preference of errors if their timeout values are equal is:
>>> statement_timeout > lock_timeout > deadlock_timeout
>>> 
>>>   
>> As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
>> work, with this new version.
>>
>> Keeping the deadlock_timeout to 1s, when lock_timeout >= 1001, 
>> lock_timeout doesn't trigger anymore.
>>   
>> 
>
> I missed one case when the lock_timeout_active should have been set
> but the timer must not have been re-set, this caused the problem.
> I blame the hot weather and having no air conditioning. The second is
> now fixed. :-)
>
> I also added one line in autovacuum.c to disable lock_timeout,
> in case it's globally set in postgresq.conf as per Alvaro's comment.
>
> Also, I made sure that only one or two timeout causes (one of
> deadlock_timeout
> and lock_timeout in the first case or statement_timeout plus one of the
> other two)
> can be active at a time.

A little clarification is needed. The above statement is not entirely true.
There can be a case when all three timeout causes can be active, but only
when deadlock_timeout is the smallest of the three. If the fin_time value
for the another timeout cause is smaller than for deadlock_timeout then
there's no point to make deadlock_timeout active. And in case
deadlock_timeout
triggers and CheckDeadLock() finds that there really is a deadlock then the
possibly active lock_timeout gets disabled to avoid calling
RemoveFromWaitQueue() twice. I hope the comments in the code explain it
well.

>  Previously I was able to trigger a segfault
> with the default
> 1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
> system's
> clock resolution makes the lock_timeout and deadlock_timeout equal and
> RemoveFromWaitQueue() was called twice. This way it's a lot more robust.
>   

Best regards,
Zoltán Böszörményi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 7:06 AM, Fujii Masao  wrote:
> On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas  wrote:
>> Let's not get *the manner of specifying the policy* confused with *the
>> need to update the policy when the master changes*.  It doesn't seem
>> likely you would want the same value for  synchronous_standbys on all
>> your machines.  In the most common configuration, you'd probably have:
>>
>> on A: synchronous_standbys=B
>> on B: synchronous_standbys=A
>
> Oh, true. But, what if we have another synchronous standby called C?
> We specify the policy as follows?:
>
> on A: synchronous_standbys=B,C
> on B: synchronous_standbys=A,C
> on C: synchronous_standbys=A,B
>
> We would need to change the setting on both A and B when we want to
> change the name of the third standby from C to D, for example. No?

Sure.  If you give the standbys names, then if people change the
names, they'll have to update their configuration.  But I can't see
that as an argument against doing it.  You can remove the possibility
that someone will have a hassle if they rename a server by not
allowing them to give it a name in the first place, but that doesn't
seem like a win from a usability perspective.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Boszormenyi Zoltan
Marc Cousin írta:
> The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :
>   
>> I fixed this by adding CheckLockTimeout() function that works like
>> CheckStatementTimeout() and ensuring that the same start time is
>> used for both deadlock_timeout and lock_timeout if both are active.
>> The preference of errors if their timeout values are equal is:
>> statement_timeout > lock_timeout > deadlock_timeout
>> 
>
> As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
> work, with this new version.
>
> Keeping the deadlock_timeout to 1s, when lock_timeout >= 1001, 
> lock_timeout doesn't trigger anymore.
>   

I missed one case when the lock_timeout_active should have been set
but the timer must not have been re-set, this caused the problem.
I blame the hot weather and having no air conditioning. The second is
now fixed. :-)

I also added one line in autovacuum.c to disable lock_timeout,
in case it's globally set in postgresq.conf as per Alvaro's comment.

Also, I made sure that only one or two timeout causes (one of
deadlock_timeout
and lock_timeout in the first case or statement_timeout plus one of the
other two)
can be active at a time. Previously I was able to trigger a segfault
with the default
1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
system's
clock resolution makes the lock_timeout and deadlock_timeout equal and
RemoveFromWaitQueue() was called twice. This way it's a lot more robust.

Best regards,
Zoltán Böszörményi

diff -dcrpN pgsql.orig/doc/src/sgml/config.sgml pgsql/doc/src/sgml/config.sgml
*** pgsql.orig/doc/src/sgml/config.sgml	2010-07-26 10:05:37.0 +0200
--- pgsql/doc/src/sgml/config.sgml	2010-07-29 11:58:56.0 +0200
*** COPY postgres_log FROM '/full/path/to/lo
*** 4479,4484 
--- 4479,4508 

   
  
+  
+   lock_timeout (integer)
+   
+lock_timeout configuration parameter
+   
+   
+
+ Abort any statement that tries to acquire a heavy-weight lock (e.g. rows,
+ pages, tables, indices or other objects) and the lock has to wait more
+ than the specified number of milliseconds, starting from the time the
+ command arrives at the server from the client.
+ If log_min_error_statement is set to ERROR or lower,
+ the statement that timed out will also be logged. A value of zero
+ (the default) turns off the limitation.
+
+ 
+
+ Setting lock_timeout in
+ postgresql.conf is not recommended because it
+ affects all sessions.
+  
+  
+  
+ 
   
vacuum_freeze_table_age (integer)

diff -dcrpN pgsql.orig/doc/src/sgml/ref/lock.sgml pgsql/doc/src/sgml/ref/lock.sgml
*** pgsql.orig/doc/src/sgml/ref/lock.sgml	2010-04-03 09:23:01.0 +0200
--- pgsql/doc/src/sgml/ref/lock.sgml	2010-07-29 11:58:56.0 +0200
*** LOCK [ TABLE ] [ ONLY ] NOWAIT is specified, LOCK
 TABLE does not wait to acquire the desired lock: if it
 cannot be acquired immediately, the command is aborted and an
!error is emitted.  Once obtained, the lock is held for the
!remainder of the current transaction.  (There is no UNLOCK
 TABLE command; locks are always released at transaction
 end.)

--- 39,49 
 NOWAIT is specified, LOCK
 TABLE does not wait to acquire the desired lock: if it
 cannot be acquired immediately, the command is aborted and an
!error is emitted. If lock_timeout is set to a value
!higher than 0, and the lock cannot be acquired under the specified
!timeout value in milliseconds, the command is aborted and an error
!is emitted. Once obtained, the lock is held for the remainder of  
!the current transaction.  (There is no UNLOCK
 TABLE command; locks are always released at transaction
 end.)

diff -dcrpN pgsql.orig/doc/src/sgml/ref/select.sgml pgsql/doc/src/sgml/ref/select.sgml
*** pgsql.orig/doc/src/sgml/ref/select.sgml	2010-06-20 13:59:13.0 +0200
--- pgsql/doc/src/sgml/ref/select.sgml	2010-07-29 11:58:56.0 +0200
*** FOR SHARE [ OF semNum;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = semop(sema->semId, &sops, 1);
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "semop(id=%d) failed: %m", sema->semId);
+ }
diff -dcrpN pgsql.orig/src/backend/port/win32_sema.c pgsql/src/backend/port/win32_sema.c
*** pgsql.orig/src/backend/port/win32_sema.c	2010-01-02 17:57:50.0 +0100
--- pgsql/src/backend/port/win32_sema.c	2010-07-29 11:58:56.0 +0200
***
*** 16,21 
--- 16,22 
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  static HANDLE *mySemSet;		/* IDs of sema s

Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas  wrote:
> Let's not get *the manner of specifying the policy* confused with *the
> need to update the policy when the master changes*.  It doesn't seem
> likely you would want the same value for  synchronous_standbys on all
> your machines.  In the most common configuration, you'd probably have:
>
> on A: synchronous_standbys=B
> on B: synchronous_standbys=A

Oh, true. But, what if we have another synchronous standby called C?
We specify the policy as follows?:

on A: synchronous_standbys=B,C
on B: synchronous_standbys=A,C
on C: synchronous_standbys=A,B

We would need to change the setting on both A and B when we want to
change the name of the third standby from C to D, for example. No?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres as Historian

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 3:20 AM, Hardik Belani  wrote:
> We are using postgres as RDBMS for our product. There is a requirement
> coming for a feature which will require me to store data about various data
> points (mainly numbers) on a time scale. Data measurement is being taken
> every few secs/mins based and it is needed to be stored for statistical
> analysis. Now this requires numbers (integers/floats) to be stored at every
> mins.
>
> For this i can create a table with number and time (may be time offset
> instead of timestamp) as columns. But still it will require me to store huge
> number of rows in the order of few millions. Data is read only and only
> inserts can happen. But I need to perform all kinds of aggregation to get
> various statistics. for example: daily avg, monthly avg etc..
>
> We already are using postgres for our product so using postgres does not add
> any additional installation requirement and hence it is a bit easier.
>
> Would you recommand postgres for this kind of requirement and will be
> provide the performance. OR do you recommand any other database meant
> for such requirements. I am also searching for a good historian database if
> postgres doesn't suppport.

You can certainly use Postgres in this kind of environment, and I
have.  Of course, if the volume of data is higher than your hardware
can keep up with, then you're going to have problems.  Disabling
synchronous_commit may help, if losing the last few transactions is
acceptable in the event of a system crash; appropriate use of table
partitioning may help, too.  There are certainly databases out there
that are better optimized for this case (consider the rrd stuff built
into mrtg, for example), but they're also not as feature-rich, so it's
a trade-off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 5:02 AM, Fujii Masao  wrote:
> On Sun, Aug 1, 2010 at 9:51 PM, Robert Haas  wrote:
>> Perhaps someone will claim that nobody wants to do that anyway (which
>> I don't believe, BTW), but even in simpler cases it would be nicer to
>> have an explicit policy rather than - in effect - inferring a policy
>> from a soup of GUC settings.  For example, if you want one synchronous
>> standby (A) and two asynchronous standbys (B and C).  You can say
>> quorum=1 on the master and then configure vote=1 on A and vote=0 on B
>> and C, but now you have to look at four machines to figure out what
>> the policy is, and a change on any one of those machines can break it.
>>  ISTM that if you can just write synchronous_standbys=A on the master,
>> that's a whole lot more clear and less error-prone.
>
> Some standbys may become master later by failover. So we would
> need to write something like synchronous_standbys=A on not only
> current one master but also those standbys. Changing
> synchronous_standbys would require change on all those servers.
> Or the master should replicate even that change to the standbys?

Let's not get *the manner of specifying the policy* confused with *the
need to update the policy when the master changes*.  It doesn't seem
likely you would want the same value for  synchronous_standbys on all
your machines.  In the most common configuration, you'd probably have:

on A: synchronous_standbys=B
on B: synchronous_standbys=A

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Sun, Aug 1, 2010 at 9:51 PM, Robert Haas  wrote:
> Perhaps someone will claim that nobody wants to do that anyway (which
> I don't believe, BTW), but even in simpler cases it would be nicer to
> have an explicit policy rather than - in effect - inferring a policy
> from a soup of GUC settings.  For example, if you want one synchronous
> standby (A) and two asynchronous standbys (B and C).  You can say
> quorum=1 on the master and then configure vote=1 on A and vote=0 on B
> and C, but now you have to look at four machines to figure out what
> the policy is, and a change on any one of those machines can break it.
>  ISTM that if you can just write synchronous_standbys=A on the master,
> that's a whole lot more clear and less error-prone.

Some standbys may become master later by failover. So we would
need to write something like synchronous_standbys=A on not only
current one master but also those standbys. Changing
synchronous_standbys would require change on all those servers.
Or the master should replicate even that change to the standbys?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Sun, Aug 1, 2010 at 3:11 PM, Heikki Linnakangas
 wrote:
> I don't think any of this quorum stuff makes much sense without explicitly
> registering standbys in the master.

I'm not sure if this is a good idea. This requires users to do more
manual operations than ever when setting up the replication; assign
unique name (or ID) to each standby, register them in the master,
specify the names in each recovery.conf (or elsewhere), and remove
the registration from the master when getting rid of the standby.

But this is similar to the way of MySQL replication setup, so some
people (excluding me) may be familiar with it.

> That would also solve the fuzziness with wal_keep_segments - if the master
> knew what standbys exist, it could keep track of how far each standby has
> received WAL, and keep just enough WAL for each standby to catch up.

What if the registered standby stays down for a long time?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-02 Thread Pavel Stehule
Hello

2010/8/2 Mike Fowler :
> Hi Pavel,
>
> Currently your patch isn't applying to head, from the looks of things a
> function signature has changed. Can you update your patch please?
>

yes - see attachment

> Also, having had a read through the patch itself I note that there are no
> tests and no changes to documentation. Shouldn't the documentation advertise
> that the there are no limits on the numbers of parameters? A couple of tests
> will also help me to test your patch,
>

the limit of 10 parameters was not documented. With this patch, there
are not limit - or limit comes from libxml2.

Test are a problem - for me - I don't understand to xslt too much - so
I am not able to write xslt template with more than 10 params.

I looked there and I the params are not tested now - so it is
necessary to write a new set of regress tests. But I am not able to do
it :(.


> Below is the results of running patch:
>
> patch -p0 < ../nolimits.diff
> patching file ./contrib/xml2/xslt_proc.c
> Hunk #1 FAILED at 42.
> Hunk #2 succeeded at 57 (offset -2 lines).
> Hunk #3 succeeded at 69 (offset -2 lines).
> Hunk #4 succeeded at 142 (offset -4 lines).
> Hunk #5 succeeded at 179 (offset -4 lines).
> Hunk #6 succeeded at 192 with fuzz 1 (offset -4 lines).
> 1 out of 6 hunks FAILED -- saving rejects to file
> ./contrib/xml2/xslt_proc.c.rej
>
> The rejects were:
>
>
> *** ./contrib/xml2/xslt_proc.c.orig     2010-03-03 20:10:22.0 +0100
> --- ./contrib/xml2/xslt_proc.c  2010-05-03 15:07:17.010918303 +0200
> ***
> *** 42,50 
>  extern void pgxml_parser_init(void);
>
>  /* local defs */
> ! static void parse_params(const char **params, text *paramstr);
>
> ! #define MAXPARAMS 20                  /* must be even, see parse_params()
> */
>
>  #endif /* USE_LIBXSLT */
>
> --- 42,51 
>  extern void pgxml_parser_init(void);
>
>  /* local defs */
> ! const char **parse_params(text *paramstr);
>
> ! #define INIT_PARAMS 20                        /* must be even, see
> parse_params() */
> ! #define EXTEND_PARAMS 20              /* must be even, see parse_params()
> */
>
>  #endif /* USE_LIBXSLT */
>
>
> Regards,
> --
> Mike Fowler
> Registered Linux user: 379787
>

Regards

Pavel Stehule
*** ./contrib/xml2/xslt_proc.c.orig	2010-07-06 21:18:55.0 +0200
--- ./contrib/xml2/xslt_proc.c	2010-08-02 09:31:32.410911283 +0200
***
*** 41,49 
  extern void pgxml_parser_init(void);
  
  /* local defs */
! static void parse_params(const char **params, text *paramstr);
  
! #define MAXPARAMS 20			/* must be even, see parse_params() */
  #endif   /* USE_LIBXSLT */
  
  
--- 41,50 
  extern void pgxml_parser_init(void);
  
  /* local defs */
! const char **parse_params(text *paramstr);
  
! #define INIT_PARAMS 20			/* must be even, see parse_params() */
! #define EXTEND_PARAMS 20		/* must be even, see parse_params() */
  #endif   /* USE_LIBXSLT */
  
  
***
*** 57,63 
  	text	   *doct = PG_GETARG_TEXT_P(0);
  	text	   *ssheet = PG_GETARG_TEXT_P(1);
  	text	   *paramstr;
! 	const char *params[MAXPARAMS + 1];	/* +1 for the terminator */
  	xsltStylesheetPtr stylesheet = NULL;
  	xmlDocPtr	doctree;
  	xmlDocPtr	restree;
--- 58,64 
  	text	   *doct = PG_GETARG_TEXT_P(0);
  	text	   *ssheet = PG_GETARG_TEXT_P(1);
  	text	   *paramstr;
! 	const char **params;
  	xsltStylesheetPtr stylesheet = NULL;
  	xmlDocPtr	doctree;
  	xmlDocPtr	restree;
***
*** 69,79 
  	if (fcinfo->nargs == 3)
  	{
  		paramstr = PG_GETARG_TEXT_P(2);
! 		parse_params(params, paramstr);
  	}
  	else
  		/* No parameters */
  		params[0] = NULL;
  
  	/* Setup parser */
  	pgxml_parser_init();
--- 70,83 
  	if (fcinfo->nargs == 3)
  	{
  		paramstr = PG_GETARG_TEXT_P(2);
! 		params = parse_params(paramstr);
  	}
  	else
+ 	{
  		/* No parameters */
+ 		params = palloc(sizeof(char *));
  		params[0] = NULL;
+ 	}
  
  	/* Setup parser */
  	pgxml_parser_init();
***
*** 139,160 
  
  #ifdef USE_LIBXSLT
  
! static void
! parse_params(const char **params, text *paramstr)
  {
  	char	   *pos;
  	char	   *pstr;
- 	int			i;
  	char	   *nvsep = "=";
  	char	   *itsep = ",";
! 
  	pstr = text_to_cstring(paramstr);
  
  	pos = pstr;
! 
! 	for (i = 0; i < MAXPARAMS; i++)
  	{
! 		params[i] = pos;
  		pos = strstr(pos, nvsep);
  		if (pos != NULL)
  		{
--- 143,175 
  
  #ifdef USE_LIBXSLT
  
! const char **
! parse_params(text *paramstr)
  {
  	char	   *pos;
  	char	   *pstr;
  	char	   *nvsep = "=";
  	char	   *itsep = ",";
! 	const char **params;
! 	int	nparams;
! 	int	mparams;		/* max params */
! 	
  	pstr = text_to_cstring(paramstr);
+ 	
+ 	mparams = INIT_PARAMS;
+ 	params = (const char **) palloc(INIT_PARAMS * sizeof(char *) + 1);
  
  	pos = pstr;
! 	nparams = 0;
! 	while (*pos != '\0')
  	{
! 		if (nparams >= mparams)
! 		{
! 			/* extend params params */
! 			mparams += EXTEND_PARAMS;
! 			params = (const char **) repalloc(params, mparams * sizeof(char *) + 1);
! 		}
! 		para

Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Markus Wanner

Hi,

On 08/01/2010 08:04 PM, Sushant Sinha wrote:

1. We do not have separate tokens "wikipedia" and "org"
2. If we have the two tokens we should have them at adjacent position so
that a phrase search for "wikipedia org" should work.


This would needlessly increase the number of tokens. Instead you'd 
better make it work like compound word support, having just "wikipedia" 
and "org" as tokens.


Searching for "wikipedia.org" or "wikipedia org" should then result in 
the same search query with the two tokens: "wikipedia" and "org".



position 0: WORD(wikipedia), URL(wikipedia.org/search?q=sushant)


IMO the differentiation between WORDs and URLs is not something the text 
search engine should have to take care a lot. Let it just do the 
searching and make it do that well.


What does a token "wikipedia.org/search?q=sushant" buy you in terms of 
text searching? Or even result highlighting? I wouldn't expect anybody 
to want to search for a full URL, do you?


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres as Historian

2010-08-02 Thread Hardik Belani
We are using postgres as RDBMS for our product. There is a requirement
coming for a feature which will require me to store data about various data
points (mainly numbers) on a time scale. Data measurement is being taken
every few secs/mins based and it is needed to be stored for statistical
analysis. Now this requires numbers (integers/floats) to be stored at every
mins.

For this i can create a table with number and time (may be time offset
instead of timestamp) as columns. But still it will require me to store huge
number of rows in the order of few millions. Data is read only and only
inserts can happen. But I need to perform all kinds of aggregation to get
various statistics. for example: daily avg, monthly avg etc..

We already are using postgres for our product so using postgres does not add
any additional installation requirement and hence it is a bit easier.

Would you recommand postgres for this kind of requirement and will be
provide the performance. OR do you recommand any other database meant
for such requirements. I am also searching for a good historian database if
postgres doesn't suppport.


Thanks,
Hardik