Re: [HACKERS] review: xml_is_well_formed

2010-08-02 Thread Pavel Stehule
2010/7/31 Robert Haas robertmh...@gmail.com:
 On Sat, Jul 31, 2010 at 8:10 AM, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2010-07-30 at 12:50 +0100, Mike Fowler wrote:
  * xml_is_well_formed returns true for simple text
 
  postgres=# SELECT xml_is_well_formed('');
    xml_is_well_formed
  
    t
  (1 row)
 
  it is probably wrong result - is it ok??
 

 Yes this is OK, pure text is valid XML content.

 Are you speaking of XML content fragments that SQL/XML defines?

 Well-formedness should probably only allow XML documents.

 I think the point of this function is to determine whether a cast to
 xml will throw an error.  The behavior should probably match exactly
 whatever test would be applied there.

I agree with this idea - so I am able to do:

postgres=# select 'xxx'::xml;
 xml
-
 xxx
(1 row)

I have not any suggestions now - so I'll change flag to ready to commit

Regards

Pavel Stehule


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation, proof of concept

2010-08-02 Thread Jaime Casanova
On Tue, Jul 13, 2010 at 1:25 PM, Peter Eisentraut pete...@gmx.net wrote:
 Here is a proof of concept for per-column collation support.


Hi,

i was looking at this.

nowadays, CREATE DATABASE has a lc_collate clause. is the new collate
clause similar as the lc_collate?
i mean, is lc_collate what we will use as a default?

if yes, then probably we need to use pg_collation there too because
lc_collate and the new collate clause use different collation names.

postgres=# create database test with lc_collate 'en_US.UTF-8';
CREATE DATABASE
test=# create table t1 (col1 text collate en_US.UTF-8);
ERROR:  collation en_US.UTF-8 does not exist
test=# create table t1 (col1 text collate en_US.utf8);
CREATE TABLE


also i got errors from regression tests when MULTIBYTE=UTF8
(attached). it seems i was trying to create locales that weren't
defined on locales.txt (from were was fed that file?). i added a line
to that file (for es_EC.utf8) then i create a table with a column
using that collate and execute select * from t2 where col1  'n'; 
and i got this error: ERROR:  could not create locale es_EC.utf8
(of course, that last part was me messing the things up, but it show
we shouldn't be using a file locales.txt, i think)

i can attach a collate to a domain but i can't see where are we
storing that info (actually it says it's not collatable):

-- 
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (9.1) btree_gist support for searching on not equals

2010-08-02 Thread Jeff Davis
On Sun, 2010-08-01 at 21:57 -0400, Robert Haas wrote:
 On Fri, Jul 16, 2010 at 1:19 AM, Jeff Davis pg...@j-davis.com wrote:
  Thank you for the review.
 
  On Mon, 2010-07-12 at 17:17 +0900, Itagaki Takahiro wrote:
  (1) Exclusion constraints support for operators where x operator x
  is false (tiny patch)
  https://commitfest.postgresql.org/action/patch_view?id=307
  (2) btree_gist support for searching on  (not equals)
  https://commitfest.postgresql.org/action/patch_view?id=308
 
  Those patches should be committed at once because (2) requires (1) to work
  with EXCLUDE constraints. Also, (1) has no benefits without (2) because we
  have no use cases for  as an index-able operator. Both patches are very
  simple and small, and worked as expected both WHERE  and EXCLUDE
  constraints cases.
 
  It appears that Tom already committed (1).
 
  I'd like to ask you to write additional documentation about btree_gist [1]
  that the module will be more useful when it is used with exclusion
  constraints together. Without documentation, no users find the usages.
 
  Good idea, new patch attached.
 
 It seems pretty odd to define a constant called
 BTNotEqualStrategyNumber in contrib/btree_gist.  Shouldn't we either
 call this something else, or define it in access/skey.h?  Considering
 that there seem to be some interesting gymnastics being done with
 BTMaxStrategyNumber, I'd vote for the former.  Maybe just
 BtreeGistNotEqualStrategyNumber?

Sounds good to me.

At some point we may be interested to add this to BTree, as well. But we
can cross that bridge when we come to it.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-02 Thread Pavel Stehule
2010/8/2 Pavel Stehule pavel.steh...@gmail.com:
 2010/7/31 Robert Haas robertmh...@gmail.com:
 On Sat, Jul 31, 2010 at 8:10 AM, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2010-07-30 at 12:50 +0100, Mike Fowler wrote:
  * xml_is_well_formed returns true for simple text
 
  postgres=# SELECT xml_is_well_formed('');
    xml_is_well_formed
  
    t
  (1 row)
 
  it is probably wrong result - is it ok??
 

 Yes this is OK, pure text is valid XML content.

 Are you speaking of XML content fragments that SQL/XML defines?

 Well-formedness should probably only allow XML documents.

 I think the point of this function is to determine whether a cast to
 xml will throw an error.  The behavior should probably match exactly
 whatever test would be applied there.

 I agree with this idea - so I am able to do:

 postgres=# select 'xxx'::xml;
  xml
 -
  xxx
 (1 row)

 I have not any suggestions now - so I'll change flag to ready to commit

sorry - contrib module should be a fixed

patch attached



 Regards

 Pavel Stehule


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


*** ./contrib/xml2/pgxml.sql.in.orig	2010-03-01 19:07:59.0 +0100
--- ./contrib/xml2/pgxml.sql.in	2010-08-02 08:40:27.885789791 +0200
***
*** 5,18 
  
  --SQL for XML parser
  
- CREATE OR REPLACE FUNCTION xml_is_well_formed(text) RETURNS bool
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT IMMUTABLE;
- 
  -- deprecated old name for xml_is_well_formed
  CREATE OR REPLACE FUNCTION xml_valid(text) RETURNS bool
! AS 'MODULE_PATHNAME', 'xml_is_well_formed'
! LANGUAGE C STRICT IMMUTABLE;
  
  CREATE OR REPLACE FUNCTION xml_encode_special_chars(text) RETURNS text
  AS 'MODULE_PATHNAME'
--- 5,14 
  
  --SQL for XML parser
  
  -- deprecated old name for xml_is_well_formed
  CREATE OR REPLACE FUNCTION xml_valid(text) RETURNS bool
! AS 'xml_is_well_formed'
! LANGUAGE INTERNAL STRICT IMMUTABLE;
  
  CREATE OR REPLACE FUNCTION xml_encode_special_chars(text) RETURNS text
  AS 'MODULE_PATHNAME'
*** ./contrib/xml2/uninstall_pgxml.sql.orig	2007-11-13 05:24:29.0 +0100
--- ./contrib/xml2/uninstall_pgxml.sql	2010-08-02 08:38:40.134785878 +0200
***
*** 29,33 
  
  -- deprecated old name for xml_is_well_formed
  DROP FUNCTION xml_valid(text);
- 
- DROP FUNCTION xml_is_well_formed(text);
--- 29,31 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres as Historian

2010-08-02 Thread Hardik Belani
We are using postgres as RDBMS for our product. There is a requirement
coming for a feature which will require me to store data about various data
points (mainly numbers) on a time scale. Data measurement is being taken
every few secs/mins based and it is needed to be stored for statistical
analysis. Now this requires numbers (integers/floats) to be stored at every
mins.

For this i can create a table with number and time (may be time offset
instead of timestamp) as columns. But still it will require me to store huge
number of rows in the order of few millions. Data is read only and only
inserts can happen. But I need to perform all kinds of aggregation to get
various statistics. for example: daily avg, monthly avg etc..

We already are using postgres for our product so using postgres does not add
any additional installation requirement and hence it is a bit easier.

Would you recommand postgres for this kind of requirement and will be
provide the performance. OR do you recommand any other database meant
for such requirements. I am also searching for a good historian database if
postgres doesn't suppport.


Thanks,
Hardik


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Markus Wanner

Hi,

On 08/01/2010 08:04 PM, Sushant Sinha wrote:

1. We do not have separate tokens wikipedia and org
2. If we have the two tokens we should have them at adjacent position so
that a phrase search for wikipedia org should work.


This would needlessly increase the number of tokens. Instead you'd 
better make it work like compound word support, having just wikipedia 
and org as tokens.


Searching for wikipedia.org or wikipedia org should then result in 
the same search query with the two tokens: wikipedia and org.



position 0: WORD(wikipedia), URL(wikipedia.org/search?q=sushant)


IMO the differentiation between WORDs and URLs is not something the text 
search engine should have to take care a lot. Let it just do the 
searching and make it do that well.


What does a token wikipedia.org/search?q=sushant buy you in terms of 
text searching? Or even result highlighting? I wouldn't expect anybody 
to want to search for a full URL, do you?


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-02 Thread Pavel Stehule
Hello

2010/8/2 Mike Fowler m...@mlfowler.com:
 Hi Pavel,

 Currently your patch isn't applying to head, from the looks of things a
 function signature has changed. Can you update your patch please?


yes - see attachment

 Also, having had a read through the patch itself I note that there are no
 tests and no changes to documentation. Shouldn't the documentation advertise
 that the there are no limits on the numbers of parameters? A couple of tests
 will also help me to test your patch,


the limit of 10 parameters was not documented. With this patch, there
are not limit - or limit comes from libxml2.

Test are a problem - for me - I don't understand to xslt too much - so
I am not able to write xslt template with more than 10 params.

I looked there and I the params are not tested now - so it is
necessary to write a new set of regress tests. But I am not able to do
it :(.


 Below is the results of running patch:

 patch -p0  ../nolimits.diff
 patching file ./contrib/xml2/xslt_proc.c
 Hunk #1 FAILED at 42.
 Hunk #2 succeeded at 57 (offset -2 lines).
 Hunk #3 succeeded at 69 (offset -2 lines).
 Hunk #4 succeeded at 142 (offset -4 lines).
 Hunk #5 succeeded at 179 (offset -4 lines).
 Hunk #6 succeeded at 192 with fuzz 1 (offset -4 lines).
 1 out of 6 hunks FAILED -- saving rejects to file
 ./contrib/xml2/xslt_proc.c.rej

 The rejects were:


 *** ./contrib/xml2/xslt_proc.c.orig     2010-03-03 20:10:22.0 +0100
 --- ./contrib/xml2/xslt_proc.c  2010-05-03 15:07:17.010918303 +0200
 ***
 *** 42,50 
  extern void pgxml_parser_init(void);

  /* local defs */
 ! static void parse_params(const char **params, text *paramstr);

 ! #define MAXPARAMS 20                  /* must be even, see parse_params()
 */

  #endif /* USE_LIBXSLT */

 --- 42,51 
  extern void pgxml_parser_init(void);

  /* local defs */
 ! const char **parse_params(text *paramstr);

 ! #define INIT_PARAMS 20                        /* must be even, see
 parse_params() */
 ! #define EXTEND_PARAMS 20              /* must be even, see parse_params()
 */

  #endif /* USE_LIBXSLT */


 Regards,
 --
 Mike Fowler
 Registered Linux user: 379787


Regards

Pavel Stehule
*** ./contrib/xml2/xslt_proc.c.orig	2010-07-06 21:18:55.0 +0200
--- ./contrib/xml2/xslt_proc.c	2010-08-02 09:31:32.410911283 +0200
***
*** 41,49 
  extern void pgxml_parser_init(void);
  
  /* local defs */
! static void parse_params(const char **params, text *paramstr);
  
! #define MAXPARAMS 20			/* must be even, see parse_params() */
  #endif   /* USE_LIBXSLT */
  
  
--- 41,50 
  extern void pgxml_parser_init(void);
  
  /* local defs */
! const char **parse_params(text *paramstr);
  
! #define INIT_PARAMS 20			/* must be even, see parse_params() */
! #define EXTEND_PARAMS 20		/* must be even, see parse_params() */
  #endif   /* USE_LIBXSLT */
  
  
***
*** 57,63 
  	text	   *doct = PG_GETARG_TEXT_P(0);
  	text	   *ssheet = PG_GETARG_TEXT_P(1);
  	text	   *paramstr;
! 	const char *params[MAXPARAMS + 1];	/* +1 for the terminator */
  	xsltStylesheetPtr stylesheet = NULL;
  	xmlDocPtr	doctree;
  	xmlDocPtr	restree;
--- 58,64 
  	text	   *doct = PG_GETARG_TEXT_P(0);
  	text	   *ssheet = PG_GETARG_TEXT_P(1);
  	text	   *paramstr;
! 	const char **params;
  	xsltStylesheetPtr stylesheet = NULL;
  	xmlDocPtr	doctree;
  	xmlDocPtr	restree;
***
*** 69,79 
  	if (fcinfo-nargs == 3)
  	{
  		paramstr = PG_GETARG_TEXT_P(2);
! 		parse_params(params, paramstr);
  	}
  	else
  		/* No parameters */
  		params[0] = NULL;
  
  	/* Setup parser */
  	pgxml_parser_init();
--- 70,83 
  	if (fcinfo-nargs == 3)
  	{
  		paramstr = PG_GETARG_TEXT_P(2);
! 		params = parse_params(paramstr);
  	}
  	else
+ 	{
  		/* No parameters */
+ 		params = palloc(sizeof(char *));
  		params[0] = NULL;
+ 	}
  
  	/* Setup parser */
  	pgxml_parser_init();
***
*** 139,160 
  
  #ifdef USE_LIBXSLT
  
! static void
! parse_params(const char **params, text *paramstr)
  {
  	char	   *pos;
  	char	   *pstr;
- 	int			i;
  	char	   *nvsep = =;
  	char	   *itsep = ,;
! 
  	pstr = text_to_cstring(paramstr);
  
  	pos = pstr;
! 
! 	for (i = 0; i  MAXPARAMS; i++)
  	{
! 		params[i] = pos;
  		pos = strstr(pos, nvsep);
  		if (pos != NULL)
  		{
--- 143,175 
  
  #ifdef USE_LIBXSLT
  
! const char **
! parse_params(text *paramstr)
  {
  	char	   *pos;
  	char	   *pstr;
  	char	   *nvsep = =;
  	char	   *itsep = ,;
! 	const char **params;
! 	int	nparams;
! 	int	mparams;		/* max params */
! 	
  	pstr = text_to_cstring(paramstr);
+ 	
+ 	mparams = INIT_PARAMS;
+ 	params = (const char **) palloc(INIT_PARAMS * sizeof(char *) + 1);
  
  	pos = pstr;
! 	nparams = 0;
! 	while (*pos != '\0')
  	{
! 		if (nparams = mparams)
! 		{
! 			/* extend params params */
! 			mparams += EXTEND_PARAMS;
! 			params = (const char **) repalloc(params, mparams * sizeof(char *) + 1);
! 		}
! 		params[nparams++] = pos;
  		pos = strstr(pos, nvsep);
  		

Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Sun, Aug 1, 2010 at 3:11 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I don't think any of this quorum stuff makes much sense without explicitly
 registering standbys in the master.

I'm not sure if this is a good idea. This requires users to do more
manual operations than ever when setting up the replication; assign
unique name (or ID) to each standby, register them in the master,
specify the names in each recovery.conf (or elsewhere), and remove
the registration from the master when getting rid of the standby.

But this is similar to the way of MySQL replication setup, so some
people (excluding me) may be familiar with it.

 That would also solve the fuzziness with wal_keep_segments - if the master
 knew what standbys exist, it could keep track of how far each standby has
 received WAL, and keep just enough WAL for each standby to catch up.

What if the registered standby stays down for a long time?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Sun, Aug 1, 2010 at 9:51 PM, Robert Haas robertmh...@gmail.com wrote:
 Perhaps someone will claim that nobody wants to do that anyway (which
 I don't believe, BTW), but even in simpler cases it would be nicer to
 have an explicit policy rather than - in effect - inferring a policy
 from a soup of GUC settings.  For example, if you want one synchronous
 standby (A) and two asynchronous standbys (B and C).  You can say
 quorum=1 on the master and then configure vote=1 on A and vote=0 on B
 and C, but now you have to look at four machines to figure out what
 the policy is, and a change on any one of those machines can break it.
  ISTM that if you can just write synchronous_standbys=A on the master,
 that's a whole lot more clear and less error-prone.

Some standbys may become master later by failover. So we would
need to write something like synchronous_standbys=A on not only
current one master but also those standbys. Changing
synchronous_standbys would require change on all those servers.
Or the master should replicate even that change to the standbys?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 5:02 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Aug 1, 2010 at 9:51 PM, Robert Haas robertmh...@gmail.com wrote:
 Perhaps someone will claim that nobody wants to do that anyway (which
 I don't believe, BTW), but even in simpler cases it would be nicer to
 have an explicit policy rather than - in effect - inferring a policy
 from a soup of GUC settings.  For example, if you want one synchronous
 standby (A) and two asynchronous standbys (B and C).  You can say
 quorum=1 on the master and then configure vote=1 on A and vote=0 on B
 and C, but now you have to look at four machines to figure out what
 the policy is, and a change on any one of those machines can break it.
  ISTM that if you can just write synchronous_standbys=A on the master,
 that's a whole lot more clear and less error-prone.

 Some standbys may become master later by failover. So we would
 need to write something like synchronous_standbys=A on not only
 current one master but also those standbys. Changing
 synchronous_standbys would require change on all those servers.
 Or the master should replicate even that change to the standbys?

Let's not get *the manner of specifying the policy* confused with *the
need to update the policy when the master changes*.  It doesn't seem
likely you would want the same value for  synchronous_standbys on all
your machines.  In the most common configuration, you'd probably have:

on A: synchronous_standbys=B
on B: synchronous_standbys=A

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres as Historian

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 3:20 AM, Hardik Belani hardikbel...@gmail.com wrote:
 We are using postgres as RDBMS for our product. There is a requirement
 coming for a feature which will require me to store data about various data
 points (mainly numbers) on a time scale. Data measurement is being taken
 every few secs/mins based and it is needed to be stored for statistical
 analysis. Now this requires numbers (integers/floats) to be stored at every
 mins.

 For this i can create a table with number and time (may be time offset
 instead of timestamp) as columns. But still it will require me to store huge
 number of rows in the order of few millions. Data is read only and only
 inserts can happen. But I need to perform all kinds of aggregation to get
 various statistics. for example: daily avg, monthly avg etc..

 We already are using postgres for our product so using postgres does not add
 any additional installation requirement and hence it is a bit easier.

 Would you recommand postgres for this kind of requirement and will be
 provide the performance. OR do you recommand any other database meant
 for such requirements. I am also searching for a good historian database if
 postgres doesn't suppport.

You can certainly use Postgres in this kind of environment, and I
have.  Of course, if the volume of data is higher than your hardware
can keep up with, then you're going to have problems.  Disabling
synchronous_commit may help, if losing the last few transactions is
acceptable in the event of a system crash; appropriate use of table
partitioning may help, too.  There are certainly databases out there
that are better optimized for this case (consider the rrd stuff built
into mrtg, for example), but they're also not as feature-rich, so it's
a trade-off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas robertmh...@gmail.com wrote:
 Let's not get *the manner of specifying the policy* confused with *the
 need to update the policy when the master changes*.  It doesn't seem
 likely you would want the same value for  synchronous_standbys on all
 your machines.  In the most common configuration, you'd probably have:

 on A: synchronous_standbys=B
 on B: synchronous_standbys=A

Oh, true. But, what if we have another synchronous standby called C?
We specify the policy as follows?:

on A: synchronous_standbys=B,C
on B: synchronous_standbys=A,C
on C: synchronous_standbys=A,B

We would need to change the setting on both A and B when we want to
change the name of the third standby from C to D, for example. No?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Boszormenyi Zoltan
Marc Cousin írta:
 The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :
   
 I fixed this by adding CheckLockTimeout() function that works like
 CheckStatementTimeout() and ensuring that the same start time is
 used for both deadlock_timeout and lock_timeout if both are active.
 The preference of errors if their timeout values are equal is:
 statement_timeout  lock_timeout  deadlock_timeout
 

 As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
 work, with this new version.

 Keeping the deadlock_timeout to 1s, when lock_timeout = 1001, 
 lock_timeout doesn't trigger anymore.
   

I missed one case when the lock_timeout_active should have been set
but the timer must not have been re-set, this caused the problem.
I blame the hot weather and having no air conditioning. The second is
now fixed. :-)

I also added one line in autovacuum.c to disable lock_timeout,
in case it's globally set in postgresq.conf as per Alvaro's comment.

Also, I made sure that only one or two timeout causes (one of
deadlock_timeout
and lock_timeout in the first case or statement_timeout plus one of the
other two)
can be active at a time. Previously I was able to trigger a segfault
with the default
1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
system's
clock resolution makes the lock_timeout and deadlock_timeout equal and
RemoveFromWaitQueue() was called twice. This way it's a lot more robust.

Best regards,
Zoltán Böszörményi

diff -dcrpN pgsql.orig/doc/src/sgml/config.sgml pgsql/doc/src/sgml/config.sgml
*** pgsql.orig/doc/src/sgml/config.sgml	2010-07-26 10:05:37.0 +0200
--- pgsql/doc/src/sgml/config.sgml	2010-07-29 11:58:56.0 +0200
*** COPY postgres_log FROM '/full/path/to/lo
*** 4479,4484 
--- 4479,4508 
/listitem
   /varlistentry
  
+  varlistentry id=guc-lock-timeout xreflabel=lock_timeout
+   termvarnamelock_timeout/varname (typeinteger/type)/term
+   indexterm
+primaryvarnamelock_timeout/ configuration parameter/primary
+   /indexterm
+   listitem
+para
+ Abort any statement that tries to acquire a heavy-weight lock (e.g. rows,
+ pages, tables, indices or other objects) and the lock has to wait more
+ than the specified number of milliseconds, starting from the time the
+ command arrives at the server from the client.
+ If varnamelog_min_error_statement/ is set to literalERROR/ or lower,
+ the statement that timed out will also be logged. A value of zero
+ (the default) turns off the limitation.
+/para
+ 
+para
+ Setting varnamelock_timeout/ in
+ filenamepostgresql.conf/ is not recommended because it
+ affects all sessions.
+/para  
+   /listitem   
+  /varlistentry
+ 
   varlistentry id=guc-vacuum-freeze-table-age xreflabel=vacuum_freeze_table_age
termvarnamevacuum_freeze_table_age/varname (typeinteger/type)/term
indexterm
diff -dcrpN pgsql.orig/doc/src/sgml/ref/lock.sgml pgsql/doc/src/sgml/ref/lock.sgml
*** pgsql.orig/doc/src/sgml/ref/lock.sgml	2010-04-03 09:23:01.0 +0200
--- pgsql/doc/src/sgml/ref/lock.sgml	2010-07-29 11:58:56.0 +0200
*** LOCK [ TABLE ] [ ONLY ] replaceable cla
*** 39,46 
 literalNOWAIT/literal is specified, commandLOCK
 TABLE/command does not wait to acquire the desired lock: if it
 cannot be acquired immediately, the command is aborted and an
!error is emitted.  Once obtained, the lock is held for the
!remainder of the current transaction.  (There is no commandUNLOCK
 TABLE/command command; locks are always released at transaction
 end.)
/para
--- 39,49 
 literalNOWAIT/literal is specified, commandLOCK
 TABLE/command does not wait to acquire the desired lock: if it
 cannot be acquired immediately, the command is aborted and an
!error is emitted. If varnamelock_timeout/varname is set to a value
!higher than 0, and the lock cannot be acquired under the specified
!timeout value in milliseconds, the command is aborted and an error
!is emitted. Once obtained, the lock is held for the remainder of  
!the current transaction.  (There is no commandUNLOCK
 TABLE/command command; locks are always released at transaction
 end.)
/para
diff -dcrpN pgsql.orig/doc/src/sgml/ref/select.sgml pgsql/doc/src/sgml/ref/select.sgml
*** pgsql.orig/doc/src/sgml/ref/select.sgml	2010-06-20 13:59:13.0 +0200
--- pgsql/doc/src/sgml/ref/select.sgml	2010-07-29 11:58:56.0 +0200
*** FOR SHARE [ OF replaceable class=param
*** 1160,1165 
--- 1160,1173 
 /para
  
 para
+ If literalNOWAIT/ option is not specified and varnamelock_timeout/varname
+ is set to a value higher than 0, and the lock needs to wait more than
+ the specified value in milliseconds, the command reports an error after
+ timing out, 

Re: [HACKERS] Synchronous replication

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 7:06 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas robertmh...@gmail.com wrote:
 Let's not get *the manner of specifying the policy* confused with *the
 need to update the policy when the master changes*.  It doesn't seem
 likely you would want the same value for  synchronous_standbys on all
 your machines.  In the most common configuration, you'd probably have:

 on A: synchronous_standbys=B
 on B: synchronous_standbys=A

 Oh, true. But, what if we have another synchronous standby called C?
 We specify the policy as follows?:

 on A: synchronous_standbys=B,C
 on B: synchronous_standbys=A,C
 on C: synchronous_standbys=A,B

 We would need to change the setting on both A and B when we want to
 change the name of the third standby from C to D, for example. No?

Sure.  If you give the standbys names, then if people change the
names, they'll have to update their configuration.  But I can't see
that as an argument against doing it.  You can remove the possibility
that someone will have a hassle if they rename a server by not
allowing them to give it a name in the first place, but that doesn't
seem like a win from a usability perspective.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta:
 Marc Cousin írta:
   
 The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :
   
 
 I fixed this by adding CheckLockTimeout() function that works like
 CheckStatementTimeout() and ensuring that the same start time is
 used for both deadlock_timeout and lock_timeout if both are active.
 The preference of errors if their timeout values are equal is:
 statement_timeout  lock_timeout  deadlock_timeout
 
   
 As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
 work, with this new version.

 Keeping the deadlock_timeout to 1s, when lock_timeout = 1001, 
 lock_timeout doesn't trigger anymore.
   
 

 I missed one case when the lock_timeout_active should have been set
 but the timer must not have been re-set, this caused the problem.
 I blame the hot weather and having no air conditioning. The second is
 now fixed. :-)

 I also added one line in autovacuum.c to disable lock_timeout,
 in case it's globally set in postgresq.conf as per Alvaro's comment.

 Also, I made sure that only one or two timeout causes (one of
 deadlock_timeout
 and lock_timeout in the first case or statement_timeout plus one of the
 other two)
 can be active at a time.

A little clarification is needed. The above statement is not entirely true.
There can be a case when all three timeout causes can be active, but only
when deadlock_timeout is the smallest of the three. If the fin_time value
for the another timeout cause is smaller than for deadlock_timeout then
there's no point to make deadlock_timeout active. And in case
deadlock_timeout
triggers and CheckDeadLock() finds that there really is a deadlock then the
possibly active lock_timeout gets disabled to avoid calling
RemoveFromWaitQueue() twice. I hope the comments in the code explain it
well.

  Previously I was able to trigger a segfault
 with the default
 1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
 system's
 clock resolution makes the lock_timeout and deadlock_timeout equal and
 RemoveFromWaitQueue() was called twice. This way it's a lot more robust.
   

Best regards,
Zoltán Böszörményi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Fujii Masao
On Mon, Aug 2, 2010 at 8:32 PM, Robert Haas robertmh...@gmail.com wrote:
 Sure.  If you give the standbys names, then if people change the
 names, they'll have to update their configuration.  But I can't see
 that as an argument against doing it.  You can remove the possibility
 that someone will have a hassle if they rename a server by not
 allowing them to give it a name in the first place, but that doesn't
 seem like a win from a usability perspective.

I'm just comparing your idea (i.e., set synchronous_standbys on
each possible master) with my idea (i.e., set replication_mode on
each standby). Though your idea has the advantage described in the
following post, it seems to make the setup of the standbys more
complicated, as I described. So I'm trying to generate better idea.
http://archives.postgresql.org/pgsql-hackers/2010-08/msg7.php

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Yeb Havinga

Fujii Masao wrote:

On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas robertmh...@gmail.com wrote:
  

Let's not get *the manner of specifying the policy* confused with *the
need to update the policy when the master changes*.  It doesn't seem
likely you would want the same value for  synchronous_standbys on all
your machines.  In the most common configuration, you'd probably have:

on A: synchronous_standbys=B
on B: synchronous_standbys=A



Oh, true. But, what if we have another synchronous standby called C?
We specify the policy as follows?:

on A: synchronous_standbys=B,C
on B: synchronous_standbys=A,C
on C: synchronous_standbys=A,B

We would need to change the setting on both A and B when we want to
change the name of the third standby from C to D, for example. No?
  
What if the master is named as well in the 'pool of servers that are in 
sync'? In the scenario above this pool would be A,B,C. Working with this 
concept has as benefit that the setting can be copied to all other 
servers as well, and is invariant under any number of failures or 
switchovers. The same could also hold for quorum expressions like A  
(B || C), if A,B,C are either master or standby.


I initially though that once the definitions could be the same on all 
servers, having them in a system catalog would be a good thing. However 
that'd propably hard to setup, and also in the case of failures during 
change of the parameters it could become very messy.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 8:57 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Fujii Masao wrote:

 On Mon, Aug 2, 2010 at 7:53 PM, Robert Haas robertmh...@gmail.com wrote:


 Let's not get *the manner of specifying the policy* confused with *the
 need to update the policy when the master changes*.  It doesn't seem
 likely you would want the same value for  synchronous_standbys on all
 your machines.  In the most common configuration, you'd probably have:

 on A: synchronous_standbys=B
 on B: synchronous_standbys=A


 Oh, true. But, what if we have another synchronous standby called C?
 We specify the policy as follows?:

 on A: synchronous_standbys=B,C
 on B: synchronous_standbys=A,C
 on C: synchronous_standbys=A,B

 We would need to change the setting on both A and B when we want to
 change the name of the third standby from C to D, for example. No?


 What if the master is named as well in the 'pool of servers that are in
 sync'? In the scenario above this pool would be A,B,C. Working with this
 concept has as benefit that the setting can be copied to all other servers
 as well, and is invariant under any number of failures or switchovers. The
 same could also hold for quorum expressions like A  (B || C), if A,B,C are
 either master or standby.

 I initially though that once the definitions could be the same on all
 servers, having them in a system catalog would be a good thing. However
 that'd propably hard to setup, and also in the case of failures during
 change of the parameters it could become very messy.

Yeah, I think this information has to be stored either in GUCs or in a
flat-file somewhere.  Putting it in a system catalog will cause major
problems when trying to get a down system back up, I think.

I suspect that for complex setups, people will need to use some kind
of cluster-ware to update the settings as nodes go up and down.  But I
think it will still be simpler if the nodes are named.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Sushant Sinha
 On 08/01/2010 08:04 PM, Sushant Sinha wrote:
  1. We do not have separate tokens wikipedia and org
  2. If we have the two tokens we should have them at adjacent position so
  that a phrase search for wikipedia org should work.
 
 This would needlessly increase the number of tokens. Instead you'd 
 better make it work like compound word support, having just wikipedia 
 and org as tokens.

The current text parser already returns url and url_path. That already
increases the number of unique tokens. I am only asking for adding of
normal english words as well so that if someone types only wikipedia
he gets a match. 

 
 Searching for wikipedia.org or wikipedia org should then result in 
 the same search query with the two tokens: wikipedia and org.

Earlier people have expressed the need to index urls/emails and
currently the text parser already does so. Reverting that would be a
regression of functionality. Further, a ranking function can take
advantage of direct match of a token.

  position 0: WORD(wikipedia), URL(wikipedia.org/search?q=sushant)
 
 IMO the differentiation between WORDs and URLs is not something the text 
 search engine should have to take care a lot. Let it just do the 
 searching and make it do that well.

Postgres english parser already emits urls as tokens. Only thing I am
asking is on improving the tokenization and positioning.

 What does a token wikipedia.org/search?q=sushant buy you in terms of 
 text searching? Or even result highlighting? I wouldn't expect anybody 
 to want to search for a full URL, do you?

There have been need expressed in past. And an exact token match can
result in better ranking functions. For example, a tf-idf ranking will
rank matching of such unique tokens significantly higher.

-Sushant.

 Regards
 
 Markus Wanner



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Yeb Havinga

Robert Haas wrote:

I agree that's the crux of the problem, but I can't see solving it
with a global variable.  I realize you were just testing...
  
Yes it was just a test. However, somewhere information must be kept or 
altered so it can be detected that a relation has already been visited, 
i.e. it is a multiple inheriting child. The other solutions I could 
think of are more intrusive (i.e. definitionin ATController and passing 
as parameter).


The attached patch uses the globally defined list. After ATPrepCmd the 
list pointer is reset to NIL, the list not freed since the allocs are 
done in a memory context soon to be deleted (PortalHeapMemory). It 
passes regression as well as the script below.


regards,
Yeb Havinga


DROP SCHEMA IF EXISTS test_inheritance CASCADE;
CREATE SCHEMA test_inheritance;
SET search_path TO test_inheritance;

CREATE TABLE top (i int);
CREATE TABLE mid1 () INHERITS (top);
CREATE TABLE mid2 () INHERITS (top);
CREATE TABLE bottom () INHERITS (mid1, mid2);
CREATE TABLE basement () INHERITS (bottom);

ALTER TABLE top
ADD COLUMN a_table_column integer,
ADD COLUMN a_table_column2 integer;

ALTER TABLE top
ADD COLUMN a_table_column3 integer;

ALTER TABLE top
ADD CONSTRAINT  a_check_constraint CHECK (i IN (0,1)),
ADD CONSTRAINT  a_check_constraint2 CHECK (i IN (0,1));

ALTER TABLE top
ADD CONSTRAINT  a_check_constraint3 CHECK (i IN (0,1));

SELECT t.oid, t.relname, a.attinhcount
FROM pg_class t
JOIN pg_attribute a ON (a.attrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance' AND a.attname LIKE 'a_table_column%'
ORDER BY oid;

SELECT t.oid, t.relname, c.coninhcount
FROM pg_class t
JOIN pg_constraint c ON (c.conrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance'
ORDER BY oid;




diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 49a6f73..08efffc 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -99,6 +99,10 @@ typedef struct OnCommitItem
 
 static List *on_commits = NIL;
 
+/*
+ * Per subcommand history of relids visited in an inheritance hierarchy.
+ */
+static List *visited_relids = NIL;
 
 /*
  * State information for ALTER TABLE
@@ -2584,6 +2588,7 @@ ATController(Relation rel, List *cmds, bool recurse, 
LOCKMODE lockmode)
AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);
 
ATPrepCmd(wqueue, rel, cmd, recurse, false, lockmode);
+   visited_relids = NIL;
}
 
/* Close the relation, but keep lock until commit */
@@ -3618,10 +3623,11 @@ ATSimpleRecursion(List **wqueue, Relation rel,
 /*
  * ATOneLevelRecursion
  *
- * Here, we visit only direct inheritance children.  It is expected that
- * the command's prep routine will recurse again to find indirect children.
- * When using this technique, a multiply-inheriting child will be visited
- * multiple times.
+ * Here, we visit only direct inheritance children.  It is expected that the
+ * command's prep routine will recurse again to find indirect children.  When
+ * using this technique, a multiple-inheriting child will be visited multiple
+ * times. Childs of multiple-inheriting childs however are only visited once
+ * for each parent.
  */
 static void
 ATOneLevelRecursion(List **wqueue, Relation rel,
@@ -3631,6 +3637,14 @@ ATOneLevelRecursion(List **wqueue, Relation rel,
ListCell   *child;
List   *children;
 
+   /* If we already visited the current multiple-inheriting relation, we
+* mustn't recurse to it's child tables, because they've already been
+* visited. Visiting them would lead to an incorrect value for
+* attinhcount. */
+   if (list_member_oid(visited_relids, relid))
+   return;
+   visited_relids = lappend_oid(visited_relids, relid);
+
children = find_inheritance_children(relid, lockmode);
 
foreach(child, children)
@@ -4891,6 +4905,15 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo 
*tab, Relation rel,
CommandCounterIncrement();
 
/*
+* If the constraint got merged with an existing constraint, we're done.
+* We mustn't recurse to child tables in this case, because they've 
already
+* got the constraint, and visiting them again would lead to an 
incorrect
+* value for coninhcount.
+*/
+   if (newcons == NIL)
+   return;
+
+   /*
 * Propagate to children as appropriate.  Unlike most other ALTER
 * routines, we have to do this one level of recursion at a time; we 
can't
 * use find_all_inheritors to do it in one pass.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres as Historian

2010-08-02 Thread Etienne Dube

On 02/08/2010 3:20 AM, Hardik Belani wrote:
We are using postgres as RDBMS for our product. There is a requirement 
coming for a feature which will require me to store data about various 
data points (mainly numbers) on a time scale. Data measurement is 
being taken every few secs/mins based and it is needed to be stored 
for statistical analysis. Now this requires numbers (integers/floats) 
to be stored at every mins.
For this i can create a table with number and time (may be time offset 
instead of timestamp) as columns. But still it will require me to 
store huge number of rows in the order of few millions. Data is read 
only and only inserts can happen. But I need to perform all kinds of 
aggregation to get various statistics. for example: daily avg, monthly 
avg etc..
We already are using postgres for our product so using postgres does 
not add any additional installation requirement and hence it is a bit 
easier.
Would you recommand postgres for this kind of requirement and will be 
provide the performance. OR do you recommand any other database meant 
for such requirements. I am also searching for a good historian 
database if postgres doesn't suppport.

Thanks,
Hardik



Hi Hardik,

Data warehousing techniques could help you with your requirements of 
aggregating large amounts of data. Have a look at The Data Warehouse 
Toolkit by R. Kimball on how to design a star schema with aggregate 
tables (these can be done as materialized views using PL/pgSQL and 
triggers under postgres). You could also use an OLAP server (e.g. 
Mondrian, which pretty nice and open source as well) on top of your 
postgres DB, as it can use aggregate tables transparently when needed.


Etienne


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Markus Wanner

Hi,

On 08/02/2010 03:12 PM, Sushant Sinha wrote:

The current text parser already returns url and url_path. That already
increases the number of unique tokens.


Well, I think I simply turned that off to be able to search for plain 
words. It still works for complete URLs, those are just treated like 
text, then.



Earlier people have expressed the need to index urls/emails and
currently the text parser already does so. Reverting that would be a
regression of functionality. Further, a ranking function can take
advantage of direct match of a token.


That's a point, yes. However, simply making the same string turn up 
twice in the tokenizer's output doesn't sound like the right solution to 
me. Especially considering that the query parser uses the very same 
tokenizer.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 9:12 AM, Sushant Sinha sushant...@gmail.com wrote:
 The current text parser already returns url and url_path. That already
 increases the number of unique tokens. I am only asking for adding of
 normal english words as well so that if someone types only wikipedia
 he gets a match.
[...]
 Postgres english parser already emits urls as tokens. Only thing I am
 asking is on improving the tokenization and positioning.

Can you write a patch to implement your idea?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 9:20 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Robert Haas wrote:

 I agree that's the crux of the problem, but I can't see solving it
 with a global variable.  I realize you were just testing...

 Yes it was just a test. However, somewhere information must be kept or
 altered so it can be detected that a relation has already been visited, i.e.
 it is a multiple inheriting child. The other solutions I could think of are
 more intrusive (i.e. definitionin ATController and passing as parameter).

 The attached patch uses the globally defined list. After ATPrepCmd the list
 pointer is reset to NIL, the list not freed since the allocs are done in a
 memory context soon to be deleted (PortalHeapMemory). It passes regression
 as well as the script below.

I can't speak for any other committer, but personally I'm prepared to
reject out of hand any solution involving a global variable.  This
code is none to easy to follow as it is and adding global variables to
the mix is not going to make it better, even if we can convince
ourselves that it's safe and correct.  This is something of a corner
case, so I won't be crushed if a cleaner fix is too invasive to
back-patch.  Incidentally, we need to shift this discussion to a new
subject line; we have a working patch for the original subject of this
thread, and are now discussing the related problem I found with
attributes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Sushant Sinha
On Mon, 2010-08-02 at 09:32 -0400, Robert Haas wrote:
 On Mon, Aug 2, 2010 at 9:12 AM, Sushant Sinha sushant...@gmail.com wrote:
  The current text parser already returns url and url_path. That already
  increases the number of unique tokens. I am only asking for adding of
  normal english words as well so that if someone types only wikipedia
  he gets a match.
 [...]
  Postgres english parser already emits urls as tokens. Only thing I am
  asking is on improving the tokenization and positioning.
 
 Can you write a patch to implement your idea?
 

Yes thats what I am planning to do. I just wanted to see if anyone can
help me in estimating whether this is doable in the current parser or I
need to write a new one. If possible, then some idea on how to go about
implementing?

-Sushant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Tom Lane
Sushant Sinha sushant...@gmail.com writes:
 This would needlessly increase the number of tokens. Instead you'd 
 better make it work like compound word support, having just wikipedia 
 and org as tokens.

 The current text parser already returns url and url_path. That already
 increases the number of unique tokens. I am only asking for adding of
 normal english words as well so that if someone types only wikipedia
 he gets a match. 

The suggestion to make it work like compound words is still a good one,
ie given wikipedia.org you'd get back

hostwikipedia.org
host-part   wikipedia
host-part   org

not just the host token as at present.

Then the user could decide whether he needed to index hostname
components or not, by choosing whether to forward hostname-part
tokens to a dictionary or just discard them.

If you submit a patch that tries to force the issue by classifying
hostname parts as plain words, it'll probably get rejected out of
hand on backwards-compatibility grounds.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Kevin Grittner
Sushant Sinha sushant...@gmail.com wrote:
 
 Yes thats what I am planning to do. I just wanted to see if anyone
 can help me in estimating whether this is doable in the current
 parser or I need to write a new one. If possible, then some idea
 on how to go about implementing?
 
The current tsearch parser is a state machine which does clunky mode
switches to handle special cases like you describe.  If you're
looking at doing very much in there, you might want to consider a
rewrite to something based on regular expressions.  See discussion
in these threads:
 
http://archives.postgresql.org/message-id/200912102005.16560.and...@anarazel.de
 
http://archives.postgresql.org/message-id/4b210d9e02250002d...@gw.wicourts.gov
 
That was actually at the top of my personal PostgreSQL TODO list
(after my current project is wrapped up), but I wouldn't complain if
someone else wanted to take it.  :-)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 10:21 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Sushant Sinha sushant...@gmail.com wrote:

 Yes thats what I am planning to do. I just wanted to see if anyone
 can help me in estimating whether this is doable in the current
 parser or I need to write a new one. If possible, then some idea
 on how to go about implementing?

 The current tsearch parser is a state machine which does clunky mode
 switches to handle special cases like you describe.  If you're
 looking at doing very much in there, you might want to consider a
 rewrite to something based on regular expressions.  See discussion
 in these threads:

 http://archives.postgresql.org/message-id/200912102005.16560.and...@anarazel.de

 http://archives.postgresql.org/message-id/4b210d9e02250002d...@gw.wicourts.gov

 That was actually at the top of my personal PostgreSQL TODO list
 (after my current project is wrapped up), but I wouldn't complain if
 someone else wanted to take it.  :-)

If you end up rewriting it, it may be a good idea, in the initial
rewrite, to mimic the current results as closely as possible - and
then submit a separate patch to change the results.  Changing two
things at the same time exponentially increases the chance of your
patch getting rejected.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Yeb Havinga

Robert Haas wrote:

On Mon, Aug 2, 2010 at 9:20 AM, Yeb Havinga yebhavi...@gmail.com wrote:

The attached patch uses the globally defined list.

I can't speak for any other committer, but personally I'm prepared to
reject out of hand any solution involving a global variable.  This
code is none to easy to follow as it is and adding global variables to
the mix is not going to make it better, even if we can convince
ourselves that it's safe and correct.  This is something of a corner
case, so I won't be crushed if a cleaner fix is too invasive to
back-patch.

Hello Robert,

Thanks for looking at the patch. I've attached a bit more wordy version, 
that adds a boolean to AlteredTableInfo and a function to wipe that 
boolean between processing of subcommands.

  Incidentally, we need to shift this discussion to a new
subject line; we have a working patch for the original subject of this
thread, and are now discussing the related problem I found with
attributes.
  
Both solutions have changes in callers of 'find_inheritance_children'. I 
was working in the hope a unifiying solution would pop up naturally, but 
so far it has not. Checking of the new AlteredTableInfo-relVisited 
boolean could perhaps be pushed down into find_inheritance_children, if 
multiple-'doing things' for the childs under a multiple-inheriting 
relation is unwanted for every kind of action. It seems to me that the 
question whether that is the case must be answered, before the current 
working patch for coninhcount is 'ready for committer'.


regards,
Yeb Havinga

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 49a6f73..d43efc3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -99,7 +99,6 @@ typedef struct OnCommitItem
 
 static List *on_commits = NIL;
 
-
 /*
  * State information for ALTER TABLE
  *
@@ -132,6 +131,7 @@ typedef struct AlteredTableInfo
Oid relid;  /* Relation to work on 
*/
charrelkind;/* Its relkind */
TupleDesc   oldDesc;/* Pre-modification tuple 
descriptor */
+   boolrelVisited; /* Was the rel visited before, for the 
current subcommand */
/* Information saved by Phase 1 for Phase 2: */
List   *subcmds[AT_NUM_PASSES]; /* Lists of AlterTableCmd */
/* Information saved by Phases 1/2 for Phase 3: */
@@ -335,6 +335,7 @@ static void ATExecDropInherit(Relation rel, RangeVar 
*parent, LOCKMODE lockmode)
 static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
   ForkNumber forkNum, bool istemp);
 static const char *storage_name(char c);
+static void ATResetRelVisited(List **wqueue);
 
 
 /* 
@@ -2583,6 +2584,7 @@ ATController(Relation rel, List *cmds, bool recurse, 
LOCKMODE lockmode)
{
AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);
 
+   ATResetRelVisited(wqueue);
ATPrepCmd(wqueue, rel, cmd, recurse, false, lockmode);
}
 
@@ -3503,6 +3505,23 @@ ATGetQueueEntry(List **wqueue, Relation rel)
 }
 
 /*
+ * ATResetRelVisited: reset the visitation info for each rel in the working
+ * queue, so it can be used for the next subcommand.
+ */
+static void
+ATResetRelVisited(List **wqueue)
+{
+   AlteredTableInfo *tab;
+   ListCell   *ltab;
+
+   foreach(ltab, *wqueue)
+   {
+   tab = (AlteredTableInfo *) lfirst(ltab);
+   tab-relVisited = false;
+   }
+}
+
+/*
  * ATSimplePermissions
  *
  * - Ensure that it is a relation (or possibly a view)
@@ -3618,19 +3637,29 @@ ATSimpleRecursion(List **wqueue, Relation rel,
 /*
  * ATOneLevelRecursion
  *
- * Here, we visit only direct inheritance children.  It is expected that
- * the command's prep routine will recurse again to find indirect children.
- * When using this technique, a multiply-inheriting child will be visited
- * multiple times.
+ * Here, we visit only direct inheritance children.  It is expected that the
+ * command's prep routine will recurse again to find indirect children.  When
+ * using this technique, a multiple-inheriting child will be visited multiple
+ * times. Childs of multiple-inheriting childs however are only visited once
+ * for each parent.
  */
 static void
 ATOneLevelRecursion(List **wqueue, Relation rel,
AlterTableCmd *cmd, LOCKMODE lockmode)
 {
Oid relid = RelationGetRelid(rel);
+   AlteredTableInfo *tab = ATGetQueueEntry(wqueue, rel);
ListCell   *child;
List   *children;
 
+   /* If we already visited the current multiple-inheriting relation, we
+* mustn't recurse to it's child tables, because they've already been
+* visited. Visiting them would lead to an incorrect value for
+* attinhcount. */
+   if 

Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-02 Thread Robert Haas
2010/8/2 Alexander Korotkov aekorot...@gmail.com:
 On Mon, Aug 2, 2010 at 5:20 AM, Robert Haas robertmh...@gmail.com wrote:
 I reviewed this code in a fair amount of detail today and ended up
 rewriting it.  In general terms, it's best to avoid changing things
 that are not relevant to the central purpose of the patch.  This patch
 randomly adds a whole bunch of whitespace and removes a whole bunch of
 comments which I find useful and do not want to see removed.  It took
 me about an hour to reverse out all of those changes, which then let
 me get down to the business of actually analyzing the code.
 I'm sorry. This is my first patch, I will be more accurate next time. But, I
 think there is no unified opinion about some changes like replacement !m
 with m==0.

Sure; if that were the only such change, I wouldn't have mentioned it.

 I think this line is not correct:
 if (m != s_bytes  n != t_bytes)

Good catch, fixed in the attached version.

 The attached patch takes the approach of using a fast-path for just
 the innermost loop when neither string contains any multi-byte
 characters (regardless of whether or not the encoding allows
 multi-byte characters).  It's still slower than CVS HEAD, but for
 strings containing no multi-byte characters it's only marginally, if
 at all, slower than previous releases, because 9.0 doesn't contain
 your fix to avoid the extra string copy; and it's faster than your
 implementation even when multi-byte characters ARE present.  It is
 slightly slower on single-byte encoding databases (I tested
 SQL_ASCII), but still faster than previous releases.  It's also quite
 a bit less code duplication.

 Can I look at the test which shows that your implementation is faster than
 my when multi-byte characters are present? My tests show reverse. For
 example, with russian dictionary of openoffice:

Hmm, with the correction you mention above, I'm getting about the same
results with multi-byte characters (but faster with only single-byte
characters).  I did this:

CREATE TABLE words (a text not null primary key);
COPY words from '/usr/share/dict/words';

SELECT SUM(levenshtein(a, 'foo')) from words;
SELECT SUM(levenshtein(a, 'Urbański')) FROM words;
SELECT SUM(levenshtein(a, 'ańs')) FROM words;

With the updated patch attached below, these ran about 102 ms, 222 ms,
129 ms.  With your patch, I get about 134 ms, 222 ms, 130 ms.  Perhaps
this is because I only have multi-byte characters in one of the
inputs, not both.  I don't have a dictionary handy with multi-byte
words in it.  Can you send me a file?

 I think that the cause of slow down slowdown is memcmp function. This
 function is very fast for long parts of memory, but of few bytes inline
 function is faster, because compiler have more freedom for optimization.
 After replacing memcmp with inline function like following in your
 implementation:

Yeah, that does seem to be slightly better.  I've done a version of
this in the attached patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


mb_levenshtein_rmh-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Marc Cousin
The Monday 02 August 2010 13:59:59, Boszormenyi Zoltan wrote :
 
  Also, I made sure that only one or two timeout causes (one of
  deadlock_timeout
  and lock_timeout in the first case or statement_timeout plus one of the
  other two)
  can be active at a time.
 
 A little clarification is needed. The above statement is not entirely true.
 There can be a case when all three timeout causes can be active, but only
 when deadlock_timeout is the smallest of the three. If the fin_time value
 for the another timeout cause is smaller than for deadlock_timeout then
 there's no point to make deadlock_timeout active. And in case
 deadlock_timeout
 triggers and CheckDeadLock() finds that there really is a deadlock then the
 possibly active lock_timeout gets disabled to avoid calling
 RemoveFromWaitQueue() twice. I hope the comments in the code explain it
 well.
 
   Previously I was able to trigger a segfault
  with the default
  1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
  system's
  clock resolution makes the lock_timeout and deadlock_timeout equal and
  RemoveFromWaitQueue() was called twice. This way it's a lot more robust.

 
Ok, I've tested this new version:

This time, it's this case that doesn't work :

Session 1 : lock a table

Session 2 : set lock_timeout to a large value, just to make it obvious (10 
seconds).
It times out after 1 s (the deadlock timeout), returning 'could not obtain 
lock'.
Of course, it should wait 10 seconds.


I really feel that the timeout framework is the way to go here. I know what
you said about it before, but I think that the current code is getting
too complicated, with too many special cases all over.


As this is only my second review and I'm sure I am missing things here, could
someone with more experience have a look and give advice ?


Cheers

Marc

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiling CVS HEAD with clang under OSX

2010-08-02 Thread Tom Lane
Neil Conway neil.con...@gmail.com writes:
 FWIW, I think we should aim to eventually remove the dependency on
 -fwrapv, and instead make the code correct under the semantics
 guaranteed by the C spec.

[ shrug... ]  We've gone around on that before.  I can't get excited
about it, and the reason is that 99% of the places where we actually
need -fwrapv behavior is in places where we are trying to test for
overflow.  The need for that can't be legislated away.  As an example,
in int4pl we have code like

result = arg1 + arg2;
if (some-test-on-result-and-arg1-and-arg2)
elog(ERROR, overflow);

Here's the problem: if the compiler is allowed to assume that overflow
cannot happen, it is always going to be able to prove that the
if-test is constant false.  This is inherent.  Anybody claiming to
exhibit a safe way to code the overflow test is really only telling
you that today's version of their compiler isn't smart enough to make
the proof.  Next year, who knows?  I'm uninterested in getting into
that kind of arms race with the compiler authors.  I prefer to keep
the goalposts exactly where they've been for the past forty years,
namely -fwrapv semantics.

If the compiler guys actually were serious about helping people write
code that didn't need -fwrapv, they would provide primitives for
testing for arithmetic overflow.  I would be ecstatic if we could write
int4pl like this:

if (sum_overflows(arg1, arg2))
elog(ERROR, overflow);
else
return arg1 + arg2;

especially since with a halfway decent compiler this sort of
construction wouldn't even require doing the addition twice ---
presumably the compiler could see that it had a common subexpression
there, and generate code that consists of one addition plus a
branch-on-overflow test.  This'd be clean, readable, and far more
efficient than the hacks we have to resort to now.

Short of that, though, -fwrapv is what it's gonna be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 10:47 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 The attached patch uses the globally defined list.

 I can't speak for any other committer, but personally I'm prepared to
 reject out of hand any solution involving a global variable.  This
 code is none to easy to follow as it is and adding global variables to
 the mix is not going to make it better, even if we can convince
 ourselves that it's safe and correct.  This is something of a corner
 case, so I won't be crushed if a cleaner fix is too invasive to
 back-patch.

 Thanks for looking at the patch. I've attached a bit more wordy version,
 that adds a boolean to AlteredTableInfo and a function to wipe that boolean
 between processing of subcommands.

I don't think that this is much cleaner than the global variable
solution; you haven't really localized that need to know about the new
flag in any meaningful way, the hacks in ATOneLevelRecusion()
basically destroy any pretense of that code possibly being reusable
for some other caller.  However, there's a more serious problem, which
is that it doesn't in general fix the bug: try it with the
top1/top2/bottom/basement example I posted upthread.  If you add the
same column to both top1 and top2 and then drop it in both top1 and
top2, basement ends up with a leftover copy.  The problem is that
only visit each child once is not the right algorithm; what you need
to do is only visit the descendents of each child if no merge
happened at the parent.  I believe that the only way to do this
correct is to merge the prep stage into the execution stage, as the
code for adding constraints already does.  At the prep stage, you
don't have enough information to determine which relations you'll
ultimately need to update, since you don't know where the merges will
happen.

  Incidentally, we need to shift this discussion to a new
 subject line; we have a working patch for the original subject of this
 thread, and are now discussing the related problem I found with
 attributes.


 Both solutions have changes in callers of 'find_inheritance_children'. I was
 working in the hope a unifiying solution would pop up naturally, but so far
 it has not. Checking of the new AlteredTableInfo-relVisited boolean could
 perhaps be pushed down into find_inheritance_children, if multiple-'doing
 things' for the childs under a multiple-inheriting relation is unwanted for
 every kind of action. It seems to me that the question whether that is the
 case must be answered, before the current working patch for coninhcount is
 'ready for committer'.

I am of the opinion that the chances of a unifying solution popping up
are pretty much zero.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-08-02 Thread Robert Haas
2010/7/29 Alexander Korotkov aekorot...@gmail.com:
 But, in pg_trgm it makes it possible to combine different similarity levels
 in one query. For example:
 select * from test_trgm order by t - 'asdf'  0.5 or t - 'qwer'  0.4;
 Is there any chance to handle this syntax also?

Maybe I'm missing something, but I don't think that ORDER BY clause
makes much sense.  OR is going to reduce a true or false value - and
it's usually not that interesting to order by a column that can only
take one of two values.

Am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (9.1) btree_gist support for searching on not equals

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 2:39 AM, Jeff Davis pg...@j-davis.com wrote:
 On Sun, 2010-08-01 at 21:57 -0400, Robert Haas wrote:
 On Fri, Jul 16, 2010 at 1:19 AM, Jeff Davis pg...@j-davis.com wrote:
  Thank you for the review.
 
  On Mon, 2010-07-12 at 17:17 +0900, Itagaki Takahiro wrote:
  (1) Exclusion constraints support for operators where x operator x
  is false (tiny patch)
  https://commitfest.postgresql.org/action/patch_view?id=307
  (2) btree_gist support for searching on  (not equals)
  https://commitfest.postgresql.org/action/patch_view?id=308
 
  Those patches should be committed at once because (2) requires (1) to work
  with EXCLUDE constraints. Also, (1) has no benefits without (2) because we
  have no use cases for  as an index-able operator. Both patches are very
  simple and small, and worked as expected both WHERE  and EXCLUDE
  constraints cases.
 
  It appears that Tom already committed (1).
 
  I'd like to ask you to write additional documentation about btree_gist [1]
  that the module will be more useful when it is used with exclusion
  constraints together. Without documentation, no users find the usages.
 
  Good idea, new patch attached.

 It seems pretty odd to define a constant called
 BTNotEqualStrategyNumber in contrib/btree_gist.  Shouldn't we either
 call this something else, or define it in access/skey.h?  Considering
 that there seem to be some interesting gymnastics being done with
 BTMaxStrategyNumber, I'd vote for the former.  Maybe just
 BtreeGistNotEqualStrategyNumber?

 Sounds good to me.

OK, committed that way.

 At some point we may be interested to add this to BTree, as well. But we
 can cross that bridge when we come to it.

Yeah.

I was also wondering if it would be worth adding some additional
regression testing to contrib/btree_gist exercising this new
functionality.  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiling CVS HEAD with clang under OSX

2010-08-02 Thread Greg Stark
On Mon, Aug 2, 2010 at 4:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Here's the problem: if the compiler is allowed to assume that overflow
 cannot happen, it is always going to be able to prove that the
 if-test is constant false.  This is inherent.  Anybody claiming to
 exhibit a safe way to code the overflow test is really only telling
 you that today's version of their compiler isn't smart enough to make
 the proof.

So I'll do the next two parts of the dialogue myself:

Questioner: So why not write that as:

if ((arg1  0  arg2  0  arg1  MAXINT - arg2) ||
(arg1  0  arg2  0  arg1  MININT + arg2))
  elog(overflow)
else
  return arg1+arg2;

Tom: Because that code is much more complex and prone to errors
especially when you start getting into multiplication and other
operations and it's also much slower than the code which allows
overflow to happen and then checks that the result makes sense.

I'm not entirely sure I agree. At least I haven't actually gone
through all the arithmetic operations and I'm not sure how much more
complex they get. If they were all at that level of complexity I think
I would say we should go ahead and bite the performance bullet and do
it the ultra-safe way.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Yeb Havinga

Robert Haas wrote:

I don't think that this is much cleaner than the global variable
solution; you haven't really localized that need to know about the new
flag in any meaningful way, the hacks in ATOneLevelRecusion()
basically destroy any pretense of that code possibly being reusable
for some other caller.  However, there's a more serious problem, which
is that it doesn't in general fix the bug: try it with the
top1/top2/bottom/basement example I posted upthread.  If you add the
same column to both top1 and top2 and then drop it in both top1 and
top2, basement ends up with a leftover copy.  The problem is that
only visit each child once is not the right algorithm; what you need
to do is only visit the descendents of each child if no merge
happened at the parent.  I believe that the only way to do this
correct is to merge the prep stage into the execution stage, as the
code for adding constraints already does.  At the prep stage, you
don't have enough information to determine which relations you'll
ultimately need to update, since you don't know where the merges will
happen.
  

Hello Robert,

Again thanks for looking at the patch. Unfortunately I missed the 
top1/top2 example earlier, but now I've seen it I understand that it is 
impossible to fix this problem during the prep stage, without looking at 
actual existing columns, i.e. without the merge code. Running the 
top1/top2 example I'd also have expected an error while adding the 
column to the second top, since the columns added to top1 and top2 are 
from a different origin. It is apparently considered good behaviour, 
however troubles emerge when e.g. trying to rename a_table_column in the 
top1/top2 example, where that is no problem in the 'lollipop' structure, 
that has a single origin.


ALTER TABLE top RENAME COLUMN a_table_column TO another_table_column;

SELECT t.oid, t.relname, a.attname, a.attinhcount
FROM pg_class t
JOIN pg_attribute a ON (a.attrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance' AND a.attname LIKE '%table_column%'
ORDER BY oid;

I do not completely understand what you mean with the destruction of 
reusability of ATOneLevelRecursion, currently only called by 
ATPrepAddColumn - in the patch it is documented in the definition of 
relVisited that is it visit info for each subcommand. The loop over 
subcommands is in ATController, where the value is properly reset for 
each all current and future subcommands. Hence the ATOneLevelRecursion 
routing is usable in its current form for all callers during the prep 
stage, and not only ATPrepAddColumn.

I am of the opinion that the chances of a unifying solution popping up
are pretty much zero.  :-)
  

Me too, now I understand the fixes must be in the execution stages.

regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Kevin Grittner
Marc Cousin cousinm...@gmail.com wrote:
 
 This time, it's this case that doesn't work :
 
 I really feel that the timeout framework is the way to go here.
 
Since Zoltán also seems to feel this way:
 
http://archives.postgresql.org/message-id/4c516c3a.6090...@cybertec.at
 
I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF.  Does anyone
feel this approach (without the framework) should be pursued
further?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Boszormenyi Zoltan
Hi,

Kevin Grittner írta:
 Marc Cousin cousinm...@gmail.com wrote:
  
   
 This time, it's this case that doesn't work :
 
  
   
 I really feel that the timeout framework is the way to go here.
 
  
 Since Zoltán also seems to feel this way:
  
 http://archives.postgresql.org/message-id/4c516c3a.6090...@cybertec.at
  
 I wonder whether this patch shouldn't be rejected with a request
 that the timeout framework be submitted to the next CF.  Does anyone
 feel this approach (without the framework) should be pursued
 further?
   

I certainly think so, the current scheme seems to be very fragile
and doesn't want to be extended.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Kevin Grittner
Boszormenyi Zoltan z...@cybertec.at wrote:
 Kevin Grittner írta:
 
 I wonder whether this patch shouldn't be rejected with a request
 that the timeout framework be submitted to the next CF.  Does
 anyone feel this approach (without the framework) should be
 pursued further?
 
 I certainly think so, the current scheme seems to be very fragile
 and doesn't want to be extended.
 
Sorry, I phrased that question in a rather confusing way; I'm not
sure, but it sounds like you're in favor of dropping this approach
and pursuing the timeout framework in the next CF -- is that right?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 3:09 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Marc Cousin cousinm...@gmail.com wrote:

 This time, it's this case that doesn't work :

 I really feel that the timeout framework is the way to go here.

 Since Zoltán also seems to feel this way:

 http://archives.postgresql.org/message-id/4c516c3a.6090...@cybertec.at

 I wonder whether this patch shouldn't be rejected with a request
 that the timeout framework be submitted to the next CF.  Does anyone
 feel this approach (without the framework) should be pursued
 further?

I think Returned with Feedback would be more appropriate than
Rejected, since we're asking for a rework, rather than saying - we
just don't want this.  But otherwise, +1.

Generally, I'm of the opinion that patches needing significant rework
should be set to Returned with Feedback and resubmitted for the next
CF; otherwise, it just gets unmanageable.  Our goal for a CF should be
to review everything thoroughly, not to get everything committed.
Otherwise, we end up with a never-ending train of what are effectively
new patches popping up, and it becomes impossible to close out the
CommitFest on time.  Reviewers and committers end up getting slammed,
and it's not very much fun for patch authors (who are trying to
frantically do last-minute rewrites) either; nor is it good for the
overall quality of code the ends up in our tree.  Better to take a
breather and then start fresh.

(If you don't believe in committer fatigue, look at the review I gave
Itagaki Takahiro on the partitioning patch in January vs. the review I
gave in July.  One of those reviews is a whole lot more specific,
detailed, and accurate than the other one...)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-08-02 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 I wonder whether this patch shouldn't be rejected with a request
 that the timeout framework be submitted to the next CF.
 
 I think Returned with Feedback would be more appropriate than
 Rejected, since we're asking for a rework, rather than saying -
 we just don't want this.  But otherwise, +1.
 
Done.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 2:56 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 I do not completely understand what you mean with the destruction of
 reusability of ATOneLevelRecursion, currently only called by ATPrepAddColumn
 - in the patch it is documented in the definition of relVisited that is it
 visit info for each subcommand. The loop over subcommands is in
 ATController, where the value is properly reset for each all current and
 future subcommands. Hence the ATOneLevelRecursion routing is usable in its
 current form for all callers during the prep stage, and not only
 ATPrepAddColumn.

Well, only if they happen to want the visit each table only once
behavior, which might not be true for every command type.

 I am of the opinion that the chances of a unifying solution popping up
 are pretty much zero.  :-)

 Me too, now I understand the fixes must be in the execution stages.

OK.  I'll go ahead and commit the patch upthread, so that the
constraint bug is fixed, and then we can keep arguing about what do
about the column bug, perhaps on a new thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-02 Thread Robert Haas
2010/8/2 Alexander Korotkov aekorot...@gmail.com:
 The dump of the table with russian dictionary is in attachment.

 I use following tests:
 SELECT SUM(levenshtein(a, 'foo')) from words;
 SELECT SUM(levenshtein(a, 'Urbański')) FROM words;
 SELECT SUM(levenshtein(a, 'ańs')) FROM words;
 SELECT SUM(levenshtein(a, 'foo')) from words2;
 SELECT SUM(levenshtein(a, 'дом')) FROM words2;
 SELECT SUM(levenshtein(a, 'компьютер')) FROM words2;

 With your last version of patch results are:
 63ms 94ms 61ms 100ms 121ms 217ms.

 But I found some other optimization. With this condition:
 if (x[x_char_len-1] == y[y_char_len-1]  x_char_len == y_char_len 
     (x_char_len == 1 || char_same(x, y, x_char_len - 1)))
 results become:
 58ms 96ms 63ms 102ms 107ms 171ms

 On single-byte characters results almost didn't change, but they come better
 with multi-byte characters. Generally, this improvement is because first
 bytes of different characters are frequently same (for example, almost all
 Cyrillic characters start from same byte, and I think that there is similar
 situation in other alphabets), but match of last bytes is just a
 coincidence.

Hey, that's really cool.  It helps a lot here, too:

My previous patch, with your 5 test cases:
Time: 100.556 ms
Time: 205.254 ms
Time: 127.070 ms
Time: 77.734 ms
Time: 90.345 ms
Time: 166.954 ms

Your original patch, same 5 test cases:
Time: 131.489 ms
Time: 215.048 ms
Time: 125.471 ms
Time: 80.068 ms
Time: 87.110 ms
Time: 166.918 ms

My patch, with your proposed change to compare the last character
first, same 5 test cases:
Time: 96.489 ms
Time: 201.497 ms
Time: 121.876 ms
Time: 75.005 ms
Time: 80.219 ms
Time: 142.844 ms

Updated patch attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


mb_levenshtein_rmh-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-02 Thread Mike Fowler

On 02/08/10 07:46, Pavel Stehule wrote:



I have not any suggestions now - so I'll change flag to ready to commit


sorry - contrib module should be a fixed

patch attached



Thanks Pavel, you saved me some time!

Regards,

--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Where in the world is Itagaki Takahiro?

2010-08-02 Thread Kevin Grittner
I haven't seen any posts or CF activity from Itagaki in over a week,
so I'm wondering how to handle some patches in the CF.  Does anyone
know whether he's on vacation?  Expected return?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-08-02 Thread Florian Pflug
Hi

I've updated mvcc.sgml to explain the new serialization conflict rules for 
row-level locks, and added a paragraph to backend/executor/README that explains 
the implementation of those. I've chosen backend/executor/README because it 
already contains a description of UPDATE handling in READ COMMITTED mode, which 
seemed at least somewhat related.

The patch now removes all code dealing with the crosscheck_snapshot, since the 
RI triggers should now be safe without that.

I've also fixed the formatting of multi-line comments to match the coding 
style. I kept the braces around single-line if or else statements wherever 
both if and else blocks were present and one of them wasn't a single-line 
block.

I think, in addition to the documentation changes this patch contains, that a 
section on how to write RI triggers in pl/pgsql would be nice to have. It's not 
strictly part of the documentation of this feature though, more a potential 
use-case, so I didn't tackle it for the moment. I do hope to find the time to 
write such a section, though, and if that happens I'll post it as a separate 
documentation-only patch.

I've pushed the changes to the branch serializable_row_locks on 
git://github.com/fgp/postgres.git and also attached a patch.

best regards,
Florian Pflug


serializable_row_locks.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where in the world is Itagaki Takahiro?

2010-08-02 Thread Itagaki Takahiro
2010/8/3 Kevin Grittner kevin.gritt...@wicourts.gov:
 I haven't seen any posts or CF activity from Itagaki in over a week,
 so I'm wondering how to handle some patches in the CF.  Does anyone
 know whether he's on vacation?  Expected return?

Sorry for delayed reply. I moved to a new job,
and was very busy for it.

My proposed patches should be returned or postponed
to the next commitfest. I will restart to check
patches I reviewed before in one or two weeks.
I apologize to patch's authors for the delay.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where in the world is Itagaki Takahiro?

2010-08-02 Thread Josh Berkus
On 8/2/10 3:42 PM, Itagaki Takahiro wrote:
 Sorry for delayed reply. I moved to a new job,
 and was very busy for it.

Congratulations!  Are you still at NTT Open Source?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 5:07 PM, Alexander Korotkov aekorot...@gmail.com wrote:
 Now I think patch is as good as can be. :)

OK, committed.

 I'm going to prepare less-or-equal function in same manner as this patch.

Sounds good.  Since we're now more than half-way through this
CommitFest and this patch has undergone quite a bit of change from
what we started out with, I'd like to ask that you submit the new
patch for the next CommitFest, to begin September 15th.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Robert Haas
On Sun, Aug 1, 2010 at 11:48 PM, Robert Haas robertmh...@gmail.com wrote:
 b) more robust algorithm for header rows identification

 Have not gotten to this one yet.

I notIce that on WIN32 the default editor is notepad.exe and the
default editor navigation option is /.  Does notepad.exe /lineno
filename actually work on Windows?  A quick Google search suggests
that the answer is no, which seems somewhat unfortunate: it means
we'd be shipping broken on Win32 out of the box.

http://www.robvanderwoude.com/commandlineswitches.php#Notepad

This is actually my biggest concern about this patch - that it may be
just too much of a hassle to actually make it work for people.  I just
tried setting $EDITOR to MacOS's TextEdit program, and it turns out
that TextEdit doesn't understand +.  I'm afraid that we're going to
end up with a situation where it only works for people using emacs or
vi, and everyone else ends up with a broken install (and, possibly, no
clear idea how to fix it).  Perhaps it would be better to accept \sf
and reject \sf+ and \ef func lineno.

Assuming we get past that threshold issue, I'm also wondering whether
the navigation option terminology is best; e.g. set
PSQL_EDITOR_NAVIGATION_OPTION to configure it.  It doesn't seem
terrible, but it doesn't seem great, either.  Anyone have a better
idea?

The docs are a little rough; they could benefit from some editing by a
fluent English speaker.  If anyone has time to work on this, it would
be much appreciated.

Instead of line number is unacceptable, I think you should write
invalid line number.

dollar should not be spelled dolar.  function should not be
spelled finction.

This change looks suspiciously like magic.  If it's actually safe, it
needs a comment explaining why:

-   sys = pg_malloc(strlen(editorName) + strlen(fname) + 10 + 1);
+   sys = pg_malloc(strlen(editorName) + strlen(fname) + 20 + 1);

Attempting to compile with this patch applied emits a warning on my
machine; whether or not the warning is valid, you need to make it go
away:

command.c: In function ‘HandleSlashCmds’:
command.c:1055: warning: ‘bsep’ may be used uninitialized in this function
command.c:1055: note: ‘bsep’ was declared here

Why does the \sf output have a trailing blank line?  This seems weird,
especially because \ef puts no such trailing blank line in the editor.

Instead of:

+   /* skip useles whitespaces */
+   while (c = func)
+   if (isblank(*c))
+   c--;
+   else
+   break;

...wouldn't it be just as good to write:

while (c = func  isblank(*c))
c--;

(and similarly elsewhere)

In extract_separator, if you invert the sense of the first if-test,
then you can avoid having to indent the entire function contents.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 This is actually my biggest concern about this patch - that it may be
 just too much of a hassle to actually make it work for people.  I just
 tried setting $EDITOR to MacOS's TextEdit program, and it turns out
 that TextEdit doesn't understand +.  I'm afraid that we're going to
 end up with a situation where it only works for people using emacs or
 vi, and everyone else ends up with a broken install (and, possibly, no
 clear idea how to fix it).

[ disclaimer: I've not looked at the proposed patch yet ]

It seems like this ought to be fairly easily surmountable as long as
the patch is designed for failure.  The fallback position is just that
the line number does nothing, ie \ef foo just opens the editor and
doesn't try to position the cursor anywhere special; nobody can complain
about that because it's no worse than before.  What we need is to not
try to force positioning if we don't recognize the editor.  I'm tempted
to suggest forgetting about any user-configurable parameter and just
provide code that strcmp's the $EDITOR value to see if it recognizes the
editor name, otherwise do nothing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 This is actually my biggest concern about this patch - that it may be
 just too much of a hassle to actually make it work for people.  I just
 tried setting $EDITOR to MacOS's TextEdit program, and it turns out
 that TextEdit doesn't understand +.  I'm afraid that we're going to
 end up with a situation where it only works for people using emacs or
 vi, and everyone else ends up with a broken install (and, possibly, no
 clear idea how to fix it).

 [ disclaimer: I've not looked at the proposed patch yet ]

 It seems like this ought to be fairly easily surmountable as long as
 the patch is designed for failure.

It isn't.

 The fallback position is just that
 the line number does nothing, ie \ef foo just opens the editor and
 doesn't try to position the cursor anywhere special; nobody can complain
 about that because it's no worse than before.  What we need is to not
 try to force positioning if we don't recognize the editor.

Supposing for the moment that we could make it work that way, that
would be reasonable.

 I'm tempted
 to suggest forgetting about any user-configurable parameter and just
 provide code that strcmp's the $EDITOR value to see if it recognizes the
 editor name, otherwise do nothing.

With all due respect, that sounds like an amazingly bad idea.  Surely,
we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
or complaints that it's not already included.  While this is
superficially a Nice Thing to Have and I would certainly support it if
+linenumber were relatively universal, it's really a pretty minor
convenience when you come right down to it, and I am not at all
convinced it is worth the hassle of trying to divine what piece of
syntax will equip the user's choice of editor with the necessary
amount of clue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm tempted
 to suggest forgetting about any user-configurable parameter and just
 provide code that strcmp's the $EDITOR value to see if it recognizes the
 editor name, otherwise do nothing.

 With all due respect, that sounds like an amazingly bad idea.  Surely,
 we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
 or complaints that it's not already included.

Well, yeah, that's the idea.  I say that beats a constant stream of
complaints that $MYFAVORITEEDITOR no longer works at all --- which
is what your concern was, no?

 While this is
 superficially a Nice Thing to Have and I would certainly support it if
 +linenumber were relatively universal, it's really a pretty minor
 convenience when you come right down to it, and I am not at all
 convinced it is worth the hassle of trying to divine what piece of
 syntax will equip the user's choice of editor with the necessary
 amount of clue.

The other approach we could take is that this whole thing is disabled by
default, and you have to set a psql variable EDITOR_LINENUMBER_SWITCH
to turn it on.  If you haven't read the documentation enough to find
out that variable exists, well, no harm no foul.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (9.1) btree_gist support for searching on not equals

2010-08-02 Thread Jeff Davis
On Mon, 2010-08-02 at 12:27 -0400, Robert Haas wrote:
 I was also wondering if it would be worth adding some additional
 regression testing to contrib/btree_gist exercising this new
 functionality.  Thoughts?

Sure. I attached two tests.

Regards,
Jeff Davis

*** a/contrib/btree_gist/Makefile
--- b/contrib/btree_gist/Makefile
***
*** 11,17  DATA_built  = btree_gist.sql
  DATA= uninstall_btree_gist.sql
  
  REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz time timetz \
!   date interval macaddr inet cidr text varchar char bytea bit varbit numeric
  
  ifdef USE_PGXS
  PG_CONFIG = pg_config
--- 11,17 
  DATA= uninstall_btree_gist.sql
  
  REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz time timetz \
!   date interval macaddr inet cidr text varchar char bytea bit varbit numeric mixed
  
  ifdef USE_PGXS
  PG_CONFIG = pg_config
*** /dev/null
--- b/contrib/btree_gist/expected/mixed.out
***
*** 0 
--- 1,31 
+ SET enable_seqscan = 'false';
+ -- test search for not equals
+ CREATE TABLE test_ne (
+a  TIMESTAMP,
+b  NUMERIC
+ );
+ CREATE INDEX test_ne_idx ON test_ne USING gist (a, b);
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ INSERT INTO test_ne VALUES('2007-02-03', -91.3);
+ INSERT INTO test_ne VALUES('2011-09-01', 43.7);
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ SELECT * FROM test_ne WHERE a  '2009-01-01' AND b  10.7;
+ a |   b   
+ --+---
+  Sat Feb 03 00:00:00 2007 | -91.3
+  Thu Sep 01 00:00:00 2011 |  43.7
+ (2 rows)
+ 
+ -- test search for not equals using an exclusion constraint
+ CREATE TABLE zoo (
+cage   INTEGER,
+animal TEXT,
+EXCLUDE USING gist (cage WITH =, animal WITH )
+ );
+ NOTICE:  CREATE TABLE / EXCLUDE will create implicit index zoo_cage_animal_excl for table zoo
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'lion');
+ ERROR:  conflicting key value violates exclusion constraint zoo_cage_animal_excl
+ DETAIL:  Key (cage, animal)=(123, lion) conflicts with existing key (cage, animal)=(123, zebra).
+ INSERT INTO zoo VALUES(124, 'lion');
*** /dev/null
--- b/contrib/btree_gist/sql/mixed.sql
***
*** 0 
--- 1,30 
+ 
+ SET enable_seqscan = 'false';
+ 
+ -- test search for not equals
+ 
+ CREATE TABLE test_ne (
+a  TIMESTAMP,
+b  NUMERIC
+ );
+ CREATE INDEX test_ne_idx ON test_ne USING gist (a, b);
+ 
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ INSERT INTO test_ne VALUES('2007-02-03', -91.3);
+ INSERT INTO test_ne VALUES('2011-09-01', 43.7);
+ INSERT INTO test_ne SELECT '2009-01-01', 10.7 FROM generate_series(1,1000);
+ 
+ SELECT * FROM test_ne WHERE a  '2009-01-01' AND b  10.7;
+ 
+ -- test search for not equals using an exclusion constraint
+ 
+ CREATE TABLE zoo (
+cage   INTEGER,
+animal TEXT,
+EXCLUDE USING gist (cage WITH =, animal WITH )
+ );
+ 
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'zebra');
+ INSERT INTO zoo VALUES(123, 'lion');
+ INSERT INTO zoo VALUES(124, 'lion');

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Robert Haas
On Mon, Aug 2, 2010 at 11:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm tempted
 to suggest forgetting about any user-configurable parameter and just
 provide code that strcmp's the $EDITOR value to see if it recognizes the
 editor name, otherwise do nothing.

 With all due respect, that sounds like an amazingly bad idea.  Surely,
 we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
 or complaints that it's not already included.

 Well, yeah, that's the idea.  I say that beats a constant stream of
 complaints that $MYFAVORITEEDITOR no longer works at all --- which
 is what your concern was, no?

Well, it'd still work fine for \e foo.  It'll just blow up for \e foo
3.  My concern isn't really that things that which work now will break
so much as that this new feature will fail to work for a large
percentage of the people who try to use it, including virtually
everyone who tries to use it on Win32.

 While this is
 superficially a Nice Thing to Have and I would certainly support it if
 +linenumber were relatively universal, it's really a pretty minor
 convenience when you come right down to it, and I am not at all
 convinced it is worth the hassle of trying to divine what piece of
 syntax will equip the user's choice of editor with the necessary
 amount of clue.

 The other approach we could take is that this whole thing is disabled by
 default, and you have to set a psql variable EDITOR_LINENUMBER_SWITCH
 to turn it on.  If you haven't read the documentation enough to find
 out that variable exists, well, no harm no foul.

That might be reasonable.  Right now the default behavior is to do
+line on Linux and /line on Windows.  But maybe a more sensible
default would be to fail with an error message that says you have to
set thus-and-so variable if you want to use this feature.  That seems
better than sometimes working and sometimes failing depending on what
editor the user has configured.

A side question is whether this should be an environment variable or a
psql variable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Pavel Stehule
2010/8/3 Robert Haas robertmh...@gmail.com:
 On Sun, Aug 1, 2010 at 11:48 PM, Robert Haas robertmh...@gmail.com wrote:
 b) more robust algorithm for header rows identification

 Have not gotten to this one yet.

 I notIce that on WIN32 the default editor is notepad.exe and the
 default editor navigation option is /.  Does notepad.exe /lineno
 filename actually work on Windows?  A quick Google search suggests
 that the answer is no, which seems somewhat unfortunate: it means
 we'd be shipping broken on Win32 out of the box.

 http://www.robvanderwoude.com/commandlineswitches.php#Notepad

notapad supports nothing :( Microsoft doesn't use command line. I
found this syntax in pspad. Next other editor is notepad++. It use a
syntax -n. Wide used KDE editors use --line syntax


 This is actually my biggest concern about this patch - that it may be
 just too much of a hassle to actually make it work for people.  I just
 tried setting $EDITOR to MacOS's TextEdit program, and it turns out
 that TextEdit doesn't understand +.  I'm afraid that we're going to
 end up with a situation where it only works for people using emacs or
 vi, and everyone else ends up with a broken install (and, possibly, no
 clear idea how to fix it).  Perhaps it would be better to accept \sf
 and reject \sf+ and \ef func lineno.


\sf+ is base - we really need a source output with line numbers. \ef
foo func is just user very friendly function. I agree, so this topic
have to be better documented

 Assuming we get past that threshold issue, I'm also wondering whether
 the navigation option terminology is best; e.g. set
 PSQL_EDITOR_NAVIGATION_OPTION to configure it.  It doesn't seem
 terrible, but it doesn't seem great, either.  Anyone have a better
 idea?



 The docs are a little rough; they could benefit from some editing by a
 fluent English speaker.  If anyone has time to work on this, it would
 be much appreciated.

 Instead of line number is unacceptable, I think you should write
 invalid line number.

 dollar should not be spelled dolar.  function should not be
 spelled finction.

 This change looks suspiciously like magic.  If it's actually safe, it
 needs a comment explaining why:

 -       sys = pg_malloc(strlen(editorName) + strlen(fname) + 10 + 1);
 +       sys = pg_malloc(strlen(editorName) + strlen(fname) + 20 + 1);

 Attempting to compile with this patch applied emits a warning on my
 machine; whether or not the warning is valid, you need to make it go
 away:

 command.c: In function ‘HandleSlashCmds’:
 command.c:1055: warning: ‘bsep’ may be used uninitialized in this function
 command.c:1055: note: ‘bsep’ was declared here

 Why does the \sf output have a trailing blank line?  This seems weird,
 especially because \ef puts no such trailing blank line in the editor.

 Instead of:

 +       /* skip useles whitespaces */
 +       while (c = func)
 +               if (isblank(*c))
 +                       c--;
 +               else
 +                       break;

 ...wouldn't it be just as good to write:

 while (c = func  isblank(*c))
    c--;

 (and similarly elsewhere)

 In extract_separator, if you invert the sense of the first if-test,
 then you can avoid having to indent the entire function contents.

 --

I'' recheck these issue


Pavel

 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-02 Thread Pavel Stehule
2010/8/3 Robert Haas robertmh...@gmail.com:
 On Mon, Aug 2, 2010 at 11:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Aug 2, 2010 at 10:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm tempted
 to suggest forgetting about any user-configurable parameter and just
 provide code that strcmp's the $EDITOR value to see if it recognizes the
 editor name, otherwise do nothing.

 With all due respect, that sounds like an amazingly bad idea.  Surely,
 we'll be forever getting patches to add $MYFAVORITEEDITOR to the list,
 or complaints that it's not already included.

 Well, yeah, that's the idea.  I say that beats a constant stream of
 complaints that $MYFAVORITEEDITOR no longer works at all --- which
 is what your concern was, no?

 Well, it'd still work fine for \e foo.  It'll just blow up for \e foo
 3.  My concern isn't really that things that which work now will break
 so much as that this new feature will fail to work for a large
 percentage of the people who try to use it, including virtually
 everyone who tries to use it on Win32.

 While this is
 superficially a Nice Thing to Have and I would certainly support it if
 +linenumber were relatively universal, it's really a pretty minor
 convenience when you come right down to it, and I am not at all
 convinced it is worth the hassle of trying to divine what piece of
 syntax will equip the user's choice of editor with the necessary
 amount of clue.

 The other approach we could take is that this whole thing is disabled by
 default, and you have to set a psql variable EDITOR_LINENUMBER_SWITCH
 to turn it on.  If you haven't read the documentation enough to find
 out that variable exists, well, no harm no foul.

 That might be reasonable.  Right now the default behavior is to do
 +line on Linux and /line on Windows.  But maybe a more sensible
 default would be to fail with an error message that says you have to
 set thus-and-so variable if you want to use this feature.  That seems
 better than sometimes working and sometimes failing depending on what
 editor the user has configured.


I like this idea

 A side question is whether this should be an environment variable or a
 psql variable.

it can be just psql variable.

Pavel

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GROUPING SETS revisited

2010-08-02 Thread Joshua Tolley
In case anyone's interested, I've taken the CTE-based grouping sets patch from
[1] and made it apply to 9.1, attached. I haven't yet done things like checked
it for whitespace consistency, style conformity, or anything else, but (tuits
permitting) hope to figure out how it works and get it closer to commitability
in some upcoming commitfest.

I mention it here so that if someone else is working on it, we can avoid
duplicated effort, and to see if a CTE-based grouping sets implementation is
really the way we think we want to go.

[1] http://archives.postgresql.org/pgsql-hackers/2009-05/msg00700.php

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com
diff --git a/src/backend/parser/Makefile b/src/backend/parser/Makefile
index a8f4c07..fb248a6 100644
*** a/src/backend/parser/Makefile
--- b/src/backend/parser/Makefile
*** override CPPFLAGS := -I. -I$(srcdir) $(C
*** 15,21 
  OBJS= analyze.o gram.o keywords.o kwlookup.o parser.o \
parse_agg.o parse_clause.o parse_coerce.o parse_cte.o parse_expr.o \
parse_func.o parse_node.o parse_oper.o parse_param.o parse_relation.o \
!   parse_target.o parse_type.o parse_utilcmd.o scansup.o
  
  FLEXFLAGS = -CF
  
--- 15,21 
  OBJS= analyze.o gram.o keywords.o kwlookup.o parser.o \
parse_agg.o parse_clause.o parse_coerce.o parse_cte.o parse_expr.o \
parse_func.o parse_node.o parse_oper.o parse_param.o parse_relation.o \
!   parse_target.o parse_type.o parse_utilcmd.o scansup.o parse_gsets.o
  
  FLEXFLAGS = -CF
  
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 6b99a10..1b579a8 100644
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
***
*** 34,39 
--- 34,40 
  #include parser/parse_clause.h
  #include parser/parse_coerce.h
  #include parser/parse_cte.h
+ #include parser/parse_gsets.h
  #include parser/parse_oper.h
  #include parser/parse_param.h
  #include parser/parse_relation.h
*** parse_sub_analyze(Node *parseTree, Parse
*** 150,155 
--- 151,313 
  }
  
  /*
+  * process GROUPING SETS
+  */
+ static SelectStmt *
+ makeSelectStmt(List *targetList, List *fromClause)
+ {
+ 	SelectStmt *n = makeNode(SelectStmt);
+ 	n-distinctClause = NULL;
+ 	n-intoClause = NULL;
+ 	n-targetList = targetList;
+ 	n-fromClause = fromClause;
+ 	n-whereClause = NULL;
+ 	n-groupClause = NULL;
+ 	n-havingClause = NULL;
+ 	n-windowClause = NIL;
+ 	n-withClause = NULL;
+ 	n-valuesLists = NIL;
+ 	n-sortClause = NIL;
+ 	n-limitOffset = NULL;
+ 	n-limitCount = NULL;
+ 	n-lockingClause = NIL;
+ 	n-op = SETOP_NONE;
+ 	n-all = false;
+ 	n-larg = NULL;
+ 	n-rarg = NULL;
+ 	return n;
+ }
+ 
+ static List *
+ makeStarTargetList(void)
+ {
+ 	ResTarget *rt = makeNode(ResTarget);
+ 	
+ 	rt-name = NULL;
+ 	rt-indirection = NIL;
+ 	rt-val = (Node *) makeNode(ColumnRef);
+ 	((ColumnRef *) rt-val)-fields = list_make1(makeNode(A_Star));
+ 	rt-location = -1;
+ 
+ 	return list_make1(rt);
+ }
+  
+ static SelectStmt *
+ transformGroupingSets(ParseState *pstate, SelectStmt *stmt)
+ {
+ 	if (stmt-groupClause  IsA(stmt-groupClause, GroupByClause))
+ 	{
+ 		GroupingSetsSpec *gss = (GroupingSetsSpec *) expandGroupingSets(pstate, 
+ 		(List *)((GroupByClause *)stmt-groupClause)-fields);
+ 	
+ 		if (pstate-p_hasGroupingSets)
+ 		{
+ 			CommonTableExpr *cte = makeNode(CommonTableExpr);
+ 			SelectStmt  *cteedstmt;
+ 			int	ngroupingsets = list_length(gss-set_list) + (gss-has_empty_set ? 1 : 0);
+ 			bool	all = ((GroupByClause *) stmt-groupClause)-all;
+ 		
+ 			cteedstmt = makeSelectStmt(NIL, NIL);
+ 			cteedstmt-intoClause = stmt-intoClause;
+ 			cteedstmt-sortClause = stmt-sortClause;
+ 			cteedstmt-limitOffset = stmt-limitOffset;
+ 			cteedstmt-limitCount = stmt-limitCount;
+ 			cteedstmt-lockingClause = stmt-lockingClause;
+ 		
+ 			cte-ctename = **g**;
+ 			cte-ctequery = (Node *) stmt;
+ 			cte-location = -1;
+ 		
+ 			cteedstmt-withClause = makeNode(WithClause);
+ 			cteedstmt-withClause-ctes = list_make1(cte);
+ 			cteedstmt-withClause-recursive = false;
+ 			cteedstmt-withClause-location = -1;
+ 		
+ 			/* when is more than one grouping set, then we should generate setop node */
+ 			if (ngroupingsets  1)
+ 			{
+ /* add quuery under union all for every grouping set */
+ SelectStmt	*larg = NULL;
+ SelectStmt	*rarg;
+ ListCell*lc;
+ 			
+ foreach(lc, gss-set_list)
+ {
+ 	List	*groupClause;
+ 
+ 	Assert(IsA(lfirst(lc), List));
+ 	groupClause = (List *) lfirst(lc);
+ 
+ 	if (larg == NULL)
+ 	{
+ 		larg = makeSelectStmt(copyObject(stmt-targetList),
+ 	list_make1(makeRangeVar(NULL, **g**, -1)));
+ 		larg-groupClause = (Node *) groupClause;
+ 		larg-havingClause = copyObject(stmt-havingClause);
+ 	}
+ 	else
+ 	{
+ 		SelectStmt	*setop = makeSelectStmt(NIL, NIL);
+ 	
+ 		rarg = makeSelectStmt(copyObject(stmt-targetList),
+