Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-09-01 Thread Sushant Sinha
I have attached a patch that emits parts of a host token, a url token,
an email token and a file token. Further, it makes sure that a
host/url/email/file token and the first part-token are at the same
position in tsvector.

The two major changes are:

1. Tokenization changes: The patch exploits the special handlers in the
text parser to reset the parser position to the start of a
host/url/email/file token when it finds one. Special handlers were
already used for extracting host and urlpath from a full url. So this is
more of an extension of the same idea.

2. Position changes: We do not advance position when we encounter a
host/url/email/file token. As a result the first part of that token
aligns with the token itself.

Attachments:

tokens_output.txt: sample queries and results with the patch
token_v1.patch:patch wrt cvs head

Currently, the patch output parts of the tokens as normal tokens like
WORD, NUMWORD etc. Tom argued earlier that this will break
backward-compatibility and so it should be outputted as parts of the
respective tokens. If there is an agreement over what Tom says, then the
current patch can be modified to output subtokens as parts. However,
before I complicate the patch with that, I wanted to get feedback on any
other major problem with the patch.

-Sushant.

On Mon, 2010-08-02 at 10:20 -0400, Tom Lane wrote:
 Sushant Sinha sushant...@gmail.com writes:
  This would needlessly increase the number of tokens. Instead you'd 
  better make it work like compound word support, having just wikipedia 
  and org as tokens.
 
  The current text parser already returns url and url_path. That already
  increases the number of unique tokens. I am only asking for adding of
  normal english words as well so that if someone types only wikipedia
  he gets a match. 
 
 The suggestion to make it work like compound words is still a good one,
 ie given wikipedia.org you'd get back
 
   hostwikipedia.org
   host-part   wikipedia
   host-part   org
 
 not just the host token as at present.
 
 Then the user could decide whether he needed to index hostname
 components or not, by choosing whether to forward hostname-part
 tokens to a dictionary or just discard them.
 
 If you submit a patch that tries to force the issue by classifying
 hostname parts as plain words, it'll probably get rejected out of
 hand on backwards-compatibility grounds.
 
   regards, tom lane

1. FILEPATH

testdb=# SELECT ts_debug('/stuff/index.html');
 ts_debug   
  

--
 (file,File or path name,/stuff/index.html,{simple},simple,{/stuff/index.html}
)
 (blank,Space symbols,/,{},,)
 (asciiword,Word, all ASCII,stuff,{english_stem},english_stem,{stuff})
 (blank,Space symbols,/,{},,)
 (asciiword,Word, all ASCII,index,{english_stem},english_stem,{index})
 (blank,Space symbols,.,{},,)
 (asciiword,Word, all ASCII,html,{english_stem},english_stem,{html})


SELECT to_tsvector('english', '/stuff/index.html');
to_tsvector 

 '/stuff/index.html':0 'html':2 'index':1 'stuff':0
(1 row)

2. URL

testdb=# SELECT ts_debug('http://example.com/stuff/index.html');
   ts_debug 
   

---
 (protocol,Protocol head,http://,{},,)
 (url,URL,example.com/stuff/index.html,{simple},simple,{example.com/stuff/index.
html})
 (host,Host,example.com,{simple},simple,{example.com})
 (asciiword,Word, all ASCII,example,{english_stem},english_stem,{exampl})
 (blank,Space symbols,.,{},,)
 (asciiword,Word, all ASCII,com,{english_stem},english_stem,{com})
 (url_path,URL path,/stuff/index.html,{simple},simple,{/stuff/index.html})
 (blank,Space symbols,/,{},,)
 (asciiword,Word, all ASCII,stuff,{english_stem},english_stem,{stuff})
 (blank,Space symbols,/,{},,)
 (asciiword,Word, all ASCII,index,{english_stem},english_stem,{index})
 (blank,Space symbols,.,{},,)
 (asciiword,Word, all ASCII,html,{english_stem},english_stem,{html})
(13 rows)

testdb=# SELECT to_tsvector('english', 'http://example.com/stuff/index.html');
  to_tsvector   



 '/stuff/index.html':2 'com':1 'exampl':0 'example.com':0 'example.com/stuff/ind
ex.html':0 'html':4 'index':3 'stuff':2

3. EMAIL

testdb=# SELECT ts_debug('sush...@foo.bar');
  ts_debug   
-
 (email,Email 

Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Thom Brown
On 1 September 2010 06:45, David E. Wheeler da...@kineticode.com wrote:
 The aggregate docs say:

 The first form of aggregate expression invokes the aggregate across all 
 input rows for which the given expression(s) yield non-null values. 
 (Actually, it is up to the aggregate function whether to ignore null values 
 or not — but all the standard ones do.)

 -- 
 http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-AGGREGATES

 That, however, is not true of array_agg():

 try=# CREATE TABLE foo(id int);
 CREATE TABLE
 try=# INSERT INTO foo values(1), (2), (NULL), (3);
 INSERT 0 4
 try=# select array_agg(id) from foo;
  array_agg
 ──
  {1,2,NULL,3}
 (1 row)

 So are the docs right, or is array_agg() right?

I think it might be both.  array_agg doesn't return NULL, it returns
an array which contains NULL.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Aug 31, 2010, at 11:56 PM, Thom Brown wrote:

 The first form of aggregate expression invokes the aggregate across all 
 input rows for which the given expression(s) yield non-null values. 
 (Actually, it is up to the aggregate function whether to ignore null values 
 or not — but all the standard ones do.)
 
 -- 
 http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-AGGREGATES
 
 That, however, is not true of array_agg():
 
 try=# CREATE TABLE foo(id int);
 CREATE TABLE
 try=# INSERT INTO foo values(1), (2), (NULL), (3);
 INSERT 0 4
 try=# select array_agg(id) from foo;
  array_agg
 ──
  {1,2,NULL,3}
 (1 row)
 
 So are the docs right, or is array_agg() right?
 
 I think it might be both.  array_agg doesn't return NULL, it returns
 an array which contains NULL.

No, string_agg() doesn't work this way, for example:

select string_agg(id::text, ',') from foo;
 string_agg 

 1,2,3
(1 row)

Note that it's not:

select string_agg(id::text, ',') from foo;
 string_agg 

 1,2,,3
(1 row)

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

2010-09-01 Thread Heikki Linnakangas

On 31/08/10 15:47, Fujii Masao wrote:

On Fri, Aug 27, 2010 at 4:39 PM, Fujii Masaomasao.fu...@gmail.com  wrote:

/*
  * XXX: Should we invent an API to wait for data coming from the
  * client connection too? It's not critical, but we could then
  * eliminate the timeout altogether and go to sleep for good.
  */


Yes, it would be very helpful when walsender waits for the ACK from
the standby in upcoming synchronous replication.


I propose to change WaitLatch() so that it accepts the socket
descriptor as an argument, to wait for data coming from the
client connection.


Yeah, we probably should do that now.


 WaitLatch() monitors not only the self-pipe
but also the given socket. If -1 is supplied, it checks only
the self-pipe.


The obvious next question is how to wait for multiple sockets and a 
latch at the same time? Perhaps we should have a select()-like interface 
where you can pass multiple file descriptors. Then again, looking at the 
current callers of select() in the backend, apart from postmaster they 
all wait for only one fd.



The socket should be monitored by using poll() if the platform
has it, since poll() is usually more efficient.


Yeah, I guess.


So I'd like to change Unix implementation of WaitLatch() as
follows. Thought?

---
define WaitLatch(latch, timeout) WaitLatchAndSocket(latch, -1, timeout)

void WaitLatchAndSocket(Latch *latch, int sock, long timeout);
{
 ...

 FD_SET(selfpipe_readfd,input_mask);
 if (sock != -1)
 FD_SET(sock,input_mask);

#ifdef HAVE_POLL
 poll(...)
#else
 select(...)
  #endif   /* HAVE_POLL */

 ...
}
---


Yep.


Windows implementation of WaitLatchAndSocket() seems not to be
so simple. We would need to wait for both the latch event and
the packet from the socket by using WaitForMultipleObjectsEx().


Well, we already use WaitForMultipleObjectsEx() to implement select() on 
Windows, so it should be straightforward to copy that. I'll look into that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Pavel Stehule
2010/9/1 David E. Wheeler da...@kineticode.com:
 The aggregate docs say:

 The first form of aggregate expression invokes the aggregate across all 
 input rows for which the given expression(s) yield non-null values. 
 (Actually, it is up to the aggregate function whether to ignore null values 
 or not — but all the standard ones do.)

 -- 
 http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-AGGREGATES

 That, however, is not true of array_agg():

 try=# CREATE TABLE foo(id int);
 CREATE TABLE
 try=# INSERT INTO foo values(1), (2), (NULL), (3);
 INSERT 0 4
 try=# select array_agg(id) from foo;
  array_agg
 ──
  {1,2,NULL,3}
 (1 row)

 So are the docs right, or is array_agg() right?

Docs is wrong :) I like current implementation. You can remove a NULLs
from aggregation very simply, but different direction isn't possible

Regards
Pavel Stehule

 Best,

 David


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-01 Thread Fujii Masao
On Wed, Sep 1, 2010 at 2:33 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Once we're done with that, all the big questions are still left.

Yeah, let's discuss about those topics :)

 How to configure it?

Before discussing about that, we should determine whether registering
standbys in master is really required. It affects configuration a lot.
Heikki thinks that it's required, but I'm still unclear about why and
how.

Why do standbys need to be registered in master? What information
should be registered?

 What does synchronous replication mean, when is a transaction
 acknowledged as committed?

I proposed four synchronization levels:

1. async
  doesn't make transaction commit wait for replication, i.e.,
  asynchronous replication. This mode has been already supported in
  9.0.

2. recv
  makes transaction commit wait until the standby has received WAL
  records.

3. fsync
  makes transaction commit wait until the standby has received and
  flushed WAL records to disk

4. replay
  makes transaction commit wait until the standby has replayed WAL
  records after receiving and flushing them to disk

OTOH, Simon proposed the quorum commit feature. I think that both
is required for various our use cases. Thought?

 What to do if a standby server dies and never
 acknowledges a commit?

The master's reaction to that situation should be configurable. So
I'd propose new configuration parameter specifying the reaction.
Valid values are:

- standalone
  When the master has waited for the ACK much longer than the timeout
  (or detected the failure of the standby), it closes the connection
  to the standby and restarts transactions.

- down
  When that situation occurs, the master shuts down immediately.
  Though this is unsafe for the system requiring high availability,
  as far as I recall, some people wanted this mode in the previous
  discussion.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Thom Brown
On 1 September 2010 07:56, Thom Brown t...@linux.com wrote:
 On 1 September 2010 06:45, David E. Wheeler da...@kineticode.com wrote:
 The aggregate docs say:

 The first form of aggregate expression invokes the aggregate across all 
 input rows for which the given expression(s) yield non-null values. 
 (Actually, it is up to the aggregate function whether to ignore null values 
 or not — but all the standard ones do.)

 -- 
 http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-AGGREGATES

 That, however, is not true of array_agg():

 try=# CREATE TABLE foo(id int);
 CREATE TABLE
 try=# INSERT INTO foo values(1), (2), (NULL), (3);
 INSERT 0 4
 try=# select array_agg(id) from foo;
  array_agg
 ──
  {1,2,NULL,3}
 (1 row)

 So are the docs right, or is array_agg() right?

 I think it might be both.  array_agg doesn't return NULL, it returns
 an array which contains NULL.

The second I wrote that, I realised it was b*ll%$ks, as I was still in
the process of waking up.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] leaky views, yet again

2010-09-01 Thread KaiGai Kohei
(2010/07/21 19:35), KaiGai Kohei wrote:
 (2010/07/21 19:26), Robert Haas wrote:
 2010/7/21 KaiGai Koheikai...@ak.jp.nec.com:
 On the other hand, if it's enough from a performance
 point of view to review and mark only a few built-in functions like
 index operators, maybe it's ok.

 I also think it is a worthful idea to try as a proof-of-concept.

 Yeah. So, should we mark this patch as Returned with Feedback, and
 you can submit a proof-of-concept patch for the next CF?

 Yes, it's fair enough.
 
The attached patch is a proof-of-concept one according to the suggestion
from Heikki before.

Right now, it stands on a strict assumption that considers operators
implemented with built-in functions are safe; it does not have no
possibility to leak supplied arguments anywhere.

This patch modifies the logic that the planner decides where the
given qualifier clause should be distributed.

If the clause does not contain any leakable functions, nothing
were changed. In this case, the clause shall be pushed down into
inside of the join, if it depends on one-side of the join.

Elsewhere, if the clause contains any leakable functions, this
patch prevents to push down the clause into join loop, because
the clause need to be evaluated after the condition of join.
If it would not be prevented, the leakable function may expose
the contents to be invisible for users; due to the VIEWs for
row-level security purpose.

Example

postgres=# CREATE OR REPLACE FUNCTION f_policy(int)
   RETURNS bool LANGUAGE 'plpgsql'
   AS 'begin return $1 % 2 = 0; end;';
CREATE FUNCTION
postgres=# CREATE OR REPLACE FUNCTION f_malicious(text)
   RETURNS bool LANGUAGE 'plpgsql' COST 0.001
   AS 'begin raise notice ''leak: %'', $1; return true; end';
CREATE FUNCTION
postgres=# CREATE TABLE t1 (a int primary key, b text);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey for 
table t1
CREATE TABLE
postgres=# CREATE TABLE t2 (x int primary key, y text);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t2_pkey for 
table t2
CREATE TABLE
postgres=# INSERT INTO t1 VALUES (1,'aaa'), (2,'bbb'), (3,'ccc');
INSERT 0 3
postgres=# INSERT INTO t2 VALUES (2,'xxx'), (3,'yyy'), (4,'zzz');
INSERT 0 3
postgres=# CREATE VIEW v AS SELECT * FROM t1 JOIN t2 ON f_policy(a+x);
CREATE VIEW

* SELECT * FROM v WHERE f_malicious(b);

[without this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE f_malicious(b);
QUERY PLAN
--
 Nested Loop  (cost=0.00..133685.13 rows=168100 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Materialize  (cost=0.00..24.35 rows=410 width=36)
 -  Seq Scan on t1  (cost=0.00..22.30 rows=410 width=36)
   Filter: f_malicious(b)
(6 rows)
 == f_malicious() may be raise a notice about invisible tuples.

[with this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE f_malicious(b);
QUERY PLAN
---
 Nested Loop  (cost=0.00..400969.96 rows=168100 width=72)
   Join Filter: (f_malicious(t1.b) AND f_policy((t1.a + t2.x)))
   -  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
   -  Materialize  (cost=0.00..28.45 rows=1230 width=36)
 -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)
 == f_malicious() is moved to outside of the join.
 It is evaluated earlier than f_policy() in same level due to
 the function cost, but it is another matter.


* SELECT * FROM v WHERE a = 2;
[without this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a = 2;
   QUERY PLAN
-
 Nested Loop  (cost=0.00..353.44 rows=410 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   -  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
 Index Cond: (a = 2)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)

[with this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a = 2;
   QUERY PLAN
-
 Nested Loop  (cost=0.00..353.44 rows=410 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   -  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
 Index Cond: (a = 2)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)

  == a = 2 is a built-in operator, so we assume it is safe.
  This clause was pushed down into the join loop, then utilized to
  index scan.


* SELECT * FROM v WHERE a::text = 'a';

[without this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a::text = '2';
   QUERY PLAN

 Nested Loop  

Re: I: [HACKERS] About Our CLUSTER implementation is pessimal patch

2010-09-01 Thread Itagaki Takahiro
On Tue, Aug 31, 2010 at 8:04 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 * How to determine which method was used?
  We can get some information from trace_sort logs,
  but CLUSTER VERBOSE would be better to log
  which CLUSTER method was used.

I wrote a patch to improve CLUSTER VERBOSE (and VACUUM FULL VERBOSE).
The patch should be applied after sorted_cluster-20100721.patch .

* clustering schema.table by index scan on index
* clustering schema.table by sequential scan and sort

It also adds VACUUM VERBOSE-like logs:
INFO:  table: found 1 removable, 11 nonremovable row versions in
1655 pages
DETAIL:  1 dead row versions cannot be removed yet.
CPU 0.03s/0.06u sec elapsed 0.21 sec.

Note that the patch says nothing about reindexing. But if
required, I'm willing to add some VERBOSE messages for
indexes (ex. REINDEX VERBOSE)

-- 
Itagaki Takahiro


cluster_verbose-20100901.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-01 Thread Heikki Linnakangas

On 01/09/10 10:53, Fujii Masao wrote:

Before discussing about that, we should determine whether registering
standbys in master is really required. It affects configuration a lot.
Heikki thinks that it's required, but I'm still unclear about why and
how.

Why do standbys need to be registered in master? What information
should be registered?


That requirement falls out from the handling of disconnected standbys. 
If a standby is not connected, what does the master do with commits? If 
the answer is anything else than acknowledge them to the client 
immediately, as if the standby never existed, the master needs to know 
what standby servers exist. Otherwise it can't know if all the standbys 
are connected or not.



What does synchronous replication mean, when is a transaction
acknowledged as committed?


I proposed four synchronization levels:

1. async
   doesn't make transaction commit wait for replication, i.e.,
   asynchronous replication. This mode has been already supported in
   9.0.

2. recv
   makes transaction commit wait until the standby has received WAL
   records.

3. fsync
   makes transaction commit wait until the standby has received and
   flushed WAL records to disk

4. replay
   makes transaction commit wait until the standby has replayed WAL
   records after receiving and flushing them to disk

OTOH, Simon proposed the quorum commit feature. I think that both
is required for various our use cases. Thought?


I'd like to keep this as simple as possible, yet flexible so that with 
enough scripting and extensions, you can get all sorts of behavior. I 
think quorum commit falls into the extension category; if you're setup 
is complex enough, it's going to be impossible to represent that in our 
config files no matter what. But if you write a little proxy, you can 
implement arbitrary rules there.


I think recv/fsync/replay should be specified in the standby. It has no 
direct effect on the master, the master would just relay the setting to 
the standby when it connects, or the standby would send multiple 
XLogRecPtrs and let the master decide when the WAL is persistent enough. 
And what if you write a proxy that has some other meaning of persistent 
enough? Like when it has been written to the OS buffers but not yet 
fsync'd, or when it has been fsync'd to at least one standby and 
received by at least three others. recv/fsync/replay is not going to 
represent that behavior well.


sync vs async on the other hand should be specified in the master, 
because it has a direct impact on the behavior of commits in the master.


I propose a configuration file standbys.conf, in the master:

# STANDBY NAMESYNCHRONOUS   TIMEOUT
importantreplica  yes   100ms
tempcopy  no10s

Or perhaps this should be stored in a system catalog.


What to do if a standby server dies and never
acknowledges a commit?


The master's reaction to that situation should be configurable. So
I'd propose new configuration parameter specifying the reaction.
Valid values are:

- standalone
   When the master has waited for the ACK much longer than the timeout
   (or detected the failure of the standby), it closes the connection
   to the standby and restarts transactions.

- down
   When that situation occurs, the master shuts down immediately.
   Though this is unsafe for the system requiring high availability,
   as far as I recall, some people wanted this mode in the previous
   discussion.


Yeah, though of course you might want to set that per-standby too..


Let's step back a bit and ask what would be the simplest thing that you 
could call synchronous replication in good conscience, and also be 
useful at least to some people. Let's leave out the down mode, because 
that requires registration. We'll probably have to do registration at 
some point, but let's take as small steps as possible.


Without the down mode in the master, frankly I don't see the point of 
the recv and fsync levels in the standby. Either way, when the 
master acknowledges a commit to the client, you don't know if it has 
made it to the standby yet because the replication connection might be 
down for some reason.


That leaves us the 'replay' mode, which *is* useful, because it gives 
you the guarantee that when the master acknowledges a commit, it will 
appear committed in all hot standby servers that are currently 
connected. With that guarantee you can build a reliable cluster with 
something pgpool-II where all writes go to one node, and reads are 
distributed to multiple nodes.


I'm not sure what we should aim for in the first phase. But if you want 
as little code as possible yet have something useful, I think 'replay' 
mode with no standby registration is the way to go.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-09-01 Thread Magnus Hagander
On Wed, Sep 1, 2010 at 02:33, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Aug 31, 2010 at 5:07 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Aug 31, 2010 at 19:46, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Aug 31, 2010 at 19:44, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Ok. I've got a new migration runinng. Here's the revisions removed:
 RCS file: /usr/local/cvsroot/pgsql/src/backend/parser/Attic/gram.c,v
 deleting revision 2.88
 RCS file: 
 /usr/local/cvsroot/pgsql/src/interfaces/ecpg/preproc/Attic/pgc.c,v
 deleting revision 1.2
 RCS file: 
 /usr/local/cvsroot/pgsql/src/interfaces/ecpg/preproc/Attic/preproc.c,v
 deleting revision 1.6

 Hmm, it looks like you deleted the file deletion events (the versions
 cited above).  Not sure this is the right thing.  Check to see if the
 files are still there according to the converted git history ...

 Oh, drat. That's right. It shouldn't have been inclusive :S

 I'll abort the conversion and run it again :)

 Ok, I've pushed a clone of the new repository with these modifications to:

 http://git.postgresql.org/gitweb?p=git-migration-test.git;a=summary

 Haven't had the time to dig into it yet, so please go ahead anybody
 who wants to :-)

 That definitely didn't fix it, although I'm not quite sure why.  Can
 you throw the modified CVS you ran this off of up somewhere I can
 rsync it?

no rsync server on that box, but I put up a tarball for you at
http://www.hagander.net/pgsql/cvsrepo.tgz


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Path question

2010-09-01 Thread Boszormenyi Zoltan
Hi,

we are experimenting with modifying table partitioning
so the ORDER BY clause can be pushed down to
child nodes on the grounds that:
1. smaller datasets are faster to sort, e.g. two datasets that almost
spill out to disk are faster to sort in memory and later merge them
than the union set that spills out to disk, or two larger sets
that spill out to disk are faster to sort individually than the
union dataset (because of the longer seeks, etc)
2. individual child nodes can have indexes that produce
the sorted output already

Currently I am abusing the AppendPath node but later I will add a new
executor node that will merge the sorted input coming from the children.

I added the pathkey to the AppendPath to reflect that it produces
sorted output and I am adding the Sort plan to the children.

My problem is:

zozo=# explain select * from t1 where d = '2008-01-01' order by d;
QUERY
PLAN
---
 Result  (cost=8.28..33.13 rows=4 width=40)
   -  Append  (cost=8.28..33.13 rows=4 width=40)
 -  Sort  (cost=8.28..8.28 rows=1 width=40)
   Sort Key: public.t1.d
   -  Index Scan using t1_d_key on t1  (cost=0.00..8.27
rows=1 width=40)
 Index Cond: (d = '2008-01-01'::date)
 -  Sort  (cost=8.28..8.28 rows=1 width=40)
   Sort Key: public.t1.d
   -  Index Scan using t1_2008_d_key on t1_2008 t1 
(cost=0.00..8.27 rows=1 width=40)
 Index Cond: (d = '2008-01-01'::date)
 -  Sort  (cost=8.28..8.28 rows=1 width=40)
   Sort Key: public.t1.d
   -  Index Scan using t1_2009_d_key on t1_2009 t1 
(cost=0.00..8.27 rows=1 width=40)
 Index Cond: (d = '2008-01-01'::date)
 -  Sort  (cost=8.28..8.28 rows=1 width=40)
   Sort Key: public.t1.d
   -  Index Scan using t1_2010_d_key on t1_2010 t1 
(cost=0.00..8.27 rows=1 width=40)
 Index Cond: (d = '2008-01-01'::date)
(18 rows)

In one leaf, e.g.:

 -  Sort  (cost=8.28..8.28 rows=1 width=40)
   Sort Key: public.t1.d
   -  Index Scan using t1_2010_d_key on t1_2010 t1 
(cost=0.00..8.27 rows=1 width=40)
 Index Cond: (d = '2008-01-01'::date)

The plan is scanning the t_2010 child table, but the Sort is trying to
sort by the fully qualified parent table. I think this is a problem
but I don't know how to solve it. I have tried transforming the
parent query with

 adjust_appendrel_attrs((Node *) parse, appinfo)

where parse is

 Query  *parse = root-parse;

in set_append_rel_pathlist() and the transformed query trees are
used for the children with

make_sort_from_sortclauses(root, query-sortClause, subplan)

in create_append_plan(). adjust_appendrel_attrs() seems to be prepared
to be called with a Query * , so I don't know why the above leaf plan
doesn't show Sort Key: public.t1_2010.d and so on.

Can someone help me?

Best regards,
Zoltán Böszötményi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Path question

2010-09-01 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 we are experimenting with modifying table partitioning
 so the ORDER BY clause can be pushed down to
 child nodes on the grounds that:

This is really premature, and anything you do along those lines now will
probably never get committed.  The problem is that the transformation
you propose is wrong unless the planner can prove that the different
child tables contain nonoverlapping ranges of the sort key.  Now you
might be intending to add logic to try to prove that from inspection of
constraints, but I don't believe that reverse-engineering such knowledge
on the fly is a sane approach: it will be hugely expensive and will add
that cost even in many situations where the optimization fails to apply.

The project direction is that we are going to add some explicit
representation of partitioned tables.  After that, the planner can just
know immediately that a range-partitioned sort key is amenable to this
treatment, and at that point it'll make sense to work on it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Path question

2010-09-01 Thread Greg Stark
2010/9/1 Boszormenyi Zoltan z...@cybertec.at:
 we are experimenting with modifying table partitioning
 so the ORDER BY clause can be pushed down to
 child nodes on the grounds that:
 1. smaller datasets are faster to sort, e.g. two datasets that almost
    spill out to disk are faster to sort in memory and later merge them
    than the union set that spills out to disk, or two larger sets
    that spill out to disk are faster to sort individually than the
    union dataset (because of the longer seeks, etc)

For what it's worth this logic doesn't necessarily hold. Sorting two
data sets in memory is only faster if you only need one result set at
a time. If you know the first set contains only records which come
before the second then you can do this but if not then you'll need all
the result sets to be available at the same time which will require
spilling them to disk. If you have to do that then you might as well
do a tapesort anyways.

 2. individual child nodes can have indexes that produce
    the sorted output already

This is the big win. Currently Append nodes mean throwing out any
ordering from the child nodes and that's important information to be
losing.

 Currently I am abusing the AppendPath node but later I will add a new
 executor node that will merge the sorted input coming from the children.

You realize I already did this a few years ago. Search for merge
append on the lists. The hard part -- as usual -- ends up being
figuring out in the planner when to do this optimization and how to
avoid exponential growth in the number of plans considered and the
amount of data retained about each plan.


For what it's worth I disagree with Tom. I think this is a situation
where we need *both* types of solution. Ideally we will be able to use
a plain Append node for cases where we know the relative ordering of
the data in different partitions, but there will always be cases where
the structured partition data doesn't actually match up with the
ordering requested and we'll need to fall back to a merge-append node.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Path question

2010-09-01 Thread PostgreSQL - Hans-Jürgen Schönig
On Sep 1, 2010, at 4:10 PM, Tom Lane wrote:

 Boszormenyi Zoltan z...@cybertec.at writes:
 we are experimenting with modifying table partitioning
 so the ORDER BY clause can be pushed down to
 child nodes on the grounds that:
 
 This is really premature, and anything you do along those lines now will
 probably never get committed.  The problem is that the transformation
 you propose is wrong unless the planner can prove that the different
 child tables contain nonoverlapping ranges of the sort key.  Now you
 might be intending to add logic to try to prove that from inspection of
 constraints, but I don't believe that reverse-engineering such knowledge
 on the fly is a sane approach: it will be hugely expensive and will add
 that cost even in many situations where the optimization fails to apply.
 


well, why non-overlapping? the idea is to make append smart enough to take the 
sorted lists from below and merge them which will give sorted output as well.
my original idea was what you described but given Martijn van Oosterhout's 
posting we were pretty confident that we can get along without non-overlapping 
partitions.


 The project direction is that we are going to add some explicit
 representation of partitioned tables.  After that, the planner can just
 know immediately that a range-partitioned sort key is amenable to this
 treatment, and at that point it'll make sense to work on it.
 


can you outline some ideas here and maybe point to some useful discussion here?


many thanks,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: ECPG dynamic cursor fix for UPDATE/DELETE ... WHERE CURRENT OF :curname

2010-09-01 Thread Michael Meskes
 1. The statement
 
 UPDATE table SET fld1 = :input1
 WHERE CURRENT OF :curname
 RETURNING id + :input2;
 
 is transformed into
 
 UPDATE table SET fld1 = $1
 WHERE CURRENT OF $0
 RETURNING id + $2;
 
 and the $0 is past $1. The current code cannot deal with such
 a messed up order, and scanning the original query twice is
 needed, once for $0 substitution, once for mapping $1, etc. to
 the other input variables.

I cannot seem to reproduce this bug. Could you please send me an example that
makes this reproducible? Yes, I know that I have to change preproc.y to allow
for variable cursor names but in my test case everything seems to work well and
$0 gets replaced by the cursor name.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: register/unregister standby Re: [HACKERS] Synchronous replication

2010-09-01 Thread Heikki Linnakangas

On 30/08/10 15:14, Fujii Masao wrote:

I think that the advantage of registering standbys is that we can
specify which WAL files the master has to keep for the upcoming
standby. IMO, it's usually called together with pg_start_backup
as follows:

 SELECT register_standby('foo', pg_start_backup())

This requests the master keep to all the WAL files following the
backup starting location which pg_start_backup returns.


Hmm, that's clever. I was thinking that you'd initialize the standby 
from an existing backup, and in that context the standby would not need 
to connect to the master except via the replication connection. To take 
a base backup, you'll need not only that but also access to the 
filesystem in the master, ie. shell access.


There's been some talk of being able to stream a base backup over the 
replication connection too, which would be extremely handy. And that 
makes my point even stronger that registering a standby should be 
possible via the replication connection.


Of course, you could well expose the functionality as both a built-in 
function and a command in replication mode, so this detail isn't really 
that important right now.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Path question

2010-09-01 Thread Tom Lane
=?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at 
writes:
 On Sep 1, 2010, at 4:10 PM, Tom Lane wrote:
 This is really premature, and anything you do along those lines now will
 probably never get committed.

 well, why non-overlapping? the idea is to make append smart enough to
 take the sorted lists from below and merge them which will give sorted
 output as well.

Well, an extra merge step is going to change the cost comparisons quite
a bit; see Greg Starks' comments.  But in any case, my point wasn't that
this is something we should never do; it was that it makes more sense to
wait till something has happened with explicit partitioning.

 The project direction is that we are going to add some explicit
 representation of partitioned tables.

 can you outline some ideas here and maybe point to some useful discussion 
 here?

There's been boatloads of discussion of partitioning, and at least one
submitted patch, over the past year or so ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] serializable in comments and names

2010-09-01 Thread Kevin Grittner
There are many comments in the code which use serializable
transaction to mean transaction snapshot based transaction.  This
doesn't matter much as long as REPEATABLE READ and SERIALIZABLE
transaction isolation levels behave identically, but will be
confusing and inaccurate when there is any difference between the
two.  In a similar way, the static bool registered_serializable in
snapmgr.c will become misleading, and the IsXactIsoLevelSerializable
macro is bound to lead to confusion.
 
The patch to implement true serializable transactions using SSI
renames/rewords these things to avoid confusion.  However, there are
seven files which are changed only for this reason, and another
where there is one real change of two lines hidden in the midst of
dozens of lines of such wording changes.  I find it distracting to
have all this mixed in, and I fear that it will be a time-waster for
anyone reviewing the meat of the patch.  I'd like to submit a
no-op patch to cover these issues in advance of the CF, to get
them out of the way.  Joe suggested that I pose the issue to the
-hackers list.
 
I could knock out a couple other files from the main patch if people
considered it acceptable to enable the SHMQueueIsDetached function
now, which the patch uses in several places within asserts.  I would
remove the #ifdef NOT_USED from around the (very short) function,
and add it to the .h file.
 
The changes to the comments and local variables seem pretty safe. 
The change of IsXactIsoLevelSerializable to
IsXactIsoLevelXactSnapshotBased (or whatever name the community
prefers) could break the compile of external code; but the fix would
be pretty obvious.  It's hard to see how the change of a local
variable name would present a lot of risk.  So I see this as an
extremely low-risk no-op patch to lay the groundwork for the main
patch, so that it is easier to read.
 
Thoughts?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #5305: Postgres service stops when closing Windows session

2010-09-01 Thread Dave Page
On Wed, Sep 1, 2010 at 3:49 PM, Cristian Bittel cbit...@gmail.com wrote:
 Maybe the issue, for the momtent, could be avoided modifying the shared heap
 for sessions on Windows. But I don't really have idea how much to increase
 or decrease the values. Try and error? But, inside the opened Windows
 sessions nothing alerts of a heap exaust so could be unpredictable how much
 to change the values until the next PostgreSQL service crash...
 32-bits: http://support.microsoft.com/kb/184802

 There are several reports for another services with the same behavior
 including exit code 128 and a workaround to increase the heap on old Windows
 versions but the Exit Code 128 seems to apply to Windows 2003 Server x64
 also. And seems to be improved in Windows 2008 where heap is not fixed.
 https://fogbugz.bitvise.com/default.asp?WinSSHD.1.12888.2
 http://support.microsoft.com/kb/824422

Given the unpredictability, if this is connected to desktop heap I
don't think it's running out of per-session memory, so much as the
system-wide heap (which, afaict, is fixed at 48MB). That might explain
why a desktop session could affect other sessions.

Is this a terminal server, with lots of interactive users? Can you
check the heap usage using the desktop heap monitor:
http://www.microsoft.com/downloads/details.aspx?familyid=5cfc9b74-97aa-4510-b4b9-b2dc98c8ed8bdisplaylang=en


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 12:30 AM, Pavel Stehule wrote:

 Docs is wrong :) I like current implementation. You can remove a NULLs
 from aggregation very simply, but different direction isn't possible

Would appreciate the recipe for removing the NULLs.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #5305: Postgres service stops when closing Windows session

2010-09-01 Thread Cristian Bittel
Maybe the issue, for the momtent, could be avoided modifying the shared heap
for sessions on Windows. But I don't really have idea how much to increase
or decrease the values. Try and error? But, inside the opened Windows
sessions nothing alerts of a heap exaust so could be unpredictable how much
to change the values until the next PostgreSQL service crash...
32-bits: http://support.microsoft.com/kb/184802
http://support.microsoft.com/kb/184802%20

There are several reports for another services with the same behavior
including exit code 128 and a workaround to increase the heap on old Windows
versions but the Exit Code 128 seems to apply to Windows 2003 Server x64
also. And seems to be improved in Windows 2008 where heap is not fixed.
https://fogbugz.bitvise.com/default.asp?WinSSHD.1.12888.2
http://support.microsoft.com/kb/824422





2010/8/31 Bruce Momjian br...@momjian.us

 Dave Page wrote:
  On Tue, Aug 31, 2010 at 4:35 PM, Bruce Momjian br...@momjian.us wrote:
   Dave Page wrote:
   On Tue, Aug 31, 2010 at 4:27 PM, Bruce Momjian br...@momjian.us
 wrote:
We have already found that exceeding desktop heap might cause a
CreateProcess to return success but later fail with a return code of
128, which causes a server restart.
  
   That doesn't mean that this is desktop heap exhaustion though - just
   that it can cause the same effect.
  
   Right, but it is the only possible server crash cause we have come up
   with so far.
 
  Understood - I'm just unconvinced it's the cause - aside from the
  point I made earlier about heap exhaustion being very predictable and
  reproducible (which this issue apparently is not), when the server is
  run under the SCM, it creates a logon session for that service alone
  which has it's own heap allocation which is entirely independent of
  the allocation used by any interactive logon sessions.
 
  So unless there's a major isolation bug in Windows, any desktop heap
  usage in an interactive session for one user should have zero effect
  on a non-interactive session for another user.

 Well, the only description that we have ever heard that makes sense is
 some kind of heap exhaustion, perhaps triggered by a Windows bug that
 doesn't properly track heap allocations sometimes.

 Of course, the cause might be aliens, but we don't have any evidence of
 that either.  :-|

 What we do know is that CreateProcess is returning success, and the
 child is exiting with 128 no_such_child, and that logging out can
 trigger it sometimes.

 --
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +



Re: [HACKERS] Path question

2010-09-01 Thread PostgreSQL - Hans-Jürgen Schönig
hello tom,

yeah, we have followed quite a lot of discussion as well ...
and yes, no patches.

as far as this problem is concerned: we are working on a patch and did some 
prototyping inside the planner already (attached).
the code we have is pretty limited atm (such as checking for a sort clause 
explicitly and so on - it has no support for windowing related optimizations 
and so on so far).

the cost model is not our problem - it is a lot easier to read than the code we 
are fighting with (it is a different level of complexity). i think costs can be 
handled.

yes, this merging adds some costs for sure. we might see a hell amount of 
operators being called - but i think given a reasonable number of partitions it 
is still a lot cheaper than actually resorting ... and, it is a lot more space 
efficient.
in my practical case i cannot resort because we would simply run out of space. 
an index scan is expensive but needs no additional sort space ...
and, merge is O(n) which sort is clearly not.

advise is highly appreciated.

many thanks,

hans




push-down-sort-into-inh-2.patch
Description: Binary data

On Sep 1, 2010, at 5:00 PM, Tom Lane wrote:

 =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at 
 writes:
 On Sep 1, 2010, at 4:10 PM, Tom Lane wrote:
 This is really premature, and anything you do along those lines now will
 probably never get committed.
 
 well, why non-overlapping? the idea is to make append smart enough to
 take the sorted lists from below and merge them which will give sorted
 output as well.
 
 Well, an extra merge step is going to change the cost comparisons quite
 a bit; see Greg Starks' comments.  But in any case, my point wasn't that
 this is something we should never do; it was that it makes more sense to
 wait till something has happened with explicit partitioning.
 
 The project direction is that we are going to add some explicit
 representation of partitioned tables.
 
 can you outline some ideas here and maybe point to some useful discussion 
 here?
 
 There's been boatloads of discussion of partitioning, and at least one
 submitted patch, over the past year or so ...
 
   regards, tom lane
 


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] permissions bug in RI checks?

2010-09-01 Thread David Christensen
Hey -hackers,

In doing a schema upgrade, we noticed the following behavior, which certainly 
seems like a bug.  Steps to reproduce:

CREATE USER a;
CREATE USER b;

CREATE TABLE t1 (id serial primary key);
CREATE TABLE t2 (id int references t1(id));

ALTER TABLE t1 OWNER TO a;
ALTER TABLE t2 OWNER TO a;

\c - a

REVOKE ALL ON t1 FROM a;
REVOKE ALL ON t2 FROM a;

GRANT ALL ON t1 TO b;
GRANT ALL ON t2 TO b;

\c - b

INSERT INTO t2 (id) VALUES (1);

ERROR:  permission denied for relation t1
CONTEXT:  SQL statement SELECT 1 FROM ONLY public.t1 x WHERE id 
OPERATOR(pg_catalog.=) $1 FOR SHARE OF x

The bug in this case is that b has full permissions on all of the underlying 
tables, but runs into issues when trying to access the referenced tables.  I 
traced this down to the RI checks, specifically the portion in ri_PlanCheck() 
where it calls SetUserIdAndSecContext() and then later runs the queries in the 
context of the owner of the relation.  Since the owner a lacks SELECT and 
UPDATE privileges on the table, it is not able to take the ShareLock, and spits 
out the above error.  This behavior does not occur if the object owner is a 
database superuser.  This is presumably because the superuser bypasses the 
regular ACL checks and succeeds regardless.

The behavior was originally noted in 8.1.21, but exists as well in HEAD.

No real resolution proposed, but I wanted to understand the reason behind the 
restrictions if it was intentional behavior.

Thanks,

David

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] permissions bug in RI checks?

2010-09-01 Thread Tom Lane
David Christensen da...@endpoint.com writes:
 In doing a schema upgrade, we noticed the following behavior, which certainly 
 seems like a bug.  Steps to reproduce:
 ...
 The bug in this case is that b has full permissions on all of the
 underlying tables, but runs into issues when trying to access the
 referenced tables.

Permissions checks for RI operations involve the owner of the table,
from whom you've revoked all permissions.  If the RI operations were
done as the caller, as you seem to expect, that would *not* be an
improvement; callers would have to have more privileges than one really
wants.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 1:06 AM, Thom Brown wrote:

 I think it might be both.  array_agg doesn't return NULL, it returns
 an array which contains NULL.
 
 The second I wrote that, I realised it was b*ll%$ks, as I was still in
 the process of waking up.

I know that feeling.

/me sips his coffee

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-01 Thread Robert Haas
On Wed, Sep 1, 2010 at 6:23 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I'm not sure what we should aim for in the first phase. But if you want as
 little code as possible yet have something useful, I think 'replay' mode
 with no standby registration is the way to go.

IMHO, less is more.  Trying to do too much at once can cause us to
miss the release window (and can also create more bugs).  We just need
to leave the door open to adding later whatever we leave out now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 12:30 AM, Pavel Stehule wrote:

 So are the docs right, or is array_agg() right?
 
 Docs is wrong :) I like current implementation. You can remove a NULLs
 from aggregation very simply, but different direction isn't possible

Patch:

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 9f91939..e301019 100644
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
*** sqrt(2)
*** 1543,1549 
  The first form of aggregate expression invokes the aggregate
  across all input rows for which the given expression(s) yield
  non-null values.  (Actually, it is up to the aggregate function
! whether to ignore null values or not mdash; but all the standard ones 
do.)
  The second form is the same as the first, since
  literalALL/literal is the default.  The third form invokes the
  aggregate for all distinct values of the expressions found
--- 1543,1550 
  The first form of aggregate expression invokes the aggregate
  across all input rows for which the given expression(s) yield
  non-null values.  (Actually, it is up to the aggregate function
! whether to ignore null values or not mdash; but all the standard
! ones except functionarray_agg/ do.)
  The second form is the same as the first, since
  literalALL/literal is the default.  The third form invokes the
  aggregate for all distinct values of the expressions found

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 *** 1543,1549 
   The first form of aggregate expression invokes the aggregate
   across all input rows for which the given expression(s) yield
   non-null values.  (Actually, it is up to the aggregate function
 ! whether to ignore null values or not mdash; but all the standard ones 
 do.)
   The second form is the same as the first, since
   literalALL/literal is the default.  The third form invokes the
   aggregate for all distinct values of the expressions found
 --- 1543,1550 
   The first form of aggregate expression invokes the aggregate
   across all input rows for which the given expression(s) yield
   non-null values.  (Actually, it is up to the aggregate function
 ! whether to ignore null values or not mdash; but all the standard
 ! ones except functionarray_agg/ do.)
   The second form is the same as the first, since
   literalALL/literal is the default.  The third form invokes the
   aggregate for all distinct values of the expressions found

I think when that text was written, it was meant to imply all the
aggregates defined in SQL92.  There seems to be a lot of confusion
in this thread about whether standard means defined by SQL spec
or built-in in Postgres.  Should we try to refine the wording to
clarify that?

Even more to the point, should we deliberately make this vaguer so that
we aren't finding ourselves with obsolete text again and again?  You can
bet that people adding new aggregates in the future aren't going to
think to update this sentence, any more than happened with array_agg.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 10:12 AM, Tom Lane wrote:

 I think when that text was written, it was meant to imply all the
 aggregates defined in SQL92.  There seems to be a lot of confusion
 in this thread about whether standard means defined by SQL spec
 or built-in in Postgres.  Should we try to refine the wording to
 clarify that?

Yes please.

 Even more to the point, should we deliberately make this vaguer so that
 we aren't finding ourselves with obsolete text again and again?  You can
 bet that people adding new aggregates in the future aren't going to
 think to update this sentence, any more than happened with array_agg.

Perhaps “consult the docs for each aggregate to determine how it handles NULLs.”

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Sep 1, 2010, at 10:12 AM, Tom Lane wrote:
 Even more to the point, should we deliberately make this vaguer so that
 we aren't finding ourselves with obsolete text again and again?  You can
 bet that people adding new aggregates in the future aren't going to
 think to update this sentence, any more than happened with array_agg.

 Perhaps “consult the docs for each aggregate to determine how it handles 
 NULLs.”

Hm, actually the whole para needs work.  It was designed at a time when
DISTINCT automatically discarded nulls, which isn't true anymore, and
that fact was patched-in in a very awkward way too.  Perhaps something
like

The first form of aggregate expression invokes the aggregate
once for each input row.
The second form is the same as the first, since
literalALL/literal is the default.
The third form invokes the aggregate once for each distinct value,
or set of values, of the expression(s) found in the input rows.
The last form invokes the aggregate once for each input row; since no
particular input value is specified, it is generally only useful
for the functioncount(*)/function aggregate function.

Most aggregate functions ignore null inputs, so that rows in which
one or more of the expression(s) yield null are discarded.  (This
can be assumed to be true, unless otherwise specified, for all
built-in aggregates.)

Then we have to make sure array_agg is properly documented, but we
don't have to insert something into the description of every single
aggregate, which is what your proposal would require.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David Fetter
On Wed, Sep 01, 2010 at 08:16:41AM -0700, David Wheeler wrote:
 On Sep 1, 2010, at 12:30 AM, Pavel Stehule wrote:
 
  Docs is wrong :) I like current implementation.  You can remove a
  NULLs from aggregation very simply, but different direction isn't
  possible
 
 Would appreciate the recipe for removing the NULLs.

WHERE clause :P

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Thom Brown
On 1 September 2010 18:47, David Fetter da...@fetter.org wrote:
 On Wed, Sep 01, 2010 at 08:16:41AM -0700, David Wheeler wrote:
 On Sep 1, 2010, at 12:30 AM, Pavel Stehule wrote:

  Docs is wrong :) I like current implementation.  You can remove a
  NULLs from aggregation very simply, but different direction isn't
  possible

 Would appreciate the recipe for removing the NULLs.

 WHERE clause :P

There may be cases where that's undesirable, such as there being more
than one aggregate in the SELECT list, or the column being grouped on
needing to return rows regardless as to whether there's NULLs in the
column being targeted by array_agg() or not.
-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 10:52 AM, Thom Brown wrote:

 ould appreciate the recipe for removing the NULLs.
 
 WHERE clause :P
 
 There may be cases where that's undesirable, such as there being more
 than one aggregate in the SELECT list, or the column being grouped on
 needing to return rows regardless as to whether there's NULLs in the
 column being targeted by array_agg() or not.

Exactly the issue I ran into:

SELECT name AS distribution,
   array_agg(
   CASE relstatus WHEN 'stable'
   THEN version
   ELSE NULL
   END ORDER BY version) AS stable,
   array_agg(
   CASE relstatus
   WHEN 'testing'
   THEN version
   ELSE NULL
   END ORDER BY version) AS testing
  FROM distributions
 GROUP BY name;

  distribution │  stable   │  testing   
 ──┼───┼
  pair │ {NULL,1.0.0,NULL} │ {0.0.1,NULL,1.2.0}
  pgtap│ {NULL}│ {0.0.1}
 (2 rows)

Annoying.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-01 Thread Simon Riggs
On Wed, 2010-09-01 at 08:33 +0300, Heikki Linnakangas wrote:
 On 01/09/10 04:02, Robert Haas wrote:
   See the thread on interruptible sleeps.  The problem
  right now is that there are some polling loops that act to throttle
  the maximum rate at which a node doing sync rep can make forward
  progress, independent of the capabilities of the hardware.
 
 To be precise, the polling doesn't affect the bandwidth the 
 replication can handle, but it introduces a delay wh

We're sending the WAL data in batches. We can't really escape from the
fact that we're effectively using group commit when we use synch rep.
That will necessarily increase delay and require more sessions to get
same throughput.

   Those need
  to be replaced with a system that doesn't inject unnecessary delays
  into the process, which is what Heikki is working on.
 
 Right.

 Once we're done with that, all the big questions are still left. How to 
 configure it? What does synchronous replication mean, when is a 
 transaction acknowledged as committed? What to do if a standby server 
 dies and never acknowledges a commit? All these issues have been 
 discussed, but there is no consensus yet.

That sounds an awful lot like performance tuning first and the feature
additions last.

And if you're in the middle of performance tuning, surely some objective
performance tests would help us, no?

IMHO we should be concentrating on how to add the next features because
its clear to me that if you do things in the wrong order you'll be
wasting time. And we don't have much of that, ever.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 10:30 AM, Tom Lane wrote:

 Hm, actually the whole para needs work.  It was designed at a time when
 DISTINCT automatically discarded nulls, which isn't true anymore, and
 that fact was patched-in in a very awkward way too.  Perhaps something
 like
 
The first form of aggregate expression invokes the aggregate
once for each input row.
The second form is the same as the first, since
literalALL/literal is the default.
The third form invokes the aggregate once for each distinct value,
or set of values, of the expression(s) found in the input rows.
The last form invokes the aggregate once for each input row; since no
particular input value is specified, it is generally only useful
for the functioncount(*)/function aggregate function.
 
Most aggregate functions ignore null inputs, so that rows in which
one or more of the expression(s) yield null are discarded.  (This
can be assumed to be true, unless otherwise specified, for all
built-in aggregates.)

I don't think you need the parentheses, though without them, This might be 
better written as The ignoring of NULLs.

Just my $0.02.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Pavel Stehule
2010/9/1 Thom Brown t...@linux.com:
 On 1 September 2010 18:47, David Fetter da...@fetter.org wrote:
 On Wed, Sep 01, 2010 at 08:16:41AM -0700, David Wheeler wrote:
 On Sep 1, 2010, at 12:30 AM, Pavel Stehule wrote:

  Docs is wrong :) I like current implementation.  You can remove a
  NULLs from aggregation very simply, but different direction isn't
  possible

 Would appreciate the recipe for removing the NULLs.

 WHERE clause :P

 There may be cases where that's undesirable, such as there being more
 than one aggregate in the SELECT list, or the column being grouped on
 needing to return rows regardless as to whether there's NULLs in the
 column being targeted by array_agg() or not.

Then you can eliminate NULLs with simple function

CREATE OR REPLACE FUNCTION remove_null(anyarray)
RETURNS anyarray AS $$
SELECT ARRAY(SELECT x FROM unnest($1) g(x) WHERE x IS NOT NULL)
$$ LANGUAGE sql;
 --
 Thom Brown
 Twitter: @darkixion
 IRC (freenode): dark_ixion
 Registered Linux user: #516935

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread David E. Wheeler
On Sep 1, 2010, at 11:09 AM, Pavel Stehule wrote:

 Then you can eliminate NULLs with simple function
 
 CREATE OR REPLACE FUNCTION remove_null(anyarray)
 RETURNS anyarray AS $$
 SELECT ARRAY(SELECT x FROM unnest($1) g(x) WHERE x IS NOT NULL)
 $$ LANGUAGE sql;

Kind of defeats the purpose of the efficiency of the aggregate.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Sep 1, 2010, at 10:30 AM, Tom Lane wrote:
 Most aggregate functions ignore null inputs, so that rows in which
 one or more of the expression(s) yield null are discarded.  (This
 can be assumed to be true, unless otherwise specified, for all
 built-in aggregates.)

 I don't think you need the parentheses, though without them, This might be 
 better written as The ignoring of NULLs.

Done, without the parentheses.  I didn't add The ignoring of NULLs,
it seemed a bit too verbose.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Sep 1, 2010, at 11:09 AM, Pavel Stehule wrote:
 Then you can eliminate NULLs with simple function

 Kind of defeats the purpose of the efficiency of the aggregate.

Well, you can build your own version of array_agg with the same
implementation, except you mark the transition function as strict ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Pavel Stehule
2010/9/1 Tom Lane t...@sss.pgh.pa.us:
 David E. Wheeler da...@kineticode.com writes:
 On Sep 1, 2010, at 11:09 AM, Pavel Stehule wrote:
 Then you can eliminate NULLs with simple function

 Kind of defeats the purpose of the efficiency of the aggregate.

 Well, you can build your own version of array_agg with the same
 implementation, except you mark the transition function as strict ...


I am checking this now, and it is not possible - it needs a some
initial value and there isn't possible to set a internal value.
probably some C coding is necessary.

Regards

Pavel

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Jeff Davis
Compiling with RELCACHE_FORCE_RELEASE doesn't pass make check on my
machine.

Is it supposed to pass?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/1 Tom Lane t...@sss.pgh.pa.us:
 Well, you can build your own version of array_agg with the same
 implementation, except you mark the transition function as strict ...

 I am checking this now, and it is not possible - it needs a some
 initial value and there isn't possible to set a internal value.

Well, you can cheat a bit ...

regression=# create or replace function array_agg_transfn_strict(internal, 
anyelement) returns internal as 'array_agg_transfn' language internal immutable;
CREATE FUNCTION
regression=# create aggregate array_agg_strict(anyelement) (stype = internal,
sfunc = array_agg_transfn_strict, finalfunc = array_agg_finalfn);
CREATE AGGREGATE
regression=# create or replace function array_agg_transfn_strict(internal, 
anyelement) returns internal as 'array_agg_transfn' language internal strict 
immutable;
CREATE FUNCTION

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg() NULL Handling

2010-09-01 Thread Pavel Stehule
2010/9/1 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/1 Tom Lane t...@sss.pgh.pa.us:
 Well, you can build your own version of array_agg with the same
 implementation, except you mark the transition function as strict ...

 I am checking this now, and it is not possible - it needs a some
 initial value and there isn't possible to set a internal value.

 Well, you can cheat a bit ...

 regression=# create or replace function array_agg_transfn_strict(internal, 
 anyelement) returns internal as 'array_agg_transfn' language internal 
 immutable;
 CREATE FUNCTION
 regression=# create aggregate array_agg_strict(anyelement) (stype = internal,
 sfunc = array_agg_transfn_strict, finalfunc = array_agg_finalfn);
 CREATE AGGREGATE
 regression=# create or replace function array_agg_transfn_strict(internal, 
 anyelement) returns internal as 'array_agg_transfn' language internal strict 
 immutable;
 CREATE FUNCTION


nice dark trick :) -  but it doesn't work

ERROR:  aggregate 16395 needs to have compatible input type and transition type
postgres=#

Pavel



                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 Compiling with RELCACHE_FORCE_RELEASE doesn't pass make check on my
 machine.

What happens exactly?

 Is it supposed to pass?

It shouldn't crash, but I'm not certain whether you'd see any
visible diffs in the tests.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Greg Sabino Mullane wrote:
  Specifically, LANGUAGE changes the headers of pg_controldata 
  (but not the actual output, LC_ALL does that). Thanks for the 
  nudge, I'll get to rewriting some code.
 
  pg_upgrade does this in controldata.c for this exact reason:
 
  /*
   * Because we test the pg_resetxlog output strings, it has to be in
   * English.
   */
  if (getenv(LANG))
  lang = pg_strdup(ctx, getenv(LANG));
  #ifndef WIN32
  putenv(pg_strdup(ctx, LANG=C));
  #else
  SetEnvironmentVariableA(LANG, C);
  #endif
 
 You do realize that's far from bulletproof?  To be sure that that does
 anything, you'd need to set (or unset) LC_ALL and LC_MESSAGES as well.
 And I thought Windows spelled it LANGUAGE not LANG ...

I have implemented your suggestion of setting LANG, LANGUAGE, and
LC_MESSAGES based on code in pg_regress.c;  that is the only place I see
where we force English, and it certainly has received more testing.

Should I set LC_ALL too?  The other variables mentioned in pg_regress.c?
Is this something I should patch to 9.0.X?

Patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
Index: contrib/pg_upgrade/controldata.c
===
RCS file: /cvsroot/pgsql/contrib/pg_upgrade/controldata.c,v
retrieving revision 1.9
diff -c -c -r1.9 controldata.c
*** contrib/pg_upgrade/controldata.c	6 Jul 2010 19:18:55 -	1.9
--- contrib/pg_upgrade/controldata.c	1 Sep 2010 20:30:45 -
***
*** 11,16 
--- 11,17 
  
  #include ctype.h
  
+ static void putenv2(migratorContext *ctx, const char *var, const char *val);
  
  /*
   * get_control_data()
***
*** 52,69 
  	bool		got_date_is_int = false;
  	bool		got_float8_pass_by_value = false;
  	char	   *lang = NULL;
  
  	/*
! 	 * Because we test the pg_resetxlog output strings, it has to be in
! 	 * English.
  	 */
  	if (getenv(LANG))
  		lang = pg_strdup(ctx, getenv(LANG));
  #ifndef WIN32
! 	putenv(pg_strdup(ctx, LANG=C));
  #else
! 	SetEnvironmentVariableA(LANG, C);
  #endif
  	snprintf(cmd, sizeof(cmd), SYSTEMQUOTE \%s/%s \%s\ SYSTEMQUOTE,
  			 cluster-bindir,
  			 live_check ? pg_controldata\ : pg_resetxlog\ -n,
--- 53,82 
  	bool		got_date_is_int = false;
  	bool		got_float8_pass_by_value = false;
  	char	   *lang = NULL;
+ 	char	   *language = NULL;
+ 	char	   *lc_messages = NULL;
  
  	/*
! 	 * Because we test the pg_resetxlog output as strings, it has to be in
! 	 * English.  Copied from pg_regress.c.
  	 */
  	if (getenv(LANG))
  		lang = pg_strdup(ctx, getenv(LANG));
+ 	if (getenv(LANGUAGE))
+ 		language = pg_strdup(ctx, getenv(LANGUAGE));
+ 	if (getenv(LC_MESSAGES))
+ 		lc_messages = pg_strdup(ctx, getenv(LC_MESSAGES));
+ 
+ 	putenv2(ctx, LANG,
  #ifndef WIN32
! 			NULL);
  #else
! 			/* On Windows the default locale cannot be English, so force it */
! 			en);
  #endif
+ 	putenv2(ctx, LANGUAGE, NULL);
+ 	putenv2(ctx, LC_MESSAGES, C);
+ 
  	snprintf(cmd, sizeof(cmd), SYSTEMQUOTE \%s/%s \%s\ SYSTEMQUOTE,
  			 cluster-bindir,
  			 live_check ? pg_controldata\ : pg_resetxlog\ -n,
***
*** 334,361 
  	if (output)
  		pclose(output);
  
! 	/* restore LANG */
! 	if (lang)
! 	{
! #ifndef WIN32
! 		char	   *envstr = (char *) pg_malloc(ctx, strlen(lang) + 6);
! 
! 		sprintf(envstr, LANG=%s, lang);
! 		putenv(envstr);
! #else
! 		SetEnvironmentVariableA(LANG, lang);
! #endif
! 		pg_free(lang);
! 	}
! 	else
! 	{
! #ifndef WIN32
! 		unsetenv(LANG);
! #else
! 		SetEnvironmentVariableA(LANG, );
! #endif
! 	}
! 
  	/* verify that we got all the mandatory pg_control data */
  	if (!got_xid || !got_oid ||
  		(!live_check  !got_log_id) ||
--- 347,360 
  	if (output)
  		pclose(output);
  
! 	/* restore environment variable settings */
! 	putenv2(ctx, LANG, lang);
! 	putenv2(ctx, LANGUAGE, language);
! 	putenv2(ctx, LC_MESSAGES, lc_messages);
! 	pg_free(lang);
! 	pg_free(language);
! 	pg_free(lc_messages);
!  
  	/* verify that we got all the mandatory pg_control data */
  	if (!got_xid || !got_oid ||
  		(!live_check  !got_log_id) ||
***
*** 492,494 
--- 491,524 
  		pg_log(ctx, PG_FATAL, Unable to rename %s to %s.\n, old_path, new_path);
  	check_ok(ctx);
  }
+ 
+ 
+ /*
+  *	This is like putenv(), but takes two arguments
+  *	It also does unsetenv() if val is NULL.
+  */
+ static void
+ putenv2(migratorContext *ctx, const char *var, const char *val)
+ {
+ 	if (val)
+ 	{
+ #ifndef WIN32
+ 		char	   *envstr = (char *) pg_malloc(ctx, strlen(var) +
+ strlen(val) + 1);
+ 
+ 		sprintf(envstr, %s=%s, var, val);
+ 		putenv(envstr);
+ 		pg_free(envstr);
+ #else
+ 		SetEnvironmentVariableA(var, val);
+ #endif
+ 	}
+ 	else
+ 	{
+ #ifndef WIN32
+ 		

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-01 Thread Simon Riggs
On Wed, 2010-09-01 at 13:23 +0300, Heikki Linnakangas wrote:
 On 01/09/10 10:53, Fujii Masao wrote:
  Before discussing about that, we should determine whether registering
  standbys in master is really required. It affects configuration a lot.
  Heikki thinks that it's required, but I'm still unclear about why and
  how.
 
  Why do standbys need to be registered in master? What information
  should be registered?
 
 That requirement falls out from the handling of disconnected standbys. 
 If a standby is not connected, what does the master do with commits? If 
 the answer is anything else than acknowledge them to the client 
 immediately, as if the standby never existed, the master needs to know 
 what standby servers exist. Otherwise it can't know if all the standbys 
 are connected or not.

All the standbys presupposes that we know what they are, i.e. we have
registered them, so I see that argument as circular. Quorum commit does
not need registration, so quorum commit is the easy to implement
option and registration is the more complex later feature. I don't have
a problem with adding registration later and believe it can be done
later without issues.

  What does synchronous replication mean, when is a transaction
  acknowledged as committed?
 
  I proposed four synchronization levels:
 
  1. async
 doesn't make transaction commit wait for replication, i.e.,
 asynchronous replication. This mode has been already supported in
 9.0.
 
  2. recv
 makes transaction commit wait until the standby has received WAL
 records.
 
  3. fsync
 makes transaction commit wait until the standby has received and
 flushed WAL records to disk
 
  4. replay
 makes transaction commit wait until the standby has replayed WAL
 records after receiving and flushing them to disk
 
  OTOH, Simon proposed the quorum commit feature. I think that both
  is required for various our use cases. Thought?
 
 I'd like to keep this as simple as possible, yet flexible so that with 
 enough scripting and extensions, you can get all sorts of behavior. I 
 think quorum commit falls into the extension category; if you're setup 
 is complex enough, it's going to be impossible to represent that in our 
 config files no matter what. But if you write a little proxy, you can 
 implement arbitrary rules there.
 
 I think recv/fsync/replay should be specified in the standby. 

I think the wait mode (i.e. recv/fsync/replay or others) should be
specified in the master. This allows the application to specify whatever
level of protection it requires, and also allows the behaviour to be
different for user-specifiable parts of the application. As soon as you
set this on the standby then you have the one-size fits all approach to
synchronisation.

We already know performance of synchronous rep is poor, which is exactly
why I want to be able to control it at the application level. Fine
grained control is important, otherwise we may as well just use DRBD and
skip this project completely, since we already have that. It will also
be a feature that no other database has, taking us truly beyond what has
gone before.

The master/standby decision is not something that is easily changed.
Whichever we decide now will be the thing we stick with.

 It has no 
 direct effect on the master, the master would just relay the setting to 
 the standby when it connects, or the standby would send multiple 
 XLogRecPtrs and let the master decide when the WAL is persistent enough. 
 And what if you write a proxy that has some other meaning of persistent 
 enough? Like when it has been written to the OS buffers but not yet 
 fsync'd, or when it has been fsync'd to at least one standby and 
 received by at least three others. recv/fsync/replay is not going to 
 represent that behavior well.
 
 sync vs async on the other hand should be specified in the master, 
 because it has a direct impact on the behavior of commits in the master.
 



 I propose a configuration file standbys.conf, in the master:
 
 # STANDBY NAMESYNCHRONOUS   TIMEOUT
 importantreplica  yes   100ms
 tempcopy  no10s
 
 Or perhaps this should be stored in a system catalog.

That part sounds like complexity that can wait until later. I would not
object if you really want this, but would prefer it to look like this:

# STANDBY NAMEDEFAULT_WAIT_MODE   TIMEOUT
importantreplica  sync  100ms
tempcopy  async 10s

You don't *have* to use the application level control if you don't want
it. But its an important capability for real world apps, since the
alternative is deliberately splitting an application across two database
servers each with different wait modes.

  What to do if a standby server dies and never
  acknowledges a commit?
 
  The master's reaction to that situation should be configurable. So
  I'd propose new configuration parameter specifying the reaction.
  Valid values are:
 
  - standalone
 When 

Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mié sep 01 16:35:18 -0400 2010:

 I have implemented your suggestion of setting LANG, LANGUAGE, and
 LC_MESSAGES based on code in pg_regress.c;  that is the only place I see
 where we force English, and it certainly has received more testing.

I think a real long-term solution is to have a machine-readable format
for pg_controldata, perhaps something very simple like

$ pg_controldata --machine
PGCONTROL_VERSION_NUMBER=903
CATALOG_VERSION_NUMBER=201008051
DATABASE_SYSIDENTIFIER=5504177303240039672
etc

This wouldn't be subject to translation and thus much easier to process.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Bruce Momjian's message of mié sep 01 16:35:18 -0400 2010:
 I have implemented your suggestion of setting LANG, LANGUAGE, and
 LC_MESSAGES based on code in pg_regress.c;  that is the only place I see
 where we force English, and it certainly has received more testing.

 I think a real long-term solution is to have a machine-readable format
 for pg_controldata, perhaps something very simple like

 $ pg_controldata --machine
 PGCONTROL_VERSION_NUMBER=903
 CATALOG_VERSION_NUMBER=201008051
 DATABASE_SYSIDENTIFIER=5504177303240039672
 etc

 This wouldn't be subject to translation and thus much easier to process.

+1.  pg_controldata was never written with the idea that its output
would be read by programs.  If we're going to start doing that, we
should provide an output format that's suitable, not try to work around
it in the callers.

However, that's something for 9.1 and beyond.  Bruce's immediate problem
is what to do in pg_upgrade in 9.0, and there I concur that he should
duplicate what pg_regress is doing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Bruce Momjian
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from Bruce Momjian's message of mi?? sep 01 16:35:18 -0400 2010:
  I have implemented your suggestion of setting LANG, LANGUAGE, and
  LC_MESSAGES based on code in pg_regress.c;  that is the only place I see
  where we force English, and it certainly has received more testing.
 
  I think a real long-term solution is to have a machine-readable format
  for pg_controldata, perhaps something very simple like
 
  $ pg_controldata --machine
  PGCONTROL_VERSION_NUMBER=903
  CATALOG_VERSION_NUMBER=201008051
  DATABASE_SYSIDENTIFIER=5504177303240039672
  etc
 
  This wouldn't be subject to translation and thus much easier to process.
 
 +1.  pg_controldata was never written with the idea that its output
 would be read by programs.  If we're going to start doing that, we
 should provide an output format that's suitable, not try to work around
 it in the callers.
 
 However, that's something for 9.1 and beyond.  Bruce's immediate problem
 is what to do in pg_upgrade in 9.0, and there I concur that he should
 duplicate what pg_regress is doing.

OK, here is a patch that sets all the variables that pg_regress.c sets.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
Index: contrib/pg_upgrade/controldata.c
===
RCS file: /cvsroot/pgsql/contrib/pg_upgrade/controldata.c,v
retrieving revision 1.9
diff -c -c -r1.9 controldata.c
*** contrib/pg_upgrade/controldata.c	6 Jul 2010 19:18:55 -	1.9
--- contrib/pg_upgrade/controldata.c	1 Sep 2010 22:38:59 -
***
*** 11,16 
--- 11,17 
  
  #include ctype.h
  
+ static void putenv2(migratorContext *ctx, const char *var, const char *val);
  
  /*
   * get_control_data()
***
*** 51,69 
  	bool		got_toast = false;
  	bool		got_date_is_int = false;
  	bool		got_float8_pass_by_value = false;
  	char	   *lang = NULL;
  
  	/*
! 	 * Because we test the pg_resetxlog output strings, it has to be in
! 	 * English.
  	 */
  	if (getenv(LANG))
  		lang = pg_strdup(ctx, getenv(LANG));
  #ifndef WIN32
! 	putenv(pg_strdup(ctx, LANG=C));
  #else
! 	SetEnvironmentVariableA(LANG, C);
  #endif
  	snprintf(cmd, sizeof(cmd), SYSTEMQUOTE \%s/%s \%s\ SYSTEMQUOTE,
  			 cluster-bindir,
  			 live_check ? pg_controldata\ : pg_resetxlog\ -n,
--- 52,106 
  	bool		got_toast = false;
  	bool		got_date_is_int = false;
  	bool		got_float8_pass_by_value = false;
+ 	char	   *lc_collate = NULL;
+ 	char	   *lc_ctype = NULL;
+ 	char	   *lc_monetary = NULL;
+ 	char	   *lc_numeric = NULL;
+ 	char	   *lc_time = NULL;
  	char	   *lang = NULL;
+ 	char	   *language = NULL;
+ 	char	   *lc_all = NULL;
+ 	char	   *lc_messages = NULL;
  
  	/*
! 	 * Because we test the pg_resetxlog output as strings, it has to be in
! 	 * English.  Copied from pg_regress.c.
  	 */
+ 	if (getenv(LC_COLLATE))
+ 		lc_collate = pg_strdup(ctx, getenv(LC_COLLATE));
+ 	if (getenv(LC_CTYPE))
+ 		lc_ctype = pg_strdup(ctx, getenv(LC_CTYPE));
+ 	if (getenv(LC_MONETARY))
+ 		lc_monetary = pg_strdup(ctx, getenv(LC_MONETARY));
+ 	if (getenv(LC_NUMERIC))
+ 		lc_numeric = pg_strdup(ctx, getenv(LC_NUMERIC));
+ 	if (getenv(LC_TIME))
+ 		lc_time = pg_strdup(ctx, getenv(LC_TIME));
  	if (getenv(LANG))
  		lang = pg_strdup(ctx, getenv(LANG));
+ 	if (getenv(LANGUAGE))
+ 		language = pg_strdup(ctx, getenv(LANGUAGE));
+ 	if (getenv(LC_ALL))
+ 		lc_all = pg_strdup(ctx, getenv(LC_ALL));
+ 	if (getenv(LC_MESSAGES))
+ 		lc_messages = pg_strdup(ctx, getenv(LC_MESSAGES));
+ 
+ 	putenv2(ctx, LC_COLLATE, NULL);
+ 	putenv2(ctx, LC_CTYPE, NULL);
+ 	putenv2(ctx, LC_MONETARY, NULL);
+ 	putenv2(ctx, LC_NUMERIC, NULL);
+ 	putenv2(ctx, LC_TIME, NULL);
+ 	putenv2(ctx, LANG,
  #ifndef WIN32
! 			NULL);
  #else
! 			/* On Windows the default locale cannot be English, so force it */
! 			en);
  #endif
+ 	putenv2(ctx, LANGUAGE, NULL);
+ 	putenv2(ctx, LC_ALL, NULL);
+ 	putenv2(ctx, LC_MESSAGES, C);
+ 
  	snprintf(cmd, sizeof(cmd), SYSTEMQUOTE \%s/%s \%s\ SYSTEMQUOTE,
  			 cluster-bindir,
  			 live_check ? pg_controldata\ : pg_resetxlog\ -n,
***
*** 334,361 
  	if (output)
  		pclose(output);
  
! 	/* restore LANG */
! 	if (lang)
! 	{
! #ifndef WIN32
! 		char	   *envstr = (char *) pg_malloc(ctx, strlen(lang) + 6);
! 
! 		sprintf(envstr, LANG=%s, lang);
! 		putenv(envstr);
! #else
! 		SetEnvironmentVariableA(LANG, lang);
! #endif
! 		pg_free(lang);
! 	}
! 	else
! 	{
! #ifndef WIN32
! 		unsetenv(LANG);
! #else
! 		SetEnvironmentVariableA(LANG, );
! #endif
! 	}
! 
  	/* verify that we got all the mandatory pg_control data */
  	if (!got_xid || !got_oid ||
  		(!live_check  !got_log_id) ||
--- 371,399 
  	if (output)
  		pclose(output);
  
! 	/*
! 	 *	Restore environment variables
! 	 */
! 	putenv2(ctx, LC_COLLATE, lc_collate);
! 	putenv2(ctx, LC_CTYPE, 

Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Wed, 2010-09-01 at 15:31 -0400, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
 Compiling with RELCACHE_FORCE_RELEASE doesn't pass make check on my
 machine.

 What happens exactly?

 I have attached regression.diffs after a make check. The diffs don't
 look trivial, and actually look quite strange to me.

Hmm, sorta looks like that breaks something related to checking for
composite types.  That would be a bug.  Will look.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 However, that's something for 9.1 and beyond.  Bruce's immediate problem
 is what to do in pg_upgrade in 9.0, and there I concur that he should
 duplicate what pg_regress is doing.

 OK, here is a patch that sets all the variables that pg_regress.c sets.

I certainly hope that pg_regress isn't freeing the strings it passes
to putenv() ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Path question

2010-09-01 Thread Robert Haas
On Sep 1, 2010, at 10:21 AM, Greg Stark gsst...@mit.edu wrote:
 For what it's worth I disagree with Tom. I think this is a situation
 where we need *both* types of solution. Ideally we will be able to use
 a plain Append node for cases where we know the relative ordering of
 the data in different partitions, but there will always be cases where
 the structured partition data doesn't actually match up with the
 ordering requested and we'll need to fall back to a merge-append node.

I agree. Explicit partitioning may open up some additional optimization 
possibilities in certain cases, but Merge Append is more general and extremely 
valuable in its own right.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable in comments and names

2010-09-01 Thread Robert Haas
On Sep 1, 2010, at 11:02 AM, Kevin Grittner kevin.gritt...@wicourts.gov 
wrote:
 The patch to implement true serializable transactions using SSI
 renames/rewords these things to avoid confusion.  However, there are
 seven files which are changed only for this reason, and another
 where there is one real change of two lines hidden in the midst of
 dozens of lines of such wording changes.  I find it distracting to
 have all this mixed in, and I fear that it will be a time-waster for
 anyone reviewing the meat of the patch.  I'd like to submit a
 no-op patch to cover these issues in advance of the CF, to get
 them out of the way.  Joe suggested that I pose the issue to the
 -hackers list.

+1.

 I could knock out a couple other files from the main patch if people
 considered it acceptable to enable the SHMQueueIsDetached function
 now, which the patch uses in several places within asserts.  I would
 remove the #ifdef NOT_USED from around the (very short) function,
 and add it to the .h file.

-1.

 The changes to the comments and local variables seem pretty safe. 
 The change of IsXactIsoLevelSerializable to
 IsXactIsoLevelXactSnapshotBased (or whatever name the community
 prefers)

How about IsXactIsoLevelSnapshot?  Just to be a bit shorter.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Tom Lane
I wrote:
 Jeff Davis pg...@j-davis.com writes:
 I have attached regression.diffs after a make check. The diffs don't
 look trivial, and actually look quite strange to me.

 Hmm, sorta looks like that breaks something related to checking for
 composite types.  That would be a bug.  Will look.

So here is the culprit:

#0  0x007c08e7 in flush_rowtype_cache (type_id=49022) at typcache.c:531
#1  0x007b4797 in RelationClearRelation (relation=0x7fc490ccebe0, 
rebuild=0 '\000') at relcache.c:1929
#2  0x007b4131 in RelationClose (relation=0x7fc490ccebe0)
at relcache.c:1680
#3  0x00472721 in relation_close (relation=0x7fc490ccebe0, lockmode=1)
at heapam.c:1051
#4  0x007c01fe in lookup_type_cache (type_id=49022, flags=64)
at typcache.c:299
#5  0x007c023f in lookup_rowtype_tupdesc_internal (type_id=49022, 
typmod=-1, noError=0 '\000') at typcache.c:321
#6  0x007c03e7 in lookup_rowtype_tupdesc_copy (type_id=49022, 
typmod=-1) at typcache.c:395
#7  0x007cd087 in get_expr_result_type (expr=0x1395d08, 
resultTypeId=0x0, resultTupleDesc=0x7fffbd6e5830) at funcapi.c:256

lookup_type_cache() opens the relcache entry to copy its tupdesc, does
so, and closes the relcache entry again.  But then (if
RELCACHE_FORCE_RELEASE is on) we blow away the relcache entry ...
and RelationClearRelation carefully reaches around and blows away
the tupdesc in the typcache too!

So that's overly aggressive, but I'm not sure what's a simple fix for
it.  If we just take out that flush_rowtype_cache call in
RelationClearRelation, I think we run the risk of the typcache not being
flushed when it needs to be, namely when an ALTER TABLE changes the
table's rowtype at an instant when some particular backend doesn't
have an active relcache entry for it.

Probably the best fix would be to make typcache flushing fully
independent of the relcache, but that would mean making sure that all
ALTER TABLE variants that affect the rowtype will issue an explicit
typcache flush.  That seems a bit too invasive to be back-patchable.
I'm not entirely sure this sort of failure can occur without
RELCACHE_FORCE_RELEASE, but I'm definitely not sure it can't, so a
backpatchable fix would be nice.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  However, that's something for 9.1 and beyond.  Bruce's immediate problem
  is what to do in pg_upgrade in 9.0, and there I concur that he should
  duplicate what pg_regress is doing.
 
  OK, here is a patch that sets all the variables that pg_regress.c sets.
 
 I certainly hope that pg_regress isn't freeing the strings it passes
 to putenv() ...

pg_regress does not restore these settings (it says with C/English) so
the code is different.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-01 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 I certainly hope that pg_regress isn't freeing the strings it passes
 to putenv() ...

 pg_regress does not restore these settings (it says with C/English) so
 the code is different.

That's not what I'm on about.  You're trashing strings that are part of
the live environment.  It might accidentally fail to fail for you, if
your version of free() doesn't immediately clobber the released storage,
but it's still broken.  Read the putenv() man page.

+ #ifndef WIN32
+   char   *envstr = (char *) pg_malloc(ctx, strlen(var) +
+   strlen(val) + 1);
+ 
+   sprintf(envstr, %s=%s, var, val);
+   putenv(envstr);
+   pg_free(envstr);

+ #else
+   SetEnvironmentVariableA(var, val);
+ #endif

The fact that there is no such free() in pg_regress is not an oversight
or shortcut.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 I have attached regression.diffs after a make check. The diffs don't
 look trivial, and actually look quite strange to me.

BTW, if I dike out the flush_rowtype_cache call at relcache.c:1929,
the regression tests do pass for me, so it seems there is only one
bug exposed by this test (as of HEAD anyway).  I don't find that
to be a credible real fix, though, as stated previously.

Once we do have a fix in place for this, it might be a good idea for
someone to run a buildfarm member with -DRELCACHE_FORCE_RELEASE ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] security hook on table creation

2010-09-01 Thread KaiGai Kohei
This patch allows external security providers to check privileges
to create a new relation and to inform the security labels to be
assigned on the new one.

This hook is put on DefineRelation() and OpenIntoRel(), but isn't
put on Boot_CreateStmt, create_toast_table() and make_new_heap(),
although they actually create a relation, because I assume these
are the cases when we don't need individual checks and security
labels.

DefineIndex() also creates a new relation (RELKIND_INDEX), but
it seems to me the PG implementation considers indexes are a part
of properties of tables, because it always has same owner-id.
So, it shall be checked at the upcoming ALTER TABLE hook, instead.

The ESP plugins can return a list of security labels of the new
relation, columns and relation-type. If multiple ESPs are installed,
it shall append its security labels on the security labels of the
secondary plugin.
The list shall be a list of SecLabelItem as follows:
  typedef struct
  {
  ObjectAddress   object;
  const char *tag;
  const char *seclabel;
  } SecLabelItem;
OID of them are decided in heap_create_with_catalog(), so ESP cannot
know what OID shall be supplied at the point where it makes access
control decision.
So, the core routine fix up the SecLabelItem::object-objectId later,
if InvalidOid is given. I think it is a reasonable idea rather than
putting one more hook to store security labels after the table creation.

Please also note that this patch depends on the security label support
patch that I submitted before.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com
*** a/src/backend/bootstrap/bootparse.y
--- b/src/backend/bootstrap/bootparse.y
***
*** 246,252  Boot_CreateStmt:
  	  (Datum) 0,
  	  false,
  	  true,
! 	  false);
  		elog(DEBUG4, relation created with oid %u, id);
  	}
  	do_end();
--- 246,253 
  	  (Datum) 0,
  	  false,
  	  true,
! 	  false,
! 	  NIL);
  		elog(DEBUG4, relation created with oid %u, id);
  	}
  	do_end();
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
***
*** 49,54 
--- 49,55 
  #include catalog/pg_type.h
  #include catalog/pg_type_fn.h
  #include catalog/storage.h
+ #include commands/seclabel.h
  #include commands/tablecmds.h
  #include commands/typecmds.h
  #include miscadmin.h
***
*** 94,99  static Oid AddNewRelationType(const char *typeName,
--- 95,101 
  static void RelationRemoveInheritance(Oid relid);
  static void StoreRelCheck(Relation rel, char *ccname, Node *expr,
  			  bool is_local, int inhcount);
+ static void StoreSecLabels(Relation rel, Oid typeId, List *seclabels);
  static void StoreConstraints(Relation rel, List *cooked_constraints);
  static bool MergeWithExistingConstraint(Relation rel, char *ccname, Node *expr,
  			bool allow_merge, bool is_local);
***
*** 882,887  AddNewRelationType(const char *typeName,
--- 884,890 
   *	use_user_acl: TRUE if should look for user-defined default permissions;
   *		if FALSE, relacl is always set NULL
   *	allow_system_table_mods: TRUE to allow creation in system namespaces
+  *	seclabels : list of security labels to be assigned on, if exist
   *
   * Returns the OID of the new relation
   * 
***
*** 905,911  heap_create_with_catalog(const char *relname,
  		 Datum reloptions,
  		 bool use_user_acl,
  		 bool allow_system_table_mods,
! 		 bool if_not_exists)
  {
  	Relation	pg_class_desc;
  	Relation	new_rel_desc;
--- 908,915 
  		 Datum reloptions,
  		 bool use_user_acl,
  		 bool allow_system_table_mods,
! 		 bool if_not_exists,
! 		 List *seclabels)
  {
  	Relation	pg_class_desc;
  	Relation	new_rel_desc;
***
*** 1189,1194  heap_create_with_catalog(const char *relname,
--- 1193,1203 
  	}
  
  	/*
+ 	 * Store any supplied security labels of the relation
+ 	 */
+ 	StoreSecLabels(new_rel_desc, new_type_oid, seclabels);
+ 
+ 	/*
  	 * Store any supplied constraints and defaults.
  	 *
  	 * NB: this may do a CommandCounterIncrement and rebuild the relcache
***
*** 1808,1813  StoreRelCheck(Relation rel, char *ccname, Node *expr,
--- 1817,1861 
  }
  
  /*
+  * Store any supplied security labels of the new relation
+  */
+ static void
+ StoreSecLabels(Relation rel, Oid typeId, List *seclabels)
+ {
+ 	ListCell   *l;
+ 
+ 	foreach (l, seclabels)
+ 	{
+ 		SecLabelItem   *sl = lfirst(l);
+ 		ObjectAddress	object;
+ 
+ 		Assert(sl-tag != NULL  sl-seclabel != NULL);
+ 
+ 		memcpy(object, sl-object, sizeof(object));
+ 
+ 		/*
+ 		 * It fixes up OID of the new relation and type, because these IDs were
+ 		 * not decided at the point when external security provider made access
+ 		 * control decision and returns security label to be assigned on.
+ 		 *
+ 		 * If 

Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Tom Lane
I wrote:
 Probably the best fix would be to make typcache flushing fully
 independent of the relcache, but that would mean making sure that all
 ALTER TABLE variants that affect the rowtype will issue an explicit
 typcache flush.  That seems a bit too invasive to be back-patchable.
 I'm not entirely sure this sort of failure can occur without
 RELCACHE_FORCE_RELEASE, but I'm definitely not sure it can't, so a
 backpatchable fix would be nice.

After a bit more study it seems that there is a reasonably
back-patchable approach to this.  We can continue to drive flushing of
composite-type typcache entries off of relcache flush, but it has to
occur when we do RelationCacheInvalidateEntry() or
RelationCacheInvalidate() due to a SI invalidate event, not just
anytime a relcache entry is closed.  We can do that by plugging in a
callback function with CacheRegisterRelcacheCallback.

Because the callback will only have the relation OID not the type OID,
it will have to scan the whole TypeCacheHash to see if there's a
matching entry.  However, that's not as bad as it sounds, because there
aren't likely to be very many entries in that hashtable.  I put in some
quick-hack instrumentation to see how big the table gets during the
regression tests, and find that of the hundred-odd backends launched
during the tests, none get above 26 typcache entries, and only 8 get as
many as 10 entries.  Based on those numbers, I'm not sure it'd ever be
worth adding the additional infrastructure to allow a direct hash lookup
instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] leaky views, yet again

2010-09-01 Thread Robert Haas
2010/9/1 KaiGai Kohei kai...@ak.jp.nec.com:
 Right now, it stands on a strict assumption that considers operators
 implemented with built-in functions are safe; it does not have no
 possibility to leak supplied arguments anywhere.

 Please note that this patch does not case about a case when
 a function inside a view and a function outside a view are
 distributed into same level and the later function has lower
 cost value.

Without making some attempt to address these two points, I don't see
the point of this patch.

Also, I believe we decided previously do this deoptimization only in
case the user requests it with CREATE SECURITY VIEW.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-09-01 Thread Robert Haas
2010/8/28 Alexander Korotkov aekorot...@gmail.com:
 Now test for levenshtein_less_equal performance.

Nice results.  I'll try to find time to look at this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG fix for mixed case cursor names

2010-09-01 Thread Robert Haas
2010/8/25 Boszormenyi Zoltan z...@cybertec.at:
 PostgreSQL allows in plain SQL to declare a cursor
 e.g. in all lower case and fetch from is in all upper case.
 We need to allow this from ECPG, too, but strictly when
 the cursor name is not in a variable. Otherwise this code
 below doesn't notice the cursor's double declaration
 and complains using an undeclared cursor:

It might be a good idea to add this to the open CommitFest so we don't
lose track of it.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE

2010-09-01 Thread Robert Haas
On Mon, Aug 30, 2010 at 5:30 PM, Marko Tiikkaja
marko.tiikk...@cs.helsinki.fi wrote:
 On 2010-08-31 12:07 AM +0300, I wrote:

 I looked at fixing this..

 The previous patch had a bug in fmgr_sql() our regression tests didn't
 catch.  Fixed version attached.

It would probably be a good idea to add this to the currently open CommitFest.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-09-01 Thread Robert Haas
On Wed, Sep 1, 2010 at 6:39 AM, Magnus Hagander mag...@hagander.net wrote:
 That definitely didn't fix it, although I'm not quite sure why.  Can
 you throw the modified CVS you ran this off of up somewhere I can
 rsync it?

 no rsync server on that box, but I put up a tarball for you at
 http://www.hagander.net/pgsql/cvsrepo.tgz

OK, color me baffled.  I looked at gram.c and I believe you obsoleted
the right revs. The only difference I see between this and some other
random deleted file is that it has a couple of tags pointing to revs
that don't exist any more, but I can't see how that would cause the
observed weirdness.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] leaky views, yet again

2010-09-01 Thread KaiGai Kohei
(2010/09/02 11:57), Robert Haas wrote:
 2010/9/1 KaiGai Koheikai...@ak.jp.nec.com:
 Right now, it stands on a strict assumption that considers operators
 implemented with built-in functions are safe; it does not have no
 possibility to leak supplied arguments anywhere.

 Please note that this patch does not case about a case when
 a function inside a view and a function outside a view are
 distributed into same level and the later function has lower
 cost value.
 
 Without making some attempt to address these two points, I don't see
 the point of this patch.
 
 Also, I believe we decided previously do this deoptimization only in
 case the user requests it with CREATE SECURITY VIEW.
 
Perhaps, I remember the previous discussion incorrectly.

If we have a hint about whether the supplied view is intended to security
purpose, or not, it seems to me it is a reliable method to prevent pulling
up the subqueries come from security views.
Is it too much deoptimization?

If we keep security views as subqueries, the later point shall be solved
implicitly. Even if we try to optimize security views in a certain level,
it is not difficult to solve, as I demonstrated before.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] leaky views, yet again

2010-09-01 Thread Robert Haas
2010/9/1 KaiGai Kohei kai...@ak.jp.nec.com:
 (2010/09/02 11:57), Robert Haas wrote:
 2010/9/1 KaiGai Koheikai...@ak.jp.nec.com:
 Right now, it stands on a strict assumption that considers operators
 implemented with built-in functions are safe; it does not have no
 possibility to leak supplied arguments anywhere.

 Please note that this patch does not case about a case when
 a function inside a view and a function outside a view are
 distributed into same level and the later function has lower
 cost value.

 Without making some attempt to address these two points, I don't see
 the point of this patch.

 Also, I believe we decided previously do this deoptimization only in
 case the user requests it with CREATE SECURITY VIEW.

 Perhaps, I remember the previous discussion incorrectly.

 If we have a hint about whether the supplied view is intended to security
 purpose, or not, it seems to me it is a reliable method to prevent pulling
 up the subqueries come from security views.
 Is it too much deoptimization?

Well, that'd prevent something like id = 3 from getting pushed down,
which seems a bit harsh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

2010-09-01 Thread Fujii Masao
On Wed, Sep 1, 2010 at 4:11 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 The obvious next question is how to wait for multiple sockets and a latch at
 the same time? Perhaps we should have a select()-like interface where you
 can pass multiple file descriptors. Then again, looking at the current
 callers of select() in the backend, apart from postmaster they all wait for
 only one fd.

Currently backends have not waited for multiple sockets, so I don't think that
interface is required for now. Similarly, we don't need to wait for the socket
to be ready to *write* because there is no use case for now.

 Windows implementation of WaitLatchAndSocket() seems not to be
 so simple. We would need to wait for both the latch event and
 the packet from the socket by using WaitForMultipleObjectsEx().

 Well, we already use WaitForMultipleObjectsEx() to implement select() on
 Windows, so it should be straightforward to copy that. I'll look into that.

Agreed.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] leaky views, yet again

2010-09-01 Thread KaiGai Kohei
(2010/09/02 12:38), Robert Haas wrote:
 2010/9/1 KaiGai Koheikai...@ak.jp.nec.com:
 (2010/09/02 11:57), Robert Haas wrote:
 2010/9/1 KaiGai Koheikai...@ak.jp.nec.com:
 Right now, it stands on a strict assumption that considers operators
 implemented with built-in functions are safe; it does not have no
 possibility to leak supplied arguments anywhere.

 Please note that this patch does not case about a case when
 a function inside a view and a function outside a view are
 distributed into same level and the later function has lower
 cost value.

 Without making some attempt to address these two points, I don't see
 the point of this patch.

 Also, I believe we decided previously do this deoptimization only in
 case the user requests it with CREATE SECURITY VIEW.

 Perhaps, I remember the previous discussion incorrectly.

 If we have a hint about whether the supplied view is intended to security
 purpose, or not, it seems to me it is a reliable method to prevent pulling
 up the subqueries come from security views.
 Is it too much deoptimization?
 
 Well, that'd prevent something like id = 3 from getting pushed down,
 which seems a bit harsh.
 
Hmm. If so, we need to remember what FromExpr was come from subqueries
of security views, and what were not. Then, we need to prevent leakable
clause will be distributed to inside of them.
In addition, we also need to care about the order of function calls in
same level, because it is not implicitly solved.

At first, let's consider top-half of the matter.

When views are expanded into subqueries in query-rewriter, Query tree
lost an information OID of the view, because RangeTblEntry does not have
OID of the relation when it is RTE_SUBQUERY. So, we need to patch here
to mark a flag whether the supplied view is security focused, or not.

Then, pull_up_simple_subquery() pulls up a supplied subquery into normal
join, if possible. In this case, FromExpr is chained into the upper level.
Of course, FromExpr does not have a flag to show its origin, so we also
need to copy the new flag in RangeTblEntry to FromExpr.

Then, when distribute_qual_to_rels() is called, the caller also provides
a Bitmapset of relation-Ids which are contained under the FromExpr with
the flag saying it came from the security views.
Even if the supplied clause references a part of the Bitmapset, we need
to prevent the clause being pushed down into the relations came from
security views, except for ones we can make sure these are safe.

Perhaps, it is not impossible

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression

2010-09-01 Thread Jeff Davis
On Wed, 2010-09-01 at 20:57 -0400, Tom Lane wrote:
 I wrote:
  Probably the best fix would be to make typcache flushing fully
  independent of the relcache, but that would mean making sure that all
  ALTER TABLE variants that affect the rowtype will issue an explicit
  typcache flush.  That seems a bit too invasive to be back-patchable.
  I'm not entirely sure this sort of failure can occur without
  RELCACHE_FORCE_RELEASE, but I'm definitely not sure it can't, so a
  backpatchable fix would be nice.
 
 After a bit more study it seems that there is a reasonably
 back-patchable approach to this.  We can continue to drive flushing of
 composite-type typcache entries off of relcache flush, but it has to
 occur when we do RelationCacheInvalidateEntry() or
 RelationCacheInvalidate() due to a SI invalidate event, not just
 anytime a relcache entry is closed.  We can do that by plugging in a
 callback function with CacheRegisterRelcacheCallback.
 

I think I see how this fixes the problem, but I still don't completely
understand.

Why can't we just make a real copy of the tuple descriptor for the type
cache entry, rather than sharing it between the relcache and the type
cache?

Thank you for the quick response.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers