Re: [HACKERS] Chronic performance issue with Replication Failover and FSM.

2012-03-14 Thread Heikki Linnakangas

On 14.03.2012 01:53, Josh Berkus wrote:

1. The Free Space Map is not replicated between servers.

2. Thus, when we fail over to a replica, it starts with a blank FSM.


The FSM is included in the base backup, and it is updated when VACUUM 
records are replayed.


It is also updated when insert/update/delete records are replayed, 
athough there's some fuzziness there: records with full page images 
don't update the FSM, and the FSM is only updated when the page has less 
than 20% of free space left. But that would cause an error in the other 
direction, with the FSM claiming that some pages have more free space 
than they do in reality.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Keystone auth in PostgreSQL

2012-03-14 Thread Vivek Singh Raghuwanshi
Hi All,

Can i use keystone auth with PostgreSQL, it is very helpful when i am
using OpenStack as a cloud service and implement DBaaS.

-- 
ViVek Raghuwanshi
Mobile -+91-09595950504

Skype - vivek_raghuwanshi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-14 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 13, 2012 at 5:06 PM, Andres Freund and...@anarazel.de wrote:
 Generally, uppon rereading, I have to say that I am not very happy with the
 decision that ANY triggers are fired from other places than the specific
 triggers. That seams to be a rather dangerous/confusing route to me.

 I agree. I think that's a complete non-starter.

Ok, well, let me react in 2 ways here:

 A. it's very easy to change and will simplify the code
 B. it's been done this way for good reasons (at the time)

Specifically, I've been asked to implement the feature of blocking all
and any DDL activity on a machine in a single command, and we don't have
support for triggers on all commands (remember shared objects).

Now, as I've completed support for all interesting commands the
discrepancy between what's supported in the ANY case and in the specific
command case has reduced. If you're saying to nothing, that's good news.

Also, when calling the user's procedure from the same place in case of an
ANY command trigger or a specific one it's then possible to just hand
them over the exact same set of info (object id, name, schema name).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Too many IO?

2012-03-14 Thread Simon Riggs
On Wed, Mar 14, 2012 at 2:29 AM, Tatsuo Ishii is...@postgresql.org wrote:

 However I saw 1505 more accesses in total. My guess is this number
 mainly comes from index meta page access. So my guess is we need 3
 page accesses (to traverse b tree index tree) before reaching the leaf
 page in average. Am I correct or the number is execessive?

Meta page access was optimised away some time ago.

Descending the index tree can easily take that long, perhaps longer
when the table is larger and the tree is deeper.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-03-14 Thread Andres Freund
On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote:
 On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote:
  for recursively everything in dir:
 posix_fadvise(fd, POSIX_FADV_DONTNEED);
  
  for recursively everything in dir:
 fsync(fd);
 
 Wow, that made a huge difference!
 
   no sync:  ~ 1.0s
   sync: ~10.0s
   fadvise+sync: ~ 1.3s
 
 Patch attached.
 
 Now I feel much better about it. Most people will either have fadvise, a
 write cache (rightly or wrongly), or actually need the sync. Those that
 have none of those can use -N.
Well, while the positive effect of this are rather large it also has the bad 
effect of pushing the whole new database out of the cache. Which is not so nice 
if you want to run tests on it seconds later.
How are the results with sync_file_range(fd, 0, 0, 
SYNC_FILE_RANGE_WRITE)? 

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Weird behaviour

2012-03-14 Thread Vlad Arkhipov
Could anyone please explain the behaviour of Postgres in the cases 
below? It evaluates an unused expression t.x || t.y in the first case 
but doesn't do it in the second one. It's also strange that the last 
explain throws an error.


postgres=# select version();
version
---
 PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 
4.1.2 20080704 (Red Hat 4.1.2-51), 64-bit

(1 row)

postgres=# create or replace function f()
postgres-# returns text as $$
postgres$# begin
postgres$#   raise exception 'here';
postgres$# end;
postgres$# $$ language plpgsql immutable;

postgres=# select t.x
postgres-# from (
postgres(#   select t.x, t.x || f()
postgres(#   from (values(1)) as t(x)
postgres(# ) t;
ERROR:  here

postgres=# select t.x
postgres-# from (
postgres(#   select t.x, t.x::text || f()
postgres(#   from (values(1)) as t(x)
postgres(# ) t;
 x
---
 1
(1 row)

postgres=# explain select t.x
postgres-# from (
postgres(#   select t.x, t.x || f()
postgres(#   from (values(1)) as t(x)
postgres(# ) t;
ERROR:  here

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread David Fetter
On Tue, Mar 13, 2012 at 08:24:47AM -0700, David Fetter wrote:
 Folks,
 
 This is for 9.3, of course.
 
 I noticed that CREATE FOREIGN TABLE (LIKE some_table) doesn't work.  I
 believe it should, as it would:
 
 - Remove a POLA violation
 - Make data loading into an extant table even easier, especially if
   there need to be filtering or other cleanup steps
 
 Come to think of it, which CREATE TABLE options are inappropriate to
 CREATE FOREIGN TABLE?
 
 Cheers,
 David.

Here's a WIP patch (lots of cut/paste, no docs, no tests), but it does
work.  Still to do in addition: decide whether ALTER FOREIGN TABLE
should also handle LIKE.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 5cde225..c634e19 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2727,6 +2727,16 @@ _copyTableLikeClause(const TableLikeClause *from)
return newnode;
 }
 
+static ForeignTableLikeClause *
+_copyForeignTableLikeClause(const ForeignTableLikeClause *from)
+{
+   ForeignTableLikeClause *newnode = makeNode(ForeignTableLikeClause);
+
+   COPY_NODE_FIELD(relation);
+
+   return newnode;
+}
+
 static DefineStmt *
 _copyDefineStmt(const DefineStmt *from)
 {
@@ -4126,6 +4136,9 @@ copyObject(const void *from)
case T_TableLikeClause:
retval = _copyTableLikeClause(from);
break;
+   case T_ForeignTableLikeClause:
+   retval = _copyForeignTableLikeClause(from);
+   break;
case T_DefineStmt:
retval = _copyDefineStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index d2a79eb..55cc2db 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1170,6 +1170,14 @@ _equalTableLikeClause(const TableLikeClause *a, const 
TableLikeClause *b)
 }
 
 static bool
+_equalForeignTableLikeClause(const ForeignTableLikeClause *a, const 
ForeignTableLikeClause *b)
+{
+   COMPARE_NODE_FIELD(relation);
+
+   return true;
+}
+
+static bool
 _equalDefineStmt(const DefineStmt *a, const DefineStmt *b)
 {
COMPARE_SCALAR_FIELD(kind);
@@ -2685,6 +2693,9 @@ equal(const void *a, const void *b)
case T_TableLikeClause:
retval = _equalTableLikeClause(a, b);
break;
+   case T_ForeignTableLikeClause:
+   retval = _equalForeignTableLikeClause(a, b);
+   break;
case T_DefineStmt:
retval = _equalDefineStmt(a, b);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 51181a9..88599ba 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2057,6 +2057,14 @@ _outTableLikeClause(StringInfo str, const 
TableLikeClause *node)
 }
 
 static void
+_outForeignTableLikeClause(StringInfo str, const ForeignTableLikeClause *node)
+{
+   WRITE_NODE_TYPE(FOREIGNTABLELIKECLAUSE);
+
+   WRITE_NODE_FIELD(relation);
+}
+
+static void
 _outLockingClause(StringInfo str, const LockingClause *node)
 {
WRITE_NODE_TYPE(LOCKINGCLAUSE);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index feb28a4..34e5bfc 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -373,7 +373,7 @@ static void processCASbits(int cas_bits, int location, 
const char *constrType,
 %type vsetstmt set_rest set_rest_more SetResetClause FunctionSetResetClause
 
 %type node   TableElement TypedTableElement ConstraintElem TableFuncElement
-   ForeignTableElement
+   ForeignTableElement ForeignTableLikeClause
 %type node   columnDef columnOptions
 %type defelt def_elem reloption_elem old_aggr_elem
 %type node   def_arg columnElem where_clause where_or_current_clause
@@ -3950,6 +3950,16 @@ ForeignTableElementList:
 
 ForeignTableElement:
columnDef   { $$ = 
$1; }
+| ForeignTableLikeClause   { $$ = $1; }
+   ;
+
+ForeignTableLikeClause:
+   LIKE qualified_name
+   {
+   ForeignTableLikeClause *n = 
makeNode(ForeignTableLikeClause);
+   n-relation = $2;
+   $$ = (Node *)n;
+   }
;
 
 

Re: [HACKERS] pg_prewarm

2012-03-14 Thread Robert Haas
On Fri, Mar 9, 2012 at 5:24 AM, Fujii Masao masao.fu...@gmail.com wrote:
 For such system, so far I've been suggesting using pgstatindex, but it's good
 if pg_prewarm can do that.

Relevant to this, see commit 2e46bf67114586835f4a9908f1a1f08ee8ba83a8.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 8:28 AM, David Fetter da...@fetter.org wrote:
 On Tue, Mar 13, 2012 at 08:24:47AM -0700, David Fetter wrote:
 Folks,

 This is for 9.3, of course.

 I noticed that CREATE FOREIGN TABLE (LIKE some_table) doesn't work.  I
 believe it should, as it would:

 - Remove a POLA violation
 - Make data loading into an extant table even easier, especially if
   there need to be filtering or other cleanup steps

 Come to think of it, which CREATE TABLE options are inappropriate to
 CREATE FOREIGN TABLE?

 Here's a WIP patch (lots of cut/paste, no docs, no tests), but it does
 work.  Still to do in addition: decide whether ALTER FOREIGN TABLE
 should also handle LIKE.

I think that instead of inventing new grammar productions and a new
node type for this, you should just reuse the existing productions for
LIKE clauses and then reject invalid options during parse analysis.
INCLUDING COMMENTS would be OK, but the the rest are no good.

I'd actually like to see us allow foreign tables to have constraints.
Obviously, we can't enforce constraints on remote data, but the point
would be allow the system administrator to supply the query planner
with enough knowledge to make constraint exclusion work.  The fact
that you can't make that work today is a major gap, IMV.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 4:27 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Also, when calling the user's procedure from the same place in case of an
 ANY command trigger or a specific one it's then possible to just hand
 them over the exact same set of info (object id, name, schema name).

Yes, I think that's an essential property of the system, here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Syntax error and reserved keywords

2012-03-14 Thread Dimitri Fontaine
Hi,

A colleague came to me to express his surprise about this quite simple
use case:

  =# alter table toto add column user text;
  ERROR:  syntax error at or near user
  LINE 1: alter table toto add column user text;

Is there a reason for us not to add an HINT: user is a reserved
keyword or something like that, other than nobody having been interested
in doing the work?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread David Fetter
On Wed, Mar 14, 2012 at 08:53:17AM -0400, Robert Haas wrote:
 On Wed, Mar 14, 2012 at 8:28 AM, David Fetter da...@fetter.org wrote:
  On Tue, Mar 13, 2012 at 08:24:47AM -0700, David Fetter wrote:
  Folks,
 
  This is for 9.3, of course.
 
  I noticed that CREATE FOREIGN TABLE (LIKE some_table) doesn't work.  I
  believe it should, as it would:
 
  - Remove a POLA violation
  - Make data loading into an extant table even easier, especially if
    there need to be filtering or other cleanup steps
 
  Come to think of it, which CREATE TABLE options are inappropriate to
  CREATE FOREIGN TABLE?
 
  Here's a WIP patch (lots of cut/paste, no docs, no tests), but it does
  work.  Still to do in addition: decide whether ALTER FOREIGN TABLE
  should also handle LIKE.
 
 I think that instead of inventing new grammar productions and a new
 node type for this, you should just reuse the existing productions for
 LIKE clauses and then reject invalid options during parse analysis.

OK.  Should I first merge CREATE FOREIGN TABLE with CREATE TABLE and
submit that as a separate patch?

 INCLUDING COMMENTS would be OK, but the the rest are no good.

At least for now.  I can see FDWs in the future that would delegate
the decision to the remote side, and if the remote side happens to be
PostgreSQL, a lot of those delegations could be in force.  Of course,
this would either create a dependency that would need to be tracked in
the other node or not be able to guarantee the durability of DDL, the
latter being the current situation.  I suspect there would be use
cases for each.

 I'd actually like to see us allow foreign tables to have constraints.

So would I :)

 Obviously, we can't enforce constraints on remote data, but the point
 would be allow the system administrator to supply the query planner
 with enough knowledge to make constraint exclusion work.  The fact
 that you can't make that work today is a major gap, IMV.

I didn't do INHERITS because most FDWs won't ever have that concept,
i.e. aren't PostgreSQL.  Are you thinking about this as a general way
to handle remote partitioned tables?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] VALID UNTIL

2012-03-14 Thread Euler Taveira
Hi,

I have a hard time figuring out why my replication stopped with a message like

FATAL:  password authentication failed for user foo

in the logs. I thought it was some pg_hba.conf change, a pgpass modification,
or NOLOGIN option, it wasn't. I was out of options when I remembered to check
if there is a VALID OPTION option set. For my surprise, it wasn't exposed by
\du or even \du+.

euler=# \du
List of roles
-[ RECORD 1 ]--
Role name  | euler
Attributes | Superuser, Create role, Create DB, Replication
Member of  | {}
-[ RECORD 2 ]--
Role name  | foo
Attributes |
Member of  | {}

euler=# \du+
List of roles
-[ RECORD 1 ]---
Role name   | euler
Attributes  | Superuser, Create role, Create DB, Replication
Member of   | {}
Description |
-[ RECORD 2 ]---
Role name   | foo
Attributes  |
Member of   | {}
Description |

but after checking in the catalog I got it.

test=# select rolname,rolvaliduntil from pg_authid;
 rolname | rolvaliduntil
-+
 euler   |
 foo | 2012-03-01 00:00:00-03
(2 rows)

Is there any reason why it is not exposed? What about exposing that
information in attributes or even in a separate column? It could help
troubleshooting quickly on this case.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Mar 14, 2012 at 8:28 AM, David Fetter da...@fetter.org wrote:
 Here's a WIP patch (lots of cut/paste, no docs, no tests), but it does
 work.  Still to do in addition: decide whether ALTER FOREIGN TABLE
 should also handle LIKE.

 I think that instead of inventing new grammar productions and a new
 node type for this, you should just reuse the existing productions for
 LIKE clauses and then reject invalid options during parse analysis.

+1; in this approach, adding more features will make it worse not better.

 I'd actually like to see us allow foreign tables to have constraints.
 Obviously, we can't enforce constraints on remote data, but the point
 would be allow the system administrator to supply the query planner
 with enough knowledge to make constraint exclusion work.  The fact
 that you can't make that work today is a major gap, IMV.

Hm.  That opinion seems to me to connect to the recently-posted patch to
make contrib/file_fdw enforce NOT NULL constraints.  Should we instead
have the position that constraints declared for foreign tables are
statements that we can take on faith, and it's the user's fault if they
are wrong?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread David Fetter
On Wed, Mar 14, 2012 at 10:27:28AM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Wed, Mar 14, 2012 at 8:28 AM, David Fetter da...@fetter.org wrote:
  Here's a WIP patch (lots of cut/paste, no docs, no tests), but it does
  work. �Still to do in addition: decide whether ALTER FOREIGN TABLE
  should also handle LIKE.
 
  I think that instead of inventing new grammar productions and a new
  node type for this, you should just reuse the existing productions for
  LIKE clauses and then reject invalid options during parse analysis.
 
 +1; in this approach, adding more features will make it worse not better.

OK :)

  I'd actually like to see us allow foreign tables to have constraints.
  Obviously, we can't enforce constraints on remote data, but the point
  would be allow the system administrator to supply the query planner
  with enough knowledge to make constraint exclusion work.  The fact
  that you can't make that work today is a major gap, IMV.
 
 Hm.  That opinion seems to me to connect to the recently-posted
 patch to make contrib/file_fdw enforce NOT NULL constraints.  Should
 we instead have the position that constraints declared for foreign
 tables are statements that we can take on faith, and it's the user's
 fault if they are wrong?

I think that's something FDWs need to be able to communicate to
PostgreSQL.  For example, something talking to another PostgreSQL
would (potentially, anyhow) have access to deep knowledge of the
remote side, but file_fdw would only have best efforts even for clever
things like statistics.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Syntax error and reserved keywords

2012-03-14 Thread Pavel Stehule
2012/3/14 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Hi,

 A colleague came to me to express his surprise about this quite simple
 use case:

  =# alter table toto add column user text;
  ERROR:  syntax error at or near user
  LINE 1: alter table toto add column user text;

 Is there a reason for us not to add an HINT: user is a reserved
 keyword or something like that, other than nobody having been interested
 in doing the work?

Probably nobody did this work. I am thinking so on current code, this
request is relatively simple implemented - and I agree so this can be
really nice feature.

Regards

Pavel


 Regards,
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Syntax error and reserved keywords

2012-03-14 Thread Euler Taveira
On 14-03-2012 10:58, Dimitri Fontaine wrote:
 Is there a reason for us not to add an HINT: user is a reserved
 keyword or something like that, other than nobody having been interested
 in doing the work?
 
AFAIK, there is no such warning message in the code. If you're volunteering to
do it, please cover all sql commands.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Too many IO?

2012-03-14 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 I have created a 29GB test database by using standard pgbnech -i -s
 2000.  Then I executed:

That means 200 million accounts rows.  With integer keys you could
expect to get 200 to 300 keys per index page.  Taking the number as 200
for simplicity, we expect 1 million leaf pages, 5000 level-1 inner
pages, 25 level-2 inner pages, and a level-3 root page.  Even if the
tree were packed completely full, it'd still be depth 3.

 As you can see, this query generated 1255+1250 = 2505 times block read
 either from the buffer or the disk. In my understanding the query
 accesses an index tuple, which will need access to root page and
 several number of meta pages (I mean index pages they are not either
 root or leaf pages) and 1 leaf page, then access 1 heap block. So I
 expected total number of IO would be somewhat:

 500 index leaf pages + 500 heap blocks = 1000

The way I count it, each probe will touch the root page, a level-2 inner
page, a level-1 inner page, a leaf page, and a heap page, so five buffer
touches per cycle, which is almost exactly what you've got.  Only the
first two of those are very likely to benefit from caching from previous
searches, so the fact that you got 1255 hits and not only 1000 is
actually a bit better than expected.  Probably this query was not done
from a standing start, and so some of the level-1 pages were already in
buffer cache.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Syntax error and reserved keywords

2012-03-14 Thread Pavel Stehule
2012/3/14 Pavel Stehule pavel.steh...@gmail.com:
 2012/3/14 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Hi,

 A colleague came to me to express his surprise about this quite simple
 use case:

  =# alter table toto add column user text;
  ERROR:  syntax error at or near user
  LINE 1: alter table toto add column user text;

 Is there a reason for us not to add an HINT: user is a reserved
 keyword or something like that, other than nobody having been interested
 in doing the work?

 Probably nobody did this work. I am thinking so on current code, this
 request is relatively simple implemented - and I agree so this can be
 really nice feature.


but it is not too simple as I though

this message coming from scanner_yyerror - and forwarding hint into
this callback routine is not trivial - more - this message is used
when word is reserved keyword and must not be and when word is just
wrong reserved keyword.

Regards

Pavel

 Regards

 Pavel


 Regards,
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VALID UNTIL

2012-03-14 Thread David Fetter
On Wed, Mar 14, 2012 at 11:20:05AM -0300, Euler Taveira de Oliveira wrote:
 Hi,
 
 I have a hard time figuring out why my replication stopped with a message like
 
 FATAL:  password authentication failed for user foo
 
 in the logs. I thought it was some pg_hba.conf change, a pgpass modification,
 or NOLOGIN option, it wasn't. I was out of options when I remembered to check
 if there is a VALID OPTION option set. For my surprise, it wasn't exposed by
 \du or even \du+.
 
 Is there any reason why it is not exposed?

Oversight.

 What about exposing that information in attributes or even in a
 separate column? It could help troubleshooting quickly on this case.

I don't know how frequently people use VALID UNTIL, but I'm guessing
it's not terribly often because yours is the first comment about how
it's not exposed, so I'd tend toward putting it in attributes rather
than a separate column.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weird behaviour

2012-03-14 Thread Tom Lane
Vlad Arkhipov arhi...@dc.baikal.ru writes:
 Could anyone please explain the behaviour of Postgres in the cases 
 below?

I think it has something to do with anytextcat() being mistakenly marked
as volatile, thus preventing flattening of the subquery in the cases
where you don't explicitly coerce the integer to text.  When the
subquery does get flattened, that results in discarding the troublesome
expression as being unreferenced, so no error.  HEAD doesn't throw the
error for either case, thanks to commit
3db6524fe63f0598dcb2b307bb422bc126f2b15d.

 It evaluates an unused expression t.x || t.y in the first case 
 but doesn't do it in the second one. It's also strange that the last 
 explain throws an error.

I think your expectations need adjustment: what is strange is not
getting the error, but failing to get it.  In general the planner
assumes that it can freely evaluate immutable functions, and so this
query typically *will* throw an error during constant-simplification.
In some of these phrasings you manage to avoid that because the
expression is discarded as unreferenced before const-simplification
gets run, but that's an implementation artifact that should not
be relied on.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Tom Lane
David Fetter da...@fetter.org writes:
 I didn't do INHERITS because most FDWs won't ever have that concept,
 i.e. aren't PostgreSQL.

What's that have to do with it?  Inheritance would be a local
association of tables, having nothing to do with what the remote end is.
IOW, if c inherits from p, that means to scan c as well in SELECT FROM
p.  We can do this regardless of whether c or p or both are foreign
tables.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Tom Lane
David Fetter da...@fetter.org writes:
 On Wed, Mar 14, 2012 at 10:27:28AM -0400, Tom Lane wrote:
 Hm.  That opinion seems to me to connect to the recently-posted
 patch to make contrib/file_fdw enforce NOT NULL constraints.  Should
 we instead have the position that constraints declared for foreign
 tables are statements that we can take on faith, and it's the user's
 fault if they are wrong?

 I think that's something FDWs need to be able to communicate to
 PostgreSQL.  For example, something talking to another PostgreSQL
 would (potentially, anyhow) have access to deep knowledge of the
 remote side, but file_fdw would only have best efforts even for clever
 things like statistics.

I think we're talking at cross-purposes.  What you're saying seems to
assume that it's the system's responsibility to do something about a
constraint declared on a foreign table.  What I'm suggesting is that
maybe it isn't.  A constraint, ordinarily, would be enforced against
table *updates*, and then just assumed valid during reads.  In the case
of a foreign table, we can't enforce constraints during updates because
we don't have control of all updates.  Should we ignore declared
constraints because they're not necessarily true?  Should we assume on
faith that they're true?  The posted patch for file_fdw takes the
approach of silently filtering out rows for which they're not true,
which is not obviously the right thing either --- quite aside from
whether that's a sane semantics, it's not going to scale to foreign key
constraints, and even for simple NOT NULL and CHECK constraints it
results in a runtime penalty on selects, which is not what people would
expect from a constraint.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VALID UNTIL

2012-03-14 Thread Fabrízio de Royes Mello
2012/3/14 David Fetter da...@fetter.org


 I don't know how frequently people use VALID UNTIL, but I'm guessing
 it's not terribly often because yours is the first comment about how
 it's not exposed, so I'd tend toward putting it in attributes rather
 than a separate column.


If it's desired I can write a patch to put Valid until into attributes
column.

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread David Fetter
On Wed, Mar 14, 2012 at 11:29:14AM -0400, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  On Wed, Mar 14, 2012 at 10:27:28AM -0400, Tom Lane wrote:
  Hm.  That opinion seems to me to connect to the recently-posted
  patch to make contrib/file_fdw enforce NOT NULL constraints.
  Should we instead have the position that constraints declared for
  foreign tables are statements that we can take on faith, and it's
  the user's fault if they are wrong?
 
  I think that's something FDWs need to be able to communicate to
  PostgreSQL.  For example, something talking to another PostgreSQL
  would (potentially, anyhow) have access to deep knowledge of the
  remote side, but file_fdw would only have best efforts even for
  clever things like statistics.
 
 I think we're talking at cross-purposes.  What you're saying seems
 to assume that it's the system's responsibility to do something
 about a constraint declared on a foreign table.  What I'm suggesting
 is that maybe it isn't.

Actually, I'm suggesting that this behavior needs to be controlled,
not system-wide, but per FDW, and eventually per server, table and
column.

 A constraint, ordinarily, would be enforced against table *updates*,
 and then just assumed valid during reads.  In the case of a foreign
 table, we can't enforce constraints during updates because we don't
 have control of all updates.

I think that the situation will become a bit more nuanced than that.
A FDW could delegate constraints to the remote side, and in principle,
the remote side could inform PostgreSQL of all types of changes (DML,
DCL, DDL).

 Should we ignore declared constraints because they're not
 necessarily true?  Should we assume on faith that they're true?

Neither.  We should instead have ways for FDWs to say which
constraints are local-only, and which presumed correct on the remote
side.  If they lie when asserting the latter, that's pilot error.

 The posted patch for file_fdw takes the approach of silently
 filtering out rows for which they're not true, which is not
 obviously the right thing either --- quite aside from whether that's
 a sane semantics,

It clearly is for the author's use case.  Other use cases will differ.

 it's not going to scale to foreign key constraints, and even for
 simple NOT NULL and CHECK constraints it results in a runtime
 penalty on selects, which is not what people would expect from a
 constraint.

If people expect FK constraints on, say, a Twitter feed, they're
riding for a very hard fall.  If they expect them on a system with
several PostgreSQL nodes in it, that could very well be reasonable.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Tom Lane
David Fetter da...@fetter.org writes:
 On Wed, Mar 14, 2012 at 11:29:14AM -0400, Tom Lane wrote:
 The posted patch for file_fdw takes the approach of silently
 filtering out rows for which they're not true, which is not
 obviously the right thing either --- quite aside from whether that's
 a sane semantics,

 It clearly is for the author's use case.  Other use cases will differ.

You're assuming facts not in evidence.  Fujita-san posted that patch not
because he had any use case one way or the other, but because he read
something in fdwhandler.sgml that made him think it was required to
avoid planner malfunctions.  (Actually it is not, at the moment, since
we don't do any optimizations based on NOT NULL properties; but we might
in future.)  The question on the table is precisely whether believing a
contrary-to-fact NOT NULL assertion would constitute planner malfunction
or user error.

In general, the approach you're sketching towards foreign constraints
seems to me to be 100% overdesign with no basis in known user
requirements.  We have a list longer than my arm of things that are
more pressing than doing anything like that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 10:22 AM, David Fetter da...@fetter.org wrote:
 I think that instead of inventing new grammar productions and a new
 node type for this, you should just reuse the existing productions for
 LIKE clauses and then reject invalid options during parse analysis.

 OK.  Should I first merge CREATE FOREIGN TABLE with CREATE TABLE and
 submit that as a separate patch?

I don't see any reason to do that.  I merely meant that you could
reuse TableLikeClause or maybe even TableElement in the grammer for
CreateForeignTableStmt.

 INCLUDING COMMENTS would be OK, but the the rest are no good.

 At least for now.  I can see FDWs in the future that would delegate
 the decision to the remote side, and if the remote side happens to be
 PostgreSQL, a lot of those delegations could be in force.  Of course,
 this would either create a dependency that would need to be tracked in
 the other node or not be able to guarantee the durability of DDL, the
 latter being the current situation.  I suspect there would be use
 cases for each.

What's relevant for LIKE is whether we want to create constraints,
defaults, comments, etc. on the *local* side, not the remote side -
and that has nothing do with with the particular choice of FDW in use.

I don't think we should conflate the local and remote sides.  Even if
a foreign table refers to a remote table that has comments on its
columns, there's no rule that the comments on the foreign side must
match up with the comments on the local side, and in fact I think that
in general we want to keep those concepts clearly distinct.  There's
no guarantee that the two systems are controlled by the same DBA, and
they might each have their own choice words about those columns.  IOW,
even if we had the ability to keep those things synchronized, we
shouldn't do it, or at least not by default.

 Obviously, we can't enforce constraints on remote data, but the point
 would be allow the system administrator to supply the query planner
 with enough knowledge to make constraint exclusion work.  The fact
 that you can't make that work today is a major gap, IMV.

 I didn't do INHERITS because most FDWs won't ever have that concept,
 i.e. aren't PostgreSQL.  Are you thinking about this as a general way
 to handle remote partitioned tables?

The original foreign table patch included constraints and the ability
to inherit back and forth between regular tables and foreign tables.
I ripped all that out before committing because it wasn't sufficiently
well thought-out, but I'm not convinced it's something we never want
to do.  Either way, constraint exclusion can also be used in other
scenarios, like a UNION ALL view over several foreign tables.

And yes, I am thinking about remote partitioned tables.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Ronan Dunklau
On 14/03/2012 16:47, David Fetter wrote:
 On Wed, Mar 14, 2012 at 11:29:14AM -0400, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
 On Wed, Mar 14, 2012 at 10:27:28AM -0400, Tom Lane wrote:
 Hm.  That opinion seems to me to connect to the recently-posted
 patch to make contrib/file_fdw enforce NOT NULL constraints.
 Should we instead have the position that constraints declared for
 foreign tables are statements that we can take on faith, and it's
 the user's fault if they are wrong?

 I think that's something FDWs need to be able to communicate to
 PostgreSQL.  For example, something talking to another PostgreSQL
 would (potentially, anyhow) have access to deep knowledge of the
 remote side, but file_fdw would only have best efforts even for
 clever things like statistics.

 I think we're talking at cross-purposes.  What you're saying seems
 to assume that it's the system's responsibility to do something
 about a constraint declared on a foreign table.  What I'm suggesting
 is that maybe it isn't.
 
 Actually, I'm suggesting that this behavior needs to be controlled,
 not system-wide, but per FDW, and eventually per server, table and
 column.

 A constraint, ordinarily, would be enforced against table *updates*,
 and then just assumed valid during reads.  In the case of a foreign
 table, we can't enforce constraints during updates because we don't
 have control of all updates.
 
 I think that the situation will become a bit more nuanced than that.
 A FDW could delegate constraints to the remote side, and in principle,
 the remote side could inform PostgreSQL of all types of changes (DML,
 DCL, DDL).
 
 Should we ignore declared constraints because they're not
 necessarily true?  Should we assume on faith that they're true?
 
 Neither.  We should instead have ways for FDWs to say which
 constraints are local-only, and which presumed correct on the remote
 side.  If they lie when asserting the latter, that's pilot error.
 

I don't understand what value would that bring. Do you propose that, if
a FDW declares a constraint as local-only, the planner should ignore
them but when declared as remote, it could use this information ?

Let me describe a simple use case we have in one of our web
applications, which would benefit from foreign keys on foreign tables.

The application has users, stored in a users table, which can upload
files. The files are stored on the server's filesystem, using one folder
per user, named after the user_id.

Ex:

/
  1/
myfile.png
  2/
myotherfile.png


This filesystem is accessed using the StructuredFS FDW, which maps a
file system tree to a set of columns corresponding to parts of the file
path: every file which path matches the pattern results in a row. Using
the aforementioned structure, the foreign table would have an user_id
column, and a filename column.

Now, the FDW itself cannot know that the foreign key will be enforced,
but as the application developer, I know that every directory will be
named after an user_id.

Allowing foreign keys on such a foreign table would allow us to describe
the model more precisely in PostgreSQL, and external tools could use
this knowledge too, even if PostgreSQL completely ignore them.
Especially ORMs relying on foreign keys to determine join conditions
between tables.

On the other hand, should foreign keys referencing a foreign table be
allowed too ? From a foreign table to another, from a local table to a
foreign table ?

Regards,

-- 
Ronan Dunklau

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 12:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 David Fetter da...@fetter.org writes:
 On Wed, Mar 14, 2012 at 11:29:14AM -0400, Tom Lane wrote:
 The posted patch for file_fdw takes the approach of silently
 filtering out rows for which they're not true, which is not
 obviously the right thing either --- quite aside from whether that's
 a sane semantics,

 It clearly is for the author's use case.  Other use cases will differ.

 You're assuming facts not in evidence.  Fujita-san posted that patch not
 because he had any use case one way or the other, but because he read
 something in fdwhandler.sgml that made him think it was required to
 avoid planner malfunctions.  (Actually it is not, at the moment, since
 we don't do any optimizations based on NOT NULL properties; but we might
 in future.)  The question on the table is precisely whether believing a
 contrary-to-fact NOT NULL assertion would constitute planner malfunction
 or user error.

+1 for user error.  I think at some point I had taken the view that
perhaps the FDW should check the data it's emitting against the NOT
NULL constraints, but that would imply that we ought to cross-check
CHECK constraints as well, once we have those, which sounds
unreasonably expensive.  So defining the constraint as a promise by
the user seems best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Robert Haas
On Tue, Mar 13, 2012 at 11:42 PM, Noah Misch n...@leadboat.com wrote:
 More often than that; each 2-member mxid takes 4 bytes in an offsets file and
 10 bytes in a members file.  So, more like one fsync per ~580 mxids.  Note
 that we already fsync the multixact SLRUs today, so any increase will arise
 from the widening of member entries from 4 bytes to 5.  The realism of this
 test is attractive.  Nearly-static parent tables are plenty common, and this
 test will illustrate the impact on those designs.

Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU today?

 You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
 lookups, so I think something more sophisticated is needed to exercise that
 cost.  Not sure what.

I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
all-visible.  HEAP_XMAX_INVALID will obviously help, when it happens.

 This isn't exactly a test case, but from Noah's previous comments I
 gather that there is a theoretical risk of mxid consumption running
 ahead of xid consumption.  We should try to think about whether there
 are any realistic workloads where that might actually happen.  I'm
 willing to believe that there aren't, but not just because somebody
 asserts it.  The reason I'm concerned about this is because, if it
 should happen, the result will be more frequent anti-wraparound
 vacuums on every table in the cluster.  Those are already quite
 painful for some users.

 Yes.  Pre-release, what can we really do here other than have more people
 thinking about ways it might happen in practice?  Post-release, we could
 suggest monitoring methods or perhaps have VACUUM emit a WARNING when a table
 is using more mxid space than xid space.

Well, post-release, the cat is out of the bag: we'll be stuck with
this whether the performance characteristics are acceptable or not.
That's why we'd better be as sure as possible before committing to
this implementation that there's nothing we can't live with.  It's not
like there's any reasonable way to turn this off if you don't like it.

 Also consider a benchmark that does plenty of non-key updates on a parent
 table with no activity on the child table.  We'll pay the overhead of
 determining that the key column(s) have not changed, but it will never pay off
 by preventing a lock wait.

Good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column correlation, 2nd shot

2012-03-14 Thread Robert Haas
On Tue, Mar 13, 2012 at 9:56 AM, Hans-Jürgen Schönig
postg...@cybertec.at wrote:
 Here's the cross-col patch against todays master branch.

Please add your patch here, so it doesn't get forgotten:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 12:34 AM, Joachim Wieland j...@mcknight.de wrote:
 If a child terminates without leaving a message, the master will still
 detect it and just say a worker process died unexpectedly (this part
 was actually broken, but now it's fixed :-) )

 All that may be true, but I still don't see why it's right for this to
 apply in the cases where the worker thread says die_horribly(), but
 not in the cases where the worker says exit_horribly().

 Hm, I'm not calling the error handler from exit_horribly because it
 doesn't have the AH. It looks like the code assumes that
 die_horribly() is called whenever AH is available and if not,
 exit_horribly() should be called which eventually calls these
 preregistered exit-hooks via exit_nicely() to clean up the connection.

 I think we should somehow unify both functions, the code is not very
 consistent in this respect, it also calls exit_horribly() when it has
 AH available. See for example pg_backup_tar.c

I think we should get rid of die_horribly(), and instead have arrange
to always clean up AH via an on_exit_nicely hook.

 Sure, but since all the function does is write to it or access it,
 what good does that do me?

 It encapsulates the variable so that it can only be used for one
 specific use case.

Seems pointless to me.

+   /*
+* This is a data dumper routine, executed in a child for parallel backu
+* so it must not access the global g_conn but AH-connection instead.
+*/

There's no g_conn any more.  This and several other references to it
should be updated or expunged.

+   {
+   write_msg(NULL, parallel backup only supported by the directory
+   exit(1);
+   }

I think this should exit_horribly() with that message.  It definitely
can't use exit() rather than exit_nicely(); more generally, every copy
of exit() that you've added here should exit_nicely() instead, or use
some higher-level routine like exit_horribly().

+   write_msg(NULL, No synchronized snapshots available in
+You might have to run with --n
+   exit(1);

In addition to the previous problem, what do you mean by might?  The
real problem is that on pre-9.2 versions multiple jobs are not OK
unless that option is used; I think we should say that more directly.

/*
 * The sequence is the following (for dump, similar for restore):
 *
 * Master   Worker
 *
 *  enters WaitForCommands()
 * DispatchJobForTocEntry(...te...)
 *
 * [ Worker is IDLE ]
 *
 * arg = (MasterStartParallelItemPtr)()
 * send: DUMP arg
 *  receive: DUMP arg
 *  str = (WorkerJobDumpPtr)(arg)
 * [ Worker is WORKING ]... gets te from arg ...
 *  ... dump te ...
 *  send: OK DUMP info
 *
 * In ListenToWorkers():
 *
 * [ Worker is FINISHED ]
 * receive: OK DUMP info
 * status = (MasterEndParallelItemPtr)(info)
 *
 * In ReapWorkerStatus(ptr):
 * *ptr = status;
 * [ Worker is IDLE ]
 */

I don't find this comment very clear, and would suggest rewriting it
using prose rather than an ASCII diagram.  Note also that any sort of
thing that does look like an ASCII diagram must be surrounded by lines
of dashes within the comment block, or pgindent will make hash of it.
There are a couple of other places where this is an issue as well,
like the comment for ListenToWorkers().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Faster compression, again

2012-03-14 Thread Daniel Farina
For 9.3 at a minimum.

The topic of LZO became mired in doubts about:

* Potential Patents
* The author's intention for the implementation to be GPL

Since then, Google released Snappy, also an LZ77-class
implementation, and it has been ported to C (recently, and with some
quirks, like no LICENSE file...yet, although it is linked from the
original Snappy project).  The original Snappy (C++) has a BSD license
and a patent grant (which shields you from Google, at least).  Do we
want to investigate a very-fast compression algorithm inclusion again
in the 9.3 cycle?

I've been using the similar implementation LZO for WAL archiving and
it is a significant savings (not as much as pg_lesslog, but also less
invasive).  It is also fast enough that even if one were not to uproot
TOAST's compression that it would probably be very close to a complete
win for protocol traffic, whereas SSL's standardized zlib can
definitely be a drag in some cases.

This idea resurfaces often, but the reason why I wrote in about it is
because I have a table which I categorized as small but was, in
fact, 1.5MB, which made transferring it somewhat slow over a remote
link.  zlib compression takes it down to about 550K and lzo (similar,
but not identical) 880K.  If we're curious how it affects replication
traffic, I could probably gather statistics on LZO-compressed WAL
traffic, of which we have a pretty huge amount captured.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread k...@rice.edu
On Wed, Mar 14, 2012 at 11:06:16AM -0700, Daniel Farina wrote:
 For 9.3 at a minimum.
 
 The topic of LZO became mired in doubts about:
 
 * Potential Patents
 * The author's intention for the implementation to be GPL
 
 Since then, Google released Snappy, also an LZ77-class
 implementation, and it has been ported to C (recently, and with some
 quirks, like no LICENSE file...yet, although it is linked from the
 original Snappy project).  The original Snappy (C++) has a BSD license
 and a patent grant (which shields you from Google, at least).  Do we
 want to investigate a very-fast compression algorithm inclusion again
 in the 9.3 cycle?
 

+1 for Snappy and a very fast compression algorithm.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Too many IO?

2012-03-14 Thread Alvaro Herrera

Excerpts from Tatsuo Ishii's message of mar mar 13 23:29:44 -0300 2012:

 As you can see, this query generated 1255+1250 = 2505 times block read
 either from the buffer or the disk. In my understanding the query
 accesses an index tuple, which will need access to root page and
 several number of meta pages (I mean index pages they are not either
 root or leaf pages)

To clarify terminology a bit -- we call metapage of a btree index the
page number zero, which among other things contains the page number for
the root page.  That is, since the root page can move around due to
splitting, its number can always be found by reading the metapage.

Pages that aren't root nor leaf pages are typically called internal
pages, though there are other names for them (non-leaf).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_buffers, redux

2012-03-14 Thread Jeff Janes
On Mon, Mar 12, 2012 at 7:16 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Mar 11, 2012 at 11:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Mar 11, 2012 at 12:55 PM, Robert Haas robertmh...@gmail.com wrote:
 I've finally been able to run some more tests of the effect of
 adjusting wal_buffers to values higher than 16MB.  I ran the test on
 the 16 core (x 4 hw threads/core) IBM POWER7 machine, with my usual
 configuration settings:

 shared_buffers = 8GB
 maintenance_work_mem = 1GB
 synchronous_commit = off
 checkpoint_segments = 300
 checkpoint_timeout = 15min
 checkpoint_completion_target = 0.9
 wal_writer_delay = 20ms

 I ran three 30-minute tests at scale factor 300 with wal_buffers set
 at various values from 16MB up to 160MB, in multiples of 16MB, using
 pgbench with 32 clients and 32 threads in each case.  The short
 version is that 32MB seems to be significantly better than 16MB, by
 about 1000 tps, and after that it gets murky; full results are below.

 Currently the max of wal_buffers is 16MB (i.e., the size of one WAL file)
 when it's set to -1. Thanks to your result, we should increase the max to
 32MB?

 I think that might be a good idea, although I'm not entirely convinced
 that we understand why increasing wal_buffers is helping as much as it
 is.  I stuck an elog() into AdvanceXLInsertBuffer() to complain in the
 case that we were writing buffers while holding the insert lock.
 Then, I reran 30-minute tests 32 clients, one with wal_buffers=16MB
 and the other wal_buffers=32MB.  On the 16MB test, the elog() fired 15
 times in a single second shortly after the start of the test, and then
 9 more times over the rest of the test.  On the 32MB test, the elog()
 fired a total 6 times over the course of the test.  The first test got
 14320 tps, while the second got 15026 tps.  I find that quite
 surprising, because although WAL buffer wraparound is certainly bad
 (on this test, it probably brings the system completely to a halt
 until fsync() finishes) it really shouldn't lock up the system for
 multiple seconds at a time.  And yet that's what it would need to be
 doing to account for the tps discrepancy on this test, considering how
 rarely it occurs.

Where did you put the elog?  If you put it near
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_START(), it is probably too
late in the code.

If someone else (like the background wal writer) is blocked on an
fsync, then AdvanceXLInsertBuffer will block on getting the
WALWriteLock.  Once it gets the lock, it will likely find it no longer
needs to do the write (because it was done by the thing that just
blocked and then released it), but at that point the damage has
already been done.  The damage is not that it has to do a write, but
that it had to block (indirectly) on an fsync which it didn't really
care about.

One possibility is that the writer should update
xlogctl-LogwrtResult.Write between finishing the write and starting
the fsync.  That way an AdvanceXLInsertBuffer that arrives during the
fsync could see that the data is needs to overwrite has already been
written, even if not yet fsynced, and that would be enough to allow it
proceed.

That would be ugly, because AdvanceXLInsertBuffer that arrives during
the write itself would check xlogctl and then block until the fsync
finished, with no way to wake up as soon as the write-part finished.
So making it work cleanly would require a redesign of the whole
locking scheme.  Which the Scaling patch is already doing.




 Studying AdvanceXLInsertBuffer() a bit more, I'm wondering if the
 problem isn't so much - or isn't only - that it's expensive to write
 buffers while also holding WALInsertLock.  Maybe it's too expensive
 even to acquire WalWriteLock in the first place - that is, the real
 problem isn't so much the wraparound condition itself, but the expense
 of testing whether a possible wraparound has actually occurred.  A
 quick test suggests that we acquire WALWriteLock here much more often
 than we actually write anything while holding it, and that we get a
 big burst of WALWriteLock acquisitions here immediately after a
 checkpoint.  I don't have any proof that this is what's causing the
 tps drop with smaller wal_buffers, but I think there has to be
 something other than an actual wraparound condition causing problems
 here, because that just doesn't seem to happen frequently enough to be
 an issue.

I think my analysis is pretty much a re-wording of yours, but I'd
emphasize that getting the WALWriteLock is bad not just because they
fight over the lock, but because someone else (probably background wal
writer) is camping out on the lock while doing an fsync.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Merlin Moncure
On Wed, Mar 14, 2012 at 1:06 PM, Daniel Farina dan...@heroku.com wrote:
 For 9.3 at a minimum.

 The topic of LZO became mired in doubts about:

 * Potential Patents
 * The author's intention for the implementation to be GPL

 Since then, Google released Snappy, also an LZ77-class
 implementation, and it has been ported to C (recently, and with some
 quirks, like no LICENSE file...yet, although it is linked from the
 original Snappy project).  The original Snappy (C++) has a BSD license
 and a patent grant (which shields you from Google, at least).  Do we
 want to investigate a very-fast compression algorithm inclusion again
 in the 9.3 cycle?

 I've been using the similar implementation LZO for WAL archiving and
 it is a significant savings (not as much as pg_lesslog, but also less
 invasive).  It is also fast enough that even if one were not to uproot
 TOAST's compression that it would probably be very close to a complete
 win for protocol traffic, whereas SSL's standardized zlib can
 definitely be a drag in some cases.

 This idea resurfaces often, but the reason why I wrote in about it is
 because I have a table which I categorized as small but was, in
 fact, 1.5MB, which made transferring it somewhat slow over a remote
 link.  zlib compression takes it down to about 550K and lzo (similar,
 but not identical) 880K.  If we're curious how it affects replication
 traffic, I could probably gather statistics on LZO-compressed WAL
 traffic, of which we have a pretty huge amount captured.

there are plenty of on gpl lz based libraries out there (for example:
http://www.fastlz.org/) and always have been.  they are all much
faster than zlib.  the main issue is patents...you have to be careful
even though all the lz77/78 patents seem to have expired or apply to
specifics not relevant to general use.

see here for the last round of talks on this:
http://archives.postgresql.org/pgsql-performance/2009-08/msg00052.php

lzo is nearing its 20th birthday, so even if you are paranoid about
patents (admittedly, there is good reason to be), the window is
closing fast to have patent issues that aren't A expired or B  covered
by prior art on that or the various copycat implementations, at least
in the US.

snappy looks amazing.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Syntax error and reserved keywords

2012-03-14 Thread Peter Eisentraut
On ons, 2012-03-14 at 14:58 +0100, Dimitri Fontaine wrote:
 A colleague came to me to express his surprise about this quite simple
 use case:
 
   =# alter table toto add column user text;
   ERROR:  syntax error at or near user
   LINE 1: alter table toto add column user text;
 
 Is there a reason for us not to add an HINT: user is a reserved
 keyword or something like that, other than nobody having been interested
 in doing the work?

If that were easily possible, we could just recognize 'user' as an
identifier in this context and avoid the issue altogether.  But it's
not.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Peter Eisentraut
On ons, 2012-03-14 at 10:27 -0400, Tom Lane wrote:
 That opinion seems to me to connect to the recently-posted patch to
 make contrib/file_fdw enforce NOT NULL constraints.  Should we instead
 have the position that constraints declared for foreign tables are
 statements that we can take on faith, and it's the user's fault if
 they are wrong?

We should look into the NOT ENFORCED stuff for constraints in SQL:2011.
Then we can have both, and both for regular and foreign tables.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Peter Eisentraut
On tis, 2012-03-13 at 20:34 -0400, Bruce Momjian wrote:
 I frankly am worried that if we copy over statistics even in ASCII
 that don't match what the server expects, it might lead to a crash,
 which has me back to wanting to speed up vacuumdb.

Why can't we maintain a conversion routine for statistics from older
versions?  It's not like the statistics layout changes every week.
pg_dump could print out something like

SELECT pg_restore_statistics(catversion, tablename, ... some data ...);
...

and that function would have the knowledge to convert the data and
insert it back into pg_statistic and pg_class.

That can't be that hard considering all the other work we put into
backward compatibility and upgrade capability.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-14 Thread Andrew Dunstan

On 03/13/2012 02:10 PM, Andrew Dunstan wrote:



On 03/13/2012 01:53 PM, Robert Haas wrote:


I tried this actually (patch attached) but then I wanted to test it
and couldn't find anything that used pgpipe() on Windows.

pg_basebackup/pg_basebackup.c is using it but it's in an #ifndef WIN32
and the same is true for postmaster/syslogger.c. Am I missing
something or has this Windows implementation become stale by now? I'll
append the patch but haven't adapted the pg_dump patch yet to use it.
Should we still go forward the way you proposed?

Dunno.  Can we get an opinion on that from one of the Windows guys?
Andrew, Magnus?




I haven't had time to review this patch or even follow all the 
discussion as I was hoping. I'll try to review the whole thing shortly.



pgpipe used to be used in pgstat.c, but that's no longer true in any 
live branch, so it's probably long dead. I'd be inclined to rip it out 
if possible rather than expand its use.


I've just started looking at the patch, and I'm curious to know why it 
didn't follow the pattern of parallel pg_restore which created a new 
worker for each table rather than passing messages to looping worker 
threads as this appears to do. That might have avoided a lot of the need 
for this message passing infrastructure, if it could have been done. But 
maybe I just need to review the patch and the discussions some more.


cheers

andrew






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Andrew Dunstan



On 03/14/2012 04:10 PM, Merlin Moncure wrote:

there are plenty of on gpl lz based libraries out there (for example:
http://www.fastlz.org/) and always have been.  they are all much
faster than zlib.  the main issue is patents...you have to be careful
even though all the lz77/78 patents seem to have expired or apply to
specifics not relevant to general use.



We're not going to include GPL code in the backend. We have enough 
trouble with readline and that's only for psql. SO the fact that there 
are GPL libraries can't help us, whether there are patent issues or not.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On ons, 2012-03-14 at 10:27 -0400, Tom Lane wrote:
 That opinion seems to me to connect to the recently-posted patch to
 make contrib/file_fdw enforce NOT NULL constraints.  Should we instead
 have the position that constraints declared for foreign tables are
 statements that we can take on faith, and it's the user's fault if
 they are wrong?

 We should look into the NOT ENFORCED stuff for constraints in SQL:2011.
 Then we can have both, and both for regular and foreign tables.

Have both what?  The key point here is that we *can't* enforce
constraints on foreign tables, at least not with anything like the
semantics SQL constraints normally have.  Ignoring that point leads
to nonsensical conclusions.

Declaring a foreign constraint as NOT ENFORCED might be a reasonable
thing to do, but it doesn't help us decide what to do when that clause
isn't attached.

On reflection I don't see anything much wrong with the if you lied
about the constraint it's your fault that things broke position.
It seems quite comparable to the fact that we take the user's assertions
on faith as to the number and data types of the columns in a foreign
table.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Andrew Dunstan



On 03/14/2012 04:44 PM, Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

On ons, 2012-03-14 at 10:27 -0400, Tom Lane wrote:

That opinion seems to me to connect to the recently-posted patch to
make contrib/file_fdw enforce NOT NULL constraints.  Should we instead
have the position that constraints declared for foreign tables are
statements that we can take on faith, and it's the user's fault if
they are wrong?

We should look into the NOT ENFORCED stuff for constraints in SQL:2011.
Then we can have both, and both for regular and foreign tables.

Have both what?  The key point here is that we *can't* enforce
constraints on foreign tables, at least not with anything like the
semantics SQL constraints normally have.  Ignoring that point leads
to nonsensical conclusions.

Declaring a foreign constraint as NOT ENFORCED might be a reasonable
thing to do, but it doesn't help us decide what to do when that clause
isn't attached.

On reflection I don't see anything much wrong with the if you lied
about the constraint it's your fault that things broke position.
It seems quite comparable to the fact that we take the user's assertions
on faith as to the number and data types of the columns in a foreign
table.




Maybe we should say that for foreign tables NOT ENFORCED is implied. 
That seems to amount to much the same thing.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_buffers, redux

2012-03-14 Thread Robert Haas
On Tue, Mar 13, 2012 at 11:18 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Mar 13, 2012 at 6:44 PM, Josh Berkus j...@agliodbs.com wrote:
 That's a speedup of nearly a factor of two, so clearly fsync-related
 stalls are a big problem here, even with wal_buffers cranked up
 through the ceiling.

 H.   Do you have any ability to test on XFS?

 It seems I do.

 XFS, with fsync = on:
 tps = 14746.687499 (including connections establishing)
 XFS, with fsync = off:
 tps = 25121.876560 (including connections establishing)

 No real dramatic difference there, maybe a bit slower.

 On further thought, it may be that this is just a simple case of too
 many checkpoints.  With fsync=off, we don't have to actually write all
 that dirty data back to disk.  I'm going to try cranking up
 checkpoint_segments and see what happens.

OK, this is bizarre.  I wiped out my XFS filesystem and put ext4 back,
and look at this:

tps = 19105.740878 (including connections establishing)
tps = 19687.674409 (including connections establishing)

That's a jump of nearly a third from before.  I'm not sure what's
different.  Nothing, AFAIK.  I drop and recreate the database after
every test run, so I don't see why this should be so much better,
unless ext4 degrades over time (even though the FS is nearly empty,
and I'm dropping the whole database after each test run).

Then I tried it with checkpoint_segments=3000 rather than 300.

tps = 26750.190469 (including connections establishing)

Hmm, what happens with checkpoint_segments=3000 and fsync=off?

tps = 30395.583366 (including connections establishing)

Hmm, and what if I set checkpoint_segments=300 and fsync=off?

tps = 26029.160919 (including connections establishing)

Not sure what to make of all this, yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread k...@rice.edu
On Wed, Mar 14, 2012 at 04:43:55PM -0400, Andrew Dunstan wrote:
 
 
 On 03/14/2012 04:10 PM, Merlin Moncure wrote:
 there are plenty of on gpl lz based libraries out there (for example:
 http://www.fastlz.org/) and always have been.  they are all much
 faster than zlib.  the main issue is patents...you have to be careful
 even though all the lz77/78 patents seem to have expired or apply to
 specifics not relevant to general use.
 
 
 We're not going to include GPL code in the backend. We have enough
 trouble with readline and that's only for psql. SO the fact that
 there are GPL libraries can't help us, whether there are patent
 issues or not.
 
 cheers
 
 andrew
 

That is what makes Google's Snappy so appealing, a BSD license.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_buffers, redux

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 3:29 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 I think my analysis is pretty much a re-wording of yours, but I'd
 emphasize that getting the WALWriteLock is bad not just because they
 fight over the lock, but because someone else (probably background wal
 writer) is camping out on the lock while doing an fsync.

Yeah, good point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-14 Thread Alvaro Herrera

Excerpts from Andrew Dunstan's message of mié mar 14 17:39:59 -0300 2012:

 pgpipe used to be used in pgstat.c, but that's no longer true in any 
 live branch, so it's probably long dead. I'd be inclined to rip it out 
 if possible rather than expand its use.

our pgpipe() function is interesting -- all the callers that use it
first verify that they aren't WIN32.  If they are, they are using a
#define that makes it plain pipe().  And the function is only defined in
WIN32.  It seems a reasonable idea to kill both pgpipe() and piperead().

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Merlin Moncure
On Wed, Mar 14, 2012 at 3:43 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 03/14/2012 04:10 PM, Merlin Moncure wrote:

 there are plenty of on gpl lz based libraries out there (for example:
 http://www.fastlz.org/) and always have been.  they are all much
 faster than zlib.  the main issue is patents...you have to be careful
 even though all the lz77/78 patents seem to have expired or apply to
 specifics not relevant to general use.


 We're not going to include GPL code in the backend. We have enough trouble
 with readline and that's only for psql. SO the fact that there are GPL
 libraries can't help us, whether there are patent issues or not.

er, typo: I meant to say: *non-gpl* lz based...  :-).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Daniel Farina
On Wed, Mar 14, 2012 at 2:03 PM, Merlin Moncure mmonc...@gmail.com wrote:
 er, typo: I meant to say: *non-gpl* lz based...  :-).

Given that, few I would say have had the traction that LZO and Snappy
have had, even though in many respects they are interchangeable in the
general trade-off spectrum. The question is: what burden of proof is
required to convince the project that Snappy does not have exorbitant
patent issues, in proportion to the utility of having a compression
scheme of this type integrated?

One would think Google's lawyers did their homework to ensure they
would not be trolled for hideous sums of money by disclosing and
releasing the exact implementation of the compression used virtually
everywhere.  If anything, that may have been a more complicated issue
than writing and releasing yet-another-LZ77 implementation.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 4:39 PM, Andrew Dunstan aduns...@postgresql.org wrote:
 I've just started looking at the patch, and I'm curious to know why it
 didn't follow the pattern of parallel pg_restore which created a new worker
 for each table rather than passing messages to looping worker threads as
 this appears to do. That might have avoided a lot of the need for this
 message passing infrastructure, if it could have been done. But maybe I just
 need to review the patch and the discussions some more.

Hmm, I hadn't actually considered that idea.  Not sure whether it's
better or worse than the current implementation...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Peter Eisentraut
On ons, 2012-03-14 at 16:44 -0400, Tom Lane wrote:
 On reflection I don't see anything much wrong with the if you lied
 about the constraint it's your fault that things broke position.
 It seems quite comparable to the fact that we take the user's
 assertions on faith as to the number and data types of the columns in
 a foreign table.

We do enforce the data types of a foreign table.  We can't ensure that
the data that a foreign table contains is valid at any moment, but
when we read the data and interact with it in SQL, we reject it if it's
not valid.  Similarly, one could conceivably apply not-null and check
constraints as the data is read, which is exactly what the other patch
you referred to proposes.  And I think we must do it that way, otherwise
check constraints on domains and check constraints on tables would
behave quite differently.

So if we want to have fake constraints on foreign tables, I think we
should require the NOT ENFORCED decoration or something similar, unless
the FDW signals that it can enforce the constraint.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 5:14 PM, Peter Eisentraut pete...@gmx.net wrote:
 On ons, 2012-03-14 at 16:44 -0400, Tom Lane wrote:
 On reflection I don't see anything much wrong with the if you lied
 about the constraint it's your fault that things broke position.
 It seems quite comparable to the fact that we take the user's
 assertions on faith as to the number and data types of the columns in
 a foreign table.

 We do enforce the data types of a foreign table.  We can't ensure that
 the data that a foreign table contains is valid at any moment, but
 when we read the data and interact with it in SQL, we reject it if it's
 not valid. Similarly, one could conceivably apply not-null and check
 constraints as the data is read, which is exactly what the other patch
 you referred to proposes.  And I think we must do it that way, otherwise
 check constraints on domains and check constraints on tables would
 behave quite differently.

 So if we want to have fake constraints on foreign tables, I think we
 should require the NOT ENFORCED decoration or something similar, unless
 the FDW signals that it can enforce the constraint.

I think that would be missing the point.  If a constraint is NOT
ENFORCED, then the query planner presumably won't rely on it for
planning purposes, but the whole point of having constraints on
foreign tables is that we want the query planner to do just that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Peter Eisentraut
On ons, 2012-03-14 at 17:16 -0400, Robert Haas wrote:
 If a constraint is NOT ENFORCED, then the query planner presumably
 won't rely on it for planning purposes 

Why do you presume that?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-14 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Also, when calling the user's procedure from the same place in case of an
 ANY command trigger or a specific one it's then possible to just hand
 them over the exact same set of info (object id, name, schema name).

 Yes, I think that's an essential property of the system, here.

Ok, I've implemented that. No patch attached because I need to merge
with master again and I'm out to sleep now, it sometimes ring when being
on-call…

Curious people might have a look at my github repository where the
command_triggers branch is updated:

  https://github.com/dimitri/postgres/compare/daf69e1e...e3714cb9e6

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Wed, Mar 14, 2012 at 10:40:41PM +0200, Peter Eisentraut wrote:
 On tis, 2012-03-13 at 20:34 -0400, Bruce Momjian wrote:
  I frankly am worried that if we copy over statistics even in ASCII
  that don't match what the server expects, it might lead to a crash,
  which has me back to wanting to speed up vacuumdb.
 
 Why can't we maintain a conversion routine for statistics from older
 versions?  It's not like the statistics layout changes every week.
 pg_dump could print out something like
 
 SELECT pg_restore_statistics(catversion, tablename, ... some data ...);
 ...
 
 and that function would have the knowledge to convert the data and
 insert it back into pg_statistic and pg_class.
 
 That can't be that hard considering all the other work we put into
 backward compatibility and upgrade capability.

Well, I have not had to make major adjustments to pg_upgrade since 9.0,
meaning the code is almost complete unchanged and does not require
additional testing for each major release.  If we go down the road of
dumping stats, we will need to adjust for stats changes and test this to
make sure we have made the proper adjustment for every major release.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On ons, 2012-03-14 at 17:16 -0400, Robert Haas wrote:
 If a constraint is NOT ENFORCED, then the query planner presumably
 won't rely on it for planning purposes 

 Why do you presume that?

What does SQL:2011 say exactly about the semantics of NOT ENFORCED?
Is an implementation allowed to fail in undefined ways if a constraint
is marked NOT ENFORCED and is not actually true?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Given that, few I would say have had the traction that LZO and Snappy
 have had, even though in many respects they are interchangeable in the
 general trade-off spectrum. The question is: what burden of proof is
 required to convince the project that Snappy does not have exorbitant
 patent issues, in proportion to the utility of having a compression
 scheme of this type integrated?

Another not-exactly-trivial requirement is to figure out how to not
break on-disk compatibility when installing an alternative compression
scheme.  In hindsight it might've been a good idea if pglz_compress had
wasted a little bit of space on some sort of version identifier ...
but it didn't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 Another not-exactly-trivial requirement is to figure out how to
 not break on-disk compatibility when installing an alternative
 compression scheme.  In hindsight it might've been a good idea if
 pglz_compress had wasted a little bit of space on some sort of
 version identifier ... but it didn't.
 
Doesn't it always start with a header of two int32 values where the
first must be smaller than the second?  That seems like enough to
get traction for an identifiably different header for an alternative
compression technique.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Noah Misch
On Wed, Mar 14, 2012 at 01:23:14PM -0400, Robert Haas wrote:
 On Tue, Mar 13, 2012 at 11:42 PM, Noah Misch n...@leadboat.com wrote:
  More often than that; each 2-member mxid takes 4 bytes in an offsets file 
  and
  10 bytes in a members file. ?So, more like one fsync per ~580 mxids. ?Note
  that we already fsync the multixact SLRUs today, so any increase will arise
  from the widening of member entries from 4 bytes to 5. ?The realism of this
  test is attractive. ?Nearly-static parent tables are plenty common, and this
  test will illustrate the impact on those designs.
 
 Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
 today?

Good question.  So far, I can't think of a reason.  nextMulti is critical,
but we already fsync it with pg_control.  We could delete the other multixact
state data at every startup and set OldestVisibleMXactId accordingly.

  You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
  lookups, so I think something more sophisticated is needed to exercise that
  cost. ?Not sure what.
 
 I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
 all-visible.  HEAP_XMAX_INVALID will obviously help, when it happens.

True.  The patch (see ResetMultiHintBit()) also replaces a multixact xmax with
the updater xid when all transactions of the multixact have ended.  You would
need a test workload with long-running multixacts that delay such replacement.
However, the workloads that come to mind are the very workloads for which this
patch eliminates lock waits; they wouldn't illustrate a worst-case.

  This isn't exactly a test case, but from Noah's previous comments I
  gather that there is a theoretical risk of mxid consumption running
  ahead of xid consumption. ?We should try to think about whether there
  are any realistic workloads where that might actually happen. ?I'm
  willing to believe that there aren't, but not just because somebody
  asserts it. ?The reason I'm concerned about this is because, if it
  should happen, the result will be more frequent anti-wraparound
  vacuums on every table in the cluster. ?Those are already quite
  painful for some users.
 
  Yes. ?Pre-release, what can we really do here other than have more people
  thinking about ways it might happen in practice? ?Post-release, we could
  suggest monitoring methods or perhaps have VACUUM emit a WARNING when a 
  table
  is using more mxid space than xid space.
 
 Well, post-release, the cat is out of the bag: we'll be stuck with
 this whether the performance characteristics are acceptable or not.
 That's why we'd better be as sure as possible before committing to
 this implementation that there's nothing we can't live with.  It's not
 like there's any reasonable way to turn this off if you don't like it.

I disagree; we're only carving in stone the FOR KEY SHARE and FOR KEY UPDATE
syntax additions.  We could even avoid doing that by not documenting them.  A
later major release could implement them using a completely different
mechanism or even reduce them to aliases, KEY SHARE = SHARE and KEY UPDATE =
UPDATE.  To be sure, let's still do a good job the first time.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 5:21 PM, Peter Eisentraut pete...@gmx.net wrote:
 On ons, 2012-03-14 at 17:16 -0400, Robert Haas wrote:
 If a constraint is NOT ENFORCED, then the query planner presumably
 won't rely on it for planning purposes

 Why do you presume that?

Well, as Tom alludes to, I'm guessing that NOT ENFORCED is not a
license to deliver wrong answers.  But also as Tom says, what does the
spec say?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Daniel Farina
On Wed, Mar 14, 2012 at 2:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 Given that, few I would say have had the traction that LZO and Snappy
 have had, even though in many respects they are interchangeable in the
 general trade-off spectrum. The question is: what burden of proof is
 required to convince the project that Snappy does not have exorbitant
 patent issues, in proportion to the utility of having a compression
 scheme of this type integrated?

 Another not-exactly-trivial requirement is to figure out how to not
 break on-disk compatibility when installing an alternative compression
 scheme.  In hindsight it might've been a good idea if pglz_compress had
 wasted a little bit of space on some sort of version identifier ...
 but it didn't.

I was more thinking that the latency and throughput in LZ77 schemes
may be best first applied to protocol compression.  The downside is
that requires more protocol wrangling, but at least terabytes of
on-disk format doesn't get in the picture, even though LZ77 on the
data itself may be attractive.

I'm interested allowing layering transports below FEBE (similar to how
SSL is below, except without the complexity of being tied into auth
 auth) in a couple of respects, and this is one of them.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VALID UNTIL

2012-03-14 Thread Fabrízio de Royes Mello
2012/3/14 David Fetter da...@fetter.org


 I don't know how frequently people use VALID UNTIL, but I'm guessing
 it's not terribly often because yours is the first comment about how
 it's not exposed, so I'd tend toward putting it in attributes rather
 than a separate column.


The attached patch put VALID UNTIL into attributes column in verbose mode
like example above.

bdteste=# \du
 List of roles
 Role name |   Attributes   | Member of
---++---
 bucardo   | Superuser, Create role, Create DB, Replication | {}
 postgres  | Superuser, Create role, Create DB, Replication | {}

bdteste=# \du+
  List of roles
 Role name |Attributes
| Member of | Description
---+---+---+--
 bucardo   | Superuser, Create role, Create DB, Valid until 2012-12-31
23:59:59.99-02, Replication | {}| bucardo role
 postgres  | Superuser, Create role, Create DB, Replication
   | {}|


Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
*** a/src/bin/psql/describe.c
--- b/src/bin/psql/describe.c
***
*** 2382,2388  describeRoles(const char *pattern, bool verbose)
  		ARRAY(SELECT b.rolname\n
  		  FROM pg_catalog.pg_auth_members m\n
   JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)\n
! 		  WHERE m.member = r.oid) as memberof);
  
  		if (verbose  pset.sversion = 80200)
  		{
--- 2382,2389 
  		ARRAY(SELECT b.rolname\n
  		  FROM pg_catalog.pg_auth_members m\n
   JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)\n
! 		  WHERE m.member = r.oid) as memberof,\n
! 		'Valid until '::text||r.rolvaliduntil::text as rolvaliduntil);
  
  		if (verbose  pset.sversion = 80200)
  		{
***
*** 2407,2413  describeRoles(const char *pattern, bool verbose)
  		true AS rolinherit, false AS rolcreaterole,\n
  	   u.usecreatedb AS rolcreatedb, true AS rolcanlogin,\n
  		-1 AS rolconnlimit,\n
! 		ARRAY(SELECT g.groname FROM pg_catalog.pg_group g WHERE u.usesysid = ANY(g.grolist)) as memberof
  		  \nFROM pg_catalog.pg_user u\n);
  
  		processSQLNamePattern(pset.db, buf, pattern, false, false,
--- 2408,2415 
  		true AS rolinherit, false AS rolcreaterole,\n
  	   u.usecreatedb AS rolcreatedb, true AS rolcanlogin,\n
  		-1 AS rolconnlimit,\n
! 		ARRAY(SELECT g.groname FROM pg_catalog.pg_group g WHERE u.usesysid = ANY(g.grolist)) as memberof,\n
! 		'Valid until '::text||u.valuntil::text as rolvaliduntil
  		  \nFROM pg_catalog.pg_user u\n);
  
  		processSQLNamePattern(pset.db, buf, pattern, false, false,
***
*** 2452,2459  describeRoles(const char *pattern, bool verbose)
  		if (strcmp(PQgetvalue(res, i, 5), t) != 0)
  			add_role_attribute(buf, _(Cannot login));
  
  		if (pset.sversion = 90100)
! 			if (strcmp(PQgetvalue(res, i, (verbose ? 9 : 8)), t) == 0)
  add_role_attribute(buf, _(Replication));
  
  		conns = atoi(PQgetvalue(res, i, 6));
--- 2454,2464 
  		if (strcmp(PQgetvalue(res, i, 5), t) != 0)
  			add_role_attribute(buf, _(Cannot login));
  
+ 		if (strcmp(PQgetvalue(res, i, 8), ) != 0  verbose)
+   add_role_attribute(buf, PQgetvalue(res, i, 8));
+ 
  		if (pset.sversion = 90100)
! 			if (strcmp(PQgetvalue(res, i, (verbose ? 10 : 9)), t) == 0)
  add_role_attribute(buf, _(Replication));
  
  		conns = atoi(PQgetvalue(res, i, 6));
***
*** 2478,2485  describeRoles(const char *pattern, bool verbose)
  		printTableAddCell(cont, PQgetvalue(res, i, 7), false, false);
  
  		if (verbose  pset.sversion = 80200)
! 			printTableAddCell(cont, PQgetvalue(res, i, 8), false, false);
! 	}
  	termPQExpBuffer(buf);
  
  	printTable(cont, pset.queryFout, pset.logfile);
--- 2483,2491 
  		printTableAddCell(cont, PQgetvalue(res, i, 7), false, false);
  
  		if (verbose  pset.sversion = 80200)
! 			printTableAddCell(cont, PQgetvalue(res, i, 9), false, false);
! 	
!   }
  	termPQExpBuffer(buf);
  
  	printTable(cont, pset.queryFout, pset.logfile);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Too many IO?

2012-03-14 Thread Tatsuo Ishii
 As you can see, this query generated 1255+1250 = 2505 times block read
 either from the buffer or the disk. In my understanding the query
 accesses an index tuple, which will need access to root page and
 several number of meta pages (I mean index pages they are not either
 root or leaf pages) and 1 leaf page, then access 1 heap block. So I
 expected total number of IO would be somewhat:
 
 500 index leaf pages + 500 heap blocks = 1000
 
 The way I count it, each probe will touch the root page, a level-2 inner
 page, a level-1 inner page, a leaf page, and a heap page, so five buffer
 touches per cycle, which is almost exactly what you've got.  Only the
 first two of those are very likely to benefit from caching from previous
 searches, so the fact that you got 1255 hits and not only 1000 is
 actually a bit better than expected.  Probably this query was not done
 from a standing start, and so some of the level-1 pages were already in
 buffer cache.

Thanks for the explanation. Now I understand that PostgreSQL works as
expected.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:15:52PM -0400, Bruce Momjian wrote:
 On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote:
  On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote:
   Bruce Momjian br...@momjian.us wrote:

What is the target=10 duration?  I think 10 is as low as we can
acceptably recommend.  Should we recommend they run vacuumdb
twice, once with default_statistics_target = 4, and another with
the default?

   Here are the results at various settings.

   1   :  172198.892 ms
   2   :  295536.814 ms
   4   :  474319.826 ms
   10  :  750458.312 ms
   100 :  3433794.609 ms
  
  Thanks, good numbers to know.
 
 OK, new crazy idea.  Kevin has shown that his database can get a single
 bucket in 2.8 minutes.  What if we have pg_upgrade create a vacuumdb
 script that generates increasingly accurate statistics, e.g. it runs for
 default_statistics_target values of 1, 10, and default (100).  That
 would give basic statistics quickly (2.8 minutes), and full statistics
 in an hour, for Kevin's database.
 
   PGOPTIONS='-c default_statistics_target=1' vacuumdb --all --analyze-only
   PGOPTIONS='-c default_statistics_target=10' vacuumdb --all 
 --analyze-only
   vacuumdb --all --analyze-only
 
 The only problem I see is that users who use non-default statistics per
 table would not be affected by the increasing default_statistics_target
 values.
 
 The upside is this would work for all releases of Postgres.

OK, I have modified pg_upgrade with the attached patch to do exactly
this.  I have also attached the script pg_upgrade creates that should be
run instead of vacuumdb.

Based on Kevin's numbers above, the first vacuumdb will be done in 2.8
minutes (1 target), the 10 target vacuumdb done after 15 minutes, and
the 100 target vacuumdb done after 72 minutes (times accumulate).  Here
is what the output looks like:

Generating minimal optimizer statistics (1 target)
--
vacuumdb: vacuuming database postgres
vacuumdb: vacuuming database template1
vacuumdb: vacuuming database test

The server is now available with minimal optimizer statistics.
Query performance will be optimal once this script completes.

Generating medium optimizer statistics (10 targets)
---
vacuumdb: vacuuming database postgres
vacuumdb: vacuuming database template1
vacuumdb: vacuuming database test

Generating default (full) optimizer statistics (100 targets?)
-
vacuumdb: vacuuming database postgres
vacuumdb: vacuuming database template1
vacuumdb: vacuuming database test

Done

If we don't want to try migrating the statistics from the old system,
this seems like the best approach.

Does anyone know how bad the queries will be with only one target?

I did see if vacuumdb --analyze-only was somehow being throttled by the
vacuum settings, but saw the drive at 100% utilization analying a 36GB
table on a 24GB RAM server, so it seems I/O bound.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index cf43384..4481de0
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** static void check_for_reg_data_type_usag
*** 23,28 
--- 23,33 
  static void check_for_support_lib(ClusterInfo *cluster);
  static void get_bin_version(ClusterInfo *cluster);
  
+ #ifndef WIN32
+ #define	ECHO_QUOTE	'
+ #else
+ #define	ECHO_QUOTE	
+ #endif
  
  void
  output_check_banner(bool *live_check)
*** issue_warnings(char *sequence_script_fil
*** 193,213 
  
  
  void
! output_completion_banner(char *deletion_script_file_name)
  {
  	/* Did we copy the free space files? */
  	if (GET_MAJOR_VERSION(old_cluster.major_version) = 804)
  		pg_log(PG_REPORT,
! 			   Optimizer statistics are not transferred by pg_upgrade so\n
! 			   consider running:\n
! 			   vacuumdb --all --analyze-only\n
! 			   on the newly-upgraded cluster.\n\n);
  	else
  		pg_log(PG_REPORT,
  			   Optimizer statistics and free space information are not transferred\n
! 			   by pg_upgrade so consider running:\n
! 			   vacuumdb --all --analyze\n
! 			   on the newly-upgraded cluster.\n\n);
  
  	pg_log(PG_REPORT,
  		   Running this script will delete the old cluster's data files:\n
--- 198,217 
  
  
  void
! output_completion_banner(char *analyze_script_file_name,
! 		 char *deletion_script_file_name)
  {
  	/* Did we copy the free space files? */
  	if (GET_MAJOR_VERSION(old_cluster.major_version) = 804)
  		pg_log(PG_REPORT,
! 			   Optimizer 

Re: [HACKERS] VALID UNTIL

2012-03-14 Thread Tom Lane
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 The attached patch put VALID UNTIL into attributes column in verbose mode
 like example above.

Why would you confine it to verbose mode?  For most people it won't
matter, but for people who are using the feature, it seems like
important information.  Per the OP's complaint, it's particularly
important for those who have forgotten they're using the feature
(and hence would not think to specify + ...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Does anyone know how bad the queries will be with only one target?

Bad.  That cycle seems like largely a waste of time.  About the only
thing it would do for you is ensure that relpages/reltuples are up to
date, which seems like something we could possibly arrange for during
the data import.

 I did see if vacuumdb --analyze-only was somehow being throttled by the
 vacuum settings, but saw the drive at 100% utilization analying a 36GB
 table on a 24GB RAM server, so it seems I/O bound.

I think it'd be good to explicitly set vacuum_cost_delay to 0 in the
first pass, in the same way as you are forcing
default_statistics_target, just in case somebody has a nondefault
setting for that.  The second pass could probably be allowed to use some
higher delay setting.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VALID UNTIL

2012-03-14 Thread Fabrízio de Royes Mello
2012/3/14 Tom Lane t...@sss.pgh.pa.us


 Why would you confine it to verbose mode?


Because I did not want to change the current behavior of this psql
command... but...


For most people it won't
 matter, but for people who are using the feature, it seems like
 important information.  Per the OP's complaint, it's particularly
 important for those who have forgotten they're using the feature
 (and hence would not think to specify + ...)


You' re right, then I attached a new patch with your suggestion.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
*** a/src/bin/psql/describe.c
--- b/src/bin/psql/describe.c
***
*** 2382,2388  describeRoles(const char *pattern, bool verbose)
  		ARRAY(SELECT b.rolname\n
  		  FROM pg_catalog.pg_auth_members m\n
   JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)\n
! 		  WHERE m.member = r.oid) as memberof);
  
  		if (verbose  pset.sversion = 80200)
  		{
--- 2382,2389 
  		ARRAY(SELECT b.rolname\n
  		  FROM pg_catalog.pg_auth_members m\n
   JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)\n
! 		  WHERE m.member = r.oid) as memberof,\n
! 		'Valid until '::text||r.rolvaliduntil::text as rolvaliduntil);
  
  		if (verbose  pset.sversion = 80200)
  		{
***
*** 2407,2413  describeRoles(const char *pattern, bool verbose)
  		true AS rolinherit, false AS rolcreaterole,\n
  	   u.usecreatedb AS rolcreatedb, true AS rolcanlogin,\n
  		-1 AS rolconnlimit,\n
! 		ARRAY(SELECT g.groname FROM pg_catalog.pg_group g WHERE u.usesysid = ANY(g.grolist)) as memberof
  		  \nFROM pg_catalog.pg_user u\n);
  
  		processSQLNamePattern(pset.db, buf, pattern, false, false,
--- 2408,2415 
  		true AS rolinherit, false AS rolcreaterole,\n
  	   u.usecreatedb AS rolcreatedb, true AS rolcanlogin,\n
  		-1 AS rolconnlimit,\n
! 		ARRAY(SELECT g.groname FROM pg_catalog.pg_group g WHERE u.usesysid = ANY(g.grolist)) as memberof,\n
! 		'Valid until '::text||u.valuntil::text as rolvaliduntil
  		  \nFROM pg_catalog.pg_user u\n);
  
  		processSQLNamePattern(pset.db, buf, pattern, false, false,
***
*** 2452,2459  describeRoles(const char *pattern, bool verbose)
  		if (strcmp(PQgetvalue(res, i, 5), t) != 0)
  			add_role_attribute(buf, _(Cannot login));
  
  		if (pset.sversion = 90100)
! 			if (strcmp(PQgetvalue(res, i, (verbose ? 9 : 8)), t) == 0)
  add_role_attribute(buf, _(Replication));
  
  		conns = atoi(PQgetvalue(res, i, 6));
--- 2454,2464 
  		if (strcmp(PQgetvalue(res, i, 5), t) != 0)
  			add_role_attribute(buf, _(Cannot login));
  
+ 		if (strcmp(PQgetvalue(res, i, 8), ) != 0  verbose)
+   add_role_attribute(buf, PQgetvalue(res, i, 8));
+ 
  		if (pset.sversion = 90100)
! 			if (strcmp(PQgetvalue(res, i, (verbose ? 10 : 9)), t) == 0)
  add_role_attribute(buf, _(Replication));
  
  		conns = atoi(PQgetvalue(res, i, 6));
***
*** 2478,2485  describeRoles(const char *pattern, bool verbose)
  		printTableAddCell(cont, PQgetvalue(res, i, 7), false, false);
  
  		if (verbose  pset.sversion = 80200)
! 			printTableAddCell(cont, PQgetvalue(res, i, 8), false, false);
! 	}
  	termPQExpBuffer(buf);
  
  	printTable(cont, pset.queryFout, pset.logfile);
--- 2483,2491 
  		printTableAddCell(cont, PQgetvalue(res, i, 7), false, false);
  
  		if (verbose  pset.sversion = 80200)
! 			printTableAddCell(cont, PQgetvalue(res, i, 9), false, false);
! 	
!   }
  	termPQExpBuffer(buf);
  
  	printTable(cont, pset.queryFout, pset.logfile);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Alvaro Herrera

Excerpts from Noah Misch's message of mié mar 14 19:10:00 -0300 2012:
 
 On Wed, Mar 14, 2012 at 01:23:14PM -0400, Robert Haas wrote:
  On Tue, Mar 13, 2012 at 11:42 PM, Noah Misch n...@leadboat.com wrote:
   More often than that; each 2-member mxid takes 4 bytes in an offsets file 
   and
   10 bytes in a members file. ?So, more like one fsync per ~580 mxids. ?Note
   that we already fsync the multixact SLRUs today, so any increase will 
   arise
   from the widening of member entries from 4 bytes to 5. ?The realism of 
   this
   test is attractive. ?Nearly-static parent tables are plenty common, and 
   this
   test will illustrate the impact on those designs.
  
  Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
  today?
 
 Good question.  So far, I can't think of a reason.  nextMulti is critical,
 but we already fsync it with pg_control.  We could delete the other multixact
 state data at every startup and set OldestVisibleMXactId accordingly.

Hmm, yeah.

   You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on 
   mxid
   lookups, so I think something more sophisticated is needed to exercise 
   that
   cost. ?Not sure what.
  
  I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
  all-visible.  HEAP_XMAX_INVALID will obviously help, when it happens.
 
 True.  The patch (see ResetMultiHintBit()) also replaces a multixact xmax with
 the updater xid when all transactions of the multixact have ended.

I have noticed that this code is not correct, because we don't know that
we're holding an appropriate lock on the page, so we can't simply change
the Xmax and reset those hint bits.  As things stand today, mxids
persist longer.  (We could do some cleanup at HOT-style page prune, for
example, though the lock we need is even weaker than that.)  Overall
this means that coming up with a test case demonstrating this pressure
probably isn't that hard.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Wed, Mar 14, 2012 at 08:26:06PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Does anyone know how bad the queries will be with only one target?
 
 Bad.  That cycle seems like largely a waste of time.  About the only
 thing it would do for you is ensure that relpages/reltuples are up to
 date, which seems like something we could possibly arrange for during
 the data import.

Well, it is also getting us the most common value, which seems useful.

  I did see if vacuumdb --analyze-only was somehow being throttled by the
  vacuum settings, but saw the drive at 100% utilization analying a 36GB
  table on a 24GB RAM server, so it seems I/O bound.
 
 I think it'd be good to explicitly set vacuum_cost_delay to 0 in the
 first pass, in the same way as you are forcing
 default_statistics_target, just in case somebody has a nondefault
 setting for that.  The second pass could probably be allowed to use some
 higher delay setting.

OK, I have now set vacuum_cost_delay=0 for the first vacuumdb
(target=1).

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 6:08 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Another not-exactly-trivial requirement is to figure out how to
 not break on-disk compatibility when installing an alternative
 compression scheme.  In hindsight it might've been a good idea if
 pglz_compress had wasted a little bit of space on some sort of
 version identifier ... but it didn't.

 Doesn't it always start with a header of two int32 values where the
 first must be smaller than the second?  That seems like enough to
 get traction for an identifiably different header for an alternative
 compression technique.

The first of those words is vl_len_, which we can't fiddle with too
much, but the second is rawsize, which we can definitely fiddle with.
Right now, rawsize  vl_len_ means it's compressed; and rawsize ==
vl_len_ means it's uncompressed.  As you point out, rawsize  vl_len_
is undefined; also, and maybe simpler, rawsize  0 is undefined.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] EquivalenceClasses and subqueries and PlaceHolderVars, oh my

2012-03-14 Thread Tom Lane
I looked into the problem complained of here:
http://archives.postgresql.org/pgsql-bugs/2012-03/msg00016.php
It's not at all specific to custom types; you can exhibit it with
this query in the regression database:

explain select * from
 (select 1 as t, unique1 from tenk1 a
  union all
  select 2 as t, unique1 from tenk1 b) c
where t = 2;

9.0 successfully optimizes away the excluded subquery:

   QUERY PLAN
-
 Result  (cost=0.00..458.00 rows=1 width=8)
   -  Append  (cost=0.00..458.00 rows=1 width=8)
 -  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 width=8)
(3 rows)

but 9.1 and HEAD not so much:

  QUERY PLAN  
--
 Result  (cost=0.00..966.00 rows=100 width=8)
   -  Append  (cost=0.00..966.00 rows=100 width=8)
 -  Seq Scan on tenk1 a  (cost=0.00..483.00 rows=50 width=8)
   Filter: (1 = 2)
 -  Seq Scan on tenk1 b  (cost=0.00..483.00 rows=50 width=8)
   Filter: (2 = 2)
(6 rows)

This is a consequence of commit 57664ed25e5dea117158a2e663c29e60b3546e1c,
which was already known to cause some issues as per commit
b28ffd0fcc583c1811e5295279e7d4366c3cae6c.  This gripe is basically the
same problem: when we push the t = 2 condition down into the subqueries,
t is no longer replaced with just constant 1 or constant 2, but with
those constants wrapped in PlaceHolderVar nodes.  In this case that
prevents constant-folding from realizing it can simplify 1 = 2 or
2 = 2 to constant false or true, whereas in the previous complaint
the PHV wrappers were interfering with matching expressions to indexes.

I spent a fair amount of time thinking about whether we could revert
that patch and solve the original problem some other way, but with no
real success.  The original problem was reported here:
http://archives.postgresql.org/pgsql-hackers/2011-11/msg00419.php
with an example equivalent to this variant of the previous query:

explain select * from
 (select thousand as t1, tenthous as t2 from tenk1 a
  union all
  select 42 as t1, 42 as t2 from tenk1 b) c
order by t1, t2;

There is an EquivalenceClass for each of t1 and t2, and if we don't
do something like wrapping the constants with distinct PHVs, then
add_child_rel_equivalences will end up pushing identical constants into
both ECs, thus totally bollixing the fundamental rule that any expression
should match at most one EC.

Another variant of this is where there are identical Vars rather than
constants in one of the subqueries:

explain select * from
 (select thousand as t1, tenthous as t2 from tenk1 a
  union all
  select unique2 as t1, unique2 as t2 from tenk1 b) c
order by t1, t2;

I chose this example to match existing indexes in the regression database:
the ideal plan would do an indexscan on the (thousand, tenthous) index
for the first arm, and an indexscan on the (unique2) index for the second
arm, and MergeAppend them together.  In general the planner is aware that
ORDER BY x, x is the same as ORDER BY x, so you'd think it could apply
that principle to the second arm of this union ... but it can't.  To do
that, it would have to realize that the unique2 index matches both of the
EquivalenceClasses in this query, and that's totally outside its model of
reality.

It seems to me that to do a really nice job with this sort of situation,
we would need some more general concept than EquivalenceClasses.  I'm
not sure what, though I have a vague feeling that it might look like
EquivalenceClasses that are only valid within some sub-area of a query.

Now, this is a sufficiently weird corner case that I'm not desirous of
making major planner design changes just to improve this particular
outcome (and in any case that doesn't sound like a backpatchable bug
fix).  But down the road we may think of more reasons why we need a
better idea than EquivalenceClasses.

In the meantime, the best solution I've been able to think of goes like
this: continue to add PHVs on to duplicated or non-Var subquery outputs
when propagating those outputs into the outer query, but then strip them
off again when propagating transformed outer expressions down into the
sub-query.  There are basically only two places where we do the latter ---
set_append_rel_pathlist in allpaths.c propagates the inheritance parent's
baserestrictlist and other attachments to child rels, and
match_eclass_clauses_to_index extracts modified join clauses from
EquivalenceClasses.  So it's a bit ugly but should be a localized fix,
and it would allow us to revert b28ffd0fcc583c1811e5295279e7d4366c3cae6c
because the problem would be taken care of at a higher level.  This
would not fix the problem shown in the last example, that ideally we
should be able to match an index to more than one EquivalenceClass;
the planner 

Re: [HACKERS] Client Messages

2012-03-14 Thread Robert Haas
On Thu, Mar 8, 2012 at 8:40 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 29, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  Do we have an updated patch?  Fujii?

 No. I believe that the author Jim will submit the updated version.

 Jim, are you going to submit an updated version?

Hearing no response, I'm marking this Returned with Feedback.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Mar 14, 2012 at 6:08 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Doesn't it always start with a header of two int32 values where the
 first must be smaller than the second?  That seems like enough to
 get traction for an identifiably different header for an alternative
 compression technique.

 The first of those words is vl_len_, which we can't fiddle with too
 much, but the second is rawsize, which we can definitely fiddle with.
 Right now, rawsize  vl_len_ means it's compressed; and rawsize ==
 vl_len_ means it's uncompressed.  As you point out, rawsize  vl_len_
 is undefined; also, and maybe simpler, rawsize  0 is undefined.

Well, let's please not make the same mistake again of assuming that
there will never again be any other ideas in this space.  IOW, let's
find a way to shoehorn in an actual compression-method ID value of some
sort.  I don't particularly care for trying to push that into rawsize,
because you don't really have more than about one bit to work with
there, unless you eat the entire word for ID purposes which seems
excessive.

After looking at pg_lzcompress.c for a bit, it appears to me that the
LSB of the first byte of compressed data must always be zero, because
the very first control bit has to say copy a literal byte; you can't
have a back-reference until there's some data in the output buffer.
So what I suggest is that we keep rawsize the same as it is, but peek at
the first byte after that to decide what we have: even means existing
compression method, an odd value is an ID byte selecting some new
method.  This gives us room for 128 new methods before we have trouble
again, while consuming only one byte which seems like acceptable
overhead for the purpose.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 9:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, let's please not make the same mistake again of assuming that
 there will never again be any other ideas in this space.  IOW, let's
 find a way to shoehorn in an actual compression-method ID value of some
 sort.  I don't particularly care for trying to push that into rawsize,
 because you don't really have more than about one bit to work with
 there, unless you eat the entire word for ID purposes which seems
 excessive.

Well, you don't have to go that far.  For example, you could dictate
that, when the value is negative, the most significant byte indicates
the compression algorithm is in use (128 possible compression
algorithms).  The remaining 3 bytes indicate the compressed length;
but when they're all zero, the compressed length is instead stored in
the following 4-byte word.  This consumes one additional 4-byte word
for values that take = 16MB compressed, but that's presumably a
non-problem.

 After looking at pg_lzcompress.c for a bit, it appears to me that the
 LSB of the first byte of compressed data must always be zero, because
 the very first control bit has to say copy a literal byte; you can't
 have a back-reference until there's some data in the output buffer.
 So what I suggest is that we keep rawsize the same as it is, but peek at
 the first byte after that to decide what we have: even means existing
 compression method, an odd value is an ID byte selecting some new
 method.  This gives us room for 128 new methods before we have trouble
 again, while consuming only one byte which seems like acceptable
 overhead for the purpose.

That would work, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 6:10 PM, Noah Misch n...@leadboat.com wrote:
 Well, post-release, the cat is out of the bag: we'll be stuck with
 this whether the performance characteristics are acceptable or not.
 That's why we'd better be as sure as possible before committing to
 this implementation that there's nothing we can't live with.  It's not
 like there's any reasonable way to turn this off if you don't like it.

 I disagree; we're only carving in stone the FOR KEY SHARE and FOR KEY UPDATE
 syntax additions.  We could even avoid doing that by not documenting them.  A
 later major release could implement them using a completely different
 mechanism or even reduce them to aliases, KEY SHARE = SHARE and KEY UPDATE =
 UPDATE.  To be sure, let's still do a good job the first time.

What I mean is really that, once the release is out, we don't get to
take it back.  Sure, the next release can fix things, but any
regressions will become obstacles to upgrading and pain points for new
users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 9:17 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
  Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
  today?

 Good question.  So far, I can't think of a reason.  nextMulti is critical,
 but we already fsync it with pg_control.  We could delete the other multixact
 state data at every startup and set OldestVisibleMXactId accordingly.

 Hmm, yeah.

In a way, the fact that we don't do that is kind of fortuitous in this
situation.  I had just assumed that we were not fsyncing it because
there seems to be no reason to do so.  But since we are, we already
know that the fsyncs resulting from frequent mxid allocation aren't a
huge pain point.  If they were, somebody would have presumably
complained about it and fixed it before now.  So that means that what
we're really worrying about here is the overhead of fsyncing a little
more often, which is a lot less scary than starting to do it when we
weren't previously.

Now, we could look at this as an opportunity to optimize the existing
implementation by removing the fsyncs, rather than adding the new
infrastructure Alvaro is proposing.  But that would only make sense if
we thought that getting rid of the fsyncs would be more valuable than
avoiding the blocking here, and I don't.

I still think that someone needs to do some benchmarking here, because
this is a big complicated performance patch, and we can't predict the
impact of it on real-world scenarios without testing.  There is
clearly some additional overhead, and it makes sense to measure it and
hopefully discover that it isn't excessive.  Still, I'm a bit
relieved.

 I have noticed that this code is not correct, because we don't know that
 we're holding an appropriate lock on the page, so we can't simply change
 the Xmax and reset those hint bits.  As things stand today, mxids
 persist longer.  (We could do some cleanup at HOT-style page prune, for
 example, though the lock we need is even weaker than that.)  Overall
 this means that coming up with a test case demonstrating this pressure
 probably isn't that hard.

What would such a test case look like?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq should have functions for escaping data for use in COPY FROM

2012-03-14 Thread Robert Haas
On Fri, Mar 9, 2012 at 9:16 PM, Joey Adams joeyadams3.14...@gmail.com wrote:
 libpq has functions for escaping values in SQL commands
 (PQescapeStringConn, PQescapeByteaConn, and the new PQescapeLiteral),
 and it supports parameterizing queries with PQexecParams.  But it does
 not (to my knowledge) have functions for escaping values for COPY
 FROM.

 COPY FROM is useful for inserting rows in bulk (though I wonder if
 constructing massive INSERT statements and using PQexecParams is just
 as efficient).  It is also useful for generating .sql files which can
 be run on a database elsewhere.

 I think libpq should include functions for escaping with COPY FROM.

I'm a little bit confused about what you're getting at here, because
COPY has a huge pile of options - not just CSV or text, but also
things like QUOTE and DELIMITER.  It's not like there is ONE way to
escape things for COPY.  I guess we could include code that escapes
things in the manner that an optionless COPY expects, or we could
include in the API all the same options that COPY supports, but the
former sounds narrow and the latter complex.

 Before spending a bunch of time on this, I'd like some input.  A few 
 questions:

  * Should we have corresponding functions for parsing COPY TO data, or
 is PQexecParams sufficient?

  * Should we support CSV escaping?  Can the CSV format safely encode
 all characters (in particular, newlines)?

The fine manual page for COPY discusses how to encode CSV data in
considerable detail.

  * Should we deal with encodings here, or just escape everything that
 isn't printable ASCII like the code I wrote does?

I think your code will fall over badly if fed, say, UTF-8 characters
with code points greater than 0x7F.

I doubt very much that we would accept anything into libpq that
doesn't handle all the encodings we support, and that covers a lot of
territory.  There are some restrictions on the set of server-side
encodings - we only allow those that have certain safe properties -
but IIUC client encodings are much less restricted and a lot of wacky
stuff is possible.  Even if you can come up with code that handles all
cases correctly, it'll probably perform much less well in simple cases
than the quick hack you linked to here.

Considering all the above, this seems like it might be a solution in
search of a problem.  It's not actually that hard to write code to do
proper escaping for a *given* encoding and a *given* set of COPY
options, but trying to write something general sounds like a job and a
half.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-14 Thread Joachim Wieland
On Wed, Mar 14, 2012 at 2:02 PM, Robert Haas robertmh...@gmail.com wrote:
 I think we should get rid of die_horribly(), and instead have arrange
 to always clean up AH via an on_exit_nicely hook.

Good. The only exit handler I've seen so far is
pgdump_cleanup_at_exit. If there's no other one, is it okay to remove
all of this stacking functionality (see on_exit_nicely_index /
MAX_ON_EXIT_NICELY) from dumputils.c and just define two global
variables, one for the function and one for the arg that this function
would operate on (or a struct of both)?

We'd then have the current function and AHX (or only AH-connection
from it) in the non-parallel case and as soon as we enter the parallel
dump, we can exchange it for another function operating on
ParallelState*. This avoids having to deal with thread-local storage
on Windows, because ParallelState* is just large enough to hold all
the required data and a specific thread can easily find its own slot
with its threadId.


 Sure, but since all the function does is write to it or access it,
 what good does that do me?

 It encapsulates the variable so that it can only be used for one
 specific use case.

 Seems pointless to me.

Not so much to me if the alternative is to make ParallelState* a
global variable, but anyway, with the concept proposed above,
ParallelState* would be the arg that the parallel exit handler would
operate on, so it would indeed be global but hidden behind a different
name and a void* pointer.

(I will address all the other points you brought up in my next patch)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-14 Thread Joachim Wieland
On Wed, Mar 14, 2012 at 4:39 PM, Andrew Dunstan aduns...@postgresql.org wrote:
 I've just started looking at the patch, and I'm curious to know why it
 didn't follow the pattern of parallel pg_restore which created a new worker
 for each table rather than passing messages to looping worker threads as
 this appears to do. That might have avoided a lot of the need for this
 message passing infrastructure, if it could have been done. But maybe I just
 need to review the patch and the discussions some more.

The main reason for this design has now been overcome by the
flexibility of the synchronized snapshot feature, which allows to get
the snapshot of a transaction even if this other transaction has been
running for quite some time already. In other previously proposed
implementations of this feature, workers had to connect at the same
time and then could not close their transactions without losing the
snapshot.

The other drawback of the fork-per-tocentry-approach is the somewhat
limited bandwith of information from the worker back to the master,
it's basically just the return code. That's fine if there is no error,
but if there is, then the master can't tell any further details (e.g.
could not get lock on table foo, or could not write to file bar: no
space left on device).

This restriction does not only apply to error messages. For example,
what I'd also like to have in pg_dump would be checksums on a
per-TocEntry basis. The individual workers would calculate the
checksums when writing the file and then send them back to the master
for integration into the TOC. I don't see how such a feature could be
implemented in a straightforward way without a message passing
infrastructure.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread David Fetter
On Wed, Mar 14, 2012 at 12:06:20PM -0400, Robert Haas wrote:
 On Wed, Mar 14, 2012 at 10:22 AM, David Fetter da...@fetter.org wrote:
  I think that instead of inventing new grammar productions and a new
  node type for this, you should just reuse the existing productions for
  LIKE clauses and then reject invalid options during parse analysis.
 
  OK.  Should I first merge CREATE FOREIGN TABLE with CREATE TABLE and
  submit that as a separate patch?
 
 I don't see any reason to do that.  I merely meant that you could
 reuse TableLikeClause or maybe even TableElement in the grammer for
 CreateForeignTableStmt.

Next WIP patch attached implementing this via reusing TableLikeClause
and refactoring transformTableLikeClause().

What say?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 3950,3955  ForeignTableElementList:
--- 3950,3956 
  
  ForeignTableElement:
columnDef   { $$ = 
$1; }
+ | TableLikeClause { $$ = $1; }
;
  
  /*
*** a/src/backend/parser/parse_utilcmd.c
--- b/src/backend/parser/parse_utilcmd.c
***
*** 652,657  transformTableLikeClause(CreateStmtContext *cxt, 
TableLikeClause *table_like_cla
--- 652,678 

table_like_clause-relation-relname)));
  
cancel_parser_errposition_callback(pcbstate);
+   
+   /*
+* For foreign tables, disallow some options.
+*/
+   if (strcmp(cxt-stmtType, CREATE FOREIGN TABLE)==0)
+   {
+   if (table_like_clause-options  CREATE_TABLE_LIKE_CONSTRAINTS)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg(\%s\ is a foreign table. 
Only local tables can take LIKE CONSTRAINTS,
+   
table_like_clause-relation-relname)));
+   }
+   else if (table_like_clause-options  CREATE_TABLE_LIKE_INDEXES)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg(\%s\ is a foreign table. 
Only local tables can take LIKE INDEXES,
+   
table_like_clause-relation-relname)));
+   }
+   }
  
/*
 * Check for privileges

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers