Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode

2014-03-07 Thread Michael Paquier
On Thu, Mar 6, 2014 at 12:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 03/05/2014 09:11 AM, Michael Paquier wrote:
 After testing this feature, I noticed that FORCE_NULL and
 FORCE_NOT_NULL can both be specified with COPY on the same column.

 Strictly they are not actually contradictory, since FORCE NULL relates
 to quoted null strings and FORCE NOT NULL relates to unquoted null
 strings. Arguably the docs are slightly loose on this point. Still,
 applying both FORCE NULL and FORCE NOT NULL to the same column would be
 rather perverse, since it would result in a quoted null string becoming
 null and an unquoted null string becoming not null.

 Given the remarkable lack of standardization of CSV output, who's
 to say that there might not be data sources out there for which this
 is the desired behavior?  It's weird, I agree, but I think throwing
 an error for the combination is not going to be helpful.  It's not
 like somebody might accidentally write both on the same column.

 +1 for clarifying the docs, though, more or less in the words you
 used above.
Following that, I have hacked the patch attached to update the docs
with an additional regression test (actually replaces a test that was
the same as the one before in copy2).

I am attaching as well a second patch for file_fdw, to allow the use
of force_null and force_not_null on the same column, to be consistent
with COPY.
Regards,
-- 
Michael
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 7fb1dbc..97a35d0 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -267,11 +267,6 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 		(errcode(ERRCODE_SYNTAX_ERROR),
 		 errmsg(conflicting or redundant options),
 		 errhint(option \force_not_null\ supplied more than once for a column)));
-			if(force_null)
-ereport(ERROR,
-		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg(conflicting or redundant options),
-		 errhint(option \force_not_null\ cannot be used together with \force_null\)));
 			force_not_null = def;
 			/* Don't care what the value is, as long as it's a legal boolean */
 			(void) defGetBoolean(def);
@@ -284,11 +279,6 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 		(errcode(ERRCODE_SYNTAX_ERROR),
 		 errmsg(conflicting or redundant options),
 		 errhint(option \force_null\ supplied more than once for a column)));
-			if(force_not_null)
-ereport(ERROR,
-		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg(conflicting or redundant options),
-		 errhint(option \force_null\ cannot be used together with \force_not_null\)));
 			force_null = def;
 			(void) defGetBoolean(def);
 		}
diff --git a/contrib/file_fdw/input/file_fdw.source b/contrib/file_fdw/input/file_fdw.source
index 0c278aa..b608372 100644
--- a/contrib/file_fdw/input/file_fdw.source
+++ b/contrib/file_fdw/input/file_fdw.source
@@ -91,24 +91,22 @@ ALTER FOREIGN TABLE text_csv OPTIONS (SET format 'csv');
 \pset null _null_
 SELECT * FROM text_csv;
 
+-- force_not_null and force_null can be used together on the same column
+ALTER FOREIGN TABLE text_csv ALTER COLUMN word1 OPTIONS (force_null 'true');
+ALTER FOREIGN TABLE text_csv ALTER COLUMN word3 OPTIONS (force_not_null 'true');
+
 -- force_not_null is not allowed to be specified at any foreign object level:
 ALTER FOREIGN DATA WRAPPER file_fdw OPTIONS (ADD force_not_null '*'); -- ERROR
 ALTER SERVER file_server OPTIONS (ADD force_not_null '*'); -- ERROR
 CREATE USER MAPPING FOR public SERVER file_server OPTIONS (force_not_null '*'); -- ERROR
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_not_null '*'); -- ERROR
 
--- force_not_null cannot be specified together with force_null
-ALTER FOREIGN TABLE text_csv ALTER COLUMN word1 OPTIONS (force_null 'true'); --ERROR
-
 -- force_null is not allowed to be specified at any foreign object level:
 ALTER FOREIGN DATA WRAPPER file_fdw OPTIONS (ADD force_null '*'); -- ERROR
 ALTER SERVER file_server OPTIONS (ADD force_null '*'); -- ERROR
 CREATE USER MAPPING FOR public SERVER file_server OPTIONS (force_null '*'); -- ERROR
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_null '*'); -- ERROR
 
--- force_null cannot be specified together with force_not_null
-ALTER FOREIGN TABLE text_csv ALTER COLUMN word3 OPTIONS (force_not_null 'true'); --ERROR
-
 -- basic query tests
 SELECT * FROM agg_text WHERE b  10.0 ORDER BY a;
 SELECT * FROM agg_csv ORDER BY a;
diff --git a/contrib/file_fdw/output/file_fdw.source b/contrib/file_fdw/output/file_fdw.source
index 2bec160..bc183b8 100644
--- a/contrib/file_fdw/output/file_fdw.source
+++ b/contrib/file_fdw/output/file_fdw.source
@@ -115,6 +115,9 @@ SELECT * FROM text_csv;
  ABC   | abc|| 
 (5 rows)
 
+-- force_not_null and force_null can be used together on the same column
+ALTER FOREIGN TABLE text_csv ALTER COLUMN word1 OPTIONS (force_null 'true');
+ALTER FOREIGN TABLE text_csv ALTER COLUMN word3 OPTIONS 

Re: [HACKERS] syslog_ident mentioned as syslog_identify in the docs

2014-03-07 Thread Heikki Linnakangas

On 03/07/2014 08:24 AM, Michael Paquier wrote:

In the documentation, particularly the doc index, syslog_ident is
incorrectly mentioned as syslog_identify. The attached patch fixes
that. This error is in the docs since 8.0.


Thanks, fixed.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:

2014-03-07 Thread Simon Riggs
On 6 March 2014 22:43, Noah Misch n...@leadboat.com wrote:
 On Tue, Mar 04, 2014 at 06:50:17PM +0100, Andres Freund wrote:
 On 2014-03-04 11:40:10 -0500, Tom Lane wrote:
  Robert Haas robertmh...@gmail.com writes:  I think this is all too
  late for 9.4, though.
 
  I agree with the feeling that a meaningful fix for pg_dump isn't going
  to get done for 9.4.  So that leaves us with the alternatives of (1)
  put off the lock-strength-reduction patch for another year; (2) push
  it anyway and accept a reduction in pg_dump reliability.
 
  I don't care for (2).  I'd like to have lock strength reduction as
  much as anybody, but it can't come at the price of reduction of
  reliability.

 I am sorry, but I think this is vastly overstating the scope of the
 pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the
 amount of problems that has caused in the past is surprisingly low. If
 such a frequently used command doesn't cause problems, why are you
 assuming other commands to be that problematic? And I think it's hard to
 argue that the proposed changes are more likely to cause problems.

 Let's try to go at this a bit more methodically. The commands that -
 afaics - change their locklevel due to latest patch (v21) are:
 [snip]

 Good analysis.  The hazards arise when pg_dump uses one of the ruleutils.c
 deparse worker functions.  As a cross-check to your study, I looked briefly at
 the use of those functions in pg_dump and how this patch might affect them:

 -- pg_get_constraintdef()

 pg_dump reads the constraint OID with its transaction snapshot, so we will
 never see a too-new constraint.  Dropping a constraint still requires
 AccessExclusiveLock.

Agreed

 Concerning VALIDATE CONSTRAINT, pg_dump reads convalidated with its
 transaction snapshot and uses that to decide whether to dump the CHECK
 constraint as part of the CREATE TABLE or as a separate ALTER TABLE ADD
 CONSTRAINT following the data load.  However, pg_get_constraintdef() reads the
 latest convalidated to decide whether to emit NOT VALID.  Consequently, one
 can get a dump in which the dumped table data did not yet conform to the
 constraint, and the ALTER TABLE ADD CONSTRAINT (w/o NOT VALID) fails.
 (Suppose you deleted the last invalid rows just before executing the VALIDATE
 CONSTRAINT.  I tested this by committing the DELETE + VALIDATE CONSTRAINT with
 pg_dump stopped at getTableAttrs().)

 One hacky, but maintainable and effective, solution to the VALIDATE CONSTRAINT
 problem is to have pg_dump tack on a NOT VALID if pg_get_constraintdef() did
 not do so.  It's, conveniently, the last part of the definition.  I would tend
 to choose this.  We could also just decide this isn't bad enough to worry
 about.  The consequence is that an ALTER TABLE ADD CONSTRAINT fails.  Assuming
 no --single-transaction for the original restoral, you just add NOT VALID to
 the command and rerun.  Like most of the potential new pg_dump problems, this
 can already happen today if the relevant database changes happen between
 taking the pg_dump transaction snapshot and locking the tables.

Too hacky for me, but some good thinking. My proposed solution is below.

 -- pg_get_expr() for default expressions

 pg_dump reads pg_attrdef.adbin using its transaction snapshot, so it will
 never see a too-new default.  This does allow us to read a dropped default.
 That's not a problem directly.  However, suppose the default references a
 function dropped at the same time as the default.  pg_dump could fail in
 pg_get_expr().

 -- pg_get_indexdef()

 As you explained elsewhere, new indexes are no problem.  DROP INDEX still
 requires AccessExclusiveLock.  Overall, no problems here.

 -- pg_get_ruledef()

 The patch changes lock requirements for enabling and disabling of rules, but
 that is all separate from the rule expression handling.  No problems.

 -- pg_get_triggerdef()

 The patch reduces CREATE TRIGGER and DROP TRIGGER to ShareUpdateExclusiveLock.
 The implications for pg_dump are similar to those for pg_get_expr().

These are certainly concerning. What surprises me the most is that
pg_dump has been so happy to randomly mix SQL using the transaction
snapshot with sys cache access code using a different snapshot. If
that was intention there is no documentation in code or in the docs to
explain that.

 -- pg_get_viewdef()

 Untamed: pg_dump does not lock views at all.

OMG, its really a wonder pg_dump works at all.

 One thing not to forget is that you can always get the old mutual exclusion
 back by issuing LOCK TABLE just before a DDL operation.  If some unlucky user
 regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a
 workaround.  There's no comparable way for someone who would not experience
 that problem to weaken the now-hardcoded AccessExclusiveLock.  Many
 consequences of insufficient locking are too severe for that workaround to
 bring comfort, but the pg_dump failure scenarios around pg_get_expr() and
 pg_get_triggerdef() 

Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-07 Thread Heikki Linnakangas

On 03/06/2014 09:34 PM, Robert Haas wrote:

On Thu, Mar 6, 2014 at 8:11 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

I don't think it's necessary to improve concurrency just to get WAL-logging.
Better concurrency is a worthy goal of its own, of course, but it's a
separate concern.


To some extent, I agree, but only to some extent.  To make hash
indexes generally usable, we really need to solve both problems.  When
I got rid of just some of the heavyweight locking in commit
76837c1507cb5a5a0048046233568447729e66dd, the results were pretty
dramatic at higher levels of concurrency:

http://www.postgresql.org/message-id/CA+Tgmoaf=nojxlyzgcbrry+pe-0vll0vfhi6tjdm3fftvws...@mail.gmail.com

But there was still an awful lot of contention inside the heavyweight
lock manager, and I don't think hash indexes are going to be ready for
prime time until we solve that problem.


Hmm. You suggested ensuring that a scan always has at least a pin, and 
split takes a vacuum-lock. That ought to work. There's no need for the 
more complicated maneuvers you described, ISTM that you can just replace 
the heavy-weight share lock with holding a pin on the primary page of 
the bucket, and an exclusive lock with a vacuum-lock. Note that 
_hash_expandtable already takes the exclusive lock conditionally, ie. if 
it doesn't get the lock immediately it just gives up. We could do the 
same with the cleanup lock.


Vacuum could also be enhanced. It currently takes an exclusive lock on 
the bucket, then removes any dead tuples and finally squeezes the 
bucket by moving tuples to earlier pages. But you only really need the 
exclusive lock for the squeeze-phase. You could do the dead-tuple 
removal without the bucket-lock, and only grab for the squeeze-phase. 
And the squeezing is optional, so you could just skip that if you can't 
get the lock. But that's a separate patch as well.


One more thing we could do to make hash indexes more scalable, 
independent of the above: Cache the information in the metapage in 
backend-private memory. Then you (usually) wouldn't need to access the 
metapage at all when scanning. Store a copy of the bitmask for that 
bucket in the primary page, and when scanning, check that it matches the 
cached value. If not, refresh the cached metapage and try again.



So, there seems to be a few fairly simple and independent improvements 
to be made:


1. Replace the heavy-weight lock with pin  vacuum-lock.
2. Make it crash-safe, by adding WAL-logging
3. Only acquire the exclusive-lock (vacuum-lock after step 1) in VACUUM 
for the squeeze phase.

4. Cache the metapage.

We still don't know if it's going to be any better than B-tree after all 
that's done, but the only way to find out is to go ahead and implement it.


This seems like a great GSoC project to me. We have a pretty good idea 
of what we want to accomplish. It's uncontroversial: I don't think 
anyone is going to object to improving hash indexes (one could argue 
that it's a waste of time, but that's different from objecting to the 
idea). And it consists of a few mostly independent parts, so it's 
possible to do incrementally which makes it easier to track progress, 
and we'll probably have something useful at the end of the summer even 
if it doesn't all get finished.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing RelationCloseSmgr in FreeFakeRelcacheEntry?

2014-03-07 Thread Heikki Linnakangas

On 03/06/2014 02:18 AM, Bruce Momjian wrote:

On Tue, Nov  5, 2013 at 08:36:32PM +0100, Andres Freund wrote:

On 2013-11-04 13:48:32 +0100, Andres Freund wrote:

What about just unowning the smgr entry with
if (rel-rd_smgr != NULL)
smgrsetowner(NULL, rel-rd_smgr)
when closing the fake relcache entry?

That shouldn't require any further changes than changing the comment in
smgrsetowner() that this isn't something expected to frequently happen.


Attached is a patch doing it like that, it required slightmy more
invasive changes than that. With the patch applied we survive replay of
a primary's make check run under valgrind without warnings.


Where are we on this patch?


Committed now, with small changes. I made the new smgrclearowner 
function to check that the SmgrRelation object is indeed owned by the 
owner we expect, so that it doesn't unown it if it's actually owned by 
someone else. That shouldn't happen, but it seemed prudent to check.


Thanks Andres. I tried to reproduce the valgrind message you reported, 
but couldn't. How did you do it? Did this commit fix it?


And thanks for the nudge, Bruce.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-03-07 Thread Peter Eisentraut
On 2/27/14, 6:04 AM, Dimitri Fontaine wrote:
 What about allowing a control file like this:
 
# hstore extension
comment = 'data type for storing sets of (key, value) pairs'
default_version = '1.3'
directory = 'local/hstore-new'
module_pathname = '$directory/hstore'
relocatable = true

I think your previously proposed patch to add extension_control_path
plus my suggestion to update existing de facto best practices to not
include $libdir into the module path name (thus allowing the use of
dynamic_library_path) will address all desired use cases just fine.

Moreover, going that way would reuse existing facilities and concepts,
remove indirections and reduce overall complexity.  This new proposal,
on the other hand, would go the other way, introducing new concepts,
adding more indirections, and increasing overall complexity, while
actually achieving less.

I see an analogy here.  What we are currently doing is similar to
hardcoding absolute rpaths into all libraries.  Your proposal is
effectively to (1) add the $ORIGIN mechanism and (2) make people use
chrpath when they want to install somewhere else.  My proposal is to get
rid of all rpaths and just set a search path.  Yes, on technical level,
this is less powerful, but it's simpler and gets the job done and is
harder to misuse.

A problem with features like these is that they get rarely used but
offer infinite flexibility, so they are not used consistently and you
can't rely on anything.  This is already the case for the
module_pathname setting in the control file.  It has, AFAICT, no actual
use, and because of that no one uses it, and because of that, there is
no guarantee that extensions use it sensibly, and because of that no one
can ever make sensible use of it in the future, because there is no
guarantee that extensions have it set sensibly.  In fact, I would
propose deprecating module_pathname.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby doesn't come up on some situation.

2014-03-07 Thread Kyotaro HORIGUCHI
Hello,

  After all, I have confirmed that this fixes the problem on crash
  recovery of hot-standby botfor 9.3 and HEAD and no problem was
  found except unreadability :(
 
 Ok, committed. Thanks!

Thank you.

 Any concrete suggestions about the readability? Is there some
 particular spot that needs clarifying?

Well, Although I became to see no problem there after I understood how
it works:-), I'll write about two points where I had difficulties to
understand.

|  * (or the checkpoint record itself, if it's a shutdown checkpoint).
|  */
| if (checkPoint.redo  RecPtr)

First, it was a bit tough work to confirm the equivalence between
(redo == RecPtr) and that the checkpoint is shutdown checkpoint.
Although finally I was convinced that it surely holds, that is
actually not the point. The point here is in the first half of the
phrase. The comment might be less perplexing if it were as folowing
even if only shutdown checkpoint satisfies the condition. But it would
occur another quiestion in readers' mind.

|  * (or the checkpoint record itself, e.g. if it's a shutdown checkpoint).

Second, the added code depends on the assumption that RecPtr points to
the checkpoint record and EndRecPtr points to the next record there.
It would be better for understandability and stability (against
modifying code) to explicitly declare the precondition, like this

| Here RecPtr points the checkpoint record and EndRecPtr points to the
| place for the record just after.


  Surely this is the consequence of illegal operation but I think
  it is also not a issue of assertion - which fires on something
  wrong in design or quite rare cases(this case ?).
 
 Ah, I see. Yes, that's definitely a bug. If you don't hit the
 assertion, because the oldestActiveXID is set in the checkpoint record
 even though wal_level is 'archive', or if you simply have assertions
 disabled, the system will start up in hot standby mode even though
 it's not safe.
 
  So it might be
  better to show message as below on the case.
 
  | FATAL:  Checkpoint doesn't have valid oldest active transaction id
  | HINT: Reading WAL might have been written under insufficient
  | wal_level.

Agreed.

 Hmm. When I test that with 9.2, oldestActiveXID is not 0, even though
 wal_level is 'archive'. So the above patch doesn't fix the whole
 problem.
 
 The real bug here is that CheckRequiredParameterValues() tests for
 InArchiveRecovery, when it should be testing for
 ArchiveRecoveryRequested. Otherwise, the checks are not performed when
 going through the crash recovery followed by archive recovery. I
 should've changed that as part of the commit that added the crash
 recovery then archive recovery behavior.
 
 Fixed, thanks for pointing it out!

It's my pleasre.

regrds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.9.1

2014-03-07 Thread Andres Freund
Hi,

On 2014-03-05 23:20:57 +0100, Andres Freund wrote:
 On 2014-03-05 17:05:24 -0500, Robert Haas wrote:
   I very much dislike having the three different event loops, but it's
   pretty much forced by the design of the xlogreader. My xlogreader
   version didn't block when it neeeded to wait for WAL but just returned
   need input/output, but with the eventually committed version you're
   pretty much forced to block inside the read_page callback.
  
   I don't really have a idea how we could sensibly unify them atm.
  
  WalSndLoop(void (*gutsfn)())?
 
 The problem is that they are actually different. In the WalSndLoop we're
 also maintaining the walsender's state, in WalSndWriteData() we're just
 waiting for writes to be flushed, in WalSndWaitForWal we're primarily
 waiting for the flush pointer to pass some LSN. And the timing of the
 individual checks isn't trivial (just added some more comments about
 it).
 
 I'll simplify it by pulling out more common code, maybe it'll become
 apparent how it should look.

I've attached a new version of the walsender patch. It's been rebased
ontop of Heikki's latest commit to walsender.c. I've changed a fair bit
of stuff:
* The sleeptime is now computed to sleep until we either need to send a
  keepalive or kill ourselves, as Heikki sugggested.
* Sleep time computation, sending pings, checking timeouts is now done
  in separate functions.
* Comment and codestyle improvements.

Although they are shorter and simpler now, I have not managed to unify
the three loops however. They seem to be too different to unify them
inside one. I tried a common function with an 'wait_for' bitmask
argument, but that turned out to be fairly illegible. The checks in
WalSndWaitForWal() and WalSndLoop() just seem to be too different.

I'd be grateful if you (or somebody else!) could have a quick look at
body of the loops in WalSndWriteData(), WalSndWaitForWal() and
WalSndLoop(). Maybe I am just staring at it the wrong way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From ca84cd3d2966f0e7297d1c7270b094122eec2f18 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 5 Mar 2014 00:15:38 +0100
Subject: [PATCH] Add walsender interface for the logical decoding
 functionality.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This exposes the changeset extraction feature via walsenders which
allows the data to be received in a streaming fashion and supports
synchronous replication.

To do this walsenders need to be able to connect to a specific
database. For that it extend the existing 'replication' parameter to
not only allow a boolean value but also database. If the latter is
specified we connect to the database specified in 'dbname'.

Andres Freund, with contributions from Álvaro Herrera, reviewed by
Robert Haas.
---
 doc/src/sgml/protocol.sgml |  24 +-
 src/backend/postmaster/postmaster.c|  28 +-
 .../libpqwalreceiver/libpqwalreceiver.c|   4 +-
 src/backend/replication/repl_gram.y|  81 +-
 src/backend/replication/repl_scanner.l |   1 +
 src/backend/replication/walsender.c| 914 ++---
 src/backend/utils/init/postinit.c  |  15 +-
 src/bin/pg_basebackup/pg_basebackup.c  |   6 +-
 src/bin/pg_basebackup/pg_receivexlog.c |   6 +-
 src/bin/pg_basebackup/receivelog.c |   6 +-
 src/include/replication/walsender.h|   1 +
 src/tools/pgindent/typedefs.list   |   1 +
 12 files changed, 918 insertions(+), 169 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d36f2f3..510bf9a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1302,10 +1302,13 @@
 
 para
 To initiate streaming replication, the frontend sends the
-literalreplication/ parameter in the startup message. This tells the
-backend to go into walsender mode, wherein a small set of replication commands
-can be issued instead of SQL statements. Only the simple query protocol can be
-used in walsender mode.
+literalreplication/ parameter in the startup message. A boolean value
+of literaltrue/ tells the backend to go into walsender mode, wherein a
+small set of replication commands can be issued instead of SQL statements. Only
+the simple query protocol can be used in walsender mode.
+Passing a literaldatabase/ as the value instructs walsender to connect to
+the database specified in the literaldbname/ paramter which will in future
+allow some additional commands to the ones specified below to be run.
 
 The commands accepted in walsender mode are:
 
@@ -1315,7 +1318,7 @@ The commands accepted in walsender mode are:
 listitem
  para
   Requests the server to identify itself. Server replies with a result
-  

Re: [HACKERS] Changeset Extraction v7.9.1

2014-03-07 Thread Alvaro Herrera
Andres Freund escribió:

   fprintf(stderr,
 - _(%s: could not identify system: got 
 %d rows and %d fields, expected %d rows and %d fields\n),
 - progname, PQntuples(res), 
 PQnfields(res), 1, 3);
 + _(%s: could not identify system: got 
 %d rows and %d fields, expected 1 row and 3 or more fields\n),
 + progname, PQntuples(res), 
 PQnfields(res));

Please don't change this.  The reason these messages use %d and an extra
printf argument is to avoid giving translators extra work when the
number of rows or fields is changed.  In these cases I suggest this:

 - _(%s: could not identify system: got 
 %d rows and %d fields, expected %d rows and %d fields\n),
 - progname, PQntuples(res), 
 PQnfields(res), 1, 3);
 + _(%s: could not identify system: got 
 %d rows and %d fields, expected %d rows and %d or more fields\n),
 + progname, PQntuples(res), 
 PQnfields(res), 1, 3);

(Yes, I know the expected 1 rows output looks a bit silly.  Since this
is an unexpected error message anyway, I don't think that's worth
fixing.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-07 Thread Robert Haas
On Thu, Mar 6, 2014 at 7:07 PM, Greg Stark st...@mit.edu wrote:
 On Thu, Mar 6, 2014 at 11:14 PM, Robert Haas robertmh...@gmail.com wrote:
 I've been tempted to implement a new type of hash index that allows both WAL
 and high concurrency, simply by disallowing bucket splits.  At the index
 creation time you use a storage parameter to specify the number of buckets,
 and that is that. If you mis-planned, build a new index with more buckets,
 possibly concurrently, and drop the too-small one.

 Yeah, we could certainly do something like that.  It sort of sucks,
 though.  I mean, it's probably pretty easy to know that starting with
 the default 2 buckets is not going to be enough; most people will at
 least be smart enough to start with, say, 1024.  But are you going to
 know whether you need 32768 or 1048576 or 33554432?  A lot of people
 won't, and we have more than enough reasons for performance to degrade
 over time as it is.

 The other thought I had was that you can do things lazily in vacuum.
 So when you probe you need to check multiple pages until vacuum comes
 along and rehashes everything.

I was thinking about this, too.  Cleaning up the old bucket after the
split is pretty similar to vacuuming.  And lo and behold, vacuum also
needs to lock the entire bucket.  AFAICT, there are two reasons for
this.  First, when we resume a suspended scan, we assume that the next
match on the page, if any, will occur on the same page at an offset
greater than the offset where we found the previous match.  The code
copes with the possibility of current insertions by refinding the last
item we returned and scanning forward from there, but assumes the
pointer we're refinding can't have moved to a lower offset.  I think
this could be easily fixed - at essentially no cost - by changing
hashgettuple so that, if the forward scan errors out, it tries
scanning backward rather than just giving up.  Second, vacuum compacts
each bucket that it modifies using _hash_squeezebucket, which scans
from the two ends of the index in toward the middle, filling any free
space on earlier pages by pulling tuples from the end of the bucket
chain.  This is a little thornier.  We could change vacuum so that it
only removes TIDs from the individual pages, without actually trying
to free up pages, but that seems undesirable.

However, I think we might be able to get by with making bucket
compaction less aggressive, without eliminating it altogether.
Suppose that whenever we remove items from a page, we consider
consolidating the page with its immediate predecessor and successor in
the bucket chain.  This means our utilization will be a little over
50% in the worst case where we have a full page, a page with one item,
another full page, another page with one item, and so on.  But that's
not all bad, because compacting a bucket chain to the minimum possible
size when it may well be about to suffer more inserts isn't
necessarily a good thing anyway.  Also, doing this means that we don't
need to lock out scans from the entire bucket chain in order to
compact.  It's sufficient to make sure that nobody's in the middle of
scanning the two pages we want to merge.

That turns out not to be possible right now.  When a scan is
suspended, we still hold a pin on the page we're scanning.  But when
_hash_readnext or _hash_readprev walks the bucket chain, it drops both
lock and pin on the current buffer and then pins and locks the new
buffer.  That, however, seems like it could easily be changed: drop
lock on current buffer, acquire pin on new buffer, drop pin on current
buffer, lock new buffer.  Then, if we want to merge two buffers, it's
enough to lock both of them for cleanup.  To avoid any risk of
deadlock, and also to avoid waiting for a long-running suspended scan
to wake up and complete, we can do this conditionally; if we fail to
get either cleanup lock, then just don't merge the pages; take an
exclusive lock only and remove whatever you need to remove, leaving
the rest.  Merging pages is only a performance optimization, so if it
fails now and then, no real harm done.

(A side benefit of this approach is that we could opportunistically
attempt to compact pages containing dead items any time we can manage
a ConditionalLockBufferForCleanup() on the page, sort of like what HOT
pruning does for heap blocks.  We could even, after pruning away dead
items, attempt to merge the page with siblings, so that even without
vacuum the bucket chains can gradually shrink if the index tuples are
discovered to be pointing to dead heap tuples.)

With the above changes, vacuuming a bucket can happen without taking
the heavyweight lock in exclusive mode, leaving bucket splitting as
the only operation that requires it.  And we could perhaps use
basically the same algorithm to clean the tuples out of the old bucket
after the split.  The problem is that, when we're vacuuming, we know
that no scan currently in progress can still care about any of the
tuples we're removing.  

Re: [HACKERS] Changeset Extraction v7.9.1

2014-03-07 Thread Andres Freund
On 2014-03-07 10:17:21 -0300, Alvaro Herrera wrote:
 Andres Freund escribió:
 
  fprintf(stderr,
  -   _(%s: could not identify system: got 
  %d rows and %d fields, expected %d rows and %d fields\n),
  -   progname, PQntuples(res), 
  PQnfields(res), 1, 3);
  +   _(%s: could not identify system: got 
  %d rows and %d fields, expected 1 row and 3 or more fields\n),
  +   progname, PQntuples(res), 
  PQnfields(res));
 
 Please don't change this.  The reason these messages use %d and an extra
 printf argument is to avoid giving translators extra work when the
 number of rows or fields is changed.  In these cases I suggest this:
 
  -   _(%s: could not identify system: got 
  %d rows and %d fields, expected %d rows and %d fields\n),
  -   progname, PQntuples(res), 
  PQnfields(res), 1, 3);
  +   _(%s: could not identify system: got 
  %d rows and %d fields, expected %d rows and %d or more fields\n),
  +   progname, PQntuples(res), 
  PQnfields(res), 1, 3);
 
 (Yes, I know the expected 1 rows output looks a bit silly.  Since this
 is an unexpected error message anyway, I don't think that's worth
 fixing.)

I changed it to not use placeholders because I thought or more was
specific enough to be unlikely to be used in other places, but I don't
have a problem with continuing to use them.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-03-07 Thread Craig Ringer
On 03/05/2014 11:02 AM, Craig Ringer wrote:
 The main known issue remaining is plan invalidation.

I've pushed a version with a plan invalidation implementation. It's tagged:

  rls-9.4-upd-sb-views-v9

in

  g...@github.com:ringerc/postgres.git

The invalidation implementation does not yet handle foreign key checks;
that will require additional changes. I'll push an update to the
rls-9.4-upd-sb-views and post an update later, at which time I'll rebase
the changes back into the history.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-07 Thread Amit Kapila
On Fri, Mar 7, 2014 at 9:41 AM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Mar  7, 2014 at 09:07:24AM +0530, Amit Kapila wrote:
 One reason could be that as we are already returning special exit code
 for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as 
 this
 also get called during status command.

 Also in the link sent by you in another mail, I could see some statement
 which indicates it is more important to return correct codes in case of
 status which sounds a good reason to me.

 Link and statement
 http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net

 The status case is different, because there the exit code
 can be passed out by the init script directly.

 If we just want to handle correct exit codes for status option, then
 may be we need to refactor the code a bit to ensure the same.

 OK, done with the attached patch  Three is returned for status, but one
 for other cases.

I think it would have been better if it return status as 4 for cases where
directory/file is not accessible (current new cases added by this patch).

I think status 3 should be only return only when the data directory is valid
and pid file is missing, because in script after getting this code, the next
thing they will probably do is to start the server which doesn't seem a
good fit for newly added cases.

Other than that patch seems fine to me.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-07 Thread Robert Haas
[ I just sent a reply to Greg Stark's email which touches on some of
the same points you mentioned here; that was mostly written last
night, and finished this morning before seeing this.  ]

On Fri, Mar 7, 2014 at 4:34 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Hmm. You suggested ensuring that a scan always has at least a pin, and split
 takes a vacuum-lock. That ought to work. There's no need for the more
 complicated maneuvers you described, ISTM that you can just replace the
 heavy-weight share lock with holding a pin on the primary page of the
 bucket, and an exclusive lock with a vacuum-lock. Note that
 _hash_expandtable already takes the exclusive lock conditionally, ie. if it
 doesn't get the lock immediately it just gives up. We could do the same with
 the cleanup lock.

We could try that.  I assume you mean do *just* what you describe
here, without the split-in-progress or moved-by-split flags I
suggested.  The only issue I see with that is that instead of everyone
piling up on the heavyweight lock, a wait which is interruptible,
they'd all pile up on the buffer content lwlock, a wait which isn't.
And splitting a bucket can involve an arbitrary number of I/O
operations, so that's kind of unappealing.  Even checkpoints would be
blocked until the bucket split completed, which seems unfortunate.

But I like the idea of teaching each scan to retain a pin on the
primary buffer page.  If we then do the split the way I proposed
before, we can implement the outwait concurrent scans step by taking
a cleanup lock on the primary buffer page.  In this design, we only
need to acquire and release the cleanup lock.  Once we get the cleanup
lock on the primary buffer page, even for an instant, we know that
there are no remaining scans in the bucket that need the pre-split
tuples that have now been moved to the new bucket.  We can then remove
tuples with a lesser lock (or keep the stronger lock if we wish to
re-pack).

 Vacuum could also be enhanced. It currently takes an exclusive lock on the
 bucket, then removes any dead tuples and finally squeezes the bucket by
 moving tuples to earlier pages. But you only really need the exclusive lock
 for the squeeze-phase. You could do the dead-tuple removal without the
 bucket-lock, and only grab for the squeeze-phase. And the squeezing is
 optional, so you could just skip that if you can't get the lock. But that's
 a separate patch as well.

Yeah, I think this would be a useful improvement.

 One more thing we could do to make hash indexes more scalable, independent
 of the above: Cache the information in the metapage in backend-private
 memory. Then you (usually) wouldn't need to access the metapage at all when
 scanning. Store a copy of the bitmask for that bucket in the primary page,
 and when scanning, check that it matches the cached value. If not, refresh
 the cached metapage and try again.

I don't know whether this would be a useful improvement.

 So, there seems to be a few fairly simple and independent improvements to be
 made:

 1. Replace the heavy-weight lock with pin  vacuum-lock.
 2. Make it crash-safe, by adding WAL-logging
 3. Only acquire the exclusive-lock (vacuum-lock after step 1) in VACUUM for
 the squeeze phase.
 4. Cache the metapage.

 We still don't know if it's going to be any better than B-tree after all
 that's done, but the only way to find out is to go ahead and implement it.

I'm of the opinion that we ought to start by making splits and
vacuuming more concurrent, and then do that stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-07 Thread Heikki Linnakangas

On 03/07/2014 03:48 PM, Robert Haas wrote:

So, there seems to be a few fairly simple and independent improvements to be
made:

1. Replace the heavy-weight lock with pin  vacuum-lock.
2. Make it crash-safe, by adding WAL-logging
3. Only acquire the exclusive-lock (vacuum-lock after step 1) in VACUUM for
the squeeze phase.
4. Cache the metapage.

We still don't know if it's going to be any better than B-tree after all
that's done, but the only way to find out is to go ahead and implement it.

I'm of the opinion that we ought to start by making splits and
vacuuming more concurrent, and then do that stuff.


Priorities are a matter of opinion, but note that for many applications 
the concurrency of splits and vacuuming is pretty much irrelevant (at 
least after we do bullet point 3 above). Like, if the index is mostly 
read-only, or at least doesn't grow much. You'd still want it to be 
crash-safe (2), and you might care a lot about the performance of scans 
(1 and 4), but if splits and vacuum hold an exclusive lock for a few 
seconds, that's OK if you only need to do it once in a blue moon.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-07 Thread k...@rice.edu
On Thu, Mar 06, 2014 at 06:14:21PM -0500, Robert Haas wrote:
 On Thu, Mar 6, 2014 at 3:44 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 
  I've been tempted to implement a new type of hash index that allows both WAL
  and high concurrency, simply by disallowing bucket splits.  At the index
  creation time you use a storage parameter to specify the number of buckets,
  and that is that. If you mis-planned, build a new index with more buckets,
  possibly concurrently, and drop the too-small one.
 
 Yeah, we could certainly do something like that.  It sort of sucks,
 though.  I mean, it's probably pretty easy to know that starting with
 the default 2 buckets is not going to be enough; most people will at
 least be smart enough to start with, say, 1024.  But are you going to
 know whether you need 32768 or 1048576 or 33554432?  A lot of people
 won't, and we have more than enough reasons for performance to degrade
 over time as it is.
 
It would be useful to have a storage parameter for the target size of
the index, even if it is not exact, to use in the initial index build
to avoid the flurry of i/o caused by bucket splits as the index grows.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-03-07 Thread Craig Ringer
On 03/07/2014 09:33 PM, Craig Ringer wrote:
 On 03/05/2014 11:02 AM, Craig Ringer wrote:
 The main known issue remaining is plan invalidation.
 
 I've pushed a version with a plan invalidation implementation. It's tagged:
 
   rls-9.4-upd-sb-views-v9
 
 in
 
   g...@github.com:ringerc/postgres.git
 
 The invalidation implementation does not yet handle foreign key checks;
 that will require additional changes. I'll push an update to the
 rls-9.4-upd-sb-views and post an update later, at which time I'll rebase
 the changes back into the history.

Well, that was easy. Done.

  rls-9.4-upd-sb-views-v11

and rebased the rls-9.4-upd-sb-views branch to incorporate the changes.

The regression tests have further failures, but some are due to changes
in the inheritance semantics. I'm going through them now.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-07 Thread Heikki Linnakangas

On 03/07/2014 03:48 PM, Robert Haas wrote:

On Fri, Mar 7, 2014 at 4:34 AM, Heikki Linnakangas
hlinnakan...@vmware.com  wrote:

Hmm. You suggested ensuring that a scan always has at least a pin, and split
takes a vacuum-lock. That ought to work. There's no need for the more
complicated maneuvers you described, ISTM that you can just replace the
heavy-weight share lock with holding a pin on the primary page of the
bucket, and an exclusive lock with a vacuum-lock. Note that
_hash_expandtable already takes the exclusive lock conditionally, ie. if it
doesn't get the lock immediately it just gives up. We could do the same with
the cleanup lock.

We could try that.  I assume you mean do*just*  what you describe
here, without the split-in-progress or moved-by-split flags I
suggested.


Yep.


The only issue I see with that is that instead of everyone
piling up on the heavyweight lock, a wait which is interruptible,
they'd all pile up on the buffer content lwlock, a wait which isn't.
And splitting a bucket can involve an arbitrary number of I/O
operations, so that's kind of unappealing.  Even checkpoints would be
blocked until the bucket split completed, which seems unfortunate.


Hmm. I doubt that's a big deal in practice, although I agree it's a bit 
ugly.


Once we solve the crash-safety of splits, we actually have the option of 
doing the split in many small steps, even when there's no crash 
involved. You could for example grab the vacuum-lock, move all the 
tuples in the first 5 pages, and then release the lock to give other 
backends that are queued up a chance to do their scans/insertions. Then 
re-acquire the lock, and continue where you left. Or just bail out and 
let the next vacuum or insertion to finish it later.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-03-07 Thread Stephen Frost
* Peter Eisentraut (pete...@gmx.net) wrote:
 On 2/27/14, 6:04 AM, Dimitri Fontaine wrote:
  What about allowing a control file like this:
  
 # hstore extension
 comment = 'data type for storing sets of (key, value) pairs'
 default_version = '1.3'
 directory = 'local/hstore-new'
 module_pathname = '$directory/hstore'
 relocatable = true
 
 I think your previously proposed patch to add extension_control_path
 plus my suggestion to update existing de facto best practices to not
 include $libdir into the module path name (thus allowing the use of
 dynamic_library_path) will address all desired use cases just fine.

You still haven't addressed how to deal with the case of multiple .so's
with the same name ending up in the dynamic_library_path.  I don't see
that as unlikely to end up happening either.

 Moreover, going that way would reuse existing facilities and concepts,
 remove indirections and reduce overall complexity.  This new proposal,
 on the other hand, would go the other way, introducing new concepts,
 adding more indirections, and increasing overall complexity, while
 actually achieving less.

Being able to have a self-contained module which requires a minimum of
modification to postgresql.conf is a reduction in complexity, imv.
Having to maintain two config options which will end up being overly
long and mostly duplicated doesn't make things easier for people.  This
has made me wonder if we could allow a control file to be explicitly
referred to from CREATE EXTENSION itself, dropping the need for any of
this postgresql.conf/GUC maintenance.  There are downsides to that
approach as well, of course, but it's definitely got a certain appeal.

 I see an analogy here.  What we are currently doing is similar to
 hardcoding absolute rpaths into all libraries.  Your proposal is
 effectively to (1) add the $ORIGIN mechanism and (2) make people use
 chrpath when they want to install somewhere else.  My proposal is to get
 rid of all rpaths and just set a search path.  Yes, on technical level,
 this is less powerful, but it's simpler and gets the job done and is
 harder to misuse.

I don't buy off on this analogy.  For starters, you can change the
control file without needing to rebuild the library, but the main
difference is that, in practice, there are no library transistions
happening and instead what we're likely to have are independent and
*incompatible* libraries living with the same name in our path.

 A problem with features like these is that they get rarely used but
 offer infinite flexibility, so they are not used consistently and you
 can't rely on anything.  This is already the case for the
 module_pathname setting in the control file.  It has, AFAICT, no actual
 use, and because of that no one uses it, and because of that, there is
 no guarantee that extensions use it sensibly, and because of that no one
 can ever make sensible use of it in the future, because there is no
 guarantee that extensions have it set sensibly.  In fact, I would
 propose deprecating module_pathname.

This makes sense when you have complete control over where things are
installed to and can drop the control file into the one true directory
of control files and similairly with the .so.  Indeed, that works
already today for certain platforms, but from what I understand, on the
OSX platform you don't really get to just dump files anywhere on the
filesystem that you want and instead end up forced into a specific
directory tree.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Florian Weimer

On 03/07/2014 06:03 AM, Tom Lane wrote:


In the bug thread I proposed making atexit_callback check whether getpid()
still matches MyProcPid.  If it doesn't, then presumably we inherited the
atexit callback list, along with the value of MyProcPid, from some parent
backend process whose elbow we should not joggle.  Can anyone see a flaw
in that?


There's the PID reuse problem.  Forking twice (with a delay) could end 
up with the same PID as MyProcPid.  Comparing the process start time 
would protect against that.  Checking getppid() would have the same 
theoretical problem.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-07 Thread Bruce Momjian
On Fri, Mar  7, 2014 at 07:18:04PM +0530, Amit Kapila wrote:
  OK, done with the attached patch  Three is returned for status, but one
  for other cases.
 
 I think it would have been better if it return status as 4 for cases where
 directory/file is not accessible (current new cases added by this patch).
 
 I think status 3 should be only return only when the data directory is valid
 and pid file is missing, because in script after getting this code, the next
 thing they will probably do is to start the server which doesn't seem a
 good fit for newly added cases.

OK, I have updated the attached patch to reflect this, and added a C
comment.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
new file mode 100644
index 0dbaa6e..56d238f
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*** static bool allow_core_files = false;
*** 97,102 
--- 97,103 
  static time_t start_time;
  
  static char postopts_file[MAXPGPATH];
+ static char version_file[MAXPGPATH];
  static char pid_file[MAXPGPATH];
  static char backup_file[MAXPGPATH];
  static char recovery_file[MAXPGPATH];
*** static void pgwin32_doRunAsService(void)
*** 152,158 
  static int	CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service);
  #endif
  
! static pgpid_t get_pgpid(void);
  static char **readfile(const char *path);
  static void free_readfile(char **optlines);
  static int	start_postmaster(void);
--- 153,159 
  static int	CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service);
  #endif
  
! static pgpid_t get_pgpid(bool is_status_request);
  static char **readfile(const char *path);
  static void free_readfile(char **optlines);
  static int	start_postmaster(void);
*** print_msg(const char *msg)
*** 246,255 
  }
  
  static pgpid_t
! get_pgpid(void)
  {
  	FILE	   *pidf;
  	long		pid;
  
  	pidf = fopen(pid_file, r);
  	if (pidf == NULL)
--- 247,280 
  }
  
  static pgpid_t
! get_pgpid(bool is_status_request)
  {
  	FILE	   *pidf;
  	long		pid;
+ 	struct stat statbuf;
+ 
+ 	if (stat(pg_data, statbuf) != 0)
+ 	{
+ 		if (errno == ENOENT)
+ 			printf(_(%s: directory \%s\ does not exist\n), progname,
+ 	 pg_data);
+ 		else
+ 			printf(_(%s: cannot access directory \%s\\n), progname,
+ 	 pg_data);
+ 		/*
+ 		 * The Linux Standard Base Core Specification 3.1 says this should return
+ 		 * '4, program or service status is unknown'
+ 		 * https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
+ 		 */
+ 		exit(is_status_request ? 4 : 1);
+ 	}
+ 
+ 	if (stat(version_file, statbuf) != 0  errno == ENOENT)
+ 	{
+ 		printf(_(%s: directory \%s\ is not a database cluster directory\n),
+  progname, pg_data);
+ 		exit(is_status_request ? 4 : 1);
+ 	}
  
  	pidf = fopen(pid_file, r);
  	if (pidf == NULL)
*** do_start(void)
*** 810,816 
  
  	if (ctl_command != RESTART_COMMAND)
  	{
! 		old_pid = get_pgpid();
  		if (old_pid != 0)
  			write_stderr(_(%s: another server might be running; 
  		   trying to start server anyway\n),
--- 835,841 
  
  	if (ctl_command != RESTART_COMMAND)
  	{
! 		old_pid = get_pgpid(false);
  		if (old_pid != 0)
  			write_stderr(_(%s: another server might be running; 
  		   trying to start server anyway\n),
*** do_stop(void)
*** 894,900 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid();
  
  	if (pid == 0)/* no pid file */
  	{
--- 919,925 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid(false);
  
  	if (pid == 0)/* no pid file */
  	{
*** do_stop(void)
*** 943,949 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid()) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
--- 968,974 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid(false)) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
*** do_restart(void)
*** 980,986 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid();
  
  	if (pid == 0)/* no pid file */
  	{
--- 1005,1011 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid(false);
  
  	if (pid == 0)/* no pid file */
  	{
*** do_restart(void)
*** 1033,1039 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid()) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
--- 1058,1064 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid(false)) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
*** do_reload(void)
*** 1071,1077 
  {
  	pgpid_t		pid;
  
! 	pid = get_pgpid();
  	if (pid == 0)/* 

Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Heikki Linnakangas

On 03/07/2014 04:23 PM, Florian Weimer wrote:

On 03/07/2014 06:03 AM, Tom Lane wrote:


In the bug thread I proposed making atexit_callback check whether getpid()
still matches MyProcPid.  If it doesn't, then presumably we inherited the
atexit callback list, along with the value of MyProcPid, from some parent
backend process whose elbow we should not joggle.  Can anyone see a flaw
in that?


There's the PID reuse problem.  Forking twice (with a delay) could end
up with the same PID as MyProcPid.


Not if the parent process is still running.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Andres Freund
On 2014-03-07 00:03:48 -0500, Tom Lane wrote:
 In the bug thread I proposed making atexit_callback check whether getpid()
 still matches MyProcPid.

What are you proposing to do in that case? This is only one of the
failure cases of forking carelessly, right?
I think the only appropriate thing would be to make as much noise as
possible in that case, which is probably something like writing a
message to stderr, and then abort().

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-03-07 Thread Craig Ringer
On 03/07/2014 10:07 PM, Craig Ringer wrote:
 On 03/07/2014 09:33 PM, Craig Ringer wrote:
 On 03/05/2014 11:02 AM, Craig Ringer wrote:
 The main known issue remaining is plan invalidation.

 I've pushed a version with a plan invalidation implementation. It's tagged:

   rls-9.4-upd-sb-views-v9

 in

   g...@github.com:ringerc/postgres.git

 The invalidation implementation does not yet handle foreign key checks;
 that will require additional changes. I'll push an update to the
 rls-9.4-upd-sb-views and post an update later, at which time I'll rebase
 the changes back into the history.
 
 Well, that was easy. Done.
 
   rls-9.4-upd-sb-views-v11
 
 and rebased the rls-9.4-upd-sb-views branch to incorporate the changes.
 
 The regression tests have further failures, but some are due to changes
 in the inheritance semantics. I'm going through them now.
 


Need a quick opinion.

KaiGai's original code produced a plan like this for an inheritance set:

  EXPLAIN (costs off) SELECT * FROM t1 WHERE f_leak(b) FOR SHARE;
! QUERY PLAN
! ---
   LockRows
!-  Append
!  -  Subquery Scan on t1
!Filter: f_leak(t1.b)
!-  Seq Scan on t1 t1_1
!  Filter: ((a % 2) = 0)
!  -  Subquery Scan on t2
!Filter: f_leak(t2.b)
!-  Seq Scan on t2 t2_1
!  Filter: ((a % 2) = 1)
!  -  Seq Scan on t3
!Filter: f_leak(b)
  (12 rows)



The new code, using updatable s.b. views, instead produces:


  EXPLAIN (costs off) SELECT * FROM t1 WHERE f_leak(b) FOR SHARE;
!   QUERY PLAN
! ---
   LockRows
!-  Subquery Scan on t1
!  Filter: f_leak(t1.b)
!  -  LockRows
!-  Result
!  -  Append
!-  Seq Scan on t1 t1_1
!  Filter: ((a % 2) = 0)
!-  Seq Scan on t2
!  Filter: ((a % 2) = 0)
!-  Seq Scan on t3
!  Filter: ((a % 2) = 0)
  (12 rows)



The different quals are expected, because of the change to the
definition of inheritance handling in row security.

What I'm concerned about is the locking. It looks to me like we're
causing the user to lock rows that they may not intend to lock, by
applying a LockRows step *before* the user supplied qual. (I'm going to
test that tomorrow, it's sleep time in Australia).

This seems to be related to RowMark handling in updatable security
barrier views. I need to check whether it happens with updates to
security barrier views, as well as with row security.

Dean, any thoughts?



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-03-07 00:03:48 -0500, Tom Lane wrote:
 In the bug thread I proposed making atexit_callback check whether getpid()
 still matches MyProcPid.

 What are you proposing to do in that case? This is only one of the
 failure cases of forking carelessly, right?

No, I think it should do nothing.  The coding pattern shown in bug #9464
seems perfectly reasonable and I think we should allow it.  No doubt it's
safer if the child process does an on_exit_reset; but right now, if the
child fails to do so, atexit_callback is actively breaking things.  And
I don't think we can rely on third-party libraries to call on_exit_reset
after forking.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 03/07/2014 04:23 PM, Florian Weimer wrote:
 There's the PID reuse problem.  Forking twice (with a delay) could end
 up with the same PID as MyProcPid.

 Not if the parent process is still running.

If the original parent backend is *not* still running, then running
atexit_callback in the grandchild is just as dangerous if not more so;
it could be clobbering shared-memory state belonging to some other
session that has recycled the same PGPROC.

I think Florian's right that there's a risk there, but it seems pretty
remote, and I don't see any reliable way to detect the case anyhow.
(Process start time?  Where would you get that from portably?)
It's not a reason not to do something about the much larger chance of
this happening in a direct child process, which certainly won't have a
matching PID.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Florian Weimer

On 03/07/2014 03:57 PM, Tom Lane wrote:


I think Florian's right that there's a risk there, but it seems pretty
remote, and I don't see any reliable way to detect the case anyhow.
(Process start time?  Where would you get that from portably?)


I don't think there's a portable source for that.  On Linux, you'd have 
to use /proc.



It's not a reason not to do something about the much larger chance of
this happening in a direct child process, which certainly won't have a
matching PID.


Indeed.  Checking getppid() in addition might narrow things down further.

On Linux, linking against pthread_atfork currently requires linking 
against pthread (although this is about to change), and it might incur 
the pthread-induced overhead on some configurations.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Tom Lane
I just noticed that the DSM patch has introduced a whole new class of
failures related to the bug #9464 issue: to wit, any on_detach
actions registered in a parent process will also be performed when a
child process exits, because nothing has been added to on_exit_reset
to prevent that.  It seems likely that this is undesirable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Andres Freund
On 2014-03-07 09:49:05 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-03-07 00:03:48 -0500, Tom Lane wrote:
  In the bug thread I proposed making atexit_callback check whether getpid()
  still matches MyProcPid.
 
  What are you proposing to do in that case? This is only one of the
  failure cases of forking carelessly, right?
 
 No, I think it should do nothing.  The coding pattern shown in bug #9464
 seems perfectly reasonable and I think we should allow it.  No doubt it's
 safer if the child process does an on_exit_reset; but right now, if the
 child fails to do so, atexit_callback is actively breaking things.  And
 I don't think we can rely on third-party libraries to call on_exit_reset
 after forking.

I don't think it's a reasonable pattern run background processes that
aren't managed by postgres with all shared memory still
accessible. You'll have to either also detach from shared memory and
related things, or you have to fork() and exec(). At the very least, not
integrating the child with the postmaster's lifetime will prevent
postgres from restarting because there's still a child attached to the
shared memory.
I don't see any way it's be safe for a pg unaware library to start
forking around and doing similar random things inside normal
backends.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Tom Lane
Florian Weimer fwei...@redhat.com writes:
 On 03/07/2014 03:57 PM, Tom Lane wrote:
 It's not a reason not to do something about the much larger chance of
 this happening in a direct child process, which certainly won't have a
 matching PID.

 Indeed.  Checking getppid() in addition might narrow things down further.

I don't think getppid adds much to the party.  In particular, what to
do if it returns 1?  You can't tell if you're an orphaned backend (in
which case you should still do normal shutdown) or an orphaned
grandchild.  The standalone-backend case would also complicate matters.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Florian Weimer

On 03/07/2014 04:10 PM, Tom Lane wrote:

Florian Weimer fwei...@redhat.com writes:

On 03/07/2014 03:57 PM, Tom Lane wrote:

It's not a reason not to do something about the much larger chance of
this happening in a direct child process, which certainly won't have a
matching PID.



Indeed.  Checking getppid() in addition might narrow things down further.


I don't think getppid adds much to the party.  In particular, what to
do if it returns 1?  You can't tell if you're an orphaned backend (in
which case you should still do normal shutdown)


Oh.  I didn't know that orphaned backends perform a normal shutdown.

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-03-07 09:49:05 -0500, Tom Lane wrote:
 No, I think it should do nothing.  The coding pattern shown in bug #9464
 seems perfectly reasonable and I think we should allow it.

 I don't think it's a reasonable pattern run background processes that
 aren't managed by postgres with all shared memory still
 accessible. You'll have to either also detach from shared memory and
 related things, or you have to fork() and exec().

The code in question is trying to do that.  And what do you think will
happen if the exec() fails?

 At the very least, not
 integrating the child with the postmaster's lifetime will prevent
 postgres from restarting because there's still a child attached to the
 shared memory.

I think you're willfully missing the point.  The reason we added
atexit_callback was to try to defend ourselves against third-party code
that did things in a non-Postgres-aware way.  Arguing that such code
should do things in a Postgres-aware way is not helpful for the concerns
here, and it's not relevant to reality either, because people will load
stuff like libperl into backends.  Good luck getting a post-fork
on_exit_reset() call into libperl.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unportable coding in reorderbuffer.h

2014-03-07 Thread Andres Freund
Hi,

On 2014-03-06 02:39:37 +0100, Andres Freund wrote:
 Unless somebody protests I'll try to get the remaining walsender and
 docs patches ready before sending in the patch fixing this as it's not
 disturbing the buildfarm. I'll be onsite/travelling tomorrow; so I am
 not sure I'll be done then, but friday it is.

A patch fixing this is attached. I've added some more local variables to
deal with the longer lines...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 875eda2d163e38e6533bf6cf8e6e6417aad0421b Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 6 Mar 2014 02:00:47 +0100
Subject: [PATCH] Remove unportable use of anonymous unions from
 reorderbuffer.h.

In b89e151054a I had assumed it was ok to use anonymous unions as
struct members, but while a longstanding extension in many compilers,
it's only been standardized in C11.

To fix, remove one of the anonymous unions which tried to hide some
implementation specific enum values and give the other a name. The
latter unfortunately requires changes in output plugins, but since the
feature has only been added a few days ago...

Per complaint from Tom Lane.
---
 contrib/test_decoding/test_decoding.c   |  18 +-
 src/backend/replication/logical/decode.c|  28 +--
 src/backend/replication/logical/reorderbuffer.c | 283 
 src/include/replication/reorderbuffer.h |  39 ++--
 4 files changed, 186 insertions(+), 182 deletions(-)

diff --git a/contrib/test_decoding/test_decoding.c b/contrib/test_decoding/test_decoding.c
index ea463fb..e356c7c 100644
--- a/contrib/test_decoding/test_decoding.c
+++ b/contrib/test_decoding/test_decoding.c
@@ -358,43 +358,45 @@ pg_decode_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
 			appendStringInfoString(ctx-out,  INSERT:);
-			if (change-tp.newtuple == NULL)
+			if (change-data.tp.newtuple == NULL)
 appendStringInfoString(ctx-out,  (no-tuple-data));
 			else
 tuple_to_stringinfo(ctx-out, tupdesc,
-	change-tp.newtuple-tuple,
+	change-data.tp.newtuple-tuple,
 	false);
 			break;
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			appendStringInfoString(ctx-out,  UPDATE:);
-			if (change-tp.oldtuple != NULL)
+			if (change-data.tp.oldtuple != NULL)
 			{
 appendStringInfoString(ctx-out,  old-key:);
 tuple_to_stringinfo(ctx-out, tupdesc,
-	change-tp.oldtuple-tuple,
+	change-data.tp.oldtuple-tuple,
 	true);
 appendStringInfoString(ctx-out,  new-tuple:);
 			}
 
-			if (change-tp.newtuple == NULL)
+			if (change-data.tp.newtuple == NULL)
 appendStringInfoString(ctx-out,  (no-tuple-data));
 			else
 tuple_to_stringinfo(ctx-out, tupdesc,
-	change-tp.newtuple-tuple,
+	change-data.tp.newtuple-tuple,
 	false);
 			break;
 		case REORDER_BUFFER_CHANGE_DELETE:
 			appendStringInfoString(ctx-out,  DELETE:);
 
 			/* if there was no PK, we only know that a delete happened */
-			if (change-tp.oldtuple == NULL)
+			if (change-data.tp.oldtuple == NULL)
 appendStringInfoString(ctx-out,  (no-tuple-data));
 			/* In DELETE, only the replica identity is present; display that */
 			else
 tuple_to_stringinfo(ctx-out, tupdesc,
-	change-tp.oldtuple-tuple,
+	change-data.tp.oldtuple-tuple,
 	true);
 			break;
+		default:
+			Assert(false);
 	}
 
 	MemoryContextSwitchTo(old);
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index af3948e..414cfa9 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -586,17 +586,17 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 
 	change = ReorderBufferGetChange(ctx-reorder);
 	change-action = REORDER_BUFFER_CHANGE_INSERT;
-	memcpy(change-tp.relnode, xlrec-target.node, sizeof(RelFileNode));
+	memcpy(change-data.tp.relnode, xlrec-target.node, sizeof(RelFileNode));
 
 	if (xlrec-flags  XLOG_HEAP_CONTAINS_NEW_TUPLE)
 	{
 		Assert(r-xl_len  (SizeOfHeapInsert + SizeOfHeapHeader));
 
-		change-tp.newtuple = ReorderBufferGetTupleBuf(ctx-reorder);
+		change-data.tp.newtuple = ReorderBufferGetTupleBuf(ctx-reorder);
 
 		DecodeXLogTuple((char *) xlrec + SizeOfHeapInsert,
 		r-xl_len - SizeOfHeapInsert,
-		change-tp.newtuple);
+		change-data.tp.newtuple);
 	}
 
 	ReorderBufferQueueChange(ctx-reorder, r-xl_xid, buf-origptr, change);
@@ -626,7 +626,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 
 	change = ReorderBufferGetChange(ctx-reorder);
 	change-action = REORDER_BUFFER_CHANGE_UPDATE;
-	memcpy(change-tp.relnode, xlrec-target.node, sizeof(RelFileNode));
+	memcpy(change-data.tp.relnode, xlrec-target.node, sizeof(RelFileNode));
 
 	data = (char *) xlhdr-header;
 
@@ -634,11 +634,11 @@ DecodeUpdate(LogicalDecodingContext *ctx, 

Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Andres Freund
Hi,

On 2014-03-07 10:24:31 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-03-07 09:49:05 -0500, Tom Lane wrote:
  No, I think it should do nothing.  The coding pattern shown in bug #9464
  seems perfectly reasonable and I think we should allow it.
 
  I don't think it's a reasonable pattern run background processes that
  aren't managed by postgres with all shared memory still
  accessible. You'll have to either also detach from shared memory and
  related things, or you have to fork() and exec().
 
 The code in question is trying to do that.  And what do you think will
 happen if the exec() fails?

In code that's written properly, not much. It will use _exit() after the
exec() failure. That's pretty established best practice after forking,
especially after a exec() failure.

  At the very least, not
  integrating the child with the postmaster's lifetime will prevent
  postgres from restarting because there's still a child attached to the
  shared memory.
 
 I think you're willfully missing the point.  The reason we added
 atexit_callback was to try to defend ourselves against third-party code
 that did things in a non-Postgres-aware way.

Hey, I am not arguing that we should remove protection, I am saying that
we should make it more obvious that dangerous stuff happens. That way
people can fix stuff during development rather than finding out stuff
breaks in some corner cases on busy live systems.

 Arguing that such code
 should do things in a Postgres-aware way is not helpful for the concerns
 here, and it's not relevant to reality either, because people will load
 stuff like libperl into backends.  Good luck getting a post-fork
 on_exit_reset() call into libperl.

libperl won't fork on it's own, without the user telling it to do so,
and code that forks can very well be careful to do POSIX::_exit(),
especially in case a exec fails. I don't have much problem telling
people that a fork() without a direct exec() afterwards simply isn't
supported from PLs.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-07 Thread Merlin Moncure
On Thu, Mar 6, 2014 at 10:33 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Mar  6, 2014 at 09:50:56PM +0400, Oleg Bartunov wrote:
 Hi there,

 Looks like consensus is done. I and Teodor are not happy with it, but
 what we can do :)   One thing I  want to do is to reserve our
 contribution to the flagship feature (jsonb), particularly, binary
 storage for nested structures and indexing. Their work was sponsored
 by Engine Yard.

 OK, if we are going with an unchanged hstore in contrib and a new JSONB,
 there is no reason to wack around JSONB to be binary compatible with the
 old hstore format.  What sacrifices did we need to make to have JSBONB
 be binary compatible with hstore, can those sacrifices be removed, and
 can that be done in time for 9.4?

Also,
*) what hstore2 features (if any) that are not already reflected in
the jsonb type are going to be moved to josnb for 9.4?
*) if the answer above is anything but 'nothing', what hstore-isms are
going to be adjusted in the process of doing so?  Presumably there
would be same function name changes to put them in the jsonb style but
also the hstore sytnax ('=') is going to be embedded in some of the
search operators and possibly other things.  Is that going change?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-07 Thread Andrew Dunstan


On 03/06/2014 11:33 PM, Bruce Momjian wrote:

On Thu, Mar  6, 2014 at 09:50:56PM +0400, Oleg Bartunov wrote:

Hi there,

Looks like consensus is done. I and Teodor are not happy with it, but
what we can do :)   One thing I  want to do is to reserve our
contribution to the flagship feature (jsonb), particularly, binary
storage for nested structures and indexing. Their work was sponsored
by Engine Yard.

OK, if we are going with an unchanged hstore in contrib and a new JSONB,
there is no reason to wack around JSONB to be binary compatible with the
old hstore format.  What sacrifices did we need to make to have JSBONB
be binary compatible with hstore, can those sacrifices be removed, and
can that be done in time for 9.4?




IIRC The sacrifice was one bit in the header (i.e. in the first int 
after the varlena header). We could now repurpose that (for example if 
we ever decided to use a new format).


Oleg and Teodor made most of the adjustments on the hstore(2) side (e.g. 
providing for scalar roots, providing for json typing of scalars so 
everything isn't just a string).


Can the architecture be changed? No. If we think it's not good enough we 
would have to kiss jsonb goodbye for 9.4 and go back to the drawing 
board. But I haven't seen any such suggestion from anyone who has been 
reviewing it (e.g. Andres or Peter).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-03-07 Thread Dimitri Fontaine
Hi,

Peter Eisentraut pete...@gmx.net writes:
 On 2/27/14, 6:04 AM, Dimitri Fontaine wrote:
directory = 'local/hstore-new'
module_pathname = '$directory/hstore'

 I think your previously proposed patch to add extension_control_path
 plus my suggestion to update existing de facto best practices to not
 include $libdir into the module path name (thus allowing the use of
 dynamic_library_path) will address all desired use cases just fine.

My opinion is that we have two choices: refactoring the current API or
incrementally improving it. In both cases we should make it possible for
the packager to control where any individual module file is loaded from,
with maybe options for the sysadmin to override the packager's choice.

In your proposal, the control moves away from the developer, and that's
a good thing, so you get a +1 from me.

Just please make sure that it's still possible to use full absolute path
for the module path name so that the packager can have control too.

 Moreover, going that way would reuse existing facilities and concepts,
 remove indirections and reduce overall complexity.  This new proposal,
 on the other hand, would go the other way, introducing new concepts,
 adding more indirections, and increasing overall complexity, while
 actually achieving less.

What the $directory proposal achieves is allowing a fully relocatable
extension layout, where you just have to drop a directory anywhere in
the file system and it just works (*).

It just work and allows to easily control which module is loaded and
without having to setup either LD_LIBRARY_PATH, ld.so.conf nor our own
dynamic_library_path.

  * providing that said directory is part of extension_control_path, or
that you copy or move the .control file to sharedir/extension.

That said, I don't intend to be using it myself, so I won't try and save
that patch in any ways. My position is that Stephen's concern is real
and his idea simple enough while effective, so worth pursuing.

 I see an analogy here.  What we are currently doing is similar to
 hardcoding absolute rpaths into all libraries.  Your proposal is
 effectively to (1) add the $ORIGIN mechanism and (2) make people use
 chrpath when they want to install somewhere else.  My proposal is to get
 rid of all rpaths and just set a search path.  Yes, on technical level,
 this is less powerful, but it's simpler and gets the job done and is
 harder to misuse.

What happens if you have more than one 'prefix.so' file in your path?

 A problem with features like these is that they get rarely used but
 offer infinite flexibility, so they are not used consistently and you
 can't rely on anything.  This is already the case for the
 module_pathname setting in the control file.  It has, AFAICT, no actual
 use, and because of that no one uses it, and because of that, there is
 no guarantee that extensions use it sensibly, and because of that no one
 can ever make sensible use of it in the future, because there is no
 guarantee that extensions have it set sensibly.  In fact, I would
 propose deprecating module_pathname.

The module_pathname facility allows the packager to decide where the
library module file gets installed and the extension author not to
concern himself with that choice.

I agree that using $libdir as the extension developer isn't the right
thing to do. Having to choose the installation path as a developer,
either in the SQL script or in the control file, is not the right thing.

Now, the practical answer I have to that point is to have the packager
rewrite the control file as part of its build system.

My vote goes against deprecating module_pathname, because I didn't see
in your proposal any ways to offer the control back to the packager,
only to the sysadmin, and I don't want to have the sysadmin involved if
we can avoid it (as you say, too much flexibility is a killer).

In practical term, though, given the current situation, the build system
I'm woking on already has to edit the SQL scripts and control files
anyways…

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-07 Thread Bruce Momjian
On Fri, Mar  7, 2014 at 11:35:41AM -0500, Andrew Dunstan wrote:
 IIRC The sacrifice was one bit in the header (i.e. in the first int
 after the varlena header). We could now repurpose that (for example
 if we ever decided to use a new format).
 
 Oleg and Teodor made most of the adjustments on the hstore(2) side
 (e.g. providing for scalar roots, providing for json typing of
 scalars so everything isn't just a string).
 
 Can the architecture be changed? No. If we think it's not good
 enough we would have to kiss jsonb goodbye for 9.4 and go back to
 the drawing board. But I haven't seen any such suggestion from
 anyone who has been reviewing it (e.g. Andres or Peter).

We are going to be stuck with the JSONB binary format we ship in 9.4 so
I am asking if there are things we should do to improve it, now that we
know we don't need backward compatibility.

If they can be done for 9.4, great, if not, we have to decide if these
suboptimal cases are enough for us to delay the data type until 9.5.  I
don't know the answer, but I have to ask the question.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-07 Thread Andrew Dunstan


On 03/07/2014 11:45 AM, Bruce Momjian wrote:

On Fri, Mar  7, 2014 at 11:35:41AM -0500, Andrew Dunstan wrote:

IIRC The sacrifice was one bit in the header (i.e. in the first int
after the varlena header). We could now repurpose that (for example
if we ever decided to use a new format).

Oleg and Teodor made most of the adjustments on the hstore(2) side
(e.g. providing for scalar roots, providing for json typing of
scalars so everything isn't just a string).

Can the architecture be changed? No. If we think it's not good
enough we would have to kiss jsonb goodbye for 9.4 and go back to
the drawing board. But I haven't seen any such suggestion from
anyone who has been reviewing it (e.g. Andres or Peter).

We are going to be stuck with the JSONB binary format we ship in 9.4 so
I am asking if there are things we should do to improve it, now that we
know we don't need backward compatibility.

If they can be done for 9.4, great, if not, we have to decide if these
suboptimal cases are enough for us to delay the data type until 9.5.  I
don't know the answer, but I have to ask the question.



AFAIK, there is no sacrifice of optimality. hstore2 and jsonb were 
essentially two ways of spelling the same data, the domains were 
virtually identical (hstore might have been a bit more liberal about 
numeric input).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-07 Thread Bruce Momjian
On Fri, Mar  7, 2014 at 11:57:48AM -0500, Andrew Dunstan wrote:
 If they can be done for 9.4, great, if not, we have to decide if these
 suboptimal cases are enough for us to delay the data type until 9.5.  I
 don't know the answer, but I have to ask the question.
 
 
 AFAIK, there is no sacrifice of optimality. hstore2 and jsonb were
 essentially two ways of spelling the same data, the domains were
 virtually identical (hstore might have been a bit more liberal about
 numeric input).

OK, it sounds like the adjustments are minimal, like not using the
high-order bit.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-03-07 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 03/06/2014 02:58 AM, Tom Lane wrote:
 The PlanInvalItem could perfectly well be generated by the planner,
 no, if it has the user OID?  But I'm not real sure why you need it.
 I don't see the reason for an invalidation triggered by user ID.
 What exactly about the *user*, and not something else, would trigger
 plan invalidation?

 It's only that the plan depends on the user ID. There's no point keeping
 the plan around if the user no longer exists.

[ shrug... ]  Leaving such a plan cached would be harmless, though.
Furthermore, the user ID we'd be talking about is either the owner
of the current session, or the owner of some view or security-definer
function that the plan is already dependent on, so it's fairly hard
to credit that the plan would survive long enough for the issue to
arise.

Even if there is a scenario where invalidating by user ID is actually
useful, I think adding infrastructure to cause invalidation in such a case
is optimizing for the wrong thing.  You're adding cycles to every query to
benefit a case that is going to be quite infrequent in practice.

 What we do need is a notion that a plan cache entry might only be
 valid for a specific calling user ID; but that's a matter for cache
 entry lookup not for subsequent invalidation.

 Yes, that would be good, but is IMO more of a separate optimization. I'm
 currently using KaiGai's code to invalidate and re-plan when a user ID
 change is detected.

I'm unlikely to accept a patch that does that; wouldn't it be catastrophic
for performance in the presence of security-definer functions?  You can't
just trash the whole plan cache when a user ID switch occurs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What the heck is hobby.2ndquadrant.com?

2014-03-07 Thread Tom Lane
There are a few threads floating around currently in which some of
the cc'd addresses are various-people at hobby.2ndquadrant.com.
Addresses like that seem not to work, or at least not work reliably.
I believe the official addresses are just soandso at 2ndquadrant.com.

If you're replying to any such threads, you might want to clean
up the addresses so you don't get a bunch of bounce messages.

In any case, I thought the 2ndquadrant people would want to know that
there's something broken in their mail system.  I'm not sure how these
addresses got injected into our threads in the first place, but I bet
it wasn't done by anybody outside 2ndquadrant.  Perhaps an internal
mail server name was unintentionally exposed?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Peter LaDow
Sorry for the bit of top-posting, but I wanted to make some things
clear.  Also, I wasn't subscribed to pgsql-hackers at the time this
thread began, so I apologize for the missing headers that might cause
threading issues.

I'm the submitter of bug #9464.  Here's the background on what we are
doing.  We are on a limited resource, embedded device.  We make use of
the database as an event driven system.  In the case of an incoming
event, such as a settings change, we make use of this fork/exec
procedure to spawn an asynchronous process to handle certain events.
Some of these external processes are long running, some need to run
outside the current transaction context, and some we don't care about
the result--we just want it to run.

Also, the on_exit_reset() does clear up the issue, and we have
implemented it as suggested in this thread (calling it immediately
after the fork() in the child).  And Andres' keen eye noted we were
calling exit() after an exec() failure, rather than _exit(). Thank
you, Andres, for pointing out this error.

Andres Freund andres at 2ndquadrant.com writes:
 On 2014-03-07 09:49:05 -0500, Tom Lane wrote:
  Andres Freund andres at 2ndquadrant.com writes:
   What are you proposing to do in that case? This is only one of the
   failure cases of forking carelessly, right?

Just to be clear, we intended to fork careFULLy, not careLESSly.
Hence the reason for the double fork with an eventual exec().  We
intended to be clearly separated from the backend and executing in our
own clean, unrelated context.

 I don't think it's a reasonable pattern run background processes that
 aren't managed by postgres with all shared memory still
 accessible. You'll have to either also detach from shared memory and

In this case we definitely did NOT want to be managed by postgres.
Hence the double fork.  The intent was that the first level child
would NOT be managed by postgres, hence the exit().

 related things, or you have to fork() and exec(). At the very least, not

Which is _exactly_ what we were trying to do.  The first fork was only
so that we could fork again and spawn a subprocess completely detached
from the postmaster.  And also to have something for the postmaster to
reap via waitpid and prevent a zombie from hanging around.  The
typical daemon stuff.

 integrating the child with the postmaster's lifetime will prevent
 postgres from restarting because there's still a child attached to the
 shared memory.

Which is _exactly_ what we were NOT trying to do.  We did not want to
integrate with postmaster.

 I don't see any way it's be safe for a pg unaware library to start
 forking around and doing similar random things inside normal
 backends.

We aren't exactly forking around.  We were trying to spawn an
asynchronous process without any ties to the postmaster.  This was
expected to be well-defined, clean behavior.  A fork/exec isn't an
unusual thing to do, and we didn't expect that exiting a child would
invoke behavior that would cause problems.

Thanks,

Pete


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-03-07 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 What I'm concerned about is the locking. It looks to me like we're
 causing the user to lock rows that they may not intend to lock, by
 applying a LockRows step *before* the user supplied qual. (I'm going to
 test that tomorrow, it's sleep time in Australia).

The fact that there are two LockRows nodes seems outright broken.
The one at the top of the plan is correctly placed, but how did the
other one get in there?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What the heck is hobby.2ndquadrant.com?

2014-03-07 Thread Simon Riggs
On 7 March 2014 17:46, Tom Lane t...@sss.pgh.pa.us wrote:
 There are a few threads floating around currently in which some of
 the cc'd addresses are various-people at hobby.2ndquadrant.com.
 Addresses like that seem not to work, or at least not work reliably.
 I believe the official addresses are just soandso at 2ndquadrant.com.

 If you're replying to any such threads, you might want to clean
 up the addresses so you don't get a bunch of bounce messages.

 In any case, I thought the 2ndquadrant people would want to know that
 there's something broken in their mail system.  I'm not sure how these
 addresses got injected into our threads in the first place, but I bet
 it wasn't done by anybody outside 2ndquadrant.  Perhaps an internal
 mail server name was unintentionally exposed?

Thanks for highlighting that. Will investigate.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] atexit_callback can be a net negative

2014-03-07 Thread Andres Freund
Hi,

On 2014-03-07 09:50:15 -0800, Peter LaDow wrote:
 Also, the on_exit_reset() does clear up the issue, and we have
 implemented it as suggested in this thread (calling it immediately
 after the fork() in the child).  And Andres' keen eye noted we were
 calling exit() after an exec() failure, rather than _exit(). Thank
 you, Andres, for pointing out this error.

I actually didn't see any source code of yours ;), just answered Tom's
question about what to do after exec() failed.

There's several other wierd behaviours if you use exit() instead of
_exit() after a fork, most prominently double flushes leading to
repeated/corrupted output when you have have stream FILEs open in the
parent. This is because the stream will be flushed in both, the parent
(at some later write or exit) and the child (at exit). It's simply
something that won't work well.

  I don't see any way it's be safe for a pg unaware library to start
  forking around and doing similar random things inside normal
  backends.
 
 We aren't exactly forking around.  We were trying to spawn an
 asynchronous process without any ties to the postmaster.

The important bit in the sentence you quote is pg unaware library. My
point is just that there are some special considerations you have to be
aware of. fork() and exec() is a way to escape that environment, and
should be fine.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal (9.5) : psql unicode border line styles

2014-03-07 Thread Pavel Stehule
Hello

I am returning back to this topic. Last time I proposed styles:

http://www.postgresql.org/message-id/cafj8prclgoktryjpbtoncpgyftrcz-zgfowdc1jqulb+ede...@mail.gmail.com

http://postgres.cz/wiki/Pretty_borders_in_psql

This experiment fails, but there are some interesting tips in discuss.

So I propose little bit different proposal - choose one predefined style
for any table lines elements. These styles are active only when linestyle
is unicode.

So possible line elements are:

* border,
* header_separator,
* row_separator,
* column_separator,

Possible styles (for each element)

* none,
* single,
* double,
* thick,

It should to have enough variability to define all styles proposed early. I
hope, so this proposal is secure and simple for usage. Styles should be
persistently saved in .psqlrc file - and some examples can be in
documentation.

Usage:

\pset linestyle_border double
\pset linestyle_header_separator single
\pset linestyle_row_separator single
\pset linestyle_column_separator single

\pset linestyle unicode

╔═══╤╤═══╗
║ a │ b  │   c   ║
╟───┼┼───╢
║ 1 │ 2012-05-24 │ Hello ║
╟───┼┼───╢
║ 2 │ 2012-05-25 │ Hello ║
║   ││ World ║
╚═══╧╧═══╝
(2 rows)


Comments, ideas ?

Regards

Pavel


Re: [HACKERS] Unportable coding in reorderbuffer.h

2014-03-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 A patch fixing this is attached. I've added some more local variables to
 deal with the longer lines...

Since I've got a compiler in captivity that complains about this,
I'll take care of checking and committing this patch.

I still think it might be a good idea to spin up a buildfarm member
running gcc -ansi -pedantic, even if we don't see that as a particularly
useful case in practice.  Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Robert Haas
On Fri, Mar 7, 2014 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I just noticed that the DSM patch has introduced a whole new class of
 failures related to the bug #9464 issue: to wit, any on_detach
 actions registered in a parent process will also be performed when a
 child process exits, because nothing has been added to on_exit_reset
 to prevent that.  It seems likely that this is undesirable.

I don't think this can actually happen.  There are quite a number of
things that would go belly-up if you tried to use dynamic shared
memory from the postmaster, which is why dsm_create() and dsm_attach()
both Assert(IsUnderPostmaster).  Without calling those function,
there's no way for code running in the postmaster to call
on_dsm_detach(), because it's got nothing to pass for the first
argument.

In case you're wondering, the major reason I disallowed this is that
the control segment tracks the number of backends attached to each
segment, and there's no provision for adjusting that value each time
the postmaster forks.  We could add such provision, but it seems like
there would be race conditions, and the postmaster might have to
participate in locking, so it might be pretty ugly, and a performance
suck for anyone who doesn't need this functionality.  And it doesn't
seem very useful anyway.  The postmaster really shouldn't be touching
any shared memory segment more than the absolutely minimal amount, so
the main possible benefit I can see is that you could have the mapping
already in place for each new backend rather than having to set it up
in every backend.  But I'm prepared to hope that isn't actually
important enough to be worth worrying about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unportable coding in reorderbuffer.h

2014-03-07 Thread Andres Freund
On 2014-03-07 13:27:28 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  A patch fixing this is attached. I've added some more local variables to
  deal with the longer lines...
 
 Since I've got a compiler in captivity that complains about this,
 I'll take care of checking and committing this patch.

Thanks.

The tests specific for the feature can be run with (cd
contrib/test_decoding  make -s check), as they require non-default
PGC_POSTMASTER settings.

 I still think it might be a good idea to spin up a buildfarm member
 running gcc -ansi -pedantic, even if we don't see that as a particularly
 useful case in practice.  Thoughts?

I tried to compile with both -ansi -pedantic -std=c90 to catch potential
further issues, but at least my gcc version makes the output completely
drown in various warnings. Some can be individually disabled, but lots
of them seem to be only be covered by pretty general warning categories.
-ansi -std=c90 -Wno-variadic-macros works reasonably well tho.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 7, 2014 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I just noticed that the DSM patch has introduced a whole new class of
 failures related to the bug #9464 issue: to wit, any on_detach
 actions registered in a parent process will also be performed when a
 child process exits, because nothing has been added to on_exit_reset
 to prevent that.  It seems likely that this is undesirable.

 I don't think this can actually happen.  There are quite a number of
 things that would go belly-up if you tried to use dynamic shared
 memory from the postmaster, which is why dsm_create() and dsm_attach()
 both Assert(IsUnderPostmaster).

Nonetheless it seems like a good idea to make on_exit_reset drop any
such queued actions.

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*.  So whether or not you buy Andres' argument that it's not
necessary for atexit_callback to defend against this scenario, there's
actually no other defense possible given the way things work in HEAD.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Andres Freund
Hi,

On 2014-03-07 13:54:42 -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  I don't think this can actually happen.  There are quite a number of
  things that would go belly-up if you tried to use dynamic shared
  memory from the postmaster, which is why dsm_create() and dsm_attach()
  both Assert(IsUnderPostmaster).

 Nonetheless it seems like a good idea to make on_exit_reset drop any
 such queued actions.

 The big picture here is that in the scenario being debated in the other
 thread, exit() in a child process forked from a backend will execute that
 backend's on_detach actions *even if the code had done on_exit_reset after
 the fork*.

Hm, aren't those actions called via shmem_exit() calling
dsm_backend_shutdown() et al? I think that should be cleared by
on_shmem_exit()?

 So whether or not you buy Andres' argument that it's not
 necessary for atexit_callback to defend against this scenario, there's
 actually no other defense possible given the way things work in HEAD.

I think you're misunderstanding me. I am saying we *should* defend
against it. Our opinions just seem to differ on what to do when the
scenario is detected. I am saying we should scream bloody murder (which
admittedly is a hard in a child), you want to essentially declare it
supported.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-03-07 13:54:42 -0500, Tom Lane wrote:
 The big picture here is that in the scenario being debated in the other
 thread, exit() in a child process forked from a backend will execute that
 backend's on_detach actions *even if the code had done on_exit_reset after
 the fork*.

 Hm, aren't those actions called via shmem_exit() calling
 dsm_backend_shutdown() et al? I think that should be cleared by
 on_shmem_exit()?

But dsm_backend_shutdown gets called whether or not any shmem_exit
actions are registered.

 I think you're misunderstanding me. I am saying we *should* defend
 against it. Our opinions just seem to differ on what to do when the
 scenario is detected. I am saying we should scream bloody murder (which
 admittedly is a hard in a child), you want to essentially declare it
 supported.

And if we scream bloody murder, what will happen?  Absolutely nothing
except we'll annoy our users.  They won't have control over the
third-party libraries that are doing what you want to complain about.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Andres Freund
On 2014-03-07 14:14:17 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-03-07 13:54:42 -0500, Tom Lane wrote:
  The big picture here is that in the scenario being debated in the other
  thread, exit() in a child process forked from a backend will execute that
  backend's on_detach actions *even if the code had done on_exit_reset after
  the fork*.
 
  Hm, aren't those actions called via shmem_exit() calling
  dsm_backend_shutdown() et al? I think that should be cleared by
  on_shmem_exit()?
 
 But dsm_backend_shutdown gets called whether or not any shmem_exit
 actions are registered.

Yikes. I thought on_exit_reset() disarmed the atexit callback, but
indeed, it does not...

  I think you're misunderstanding me. I am saying we *should* defend
  against it. Our opinions just seem to differ on what to do when the
  scenario is detected. I am saying we should scream bloody murder (which
  admittedly is a hard in a child), you want to essentially declare it
  supported.
 
 And if we scream bloody murder, what will happen?  Absolutely nothing
 except we'll annoy our users.  They won't have control over the
 third-party libraries that are doing what you want to complain about.

If the third party library is suitably careful it will only use fork()
and then exec() or _exit(), otherwise there are other things that'll be
broken (e.g. stdio). In that case everything was and is fine. If not,
the library's user can then fix or file a bug against the library.

Both perl and glibc seem to do just that in system() btw...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Robert Haas
On Fri, Mar 7, 2014 at 1:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 7, 2014 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I just noticed that the DSM patch has introduced a whole new class of
 failures related to the bug #9464 issue: to wit, any on_detach
 actions registered in a parent process will also be performed when a
 child process exits, because nothing has been added to on_exit_reset
 to prevent that.  It seems likely that this is undesirable.

 I don't think this can actually happen.  There are quite a number of
 things that would go belly-up if you tried to use dynamic shared
 memory from the postmaster, which is why dsm_create() and dsm_attach()
 both Assert(IsUnderPostmaster).

 Nonetheless it seems like a good idea to make on_exit_reset drop any
 such queued actions.

 The big picture here is that in the scenario being debated in the other
 thread, exit() in a child process forked from a backend will execute that
 backend's on_detach actions *even if the code had done on_exit_reset after
 the fork*.

Hmm.  So the problematic sequence of events is where a postmaster
child forks, and then exits without exec-ing, perhaps because e.g.
exec fails?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Peter LaDow
On Friday, March 7, 2014, Andres Freund and...@2ndquadrant.com wrote:

 If the third party library is suitably careful it will only use fork()
 and then exec() or _exit(), otherwise there are other things that'll be


But that is not possible* in our case of trying to spawn an asynchronous
backgound process. The goal here is for the extension to spawn the process
and return immediately. If we exec in the first level child, and
immediately return, who reaps the child when done?

This is why we fork twice and exit in the first level child so that the
extension can reap the first.

Unless Postgres happily reaps all children perhaps through a SIGCHLD
handler?

(* I suppose we could exec a program that itself forks/execs a background
process. But that is essentially no different than what we are doing now,
other than trying to meet this fork/exec/_exit requirement. And that's fine
if that is to be the case. We just need to know what it is.)


 Both perl and glibc seem to do just that in system() btw...


I don't know about Perl, but system is blocking. What if you don't want to
wait for the child to exit?

Pete


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Andres Freund
On 2014-03-07 12:09:45 -0800, Peter LaDow wrote:
 On Friday, March 7, 2014, Andres Freund and...@2ndquadrant.com wrote:
 
  If the third party library is suitably careful it will only use fork()
  and then exec() or _exit(), otherwise there are other things that'll be

 But that is not possible* in our case of trying to spawn an asynchronous
 backgound process.

Forking twice is ok as well, as long as you just use _exit() after the
fork. The thing is that you shouldn't run any nontrivial code in the
fork, as long as you're connected to the original environment (fd's,
memory mappings and so forth).

  Both perl and glibc seem to do just that in system() btw...

 I don't know about Perl, but system is blocking. What if you don't want to
 wait for the child to exit?

Man. The point is that that the library code is carefully written to not
use exit() but _exit() after a fork failure, that's it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Peter LaDow
On Fri, Mar 7, 2014 at 12:17 PM, Andres Freund and...@2ndquadrant.com wrote:
 Man. The point is that that the library code is carefully written to not
 use exit() but _exit() after a fork failure, that's it.

I understand your point.  I understand that in the case of the
postmaster we don't want to invoke behavior that can cause problems by
calling exit(). But how does this jive with Tom's point (on the bug
list) about other 3rd party libraries perhaps registering atexit
handlers?

If the convention is that only fork/exec/_exit is permissible after a
fork (what about on_exit_reset?), then third party libraries also need
to be aware of that convention and not depend on their atexit handlers
being called.

And I'm not advocating a certain solution.  I'm only trying to clarify
what the solution is so that we comply with the convention.  We
don't want to break posgres or any other well behaved third party
libraries.  I don't know the internals of postgres (hence original bug
report and this thread), so I of course defer to you and the other
experts here.  So, in our case we call on_exit_reset() after the fork
in the child, and then from there on only use fork, exec, or _exit, as
you mention above.

Pete


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Peter LaDow
On Fri, Mar 7, 2014 at 12:17 PM, Andres Freund and...@2ndquadrant.com wrote:
 Forking twice is ok as well, as long as you just use _exit() after the
 fork. The thing is that you shouldn't run any nontrivial code in the
 fork, as long as you're connected to the original environment (fd's,
 memory mappings and so forth).

Just to be clear, what do you mean by nontrivial code?  Do you mean
C library calls, other than fork/exec/_exit?

Pete


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-07 Thread Alvaro Herrera
I am reworking this patch, both to accomodate earlier review comments
and to fix the multiple verify step of namespaces, as noted by Tom in
4530.1390023...@sss.pgh.pa.us

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Static Code Analysis Exploits.

2014-03-07 Thread Patrick Curran

Hi,

We use Postgres in our product and we have a client that requires a 
static code analysis scan to detect vulnerabilities. They are concerned 
because the tool (Veracode) found several flaws in Postgres and they 
believe there might be a security risk. I'm sure there are lots of 
companies that use Postgres that have security policies like theirs in 
place, so I'm hoping someone has the experience to know that these are 
false positives or that they are not a security risk for some reason. 
Below is a description of the vulnerability and the location in the 
source code. Version 9.3.2.1 was scanned. Please let me know if there is 
a better place to ask this kind of question.


Thanks,
Patrick



Stack-based Buffer Overflow (CWE ID 121)(13 flaws):
There is a potential buffer overflow with these functions. If an 
attacker can control the data written into the buffer, the overflow may 
result in execution of arbitrary code.


btree_gist.dll .../btree_gist/btree_utils_num.c 115
btree_gist.dll .../btree_gist/btree_utils_num.c 123
pgcrypto.dll .../contrib/pgcrypto/crypt-md5.c 103
libpq.dll .../interfaces/libpq/fe-connect.c 3185
libpq.dll .../interfaces/libpq/fe-connect.c 3220
clusterdb.exe .../interfaces/libpq/fe-connect.c 3243
libpq.dll .../libpq/fe-protocol3.c 1661
libecpg_compat.dll .../ecpg/compatlib/informix.c 978
pgcrypto.dll .../contrib/pgcrypto/mbuf.c 112
pgcrypto.dll .../contrib/pgcrypto/mbuf.c 290
pgcrypto.dll .../contrib/pgcrypto/mbuf.c 306
pgcrypto.dll .../contrib/pgcrypto/mbuf.c 330
libpq.dll .../interfaces/libpq/pqexpbuffer.c 369

Use of Inherently Dangerous Function (CWE ID 242)(1 flaw):
These functions are inherently unsafe because they does not perform 
bounds checking on the size of their input. An attacker can send overly 
long input and overflow the destination buffer, potentially resulting in 
execution of arbitrary code.

pg_isolation_regress.exe .../src/test/regress/pg_regress.c 2307

Integer Overflow or Wraparound (CWE ID 190)(1 flaw):
An integer overflow condition exists when an integer that has not been 
properly sanity checked is used in the determination of an offset or 
size for memory allocation, copying, concatenation, or similarly. If the 
integer in question is incremented past the maximum possible value, it 
may wrap to become a very small, negative number, therefore providing an 
unintended value. This occurs most commonly in arithmetic operations or 
loop iterations. Integer overflows can often result in buffer overflows 
or data corruption, both of which may be potentially exploited to 
execute arbitrary code.


dict_snowball.dll .../libstemmer/utilities.c 371

Process Control (CWE ID 114)(4 flaws)
A function call could result in a process control attack. An argument to 
a process control function is either derived from an untrusted source or 
is hard-coded, both of which may allow an attacker to execute malicious 
code under certain conditions. If an attacker is allowed to specify all 
or part of the filename, it may be possible to load arbitrary libraries. 
If the location is hard-coded and an attacker is able to place a 
malicious copy of the library higher in the search order than the file 
the application intends to load, then the application will load the 
malicious version.


psql.exe .../src/bin/psql/print.c 752
psql.exe .../src/bin/psql/print.c 791
psql.exe .../src/bin/psql/print.c 2209
psql.exe .../src/bin/psql/print.c 2500


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Next CommitFest Deadlines

2014-03-07 Thread Robert Haas
On Thu, Mar 6, 2014 at 7:44 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 Hi all,

 There are some place with the next commitfest deadlines (2014/06 and
 2014/09) ?

 Regards,

Those deadlines won't be finalized until after PGCon, but it seems
likely to me that we'll do about what we did last year.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-07 Thread David E. Wheeler
On Mar 6, 2014, at 1:51 AM, Peter Geoghegan p...@heroku.com wrote:

 It's true for perl. Syntax of hstore is close to hash/array syntax and it's
 easy serialize/deserialize hstore to/from perl. Syntax of hstore was
 inspired by perl.
 
 I understand that. There is a module on CPAN called Pg::hstore that
 will do this; it appears to have been around since 2011. I don't use
 Perl, so I don't know a lot about it. Perhaps David Wheeler has an
 opinion on the value of Perl-like syntax, as a long time Perl
 enthusiast?

HSTORE was inspired by the syntax of Perl hash declarations, but it is not 
compatible. Notably, HSTORE the HSTORE can have a value `NULL`, while in Perl 
hashes it’s `undef`. So you cannot simply `eval` an HSTORE to get a Perl hash 
unless you are certain there are no NULLs.

Besides, string eval in Perl is considered unsafe. Parsing is *much* safer.

 In any case, Perl has excellent support for JSON, just like every
 other language - you are at no particular advantage in Perl by having
 a format that happens to more closely resemble the format of Perl
 hashes and arrays. I really feel that we should concentrate our
 efforts on one standardized format here. It makes the effort to
 integrate your good work, in a way that makes it available to everyone
 so much easier.

I agree. I like HSTORE, but now that JSON is so standard (in fact, as of this 
week, a *real* standard! http://rfc7159.net/rfc7159), and its support is so 
much better than that of HSTORE, including in Perl, I believe that it should be 
priority over HSTORE. I’m happy if HSTORE has the same functionality as JSONB, 
but given the choice, all other things being equal, as a Perl hacker I will 
always choose JSONB.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] on_exit_reset fails to clear DSM-related exit actions

2014-03-07 Thread Peter LaDow
On Fri, Mar 7, 2014 at 12:55 PM, Peter LaDow pet...@gocougs.wsu.edu wrote:
 Just to be clear, what do you mean by nontrivial code?  Do you mean
 C library calls, other than fork/exec/_exit?

I think I've answered my own question:

http://man7.org/linux/man-pages/man7/signal.7.html

The 'Async-signal-safe functions' are the nontrivial C calls that may be called.

Pete


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-07 Thread Fabrízio de Royes Mello
On Fri, Mar 7, 2014 at 5:56 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 I am reworking this patch, both to accomodate earlier review comments
 and to fix the multiple verify step of namespaces, as noted by Tom in
 4530.1390023...@sss.pgh.pa.us


This link is broken...

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-07 Thread Fabrízio de Royes Mello
On Fri, Mar 7, 2014 at 7:21 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-03-07 19:14:48 -0300, Fabrízio de Royes Mello wrote:
  On Fri, Mar 7, 2014 at 5:56 PM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
 
   I am reworking this patch, both to accomodate earlier review comments
   and to fix the multiple verify step of namespaces, as noted by Tom in
   4530.1390023...@sss.pgh.pa.us
  
  
  This link is broken...

 It is a message id, and it seemt o point to an appropriate thread? You
 can relatively easily lookup message ids using urls like
 http://www.postgresql.org/message-id/4530.1390023...@sss.pgh.pa.us


Sorry... my fault!! Thanks!

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Static Code Analysis Exploits.

2014-03-07 Thread Tom Lane
Patrick Curran pcur...@contentanalyst.com writes:
 We use Postgres in our product and we have a client that requires a 
 static code analysis scan to detect vulnerabilities. They are concerned 
 because the tool (Veracode) found several flaws in Postgres and they 
 believe there might be a security risk. I'm sure there are lots of 
 companies that use Postgres that have security policies like theirs in 
 place, so I'm hoping someone has the experience to know that these are 
 false positives or that they are not a security risk for some reason. 
 Below is a description of the vulnerability and the location in the 
 source code. Version 9.3.2.1 was scanned. Please let me know if there is 
 a better place to ask this kind of question.

TBH, I don't think anyone's going to bother with going through this list
in this form.  Line numbers in something that's not an official community
release are not very helpful, and the descriptions are far too vague for
someone looking at the list to be sure exactly what their tool is on
about.  I took one example at random:

 Stack-based Buffer Overflow (CWE ID 121)(13 flaws):
 There is a potential buffer overflow with these functions. If an 
 attacker can control the data written into the buffer, the overflow may 
 result in execution of arbitrary code.

 libpq.dll .../interfaces/libpq/pqexpbuffer.c 369

This seems to be complaining about the memcpy in appendBinaryPQExpBuffer.
Well, I don't see anything unsafe about it: the preceding call to
enlargePQExpBuffer should have made sure that the target buffer is big
enough.  And the reference to stack-based buffer overflow is completely
nonsensical, because no PQExpBuffer keeps its buffer on the stack.  It's
conceivable that their tool has spotted some unsafe pattern in some call
site, but this report is unhelpful about identifying what that would be.

I did look at a few of the other items, and none of the ones I looked at
were any more clear.

FWIW, we do have access to Coverity scans of the Postgres sources, and
we do make efforts to fix things Coverity complains about.  But we're
unlikely to take reports like this one seriously: there's simply not
enough information to know what the tool is unhappy about, nor any
clear reason to believe that it's spotted something that both human
readers and Coverity have missed.

Sorry if that's not the answer you wanted; but a more positive response
is going to require a substantially greater amount of work on your end.
In particular, given the very substantial amounts of work that have
already gone into hardening the Postgres code, I think the burden of
proof is on you or your client to show that these are issues, not on
us to disprove claims that are too vague to be disproven.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Regression test errors

2014-03-07 Thread Martín Marqués
I was testing some builds I was doing and found that the regression
tests fails when doing the against a Hot Standby server:

$ make standbycheck
[...]
== running regression test queries==
test hs_standby_check ... ok
test hs_standby_allowed   ... FAILED
test hs_standby_disallowed... FAILED
test hs_standby_functions ... ok

==
 2 of 4 tests failed.
==

The differences that caused some tests to fail can be viewed in the
file /usr/local/postgresql-9.3.3/src/test/regress/regression.diffs.
A copy of the test summary that you see
above is saved in the file
/usr/local/postgresql-9.3.3/src/test/regress/regression.out.

The regression.diffs and patch attached.

I haven't checked how far back those go. I don't think it's even
important to back patch this, but it's nice for future testing.

Regards,

-- 
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


regression.diffs
Description: Binary data
From b6db8388e37f6afaa431e31239fd972d10140cc1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mart=C3=ADn=20Marqu=C3=A9s?= mar...@2ndquadrant.com
Date: Fri, 7 Mar 2014 21:29:29 -0300
Subject: [PATCH] Standby regression checks failed.

Two Hot Standby regression tests failed for various reasones.

- An error in src/test/regress/expected/hs_standby_disallowed.out
  made regression fail (VACUUM should be ANALYZE).
- Serializable transactions won't work on a Hot Standby.
---
 src/test/regress/expected/hs_standby_allowed.out| 2 +-
 src/test/regress/expected/hs_standby_disallowed.out | 2 +-
 src/test/regress/sql/hs_standby_allowed.sql | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/test/regress/expected/hs_standby_allowed.out b/src/test/regress/expected/hs_standby_allowed.out
index 1abe5f6..9d18d77 100644
--- a/src/test/regress/expected/hs_standby_allowed.out
+++ b/src/test/regress/expected/hs_standby_allowed.out
@@ -49,7 +49,7 @@ select count(*)  as should_be_1 from hs1;
 (1 row)
 
 end;
-begin transaction isolation level serializable;
+begin transaction isolation level readrepeatable;
 select count(*) as should_be_1 from hs1;
  should_be_1 
 -
diff --git a/src/test/regress/expected/hs_standby_disallowed.out b/src/test/regress/expected/hs_standby_disallowed.out
index e7f4835..bc11741 100644
--- a/src/test/regress/expected/hs_standby_disallowed.out
+++ b/src/test/regress/expected/hs_standby_disallowed.out
@@ -124,7 +124,7 @@ unlisten *;
 ERROR:  cannot execute UNLISTEN during recovery
 -- disallowed commands
 ANALYZE hs1;
-ERROR:  cannot execute VACUUM during recovery
+ERROR:  cannot execute ANALYZE during recovery
 VACUUM hs2;
 ERROR:  cannot execute VACUUM during recovery
 CLUSTER hs2 using hs1_pkey;
diff --git a/src/test/regress/sql/hs_standby_allowed.sql b/src/test/regress/sql/hs_standby_allowed.sql
index 58e2c01..5cd450d 100644
--- a/src/test/regress/sql/hs_standby_allowed.sql
+++ b/src/test/regress/sql/hs_standby_allowed.sql
@@ -28,7 +28,7 @@ begin transaction read only;
 select count(*)  as should_be_1 from hs1;
 end;
 
-begin transaction isolation level serializable;
+begin transaction isolation repeatable read;
 select count(*) as should_be_1 from hs1;
 select count(*) as should_be_1 from hs1;
 select count(*) as should_be_1 from hs1;
-- 
1.8.3.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-07 Thread Andres Freund
On 2014-03-07 19:14:48 -0300, Fabrízio de Royes Mello wrote:
 On Fri, Mar 7, 2014 at 5:56 PM, Alvaro Herrera 
 alvhe...@2ndquadrant.comwrote:
 
  I am reworking this patch, both to accomodate earlier review comments
  and to fix the multiple verify step of namespaces, as noted by Tom in
  4530.1390023...@sss.pgh.pa.us
 
 
 This link is broken...

It is a message id, and it seemt o point to an appropriate thread? You
can relatively easily lookup message ids using urls like
http://www.postgresql.org/message-id/4530.1390023...@sss.pgh.pa.us

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE not synonymous with SELECT * FROM?

2014-03-07 Thread Bruce Momjian
On Wed, Nov 13, 2013 at 10:28:07AM +0100, Colin 't Hart wrote:
 David et al,
 
 How about something like this?

I have applied a modified version of your patch.   I didn't like the
idea of putting SELECT inside an OR syntax clauses --- the syntax is
already too complicated.  I also didn't like moving the TABLE mention up
in the file.  What I did do was to document the supported TABLE clauses,
and add some of your verbiage.  Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
new file mode 100644
index 7395754..f1bc158
*** a/doc/src/sgml/ref/select.sgml
--- b/doc/src/sgml/ref/select.sgml
*** TABLE [ ONLY ] replaceable class=param
*** 214,220 
  subqueries that can be referenced by name in the primary query.
  The subqueries effectively act as temporary tables or views
  for the duration of the primary query.
! Each subquery can be a commandSELECT/command, commandVALUES/command,
  commandINSERT/command, commandUPDATE/command or
  commandDELETE/command statement.
  When writing a data-modifying statement (commandINSERT/command,
--- 214,220 
  subqueries that can be referenced by name in the primary query.
  The subqueries effectively act as temporary tables or views
  for the duration of the primary query.
! Each subquery can be a commandSELECT/command, commandTABLE/, commandVALUES/command,
  commandINSERT/command, commandUPDATE/command or
  commandDELETE/command statement.
  When writing a data-modifying statement (commandINSERT/command,
*** SELECT * FROM (SELECT * FROM mytable FOR
*** 1489,1500 
  programlisting
  TABLE replaceable class=parametername/replaceable
  /programlisting
! is completely equivalent to
  programlisting
  SELECT * FROM replaceable class=parametername/replaceable
  /programlisting
  It can be used as a top-level command or as a space-saving syntax
! variant in parts of complex queries.
 /para
/refsect2
   /refsect1
--- 1489,1505 
  programlisting
  TABLE replaceable class=parametername/replaceable
  /programlisting
! is equivalent to
  programlisting
  SELECT * FROM replaceable class=parametername/replaceable
  /programlisting
  It can be used as a top-level command or as a space-saving syntax
! variant in parts of complex queries. Only the literalWITH/,
! literalUNION/, literalINTERSECT/, literalEXCEPT/,
! literalORDER BY/, literalLIMIT/, literalOFFSET/,
! literalFETCH/ and locking clauses can be used with commandTABLE/;
! the literalWHERE/ clause and any form of aggregation cannot
! be used.
 /para
/refsect2
   /refsect1

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tcp_keepalives_idle

2014-03-07 Thread Bruce Momjian
On Thu, Nov 14, 2013 at 11:32:23AM +0100, Marko Tiikkaja wrote:
 On 11/14/13 7:08 AM, Tatsuo Ishii wrote:
 It means the connection is idle except for keepalive packets.
 We could perhaps just drop the word otherwise, if people find
 it confusing.
 
 Wah. I seemed to completely misunderstand what the pharase
 says. Thanks for clarification. I agree to drop otherwise.
 
 I had some problem interpreting these explanations as well:
 http://www.postgresql.org/message-id/527a21f1.2000...@joh.to
 
 Compare that to the description in the libpq documentation:
 Controls the number of seconds of inactivity after which TCP should
 send a keepalive message to the server..

Good point. I have improved the server-side keepalive parameter
descriptions to use the superior libpq text, with adjustment.

Applied patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tcp_keepalives_idle

2014-03-07 Thread Bruce Momjian
On Fri, Mar  7, 2014 at 10:03:42PM -0500, Bruce Momjian wrote:
 On Thu, Nov 14, 2013 at 11:32:23AM +0100, Marko Tiikkaja wrote:
  On 11/14/13 7:08 AM, Tatsuo Ishii wrote:
  It means the connection is idle except for keepalive packets.
  We could perhaps just drop the word otherwise, if people find
  it confusing.
  
  Wah. I seemed to completely misunderstand what the pharase
  says. Thanks for clarification. I agree to drop otherwise.
  
  I had some problem interpreting these explanations as well:
  http://www.postgresql.org/message-id/527a21f1.2000...@joh.to
  
  Compare that to the description in the libpq documentation:
  Controls the number of seconds of inactivity after which TCP should
  send a keepalive message to the server..
 
 Good point. I have improved the server-side keepalive parameter
 descriptions to use the superior libpq text, with adjustment.
 
 Applied patch attached.

Oops, now attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 86dbd0f..2811f11
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** include 'filename'
*** 684,691 
/indexterm
listitem
 para
! Specifies the number of seconds before sending a keepalive packet on
! an otherwise idle connection.  A value of 0 uses the system default.
  This parameter is supported only on systems that support the
  symbolTCP_KEEPIDLE/ or symbolTCP_KEEPALIVE/ symbols, and on
  Windows; on other systems, it must be zero.
--- 684,692 
/indexterm
listitem
 para
! Specifies the number of seconds of inactivity after which TCP
! should send a keepalive message to the client.  A value of 0 uses
! the system default.
  This parameter is supported only on systems that support the
  symbolTCP_KEEPIDLE/ or symbolTCP_KEEPALIVE/ symbols, and on
  Windows; on other systems, it must be zero.
*** include 'filename'
*** 708,715 
/indexterm
listitem
 para
! Specifies the number of seconds between sending keepalives on an
! otherwise idle connection.  A value of 0 uses the system default.
  This parameter is supported only on systems that support the
  symbolTCP_KEEPINTVL/ symbol, and on Windows; on other systems, it
  must be zero.
--- 709,717 
/indexterm
listitem
 para
! Specifies the number of seconds after which a TCP keepalive message
! that is not acknowledged by the client should be retransmitted.
! A value of 0 uses the system default.
  This parameter is supported only on systems that support the
  symbolTCP_KEEPINTVL/ symbol, and on Windows; on other systems, it
  must be zero.
*** include 'filename'
*** 732,739 
/indexterm
listitem
 para
! Specifies the number of keepalive packets to send on an otherwise idle
! connection.  A value of 0 uses the system default.  This parameter is
  supported only on systems that support the symbolTCP_KEEPCNT/
  symbol; on other systems, it must be zero.
  In sessions connected via a Unix-domain socket, this parameter is
--- 734,742 
/indexterm
listitem
 para
! Specifies the number of TCP keepalives that can be lost before
! the server's connection to the client is considered dead.  A value of 0
! uses the system default.  This parameter is
  supported only on systems that support the symbolTCP_KEEPCNT/
  symbol; on other systems, it must be zero.
  In sessions connected via a Unix-domain socket, this parameter is

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-03-07 Thread Yeb Havinga

On 05/03/14 15:44, Craig Ringer wrote:

On 03/05/2014 05:25 PM, Yeb Havinga wrote:

Maybe a naive thought, but shouldn't all plans that include a table with
an RLS clause be invalidated when the session role switches, regardless
of which users from and to?

Only if the plan is actually accessed when under a different user ID.
Consider SECURITY DEFINER functions; you don't want to flush all cached
plans just because you ran a SECURITY DEFINER function that doesn't even
share any statements with the outer transaction.

Hmm good point.


I've also got some concerns about the user visible API; I'm not sure it
makes sense to set row security policy for row reads per-command in
PostgreSQL, since we have the RETURNING clause. Read-side policy should
just be FOR READ.

Hmm but FOR READ would mean new keywords, and SELECT is also a concept
known to users. I didn't find the api problematic to understand, on the
contrary.

Would you expect that FOR SELECT also affects rows you can see to
UPDATE, INSERT, or DELETE?

Yes.

Because that's what it would have to mean, really. Otherwise, you could
just use `UPDATE thetable SET id = id RETURNING *` (or whatever) to read
the rows out if you had UPDATE rights. Or do the same with DELETE.

With RETURNING, it doesn't make much sense for different statements to
have different read access. Can you think of a case where it'd be
reasonable to deny SELECT, but allow someone to see the same rows with
`UPDATE ... RETURNING` ?


It might be an idea to add the SELECT RLS clause for DML
queries that contain a RETURNING clause.

That way lies madness: A DML statement that affects *a different set of
rows* depending on whether or not it has a RETURNING clause.
If you state it like that, it sounds like a POLA violation. But the 
complete story is: A user is allowed to UPDATE a set of rows from a 
table that is not a subsect of the set of rows he can SELECT from the 
table, iow he can UPDATE rows he is not allowed to SELECT. This can lead 
to unexpected results: When the user issues an UPDATE of the table 
without a returning clause, all rows the user may UPDATE are affected. 
When the user issues an UPDATE of the table with a returning clause, all 
rows the user may UPDATE and SELECT are affected.


So the madness comes from the fact that it is allowed to define RLS that 
allow to modify rows you cannot select. Either prevent these conditions 
(i.e. proof that all DML RLS qual implies the SELECT qual, otherwise 
give an error on DML with a RETURNING clause), or allow it without 
violating the RLS rules but accept that a DML with RETURNING is 
different from a DML only.


regards,
Yeb



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-07 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Mar5, 2014, at 18:37 , Tom Lane t...@sss.pgh.pa.us wrote:
 My advice is to lose the EXPLAIN output entirely.  If the authors of
 the patch can't agree on what it means, what hope have everyday users
 got of making sense of it?

 The question isn't what the current output means, but whether it's a
 good metric to report or not.

If you can't agree, then it isn't.

 If we don't report anything, then how would a user check whether a query
 is slow because of O(n^2) behaviour of a windowed aggregate, or because
 of some other reasons?

[ shrug... ]  They can see whether the Window plan node is where the time
is going.  It's not apparent to me that the extra numbers you propose to
report will edify anybody.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-07 Thread Florian Pflug
On Mar5, 2014, at 23:49 , Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 On Mar5, 2014, at 18:37 , Tom Lane t...@sss.pgh.pa.us wrote:
 My advice is to lose the EXPLAIN output entirely.  If the authors of
 the patch can't agree on what it means, what hope have everyday users
 got of making sense of it?
 
 The question isn't what the current output means, but whether it's a
 good metric to report or not.
 
 If you can't agree, then it isn't.

Probably, yes, so let's find something that *is* a good metric.

(BTW, it's not the authors who disagree here. It was me who put the EXPLAIN
feature in, and Dean reviewed it and found it confusing. The original
author David seems to run out of time to work on this, and AFAIK hasn't
weighted in on that particular feature at all)

 If we don't report anything, then how would a user check whether a query
 is slow because of O(n^2) behaviour of a windowed aggregate, or because
 of some other reasons?
 
 [ shrug... ]  They can see whether the Window plan node is where the time
 is going.  It's not apparent to me that the extra numbers you propose to
 report will edify anybody.

By the same line of reasoning, every metric except execution time is
superfluous. Comparing execution times really is a horrible way to measure
this - not only does it include all kinds of thing that have nothing to do
with the restart behaviour of aggregates, you'd also have to construct a
base-line query first which is guaranteed to not restart.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-03-07 Thread Peter Eisentraut
On 3/7/14, 11:39 AM, Dimitri Fontaine wrote:
 Just please make sure that it's still possible to use full absolute path
 for the module path name so that the packager can have control too.

I can't think of any package system where absolute paths are part of a
recommended workflow.  There is always a search path, that search path
contains a series of some kinds of system, non-system, per-user
directories.  A packager or installer chooses one of those directories
as installation target, according to convention or user choice.

OK, you can install a C library in some obscure place and then to
#include /some/absolute/path/file.h, but that's not sane practice.
Similarly, you will still be able to hardcode paths into CREATE FUNCTION
statements or other places, but why would you want to?

 What the $directory proposal achieves is allowing a fully relocatable
 extension layout, where you just have to drop a directory anywhere in
 the file system and it just works (*).

If that's what you want (and it sounds attractive), why couldn't we just
locate libraries using extension_control_path as well (which presumably
contains the control file we are just processing).  No need for another
indirection.  Libraries and control files being in separate directory
hierarchies is a historical artifact, but it's not something that can't
be changed if it's not what we want.

The problem I have with this $directory proposal, if I understand it
correctly, is that it requires editing of the control file at
installation time.  I don't like this for three reasons: it's not clear
how this should actually be done, creating more broken extension build
scripts (a big problem already); control files are part of the code,
so to speak, not a configuration file, and so munging it in a
site-specific way creates a hybrid type of file that will be difficult
to properly manage; it doesn't allow running an extension before
installation (unless I overwrite the control file in my own source
tree), which is my main use case for this.

 What happens if you have more than one 'prefix.so' file in your path?

The same thing that happens if you have more than one prefix.control in
your path.  You take the first one you find.

Yes, if those are actually two different path settings, you need to keep
those aligned.  But that would be a one-time setting by the database
administrator, not a per-extension-and-packager setting, so it's
manageable.  If it still bothers you, put them both in the same path, as
I suggested above.

 The module_pathname facility allows the packager to decide where the
 library module file gets installed and the extension author not to
 concern himself with that choice.

Again, that would only work if they patch the control file during
installation, which is dubious.  That's like patching paths in include
files or shared libraries.  People do that, but it's not a preferred
method.  And secondly, why would a packager care?  Has any packager ever
cared to relocate the library file and no other file?

Aside from those details, it seems clear that any reasonably complete
move-extensions-elsewhere feature will need some kind of build system
support.  I have various ideas on that and would gladly contribute some
of them, but it's not going to happen within two weeks.

At this point I suggest that we work toward the minimum viable product:
the extension_control_path feature as originally proposed (plus the
crash fixes), and let the field work out best practices.  As you
describe, you can work around all the other issues by patching various
text files, but you currently cannot move the extension control file in
any way, and that's a real deficiency.  (I once experimented with bind
mounts to work around that -- a real mess ;-) )



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-07 Thread Florian Pflug
On Mar5, 2014, at 23:49 , Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 On Mar5, 2014, at 18:37 , Tom Lane t...@sss.pgh.pa.us wrote:
 My advice is to lose the EXPLAIN output entirely.  If the authors of
 the patch can't agree on what it means, what hope have everyday users
 got of making sense of it?
 
 The question isn't what the current output means, but whether it's a
 good metric to report or not.
 
 If you can't agree, then it isn't.

Probably, yes, so let's find something that *is* a good metric.

(BTW, it's not the authors who disagree here. It was me who put the EXPLAIN
feature in, and Dean reviewed it and found it confusing. The original
author David seems to run out of time to work on this, and AFAIK hasn't
weighted in on that particular feature at all)

 If we don't report anything, then how would a user check whether a query
 is slow because of O(n^2) behaviour of a windowed aggregate, or because
 of some other reasons?
 
 [ shrug... ]  They can see whether the Window plan node is where the time
 is going.  It's not apparent to me that the extra numbers you propose to
 report will edify anybody.

By the same line of reasoning, every metric except execution time is
superfluous. Comparing execution times really is a horrible way to measure
this - not only does it include all kinds of thing that have nothing to do
with the restart behaviour of aggregates, you'd also have to construct a
base-line query first which is guaranteed to not restart.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-03-07 Thread Peter Eisentraut
[I answered most of these concerns in more detail in the reply to Dimitri.]

On 3/7/14, 9:16 AM, Stephen Frost wrote:
 Being able to have a self-contained module which requires a minimum of
 modification to postgresql.conf is a reduction in complexity, imv.
 Having to maintain two config options which will end up being overly
 long and mostly duplicated doesn't make things easier for people.

Then we can make it one path.

 This
 has made me wonder if we could allow a control file to be explicitly
 referred to from CREATE EXTENSION itself, dropping the need for any of
 this postgresql.conf/GUC maintenance.  There are downsides to that
 approach as well, of course, but it's definitely got a certain appeal.

That might be useful as a separate feature, but it reeks of #include
/absolute/path/file.h, which isn't a sane practice.  No programming
language other than ancient or poorly designed ones allows that sort of
thing.

 I don't buy off on this analogy.  For starters, you can change the
 control file without needing to rebuild the library,

(You can also change the rpath without rebuilding the library.)

 but the main
 difference is that, in practice, there are no library transistions
 happening and instead what we're likely to have are independent and
 *incompatible* libraries living with the same name in our path.

I understand this concern.  The question is, how big is it relative to
the other ones.

As side idea I just had, how about embedding the extension version
number into the library somehow?  Food for thought.

 This makes sense when you have complete control over where things are
 installed to and can drop the control file into the one true directory
 of control files and similairly with the .so.  Indeed, that works
 already today for certain platforms, but from what I understand, on the
 OSX platform you don't really get to just dump files anywhere on the
 filesystem that you want and instead end up forced into a specific
 directory tree.

That is incorrect.

If someone has a general use for module_pathname, I'd be interested to
hear it, but that's not one of them.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-07 Thread Amit Kapila
On Fri, Mar 7, 2014 at 7:59 PM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Mar  7, 2014 at 07:18:04PM +0530, Amit Kapila wrote:
  OK, done with the attached patch  Three is returned for status, but one
  for other cases.

 I think it would have been better if it return status as 4 for cases where
 directory/file is not accessible (current new cases added by this patch).

 I think status 3 should be only return only when the data directory is valid
 and pid file is missing, because in script after getting this code, the next
 thing they will probably do is to start the server which doesn't seem a
 good fit for newly added cases.

 OK, I have updated the attached patch to reflect this, and added a C
 comment.

This is fine, do you think we should mention about this in docs?
I have added one line in docs (updated patch attached), if you think
it makes sense then add it otherwise ignore it.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index 45b53ce..99cb176 100644
--- a/doc/src/sgml/ref/pg_ctl-ref.sgml
+++ b/doc/src/sgml/ref/pg_ctl-ref.sgml
@@ -206,9 +206,10 @@ PostgreSQL documentation
   para
optionstatus/option mode checks whether a server is running in
the specified data directory. If it is, the acronymPID/acronym
-   and the command line options that were used to invoke it are
-   displayed.  If the server is not running, the process returns an
-   exit status of 3.
+and the command line options that were used to invoke it are
+displayed.  If the server is not running, the process returns an
+exit status of 3.  If the specified data directory is not accessible,
+the process returns an exit status of 4.
   /para
 
   para
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 0dbaa6e..56d238f 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -97,6 +97,7 @@ static bool allow_core_files = false;
 static time_t start_time;
 
 static char postopts_file[MAXPGPATH];
+static char version_file[MAXPGPATH];
 static char pid_file[MAXPGPATH];
 static char backup_file[MAXPGPATH];
 static char recovery_file[MAXPGPATH];
@@ -152,7 +153,7 @@ static void pgwin32_doRunAsService(void);
 static int CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION 
*processInfo, bool as_service);
 #endif
 
-static pgpid_t get_pgpid(void);
+static pgpid_t get_pgpid(bool is_status_request);
 static char **readfile(const char *path);
 static void free_readfile(char **optlines);
 static int start_postmaster(void);
@@ -246,10 +247,34 @@ print_msg(const char *msg)
 }
 
 static pgpid_t
-get_pgpid(void)
+get_pgpid(bool is_status_request)
 {
FILE   *pidf;
longpid;
+   struct stat statbuf;
+
+   if (stat(pg_data, statbuf) != 0)
+   {
+   if (errno == ENOENT)
+   printf(_(%s: directory \%s\ does not exist\n), 
progname,
+pg_data);
+   else
+   printf(_(%s: cannot access directory \%s\\n), 
progname,
+pg_data);
+   /*
+* The Linux Standard Base Core Specification 3.1 says this 
should return
+* '4, program or service status is unknown'
+* 
https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
+*/
+   exit(is_status_request ? 4 : 1);
+   }
+
+   if (stat(version_file, statbuf) != 0  errno == ENOENT)
+   {
+   printf(_(%s: directory \%s\ is not a database cluster 
directory\n),
+progname, pg_data);
+   exit(is_status_request ? 4 : 1);
+   }
 
pidf = fopen(pid_file, r);
if (pidf == NULL)
@@ -810,7 +835,7 @@ do_start(void)
 
if (ctl_command != RESTART_COMMAND)
{
-   old_pid = get_pgpid();
+   old_pid = get_pgpid(false);
if (old_pid != 0)
write_stderr(_(%s: another server might be running; 
   trying to start server 
anyway\n),
@@ -894,7 +919,7 @@ do_stop(void)
pgpid_t pid;
struct stat statbuf;
 
-   pid = get_pgpid();
+   pid = get_pgpid(false);
 
if (pid == 0)   /* no pid file */
{
@@ -943,7 +968,7 @@ do_stop(void)
 
for (cnt = 0; cnt  wait_seconds; cnt++)
{
-   if ((pid = get_pgpid()) != 0)
+   if ((pid = get_pgpid(false)) != 0)
{
print_msg(.);
pg_usleep(100); /* 1 sec */
@@ -980,7 +1005,7 @@ do_restart(void)
pgpid_t pid;
struct stat statbuf;
 
-   pid = get_pgpid();
+   pid = 

Re: [HACKERS] pg_upgrade: delete_old_cluster.sh issues

2014-03-07 Thread Bruce Momjian
On Mon, Nov 18, 2013 at 10:13:19PM -0500, Bruce Momjian wrote:
 On Tue, Nov 12, 2013 at 10:35:58AM +, Marc Mamin wrote:
  Hello,
   
  IMHO, there is a serious issue in the script to clean the old data directory
  when running pg_upgrade in link mode.
   
  in short: When working with symbolic links, the first step in
  delete_old_cluster.sh
  is to delete the old $PGDATA folder that may contain tablespaces used by the
  new instance.
   
  in long, our use case:
 
 Rather than removing files/directories individually, which would be
 difficult to maintain, we decided in pg_upgrade 9.3 to detect
 tablespaces in the old data directory and report that and not create a
 delete script.  Here is the commit:
 

 http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=4765dd79219b9697d84f5c2c70f3fe00455609a1
 
 The problem with your setup is that while you didn't pass symbolic links
 to pg_upgrade, you did use symbolic links when defining the tablespaces,
 so pg_upgrade couldn't recognize that the symbolic links were inside the
 old data directory.
 
 We could use readlink() to go walk over all symbolic links, but that
 seems quite complex.  We could use stat() and make sure there are no
 matching inodes in the old data directory, or that they are in a
 different file system.  We could look for a directory named after the PG
 catversion in the old data directory.  We could update the docs.
 
 I am not sure what to do.  We never expected people would put
 tablespaces in the data directory.

I went with a doc patch, attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
new file mode 100644
index 4d03b12..72e3cb6
*** a/doc/src/sgml/pgupgrade.sgml
--- b/doc/src/sgml/pgupgrade.sgml
*** psql --username postgres --file script.s
*** 460,466 
   cluster's data directories by running the script mentioned when
   commandpg_upgrade/command completes. You can also delete the
   old installation directories
!  (e.g. filenamebin/, filenameshare/).
  /para
 /step
  
--- 460,467 
   cluster's data directories by running the script mentioned when
   commandpg_upgrade/command completes. You can also delete the
   old installation directories
!  (e.g. filenamebin/, filenameshare/).  This will not work
!  if you have tablespaces inside the old data directory.
  /para
 /step
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shave a few instructions from child-process startup sequence

2014-03-07 Thread Bruce Momjian
On Sun, Dec  1, 2013 at 12:07:21PM -0500, Gurjeet Singh wrote:
 On Wed, Nov 27, 2013 at 9:12 AM, Robert Haas robertmh...@gmail.com wrote:
 
 
 This is a performance patch, so it should come with benchmark results
 demonstrating that it accomplishes its intended purpose.  I don't see
 any.
 
 
 Yes, this is a performance patch, but as the subject says, it saves a few
 instructions. I don't know how to write a test case that can measure savings 
 of
 skipping a few instructions in a startup sequence that potentially takes
 thousands, or even millions, of instructions.

Are we saying we don't want this patch?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-07 Thread Fabrízio de Royes Mello
On Fri, Mar 7, 2014 at 5:56 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 I am reworking this patch, both to accomodate earlier review comments
 and to fix the multiple verify step of namespaces, as noted by Tom in
 4530.1390023...@sss.pgh.pa.us


Alvaro,

Do you need some help?

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello