date:20140121


On 01/21/2014 07:22 AM, Harold Giménez wrote:

First of all, I apologize for submitting a patch and missing the commitfest
deadline. Given the size of the patch, I thought I'd submit it for your
consideration regardless.

This patch prevents non-superusers from viewing other user's
pg_stat_activity.application_name.  This topic was discussed some time ago
[1] and consequently application_name was made world readable [2].

I would like to propose that we hide it instead by reverting to the
original behavior.  There is a very large number of databases on the same
cluster shared across different users who can easily view each other's
application_name values.  Along with that, there are some libraries that
default application_name to the name of the running process [3], which can
leak information about what web servers applications are running, queue
systems, etc. Furthermore leaking application names in a multi-tenant
environment is more information than an attacker should have access to on
services like Heroku and other similar providers.


I don't find these arguments compelling to change it now. It's 
well-documented that application_name is visible to everyone. Just don't 
put sensitive information there.


For those users that don't mind advertising application_name, the patch 
would be highly inconvenient. For example, the database owner could no 
longer see the application_name of other users connected to her database.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: option --if-exists for pg_dump

2014-01-21 Thread Jeevan Chalke

Hi Pavel,

Consider following test scenario:

mydb=# \d emp
  Table public.emp
 Column |  Type   | Modifiers
+-+---
 empno  | integer | not null
 deptno | integer |
 ename  | text|
Indexes:
emp_pkey PRIMARY KEY, btree (empno)
Foreign-key constraints:
emp_deptno_fkey FOREIGN KEY (deptno) REFERENCES dept(deptno)

mydb=# \d dept
 Table public.dept
 Column |  Type   | Modifiers
+-+---
 deptno | integer | not null
 dname  | text|
Indexes:
dept_pkey PRIMARY KEY, btree (deptno)
Referenced by:
TABLE emp CONSTRAINT emp_deptno_fkey FOREIGN KEY (deptno)
REFERENCES dept(deptno)

mydb=# \q
jeevan@ubuntu:~/pg_master$ ./install/bin/pg_dump -d mydb --if-exists -c 
mydb_ic.dmp

I see following lines in dump which looks certainly wrong:
===

DROP FK CONSTRAINT IF EXISTS ublic.emp DROP CONSTRAINT emp_deptno_fkey;
DROP CONSTRAINT IF EXISTS Y public.emp DROP CONSTRAINT emp_pkey;
DROP CONSTRAINT IF EXISTS Y public.dept DROP CONSTRAINT dept_pkey;

When try to restore, as expected it is throwing an error:
===

psql:mydb_ic.dmp:14: ERROR:  syntax error at or near FK
LINE 1: DROP FK CONSTRAINT IF EXISTS ublic.emp DROP CONSTRAINT emp_d...
 ^
psql:mydb_ic.dmp:15: ERROR:  syntax error at or near CONSTRAINT
LINE 1: DROP CONSTRAINT IF EXISTS Y public.emp DROP CONSTRAINT emp_p...
 ^
psql:mydb_ic.dmp:16: ERROR:  syntax error at or near CONSTRAINT
LINE 1: DROP CONSTRAINT IF EXISTS Y public.dept DROP CONSTRAINT dept...
 ^

Note:
===
Commands which are in form of ALTER TABLE ... DROP are failing.
You need to test each and every object with DROP .. IF EXISTS command.
Better write small test-case with all objects included.

Following logic has flaw:
===
diff --git a/src/bin/pg_dump/pg_backup_archiver.c
b/src/bin/pg_dump/pg_backup_archiver.c
index 7fc0288..0677712 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -413,8 +413,30 @@ RestoreArchive(Archive *AHX)
 /* Select owner and schema as necessary */
 _becomeOwner(AH, te);
 _selectOutputSchema(AH, te-namespace);
-/* Drop it */
-ahprintf(AH, %s, te-dropStmt);
+
+if (*te-dropStmt != '\0')
+{
+/* Inject IF EXISTS clause when it is required. */
+if (ropt-if_exists)
+{
+char buffer[40];
+size_t l;
+
+/* But do it only, when it is not there yet. */
+snprintf(buffer, sizeof(buffer), DROP %s IF
EXISTS,
+ te-desc);
+l = strlen(buffer);
+
+if (strncmp(te-dropStmt, buffer, l) != 0)
+{
+ahprintf(AH, DROP %s IF EXISTS %s,
+te-desc,
+te-dropStmt + l);
+}
+else
+ahprintf(AH, %s, te-dropStmt);
+}
+}
 }
 }


Also:
===

1.
This is still out of sync.

@@ -348,6 +350,8 @@ main(int argc, char *argv[])
 appendPQExpBufferStr(pgdumpopts,  --binary-upgrade);
 if (column_inserts)
 appendPQExpBufferStr(pgdumpopts,  --column-inserts);
+if (if_exists)
+appendPQExpBufferStr(pgdumpopts,  --if-exists);
 if (disable_dollar_quoting)
 appendPQExpBufferStr(pgdumpopts,  --disable-dollar-quoting);
 if (disable_triggers)

2.
Spell check required:

+/* skip first n chars, and create a modifieble copy */

modifieble = modifiable

+/* DROP IF EXISTS pattern is not appliable on dropStmt */

appliable = applicable

3.

+/*
+ * Object description is based on dropStmt statement. But
+ * a drop statements can be enhanced about IF EXISTS clause.
+ * We have to increase a offset in this case, IF EXISTS
+ * should not be included on object description.
+ */

Looks like you need to re-phrase these comments line. Something like:

/*
 * Object description is based on dropStmt statement which may have
 * IF EXISTS clause.  Thus we need to update an offset such that it
 * won't be included in the object description.
 */
Or as per your choice.


Need to have careful thought on a bug mentioned above.

Thanks


On Fri, Jan 17, 2014 at 6:23 PM, Pavel Stehule pavel.steh...@gmail.comwrote:

 Hello


 2014/1/16 Jeevan Chalke jeevan.cha...@enterprisedb.com

 Hi Pavel,

 I have reviewed the patch and here are my concerns and notes:

 POSITIVES:
 ---
 1. Patch applies with some white-space errors.
 2. make / make install / make check is smooth. No issues as such.
 3. Feature looks good as well.
 4. NO

Re: [HACKERS] Funny representation in pg_stat_statements.query.

2014-01-21 Thread Kyotaro HORIGUCHI

Hello, I found it would be reasonably fixed by small
modifications.

  - CURRENT_TIME and the like are parsed into the nodes directly
represents the node specs in gram.y
blah, blah
 Since this becomes more than a simple bug fix, I think it is too
 late for 9.4 now. I'll work on this in the longer term.

The fundamental cause of this issue is Const node which conveys
the location of other than constant tokens. Any other nodes, for
instance TypeCasts, are safe.

So this is fixed by quite simple way like following getting rid
of the referred difficulties of keeping the code sane and total
loss of token locations. (white spaces are collapsed for readability)

| --- a/src/backend/parser/gram.y
| +++ b/src/backend/parser/gram.y
| @@ -11355,8 +11355,8 @@ func_expr_common_subexpr:
|* to rely on it.)
|*/
|   Node *n;
| - n = makeStringConstCast(now, @1, SystemTypeName(text));
| - $$ = makeTypeCast(n, SystemTypeName(date), -1);
| + n = makeStringConstCast(now, -1, SystemTypeName(text));
| + $$ = makeTypeCast(n, SystemTypeName(date), @1);

Any suggestions? Attached is the complete patch.


 As of 9.4dev, transformTypeCast passes the locations retrieved
from the source TypeCast node itself or its TypeName subnode to
the newly created CoerceViaIO node but leaves that of the inner
expressions (the now, I mean) as it is, so I think this fits to
what transformTypeCast does more than before. The query tree
after the transformation sees like follows. I think this is
preferable to that -1 to CoerceViaIO and 7 to Const.

...
  {
type = TargetEtnry
expr = {
  type = CoerceViaIO, location = 7,
  arg = {
type = Const, location = -1
  }
}
  }


 In addition, the location stored in StringConstCast seems
currently not having no route to users' sight, even any step out
the node - in short 'useless'. And yet the location of
CoerceViaIO is also not used...

|| CURRENT_DATE
..
|-  n = makeStringConstCast(now, @1, SystemTypeName(text)
|+  n = makeStringConstCast(nowe, @1, SystemTypeName(text)

yields only this, including logs.

| =# select CURRENT_DATE;
| ERROR:  invalid input syntax for type date: nowe


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 12a6beb..a14a381 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11399,8 +11399,8 @@ func_expr_common_subexpr:
 	 * to rely on it.)
 	 */
 	Node *n;
-	n = makeStringConstCast(now, @1, SystemTypeName(text));
-	$$ = makeTypeCast(n, SystemTypeName(date), -1);
+	n = makeStringConstCast(now, -1, SystemTypeName(text));
+	$$ = makeTypeCast(n, SystemTypeName(date), @1);
 }
 			| CURRENT_TIME
 {
@@ -11409,8 +11409,8 @@ func_expr_common_subexpr:
 	 * See comments for CURRENT_DATE.
 	 */
 	Node *n;
-	n = makeStringConstCast(now, @1, SystemTypeName(text));
-	$$ = makeTypeCast(n, SystemTypeName(timetz), -1);
+	n = makeStringConstCast(now, -1, SystemTypeName(text));
+	$$ = makeTypeCast(n, SystemTypeName(timetz), @1);
 }
 			| CURRENT_TIME '(' Iconst ')'
 {
@@ -11420,10 +11420,10 @@ func_expr_common_subexpr:
 	 */
 	Node *n;
 	TypeName *d;
-	n = makeStringConstCast(now, @1, SystemTypeName(text));
+	n = makeStringConstCast(now, -1, SystemTypeName(text));
 	d = SystemTypeName(timetz);
 	d-typmods = list_make1(makeIntConst($3, @3));
-	$$ = makeTypeCast(n, d, -1);
+	$$ = makeTypeCast(n, d, @1);
 }
 			| CURRENT_TIMESTAMP
 {
@@ -11441,10 +11441,10 @@ func_expr_common_subexpr:
 	 */
 	Node *n;
 	TypeName *d;
-	n = makeStringConstCast(now, @1, SystemTypeName(text));
+	n = makeStringConstCast(now, -1, SystemTypeName(text));
 	d = SystemTypeName(timestamptz);
 	d-typmods = list_make1(makeIntConst($3, @3));
-	$$ = makeTypeCast(n, d, -1);
+	$$ = makeTypeCast(n, d, @1);
 }
 			| LOCALTIME
 {
@@ -11453,8 +11453,8 @@ func_expr_common_subexpr:
 	 * See comments for CURRENT_DATE.
 	 */
 	Node *n;
-	n = makeStringConstCast(now, @1, SystemTypeName(text));
-	$$ = makeTypeCast((Node *)n, SystemTypeName(time), -1);
+	n = makeStringConstCast(now, -1, SystemTypeName(text));
+	$$ = makeTypeCast((Node *)n, SystemTypeName(time), @1);
 }
 			| LOCALTIME '(' Iconst ')'
 {
@@ -11464,10 +11464,10 @@ func_expr_common_subexpr:
 	 */
 	Node *n;
 	TypeName *d;
-	n = makeStringConstCast(now, @1, SystemTypeName(text));
+	n = makeStringConstCast(now, -1, SystemTypeName(text));
 	d = SystemTypeName(time);
 	d-typmods = list_make1(makeIntConst($3, @3));
-	$$ = makeTypeCast((Node *)n, d, -1);
+	$$ = makeTypeCast((Node *)n, d, @1);
 }
 			| LOCALTIMESTAMP
 {
@@ -11476,8 +11476,8 @@ func_expr_common_subexpr:
 	 * See comments for CURRENT_DATE.
 	 */
 	Node *n;
-	n =

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Craig Ringer

On 01/21/2014 04:19 PM, Heikki Linnakangas wrote:
 On 01/21/2014 07:22 AM, Harold Giménez wrote:
 First of all, I apologize for submitting a patch and missing the
 commitfest
 deadline. Given the size of the patch, I thought I'd submit it for your
 consideration regardless.

 This patch prevents non-superusers from viewing other user's
 pg_stat_activity.application_name.  This topic was discussed some time
 ago
 [1] and consequently application_name was made world readable [2].

 I would like to propose that we hide it instead by reverting to the
 original behavior.  There is a very large number of databases on the same
 cluster shared across different users who can easily view each other's
 application_name values.  Along with that, there are some libraries that
 default application_name to the name of the running process [3], which
 can
 leak information about what web servers applications are running, queue
 systems, etc. Furthermore leaking application names in a multi-tenant
 environment is more information than an attacker should have access to on
 services like Heroku and other similar providers.
 
 I don't find these arguments compelling to change it now. It's
 well-documented that application_name is visible to everyone. Just don't
 put sensitive information there.
 
 For those users that don't mind advertising application_name, the patch
 would be highly inconvenient. For example, the database owner could no
 longer see the application_name of other users connected to her database.

It also means that monitoring tools must run as superuser to see
information they require, which to me is a total showstopper.

If you want control over visibility of application_name, it should be
done with a column privilige granted to a system role, or something like
that - so the ability to see it can be given to public on default
(thus not breaking BC) and if it's revoked from public, given to roles
that need to see it.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-21 Thread Dean Rasheed

On 21 January 2014 01:09, KaiGai Kohei kai...@ak.jp.nec.com wrote:
(2014/01/13 22:53), Craig Ringer wrote:

On 01/09/2014 11:19 PM, Tom Lane wrote:

Dean Rasheed dean.a.rash...@gmail.com writes:

My first thought was that it should just preprocess any security
barrier quals in subquery_planner() in the same way as other quals are
preprocessed. But thinking about it further, those quals are destined
to become the quals of subqueries in the range table, so we don't
actually want to preprocess them at that stage --- that will happen
later when the new subquery is planned by recursion back into
subquery_planner(). So I think the right answer is to make
adjust_appendrel_attrs() handle recursion into sublink subqueries.

TBH, this sounds like doubling down on a wrong design choice. I see
no good reason that updatable security views should require any
fundamental rearrangements of the order of operations in the planner.

In that case, would you mind offerign a quick sanity check on the
following alternative idea:

- Add sourceRelation to Query. This refers to the RTE that supplies
tuple projections to feed into ExecModifyTable, with appropriate resjunk
ctid and (if requ'd) oid cols present.

- When expanding a target view into a subquery, set sourceRelation on
the outer view to the index of the RTE of the newly expanded subquery.

I have sane opinion. Existing assumption, resultRelation is RTE of the
table to be read not only modified, makes the implementation complicated.
If we would have a separate sourceRelation, it allows to have flexible
form including sub-query with security_barrier on the reader side.

Could you tell me the direction you're inclined right now?
I wonder whether I should take the latest patch submitted on 09-Jan for
a deep code reviewing and testing.

Yes, please review the patch from 09-Jan
(http://www.postgresql.org/message-id/CAEZATCUiKxOg=voovja2s6g-sixzzxg18totggp8zobq6qn...@mail.gmail.com).

The idea behind that patch is to avoid a lot of the complication that
leads to and then arises from having a separate sourceRelation in
the query.

If you go down the route of expanding the subquery in the rewriter and
making it the sourceRelation, then you have to make extensive
changes to preprocess_targetlist so that it recursively descends into
the subquery to pull out variables needed by an update.

Also you would probably need additional changes in the rewriter so
that later stages didn't trample on the sourceRelation, and
correctly updated it in the case of views on top of other views.

Also, you would need to make changes to the inheritance planner code,
because you'd now have 2 RTEs referring to the target relation
(resultRelation and sourceRelation wrapping it in a subquery).
Referring to the comment in inheritance_planner():

* Source inheritance is expanded at the bottom of the
* plan tree (see allpaths.c), but target inheritance has to be expanded at
* the top.

except that in the case of the sourceRelation, it is actually
effectively the target, which means it shouldn't be expanded,
otherwise you'd get plans like the ones I complained about upthread
(see the final explain output in
http://www.postgresql.org/message-id/caezatcu3vcdkjpnhgfkrmrkz0axkcuh4ce_kqq3z2hzknhi...@mail.gmail.com).

Perhaps all of that could be made to work, but I think that it would
end up being a far more invasive patch than my 09-Jan patch. My patch
avoids both those issues by not doing the subquery expansion until
after inheritance expansion, and after targetlist preprocessing.

Regards,
Dean

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part 1: additional information


On 01/21/2014 04:02 AM, Tomas Vondra wrote:

On 20.1.2014 19:30, Heikki Linnakangas wrote:


Attached is a yet another version, with more bugs fixed and more
comments added and updated. I would appreciate some heavy-testing of
this patch now. If you could re-run the tests you've been using,
that could be great. I've tested the WAL replay by replicating GIN
operations over streaming replication. That doesn't guarantee it's
correct, but it's a good smoke test.


I gave it a try - the OOM error seems to be gone, but now get this

PANIC:  cannot insert duplicate items to GIN index page

This only happens when building the index incrementally (i.e. using a
sequence of INSERT statements into a table with GIN index). When I
create a new index on a table (already containing the same dataset) it
works just fine.

Also, I tried to reproduce the issue by running a simple plpgsql loop
(instead of a complex python script):

DO LANGUAGE plpgsql $$
DECLARE
 r tsvector;
BEGIN
 FOR r IN SELECT body_tsvector FROM data_table LOOP
 INSERT INTO idx_table (body_tsvector) VALUES (r);
 END LOOP;
END$$;

where data_table is the table with imported data (the same data I
mentioned in the post about OOM errors), and index_table is an empty
table with a GIN index. And indeed it fails, but only if I run the block
in multiple sessions in parallel.


Oh, I see what's going on. I had assumed that there cannot be duplicate 
insertions into the posting tree, but that's dead wrong. The fast 
insertion mechanism depends on a duplicate insertion to do nothing.


Will fix, thanks for the testing!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-21 Thread David Rowley

On Sun, Dec 15, 2013 at 2:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Greg Stark st...@mit.edu writes:
  On 14 Dec 2013 15:40, Tom Lane t...@sss.pgh.pa.us wrote:
  I think you *can't* cover them for the float types; roundoff error
  would mean you don't get the same answers as before.

  I was going to say the same thing. But then I started to wonder
 What's
  so special about the answers we used to give? They are also subject to
  round off and the results are already quite questionable in those cases.

 Well, we can't easily do better than the old answers, and the new ones
 might be arbitrarily worse.  Example: sum or average across single-row
 windows ought to be exact in any case, but it might be arbitrarily wrong
 with the negative-transition technique.

 More generally, this is supposed to be a performance enhancement only;
 it's not supposed to change the results.


It came to me that it might be possible to implement inverse transitions
for floating point aggregates by just detecting if precision has been lost
during forward transitions.

I've written the test to do this as:

 IF state.value + value = state.value AND value  0 THEN
newstate.precision_lost := true; newstate.value := state.value; ELSE
newstate.precision_lost := false; newstate.value := state.value + value;
END IF;
The inverse transition function checks the precision_lost and if it's true
it returns NULL. The core code is now implemented (thanks to Florian) to
re-aggregate when NULL is returned from the inverse transition function.

I've attached an implementation of this with the transition functions
written in plpgsql.
I don't really know for sure yet if it can handle all cases and give the
exact same results as it would without inverse transitions, but it
certainly fixes the error case which was presented

Using the attached on HEAD of
https://github.com/david-rowley/postgres/commits/invtrans

explain (analyze, verbose)
select mysum(v) over (order by i rows between current row and unbounded
following) from (values(1,1e20),(2,1)) b(i,v);

Gives me the expected results of 1e20 and 1, instead of my original attempt
which gave 1e20 and 0.

I guess the extra tracking on forward transition might mean this would not
be practical to implement in C for sum(float), but I just wanted to run the
idea through a few heads to see if anyone can present a case where it can
still produce wrong results.

If it seems sound enough, then I may implement it in C to see how much
overhead it adds to forward aggregation for floating point types, but even
if it did add too much overhead to forward aggregation it might be worth
allowing aggregates to have 2 forward transition functions and if the 2nd
one exists then it could be used in windowing functions where the frame
does not have unbounded following.

Any thoughts?

Regards

David Rowley
BEGIN WORK;

CREATE TYPE float_state AS (precision_lost bool, value float);


CREATE OR REPLACE FUNCTION float_sum(state float_state, value float)
  RETURNS float_state AS
$$
DECLARE newstate float_state;
BEGIN
IF state IS NULL THEN
IF value IS NULL THEN
RETURN NULL;
ELSE
newstate.value := value;
newstate.precision_lost := false;
return newstate;
END IF;
END IF;

IF state.value + value = state.value AND value  0 THEN
newstate.precision_lost := true;
newstate.value := state.value;
ELSE
newstate.precision_lost := false;
newstate.value := state.value + value;
END IF;

RETURN newstate;
END;
$$
LANGUAGE plpgsql IMMUTABLE;

CREATE OR REPLACE FUNCTION float_sum_inv(state float_state, value float)
  RETURNS float_state AS
$$
DECLARE newstate float_state;
BEGIN
IF state.precision_lost = true THEN
RETURN NULL;
ELSE
newstate.value := state.value - value;
RETURN newstate;
END IF;
END;
$$
LANGUAGE plpgsql STRICT IMMUTABLE;

CREATE OR REPLACE FUNCTION float_sum_final(state float_state)
  RETURNS float AS
$$
BEGIN
IF NOT(state IS NULL) THEN
RETURN state.value;
ELSE
RETURN NULL;
END IF;
END;
$$
LANGUAGE plpgsql IMMUTABLE;


CREATE AGGREGATE mysum (float) (
stype = float_state,
sfunc = float_sum,
invfunc = float_sum_inv,
finalfunc = float_sum_final
  );

select mysum(v)  from (values(1,1e20),(2,1)) b(i,v);

-- forces re-aggregate due to precision loss
--explain (analyze, verbose)
select mysum(v) over (order by i rows between current row and unbounded 
following) from (values(1,1e20),(2,1)) b(i,v);

-- does not force reaggregate.
--explain (analyze, verbose)
select mysum(v) over (order by i rows between current row and unbounded 
following) from (values(1,1),(2,2),(3,3)) b(i,v);

rollback;

-- 
Sent via pgsql-hackers mailing list

[HACKERS] alternative back-end block formats

2014-01-21 Thread Christian Convey

Hi all,

I'm playing around with Postgres, and I thought it might be fun to
experiment with alternative formats for relation blocks, to see if I can
get smaller files and/or faster server performance.

Does anyone know if this has been done before with Postgres?  I would have
assumed yes, but I'm not finding anything in Google about people having
done this.

Thanks,
Christian

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-21 Thread Florian Pflug

On Jan21, 2014, at 10:53 , David Rowley dgrowle...@gmail.com wrote:
 It came to me that it might be possible to implement inverse transitions
 for floating point aggregates by just detecting if precision has been
 lost during forward transitions.
 
 I've written the test to do this as:
 
 IF state.value + value = state.value AND value  0 THEN
   newstate.precision_lost := true;
   newstate.value := state.value;
   ELSE
   newstate.precision_lost := false;
   newstate.value := state.value + value;
   END IF;
 
 
 The inverse transition function checks the precision_lost and if it's true it
 returns NULL. The core code is now implemented (thanks to Florian) to
 re-aggregate when NULL is returned from the inverse transition function.

That's not sufficient, I fear. You can lose all significant digits of the value
and still have precision_lost = false afterwards. Try summing over 1e16, 1.01.
SELECT 1e16::float8 + 1.01::float8 = 1e16::float8 returns FALSE, yet
SELECT 1e16::float8 + 1.01::float8 - 1e16::float8 returns 2 where 1.01
would have been correct. That's still too much precision loss.

I'm quite certain that the general idea has merit, but the actual
invertibility condition is going to be more involved. If you want to play with
this, I think the first step has to be to find a set of guarantees that 
SUM(float) is supposed to meet. Currently, SUM(float) guarantees that if the
summands all have the same sign, the error is bound by C * N, where C is some
(machine-specific?) constant (about 1e-15 or so), and N is the number of input
rows. Or at least so I think from looking at SUMs over floats in descending
order, which I guess is the worst case. You could, for example, try to see if
you can find a invertibility conditions which guarantees the same, but allows
C to be larger. That would put a bound on the number of digits lost by the new
SUM(float) compared to the old one.

I don't have high hopes for this getting int 9.4, though.

 If it seems sound enough, then I may implement it in C to see how much
 overhead it adds to forward aggregation for floating point types, but even
 if it did add too much overhead to forward aggregation it might be worth
 allowing aggregates to have 2 forward transition functions and if the 2nd
 one exists then it could be used in windowing functions where the frame
 does not have unbounded following.

I don't think adding yet another type of aggregation function is the
solution here. 

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-21 Thread KONDO Mitsumasa


Rebased patch is attached.

pg_stat_statements in PG9.4dev has already changed table columns in. So I hope 
this patch will be committed in PG9.4dev.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
*** a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
--- b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
***
*** 19,24  CREATE FUNCTION pg_stat_statements(
--- 19,27 
  OUT query text,
  OUT calls int8,
  OUT total_time float8,
+ OUT min_time float8,
+ OUT max_time float8,
+ OUT stdev_time float8,
  OUT rows int8,
  OUT shared_blks_hit int8,
  OUT shared_blks_read int8,
***
*** 41,43  CREATE VIEW pg_stat_statements AS
--- 44,52 
SELECT * FROM pg_stat_statements();
  
  GRANT SELECT ON pg_stat_statements TO PUBLIC;
+ 
+ /* New Function */
+ CREATE FUNCTION pg_stat_statements_reset_time()
+ RETURNS void
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C;
*** a/contrib/pg_stat_statements/pg_stat_statements--1.2.sql
--- b/contrib/pg_stat_statements/pg_stat_statements--1.2.sql
***
*** 9,14  RETURNS void
--- 9,19 
  AS 'MODULE_PATHNAME'
  LANGUAGE C;
  
+ CREATE FUNCTION pg_stat_statements_reset_time()
+ RETURNS void
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C;
+ 
  CREATE FUNCTION pg_stat_statements(
  OUT userid oid,
  OUT dbid oid,
***
*** 16,21  CREATE FUNCTION pg_stat_statements(
--- 21,29 
  OUT query text,
  OUT calls int8,
  OUT total_time float8,
+ OUT min_time float8,
+ OUT max_time float8,
+ OUT stdev_time float8,
  OUT rows int8,
  OUT shared_blks_hit int8,
  OUT shared_blks_read int8,
***
*** 42,44  GRANT SELECT ON pg_stat_statements TO PUBLIC;
--- 50,53 
  
  -- Don't want this to be available to non-superusers.
  REVOKE ALL ON FUNCTION pg_stat_statements_reset() FROM PUBLIC;
+ REVOKE ALL ON FUNCTION pg_stat_statements_reset_time() FROM PUBLIC;
*** a/contrib/pg_stat_statements/pg_stat_statements--unpackaged--1.0.sql
--- b/contrib/pg_stat_statements/pg_stat_statements--unpackaged--1.0.sql
***
*** 4,8 
--- 4,9 
  \echo Use CREATE EXTENSION pg_stat_statements to load this file. \quit
  
  ALTER EXTENSION pg_stat_statements ADD function pg_stat_statements_reset();
+ ALTER EXTENSION pg_stat_statements ADD function pg_stat_statements_reset_time();
  ALTER EXTENSION pg_stat_statements ADD function pg_stat_statements();
  ALTER EXTENSION pg_stat_statements ADD view pg_stat_statements;
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***
*** 78,83  static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
--- 78,84 
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define STICKY_DECREASE_FACTOR	(0.50)	/* factor for sticky entries */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
+ #define EXEC_TIME_INIT		(-1)		/* initial execution time */
  
  #define JUMBLE_SIZE1024	/* query serialization buffer size */
  
***
*** 114,119  typedef struct Counters
--- 115,123 
  {
  	int64		calls;			/* # of times executed */
  	double		total_time;		/* total execution time, in msec */
+ 	double		total_sqtime;		/* cumulated square execution time, in msec */
+ 	double		min_time;		/* maximum execution time, in msec */
+ 	double		max_time;		/* minimum execution time, in msec */
  	int64		rows;			/* total # of retrieved or affected rows */
  	int64		shared_blks_hit;	/* # of shared buffer hits */
  	int64		shared_blks_read;		/* # of shared disk blocks read */
***
*** 237,245  void		_PG_init(void);
--- 241,251 
  void		_PG_fini(void);
  
  Datum		pg_stat_statements_reset(PG_FUNCTION_ARGS);
+ Datum		pg_stat_statements_reset_time(PG_FUNCTION_ARGS);
  Datum		pg_stat_statements(PG_FUNCTION_ARGS);
  
  PG_FUNCTION_INFO_V1(pg_stat_statements_reset);
+ PG_FUNCTION_INFO_V1(pg_stat_statements_reset_time);
  PG_FUNCTION_INFO_V1(pg_stat_statements);
  
  static void pgss_shmem_startup(void);
***
*** 266,271  static pgssEntry *entry_alloc(pgssHashKey *key, const char *query,
--- 272,278 
  			int query_len, bool sticky);
  static void entry_dealloc(void);
  static void entry_reset(void);
+ static void entry_reset_time(void);
  static void AppendJumble(pgssJumbleState *jstate,
  			 const unsigned char *item, Size size);
  static void JumbleQuery(pgssJumbleState *jstate, Query *query);
***
*** 1046,1051  pgss_store(const char *query, uint32 queryId,
--- 1053,1059 
  
  		e-counters.calls += 1;
  		e-counters.total_time += total_time;
+ 		e-counters.total_sqtime += total_time * total_time;
  		e-counters.rows += rows;
  		e-counters.shared_blks_hit += bufusage-shared_blks_hit;
  		e-counters.shared_blks_read += bufusage-shared_blks_read;
***
*** 1061,1066  pgss_store(const char *query, uint32

Re: [HACKERS] Fix memset usage in pgcrypto

2014-01-21 Thread Marko Kreen

On Mon, Jan 20, 2014 at 06:49:21PM -0300, Alvaro Herrera wrote:
 Marko Kreen escribió:
  http://www.viva64.com/en/b/0227/ reported that on-stack memset()s
  might be optimized away by compilers.  Fix it.
 
 Just to clarify, this needs to be applied to all branches, right?  If
 so, does the version submitted apply cleanly to all of them?

It does apply cleanly.  It is not critical fix, but it's simple,
so I think it should be back-patched.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Stephen Frost

* Craig Ringer (cr...@2ndquadrant.com) wrote:
 It also means that monitoring tools must run as superuser to see
 information they require, which to me is a total showstopper.

We've already got *far* too much of that going on for my taste.  I'd
love to see a comprehensive solution to this problem which allows
monitoring systems to run w/o superuser privileges.

 If you want control over visibility of application_name, it should be
 done with a column privilige granted to a system role, or something like
 that - so the ability to see it can be given to public on default
 (thus not breaking BC) and if it's revoked from public, given to roles
 that need to see it.

I agree with this- individuals should be able to control access to this
information for their databases/clusters.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] GIN improvements part 1: additional information

2014-01-21 Thread Alexander Korotkov

On Mon, Jan 20, 2014 at 10:30 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 On 01/17/2014 08:49 PM, Alexander Korotkov wrote:

 On Fri, Jan 17, 2014 at 10:38 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

  On 01/17/2014 01:05 PM, Alexander Korotkov wrote:

  Seems to be fixed in the attached version of patch.
 Another improvement in this version: only left-most segments and further
 are re-encoded. Left part of page are left untouched.


 I'm looking into this now. A few quick notes:

 * Even when you don't re-encode the whole page, you still WAL-log the
 whole page. While correct, it'd be a pretty obvious optimization to only
 WAL-log the modified part.


 Oh, I overlooked it. I wrote code in ginRedoInsertData which finds
 appropriate place fro data but didn't notice that I still wrote whole page
 to xlog record.


 Yeah. I think ginRedoInsertData was too cute to be bug-free. I added an
 explicit unmodifiedsize field to the WAL record, so that ginRedoInsertData
 doesn't need to calculate it.


  * When new items are appended to the end of the page so that only the last
 existing compressed segment is re-encoded, and the page has to be split,
 the items aren't divided 50/50 on the pages. The loop that moves segments
 destined for the left page to the right won't move any existing,
 untouched,
 segments.


 I think this loop will move one segment. However, it's too small.


 Implemented this.

 I noticed that the gin vacuum redo routine is dead code, except for the
 data-leaf page handling, because we never remove entries or internal nodes
 (page deletion is a separate wal record type). And the data-leaf case is
 functionally equivalent to heap newpage records. I removed the dead code
 and made it more clear that it resembles heap newpage.

 Attached is a yet another version, with more bugs fixed and more comments
 added and updated. I would appreciate some heavy-testing of this patch now.
 If you could re-run the tests you've been using, that could be great. I've
 tested the WAL replay by replicating GIN operations over streaming
 replication. That doesn't guarantee it's correct, but it's a good smoke
 test.


I tried my test-suite but it hangs on index scan with infinite loop. I
re-tried it on my laptop with -O0. I found it to crash on update and vacuum
in some random places like:
Assert(GinPageIsData(page)); in xlogVacuumPage
Assert(ndecoded == totalpacked); in ginCompressPostingList
Trying to debug it.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] ALTER SYSTEM SET typos and fix for temporary file name management

2014-01-21 Thread Michael Paquier

On Mon, Jan 20, 2014 at 2:12 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Sat, Jan 18, 2014 at 7:59 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Hi all,

 After going through commit 65d6e4c (introducing ALTER SYSTEM SET), I
 noticed a couple of typo mistakes as well as (I think) a weird way of
 using the temporary auto-configuration name postgresql.auto.conf.temp
 in two different places, resulting in the patch attached.

 .tmp suffix is used at couple other places in code as well.
 snapmgr.c
 XactExportFilePath(pathtmp, topXid, list_length(exportedSnapshots), .tmp);
 receivelog.c
 snprintf(tmppath, MAXPGPATH, %s.tmp, path);

 Although similar suffix is used at other places, but still I think it might be
 better to define for current case as here prefix (postgresql.auto.conf) is 
 also
 fixed and chances of getting it changed are less. However even if we don't
 do, it might be okay as we are using such suffixes at other places as well.
In the case of snapmgr.c and receivelog.c, the operations are kept
local, so it does not matter much if this way of naming is done as all
the operations for a temporary file are kept within the same, isolated
function. However, the case of postgresql.auto.conf.temp is different:
this temporary file name is used as well for a check in basebackup.c,
so I imagine that it would be better to centralize this information in
a single place. Now this name is only used in two places, but who
knows if some additional checks here are there won't be needed for
this temp file...
postgresql.auto.conf.temp is also located at the root of PGDATA, so it
might be surprising for the user to find it even if there are few
chances that it can happen.

 It might be an overkill to use a dedicated variable for the temporary
 autoconfiguration file (here PG_AUTOCONF_FILENAME_TEMP), but I found
 kind of weird the way things are currently done on master branch. IMO,
 it would reduce bug possibilities to have everything managed with a
 single variable.

 Typos at least should be fixed :)

 - appendStringInfoString(buf, # Do not edit this file manually! \n);
 - appendStringInfoString(buf, # It will be overwritten by ALTER
 SYSTEM command. \n);
 + appendStringInfoString(buf, # Do not edit this file manually!\n);
 + appendStringInfoString(buf, # It will be overwritten by ALTER
 SYSTEM command.\n);

 Same change in initdb.c is missing.
Thanks, I missed it.
-- 
Michael
commit a8cf33deb3c46c4f56f1e617bbd3142cbbd66f1b
Author: Michael Paquier mich...@otacoo.com
Date:   Tue Jan 21 21:44:30 2014 +0900

Fix typos and temporary file management in ALTER SYSTEM

Using a temporary file uniquely defined in a single place reduces the
occurence of bugs related to it. Some typos are fixed at the same time.

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index fc35f5b..a31bcdf 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -834,11 +834,10 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
 
 		/* skip auto conf temporary file */
 		if (strncmp(de-d_name,
-	PG_AUTOCONF_FILENAME .temp,
-	sizeof(PG_AUTOCONF_FILENAME) + 4) == 0)
+	PG_AUTOCONF_FILENAME_TEMP,
+	sizeof(PG_AUTOCONF_FILENAME_TEMP)) == 0)
 			continue;
 
-
 		/*
 		 * If there's a backup_label file, it belongs to a backup started by
 		 * the user with pg_start_backup(). It is *not* correct for this
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1217098..7da52bf 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -6457,9 +6457,9 @@ flatten_set_variable_args(const char *name, List *args)
 }
 
 /*
- * Writes updated configuration parameter values into
- * postgresql.auto.conf.temp file. It traverses the list of parameters
- * and quote the string values before writing them to temporaray file.
+ * Write updated configuration parameter values into
+ * file PG_AUTOCONF_FILENAME_TEMP. It traverses the list of parameters
+ * and quotes the string values before writing them to temporary file.
  */
 static void
 write_auto_conf_file(int fd, const char *filename, ConfigVariable **head_p)
@@ -6468,8 +6468,8 @@ write_auto_conf_file(int fd, const char *filename, ConfigVariable **head_p)
 	StringInfoData buf;
 
 	initStringInfo(buf);
-	appendStringInfoString(buf, # Do not edit this file manually! \n);
-	appendStringInfoString(buf, # It will be overwritten by ALTER SYSTEM command. \n);
+	appendStringInfoString(buf, # Do not edit this file manually!\n);
+	appendStringInfoString(buf, # It will be overwritten by ALTER SYSTEM command.\n);
 
 	/*
 	 * write the file header message before contents, so that if there is no
@@ -6596,7 +6596,7 @@ replace_auto_config_value(ConfigVariable **head_p, ConfigVariable **tail_p,
  *
  * The configuration parameters are written to a temporary
  * file then renamed to the final name. The template for the
- * temporary file is

Re: [HACKERS] GIN improvements part 1: additional information

2014-01-21 Thread Alexander Korotkov

On Tue, Jan 21, 2014 at 4:28 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 I noticed that the gin vacuum redo routine is dead code, except for the
 data-leaf page handling, because we never remove entries or internal nodes
 (page deletion is a separate wal record type). And the data-leaf case is
 functionally equivalent to heap newpage records. I removed the dead code
 and made it more clear that it resembles heap newpage.

 Attached is a yet another version, with more bugs fixed and more comments
 added and updated. I would appreciate some heavy-testing of this patch now.
 If you could re-run the tests you've been using, that could be great. I've
 tested the WAL replay by replicating GIN operations over streaming
 replication. That doesn't guarantee it's correct, but it's a good smoke
 test.


 I tried my test-suite but it hangs on index scan with infinite loop. I
 re-tried it on my laptop with -O0. I found it to crash on update and vacuum
 in some random places like:
 Assert(GinPageIsData(page)); in xlogVacuumPage
 Assert(ndecoded == totalpacked); in ginCompressPostingList
 Trying to debug it.


Another question is about dataPlaceToPageLeaf:

while ((Pointer) seg  segend)
{
if (ginCompareItemPointers(minNewItem, seg-first)  0)
break;

Shouldn't we adjust seg to previous segment? If minNewItem is less than
seg-first we should insert it to previous segment.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-01-21 Thread MauMau


From: Amit Kapila amit.kapil...@gmail.com

How about below message:

event source event_source_name is already registered.


Thanks.  I'll use this, making the initial letter the capital E like other 
messages in the same source file.  I'm going to submit the final patch and 
update the CommitFest entry tomorrow at the earliest.




Today, I reviewed the patch again and found it okay, except a small
inconsistency which is about default event source name in
postgresql.conf, all other places it's changed except in .conf file.
Do you think it makes sense to change there as well?


Oh, I missed it.  postgresql.conf.sample says:

# The commented-out settings shown in this file represent the default 
values.


To follow this, we have the line as:

#event_source = 'PostgreSQL 9.4'

But this requires us to change this line for each major release.  That's a 
maintenance headache.  Another idea is:


#event_source = 'PostgreSQL major_release'

But this is not the exact default value.

So I want to leave the line as now.  Anyway, some other lines in 
postgresql.conf.sample do not represent the default value, either,:


#data_directory = 'ConfigDir'  # use data in another directory
#max_stack_depth = 2MB   # min 100kB
#include_if_exists = 'exists.conf' # include file only if it exists
#include = 'special.conf'  # include file

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

Stephen Frost wrote:
 * Craig Ringer (cr...@2ndquadrant.com) wrote:
  It also means that monitoring tools must run as superuser to see
  information they require, which to me is a total showstopper.
 
 We've already got *far* too much of that going on for my taste.  I'd
 love to see a comprehensive solution to this problem which allows
 monitoring systems to run w/o superuser privileges.

Yeah, we need a CAP_MONITOR capability thingy.  (CAP_BACKUP would be
great too -- essentially read only but read everything)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add %z support to elog/ereport?

On Fri, Jan 17, 2014 at 8:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-17 13:50:08 -0500, Tom Lane wrote:
 I think this approach is fundamentally broken, because it can't reasonably
 cope with any case more complicated than %zu or %zd.

 Am I just too tired, or am I not getting how INT64_FORMAT currently
 allows the arguments to be used posititional?

 It doesn't, which is one of the reasons for not allowing it in
 translatable strings (the other being lack of standardization of the
 strings that would be subject to translation).

 On second thought, that answer was too glib.  There's no need for %n$
 in the format strings *in the source code*, so INT64_FORMAT isn't getting
 in the way from that perspective.  However, expand_fmt_string() is
 necessarily applied to formats *after* they've been through gettext(),
 so it has to expect that it might see %n$ in the now-translated strings.

 It's still the case that we need a policy against INT64_FORMAT in
 translatable strings, though, because what's passed to the gettext
 mechanism has to be a fixed string not something with platform
 dependencies in it.  Or so I would assume, anyway.

Well, that's kinda problematic, isn't it?  Printing the variable into
a static buffer so that it can then be formatted with %s is pretty
pessimal for a message that we might not even be planning to emit.

Perhaps we should jettison entirely the idea of using the operating
system's built-in sprintf and use one of our own that has all of the
nice widgets we need, like a format code that's guaranteed to be right
for uint64 and one that's guaranteed to be right for Size.  This could
turn out to be a bad idea if the best sprintf we can write is much
slower than the native sprintf on any common platforms ... and maybe
it wouldn't play nice with GCC's desire to check format strings.  But
what we're doing now is a real nuisance, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] change alter user to be a true alias for alter role

2014-01-21 Thread Kevin Grittner

Jov am...@amutu.com wrote:

 attach patch add the per database set for alter user

Please add to the open CommitFest so that the patch doesn't fall
through the cracks:

https://commitfest.postgresql.org/action/commitfest_view/open

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

On Fri, Jan 17, 2014 at 12:07 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Jan 17, 2014 at 12:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 PS: off topic, but isn't ParseConfigDirectory leaking the result
 of AbsoluteConfigLocation?  In both normal and error paths?

Yes, I also think it leaks in both cases and similar leak is
present in ParseConfigFile(). I have tried to fix both of these
leaks with attached patch.

Committed and back-patched to 9.3.  While reviewing, I noted that the
skipping missing configuration file message in ParseConfigFile()
uses an elevel of LOG, while the other messages in the same file use
elevel.  I'm thinking that's a bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Potential relcache leak in get_object_address_attribute

On Sat, Jan 18, 2014 at 7:14 PM, Marti Raudsepp ma...@juffo.org wrote:
 It looks like the get_object_address_attribute() function has a
 potential relcache leak. When missing_ok=true, the relation is found
 but attribute is not, then the relation isn't closed, nor is it
 returned to the caller.

 I couldn't figure out any ways to trigger this, but it's best to fix anyway.

I agree.  Committed, thanks for the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add %z support to elog/ereport?

Robert Haas escribió:
 On Fri, Jan 17, 2014 at 8:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I wrote:
  Andres Freund and...@2ndquadrant.com writes:
  On 2014-01-17 13:50:08 -0500, Tom Lane wrote:

  Am I just too tired, or am I not getting how INT64_FORMAT currently
  allows the arguments to be used posititional?
 
  It doesn't, which is one of the reasons for not allowing it in
  translatable strings (the other being lack of standardization of the
  strings that would be subject to translation).
 
  On second thought, that answer was too glib.  There's no need for %n$
  in the format strings *in the source code*, so INT64_FORMAT isn't getting
  in the way from that perspective.  However, expand_fmt_string() is
  necessarily applied to formats *after* they've been through gettext(),
  so it has to expect that it might see %n$ in the now-translated strings.

How difficult would it be to have expand_fmt_string deal with positional
modifiers?  I don't think we need anything from it other than the %n$
notation, so perhaps it's not so problematic.

 Perhaps we should jettison entirely the idea of using the operating
 system's built-in sprintf and use one of our own that has all of the
 nice widgets we need, like a format code that's guaranteed to be right
 for uint64 and one that's guaranteed to be right for Size.  This could
 turn out to be a bad idea if the best sprintf we can write is much
 slower than the native sprintf on any common platforms ... and maybe
 it wouldn't play nice with GCC's desire to check format strings.  But
 what we're doing now is a real nuisance, too.

Maybe we can use our own implementation if the system's doesn't support
%z.  It's present in glibc 2.1 at least, and it's part of in the 2004
edition of POSIX:2001.
http://pubs.opengroup.org/onlinepubs/009695399/functions/sprintf.html

(It is not present in SUSv2 (1997), and I wasn't able to find the
original POSIX:2001 version.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add %z support to elog/ereport?

On 2014-01-21 12:11:23 -0300, Alvaro Herrera wrote:
 How difficult would it be to have expand_fmt_string deal with positional
 modifiers?  I don't think we need anything from it other than the %n$
 notation, so perhaps it's not so problematic.

I don't think there's much reason to go there. I didn't go for the
pg-supplied sprintf() because I thought it'd be considered to
invasive. Since that's apparently not the case...

  Perhaps we should jettison entirely the idea of using the operating
  system's built-in sprintf and use one of our own that has all of the
  nice widgets we need, like a format code that's guaranteed to be right
  for uint64 and one that's guaranteed to be right for Size.  This could
  turn out to be a bad idea if the best sprintf we can write is much
  slower than the native sprintf on any common platforms ... and maybe
  it wouldn't play nice with GCC's desire to check format strings.  But
  what we're doing now is a real nuisance, too.
 
 Maybe we can use our own implementation if the system's doesn't support
 %z.  It's present in glibc 2.1 at least, and it's part of in the 2004
 edition of POSIX:2001.
 http://pubs.opengroup.org/onlinepubs/009695399/functions/sprintf.html

Yea, I have a patch for that, will send it as soon as some other stuff
is finished.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

On Sun, Jan 19, 2014 at 2:43 PM, Marti Raudsepp ma...@juffo.org wrote:
 On Tue, Nov 19, 2013 at 6:38 PM, Andres Freund and...@2ndquadrant.com wrote:
 The attached patches compile and make check successfully on linux x86,
 amd64 and msvc x86 and amd64. This time and updated configure is
 included. The majority of operations still rely on CAS emulation.

 Note that the atomics-generic-acc.h file has ® characters in the
 Latin-1 encoding (Intel ® Itanium ®). If you have to use weird
 characters, I think you should make sure they're UTF-8

I think we're better off avoiding them altogether.  What's wrong with (R)?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

On 2014-01-21 10:20:35 -0500, Robert Haas wrote:
 On Sun, Jan 19, 2014 at 2:43 PM, Marti Raudsepp ma...@juffo.org wrote:
  On Tue, Nov 19, 2013 at 6:38 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
  The attached patches compile and make check successfully on linux x86,
  amd64 and msvc x86 and amd64. This time and updated configure is
  included. The majority of operations still rely on CAS emulation.
 
  Note that the atomics-generic-acc.h file has ® characters in the
  Latin-1 encoding (Intel ® Itanium ®). If you have to use weird
  characters, I think you should make sure they're UTF-8
 
 I think we're better off avoiding them altogether.  What's wrong with (R)?

Nothing at all. That was just the copied title from the pdf, it's gone
now.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improve the help message about psql -F

2014-01-21 Thread Marti Raudsepp

On Mon, Jan 20, 2014 at 2:04 PM, Jov am...@amutu.com wrote:
 reasonable，I removed the set,patch attached.

Hi Jov,

A new commitfest was just opened, due on 2014-06. Please add your patch here:
https://commitfest.postgresql.org/action/commitfest_view?id=22

(You'll need a community account if you don't already have one)

Sometimes simple fixes like yours are merged outside a CommitFest, but
adding it there makes sure it won't get lost.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

Stephen Frost sfr...@snowman.net writes:
 * Craig Ringer (cr...@2ndquadrant.com) wrote:
 If you want control over visibility of application_name, it should be
 done with a column privilige granted to a system role, or something like
 that - so the ability to see it can be given to public on default
 (thus not breaking BC) and if it's revoked from public, given to roles
 that need to see it.

 I agree with this- individuals should be able to control access to this
 information for their databases/clusters.

I think that'd be much more complexity than the case justifies.  The
argument that application_name might contain sensitive information seems
ludicrously weak to me: whatever a client is exposing as application_name
is its own darn choice.  If you don't like it, go fix the client.
If there is some client library that sets application_name without
allowing the choice to be overridden, then that's a problem with that
library, not with the server's behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2014-01-21 Thread Simon Riggs

On 6 June 2013 16:00, Heikki Linnakangas hlinnakan...@vmware.com wrote:
In the Redesigning checkpoint_segments thread, many people opined that
there should be a hard limit on the amount of disk space used for WAL:
http://www.postgresql.org/message-id/CA+TgmoaOkgZb5YsmQeMg8ZVqWMtR=6s4-ppd+6jiy4oq78i...@mail.gmail.com.
I'm starting a new thread on that, because that's mostly orthogonal to
redesigning checkpoint_segments.

The current situation is that if you run out of disk space while writing
WAL, you get a PANIC, and the server shuts down. That's awful. We can try to
avoid that by checkpointing early enough, so that we can remove old WAL
segments to make room for new ones before you run out, but unless we somehow
throttle or stop new WAL insertions, it's always going to be possible to use
up all disk space. A typical scenario where that happens is when
archive_command fails for some reason; even a checkpoint can't remove old,
unarchived segments in that case. But it can happen even without WAL
archiving.

I don't see we need to prevent WAL insertions when the disk fills. We
still have the whole of wal_buffers to use up. When that is full, we
will prevent further WAL insertions because we will be holding the
WALwritelock to clear more space. So the rest of the system will lock
up nicely, like we want, apart from read-only transactions.

Instead of PANICing, we should simply signal the checkpointer to
perform a shutdown checkpoint. That normally requires a WAL insertion
to complete, but it seems easy enough to make that happen by simply
rewriting the control file, after which ALL WAL files are superfluous
for crash recovery and can be deleted. Once that checkpoint is
complete, we can begin deleting WAL files that are archived/replicated
and continue as normal. The previously failing WAL write can now be
made again and may succeed this time - if it does, we continue, if not
- now we PANIC.

Note that this would not require in-progress transactions to be
aborted. They can continue normally once wal_buffers re-opens.

We don't really want anything too drastic, because if this situation
happens once it may happen many times - I'm imagining a flaky network
etc.. So we want the situation to recover quickly and easily, without
too many consequences.

The above appears to be very minimal change from existing code and
doesn't introduce lots of new points of breakage.

I've seen a case, where it was even worse than a PANIC and shutdown. pg_xlog
was on a separate partition that had nothing else on it. The partition
filled up, and the system shut down with a PANIC. Because there was no space
left, it could not even write the checkpoint after recovery, and thus
refused to start up again. There was nothing else on the partition that you
could delete to make space. The only recourse would've been to add more disk
space to the partition (impossible), or manually delete an old WAL file that
was not needed to recover from the latest checkpoint (scary). Fortunately
this was a test system, so we just deleted everything.

Doing shutdown checkpoints via the control file would exactly solve
that issue. We already depend upon the readability of the control file
anyway, so this changes nothing. (And if you regard it does, then we
can have multiple control files, or at least a backup control file at
shutdown).

We can make the shutdown checkpoint happen always at EOF of a WAL
segment, so at shutdown we don't need any WAL files to remain at all.

So we need to somehow stop new WAL insertions from happening, before it's
too late.

I don't think we do.

What might be sensible is to have checkpoints speed up as WAL volume
approaches a predefined limit, so that we minimise the delay caused
when wal_buffers locks up.

Not suggesting anything here for 9.4, since we're midCF.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Magnus Hagander

On Tue, Jan 21, 2014 at 2:41 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Stephen Frost wrote:
  * Craig Ringer (cr...@2ndquadrant.com) wrote:
   It also means that monitoring tools must run as superuser to see
   information they require, which to me is a total showstopper.
 
  We've already got *far* too much of that going on for my taste.  I'd
  love to see a comprehensive solution to this problem which allows
  monitoring systems to run w/o superuser privileges.

 Yeah, we need a CAP_MONITOR capability thingy.  (CAP_BACKUP would be
 great too -- essentially read only but read everything)


Isn't CAP_BACKUP pretty much the REPLICATION privilege?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Stephen Frost

* Magnus Hagander (mag...@hagander.net) wrote:
 Isn't CAP_BACKUP pretty much the REPLICATION privilege?

Not unless we change it to allow read-access to all tables to allow for
pg_dump to work...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] currawong is not a happy animal

On Sat, Jan 18, 2014 at 11:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Amit Kapila amit.kapil...@gmail.com writes:
 Is there any specific reason why adding PGDLLIMPORT for exporting
 const variables is not good, when we are already doing for other variables
 like InterruptHoldoffCount, CritSectionCount?

 On searching in code, I found that for few const variables we do
 extern PGDLLIMPORT. For example:
 extern PGDLLIMPORT const int NumScanKeywords;

 OK, that one is a direct counterexample to my worry.

 I'm still unconvinced that having a contrib module checking stuff based
 on the size of a struct it's not supposed to access represents a sane
 division of responsibilities; but let's fix the build problem now and
 then Robert can contemplate that question at his leisure.

Exposing the whole struct because of that minor detail seemed really
ugly to me, and I feel strongly that we should avoid it.  In the
intended use of this code, I expect queues to be some number of
kilobytes on the low end up to some number of megabytes on the high
end, so the fact that it can't be less than 56 bytes or so is sort of
a meaningless blip on the radar.  So one thing I thought about is just
defining the minimum size as something conservative, like 1024.  But
for testing purposes, it was certainly useful to create the smallest
possible queue, because that stresses the synchronization code to the
maximum degree possible.  So on the whole I'm still happy with how I
did this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom collations, collation modifiers (eg case-insensitive)

Craig Ringer wrote:

 (I intensely dislike the idea of ignoring accents, but it's something
 some people appear to need/want, and is supported by other vendors).

FWIW at least in spanish, users always want to search for entries
ignoring accents.  Nowadays I think most turn to unaccent(), but before
that was written people resorted to calling transform() on both the
stored data and the user-entered query, so that all pertinent matches
would be found.  People would even create indexes on the unaccented data
to speed this up.  It's an important feature, not a corner case by any
means.

Not sure about other languages.  For instance perhaps in French they
would be interested in ignoring some accents but not others.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Magnus Hagander

On Tue, Jan 21, 2014 at 5:18 PM, Stephen Frost sfr...@snowman.net wrote:

 * Magnus Hagander (mag...@hagander.net) wrote:
  Isn't CAP_BACKUP pretty much the REPLICATION privilege?

 Not unless we change it to allow read-access to all tables to allow for
 pg_dump to work...


That sounds more like CAP_DUMP than CAP_BACKUP :)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

2014-01-21 Thread Bruce Momjian

On Fri, Jan 17, 2014 at 04:31:48PM +, Mel Gorman wrote:
 NUMA Optimisations
 --
 
 The primary one that showed up was zone_reclaim_mode. Enabling that parameter
 is a disaster for many workloads and apparently Postgres is one. It might
 be time to revisit leaving that thing disabled by default and explicitly
 requiring that NUMA-aware workloads that are correctly partitioned enable it.
 Otherwise NUMA considerations are not that much of a concern right now.

Here is a blog post about our zone_reclaim_mode-disable recommendations:


http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html

 Direct IO, buffered IO, double buffering and wishlists
 --
6. Only writeback pages if explicitly synced. Postgres has strict write
   ordering requirements. In the words of Tom Lane -- As things currently
   stand, we dirty the page in our internal buffers, and we don't write
   it to the kernel until we've written and fsync'd the WAL data that
   needs to get to disk first. mmap() would avoid double buffering but
   it has no control about the write ordering which is a show-stopper.
   As Andres Freund described;

What was not explicitly stated here is that the Postgres design is
taking advantage of the double-buffering feature here and writing to a
memory copy of the page while there is still an unmodified copy in the
kernel cache, or on disk.  In the case of a crash, we rely on the fact
that the disk page is unchanged.  Certainly any design that requires the
kernel to mange two different copies of the same page is going to be
confusing.

One larger question is how many of these things that Postgres needs are
needed by other applications?  I doubt Postgres is large enough to
warrant changes on its own.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamic shared memory and locks

On Mon, Jan 20, 2014 at 11:23 PM, KaiGai Kohei kai...@ak.jp.nec.com wrote:
 I briefly checked the patch. Most of lines are mechanical replacement
 from LWLockId to LWLock *, and compiler didn't claim anything with
 -Wall -Werror option.

 My concern is around LWLockTranche mechanism. Isn't it too complicated
 towards the purpose?
 For example, it may make sense to add char lockname[NAMEDATALEN]; at
 the tail of LWLock structure if LOCK_DEBUG is enabled. Probably, it
 also adds an argument of LWLockAssign() to gives the human readable
 name. Is the additional 64bytes (= NAMEDATALEN) per lock too large
 for recent hardware?

Well, we'd need it when either LOCK_DEBUG was defined or when
LWLOCK_STATS was defined or when --enable-dtrace was used, and while
the first two are probably rarely enough used that that would be OK, I
think the third case is probably fairly common, and I don't think we
want to have such a potentially performance-critical difference
between builds with and without dtrace.

Also... yeah, it's a lot of memory.  If we add an additional 64 bytes
to the structure, then we're looking at 96 bytes per lwlock instead of
32, after padding out to a 32-byte boundary to avoid crossing cache
lines.  We need 2 lwlocks per buffer, so that's an additional 128
bytes per 8kB buffer.  For shared_buffers = 8GB, that's an extra 128MB
for storing lwlocks.  I'm not willing to assume nobody cares about
that.  And while I agree that this is a bit complex, I don't think
it's really as bad as all that.  We've gotten by for a long time
without people being able to put lwlocks in other parts of memory, and
most extension authors have gotten by with LWLockAssign() just fine
and can continue doing things that way; only users with particularly
sophisticated needs should bother to worry about the tranche stuff.

One idea I just had is to improve the dsm_toc module so that it can
optionally set up a tranche of lwlocks for you, and provide some
analogues of RequestAddinLWLocks and LWLockAssign for that case.  That
would probably make this quite a bit simpler to use, at least for
people using it with dynamic shared memory.  But I think that's a
separate patch.

 Below is minor comments.

 It seems to me this newer form increased direct reference to
 the MainLWLockArray. Is it really graceful?
 My recommendation is to define a macro that wraps the reference to
 MainLWLockArray.

 For example:
   #define PartitionLock(i) \
 (MainLWLockArray[LOCK_MANAGER_LWLOCK_OFFSET + (i)].lock)

Yeah, that's probably a good idea.

 This chunk is enclosed by #ifdef EXEC_BACKEND, the legacy LWLockArray
 was not removed.

Oops, will fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Stephen Frost

* Magnus Hagander (mag...@hagander.net) wrote:
 On Tue, Jan 21, 2014 at 5:18 PM, Stephen Frost sfr...@snowman.net wrote:
  Not unless we change it to allow read-access to all tables to allow for
  pg_dump to work...
 
 That sounds more like CAP_DUMP than CAP_BACKUP :)

Well, perhaps CAP_READONLY (or READALL?), there are auditor-type roles
which could be reduced to that level instead of superuser.  I'm on the
fence about if this needs to be seperate from REPLICATION though- how
many different such options are we going to have and how ugly is it
going to get to litter the code with if(superuser || read-only || ...)?

Perhaps a way to say this role has X-privilege on all objects of this
type which could then be used to GRANT SELECT and would be a single
point where we need to add those checks (in the ACL code for each
object type)?  One of the key points would be that the privilege apply
to newly created objects as well..

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Closing commitfest 2013-11

On Mon, Jan 20, 2014 at 4:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 With apologies to our beloved commitfest-mace-wielding CFM, commitfest
 2013-11 intentionally still contains a few open patches.  I think that
 CF is largely being ignored by most people now that we have CF 2014-01
 in progress.  If we don't want to do anything about these patches in the
 immediate future, I propose we move them to CF 2014-01.

 I think the idea was that patch authors should take responsibility for
 pushing their patches forward to 2014-01 if they still wanted them
 considered.  Quite a few patches already were moved that way, IIRC.

Agreed on that general theory.

And, also, yeah, the shared memory message queueing stuff got
committed.  Sorry, I missed the fact that there was still an open CF
entry for that; I assumed that it would have been marked Returned with
Feedback.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add %z support to elog/ereport?

Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-21 12:11:23 -0300, Alvaro Herrera wrote:
 How difficult would it be to have expand_fmt_string deal with positional
 modifiers?  I don't think we need anything from it other than the %n$
 notation, so perhaps it's not so problematic.

 I don't think there's much reason to go there. I didn't go for the
 pg-supplied sprintf() because I thought it'd be considered to
 invasive. Since that's apparently not the case...

Yeah, that would make expand_fmt_string considerably more complicated
and so presumably slower.  We don't really need that when we can make
what I expect is a pretty trivial addition to snprintf.c.  Also, fixing
snprintf.c will make it safe to use the z flag in contexts other than
ereport/elog.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb

2014-01-21 Thread Oskari Saarenmaa


09.01.2014 05:15, Peter Eisentraut kirjoitti:

pg_upgrade creates a script analyze_new_cluster.{sh|bat} that runs
vacuumdb --analyze-only in three stages with different statistics target
settings to get a fresh cluster analyzed faster.  I think this behavior
is also useful for clusters or databases freshly created by pg_restore
or any other loading mechanism, so it's suboptimal to have this
constrained to pg_upgrade.


I think the three stage analyze is a wrong solution to the slow 
analyze problem.  In my experience most of the analyze time goes to 
reading random blocks from the disk but we usually use only a small 
portion of that data (1 row per block.)


If we were able to better utilize the data we read we could get good 
statistics with a lot less IO than we currently need.  This was 
discussed in length at
http://www.postgresql.org/message-id/CAM-w4HOjRbNPMW=shjhw_qfapcuu5ege1tmdr0zqu+kqx8q...@mail.gmail.com 
but it hasn't turned into patches so far.


/ Oskari



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

Simon Riggs si...@2ndquadrant.com writes:
 On 6 June 2013 16:00, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 The current situation is that if you run out of disk space while writing
 WAL, you get a PANIC, and the server shuts down. That's awful.

 I don't see we need to prevent WAL insertions when the disk fills. We
 still have the whole of wal_buffers to use up. When that is full, we
 will prevent further WAL insertions because we will be holding the
 WALwritelock to clear more space. So the rest of the system will lock
 up nicely, like we want, apart from read-only transactions.

I'm not sure that all writing transactions lock up hard is really so
much better than the current behavior.

My preference would be that we simply start failing writes with ERRORs
rather than PANICs.  I'm not real sure ATM why this has to be a PANIC
condition.  Probably the cause is that it's being done inside a critical
section, but could we move that?

 Instead of PANICing, we should simply signal the checkpointer to
 perform a shutdown checkpoint.

And if that fails for lack of disk space?  In any case, what you're
proposing sounds like a lot of new complication in a code path that
is necessarily never going to be terribly well tested.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-01-21 Thread Fujii Masao

On Fri, Dec 20, 2013 at 9:21 PM, MauMau maumau...@gmail.com wrote:
 From: Fujii Masao masao.fu...@gmail.com

 ! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

 Even when standby_mode is not enabled, we can use cascade replication and
 it needs the accumulated WAL files. So I think that
 AllowCascadeReplication()
 should be added into this condition.

 !   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
 !   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
 !
 !   if (restoredFromArchive)

 Don't we need to check !StandbyModeRequested and
 !AllowCascadeReplication()
 here?


 Oh, you are correct.  Okay, done.

Thanks! The patch looks good to me. Attached is the updated version of
the patch. I added the comments.

Did you test whether this patch works properly in several recovery cases?

Regards,

-- 
Fujii Masao
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 3772,3781  XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
  	}
  
  	/*
! 	 * If the segment was fetched from archival storage, replace the existing
! 	 * xlog segment (if any) with the archival version.
  	 */
! 	if (source == XLOG_FROM_ARCHIVE)
  	{
  		KeepFileRestoredFromArchive(path, xlogfname);
  
--- 3772,3792 
  	}
  
  	/*
! 	 * If the segment was fetched from archival storage and either cascading
! 	 * replication or standby_mode is enabled, replace the existing xlog
! 	 * segment (if any) with the archival version. Cascading replication needs
! 	 * this replacement so that cascading walsender can send the xlog segment
! 	 * which was restored from the archive. When standby_mode is enabled,
! 	 * fast promotion is performed at the end of recovery, and also needs this
! 	 * replacement so that the crash recovery just after fast promotion can
! 	 * replay all the required segments from pg_xlog.
! 	 *
! 	 * If the replacement is not required, the segments are always restored onto
! 	 * the same file named RECOVERYXLOG from the archive. This prevents the
! 	 * large increase of segments in pg_xlog.
  	 */
! 	if (source == XLOG_FROM_ARCHIVE 
! 		(AllowCascadeReplication() || StandbyModeRequested))
  	{
  		KeepFileRestoredFromArchive(path, xlogfname);
  
***
*** 5572,5593  exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo)
  	}
  
  	/*
! 	 * If we are establishing a new timeline, we have to copy data from the
! 	 * last WAL segment of the old timeline to create a starting WAL segment
! 	 * for the new timeline.
  	 *
! 	 * Notify the archiver that the last WAL segment of the old timeline is
! 	 * ready to copy to archival storage. Otherwise, it is not archived for a
! 	 * while.
  	 */
! 	if (endTLI != ThisTimeLineID)
  	{
! 		XLogFileCopy(endLogSegNo, endTLI, endLogSegNo);
  
! 		if (XLogArchivingActive())
  		{
! 			XLogFileName(xlogpath, endTLI, endLogSegNo);
! 			XLogArchiveNotify(xlogpath);
  		}
  	}
  
--- 5583,5646 
  	}
  
  	/*
! 	 * If the segment was fetched from archival storage and neither cascading
! 	 * replication nor fast promotion is enabled, we replace the existing xlog
! 	 * segment (if any) with the archival version named RECOVERYXLOG. This is
! 	 * because whatever is in XLOGDIR is very possibly older than what we have
! 	 * from the archives, since it could have come from restoring a PGDATA
! 	 * backup.	In any case, the archival version certainly is more
! 	 * descriptive of what our current database state is, because that is what
! 	 * we replayed from.
! 	 *
! 	 * Note that if we are establishing a new timeline, ThisTimeLineID is
! 	 * already set to the new value, and so we will create a new file instead
! 	 * of overwriting any existing file.  (This is, in fact, always the case
! 	 * at present.)
  	 *
! 	 * If either cascading replication or fast promotion is enabled, we don't
! 	 * need the replacement here because it has already done just after the
! 	 * segment was restored from the archive.
  	 */
! 	snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
! 	XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
! 
! 	if (restoredFromArchive 
! 		!AllowCascadeReplication()  !StandbyModeRequested)
  	{
! 		unlink(xlogpath);		/* might or might not exist */
! 		if (rename(recoveryPath, xlogpath) != 0)
! 			ereport(FATAL,
! 	(errcode_for_file_access(),
! 	 errmsg(could not rename file \%s\ to \%s\: %m,
! 			recoveryPath, xlogpath)));
! 		/* XXX might we need to fix permissions on the file? */
! 	}
! 	else
! 	{
! 		/*
! 		 * Since there might be a partial WAL segment named RECOVERYXLOG,
! 		 * get rid of it.
! 		 */
! 		unlink(recoveryPath);	/* ignore any error */
  
! 		/*
! 		 * If we are establishing a new timeline, we have to copy data from
! 		 * the last WAL segment of the old timeline to create a starting WAL
! 		 * segment for the new timeline.
! 		 *
! 		 * Notify the archiver that the last WAL segment of the old timeline
! 		 * is ready to copy to archival

Re: [HACKERS] array_length(anyarray)

On Sun, Jan 19, 2014 at 1:41 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2014/1/19 Marko Tiikkaja ma...@joh.to
 On 1/19/14, 12:21 AM, Pavel Stehule wrote:
 I checked it and I got a small issue

 bash-4.1$ patch -p1  cardinality.patch
 (Stripping trailing CRs from patch.)

 not sure about source of this problem.


 I can't reproduce the problem.  In fact, I don't see a single CR byte in
 the patch file on my disk, the file my email clients claims to have sent or
 a copy of the file I downloaded from the archives.  Are you sure this isn't
 a problem on your end?

 It can be problem on my side - some strange combination of mime type. I seen
 this issue before. I will recheck it tomorrow from other computer.

Doesn't matter anyway.  Patch needing to strip trailing CRs doesn't
cause any issue.  I got the same message, BTW; maybe some kind of
gmail weirdness.

As it turns out, a function called cardinality() was added in 2009 and
ripped back out again.  But the one that was committed back then had
funny semantics that people weren't sure about, and people seemed to
think it should behave like this one does.  So I've gone ahead and
committed this, crossing my fingers that we won't have to rip it back
out again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Closing commitfest 2013-11

2014-01-21 Thread Pavel Stehule

Hello

I disagree with it. There was no any request to move ready for commit
patches to next commitfest! I expected so only unfinishing patches should
by moved there by their authors. I sent question to Peter E. But without
reply, but Tom did commits from thist list, so I expected so there is some
agreement about it and I did'nt any alarm.

My patch there is prerequsity for dump --if-exi
Dne 21.1.2014 17:41 Robert Haas robertmh...@gmail.com napsal(a):

 On Mon, Jan 20, 2014 at 4:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
  With apologies to our beloved commitfest-mace-wielding CFM, commitfest
  2013-11 intentionally still contains a few open patches.  I think that
  CF is largely being ignored by most people now that we have CF 2014-01
  in progress.  If we don't want to do anything about these patches in the
  immediate future, I propose we move them to CF 2014-01.
 
  I think the idea was that patch authors should take responsibility for
  pushing their patches forward to 2014-01 if they still wanted them
  considered.  Quite a few patches already were moved that way, IIRC.

 Agreed on that general theory.

 And, also, yeah, the shared memory message queueing stuff got
 committed.  Sorry, I missed the fact that there was still an open CF
 entry for that; I assumed that it would have been marked Returned with
 Feedback.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] truncating pg_multixact/members

On Mon, Jan 20, 2014 at 1:39 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 * I haven't introduced settings to tweak this per table for
 autovacuum.  I don't think those are needed.  It's not hard to do,
 however; if people opine against this, I will implement that.

I can't think of any reason to believe that it will be less important
to tune these values on a per-table basis than it is to be able to do
the same with the autovacuum parameters.  Indeed, all the discussion
on this thread suggests precisely that we have no real idea how to set
these values yet, so more configurability is good.  Even if you reject
that argument, I think it's a bad idea to start making xmax vacuuming
and xmin vacuuming less than parallel; such decisions confuse users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Closing commitfest 2013-11

2014-01-21 Thread Pavel Stehule

Dne 21.1.2014 18:52 Pavel Stehule pavel.steh...@gmail.com napsal(a):

 Hello

 I disagree with it. There was no any request to move ready for commit
patches to next commitfest! I expected so only unfinishing patches should
by moved there by their authors. I sent question to Peter E. But without
reply, but Tom did commits from thist list, so I expected so there is some
agreement about it and I did'nt any alarm.

 My patch there is prerequsity for dump --if-exi

Sorry, train and mobile :(

It is required for dump --if-exists feature.

Regards

Pavel

 Dne 21.1.2014 17:41 Robert Haas robertmh...@gmail.com napsal(a):

 On Mon, Jan 20, 2014 at 4:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
  With apologies to our beloved commitfest-mace-wielding CFM, commitfest
  2013-11 intentionally still contains a few open patches.  I think that
  CF is largely being ignored by most people now that we have CF 2014-01
  in progress.  If we don't want to do anything about these patches in
the
  immediate future, I propose we move them to CF 2014-01.
 
  I think the idea was that patch authors should take responsibility for
  pushing their patches forward to 2014-01 if they still wanted them
  considered.  Quite a few patches already were moved that way, IIRC.

 Agreed on that general theory.

 And, also, yeah, the shared memory message queueing stuff got
 committed.  Sorry, I missed the fact that there was still an open CF
 entry for that; I assumed that it would have been marked Returned with
 Feedback.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug? could not remove shared memory segment

2014-01-21 Thread Fujii Masao

Hi,

When I shut down the standby server and then the master server in that order,
I found that the master server emit the following message.

LOG:  XX000: could not remove shared memory segment
/PostgreSQL.1804289383: Invalid argument
LOCATION:  dsm_impl_posix, dsm_impl.c:269

Is this intentional message or a bug?

$ uname -a
Darwin test.local 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23
16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64

I used the latest version of the source code in the git master.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom collations, collation modifiers (eg case-insensitive)

2014-01-21 Thread Pavel Stehule

This feature is interesting for Czech language too. Lot of applications
allows accent free searching due possible issues in original data or some
client limits - missing Czech keyboard and similar
Dne 21.1.2014 17:17 Alvaro Herrera alvhe...@2ndquadrant.com napsal(a):

 Craig Ringer wrote:

  (I intensely dislike the idea of ignoring accents, but it's something
  some people appear to need/want, and is supported by other vendors).

 FWIW at least in spanish, users always want to search for entries
 ignoring accents.  Nowadays I think most turn to unaccent(), but before
 that was written people resorted to calling transform() on both the
 stored data and the user-entered query, so that all pertinent matches
 would be found.  People would even create indexes on the unaccented data
 to speed this up.  It's an important feature, not a corner case by any
 means.

 Not sure about other languages.  For instance perhaps in French they
 would be interested in ignoring some accents but not others.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] array_length(anyarray)

2014-01-21 Thread Marko Tiikkaja


On 1/21/14, 6:42 PM, Robert Haas wrote:

On Sun, Jan 19, 2014 at 1:41 AM, Pavel Stehule pavel.steh...@gmail.com wrote:

It can be problem on my side - some strange combination of mime type. I seen
this issue before. I will recheck it tomorrow from other computer.


Doesn't matter anyway.  Patch needing to strip trailing CRs doesn't
cause any issue.  I got the same message, BTW; maybe some kind of
gmail weirdness.


Interesting.


As it turns out, a function called cardinality() was added in 2009 and
ripped back out again.  But the one that was committed back then had
funny semantics that people weren't sure about, and people seemed to
think it should behave like this one does.  So I've gone ahead and
committed this, crossing my fingers that we won't have to rip it back
out again.


Thanks!  I'll keep my fingers crossed as well.


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Incorrectly reporting config errors

2014-01-21 Thread Thom Brown

Hi all,

I'm getting a report of a config error when changing a config value
that requires a restart:

In postgresql.conf

max_connections = 92


(pg_ctl restart)

postgres=# show max_connections ;
 max_connections
-
 92
(1 row)


(Edit postgresql.conf so that max_connections = 93)

(pg_ctl reload)

Now the log file contains:

2014-01-21 18:14:53 GMT [28718]: [4-1] user=,db=,client= LOG:
received SIGHUP, reloading configuration files
2014-01-21 18:14:53 GMT [28718]: [5-1] user=,db=,client= LOG:
parameter max_connections cannot be changed without restarting the
server
2014-01-21 18:14:53 GMT [28718]: [6-1] user=,db=,client= LOG:
configuration file /home/thom/Development/data/postgresql.conf
contains errors; unaffected changes were applied

It doesn't contain errors.  I changed the 92 to 93.  If I restart, it
doesn't complain, and there's nothing in the log about the config
anymore.

This seems to be the case for any parameter with a postmaster context.

I can understand why it logs the message about it not changing without
a restart, but the other one seems like a bug.

I've tested this on 9.3 and 9.4devel.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add %z support to elog/ereport?

Robert Haas robertmh...@gmail.com writes:
 On Fri, Jan 17, 2014 at 8:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It's still the case that we need a policy against INT64_FORMAT in
 translatable strings, though, because what's passed to the gettext
 mechanism has to be a fixed string not something with platform
 dependencies in it.  Or so I would assume, anyway.

 Well, that's kinda problematic, isn't it?  Printing the variable into
 a static buffer so that it can then be formatted with %s is pretty
 pessimal for a message that we might not even be planning to emit.

Well, it's a tad annoying, but a quick grep says there are maybe 40
cases of this in our source code, so I'm not sure it's rising to the
level of a must-fix problem.

 Perhaps we should jettison entirely the idea of using the operating
 system's built-in sprintf and use one of our own that has all of the
 nice widgets we need, like a format code that's guaranteed to be right
 for uint64 and one that's guaranteed to be right for Size.  This could
 turn out to be a bad idea if the best sprintf we can write is much
 slower than the native sprintf on any common platforms ... and maybe
 it wouldn't play nice with GCC's desire to check format strings.

That last is a deal-breaker.  It's not just whether gcc desires to check
this --- we *need* that checking, because people get it wrong without it.

I thought about proposing that we insist that snprintf support the ll
flag (substituting our own if not).  But that doesn't really fix anything
unless we're willing to explicitly cast the corresponding values to
long long, which is probably not workable from a portability standpoint.
I'm not really worried about platforms thinking long long is 128 bits ---
I'm worried about old compilers that don't have such a datatype.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug? could not remove shared memory segment

On Tue, Jan 21, 2014 at 1:05 PM, Fujii Masao masao.fu...@gmail.com wrote:
 When I shut down the standby server and then the master server in that order,
 I found that the master server emit the following message.

 LOG:  XX000: could not remove shared memory segment
 /PostgreSQL.1804289383: Invalid argument
 LOCATION:  dsm_impl_posix, dsm_impl.c:269

 Is this intentional message or a bug?

 $ uname -a
 Darwin test.local 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23
 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64

 I used the latest version of the source code in the git master.

Urgh.

I think what happened is that, when you cloned the master to produce
the standby, you also copied the pg_dynshmem directory, which cause
the standby to latch onto the master's control segment and blow it
away on startup.  When the master tried to shut down, it got unhappy.

I think this is further evidence that we need to get rid of the state
file; while I haven't agreed with Heikki's comments on this topic in
their entirety, there is clearly a problem here that needs to get
fixed somehow.  Latest discussion of this topic is here:

http://www.postgresql.org/message-id/CA+TgmoZAkz3yrZkm8D=n27dsc00pdqsshzljbkq29_twqge...@mail.gmail.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

Thom Brown t...@linux.com writes:
 I'm getting a report of a config error when changing a config value
 that requires a restart:
 ...
 2014-01-21 18:14:53 GMT [28718]: [4-1] user=,db=,client= LOG:
 received SIGHUP, reloading configuration files
 2014-01-21 18:14:53 GMT [28718]: [5-1] user=,db=,client= LOG:
 parameter max_connections cannot be changed without restarting the
 server
 2014-01-21 18:14:53 GMT [28718]: [6-1] user=,db=,client= LOG:
 configuration file /home/thom/Development/data/postgresql.conf
 contains errors; unaffected changes were applied

 It doesn't contain errors.

Yeah it does: it's got a value that can't be applied.  I think you're
making a semantic quibble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2014-01-21 Thread Simon Riggs

On 21 January 2014 18:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 6 June 2013 16:00, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 The current situation is that if you run out of disk space while writing
 WAL, you get a PANIC, and the server shuts down. That's awful.

 I don't see we need to prevent WAL insertions when the disk fills. We
 still have the whole of wal_buffers to use up. When that is full, we
 will prevent further WAL insertions because we will be holding the
 WALwritelock to clear more space. So the rest of the system will lock
 up nicely, like we want, apart from read-only transactions.

 I'm not sure that all writing transactions lock up hard is really so
 much better than the current behavior.

Lock up momentarily, until the situation clears. But my proposal would
allow the situation to fully clear, i.e. all WAL files could be
deleted as soon as replication/archiving has caught up. The current
behaviour doesn't automatically correct itself as this proposal would.
My proposal is also fully safe in line with synchronous replication,
as well as zero performance overhead for mainline processing.

 My preference would be that we simply start failing writes with ERRORs
 rather than PANICs.

Yes, that is what I am proposing, amongst other points.

 I'm not real sure ATM why this has to be a PANIC
 condition.  Probably the cause is that it's being done inside a critical
 section, but could we move that?

Yes, I think so.

 Instead of PANICing, we should simply signal the checkpointer to
 perform a shutdown checkpoint.

 And if that fails for lack of disk space?

I proposed a way to ensure it wouldn't fail for that, at least on pg_xlog space.

 In any case, what you're
 proposing sounds like a lot of new complication in a code path that
 is necessarily never going to be terribly well tested.

It's the smallest amount of change proposed so far... I agree on the
danger of untested code.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

2014-01-21 Thread Adrian Klaver


On 01/21/2014 10:26 AM, Thom Brown wrote:

Hi all,

I'm getting a report of a config error when changing a config value
that requires a restart:

In postgresql.conf

max_connections = 92


(pg_ctl restart)

postgres=# show max_connections ;
  max_connections
-
  92
(1 row)


(Edit postgresql.conf so that max_connections = 93)

(pg_ctl reload)

Now the log file contains:

2014-01-21 18:14:53 GMT [28718]: [4-1] user=,db=,client= LOG:
received SIGHUP, reloading configuration files
2014-01-21 18:14:53 GMT [28718]: [5-1] user=,db=,client= LOG:
parameter max_connections cannot be changed without restarting the
server
2014-01-21 18:14:53 GMT [28718]: [6-1] user=,db=,client= LOG:
configuration file /home/thom/Development/data/postgresql.conf
contains errors; unaffected changes were applied

It doesn't contain errors.  I changed the 92 to 93.  If I restart, it
doesn't complain, and there's nothing in the log about the config
anymore.

This seems to be the case for any parameter with a postmaster context.

I can understand why it logs the message about it not changing without
a restart, but the other one seems like a bug.


You wanted a change in a value, the change did not occur, hence an error.



I've tested this on 9.3 and 9.4devel.

Thom





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Documentation patch for to_date and to_timestamp

2014-01-21 Thread Steve Crawford

The attached patch is in response to ongoing mailing-list questions 
regarding perceived weirdness in to_timestamp and to_date.


The patch modifies doc/src/sgml/func.sgml to add (see usage notes) in 
the description column for to_date and to_timestamp in the Formatting 
Functions table and adds the following two list items to the start of 
the usage notes for date/time conversion:


The to_date and to_timestamp functions exist to parse unusual input 
formats that cannot be handled by casting. These functions interpret 
input liberally and with minimal error checking so the conversion has 
the potential to yield unexpected results. Read the following notes and 
test carefully before use. Casting is the preferred method of conversion 
wherever possible.


Input to to_date and to_timestamp is not restricted to normal ranges 
thus to_date('20096040','MMDD') returns 2014-01-17 rather than 
generating an error.


Cheers,
Steve

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c76d357..19197ce 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -5426,7 +5426,7 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
  literalfunctionto_date(typetext/type, typetext/type)/function/literal
 /entry
 entrytypedate/type/entry
-entryconvert string to date/entry
+entryconvert string to date (see usage notes)/entry
 entryliteralto_date('05nbsp;Decnbsp;2000', 'DDnbsp;Monnbsp;')/literal/entry
/row
row
@@ -5448,7 +5448,7 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
  literalfunctionto_timestamp(typetext/type, typetext/type)/function/literal
 /entry
 entrytypetimestamp with time zone/type/entry
-entryconvert string to time stamp/entry
+entryconvert string to time stamp (see usage notes)/entry
 entryliteralto_timestamp('05nbsp;Decnbsp;2000', 'DDnbsp;Monnbsp;')/literal/entry
/row
row
@@ -5750,10 +5750,32 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
 
para
 Usage notes for date/time formatting:
+   /para
 
+   para
 itemizedlist
  listitem
   para
+   The functionto_date/function and functionto_timestamp/function
+   functions exist to parse unusual input formats that cannot be handled
+   by casting.  These functions interpret input liberally and with minimal
+   error checking so the conversion has the potential to yield unexpected
+   results.  Read the following notes and test carefully before use.
+   Casting is the preferred method of conversion wherever possible.
+  /para
+ /listitem
+
+ listitem
+  para
+   Input to functionto_date/function and
+   functionto_timestamp/function is not restricted to normal ranges
+   thus literalto_date('20096040','MMDD')/literal returns
+   literal2014-01-17/literal rather than generating an error.
+  /para
+ /listitem
+
+ listitem
+  para
literalFM/literal suppresses leading zeroes and trailing blanks
that would otherwise be added to make the output of a pattern be
fixed-width.  In productnamePostgreSQL/productname,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

2014-01-21 Thread Thom Brown

On 21 January 2014 18:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 I'm getting a report of a config error when changing a config value
 that requires a restart:
 ...
 2014-01-21 18:14:53 GMT [28718]: [4-1] user=,db=,client= LOG:
 received SIGHUP, reloading configuration files
 2014-01-21 18:14:53 GMT [28718]: [5-1] user=,db=,client= LOG:
 parameter max_connections cannot be changed without restarting the
 server
 2014-01-21 18:14:53 GMT [28718]: [6-1] user=,db=,client= LOG:
 configuration file /home/thom/Development/data/postgresql.conf
 contains errors; unaffected changes were applied

 It doesn't contain errors.

 Yeah it does: it's got a value that can't be applied.  I think you're
 making a semantic quibble.

I see it as technically wrong.  There's nothing wrong with my config
file.  A reload of the file may not be able to apply all the settings,
but there's no typo or mistake anywhere in my file.  I would just need
to restart instead of reload.

However, given that you find it unsurprising, I'll leave it there.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] NOT Null constraint on foreign table not working

Albe Laurenz laurenz.a...@wien.gv.at writes:
 I believe that a column of a foreign table should be NOT NULL only if
 it is guaranteed that it cannot contain NULL values.  Doesn't the planner
 rely on that?

The planner does expect that constraints tell the truth.  I don't remember
how significant a false NOT NULL constraint might be, but certainly false
CHECK constraints can give rise to incorrect plans.

 But PostgreSQL cannot guarantee that, that has to happen on the remote side
 (or in the FDW).  I think that it is best that an error for a constraint
 violation is thrown by the same entity that guarantees that the constraint
 is respected.

A point worth making here is that it's already implicit in the contract
that the CREATE FOREIGN TABLE command accurately represents what the
far-side table is.  If you get the column datatypes wrong, for example,
it's possible to have subtle semantic bugs not all that different from
what will happen with an incorrectly-deduced plan.  And we don't make
any attempt to slap your wrist for that.  So I don't see that there's
anything fundamentally wrong with the position that any NOT NULL or CHECK
constraints attached to a foreign table must be accurate reflections of
constraints that exist on the far side, rather than something we should
enforce locally.  (Note that this argument applies to FDWs for remote SQL
servers, but not necessarily for FDWs for non-SQL data sources, where
conceivably the CREATE FOREIGN TABLE command actually is itself the
authoritative truth.  Such an FDW would then be responsible for enforcing
constraints.)

I agree though that we've failed miserably to explain this in the docs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

On Tue, Jan 21, 2014 at 2:00 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Mon, Jan 20, 2014 at 9:49 PM, Robert Haas robertmh...@gmail.com wrote:
 I ran Heikki's test suit on latest master and latest master plus
 pgrb_delta_encoding_v4.patch on a PPC64 machine, but the results
 didn't look too good.  The only tests where the WAL volume changed by
 more than half a percent were the one short and one long field, no
 change test, where it dropped by 17%, but at the expense of an
 increase in duration of 38%; and the hundred tiny fields, half
 nulled test, where it dropped by 2% without a change in runtime.

 Unfortunately, some of the tests where WAL didn't change significantly
 took a runtime hit - in particular, hundred tiny fields, half
 changed slowed down by 10% and hundred tiny fields, all changed by
 8%.

 I think this part of result is positive, as with earlier approaches here the
 dip was  20%. Refer the result posted at link:
 http://www.postgresql.org/message-id/51366323.8070...@vmware.com


  I've attached the full results in OpenOffice format.

 Profiling the one short and one long field, no change test turns up
 the following:

 51.38% postgres  pgrb_delta_encode
 23.58% postgres  XLogInsert
  2.54% postgres  heap_update
  1.09% postgres  LWLockRelease
  0.90% postgres  LWLockAcquire
  0.89% postgres  palloc0
  0.88% postgres  log_heap_update
  0.84% postgres  HeapTupleSatisfiesMVCC
  0.75% postgres  ExecModifyTable
  0.73% postgres  hash_search_with_hash_value

 Yipes.  That's a lot more than I remember this costing before.  And I
 don't understand why I'm seeing such a large time hit on this test
 where you actually saw a significant time *reduction*.  One
 possibility is that you may have been running with a default
 checkpoint_segments value or one that's low enough to force
 checkpointing activity during the test.  I ran with
 checkpoint_segments=300.

 I ran with checkpoint_segments = 128 and when I ran with v4, I also
 see similar WAL reduction as you are seeing, except that in my case
 runtime for both are almost similar (i think in your case disk writes are
 fast, so CPU overhead is more visible).
 I think the major difference in above test is due to below part of code:

 pgrb_find_match()
 {
 ..
 + /* if (match_chunk)
 + {
 + while (*ip == *hp)
 + {
 + matchlen++;
 + ip++;
 + hp++;
 + }
 + } */
 }

 Basically if we don't go for longer match, then for test where most data
 (one short and one long field, no change) is similar, it has to do below
 extra steps with no advantage:
 a. copy extra tags
 b. calculation for rolling hash
 c. finding the match
 I think here major cost is due to 'a', but others might also not be free.
 To confirm the theory, if we run the test by just un-commenting above
 code, there can be significant change in both WAL reduction and
 runtime for this test.

 I have one idea to avoid the overhead of step a) which is to combine
 the tags, means don't write the tag until it founds any un-matching data.
 When any un-matched data is found, then combine all the previously
 matched data and write it as one tag.
 This should eliminate the overhead due to step a.

I think that's a good thing to try.  Can you code it up?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inherit support for foreign tables

On Mon, Jan 20, 2014 at 9:44 PM, Shigeru Hanada
shigeru.han...@gmail.com wrote:
 Thanks for the comments.

 2014/1/21 KaiGai Kohei kai...@ak.jp.nec.com:
 In addition, an idea which I can't throw away is to assume that all
 constraints defined on foreign tables as ASSERTIVE.  Foreign tables
 potentially have dangers to have wrong data by updating source data
 not through foreign tables.  This is not specific to an FDW, so IMO
 constraints defined on foreign tables are basically ASSERTIVE.  Of
 course PG can try to maintain data correct, but always somebody might
 break it.
 qu

 Does it make sense to apply assertive CHECK constraint on the qual
 of ForeignScan to filter out tuples with violated values at the local
 side, as if row-level security feature doing.
 It enables to handle a situation that planner expects only clean
 tuples are returned but FDW driver is unavailable to anomalies.

 Probably, this additional check can be turned on/off on the fly,
 if FDW driver has a way to inform the core system its capability,
 like FDW_CAN_ENFORCE_CHECK_CONSTRAINT that informs planner to skip
 local checks.

 Hmm, IIUC you mean that local users can't (or don't need to) know that
 data which violates the local constraints exist on remote side.
 Applying constraints to the data which is modified through FDW would
 be necessary as well.  In that design, FDW is a bidirectional filter
 which provides these features:

 1) Don't push wrong data into remote data source, by applying local
 constraints to the result of the modifying query executed on local PG.
  This is not perfect filter, because remote constraints don't mapped
 automatically or perfectly (imagine constraints which is available on
 remote but is not supported in PG).
 2) Don't retrieve wrong data from remote to local PG, by applying
 local constraints

 I have a concern about consistency.  It has not been supported, but
 let's think of Aggregate push-down invoked by a query below.

 SELECT count(*) FROM remote_table;

 If this query was fully pushed down, the result is the # of records
 exist on remote side, but the result would be # of valid records when
 we don't push down the aggregate.  This would confuse users.

 Besides CHECK constraints, currently NOT NULL constraints are
 virtually ASSERTIVE (not enforcing).  Should it also be noted
 explicitly?

 Backward compatibility….

 Yep, backward compatibility (especially visible ones to users) should
 be minimal, ideally zero.

 NOT NULL [ASSERTIVE] might be an option.

 Treating [ASSERTIVE | NOT ASSERTIVE] like DEFERRABLE, and allow
 ingASSERTIVE for only foreign tables?  It makes sense, though we need
 consider exclusiveness .  But It needs to default to ASSERTIVE on
 foreign tables, and NOT ASSERTIVE (means forced) on others.  Isn't
 is too complicated?

 CREATE FOREIGN TABLE foo (
 id int NOT NULL ASSERTIVE CHECK (id  1) ASSERTIVE,
 …
 CONSTRAINT chk_foo_name_upper CHECK (upper(name) = name) ASSERTIVE
 ) SERVER server;

 BTW, I noticed that this is like push-down-able expressions in
 JOIN/WHERE.  We need to check a CHECK constraint defined on a foreign
 tables contains only expressions which have same semantics as remote
 side (in practice, built-in and immutable)?

I don't think that that ASSERTIVE is going to fly, because assertive
means (sayeth the Google) having or showing a confident and forceful
personality, which is not what we mean here.  It's tempting to do
something like try to replace the keyword check with assume or
assert or (stretching) assertion, but that would require whichever
one we picked to be a fully-reserved keyword, which I can't think is
going to get much support here, for entirely understandable reasons.
So I think we should look for another option.

Currently, constraints can be marked NO INHERIT (though this seems to
have not been fully documented, as the ALTER TABLE page doesn't
mention it anywhere) or NOT VALID, so I'm thinking maybe we should go
with something along those lines.  Some ideas:

- NO CHECK.  The idea of writing CHECK (id  1) NO CHECK is pretty
hilarious, though.
- NO VALIDATE.  But then people need to understand that NOT VALID
means we didn't validate it yet while no validate means we don't
ever intend to validate it, which could be confusing.
- NO ENFORCE.  Requires a new (probably unreserved) keyword.
- NOT VALIDATED or NOT CHECKED.  Same problems as NO CHECK and NO
VALIDATE, respectively, plus now we have to create a new keyword.

Another idea is to apply an extensible-options syntax to constraints,
like we do for EXPLAIN, VACUUM, etc.  Like maybe:

CHECK (id  1) OPTIONS (enforced false, valid true)

Yet another idea is to consider validity a three-state property:
either the constraint is valid (because we have checked it and are
enforcing it), or it is not valid (because we are enforcing it but
have not checked the pre-existing data), or it is assumed true
(because we are not checking or enforcing it but are believing it
anyway).  So then we

Re: [HACKERS] Closing commitfest 2013-11

2014-01-21 Thread Vik Fearing

On 01/20/2014 10:31 PM, Tom Lane wrote:
 I think the idea was that patch authors should take responsibility for
 pushing their patches forward to 2014-01 if they still wanted them
 considered.  Quite a few patches already were moved that way, IIRC.

 Agreed though that we shouldn't let them just rot.

Does this mean I can resurrect my pg_sleep_until() patch?  I didn't set
it back to Needs Review after I completely changed my approach based on
feedback.  I would hate for it to get lost just because I didn't know
how to use the commitfest app.

https://commitfest.postgresql.org/action/patch_view?id=1189

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Funny representation in pg_stat_statements.query.

Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 - CURRENT_TIME and the like are parsed into the nodes directly
 represents the node specs in gram.y
 blah, blah
 Since this becomes more than a simple bug fix, I think it is too
 late for 9.4 now. I'll work on this in the longer term.

OK.  I will get around to it someday, but if you do it first, that's fine.

 The fundamental cause of this issue is Const node which conveys
 the location of other than constant tokens. Any other nodes, for
 instance TypeCasts, are safe.

 So this is fixed by quite simple way like following getting rid
 of the referred difficulties of keeping the code sane and total
 loss of token locations. (white spaces are collapsed for readability)

Huh, that's a cute idea.  It's certainly a kluge with a capital K,
and might fail to extend to new cases; but if we're intending to
replace all these cases with new special-purpose parse nodes soon,
neither of those objections seem very strong.

The main concern I'd have would be whether there could be any weird
change in error cursor locations, but right offhand that's probably
not a problem.  We know in each of these cases that there will be
some node produced by the coercion step, so the location won't
disappear entirely, and so exprLocation() of the node tree as a
whole should be the same.  Not labeling the A_Const could result in
failure to produce an error location for a complaint about invalid
input for the coercion --- but surely that should never happen in
any of these cases.

So right offhand I think this is probably workable, and while it's
ugly it's an appropriate amount of effort to put into code whose
days are numbered anyhow.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER SYSTEM SET typos and fix for temporary file name management

On Tue, Jan 21, 2014 at 7:47 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 After going through commit 65d6e4c (introducing ALTER SYSTEM SET), I
 noticed a couple of typo mistakes as well as (I think) a weird way of
 using the temporary auto-configuration name postgresql.auto.conf.temp
 in two different places, resulting in the patch attached.

 .tmp suffix is used at couple other places in code as well.
 snapmgr.c
 XactExportFilePath(pathtmp, topXid, list_length(exportedSnapshots), .tmp);
 receivelog.c
 snprintf(tmppath, MAXPGPATH, %s.tmp, path);

 Although similar suffix is used at other places, but still I think it might 
 be
 better to define for current case as here prefix (postgresql.auto.conf) is 
 also
 fixed and chances of getting it changed are less. However even if we don't
 do, it might be okay as we are using such suffixes at other places as well.
 In the case of snapmgr.c and receivelog.c, the operations are kept
 local, so it does not matter much if this way of naming is done as all
 the operations for a temporary file are kept within the same, isolated
 function. However, the case of postgresql.auto.conf.temp is different:
 this temporary file name is used as well for a check in basebackup.c,
 so I imagine that it would be better to centralize this information in
 a single place. Now this name is only used in two places, but who
 knows if some additional checks here are there won't be needed for
 this temp file...
 postgresql.auto.conf.temp is also located at the root of PGDATA, so it
 might be surprising for the user to find it even if there are few
 chances that it can happen.

I don't think there's any real reason to defined
PG_AUTOCONF_FILENAME_TEMP.  pg_stat_statements just writes
PGSS_DUMP_FILE .tmp and that hasn't been a problem that I know of.
I do wonder why ALTER SYSTEM SET is spelling the suffix temp instead
of tmp.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-21 Thread Simon Riggs

On 21 January 2014 12:54, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote:
 Rebased patch is attached.

Does this fix the Windows bug reported by Kumar on 20/11/2013 ?

 pg_stat_statements in PG9.4dev has already changed table columns in. So I
 hope this patch will be committed in PG9.4dev.

I've read through the preceding discussion and I'm not very impressed.
Lots of people have spoken about wanting histogram output and I can't
even see a clear statement of whether that will or will not happen.
AFAICS, all that has happened is that people have given their opinions
and we've got almost the same identical patch, with a rush-rush
comment to commit even though we've waited months. If you submit a
patch, then you need to listen to feedback and be clear about what you
will do next, if you don't people will learn to ignore you and nobody
wants that. I should point out that technically this patch is late and
we could reject it solely on that basis, if we wanted to.

I agree with people saying that stddev is better than nothing at all,
so I am inclined to commit this, in spite of the above.

Any objections to commit?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Funny representation in pg_stat_statements.query.

2014-01-21 Thread Peter Geoghegan

On Tue, Jan 21, 2014 at 12:30 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 The fundamental cause of this issue is Const node which conveys
 the location of other than constant tokens. Any other nodes, for
 instance TypeCasts, are safe.

 So this is fixed by quite simple way like following getting rid
 of the referred difficulties of keeping the code sane and total
 loss of token locations. (white spaces are collapsed for readability)

There was a more obvious case of this that I noticed during the
development of pg_stat_statements query normalization. As you may have
noticed, that was addressed by this commit:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5d3fcc4c2e137417ef470d604fee5e452b22f6a7

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dynamic Shared Memory stuff

2014-01-21 Thread Noah Misch

On Wed, Dec 18, 2013 at 12:21:08PM -0500, Robert Haas wrote:
 On Tue, Dec 10, 2013 at 6:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  The larger point is that such a shutdown process has never in the history
  of Postgres been successful at removing shared-memory (or semaphore)
  resources.  I do not feel a need to put a higher recoverability standard
  onto the DSM code than what we've had for fifteen years with SysV shmem.
 
  But, by the same token, it shouldn't be any *less* recoverable.  In this
  context that means that starting a new postmaster should recover the
  resources, at least 90% of the time (100% if we still have the old
  postmaster lockfile).  I think the idea of keeping enough info in the
  SysV segment to permit recovery of DSM resources is a good one.  Then,
  any case where the existing logic is able to free the old SysV segment
  is guaranteed to also work for DSM.
 
 So I'm taking a look at this.  There doesn't appear to be anything
 intrinsically intractable here, but it seems important to time the
 order of operations so as to minimize the chances that things fail in
 awkward places.  The point where we remove the old shared memory
 segment from the previous postmaster invocation is here:
 
 /*
  * The segment appears to be from a dead Postgres process, or from a
  * previous cycle of life in this same process.  Zap it, if possible.
  * This probably shouldn't fail, but if it does, assume the segment
  * belongs to someone else after all, and continue quietly.
  */
 shmdt(memAddress);
 if (shmctl(shmid, IPC_RMID, NULL)  0)
 continue;
 
 My first thought was to remember the control segment ID from the
 header just before the shmdt() there, and then later when the DSM
 module is starting, do cleanup.  But that doesn't seem like a good
 idea, because then if startup fails after we remove the old shared
 memory segment and before we get to the DSM initialization code, we'll
 have lost the information on what control segment needs to be cleaned
 up.  A subsequent startup attempt won't see the old shm again, because
 it's already gone.  I'm fairly sure that this would be a net reduction
 in reliability vs. the status quo.
 
 So now what I'm thinking is that we ought to actually perform the DSM
 cleanup before detaching the old segment and trying to remove it.
 That shouldn't fail, but if it does, or if we get killed before
 completing it, the next run will hopefully find the same old shm and
 finish the cleanup.  But that kind of flies in the face of the comment
 above: if we perform DSM cleanup and then discover that the segment
 wasn't ours after all, that means we just stomped all over somebody
 else's stuff.  That's not too good.  But trying to remove the segment
 first and then perform the cleanup creates a window where, if we get
 killed, the next restart will have lost information about how to
 finish cleaning up.  So it seems that some kind of logic tweak is
 needed here, but I'm not sure exactly what.  As I see it, the options
 are:
 
 1. Make failure to remove the shared memory segment we thought was
 ours an error.  This will definitely show up any problems, but only
 after we've burned down some other processes's dynamic shared memory
 segments.  The most likely outcome is creating failure-to-start
 problems that don't exist today.
 
 2. Make it a warning.  We'll still burn down somebody else's DSMs, but
 at least we'll still start ourselves.  Sadly, WARNING: You have just
 done a bad thing.  It's too late to fix it.  Sorry! is not very
 appealing.

It has long been the responsibility of PGSharedMemoryCreate() to determine
that a segment is unimportant before calling IPC_RMID.  The success or failure
of IPC_RMID is an unreliable guide to the correctness of that determination.
IPC_RMID will succeed on an important segment owned by the same UID, and it
will fail if some other process removed the segment after our shmat().  Such a
failure does not impugn our having requested DSM cleanup on the basis of the
PGShmemHeader we did read, so an apologetic WARNING would be wrong.

 3. Remove the old shared memory segment first, then perform the
 cleanup immediately afterwards.  If we get killed before completing
 the cleanup, we'll leak the un-cleaned-up stuff.  Decide that's OK and
 move on.

The period in which process exit would leak segments is narrow enough that
this seems fine.  I see no net advantage over performing the cleanup before
shmdt(), though.

 4. Adopt some sort of belt-and-suspenders approach, keeping the state
 file we have now and backstopping it with this mechanism, so that we
 really only need this to work when $PGDATA has been blown away and
 recreated.  This seems pretty inelegant, and I'm not sure who it'll
 benefit other than those (few?) people who kill -9 the postmaster (or
 it segfaults or otherwise dies without running the code to remove
 shared memory segments) and then remove

[HACKERS] Re[2]: [HACKERS] Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

2014-01-21 Thread Миша Тюрин


Hi
But maybe postgres should provide its own subsystem like linux active/inactive 
memory over and/or near shared buffers? There 
could be some postgres special heuristics in its own approach.
And does anyone know how mysql-innodb guys are getting with similar issues?
Thank you!

Re: [HACKERS] inherit support for foreign tables

Robert Haas robertmh...@gmail.com writes:
 One thing that's bugging me a bit about this whole line of attack is
 that, in the first instance, the whole goal here is to support
 inheritance hierarchies that mix ordinary tables with foreign tables.
 If you have a table with children some of which are inherited and
 others of which are not inherited, you're very likely going to want
 your constraints enforced for real on the children that are tables and
 assumed true on the children that are foreign tables, and none of what
 we're talking about here gets us to that, because we normally want the
 constraints to be identical throughout the inheritance hierarchy.

There's a nearby thread that's addressing this same question, in which
I make the case (again) that the right thing for postgres_fdw constraints
is that they're just assumed true.  So I'm not sure why this conversation
is proposing to implement a lot of mechanism to do something different
from that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-21 Thread Andrew Dunstan



On 01/21/2014 02:48 PM, Simon Riggs wrote:
I agree with people saying that stddev is better than nothing at all, 
so I am inclined to commit this, in spite of the above. Any objections 
to commit? 


I have not been following terribly closely, but if it includes stddev 
then yes, please do, many of us will find it very useful indeed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inherit support for foreign tables

On Tue, Jan 21, 2014 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 One thing that's bugging me a bit about this whole line of attack is
 that, in the first instance, the whole goal here is to support
 inheritance hierarchies that mix ordinary tables with foreign tables.
 If you have a table with children some of which are inherited and
 others of which are not inherited, you're very likely going to want
 your constraints enforced for real on the children that are tables and
 assumed true on the children that are foreign tables, and none of what
 we're talking about here gets us to that, because we normally want the
 constraints to be identical throughout the inheritance hierarchy.

 There's a nearby thread that's addressing this same question, in which
 I make the case (again) that the right thing for postgres_fdw constraints
 is that they're just assumed true.  So I'm not sure why this conversation
 is proposing to implement a lot of mechanism to do something different
 from that.

/me scratches head.

Because the other guy named Tom Lane took the opposite position on the
second message on this thread, dated 11/14/13?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-21 Thread Peter Geoghegan

On Tue, Jan 21, 2014 at 11:48 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I agree with people saying that stddev is better than nothing at all,
 so I am inclined to commit this, in spite of the above.

I could live with stddev. But we really ought to be investing in
making pg_stat_statements work well with third-party tools. I am very
wary of enlarging the counters structure, because it is protected by a
spinlock. There has been no attempt to quantify that cost, nor has
anyone even theorized that it is not likely to be appreciable.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Closing commitfest 2013-11

Vik Fearing wrote:
 On 01/20/2014 10:31 PM, Tom Lane wrote:
  I think the idea was that patch authors should take responsibility for
  pushing their patches forward to 2014-01 if they still wanted them
  considered.  Quite a few patches already were moved that way, IIRC.
 
  Agreed though that we shouldn't let them just rot.
 
 Does this mean I can resurrect my pg_sleep_until() patch?  I didn't set
 it back to Needs Review after I completely changed my approach based on
 feedback.  I would hate for it to get lost just because I didn't know
 how to use the commitfest app.
 
 https://commitfest.postgresql.org/action/patch_view?id=1189

No objection here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re[2]: [HACKERS] Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

2014-01-21 Thread Claudio Freire

On Tue, Jan 21, 2014 at 5:01 PM, Миша Тюрин tmih...@bk.ru wrote:
 And does anyone know how mysql-innodb guys are getting with similar issues?


I'm no innodb dev, but from managing mysql databases, I can say that
mysql simply eats all the RAM the admin is willing to allocate for the
DB, and is content with the page cache almost not working.

IOW: mysql manages its own cache and doesn't need or want the page
cache. That *does* result in terrible performance when I/O is needed.
Some workloads are nigh-impossible to optimize with this scheme.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

On Tue, Jan 21, 2014 at 1:59 PM, Thom Brown t...@linux.com wrote:
 On 21 January 2014 18:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 I'm getting a report of a config error when changing a config value
 that requires a restart:
 ...
 2014-01-21 18:14:53 GMT [28718]: [4-1] user=,db=,client= LOG:
 received SIGHUP, reloading configuration files
 2014-01-21 18:14:53 GMT [28718]: [5-1] user=,db=,client= LOG:
 parameter max_connections cannot be changed without restarting the
 server
 2014-01-21 18:14:53 GMT [28718]: [6-1] user=,db=,client= LOG:
 configuration file /home/thom/Development/data/postgresql.conf
 contains errors; unaffected changes were applied

 It doesn't contain errors.

 Yeah it does: it's got a value that can't be applied.  I think you're
 making a semantic quibble.

 I see it as technically wrong.  There's nothing wrong with my config
 file.  A reload of the file may not be able to apply all the settings,
 but there's no typo or mistake anywhere in my file.  I would just need
 to restart instead of reload.

 However, given that you find it unsurprising, I'll leave it there.

I kind of agree with Thom.  I understand why it's doing what it's
doing, but it still seems sort of lame.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

Robert Haas robertmh...@gmail.com writes:
 I kind of agree with Thom.  I understand why it's doing what it's
 doing, but it still seems sort of lame.

Well, the point of the message is to report that we failed to apply
all the settings requested by the file.  If you prefer some wording
squishier than error, we could bikeshed the message phrasing.
But I don't think we should suppress the message entirely; nor does
it seem worthwhile to try to track whether all the failures were of
precisely this type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2014-01-21 Thread Greg Stark

Fwiw I think all transactions lock up until space appears is *much*
better than PANICing. Often disks fill up due to other transient
storage or people may have options to manually increase the amount of
space. it's much better if the database just continues to function
after that rather than need to be restarted.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_istready and older versions

2014-01-21 Thread Josh Berkus

All,

pg_isready works against older versions of PostgreSQL.  Does anyone know
if there's a limit to that?  v3 protocol change?  Something else?

Backwards compatibility ought to be in its docs, but to fix that I need
to know what version it's compatible *to*.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

On Tue, Jan 21, 2014 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I kind of agree with Thom.  I understand why it's doing what it's
 doing, but it still seems sort of lame.

 Well, the point of the message is to report that we failed to apply
 all the settings requested by the file.  If you prefer some wording
 squishier than error, we could bikeshed the message phrasing.
 But I don't think we should suppress the message entirely; nor does
 it seem worthwhile to try to track whether all the failures were of
 precisely this type.

Well, to me the whole thing smacks of:

LOG: there is a problem
LOG: please be aware that we logged a message about a problem

The only real argument for the message:

LOG: configuration file /home/thom/Development/data/postgresql.conf
contains errors; unaffected changes were applied

...is that somebody might think that the presence of a single error
caused all the changes to be ignored.  And there's a good reason they
might think that: we used to do it that way.  But on the flip side, if
we emitted a LOG or WARNING message every time the user did something
that works in a way incompatible with previous releases, we'd go
insane.  So I think the argument for just dumping that message
altogether is actually pretty good.

Bikeshedding the wording is, of course, another viable option.  For
that matter, ignoring the problem is a pretty viable option, too.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part 1: additional information


On 01/21/2014 11:35 AM, Heikki Linnakangas wrote:

On 01/21/2014 04:02 AM, Tomas Vondra wrote:

On 20.1.2014 19:30, Heikki Linnakangas wrote:


Attached is a yet another version, with more bugs fixed and more
comments added and updated. I would appreciate some heavy-testing of
this patch now. If you could re-run the tests you've been using,
that could be great. I've tested the WAL replay by replicating GIN
operations over streaming replication. That doesn't guarantee it's
correct, but it's a good smoke test.


I gave it a try - the OOM error seems to be gone, but now get this

 PANIC:  cannot insert duplicate items to GIN index page

This only happens when building the index incrementally (i.e. using a
sequence of INSERT statements into a table with GIN index). When I
create a new index on a table (already containing the same dataset) it
works just fine.

Also, I tried to reproduce the issue by running a simple plpgsql loop
(instead of a complex python script):

DO LANGUAGE plpgsql $$
DECLARE
  r tsvector;
BEGIN
  FOR r IN SELECT body_tsvector FROM data_table LOOP
  INSERT INTO idx_table (body_tsvector) VALUES (r);
  END LOOP;
END$$;

where data_table is the table with imported data (the same data I
mentioned in the post about OOM errors), and index_table is an empty
table with a GIN index. And indeed it fails, but only if I run the block
in multiple sessions in parallel.


Oh, I see what's going on. I had assumed that there cannot be duplicate
insertions into the posting tree, but that's dead wrong. The fast
insertion mechanism depends on a duplicate insertion to do nothing.


Ok, this turned out to be a much bigger change than I thought...

In principle, it's not difficult to eliminate duplicates. However, to 
detect a duplicate, you have to check if the item is present in the 
uncompressed items array, or in the compressed lists. That requires 
decoding the segment where it should be.


But if we decode the segment, what's the purpose of the uncompressed 
items array? Its purpose was to speed up insertions, by buffering them 
so that we don't need to decode and re-encode the page/segment on every 
inserted item. But if we're paying the price of decoding it anyway, we 
might as well include the new item and re-encode the segment. The 
uncompressed area saves some effort in WAL-logging, as the record of 
inserting an entry into the uncompressed area is much smaller than that 
of re-encoding part of the page, but if that really is a concern, we 
could track more carefully which parts of the page is modified, and only 
WAL-log the required parts. And hopefully, the fast-update lists buffer 
inserts so that you insert many items at a time to the posting tree, and 
the price of re-encoding is only paid once.


So, now that I think about this once more, I don't think the 
uncompressed area in every leaf page is a good idea.


I refactored the way the recompression routine works again. It is now 
more clearly a multi-step process. First, the existing page is 
disassembled into an in-memory struct, which is a linked list of all 
the segments. In-memory each segment can be represented as an array of 
item pointers, or in the compressed format. In the next phase, all the 
new items are added to the segments where they belong. This naturally 
can lead to overly large segments; in the third phase, the items are 
redistributed among the segments, and compressed back to the compressed 
format. Finally, the finished segments are written back to the page, or 
pages if it had to be split.


The same subroutines to disassemble and recompress are also used in vacuum.

Attached is what I've got now. This is again quite heavily-changed from 
the previous version - sorry for repeatedly rewriting this. I'll 
continue testing and re-reviewing this tomorrow.


- Heikki


gin-packed-postinglists-20140121.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER SYSTEM SET typos and fix for temporary file name management

Robert Haas escribió:

 I don't think there's any real reason to defined
 PG_AUTOCONF_FILENAME_TEMP.  pg_stat_statements just writes
 PGSS_DUMP_FILE .tmp and that hasn't been a problem that I know of.
 I do wonder why ALTER SYSTEM SET is spelling the suffix temp instead
 of tmp.

I agree with Michael that having pg_basebackup be aware of the .temp
suffix is ugly; for instance if we were to fix it to .tmp in ALTER
SYSTEM but forgot to change pg_basebackup, the check would be
immediately broken.  But on the other hand I'm not sure why it's such a
problem for pg_basebackup that it needs to be checking in the first
place -- how about we rip that out?

Perhaps the temp file needs to be in pgsql_tmp?  (Do we need to worry
about cross-filesystem links if we do?  I mean, do we support mounting
pgsql_tmp separately?)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Funny representation in pg_stat_statements.query.

Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 The fundamental cause of this issue is Const node which conveys
 the location of other than constant tokens. Any other nodes, for
 instance TypeCasts, are safe.

 So this is fixed by quite simple way like following getting rid
 of the referred difficulties of keeping the code sane and total
 loss of token locations. (white spaces are collapsed for readability)

Committed with minor adjustments (improved comments, and hit one
additional case you missed)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery


On 01/21/2014 07:31 PM, Fujii Masao wrote:

On Fri, Dec 20, 2013 at 9:21 PM, MauMau maumau...@gmail.com wrote:

From: Fujii Masao masao.fu...@gmail.com


! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

Even when standby_mode is not enabled, we can use cascade replication and
it needs the accumulated WAL files. So I think that
AllowCascadeReplication()
should be added into this condition.

!   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
!   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
!
!   if (restoredFromArchive)

Don't we need to check !StandbyModeRequested and
!AllowCascadeReplication()
here?


Oh, you are correct.  Okay, done.


Thanks! The patch looks good to me. Attached is the updated version of
the patch. I added the comments.


Sorry for reacting so slowly, but I'm not sure I like this patch. It's a 
quite useful property that all the WAL files that are needed for 
recovery are copied into pg_xlog, even when restoring from archive, even 
when not doing cascading replication. It guarantees that you can restart 
the standby, even if the connection to the archive is lost for some 
reason. I intentionally changed the behavior for archive recovery too, 
when it was introduced for cascading replication. Also, I think it's 
good that the behavior does not depend on whether cascading replication 
is enabled - it's a quite subtle difference.


So, IMHO this is not a bug, it's a feature.

To solve the original problem of running out of disk space in archive 
recovery, I wonder if we should perform restartpoints more aggressively. 
We intentionally don't trigger restatpoings by checkpoint_segments, only 
checkpoint_timeout, but I wonder if there should be an option for that. 
MauMau, did you try simply reducing checkpoint_timeout, while doing 
recovery?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

Greg Stark st...@mit.edu writes:
 Fwiw I think all transactions lock up until space appears is *much*
 better than PANICing. Often disks fill up due to other transient
 storage or people may have options to manually increase the amount of
 space. it's much better if the database just continues to function
 after that rather than need to be restarted.

Well, PANIC is certainly bad, but what I'm suggesting is that we just
focus on getting that down to ERROR and not worry about trying to get
out of the disk-shortage situation automatically.  Nor do I believe
that it's such a good idea to have the database freeze up until space
appears rather than reporting errors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

Robert Haas robertmh...@gmail.com writes:
 The only real argument for the message:

 LOG: configuration file /home/thom/Development/data/postgresql.conf
 contains errors; unaffected changes were applied

 ...is that somebody might think that the presence of a single error
 caused all the changes to be ignored.  And there's a good reason they
 might think that: we used to do it that way.  But on the flip side, if
 we emitted a LOG or WARNING message every time the user did something
 that works in a way incompatible with previous releases, we'd go
 insane.  So I think the argument for just dumping that message
 altogether is actually pretty good.

Hm.  I don't recall what the arguments were for adding that message,
but you have a point that it might have served its purpose by now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alternative back-end block formats

2014-01-21 Thread Bruce Momjian

On Tue, Jan 21, 2014 at 06:43:54AM -0500, Christian Convey wrote:
 Hi all,
 
 I'm playing around with Postgres, and I thought it might be fun to experiment
 with alternative formats for relation blocks, to see if I can get smaller 
 files
 and/or faster server performance.
 
 Does anyone know if this has been done before with Postgres?  I would have
 assumed yes, but I'm not finding anything in Google about people having done
 this.

Not that I know of.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2014-01-21 Thread Jeff Janes

On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Simon Riggs si...@2ndquadrant.com writes:
  On 6 June 2013 16:00, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:
  The current situation is that if you run out of disk space while writing
  WAL, you get a PANIC, and the server shuts down. That's awful.

  I don't see we need to prevent WAL insertions when the disk fills. We
  still have the whole of wal_buffers to use up. When that is full, we
  will prevent further WAL insertions because we will be holding the
  WALwritelock to clear more space. So the rest of the system will lock
  up nicely, like we want, apart from read-only transactions.

 I'm not sure that all writing transactions lock up hard is really so
 much better than the current behavior.

 My preference would be that we simply start failing writes with ERRORs
 rather than PANICs.  I'm not real sure ATM why this has to be a PANIC
 condition.  Probably the cause is that it's being done inside a critical
 section, but could we move that?


My understanding is that if it runs out of buffer space while in an
XLogInsert, it will be holding one or more buffer content locks
exclusively, and unless it can complete the xlog (or scrounge up the info
to return that buffer to its previous state), it can never release that
lock.  There might be other paths were it could get by with an ERROR, but
if no one can write xlog anymore, all of those paths must quickly converge
to the one that cannot simply ERROR.

Cheers,

Jeff

Re: [HACKERS] Backup throttling

2014-01-21 Thread Antonin Houska

I realize the following should be applied on the top of v7:

index a0216c1..16dd939 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1263,7 +1263,7 @@ throttle(size_t increment)
throttling_counter %= throttling_sample;

/* Once the (possible) sleep has ended, new period starts. */
-   if (wait_result | WL_TIMEOUT)
+   if (wait_result  WL_TIMEOUT)
throttled_last += elapsed + sleep;
else if (sleep  0)
/* Sleep was necessary but might have been interrupted. */

// Tony

On 01/20/2014 05:10 PM, Antonin Houska wrote:
 On 01/15/2014 10:52 PM, Alvaro Herrera wrote:
 I gave this patch a look.  There was a bug that the final bounds check
 for int32 range was not done when there was no suffix, so in effect you
 could pass numbers larger than UINT_MAX and pg_basebackup would not
 complain until the number reached the server via BASE_BACKUP.  Maybe
 that's fine, given that numbers above 1G will cause a failure on the
 server side anyway, but it looked like a bug to me.  I tweaked the parse
 routine slightly; other than fix the bug, I made it accept fractional
 numbers, so you can say 0.5M for instance.
 
 Thanks.
 
 Perhaps we should make sure pg_basebackup rejects numbers larger than 1G
 as well.
 
 Is there a good place to define the constant, so that both backend and
 client can use it? I'd say 'include/common' but no existing file seems
 to be appropriate. I'm not sure if it's worth to add a new file there.
 
 Another thing I found a bit strange was the use of the latch.  What this
 patch does is create a separate latch which is used for the throttling.
 This means that if the walsender process receives a signal, it will not
 wake up if it's sleeping in throttling.  Perhaps this is okay: as Andres
 was quoted upthread as saying, maybe this is not a problem because the
 sleep times are typically short anyway.  But we're pretty much used to
 the idea that whenever a signal is sent, processes act on it
 *immediately*.  Maybe some admin will not feel comfortable about waiting
 some extra 20ms when they cancel their base backups.  In any case,
 having a secondary latch to sleep on in a process seems weird.  Maybe
 this should be using MyWalSnd-latch somehow.
 
 o.k., MyWalSnd-latch is used now.
 
 You have this interesting THROTTLING_MAX_FREQUENCY constant defined to
 128, with the comment check this many times per second.
 Let's see: if the user requests 1MB/s, this value results in
 throttling_sample = 1MB / 128 = 8192.  So for every 8kB transferred, we
 would stop, check the wall clock time, and if less time has lapsed than
 we were supposed to spend transferring those 8kB then we sleep.  Isn't a
 check every 8kB a bit too frequent?  This doesn't seem sensible to me.
 I think we should be checking perhaps every tenth of the requested
 maximum rate, or something like that, not every 1/128th.

 Now, what the code actually does is not necessarily that, because the
 sampling value is clamped to a minimum of 32 kB.  But then if we're
 going to do that, why use such a large divisor value in the first place?
 I propose we set that constant to a smaller value such as 8.
 
 I tried to use THROTTLING_SAMPLE_MIN and THROTTLING_MAX_FREQUENCY to
 control both the minimum and maximum chunk size. It was probably too
 generic, THROTTLING_SAMPLE_MIN is no longer there.
 
 New patch version is attached.
 
 // Antonin Houska (Tony)
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] yum psycopg2 doc package not signed

2014-01-21 Thread Martín Marqués

I was updating the packages from one of my servers and I got this message:

Package python-psycopg2-doc-2.5.2-1.f19.x86_64.rpm is not signed

If I remove the package (I thought it might be that package alone) I
get errors from other packages:

Package python-psycopg2-2.5.2-1.f19.x86_64.rpm is not signed

Something wrong with the packages?

-- 

Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

I have been mulling over this patch, and I can't seem to come to terms
with it.  I first started making it look nicer here and there, thinking
it was all mostly okay, but eventually arrived at the idea that it seems
wrong to do what this does: basically, get_object_address() tries to
obtain an object address, and if that fails, return InvalidOid; then, in
RemoveObjects, we try rather hard to figure out why that failed, and
construct an error message.

It seems to me that it would make more sense to have get_object_address
somehow return a code indicating what failed; then we don't have to go
all over through the parser code once more.  Perhaps, for example, when
missing_ok is given as true to get_object_address it also needs to get a
pointer to ObjectType and a string; if some object does not exist then
fill the ObjectType with the failing object and the string with the
failing name.  Then RemoveObjects can construct a string more easily.
Not sure how workable this exact idea is; maybe there is a better way.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

2014-01-21 Thread Adrian Klaver


On 01/21/2014 12:29 PM, Robert Haas wrote:

On Tue, Jan 21, 2014 at 1:59 PM, Thom Brown t...@linux.com wrote:

On 21 January 2014 18:35, Tom Lane t...@sss.pgh.pa.us wrote:

Thom Brown t...@linux.com writes:

I'm getting a report of a config error when changing a config value
that requires a restart:
...
2014-01-21 18:14:53 GMT [28718]: [4-1] user=,db=,client= LOG:
received SIGHUP, reloading configuration files
2014-01-21 18:14:53 GMT [28718]: [5-1] user=,db=,client= LOG:
parameter max_connections cannot be changed without restarting the
server
2014-01-21 18:14:53 GMT [28718]: [6-1] user=,db=,client= LOG:
configuration file /home/thom/Development/data/postgresql.conf
contains errors; unaffected changes were applied



It doesn't contain errors.


Yeah it does: it's got a value that can't be applied.  I think you're
making a semantic quibble.


I see it as technically wrong.  There's nothing wrong with my config
file.  A reload of the file may not be able to apply all the settings,
but there's no typo or mistake anywhere in my file.  I would just need
to restart instead of reload.

However, given that you find it unsurprising, I'll leave it there.


I kind of agree with Thom.  I understand why it's doing what it's
doing, but it still seems sort of lame.


Though I am not sure why it is lame when it seems to be following 
protocol; announce the problem, then tell where it originates. Seems 
like useful information to me.



postgres-2014-01-21 14:39:54.738 PST-0ERROR:  parameter 
max_connections cannot be changed without restarting the server

postgres-2014-01-21 14:39:54.738 PST-0STATEMENT:  SET  max_connections=99;

-2014-01-21 14:42:23.166 PST-0LOG:  received SIGHUP, reloading 
configuration files
-2014-01-21 14:42:23.168 PST-0LOG:  parameter max_connections cannot 
be changed without restarting the server
-2014-01-21 14:42:23.169 PST-0LOG:  configuration file 
/usr/local/pgsql93/data/postgresql.conf contains errors; unaffected 
changes were applied




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrectly reporting config errors

On 2014-01-21 16:13:11 -0500, Robert Haas wrote:
 On Tue, Jan 21, 2014 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  I kind of agree with Thom.  I understand why it's doing what it's
  doing, but it still seems sort of lame.
 
  Well, the point of the message is to report that we failed to apply
  all the settings requested by the file.  If you prefer some wording
  squishier than error, we could bikeshed the message phrasing.
  But I don't think we should suppress the message entirely; nor does
  it seem worthwhile to try to track whether all the failures were of
  precisely this type.
 
 Well, to me the whole thing smacks of:
 
 LOG: there is a problem
 LOG: please be aware that we logged a message about a problem
 
 The only real argument for the message:
 
 LOG: configuration file /home/thom/Development/data/postgresql.conf
 contains errors; unaffected changes were applied
 
 ...is that somebody might think that the presence of a single error
 caused all the changes to be ignored.  And there's a good reason they
 might think that: we used to do it that way.  But on the flip side, if
 we emitted a LOG or WARNING message every time the user did something
 that works in a way incompatible with previous releases, we'd go
 insane.  So I think the argument for just dumping that message
 altogether is actually pretty good.

I don't find that argument all that convincing. The expectation that a
config file isn't applied if it contains errors is a generally
reasonable one, not just because we used to do it that way. I also don't
see the disadvantage of the current behavour of reporting that we didn't
apply the change. Additionally it makes it easier to look for such
errors, by having a relatively easy to remember message to search for in
the log.

What I think is that we're reporting such problems way too often. The
usual cause I know of is something like:
shared_buffers = 4GB
...
shared_buffers = 6GB

When reloading the config we'll report an error in the line setting it
to 4GB because shared_buffers is PGC_POSTMASTER leading to a spurious
contains errors message. Rather annoying.

Now, I am sure somebody will argue along the lines of well, don't do
that then, but I don't see much merit in that argument. A rather common
and sensible configuration is to have a common configuration file used
across servers, which then is overwritten by a per-server or per-cluster
config file containing values specific to a server/cluster.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] yum psycopg2 doc package not signed

2014-01-21 Thread Devrim GÜNDÜZ


Hi,

On Tue, 2014-01-21 at 20:19 -0200, Martín Marqués wrote:
 I was updating the packages from one of my servers and I got this
 message:
 
 Package python-psycopg2-doc-2.5.2-1.f19.x86_64.rpm is not signed
 
 If I remove the package (I thought it might be that package alone) I
 get errors from other packages:
 
 Package python-psycopg2-2.5.2-1.f19.x86_64.rpm is not signed
 
 Something wrong with the packages?

I thought I fixed that -- but apparently not. Please run

yum clean metadata

and try updating the package again.

Regards,
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR




signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] new json funcs

2014-01-21 Thread Marko Tiikkaja


Hi Andrew,

On 1/18/14, 10:05 PM, I wrote:

But I'll continue with my review now that this has been sorted out.


Sorry about the delay.

I think the API for the new functions looks good.  They are all welcome 
additions to the JSON family.


The implementation side looks reasonable to me.  I'm not sure there's 
need to duplicate so much code, though.  E.g. json_to_recordset is 
almost identical to json_populate_recordset, and json_to_record has a 
bit of the same disease.


Finally, (as I'm sure you know already), docs are still missing. 
Marking the patch Waiting on Author for the time being.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

Jeff Janes jeff.ja...@gmail.com writes:
 On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 My preference would be that we simply start failing writes with ERRORs
 rather than PANICs.  I'm not real sure ATM why this has to be a PANIC
 condition.  Probably the cause is that it's being done inside a critical
 section, but could we move that?

 My understanding is that if it runs out of buffer space while in an
 XLogInsert, it will be holding one or more buffer content locks
 exclusively, and unless it can complete the xlog (or scrounge up the info
 to return that buffer to its previous state), it can never release that
 lock.  There might be other paths were it could get by with an ERROR, but
 if no one can write xlog anymore, all of those paths must quickly converge
 to the one that cannot simply ERROR.

Well, the point is we'd have to somehow push detection of the problem
to a point before the critical section that does the buffer changes
and WAL insertion.

The first idea that comes to mind is (1) estimate the XLOG space needed
(an overestimate is fine here); (2) just before entering the critical
section, call some function to reserve that space, such that we always
have at least sum(outstanding reservations) available future WAL space;
(3) release our reservation as part of the actual XLogInsert call.

The problem here is that the reserve function would presumably need an
exclusive lock, and would be about as much of a hot spot as XLogInsert
itself is.  Plus we'd be paying a lot of extra cycles to solve a corner
case problem that, with all due respect, comes up pretty darn seldom.
So probably we need a better idea than that.

Maybe we could get some mileage out of the fact that very approximate
techniques would be good enough.  For instance, I doubt anyone would bleat
if the system insisted on having 10MB or even 100MB of future WAL space
always available.  But I'm not sure exactly how to make use of that
flexibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2014-01-21 Thread Peter Geoghegan

On Tue, Jan 21, 2014 at 3:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Maybe we could get some mileage out of the fact that very approximate
 techniques would be good enough.  For instance, I doubt anyone would bleat
 if the system insisted on having 10MB or even 100MB of future WAL space
 always available.  But I'm not sure exactly how to make use of that
 flexibility.

In the past I've thought that one approach that would eliminate
concerns about portably and efficiently knowing how much space is left
on the pg_xlog filesystem is to have a ballast file. Under this
scheme, perhaps XLogInsert() could differentiate between a soft and
hard failure. Hopefully the reserve function you mentioned, which is
still called at the same place, just before each critical section
thereby becomes cheap. Perhaps I'm just restating what you said,
though.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

On 2014-01-21 18:24:39 -0500, Tom Lane wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
  On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  My preference would be that we simply start failing writes with ERRORs
  rather than PANICs.  I'm not real sure ATM why this has to be a PANIC
  condition.  Probably the cause is that it's being done inside a critical
  section, but could we move that?
 
  My understanding is that if it runs out of buffer space while in an
  XLogInsert, it will be holding one or more buffer content locks
  exclusively, and unless it can complete the xlog (or scrounge up the info
  to return that buffer to its previous state), it can never release that
  lock.  There might be other paths were it could get by with an ERROR, but
  if no one can write xlog anymore, all of those paths must quickly converge
  to the one that cannot simply ERROR.
 
 Well, the point is we'd have to somehow push detection of the problem
 to a point before the critical section that does the buffer changes
 and WAL insertion.

Well, I think that's already hard for the heapam.c stuff, but doing that
in xact.c seems fracking hairy. We can't simply stop in the middle of a
commit and not continue, that'd often grind the system to a halt
preventing cleanup. Additionally the size of the inserted record for
commits is essentially unbounded, which makes it an especially fun case.

 The first idea that comes to mind is (1) estimate the XLOG space needed
 (an overestimate is fine here); (2) just before entering the critical
 section, call some function to reserve that space, such that we always
 have at least sum(outstanding reservations) available future WAL space;
 (3) release our reservation as part of the actual XLogInsert call.

I think that's not necessarily enough. In a COW filesystem like btrfs or
ZFS you really cannot give much guarantees about writes suceeding or
failing, even if we were able to create (and zero) a new segment.

Even if you disregard that, we'd need to keep up with lots of concurrent
reservations, looking a fair bit into the future. E.g. during a smart
shutdown in a workload with lots of subtransactions trying to reserve
space might make the situation actually worse because we might end up
trying to reserve the combined size of records.

 The problem here is that the reserve function would presumably need an
 exclusive lock, and would be about as much of a hot spot as XLogInsert
 itself is.  Plus we'd be paying a lot of extra cycles to solve a corner
 case problem that, with all due respect, comes up pretty darn seldom.
 So probably we need a better idea than that.

Yea, I don't think anything really safe is going to work without
signifcant penalties.

 Maybe we could get some mileage out of the fact that very approximate
 techniques would be good enough.  For instance, I doubt anyone would bleat
 if the system insisted on having 10MB or even 100MB of future WAL space
 always available.  But I'm not sure exactly how to make use of that
 flexibility.

If we'd be more aggressive with preallocating WAL files and doing so in
the WAL writer, we could stop accepting writes in some common codepaths
(e.g. nodeModifyTable.c) as soon as preallocating failed but continue to
accept writes in other locations (e.g. TRUNCATE, DROP TABLE). That'd
still fail if you write a *very* large commit record using up all the
reserve though...

I personally think this isn't worth complicating the code for.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new json funcs

2014-01-21 Thread Andrew Dunstan



On 01/21/2014 06:21 PM, Marko Tiikkaja wrote:

Hi Andrew,

On 1/18/14, 10:05 PM, I wrote:

But I'll continue with my review now that this has been sorted out.


Sorry about the delay.

I think the API for the new functions looks good.  They are all 
welcome additions to the JSON family.


The implementation side looks reasonable to me.  I'm not sure there's 
need to duplicate so much code, though.  E.g. json_to_recordset is 
almost identical to json_populate_recordset, and json_to_record has a 
bit of the same disease.


I can probably factor some of that out. Of course, when it was an 
extension there wasn't the possibility.




Finally, (as I'm sure you know already), docs are still missing. 
Marking the patch Waiting on Author for the time being.






Yes, I have a draft, just waiting for time to go through it.

Thanks for the review.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: hide application_name from other users

2014-01-21 Thread Harold Giménez

On Tue, Jan 21, 2014 at 12:31 AM, Craig Ringer cr...@2ndquadrant.com wrote:

 On 01/21/2014 04:19 PM, Heikki Linnakangas wrote:
  On 01/21/2014 07:22 AM, Harold Giménez wrote:
  First of all, I apologize for submitting a patch and missing the
  commitfest
  deadline. Given the size of the patch, I thought I'd submit it for your
  consideration regardless.
 
  This patch prevents non-superusers from viewing other user's
  pg_stat_activity.application_name.  This topic was discussed some time
  ago
  [1] and consequently application_name was made world readable [2].
 
  I would like to propose that we hide it instead by reverting to the
  original behavior.  There is a very large number of databases on the same
  cluster shared across different users who can easily view each other's
  application_name values.  Along with that, there are some libraries that
  default application_name to the name of the running process [3], which
  can
  leak information about what web servers applications are running, queue
  systems, etc. Furthermore leaking application names in a multi-tenant
  environment is more information than an attacker should have access to on
  services like Heroku and other similar providers.
 
  I don't find these arguments compelling to change it now. It's
  well-documented that application_name is visible to everyone. Just don't
  put sensitive information there.
 
  For those users that don't mind advertising application_name, the patch
  would be highly inconvenient. For example, the database owner could no
  longer see the application_name of other users connected to her database.

 It also means that monitoring tools must run as superuser to see
 information they require, which to me is a total showstopper.


Well, the fact is that if you don't run monitoring tools as superuser,
there may not be enough connection slots available anyways, in cases
where actual usage is consuming all of max_connections, and only the
reserved slots are available. So in a way it's already unreliable to
run monitoring as non-superuser unfortunately.



 If you want control over visibility of application_name, it should be
 done with a column privilige granted to a system role, or something like
 that - so the ability to see it can be given to public on default
 (thus not breaking BC) and if it's revoked from public, given to roles
 that need to see it.


Something along these lines sounds like would solve the problem.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)