Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-21 Thread Tomas Vondra
On 21.12.2014 02:54, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 Attached is v5 of the patch, fixing an error with releasing a shared
 memory context (invalid flag values in a few calls).
 
 The functions that gain a new argument should get their comment updated,
 to explain what the new argument is for.

Right. I've added a short description of the 'subcontext' parameter to
all three variations of the initArray* function, and a more thorough
explanation to initArrayResult().

 Also, what is it with this hunk?
 
 @@ -4768,6 +4770,9 @@ makeMdArrayResult(ArrayBuildState *astate,
  
  MemoryContextSwitchTo(oldcontext);
  
 +/* we can only release the context if it's a private one. */
 +// Assert(! (release  !astate-private_cxt));
 +
  /* Clean up all the junk */
  if (release)
  MemoryContextDelete(astate-mcontext);

That's a mistake, of couse - the assert should not be commented out.

Attached is v6 of the patch, with the comments and assert fixed.


Thinking about the 'release' flag a bit more - maybe we could do this
instead:

if (release  astate-private_cxt)
MemoryContextDelete(astate-mcontext);
else if (release)
{
pfree(astate-dvalues);
pfree(astate-dnulls);
pfree(astate);
}

i.e. either destroy the whole context if possible, and just free the
memory when using a shared memory context. But I'm afraid this would
penalize the shared memory context, because that's intended for cases
where all the build states coexist in parallel and then at some point
are all converted into a result and thrown away. Adding pfree() calls is
no improvement here, and just wastes cycles.

regards
Tomas
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index d9faf20..9c97755 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -262,7 +262,7 @@ ExecScanSubPlan(SubPlanState *node,
 	/* Initialize ArrayBuildStateAny in caller's context, if needed */
 	if (subLinkType == ARRAY_SUBLINK)
 		astate = initArrayResultAny(subplan-firstColType,
-	CurrentMemoryContext);
+	CurrentMemoryContext, true);
 
 	/*
 	 * We are probably in a short-lived expression-evaluation context. Switch
@@ -964,7 +964,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
 	/* Initialize ArrayBuildStateAny in caller's context, if needed */
 	if (subLinkType == ARRAY_SUBLINK)
 		astate = initArrayResultAny(subplan-firstColType,
-	CurrentMemoryContext);
+	CurrentMemoryContext, true);
 
 	/*
 	 * Must switch to per-query memory context.
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 50ea4d2..f434fdd 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -498,8 +498,13 @@ array_agg_transfn(PG_FUNCTION_ARGS)
 		elog(ERROR, array_agg_transfn called in non-aggregate context);
 	}
 
-	state = PG_ARGISNULL(0) ? NULL : (ArrayBuildState *) PG_GETARG_POINTER(0);
+	if (PG_ARGISNULL(0))
+		state = initArrayResult(arg1_typeid, aggcontext, false);
+	else
+		state = (ArrayBuildState *) PG_GETARG_POINTER(0);
+
 	elem = PG_ARGISNULL(1) ? (Datum) 0 : PG_GETARG_DATUM(1);
+
 	state = accumArrayResult(state,
 			 elem,
 			 PG_ARGISNULL(1),
@@ -573,7 +578,22 @@ array_agg_array_transfn(PG_FUNCTION_ARGS)
 		elog(ERROR, array_agg_array_transfn called in non-aggregate context);
 	}
 
-	state = PG_ARGISNULL(0) ? NULL : (ArrayBuildStateArr *) PG_GETARG_POINTER(0);
+
+	if (PG_ARGISNULL(0))
+	{
+		Oid			element_type = get_element_type(arg1_typeid);
+
+		if (!OidIsValid(element_type))
+			ereport(ERROR,
+	(errcode(ERRCODE_DATATYPE_MISMATCH),
+	 errmsg(data type %s is not an array type,
+			format_type_be(arg1_typeid;
+
+		state = initArrayResultArr(arg1_typeid, element_type, aggcontext, false);
+	}
+	else
+		state = (ArrayBuildStateArr *) PG_GETARG_POINTER(0);
+
 	state = accumArrayResultArr(state,
 PG_GETARG_DATUM(1),
 PG_ARGISNULL(1),
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 933c6b0..0302caf 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4606,6 +4606,7 @@ array_insert_slice(ArrayType *destArray,
  *
  *	element_type is the array element type (must be a valid array element type)
  *	rcontext is where to keep working state
+ *	subcontext is a flag determining whether to use a separate memory context
  *
  * Note: there are two common schemes for using accumArrayResult().
  * In the older scheme, you start with a NULL ArrayBuildState pointer, and
@@ -4615,24 +4616,36 @@ array_insert_slice(ArrayType *destArray,
  * once per element.  In this scheme you always end with a non-NULL pointer
  * that you can pass to makeArrayResult; you get an empty array if there
  * were no elements.  This is preferred if an empty array is what you want.
+ *
+ * You may choose whether to 

Re: [HACKERS] pgbench -f and vacuum

2014-12-21 Thread Tatsuo Ishii
 On Sun, Dec 14, 2014 at 11:43 AM, Tatsuo Ishii is...@postgresql.org wrote:
 If we care enough about that case to attempt the vacuum anyway then we
 need to do something about the error message; either squelch it or
 check for the existence of the tables before attempting to
 vacuum. Since there's no way to squelch in the server logfile, I think
 checking for the table is the right answer.

 Fair enough. I will come up with checking for table before vacuum
 approach.
 
 +1 for this approach.

Here is the patch I promised.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index d69036a..6b07932 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -88,6 +88,8 @@ static int	pthread_create(pthread_t *thread, pthread_attr_t *attr, void *(*start
 static int	pthread_join(pthread_t th, void **thread_return);
 #endif
 
+static void executeStatement2(PGconn *con, const char *sql, const char *table);
+static bool is_table_exists(PGconn *con, const char *table);
 
 /
  * some configurable parameters */
@@ -605,6 +607,54 @@ executeStatement(PGconn *con, const char *sql)
 	PQclear(res);
 }
 
+/* call executeStatement() if table exists */
+static void
+executeStatement2(PGconn *con, const char *sql, const char *table)
+{
+	char		buf[1024];
+
+	snprintf(buf, sizeof(buf)-1, %s %s, sql, table);
+
+	if (is_table_exists(con, table))
+			executeStatement(con, buf);
+}
+
+/*
+ * Return true if the table exists
+ */
+static bool
+is_table_exists(PGconn *con, const char *table)
+{
+	PGresult	*res;
+	char		buf[1024];
+	char		*result;
+
+	snprintf(buf, sizeof(buf)-1, SELECT to_regclass('%s') IS NULL, table);
+
+	res = PQexec(con, buf);
+	if (PQresultStatus(res) != PGRES_TUPLES_OK)
+	{
+		return false;
+	}
+
+	result = PQgetvalue(res, 0, 0);
+
+	if (result == NULL)
+	{
+		PQclear(res);
+		return false;
+	}
+
+	if (*result == 't')
+	{
+		PQclear(res);
+		return false;	/* table does not exist */
+	}
+
+	PQclear(res);
+	return true;
+}
+
 /* set up a connection to the backend */
 static PGconn *
 doConnect(void)
@@ -3197,17 +3247,34 @@ main(int argc, char **argv)
 
 	if (!is_no_vacuum)
 	{
-		fprintf(stderr, starting vacuum...);
-		executeStatement(con, vacuum pgbench_branches);
-		executeStatement(con, vacuum pgbench_tellers);
-		executeStatement(con, truncate pgbench_history);
-		fprintf(stderr, end.\n);
+		bool msg1 = false;
+		bool msg2 = false;
+
+		if (is_table_exists(con, pgbench_branches))
+			msg1 = true;
+
+		if (is_table_exists(con, pgbench_accounts))
+			msg2 = true;
+
+		if (msg1)
+			fprintf(stderr, starting vacuum...);
+
+		executeStatement2(con, vacuum, pgbench_branches);
+		executeStatement2(con, vacuum, pgbench_tellers);
+		executeStatement2(con, truncate, pgbench_history);
+
+		if (msg1)
+			fprintf(stderr, end.\n);
 
 		if (do_vacuum_accounts)
 		{
-			fprintf(stderr, starting vacuum pgbench_accounts...);
-			executeStatement(con, vacuum analyze pgbench_accounts);
-			fprintf(stderr, end.\n);
+			if (msg2)
+fprintf(stderr, starting vacuum pgbench_accounts...);
+
+			executeStatement2(con, vacuum analyze, pgbench_accounts);
+
+			if (msg2)
+fprintf(stderr, end.\n);
 		}
 	}
 	PQfinish(con);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-12-21 Thread Andrew Dunstan


On 04/07/2014 04:19 PM, Tom Lane wrote:

Alvaro Herrera alvhe...@2ndquadrant.com writes:

I just noticed that this patch not only adds min,max,stddev, but it also
adds the ability to reset an entry's counters.  This hasn't been
mentioned in this thread at all; there has been no discussion on whether
this is something we want to have, nor on whether this is the right API.
What it does is add a new function pg_stat_statements_reset_time() which
resets the min and max values from all function's entries, without
touching the stddev state variables nor the number of calls.  Note
there's no ability to reset the values for a single function.

That seems pretty bizarre.  At this late date, the best thing would
probably be to strip out the undiscussed functionality.  It can get
resubmitted for 9.5 if there's a real use-case for it.





This seems to have fallen through the slats completely. I'd like to 
revive it for 9.5, without the extra function. If someone wants to be 
able to reset stats on a finer grained basis that should probably be a 
separate patch.


One thing that bothered me slightly is that the stddev calculation 
appears to use a rather naive sum of squares method of calculation, 
which is known to give ill conditioned or impossible results under some 
pathological conditions: see 
http://www.johndcook.com/blog/standard_deviation for some details. I 
don't know if our results are likely to be so skewed as to produce 
pathological results, but ISTM we should probably take the trouble to be 
correct and use Welford's method anyway.


On my blog Peter Geoghegan mentioned something about atomic 
fetch-and-add being useful here, but I'm not quite sure what that's 
referring to. Perhaps someone can give me a pointer.


Thoughts?

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench -f and vacuum

2014-12-21 Thread Fabrízio de Royes Mello
On Sun, Dec 21, 2014 at 12:58 PM, Tatsuo Ishii is...@postgresql.org wrote:

  On Sun, Dec 14, 2014 at 11:43 AM, Tatsuo Ishii is...@postgresql.org
wrote:
  If we care enough about that case to attempt the vacuum anyway then we
  need to do something about the error message; either squelch it or
  check for the existence of the tables before attempting to
  vacuum. Since there's no way to squelch in the server logfile, I think
  checking for the table is the right answer.
 
  Fair enough. I will come up with checking for table before vacuum
  approach.
 
  +1 for this approach.

 Here is the patch I promised.


Some comments:

- Error to apply to the current master:

$ git apply /home/fabrizio/Downloads/pgbench-f-noexit-v2.patch
/home/fabrizio/Downloads/pgbench-f-noexit-v2.patch:9: trailing whitespace.
static void executeStatement2(PGconn *con, const char *sql, const char
*table);
/home/fabrizio/Downloads/pgbench-f-noexit-v2.patch:10: trailing whitespace.
static bool is_table_exists(PGconn *con, const char *table);
/home/fabrizio/Downloads/pgbench-f-noexit-v2.patch:18: trailing whitespace.
/* call executeStatement() if table exists */
/home/fabrizio/Downloads/pgbench-f-noexit-v2.patch:19: trailing whitespace.
static void
/home/fabrizio/Downloads/pgbench-f-noexit-v2.patch:20: trailing whitespace.
executeStatement2(PGconn *con, const char *sql, const char *table)
error: patch failed: contrib/pgbench/pgbench.c:88
error: contrib/pgbench/pgbench.c: patch does not apply


+static void executeStatement2(PGconn *con, const char *sql, const char
*table);

I think we can use a better name like executeStatementIfTableExists.


+if (result == NULL)
+{
+PQclear(res);
+return false;
+}
+
+if (*result == 't')
+{
+PQclear(res);
+return false;/* table does not exist */
+}

To simplify isn't better this way?

if (result == NULL || *result == 't')
{
PQclear(res);
return false; /* table does not exists */
}


Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


[HACKERS] Proposal VACUUM SCHEMA

2014-12-21 Thread Fabrízio de Royes Mello
Hi all,

I work with some customer that have databases with a lot of schemas and
sometimes we need to run manual VACUUM in one schema, and would be nice to
have a new option to run vacuum in relations from a specific schema.

The new syntax could be:

VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] {  [ table_name ] | SCHEMA
schema_name }

Also I'll add a new option to vacuumdb client:

-S, --schema=SCHEMA

I can work on this feature to 2015/02 CF.

Thoughts?

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] TABLESAMPLE patch

2014-12-21 Thread Tomas Vondra
Hi,

On 18.12.2014 13:14, Petr Jelinek wrote:
 Hi,
 
 v2 version of this patch is attached.

I did a review of this v2 patch today. I plan to do a bit more testing,
but these are my comments/questions so far:

(0) There's a TABLESAMPLE page at the wiki, not updated since 2012:

https://wiki.postgresql.org/wiki/TABLESAMPLE_Implementation

We should either update it or mark it as obsolete I guess. Also,
I'd like to know what's the status regarding the TODO items
mentioned there. Are those still valid with this patch?

(1) The patch adds a new catalog, but does not bump CATVERSION.

(2) The catalog naming (pg_tablesamplemethod) seems a bit awkward,
as it squishes everything into a single chunk. That's inconsistent
with naming of the other catalogs. I think pg_table_sample_method
would be better.

(3) There are a few more strange naming decisions, but that's mostly
because of the SQL standard requires that naming. I mean SYSTEM and
BERNOULLI method names, and also the fact that the probability is
specified as 0-100 value, which is inconsistent with other places
(e.g. percentile_cont uses the usual 0-1 probability notion). But
I don't think this can be fixed, that's what the standard says.

(4) I noticed there's an interesting extension in SQL Server, which
allows specifying PERCENT or ROWS, so you can say

  SELECT * FROM table TABLESAMPLE SYSTEM (25 PERCENT);

or

  SELECT * FROM table TABLESAMPLE SYSTEM (2500 ROWS);

That seems handy, and it'd make migration from SQL Server easier.
What do you think?

(5) I envision a lot of confusion because of the REPEATABLE clause.
With READ COMMITTED, it's not really repeatable because of changes
done by the other users (and maybe things like autovacuum). Shall
we mention this in the documentation?

(6) This seems slightly wrong, because of long/uint32 mismatch:

  long seed = PG_GETARG_UINT32(1);

I think uint32 would be more appropriate, no?

(7) NITPICKING: I think a 'sample_rate' would be a better name here:

  double percent = sampler-percent;

(8) NITPICKING: InitSamplingMethod contains a command with ';;'

  fcinfo.arg[i] = (Datum) 0;;

(9) The current regression tests only use the REPEATABLE cases. I
understand queries without this clause are RANDOM, but maybe we
could do something like this:

   SELECT COUNT(*) BETWEEN 5000 AND 7000 FROM (
 SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (50)
   ) foo;

 Granted, there's still a small probability of false positive, but
 maybe that's sufficiently small? Or is the amount of code this
 tests negligible?

(10) In the initial patch you mentioned it's possible to write custom
 sampling methods. Do you think a CREATE TABLESAMPLE METHOD,
 allowing custom methods implemented as extensions would be useful?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-21 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 i.e. either destroy the whole context if possible, and just free the
 memory when using a shared memory context. But I'm afraid this would
 penalize the shared memory context, because that's intended for cases
 where all the build states coexist in parallel and then at some point
 are all converted into a result and thrown away. Adding pfree() calls is
 no improvement here, and just wastes cycles.

FWIW, I quite dislike the terminology shared memory context, because
it sounds too much like it means a context in shared memory.  I see
that the patch itself doesn't use that phrase, which is good, but can
we come up with some other phrase for talking about it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PrivateRefCount patch has got issues

2014-12-21 Thread Andres Freund
Hi,

On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
 I just happened to look into bufmgr.c for the first time in awhile, and
 noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
 look too well thought out to me.  In particular, PinBuffer_Locked calls
 GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
 seems completely unacceptable.

Argh, yes. That certainly isn't ok.

The easiest way to fix that seems to be to declare that PinBuffer_Locked
can only be used when we're guaranteed to not have pinned the
buffer. That happens to be true for all the existing users. In fact all
of them even seem to require the refcount to be zero across all
backends.  That prerequisite then allows to increase the buffer header
refcount before releasing the spinlock *and* before increasing the
private refcount.

 It's also depressing that the very common code path
 ReleaseBuffer-UnpinBuffer results in a double search of the
 array/hashtable; that should be refactored to avoid that.

Sounds like a good idea, will see how that works out to look.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PrivateRefCount patch has got issues

2014-12-21 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
 I just happened to look into bufmgr.c for the first time in awhile, and
 noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
 look too well thought out to me.  In particular, PinBuffer_Locked calls
 GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
 seems completely unacceptable.

 Argh, yes. That certainly isn't ok.

 The easiest way to fix that seems to be to declare that PinBuffer_Locked
 can only be used when we're guaranteed to not have pinned the
 buffer. That happens to be true for all the existing users. In fact all
 of them even seem to require the refcount to be zero across all
 backends.  That prerequisite then allows to increase the buffer header
 refcount before releasing the spinlock *and* before increasing the
 private refcount.

Hm, if you do it like that, what happens if we get a palloc failure while
trying to record the private refcount?  I think you must not bump the pin
count in shared memory unless you're certain you can record the fact that
you've done so.

The idea I'd been wondering about hinged on the same observation that we
know the buffer is not pinned (by our process) already, but the mechanics
would be closer to what we do in resource managers: reserve space first,
do the thing that needs to be remembered, bump the count using the
reserved space.  Given the way you've set this up, the idea boils down to
having a precheck call that forces there to be an empty slot in the local
fastpath array (by pushing something out to the hash table if necessary)
before we start trying to pin the buffer.  Then it's guaranteed that the
record step will succeed.  You could possibly even arrange it so that
it's known which array entry needs to be used and then the record part
is just a couple of inline instructions, so that it'd be OK to do that
while still holding the spinlock.  Otherwise it would still be a good idea
to do the record after releasing the spinlock IMO; but either way this
avoids the issue of possible state inconsistency due to a midflight
palloc failure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-12-21 Thread Alvaro Herrera
Andrew Dunstan wrote:

 On my blog Peter Geoghegan mentioned something about atomic fetch-and-add
 being useful here, but I'm not quite sure what that's referring to. Perhaps
 someone can give me a pointer.

The point, I think, is that without atomic instructions you have to hold
a lock while incrementing the counters.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-12-21 Thread Andres Freund
On December 21, 2014 7:23:27 PM CET, Alvaro Herrera alvhe...@2ndquadrant.com 
wrote:
Andrew Dunstan wrote:

 On my blog Peter Geoghegan mentioned something about atomic
fetch-and-add
 being useful here, but I'm not quite sure what that's referring to.
Perhaps
 someone can give me a pointer.

The point, I think, is that without atomic instructions you have to
hold
a lock while incrementing the counters.

Port/atomics.h provides a abstraction for doing such atomic manipulations.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-12-21 Thread Andrew Dunstan


On 12/21/2014 01:23 PM, Alvaro Herrera wrote:

Andrew Dunstan wrote:


On my blog Peter Geoghegan mentioned something about atomic fetch-and-add
being useful here, but I'm not quite sure what that's referring to. Perhaps
someone can give me a pointer.

The point, I think, is that without atomic instructions you have to hold
a lock while incrementing the counters.



Hmm, do we do that now? That won't work for the stddev method I was 
referring to, although it could for the sum of squares method. In that 
case I would like someone more versed in numerical analysis than me to 
tell me how safe using sum of squares actually is in our case. And how 
about min and max? They don't look like good candidates for atomic 
operations.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-12-21 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 12/21/2014 01:23 PM, Alvaro Herrera wrote:
 The point, I think, is that without atomic instructions you have to hold
 a lock while incrementing the counters.

 Hmm, do we do that now?

We already have a spinlock mutex around the counter adjustment code, so
I'm not sure why this discussion is being held.

 I would like someone more versed in numerical analysis than me to 
 tell me how safe using sum of squares actually is in our case.

That, on the other hand, might be a real issue.  I'm afraid that
accumulating across a very long series of statements could lead
to severe roundoff error in the reported values, unless we use
a method chosen for numerical stability.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal VACUUM SCHEMA

2014-12-21 Thread Tom Lane
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 I work with some customer that have databases with a lot of schemas and
 sometimes we need to run manual VACUUM in one schema, and would be nice to
 have a new option to run vacuum in relations from a specific schema.

I'm pretty skeptical of this alleged use-case.  Manual vacuuming ought
to be mostly a thing of the past, and even if it's not, hitting
*everything* in a schema should seldom be an appropriate thing to do.

While the feature itself might be fairly innocuous, I'm just wondering
why we need to encourage manual vacuuming.  And why that, but not
say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-12-21 Thread Tomas Vondra
On 2.12.2014 06:14, Jeff Davis wrote:
 On Sun, 2014-11-30 at 17:49 -0800, Peter Geoghegan wrote:
 On Mon, Nov 17, 2014 at 11:39 PM, Jeff Davis pg...@j-davis.com wrote:
 I can also just move isReset there, and keep mem_allocated as a uint64.
 That way, if I find later that I want to track the aggregated value for
 the child contexts as well, I can split it into two uint32s. I'll hold
 off any any such optimizations until I see some numbers from HashAgg
 though.

 I took a quick look at memory-accounting-v8.patch.

 Is there some reason why mem_allocated is a uint64? All other things
 being equal, I'd follow the example of tuplesort.c's
 MemoryContextAllocHuge() API, which (following bugfix commit
 79e0f87a1) uses int64 variables to track available memory and so on.
 
 No reason. New version attached; that's the only change.

I've started reviewing this today. It does not apply cleanly on current
head, because of 4a14f13a committed a few days ago. That commit changed
the constants in  src/include/utils/hsearch.h, so the patch needs to use
this:

#define HASH_NOCHILDCXT 0x4000 /* ... */

That's the only conflict, and after fixing it it compiles OK. However, I
got a segfault on the very first query I tried :-(

  create table test_hash_agg as select i AS a, i AS b, i AS c, i AS d
from generate_series(1,1000) s(i);

  analyze test_hash_agg;

  select a, count(*) from test_hash_agg where a  10 group by a;

With a fresh cluster (default config), I get this:

  server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
  The connection to the server was lost. Attempting reset: Failed.

The backtrace looks like this (full attached):

Program received signal SIGSEGV, Segmentation fault.
advance_transition_function (aggstate=aggstate@entry=0x2b5c5f0,
peraggstate=peraggstate@entry=0x2b5efd0,
pergroupstate=pergroupstate@entry=0x8) at nodeAgg.c:468
468 if (pergroupstate-noTransValue)
(gdb) bt
#0  advance_transition_function at nodeAgg.c:468
#1  0x005c3494 in advance_aggregates at nodeAgg.c:624
#2  0x005c3dc2 in agg_fill_hash_table at nodeAgg.c:1640
#3  ExecAgg (node=node@entry=0x2b5c5f0) at nodeAgg.c:1338

(gdb) print pergroupstate-noTransValue
Cannot access memory at address 0x11
(gdb) print pergroupstate
$1 = (AggStatePerGroup) 0x8

My guess is something is scribbling over the pergroup memory, or maybe
the memory context gets released, or something. In any case, it's easily
reproducible, and apparently deterministic (always exactly the same
values, no randomness).

regards
Tomas
Program received signal SIGSEGV, Segmentation fault.
advance_transition_function (aggstate=aggstate@entry=0x2b5c5f0, 
peraggstate=peraggstate@entry=0x2b5efd0, pergroupstate=pergroupstate@entry=0x8) 
at nodeAgg.c:468
468 if (pergroupstate-noTransValue)
(gdb) bt
#0  advance_transition_function (aggstate=aggstate@entry=0x2b5c5f0, 
peraggstate=peraggstate@entry=0x2b5efd0, pergroupstate=pergroupstate@entry=0x8) 
at nodeAgg.c:468
#1  0x005c3494 in advance_aggregates 
(aggstate=aggstate@entry=0x2b5c5f0, pergroup=pergroup@entry=0x8) at 
nodeAgg.c:624
#2  0x005c3dc2 in agg_fill_hash_table (aggstate=0x2b5c5f0) at 
nodeAgg.c:1640
#3  ExecAgg (node=node@entry=0x2b5c5f0) at nodeAgg.c:1338
#4  0x005b6c68 in ExecProcNode (node=node@entry=0x2b5c5f0) at 
execProcnode.c:486
#5  0x005b3e90 in ExecutePlan (dest=0xbffa40 donothingDR, 
direction=optimized out, numberTuples=0, sendTuples=1 '\001', 
operation=CMD_SELECT, planstate=0x2b5c5f0, estate=0x2b5c4d8) at execMain.c:1480
#6  standard_ExecutorRun (queryDesc=0x2b58d78, direction=optimized out, 
count=0) at execMain.c:308
#7  0x0056df73 in ExplainOnePlan 
(plannedstmt=plannedstmt@entry=0x2b58ce0, into=into@entry=0x0, 
es=es@entry=0x7fff6fa066c0, queryString=optimized out, params=0x0, 
planduration=planduration@entry=0x7fff6fa06650) at explain.c:487
#8  0x0056e22d in ExplainOneQuery (query=optimized out, into=0x0, 
es=0x7fff6fa066c0, queryString=0x2b0bf68 explain analyze select a, count(*) 
from test_hash_agg where a  10 group by a;, params=0x0) at explain.c:341
#9  0x0056e670 in ExplainQuery (stmt=stmt@entry=0x2b0d248, 
queryString=0x2b0bf68 explain analyze select a, count(*) from test_hash_agg 
where a  10 group by a;, params=params@entry=0x0, dest=0x2b1b490) at 
explain.c:232
#10 0x006b5af6 in standard_ProcessUtility (parsetree=0x2b0d248, 
queryString=optimized out, context=optimized out, params=0x0, 
dest=optimized out, completionTag=0x7fff6fa067d0 ) at utility.c:648
#11 0x006b2c61 in PortalRunUtility (portal=portal@entry=0x2b4c3b8, 
utilityStmt=0x2b0d248, isTopLevel=optimized out, dest=dest@entry=0x2b1b490, 
completionTag=completionTag@entry=0x7fff6fa067d0 ) at pquery.c:1187
#12 0x006b3add in FillPortalStore (portal=portal@entry=0x2b4c3b8, 

Re: [HACKERS] controlling psql's use of the pager a bit more

2014-12-21 Thread Andrew Dunstan


On 11/15/2014 05:56 PM, Andrew Dunstan wrote:


On 11/13/2014 11:41 AM, Andrew Dunstan wrote:


On 11/13/2014 11:09 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:
I often get annoyed because psql is a bit too aggressive when it 
decides
whether to put output through the pager, and the only way to avoid 
this
is to turn the pager off (in which case your next query might dump 
many
thousands of lines to the screen). I'd like a way to be able to 
specify
a minumum number of lines of output before psql would invoke the 
pager,

rather than just always using the terminal window size.
Are you saying you'd want to set the threshold to *more* than the 
window

height?  Why?



Because I might be quite happy with 100 or 200 lines I can just 
scroll in my terminal's scroll buffer, but want to use the pager for 
more than that. This is useful especially if I want to scroll back 
and see the results from a query or two ago.








This patch shows more or less what I had in mind.

However, there is more work to do. As Tom noted upthread, psql's 
calculation of the number of lines is pretty bad. For example, if I do:


   \pset pager_min_lines 100
   select * from generate_series(1,50);

the pager still gets invoked, which is unfortunate to say the least.

So I'm going to take a peek at that.




The over-eager invocation of the pager due to double counting of lines 
got fixed recently, so here's a slightly updated patch for a 
pager_min_lines setting, including docco.


cheers

andrew
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e7fcc73..4201847 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2236,6 +2236,18 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteralpager_min_lines/literal/term
+  listitem
+  para
+  If literalpager_min_lines/ is set to a number greater than the
+  page height, the pager program will not be called unless there are
+  at least this many lines of output to show. The default setting
+  is 0.
+  /para
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralrecordsep/literal/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a17d7..2703850 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -807,7 +807,7 @@ exec_command(const char *cmd,
 opt[--len] = '\0';
 		}
 
-		helpSQL(opt, pset.popt.topt.pager);
+		helpSQL(opt, pset.popt.topt.pager, pset.popt.topt.pager_min_lines);
 		free(opt);
 	}
 
@@ -1074,7 +1074,8 @@ exec_command(const char *cmd,
 			static const char *const my_list[] = {
 border, columns, expanded, fieldsep, fieldsep_zero,
 footer, format, linestyle, null,
-numericlocale, pager, recordsep, recordsep_zero,
+numericlocale, pager, pager_min_lines,
+recordsep, recordsep_zero,
 tableattr, title, tuples_only,
 unicode_border_linestyle,
 unicode_column_linestyle,
@@ -1268,7 +1269,8 @@ exec_command(const char *cmd,
 	lines++;
 }
 
-output = PageOutput(lineno, pset.popt.topt.pager);
+output = PageOutput(lineno, pset.popt.topt.pager,
+	pset.popt.topt.pager_min_lines);
 is_pager = true;
 			}
 			else
@@ -1520,13 +1522,13 @@ exec_command(const char *cmd,
 	OT_NORMAL, NULL, false);
 
 		if (!opt0 || strcmp(opt0, commands) == 0)
-			slashUsage(pset.popt.topt.pager);
+			slashUsage(pset.popt.topt.pager, pset.popt.topt.pager_min_lines);
 		else if (strcmp(opt0, options) == 0)
-			usage(pset.popt.topt.pager);
+			usage(pset.popt.topt.pager, pset.popt.topt.pager_min_lines);
 		else if (strcmp(opt0, variables) == 0)
-			helpVariables(pset.popt.topt.pager);
+			helpVariables(pset.popt.topt.pager, pset.popt.topt.pager_min_lines);
 		else
-			slashUsage(pset.popt.topt.pager);
+			slashUsage(pset.popt.topt.pager, pset.popt.topt.pager_min_lines);
 	}
 
 #if 0
@@ -2522,6 +2524,13 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			popt-topt.pager = 1;
 	}
 
+	/* set minimum lines for pager use */
+	else if (strcmp(param, pager_min_lines) == 0)
+	{
+		if (value)
+			popt-topt.pager_min_lines = atoi(value);
+	}
+
 	/* disable (x rows) footer */
 	else if (strcmp(param, footer) == 0)
 	{
@@ -2643,6 +2652,13 @@ printPsetInfo(const char *param, struct printQueryOpt *popt)
 			printf(_(Pager usage is off.\n));
 	}
 
+	/* show minimum lines for pager use */
+	else if (strcmp(param, pager_min_lines) == 0)
+	{
+		printf(_(Pager won't be used for less than %d lines\n),
+			   popt-topt.pager_min_lines);
+	}
+
 	/* show record separator for unaligned text */
 	else if (strcmp(param, recordsep) == 0)
 	{
@@ -2795,6 +2811,8 @@ pset_value_string(const char *param, struct printQueryOpt *popt)
 		return pstrdup(pset_bool_string(popt-topt.numericLocale));
 	else if (strcmp(param, pager) == 0)
 		return psprintf(%d, 

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-12-21 Thread Tomas Vondra
On 21.12.2014 20:19, Tomas Vondra wrote:

 However, I got a segfault on the very first query I tried :-(
 
   create table test_hash_agg as select i AS a, i AS b, i AS c, i AS d
 from generate_series(1,1000) s(i);
 
   analyze test_hash_agg;
 
   select a, count(*) from test_hash_agg where a  10 group by a;
 
 With a fresh cluster (default config), I get this:
 
   server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
   The connection to the server was lost. Attempting reset: Failed.
 
 The backtrace looks like this (full attached):
 
 Program received signal SIGSEGV, Segmentation fault.
 advance_transition_function (aggstate=aggstate@entry=0x2b5c5f0,
 peraggstate=peraggstate@entry=0x2b5efd0,
 pergroupstate=pergroupstate@entry=0x8) at nodeAgg.c:468
 468 if (pergroupstate-noTransValue)
 (gdb) bt
 #0  advance_transition_function at nodeAgg.c:468
 #1  0x005c3494 in advance_aggregates at nodeAgg.c:624
 #2  0x005c3dc2 in agg_fill_hash_table at nodeAgg.c:1640
 #3  ExecAgg (node=node@entry=0x2b5c5f0) at nodeAgg.c:1338
 
 (gdb) print pergroupstate-noTransValue
 Cannot access memory at address 0x11
 (gdb) print pergroupstate
 $1 = (AggStatePerGroup) 0x8
 
 My guess is something is scribbling over the pergroup memory, or maybe
 the memory context gets released, or something. In any case, it's easily
 reproducible, and apparently deterministic (always exactly the same
 values, no randomness).

BTW the fact that 'make installcheck' and 'make check' both pass yet a
simple query crashes, suggests there are no regression tests testing the
batching properly. Which should be part of the patch, I believe.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-12-21 Thread Andrew Dunstan


On 12/21/2014 02:12 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 12/21/2014 01:23 PM, Alvaro Herrera wrote:

The point, I think, is that without atomic instructions you have to hold
a lock while incrementing the counters.

Hmm, do we do that now?

We already have a spinlock mutex around the counter adjustment code, so
I'm not sure why this discussion is being held.


Because Peter suggested we might be able to use atomics. I'm a bit 
dubious that we can for min and max anyway.





I would like someone more versed in numerical analysis than me to
tell me how safe using sum of squares actually is in our case.

That, on the other hand, might be a real issue.  I'm afraid that
accumulating across a very long series of statements could lead
to severe roundoff error in the reported values, unless we use
a method chosen for numerical stability.





Right.

The next question along those lines is whether we need to keep a running 
mean or whether that can safely be calculated on the fly. The code at 
http://www.johndcook.com/blog/standard_deviation/ does keep a running 
mean, and maybe that's required to prevent ill conditioned results, 
although I'm quite sure I see how it would. But this isn't my area of 
expertise.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS

2014-12-21 Thread Noah Misch
On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote:
  Tom == Tom Lane t...@sss.pgh.pa.us writes:
 
   I'd already explained in more detail way back when we posted the
   patch. But to reiterate: the ChainAggregate nodes pass through
   their input data unchanged, but on group boundaries they write
   aggregated result rows to a tuplestore shared by the whole
   chain. The top node returns the data from the tuplestore after its
   own output is completed.
 
  Tom That seems pretty grotty from a performance+memory consumption
  Tom standpoint.  At peak memory usage, each one of the Sort nodes
  Tom will contain every input row,
 
 Has this objection ever been raised for WindowAgg, which has the same
 issue?

I caution against using window function performance as the template for
GROUPING SETS performance goals.  The benefit of GROUPING SETS compared to its
UNION ALL functional equivalent is 15% syntactic pleasantness, 85% performance
opportunities.  Contrast that having window functions is great even with naive
performance, because they enable tasks that are otherwise too hard in SQL.

  Tom In principle, with the CTE+UNION approach I was suggesting, the
  Tom peak memory consumption would be one copy of the input rows in
  Tom the CTE's tuplestore plus one copy in the active branch's Sort
  Tom node.  I think a bit of effort would be needed to get there (ie,
  Tom shut down one branch's Sort node before starting the next,
  Tom something I'm pretty sure doesn't happen today).
 
 Correct, it doesn't.
 
 However, I notice that having ChainAggregate shut down its input would
 also have the effect of bounding the memory usage (to two copies,
 which is as good as the append+sorts+CTE case).

Agreed, and I find that more promising than the CTE approach.  Both strategies
require temporary space covering two copies of the input data.  (That, or you
accept rescanning the original input.)  The chained approach performs less
I/O.  Consider SELECT count(*) FROM t GROUP BY GROUPING SETS (a, b), where
pg_relation_size(t)  RAM.  I/O consumed with the chained approach:

  read table
  write tuplesort 1
  read tuplesort 1
  write tuplesort 2
  read tuplesort 2

I/O consumed with the CTE approach:

  read table
  write CTE
  read CTE
  write tuplesort 1
  read tuplesort 1
  read CTE
  write tuplesort 2
  read tuplesort 2

Tom rightly brought up the space requirements for result rows.  The CTE
approach naturally avoids reserving space for that.  However, I find it a safe
bet to optimize GROUPING SETS for input  result.  Reserving temporary space
for result rows to save input data I/O is a good trade.  We don't actually
need to compromise; one can imagine a GroupAggregateChain plan node with a
sortChain list that exhibits the efficiencies of both.  I'm fine moving
forward with the cross-node tuplestore, though.

The elephant in the performance room is the absence of hash aggregation.  I
agree with your decision to make that a follow-on patch, but the project would
be in an awkward PR situation if 9.5 has GroupAggregate-only GROUPING SETS.  I
may argue to #ifdef-out the feature rather than release that way.  We don't
need to debate that prematurely, but keep it in mind while planning.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal VACUUM SCHEMA

2014-12-21 Thread Fabrízio de Royes Mello
On Sun, Dec 21, 2014 at 5:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  I work with some customer that have databases with a lot of schemas and
  sometimes we need to run manual VACUUM in one schema, and would be nice
to
  have a new option to run vacuum in relations from a specific schema.

 I'm pretty skeptical of this alleged use-case.  Manual vacuuming ought
 to be mostly a thing of the past, and even if it's not, hitting
 *everything* in a schema should seldom be an appropriate thing to do.


I agree manual vacuum is a thing of the past, but autovacuum doesn't solve
100% of the cases, and sometimes we need to use it so my proposal is just
do help DBAs and/or Sysadmins to write simple maintenance scripts.


 While the feature itself might be fairly innocuous, I'm just wondering
 why we need to encourage manual vacuuming.

IMHO we will not encourage manual vacuuming, just give more flexibility to
users.


 And why that, but not
 say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...


+1. I can write patches for each of this maintenance statement too.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] pgbench -f and vacuum

2014-12-21 Thread Tatsuo Ishii
 - Error to apply to the current master:

Works for me. 

$ git apply ~/pgbench-f-noexit-v2.patch 
$

Maybe git version difference or the patch file was malformed by mail
client?

 +static void executeStatement2(PGconn *con, const char *sql, const char
 *table);
 
 I think we can use a better name like executeStatementIfTableExists.

Point taken.

 +if (result == NULL)
 +{
 +PQclear(res);
 +return false;
 +}
 +
 +if (*result == 't')
 +{
 +PQclear(res);
 +return false;/* table does not exist */
 +}
 
 To simplify isn't better this way?
 
 if (result == NULL || *result == 't')
 {
 PQclear(res);
 return false; /* table does not exists */
 }

Thanks for pointing it out.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench -f and vacuum

2014-12-21 Thread Tomas Vondra
Hi,

On 21.12.2014 15:58, Tatsuo Ishii wrote:
 On Sun, Dec 14, 2014 at 11:43 AM, Tatsuo Ishii is...@postgresql.org wrote:
 If we care enough about that case to attempt the vacuum anyway
 then we need to do something about the error message; either
 squelch it or check for the existence of the tables before
 attempting to vacuum. Since there's no way to squelch in the
 server logfile, I think checking for the table is the right
 answer.

 Fair enough. I will come up with checking for table before
 vacuum approach.

 +1 for this approach.
 
 Here is the patch I promised.

First of all - I'm not entirely convinced the IF EXISTS approach is
somehow better than -f implies -n suggested before, but I don't have a
strong preference either.

Regarding the patch:

(1) I agree with Fabrizio that the 'executeStatement2' is not the best
naming as it does not show the 'if exists' intent.


(2) The 'executeStatement2' API is a bit awkward as the signarure

executeStatement2(PGconn *con, const char *sql, const char *table);

suggests that the 'sql' command is executed when 'table' exists.
But that's not the case, because what actually gets executed is
'sql table'.


(3) The 'is_table_exists' should be probably just 'table_exists'.


(4) The SQL used in is_table_exists to check table existence seems
slightly wrong, because 'to_regclass' works for other relation
kinds, not just regular tables - it will match views for example.
While a conflict like that (having an object with the right name
but not a regular table) is rather unlikely I guess, a more correct
query would be this:

  SELECT oid FROM pg_class WHERE relname = '%s' AND relkind = 'r';


(5) I'm not a libpq expert, but I don't see how the PQgetvalue() could
return anything except true/false, so the

if (result == NULL)
{
PQclear(res);
return false;
}

seems a bit pointless to me. But maybe it's necessary?


(6) The is_table_exists might be further simplified along these lines:

static bool
is_table_exists(PGconn *con, const char *table)
{
PGresult*res;
charbuf[1024];
char*result;
boolretval;

snprintf(buf, sizeof(buf)-1,
 SELECT to_regclass('%s') IS NULL, table);

res = PQexec(con, buf);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
return false;
}

result = PQgetvalue(res, 0, 0);

retval = (*result == 't');

PQclear(res);

return retval;
}


(7) I also find the coding in main() around line 3250 a bit clumsy. The
first reason is that it only checks existence of 'pgbench_branches'
and then vacuums three pgbench_tellers and pgbench_history in the
same block. If pgbench_branches does not exist, there will be no
message printed (but the tables will be vacuumed).

The second reason is that the msg1, msg2 variables seem unnecessary.
IMHO this is a bit better:

if (!is_no_vacuum)
{
if (is_table_exists(con, pgbench_branches))
{
fprintf(stderr, starting vacuum...);

executeStatement2(con, vacuum, pgbench_branches);
executeStatement2(con, vacuum, pgbench_tellers);
executeStatement2(con, truncate, pgbench_history);

fprintf(stderr, end.\n);
}

if (do_vacuum_accounts)
{
if (is_table_exists(con, pgbench_accounts))
{
fprintf(stderr, starting vacuum pgbench_accounts...);

executeStatement(con,
 vacuum analyze pgbench_accounts);

fprintf(stderr, end.\n);
}
}
}

(8) Also, I think it's not necessary to define function prototypes for
executeStatement2 and is_table_exists. It certainly is not
consistent with the other functions defined in pgbench.c (e.g.
there's no prototype for executeStatement). Just delete the two
prototypes and move is_table_exists before executeStatement2.


regards
Tomas







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2014-12-21 Thread Michael Paquier
On Mon, Dec 22, 2014 at 2:38 AM, Tomas Vondra t...@fuzzy.cz wrote:
 (1) The patch adds a new catalog, but does not bump CATVERSION.
FWIW, this part is managed by the committer when this patch is picked up.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel mode and parallel contexts

2014-12-21 Thread Michael Paquier
On Thu, Dec 18, 2014 at 4:53 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 17, 2014 at 9:16 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 12 December 2014 at 22:52, Robert Haas robertmh...@gmail.com wrote:

 I would be remiss if I failed to mention that this patch includes work
 by my colleagues Amit Kapila, Rushabh Lathia, and Jeevan Chalke, as
 well as my former colleague Noah Misch; and that it would not have
 been possible without the patient support of EnterpriseDB management.

 Noted and thanks to all.

 I will make this my priority for review, but regrettably not until New
 Year, about 2.5-3 weeks away.

 OK!

 In the meantime, I had a good chat with Heikki on IM yesterday which
 gave me some new ideas on how to fix up the transaction handling in
 here, which I am working on implementing.  So hopefully I will have
 that by then.

 I am also hoping Amit will be adapting his parallel seq-scan patch to
 this framework.
Simon, if you are planning to review this patch soon, could you add
your name as a reviewer for this patch in the commit fest app?
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-21 Thread Ali Akbar

 Another positive benefit is that this won't break the code unless it
 uses the new API. This is a problem especially with external code (e.g.
 extensions), but the new API (initArray*) is not part of 9.4 so there's
 no such code. So that's nice.

 The one annoying thing is that this makes the API slighly unbalanced.
 With the new API you can use a shared memory context, which with the old
 one (not using the initArray* methods) you can't.

 But I'm OK with that, and it makes the patch smaller (15kB - 11kB).


Yes, with this API, we can backpatch this patch to 9.4 (or below) if we
need it there.

I think this API is a good compromise of old API and new API. Ideally if we
can migrate all code to new API (all code must call initArrayResult* before
accumArrayResult*), we can remove parameter MemoryContext rcontext from
accumArrayResult. Currently, the code isn't using the rcontext for anything
except for old API calls (in first call to accumArrayResult).

2014-12-21 20:38 GMT+07:00 Tomas Vondra t...@fuzzy.cz:

 On 21.12.2014 02:54, Alvaro Herrera wrote:
  Tomas Vondra wrote:
  Attached is v5 of the patch, fixing an error with releasing a shared
  memory context (invalid flag values in a few calls).
 
  The functions that gain a new argument should get their comment updated,
  to explain what the new argument is for.

 Right. I've added a short description of the 'subcontext' parameter to
 all three variations of the initArray* function, and a more thorough
 explanation to initArrayResult().



With this API, i think we should make it clear if we call initArrayResult
with subcontext=false, we can't call makeArrayResult, but we must use
makeMdArrayResult directly.

Or better, we can modify makeArrayResult to release according to
astate-private_cxt:

 @@ -4742,7 +4742,7 @@ makeArrayResult(ArrayBuildState *astate,
 dims[0] = astate-nelems;
 lbs[0] = 1;

 -   return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, true);
 +   return makeMdArrayResult(astate, ndims, dims, lbs, rcontext,
 astate-private_cxt);




Or else we implement what you suggest below (more comments below):

Thinking about the 'release' flag a bit more - maybe we could do this
 instead:

 if (release  astate-private_cxt)
 MemoryContextDelete(astate-mcontext);
 else if (release)
 {
 pfree(astate-dvalues);
 pfree(astate-dnulls);
 pfree(astate);
 }

 i.e. either destroy the whole context if possible, and just free the
 memory when using a shared memory context. But I'm afraid this would
 penalize the shared memory context, because that's intended for cases
 where all the build states coexist in parallel and then at some point
 are all converted into a result and thrown away. Adding pfree() calls is
 no improvement here, and just wastes cycles.


As per Tom's comment, i'm using parent memory context instead of shared
memory context below.

In the future, if some code writer decided to use subcontext=false, to save
memory in cases where there are many array accumulation, and the parent
memory context is long-living, current code can cause memory leak. So i
think we should implement your suggestion (pfreeing astate), and warn the
implication in the API comment. The API user must choose between
release=true, wasting cycles but preventing memory leak, or managing memory
from the parent memory context.

In one possible use case, for efficiency maybe the caller will create a
special parent memory context for all array accumulation. Then uses
makeArrayResult* with release=false, and in the end releasing the parent
memory context once for all.

Regards,
-- 
Ali Akbar


Re: [HACKERS] btree_gin and ranges

2014-12-21 Thread Michael Paquier
On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 10/22/2014 01:55 PM, Teodor Sigaev wrote:

 Suggested patch adds GIN support contains operator for ranges over scalar
 column.

 It allows more effective GIN scan. Currently, queries like
 SELECT * FROM test_int4 WHERE i = 1 and i = 1
 will be excuted by GIN with two scans: one is from mines infinity to 1 and
 another is from -1 to plus infinity. That's because GIN is generalized
 and it
 doesn't know a semantics of operation.

 With patch it's possible to rewrite query with ranges
 SELECT * FROM test_int4 WHERE i @ '[-1, 1]'::int4range
 and GIN index will support this query with single scan from -1 to 1.

 Patch provides index support only for existing range types: int4, int8,
 numeric,
 date and timestamp with and without time zone.


 I started to look at this, but very quickly got carried away into
 refactoring away the macros. Defining long functions as macros makes
 debugging quite difficult, and it's not nice to read or edit the source code
 either.

 I propose the attached refactoring (it doesn't include your range-patch yet,
 just refactoring the existing code). It turns the bulk of the macros into
 static functions. GIN_SUPPORT macro still defines the datatype-specific
 functions, but they are now very thin wrappers that just call the
 corresponding generic static functions.

 It's annoying that we need the concept of a leftmost value, for the  and =
 queries. Without that, we could have the support functions look up the
 required datatype information from the type cache, based on the datatype of
 the passed argument. Then you could easily use the btree_gin support
 functions also for user-defined datatypes. But that needs bigger changes,
 and this is a step in the right direction anyway.
I just had a look at the refactoring patch and ISTM that this is a
good step forward in terms of readability. Teodor, I am noticing that
your patch cannot apply once the refactoring is done. Could you rebase
your patch once the refactoring is pushed?s
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2014-12-21 Thread Michael Paquier
On Tue, Dec 16, 2014 at 7:04 PM, David Rowley dgrowle...@gmail.com wrote:
 It also looks like your OIDs have been nabbed by some jsonb stuff.
 DETAIL:  Key (oid)=(3267) is duplicated.
Use src/include/catalog/unused_oids to track the OIDs not yet used in
the catalogs when adding new objects for a feature.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2014-12-21 Thread Jim Nasby

On 12/21/14, 12:42 AM, Amit Kapila wrote:

On Fri, Dec 19, 2014 at 6:21 PM, Stephen Frost sfr...@snowman.net 
mailto:sfr...@snowman.net wrote:
a. Instead of passing value array, just pass tuple id, but retain the
buffer pin till master backend reads the tuple based on tupleid.
This has side effect that we have to retain buffer pin for longer
period of time, but again that might not have any problem in
real world usage of parallel query.

b. Instead of passing value array, pass directly the tuple which could
be directly propagated by master backend to upper layer or otherwise
in master backend change some code such that it could propagate the
tuple array received via shared memory queue directly to frontend.
Basically save the one extra cycle of form/deform tuple.

Both these need some new message type and handling for same in
Executor code.

Having said above, I think we can try to optimize this in multiple
ways, however we need additional mechanism and changes in Executor
code which is error prone and doesn't seem to be important at this
stage where we want the basic feature to work.


Would b require some means of ensuring we didn't try and pass raw tuples to 
frontends? Other than that potential wrinkle, it seems like less work than a.

...


I think there are mainly two things which can lead to benefit
by employing parallel workers
a. Better use of available I/O bandwidth
b. Better use of available CPU's by doing expression evaluation
by multiple workers.


...


In the above tests, it seems to me that the maximum benefit due to
'a' is realized upto 4~8 workers


I'd think a good first estimate here would be to just use 
effective_io_concurrency.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-21 Thread Michael Paquier
On Sun, Dec 21, 2014 at 6:56 AM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Dec 18, 2014 at 9:31 AM, Peter Geoghegan p...@heroku.com wrote:
 So I think there needs to be some kind of logic to de-recognize the table
 alias foo.

 Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
 through an adaptation of my usual torture test, and it ran fine until
 wraparound shutdown.  I'll poke at it more later.

 Oops. I agree with your diagnosis, and will circle around to fix that
 bug in the next revision

 Attached patch fixes the bug. I'm not delighted about the idea of
 cutting off parent parse state (the parse state of the insert) within
 transformUpdateStmt() only once we've used the parent state to
 establish that this is a speculative/auxiliary update, but it's
 probably the path of least resistance here.

 When this is rolled into the next version, there will be a testcase.
Looking at this thread, the last version of this patch is available here:
http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrjvu-ec...@mail.gmail.com
And they do not apply correctly, so this patch needs a rebase.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-21 Thread Peter Geoghegan
On Sun, Dec 21, 2014 at 6:10 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Looking at this thread, the last version of this patch is available here:
 http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrjvu-ec...@mail.gmail.com
 And they do not apply correctly, so this patch needs a rebase.

That isn't so. The latest version is much more recent than that. It's
available here:

http://www.postgresql.org/message-id/cam3swzqtqsclz1yj1ouwfpo-gmfhwtgwtog+o_nnzxrpa7c...@mail.gmail.com

Everything is tracked in the commitfest app in detail.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal VACUUM SCHEMA

2014-12-21 Thread Jim Nasby

On 12/21/14, 3:30 PM, Fabrízio de Royes Mello wrote:


On Sun, Dec 21, 2014 at 5:18 PM, Tom Lane t...@sss.pgh.pa.us 
mailto:t...@sss.pgh.pa.us wrote:
 
  =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com 
mailto:fabriziome...@gmail.com writes:
   I work with some customer that have databases with a lot of schemas and
   sometimes we need to run manual VACUUM in one schema, and would be nice to
   have a new option to run vacuum in relations from a specific schema.
 
  I'm pretty skeptical of this alleged use-case.  Manual vacuuming ought
  to be mostly a thing of the past, and even if it's not, hitting
  *everything* in a schema should seldom be an appropriate thing to do.
 

I agree manual vacuum is a thing of the past, but autovacuum doesn't solve 100% 
of the cases, and sometimes we need to use it so my proposal is just do help 
DBAs and/or Sysadmins to write simple maintenance scripts.


Just one example of that is pre-emptively vacuuming during slower periods. Nothing spells 
fun like a freeze vacuum in the middle of a busy lunch period for a website.

Similarly, it's common to need to proactively vacuum after a data load, and 
since it's not unusual for there to be a schema dedicated to loading data, this 
makes that easier.


  And why that, but not
  say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
 

+1. I can write patches for each of this maintenance statement too.


If we're going to go that route, then perhaps it would make more sense to 
create a command that allows you to apply a second command to every object in a 
schema. We would have to be careful about PreventTransactionChain commands.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-21 Thread Michael Paquier
On Mon, Dec 22, 2014 at 11:20 AM, Peter Geoghegan p...@heroku.com wrote:
 On Sun, Dec 21, 2014 at 6:10 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Looking at this thread, the last version of this patch is available here:
 http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrjvu-ec...@mail.gmail.com
 And they do not apply correctly, so this patch needs a rebase.

 That isn't so. The latest version is much more recent than that. It's
 available here:

 http://www.postgresql.org/message-id/cam3swzqtqsclz1yj1ouwfpo-gmfhwtgwtog+o_nnzxrpa7c...@mail.gmail.com

 Everything is tracked in the commitfest app in detail.
Oops, sorry. I got mistaken because of the name of the latest attachments.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-21 Thread Jim Nasby

On 12/21/14, 7:08 PM, Ali Akbar wrote:

Another positive benefit is that this won't break the code unless it
uses the new API. This is a problem especially with external code (e.g.
extensions), but the new API (initArray*) is not part of 9.4 so there's
no such code. So that's nice.

The one annoying thing is that this makes the API slighly unbalanced.
With the new API you can use a shared memory context, which with the old
one (not using the initArray* methods) you can't.

But I'm OK with that, and it makes the patch smaller (15kB - 11kB).


Yes, with this API, we can backpatch this patch to 9.4 (or below) if we need it 
there.

I think this API is a good compromise of old API and new API. Ideally if we can 
migrate all code to new API (all code must call initArrayResult* before 
accumArrayResult*), we can remove parameter MemoryContext rcontext from 
accumArrayResult. Currently, the code isn't using the rcontext for anything 
except for old API calls (in first call to accumArrayResult).


Until we eliminate the API though, we should leave something in place that 
still uses the old one, to make certain we don't accidentally break it.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal VACUUM SCHEMA

2014-12-21 Thread Fabrízio de Royes Mello
Em segunda-feira, 22 de dezembro de 2014, Jim Nasby 
jim.na...@bluetreble.com escreveu:

 On 12/21/14, 3:30 PM, Fabrízio de Royes Mello wrote:


 On Sun, Dec 21, 2014 at 5:18 PM, Tom Lane t...@sss.pgh.pa.us mailto:
 t...@sss.pgh.pa.us wrote:
  
   =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com
 mailto:fabriziome...@gmail.com writes:
I work with some customer that have databases with a lot of schemas
 and
sometimes we need to run manual VACUUM in one schema, and would be
 nice to
have a new option to run vacuum in relations from a specific schema.
  
   I'm pretty skeptical of this alleged use-case.  Manual vacuuming ought
   to be mostly a thing of the past, and even if it's not, hitting
   *everything* in a schema should seldom be an appropriate thing to do.
  

 I agree manual vacuum is a thing of the past, but autovacuum doesn't
 solve 100% of the cases, and sometimes we need to use it so my proposal is
 just do help DBAs and/or Sysadmins to write simple maintenance scripts.


 Just one example of that is pre-emptively vacuuming during slower periods.
 Nothing spells fun like a freeze vacuum in the middle of a busy lunch
 period for a website.

 Similarly, it's common to need to proactively vacuum after a data load,
 and since it's not unusual for there to be a schema dedicated to loading
 data, this makes that easier.


Good example. Thanks.




And why that, but not
   say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
  

 +1. I can write patches for each of this maintenance statement too.


 If we're going to go that route, then perhaps it would make more sense to
 create a command that allows you to apply a second command to every object
 in a schema. We would have to be careful about PreventTransactionChain
 commands.


 Sorry but I don't understand what you meant. Can you explain more about
your idea?

Regards,

Fabrízio Mello


-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Re[3]: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

2014-12-21 Thread Michael Paquier
On Tue, Nov 4, 2014 at 6:25 AM, Alexey Vasiliev leopard...@inbox.ru wrote:
 Added new patch.
Seems useful to me to be able to tune this interval of time.

I would simply reword the documentation as follows:
If varnamerestore_command/ returns nonzero exit status code, retry
command after the interval of time specified by this parameter.
Default value is literal5s/.

Also, I think that it would be a good idea to error out if this
parameter has a value of let's say, less than 1s instead of doing a
check for a positive value in the waiting latch. On top of that, the
default value of restore_command_retry_interval should be 5000 and not
0 to simplify the code.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2014-12-21 Thread Michael Paquier
On Mon, Dec 15, 2014 at 3:10 PM, Peter Eisentraut pete...@gmx.net wrote:
 fixed
This patch needs a rebase, it does not apply correctly in a couple of
places on latest HEAD (699300a):
./src/include/catalog/catversion.h.rej
./src/include/catalog/pg_proc.h.rej
./src/pl/plpython/plpy_procedure.c.rej
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fractions in GUC variables

2014-12-21 Thread Michael Paquier
On Tue, Dec 16, 2014 at 12:19 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 12/15/14 8:56 AM, Heikki Linnakangas wrote:
 Overall, I feel that this isn't really worth the trouble. We use
 fractions consistently now, so there isn't much room for confusion over
 what the current values mean. Using a percentage might be more familiar
 for some people, but OTOH you'll have to get used to the fractions
 anyway, unless we change the default output format too, and I'm not in
 favour of doing that. I suggest that we just drop this, and remove the
 TODO item.

 Agreed.

 The patch is sound as far as it goes (I might be inclined to accept
 whitespace between number and % sign), but given the above points and
 the original reason for it having been eliminated, I'm inclined to drop it.
+1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] no test programs in contrib

2014-12-21 Thread Michael Paquier
On Sat, Nov 29, 2014 at 5:54 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Peter Eisentraut wrote:
 On 11/27/14 3:10 PM, Tom Lane wrote:
  I'm not too happy with that approach, because packagers are going to
  tend to think they should package any files installed by install-world.
  The entire point of this change, IMO, is that the test modules should
  *not* get installed, certainly not by normal install targets.  Being
  consistent with the existing contrib packaging is exactly not what we
  want.

 I share this objection.

 Okay, the attached version does it that way.

 I also attach some changes for the MSVC build stuff.  I tested it and it
 builds fine AFAICT, but it doesn't install because Install.pm wants to
 install contrib modules from contrib/ (which seems reasonable) but my
 hack adds the src/test/modules/ as contrib modules also, so Install.pm
 goes bonkers.  I'm not even sure *what* we're supposed to build -- there
 is no distinction in these programs as there is in the makefiles about
 what to install.  So if some Windows developer can look into this, I'd
 appreciate it.

 But the existing main regression tests are able to run against an
 existing installation while using the modules autoinc.so and refint.so
 without installing them.  I think the problem is that we are able to
 load a .so file from just about anywhere, but we can't load a full
 extension in the same way.  There have been discussions about that, in
 the context of being able to test an externally developed extension
 before/without installing it.  This is pretty much the same case.

 I'm leaving that problem for someone else to solve.

 Andres Freund wrote:
 On 2014-11-27 15:51:51 -0500, Tom Lane wrote:
 
  If we follow that reasoning we'll end up removing nothing from contrib.
  There is no reason that end users should need to be performing such
  testing; anyone who does have reason to do it will have a source tree
  at hand.

 Actually I don't think that's true for test_decoding - there's quite a
 bit of actual things that you can do with it. At the very least it's
 useful to roughly measure the impact logical replication would have on a
 database, but it's also helpful to look at the changes. And even if the
 format isn't super nice, thanks to Robert's insistence it's actually
 safely parseable if necessary.

 Argh.  Okay, the attached doesn't move test_decoding either.

 I think it's fine anyway -- I'm sure we will come up with a few
 additional test modules, such as the one for the commit_ts patch.
When src/test/modules is compiled directly after running ./configure,
make complains about utils/errcodes.h missing:
In file included from worker_spi.c:23:
In file included from ../../../../src/include/postgres.h:48:
../../../../src/include/utils/elog.h:69:10: fatal error:
'utils/errcodes.h' file not found
#include utils/errcodes.h

Shouldn't src/test/modules/Makefile includes .PHONY with
submake-errcodes like for example in the patch attached?
Regards,
-- 
Michael
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 93d93af..7ca3eb0 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -11,6 +11,7 @@ SUBDIRS = \
 		  test_shm_mq \
 		  test_parser
 
+.PHONY: submake-errcodes
 all: submake-errcodes
 
 submake-errcodes:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2014-12-21 Thread Amit Kapila
On Mon, Dec 22, 2014 at 7:34 AM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 12/21/14, 12:42 AM, Amit Kapila wrote:

 On Fri, Dec 19, 2014 at 6:21 PM, Stephen Frost sfr...@snowman.net
mailto:sfr...@snowman.net wrote:
 a. Instead of passing value array, just pass tuple id, but retain the
 buffer pin till master backend reads the tuple based on tupleid.
 This has side effect that we have to retain buffer pin for longer
 period of time, but again that might not have any problem in
 real world usage of parallel query.

 b. Instead of passing value array, pass directly the tuple which could
 be directly propagated by master backend to upper layer or otherwise
 in master backend change some code such that it could propagate the
 tuple array received via shared memory queue directly to frontend.
 Basically save the one extra cycle of form/deform tuple.

 Both these need some new message type and handling for same in
 Executor code.

 Having said above, I think we can try to optimize this in multiple
 ways, however we need additional mechanism and changes in Executor
 code which is error prone and doesn't seem to be important at this
 stage where we want the basic feature to work.


 Would b require some means of ensuring we didn't try and pass raw tuples
to frontends?

That seems to be already there, before sending the tuple
to frontend, we already ensure to deform it (refer printtup()-

slot_getallattrs())

Other than that potential wrinkle, it seems like less work than a.


Here, I am assuming that you are mentioning about *pass the tuple*
directly approach;  We also need to devise a new protocol message
and mechanism to directly pass the tuple via shared memory queues,
also I think currently we can send only the things via shared memory
queues which we can do via FE/BE protocol and we don't send tuples
directly to frontend.   Apart from this, I am not sure how much benefit it
can give, because it will reduce one part of tuple communication, but still
the amount of data transferred will be almost same.

This is an area of improvement which needs more investigation and even
without this we can get benefit in many cases as shown upthread and
I think after that we can try to parallelize the aggregation (Simon Riggs
and
David Rowley have already worked out some infrastructure for the same)
that will surely give us good benefits.  So I suggest it's better to focus
on
the remaining things with which this patch could be in a shape (in terms of
robustness/stability) where it can be accepted rather than trying to
optimize tuple communication which we can do later as well.

 ...

 I think there are mainly two things which can lead to benefit
 by employing parallel workers
 a. Better use of available I/O bandwidth
 b. Better use of available CPU's by doing expression evaluation
 by multiple workers.


 ...

 In the above tests, it seems to me that the maximum benefit due to
 'a' is realized upto 4~8 workers


 I'd think a good first estimate here would be to just use
effective_io_concurrency.


One thing we should be cautious about this parameter is that currently
it is mapped to number of pages that needs to prefetched, and using
it for deciding degree of parallelism could be slightly tricky, however I
will consider it while working on cost model.

Thanks for your suggestions.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: Suppressing elog.c context messages (was Re: [HACKERS] Wait free LW_SHARED acquisition)

2014-12-21 Thread Amit Kapila
On Fri, Dec 19, 2014 at 9:36 PM, Andres Freund and...@2ndquadrant.com
wrote:

 Hi,

 When debugging lwlock issues I found PRINT_LWDEBUG/LOG_LWDEBUG rather
 painful to use because of the amount of elog contexts/statements
 emitted. Given the number of lwlock acquirations that's just not doable.

 To solve that during development I've solved that by basically
 replacing:
 if (Trace_lwlocks)
elog(LOG, %s(%s %d): %s, where, name, index, msg);

 with something like

 if (Trace_lwlocks)
 {
 ErrorContextCallback *old_error_context_stack;
 ...
 old_error_context_stack = error_context_stack;
 error_context_stack = NULL;
 ereport(LOG,
(errhidestmt(true),
 errmsg(%s(%s %d): %s, where, T_NAME(lock),
 T_ID(lock), msg)));

 I think it'd generally be useful to have something like errhidecontext()
 akin to errhidestatement() to avoid things like the above.


Under this proposal, do you want to suppress the context/statement
unconditionally or via some hook/variable, because it might be useful to
print the contexts/statements in certain cases where there is complex
application and we don't know which part of application code causes
problem.

 The usecases wher eI see this as being useful is high volume debug
 logging, not normal messages...


I think usecase is valid, it is really difficult to dig such a log
especially
when statement size is big.  Also I think even with above, the number
of logs generated are high for any statement which could still make
debugging difficult, do you think it would be helpful if PRINT_LWDEBUG
and LOG_LWDEBUG are used with separate defines (LOCK_DEBUG and
LOCK_BLOCK_DEBUG) as in certain cases we might want to print info
about locks which are acquired after waiting or in other words that gets
blocked.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] install libpq.dll in bin directory on Windows / Cygwin

2014-12-21 Thread Michael Paquier
On Wed, Feb 26, 2014 at 6:11 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/1/14, 3:22 PM, Andrew Dunstan wrote:
 In the end I went with the way I had suggested, because that's what the
 MSVC system does - it doesn't copy any other DLLs to the bin directory.
 So doing that seemed sane for backpatching, to bring the two build
 systems into sync.

 If you want to propose a better arrangement for the future, to include,
 say, ecpg DLLs, and including changes to the MSVC system, we can discuss
 that separately.

 See attached patch.

 There is also the commit fest item
 https://commitfest.postgresql.org/action/patch_view?id=1330 that
 requests the MSVC builds to install the epcg libraries in the bin directory.

Looking finally at this patch. In short, it moves to bin/
libpgtypes.dll, libecpg.dll and libecpg_compat.dll for cygwin and
MinGW build using some additional processing in Makefile.shlib,
removing at the same time the win32 stuff in libpq/Makefile.

IMO, it would be more readable to replace this part with a separate if
block for readability. So changing that:
- '$(DESTDIR)$(libdir)/$(shlib)' \
+ '$(DESTDIR)$(libdir)/$(shlib)' $(if $(findstring
$(PORTNAME),win32 cygwin),'$(DESTDIR)$(bindir)/$(shlib)') \
For that:
ifneq(blah)
blah2
endif

The MSVC portion of this fix got completely lost in the void:
https://commitfest.postgresql.org/action/patch_view?id=1330

Peter, could it be possible to merge this patch with its MSVC portion
for simplicity? I think that it would more readable to do all the
changes at the same time once and for all. Also, that's still a bug,
so are we still considering a backpatch? I wouldn't mind putting some
time into a patch to get that fixed..
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] documentation update for doc/src/sgml/func.sgml

2014-12-21 Thread Michael Paquier
On Sat, Oct 18, 2014 at 4:12 AM, Andreas 'ads' Scherbaum
adsm...@wars-nicht.de wrote:
 On 09/14/2014 06:32 PM, Peter Eisentraut wrote:

 On 9/12/14 3:13 PM, Andreas 'ads' Scherbaum wrote:

 Of course a general rule how to link to WP would be nice ...


 I think Wikipedia links should be avoided altogether.  We can assume
 that readers are technically proficient to look up general technical
 concepts on their own using a reference system of their choice.

 In cases where a link is warranted, it is better to construct a proper
 bibliographic citation to the primary source material, such as an IEEE
 standard or an academic paper, in a way that will stand the test of time.


 That's a clear statement, and makes sense. Should be written down somewhere,
 so it can be found again.


 Independent of that, it is actually not correct that we use the IEEE's
 rules, because we don't use any rules, that is up to the operating
 system/platform.  While most platforms indeed do use the IEEE
 floating-point standard more less, some don't.  Section 8.1.3 tries to
 point that out.


 New version attached, WP link removed, wording changed.
Documentation format is still incorrect. The function names should be
put in a block function, same for the value 0.5 with literal and
the data types NUMERIC and REAL. Corrected patch is attached. The rest
looks fine to me, I am switching it to Ready for committer.
-- 
Michael
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6f30946..f6b865e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -939,6 +939,26 @@
 /tgroup
/table
 
+   para
+For functions like functionround()/, functionlog()/ and
+functionsqrt()/ which run against either fixed-precision
+(literalNUMERIC/) or floating-point numbers (e.g. literalREAL/),
+note that the results of these operations will differ according
+to the input type due to rounding. This is most observable with
+functionround()/, which can end up rounding down as well as up for
+any literal0.5/ value. productnamePostgreSQL/productname's
+handling of floating-point values depends on the operating system, which
+may or may not follow the IEEE floating-point standard.
+  /para
+
+  para
+The bitwise operators work only on integral data types, whereas
+the others are available for all numeric data types. The bitwise
+operators are also available for the bit string types
+typebit/type and typebit varying/type, as
+shown in xref linkend=functions-bit-string-op-table.
+   /para
+
   para
 xref linkend=functions-math-random-table shows functions for
 generating random numbers.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench -f and vacuum

2014-12-21 Thread Tatsuo Ishii
 Hi,
 
 On 21.12.2014 15:58, Tatsuo Ishii wrote:
 On Sun, Dec 14, 2014 at 11:43 AM, Tatsuo Ishii is...@postgresql.org wrote:
 If we care enough about that case to attempt the vacuum anyway
 then we need to do something about the error message; either
 squelch it or check for the existence of the tables before
 attempting to vacuum. Since there's no way to squelch in the
 server logfile, I think checking for the table is the right
 answer.

 Fair enough. I will come up with checking for table before
 vacuum approach.

 +1 for this approach.
 
 Here is the patch I promised.
 
 First of all - I'm not entirely convinced the IF EXISTS approach is
 somehow better than -f implies -n suggested before, but I don't have a
 strong preference either.
 
 Regarding the patch:
 
 (1) I agree with Fabrizio that the 'executeStatement2' is not the best
 naming as it does not show the 'if exists' intent.
 
 
 (2) The 'executeStatement2' API is a bit awkward as the signarure
 
 executeStatement2(PGconn *con, const char *sql, const char *table);
 
 suggests that the 'sql' command is executed when 'table' exists.
 But that's not the case, because what actually gets executed is
 'sql table'.

Any replacement idea for sql and table?

 (3) The 'is_table_exists' should be probably just 'table_exists'.
 
 
 (4) The SQL used in is_table_exists to check table existence seems
 slightly wrong, because 'to_regclass' works for other relation
 kinds, not just regular tables - it will match views for example.
 While a conflict like that (having an object with the right name
 but not a regular table) is rather unlikely I guess, a more correct
 query would be this:
 
   SELECT oid FROM pg_class WHERE relname = '%s' AND relkind = 'r';

This doesn't always work if schema search path is set.

 (5) I'm not a libpq expert, but I don't see how the PQgetvalue() could
 return anything except true/false, so the
 
 if (result == NULL)
 {
 PQclear(res);
 return false;
 }
 
 seems a bit pointless to me. But maybe it's necessary?

PQgetvalue could return NULL, if the parameter is wrong. I don't want
to raise segfault in any case.

 (6) The is_table_exists might be further simplified along these lines:
 
 static bool
 is_table_exists(PGconn *con, const char *table)
 {
 PGresult*res;
 charbuf[1024];
 char*result;
 boolretval;
 
 snprintf(buf, sizeof(buf)-1,
  SELECT to_regclass('%s') IS NULL, table);
 
 res = PQexec(con, buf);
 if (PQresultStatus(res) != PGRES_TUPLES_OK)
 {
 return false;
 }
 
 result = PQgetvalue(res, 0, 0);
 
 retval = (*result == 't');
 
 PQclear(res);
 
 return retval;
 }
 
 
 (7) I also find the coding in main() around line 3250 a bit clumsy. The
 first reason is that it only checks existence of 'pgbench_branches'
 and then vacuums three pgbench_tellers and pgbench_history in the
 same block. If pgbench_branches does not exist, there will be no
 message printed (but the tables will be vacuumed).

So we should check the existence of all pgbench_branches,
pgbench_tellers, pgbench_history and pgbench_accounts? Not sure if
it's worth the trouble.

 The second reason is that the msg1, msg2 variables seem unnecessary.
 IMHO this is a bit better:

This will behave differently from what pgbench it is now. If -f is not
specified and pgbench_* does not exist, then pgbech silently skipps
vacuum. Today pgbench raises error in this case.

 if (!is_no_vacuum)
 {
 if (is_table_exists(con, pgbench_branches))
 {
 fprintf(stderr, starting vacuum...);
 
 executeStatement2(con, vacuum, pgbench_branches);
 executeStatement2(con, vacuum, pgbench_tellers);
 executeStatement2(con, truncate, pgbench_history);
 
 fprintf(stderr, end.\n);
 }
 
 if (do_vacuum_accounts)
 {
 if (is_table_exists(con, pgbench_accounts))
 {
 fprintf(stderr, starting vacuum pgbench_accounts...);
 
 executeStatement(con,
  vacuum analyze pgbench_accounts);
 
 fprintf(stderr, end.\n);
 }
 }
 }
 
 (8) Also, I think it's not necessary to define function prototypes for
 executeStatement2 and is_table_exists. It certainly is not
 consistent with the other functions defined in pgbench.c (e.g.
 there's no prototype for executeStatement). Just delete the two
 prototypes and move is_table_exists before executeStatement2.

I think not having static function prototypes is not a good
custom. See other source code in PostgreSQL.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php

[HACKERS] ExplainModifyTarget doesn't work as expected

2014-12-21 Thread Etsuro Fujita
Hi,

I think ExplainModifyTarget should show the parent of the inheritance
tree in multi-target-table cases, as described there, but noticed that
it doesn't always work like that.  Here is an example.

postgres=# create table parent (a int check (a  0) no inherit);
CREATE TABLE
postgres=# create table child (a int check (a = 0));
CREATE TABLE
postgres=# alter table child inherit parent;
ALTER TABLE
postgres=# explain update parent set a = a * 2 where a = 0;
  QUERY PLAN
---
 Update on child  (cost=0.00..42.00 rows=800 width=10)
   -  Seq Scan on child  (cost=0.00..42.00 rows=800 width=10)
 Filter: (a = 0)
(3 rows)

IIUC, I think this is because ExplainModifyTarget doesn't take into
account that the parent *can* be excluded by constraint exclusion.  So,
I added a field to ModifyTable to record the parent, apart from
resultRelations.  (More precisely, the parent in its role as a simple
member of the inheritance tree is recorded so that appending digits to
refname in select_rtable_names_for_explain works as before.)  Attached
is a proposed patch for that.

Thanks,

Best regards,
Etsuro Fujita
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 728,736  ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
  		((Scan *) plan)-scanrelid);
  			break;
  		case T_ModifyTable:
- 			/* cf ExplainModifyTarget */
  			*rels_used = bms_add_member(*rels_used,
! 	  linitial_int(((ModifyTable *) plan)-resultRelations));
  			break;
  		default:
  			break;
--- 728,735 
  		((Scan *) plan)-scanrelid);
  			break;
  		case T_ModifyTable:
  			*rels_used = bms_add_member(*rels_used,
! 		((ModifyTable *) plan)-resultRelation);
  			break;
  		default:
  			break;
***
*** 2109,2122  ExplainScanTarget(Scan *plan, ExplainState *es)
  static void
  ExplainModifyTarget(ModifyTable *plan, ExplainState *es)
  {
! 	Index		rti;
! 
! 	/*
! 	 * We show the name of the first target relation.  In multi-target-table
! 	 * cases this should always be the parent of the inheritance tree.
! 	 */
! 	Assert(plan-resultRelations != NIL);
! 	rti = linitial_int(plan-resultRelations);
  
  	ExplainTargetRel((Plan *) plan, rti, es);
  }
--- 2108,2114 
  static void
  ExplainModifyTarget(ModifyTable *plan, ExplainState *es)
  {
! 	Index		rti = plan-resultRelation;
  
  	ExplainTargetRel((Plan *) plan, rti, es);
  }
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***
*** 175,180  _copyModifyTable(const ModifyTable *from)
--- 175,181 
  	 */
  	COPY_SCALAR_FIELD(operation);
  	COPY_SCALAR_FIELD(canSetTag);
+ 	COPY_SCALAR_FIELD(resultRelation);
  	COPY_NODE_FIELD(resultRelations);
  	COPY_SCALAR_FIELD(resultRelIndex);
  	COPY_NODE_FIELD(plans);
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***
*** 327,332  _outModifyTable(StringInfo str, const ModifyTable *node)
--- 327,333 
  
  	WRITE_ENUM_FIELD(operation, CmdType);
  	WRITE_BOOL_FIELD(canSetTag);
+ 	WRITE_UINT_FIELD(resultRelation);
  	WRITE_NODE_FIELD(resultRelations);
  	WRITE_INT_FIELD(resultRelIndex);
  	WRITE_NODE_FIELD(plans);
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***
*** 4809,4814  make_result(PlannerInfo *root,
--- 4809,4815 
  ModifyTable *
  make_modifytable(PlannerInfo *root,
   CmdType operation, bool canSetTag,
+  Index resultRelation,
   List *resultRelations, List *subplans,
   List *withCheckOptionLists, List *returningLists,
   List *rowMarks, int epqParam)
***
*** 4857,4862  make_modifytable(PlannerInfo *root,
--- 4858,4864 
  
  	node-operation = operation;
  	node-canSetTag = canSetTag;
+ 	node-resultRelation = resultRelation;
  	node-resultRelations = resultRelations;
  	node-resultRelIndex = -1;	/* will be set correctly in setrefs.c */
  	node-plans = subplans;
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***
*** 607,612  subquery_planner(PlannerGlobal *glob, Query *parse,
--- 607,613 
  			plan = (Plan *) make_modifytable(root,
  			 parse-commandType,
  			 parse-canSetTag,
+ 			 parse-resultRelation,
  	   list_make1_int(parse-resultRelation),
  			 list_make1(plan),
  			 withCheckOptionLists,
***
*** 790,795  inheritance_planner(PlannerInfo *root)
--- 791,797 
  {
  	Query	   *parse = root-parse;
  	int			parentRTindex = parse-resultRelation;
+ 	int			pseudoParentRTindex = -1;
  	List	   *final_rtable = NIL;
  	int			save_rel_array_size = 0;
  	RelOptInfo **save_rel_array = NULL;
***
*** 925,930  inheritance_planner(PlannerInfo *root)
--- 927,940 
  		appinfo-child_relid = subroot.parse-resultRelation;
  
  		/*
+ 		 * If this child 

[HACKERS] Commit Fest 2014-12, Status after week 1

2014-12-21 Thread Michael Paquier
Hi all,

I have been though the patches of the current CF, looking at their
related threads and updating each patch status if needed. After one
week in this CF, we have done progress on many patches, more than 2/3
of them getting some comments, reviews and/or refreshed versions.

Looking at the CF app for more details, we have the following numbers
showing up:
- Needs Review: 43
- Waiting on Author: 23
- Ready for Committer: 4
- Committed: 7
- Returned with Feedback: 1
The number of patches waiting on author is high meaning that many
reviewers showed up, but there is still much work to do in terms of
reviews.

Note that there are still 33 patches that do not have a reviewer
assigned in the CF app. In some cases, no people looked at them, but
in most of them some people punctually commented on them.

Hence, to continue the progress on this CF, be sure to do the following things:
- As a patch submitter, check your patch status. Resubmit or
counter-argue if necessary as you feel. If your patch reviewer is not
showing up, be sure to ping him/her. Don't forget to review as well
one patch per patch submitted, of equal difficulty if possible.
Looking at new areas of the code is highly recommended.
- As a patch reviewer, if the patch you are reviewing is on waiting
on author for a long time, be sure to ping the patch submitter.
- Enjoy Christmas and your vacations with your relatives and friends.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers