Re: [HACKERS] contrib modules and relkind check

2017-03-07 Thread Michael Paquier
On Tue, Mar 7, 2017 at 4:15 PM, Amit Langote
 wrote:
> Sorry about the absence on this thread.

No problems! Thanks for showing up with an updated patch.

> On 2017/02/14 15:30, Michael Paquier wrote:
>> On Tue, Feb 14, 2017 at 3:18 PM, Amit Langote wrote:
>>>
>>> Added more tests in pgstattuple and the new ones for pg_visibility,
>>> although I may have overdone the latter.
>>
>> A bonus idea is also to add tests for relkinds that work, with for
>> example the creation of a table, inserting some data in it, vacuum it,
>> and look at "SELECT count(*) > 0 FROM pg_visibility('foo'::regclass)".
>
> I assume you meant only for pg_visibility.  Done in the attached (a pretty
> basic test though).

Yep.

> If we decide to go with some different approach, we'd not be doing it
> here.  Maybe in the "partitioned tables and relfilenode" thread or a new one.

Okay.

+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -0,0 +1,85 @@
+CREATE EXTENSION pg_visibility;
+--
+-- check that using the module's functions with unsupported relations will fail
+--
[...]
+select count(*) > 0 from pg_visibility('regular_table');
+ ?column?
+--
+ t
+(1 row)
Only regular tables are tested as valid objects. Testing toast tables
is not worth the complication. Could you add as well a matview?

Except for this small issue the patch looks good to me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-03-07 Thread Michael Paquier
On Wed, Mar 8, 2017 at 11:29 AM, Peter Eisentraut
 wrote:
> On 3/6/17 21:11, Michael Paquier wrote:
>> I think that the documentation of initdb should mention that
>> pg_hba.conf entries are configured for replication connections as
>> well, something like a sentence in the Description paragraph:
>> initdb sets pg_hba.conf entries using the specified authentication
>> method (trust by default) for non-replication as well as replication
>> connections.
>
> OK, I was looking for a way to document this.  Your ideas seems the most
> sensible.
>
> See attached patch that puts it all together.

Thanks for the new version.

+This option specifies the default authentication method for local
+users used in pg_hba.conf (host
+and local lines).  initdb will
+prepopulate pg_hba.conf entries using the
+specified authentication method for non-replication as well as
+replication connections.
Fine for me with a paragraph here.

This patch looks good to me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel index(-only) scan breaks when run without parallelism

2017-03-07 Thread Amit Kapila
On Wed, Mar 8, 2017 at 12:27 AM, Robert Haas  wrote:
> Amit, Rafia,
>
> nodeIndexscan.c, unlike nodeSeqscan.c, thinks that a parallel-aware
> scan will always be executed in parallel mode.  But that's not true:
> an Execute message with a non-zero row count could cause us to abandon
> planned parallelism and execute the plan serially.
>

Right, and the current code had assumed that if there is a parallel
plan then it will always enter the parallel mode.   I think the fix is
quite similar to what we do in nodeSeqscan.c i.e. initialize the scan
descriptor before starting the scan if it is not already initialized.
 There is an additional check required for ensuring if index runtime
keys are ready before calling index_rescan.  Attached patch fixes the
problem.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_scandesc_initialization_parallel_index_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-07 Thread Amit Khandekar
On 19 February 2017 at 14:59, Robert Haas  wrote:
> On Fri, Feb 17, 2017 at 2:56 PM, Amit Khandekar  
> wrote:
>> The log2(num_children)+1 formula which you proposed does not take into
>> account the number of workers for each of the subplans, that's why I
>> am a bit more inclined to look for some other logic. May be, treat the
>> children as if they belong to partitions, and accordingly calculate
>> the final number of workers. So for 2 children with 4 and 5 workers
>> respectively, Append parallel_workers would be : log3(3^4 + 3^5) .
>
> In general this will give an answer not different by more than 1 or 2
> from my answer, and often exactly the same.  In the case you mention,
> whether we get the same answer depends on which way you round:
> log3(3^4+3^5) is 5 if you round down, 6 if you round up.
>
> My formula is more aggressive when there are many subplans that are
> not parallel or take only 1 worker, because I'll always use at least 5
> workers for an append that has 9-16 children, whereas you might use
> only 2 if you do log3(3^0+3^0+3^0+3^0+3^0+3^0+3^0+3^0+3^0).  In that
> case I like my formula better. With lots of separate children, the
> chances of being able to use as many as 5 workers seem good.  (Note
> that using 9 workers as Ashutosh seems to be proposing would be a
> waste if the different children have very unequal execution times,
> because the workers that run children with short execution times can
> be reused to run additional subplans while the long ones are still
> running.  Running a separate worker for each child only works out if
> the shortest runtime is more than 50% of the longest runtime, which
> may sometimes be true but doesn't seem like a good bet in general.)
>
> Your formula is more aggressive when you have 3 children that all use
> the same number of workers; it'll always decide on  per child>+1, whereas mine won't add the extra worker in that case.
> Possibly your formula is better than mine in that case, but I'm not
> sure.  If you have as many as 9 children that all want N workers, your
> formula will decide on N+2 workers, but since my formula guarantees a
> minimum of 5 workers in such cases, I'll probably be within 1 of
> whatever answer you were getting.
>

Yeah, that seems to be right in most of the cases. The only cases
where your formula seems to give too few workers is for something like
: (2, 8, 8). For such subplans, we should at least allocate 8 workers.
It turns out that in most of the cases in my formula, the Append
workers allocated is just 1 worker more than the max per-subplan
worker count. So in (2, 1, 1, 8), it will be a fraction more than 8.
So in the patch, in addition to the log2() formula you proposed, I
have made sure that it allocates at least equal to max(per-subplan
parallel_workers values).

>
>> BTW, there is going to be some logic change in the choose-next-subplan
>> algorithm if we consider giving extra workers to subplans.
>
> I'm not sure that it's going to be useful to make this logic very
> complicated.  I think the most important thing is to give 1 worker to
> each plan before we give a second worker to any plan.  In general I
> think it's sufficient to assign a worker that becomes available to the
> subplan with the fewest number of workers (or one of them, if there's
> a tie) without worrying too much about the target number of workers
> for that subplan.

In the attached v5 patch, the logic of distributing the workers is now
kept simple : it just distributes the workers equally without
considering the per-sublan parallel_workers value. I have retained the
earlier logic of choosing the plan with minimum current workers. But
now that the pa_max_workers is not needed, I removed it, and instead a
partial_plans bitmapset is added in the Append node. Once a worker
picks up a non-partial subplan, it immediately changes its
pa_num_workers to -1. Whereas for partial subplans, the worker sets it
to -1 only after it finishes executing it.

Effectively, in parallel_append_next(), the check for whether subplan
is executing with max parallel_workers is now removed, and all code
that was using pa_max_workers is now removed.


Ashutosh Bapat  wrote:
> 10. We should probably move the parallel_safe calculation out of 
> cost_append().
> +path->parallel_safe = path->parallel_safe &&
> +  subpath->parallel_safe;
>
> 11. This check shouldn't be part of cost_append().
> +/* All child paths must have same parameterization */
> +Assert(bms_equal(PATH_REQ_OUTER(subpath), required_outer));
>

Moved out these two statements from cost_append(). Did it separately
in create_append_path().


Also, I have removed some elog() statements which were there while
inside Spinlock in parallel_append_next().


On 17 January 2017 at 11:10, Amit Langote  wrote:
> I was looking at the 

[HACKERS] [bug fix] dblink leaks unnamed connections

2017-03-07 Thread Tsunakawa, Takayuki
Hello,

dblink fails to close the unnamed connection as follows when a new unnamed 
connection is requested.  The attached patch fixes this.



postgres=# select count(*) from pg_stat_activity;
 count
---
 1
(1 row)

postgres=# select dblink_connect('dbname=postgres');
 dblink_connect

 OK
(1 row)

postgres=# select count(*) from pg_stat_activity;
 count
---
 2
(1 row)

postgres=# select dblink_connect('dbname=postgres');
 dblink_connect

 OK
(1 row)

postgres=# select count(*) from pg_stat_activity;
 count
---
 3
(1 row)


Regards
Takayuki Tsunakawa



dblink_leak_unnamed_conn.patch
Description: dblink_leak_unnamed_conn.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-07 Thread Michael Paquier
On Wed, Mar 8, 2017 at 2:32 PM, Rushabh Lathia  wrote:
> One quick comments:
>
> +| PASSWORD '(' Sconst USING Sconst ')'
> +{
> +$$ = makeDefElem("methodPassword",
> + (Node *)list_make2(makeString($3),
> +makeString($5)),
> + @1);
> +}
>
> methodPassword looks bit weird, can we change it to passwordMethod
> or pwdEncryptMethod ?

Using "Password" in suffix looks still better to me though for
consistency with the other ones.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel bitmap heap scan

2017-03-07 Thread Dilip Kumar
On Tue, Mar 7, 2017 at 10:10 PM, Dilip Kumar  wrote:
>> It's not about speed.  It's about not forgetting to prefetch.  Suppose
>> that worker 1 becomes the prefetch worker but then doesn't return to
>> the Bitmap Heap Scan node for a long time because it's busy in some
>> other part of the plan tree.  Now you just stop prefetching; that's
>> bad.  You want prefetching to continue regardless of which workers are
>> busy doing what; as long as SOME worker is executing the parallel
>> bitmap heap scan, prefetching should continue as needed.
>
> Right, I missed this part. I will fix this.
I have fixed this part, after doing that I realised if multiple
processes are prefetching then it may be possible that in boundary
cases (e.g. suppose prefetch_target is 3 and prefetch_pages is at 2)
there may be some extra prefetch but finally those prefetched blocks
will be used.  Another, solution to this problem is that we can
increase the prefetch_pages in advance then call tbm_shared_iterate,
this will avoid extra prefetch.  But I am not sure what will be best
here.


On Tue, Mar 7, 2017 at 9:41 PM, Robert Haas  wrote:
> +if (--ptbase->refcount == 0)
> +dsa_free(dsa, istate->pagetable);
> +
> +if (istate->spages)
> +{
> +ptpages = dsa_get_address(dsa, istate->spages);
> +if (--ptpages->refcount == 0)
> +dsa_free(dsa, istate->spages);
> +}
> +if (istate->schunks)
> +{
> +ptchunks = dsa_get_address(dsa, istate->schunks);
> +if (--ptchunks->refcount == 0)
> +dsa_free(dsa, istate->schunks);
> +}
>
> This doesn't involve any locking, which I think will happen to work
> with the current usage pattern but doesn't seem very robust in
> general.  I think you either need the refcounts to be protected by a
> spinlock, or maybe better, use pg_atomic_uint32 for them.  You want
> something like if (pg_atomic_sub_fetch_u32(, 1) == 0) {
> dsa_free(...) }
>
> Otherwise, there's no guarantee it will get freed exactly once.

Fixed


On Tue, Mar 7, 2017 at 9:34 PM, Robert Haas  wrote:
> You've still got this:
>
> +   if (DsaPointerIsValid(node->pstate->tbmiterator))
> +   tbm_free_shared_area(dsa, node->pstate->tbmiterator);
> +
> +   if (DsaPointerIsValid(node->pstate->prefetch_iterator))
> +   dsa_free(dsa, node->pstate->prefetch_iterator);
>
> I'm trying to get to a point where both calls use
> tbm_free_shared_area() - i.e. no peeking behind the abstraction layer.

Fixed

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


0001-tidbitmap-support-shared-v8.patch
Description: Binary data


0003-parallel-bitmap-heapscan-v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2017-03-07 Thread Ashutosh Bapat
On Tue, Mar 7, 2017 at 9:23 PM, Stephen Frost  wrote:
> Ashutosh,
>
> * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
>> Here are patches for follwing
>> 1. pg_explain_plan_time_v3 adds SUMMARY option which behaves as:
>> SUMMARY when ON prints planning time. With ANALYZE ON, it also prints
>> execution time. When user explicitly uses SUMMARY OFF, it does not
>> print planning and execution time (even when ANALYZE is ON). By
>> default SUMMARY is ON when ANALYZE is ON, otherwise SUMMARY defaults
>> to OFF. Attached explain_summary_examples.out.txt shows examples.
>>
>> 2. explain_exec_timing adds support to print planning time in EXPLAIN
>> EXECUTE output with SUMMARY option. In this case, planning time
>> includes time required to fetch the plan from cache and plan the query
>> if necessary (i.e. after invalidation or the first time it's
>> executed.) E.g.
>
> I'm going through these with an eye towards committing them soon.  I've
> already adjusted some of the documentation and comments per our earlier
> discussion

Thanks a lot.

> but I'm now reviewing the changes to ExplainExecuteQuery()
> and trying to understand the reasoning for not including the
> EvaluateParams() call in the planning time.  Not including that feels to
> me like we're ending up leaving something out of the overall timing
> picture, which doesn't seem right.
>
> If we do include that, then planning time+execution time will equal the
> overall query time and that feels like the right approach to use here.
> Otherwise the overall query time is "planning time+execution
> time+something else that we don't tell you about" which doesn't seem
> good to me.

Thanks for pointing that out. I didn't include parameter evaluation
time earlier, since it's not strictly planning time. But I think it's
important to include the parameter evaluation since different set of
parameters may cause planner to create a customized plan. So it looks
like something we should include in the planning time. I have updated
the patch to do so. I have also rebased the patches on top of current
head, resolving a conflict. The new patches have slightly different
names than previous ones, since I am now using git format-patch to
create those.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


0001-Support-SUMMARY-option-in-EXPLAIN-output.patch
Description: Binary data


0002-Print-planning-time-in-EXPLAIN-EXECUTE.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dropping partitioned tables without CASCADE

2017-03-07 Thread Amit Langote
Hi Ashutosh,

On 2017/03/06 18:19, Ashutosh Bapat wrote:
> On Mon, Mar 6, 2017 at 11:12 AM, Ashutosh Bapat wrote:
>> On Mon, Mar 6, 2017 at 11:05 AM, Simon Riggs  wrote:
>>> "has partitions" is not part of the DDL, whereas "FOR VALUES FROM (0)
>>> TO (100)" is. So ISTM sensible to differentiate between DDL and
>>> non-ddl using upper and lower case.
>>>
>>
>> Make sense. Will try to cook up a patch soon.
> 
> here's the patch. I have added a testcase in insert.sql to test \d+
> output for a partitioned table which has partitions which are further
> partitioned and also some partitions which are not partitioned
> themselves. I have also refactored a statement few lines above,
> replacing an if condition with ? : operator similar to code few lines
> below.

Thanks for cooking up the patch.  Looks good and works as expected.
Regression tests pass.

About the other statement you changed, I just realized that we should
perhaps do one more thing.  Show the Number of partitions, even if it's 0.
 In case of inheritance, the parent table stands on its own when there are
no child tables, but a partitioned table doesn't in the same sense.  I
tried to implement that in attached patch 0002.  Example below:

create table p (a int) partition by list (a);
\d p

Partition key: LIST (a)
Number of partitions: 0

\d+ p

Partition key: LIST (a)
Number of partitions: 0

create table p1 partition of p for values in (1);
\d p

Partition key: LIST (a)
Number of partitions: 1 (Use \d+ to list them.)

\d+ p

Partition key: LIST (a)
Partitions: p1 FOR VALUES IN (1)

0001 is your original patch.

Thanks,
Amit
>From f972a1231e66b84de0b1922e6a6fd96ea661ae20 Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 8 Mar 2017 11:21:55 +0900
Subject: [PATCH 1/2] Ashutosh's patch to improve \dt+ of partitioned tables

---
 src/bin/psql/describe.c  | 26 +-
 src/test/regress/expected/insert.out | 15 +++
 src/test/regress/sql/insert.sql  |  4 
 3 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e2e4cbcc08..f8617cf328 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2684,7 +2684,7 @@ describeOneTableDetails(const char *schemaname,
 		/* print child tables (with additional info if partitions) */
 		if (pset.sversion >= 10)
 			printfPQExpBuffer(,
-	"SELECT c.oid::pg_catalog.regclass, pg_get_expr(c.relpartbound, c.oid)"
+	"SELECT c.oid::pg_catalog.regclass, pg_get_expr(c.relpartbound, c.oid), c.relkind"
 	" FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i"
 	" WHERE c.oid=i.inhrelid AND"
 	" i.inhparent = '%s' AND"
@@ -2709,13 +2709,12 @@ describeOneTableDetails(const char *schemaname,
 
 		if (!verbose)
 		{
+			const char *ct = tableinfo.relkind != 'P' ? _("child tables") : _("partitions");
+
 			/* print the number of child tables, if any */
 			if (tuples > 0)
 			{
-if (tableinfo.relkind != 'P')
-	printfPQExpBuffer(, _("Number of child tables: %d (Use \\d+ to list them.)"), tuples);
-else
-	printfPQExpBuffer(, _("Number of partitions: %d (Use \\d+ to list them.)"), tuples);
+printfPQExpBuffer(, _("Number of %s: %d (Use \\d+ to list them.)"), ct, tuples);
 printTableAddFooter(, buf.data);
 			}
 		}
@@ -2738,12 +2737,21 @@ describeOneTableDetails(const char *schemaname,
 }
 else
 {
+	char *partitioned_note;
+
+	if (*PQgetvalue(result, i, 2) == 'P')
+		partitioned_note = " has partitions";
+	else
+		partitioned_note = "";
+
 	if (i == 0)
-		printfPQExpBuffer(, "%s: %s %s",
-		  ct, PQgetvalue(result, i, 0), PQgetvalue(result, i, 1));
+		printfPQExpBuffer(, "%s: %s %s%s",
+		  ct, PQgetvalue(result, i, 0), PQgetvalue(result, i, 1),
+		  partitioned_note);
 	else
-		printfPQExpBuffer(, "%*s  %s %s",
-		  ctw, "", PQgetvalue(result, i, 0), PQgetvalue(result, i, 1));
+		printfPQExpBuffer(, "%*s  %s %s%s",
+		  ctw, "", PQgetvalue(result, i, 0), PQgetvalue(result, i, 1),
+		  partitioned_note);
 }
 if (i < tuples - 1)
 	appendPQExpBufferChar(, ',');
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index 116854e142..c8ff863882 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -313,6 +313,21 @@ select tableoid::regclass::text, a, min(b) as min_b, max(b) as max_b from list_p
  part_null || 1 | 1
 (9 rows)
 
+-- test \d+ output on a table which has both partitioned and unpartitioned
+-- partitions
+\d+ list_parted
+Table "public.list_parted"
+ Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
++-+---+--+-+--+--+-
+ a  | text|   |  |   

Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-07 Thread Rushabh Lathia
On Wed, Mar 8, 2017 at 9:59 AM, Tom Lane  wrote:

> Michael Paquier  writes:
> > here is a separate thread dedicated to the following extension for
> > CREATE/ALTER ROLE: PASSWORD ('value' USING 'method').
>
> The parentheses seem weird ... do we really need those?
>

+1

I had quick glance to patch and it looks great.

One quick comments:

+| PASSWORD '(' Sconst USING Sconst ')'
+{
+$$ = makeDefElem("methodPassword",
+ (Node *)list_make2(makeString($3),
+makeString($5)),
+ @1);
+}

methodPassword looks bit weird, can we change it to passwordMethod
or pwdEncryptMethod ?


> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Rushabh Lathia


Re: [HACKERS] some dblink refactoring

2017-03-07 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tsunakawa,
> Takayuki
> How about applying the attached small patch for another refactoring?  This
> merely changes makeStringInfo() to initStringInfo() at two sites just other
> places in the same file.  makeStringInfo() on the function local variables
> leaves memory for StringInfoData allocated unnecessarily (which may be
> automatically reclaimed some time after.)

Sorry, I forgot to attach the file.

Regards
Takayuki Tsunakawa




dblink_strinfo_refactor.patch
Description: dblink_strinfo_refactor.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] some dblink refactoring

2017-03-07 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
> Here is a patch to refactor some macro hell in dblink.
> 
> This patch was discussed in the background sessions thread as a prerequisite
> for some work there, but I figure I'll make a separate thread for it to
> give everyone interested in dblink a chance to respond separate from the
> other thread.

I changed the status to ready for committer.  The patch applied cleanly, passed 
the regression test (make installcheck in contrib/dblink/), and the code looks 
perfect.

How about applying the attached small patch for another refactoring?  This 
merely changes makeStringInfo() to initStringInfo() at two sites just other 
places in the same file.  makeStringInfo() on the function local variables 
leaves memory for StringInfoData allocated unnecessarily (which may be 
automatically reclaimed some time after.)

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-07 Thread Michael Paquier
On Wed, Mar 8, 2017 at 1:29 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> here is a separate thread dedicated to the following extension for
>> CREATE/ALTER ROLE: PASSWORD ('value' USING 'method').
>
> The parentheses seem weird ... do we really need those?

A grammar without parenthesis works as well, and I am open to suggestions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-07 Thread Tom Lane
Michael Paquier  writes:
> here is a separate thread dedicated to the following extension for
> CREATE/ALTER ROLE: PASSWORD ('value' USING 'method').

The parentheses seem weird ... do we really need those?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-07 Thread Andres Freund
On 2017-03-07 20:58:35 +0800, Simon Riggs wrote:
> On 7 March 2017 at 20:36, Alvaro Herrera  wrote:
> 
> > FWIW, +1 on improving matters here.
> 
> +1 also.
> 
> I don't see what's wrong with relying on buildfarm though; testing is
> exactly what its there for.
> 
> If we had a two-stage process, where committers can issue "trial
> commits" as a way of seeing if the build farm likes things. If they
> do, we can push to the main repo.

Personally that's not addressing my main concern, which is that the
latency of getting done with some patch/topic takes a long while. If I
have to wait for the buildfarm to check some preliminary patch, I still
have to afterwards work on pushing it to master.  And very likely my
local check would finish a lot faster than a bunch of buildfarm animals
- I have after all a plenty powerfull machine, lots of cores, fast ssd,
lots of memory, ...

So I really want faster end-to-end test, not less cpu time spent on my
own machine.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign partition DDL regression tests

2017-03-07 Thread Ashutosh Bapat
On Wed, Mar 8, 2017 at 7:36 AM, Robert Haas  wrote:
> On Tue, Mar 7, 2017 at 7:14 AM, Ashutosh Bapat
>  wrote:
>> Hi Amit,
>> Thanks for adding testcases. Overall the testcases look good.
>>
>> The testcase is using ALTER TABLE to modify foreign table schema.
>> Though this works, I think a better option is to use ALTER FOREIGN
>> TABLE.
>>
>> Something not related to this patch but
>> -- no attach partition validation occurs for foreign tables
>> ALTER TABLE pt2 ATTACH PARTITION pt2_1 FOR VALUES IN (1);
>>
>> I am wondering whether we can improve this. For local tables, if a
>> valid constraint equivalent to the partitioning condition is not
>> present on the table being attached, it scans the data to verify
>> partition conditions. But for a foreign table, we don't want to do
>> that since the constraint is not guaranteed to be valid after the
>> initial check. For a normal foreign table a user can set a constraint
>> if s/he knows that that constraint will be honoured on the foreign
>> server. Thus instead of saying that we do not validate data, we can
>> refuse to attach a partition if corresponding check constraint is
>> absent on the foreign table being attached. A user will then be forced
>> to add that constraint if s/he is sure that the constraint will be
>> obeyed on the foreign server.
>
> I agree that we could do that, but what value would it have?  It just
> forces the user to spend two SQL commands doing what could otherwise
> be done in one.

I don't think it's going to be two commands always. A user who wants
to attach a foreign table as a partition, "knows" that the data on the
foreign server honours the partitioning bounds. If s/he knows that
probably he added the constraint on the foreign table, so that planner
could make use of it. Remember this is an existing foreign table. If
s/he is not aware that the data on the foreign server doesn't honour
partition bounds, adding that as a partition would be a problem. I
think, this step gives the user a chance to make a conscious decision.

> And the first command might not be that obvious.

A hint with error reported would help. In fact, the hint might as well
say "add constraint if you are sure that the foreign data honours
partition bound specification" or something like that. I noticed that
the documentation at
https://www.postgresql.org/docs/devel/static/sql-altertable.html for
ATTACH PARTITION does not have anything about foreign tables. May be
we should add whatever is the current status.

>  I
> think we should leave well enough alone.

At least we need to update the documentation.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-07 Thread David Steele

On 3/7/17 11:38 AM, Andres Freund wrote:

<...>


We have a plenty of time and we dedicate one full-time developer for
this project.


How about having that, and perhaps others, developer participate in
reviewing patches and getting to the bottom of the commitfest?  Should
we end up being done early, we can look at this patch...  There's not
been review activity corresponding to the amount of submissions from
pgpro...


This patch has been moved to CF 2017-07.

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Amit Langote
On 2017/03/08 2:55, Tom Lane wrote:
> Is there a good reason why RELKIND_PARTITIONED_TABLE is 'P' not 'p'?
> It looks rather out of place considering that seven of the eight
> pre-existing relkind codes are lower case.  (And no, I don't especially
> approve of RELKIND_SEQUENCE being 'S' either, but it's far too late to
> change that.)  Also, in typical low-res monospaced fonts, there's nearly
> no difference except vertical alignment between P and p, meaning that in
> something like
> 
> regression=# select distinct relkind from pg_class;
>  relkind 
> -
>  r
>  t
>  P
>  v
>  m
>  i
>  S
>  c
> (8 rows)
> 
> you have to look rather closely even to notice that what you're seeing
> isn't in the case you might expect.
> 
> I think we should change this while we still can.

I remember that one of the earliest versions of the patch I submitted had
two new relkinds: 'P' for partitioned tables and 'p' for leaf partitions.
The latter was dropped subsequently and I never thought of using 'p'
instead of 'P' for partitioned tables.

Attached patch that implements this change.

Thanks,
Amit
>From b356a5e61aedc9fcf4edd99a095b06c5c8e57a86 Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 8 Mar 2017 12:07:56 +0900
Subject: [PATCH] Change RELKIND_PARTITIONED_TABLE to 'p'

---
 src/backend/catalog/information_schema.sql | 40 +++---
 src/backend/catalog/system_views.sql   |  4 +--
 src/bin/pg_dump/pg_dump.c  |  2 +-
 src/bin/psql/describe.c| 38 ++--
 src/bin/psql/tab-complete.c|  8 +++---
 src/include/catalog/pg_class.h |  2 +-
 src/test/regress/expected/create_table.out |  2 +-
 src/test/regress/expected/rules.out|  4 +--
 src/test/regress/expected/sanity_check.out |  2 +-
 src/test/regress/sql/sanity_check.sql  |  2 +-
 10 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 51795cd6de..0011e06494 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -453,7 +453,7 @@ CREATE VIEW check_constraints AS
   AND a.attnum > 0
   AND NOT a.attisdropped
   AND a.attnotnull
-  AND r.relkind IN ('r', 'P')
+  AND r.relkind IN ('r', 'p')
   AND pg_has_role(r.relowner, 'USAGE');
 
 GRANT SELECT ON check_constraints TO PUBLIC;
@@ -525,7 +525,7 @@ CREATE VIEW column_domain_usage AS
   AND a.attrelid = c.oid
   AND a.atttypid = t.oid
   AND t.typtype = 'd'
-  AND c.relkind IN ('r', 'v', 'f', 'P')
+  AND c.relkind IN ('r', 'v', 'f', 'p')
   AND a.attnum > 0
   AND NOT a.attisdropped
   AND pg_has_role(t.typowner, 'USAGE');
@@ -564,7 +564,7 @@ CREATE VIEW column_privileges AS
   pr_c.relowner
FROM (SELECT oid, relname, relnamespace, relowner, (aclexplode(coalesce(relacl, acldefault('r', relowner.*
  FROM pg_class
- WHERE relkind IN ('r', 'v', 'f', 'P')
+ WHERE relkind IN ('r', 'v', 'f', 'p')
 ) pr_c (oid, relname, relnamespace, relowner, grantor, grantee, prtype, grantable),
 pg_attribute a
WHERE a.attrelid = pr_c.oid
@@ -586,7 +586,7 @@ CREATE VIEW column_privileges AS
 ) pr_a (attrelid, attname, grantor, grantee, prtype, grantable),
 pg_class c
WHERE pr_a.attrelid = c.oid
- AND relkind IN ('r', 'v', 'f', 'P')
+ AND relkind IN ('r', 'v', 'f', 'p')
  ) x,
  pg_namespace nc,
  pg_authid u_grantor,
@@ -629,7 +629,7 @@ CREATE VIEW column_udt_usage AS
 WHERE a.attrelid = c.oid
   AND a.atttypid = t.oid
   AND nc.oid = c.relnamespace
-  AND a.attnum > 0 AND NOT a.attisdropped AND c.relkind in ('r', 'v', 'f', 'P')
+  AND a.attnum > 0 AND NOT a.attisdropped AND c.relkind in ('r', 'v', 'f', 'p')
   AND pg_has_role(coalesce(bt.typowner, t.typowner), 'USAGE');
 
 GRANT SELECT ON column_udt_usage TO PUBLIC;
@@ -738,7 +738,7 @@ CREATE VIEW columns AS
CAST('NEVER' AS character_data) AS is_generated,
CAST(null AS character_data) AS generation_expression,
 
-   CAST(CASE WHEN c.relkind IN ('r', 'P') OR
+   CAST(CASE WHEN c.relkind IN ('r', 'p') OR
   (c.relkind IN ('v', 'f') AND
pg_column_is_updatable(c.oid, a.attnum, false))
 THEN 'YES' ELSE 'NO' END AS yes_or_no) AS is_updatable
@@ -753,7 +753,7 @@ CREATE VIEW columns AS
 
 WHERE (NOT pg_is_other_temp_schema(nc.oid))
 
-  AND a.attnum > 0 AND NOT a.attisdropped AND c.relkind in ('r', 'v', 'f', 'P')
+  AND a.attnum > 0 AND NOT a.attisdropped AND c.relkind in ('r', 'v', 'f', 'p')
 
   AND 

Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-07 Thread Magnus Hagander
On Tue, Mar 7, 2017 at 7:24 AM, Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

>
>
> On 03/07/2017 07:58 AM, Simon Riggs wrote:
> > On 7 March 2017 at 20:36, Alvaro Herrera 
> wrote:
> >
> >> FWIW, +1 on improving matters here.
> > +1 also.
> >
> > I don't see what's wrong with relying on buildfarm though; testing is
> > exactly what its there for.
> >
> > If we had a two-stage process, where committers can issue "trial
> > commits" as a way of seeing if the build farm likes things. If they
> > do, we can push to the main repo.
> >
>
>
> I'm happy to work on this.  Not quite sure how it would work, but I'm
> open to any suggestions.
>

Assuming the intention is a service for *committers only*, I suggest
setting up a separate (closed) git repository that committers can push to
and a separate set of BF animals could work from. We could just have it do
all branches in it, no need to filter that way as long as we keep it a
separate repo.

There have also on and off been discussions about building arbitrary
patches as they are sent to the mailinglists. Doing that without any
committer (or other trusted party) as a filter is a completely different
challenge of course, given that it basically amounts to downloading and
running random code off the internet.

But doing just the first would make it a lot easier, and probably still be
of good value.

An in-between could be to hook something off the CF app, but one important
question is how important covering many platforms is. Since we already have
good functionality for doing that in the buildfarm, it makes sense to
utilize that if we can.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] ALTER PUBLICATION and segmentation fault

2017-03-07 Thread Peter Eisentraut
On 3/7/17 11:56, Masahiko Sawada wrote:
> This issue happen even ALTER SUBSCRIPTION. I guess the main cause is
> that acl_kind of pg_publication and pg_subscription of ObjectProperty
> array are not correct. These values should be set ACL_KIND_PUBLICATION
> and ACL_KIND_SUBSCRIPTION instead of -1. Attached small patch fixes
> this issue and adds regression test.

committed, thanks

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

2017-03-07 Thread Haribabu Kommi
On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier 
wrote:

> On Fri, Jan 27, 2017 at 10:26 AM, Haribabu Kommi
>  wrote:
> > Thanks for the review.
> > Let's wait for the committer's opinion.
>
> I have moved this patch to CF 2017-03 to wait for this to happen.
>

Attached a rebased patch to resolve the OID conflict.

Regards,
Hari Babu
Fujitsu Australia


pg_stat_sql_row_mode_5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM authentication, take three

2017-03-07 Thread Magnus Hagander
On Tue, Mar 7, 2017 at 4:36 AM, Heikki Linnakangas  wrote:

> On 03/02/2017 08:50 AM, Michael Paquier wrote:
>
>> Attached is a new patch set. I have combined SASLprep with the rest
>> and fixed some conflicts. At the same time when going through NFKC
>> this morning I have noticed that the implementation was doing the
>> canonical decomposition and reordered the characters using the
>> combining classes, but the string recomposition was still missing.
>> This is addressed in this patch set, and well as on my dev tree:
>> https://github.com/michaelpq/postgres/tree/scram
>>
>
> I've now committed the bulk of these patches. Many thanks to everyone
> involved, and especially you, Michael, for your persistence!
>

+1!


Currently, the AuthenticationSASL protocol message specifies the mechanism
> that the client must use, but as future-proofing, it'd probably be best to
> redefine that to be a list of mechanisms, and let the client choose among
> those. That's not a show-stopper, but would be good to get that right in
> version 10, so that clients can prepare for that, even if we only support
> one mechanism ATM.
>

+1, and let's make sure we get it into 10. We don't want to be stuck with
something without flexibility -- then we're going to have to do the whole
"is it time yet" dance again. It would be especially bad since the
underlying protocol is pluggable.

This seems like an obvious place, but are there any other places where we
should also consider something like that for compatibility?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2017-03-07 Thread Haribabu Kommi
On Wed, Mar 1, 2017 at 12:59 PM, Haribabu Kommi 
wrote:

>
> Patch attached. Still some more docs needs to be added.
>

Updated patch attached to resolve the conflicts with following commit.

commit 9a83d56b38c870ce47b7651385ff2add583bf136
Author: Simon Riggs 
Date:   Tue Mar 7 22:00:54 2017 +0800

Allow pg_dumpall to dump roles w/o user passwords

Add new option --no-role-passwords which dumps roles without passwords.
Since we don’t need passwords, we choose to use pg_roles in preference
to pg_authid since access may be restricted for security reasons in
some configrations.

Robins Tharakan and Simon Riggs



Regards,
Hari Babu
Fujitsu Australia


pg_dump_changes_2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-03-07 Thread David Steele

On 3/3/17 4:54 PM, David Steele wrote:


On 2/1/17 1:25 AM, Kyotaro HORIGUCHI wrote:

Hello, thank you for moving this to the next CF.

At Wed, 1 Feb 2017 13:09:51 +0900, Michael Paquier  wrote 
in 

On Tue, Jan 24, 2017 at 4:58 PM, Kyotaro HORIGUCHI
 wrote:

Six new syscaches in 665d1fa was conflicted and 3-way merge
worked correctly. The new syscaches don't seem to be targets of
this patch.

To be honest, I am not completely sure what to think about this patch.
Moved to next CF as there is a new version, and no new reviews to make
the discussion perhaps move on.

I'm thinking the following is the status of this topic.

- The patch stll is not getting conflicted.

- This is not a hollistic measure for memory leak but surely
   saves some existing cases.

- Shared catcache is another discussion (and won't really
   proposed in a short time due to the issue on locking.)

- As I mentioned, a patch that caps the number of negative
   entries is avaiable (in first-created - first-delete manner)
   but it is having a loose end of how to determine the
   limitation.

While preventing bloat in the syscache is a worthwhile goal, it appears
there are a number of loose ends here and a new patch has not been provided.

It's a pretty major change so I recommend moving this patch to the
2017-07 CF.


Not hearing any opinions pro or con, I'm moving this patch to the 
2017-07 CF.


--
-David
da...@pgmasters.net



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-07 Thread Andres Freund
On 2017-03-07 21:48:23 -0500, Robert Haas wrote:
> On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson  wrote:
> > And I would argue that his feature is useful for quite many, based on my
> > experience running a semi-large database. Index bloat happens and without
> > REINDEX CONCURRENTLY it can be really annoying to solve, especially for
> > primary keys. Certainly more people have problems with index bloat than the
> > number of people who store index oids in their database.
> 
> Yeah, but that's not the only wart, I think.

I don't really see any other warts that don't correspond to CREATE/DROP
INDEX CONCURRENTLY.


> For example, I believe (haven't looked at this patch series in a
> while) that the patch takes a lock and later escalates the lock level.

It shouldn't* - that was required precisely because we had to switch the
relfilenodes when the oid stayed the same.  Otherwise in-progress index
lookups could end up using the wrong relfilenodes and/or switch in the
middle of a lookup.

* excepting the exclusive lock DROP INDEX CONCURRENTLY style dropping
  uses after marking the index as dead - but that shouldn't be much of a
  concern?


> Also, if by any chance you think (or use any software that thinks)
> that OIDs for system objects are a stable identifier, this will be the
> first case where that ceases to be true.

Can you come up with an halfway realistic scenario why an index oid, not
a table, constraint, sequence oid, would be relied upon?


> If the system is shut down or crashes or the session is killed, you'll
> be left with stray objects with names that you've never typed into the
> system.

Given how relatively few complaints we have about CIC's possibility of
ending up with invalid indexes - not that there are none - and it's
widespread usage, I'm not too concerned about this.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Andres Freund
On 2017-03-07 21:38:40 -0500, Robert Haas wrote:
> > I wonder however, if careful snapshot managment couldn't solve this as
> > well.  I have *not* thought a lot about this, but afaics we can easily
> > prevent all-visible from being set in cases it'd bother us by having an
> > "appropriate" xmin / registered snapshot.
> 
> Yeah, but that's a tax on the whole system.

I'm not sure I can buy that argument. CIC *already* holds a snapshot
during each of the two scans. By reducing the amount of time that's held
in the second scan, the systemwide impact is reduced, because it's held
for a shorter amount of time.  We need to hold a slightly "more
aggressive" snapshot, that's true, but if you have high xid throughput
those effects should roughly balance each other.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Implementation of SASLprep for SCRAM-SHA-256

2017-03-07 Thread Michael Paquier
Hi all,

Here is another thread for an issue related to SCRAM
(https://www.postgresql.org/message-id/243d8c11-6149-a4bb-0909-136992f74...@iki.fi),
that can be discussed independently: SASLprep.

RFC5802 (https://tools.ietf.org/html/rfc5802) regarding the
implementation of SCRAM, needs to have passwords normalized as per
RFC4013 (https://tools.ietf.org/html/rfc4013) using SASLprep, which is
actually NFKC. I have previously described what happens in this case
here:
https://www.postgresql.org/message-id/CAB7nPqScgwh6Eu4=c-0l7-cufzru5rultsdpmoowiz1yfvg...@mail.gmail.com
This way, we can be sure that two UTf-8 strings are considered as
equivalent in a SASL exchange, in our case we care about the password
string (we should care about the username as well). Without SASLprep,
our implementation of SCRAM may fail with other third-part tools if
passwords are not made only of ASCII characters. And that sucks.

To put in short words, NFKC is made of two steps: a canonical
decomposition of the UTF-8 string (decomposition of the string
following by a reordering using the class of each character), followed
by its recomposition.

Postgres does not have any logic on the frontend-side related to
encoding, still it is possible with some of the APIs of wchar.c to
check if a string provided is valid UTF-8 or not. If the string is not
valid UTF-8, we should just use the string as-is in the exchange,
which leads to undefined behaviors anyway as SASL needs to do things
with UTF-8.

I have also implementation a Postgres extension able to do this
operation, which has been useful for testing.
https://github.com/michaelpq/pg_plugins/tree/master/pg_sasl_prepare

Attached is a patch implementing that, the conversion tables are built
from UnicodeData.txt to be minimal in size. I have added an Open Item
on the wiki for this problem as well.

Thanks,
-- 
Michael


0001-Implement-SASLprep-aka-NFKC-for-SCRAM-authentication.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 9:24 PM, Amit Kapila  wrote:
> If we have lesser index pages and more heap pages, then we limit the
> parallelism based on index pages.

Kinda.  In most cases, we figure out the degree of parallelism based
on heap pages and then we figure out the degree of parallelism based
on index pages and we return the smaller of those numbers.  However,
as an exception, if one of those numbers would be zero and the other
would be non-zero, then we return 1 instead of 0.  That's randomly
inconsistent, and I think it's a bug which my proposed patch would
fix.  I don't understand how you can justify returning the smaller of
the two values in every other case, but in that case return something
else.

> I think it can give us benefit in
> such cases as well (especially when we have to discard rows based heap
> rows).  Now, consider it from another viewpoint, what if there are
> enough index pages (> min_parallel_index_scan_size) but not sufficient
> heap pages.  I think in such cases parallel index (only) scans will be
> beneficial especially because in the parallel index only scans
> heap_pages could be very less or possibly could be zero.

That's a separate problem.  I think we ought to consider having an
index-only scan pass -1 for the number of heap pages, so that the
degree of parallelism in that case is limited only by the number of
index pages.  I was going to bring up that issue after I got this
committed.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 9:32 PM, Amit Kapila  wrote:
> On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas  wrote:
>> On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila  wrote:
>>> Sure, if you think both Writes and Reads at OS level can have some
>>> chance of blocking in obscure cases, then we should add a wait event
>>> for them.
>>
>> I think writes have a chance of blocking in cases even in cases that
>> are not very obscure at all.
>
> Point taken for writes, but I think in general we should have some
> criteria based on which we can decide whether to have a wait event for
> a particular call. It should not happen that we have tons of wait
> events and out of which, only a few are helpful in most of the cases
> in real-world scenarios.

Well, the problem is that if you pick and choose which wait events to
add based on what you think will be common, you're actually kind of
hosing yourself. Because now when something uncommon happens, suddenly
you don't get any wait event data and you can't tell what's happening.
I think the number of new wait events added by Rushabh's patch is
wholly reasonable.  Yeah, some of those are going to be a lot more
common than others, but so what?  We add wait events so that we can
find out what's going on.  I don't want to sometimes know when a
backend is blocked on an I/O.  I want to ALWAYS know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Pavan Deolasee
On Wed, Mar 8, 2017 at 8:08 AM, Robert Haas  wrote:

> On Tue, Mar 7, 2017 at 9:12 PM, Andres Freund  wrote:
>
> >
> > I wonder however, if careful snapshot managment couldn't solve this as
> > well.  I have *not* thought a lot about this, but afaics we can easily
> > prevent all-visible from being set in cases it'd bother us by having an
> > "appropriate" xmin / registered snapshot.
>
> Yeah, but that's a tax on the whole system.
>
>
May be we can do that only when patched CIC (or any other operation which
may not like concurrent VM set) is in progress. We could use what Stephen
suggested upthread to find that state. But right now it's hard to think
because there is nothing on either side so we don't know what gets impacted
by aggressive VM set and how.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-07 Thread Robert Haas
On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson  wrote:
> And I would argue that his feature is useful for quite many, based on my
> experience running a semi-large database. Index bloat happens and without
> REINDEX CONCURRENTLY it can be really annoying to solve, especially for
> primary keys. Certainly more people have problems with index bloat than the
> number of people who store index oids in their database.

Yeah, but that's not the only wart, I think.  For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level.  If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector.  Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Pavan Deolasee
On Wed, Mar 8, 2017 at 7:33 AM, Robert Haas  wrote:

> On Tue, Mar 7, 2017 at 4:26 PM, Stephen Frost  wrote:
> > Right, that's what I thought he was getting at and my general thinking
> > was that we would need a way to discover if a CIC is ongoing on the
> > relation and therefore heap_page_prune(), or anything else, would know
> > that it can't twiddle the bits in the VM due to the ongoing CIC.
> > Perhaps a lock isn't the right answer there, but it would have to be
> > some kind of cross-process communication that operates at a relation
> > level..
>
> Well, I guess that's one option.  I lean toward the position already
> taken by Andres and Peter, namely, that it's probably not a great idea
> to pursue this optimization.


Fair point. I'm not going to "persist" with the idea too long. It seemed
like a good, low-risk feature to me which can benefit certain use cases
quite reasonably. It's not uncommon to create indexes (or reindex existing
indexes to remove index bloats) on extremely large tables and avoiding a
second heap scan can hugely benefit such cases. Holding up the patch for
something for which we don't even have a proposal yet seemed a bit strange
at first, but I see the point.

Anyways, for a recently vacuumed table of twice the size of RAM and on a
machine with SSDs, the patched CIC version runs about 25% faster. That's
probably the best case scenario.

I also realised that I'd attached a slightly older version of the patch. So
new version attached, but I'll remove the patch from CF if I don't see any
more votes in favour of the patch.

Thanks,
Pavan


cic_skip_all_visible_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 9:12 PM, Andres Freund  wrote:
> On 2017-03-07 21:03:53 -0500, Robert Haas wrote:
>> On Tue, Mar 7, 2017 at 4:26 PM, Stephen Frost  wrote:
>> > Right, that's what I thought he was getting at and my general thinking
>> > was that we would need a way to discover if a CIC is ongoing on the
>> > relation and therefore heap_page_prune(), or anything else, would know
>> > that it can't twiddle the bits in the VM due to the ongoing CIC.
>> > Perhaps a lock isn't the right answer there, but it would have to be
>> > some kind of cross-process communication that operates at a relation
>> > level..
>>
>> Well, I guess that's one option.  I lean toward the position already
>> taken by Andres and Peter, namely, that it's probably not a great idea
>> to pursue this optimization.  I'm not totally dead-set on that
>> position, but it doesn't seem likely that we can add material
>> synchronization overhead to heap_page_prune(), and I'd rather maintain
>> the option to set visibility map bits in more places in the future
>> than give up that opportunity by optimizing CIC now.  It's nice for
>> CIC to be fast, but setting all visible bits more aggressively sounds
>> nicer.
>
> Indeed.
>
> I wonder however, if careful snapshot managment couldn't solve this as
> well.  I have *not* thought a lot about this, but afaics we can easily
> prevent all-visible from being set in cases it'd bother us by having an
> "appropriate" xmin / registered snapshot.

Yeah, but that's a tax on the whole system.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 3:56 PM, Sven R. Kunze  wrote:
>> 2) As far as I can tell from reading the code to_date currently ignores
>> the M suffix which indicates that you want localized month/day names, so i
>> guess that to_date is currently immutable. Maybe it is stable due to the
>> idea that we may want to support the M suffix in the future.
>
> I think that's the case.

The commit message for 64353640e87b54250f1b8a1d2a708d270dff4bfd has
some interesting perspective on this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-07 Thread Amit Kapila
On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila  wrote:
>> Sure, if you think both Writes and Reads at OS level can have some
>> chance of blocking in obscure cases, then we should add a wait event
>> for them.
>
> I think writes have a chance of blocking in cases even in cases that
> are not very obscure at all.
>

Point taken for writes, but I think in general we should have some
criteria based on which we can decide whether to have a wait event for
a particular call. It should not happen that we have tons of wait
events and out of which, only a few are helpful in most of the cases
in real-world scenarios.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-03-07 Thread Peter Eisentraut
On 3/6/17 21:11, Michael Paquier wrote:
> I think that the documentation of initdb should mention that
> pg_hba.conf entries are configured for replication connections as
> well, something like a sentence in the Description paragraph:
> initdb sets pg_hba.conf entries using the specified authentication
> method (trust by default) for non-replication as well as replication
> connections.

OK, I was looking for a way to document this.  Your ideas seems the most
sensible.

See attached patch that puts it all together.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 43fe1c6426f541767749989ccc14cbe056f25b35 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 7 Mar 2017 21:07:07 -0500
Subject: [PATCH v2] Enable replication connections by default in pg_hba.conf

---
 doc/src/sgml/ref/initdb.sgml | 16 +++-
 src/backend/libpq/pg_hba.conf.sample |  6 +++---
 src/bin/initdb/initdb.c  |  5 -
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  7 ++-
 src/test/perl/PostgresNode.pm| 19 ++-
 5 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 1aaa4901af..d9faa96021 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -120,11 +120,17 @@ Options
   --auth=authmethod
   

-This option specifies the authentication method for local users used
-in pg_hba.conf (host
-and local lines).  Do not use trust
-unless you trust all local users on your system.  trust is
-the default for ease of installation.
+This option specifies the default authentication method for local
+users used in pg_hba.conf (host
+and local lines).  initdb will
+prepopulate pg_hba.conf entries using the
+specified authentication method for non-replication as well as
+replication connections.
+   
+
+   
+Do not use trust unless you trust all local users on your
+system.  trust is the default for ease of installation.

   
  
diff --git a/src/backend/libpq/pg_hba.conf.sample b/src/backend/libpq/pg_hba.conf.sample
index e0fbfcb026..507278b2cc 100644
--- a/src/backend/libpq/pg_hba.conf.sample
+++ b/src/backend/libpq/pg_hba.conf.sample
@@ -84,6 +84,6 @@ hostall all 127.0.0.1/32@authmethodhost@
 hostall all ::1/128 @authmethodhost@
 # Allow replication connections from localhost, by a user with the
 # replication privilege.
-@remove-line-for-nolocal@#local   replication @default_username@@authmethodlocal@
-#hostreplication @default_username@127.0.0.1/32@authmethodhost@
-#hostreplication @default_username@::1/128 @authmethodhost@
+@remove-line-for-nolocal@local   replication all @authmethodlocal@
+hostreplication all 127.0.0.1/32@authmethodhost@
+hostreplication all ::1/128 @authmethodhost@
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 1ed0d20504..ea7b47e26d 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1227,11 +1227,6 @@ setup_config(void)
 			  "@authcomment@",
 			  (strcmp(authmethodlocal, "trust") == 0 || strcmp(authmethodhost, "trust") == 0) ? AUTHTRUST_WARNING : "");
 
-	/* Replace username for replication */
-	conflines = replace_token(conflines,
-			  "@default_username@",
-			  username);
-
 	snprintf(path, sizeof(path), "%s/pg_hba.conf", pg_data);
 
 	writefile(path, conflines);
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index aafb138fd5..14bd813896 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@
 use Config;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 73;
+use Test::More tests => 72;
 
 program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
@@ -15,15 +15,12 @@
 my $node = get_new_node('main');
 
 # Initialize node without replication settings
-$node->init(hba_permit_replication => 0);
+$node->init;
 $node->start;
 my $pgdata = $node->data_dir;
 
 $node->command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
-$node->command_fails(
-	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
-	'pg_basebackup fails because of hba');
 
 # Some Windows ANSI code pages may reject this filename, in which case we
 # quietly proceed without this bit of test coverage.
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e5cb348f4c..7e530676b2 100644
--- 

Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-07 Thread Amit Kapila
On Tue, Mar 7, 2017 at 10:12 PM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 10:28 PM, Amit Kapila  wrote:
>> I also think that commit didn't intend to change the behavior,
>> however, the point is how sensible is it to keep such behavior after
>> Parallel Append.  I am not sure if this is the right time to consider
>> it or shall we wait till Parallel Append is committed.
>
> I think we should wait until that's committed.  I'm not convinced we
> ever want to change the behavior, but if we do, let's think about it
> then, not now.
>
>> - if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
>> - index_pages < (BlockNumber) min_parallel_index_scan_size &&
>> - rel->reloptkind == RELOPT_BASEREL)
>> + if (rel->reloptkind == RELOPT_BASEREL &&
>> + ((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||
>> + (index_pages >= 0 && index_pages < min_parallel_index_scan_size)))
>>   return 0;
>>
>> The purpose of considering both heap and index pages () for
>> min_parallel_* is that even if one of them is bigger than threshold
>> then we should try parallelism.  This could be helpful for parallel
>> index scans where we consider parallel workers based on both heap and
>> index pages.  Is there a reason for you to change that condition as
>> that is not even related to the problem being discussed?
>
> Really?  We want to do parallelism if EITHER condition is met?  I
> thought the goal was to do parallelism if BOTH conditions were met.
> Otherwise, you might go parallel if you have a large number of heap
> pages but only 1 or 2 index pages.
>

If we have lesser index pages and more heap pages, then we limit the
parallelism based on index pages.  I think it can give us benefit in
such cases as well (especially when we have to discard rows based heap
rows).  Now, consider it from another viewpoint, what if there are
enough index pages (> min_parallel_index_scan_size) but not sufficient
heap pages.  I think in such cases parallel index (only) scans will be
beneficial especially because in the parallel index only scans
heap_pages could be very less or possibly could be zero.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-07 Thread David Rowley
On 28 February 2017 at 01:02, Andres Freund  wrote:
> Hi,
>
> On 2017-02-27 03:17:32 -0800, Andres Freund wrote:
>> I'll work on getting slab committed first, and then review / edit /
>> commit generation.c later.  One first note there is that I'm wondering
>> if generation.c is a too generic filename.
>
> And pushed slab and its usage.  Will have a look at generation.c
> tomorrow.

Attached is a patch to fix the compiler warning for compilers that
don't understand elog(ERROR) does not return.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


slaballoc_warning_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Andres Freund
On 2017-03-07 21:03:53 -0500, Robert Haas wrote:
> On Tue, Mar 7, 2017 at 4:26 PM, Stephen Frost  wrote:
> > Right, that's what I thought he was getting at and my general thinking
> > was that we would need a way to discover if a CIC is ongoing on the
> > relation and therefore heap_page_prune(), or anything else, would know
> > that it can't twiddle the bits in the VM due to the ongoing CIC.
> > Perhaps a lock isn't the right answer there, but it would have to be
> > some kind of cross-process communication that operates at a relation
> > level..
> 
> Well, I guess that's one option.  I lean toward the position already
> taken by Andres and Peter, namely, that it's probably not a great idea
> to pursue this optimization.  I'm not totally dead-set on that
> position, but it doesn't seem likely that we can add material
> synchronization overhead to heap_page_prune(), and I'd rather maintain
> the option to set visibility map bits in more places in the future
> than give up that opportunity by optimizing CIC now.  It's nice for
> CIC to be fast, but setting all visible bits more aggressively sounds
> nicer.

Indeed.

I wonder however, if careful snapshot managment couldn't solve this as
well.  I have *not* thought a lot about this, but afaics we can easily
prevent all-visible from being set in cases it'd bother us by having an
"appropriate" xmin / registered snapshot.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-07 Thread Amit Kapila
On Wed, Mar 8, 2017 at 3:38 AM, Robert Haas  wrote:
> On Wed, Mar 1, 2017 at 4:18 AM, Robert Haas  wrote:
>> On Tue, Feb 28, 2017 at 7:31 PM, Amit Kapila  wrote:
>>> Yeah, actually those were added later in Enable-WAL-for-Hash* patch,
>>> but I think as this patch is standalone, so we should not remove it
>>> from their existing usage, I have added those back and rebased the
>>> remaining patches.
>>
>> Great, thanks.  0001 looks good to me now, so committed.
>
> Committed 0002.
>

Thanks.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-07 Thread Michael Paquier
Hi all,

As discussed on the thread dedicated to SCRAM
(https://www.postgresql.org/message-id/243d8c11-6149-a4bb-0909-136992f74...@iki.fi),
here is a separate thread dedicated to the following extension for
CREATE/ALTER ROLE: PASSWORD ('value' USING 'method').

Now that password_encryption has been extended with a new value
'scram', it is a bit bothersome for the user to create roles using
different methods because password_encryption would need to be set
first:
=# SET password_encryption = 'scram';
SET
=# CREATE ROLE foorole PASSWORD 'foopass';
CREATE ROLE
=# SET password_encryption = 'md5';
SET
=# CREATE ROLE foorole2 PASSWORD 'foopass';
CREATE ROLE

What I am proposing with the patch attached is to add a new clause
(grammar is an idea from Robert), to do the same in a single command:
=# CREATE ROLE foorole3 PASSWORD ('foo' USING 'scram');
CREATE ROLE
=# CREATE ROLE foorole4 PASSWORD ('foo' USING 'md5');
CREATE ROLE
This way there is no need to enforce password_encryption prior to
define a new password. Note that like the existing clauses, this is
permissive. In short, if the value is already MD5-encrypted or
SCRAM-encrypted, then the type of the parsed value is enforced
compared to what is defined as method for this USING clause, which is
useful for bumping data.

As this needs clarification before Postgres 10, I am adding a bullet
in the TODO items. This would prove to be useful if more protocols are
added in the future.

Thoughts?
-- 
Michael


0001-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign partition DDL regression tests

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 7:14 AM, Ashutosh Bapat
 wrote:
> Hi Amit,
> Thanks for adding testcases. Overall the testcases look good.
>
> The testcase is using ALTER TABLE to modify foreign table schema.
> Though this works, I think a better option is to use ALTER FOREIGN
> TABLE.
>
> Something not related to this patch but
> -- no attach partition validation occurs for foreign tables
> ALTER TABLE pt2 ATTACH PARTITION pt2_1 FOR VALUES IN (1);
>
> I am wondering whether we can improve this. For local tables, if a
> valid constraint equivalent to the partitioning condition is not
> present on the table being attached, it scans the data to verify
> partition conditions. But for a foreign table, we don't want to do
> that since the constraint is not guaranteed to be valid after the
> initial check. For a normal foreign table a user can set a constraint
> if s/he knows that that constraint will be honoured on the foreign
> server. Thus instead of saying that we do not validate data, we can
> refuse to attach a partition if corresponding check constraint is
> absent on the foreign table being attached. A user will then be forced
> to add that constraint if s/he is sure that the constraint will be
> obeyed on the foreign server.

I agree that we could do that, but what value would it have?  It just
forces the user to spend two SQL commands doing what could otherwise
be done in one.  And the first command might not be that obvious.  I
think we should leave well enough alone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-07 Thread Vaishnavi Prabakaran
On Wed, Mar 8, 2017 at 3:52 AM, Daniel Verite 
wrote:

> Vaishnavi Prabakaran wrote:
>
> > Yes, I have created a new patch entry into the commitfest 2017-03 and
> > attached the latest patch with this e-mail.
>
> Please find attached a companion patch implementing the batch API in
> pgbench, exposed as \beginbatch and \endbatch meta-commands
> (without documentation).
>
> The idea for now is to make it easier to exercise the API and test
> how batching performs. I guess I'll submit the patch separately in
> a future CF, depending on when/if the libpq patch goes in.
>
>

Thanks for the companion patch and here are some comments:

1. I see, below check is used to verify if the connection is not in batch
mode:
if (PQbatchStatus(st->con) != PQBATCH_MODE_ON)

But, it is better to use if (PQbatchStatus(st->con) ==
PQBATCH_MODE_OFF) for this verification. Reason is there is one more state
in batch mode - PQBATCH_MODE_ABORTED. So, if the batch mode is in aborted
status, this check will still assume that the connection is not in batch
mode.

In a same way, "if(PQbatchStatus(st->con) == PQBATCH_MODE_ON)" check is
better to be modified as "PQbatchStatus(st->con) != PQBATCH_MODE_OFF".


2.  @@ -2207,7 +2227,47 @@ doCustom(TState *thread, CState *st, StatsData
*agg)
+ if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF)
+ {
+ commandFailed(st, "already in batch mode");
+ break;
+ }

This is not required as below PQbatchBegin() internally does this
verification.

+ PQbatchBegin(st->con);


3. It is better to check the return value of PQbatchBegin() rather than
assuming success. E.g: PQbatchBegin() will return false if the connection
is in copy in/out/both states.



> While developing this, I noted a few things with 0001-v4:
>
> 1. lack of initialization for count in PQbatchQueueCount.
> Trivial fix:
>
> --- a/src/interfaces/libpq/fe-exec.c
> +++ b/src/interfaces/libpq/fe-exec.c
> @@ -1874,7 +1874,7 @@ PQisBusy(PGconn *conn)
>  int
>  PQbatchQueueCount(PGconn *conn)
>  {
> -   int count;
> +   int count = 0;
> PGcommandQueueEntry *entry;
>
>
Thanks for your review and yes, Corrected.


> 2. misleading error message in PQexecStart. It gets called by a few other
> functions than PQexec, such as PQprepare. As I understand it, the intent
> here is to forbid the synchronous functions in batch mode, so this error
> message should not single out PQexec.
>
> @@ -1932,6 +2425,13 @@ PQexecStart(PGconn *conn)
> if (!conn)
> return false;
>
> +   if (conn->asyncStatus == PGASYNC_QUEUED || conn->batch_status !=
> PQBATCH_MODE_OFF)
> +   {
> +   printfPQExpBuffer(>errorMessage,
> + libpq_gettext("cannot
> PQexec in batch mode\n"));
> +   return false;
> +   }
> +
>
>

Hmm, this error message goes with the flow of other error messages in the
same function. E.g: "PQexec not allowed during COPY BOTH" . But, anyways I
modified the
error message to be more generic.



> 3. In relation to #2, PQsendQuery() is not forbidden in batch mode
> although I don't think it can work with it, as it's based on the old
> protocol.
>
>
>
It is actually forbidden and you can see the error message "cannot
PQsendQuery in batch mode, use PQsendQueryParams" when you use
PQsendQuery().
Attached the updated patch.

Thanks & Regards,
Vaishnavi
Fujitsu Australia.


0001-Pipelining-batch-support-for-libpq-code-v5.patch
Description: Binary data


0002-Pipelining-batch-support-for-libpq-test-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 4:26 PM, Stephen Frost  wrote:
> Right, that's what I thought he was getting at and my general thinking
> was that we would need a way to discover if a CIC is ongoing on the
> relation and therefore heap_page_prune(), or anything else, would know
> that it can't twiddle the bits in the VM due to the ongoing CIC.
> Perhaps a lock isn't the right answer there, but it would have to be
> some kind of cross-process communication that operates at a relation
> level..

Well, I guess that's one option.  I lean toward the position already
taken by Andres and Peter, namely, that it's probably not a great idea
to pursue this optimization.  I'm not totally dead-set on that
position, but it doesn't seem likely that we can add material
synchronization overhead to heap_page_prune(), and I'd rather maintain
the option to set visibility map bits in more places in the future
than give up that opportunity by optimizing CIC now.  It's nice for
CIC to be fast, but setting all visible bits more aggressively sounds
nicer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

2017-03-07 Thread David Rowley
On 8 March 2017 at 01:13, Simon Riggs  wrote:
> Don't understand this. I'm talking about setting a flag on
> commit/abort WAL records, like the attached.

There's nothing setting a flag in the attached. I only see the
checking of the flag.

> We just need to track info so we can set the flag at EOXact and we're done.

Do you have an idea where to store that information? I'd assume we'd
want to store something during LogAccessExclusiveLocks() and check
that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
anything existing which might help with that today. Do you think I
should introduce some new global variable for that? Or do you propose
calling GetRunningTransactionLocks() again while generating the
Commit/Abort record?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-07 Thread Craig Ringer
On 8 March 2017 at 00:52, Daniel Verite  wrote:
> Vaishnavi Prabakaran wrote:
>
>> Yes, I have created a new patch entry into the commitfest 2017-03 and
>> attached the latest patch with this e-mail.
>
> Please find attached a companion patch implementing the batch API in
> pgbench, exposed as \beginbatch and \endbatch meta-commands
> (without documentation).
>
> The idea for now is to make it easier to exercise the API and test
> how batching performs. I guess I'll submit the patch separately in
> a future CF, depending on when/if the libpq patch goes in.

That's great, thanks, and thanks also for the fixes. Any chance you
can attach your updated patch?

I looked at modifying psql to support batching when run
non-interactively, but it would've required major restructuring of its
control loop and I ran out of time. I didn't think of modifying
pgbench. Great to see.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-07 Thread Amit Langote
On 2017/03/08 1:34, Robert Haas wrote:
> On Tue, Mar 7, 2017 at 1:43 AM, Amit Langote wrote:
>> On 2017/03/07 14:04, Tom Lane wrote:
>>> Therefore, there should definitely be a partitioned table, hopefully with
>>> a less generic name than "p1", in the final regression DB state.  Whether
>>> this particular one from alter_table.sql is a good candidate, I dunno.
>>> But let's not drop it without adding a better-thought-out replacement.
>>
>> OK, let's drop p1 in alter_table.sql.  I think a partitioned table created
>> in insert.sql is a good candidate to keep around after having it renamed,
>> which patch 0003 does.
> 
> Committed 0001.
> 
> Committed 0002 and 0003 together.

Thanks.

Regards,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Example Custom Scan Provider Implementation?

2017-03-07 Thread Amit Langote
On 2017/03/08 7:10, Eric Ridge wrote:
> Hi all!
> 
> Does anyone know of a simple, example Custom Scan Provider implementation
> for 9.6+?
> 
> I found pg_strom by searching GitHub.  Its gpuscan.c looks like maybe it
> implements a pattern similar to what I want to do, but there's a lot of
> extraneous (to me) stuff to parse through.
> 
> I'm kinda surprised there isn't an example in contrib/, actually.

Here you go: https://github.com/kaigai/ctidscan

This was proposed originally [1] to go into contrib/, but that didn't
happen somehow.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F801091310%40BPXM15GP.gisp.nec.co.jp




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding OOM in a hash join with many duplicate inner keys

2017-03-07 Thread Tom Lane
Thomas Munro  writes:
> I have been wondering about a couple of different worst case execution
> strategies that would be better than throwing our hands up and
> potentially exploding memory once we detect that further partitioning
> is not going to help, if we still manage to reach that case despite
> adding stats-based defences like this due to statistics being missing,
> bad or confused by joins below us.

Yeah, it would definitely be nice if we could constrain the runtime
space consumption better.

> 1.  We could probe the fraction of the hash table that we have managed
> to load into work_mem so far and then rewind the outer batch and do it
> again for the next work_mem-sized fraction of the inner batch and so
> on.  For outer joins we'd need to scan for unmatched tuples after each
> hash table refill.

I do not understand how that works for a left join?  You'd need to track
whether a given outer tuple has been matched in any one of the fractions
of the inner batch, so that when you're done with the batch you could know
which outer tuples need to be emitted null-extended.  Right now we only
need to track that state for the current outer tuple, but in a rescan
situation we'd have to remember it for each outer tuple in the batch.

Perhaps it could be done by treating the outer batch file as read/write
and incorporating a state flag in each tuple; or to reduce write volume,
maintaining a separate outer batch file parallel to the main one with just
a bool or even just a bit per outer tuple.  Seems messy though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2017-03-07 Thread Kevin Grittner
On Mon, Feb 20, 2017 at 7:44 PM, Thomas Munro
 wrote:

> Was it intentional that this test doesn't include any statements that
> reach the case where the trigger does RAISE EXCEPTION 'RI error'?
> From the output generated there doesn't seem to be any evidence that
> the triggers run at all, though I know from playing around with this
> that they do

Tests expanded to cover more.  Some bugs found and fixed in the process.  :-/

> + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
>
> These copyright messages are missing 3 years' worth of bugfixes.

Those were only in files created with the initial patch in 2014.  No
bug fixes missing.  Updated the years to 2017, though.

> +SPI_get_caller_relation(const char *name)
>
> Do we need this function?  It's not used by the implementation.

Good point.  It seemed useful way back when, but since no uses for
it emerged, it should be removed until such time (if any) that it
would be useful.

> +typedef struct NamedTuplestoreScan
> +{
> + Scan scan;
> + char   *enrname;
> +} NamedTuplestoreScan;
>
> Nearly plan node structs always have a comment for the members below
> 'scan'; I think this needs one too because 'enrname' is not
> self-explanatory.

Done.

>   /*
> + * Capture the NEW and OLD transition TABLE tuplestores (if specified for
> + * this trigger).
> + */
> + if (trigdata->tg_newtable || trigdata->tg_oldtable)
> + {
> + estate.queryEnv = create_queryEnv();
> + if (trigdata->tg_newtable)
> + {
> + Enr enr = palloc(sizeof(EnrData));
> +
> + enr->md.name = trigdata->tg_trigger->tgnewtable;
> + enr->md.tupdesc = trigdata->tg_relation->rd_att;
> + enr->md.enrtuples = tuplestore_tuple_count(trigdata->tg_newtable);
> + enr->reldata = trigdata->tg_newtable;
> + register_enr(estate.queryEnv, enr);
> + SPI_register_relation(enr);
> + }
>
> Why do we we have to call register_enr and also SPI_register_relation here?

Essentially, because plpgsql does some things through SPI and some
things not.  Both cases are covered.

> On Mon, Dec 19, 2016 at 4:35 AM, Thomas Munro
>  wrote:
>> On Sun, Dec 18, 2016 at 3:15 PM, Kevin Grittner  wrote:
>>> There were a few changes Thomas included in the version he posted,
>>> without really delving into an explanation for those changes.  Some
>>> or all of them are likely to be worthwhile, but I would rather
>>> incorporate them based on explicit discussion, so this version
>>> doesn't do much other than generalize the interface a little,
>>> change some names, and add more regression tests for the new
>>> feature.  (The examples I worked up for the rough proof of concept
>>> of enforcement of RI through set logic rather than row-at-a-time
>>> navigation were the basis for the new tests, so the idea won't get
>>> totally lost.)  Thomas, please discuss each suggested change (e.g.,
>>> the inclusion of the query environment in the parameter list of a
>>> few more functions).
>>
>> I was looking for omissions that would cause some kind of statements
>> to miss out on ENRs arbitrarily.  It seemed to me that
>> parse_analyze_varparams should take a QueryEnvironment, mirroring
>> parse_analyze, so that PrepareQuery could pass it in.  Otherwise,
>> PREPARE wouldn't see ENRs.  Is there any reason why SPI_prepare should
>> see them but PREPARE not?
>
> Any thoughts about that?

Do you see any way to test that code, or would it be dead code there
"just in case" we later decided to do something that needed it?  I'm
not a big fan of the latter.  I've had to spend too much time
maintaining and/or ripping out code that fits that description.

On Mon, Feb 20, 2017 at 9:43 PM, Thomas Munro
 wrote:
> On Tue, Feb 21, 2017 at 7:14 AM, Thomas Munro
>  wrote:
>> On Fri, Jan 20, 2017 at 2:49 AM, Kevin Grittner  wrote:
>>> Attached is v9 which fixes bitrot from v8.  No other changes.
>>>
>>> Still needs review.
>
> Based on a suggestion from Robert off-list I tried inserting into a
> delta relation from a trigger function and discovered that it
> segfaults

Fixed.  Such an attempt now generates something like this:

ERROR:  relation "d" cannot be the target of a modifying statement
CONTEXT:  SQL statement "INSERT INTO d VALUES (100, 100, 'x')"
PL/pgSQL function level2_table_bad_usage_func() line 3 at SQL statement

New patch attached.

Miscellanea:

Do you suppose we should have all PLs that are part of the base
distro covered?

What is necessary to indicate an additional SQL feature covered?

--
Kevin Grittner


transition-v11.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-07 Thread Tom Lane
Andres Freund  writes:
> +++ b/src/include/storage/barrier.h
> +#include "postgres.h"

> Huh, that normally shouldn't be in a header.  I see you introduced that
> in a bunch of other places too - that really doesn't look right to me.

That is absolutely not project style and is not acceptable.

The core reason why not is that postgres.h/postgres_fe.h/c.h have to be
the *first* inclusion in every compilation, for arcane portability reasons
you really don't want to know about.  (Suffice it to say that on some
platforms, stdio.h isn't all that std.)  Our coding rule for that is that
we put the appropriate one of these first in every .c file, while .h files
always assume that it's been included already.  As soon as you break that
convention, it becomes unclear from looking at a .c file whether the
ordering requirement has been satisfied.  Also, since now you've moved
the must-be-first requirement to some other header file(s), you risk
breakage when somebody applies another project convention about
alphabetizing #include references for all headers other than those magic
ones.

In short, don't even think of doing this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-07 Thread Andres Freund
Hi,

On 2017-03-07 18:46:31 -0500, Peter Eisentraut wrote:
> I looked over
> 0001-Add-expression-dependencies-on-composite-type-whole-.patch.  That
> seems useful independent of the things you have following.
> 
> Even though it is presented more as a preparatory change, it appears to
> have noticeable user-facing impact, which we should analyze.

Indeed.


> For
> example, in the regression test, the command
> 
> alter table tt14t drop column f3;
> 
> is now rejected, rightly, but the command
> 
> alter table tt14t drop column f3 cascade;
> 
> ends up dropping the whole view, which might be a bit surprising.  We
> don't support dropping columns from views, so there might be no
> alternative

It seems strictly better than silently breaking the view.


> , but I wonder what else is changing because of this.  I
> think that might deserve a bit more analysis and perhaps some test cases.

There are already some tests. More is probably good - if you have
specific cases in mind...


> --- a/src/backend/catalog/dependency.c
> +++ b/src/backend/catalog/dependency.c
> @@ -206,6 +206,9 @@ static bool object_address_present_add_flags(const
> ObjectAddress *object,
>  static bool stack_address_present_add_flags(const ObjectAddress *object,
> int flags,
> 
> ObjectAddressStack *stack);
> +static void add_type_addresses(Oid typeid, bool include_components,
> ObjectAddresses *addrs);
> +static void add_type_component_addresses(Oid typeid, ObjectAddresses
> *addrs);
> +static void add_class_component_addresses(Oid relid, ObjectAddresses
> *addrs);
>  static void DeleteInitPrivs(const ObjectAddress *object);
> 
> For some reason, the function add_object_address() is singular, and
> these new functions are plural  Clearly, they take a plural argument, so
> your version seems more correct, but I wonder if we should keep that
> consistent.

I named them plural, because add_object_address only adds a single new
entry, but add_type_addresses etc potentially add multiple ones.  So I
think plural/singular as used here is actually consistent?


> +* e.g. not to the right thing for column-less
> tables.  Note that
> 
> Small typo there.  (to -> do)

Oops.


> @@ -1639,6 +1641,9 @@ find_expr_references_walker(Node *node,
> 
> add_object_address(OCLASS_PROC, funcexpr->funcid, 0,
>context->addrs);
> +   /* dependency on type itself already exists via function */
> +   add_type_component_addresses(funcexpr->funcresulttype,
> context->addrs);
> +
> /* fall through to examine arguments */
> }
> else if (IsA(node, OpExpr))
> 
> Why shouldn't the function itself also depend on the components of its
> return type?

Because that'd make it impossible to change the return type components -
if the return type is baked into the view that's necessary, but for a
"freestanding function" it's not.  If you e.g. have a function that just
returns a table's rows, it'd certainly be annoying if that'd prevent you
from dropping columns.


> How are argument types handled?

We fall through to

return expression_tree_walker(node, find_expr_references_walker,
  (void *) 
context);

which'll add a dependency if necessary.  If it's e.g. a table column,
function return type or such we'll add a dependency there.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread legrand legrand
There was a mistake in my driver definition,

this works fine with autosave=always (but not with autoSave ...)


Thanks Again


De : Vladimir Sitnikov [via PostgreSQL] 

Envoyé : mardi 7 mars 2017 22:32:27
À : legrand legrand
Objet : Re: Statement-level rollback

Please disregard my previous message.
pgjdbc is already doing upcase conversion, so I would like to see a test case 
that reproduces the error.

Alternatively, could you please capture and share TRACE log? ( 
https://jdbc.postgresql.org/documentation/head/logging.html#configuration )

Vladimir

ср, 8 мар. 2017 г. в 1:26, Vladimir Sitnikov <[hidden 
email]>:
legrand>when usingversion 42.0.0 with
legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always

The pitfall there is the value should be written with upper case like 
autosave=ALWAYS.

I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at some 
point.


Vladimir



If you reply to this email, your message will be added to the discussion below:
http://www.postgresql-archive.org/Statement-level-rollback-tp5946725p5948059.html
To unsubscribe from Statement-level rollback, click 
here.
NAML




--
View this message in context: 
http://www.postgresql-archive.org/Statement-level-rollback-tp5946725p5948076.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Tom Lane
Robert Haas  writes:
> On Tue, Mar 7, 2017 at 6:03 PM, Tom Lane  wrote:
>> Hm, one would hope that the vast majority of code references are neither
>> of those, but rather "RELKIND_PARTITIONED_TABLE".

> For reasons which must've seemed good to whoever instituted the
> policy, pg_dump refers to relkinds using the bare letters rather than
> the constants.

Even in pg_dump, it appears to me that the large majority of relkind
references use the symbolic names.  Quite a few of the violations of
that policy look to be new ... and now that I see them, their days are
numbered.

> (And protocol message types don't even have defined constants.  Uggh.)

Yeah, that's a different issue, which boils down to the fact that in order
to do anything useful we'd need to clutter client-visible namespace with
the symbols.  I wouldn't be averse to doing something about it as long as
it's not done in postgres_ext.h, but if not there where?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-07 Thread Andres Freund
Hi,

0001: Do hash join work_mem accounting in chunks.

Don't think there's much left to say.

0002: Check hash join work_mem usage at the point of chunk allocation.

Modify the existing hash join code to detect work_mem exhaustion at
the point where chunks are allocated, instead of checking after every
tuple insertion.  This matches the logic used for estimating, and more
importantly allows for some parallelism in later patches.

diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 406c180..af1b66d 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -48,7 +48,8 @@ static void ExecHashSkewTableInsert(HashJoinTable hashtable,
int bucketNumber);
 static void ExecHashRemoveNextSkewBucket(HashJoinTable hashtable);

-static void *dense_alloc(HashJoinTable hashtable, Size size);
+static void *dense_alloc(HashJoinTable hashtable, Size size,
+bool respect_work_mem);

I still dislike this, but maybe Robert's point of:

On 2017-02-16 08:57:21 -0500, Robert Haas wrote:
> On Wed, Feb 15, 2017 at 9:36 PM, Andres Freund  wrote:
> > Isn't it kinda weird to do this from within dense_alloc()?  I mean that
> > dumps a lot of data to disk, frees a bunch of memory and so on - not
> > exactly what "dense_alloc" implies.  Isn't the free()ing part also
> > dangerous, because the caller might actually use some of that memory,
> > like e.g. in ExecHashRemoveNextSkewBucket() or such.  I haven't looked
> > deeply enough to check whether that's an active bug, but it seems like
> > inviting one if not.
>
> I haven't looked at this, but one idea might be to just rename
> dense_alloc() to ExecHashBlahBlahSomething().  If there's a real
> abstraction layer problem here then we should definitely fix it, but
> maybe it's just the angle at which you hold your head.

Is enough.


0003: Scan for unmatched tuples in a hash join one chunk at a time.


@@ -1152,8 +1155,65 @@ bool
 ExecScanHashTableForUnmatched(HashJoinState *hjstate, ExprContext *econtext)
 {
HashJoinTable hashtable = hjstate->hj_HashTable;
-   HashJoinTuple hashTuple = hjstate->hj_CurTuple;
+   HashJoinTuple hashTuple;
+   MinimalTuple tuple;
+
+   /*
+* First, process the queue of chunks holding tuples that are in regular
+* (non-skew) buckets.
+*/
+   for (;;)
+   {
+   /* Do we need a new chunk to scan? */
+   if (hashtable->current_chunk == NULL)
+   {
+   /* Have we run out of chunks to scan? */
+   if (hashtable->unmatched_chunks == NULL)
+   break;
+
+   /* Pop the next chunk from the front of the queue. */
+   hashtable->current_chunk = hashtable->unmatched_chunks;
+   hashtable->unmatched_chunks = 
hashtable->current_chunk->next;
+   hashtable->current_chunk_index = 0;
+   }
+
+   /* Have we reached the end of this chunk yet? */
+   if (hashtable->current_chunk_index >= 
hashtable->current_chunk->used)
+   {
+   /* Go around again to get the next chunk from the 
queue. */
+   hashtable->current_chunk = NULL;
+   continue;
+   }
+
+   /* Take the next tuple from this chunk. */
+   hashTuple = (HashJoinTuple)
+   (hashtable->current_chunk->data + 
hashtable->current_chunk_index);
+   tuple = HJTUPLE_MINTUPLE(hashTuple);
+   hashtable->current_chunk_index +=
+   MAXALIGN(HJTUPLE_OVERHEAD + tuple->t_len);
+
+   /* Is it unmatched? */
+   if (!HeapTupleHeaderHasMatch(tuple))
+   {
+   TupleTableSlot *inntuple;
+
+   /* insert hashtable's tuple into exec slot */
+   inntuple = ExecStoreMinimalTuple(tuple,
+   
 hjstate->hj_HashTupleSlot,
+   
 false); /* do not pfree */
+   econtext->ecxt_innertuple = inntuple;
+
+   /* reset context each time (see below for explanation) 
*/
+   ResetExprContext(econtext);
+   return true;
+   }
+   }

I suspect this might actually be slower than the current/old logic,
because the current_chunk tests are repeated every loop. I think
retaining the two loops the previous code had makes sense, i.e. one to
find a relevant chunk, and one to iterate through all tuples in a chunk,
checking for an unmatched one.


Have you run a performance comparison pre/post this patch?  I 

Re: [HACKERS] [NOVICE] opr_charset rule in gram.y

2017-03-07 Thread Neha Khatri
I see it is already addressed in master. Thanks.

Regards,
Neha


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-07 Thread Peter Eisentraut
I looked over
0001-Add-expression-dependencies-on-composite-type-whole-.patch.  That
seems useful independent of the things you have following.

Even though it is presented more as a preparatory change, it appears to
have noticeable user-facing impact, which we should analyze.  For
example, in the regression test, the command

alter table tt14t drop column f3;

is now rejected, rightly, but the command

alter table tt14t drop column f3 cascade;

ends up dropping the whole view, which might be a bit surprising.  We
don't support dropping columns from views, so there might be no
alternative, but I wonder what else is changing because of this.  I
think that might deserve a bit more analysis and perhaps some test cases.


--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -206,6 +206,9 @@ static bool object_address_present_add_flags(const
ObjectAddress *object,
 static bool stack_address_present_add_flags(const ObjectAddress *object,
int flags,

ObjectAddressStack *stack);
+static void add_type_addresses(Oid typeid, bool include_components,
ObjectAddresses *addrs);
+static void add_type_component_addresses(Oid typeid, ObjectAddresses
*addrs);
+static void add_class_component_addresses(Oid relid, ObjectAddresses
*addrs);
 static void DeleteInitPrivs(const ObjectAddress *object);

For some reason, the function add_object_address() is singular, and
these new functions are plural  Clearly, they take a plural argument, so
your version seems more correct, but I wonder if we should keep that
consistent.


+* e.g. not to the right thing for column-less
tables.  Note that

Small typo there.  (to -> do)

@@ -1639,6 +1641,9 @@ find_expr_references_walker(Node *node,

add_object_address(OCLASS_PROC, funcexpr->funcid, 0,
   context->addrs);
+   /* dependency on type itself already exists via function */
+   add_type_component_addresses(funcexpr->funcresulttype,
context->addrs);
+
/* fall through to examine arguments */
}
else if (IsA(node, OpExpr))

Why shouldn't the function itself also depend on the components of its
return type?

How are argument types handled?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 6:03 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Mar 7, 2017 at 12:55 PM, Tom Lane  wrote:
>>> Is there a good reason why RELKIND_PARTITIONED_TABLE is 'P' not 'p'?
>
>> I can't muster a lot of outrage about this one way or another.  One
>> possible advantage of 'P' is that there are fewer places where 'P' is
>> mentioned in the source code than 'p'.
>
> Hm, one would hope that the vast majority of code references are neither
> of those, but rather "RELKIND_PARTITIONED_TABLE".  information_schema.sql
> and system_views.sql will need to be gone over carefully, certainly, but
> we shouldn't be hard-coding this anywhere that there's a reasonable
> alternative.

For reasons which must've seemed good to whoever instituted the
policy, pg_dump refers to relkinds using the bare letters rather than
the constants.

(And protocol message types don't even have defined constants.  Uggh.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 5:08 PM, Robert Haas  wrote:
> On Wed, Mar 1, 2017 at 4:18 AM, Robert Haas  wrote:
>> On Tue, Feb 28, 2017 at 7:31 PM, Amit Kapila  wrote:
>>> Yeah, actually those were added later in Enable-WAL-for-Hash* patch,
>>> but I think as this patch is standalone, so we should not remove it
>>> from their existing usage, I have added those back and rebased the
>>> remaining patches.
>>
>> Great, thanks.  0001 looks good to me now, so committed.
>
> Committed 0002.

Here are some initial review thoughts on 0003 based on a first read-through.

+ affected by this setting.  Example include system catalogs.  For

Not grammatical.  Perhaps "affected by this setting, such as system catalogs".

-mark meta page dirty and release buffer content lock and pin
+ mark meta page dirty and write WAL for update of metapage
+ release buffer content lock and pin

Was the change in the indentation level intentional?

+accomodated in the initial set of buckets and it is not unusual to have large

Spelling.

+As deletion involves multiple atomic operations, it is quite possible that
+system crashes after (a) removing tuples from some of the bucket pages
+(b) before clearing the garbage flag (c) before updating the metapage.  If the

Need two commas and an "or": (a) removing tuples from some of the
bucket pages, (b) before clearing the garbage flag, or (c)

+quite possible, that system crashes before completing the operation on

No comma.  Also, "that the system crashes".

+the index will remain bloated and can impact performance of read and

and this can impact

+/*
+ * we need to release and reacquire the lock on overflow buffer to ensure
+ * that standby shouldn't see an intermediate state of it.
+ */
+LockBuffer(ovflbuf, BUFFER_LOCK_UNLOCK);
+LockBuffer(ovflbuf, BUFFER_LOCK_EXCLUSIVE);

I still think this is a bad idea.  Releasing and reacquiring the lock
on the master doesn't prevent the standby from seeing intermediate
states; the comment, if I understand correctly, is just plain wrong.
What it does do is make the code on the master more complicated, since
now the page returned by _hash_addovflpage might theoretically fill up
while you are busy releasing and reacquiring the lock.

+ *Add the tuples (itups) to wbuf in this function, we could do that in the
+ *caller as well.  The advantage of doing it here is we can easily write

Change comma to period.  Then: "We could easily do that in the caller,
but the advantage..."

+ * Initialise the freed overflow page, here we can't complete zeroed the

Don't use British spelling, and don't use a comma to join two
sentences.  Also "here we can't complete zeroed" doesn't seem right; I
don't know what that means.

+ * To replay meta page changes, we can log the entire metapage which
+ * doesn't seem advisable considering size of hash metapage or we can
+ * log the individual updated values which seems doable, but we prefer
+ * to perform exact operations on metapage during replay as are done
+ * during actual operation.  That looks straight forward and has an
+ * advantage of using lesser space in WAL.

This sounds like it is describing two alternatives that were not
selected and then the one that was selected, but I don't understand
the difference between the second non-selected alternative and what
was actually picked.  I'd just write "We need only log the one field
from the metapage that has changed." or maybe drop the comment
altogether.

+XLogEnsureRecordSpace(0, XLR_NORMAL_RDATAS + nitups);

The use of XLR_NORMAL_RDATAS here seems wrong.  Presumably you should
pass the number of rdatas you will actually need, which is some
constant plus nitups.  I think you should just write that constant
here directly, whether or not it happens to be equal to
XLR_NORMAL_RDATAS.

+/*
+ * We need to release and if required reacquire the lock on
+ * rbuf to ensure that standby shouldn't see an intermediate
+ * state of it.  If we don't release the lock, after replay of
+ * XLOG_HASH_MOVE_PAGE_CONTENTS on standby users will
be able to
+ * view the results of partial deletion on rblkno.
+ */
+LockBuffer(rbuf, BUFFER_LOCK_UNLOCK);

If you DO release the lock, this will STILL be true, because what
matters on the standby is what the redo code does.  This code doesn't
even execute on the standby, so how could it possibly affect what
intermediate states are visible there?

+/*
+ * Here, we could have copied the entire metapage as well to WAL
+ * record and restore it as it is during replay.  However the size of
+ * metapage is not small (more than 600 bytes), so recording just the
+ * information required to construct metapage.

Re: [HACKERS] cast result of copyNode()

2017-03-07 Thread Tom Lane
Mark Dilger  writes:
> You appear to be using a #define macro to wrap a function of the same name
> with the code:

> #define copyObject(obj) ((typeof(obj)) copyObject(obj))

> I don't necessarily have a problem with that, but it struck me as a bit odd, 
> and
> grep'ing through the sources, I don't see anywhere else in the project where 
> that
> is done for a function of the same number of arguments, and only one other
> place where it is done at all:

I agree that that seems like a bad idea.  Better to rename the underlying
function --- for one thing, that provides an "out" if some caller doesn't
want this behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Peter Eisentraut
On 3/7/17 12:55, Tom Lane wrote:
> Is there a good reason why RELKIND_PARTITIONED_TABLE is 'P' not 'p'?

I was confused about this too.  If there is no argument against it, I
would favor changing it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2017-03-07 Thread Josh Berkus
On 03/02/2017 07:13 AM, David Steele wrote:
> Hi Simon,
> 
> On 2/25/17 2:43 PM, Simon Riggs wrote:
>> On 25 February 2017 at 13:58, Michael Paquier  
>> wrote:
>>
>>> - trigger_file is removed.
>>> FWIW, my only complain is about the removal of trigger_file, this is
>>> useful to detect a trigger file on a different partition that PGDATA!
>>> Keeping it costs also nothing..
>>
>> Sorry, that is just an error of implementation, not intention. I had
>> it on my list to keep, at your request.
>>
>> New version coming up.
> 
> Do you have an idea when the new version will be available?

Please?  Having yet another PostgreSQL release go by without fixing
recovery.conf would make me very sad.


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cast result of copyNode()

2017-03-07 Thread Mark Dilger
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hi Peter,

I like the patch so far, and it passes all the regression tests for me on both 
mac and linux. I
am very much in favor of having more compiler type checking, so +1 from me.

You appear to be using a #define macro to wrap a function of the same name
with the code:

#define copyObject(obj) ((typeof(obj)) copyObject(obj))

I don't necessarily have a problem with that, but it struck me as a bit odd, and
grep'ing through the sources, I don't see anywhere else in the project where 
that
is done for a function of the same number of arguments, and only one other
place where it is done at all:

$ find src/ -type f -name "*.h" -or -name "*.c" | xargs cat | perl -e 
'while(<>) { print if (/^#define (\w+)\b.*\b\1\s*\(\b/); }'
#define copyObject(obj) ((typeof(obj)) copyObject(obj))
#define mkdir(a,b)  mkdir(a)

I'm just flagging that for discussion if anybody thinks it is too magical.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [NOVICE] opr_charset rule in gram.y

2017-03-07 Thread Neha Khatri
On Tue, Mar 7, 2017 at 4:30 PM, Tom Lane  wrote:

> Neha Khatri  writes:
> > I was going through the grammer rule for Character types in gram.y and
> > found an optional sub rule in is "opt_charset"
>
> This question seems quite off-topic for pgsql-novice, so I've redirected
> it to pgsql-hackers.
>

Thanks for appropriately categorizing. I was not sure.


> > CharacterWithLength:  character '(' Iconst ')' opt_charset
> > {
> > if (($5 != NULL) && (strcmp($5, "sql_text") != 0))
> > $1 = psprintf("%s_%s", $1, $5);
> >
> > $$ = SystemTypeName($1);
>
> > I would like to understand how opt_charset would get used in real
> > applications. I tried to make use of it but results in failure:
>
> > postgres=# create table testchar (c1 char(5) character set utf8);
> > ERROR:  type "pg_catalog.bpchar_utf8" does not exist
> > LINE 1: create table testchar (c1 char(5) character set utf8);
>
> > There does not seem to be any documentation available about this optional
> > parameter in the documents for Character data type( at least I could not
> > find any).
>
> Some digging in the git history shows that opt_charset was introduced in
> commit f10b63923 (in 1997!), and the current interpretation as
> "typename_charsetname" appeared in commit 3e1beda2c.  Neither commit
> adds any user documentation or even mentions fooling with character type
> declarations in its commit message.  I think that this must have been
> work-in-progress that Tom Lockhart left behind when he dropped out of
> the project circa 2002.
>
> Since there are no datatypes matching the "typename_charsetname" naming
> pattern, and today we'd be unlikely to accept an implementation based on
> that approach even if someone wrote one, my inclination is just to rip out
> this code.  Maybe we could leave behind the no-op case that allows
> "CHARACTER SET SQL_TEXT", but I don't really see the point, since we've
> never documented supporting any such syntax.
>
> Ok, understood, good historical detail.. Will go through
"typename_charsetname"
to understand the concept. Then may be come up with the cleanup patch.

Thanks and Regards,
Neha


Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Tom Lane
Robert Haas  writes:
> On Tue, Mar 7, 2017 at 12:55 PM, Tom Lane  wrote:
>> Is there a good reason why RELKIND_PARTITIONED_TABLE is 'P' not 'p'?

> I can't muster a lot of outrage about this one way or another.  One
> possible advantage of 'P' is that there are fewer places where 'P' is
> mentioned in the source code than 'p'.

Hm, one would hope that the vast majority of code references are neither
of those, but rather "RELKIND_PARTITIONED_TABLE".  information_schema.sql
and system_views.sql will need to be gone over carefully, certainly, but
we shouldn't be hard-coding this anywhere that there's a reasonable
alternative.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Vladimir Sitnikov
Please disregard my previous message.
pgjdbc is already doing upcase conversion, so I would like to see a test
case that reproduces the error.

Alternatively, could you please capture and share TRACE log? (
https://jdbc.postgresql.org/documentation/head/logging.html#configuration )

Vladimir

ср, 8 мар. 2017 г. в 1:26, Vladimir Sitnikov :

> legrand>when usingversion 42.0.0 with
> legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always
>
> The pitfall there is the value should be written with upper case like
> autosave=ALWAYS.
>
> I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at
> some point.
>
>
> Vladimir
>


Re: [HACKERS] Logical replication existing data copy

2017-03-07 Thread Erik Rijkers

On 2017-03-06 11:27, Petr Jelinek wrote:


0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  +
0005-Skip-unnecessary-snapshot-builds.patch+
0001-Logical-replication-support-for-initial-data-copy-v6.patch


I use three different machines (2 desktop, 1 server) to test logical 
replication, and all three have now at least once failed to correctly 
synchronise a pgbench session (amidst many succesful runs, of course)


I attach an output-file from the test-program, with the 2 logfiles 
(master+replica) of the failed run.  The outputfile 
(out_20170307_1613.txt) contains the output of 5 runs of 
pgbench_derail2.sh.  The first run failed, the next 4 were ok.


But that's probably not very useful; perhaps is pg_waldump more useful?  
From what moment, or leading up to what moment, or period, is a 
pg_waldump(s) useful?  I can run it from the script, repeatedly, and 
only keep the dumped files when things go awry. Would that make sense?


Any other ideas welcome.


thanks,

Erik Rijkers





20170307_1613.tar.bz2
Description: BZip2 compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Vladimir Sitnikov
legrand>when usingversion 42.0.0 with
legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always

The pitfall there is the value should be written with upper case like
autosave=ALWAYS.

I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at
some point.


Vladimir


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread legrand legrand
Thanks !

that's a very good new !


I'm still receiving the famous

"current transaction is aborted" error

when usingversion 42.0.0 with

 jdbc:postgresql://localhost:5432/postgres?autosave=always


But I will see that with pgjdbc team ;o)

Regards
PAscal




--
View this message in context: 
http://www.postgresql-archive.org/Statement-level-rollback-tp5946725p5948053.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Dave Cramer
You have to turn it on using the autosave parameter. it's not on by
default, and apparently not documented

Dave Cramer

da...@postgresintl.com
www.postgresintl.com

On 7 March 2017 at 17:15, legrand legrand 
wrote:

> Thanks !
>
> that's a very good new !
>
>
> I'm still receiving the famous
>
> "current transaction is aborted" error
> when usingversion 42.0.0 with
>
>  jdbc:postgresql://localhost:5432/postgres?autosave=always
>
>
> But I will see that with pgjdbc team ;o)
> Regards
> PAscal
>
> --
> View this message in context: RE: Statement-level rollback
> 
>
> Sent from the PostgreSQL - hackers mailing list archive
>  at
> Nabble.com.
>


[HACKERS] Example Custom Scan Provider Implementation?

2017-03-07 Thread Eric Ridge
Hi all!

Does anyone know of a simple, example Custom Scan Provider implementation
for 9.6+?

I found pg_strom by searching GitHub.  Its gpuscan.c looks like maybe it
implements a pattern similar to what I want to do, but there's a lot of
extraneous (to me) stuff to parse through.

I'm kinda surprised there isn't an example in contrib/, actually.

Thanks for your time!

eric


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-07 Thread Robert Haas
On Wed, Mar 1, 2017 at 4:18 AM, Robert Haas  wrote:
> On Tue, Feb 28, 2017 at 7:31 PM, Amit Kapila  wrote:
>> Yeah, actually those were added later in Enable-WAL-for-Hash* patch,
>> but I think as this patch is standalone, so we should not remove it
>> from their existing usage, I have added those back and rebased the
>> remaining patches.
>
> Great, thanks.  0001 looks good to me now, so committed.

Committed 0002.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for grouping sets

2017-03-07 Thread Mark Dilger
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

On linux/gcc the patch generates a warning in nodeAgg.c that is fairly easy
to fix.  Using -Werror to make catching the error easier, I get:

make[3]: Entering directory `/home/mark/postgresql.clean/src/backend/executor'
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -O2 -Werror -I../../../src/include -D_GNU_SOURCE   
-c -o nodeAgg.o nodeAgg.c -MMD -MP -MF .deps/nodeAgg.Po
cc1: warnings being treated as errors
nodeAgg.c: In function ‘ExecAgg’:
nodeAgg.c:2053: error: ‘result’ may be used uninitialized in this function
make[3]: *** [nodeAgg.o] Error 1
make[3]: Leaving directory `/home/mark/postgresql.clean/src/backend/executor'
make[2]: *** [executor-recursive] Error 2
make[2]: Leaving directory `/home/mark/postgresql.clean/src/backend'
make[1]: *** [all-backend-recurse] Error 2
make[1]: Leaving directory `/home/mark/postgresql.clean/src'
make: *** [all-src-recurse] Error 2


There are two obvious choices to silence the warning:

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index f059560..8959f5b 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -2066,6 +2066,7 @@ ExecAgg(AggState *node)
break;
case AGG_PLAIN:
case AGG_SORTED:
+   default:
result = agg_retrieve_direct(node);
break;
}

which I like better, because I don't expect it would create
an extra instruction in the compiled code, but also:

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index f059560..eab2605 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -2050,7 +2050,7 @@ lookup_hash_entries(AggState *aggstate)
 TupleTableSlot *
 ExecAgg(AggState *node)
 {
-   TupleTableSlot *result;
+   TupleTableSlot *result = NULL;
 
if (!node->agg_done)
{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for grouping sets

2017-03-07 Thread Mark Dilger

> On Mar 7, 2017, at 1:43 PM, Mark Dilger  wrote:
> 
> On my MacBook, `make check-world` gives differences in the contrib modules:

I get the same (or similar -- didn't check) regression failure on CentOS, so 
this
doesn't appear to be MacBook / hardware specific.

Mark Dilger

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for grouping sets

2017-03-07 Thread Mark Dilger
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

On my MacBook, `make check-world` gives differences in the contrib modules:

cat contrib/postgres_fdw/regression.diffs
*** 
/Users/mark/hydra/postgresql.review/contrib/postgres_fdw/expected/postgres_fdw.out
  2017-03-03 13:33:47.0 -0800
--- 
/Users/mark/hydra/postgresql.review/contrib/postgres_fdw/results/postgres_fdw.out
   2017-03-07 13:27:56.0 -0800
***
*** 3148,3163 
  -- Grouping sets
  explain (verbose, costs off)
  select c2, sum(c1) from ft1 where c2 < 3 group by rollup(c2) order by 1 nulls 
last;
! QUERY PLAN
 
! 
---
!  GroupAggregate
!Output: c2, sum(c1)
!Group Key: ft1.c2
!Group Key: ()
!->  Foreign Scan on public.ft1
!  Output: c2, c1
!  Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE ((c2 < 3)) ORDER 
BY c2 ASC NULLS LAST
! (7 rows)
  
  select c2, sum(c1) from ft1 where c2 < 3 group by rollup(c2) order by 1 nulls 
last;
   c2 |  sum   
--- 3148,3166 
  -- Grouping sets
  explain (verbose, costs off)
  select c2, sum(c1) from ft1 where c2 < 3 group by rollup(c2) order by 1 nulls 
last;
!   QUERY PLAN  
! --
!  Sort
!Output: c2, (sum(c1))
!Sort Key: ft1.c2
!->  MixedAggregate
!  Output: c2, sum(c1)
!  Hash Key: ft1.c2
!  Group Key: ()
!  ->  Foreign Scan on public.ft1
!Output: c2, c1
!Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE ((c2 < 3))
! (10 rows)
  
  select c2, sum(c1) from ft1 where c2 < 3 group by rollup(c2) order by 1 nulls 
last;
   c2 |  sum   
***
*** 3170,3185 
  
  explain (verbose, costs off)
  select c2, sum(c1) from ft1 where c2 < 3 group by cube(c2) order by 1 nulls 
last;
! QUERY PLAN
 
! 
---
!  GroupAggregate
!Output: c2, sum(c1)
!Group Key: ft1.c2
!Group Key: ()
!->  Foreign Scan on public.ft1
!  Output: c2, c1
!  Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE ((c2 < 3)) ORDER 
BY c2 ASC NULLS LAST
! (7 rows)
  
  select c2, sum(c1) from ft1 where c2 < 3 group by cube(c2) order by 1 nulls 
last;
   c2 |  sum   
--- 3173,3191 
  
  explain (verbose, costs off)
  select c2, sum(c1) from ft1 where c2 < 3 group by cube(c2) order by 1 nulls 
last;
!   QUERY PLAN  
! --
!  Sort
!Output: c2, (sum(c1))
!Sort Key: ft1.c2
!->  MixedAggregate
!  Output: c2, sum(c1)
!  Hash Key: ft1.c2
!  Group Key: ()
!  ->  Foreign Scan on public.ft1
!Output: c2, c1
!Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE ((c2 < 3))
! (10 rows)
  
  select c2, sum(c1) from ft1 where c2 < 3 group by cube(c2) order by 1 nulls 
last;
   c2 |  sum   
***
*** 3192,3211 
  
  explain (verbose, costs off)
  select c2, c6, sum(c1) from ft1 where c2 < 3 group by grouping sets(c2, c6) 
order by 1 nulls last, 2 nulls last;
!  QUERY PLAN   
   
! 
-
   Sort
 Output: c2, c6, (sum(c1))
 Sort Key: ft1.c2, ft1.c6
!->  GroupAggregate
   Output: c2, c6, sum(c1)
!  Group Key: ft1.c2
!  Sort Key: ft1.c6
!Group Key: ft1.c6
   ->  Foreign Scan on public.ft1
 Output: c2, c6, c1
!Remote SQL: SELECT "C 1", c2, c6 FROM "S 1"."T 1" WHERE ((c2 < 
3)) ORDER BY c2 ASC NULLS LAST
! (11 rows)
  
  select c2, c6, sum(c1) from ft1 where c2 < 3 group by grouping sets(c2, c6) 
order by 1 nulls last, 2 nulls last;
   c2 | c6 |  sum  
--- 3198,3216 
  
  explain (verbose, costs off)
  select c2, c6, sum(c1) from ft1 where c2 < 3 group by grouping sets(c2, c6) 
order by 1 nulls last, 2 nulls last;
! QUERY PLAN

! 
--
   Sort
 Output: c2, c6, (sum(c1))
 Sort Key: ft1.c2, ft1.c6
!->  HashAggregate
   Output: c2, c6, sum(c1)
!   

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-07 Thread Sven R. Kunze

Hi,

about the datetime issue: as far as I know, JSON does not define a 
serialization format for dates and timestamps.


On the other hand, YAML (as a superset of JSON) already supports a 
language-independent date(time) serialization format 
(http://yaml.org/type/timestamp.html).


I haven't had a glance into the SQL/JSON standard yet and a quick search 
didn't reveal anything. However, reading your test case here 
https://github.com/postgrespro/sqljson/blob/5a8a241/src/test/regress/sql/sql_json.sql#L411 
it seems as if you intend to parse all strings in the form of 
"-MM-DD" as dates. This is problematic in case a string happens to 
look like this but is not intended to be a date.


Just for the sake of completeness: YAML solves this issue by omitting 
the quotation marks around the date string (just as JSON integers have 
no quotations marks around them).


Regards,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-07 Thread Andres Freund
Hi,

On 2017-03-07 02:57:30 +1300, Thomas Munro wrote:
> I'm not sure why nodeHashjoin.c is doing raw batchfile read/write
> operations anyway; why not use tuplestore.c for that (as
> tuplestore.c's comments incorrectly say is the case)?

Another reason presumably is that using tuplestores would make it harder
to control the amount of memory used - we do *not* want an extra set of
work_mem used here, right?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Dave Cramer
On 7 March 2017 at 16:18, Michael Banck  wrote:

> On Tue, Mar 07, 2017 at 01:49:29PM -0700, legrand legrand wrote:
> > JDBC has nothing and developers has to play with savepoint as described
> > http://blog.endpoint.com/2015/02/postgres-onerrorrollback-explained.html
>
> JDBC has it since 9.4.1210 (2016-09-07), unless I am mistaken:
>
> https://github.com/pgjdbc/pgjdbc/commit/adc08d57d2a9726309ea80d574b1db
> 835396c1c8


I thought he meant we have to play with savepoints.

Yes, we do it for you now


Dave Cramer

da...@postgresintl.com
www.postgresintl.com


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Mar 3, 2017 at 6:06 PM, Stephen Frost  wrote:
> > * Andres Freund (and...@anarazel.de) wrote:
> >> On 2017-02-28 19:12:03 +0530, Pavan Deolasee wrote:
> >> > Since VM bits are only set during VACUUM which conflicts with CIC on the
> >> > relation lock, I don't see any risk of incorrectly skipping pages that 
> >> > the
> >> > second scan should have scanned.
> >>
> >> I think that's true currently, but it'd also prevent us from doing that
> >> in additional places.  Which, in my opinion, we really should (and I
> >> believe that's realistically achievable).  Thus I really don't want to
> >> base the correctness of CIC - a relatively infrequent operation - on the
> >> assumption that no VM bits can be set concurrenty due to the SUE lock.
> >
> > That sounds like we need a lock or similar mechanism to indicate that
> > CIC depends on the VM not being changed, rather than saying it shouldn't
> > depend on that because it might not always be true that the only other
> > operation (VACUUM) sets them and already acquires a conflicting lock.
> 
> I don't really think that would be a useful approach.  I think what
> Andres is thinking about -- or at least what comes to mind for me --
> is the possibility that heap_page_prune() might someday try to mark
> pages all-visible.  Then it would become something that happens from
> time to time during foreground processing, rather than (as now)
> something that only happens from within a maintenance command.

Right, that's what I thought he was getting at and my general thinking
was that we would need a way to discover if a CIC is ongoing on the
relation and therefore heap_page_prune(), or anything else, would know
that it can't twiddle the bits in the VM due to the ongoing CIC.
Perhaps a lock isn't the right answer there, but it would have to be
some kind of cross-process communication that operates at a relation
level..

Just to be clear, I wasn't thinking that heap_page_prune() or a
foreground process would actually block on the lock, just that it would
try to acquire it if it decided that it could twiddle the VM and if it
wasn't able to immediately then it would just leave it alone and move
on.  I'll admit that I'm not super familiar with the CIC or VM
internals, frankly, so I may be all wet with this entire line of
thinking, but figured I'd at least explain what the idea was.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Small fix to postgresql.conf.sample's comment on max_parallel_workers

2017-03-07 Thread David Rowley
On 8 March 2017 at 09:32, Robert Haas  wrote:
> Committed.

Thanks!

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Michael Banck
On Tue, Mar 07, 2017 at 01:49:29PM -0700, legrand legrand wrote:
> JDBC has nothing and developers has to play with savepoint as described 
> http://blog.endpoint.com/2015/02/postgres-onerrorrollback-explained.html

JDBC has it since 9.4.1210 (2016-09-07), unless I am mistaken:

https://github.com/pgjdbc/pgjdbc/commit/adc08d57d2a9726309ea80d574b1db835396c1c8


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding the optional clause 'AS' in CREATE TRIGGER

2017-03-07 Thread Peter Eisentraut
On 2/7/17 03:11, Okano, Naoki wrote:
> I tried to cretae a patch for CREATE OR REPLACE TRIGGER.

I have a feeling that this was proposed a few times in the ancient past
but did not go through because of locking issues.  I can't find any
emails about it through.  Does anyone remember?  Have you thought about
locking issues?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-07 Thread Sven R. Kunze

On 07.03.2017 03:21, Andreas Karlsson wrote:
1) I do not think we currently allow setting the locale like this 
anywhere, so this will introduce a new concept to PostgreSQL. And you 
will probably need to add support for caching per locale.


Good to know. Could you explain what you mean by "caching per locale"?

2) As far as I can tell from reading the code to_date currently 
ignores the M suffix which indicates that you want localized month/day 
names, so i guess that to_date is currently immutable. Maybe it is 
stable due to the idea that we may want to support the M suffix in the 
future.


I think that's the case.

3) I do not like the to_date function. It is much too forgiving with 
invalid input. For example 2017-02-30 because 2017-03-02.


That's indeed a funny parsing result. Why does to_date perform this kind 
of error-correction?



Also just ignoring the M suffix in the format string seems pretty bad.

Personally I would rather see a new date parsing function which is 
easier to work with or somehow fix to_date without pissing too many 
users off, but I have no idea if this is a view shared with the rest 
of the community.


Neither do I.

Many business applications have to deal with date(times). I came from 
the practical issue of indexing json objects.


Regards,
Sven



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread legrand legrand
Hello,

EDB Oracle compatibility proposes edb_stmt_level_tx parameter,
psql uses ON_ERROR_ROLLBACK = 'on',
ODBC has a parameter for this
JDBC has nothing and developers has to play with savepoint as described 
http://blog.endpoint.com/2015/02/postgres-onerrorrollback-explained.html

This feature (as a GUC at server level) would be very helpfull for Oracle
applications migration.

Regards
PAscal




--
View this message in context: 
http://www.postgresql-archive.org/Statement-level-rollback-tp5946725p5948032.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-07 Thread Robert Haas
On Fri, Mar 3, 2017 at 6:06 PM, Stephen Frost  wrote:
> * Andres Freund (and...@anarazel.de) wrote:
>> On 2017-02-28 19:12:03 +0530, Pavan Deolasee wrote:
>> > Since VM bits are only set during VACUUM which conflicts with CIC on the
>> > relation lock, I don't see any risk of incorrectly skipping pages that the
>> > second scan should have scanned.
>>
>> I think that's true currently, but it'd also prevent us from doing that
>> in additional places.  Which, in my opinion, we really should (and I
>> believe that's realistically achievable).  Thus I really don't want to
>> base the correctness of CIC - a relatively infrequent operation - on the
>> assumption that no VM bits can be set concurrenty due to the SUE lock.
>
> That sounds like we need a lock or similar mechanism to indicate that
> CIC depends on the VM not being changed, rather than saying it shouldn't
> depend on that because it might not always be true that the only other
> operation (VACUUM) sets them and already acquires a conflicting lock.

I don't really think that would be a useful approach.  I think what
Andres is thinking about -- or at least what comes to mind for me --
is the possibility that heap_page_prune() might someday try to mark
pages all-visible.  Then it would become something that happens from
time to time during foreground processing, rather than (as now)
something that only happens from within a maintenance command.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow interrupts on waiting standby

2017-03-07 Thread Peter Eisentraut
On 1/30/17 20:34, Michael Paquier wrote:
> On Fri, Jan 27, 2017 at 10:17 PM, Michael Paquier
>  wrote:
>> Two things I forgot in this patch:
>> - documentation for the new wait event
>> - the string for the wait event or this would show up as "???" in
>> pg_stat_activity.
>> There are no default clauses in the pgstat_get_wait_* routines so my
>> compiler is actually complaining...
> 
> Both things are fixed in the new version attached. I have added as
> well this patch to the next commit fest:
> https://commitfest.postgresql.org/13/977/

I'm not clear now what this patch is intending to accomplish, seeing
that the original issue has already been fixed.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small fix to postgresql.conf.sample's comment on max_parallel_workers

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 7:42 AM, Amit Kapila  wrote:
> On Tue, Mar 7, 2017 at 8:02 AM, David Rowley
>  wrote:
>> On 7 March 2017 at 15:21, Amit Kapila  wrote:
>>> +1.  How about changing the description of
>>> max_parallel_workers_per_gather to "taken from max_worker_processes,
>>> limited by max_parallel_workers"?
>>
>> Thanks for looking.
>>
>> Seems more accurate to say that it's "taken from
>> max_parallel_workers", maybe.
>>
>
> I thought of saying similar to what we have in docs, however the way
> you have written works for me.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-07 Thread Robert Haas
On Wed, Mar 1, 2017 at 1:25 PM, Andres Freund  wrote:
> On 2017-03-01 10:20:41 -0800, David Fetter wrote:
>> On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:
>> > On 2/28/17 04:24, vinayak wrote:
>> > > The view provides the information of analyze command progress details as
>> > > follows
>> > > postgres=# \d pg_stat_progress_analyze
>> > >View "pg_catalog.pg_stat_progress_analyze"
>> >
>> > Hmm, do we want a separate "progress" system view for every kind of
>> > command?  What if someone comes up with a progress checker for CREATE
>> > INDEX, REINDEX, CLUSTER, etc.?
>
> I don't think that'd be that bad, otherwise the naming of the fields is
> complicated.

+1.

I suppose if it gets really out of control we might have to rethink,
but 2 is not 50, and having appropriate column names is worth a lot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling parallelism for queries coming from SQL or other PL functions

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 9:07 AM, Rafia Sabih
 wrote:
> I have split the patch into two, one is to allow optimiser to select a
> parallel plan for queries in PL functions
> (pl_parallel_opt_support_v1.patch), wherein CURSOR_OPT_PARALLEL_OK is passed
> at required places.
>
> Next, the patch for allowing execution of such queries in parallel mode,
> that involves infrastructural changes along the lines mentioned upthread
> (pl_parallel_exec_support_v1.patch).

Logically, these patches go in the other order: you have to make the
infrastructure changes first, and then after that you can enable
parallelism in the places that are now safe.

I think any patch that bases the determination of whether it's OK to
use parallel query on the DestReceiver type is unacceptable.  There's
just no logical reason to believe that every place that uses a certain
set of DestReceiver types is OK and others are not.   What matters is
whether the query can ever be executed more than once, and that's got
to be tracked explicitly.

Here's a draft patch showing the sort of thing I have in mind.  I
think it needs more work, but it gives you the idea, I hope.  This is
loosely based on your pl_parallel_exec_support_v1.patch, but what I've
done here is added some flags that carry the information about whether
there will be only one or maybe more than one call to ExecutorRun to a
bunch of relevant places.

I think this might have the effect of disabling parallel query in some
cases where PL/pgsql currently allows it, and I think that may be
necessary.  (We may need to back-patch a different fix into 9.6.)
There are two places where we currently set CURSOR_OPT_PARALLEL_OK in
PL/pgsql: exec_stmt_return_query() sets it when calling
exec_dynquery_with_params(), and exec_run_select() calls it when
calling exec_prepare_plan() if parallelOK is set.  The latter is OK,
because exec_run_select() runs the plan via
SPI_execute_plan_with_paramlist(), which calls _SPI_execute_plan(),
which calls _SPI_pquery().  But the former is broken, because
exec_stmt_return_query() executes the query by calling
SPI_cursor_fetch() with a fetch count of 50, and that calls
_SPI_cursor_operation() which calls PortalRunFetch() -- and of course
each call to PortalRunFetch() is going to cause a separate call to
PortalRunSelect(), resulting in a separate call to ExecutorRun().
Oops.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


execute-once.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-07 Thread Andres Freund
Hi,

On 2017-03-07 12:21:59 +0300, Oleg Bartunov wrote:
> On 2017-03-03 15:49:38 -0500, David Steele wrote:
> > I propose we move this patch to the 2017-07 CF so further development
> > and review can be done without haste and as the standard becomes more
> > accessible.

+1


> I wanted to have one more  good feature in 10 and let postgres be on par
> with other competitors.  SQL/JSON adds many interesting features and users
> will be dissapointed if we postpone it for next two years.   Let's wait for
> reviewers, probably they will find the patch is not very  intrusive.

I think it's way too late to late for a patch of this size for 10. And I
don't think it's fair to a lot of other patches of significant size that
have been submitted way earlier, that also need reviewing resources, to
say that we can just see whether it'll get the required resources.


> We have a plenty of time and we dedicate one full-time developer for
> this project.

How about having that, and perhaps others, developer participate in
reviewing patches and getting to the bottom of the commitfest?  Should
we end up being done early, we can look at this patch...  There's not
been review activity corresponding to the amount of submissions from
pgpro...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CORRESPONDING clause design

2017-03-07 Thread Pavel Stehule
Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags  in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should be
valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least one
column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99)  - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4
a, 5 b, 6 c, 8 d;
ERROR:  each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR:  each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list there.
name_list should be better there.

Regards

Pavel


Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 12:55 PM, Tom Lane  wrote:
> Is there a good reason why RELKIND_PARTITIONED_TABLE is 'P' not 'p'?
> It looks rather out of place considering that seven of the eight
> pre-existing relkind codes are lower case.  (And no, I don't especially
> approve of RELKIND_SEQUENCE being 'S' either, but it's far too late to
> change that.)  Also, in typical low-res monospaced fonts, there's nearly
> no difference except vertical alignment between P and p, meaning that in
> something like
>
> regression=# select distinct relkind from pg_class;
>  relkind
> -
>  r
>  t
>  P
>  v
>  m
>  i
>  S
>  c
> (8 rows)
>
> you have to look rather closely even to notice that what you're seeing
> isn't in the case you might expect.
>
> I think we should change this while we still can.

I can't muster a lot of outrage about this one way or another.  One
possible advantage of 'P' is that there are fewer places where 'P' is
mentioned in the source code than 'p'.

[rhaas pgsql]$ git grep "'p'" | wc -l
 293
[rhaas pgsql]$ git grep "'P'" | wc -l
 104

...so it's a little easier to pick out the cases that are talking
about partitioned tables than it would be with a lower case letter.
However, as I say, I don't care very much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-07 Thread Tom Lane
Is there a good reason why RELKIND_PARTITIONED_TABLE is 'P' not 'p'?
It looks rather out of place considering that seven of the eight
pre-existing relkind codes are lower case.  (And no, I don't especially
approve of RELKIND_SEQUENCE being 'S' either, but it's far too late to
change that.)  Also, in typical low-res monospaced fonts, there's nearly
no difference except vertical alignment between P and p, meaning that in
something like

regression=# select distinct relkind from pg_class;
 relkind 
-
 r
 t
 P
 v
 m
 i
 S
 c
(8 rows)

you have to look rather closely even to notice that what you're seeing
isn't in the case you might expect.

I think we should change this while we still can.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-07 Thread Thomas Munro
On Wed, Mar 8, 2017 at 1:58 AM, Masahiko Sawada  wrote:
> On Tue, Mar 7, 2017 at 8:48 PM, Ivan Kartyshov
>  wrote:
>> Rebase done.
>
> Thank you for updating the patch.
>
>>
>> Meanwhile I made some more changes.
>>
>> Changes
>> ===
>> 1) WAITLSN is now implemented as an extension called "pg_waitlsn"
>
> I've read the discussion so far but I didn't see the reason why you've
> changed it to as a contrib module. Could you tell me about that? I
> guess this feature would be more useful if provided as a core feature
> and we need to discuss about syntax as Thomas mentioned.

The problem with using functions like pg_waitlsn(‘LSN’ [, timeout in
ms]) instead of new syntax for transaction starting commands like
BEGIN TRANSACTION ... WAIT FOR ... is that it doesn't work for the
higher isolation levels.  In READ COMMITTED it's fine, because every
statement runs with its own snapshot, so when SELECT
pg_waitlsn(some_lsn) returns, the next statement will run with a
snapshot that can see the effects of some_lsn being applied.  But in
REPEATABLE READ and SERIALIZABLE, even though pg_waitlsn(some_lsn)
waits for the LSN to be applied, the next statement will run with the
snapshot from before and never see the transaction you were waiting
for.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] parallel index(-only) scan breaks when run without parallelism

2017-03-07 Thread Robert Haas
Amit, Rafia,

nodeIndexscan.c, unlike nodeSeqscan.c, thinks that a parallel-aware
scan will always be executed in parallel mode.  But that's not true:
an Execute message with a non-zero row count could cause us to abandon
planned parallelism and execute the plan serially.   I believe this
would cause a core dump.  We definitely core dump with the following
small patch, which causes parallelism to always be abandoned:

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f5cd65d..fc4de48 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1596,8 +1596,7 @@ ExecutePlan(EState *estate,
  * when writing into a relation, because no database changes are allowed
  * in parallel mode.
  */
-if (numberTuples || dest->mydest == DestIntoRel)
-use_parallel_mode = false;
+use_parallel_mode = false;

 if (use_parallel_mode)
 EnterParallelMode();

I believe this defect was introduced by
5262f7a4fc44f651241d2ff1fa688dd664a34874 and that nodeIndexonlyscan.c
has the same defect as of 0414b26bac09379a4cbf1fbd847d1cee2293c5e4.

Please fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Poor memory context performance in large hash joins

2017-03-07 Thread Andres Freund
On 2017-03-07 13:06:39 -0500, Tom Lane wrote:
> Andres Freund  writes:
> >>> On 2017-02-24 18:04:18 -0500, Tom Lane wrote:
>  Concretely, something like the attached.  This passes regression tests
>  but I've not pushed on it any harder than that.
> 
> > I think we should go forward with something like this patch in all
> > branches, and only use Tomas' patch in master, because they're
> > considerably larger.
> 
> While I'm willing to back-patch the proposed patch to some extent,
> I'm a bit hesitant to shove it into all branches, because I don't
> think that usage patterns that create a problem occurred till recently.
> You won't get a whole lot of standalone-block allocations unless you
> have code that allocates many chunks bigger than 8K without applying a
> double-the-request-each-time type of strategy.  And even then, it doesn't
> hurt unless you try to resize those chunks, or delete them in a non-LIFO
> order.
> 
> The hashjoin issue is certainly new, and the reorderbuffer issue can't
> go back further than 9.4.  So my inclination is to patch back to 9.4
> and call it good.

That works for me.  If we find further cases later, we can easily enough
backpatch it then.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Poor memory context performance in large hash joins

2017-03-07 Thread Tom Lane
Andres Freund  writes:
>>> On 2017-02-24 18:04:18 -0500, Tom Lane wrote:
 Concretely, something like the attached.  This passes regression tests
 but I've not pushed on it any harder than that.

> I think we should go forward with something like this patch in all
> branches, and only use Tomas' patch in master, because they're
> considerably larger.

While I'm willing to back-patch the proposed patch to some extent,
I'm a bit hesitant to shove it into all branches, because I don't
think that usage patterns that create a problem occurred till recently.
You won't get a whole lot of standalone-block allocations unless you
have code that allocates many chunks bigger than 8K without applying a
double-the-request-each-time type of strategy.  And even then, it doesn't
hurt unless you try to resize those chunks, or delete them in a non-LIFO
order.

The hashjoin issue is certainly new, and the reorderbuffer issue can't
go back further than 9.4.  So my inclination is to patch back to 9.4
and call it good.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-07 Thread Robert Haas
On Tue, Mar 7, 2017 at 12:11 AM, Ashutosh Bapat
 wrote:
> I see that all the changes by Amit and myself to what was earlier 0003
> patch are now part of 0002 patch. The patch looks ready for committer.

Reviewing 0002:

This patch seems to have falsified the header comments for
expand_inherited_rtentry.

I don't quite understand the need for the change to set_rel_size().
The rte->inh case is handled before reaching the new code, and IIUC
the !rte->inh case should never happen given the changes to
expand_inherited_rtentry.  Or am I confused?  If not, I think we
should just add an Assert() to the "plain relation" case that this is
not RELKIND_PARTITIONED_TABLE (with a comment explaining why it can't
happen).

In inheritance_planner, this comment addition does not seem adequate:

+ * If the parent is a partitioned table, we already set the nominal
+ * relation.

That basically contradicts the previous paragraph.  I think you need
to adjust the previous paragraph instead of adding a new one, probably
in multiple places.  For example, the parenthesized bit now only
applies in the non-partitioned case.

-rel = mtstate->resultRelInfo->ri_RelationDesc;
+nominalRTE = rt_fetch(node->nominalRelation, estate->es_range_table);
+nominalRel = heap_open(nominalRTE->relid, NoLock);

No lock?

Another thing that bothers me about this is that, apparently, the use
of nominalRelation is no longer strictly for EXPLAIN, as the comments
in plannodes.h/relation.h still claim that it is.  I'm not sure how to
adjust that exactly; there's not a lot of room in those comments to
give all the details.  Maybe they should simply say something like /*
Parent RT index */ instead of /* Parent RT index for use of EXPLAIN
*/.  But we can't just go around changing the meaning of things
without updating the comments accordingly.  A related question is
whether we should really be using nominalRelation for this or whether
there's some other way we should be getting at the parent -- I don't
have another idea, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

2017-03-07 Thread Andres Freund
Hi,

On 2017-03-08 00:15:05 +1300, David Rowley wrote:
> -static List *RecoveryLockList;
> +/*
> + * RecoveryLockTable is a poor man's hash table that allows us to partition
> + * the stored locks. Which partition a lock is stored in is determined by the
> + * xid which the lock belongs to. The hash function is very simplistic and
> + * merely performs a binary AND against the final 0-based partition number.
> + * Splitting into partitions in this way avoids having to look through all
> + * locks to find one specific to a given xid.
> + */
> +static List **RecoveryLockTable;

Why are we open coding this? That strikes me as a bad plan...

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : Parallel Merge Join

2017-03-07 Thread Dilip Kumar
On Tue, Mar 7, 2017 at 10:28 PM, Robert Haas  wrote:
>> Apart from this, there was one problem in match_unsorted_outer (in
>> v10), Basically, if inner_cheapest_total was not parallel_safe then I
>> was always getting parallel safe inner. But, we should not do anything
>> if jointype was JOIN_UNIQUE_INNER, so fixed that also.
>
> This version looks fine to me, so committed.

Thank you.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >