date:20160316

Re: [HACKERS] Combining Aggregates

2016-03-16 Thread David Rowley

On 16 March 2016 at 23:54, Haribabu Kommi  wrote:
> On Wed, Mar 16, 2016 at 8:34 AM, David Rowley
>  wrote:
>> Yes me too, so I spent several hours yesterday writing all of the
>> combine functions and serialisation/deserialisation that are required
>> for all of SUM(), AVG() STDDEV*(). I also noticed that I had missed
>> using some existing functions for bool_and() and bool_or() so I added
>> those to pg_aggregate.h. I'm just chasing down a crash bug on
>> HAVE_INT128 enabled builds, so should be posting a patch quite soon. I
>> didn't touch the FLOAT4 and FLOAT8 aggregates as I believe Haribabu
>> has a patch for that over on the parallel aggregate thread. I've not
>> looked at it in detail yet.
>
> The additional combine function patch that I posted handles all float4 and
> float8 aggregates. There is an OID conflict with the latest source code,
> I will update the patch and post it in that thread.

Thanks! I just send a series of patches which add a whole host of
serial/deserial functions, and a patch which adds some documentation.
Maybe you could base your patch on the 0005 patch, and update the
documents to remove the "All types apart from floating-point types"
text and replace that with "Yes".

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-03-16 Thread Haribabu Kommi

On Wed, Mar 16, 2016 at 8:34 AM, David Rowley
 wrote:
> On 16 March 2016 at 06:39, Tomas Vondra  wrote:
>> After looking at the parallel aggregate patch, I also looked at this one, as
>> it's naturally related. Sadly I haven't found any issue that I could nag
>> about ;-) The patch seems well baked, as it was in the oven for quite a long
>> time already.
>
> Thanks for looking at this.
>
>> The one concern I do have is that it only adds (de)serialize functions for
>> SUM(numeric) and AVG(numeric). I think it's reasonable not to include that
>> into the patch, but it will look a bit silly if that's all that gets into
>> 9.6.
>
> Yes me too, so I spent several hours yesterday writing all of the
> combine functions and serialisation/deserialisation that are required
> for all of SUM(), AVG() STDDEV*(). I also noticed that I had missed
> using some existing functions for bool_and() and bool_or() so I added
> those to pg_aggregate.h. I'm just chasing down a crash bug on
> HAVE_INT128 enabled builds, so should be posting a patch quite soon. I
> didn't touch the FLOAT4 and FLOAT8 aggregates as I believe Haribabu
> has a patch for that over on the parallel aggregate thread. I've not
> looked at it in detail yet.

The additional combine function patch that I posted handles all float4 and
float8 aggregates. There is an OID conflict with the latest source code,
I will update the patch and post it in that thread.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-16 Thread David Rowley

On 16 March 2016 at 15:04, Robert Haas  wrote:
> I don't think I'd be objecting if you made PartialAggref a real
> alternative to Aggref.  But that's not what you've got here.  A
> PartialAggref is just a wrapper around an underlying Aggref that
> changes the interpretation of it - and I think that's not a good idea.
> If you want to have Aggref and PartialAggref as truly parallel node
> types, that seems cool, and possibly better than what you've got here
> now.  Alternatively, Aggref can do everything.  But I don't think we
> should go with this wrapper concept.

Ok, I've now gotten rid of the PartialAggref node, and I'm actually
quite happy with how it turned out. I made
search_indexed_tlist_for_partial_aggref() to follow-on the series of
other search_indexed_tlist_for_* functions and have made it behave the
same way, by returning the newly created Var instead of doing that in
fix_combine_agg_expr_mutator(), as the last version did.

Thanks for the suggestion.

New patch attached.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

0001-Allow-aggregation-to-happen-in-parallel_2016-03-16.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-16 Thread Shulgin, Oleksandr

On Tue, Mar 15, 2016 at 7:23 PM, David Steele  wrote:

> On 3/3/16 12:16 AM, Haribabu Kommi wrote:
> > On Fri, Feb 5, 2016 at 2:29 PM, Haribabu Kommi 
> wrote:
> >>
> >> This patch needs to be applied on top discard_hba_and_ident_cxt patch
> >> that is posted earlier.
> >
> > Here I attached a re-based patch to the latest head with inclusion of
> > discard_hba_ident_cxt patch for easier review as a single patch.
>
> Alex, Scott, do you have an idea of when you'll be able to review this
> new version?
>

The new version applies with some fuzziness to the current master and
compiles cleanly.

Some comments:

+/* Context to use with hba_line_callback function. */
+typedef struct
+{
+   MemoryContext memcxt;
+   TupleDesc   tupdesc;
+   Tuplestorestate *tuple_store;
+}  HbalineContext;

Rather "with *lookup_hba_line_callback*", as hba_line_callback() is a
generic one.

+ line_number |  mode   | type  | database | user_name |  address  |
  netmask | hostname | method | options |
 reason
+-+-+---+--+---+---+-+--++-+--
+  84 | skipped | local | {all}| {all} |   |
  |  | trust  | {}  |
connection type mismatch
+  86 | skipped | host  | {all}| {all} | 127.0.0.1 |
255.255.255.255 |  | trust  | {}  | IP
address mismatch
+  88 | matched | host  | {all}| {all} | ::1   |
::::::: |  | trust  | {}  |

Hm... now I'm not sure if we really need the "mode" column.  It should be
clear that we skipped every line that had a non-NULL "reason".  I guess we
could remove "mode" and rename "reason" to "skip_reason"?

Still remains an issue of representing special keywords in database and
user_name fields, but there was no consensus about that though.

--
Alex

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-16 Thread Rahila Syed

>Sorta.  Committed after renaming what you called heap blocks vacuumed
>back to heap blocks scanned, adding heap blocks vacuumed, removing the
>overall progress meter which I don't believe will be anything close to
>accurate, fixing some stylistic stuff, arranging to update multiple
>counters automatically where it could otherwise produce confusion,
>moving the new view near similar ones in the file, reformatting it to
>follow the style of the rest of the file, exposing the counter
>#defines via a header file in case extensions want to use them, and
>overhauling and substantially expanding the documentation

We have following lines,

/* report that everything is scanned and vacuumed */
pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
blkno);
pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED,
blkno);


which appear before final vacuum cycle happens for any remaining dead
tuples which may span few pages if I am not mistaken.

IMO, reporting final count of heap_blks_scanned is correct here, but
reporting final heap_blks_vacuumed can happen after the final VACUUM cycle
for more accuracy.

Thank you,
Rahila Syed

Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)

2016-03-16 Thread Anastasia Lubennikova


15.03.2016 22:28, David Steele:

On 3/4/16 2:56 PM, Vitaly Burovoy wrote:


On 3/4/16, Anastasia Lubennikova  wrote:


I think that you should update documentation. At least description of
epoch on this page:
http://www.postgresql.org/docs/devel/static/functions-datetime.html

Thank you very much for pointing where it is located (I saw only
"to_timestamp(TEXT, TEXT)").
I'll think how to update it.

Vitaly, have you decided how to update this yet?


3. (nitpicking) I don't sure about "4STAMPS" suffix. "4" is nice
abbreviation, but it seems slightly confusing to me.

It doesn't matter for me what it is called, it is short enough and
reflects a type on which it is applied.
What would the best name be for it?

Anastasia, any suggestions for a better name, or just leave it as is?

I'm not in favor of the "4", either.  I think I would prefer
JULIAN_MAXYEAR_STAMP.



This point is related to another patch 
https://commitfest.postgresql.org/9/540/.

And added to this patch just for compatibility.
If Tom wouldn't change the name of the macros there, I don't see any 
reasons why should we do it in this patch.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Failover Slots

2016-03-16 Thread Craig Ringer

On 15 March 2016 at 21:40, Craig Ringer  wrote:

> Here's a new failover slots rev, addressing the issues Oleksii Kliukin
> raised and adding a bunch of TAP tests.
>

Ahem, just found an issue here. I'll need to send another revision.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] syslog configurable line splitting behavior

2016-03-16 Thread Andreas Karlsson


On 03/16/2016 03:50 AM, Peter Eisentraut wrote:

On 3/8/16 9:12 PM, Andreas Karlsson wrote:

I have one nitpick: why is one of the variables "true" while the other
is "on" in the example? I think both should be "on".

#syslog_sequence_numbers = true
#syslog_split_lines = on

Another possible improvement would be to change "Split messages sent to
syslog." to something more verbose like "Split messages sent to syslog,
by lines and to fit in 1024 bytes.".


Updated patches with your suggestions.  I also renamed
syslog_split_lines to syslog_split_messages, which I think is more accurate.


I think the patch looks good now. I will change the status to "Ready for 
committer".


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-03-16 Thread Constantin S. Pan

On Wed, 16 Mar 2016 02:43:47 -0700
Peter Geoghegan  wrote:

> On Wed, Mar 16, 2016 at 2:25 AM, Constantin S. Pan 
> wrote:
> > The backend just waits for the results from the workers and merges
> > them (in case wnum > 0). So the 1-worker configuration should never
> > be used, because it is as sequential as the 0-worker, but adds data
> > transfer.
> 
> This is why I wanted an easy way of atomically guaranteeing some
> number of workers (typically 2), or not using parallelism at all. I
> think the parallel worker API should offer a simple way to do that in
> cases like this, where having only 1 worker is never going to win.

Well, we can check the number of workers actually launched and revert
back to single backend way when there is less than 2 workers. Let me
code that in.

Regards,

Constantin S. Pan
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-03-16 Thread Peter Geoghegan

On Wed, Mar 16, 2016 at 2:25 AM, Constantin S. Pan  wrote:
> The backend just waits for the results from the workers and merges them
> (in case wnum > 0). So the 1-worker configuration should never be used,
> because it is as sequential as the 0-worker, but adds data transfer.

This is why I wanted an easy way of atomically guaranteeing some
number of workers (typically 2), or not using parallelism at all. I
think the parallel worker API should offer a simple way to do that in
cases like this, where having only 1 worker is never going to win.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor bug affecting ON CONFLICT lock wait log messages

2016-03-16 Thread Peter Geoghegan

On Tue, Mar 15, 2016 at 8:31 AM, Stephen Frost  wrote:
>> > We wouldn't want to end up returning different error messages for the
>> > same command under the same conditions just based, which is what we'd
>> > potentially end up doing if we used XLTW_InsertIndexUnique here.
>>
>> Perhaps we need a new value in that enum, so that the context message is
>> specific to the situation at hand?
>
> Maybe, but that would require a new message and new translation and just
> generally doesn't seem like something we'd want to back-patch for a
> bugfix.

Thinking about this again, I think we should use
XLTW_InsertIndexUnique after all. The resemblance of the
check_exclusion_or_unique_constraint() code to the nbtinsert.c code
seems only superficial on second thought. So, I propose fixing the fix
by changing XLTW_InsertIndex to XLTW_InsertIndexUnique.

Basically, unlike with the similar nbtinsert.c code, we're checking
someone else's tuple in the speculative insertion
check_exclusion_or_unique_constraint() case that was changed (or it's
an exclusion constraint, where even the check for our own tuple
happens only after insertion; no change there in any case). Whereas,
in the nbtinsert.c case that I incorrectly duplicated, we're
specifically indicating that we're waiting on *our own* already
physically inserted heap tuple, and say as much in the
XLTW_InsertIndex message that makes it into the log. So, the
fd658dbb300456b393536802d1145a9cea7b25d6 fix is wrong in that we now
indicate that the wait was on our own already-inserted tuple, and not
*someone else's* already-inserted tuple, as it should (we haven't
inserting anything in the first phase of speculative insertion, this
precheck). This code is not a do-over of the check in nbtinsert.c --
rather, the code in nbtinsert.c is a second phase do-over of this code
(where we've physically inserted a heap tuple + index tuple --
"speculative" though that is).

It seems fine to characterize a wait here as "checking the uniqueness
of [somebody else's] tuple", even though technically we're checking
the would-be uniqueness were we to (speculatively, or actually)
insert. However, it does not seem fine to claim ctid_wait as a tuple
we ourselves inserted.

Sorry about that. My confusion came from the fact that historically,
when check_exclusion_or_unique_constraint() was called
check_exclusion_constraint(), it (almost) was our own tuple that was
waited on.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-03-16 Thread Constantin S. Pan

On Wed, 16 Mar 2016 12:14:51 +0530
Amit Kapila  wrote:

> On Wed, Mar 16, 2016 at 5:41 AM, Constantin S. Pan 
> wrote:
> 
> > On Mon, 14 Mar 2016 08:42:26 -0400
> > David Steele  wrote:
> >
> > > On 2/18/16 10:10 AM, Constantin S. Pan wrote:
> > > > On Wed, 17 Feb 2016 23:01:47 +0300
> > > > Oleg Bartunov  wrote:
> > > >
> > > >> My feedback is (Mac OS X 10.11.3)
> > > >>
> > > >> set gin_parallel_workers=2;
> > > >> create index message_body_idx on messages using
> > > >> gin(body_tsvector); LOG:  worker process: parallel worker for
> > > >> PID 5689 (PID 6906) was terminated by signal 11: Segmentation
> > > >> fault
> > > >
> > > > Fixed this, try the new patch. The bug was in incorrect handling
> > > > of some GIN categories.
> > >
> > > Oleg, it looks like Constantin has updated to patch to address the
> > > issue you were seeing.  Do you have time to retest and review?
> > >
> > > Thanks,
> >
> > Actually, there was some progress since. The patch is
> > attached.
> >
> > 1. Added another GUC parameter for changing the amount of
> > shared memory for parallel GIN workers.
> >
> > 2. Changed the way results are merged. It uses shared memory
> > message queue now.
> >
> > 3. Tested on some real data (GIN index on email message body
> > tsvectors). Here are the timings for different values of
> > 'gin_shared_mem' and 'gin_parallel_workers' on a 4-CPU
> > machine. Seems 'gin_shared_mem' has no visible effect.
> >
> > wnum mem(MB) time(s)
> >0  16 247
> >1  16 256
> >
> 
> 
> It seems from you data that with 1 worker, you are always seeing
> slowdown, have you investigated the reason of same?
> 
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com

That slowdown is expected. It slows down because with 1 worker it
has to transfer the results from the worker to the backend.

The backend just waits for the results from the workers and merges them
(in case wnum > 0). So the 1-worker configuration should never be used,
because it is as sequential as the 0-worker, but adds data transfer.

Regards,

Constantin S. Pan
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-16 Thread Fabien COELHO



Hello Alvaro,


If somebody specifies thousands of -f switches, they will waste a few
bytes with each, but I'm hardly concerned about a few dozen kilobytes
there ...


Ok, so you prefer a memory leak. I hate it on principle.


I don't "prefer" memory leaks -- I prefer interfaces that make sense.


C is not designed to return two things, and if it is what is needed it 
looks awkward whatever is done. The static variable trick is dirty, but it 
is the minimal fuss solution, IMO. So we are only trading awkward code 
against awkward code.



Speaking of which, I don't think the arrangement in your patch really
does.  I know I suggested it,


Yep:-)

but now that I look again, it turns out I chose badly and you 
implemented a bad idea, so can we go back and fix it, please?


Yep.

I have very little time available, so I'm trying to minimize the effort. 
I've tried "argue my point with committers", but it has proven very 
ineffective. I've switched to "do whatever is asked if it still works", 
but it is not very effective either.



What I now think should really happen is that the current sql_scripts
array, currently under an anonymous struct, should be a typedef, say
ParsedScript,


Why not.


and get a new member for the weight;


Hm... it already contains "weight".

process_file and process_builtin return a ParsedScript.  The weight and 
Command ** should not be part of script_t at all.


Sure.

In fact, with ParsedScript I don't think we need to give a name to the 
anon struct used for builtin scripts.


It is useful that it has a name so that find_builtin can return it.


Rename the current sql_scripts.name to "desc", to mirror what
is actually put in there from the builtin array struct.  Make addScript
receive a ParsedScript and weight, fill in the weight into the struct,
and put it to the array after sanity-checking.  (I'm OK with keeping
"name" instead of renaming to "desc", if that change becomes too
invasive.)


See attached a v24 & v25.

The awkwardness in v24 is that functions allocate a struct which is freed 
afterwards, really just to return two data. Whether it is better or worst 
than a static is really a matter of taste.


Version v25 results a script which is then passed as an argument, so it 
avoid the dynamic allocation & later free. Maybe it is better. I had to 
cut short the error handling if a file does not exists, though, and it 
passes a struct by value.


Feel free to pick whichever you like most.


No need for N_BUILTIN; we can use lengthof(builtin_script) instead.


Indeed. "lengthof" does not seem to be standard... ok, it is a macro in 
some header file. I really wanted to avoid an ugly sizeof divide hack, but 
as it is hidden elsewhere this is fine.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..dd3fb1d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 Unambiguous prefixes of builtin names are accepted.
@@ -322,12 +324,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -687,9 +691,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1194,12 +1202,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: 
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set nbr

Re: [HACKERS] POC: Cache data in GetSnapshotData()

2016-03-16 Thread Mithun Cy

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy 
wrote:
> I will continue to benchmark above tests with much wider range of clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427
Performance diff in 1 client seems just a run to run variance.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] multivariate statistics v14

2016-03-16 Thread Kyotaro HORIGUCHI

Hello, I returned to this.

At Sun, 13 Mar 2016 22:59:38 +0100, Tomas Vondra  
wrote in <1457906378.27231.10.ca...@2ndquadrant.com>
> Oh, yeah. There was an extra pfree().
> 
> Attached is v15 of the patch series, fixing this and also doing quite a
> few additional improvements:
> 
> * added some basic examples into the SGML documentation
> 
> * addressing the objectaddress omissions, as pointed out by Alvaro
> 
> * support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
> 
> * significant refactoring of MCV and histogram code, particularly 
>   serialization, deserialization and building
> 
> * reworking the functional dependencies to support more complex 
>   dependencies, with multiple columns as 'conditions'
> 
> * the reduction using functional dependencies is also significantly 
>   simplified (I decided to get rid of computing the transitive closure 
>   for now - it got too complex after the multi-condition dependencies, 
>   so I'll leave that for the future

Many trailing white spaces found.

0002

+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group

 2014 should be 2016? 


 This patch defines many "magic"s for many structs, but
 magic(number)s seems to be used to identify file or buffer page
 in PostgreSQL. They wouldn't be needed if you don't intend to
 dig out or identify the orphan memory blocks of mvstats.

+   MVDependencydeps[1];/* XXX why not a pointer? */

MVDependency seems to be a pointer type. 

+   if (numcols >= MVSTATS_MAX_DIMENSIONS)
+   ereport(ERROR,
and
+   Assert((attrs->dim1 >= 2) && (attrs->dim1 <= 
MVSTATS_MAX_DIMENSIONS));

seem to be contradicting.

.. Sorry, time is up..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Small patch: fix double variable initializations in policy.c

2016-03-16 Thread Aleksander Alekseev

Hello

I discovered a pretty weird code.

policy.c:1083

```
char   *qual_value;
ParseState *qual_pstate = make_parsestate(NULL);

/* parsestate is built just to build the range table */
qual_pstate = make_parsestate(NULL);
```

policy.c:1125

```
char   *with_check_value;
ParseState *with_check_pstate = make_parsestate(NULL);

/* parsestate is built just to build the range table */
with_check_pstate = make_parsestate(NULL);
```

I'm quite sure that there is no need to initialize these variables
twice. First patch fixes this. Also I discovered that policy.c is not
properly pgindent'ed. Second patch fixes this too.

Naturally I checked that after applying these patches PostgreSQL still
compiles and pass all regression tests.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index bb735ac..dfc6f00 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -1081,10 +1081,8 @@ AlterPolicy(AlterPolicyStmt *stmt)
 		if (!attr_isnull)
 		{
 			char	   *qual_value;
-			ParseState *qual_pstate = make_parsestate(NULL);
-
 			/* parsestate is built just to build the range table */
-			qual_pstate = make_parsestate(NULL);
+			ParseState *qual_pstate = make_parsestate(NULL);
 
 			qual_value = TextDatumGetCString(value_datum);
 			qual = stringToNode(qual_value);
@@ -1122,10 +1120,8 @@ AlterPolicy(AlterPolicyStmt *stmt)
 		if (!attr_isnull)
 		{
 			char	   *with_check_value;
-			ParseState *with_check_pstate = make_parsestate(NULL);
-
 			/* parsestate is built just to build the range table */
-			with_check_pstate = make_parsestate(NULL);
+			ParseState *with_check_pstate = make_parsestate(NULL);
 
 			with_check_value = TextDatumGetCString(value_datum);
 			with_check_qual = stringToNode(with_check_value);
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index dfc6f00..22fa344 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -496,7 +496,7 @@ RemoveRoleFromObjectPolicy(Oid roleid, Oid classid, Oid policy_id)
 
 	/* Must own relation. */
 	if (pg_class_ownercheck(relid, GetUserId()))
-		noperm = false; /* user is allowed to modify this policy */
+		noperm = false;			/* user is allowed to modify this policy */
 	else
 		ereport(WARNING,
 (errcode(ERRCODE_WARNING_PRIVILEGE_NOT_REVOKED),
@@ -511,15 +511,16 @@ RemoveRoleFromObjectPolicy(Oid roleid, Oid classid, Oid policy_id)
 	 */
 	if (!noperm && num_roles > 0)
 	{
-		int			i, j;
+		int			i,
+	j;
 		Oid		   *roles = (Oid *) ARR_DATA_PTR(policy_roles);
 		Datum	   *role_oids;
 		char	   *qual_value;
 		Node	   *qual_expr;
-		List   *qual_parse_rtable = NIL;
+		List	   *qual_parse_rtable = NIL;
 		char	   *with_check_value;
 		Node	   *with_check_qual;
-		List   *with_check_parse_rtable = NIL;
+		List	   *with_check_parse_rtable = NIL;
 		Datum		values[Natts_pg_policy];
 		bool		isnull[Natts_pg_policy];
 		bool		replaces[Natts_pg_policy];
@@ -536,15 +537,14 @@ RemoveRoleFromObjectPolicy(Oid roleid, Oid classid, Oid policy_id)
 
 		/*
 		 * All of the dependencies will be removed from the policy and then
-		 * re-added.  In order to get them correct, we need to extract out
-		 * the expressions in the policy and construct a parsestate just
-		 * enough to build the range table(s) to then pass to
-		 * recordDependencyOnExpr().
+		 * re-added.  In order to get them correct, we need to extract out the
+		 * expressions in the policy and construct a parsestate just enough to
+		 * build the range table(s) to then pass to recordDependencyOnExpr().
 		 */
 
 		/* Get policy qual, to update dependencies */
 		value_datum = heap_getattr(tuple, Anum_pg_policy_polqual,
-   RelationGetDescr(pg_policy_rel), &attr_isnull);
+			  RelationGetDescr(pg_policy_rel), &attr_isnull);
 		if (!attr_isnull)
 		{
 			ParseState *qual_pstate;
@@ -566,7 +566,7 @@ RemoveRoleFromObjectPolicy(Oid roleid, Oid classid, Oid policy_id)
 
 		/* Get WITH CHECK qual, to update dependencies */
 		value_datum = heap_getattr(tuple, Anum_pg_policy_polwithcheck,
-   RelationGetDescr(pg_policy_rel), &attr_isnull);
+			  RelationGetDescr(pg_policy_rel), &attr_isnull);
 		if (!attr_isnull)
 		{
 			ParseState *with_check_pstate;
@@ -665,7 +665,7 @@ RemoveRoleFromObjectPolicy(Oid roleid, Oid classid, Oid policy_id)
 
 	heap_close(pg_policy_rel, RowExclusiveLock);
 
-	return(noperm || num_roles > 0);
+	return (noperm || num_roles > 0);
 }
 
 /*
@@ -996,8 +996,8 @@ AlterPolicy(AlterPolicyStmt *stmt)
 
 	/* Get policy command */
 	polcmd_datum = heap_getattr(policy_tuple, Anum_pg_policy_polcmd,
-			 RelationGetDescr(pg_policy_rel),
-			 &polcmd_isnull);
+RelationGetDescr(pg_policy_rel),
+&polcmd_isnull);
 	Assert(!polcmd_isnull);
 	polcmd = DatumGetChar(polcmd_datum);
 
@@ -1029,15 +1029,15 @@ AlterPolicy(AlterPolicyStmt *stmt)
 	}
 	else
 	{
-		Oid*roles;
+		Oid		   *roles;
 		Datum

Re: [HACKERS] Choosing parallel_degree

2016-03-16 Thread Julien Rouhaud

On 16/03/2016 05:45, James Sewell wrote:
> On Wed, Mar 16, 2016 at 11:26 AM, Julien Rouhaud
> mailto:julien.rouh...@dalibo.com>>wrote:
> 
> 
> I'm not too familiar with parallel planning, but I tried to implement
> both in attached patch. I didn't put much effort into the
> parallel_threshold GUC documentation, because I didn't really see a good
> way to explain it. I'd e happy to improve it if needed. Also, to make
> this parameter easier to tune for users, perhaps we could divide the
> default value by 3 and use it as is in the first iteration in
> create_parallel_path() ?
> 
> Also, global max_parallel_degree still needs to be at least 1 for the
> per table value to be considered.
> 
> 
> All applies and works from my end.
> 

Thanks for testing!

> Is the max_parallel_degree per table of much use here? It allows the max
> number of workers per table to be set - but it's still bound by the same
> formula (now from the GUC). So in reality it's only really useful for
> limiting the number of workers, not raising it.
> 

You can set a global max_parallel_degree low, and raise it per table. If
you set up max_parallel_degree to 1, you can "activate" parallel workers
for only a subset of tables.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Only try to push down foreign joins if the user mapping OIDs mat

2016-03-16 Thread Ashutosh Bapat

On Wed, Mar 16, 2016 at 2:14 AM, Robert Haas  wrote:

> On Tue, Mar 15, 2016 at 6:44 AM, Ashutosh Bapat
>  wrote:
> > Here's patch which fixes the issue using Robert's idea.
>
> Please at least check your patches with 'git diff --check'


Thanks.


> before
> submitting them.  And where it's not too much trouble, pgindent them.
> Or at least make them look something like what pgindent would have
> produced, instead of having the line lengths be all over the place.
>

Sorry. PFA the patch with git diff --check output fixed.


>
>  -- change the session user to view_owner and execute the statement.
> Because of
>  -- change in session user, the plan should get invalidated and created
> again.
> --- While creating the plan, it should throw error since there is no
> user mapping
> --- available for view_owner.
> +-- The join will not be pushed down since the joining relations do not
> have a
> +-- valid user mapping.
>
> Now what's going on here?  It seems to me that either postgres_fdw
> requires a user mapping (in which case this ought to fail) or it
> doesn't (in which case this ought to push the join down).  I don't
> understand how working but not pushing the join down can ever be the
> right behavior.
>
>
In 9.5, postgres_fdw allowed to prepare statements involving foreign tables
without an associated user mapping as long as planning did not require the
user mapping. Remember, planner would require user mapping in case
use_remote_estimate is ON for that foreign table. The user mapping would be
certainly required at the time of execution. So executing such a prepared
statement would throw error. If somebody created a user mapping between
prepare and execute, execute would not throw an error. A join can be pushed
down only when user mappings associated with the joining relations are
known and they match. But given the behavior of 9.5 we should let the
prepare succeed, even if the join can not be pushed down because of unknown
user mapping. Before this fix, the test was not letting the prepare
succeed, which is not compliant with 9.5.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/file_fdw/input/file_fdw.source b/contrib/file_fdw/input/file_fdw.source
index 416753d..35db4af 100644
--- a/contrib/file_fdw/input/file_fdw.source
+++ b/contrib/file_fdw/input/file_fdw.source
@@ -166,20 +166,23 @@ DROP TABLE agg;
 SET ROLE file_fdw_superuser;
 SELECT * FROM agg_text ORDER BY a;
 SET ROLE file_fdw_user;
 SELECT * FROM agg_text ORDER BY a;
 SET ROLE no_priv_user;
 SELECT * FROM agg_text ORDER BY a;   -- ERROR
 SET ROLE file_fdw_user;
 \t on
 EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_text WHERE a > 0;
 \t off
+-- file FDW allows foreign tables to be accessed without user mapping
+DROP USER MAPPING FOR file_fdw_user SERVER file_server;
+SELECT * FROM agg_text ORDER BY a;
 
 -- privilege tests for object
 SET ROLE file_fdw_superuser;
 ALTER FOREIGN TABLE agg_text OWNER TO file_fdw_user;
 ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text');
 SET ROLE file_fdw_user;
 ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text');
 SET ROLE file_fdw_superuser;
 
 -- cleanup
diff --git a/contrib/file_fdw/output/file_fdw.source b/contrib/file_fdw/output/file_fdw.source
index 8719694..26f4999 100644
--- a/contrib/file_fdw/output/file_fdw.source
+++ b/contrib/file_fdw/output/file_fdw.source
@@ -315,31 +315,41 @@ SELECT * FROM agg_text ORDER BY a;   -- ERROR
 ERROR:  permission denied for relation agg_text
 SET ROLE file_fdw_user;
 \t on
 EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_text WHERE a > 0;
  Foreign Scan on public.agg_text
Output: a, b
Filter: (agg_text.a > 0)
Foreign File: @abs_srcdir@/data/agg.data
 
 \t off
+-- file FDW allows foreign tables to be accessed without user mapping
+DROP USER MAPPING FOR file_fdw_user SERVER file_server;
+SELECT * FROM agg_text ORDER BY a;
+  a  |b
+-+-
+   0 | 0.09561
+  42 |  324.78
+  56 | 7.8
+ 100 |  99.097
+(4 rows)
+
 -- privilege tests for object
 SET ROLE file_fdw_superuser;
 ALTER FOREIGN TABLE agg_text OWNER TO file_fdw_user;
 ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text');
 SET ROLE file_fdw_user;
 ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text');
 ERROR:  only superuser can change options of a file_fdw foreign table
 SET ROLE file_fdw_superuser;
 -- cleanup
 RESET ROLE;
 DROP EXTENSION file_fdw CASCADE;
-NOTICE:  drop cascades to 8 other objects
+NOTICE:  drop cascades to 7 other objects
 DETAIL:  drop cascades to server file_server
-drop cascades to user mapping for file_fdw_user on server file_server
 drop cascades to user mapping for file_fdw_superuser on server file_server
 drop cascades to user mapping for no_priv_user on server file_server
 drop cascades to foreign table agg_text
 drop cascades to foreign table agg_csv
 drop cascades to foreign table agg_bad
 drop cascades to foreign table text_csv
 DROP ROLE file_fdw_superuser, file_fdw_user

Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-03-16 Thread Kyotaro HORIGUCHI

It seems to me a matter of definition of "available replicas".

At Wed, 16 Mar 2016 14:13:48 +1300, Thomas Munro 
 wrote in 

> 
>  Synchronous replication offers the ability to confirm that all changes
> -made by a transaction have been transferred to one synchronous standby
> -server. This extends the standard level of durability
> +made by a transaction have been transferred to one or more
> synchronous standby
> +server. This extends that standard level of durability
>  offered by a transaction commit. This level of protection is referred
> -to as 2-safe replication in computer science theory.
> +to as group-safe replication in computer science theory.
> 
> 
> A message on the -general list today pointed me to some earlier
> discussion[1] which quoted and referenced definitions of these
> academic terms[2].  I think the above documentation should say:
> 
> "This level of protection is referred to as 2-safe replication in
> computer science literature when synchronous_commit is
> set to on, and group-1-safe (group-safe and 1-safe) when
> synchronous_commit is set to remote_write."

I suppose that the "available replica" on the paper is equivalent
to "one choosen synchronous server" at the top of the queue of
living standbys specified by s_s_names. The original description
is true based on this interpretation.

> By my reading, the situation doesn't actually change with this patch.
> It doesn't matter whether you need 1 or 42 synchronous standbys to
> make a quorum: 2-safe means durable (fsync) on all of them,
> group-1-safe means durable on one server and received (implied by
> remote_write) by all of them.

Likewise, "the first two of the living standbys" (2[r01, ..r42])
and the master is translated to "three replicas". So it keeps
2-safe for the case.

> I think we should be using those definitions because Gray's earlier
> definition of 2-safe from Transaction Processing 12.6.3 doesn't really
> fit:  It can optionally mean remote receipt or remote durable storage,
> but it doesn't wait if the 'backup' is down, so it's not the same type
> of guarantee.  (He also has 'very safe' which might describe our
> syncrep, I'm not sure.)

If the discussion above is true, the description doesn't seem to
need to be amended in the view of the safe-criteria.

> 
>  Synchronous replication offers the ability to confirm that all changes
> -made by a transaction have been transferred to one synchronous standby
> -server. This extends the standard level of durability
> +made by a transaction have been transferred to one or more synchronous 
> standby
> +server. This extends that standard level of durability
>  offered by a transaction commit. This level of protection is referred
>  to as 2-safe replication in computer science theory.
> 

But some additional explanation might be needed.

For the true quorum commit, a client will be notified when the
master and any n of all standbys have committed. This won't fit
exactly to the criterias in the paper.

In regard to Gray's definition, "2-safe" looks to be PG's syncrep
with automatic release mechanism, such like what pgsql-RA
offers. And "high availability" doesn't seem to fit to
PostgreSQL's behavior because the master virtually commits a
transaction before making an agreement to commit among all of
replicas.

# I'm reading it in Japanese so some words may be incorrect.


Thoughts?

> [1] 
> http://www.postgresql.org/message-id/603c8f070812132142n5408e7ddk899e83cddd4cb...@mail.gmail.com
> [2] http://infoscience.epfl.ch/record/33053/files/EPFL_TH2577.pdf page 76
> 
> On Thu, Mar 10, 2016 at 11:21 PM, Fujii Masao  wrote:
> > On Fri, Mar 4, 2016 at 3:40 AM, Masahiko Sawada  
> > wrote:
> >> On Thu, Mar 3, 2016 at 11:30 PM, Masahiko Sawada  
> >> wrote:
> >>> Hi,
> >>>
> >>> Thank you so much for reviewing this patch!
> >>>
> >>> All review comments regarding document and comment are fixed.
> >>> Attached latest v14 patch.
> >>>
>  This accepts 'abc^Id' as a name, which is wrong behavior (but
>  such appliction names are not allowed anyway. If you assume so,
>  I'd like to see a comment for that.).
> >>>
> >>> 'abc^Id' is accepted as application_name, no?
> >>> postgres(1)=# set application_name to 'abc^Id';
> >>> SET
> >>> postgres(1)=# show application_name ;
> >>>  application_name
> >>> --
> >>>  abc^Id
> >>> (1 row)
> >>>
>  addlit_xd_string(char *ytext) and addlitchar_xd_string(unsigned
>  char ychar) requires differnt character types. Is there any reason
>  for that?
> >>>
> >>> Because addlit_xd_string() is for adding string(char *) to xd_string,
> >>> OTOH addlit_xd_char() is for adding just one character to xd_string.
> >>>
>  I personally don't like addlit*string() things for such simple
>  syntax but itself is acceptble enough for me. However it uses
>  StringInfo to hold double-quoted names, which pallocs 1024 bytes
>  of memory chunk

Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-16 Thread Craig Ringer

Attached is a patch to mark a logical replication slot as dirty if its
confirmed lsn is changed.

Aesthetically I'm not sure if it's better to do it per this patch and call
ReplicationSlotMarkDirty after we release the spinlock, add a new
ReplicationSlotMarkDirtyLocked() that skips the spinlock acquisition, or
add a bool is_locked arg to ReplicationSlotMarkDirty and update all call
sites. All will have the same result.

I've confirmed this works as expected as part of the failover slots test
suite but it'd be pretty seriously cumbersome to extract. If anyone feels
strongly about it I can add a test that shows that

- start master
- create slot
- do some stuff
- replay from slot
- fast-stop master
- start master
- replay from slot

doesn't see the same data a second time, but if we repeat it using
immediate stop it will see the same data when replaying again.

Also attached is another patch to amend the docs to reflect the fact that a
slot can actually replay the same change more than once. I avoided the
strong temptation to update other parts of the docs nearby.

Both these are fixes that should IMO be applied to 9.6.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From e2bda9fa26f9d92059490d2b9ea7c37ae45f81b1 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 16 Mar 2016 15:12:34 +0800
Subject: [PATCH] Dirty replication slots when confirm_lsn is changed

---
 src/backend/replication/logical/logical.c | 62 +--
 1 file changed, 42 insertions(+), 20 deletions(-)

diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 2c7b749..d3fb1a5 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -440,6 +440,7 @@ DecodingContextFindStartpoint(LogicalDecodingContext *ctx)
 	}
 
 	ctx->slot->data.confirmed_flush = ctx->reader->EndRecPtr;
+	ReplicationSlotMarkDirty();
 }
 
 /*
@@ -850,10 +851,15 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 	{
 		bool		updated_xmin = false;
 		bool		updated_restart = false;
+		bool		updated_confirm = false;
 
 		SpinLockAcquire(&MyReplicationSlot->mutex);
 
-		MyReplicationSlot->data.confirmed_flush = lsn;
+		if (MyReplicationSlot->data.confirmed_flush != lsn)
+		{
+			MyReplicationSlot->data.confirmed_flush = lsn;
+			updated_confirm = true;
+		}
 
 		/* if were past the location required for bumping xmin, do so */
 		if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
@@ -891,34 +897,50 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 
 		SpinLockRelease(&MyReplicationSlot->mutex);
 
-		/* first write new xmin to disk, so we know whats up after a crash */
-		if (updated_xmin || updated_restart)
+		if (updated_xmin || updated_restart || updated_confirm)
 		{
 			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			elog(DEBUG1, "updated xmin: %u restart: %u", updated_xmin, updated_restart);
-		}
 
-		/*
-		 * Now the new xmin is safely on disk, we can let the global value
-		 * advance. We do not take ProcArrayLock or similar since we only
-		 * advance xmin here and there's not much harm done by a concurrent
-		 * computation missing that.
-		 */
-		if (updated_xmin)
-		{
-			SpinLockAcquire(&MyReplicationSlot->mutex);
-			MyReplicationSlot->effective_catalog_xmin = MyReplicationSlot->data.catalog_xmin;
-			SpinLockRelease(&MyReplicationSlot->mutex);
+			/*
+			 * first write new xmin to disk, so we know whats up
+			 * after a crash.
+			 */
+			if (updated_xmin || updated_restart)
+			{
+ReplicationSlotSave();
+elog(DEBUG1, "updated xmin: %u restart: %u", updated_xmin, updated_restart);
+			}
 
-			ReplicationSlotsUpdateRequiredXmin(false);
-			ReplicationSlotsUpdateRequiredLSN();
+			/*
+			 * Now the new xmin is safely on disk, we can let the global value
+			 * advance. We do not take ProcArrayLock or similar since we only
+			 * advance xmin here and there's not much harm done by a concurrent
+			 * computation missing that.
+			 */
+			if (updated_xmin)
+			{
+SpinLockAcquire(&MyReplicationSlot->mutex);
+MyReplicationSlot->effective_catalog_xmin = MyReplicationSlot->data.catalog_xmin;
+SpinLockRelease(&MyReplicationSlot->mutex);
+
+ReplicationSlotsUpdateRequiredXmin(false);
+ReplicationSlotsUpdateRequiredLSN();
+			}
 		}
 	}
 	else
 	{
+		bool dirtied = false;
+
 		SpinLockAcquire(&MyReplicationSlot->mutex);
-		MyReplicationSlot->data.confirmed_flush = lsn;
+		if (MyReplicationSlot->data.confirmed_flush != lsn)
+		{
+			MyReplicationSlot->data.confirmed_flush = lsn;
+			dirtied = true;
+		}
 		SpinLockRelease(&MyReplicationSlot->mutex);
+
+		if (dirtied)
+			ReplicationSlotMarkDirty();
 	}
 }
-- 
2.1.0

From 1f8d9319ef9c934c414cebf1f5223c3b3023bf7f Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 16 Mar 2016 15:45:16 +0800
Subject: [PATCH 2/2] Correct incorrect claim that slots output a change
 "exactly once"

[HACKERS] Odd oid-system-column handling in postgres_fdw

2016-03-16 Thread Etsuro Fujita

Hi,

PG9.5 allows us to add an oid system column to foreign tables, using
ALTER FOREIGN TABLE SET WITH OIDS, but currently, that column reads as
zeroes in postgres_fdw.  That seems to me like a bug.  So, I'd like to
propose to fix that, by retrieving that column from the remote server
when requested.  I'm attaching a proposed patch for that.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 287,298  foreign_expr_walker(Node *node,
  	/* Var belongs to foreign table */
  
  	/*
! 	 * System columns other than ctid should not be sent to
! 	 * the remote, since we don't make any effort to ensure
! 	 * that local and remote values match (tableoid, in
  	 * particular, almost certainly doesn't match).
  	 */
  	if (var->varattno < 0 &&
  		var->varattno != SelfItemPointerAttributeNumber)
  		return false;
  
--- 287,299 
  	/* Var belongs to foreign table */
  
  	/*
! 	 * System columns other than oid and ctid should not be
! 	 * sent to the remote, since we don't make any effort to
! 	 * ensure that local and remote values match (tableoid, in
  	 * particular, almost certainly doesn't match).
  	 */
  	if (var->varattno < 0 &&
+ 		var->varattno != ObjectIdAttributeNumber &&
  		var->varattno != SelfItemPointerAttributeNumber)
  		return false;
  
***
*** 911,919  deparseTargetList(StringInfo buf,
  	}
  
  	/*
! 	 * Add ctid if needed.  We currently don't support retrieving any other
! 	 * system columns.
  	 */
  	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
  	  attrs_used))
  	{
--- 912,936 
  	}
  
  	/*
! 	 * Add oid and ctid if needed.  We currently don't support retrieving any
! 	 * other system columns.
  	 */
+ 	if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
+ 	  attrs_used))
+ 	{
+ 		if (!first)
+ 			appendStringInfoString(buf, ", ");
+ 		else if (is_returning)
+ 			appendStringInfoString(buf, " RETURNING ");
+ 		first = false;
+ 
+ 		if (qualify_col)
+ 			ADD_REL_QUALIFIER(buf, rtindex);
+ 		appendStringInfoString(buf, "oid");
+ 
+ 		*retrieved_attrs = lappend_int(*retrieved_attrs,
+ 	   ObjectIdAttributeNumber);
+ 	}
  	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
  	  attrs_used))
  	{
***
*** 1467,1474  deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root,
  {
  	RangeTblEntry *rte;
  
! 	/* varattno can be a whole-row reference, ctid or a regular table column */
! 	if (varattno == SelfItemPointerAttributeNumber)
  	{
  		if (qualify_col)
  			ADD_REL_QUALIFIER(buf, varno);
--- 1484,1500 
  {
  	RangeTblEntry *rte;
  
! 	/*
! 	 * varattno can be an oid, a ctid, a whole-row reference, or a regular
! 	 * table column
! 	 */
! 	if (varattno == ObjectIdAttributeNumber)
! 	{
! 		if (qualify_col)
! 			ADD_REL_QUALIFIER(buf, varno);
! 		appendStringInfoString(buf, "oid");
! 	}
! 	else if (varattno == SelfItemPointerAttributeNumber)
  	{
  		if (qualify_col)
  			ADD_REL_QUALIFIER(buf, varno);
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 3744,3749  make_tuple_from_result_row(PGresult *res,
--- 3744,3750 
  	TupleDesc	tupdesc;
  	Datum	   *values;
  	bool	   *nulls;
+ 	Oid			oid = InvalidOid;
  	ItemPointer ctid = NULL;
  	ConversionLocation errpos;
  	ErrorContextCallback errcallback;
***
*** 3802,3808  make_tuple_from_result_row(PGresult *res,
  		else
  			valstr = PQgetvalue(res, row, j);
  
! 		/* convert value to internal representation */
  		errpos.cur_attno = i;
  		if (i > 0)
  		{
--- 3803,3813 
  		else
  			valstr = PQgetvalue(res, row, j);
  
! 		/*
! 		 * convert value to internal representation
! 		 *
! 		 * Note: we ignore system columns other than oid and ctid in result
! 		 */
  		errpos.cur_attno = i;
  		if (i > 0)
  		{
***
*** 3815,3823  make_tuple_from_result_row(PGresult *res,
  			  attinmeta->attioparams[i - 1],
  			  attinmeta->atttypmods[i - 1]);
  		}
  		else if (i == SelfItemPointerAttributeNumber)
  		{
! 			/* ctid --- note we ignore any other system column in result */
  			if (valstr != NULL)
  			{
  Datum		datum;
--- 3820,3839 
  			  attinmeta->attioparams[i - 1],
  			  attinmeta->atttypmods[i - 1]);
  		}
+ 		else if (i == ObjectIdAttributeNumber)
+ 		{
+ 			/* oid */
+ 			if (valstr != NULL)
+ 			{
+ Datum		datum;
+ 
+ datum = DirectFunctionCall1(oidin, CStringGetDatum(valstr));
+ oid = DatumGetObjectId(datum);
+ 			}
+ 		}
  		else if (i == SelfItemPointerAttributeNumber)
  		{
! 			/* ctid */
  			if (valstr != NULL)
  			{
  Datum		datum;
***
*** 3849,3854  make_tuple_from_result_row(PGresult *res,
--- 3865,3876 
  	tuple = h

Pushdown target list below gather node (WAS Re: [HACKERS] WIP: Upper planner pathification)

2016-03-16 Thread Amit Kapila

On Wed, Mar 9, 2016 at 11:58 PM, Robert Haas  wrote:
>
> On Wed, Mar 9, 2016 at 12:33 PM, Tom Lane  wrote:
> >
> > Gather is a bit weird, because although it can project (and needs to,
> > per the example of needing to compute a non-parallel-safe function),
> > you would rather push down as much work as possible to the child node;
> > and doing so is semantically OK for parallel-safe functions.  (Pushing
> > functions down past a Sort node, for a counterexample, is not so OK
> > if you are concerned about function evaluation order, or even number
> > of executions.)
> >
> > In the current code structure it would perhaps be reasonable to teach
> > apply_projection_to_path about that --- although this would require
> > logic to separate parallel-safe and non-parallel-safe subexpressions,
> > which doesn't quite seem like something apply_projection_to_path
> > should be doing.
>
> I think for v1 it would be fine to make this all-or-nothing; that's
> what I had in mind to do.  That is, if the entire tlist is
> parallel-safe, push it all down.  If not, let the workers just return
> the necessary Vars and have Gather compute the final tlist.
>

I find it quite convenient to teach apply_projection_to_path() to push down
target-list beneath Gather node, when targetlist contains parallel-safe
expression.  Attached patch implements pushing targetlist beneath gather
node.

Below is output of a simple test which shows the effect of implementation.

Without Patch -

postgres=# explain verbose select c1+2 from t1 where c1<10;
 QUERY PLAN
-
 Gather  (cost=0.00..44420.43 rows=30 width=4)
   Output: (c1 + 2)
   Number of Workers: 2
   ->  Parallel Seq Scan on public.t1  (cost=0.00..44420.35 rows=13 width=4)
 Output: c1
 Filter: (t1.c1 < 10)
(6 rows)

With Patch -
---
postgres=# explain verbose select c1+2 from t1 where c1<10;
 QUERY PLAN
-
 Gather  (cost=0.00..45063.75 rows=30 width=4)
   Output: ((c1 + 2))
   Number of Workers: 1
   ->  Parallel Seq Scan on public.t1  (cost=0.00..45063.68 rows=18 width=4)
 Output: (c1 + 2)
 Filter: (t1.c1 < 10)
(6 rows)


In the above plans, you can notice that target list expression (c1 + 2) is
pushed beneath Gather node after patch.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


parallel-tlist-pushdown-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

Re: [HACKERS] Combining Aggregates

Re: [HACKERS] Parallel Aggregate

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)

Re: [HACKERS] WIP: Failover Slots

Re: [HACKERS] syslog configurable line splitting behavior

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

Re: [HACKERS] Minor bug affecting ON CONFLICT lock wait log messages

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

Re: [HACKERS] pgbench stats per script & other stuff

Re: [HACKERS] POC: Cache data in GetSnapshotData()

Re: [HACKERS] multivariate statistics v14

[HACKERS] Small patch: fix double variable initializations in policy.c

Re: [HACKERS] Choosing parallel_degree

Re: [HACKERS] Re: [COMMITTERS] pgsql: Only try to push down foreign joins if the user mapping OIDs mat

Re: [HACKERS] Support for N synchronous standby servers - take 2

Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

[HACKERS] Odd oid-system-column handling in postgres_fdw

Pushdown target list below gather node (WAS Re: [HACKERS] WIP: Upper planner pathification)

22 matches

Site Navigation

Mail list logo

Footer information