Re: [HACKERS] Tsvector editing functions

2015-12-14 Thread Tomas Vondra

Hi,

On 12/07/2015 03:05 PM, Stas Kelvich wrote:

Hello.

Done with the list of suggestions. Also heavily edit delete function.



I did a quick review of the updated patch. I'm not a tsvector-expert so 
hopefully my comments won't be entirely bogus.


1) It's a bit difficult to judge the usefulness of the API, as I've
   always been a mere user of full-text search, and I never had a need
   (or courage) to mess with the tsvectors. OTOH I don't see a good
   reason no to have such API, when there's a need for it.

   The API seems to be reasonably complete, with one exception - when
   looking at editing function we have for 'hstore', we do have these
   variants for delete()

  delete(hstore,text)
  delete(hstore,text[])
  delete(hstore,hstore)

   while this patch only adds delete(tsvector,text). Would it make
   sense to add variants similar to hstore? It probably does not make
   much sense to add delete(tsvector,tsvector), right? But being able
   to delete a bunch of lexemes in one go seems like a good thing.

   What do you think?


2) tsvector_op.c needs a bit of love, to eliminate the two warnings it
   currently triggers:

tsvector_op.c:211:2: warning: ISO C90 forbids mixed ...
tsvector_op.c:635:9: warning: variable ‘lexeme2copy’ set but ...

3) the patch also touches tsvector_setweight(), only to do change:

  elog(ERROR, "unrecognized weight: %d", cw);

   to

  elog(ERROR, "unrecognized weight: %c", cw);

   That should probably go get committed separately, as a bugfix.


4) I find it rather annoying that there are pretty much no comments in
   the code. Granted, there are pretty much no comments in the
   surrounding code, but I doubt that's a good reason for not having
   any comments in new code. It makes reviews unnecessarily difficult.


5) tsvector_concat() is not mentioned in docs at all


6) Docs don't mention names of the new parameters in function
   signatures, just data types. The functions with a single parameter
   probably don't need to do that, but multi-parameter ones should.

7) Some of the functions use intexterm that does not match the function
   name. I see two such cases - to_tsvector and setweight. Is there a
   reason for that?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function and view to retrieve WAL receiver status

2015-12-14 Thread Michael Paquier
On Tue, Dec 15, 2015 at 5:27 AM, Gurjeet Singh wrote:
> The function, maybe. But emitting an all-nulls row from a view seems
> counter-intuitive, at least when looking at it in context of relational
> database.

OK, noted. Any other opinions?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-12-14 Thread Greg Stark
I ran sorts with various parameters on my small NAS server. This is a
fairly slow CPU and limited memory machine with lots of disk so I thought
it would actually make a good test case for smaller servers. The following
is the speedup (for values < 100%) or slowdown (values > 100%) for the
first patch only, the "quicksort all runs" without the extra memory
optimizations.

At first glance it's a clear pattern that the extra runs does cause a
slowdown whenever it causes more polyphase merges which is bad news. But on
further inspection look just how low work_mem had to be to have a
significant effect. Only the 4MB and 8MB work_mem cases were significantly
impacted and only when sorting over a GB of data (which was 2.7 - 7GB with
the tuple overhead). The savings when work_mem was 64MB or 128MB was
substantial.

Table SizeSort Size128MB64MB32MB16MB8MB4MB6914MB2672 MB64%70%93%110%133%137%
3457MB1336 MB64%67%90%92%137%120%2765MB1069 MB68%66%84%95%111%137%1383MB535
MB66%70%72%92%99%96%691MB267 MB65%69%70%86%99%98%346MB134 MB65%69%73%67%90%
87%

The raw numbers in seconds. I've only run the test once so far on the NAS
and there are some other things running on it so I really should rerun it a
few more times at least.

HEAD:
Table SizeSort Size128MB64MB32MB16MB8MB4MB6914MB2672 MB1068.07963.231041.94
1246.541654.352472.793457MB1336
MB529.34482.3450.77555.76657.341027.572765MB1069
MB404.02394.36348.31414.48507.38657.171383MB535 MB196.48194.26173.48182.57
214.42258.05691MB267 MB95.9393.7987.7380.493.67105.24346MB134 MB45.644.24
42.3944.2246.1749.85

With the quicksort patch:
Table SizeSort Size128MB64MB32MB16MB8MB4MB6914MB2672 MB683.6679.0969.41366.2
2193.63379.33457MB1336 MB339.1325.1404.9509.8902.21229.12765MB1069 MB275.3
260.1292.4395.4561.9898.71383MB535 MB129.9136.4124.6167.5213.2247.1691MB267
MB62.364.361.469.292.3103.2346MB134 MB29.830.730.929.441.643.4


Re: [HACKERS] Patch: Optimize memory allocation in function 'bringetbitmap'

2015-12-14 Thread Tomas Vondra

Hi,

On 10/16/2015 08:00 PM, Jinyu Zhang wrote:


Update the patch_brin_optimze_mem according to your comment.


I've reviewed the last version of the patch, and I do have a few 
comments. I haven't done any performance testing at this point, as the 
machine I'm using for that is chewing on something else at the moment.


1) bringetbitmap(PG_FUNCTION_ARGS)

+ /*
+  * Estimate the approximate size of btup and allocate memory for btup.
+  */
+ btupInitSize = bdesc->bd_tupdesc->natts * 16;
+ btup = palloc(btupInitSize);

What is the reasoning behind this estimate? I mean, this only accounts 
for the values, not the NULL bitmap or BrinTuple fields. Maybe it does 
not really matter much, but I'd like to see some brief description in 
the docs, please.


This also interacts with our memory allocator in a rather annoying way, 
because palloc() always allocates memory in 2^k chunks, but sadly the 
estimate ignores that. So for natts=3 we get btupInitSize=48, but 
internally allocate 64B (and ignore the top 16B).



2) bringetbitmap(PG_FUNCTION_ARGS)

if (tup)
{
if(size <= btupInitSize)
memcpy(btup, tup, size);
else
btup = brin_copy_tuple(tup, size);
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
}

Is there any particular reason why not to update btupInitSize to the 
size value? I mean, if the estimate is too low, then we'll call 
brin_copy_tuple() for all BRIN tuples, no?


In other words, I think the code should do this:

if (tup)
{
if(size <= btupInitSize)
memcpy(btup, tup, size);
else
{
btup = brin_copy_tuple(tup, size);
brinInitSize = size;
}
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
}

Another option might be the usual "doubling" strategy, i.e. do something 
like this instead:


if (tup)
{
while (btupInitSize < size)
btupInitSize *= 2;

btup = repalloc(btup, btupInitSize);
memcpy(btup, tup, size);

LockBuffer(buf, BUFFER_LOCK_UNLOCK);
}

Possibly with only doing the repalloc when actually needed.


3) brin_new_memtuple(...)

The changes in this function seem wrong to me. Not only you've removed 
the code that initialized all the bv_* attributes, you're also using 
palloc() so there could easily be random data. This might be OK for our 
code as we call brin_memtuple_initialize() right after this function, 
and that seems to do the init, but AFAIK brin_new_memtuple() is a part 
of our public API at it's in a header file. And these changes surely 
break the contract documented in the comment above the function:


 * Create a new BrinMemTuple from scratch, and initialize it to an empty
 * state.

So I think this is wrong and needs to be reverted.

It's possible that we'll immediately reset the attributes, but that's 
negligible cost - we expect to do brin_memtuple_initialize() many times 
for the tuple, so it should not make any difference.


Another thing is that the three arrays (bt_values, bt_allnulls and 
bt_hasnulls) may be allocated as a single chunk and then sliced.


char * ptr = palloc0(space for all three arrays);
bt_values = ptr;
bt_allnulls = ptr + (size of bt_values)
bt_bt_hasnulls = ptr + (size of bt_values + bt_allnulls)

which is essentially what the original code did with bv_values. This 
reduces palloc() overhead and fragmentation.



4) brin_memtuple_initialize(...)

It's possible that the new code needs to reset more fields, but I don't 
really see why it should mess with dtuple->bt_columns[i].bv_values for 
example.



5) brin_deform_tuple(..)

The comment before the function probably should explain the purpose of 
the new parameter (that it can either be NULL or an existing tuple).



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fixing warnings in back branches?

2015-12-14 Thread Andres Freund
Hi,

While newer branches are at the moment mostly free of warnings for me,
the picture is entirely different in the older back branches, especially
9.1. That has several hundred lines of warnings.

I personally am not bothered by a handful of spurious warnings in the
back branches, but at this point you're very unlikely to see a new
warning introduced into 9.1 while backpatching.

Should we perhaps fix the worst offenders?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Andres Freund
On 2015-12-14 12:18:27 +0100, Andres Freund wrote:
> On 2015-12-12 21:04:31 +0900, Michael Paquier wrote:
> > Hi all,
> > 
> > I just bumped into the following thing while looking again at Thomas'
> > patch for psql tab completion:
> > --- a/src/bin/psql/tab-complete.c
> > +++ b/src/bin/psql/tab-complete.c
> > @@ -1040,7 +1040,7 @@ psql_completion(const char *text, int start, int end)
> >  pg_strcasecmp(prev5_wd, "IN") == 0 &&
> >  pg_strcasecmp(prev4_wd, "TABLESPACE") == 0 &&
> >  pg_strcasecmp(prev2_wd, "OWNED") == 0 &&
> > -pg_strcasecmp(prev4_wd, "BY") == 0)
> > +pg_strcasecmp(prev_wd, "BY") == 0)
> > {
> > COMPLETE_WITH_QUERY(Query_for_list_of_roles);
> > This should be backpatched, attached is the needed patch.
> 
> Hm, this seems to need slightly more expansive surgery.
> 
> Trivially the comments for ALL IN TABLESPACE seem broken/badly copy pasted:
>  /* ALTER TABLE,INDEX,MATERIALIZED VIEW xxx ALL IN TABLESPACE xxx */
> the first xxx doesnt make sense.
> 
> Secondly the OWNED BY completion then breaks the SET TABLESPACE
> completion. That's maybe not an outright bug, but seems odd nonetheless.
> 
> Fujii, Stephen?

I'm off to do something else for a while, but attached is where I
stopped at.
>From 0b7fe71b3c27abf97c1a2fdefdd10c1e71d3eba7 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 14 Dec 2015 12:06:59 +0100
Subject: [PATCH] Fix tab completion for ALTER ... TABLESPACE ... OWNED BY.

This was introduced broken, in b2de2a.

Author: Michael Paquier
Discussion: CAB7nPqSHDdSwsJqX0d2XzjqOHr==HdWiubCi4L=zs7yftun...@mail.gmail.com
Backpatch: 9.4, like the commit introducing the bug
---
 src/bin/psql/tab-complete.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 8c48881..85f843e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1025,7 +1025,7 @@ psql_completion(const char *text, int start, int end)
 
 		COMPLETE_WITH_LIST(list_ALTER);
 	}
-	/* ALTER TABLE,INDEX,MATERIALIZED VIEW xxx ALL IN TABLESPACE xxx */
+	/* ALTER TABLE,INDEX,MATERIALIZED VIEW ALL IN TABLESPACE xxx */
 	else if (pg_strcasecmp(prev4_wd, "ALL") == 0 &&
 			 pg_strcasecmp(prev3_wd, "IN") == 0 &&
 			 pg_strcasecmp(prev2_wd, "TABLESPACE") == 0)
@@ -1035,15 +1035,24 @@ psql_completion(const char *text, int start, int end)
 
 		COMPLETE_WITH_LIST(list_ALTERALLINTSPC);
 	}
-	/* ALTER TABLE,INDEX,MATERIALIZED VIEW xxx ALL IN TABLESPACE xxx OWNED BY */
+	/* ALTER TABLE,INDEX,MATERIALIZED VIEW ALL IN TABLESPACE xxx OWNED BY */
 	else if (pg_strcasecmp(prev6_wd, "ALL") == 0 &&
 			 pg_strcasecmp(prev5_wd, "IN") == 0 &&
 			 pg_strcasecmp(prev4_wd, "TABLESPACE") == 0 &&
 			 pg_strcasecmp(prev2_wd, "OWNED") == 0 &&
-			 pg_strcasecmp(prev4_wd, "BY") == 0)
+			 pg_strcasecmp(prev_wd, "BY") == 0)
 	{
 		COMPLETE_WITH_QUERY(Query_for_list_of_roles);
 	}
+	/* ALTER TABLE,INDEX,MATERIALIZED VIEW ALL IN TABLESPACE xxx OWNED BY xxx */
+	else if (pg_strcasecmp(prev7_wd, "ALL") == 0 &&
+			 pg_strcasecmp(prev6_wd, "IN") == 0 &&
+			 pg_strcasecmp(prev5_wd, "TABLESPACE") == 0 &&
+			 pg_strcasecmp(prev3_wd, "OWNED") == 0 &&
+			 pg_strcasecmp(prev2_wd, "BY") == 0)
+	{
+		COMPLETE_WITH_CONST("SET TABLESPACE");
+	}
 	/* ALTER AGGREGATE,FUNCTION  */
 	else if (pg_strcasecmp(prev3_wd, "ALTER") == 0 &&
 			 (pg_strcasecmp(prev2_wd, "AGGREGATE") == 0 ||
-- 
2.5.0.400.gff86faf.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Michael Paquier
On Mon, Dec 14, 2015 at 8:18 PM, Andres Freund  wrote:
> On 2015-12-12 21:04:31 +0900, Michael Paquier wrote:
>> Hi all,
>>
>> I just bumped into the following thing while looking again at Thomas'
>> patch for psql tab completion:
>> --- a/src/bin/psql/tab-complete.c
>> +++ b/src/bin/psql/tab-complete.c
>> @@ -1040,7 +1040,7 @@ psql_completion(const char *text, int start, int end)
>>  pg_strcasecmp(prev5_wd, "IN") == 0 &&
>>  pg_strcasecmp(prev4_wd, "TABLESPACE") == 0 &&
>>  pg_strcasecmp(prev2_wd, "OWNED") == 0 &&
>> -pg_strcasecmp(prev4_wd, "BY") == 0)
>> +pg_strcasecmp(prev_wd, "BY") == 0)
>> {
>> COMPLETE_WITH_QUERY(Query_for_list_of_roles);
>> This should be backpatched, attached is the needed patch.
>
> Hm, this seems to need slightly more expansive surgery.
>
> Trivially the comments for ALL IN TABLESPACE seem broken/badly copy pasted:
>  /* ALTER TABLE,INDEX,MATERIALIZED VIEW xxx ALL IN TABLESPACE xxx */
> the first xxx doesnt make sense.

Indeed.

> Secondly the OWNED BY completion then breaks the SET TABLESPACE
> completion. That's maybe not an outright bug, but seems odd nonetheless.

You mean that this code should do the completion of SET TABLESPACE
after "OWNED BY xxx" has been typed? Are you sure it's worth going
this far?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: custom compression methods

2015-12-14 Thread Simon Riggs
On 13 December 2015 at 17:28, Alexander Korotkov 
wrote:


> it would be nice to make compression methods pluggable.
>

Agreed.

My thinking is that this should be combined with work to make use of the
compressed data, which is why Alvaro, Tomas, David have been working on Col
Store API for about 18 months and work on that continues with more
submissions for 9.6 due.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Andres Freund
On 2015-12-14 20:58:21 +0900, Michael Paquier wrote:
> On Mon, Dec 14, 2015 at 8:49 PM, Andres Freund  wrote:
> > On 2015-12-14 20:44:20 +0900, Michael Paquier wrote:
> >> + /*
> >> +  * ALTER TABLE,INDEX,MATERIALIZED VIEW ALL IN TABLESPACE xxx OWNED 
> >> BY xxx
> >> +  * SET TABLESPACE.
> >> +  */
> >> + else if (pg_strcasecmp(prev9_wd, "ALL") == 0 &&
> >> +  pg_strcasecmp(prev8_wd, "IN") == 0 &&
> >> +  pg_strcasecmp(prev7_wd, "TABLESPACE") == 0 &&
> >> +  pg_strcasecmp(prev5_wd, "OWNED") == 0 &&
> >> +  pg_strcasecmp(prev4_wd, "BY") == 0 &&
> >> +  pg_strcasecmp(prev2_wd, "SET") == 0 &&
> >> +  pg_strcasecmp(prev_wd, "TABLESPACE") == 0)
> >> + {
> >> + COMPLETE_WITH_QUERY(Query_for_list_of_tablespaces);
> >> + }
> >
> > Isn't that already handled by the normal SET TABLESPACE case?
> 
> No, There is no SET TABLESPACE case, there is a TABLE SET TABLESPACE
> though. Just removing the TABLE seems to be fine..

ALTER TABLE ALL IN TABLESPACE pg_default OWNED BY andres SET TABLESPACE 
works, because of

/*
 * Finally, we look through the list of "things", such as TABLE, INDEX 
and
 * check if that was the previous word. If so, execute the query to get 
a
 * list of them.
 */
else
{
int i;

for (i = 0; words_after_create[i].name; i++)
{
if (pg_strcasecmp(prev_wd, words_after_create[i].name) 
== 0)
{
if (words_after_create[i].query)

COMPLETE_WITH_QUERY(words_after_create[i].query);
else if (words_after_create[i].squery)

COMPLETE_WITH_SCHEMA_QUERY(*words_after_create[i].squery,

   NULL);
break;
}
}
}



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extend pgbench expressions with functions

2015-12-14 Thread Michael Paquier
On Mon, Nov 9, 2015 at 6:46 AM, Robert Haas  wrote:
> On Sat, Nov 7, 2015 at 2:45 AM, Fabien COELHO  wrote:
>> After looking at the generated html version, I find that the "1/param" and
>> "2/param" formula are very simple and pretty easy to read, and they would
>> not be really enhanced with additional spacing.
>>
>> ISTM that adaptative spacing (no spacing for level 1 operations, some for
>> higher level) is a good approach for readability, ie:
>>
>>f(i) - f(i+1)
>>  ^ no spacing here
>> ^ some spacing here
>>
>> So I would suggest to keep the submitted version, unless this is a blocker.
>
> Well, I think with the ".0" version it looks more like floating-point
> math, and I like the extra white-space.  But I'm happy to hear other
> opinions.

-  defined as (max + min) / 2.0, then value i
+  defined as (max+min)/2, with
This thing reminds me a bit of the little of TeX I know, when writing
things like "\sqrt{1-e^2}" spaces would be printed in the converted
html, and as that's floating arythmetic, we should have as well a .0.
So I would agree on both points with Robert.

I have looked for now at the first patch and finished with the
attached while looking at it. Perhaps a committer could look already
at that?
I am still looking at the 2nd patch in more details...
-- 
Michael


0001-Make-pgbench-documentation-more-precise-for-function.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-12-14 Thread Kyotaro HORIGUCHI
Thank you for the new patch.

At Wed, 9 Dec 2015 20:59:20 +0530, Masahiko Sawada  
wrote in 
> On Wed, Nov 18, 2015 at 2:06 PM, Masahiko Sawada  
> wrote:
> > I agree with #3 way and the s_s_name format you suggested.
> > I think that It's extensible and is tolerable for future changes.
> > I'm going to implement the patch based on this idea if other hackers
> > agree with this design.
> 
> Please find the attached draft patch which supports multi sync replication.
> This patch adds a GUC parameter synchronous_replication_method, which
> represent the method of synchronous replication.
> 
> [Design of replication method]
> synchronous_replication_method has two values; 'priority' and
> '1-priority' for now.
> We can expand the kind of its value (e.g, 'quorum', 'json' etc) in the future.
> 
> * s_r_method = '1-priority'
> This method is for backward compatibility, so the syntax of s_s_names
> is same as today.
> The behavior is same as well.
> 
> * s_r_method = 'priority'
> This method is for multiple synchronous replication using priority method.
> The syntax of s_s_names is,
>,  [, ...]

Is there anyone opposed to this?

> For example, s_r_method = 'priority' and s_s_names = '2, node1, node2,
> node3' means that the master waits for  acknowledge from at least 2
> lowest priority servers.
> If 4 standbys(node1 - node4) are available, the master server waits
> acknowledge from 'node1' and 'node2.
> The each status of wal senders are;
> 
> =# select application_name, sync_state from pg_stat_replication order
> by application_name;
> application_name | sync_state
> --+
> node1| sync
> node2| sync
> node3| potential
> node4| async
> (4 rows)
> 
> After 'node2' crashed, the master will wait for acknowledge from
> 'node1' and 'node3'.
> The each status of wal senders are;
> 
> =# select application_name, sync_state from pg_stat_replication order
> by application_name;
> application_name | sync_state
> --+
> node1| sync
> node3| sync
> node4| async
> (3 rows)
> 
> [Changing replication method]
> When we want to change the replication method, we have to change the
> s_r_method  at first, and then do pg_reload_conf().
> After changing replication method, we can change the s_s_names.

Mmm. I should be able to be changed at once, because s_r_method
and s_s_names contradict each other during the intermediate
state.

> [Expanding replication method]
> If we want to expand new replication method additionally, we need to
> implement two functions for each replication method:
> * int SyncRepGetSynchronousStandbysXXX(int *sync_standbys)
>   This function obtains the list of standbys considered as synchronous
> at that time, and return its length.
> * bool SyncRepGetSyncLsnXXX(XLogRecPtr *write_pos, XLogRecPtr *flush_pos)
>   This function obtains LSNs(write, flush) considered as synced.
> 
> Also, this patch debug code is remain yet, you can debug this behavior
> using by enable DEBUG_REPLICATION macro.
> 
> Please give me feedbacks.

I haven't looked into this fully (sorry) but I'm concerned about
several points.


- I feel that some function names looks too long. For example
  SyncRepGetSynchronousStandbysOnePriority occupies more than the
  half of a line. (However, the replication code alrady has many
  long function names..)

- The comment below of SyncRepGetSynchronousStandbyOnePriority,
  >   /* Find lowest priority standby */

  The code where the comment is for is doing the correct
  thing. Howerver, the comment is confusing. A lower priority
  *value* means a higher priority.

- SyncRepGetSynchronousStandbys checks all if()s even when the
  first one matches. Use switch or "else if" there if you they
  are exclusive each other.

- Do you intende the DEBUG_REPLICATION code in
  SyncRepGetSynchronousStandbys*() to be the final shape?  The
  same code blocks which can work for both method should be in
  their common caller but SyncRepGetSyncLsns*() are
  headache. Although it might need more refactoring, I'm sorry
  but I don't see a desirable shape for now.

  By the way, palloc(20)/free() in such short term looks
  ineffective.

- SyncRepGetSyncLsnsPriority

  For the comment "/* Find lowest XLogRecPtr of both write and
  flush from sync_nodes */", LSN is compared as early or late so
  the comment would be better to be something like "Keep/Collect
  the earliest write and flush LSNs among prioritized standbys".

  And what is more important, this block handles write and flush
  LSN jumbled and it reults in missing the earliest(= most
  delayed) LSN for certain cases. The following is an example.

   Standby 1:  write LSN = 10, flush LSN = 5
   Standby 2:  write LSN = 8 , flush LSN = 6

  For this case, finally we get tmp_write = 10 and tmp_flush = 5
  from 

Re: [HACKERS] Fixing warnings in back branches?

2015-12-14 Thread Greg Stark
On Mon, Dec 14, 2015 at 10:20 AM, Andres Freund  wrote:
> I personally am not bothered by a handful of spurious warnings in the
> back branches, but at this point you're very unlikely to see a new
> warning introduced into 9.1 while backpatching.

These are new warnings older compilers didn't detect? Perhaps just
adding some -Wno-* flags would make more sense than changing code and
possibly introducing bugs.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Michael Paquier
On Mon, Dec 14, 2015 at 9:08 PM, Andres Freund  wrote:
> ALTER TABLE ALL IN TABLESPACE pg_default OWNED BY andres SET TABLESPACE 
> works, because of

Missed that.
-   /* If we have TABLE  SET TABLESPACE provide a list of
tablespaces */
-   else if (pg_strcasecmp(prev4_wd, "TABLE") == 0 &&
-pg_strcasecmp(prev2_wd, "SET") == 0 &&
-pg_strcasecmp(prev_wd, "TABLESPACE") == 0)
-   COMPLETE_WITH_QUERY(Query_for_list_of_tablespaces);
So you could get rid of that as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-14 Thread Andreas Seltenreich
David Fetter writes:

> On Mon, Dec 14, 2015 at 03:06:18PM +0900, Michael Paquier wrote:
>> On Sun, Dec 13, 2015 at 10:14 AM, Andreas Seltenreich wrote:
>> > https://github.com/anse1/sqlsmith
>> 
>> I am in awe regarding this stuff, which has caught many bugs
>> already, it is a bit sad that it is released under the GPL license
>> preventing a potential integration into PostgreSQL core to
>> strengthen the test infrastructure,
>
> I suspect that a polite request to the Andreas that he change to a
> PostgreSQL-compatible license like one of (TPL, BSD2, MIT) might
> handle this part.

It probably would, but I never thought core integration would be a
desirable thing.  Like Csmith, SQLsmith is intended to be
product-agnostic.  That's not yet the case, but it's still on the
roadmap.

Further, the license shouldn't interfere with institutionalizing
sqlsmith somewhere to automatically send mails on failed assertions or
first sight of an error message.  Or providing a web interface around
the logging database of an instance where one can drill down on logged
errors/queries.

>> and it is even sadder to see a direct dependency with libpqxx :(
>
> I suspect this part is a SMOP, but I'm not quite sure what S might
> constitute in this case.

sqlsmith uses three connections in total:

 1. Connection for sending the generated queries to the DUT
 2. Connection for retrieving the schema from the DUT
 3. Logging connection

1 is trivial to change. 1+2 are intendend to be pluggable for supporting
other products.  Switching 1 to libpq is even on the todo list, as
libpqxx doesn't allow access to the SQLSTATE (There is a four year old
bug report about this by Andres).

2. Is more effort to change to libpq, as actual data is passed through
that connection.

3. Was intended to stay libpqxx-only even when testing other products.

Btw, I'm glad C++11 was no longer considered a blocker this thime :-)

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing warnings in back branches?

2015-12-14 Thread Andres Freund
On 2015-12-14 10:55:05 +, Greg Stark wrote:
> On Mon, Dec 14, 2015 at 10:20 AM, Andres Freund  wrote:
> > I personally am not bothered by a handful of spurious warnings in the
> > back branches, but at this point you're very unlikely to see a new
> > warning introduced into 9.1 while backpatching.
> 
> These are new warnings older compilers didn't detect?

Yes. We fixed most of them since, but that doesn't help much when
compiling 9.1.

> Perhaps just adding some -Wno-* flags would make more sense than
> changing code and possibly introducing bugs.

I think that's a case-by-case decision. Just verbatimly backpatching
something that stewed in master for a year or two seems fine. That's imo
often preferrable because often it's just that existing warning
categories grew more "vigilant", or however you want to describe it. So
if you disable those, you also remove coverage...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Michael Paquier
On Mon, Dec 14, 2015 at 8:49 PM, Andres Freund  wrote:
> On 2015-12-14 20:44:20 +0900, Michael Paquier wrote:
>> + /*
>> +  * ALTER TABLE,INDEX,MATERIALIZED VIEW ALL IN TABLESPACE xxx OWNED BY 
>> xxx
>> +  * SET TABLESPACE.
>> +  */
>> + else if (pg_strcasecmp(prev9_wd, "ALL") == 0 &&
>> +  pg_strcasecmp(prev8_wd, "IN") == 0 &&
>> +  pg_strcasecmp(prev7_wd, "TABLESPACE") == 0 &&
>> +  pg_strcasecmp(prev5_wd, "OWNED") == 0 &&
>> +  pg_strcasecmp(prev4_wd, "BY") == 0 &&
>> +  pg_strcasecmp(prev2_wd, "SET") == 0 &&
>> +  pg_strcasecmp(prev_wd, "TABLESPACE") == 0)
>> + {
>> + COMPLETE_WITH_QUERY(Query_for_list_of_tablespaces);
>> + }
>
> Isn't that already handled by the normal SET TABLESPACE case?

No, There is no SET TABLESPACE case, there is a TABLE SET TABLESPACE
though. Just removing the TABLE seems to be fine..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Michael Paquier
On Mon, Dec 14, 2015 at 8:36 PM, Andres Freund  wrote:
> On 2015-12-14 12:18:27 +0100, Andres Freund wrote:
>> On 2015-12-12 21:04:31 +0900, Michael Paquier wrote:
>> > Hi all,
>> >
>> > I just bumped into the following thing while looking again at Thomas'
>> > patch for psql tab completion:
>> > --- a/src/bin/psql/tab-complete.c
>> > +++ b/src/bin/psql/tab-complete.c
>> > @@ -1040,7 +1040,7 @@ psql_completion(const char *text, int start, int end)
>> >  pg_strcasecmp(prev5_wd, "IN") == 0 &&
>> >  pg_strcasecmp(prev4_wd, "TABLESPACE") == 0 &&
>> >  pg_strcasecmp(prev2_wd, "OWNED") == 0 &&
>> > -pg_strcasecmp(prev4_wd, "BY") == 0)
>> > +pg_strcasecmp(prev_wd, "BY") == 0)
>> > {
>> > COMPLETE_WITH_QUERY(Query_for_list_of_roles);
>> > This should be backpatched, attached is the needed patch.
>>
>> Hm, this seems to need slightly more expansive surgery.
>>
>> Trivially the comments for ALL IN TABLESPACE seem broken/badly copy pasted:
>>  /* ALTER TABLE,INDEX,MATERIALIZED VIEW xxx ALL IN TABLESPACE xxx */
>> the first xxx doesnt make sense.
>>
>> Secondly the OWNED BY completion then breaks the SET TABLESPACE
>> completion. That's maybe not an outright bug, but seems odd nonetheless.
>>
>> Fujii, Stephen?
>
> I'm off to do something else for a while, but attached is where I
> stopped at.

Just a bit more and you can get the whole set...
-- 
Michael


psql-tab-alltbspace.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: custom compression methods

2015-12-14 Thread Bill Moran
On Mon, 14 Dec 2015 14:50:57 +0800
Craig Ringer  wrote:

> On 14 December 2015 at 01:28, Alexander Korotkov 
> wrote:
> 
> > Hackers,
> >
> > I'd like to propose a new feature: "Custom compression methods".

I missed the initial post on this thread ...

Have you started or do you plan to actually start work on this? I've
already started some preliminary coding with this as one of it's
goals, but it's been stalled for about a month due to life
intervening. I plan to come back to it soon, so if you decide to
start actually writing code, please contact me so we don't double
up on the work.

-- 
Bill Moran


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] find_inheritance_children() and ALTER TABLE NO INHERIT

2015-12-14 Thread Amit Langote
On 2015/12/03 15:30, Amit Langote wrote:
> On 2015/12/03 13:09, Tom Lane wrote:
>> Amit Langote  writes:
>>> Currently find_inheritance_children() is smart enough to skip a child
>>> table that it finds has been dropped concurrently after it gets a lock on
>>> the same. It does so by looking up the child relid in syscache. It seems
>>> it should also check if the table is still in the list of children of the
>>> parent.
> 
>> I wonder whether we could improve matters by rechecking validity of the
>> pg_inherits tuple (which we saw already and could presumably retain the
>> TID of).  There is at least one place where we do something like that now,
>> IIRC.
> 
> Not sure whether sane but how about performing ordered scan on pg_inherits
> (systable_getnext_ordered())and using systable_recheck_tuple() in step
> with it? Does using ordered catalog scan ensure safety against deadlocks
> that the existing approach of ordered locking of child tables does?
> Perhaps I'm missing something.

Just leaving here a patch that does this. It still returns the list in
order determined by qsort(), IOW, not in the pg_inherits_parent_index
order to avoid broken tests. I could not figure how to do it the way you
suggested.

Thanks,
Amit
diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 04687c1..bfef40d 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -50,9 +50,10 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 {
 	List	   *list = NIL;
 	Relation	relation;
+	Relation	index;
 	SysScanDesc scan;
 	ScanKeyData key[1];
-	HeapTuple	inheritsTuple;
+	HeapTuple	tup;
 	Oid			inhrelid;
 	Oid		   *oidarr;
 	int			maxoids,
@@ -74,46 +75,30 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 	numoids = 0;
 
 	relation = heap_open(InheritsRelationId, AccessShareLock);
+	index = index_open(InheritsParentIndexId, AccessShareLock);
 
 	ScanKeyInit([0],
 Anum_pg_inherits_inhparent,
 BTEqualStrategyNumber, F_OIDEQ,
 ObjectIdGetDatum(parentrelId));
 
-	scan = systable_beginscan(relation, InheritsParentIndexId, true,
-			  NULL, 1, key);
+	scan = systable_beginscan_ordered(relation, index, NULL, 1, key);
 
-	while ((inheritsTuple = systable_getnext(scan)) != NULL)
+	while ((tup = systable_getnext_ordered(scan, ForwardScanDirection)) != NULL)
 	{
-		inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid;
+		inhrelid = ((Form_pg_inherits) GETSTRUCT(tup))->inhrelid;
 		if (numoids >= maxoids)
 		{
 			maxoids *= 2;
 			oidarr = (Oid *) repalloc(oidarr, maxoids * sizeof(Oid));
 		}
-		oidarr[numoids++] = inhrelid;
-	}
-
-	systable_endscan(scan);
-
-	heap_close(relation, AccessShareLock);
-
-	/*
-	 * If we found more than one child, sort them by OID.  This ensures
-	 * reasonably consistent behavior regardless of the vagaries of an
-	 * indexscan.  This is important since we need to be sure all backends
-	 * lock children in the same order to avoid needless deadlocks.
-	 */
-	if (numoids > 1)
-		qsort(oidarr, numoids, sizeof(Oid), oid_cmp);
-
-	/*
-	 * Acquire locks and build the result list.
-	 */
-	for (i = 0; i < numoids; i++)
-	{
-		inhrelid = oidarr[i];
 
+		/*
+		 * Following code assumes that inhrelids are returned in the same
+		 * order by systable_getnext_ordered() in all backends. This is
+		 * important since we need to be sure all backends lock children
+		 * in the same order to avoid needless deadlocks.
+		 */
 		if (lockmode != NoLock)
 		{
 			/* Get the lock to synchronize against concurrent drop */
@@ -124,7 +109,8 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 			 * really exists or not.  If not, assume it was dropped while we
 			 * waited to acquire lock, and ignore it.
 			 */
-			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)))
+			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)) ||
+!systable_recheck_tuple(scan, tup))
 			{
 /* Release useless lock */
 UnlockRelationOid(inhrelid, lockmode);
@@ -133,10 +119,22 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 			}
 		}
 
-		list = lappend_oid(list, inhrelid);
+		oidarr[numoids++] = inhrelid;
 	}
 
-	pfree(oidarr);
+	/*
+	 * Do not break regression tests - return the oids in the same order as
+	 * would have been previously.
+	 */
+	if (numoids > 1)
+		qsort(oidarr, numoids, sizeof(Oid), oid_cmp);
+
+	for (i = 0; i < numoids; i++)
+		list = lappend_oid(list, oidarr[i]);
+
+	systable_endscan_ordered(scan);
+	index_close(index, AccessShareLock);
+	heap_close(relation, AccessShareLock);
 
 	return list;
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script & other stuff

2015-12-14 Thread Michael Paquier
On Tue, Dec 15, 2015 at 5:53 AM, Fabien COELHO  wrote:
> I wrote:
>> Why would the likelyhood of an issue be small here?
>
> The time to update one stat (<< 100 cycles ?) to the time to do a
> transaction with the database (typically Y ms), so the likelyhood of two
> thread to update the very same stat at the same time is probably under
> 1/10,000,000. Even if it occurs, then one stat is slightly false, no big
> deal. So I think the potential slowdown induced by a mutex is not worth it,
> so I a comment instead.

OK, got it and agreed.

>> +   /* print NaN if no transactions where executed */
>> +   double latency = ss->sum / ss->count;
>> This does not look like a good idea, ss->count can be 0.
>
>
> "sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the
> comment.

PG code usually avoids that, and I recall static analyze tools type
coverity complaining that this may lead to undefined behavior. While I
agree that this would lead to NaN...

>> It seems also that it would be a good idea to split the patch into two
>> parts:
>> 1) Refactor the code so as the existing test scripts are put under the
>> same umbrella with addScript, adding at the same time the new option
>> -b.
>> 2) Add the weight facility and its related statistics.
>
>
> Sigh. The patch & documentation are probably not independent, so that would
> make two dependent patches, probably.

I am not really saying so, it seems just that doing the refactoring
(with its related docs), and then add the extension for the weight
(with its docs) is more natural than doing both things at the same
time.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Andres Freund
On 2015-12-12 21:04:31 +0900, Michael Paquier wrote:
> Hi all,
> 
> I just bumped into the following thing while looking again at Thomas'
> patch for psql tab completion:
> --- a/src/bin/psql/tab-complete.c
> +++ b/src/bin/psql/tab-complete.c
> @@ -1040,7 +1040,7 @@ psql_completion(const char *text, int start, int end)
>  pg_strcasecmp(prev5_wd, "IN") == 0 &&
>  pg_strcasecmp(prev4_wd, "TABLESPACE") == 0 &&
>  pg_strcasecmp(prev2_wd, "OWNED") == 0 &&
> -pg_strcasecmp(prev4_wd, "BY") == 0)
> +pg_strcasecmp(prev_wd, "BY") == 0)
> {
> COMPLETE_WITH_QUERY(Query_for_list_of_roles);
> This should be backpatched, attached is the needed patch.

Hm, this seems to need slightly more expansive surgery.

Trivially the comments for ALL IN TABLESPACE seem broken/badly copy pasted:
 /* ALTER TABLE,INDEX,MATERIALIZED VIEW xxx ALL IN TABLESPACE xxx */
the first xxx doesnt make sense.

Secondly the OWNED BY completion then breaks the SET TABLESPACE
completion. That's maybe not an outright bug, but seems odd nonetheless.

Fujii, Stephen?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: ResourceOwner optimization for tables with many partitions

2015-12-14 Thread Aleksander Alekseev
Hello, Robert

Here is my fix for item 4.

Best regards,
Aleksander

On Thu, 10 Dec 2015 11:37:23 -0500
Robert Haas  wrote:

> On Wed, Dec 9, 2015 at 8:30 AM, Aleksander Alekseev
>  wrote:
> > Hello, Robert
> >
> > Thanks for your review. I believe I fixed items 1, 2 and 3 (see
> > attachment). Also I would like to clarify item 4.
> >
> >> 4. It mixes together multiple ideas in a single patch, not only
> >> introducing a hashing concept but also striping a brand-new layer
> >> of abstraction across the resource-owner mechanism.  I am not sure
> >> that layer of abstraction is a very good idea, but if it needs to
> >> be done, I think it should be a separate patch.
> >
> > Do I right understand that you suggest following?
> >
> > Current patch should be split in two parts. In first patch we create
> > and use ResourceArray with array-based implementation (abstraction
> > layer). Then we apply second patch which change ResourceArray
> > implementation to hashing based (optimization).
> 
> Well, sorta.  To be honest, I think this patch is really ugly.  If we
> were going to do this then, yes, I would want to split the patch into
> two parts along those lines.  But actually I don't really want to do
> it this way at all.  It's not that I don't want the performance
> benefits: I do.  But the current code is really easy to read and
> extremely simple, and this changes it into something that is a heck of
> a lot harder to read and understand.  I'm not sure exactly what to do
> about that, but it seems like a problem.
> 

diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 0e7acbf..39324fe 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -29,6 +29,209 @@
 #include "utils/snapmgr.h"
 
 /*
+ * ResourceArray is a common structure for storing different types of resources.
+ *
+ * ResourceOwner can own `HeapTuple`s, `Relation`s, `Snapshot`s, etc. For
+ * each resource type there are procedures ResourceOwnerRemember* and
+ * ResourceOwnerForget*. There are also ResourceOwnerEnlarge* procedures
+ * which should be called before corresponding ResourceOwnerRemember* calls
+ * (see below). Internally each type of resource is stored in separate
+ * ResourceArray.
+ */
+typedef struct ResourceArray
+{
+	char	   *itemsarr;		/* buffer for storing values */
+	uint32		itemsizelg:2;	/* sizeof one item log 2 */
+	uint32		capacity:30;	/* capacity of array */
+	uint32		nitems;			/* how many items is stored in items array */
+}	ResourceArray;
+
+/*
+ * This number is used as initial size of resource array. If given number of
+ * items is not enough, we double array size and reallocate memory.
+ */
+#define RESARRAY_INIT_SIZE 16
+
+/*
+ * Such type of callback function is called when resource stored in
+ * ResourceArray is released using ResourceArrayFree. Callback should
+ * _actually_ release a resource so nitems value will be decremented.
+ */
+typedef void (*ResourceArrayRemoveCallback) (const void *ref, bool isCommit);
+
+/*
+ * Initialize ResourceArray
+ */
+static void
+ResourceArrayInit(ResourceArray * resarr, uint32 itemsize)
+{
+	Assert(itemsize == sizeof(int) || itemsize == sizeof(void *));
+	Assert(resarr->itemsarr == NULL);
+	Assert(resarr->capacity == 0);
+	Assert(resarr->nitems == 0);
+
+	resarr->itemsizelg = 0;
+	while (itemsize > 1)
+	{
+		resarr->itemsizelg++;
+		itemsize >>= 1;
+	}
+
+	Assert(resarr->itemsizelg == 2 || resarr->itemsizelg == 3);
+}
+
+/*
+ * Add a resource to ResourceArray
+ *
+ * Caller must have previously done ResourceArrayEnlarge()
+ */
+static void
+ResourceArrayAdd(ResourceArray * resarr, const void *dataref)
+{
+	char	   *itemptr;
+	uint32		itemsize = 1 << resarr->itemsizelg;
+
+	Assert(resarr->capacity > 0);
+	Assert(resarr->itemsarr != NULL);
+	Assert(resarr->nitems < resarr->capacity);
+
+	/*
+	 * Read next line as:
+	 *
+	 * itemptr = &(itemsarr[resarr->nitems])
+	 *
+	 * We use binary shift since compiler doesn't know that itemsize is always
+	 * power of two. It would use multiplication instead of efficient binary
+	 * shift in code `resarr->nitems * itemsize`.
+	 */
+	itemptr = resarr->itemsarr + (resarr->nitems << resarr->itemsizelg);
+	memcpy(itemptr, dataref, itemsize);
+	resarr->nitems++;
+}
+
+/*
+ * Remove a resource from ResourceArray
+ *
+ * Returns true on success, false if resource was not found
+ */
+static bool
+ResourceArrayRemove(ResourceArray * resarr, const void *dataref)
+{
+	Datum		idx;
+	char	   *itemptr;
+	char	   *lastitemptr;
+	uint32		itemsize = 1 << resarr->itemsizelg;
+
+	Assert(resarr->capacity > 0);
+	Assert(resarr->itemsarr != NULL);
+
+	lastitemptr = resarr->itemsarr +
+		((resarr->nitems - 1) << resarr->itemsizelg);
+	for (idx = 0; idx < resarr->nitems; idx++)
+	{
+		/*
+		 * Read next line as:
+		 *
+		 * itemptr = &(itemsarr[idx])
+		 *
+		 * We use binary shift since compiler doesn't know that itemsize is
+		 * 

Re: [HACKERS] psql tab completion bug for ALL IN TABLESPACE

2015-12-14 Thread Andres Freund
On 2015-12-14 20:44:20 +0900, Michael Paquier wrote:
> + /*
> +  * ALTER TABLE,INDEX,MATERIALIZED VIEW ALL IN TABLESPACE xxx OWNED BY 
> xxx
> +  * SET TABLESPACE.
> +  */
> + else if (pg_strcasecmp(prev9_wd, "ALL") == 0 &&
> +  pg_strcasecmp(prev8_wd, "IN") == 0 &&
> +  pg_strcasecmp(prev7_wd, "TABLESPACE") == 0 &&
> +  pg_strcasecmp(prev5_wd, "OWNED") == 0 &&
> +  pg_strcasecmp(prev4_wd, "BY") == 0 &&
> +  pg_strcasecmp(prev2_wd, "SET") == 0 &&
> +  pg_strcasecmp(prev_wd, "TABLESPACE") == 0)
> + {
> + COMPLETE_WITH_QUERY(Query_for_list_of_tablespaces);
> + }

Isn't that already handled by the normal SET TABLESPACE case?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Another XML build issue

2015-12-14 Thread Kevin Grittner
Today, on my ubuntu 14.04 LTS system, I saw an XML change come through
with updates.  My build on master is now broken with this:

*** /home/kgrittn/pg/master/src/test/regress/expected/xml.out
2015-11-23 08:15:28.206336200 -0600
--- /home/kgrittn/pg/master/src/test/regress/results/xml.out
2015-12-14 09:29:40.489533194 -0600
***
*** 271,279 
  line 1: Document is empty

  ^
- line 1: Start tag expected, '<' not found
-
- ^
  SELECT xmlparse(document '   ');
  ERROR:  invalid XML document
  DETAIL:  line 1: Start tag expected, '<' not found
--- 271,276 

==

Any ideas on what to do about this one?

I could certainly push yet another version of the expected file, but
I'm not sure that's the thing to do.  I don't see it in the buildfarm
yet, but that's probably just a matter of time.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing warnings in back branches?

2015-12-14 Thread Andres Freund
On 2015-12-14 09:43:07 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-12-14 10:55:05 +, Greg Stark wrote:
> >> Perhaps just adding some -Wno-* flags would make more sense than
> >> changing code and possibly introducing bugs.
> 
> > I think that's a case-by-case decision. Just verbatimly backpatching
> > something that stewed in master for a year or two seems fine. That's imo
> > often preferrable because often it's just that existing warning
> > categories grew more "vigilant", or however you want to describe it. So
> > if you disable those, you also remove coverage...
> 
> Meh.  If we thought that anything like that was an actual bug, we should
> have back-patched the fix when removing the warning in HEAD.  So I would
> expect that all remaining warnings are just compiler nannyism, and thus
> that fixing them is more likely to introduce bugs than do anything very
> useful.

I'm more concerned about removing warnings that help detect problems
when backpatching. Right now I need
  -Wno-incompatible-pointer-types \
  -Wno-type-limits \
  -Wno-unused-but-set-variable \
  -Wno-empty-body \
  -Wno-address

to compile 9.1 without warnings. -Wincompatible-pointer-types is quite
useful to detect problems. The rest indeed is pretty 'Meh'.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Don't error if the two clusters are already on the sa

2015-12-14 Thread Robert Haas
On Fri, Dec 11, 2015 at 8:44 PM, Michael Paquier
 wrote:
> On Sat, Dec 12, 2015 at 9:11 AM, Tom Lane  wrote:
>> Peter Eisentraut  writes:
>>> pg_rewind: Don't error if the two clusters are already on the same timeline
>>> This previously resulted in an error and a nonzero exit status, but
>>> after discussion this should rather be a noop with a zero exit status.
>>
>> Hm, if we're going to do that, shouldn't we back-patch the behavior
>> change into 9.5 as well?
>
> +1. It would be good to get consistent across branches here.

This is one of the (darned few) things still on the open items list
for 9.5.  Peter, are you going to take care of this?  Or alternatively
make an argument that we shouldn't do this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Tom Lane
Kevin Grittner  writes:
> Today, on my ubuntu 14.04 LTS system, I saw an XML change come through
> with updates.  My build on master is now broken with this:

> - line 1: Start tag expected, '<' not found
> -
> - ^

Now that I look more closely, there are really two distinct pathologies
exhibited in xml_2.out.  One is what I said in the commit message, that
error position reports occurring at EOF are lost.  But the other is
that this specific error message disappears altogether.  Looking at the
larger context:

SELECT xmlparse(document '');
ERROR:  invalid XML document
DETAIL:  line 1: switching encoding : no input

^
line 1: Document is empty

^
line 1: Start tag expected, '<' not found

^

the missing message is one that should have come out after the "document
is empty" message.  I am suspicious that what this is is an additional
side effect of the CVE-2015-7499 fix, which as I understood it was a hack
to prevent further parsing once any error has been detected.

So at this point I'm guessing that Ubuntu has shipped an update that
includes the patch Pavel Raiskup suggested in

https://bugzilla.redhat.com/show_bug.cgi?id=1286692#c4

which would fix the lack of error cursors at EOF, but it would not affect
any larger change such as stopping after the first detected error.

If you can confirm that, then we'll need to decide what to do about it.
Personally I'd vote for just dropping this specific test case, which
does not seem to have enough value to justify carrying YA expected-file.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Tom Lane
Kevin Grittner  writes:
> Today, on my ubuntu 14.04 LTS system, I saw an XML change come through
> with updates.  My build on master is now broken with this:

> *** /home/kgrittn/pg/master/src/test/regress/expected/xml.out
> 2015-11-23 08:15:28.206336200 -0600
> --- /home/kgrittn/pg/master/src/test/regress/results/xml.out
> 2015-12-14 09:29:40.489533194 -0600
> ***
> *** 271,279 
>   line 1: Document is empty

>   ^
> - line 1: Start tag expected, '<' not found
> -
> - ^
>   SELECT xmlparse(document '   ');
>   ERROR:  invalid XML document
>   DETAIL:  line 1: Start tag expected, '<' not found
> --- 271,276 

That is odd; I did not think there'd be more than one new behavior
from libxml2.  Can you look to see if Ubuntu is carrying some
distro-specific patch that affects this?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Kevin Grittner
On Mon, Dec 14, 2015 at 9:44 AM, Tom Lane  wrote:

> Can you look to see if Ubuntu is carrying some
> distro-specific patch that affects this?

Here's what is in the log for the change that I think is the one that
came through today:



libxml2 (2.9.1+dfsg1-3ubuntu4.6) trusty-security; urgency=medium

  * SECURITY UPDATE: denial of service via entity expansion issue
- debian/patches/CVE-2015-5312.patch: properly exit when entity
  expansion is detected in parser.c.
- CVE-2015-5312
  * SECURITY UPDATE: heap buffer overflow in xmlDictComputeFastQKey
- debian/patches/CVE-2015-7497.patch: check offset in dict.c.
- CVE-2015-7497
  * SECURITY UPDATE: denial of service via encoding conversion failures
- debian/patches/CVE-2015-7498.patch: avoid processing entities after
  encoding conversion failures in parser.c.
- CVE-2015-7498
  * SECURITY UPDATE: out of bounds read in xmlGROW
- debian/patches/CVE-2015-7499-1.patch: add xmlHaltParser() to stop the
  parser in parser.c.
- debian/patches/CVE-2015-7499-2.patch: check input in parser.c.
- CVE-2015-7499
  * SECURITY UPDATE: out of bounds read in xmlParseMisc
- debian/patches/CVE-2015-7500.patch: check entity boundaries in
  parser.c.
- CVE-2015-7500
  * SECURITY UPDATE: denial of service via extra processing of MarkupDecl
- debian/patches/CVE-2015-8241.patch: add extra EOF check in parser.c.
- CVE-2015-8241
  * SECURITY UPDATE: buffer overead with HTML parser in push mode
- debian/patches/CVE-2015-8242.patch: use pointer in the input in
  HTMLparser.c.
- CVE-2015-8242
  * SECURITY UPDATE: denial of service via encoding failures
- debian/patches/CVE-2015-8317-1.patch: do not process encoding values
  if the declaration is broken in parser.c.
- debian/patches/CVE-2015-8317-2.patch: fail parsing if the encoding
  conversion failed in parser.c.
- CVE-2015-8317

 -- Marc Deslauriers   Wed, 09 Dec 2015
12:00:30 -0500



I don't know how that compares to other distros...

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing warnings in back branches?

2015-12-14 Thread Robert Haas
On Mon, Dec 14, 2015 at 10:06 AM, Andres Freund  wrote:
> On 2015-12-14 09:43:07 -0500, Tom Lane wrote:
>> Andres Freund  writes:
>> > On 2015-12-14 10:55:05 +, Greg Stark wrote:
>> >> Perhaps just adding some -Wno-* flags would make more sense than
>> >> changing code and possibly introducing bugs.
>>
>> > I think that's a case-by-case decision. Just verbatimly backpatching
>> > something that stewed in master for a year or two seems fine. That's imo
>> > often preferrable because often it's just that existing warning
>> > categories grew more "vigilant", or however you want to describe it. So
>> > if you disable those, you also remove coverage...
>>
>> Meh.  If we thought that anything like that was an actual bug, we should
>> have back-patched the fix when removing the warning in HEAD.  So I would
>> expect that all remaining warnings are just compiler nannyism, and thus
>> that fixing them is more likely to introduce bugs than do anything very
>> useful.
>
> I'm more concerned about removing warnings that help detect problems
> when backpatching. Right now I need
>   -Wno-incompatible-pointer-types \
>   -Wno-type-limits \
>   -Wno-unused-but-set-variable \
>   -Wno-empty-body \
>   -Wno-address
>
> to compile 9.1 without warnings. -Wincompatible-pointer-types is quite
> useful to detect problems. The rest indeed is pretty 'Meh'.

IIUC, the main thing that causes incompatible pointer type warnings on
9.1 is the conflation of FILE with gzFile in pg_dump and
pg_basebackup.  Not sure exactly which commits fixed that offhand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-12-14 Thread Peter Geoghegan
On Mon, Dec 14, 2015 at 6:58 PM, Greg Stark  wrote:
> I ran sorts with various parameters on my small NAS server.

...

> without the extra memory optimizations.

Thanks for taking the time to benchmark the patch!

While I think it's perfectly fair that you didn't apply the final
on-the-fly merge "memory pool" patch, I also think that it's quite
possible that the regression you see at the very low end would be
significantly ameliorated or even eliminated by applying that patch,
too. After all, Jeff Janes had a much harder time finding a
regression, probably because he benchmarked all patches together.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove array_nulls?

2015-12-14 Thread Michael Paquier
On Tue, Dec 15, 2015 at 2:57 AM, Jim Nasby  wrote:
> On 12/11/15 2:57 PM, Tom Lane wrote:
>> Jim Nasby  writes:
>> Perhaps, but I'd like to have a less ad-hoc process about it.  What's
>> our policy for dropping backwards-compatibility GUCs?  Are there any
>> others that should be removed now as well?
>
>
> Perhaps it should be tied to bumping the major version number, which I'm
> guessing would happen next whenever we get parallel query execution. If we
> do that, a reasonable policy might be that a compatability GUC lives across
> no more than 1 major version bump (ie, we wouldn't remove something in 9.0
> that was added in 8.4).

Another possibility may be to link that with the 5-year maintenance
window of community: a compatibility GUC is dropped in the following
major release if the oldest stable version maintained is the one that
introduced it. Just an idea.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unused(?) field Form_pg_sequence.sequence_name, not updated by seq rename

2015-12-14 Thread Simon Riggs
On 15 December 2015 at 04:40, Craig Ringer  wrote:


> It gets written as part of the Form_pg_sequence each time we write a
> sequence advance to WAL, but just seems to be a waste of space.
>

Agreed


> Am I missing something obvious or should it just be removed? Or perhaps
> replaced with the sequence's Oid in pg_class, since that'd be quite handy
> for logical decoding of sequences.
>

If the name is wrong then probably other fields are wrong also when we do
ALTER SEQUENCE?

We should add the fields you need, but don't alter anything in
Form_pg_sequence.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

2015-12-14 Thread Peter Geoghegan
On Wed, Dec 9, 2015 at 2:15 PM, Peter Geoghegan  wrote:
> I think that you're missing that patch 0001 formally forbids
> abbreviated keys that are pass-by-value, by revising the contract
> (this is proposed for backpatch to 9.5 -- only comments are changed).
> This is already something that is all but forbidden, although the
> datum case does tacitly acknowledge the possibility by not allowing
> abbreviation to work with the pass-by-value-and-yet-abbreviated case.
>
> I think that this revision is also useful for putting abbreviated keys
> in indexes, something that may happen yet.

I'm also depending on this for the "quicksort for every sort run" patch, BTW:

+   /*
+* Kludge:  Trigger abbreviated tie-breaker if in-memory tuples
+* use abbreviation (writing tuples to tape never preserves
+* abbreviated keys).  Do this by assigning in-memory
+* abbreviated tuple to tape tuple directly.
+*
+* It doesn't seem worth generating a new abbreviated key for
+* the tape tuple, and this approach is simpler than
+* "unabbreviating" the memtuple tuple from a "common" routine
+* like this.
+*/
+   if (state->sortKeys != NULL &&
state->sortKeys->abbrev_converter != NULL)
+   stup->datum1 = state->memtuples[state->current].datum1;

I could, as an alternative approach, revise tuplesort so that
self-comparison works (something we currently assert against [1]),
something that would probably *also* require and update to the
sortsupport.h contract, but this seemed simpler and more general.

In general, I think that there are plenty of reasons to forbid
pass-by-reference abbreviated keys (where the abbreviated comparator
itself is a pointer, something much more complicated than an integer
3-way comparison or similar).

[1] Commit c5a03256c
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unused(?) field Form_pg_sequence.sequence_name, not updated by seq rename

2015-12-14 Thread Craig Ringer
Hi all

Does anyone know why Form_pg_sequence has a field sequence_name that
duplicates the sequence's name from pg_class ?

It's assigned when the sequence is created by copying it from pg_class. It
isn't subsequently referenced anywhere as far as I can see. It isn't
updated by ALTER SEQUENCE ... RENAME TO, so it isn't necessarily actually
the correct sequence name either.

It gets written as part of the Form_pg_sequence each time we write a
sequence advance to WAL, but just seems to be a waste of space.

Am I missing something obvious or should it just be removed? Or perhaps
replaced with the sequence's Oid in pg_class, since that'd be quite handy
for logical decoding of sequences.

If we need to keep it for some reason then it should probably be updated by
ALTER SEQUENCE.

(Arose out of
http://www.postgresql.org/message-id/CAMsr+YHSxwZA-xHsgNLpA_DbTywVQYDX8CUjBZ9Sbr=d7xc...@mail.gmail.com
)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Unused(?) field Form_pg_sequence.sequence_name, not updated by seq rename

2015-12-14 Thread Tom Lane
Craig Ringer  writes:
> Does anyone know why Form_pg_sequence has a field sequence_name that
> duplicates the sequence's name from pg_class ?

It's historical, for sure.  We won't be removing it in the foreseeable
future because of on-disk-compatibility issues.  But you might want to
read the pghackers archives, five or ten years back, where we speculated
about redoing sequences to combine them all into one system catalog
(ie, store one row per sequence not one relation per).  Aside from
application compatibility issues, the stumbling block seemed to be how to
separate transactional from nontransactional updates.  That particular
problem is also why ALTER SEQUENCE RENAME can't update the sequence's copy
of the relation name: the wrong things happen if you roll back.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] small query, about skipping dump in dumpAttrDef

2015-12-14 Thread amul sul
Hi All,

In dumpAttrDef() function we are skipping dump if table definition is not 
dumped(i.e. by checking 
tbinfo->dobj.dump), its absolutely alright to do this.

But, in dumpConstraint() we doing same by checking constraint dump 
flag(coninfo->dobj.dump) instead of table dump flag(tbinfo->dobj.dump).

IMHO, for a code consistency we should use attribute dump 
flag(adinfo->dobj.dump) instead of table dump flag as shown below:
=
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 36863df..8ac0776 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -14479,8 +14479,8 @@ dumpAttrDef(Archive *fout, DumpOptions *dopt, 
AttrDefInfo *adinfo)
PQExpBuffer q;
PQExpBuffer delq;

-   /* Skip if table definition not to be dumped */
-   if (!tbinfo->dobj.dump || dopt->dataOnly)
+   /* Skip if not to be dumped */
+   if (!adinfo->dobj.dump || dopt->dataOnly)
return;

/* Skip if not "separate"; it was dumped in the table's definition */

=

Comments? Thoughts?

 
Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-12-14 Thread Peter Geoghegan
On Mon, Dec 14, 2015 at 7:22 PM, Peter Geoghegan  wrote:
> Thanks for taking the time to benchmark the patch!

Also, I should point out that you didn't add work_mem past the point
where the master branch will get slower, while the patch continues to
get faster. This seems to happen fairly reliably, certainly if
work_mem is sized at about 1GB, and often at lower settings. With the
POWER7 "Hydra" server, external sorting for a CREATE INDEX operation
could put any possible maintenance_work_mem setting to good use -- my
test case got faster with a 15GB maintenance_work_mem setting (the
server has 64GB of ram). I think I tried 25GB as a
maintenance_work_mem setting next, but started to get OOM errors at
that point.

Again, I point this out because I want to account for why my numbers
were better (for the benefit of other people -- I think you get this,
and are being fair).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Logical decoding support for sequence advances

2015-12-14 Thread Craig Ringer
On 14 December 2015 at 16:19, Craig Ringer  wrote:


> Attached a slightly updated version. It just has less spam in the
> regression tests, by adding a new option to test_decoding to show
> sequences, which it doesn't enable except in sequence specific tests.
>

Whoops, the patch as written is wrong. I used Form_pg_sequence's
sequence_name field to get the sequence name. This seems to be a vestigial
field with no useful purpose except taking up space; in particular, it's
not updated when ALTER SEQUENCE ... RENAME TO is used to rename a sequence.
The sequence's pg_class entry is updated but the sequence_name in the
sequence page is not.

Since the sequence_name is written to WAL with each sequence recovery
position advance, isn't used anywhere, and can be wrong, shouldn't it just
be removed from Form_pg_sequence entirely? Or at least replaced with the
sequence's pg_class entry oid?

I could always fix ALTER SEQUENCE to update sequence_name, but it seems
pretty pointless to have it at all.

In any case I'll need to update this patch. Which will be fun, since the
sequence oid doesn't seem to be written to the xlog record, just the
HeapTuple header and Form_pg_sequence, and the redo is handled by doing a
page replacement. I'd like to update Form_pg_sequence to store oid not
sequence_name at the same time but that makes all this a dramatically more
intrusive change that's rather less likely to be accepted.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Unused(?) field Form_pg_sequence.sequence_name, not updated by seq rename

2015-12-14 Thread Michael Paquier
On Tue, Dec 15, 2015 at 2:05 PM, Tom Lane  wrote:
> Craig Ringer  writes:
>> Does anyone know why Form_pg_sequence has a field sequence_name that
>> duplicates the sequence's name from pg_class ?
>
> It's historical, for sure.  We won't be removing it in the foreseeable
> future because of on-disk-compatibility issues.  But you might want to
> read the pghackers archives, five or ten years back, where we speculated
> about redoing sequences to combine them all into one system catalog
> (ie, store one row per sequence not one relation per).  Aside from
> application compatibility issues, the stumbling block seemed to be how to
> separate transactional from nontransactional updates.  That particular
> problem is also why ALTER SEQUENCE RENAME can't update the sequence's copy
> of the relation name: the wrong things happen if you roll back.

That's a little bit older than 5/10 years visibly, see commit 7415105.
But yes as the sequence data is stored as a single always-visible
tuple on its relfilenode, there is no way to remove it without
breaking on-disk compatibility, but moving back in time, it would have
been surely possible to rely on the sequence OID instead.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel joins, and better parallel explain

2015-12-14 Thread Amit Kapila
On Thu, Dec 3, 2015 at 3:25 AM, Robert Haas  wrote:
>
> On Tue, Dec 1, 2015 at 7:21 AM, Amit Kapila 
wrote:
> > It would be better if we can split this patch into multiple patches like
> > Explain related changes, Append pushdown related changes, Join
> > Push down related changes.  You can choose to push the patches as
> > you prefer, but splitting can certainly help in review/verification of
the
> > code.
>
> I don't think it really makes sense to split the append push-down
> changes from the join push-down changes; those share a great deal of
> code.

Not an issue.  I have started looking into parallel join patch and below are
few findings:

1.
There are few compilation errors in the patch. It seems patch needs
to adapt the latest changes done in commit-edca44b1.

1>src/backend/optimizer/path/joinpath.c(420): error C2039:
'extra_lateral_rels' : is not a member of
'JoinPathExtraData'
1>
 E:\WorkSpace\PostgreSQL\master\postgresql\src\include\nodes/relation.h(1727)
: see declaration of
'JoinPathExtraData'
..
..

2.
Why consider_parallel_nestloop() doesn't consider materializing inner
relation as we do in match_unsorted_outer()?

I have generated a test as below where non-parallel Nestloop join is
faster than parallel Nestloop join.  I am using 'hydra' for testing this
patch.

CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 1000) g;

CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 200) g;

Analyze t1;
Analyze t2;

Restart Server
Connect with psql

set enable_hashjoin=off;
set enable_mergejoin=off;
postgres=# Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1
AND t1.c1 BETWEEN 10 AND
100100;
QUERY PLAN


-
--
 Aggregate  (cost=3294864.21..3294864.21 rows=1 width=0) (actual
time=42614.102..42614.102 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..3294864.16 rows=20 width=0) (actual
time=4123.463..42614.084 rows=101
loops=1)
 Join Filter: (t1.c1 = t2.c1)
 Rows Removed by Join Filter: 201999899
 ->  Seq Scan on t2  (cost=0.00..30811.00 rows=200 width=4)
(actual time=0.027..284.979
rows=200 loops=1)
 ->  Materialize  (cost=0.00..204053.41 rows=102 width=4) (actual
time=0.000..0.008 rows=101
loops=200)
   ->  Seq Scan on t1  (cost=0.00..204052.90 rows=102 width=4)
(actual time=13.920..2024.684
rows=101 loops=1)
 Filter: ((c1 >= 10) AND (c1 <= 100100))
 Rows Removed by Filter: 899
 Planning time: 0.085 ms
 Execution time: 42614.135 ms

I have repeated the above statement 3 times and the above result is
median of 3 runs.

Restart Server
Connect with psql

set enable_hashjoin=off;
set enable_mergejoin=off;

set max_parallel_degree=4;

Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t1.c1
BETWEEN 10 AND 100100;
QUERY PLAN


-
-
 Aggregate  (cost=1311396.47..1311396.48 rows=1 width=0) (actual
time=45736.973..45736.973 rows=1 loops=1)
   ->  Gather  (cost=1000.00..1311396.42 rows=20 width=0) (actual
time=709.083..45736.925 rows=101 loops=1)
 Number of Workers: 4
 ->  Nested Loop  (cost=0.00..1310394.42 rows=20 width=0) (actual
time=436.460..11240.321 rows=20
loops=5)
   Join Filter: (t1.c1 = t2.c1)
   Rows Removed by Join Filter: 40399980
   ->  Parallel Seq Scan on t1  (cost=0.00..45345.09 rows=23
width=4) (actual
time=425.178..425.232 rows=20 loops=5)
 Filter: ((c1 >= 10) AND (c1 <= 100100))
 Rows Removed by Filter: 180
   ->  Seq Scan on t2  (cost=0.00..30811.00 rows=200
width=4) (actual time=0.011..270.986
rows=200 loops=101)
 Planning time: 0.115 ms
 Execution time: 45737.863 ms

I have repeated the above statement 3 times and the above result is
median of 3 runs.

Now here the point to observe is that non-parallel case uses both less
Execution time and Planning time to complete the statement.  There
is a considerable increase in planning time without any benefit in
execution.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] WIP: Rework access method interface

2015-12-14 Thread Petr Jelinek

On 2015-12-12 23:17, Alexander Korotkov wrote:

On Sat, Dec 12, 2015 at 9:21 PM, Petr Jelinek > wrote:

On 2015-12-09 15:09, Alexander Korotkov wrote:


​Patch was rebased against current master.
Any notes about current version of patch?
It would be nice to commit it and continue work on other parts of am
extendability.​


The rebase seems broken, there are things missing in this version of
the patch (for example the validation functions).


​Ooops, sorry. Correct version is attached.​



Hi,

I went over this.

I get these compiler warning about unused variables in the validation 
functions:

brin.c: In function ‘brinvalidate’:
brin.c:94:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]

  keytype;
  ^
ginutil.c: In function ‘ginvalidate’:
ginutil.c:86:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]

  keytype;
  ^
gist.c: In function ‘gistvalidate’:
gist.c:101:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]

  keytype;
  ^
hash.c: In function ‘hashvalidate’:
hash.c:103:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]

  keytype;
  ^
nbtree.c: In function ‘btvalidate’:
nbtree.c:134:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]

  keytype;
  ^
nbtree.c:133:6: warning: variable ‘intype’ set but not used 
[-Wunused-but-set-variable]

  intype,
  ^
spgutils.c: In function ‘spgvalidate’:
spgutils.c:88:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]

  keytype;
  ^

These look like copy-pastos of boilerplate.

Another note is that amvalidate SQL interface is not documented 
anywhere. I know it's mainly meant for regression tests and we for 
example don't document hashing functions but it's something to think 
about/discuss maybe.


Other than that I am happy with the patch.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing warnings in back branches?

2015-12-14 Thread Tom Lane
Andres Freund  writes:
> On 2015-12-14 10:55:05 +, Greg Stark wrote:
>> Perhaps just adding some -Wno-* flags would make more sense than
>> changing code and possibly introducing bugs.

> I think that's a case-by-case decision. Just verbatimly backpatching
> something that stewed in master for a year or two seems fine. That's imo
> often preferrable because often it's just that existing warning
> categories grew more "vigilant", or however you want to describe it. So
> if you disable those, you also remove coverage...

Meh.  If we thought that anything like that was an actual bug, we should
have back-patched the fix when removing the warning in HEAD.  So I would
expect that all remaining warnings are just compiler nannyism, and thus
that fixing them is more likely to introduce bugs than do anything very
useful.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Kevin Grittner
On Mon, Dec 14, 2015 at 10:06 AM, Tom Lane  wrote:

> So at this point I'm guessing that Ubuntu has shipped an update that
> includes the patch Pavel Raiskup suggested in
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1286692#c4
>
> which would fix the lack of error cursors at EOF, but it would not affect
> any larger change such as stopping after the first detected error.
>
> If you can confirm that, then we'll need to decide what to do about it.

I've been looking at this diff, which seems to represent the ubuntu
libxml2 changes of today, and it seems rather different to what I
see in the Red Hat discussion.  In particular, I see no changes to
error.c.  I haven't really figured the cause of the discrepancy
yet, but figured I would pass along the diff link.

http://launchpadlibrarian.net/229976362/libxml2_2.9.1%2Bdfsg1-3ubuntu4.5_2.9.1%2Bdfsg1-3ubuntu4.6.diff.gz

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fdw cleanup

2015-12-14 Thread Feng Tian
Hi, Albe,

Thank you.

The resource is not memory, I have something like a file descriptor or
network connection, which I believe is the common case for foreign table.
Using RegisterXactCallback exposes an API at a much deep level to foreign
table writer, and I need to find a place to anchor the context (fdw_state).
  I took a look at postgres_fdw and found that xact callback is exactly
what is done.So I will take this approach.

Again, thank you Albe, for pointing me to it.
Feng



On Mon, Dec 14, 2015 at 12:30 AM, Albe Laurenz 
wrote:

> Feng Tian wrote:
> > I need some help to understand foreign table error handling.
> >
> > For a query on foreign table, ExecInitForeignScan is called, which in
> turn calls the BeginForeignScan
> > hook.   Inside this hook, I allocated some resource,
> >
> >
> > node->fdw_state = allocate_resource(...);
> >
> > If everything goes well, ExecEndForeignScan will call call my
> EndForeignScan hook, inside the hook, I
> > free the resource.
> >
> > free_resource(node->fdw_state);
> >
> > However, if during the execution an error happened, seems to me that
> EndForeignScan will not be called
> > (traced using gdb).   So my question is, is Begin/End the right place
> for allocate/free resources?
> > If it is not, what is the right way to do this?
>
> If the resource is memory, use "palloc" to allocate it, and PostgreSQL
> will take care of it automatically.
>
> Other than that, you could call RegisterXactCallback() and register a
> callback
> that will be called upon transaction end.  In the callback function you can
> free the resource if it was not already freed by EndForeignScan.
>
> Yours,
> Laurenz Albe
>


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Tom Lane
Kevin Grittner  writes:
> On Mon, Dec 14, 2015 at 10:06 AM, Tom Lane  wrote:
>> So at this point I'm guessing that Ubuntu has shipped an update that
>> includes the patch Pavel Raiskup suggested in
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=1286692#c4
>> 
>> which would fix the lack of error cursors at EOF, but it would not affect
>> any larger change such as stopping after the first detected error.
>> 
>> If you can confirm that, then we'll need to decide what to do about it.

> I've been looking at this diff, which seems to represent the ubuntu
> libxml2 changes of today, and it seems rather different to what I
> see in the Red Hat discussion.  In particular, I see no changes to
> error.c.  I haven't really figured the cause of the discrepancy
> yet, but figured I would pass along the diff link.

> http://launchpadlibrarian.net/229976362/libxml2_2.9.1%2Bdfsg1-3ubuntu4.5_2.9.1%2Bdfsg1-3ubuntu4.6.diff.gz

Hmm.  What it looks like is that Ubuntu cherry-picked the libxml2 commit
that added xmlHaltParser(), but *not* the follow-on commit that changed
the behavior of xmlParserPrintFileContextInternal().  So that would
probably explain the behavior you're seeing ... though it also raises
the question of whether they've left themselves vulnerable to a null
pointer dereference in xmlParserPrintFileContextInternal().  It's not
clear to me whether that second commit is properly part of the CVE
response or not.  But it's Veillard's patch, and he included it in
Red Hat's libxml2 security update, so an informed guess would be "yes".

Regardless of that, it now seems likely to me that what you're seeing
is what the output will look like if the no-error-cursor-at-EOF problem
gets fixed, which is what we're hoping will happen.  So we had better
change the regression tests to accept this behavior too.

As I said, my inclination is to remove the SELECT xmlparse(document '')
test case altogether.  The following test SELECT xmlparse(document '   ')
seems like it provides as much useful coverage as we need for that,
and its output doesn't depend on whether libxml2 continues parsing
after the first error.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Kevin Grittner
On Mon, Dec 14, 2015 at 10:43 AM, Tom Lane  wrote:

> As I said, my inclination is to remove the SELECT xmlparse(document '')
> test case altogether.  The following test SELECT xmlparse(document '   ')
> seems like it provides as much useful coverage as we need for that,
> and its output doesn't depend on whether libxml2 continues parsing
> after the first error.

That's fine with me.  I can do that, if there are no objections.
(It is easy enough to find the affected lines in all 3 "expected"
files.)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling an index temporarily

2015-12-14 Thread Corey Huinker
On Sun, Dec 13, 2015 at 11:03 PM, Tom Lane  wrote:

> Jeff Janes  writes:
> > Not to hijack the thread even further in the wrong direction, but I
> > think what Corey really wants here is to stop maintaining the index at
> > retail while preserving the existing definition and existing index
> > data, and then to do a wholesale fix-up, like what is done in the 2nd
> > half of a create index concurrently, upon re-enabling it.
>
> Meh.  Why not just drop the index?  I mean, yeah, you might save a few
> keystrokes when and if you ever re-enable it, but this sure seems like
> a feature in search of a use-case.
>
> regards, tom lane
>

Sorry, I misread Tatsu's initial post. I thought the disabling was for the
purpose of reducing overhead on large DML operations, not plan
experimentation.

Jeff's suggestion is one use-case. The work of discovering what indexes
exist on a table (because it may have changed since you last wrote that
code), saving those names and definitions to an intermediate table,
disabling them, doing the big DML operation, and then re-enabling them is
tedious and error prone, both in the coding of it and the error handling.
Leaving the index definitions in the data dictionary is one way to avoid
all that.


Re: [HACKERS] Logical replication and multimaster

2015-12-14 Thread Konstantin Knizhnik
I have updated DTM page  at 
PoastgreSQL WiKi adding  information about multimaster.
Also we have created repository 
 at github  with our 
version of PostgreSQL and DTM extensions:
multimaster, pg_dtm, pg_tsdtm, bdr (sorry for plagiarism, it is just a 
toy, lightweight version of multimaster with asynchronous replication, 
used  to compare performance).




On 13.12.2015 15:46, Konstantin Knizhnik wrote:

On 12/13/2015 12:19 PM, Simon Riggs wrote:
On 6 December 2015 at 17:39, Konstantin Knizhnik 
> wrote:


I have integrated pglogical_output in multimaster, using
bdr_apply from BDR as template for implementation of receiver part.


I didn't see the patch for this anywhere. Where is the code?


I am sorry,  the code is now in our internal gitlab repository.
We have published pg_dtm and pg_tsdtm as separate repositories at 
github.com/postgrespro.

Them include source of plugin itself and patch to PostgreSQL core.
But we find it is very inconvenient, because we also have to extend 
DTM API, adding new features as deadlock detection...
So we are going to publish at github.com/postgrespro our branch of 
PostgreSQL where pg_dtm, pg_tsdtm and multimaster will be available as 
extensions in contrib directory.  It will be available at Monday.




--
Simon Riggs http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services






Re: [HACKERS] Disabling an index temporarily

2015-12-14 Thread Corey Huinker
On Sun, Dec 13, 2015 at 10:23 PM, Bill Moran 
wrote:

> On Sun, 13 Dec 2015 22:15:31 -0500
> Corey Huinker  wrote:
>
> > ALTER TABLE foo DISABLE [NONUNIQUE] INDEXES
> > -- same, but joining to pg_class and possibly filtering on indisunique
>
> I would think that NONUNIQUE should be the default, and you should have
> to specify something special to also disable unique indexes. Arguably,
> unique indexes are actually an implementation detail of unique
> constraints. Disabling a performance-based index doesn't cause data
> corruption, whereas disabling an index created as part of unique
> constraint can allow invalid data into the table.
>
> Just my $.02 ...
>
> --
> Bill Moran
>

I'd be fine swapping NONUNIQUE for ALL and defaulting to non-unique, or
flatly enforcing a rule that it won't disable the index required by an
enabled constraint.


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-14 Thread Merlin Moncure
On Mon, Dec 14, 2015 at 2:50 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Tom Lane wrote:
>>> Quick followup: rl_resize_terminal() exists in GNU readline at least as
>>> far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
>>> at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
>>> So we'd need a configure test for this.
>
>> In libedit (NetBSD's at least) there is el_resize() which seems to do
>> the same thing.
>
> Hmm.  I see this in OS X's histedit.h:
>
> voidel_resize(EditLine *);
>
> but it appears that this is part of a completely separate API with
> essentially nothing in common with GNU readline.  Not sure if we have
> the motivation to try to support that API in parallel with readline's.
> I sure don't.

This may be moot; some testing demonstrated that libedit was not
impacted so it really comes down to having the right readline api call
available.  Looking at the code ISTM that libedit resets the terminal
on every prompt.

Also, after some experimentation I had better luck with
rl_reset_screen_size() (vs rl_resize_terminal()) that seemed to give
more regular behavior with the prompt.  So the consolidated patch with
the configure check is attached.

I'll leave it to the powers that be in terms of how and when to apply
this.  My gut is that it could probably be patched in now but I'd feel
comfortable with more testing then my own perfunctory probing.

merlin
diff --git a/configure b/configure
index 660aa57..2f0fee6 100755
--- a/configure
+++ b/configure
@@ -12415,7 +12415,7 @@ if test x"$pgac_cv_var_rl_completion_append_character" = x"yes"; then
 $as_echo "#define HAVE_RL_COMPLETION_APPEND_CHARACTER 1" >>confdefs.h
 
 fi
-  for ac_func in rl_completion_matches rl_filename_completion_function
+  for ac_func in rl_completion_matches rl_filename_completion_function rl_reset_screen_size
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index 419e3d3..f3d926d 100644
--- a/configure.in
+++ b/configure.in
@@ -1545,7 +1545,7 @@ LIBS="$LIBS_including_readline"
 
 if test "$with_readline" = yes; then
   PGAC_VAR_RL_COMPLETION_APPEND_CHARACTER
-  AC_CHECK_FUNCS([rl_completion_matches rl_filename_completion_function])
+  AC_CHECK_FUNCS([rl_completion_matches rl_filename_completion_function rl_reset_screen_size])
   AC_CHECK_FUNCS([append_history history_truncate_file])
 fi
 
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 655850b..0f17826 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -27,6 +27,7 @@
 #include "common.h"
 #include "mbprint.h"
 #include "print.h"
+#include "input.h"
 
 /*
  * We define the cancel_pressed flag in this file, rather than common.c where
@@ -2247,6 +2251,13 @@ ClosePager(FILE *pagerpipe)
 #ifndef WIN32
 		pqsignal(SIGPIPE, SIG_DFL);
 #endif
+#ifdef HAVE_RL_RESET_SCREEN_SIZE
+		/* 
+		 * Force libreadline to recheck the terminal size in case the pager
+		 * may have handled any terminal resize events.
+		 */
+		rl_reset_screen_size();
+#endif
 	}
 }
 
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 6ce5907..8aa182c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -406,6 +406,9 @@
 /* Define to 1 if you have the `rl_filename_completion_function' function. */
 #undef HAVE_RL_FILENAME_COMPLETION_FUNCTION
 
+/* Define to 1 if you have the `rl_reset_screen_size' function. */
+#undef HAVE_RL_RESET_SCREEN_SIZE
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SECURITY_PAM_APPL_H
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Uninterruptible slow geo_ops.c

2015-12-14 Thread Alvaro Herrera
Alvaro Herrera wrote:

> While I'm not familiar with the code itself, and can't post the exact
> slow query just yet, I have noticed that it is missing a
> CHECK_FOR_INTERRUPTS() call to enable cancelling the slow query.  I'd
> backpatch this all the way back.  (The exact issue they hit is mutual
> recursion between touched_lseg_between_poly and lseg_between_poly.
> Since the latter also recurses on itself, the best way forward seem to
> add a check for interrupts in the loop there.)

Pushed this part.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-14 Thread Alvaro Herrera
Merlin Moncure wrote:
> On Tue, Dec 8, 2015 at 2:33 PM, Merlin Moncure  wrote:
> > On Tue, Dec 8, 2015 at 2:02 PM, Tom Lane  wrote:
> >> I wrote:
> >>> Merlin Moncure  writes:
>  The following patch deals with a long standing gripe of mine that the
>  terminal frequently gets garbled so that when typing.
> >>
> >>> Hm.  I wonder whether rl_resize_terminal() exists in every iteration
> >>> of libreadline and libedit.
> >>
> >> Quick followup: rl_resize_terminal() exists in GNU readline at least as
> >> far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
> >> at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
> >> So we'd need a configure test for this.
> >
> > All right, I guess this answers the question I was thinking, 'can this
> > be backpatched?' (basically, now).
> 
> er, meant to say, 'no'.

Why not?  We don't forbid adding configure tests in minor releases, do
we?

I've been troubled by this many times, so thanks for finding the problem
and fixing.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function and view to retrieve WAL receiver status

2015-12-14 Thread Gurjeet Singh
On Sun, Dec 13, 2015 at 10:15 PM, Michael Paquier  wrote:

> On Mon, Dec 14, 2015 at 3:09 PM, Gurjeet Singh 
> wrote:
> > On Dec 13, 2015 9:56 PM, "Michael Paquier" 
> > wrote:
> >> If the node has no WAL receiver active, a tuple with NULL values is
> >> returned instead.
> >
> > IMO, in the absence of a WAL receiver the SRF (and the view) should not
> > return any rows.
>
> The whole point is to not use a SRF in this case: there is always at
> most one WAL receiver.
>

The function, maybe. But emitting an all-nulls row from a view seems
counter-intuitive, at least when looking at it in context of relational
database.

-- 
Gurjeet Singh http://gurjeet.singh.im/


[HACKERS] PATCH: postpone building buckets to the end of Hash (in HashJoin)

2015-12-14 Thread Tomas Vondra

Hi,

attached is v1 of one of the hashjoin improvements mentioned in 
September in the lengthy thread [1].


The main objection against simply removing the MaxAllocSize check (and 
switching to MemoryContextAllocHuge) is that if the number of rows is 
overestimated, we may consume significantly more memory than necessary.


We run into this issue because we allocate the buckets at the very 
beginning, based on the estimate. I've noticed we don't really need to 
do that - we don't really use the buckets until after the Hash node 
completes, and we don't even use it when incrementing the number of 
batches (we use the dense allocation for that).


So this patch removes this - it postpones allocating the buckets to the 
end of MultiExecHash(), and at that point we know pretty well what is 
the optimal number of buckets.


This makes tracking nbuckets_optimal/log2_nbuckets_optimal unnecessary, 
as we can simply use nbuckets/log2_nbuckets for that purpose. I've also 
removed nbuckets_original, but maybe that's not a good idea and we want 
to keep that information (OTOH we never use that number of buckets).


This patch does not change the estimation in ExecChooseHashTableSize() 
at all, because we still need to do that to get nbucket/nbatch. Maybe 
this is another opportunity for improvement in case of overestimates, 
because in that case it may happen that we do batching even when we 
could do without it. So we might start with nbuckets=1024 and 
nbatches=1, and only switch to the estimated number of batches if really 
needed.


This patch also does not mess with the allocation, i.e. it still uses 
the MaxAllocSize limit (which amounts to ~256MB due to the doubling, 
IIRC), but it should make it easier to do that change.


[1] http://www.postgresql.org/message-id/flat/19746.1443983...@sss.pgh.pa.us

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 12dae77..8fd9c9f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2116,21 +2116,17 @@ show_hash_info(HashState *hashstate, ExplainState *es)
 		if (es->format != EXPLAIN_FORMAT_TEXT)
 		{
 			ExplainPropertyLong("Hash Buckets", hashtable->nbuckets, es);
-			ExplainPropertyLong("Original Hash Buckets",
-hashtable->nbuckets_original, es);
 			ExplainPropertyLong("Hash Batches", hashtable->nbatch, es);
 			ExplainPropertyLong("Original Hash Batches",
 hashtable->nbatch_original, es);
 			ExplainPropertyLong("Peak Memory Usage", spacePeakKb, es);
 		}
-		else if (hashtable->nbatch_original != hashtable->nbatch ||
- hashtable->nbuckets_original != hashtable->nbuckets)
+		else if (hashtable->nbatch_original != hashtable->nbatch)
 		{
 			appendStringInfoSpaces(es->str, es->indent * 2);
 			appendStringInfo(es->str,
-			 "Buckets: %d (originally %d)  Batches: %d (originally %d)  Memory Usage: %ldkB\n",
+			 "Buckets: %d Batches: %d (originally %d)  Memory Usage: %ldkB\n",
 			 hashtable->nbuckets,
-			 hashtable->nbuckets_original,
 			 hashtable->nbatch,
 			 hashtable->nbatch_original,
 			 spacePeakKb);
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 5e05ec3..b0cf97d 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -39,7 +39,7 @@
 
 
 static void ExecHashIncreaseNumBatches(HashJoinTable hashtable);
-static void ExecHashIncreaseNumBuckets(HashJoinTable hashtable);
+static void ExecHashBuildBuckets(HashJoinTable hashtable);
 static void ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node,
 	  int mcvsToUse);
 static void ExecHashSkewTableInsert(HashJoinTable hashtable,
@@ -129,9 +129,8 @@ MultiExecHash(HashState *node)
 		}
 	}
 
-	/* resize the hash table if needed (NTUP_PER_BUCKET exceeded) */
-	if (hashtable->nbuckets != hashtable->nbuckets_optimal)
-		ExecHashIncreaseNumBuckets(hashtable);
+	/* Construct the actual hash table (using the optimal number of buckets). */
+	ExecHashBuildBuckets(hashtable);
 
 	/* Account for the buckets in spaceUsed (reported in EXPLAIN ANALYZE) */
 	hashtable->spaceUsed += hashtable->nbuckets * sizeof(HashJoinTuple);
@@ -283,10 +282,7 @@ ExecHashTableCreate(Hash *node, List *hashOperators, bool keepNulls)
 	 */
 	hashtable = (HashJoinTable) palloc(sizeof(HashJoinTableData));
 	hashtable->nbuckets = nbuckets;
-	hashtable->nbuckets_original = nbuckets;
-	hashtable->nbuckets_optimal = nbuckets;
 	hashtable->log2_nbuckets = log2_nbuckets;
-	hashtable->log2_nbuckets_optimal = log2_nbuckets;
 	hashtable->buckets = NULL;
 	hashtable->keepNulls = keepNulls;
 	hashtable->skewEnabled = false;
@@ -372,18 +368,11 @@ ExecHashTableCreate(Hash *node, List *hashOperators, bool keepNulls)
 	}
 
 	/*
-	 * Prepare context for the first-scan space allocations; allocate the
-	 * hashbucket array therein, and set each bucket 

Re: [HACKERS] [PoC] Asynchronous execution again (which is not parallel)

2015-12-14 Thread Kyotaro HORIGUCHI
Thank you a lot!

At Mon, 14 Dec 2015 17:51:41 +0900, Amit Langote 
 wrote in <566e831d.1050...@lab.ntt.co.jp>
> 
> Hi,
> 
> On 2015/12/14 17:34, Kyotaro HORIGUCHI wrote:
> > At Tue, 8 Dec 2015 10:40:20 -0500, Robert Haas  wrote
> >> But is it important enough to be worthwhile?  Maybe, maybe not.  I
> >> think we should be working toward a world where the Gather is at the
> >> top of the plan tree as often as possible, in which case
> >> asynchronously kicking off a Gather node won't be that exciting any
> >> more - see notes on the "parallelism + sorting" thread where I talk
> >> about primitives that would allow massively parallel merge joins,
> >> rather than 2 or 3 way parallel. 
> > 
> > Could you give me the subject of the thread? Or important message
> > of that.
> 
> I think that would be the following thread:
> 
> * parallelism and sorting *
> http://www.postgresql.org/message-id/ca+tgmoyh4zsqmgqiyra7zo1rbbvg1qhn1fjt5q0fpw+q0xa...@mail.gmail.com

Thank you for the pointer. I'll read it.

# It's hard for me to do eyeball-greping on English texts..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove array_nulls?

2015-12-14 Thread Jim Nasby

On 12/11/15 2:57 PM, Tom Lane wrote:

Jim Nasby  writes:

A quick doc search indicates this config was created in 9.0, though the
docs state it's for a change that happened in 8.2[1].


Don't know what you're looking at, but the GUC is definitely there (and
documented) in 8.2.


Is it time to remove this GUC?


Perhaps, but I'd like to have a less ad-hoc process about it.  What's
our policy for dropping backwards-compatibility GUCs?  Are there any
others that should be removed now as well?


Perhaps it should be tied to bumping the major version number, which I'm 
guessing would happen next whenever we get parallel query execution. If 
we do that, a reasonable policy might be that a compatability GUC lives 
across no more than 1 major version bump (ie, we wouldn't remove 
something in 9.0 that was added in 8.4).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel joins, and better parallel explain

2015-12-14 Thread Paul Ramsey
On Wed, Dec 2, 2015 at 1:55 PM, Robert Haas  wrote:

> Oops.  The new version I've attached should fix this.

I've been trying to see if parallel join has any effect on PostGIS
spatial join queries, which are commonly CPU bound. (My tests [1] on
simple parallel scan were very positive, though quite limited in that
they only parallelized such a small part of the work).

Like Amit, I found the current patches are missing a change to
src/include/nodes/relation.h, but just adding in "Relids
extra_lateral_rels" to JoinPathExtraData allowed a warning-free build.

The assumptions on parallel code in generally are that setting up
parallel workers is very costly compared to the work to be done, so to
get PostGIS code to parallelize I've been reduced to shoving
parallel_setup_cost very low (1) and parallel_tuple_cost as well.
Otherwise I just end up with ordinary plans. I did redefine all the
relevant functions as "parallel safe" and upped their declared costs
significantly.

I set up a 8000 record spatial table, with a spatial index, and did a
self-join on it.

explain analyze
select a.gid, b.gid from vada a join vada b
on st_intersects(a.geom, b.geom)
where a.gid != b.gid;

With no parallelism, I got this:

set max_parallel_degree = 0;

  QUERY PLAN
--
 Nested Loop  (cost=0.15..227332.48 rows=1822243 width=8) (actual
time=0.377..5528.461 rows=52074 loops=1)
   ->  Seq Scan on vada a  (cost=0.00..1209.92 rows=8792 width=1189)
(actual time=0.027..5.004 rows=8792 loops=1)
   ->  Index Scan using vada_gix on vada b  (cost=0.15..25.71 rows=1
width=1189) (actual time=0.351..0.622 rows=6 loops=8792)
 Index Cond: (a.geom && geom)
 Filter: ((a.gid <> gid) AND _st_intersects(a.geom, geom))
 Rows Removed by Filter: 3
 Planning time: 3.976 ms
 Execution time: 5533.573 ms


With parallelism, I got this:

   QUERY PLAN
-
 Nested Loop  (cost=0.15..226930.05 rows=1822243 width=8) (actual
time=0.840..5462.029 rows=52074 loops=1)
   ->  Gather  (cost=0.00..807.49 rows=8792 width=1189) (actual
time=0.335..39.326 rows=8792 loops=1)
 Number of Workers: 1
 ->  Parallel Seq Scan on vada a  (cost=0.00..806.61 rows=5861
width=1189) (actual time=0.015..10.167 rows=4396 loops=2)
   ->  Index Scan using vada_gix on vada b  (cost=0.15..25.71 rows=1
width=1189) (actual time=0.353..0.609 rows=6 loops=8792)
 Index Cond: (a.geom && geom)
 Filter: ((a.gid <> gid) AND _st_intersects(a.geom, geom))
 Rows Removed by Filter: 3
 Planning time: 4.019 ms
 Execution time: 5467.126 ms

Given that it's a CPU-bound process, I was hoping for something closer
to the results of the sequence tests, about 50% time reduction, based
on the two cores in my test machine.

In general either the parallel planner is way too conservative (it
seems), or we need to radically increase the costs of our PostGIS
functions (right now, most "costly" functions are cost 100, but I had
to push costs up into the 10 range to get parallelism to kick in
sometimes). Some guidelines on cost setting would be useful, something
that says, "this function run against this kind of data is cost level
1, compare the time your function takes on 'standard' data to the
baseline function to get a cost ratio to use in the function
definition"

ATB,

P.



[1]  https://gist.github.com/pramsey/84e7a39d83cccae692f8


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: custom compression methods

2015-12-14 Thread Andres Freund
On 2015-12-14 14:50:57 +0800, Craig Ringer wrote:
> http://www.postgresql.org/message-id/flat/20130615102028.gk19...@alap2.anarazel.de#20130615102028.gk19...@alap2.anarazel.de

> The issue with per-Datum is that TOAST claims two bits of a varlena header,
> which already limits us to 1 GiB varlena values, something people are
> starting to find to be a problem. There's no wiggle room to steal more
> bits. If you want pluggable compression you need a way to store knowledge
> of how a given datum is compressed with the datum or have a fast, efficient
> way to check.
>
> pg_upgrade means you can't just redefine the current toast bits so the
> compressed bit means "data is compressed, check first byte of varlena data
> for algorithm" because existing data won't have that, the first byte will
> be the start of the compressed data stream.

I don't think there's an actual problem here. My old patch that you
referenced solves this.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another XML build issue

2015-12-14 Thread Kevin Grittner
On Mon, Dec 14, 2015 at 10:56 AM, Kevin Grittner  wrote:
> On Mon, Dec 14, 2015 at 10:43 AM, Tom Lane  wrote:
>
>> As I said, my inclination is to remove the SELECT xmlparse(document '')
>> test case altogether.  The following test SELECT xmlparse(document '   ')
>> seems like it provides as much useful coverage as we need for that,
>> and its output doesn't depend on whether libxml2 continues parsing
>> after the first error.
>
> That's fine with me.  I can do that, if there are no objections.
> (It is easy enough to find the affected lines in all 3 "expected"
> files.)

This didn't seem terribly controversial, so pushed.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: custom compression methods

2015-12-14 Thread Jim Nasby

On 12/14/15 12:50 AM, Craig Ringer wrote:

The issue with per-Datum is that TOAST claims two bits of a varlena
header, which already limits us to 1 GiB varlena values, something
people are starting to find to be a problem. There's no wiggle room to
steal more bits. If you want pluggable compression you need a way to
store knowledge of how a given datum is compressed with the datum or
have a fast, efficient way to check.


... unless we allowed for 8 byte varlena headers. Compression changes 
themselves certainly don't warrant that, but if people are already 
unhappy with 1GB TOAST then maybe that's enough.


The other thing this might buy us are a few bits that could be used to 
support Datum versioning for other purposes, such as when the binary 
format of something changes. I would think that at some point we'll need 
that for pg_upgrade.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Logical decoding support for sequence advances

2015-12-14 Thread Craig Ringer
On 14 December 2015 at 11:28, Craig Ringer  wrote:

> Hi all
>
> Attached is a patch against 9.6 to add support for informing logical
> decoding plugins of the new sequence last_value when sequence advance WAL
> records are processed during decoding.
>


Attached a slightly updated version. It just has less spam in the
regression tests, by adding a new option to test_decoding to show
sequences, which it doesn't enable except in sequence specific tests.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 8ee88b0850fea17086a293c8afee83af3a6b95e1 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 14 Dec 2015 15:07:30 +0800
Subject: [PATCH] Logical decoding for sequence advances

Add a new logical decoding callback seq_advance_cb that's invoked
whenever we decode a new sequence chunk allocation from WAL.

Also add support for it in test_decoding.

It's guaranteed that the last_value passed to the callback is equal to
or greater than (for +ve sequences) any value returned by a nextval()
call visible to any transaction committed at or before the callback is
invoked.

In practice it'll be quite a lot larger because PostgreSQL
preallocates large chunks of sequences to minimise WAL writes, at
the cost of wasting values if it has to do crash recovery. Logical
decoding sees the crash recovery position when it gets advanced.

Needed to make logical replication based failover possible.
---
 contrib/test_decoding/Makefile  |   2 +-
 contrib/test_decoding/expected/sequence.out | 108 
 contrib/test_decoding/sql/sequence.sql  |  35 +
 contrib/test_decoding/test_decoding.c   |  33 +
 src/backend/replication/logical/decode.c|  60 +++-
 src/backend/replication/logical/logical.c   |  28 +++-
 src/include/replication/logical.h   |   3 +
 src/include/replication/output_plugin.h |  16 +
 8 files changed, 282 insertions(+), 3 deletions(-)
 create mode 100644 contrib/test_decoding/expected/sequence.out
 create mode 100644 contrib/test_decoding/sql/sequence.sql

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index a362e69..91e2765 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -38,7 +38,7 @@ submake-test_decoding:
 	$(MAKE) -C $(top_builddir)/contrib/test_decoding
 
 REGRESSCHECKS=ddl rewrite toast permissions decoding_in_xact decoding_into_rel \
-	binary prepared replorigin
+	binary prepared replorigin sequence
 
 regresscheck: | submake-regress submake-test_decoding temp-install
 	$(MKDIR_P) regression_output
diff --git a/contrib/test_decoding/expected/sequence.out b/contrib/test_decoding/expected/sequence.out
new file mode 100644
index 000..a13d29a
--- /dev/null
+++ b/contrib/test_decoding/expected/sequence.out
@@ -0,0 +1,108 @@
+SET synchronous_commit = on;
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
+ ?column? 
+--
+ init
+(1 row)
+
+CREATE SEQUENCE test_seq;
+SELECT sum(nextval('test_seq')) FROM generate_series(1, 100);
+ sum  
+--
+ 5050
+(1 row)
+
+ALTER SEQUENCE test_seq RESTART WITH 2;
+SELECT sum(nextval('test_seq')) FROM generate_series(1, 100);
+ sum  
+--
+ 5150
+(1 row)
+
+ALTER SEQUENCE test_seq INCREMENT BY 5;
+SELECT sum(nextval('test_seq')) FROM generate_series(1, 100);
+  sum  
+---
+ 35350
+(1 row)
+
+ALTER SEQUENCE test_seq CACHE 1;
+SELECT sum(nextval('test_seq')) FROM generate_series(1, 15000);
+sum
+---
+ 571552500
+(1 row)
+
+DROP SEQUENCE test_seq;
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'include-sequences', '1');
+ data 
+--
+ sequence test_seq advanced to 1
+ sequence test_seq advanced to 33
+ sequence test_seq advanced to 66
+ sequence test_seq advanced to 99
+ sequence test_seq advanced to 132
+ sequence test_seq advanced to 2
+ sequence test_seq advanced to 34
+ sequence test_seq advanced to 67
+ sequence test_seq advanced to 100
+ sequence test_seq advanced to 133
+ sequence test_seq advanced to 101
+ sequence test_seq advanced to 266
+ sequence test_seq advanced to 431
+ sequence test_seq advanced to 596
+ sequence test_seq advanced to 761
+ sequence test_seq advanced to 601
+ sequence test_seq advanced to 50761
+ sequence test_seq advanced to 100761
+(18 rows)
+
+CREATE SEQUENCE test2_seq MINVALUE 100 MAXVALUE 200 START 200 INCREMENT BY -1 CYCLE;
+SELECT sum(nextval('test2_seq')) FROM generate_series(1, 300);
+  sum  
+---
+ 45147
+(1 row)
+
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'include-sequences', '1');
+ data 

Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-12-14 Thread Daniel Verite
Pavel Stehule wrote:

> postgres=# \crosstabview 4 +month label
> 
> Maybe using optional int order column instead label is better - then you can
> do sort on client side
> 
> so the syntax can be "\crosstabview VCol [+/-]HCol [[+-]HOrderCol]

In the meantime I've followed a different idea: allowing the 
vertical header to be sorted too, still server-side.

That's because to me, the first impulse for a user noticing that
it's not sorted vertically would be to write
 \crosstabview +customer month
rather than figure out the
 \crosstabview customer +month_number month_name
invocation.
But both ways aren't even mutually exclusive. We could support
 \crosstabview [+|-]colV[:labelV] [+|-]colH[:labelH]
it's more complicated to understand, but not  harder to implement.

Also, a non-zero FETCH_COUNT is supported by this version of the patch,
if the first internal FETCH retrieves less than FETCH_COUNT rows.
Otherwise a specific error is emitted.

Also there are minor changes in arguments and callers following
recent code changes for \o

Trying to crosstab with 10k+ distinct values vertically, I've noticed
that the current code is too slow, spending too much time
sorting.  I'm currently replacing its simple arrays of distinct values
with AVL binary trees, which I expect to be much more efficient for
this.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e4f72a8..2a998b2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2449,6 +2449,95 @@ lo_import 152801
 
   
 
+  
+\crosstabview [ [-|+]colV  [-|+]colH ] 
+
+
+Execute the current query buffer (like \g) and shows the results
+inside a crosstab grid. The output column colV
+becomes a vertical header and the output column
+colH becomes a horizontal header.
+The results for the other output columns are projected inside the grid.
+
+
+
+colV
+and colH can indicate a
+column position (starting at 1), or a column name. Normal case folding
+and quoting rules apply on column names. By default,
+colV is column 1
+and colH is column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results, or in ascending
+order if a plus (+) sign precedes colH,
+or in descending order if it's a minus (-) sign.
+
+
+
+The query results being tuples of N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+a cell located at the intersection (x,y) in the grid
+has contents determined by these rules:
+
+
+
+ if there is no corresponding row in the results such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed in the cell as
+ if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to one row
+ except that the values coming from different rows are stacked
+ vertically, rows being separated by newline characters inside
+ the same cell.
+
+
+
+
+
+
+
+  
+
 
   
 \s [ filename ]
diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index f1336d5..9cb0c4a 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o crosstabview.o \
 	$(WIN32RES)
 
 
diff --git 

Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-12-14 Thread Daniel Verite
Pavel Stehule wrote:


> here is patch - supported syntax: \crosstabview VCol [+/-]HCol [HOrderCol]
> 
> Order column should to contains any numeric value. Values are sorted on
> client side

If I understand correctly, I see a problem with HOrderCol.

If the vertical header consists of, for example, a series of
event names, and it should be sorted  by event date, then
the requirement of HOrderCol being strictly numeric is
problematic,  in a way that the previous proposal was not, isn't it?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-14 Thread Alvaro Herrera
Tom Lane wrote:
> I wrote:
> > Merlin Moncure  writes:
> >> The following patch deals with a long standing gripe of mine that the
> >> terminal frequently gets garbled so that when typing.
> 
> > Hm.  I wonder whether rl_resize_terminal() exists in every iteration
> > of libreadline and libedit.
> 
> Quick followup: rl_resize_terminal() exists in GNU readline at least as
> far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
> at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
> So we'd need a configure test for this.

In libedit (NetBSD's at least) there is el_resize() which seems to do
the same thing.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2015-12-14 Thread Paul Ramsey
On Thu, Dec 10, 2015 at 10:42 PM, Haribabu Kommi
 wrote:
>
> Here I attached a POC patch of parallel aggregate based on combine
> aggregate patch. This patch contains the combine aggregate changes
> also. This patch generates and executes the parallel aggregate plan
> as discussed in earlier threads.

I tried this out using PostGIS with no great success. I used a very
simple aggregate for geometry union because the combine function is
just the same as the transfer function for this case (I also mark
ST_Area() as parallel safe, so that the planner will attempt to
parallelize the query)..

  CREATE AGGREGATE ST_MemUnion (
   basetype = geometry,
   sfunc = ST_Union,
   cfunc = ST_Union,
   stype = geometry
  );

Unfortunately attempting a test causes memory corruption and death.

  select riding,
  st_area(st_memunion(geom))
  from vada group by riding;

The explain looks OK:

  QUERY PLAN
---
 Finalize HashAggregate  (cost=220629.47..240380.26 rows=79 width=1189)
   Group Key: riding
   ->  Gather  (cost=0.00..807.49 rows=8792 width=1189)
 Number of Workers: 1
 ->  Partial HashAggregate  (cost=220628.59..220629.38 rows=79
width=1189)
   Group Key: riding
   ->  Parallel Seq Scan on vada  (cost=0.00..806.61
rows=8792 width=1189)

But the run dies.

NOTICE:  SRID value -32897 converted to the officially unknown SRID value 0
ERROR:  Unknown geometry type: 2139062143 - Invalid type

>From the message it looks like geometry gets corrupted at some point,
causing a read to fail on very screwed up metadata.

P.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-14 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Quick followup: rl_resize_terminal() exists in GNU readline at least as
>> far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
>> at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
>> So we'd need a configure test for this.

> In libedit (NetBSD's at least) there is el_resize() which seems to do
> the same thing.

Hmm.  I see this in OS X's histedit.h:

voidel_resize(EditLine *);

but it appears that this is part of a completely separate API with
essentially nothing in common with GNU readline.  Not sure if we have
the motivation to try to support that API in parallel with readline's.
I sure don't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script & other stuff

2015-12-14 Thread Fabien COELHO



-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.



I don't think it is a good idea to remove entirely the description of
what the default scenarios can do. The description would be better at
the bottom in some  with a list of each default test and what to
expect from them.


I'm trying to avoid to have the same explanation twice, otherwise someone 
is bound to complain.



+/* data structure to hold various statistics.
+ * it is used for interval statistics as well as file statistics.
 */
Nitpick: this is not a comment formatted the Postgres-way.


Indeed.


This is surprisingly broken:
$ pgbench -i
some of the specified options cannot be used in initialization (-i) mode


Hmmm.


Any file name or path including "@" will fail strangely:
$ pgbench -f "t...@1.sql"
could not open file "test": No such file or directory
empty commands for test
Perhaps instead of failing we should warn the user and enforce the
weight to be set at 1?


Yep, I can have a look at that.


$ pgbench -b foo
no builtin found for "foo"
This is not really helpful for the user, I think that the list of
potential options should be listed as an error hint.


Yep.


-  "  -S, --select-onlyperform SELECT-only
transactions\n"
+  "  -S, --select-onlysame as \"-b select-only@1\"\n"
It is good to mention that there is an equivalent, but I think that
the description should be kept.


The reason replace it is to keep the help message short column-wise.


+   /* although a mutex would make sense, the
likelyhood of an issue
+* is small and these are only stats which may
be slightly false
+*/
+   doSimpleStats(& commands[st->state]->stats,
+ INSTR_TIME_GET_DOUBLE(now) -




Why would the likelyhood of an issue be small here?


The time to update one stat (<< 100 cycles ?) to the time to do a 
transaction with the database (typically Y ms), so the likelyhood of two 
thread to update the very same stat at the same time is probably under 
1/10,000,000. Even if it occurs, then one stat is slightly false, no big 
deal. So I think the potential slowdown induced by a mutex is not worth 
it, so I a comment instead.



+   /* print NaN if no transactions where executed */
+   double latency = ss->sum / ss->count;
This does not look like a good idea, ss->count can be 0.


"sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the 
comment.



It seems also that it would be a good idea to split the patch into two parts:
1) Refactor the code so as the existing test scripts are put under the
same umbrella with addScript, adding at the same time the new option
-b.
2) Add the weight facility and its related statistics.


Sigh. The patch & documentation are probably not independent, so that 
would make two dependent patches, probably.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fdw cleanup

2015-12-14 Thread Albe Laurenz
Feng Tian wrote:
> I need some help to understand foreign table error handling.
> 
> For a query on foreign table, ExecInitForeignScan is called, which in turn 
> calls the BeginForeignScan
> hook.   Inside this hook, I allocated some resource,
> 
> 
> node->fdw_state = allocate_resource(...);
> 
> If everything goes well, ExecEndForeignScan will call call my EndForeignScan 
> hook, inside the hook, I
> free the resource.
> 
> free_resource(node->fdw_state);
> 
> However, if during the execution an error happened, seems to me that 
> EndForeignScan will not be called
> (traced using gdb).   So my question is, is Begin/End the right place for 
> allocate/free resources?
> If it is not, what is the right way to do this?

If the resource is memory, use "palloc" to allocate it, and PostgreSQL
will take care of it automatically.

Other than that, you could call RegisterXactCallback() and register a callback
that will be called upon transaction end.  In the callback function you can
free the resource if it was not already freed by EndForeignScan.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PoC] Asynchronous execution again (which is not parallel)

2015-12-14 Thread Kyotaro HORIGUCHI
Hello, thank you for the comment.

At Tue, 8 Dec 2015 10:40:20 -0500, Robert Haas  wrote in 

> On Mon, Nov 30, 2015 at 7:47 AM, Kyotaro HORIGUCHI
>  wrote:
> > "Asynchronous execution" is a feature to start substantial work
> > of nodes before doing Exec*. This can reduce total startup time
> > by folding startup time of multiple execution nodes. Especially
> > effective for the combination of joins or appends and their
> > multiple children that needs long time to startup.
> >
> > This patch does that by inserting another phase "Start*" between
> > ExecInit* and Exec* to launch parallel processing including
> > pgworker and FDWs before requesting the very first tuple of the
> > result.
> 
> I have thought about this, too, but I'm not very convinced that this
> is the right model.  In a typical case involving parallelism, you hope
> to have the Gather node as close to the top of the plan tree as
> possible.  Therefore, the start phase will not happen much before the
> first execution of the node, and you don't get much benefit.

Obeying the Init-Exec semantics, Gather node cannot execute
underlying, say, Sort node before the upper node requests for the
first tuple. Async execution also potentially works for the case.

On the other hand, the patch is currently desined considering
Gahter as driven-all-time node. Since it has the same
characteristic with Append or MergeAppend in the sense that it
potentially executes multiple (and various kinds of) underlying
nodes, the patch should be redesigned following that but as far
as I can see for now that Gather executes multiple same (or
divided) scan nodes so I haven't make Gather
"asynch-aware".  (If I didn't take it wrongly.)

And if necessary, we can mark the query as 'async requested' in
planning phase.

> Moreover, I think that prefetching can be useful not only at the start
> of the query - which is the only thing that your model supports - but
> also in mid-query.  For example, consider an Append of two ForeignScan
> nodes.  Ideally we'd like to return the results in the order that they
> become available, rather than serially.  This model might help with
> that for the first batch of rows you fetch, but not after that.

Yeah, async-exec can have the similar mechanism as Gahter to
fetch tuples from underlying nodes.

> There are a couple of other problems here that are specific to this
> example.  You get a benefit here because you've got two Gather nodes
> that both get kicked off before we try to read tuples from either, but
> that's generally something to avoid - you can only use 3 processes and
> typically at most 2 of those will actually be running (as opposed to

Yes, it is one of the reason why I said the example as artificial.

> sleeping) at the same time: the workers will run to completion, and
> then the leader will wake up and do its thing.   I'm not saying our
> current implementation of parallel query scales well to a large number
> of workers (it doesn't) but I think that's more about improving the
> implementation than any theoretical problem, so this seems a little
> worse.  Also, currently, both merge and hash joins have an
> optimization wherein if the outer side of the join turns out to be
> empty, we avoid paying the startup cost for the inner side of the
> join; kicking off the work on the inner side of the merge join
> asynchronously before we've gotten any tuples from the outer side
> loses the benefit of that optimization.

It is a matter of comparson, async wins if the startup time of
the outer is longer (to some extent) than the time to build the
inner hash. But it requries planner part. I'll take it into
account if async exec itself is found to be useful.

> I suspect there is no single paradigm that will help with all of the
> cases where asynchronous execution is useful.  We're going to need a
> series of changes that are targeted at specific problems.  For
> example, here it would be useful to have one side of the join confirm
> at the earliest possible stage that it will definitely return at least
> one tuple eventually, but then return control to the caller so that we
> can kick off the other side of the join.  The sort node never
> eliminates anything, so as soon as the sequential scan underneath it
> coughs up a tuple, we're definitely getting a return value eventually.

It's quite impressive. But it might be a business of the planner.

> At that point it's safe to kick off the other Gather node.  I don't
> quite know how to design a signalling system for that, but it could be
> done.

I agree. I'll make further considertaion on that.

> But is it important enough to be worthwhile?  Maybe, maybe not.  I
> think we should be working toward a world where the Gather is at the
> top of the plan tree as often as possible, in which case
> asynchronously kicking off a Gather node won't be that exciting any

Re: [HACKERS] W-TinyLfu for cache eviction

2015-12-14 Thread Amit Kapila
On Mon, Dec 14, 2015 at 12:18 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:
> > a global lock would be good enough for a proof of concept that only
> evaluates cache hit ratios.
>
> I think emulator can be used to check hit ratios. That way we can see
> how different algorithms affect hit ratio.
>
> Is there a set of traces of "buffer load events"? (I did some Google
> searches like "postgresql buffer cache trace" with no luck)
> Is there an option that enables tracing of each requested buffer Id?
>

To get the detailed trace for each buffer, I think you can use dynamic
tracing as explained in documentation [1].

Another simple way could be to use:
 Explain (Analyze, Buffers) 

This will give blocks hit and read numbers which could be useful in your
experiments.


[1] - http://www.postgresql.org/docs/devel/static/dynamic-trace.html

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [PoC] Asynchronous execution again (which is not parallel)

2015-12-14 Thread Amit Langote

Hi,

On 2015/12/14 17:34, Kyotaro HORIGUCHI wrote:
> At Tue, 8 Dec 2015 10:40:20 -0500, Robert Haas  wrote
>> But is it important enough to be worthwhile?  Maybe, maybe not.  I
>> think we should be working toward a world where the Gather is at the
>> top of the plan tree as often as possible, in which case
>> asynchronously kicking off a Gather node won't be that exciting any
>> more - see notes on the "parallelism + sorting" thread where I talk
>> about primitives that would allow massively parallel merge joins,
>> rather than 2 or 3 way parallel. 
> 
> Could you give me the subject of the thread? Or important message
> of that.

I think that would be the following thread:

* parallelism and sorting *
http://www.postgresql.org/message-id/ca+tgmoyh4zsqmgqiyra7zo1rbbvg1qhn1fjt5q0fpw+q0xa...@mail.gmail.com

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Uninterruptible slow geo_ops.c

2015-12-14 Thread Marco Nenciarini
On 11/12/15 19:18, Marco Nenciarini wrote:
> On 11/12/15 18:48, Alvaro Herrera wrote:
>> Hi,
>>
>> A customer of ours hit some very slow code while running the
>> @>(polygon, polygon) operator with some big polygons.  I'm not familiar
>> with this stuff but I think the problem is that the algorithm converges
>> too slowly to a solution and also has some pretty expensive calls
>> somewhere.  (Perhaps there is also a problem that the algorithm *never*
>> converges for some inputs ...)
>>
>> While I'm not familiar with the code itself, and can't post the exact
>> slow query just yet, I have noticed that it is missing a
>> CHECK_FOR_INTERRUPTS() call to enable cancelling the slow query.  I'd
>> backpatch this all the way back.  (The exact issue they hit is mutual
>> recursion between touched_lseg_between_poly and lseg_between_poly.
>> Since the latter also recurses on itself, the best way forward seem to
>> add a check for interrupts in the loop there.)
>>
>> I will follow up on the actual slowness later, as warranted.
>>
> 
> I would add that it was not simply a slow computation, but more probably they 
> hit a case where the algorithm doesn't converge at all.
> 
> I've killed it manually by calling ProcessInterrupts() through gdb after 7 
> days and half of CPU time (100% of one CPU).
> The server CPU is an Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz.
> 
> The query doesn't involve any table and is a simple call of @>(polygon, 
> polygon) operator.
> 
> SELECT polygon 'poligon literal with 522 points' @> polygon 'poligon box'
> 
> I'm checking if we can share the full query.
> 

The full query is attached.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
SELECT polygon 

Re: [HACKERS] Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

2015-12-14 Thread Tomas Vondra

Hi,

I was planning to do some review/testing on this patch, but then I 
noticed it was rejected with feedback in 2015-07 and never resubmitted 
into another CF. So I won't waste time in testing this unless someone 
shouts that I should do that anyway. Instead I'll just post some ideas 
about how we might improve the patch, because I'd forget about them 
otherwise.


On 07/05/2015 09:48 AM, Heikki Linnakangas wrote:


The ideal correction formula f(x), would be such that f(g(X)) = X, where:

  X is time, 0 = beginning of checkpoint, 1.0 = targeted end of
checkpoint (checkpoint_segments), and

  g(X) is the amount of WAL generated. 0 = beginning of checkpoint, 1.0
= targeted end of checkpoint (derived from max_wal_size).

Unfortunately, we don't know the shape of g(X), as that depends on the
workload. It might be linear, if there is no effect at all from
full_page_writes. Or it could be a step-function, where every write
causes a full page write, until all pages have been touched, and after
that none do (something like an UPDATE without a where-clause might
cause that). In pgbench-like workloads, it's something like sqrt(x). I
picked X^1.5 as a reasonable guess. It's close enough to linear that it
shouldn't hurt too much if g(x) is linear. But it cuts the worst spike
at the very beginning, if g(x) is more like sqrt(x).


Exactly. I think the main "problem" here is that we do mix two types of 
WAL records, with quite different characteristics:


 (a) full_page_writes - very high volume right after checkpoint, then
 usually drops to much lower volume

 (b) regular records - about the same volume over time (well, lower
 volume right after the checkpoint, as that's where FPWs happen)

We completely ignore this when computing elapsed_xlogs, because we 
compute it (about) like this:


elapsed_xlogs = wal_since_checkpoint / CheckPointSegments;

which of course gets confused when we write a lot of WAL right after a 
checkpoint, because of FPW. But what if we actually tracked the amount 
of WAL produced by FWP in a checkpoint (which we current don't AFAIK)?


Then we could compute the expected *remaining* amount of WAL to be 
produced within the checkpoint interval, and use that to compute a 
better progress like this:


  wal_bytes  - WAL (total)
  wal_fpw_bytes  - WAL (due to FPW)
  prev_wal_bytes - WAL (total) in previous checkpoint
  prev_wal_fpw_bytes - WAL (due to FPW) in previous checkpoint

So we know that we should expect about

  (prev_wal_bytes - wal_bytes) + (prev_wal_fpw_bytes - wal_fpw_bytes)

  (   regular WAL) + (  FPW WAL )

to be produced until the end of the current checkpoint. I don't have a 
clear idea how to transform this into the 'progress' yet, but I'm pretty 
sure tracking the two types of WAL is a key to a better solution. The 
x^1.5 is probably a step in the right direction, but I don't feel 
particularly confident about the 1.5 (which is rather arbitrary).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers