Re: [HACKERS] WAL consistency check facility

2016-11-04 Thread Michael Paquier
On Fri, Nov 4, 2016 at 6:02 PM, Michael Paquier
 wrote:
> On Fri, Nov 4, 2016 at 5:02 PM, Michael Paquier
>  wrote:
>> On Thu, Nov 3, 2016 at 11:17 PM, Kuntal Ghosh
>>  wrote:
>>> I've updated the patch for review.
>>
>> Thank you for the new patch. This will be hopefully the last round of
>> reviews, we are getting close to something that has an acceptable
>> shape.
>
> One last thing: in XLogRecordAssemble(), could you enforce the value
> of info at the beginning of the routine when wal_consistency[rmid] is
> true? And then use the value of info to decide if include_image is
> true or not? The idea here is to allow callers of XLogInsert() to pass
> by themselves XLR_CHECK_CONSISTENCY and still have consistency checks
> enabled for a given record even if wal_consistency is false for the
> rmgr of the record happening. This would be potentially useful for
> extension and feature developers when debugging some stuff, for some
> builds compiled with a DEBUG flag, or whatever.

And you need to rebase the patch, d5f6f13 has fixed the handling of
xl_info with XLR_INFO_MASK.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-11-04 Thread Peter Eisentraut
On 11/4/16 4:55 AM, Shay Rojansky wrote:
> 1. Does everyone agrees that renaming the existing datatype without
> changing the OID?
> 
> 
> As I said before, Npgsql for one loads data types by name, not by OID.
> So this would definitely cause breakage.

Why would that cause breakage?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add more NLS to bin

2016-11-04 Thread Peter Eisentraut
On 11/3/16 7:17 PM, Michael Paquier wrote:
> This patch not being complicated, so I would vote for those being
> addressed now so as they are not forgotten even if there is a FIXME
> flag added. Perhaps you don't think so, and as that's a minor issue
> I'll be fine with your judgement as well.

OK, I just wrapped it in translation markers as is, which should work
well enough.

>>> In util.c, doesn't pg_log_v() need to handle strings used in fprintf?
>>
>> Which specific lines do you have in mind?
> 
> The verbose logs at the top. In pg_rewind for example those logs are
> getting translated via the pg_log() calls used with PG_DEBUG.

Yeah that was wrong anyway.  The previously existing translation markers
were wrong.  We want to translate the fmt, not the formatted message.

New patch attached.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 91a5d23f1d8a188d11de2f2dfc6d455ebf04cfc5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 14 Oct 2016 12:00:00 -0400
Subject: [PATCH v2 1/3] pg_upgrade: Add NLS

---
 src/bin/pg_upgrade/function.c|   2 +-
 src/bin/pg_upgrade/info.c|   8 ++--
 src/bin/pg_upgrade/nls.mk|  12 +
 src/bin/pg_upgrade/option.c  | 101 +++
 src/bin/pg_upgrade/pg_upgrade.c  |   1 +
 src/bin/pg_upgrade/relfilenode.c |   6 ++-
 src/bin/pg_upgrade/server.c  |   4 +-
 src/bin/pg_upgrade/util.c|  14 +++---
 8 files changed, 80 insertions(+), 68 deletions(-)
 create mode 100644 src/bin/pg_upgrade/nls.mk

diff --git a/src/bin/pg_upgrade/function.c b/src/bin/pg_upgrade/function.c
index 3009340..5b60f9f 100644
--- a/src/bin/pg_upgrade/function.c
+++ b/src/bin/pg_upgrade/function.c
@@ -252,7 +252,7 @@ check_loadable_libraries(void)
 			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
 pg_fatal("could not open file \"%s\": %s\n",
 		 output_path, strerror(errno));
-			fprintf(script, "could not load library \"%s\":\n%s\n",
+			fprintf(script, _("could not load library \"%s\":\n%s\n"),
 	lib,
 	PQerrorMessage(conn));
 		}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 1200c7f..8af9eac 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -238,7 +238,7 @@ report_unmatched_relation(const RelInfo *rel, const DbInfo *db, bool is_new_db)
 			{
 snprintf(reldesc + strlen(reldesc),
 		 sizeof(reldesc) - strlen(reldesc),
-		 " which is an index on \"%s.%s\"",
+		 _(" which is an index on \"%s.%s\""),
 		 hrel->nspname, hrel->relname);
 /* Shift attention to index's table for toast check */
 rel = hrel;
@@ -248,7 +248,7 @@ report_unmatched_relation(const RelInfo *rel, const DbInfo *db, bool is_new_db)
 		if (i >= db->rel_arr.nrels)
 			snprintf(reldesc + strlen(reldesc),
 	 sizeof(reldesc) - strlen(reldesc),
-	 " which is an index on OID %u", rel->indtable);
+	 _(" which is an index on OID %u"), rel->indtable);
 	}
 	if (rel->toastheap)
 	{
@@ -260,7 +260,7 @@ report_unmatched_relation(const RelInfo *rel, const DbInfo *db, bool is_new_db)
 			{
 snprintf(reldesc + strlen(reldesc),
 		 sizeof(reldesc) - strlen(reldesc),
-		 " which is the TOAST table for \"%s.%s\"",
+		 _(" which is the TOAST table for \"%s.%s\""),
 		 brel->nspname, brel->relname);
 break;
 			}
@@ -268,7 +268,7 @@ report_unmatched_relation(const RelInfo *rel, const DbInfo *db, bool is_new_db)
 		if (i >= db->rel_arr.nrels)
 			snprintf(reldesc + strlen(reldesc),
 	 sizeof(reldesc) - strlen(reldesc),
-	 " which is the TOAST table for OID %u", rel->toastheap);
+	 _(" which is the TOAST table for OID %u"), rel->toastheap);
 	}
 
 	if (is_new_db)
diff --git a/src/bin/pg_upgrade/nls.mk b/src/bin/pg_upgrade/nls.mk
new file mode 100644
index 000..a0c846d
--- /dev/null
+++ b/src/bin/pg_upgrade/nls.mk
@@ -0,0 +1,12 @@
+# src/bin/pg_upgrade/nls.mk
+CATALOG_NAME = pg_upgrade
+AVAIL_LANGUAGES  =
+GETTEXT_FILES= check.c controldata.c dump.c exec.c file.c function.c \
+   info.c option.c parallel.c pg_upgrade.c relfilenode.c \
+   server.c tablespace.c util.c version.c
+GETTEXT_TRIGGERS = pg_fatal pg_log:2 prep_status report_status:2
+GETTEXT_FLAGS= \
+pg_fatal:1:c-format \
+pg_log:2:c-format \
+prep_status:1:c-format \
+report_status:2:c-format
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 2e9a40c..12a49ff 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -240,13 +240,13 @@ parseCommandLine(int argc, char *argv[])
 
 	/* Get values from env if not already set */
 	check_required_directory(_cluster.bindir, NULL, "PGBINOLD", "-b",
-			 "old cluster binaries reside");
+			 _("old cluster binaries reside"));
 	check_required_directory(_cluster.bindir, NULL, "PGBINNEW", "-B",
-			 "new 

Re: [HACKERS] Applying XLR_INFO_MASK correctly when looking at WAL record information

2016-11-04 Thread Michael Paquier
On Sat, Nov 5, 2016 at 2:29 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> I have just been trapped by the fact that in some code paths we look
>> at the RMGR information of a WAL record (record->xl_info) but the
>> filter dedicated to it, ~XLR_INFO_MASK, is not applied. This is
>> harmful now, but this could lead to problems if in the future new
>> record-level flags, similar to XLR_SPECIAL_REL_UPDATE, are added.
>> Disclaimer: I am looking at a patch where a record-level flag makes
>> sense, still this looks like a separate problem.
>
>> What about the patch attached to make things more consistent?
>
> Grepping found a couple of additional places that needed the same
> fix.

Ah, where wasShutdown is assigned. I thought I grepped it as well,
thanks for double-checking.

> Pushed with those additions.

Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree_gin and btree_gist for enums

2016-11-04 Thread Andrew Dunstan



On 11/04/2016 04:14 PM, Emre Hasegeli wrote:

Here is a patch to add enum support to btree_gin and btree_gist. I didn't
include distance operations, as I didn't think it terribly important, and
there isn't a simple way to compute it sanely and efficiently, so no KNN
support.

I don't think we can implement a meaningful distance operator for enums.



Probably.




This time including the data file for the gist regression tests.

It doesn't apply to HEAD anymore.  I have tested it on b12fd41.



I'll fix the bitrot..



The GiST part of it doesn't really work.  The current patch compares
oids.  We need to change it to compare enumsortorder.  I could make it
fail easily:



ouch. Ok,I'll work on this.



+ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD

Why don't we just create it with those changes?



I'll take a look.






+ * Use a similar trick to that used for numeric for enums, since we don't

Do you mean "similar trick that is used" or "trick similar to numeric"?



This is perfectly legal English. It means "... a similar trick to the 
one used for numeric ... ". I'll change it to that if you think it's 
clearer.






+ * actually know the leftmost value of any enum without knowing the concrete
+ * type, so we use a dummy leftmost value of InvalidOid.
+return DatumGetBool(
+DirectFunctionCall2(enum_gt, ObjectIdGetDatum(*((const 
Oid *) a)), ObjectIdGetDatum(*((const Oid *) b)))
+);

Wouldn't it be nice to refactor enum_cmp_internal() to accept typcache
and use it there like the rangetypes?




Not sure. Will look.

Thanks for the review.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add PGDLLEXPORT to PG_FUNCTION_INFO_V1

2016-11-04 Thread Tom Lane
Albe Laurenz  writes:
>> Anyway, I have prepared a patch along the lines you suggest.

Pushed, we'll see if the buildfarm likes this iteration any better.

> It occurred to me that the documentation still suggests that you should
> add a declaration to a C function; I have fixed that too.

I didn't like that and rewrote it a bit.  It's not specific to V1
functions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving executor performance

2016-11-04 Thread Doug Doole

On 07/13/2016 06:18 PM, Andres Freund wrote:

Attached (in patch 0003) is a proof-of-concept implementing an
expression evalution framework that doesn't use recursion. Instead
ExecInitExpr2 computes a number of 'steps' necessary to compute an
expression. These steps are stored in a linear array, and executed one
after another (save boolean expressions, which can jump to later steps).
E.g. to compute colname = 1 the steps are 1) fetch var, 2) evaluate
const, 3) call function.


We've been having trouble with the performance of simple expressions in 
PLpgSQL so I started playing with this patch. (No sense re-inventing the 
wheel after all.) It was straightforward to extend to simple expressions 
and showed an immediate improvement (~10% faster on a simple test). 
Running in our full environment highlighted a few areas that I think are 
worth more investigation.


However, before I tackle that, is the posted proof-of-concept still the 
latest and greatest? If not, any chance of getting the latest?


Going forward, I'd like to collaborate on our efforts if you're interested.

- Doug Doole
Salesforce


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix bug in handling of dropped columns in pltcl triggers

2016-11-04 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Nov 1, 2016 at 4:17 AM, Jim Nasby  wrote:
>> While reviewing code coverage in pltcl, I uncovered a bug in trigger
>> function return handling. If you returned the munged name of a dropped
>> column, that would silently be ignored. It would be unusual to hit this,
>> since dropped columns end up with names like "...pg.dropped.2...",
>> but since that's still a legitimate name for a column silently ignoring it
>> seems rather bogus.

> It seems to me that this patch breaks $TG_relatts and what existing
> applications would except from it:

I'm not sure how it would do that.  What's being changed here is the
code for processing the returned tuple; it doesn't change what is put
into $TG_relatts.  Also, per the manual,

>  (Empty list
>  elements also appear in the positions of columns that have been
>  dropped, so that the attribute numbering is correct for columns
>  to their right.)

If the trigger function were blindly using that information without any
exception for dropped columns, then what it would provide here as the
column name is an empty string, not "...pg.dropped.2...", so that
it would get a failure anyway.  This change wouldn't affect that.

So I'm inclined to agree that we should change this, but I think Jim's
patch doesn't go far enough: what we really ought to do is change
SPI_fnumber itself to reject matches to dropped columns.  I just did
a survey of callers, and only a small minority of them are explicitly
testing for a match to a dropped column, but it doesn't look to me
like any of the rest are really prepared to do something reasonable
with one.  I find it impossible to think of a situation where a caller
would want to treat a match to a dropped column as valid.

My proposal therefore is for SPI_fnumber to ignore (not match to)
dropped columns, and to remove any caller-side attisdropped tests that
thereby become redundant.

It's possible that it'd make sense for pltcl_trigger_handler to ignore
empty-string column names in the returned list, so that the behavior
with stupid trigger functions would be a bit more forgiving; but that
is more or less independent of this patch.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Tweak cost_merge_append to reflect 7a2fe9bd?

2016-11-04 Thread Thomas Munro
Hi

As noted in a nearby review of some similar code[1], commit 7a2fe9bd
made merge append very slightly more efficient, but nobody told
cost_merge_append about the change.  I doubt it makes much difference
to the final cost in practice but I figured it might be worth
correcting the comment.  Does this make sense?

[1] 
https://www.postgresql.org/message-id/CAEepm%3D3o9um4pi0EphOGD7u2f862hX%2BBhwD5zko-TAk_Qj1JeQ%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


tweak-merge-append-costing.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Calculation of param_source_rels in add_paths_to_joinrel

2016-11-04 Thread Tom Lane
Ashutosh Bapat  writes:
> There's code in add_paths_to_joinrel() which computes the set of
> target relations that should overlap parameterization of any proposed
> join path.
> ...
> The calculations that follow are based on joinrel->relids (baserels
> covered by the join) and SpecialJoinInfo list in PlannerInfo. It is
> not based on specific combination of relations being joined or the
> paths being generated. We should probably do this computation once and
> store the result in the joinrel and use it multiple times. That way we
> can avoid computing the same set again and again for every pair of
> joining relations and their order. Any reasons why we don't do this?

I'm not terribly excited about this.  The issue is strictly local to
add_paths_to_joinrel, but putting that set in a global data structure
makes it nonlocal, and makes it that much harder to tweak the algorithm
if we think of a better way.  (In particular, I think it's not all that
obvious that the set must be independent of which two subset relations
we are currently joining.)

If you can show a measurable performance improvement from this change,
then maybe those downsides are acceptable.  But I do not think we should
commit it without a demonstrated performance benefit from the added
complexity and loss of flexibility.

> Also, the way this code has been written, the declaration of variable
> sjinfo masks the earlier declaration with the same name. I am not sure
> if that's intentional, but may be we should use another variable name
> for the inner sjinfo. I have not included that change in the patch.

Hmm, yeah, that's probably not terribly good coding practice.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-04 Thread Karl O. Pinc
On Fri, 4 Nov 2016 16:58:45 +0100
Gilles Darold  wrote:

> I attached a v12 patch 

Attached is a comment patch which improves the comment
describing CURRENT_LOG_FILENAME.  It's been bugging me.
I should have made this change long ago when I looked
at all the other code comments but neglected to.

The comment now matches other documentation.

This applies on top of your v12 patch.

I'm thinking through the v12 patch and email.
I'm in general agreement but want to make sure
I really understand all the code paths.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


patch_pg_current_logfile-v12.diff.symbolcomment
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree_gin and btree_gist for enums

2016-11-04 Thread Emre Hasegeli
>> Here is a patch to add enum support to btree_gin and btree_gist. I didn't
>> include distance operations, as I didn't think it terribly important, and
>> there isn't a simple way to compute it sanely and efficiently, so no KNN
>> support.

I don't think we can implement a meaningful distance operator for enums.

> This time including the data file for the gist regression tests.

It doesn't apply to HEAD anymore.  I have tested it on b12fd41.

The GiST part of it doesn't really work.  The current patch compares
oids.  We need to change it to compare enumsortorder.  I could make it
fail easily:

> regression=# create type e as enum ('0', '2', '3');
> CREATE TYPE
> regression=# alter type e add value '1' after '0';
> ALTER TYPE
> regression=# create table t as select (i % 4)::text::e from 
> generate_series(0, 10) as i;
> SELECT 11
> regression=# create index on t using gist (e);
> SEGFAULT

> +ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD

Why don't we just create it with those changes?

> + * Use a similar trick to that used for numeric for enums, since we don't

Do you mean "similar trick that is used" or "trick similar to numeric"?

> + * actually know the leftmost value of any enum without knowing the concrete
> + * type, so we use a dummy leftmost value of InvalidOid.

> +return DatumGetBool(
> +DirectFunctionCall2(enum_gt, 
> ObjectIdGetDatum(*((const Oid *) a)), ObjectIdGetDatum(*((const Oid *) b)))
> +);

Wouldn't it be nice to refactor enum_cmp_internal() to accept typcache
and use it there like the rangetypes?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gather Merge

2016-11-04 Thread Thomas Munro
On Sat, Nov 5, 2016 at 2:42 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Fri, Nov 4, 2016 at 12:00 PM, Thomas Munro
>>  wrote:
>>> Shouldn't this say just "(c) 2016, PostgreSQL Global Development
>>> Group"?  Are we supposed to be blaming the University of California
>>> for new files?
>
>> If the new file contains a portion of code from this age, yes.
>
> My habit has been to include the whole old copyright if there's anything
> at all in the new file that could be considered to be copy-and-paste from
> an existing file.  Frequently it's a gray area.

Thanks.  I see that it's warranted in this case, as code is recycled
from MergeAppend.

> Legally, I doubt anyone cares much.  Morally, I see it as paying due
> respect to those who came before us in this project.

+1

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-11-04 Thread Amit Kapila
On Fri, Nov 4, 2016 at 9:37 PM, Robert Haas  wrote:
> On Tue, Nov 1, 2016 at 9:09 PM, Robert Haas  wrote:
>> On Mon, Oct 24, 2016 at 10:30 AM, Amit Kapila  
>> wrote:
>>> [ new patches ]
>>
>> I looked over parts of this today, mostly the hashinsert.c changes.
>
> Some more review...
>
> @@ -656,6 +678,10 @@ _hash_squeezebucket(Relation rel,
>  IndexTuple  itup;
>  Sizeitemsz;
>
> +/* skip dead tuples */
> +if (ItemIdIsDead(PageGetItemId(rpage, roffnum)))
> +continue;
>
> Is this an optimization independent of the rest of the patch, or is
> there something in this patch that necessitates it?
>

This specific case is independent of rest of patch, but the same
optimization is used in function _hash_splitbucket_guts() which is
mandatory, because otherwise it will make a copy of that tuple without
copying dead flag.

>  i.e. Could this
> small change be committed independently?

Both the places _hash_squeezebucket() and  _hash_splitbucket can use
this optimization irrespective of rest of the patch.  I will prepare a
separate patch for these and send along with main patch after some
testing.

>  If not, then I think it
> needs a better comment explaining why it is now mandatory.
>
> - *  Caller must hold exclusive lock on the target bucket.  This allows
> + *  Caller must hold cleanup lock on the target bucket.  This allows
>   *  us to safely lock multiple pages in the bucket.
>
> The notion of a lock on a bucket no longer really exists; with this
> patch, we'll now properly speak of a lock on a primary bucket page.
> Also, I think the bit about safely locking multiple pages is bizarre
> -- that's not the issue at all: the problem is that reorganizing a
> bucket might confuse concurrent scans into returning wrong answers.
>
> I've included a broader updating of that comment, and some other
> comment changes, in the attached incremental patch, which also
> refactors your changes to _hash_freeovflpage() a bit to avoid some
> code duplication.  Please consider this for inclusion in your next
> version.
>

Your modifications looks good to me, so will include it in next version.

> In hashutil.c, I think that _hash_msb() is just a reimplementation of
> fls(), which you can rely on being present because we have our own
> implementation in src/port.  It's quite similar to yours but slightly
> shorter.  :-)   Also, some systems have a builtin fls() function which
> actually optimizes down to a single machine instruction, and which is
> therefore much faster than either version.
>

Agreed, will change as per suggestion.

> I don't like the fact that _hash_get_newblk() and _hash_get_oldblk()
> are working out the bucket number by using the HashOpaque structure
> within the bucket page they're examining.  First, it seems weird to
> pass the whole structure when you only need the bucket number out of
> it.  More importantly, the caller really ought to know what bucket
> they care about without having to discover it.
>
> For example, in _hash_doinsert(), we figure out the bucket into which
> we need to insert, and we store that in a variable called "bucket".
> Then from there we work out the primary bucket page's block number,
> which we store in "blkno".  We read the page into "buf" and put a
> pointer to that buffer's contents into "page" from which we discover
> the HashOpaque, a pointer to which we store into "pageopaque".  Then
> we pass that to _hash_get_newblk() which will go look into that
> structure to find the bucket number ... but couldn't we have just
> passed "bucket" instead?
>

Yes, it can be done.  However, note that pageopaque is not only
retrieved for passing to _hash_get_newblk(), rather it is used in
below code as well, so we can't remove that.

>  Similarly, _hash_expandtable() has the value
> available in the variable "old_bucket".
>
> The only caller of _hash_get_oldblk() is _hash_first(), which has the
> bucket number available in a variable called "bucket".
>
> So it seems to me that these functions could be simplified to take the
> bucket number as an argument directly instead of the HashOpaque.
>

Okay, I agree that it is better to use bucket number in both the
API's, so will change it accordingly.

> Generally, this pattern recurs throughout the patch.  Every time you
> use the data in the page to figure something out which the caller
> already knew, you're introducing a risk of bugs: what if the answers
> don't match?   I think you should try to root out as much of that from
> this code as you can.
>

Okay, I will review the patch once with this angle and see if I can improve it.

> As you may be able to tell, I'm working my way into this patch
> gradually, starting with peripheral parts and working toward the core
> of it.  Generally, I think it's in pretty good shape, but I still have
> quite a bit left to study.
>

Thanks.

-- 
With Regards,
Amit 

Re: [HACKERS] Gather Merge

2016-11-04 Thread Thomas Munro
On Sat, Nov 5, 2016 at 1:55 AM, Robert Haas  wrote:
> On Thu, Nov 3, 2016 at 11:00 PM, Thomas Munro
>  wrote:
>> + /*
>> + * Avoid log(0)...
>> + */
>> + N = (path->num_workers < 2) ? 2.0 : (double) path->num_workers;
>> + logN = LOG2(N);
>> ...
>> + /* Per-tuple heap maintenance cost */
>> + run_cost += path->path.rows * comparison_cost * 2.0 * logN;
>>
>> Why multiply by two?  The comment above this code says "about log2(N)
>> comparisons to delete the top heap entry and another log2(N)
>> comparisons to insert its successor".  In fact gather_merge_getnext
>> calls binaryheap_replace_first, which replaces the top element without
>> any comparisons at all and then performs a sift-down in log2(N)
>> comparisons to find its new position.  There is no per-tuple "delete"
>> involved.  We "replace" the top element with the value it already had,
>> just to trigger the sift-down, because we know that our comparator
>> function might have a new opinion of the sort order of this element.
>> Very clever!  The comment and the 2.0 factor in cost_gather_merge seem
>> to be wrong though -- or am I misreading the code?
>
> See cost_merge_append, and the header comments threreto.

I see.  So commit 7a2fe9bd got rid of the delete/insert code
(heap_siftup_slot and heap_insert_slot) and introduced
binaryheap_replace_first which does it in one step, but the costing
wasn't adjusted and still thinks we pay comparison_cost * logN twice.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mention column name in error messages

2016-11-04 Thread David G. Johnston
On Fri, Nov 4, 2016 at 12:15 PM, Tom Lane  wrote:

> Michael Paquier  writes:
> > I am passing that down to a committer for review. The patch looks
> > large, but at 95% it involves diffs in the regression tests,
> > alternative outputs taking a large role in the bloat.
>
> This is kind of cute, but it doesn't seem to cover very much territory,
> because it only catches errors that are found in the parse stage.
> For instance, it fails to cover Franck's original example:
> ​[...]
>
>
> Maybe it'd be all right to commit this anyway, but I'm afraid the most
> common reaction would be "why's it give me this info some of the time,
> but not when I really need it?"  I'm inclined to think that an acceptable
> patch will need to provide context for the plan-time and run-time cases
> too, and I'm not very sure where would be a sane place to plug in for
> those cases.
>

​Agreed.
​

David J.


Re: [HACKERS] Mention column name in error messages

2016-11-04 Thread Tom Lane
Michael Paquier  writes:
> I am passing that down to a committer for review. The patch looks
> large, but at 95% it involves diffs in the regression tests,
> alternative outputs taking a large role in the bloat.

This is kind of cute, but it doesn't seem to cover very much territory,
because it only catches errors that are found in the parse stage.
For instance, it fails to cover Franck's original example:

# create table foo(bar varchar(4), baz varchar(2));
CREATE TABLE
# insert into foo values ('foo2!', 'ok');
ERROR:  value too long for type character varying(4)

That's still all you get, because the parser sets that up as a varchar
literal with a runtime length coercion, and the failure doesn't occur
till later.  In this particular example it happens during the planner's
constant-folding stage, but with a non-constant expression it would
happen in the executor.

Maybe it'd be all right to commit this anyway, but I'm afraid the most
common reaction would be "why's it give me this info some of the time,
but not when I really need it?"  I'm inclined to think that an acceptable
patch will need to provide context for the plan-time and run-time cases
too, and I'm not very sure where would be a sane place to plug in for
those cases.

Less important comments:

* I don't really see the point of including the column type name in the
error message.  We don't do that in the COPY context message this is
based on.  If we were going to print it, I should think we'd need the
typmod as well --- eg, the difference between varchar(4) and varchar(4000)
could be pretty relevant.

* The coding order here is subtly wrong:
 
+   /* Set up callback to fetch more details regarding the error */
+   errcallback.callback = TransformExprCallback;
+   errcallback.arg = (void *) _state;
+   errcallback.previous = error_context_stack;
+   error_context_stack = 
+   te_state.relation_name = RelationGetRelationName(rd);
+   te_state.column_name = colname;
+   te_state.expected_type = attrtype;
+   te_state.is_insert = pstate->p_is_insert;

The callback is "live" the instant you assign to error_context_stack;
any data it depends on had better be valid before that.  In this case
you're effectively depending on the assumption that
RelationGetRelationName can't throw an error.  While it probably can't
today, better style would be to set up te_state first, eg

+   /* Set up callback to fetch more details regarding the error */
+   te_state.relation_name = RelationGetRelationName(rd);
+   te_state.column_name = colname;
+   te_state.expected_type = attrtype;
+   te_state.is_insert = pstate->p_is_insert;
+   errcallback.callback = TransformExprCallback;
+   errcallback.arg = (void *) _state;
+   errcallback.previous = error_context_stack;
+   error_context_stack = 

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-11-04 Thread Tom Lane
Artur Zakirov  writes:
> I attached new version of the patch, which fix is_char_separator()
> declaration too.

I did some experimenting using
http://rextester.com/l/oracle_online_compiler

It appears that Oracle will consider a single space in the pattern
to match zero or more spaces in the input, as all of these produce
the expected result:

SELECT to_timestamp('2000JUN', ' MON') FROM dual
SELECT to_timestamp('2000 JUN', ' MON') FROM dual
SELECT to_timestamp('2000   JUN', ' MON') FROM dual

Also, a space in the pattern will match a single separator character
in the input, but not multiple separators:

SELECT to_timestamp('2000-JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('2000--JUN', ' MON') FROM dual
ORA-01843: not a valid month

And you can have whitespace along with that single separator:

SELECT to_timestamp('2000 + JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('2000 +   JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('2000 ++  JUN', ' MON') FROM dual
ORA-01843: not a valid month

You can have leading whitespace, but not leading separators:

SELECT to_timestamp('   2000 JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('/2000 JUN', ' MON') FROM dual
ORA-01841: (full) year must be between -4713 and +, and not be 0

These all work:

SELECT to_timestamp('2000 JUN', '/MON') FROM dual
SELECT to_timestamp('2000JUN', '/MON') FROM dual
SELECT to_timestamp('2000/JUN', '/MON') FROM dual
SELECT to_timestamp('2000-JUN', '/MON') FROM dual

but not

SELECT to_timestamp('2000//JUN', '/MON') FROM dual
ORA-01843: not a valid month
SELECT to_timestamp('2000--JUN', '/MON') FROM dual
ORA-01843: not a valid month

which makes it look a lot like Oracle treats separator characters in the
pattern the same as spaces (but I haven't checked their documentation to
confirm that).

The proposed patch doesn't seem to me to be trying to follow 
these Oracle behaviors, but I think there is very little reason for
changing any of this stuff unless we move it closer to Oracle.

Some other nitpicking:

* I think the is-separator function would be better coded like

static bool
is_separator_char(const char *str)
{
/* ASCII printable character, but not letter or digit */
return (*str > 0x20 && *str < 0x7F &&
!(*str >= 'A' && *str <= 'Z') &&
!(*str >= 'a' && *str <= 'z') &&
!(*str >= '0' && *str <= '9'));
}

The previous way is neither readable nor remarkably efficient, and it
knows much more about the ASCII character set than it needs to.

* Don't forget the cast to unsigned char when using isspace() or other
 functions.

* I do not see the reason for throwing an error here:

+   /* Previous character was a backslash */
+   if (in_backslash)
+   {
+   /* After backslash should go non-space character */
+   if (isspace(*str))
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("invalid escape 
sequence")));
+   in_backslash = false;

Why shouldn't backslash-space be a valid quoting sequence?

I'll set this back to Waiting on Author.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-11-04 Thread Amit Kapila
On Fri, Nov 4, 2016 at 6:37 PM, Robert Haas  wrote:
> On Thu, Nov 3, 2016 at 6:25 AM, Amit Kapila  wrote:
>>> +nblkno = _hash_get_newblk(rel, pageopaque);
>>>
>>> I think this is not a great name for this function.  It's not clear
>>> what "new blocks" refers to, exactly.  I suggest
>>> FIND_SPLIT_BUCKET(metap, bucket) or OLD_BUCKET_TO_NEW_BUCKET(metap,
>>> bucket) returning a new bucket number.  I think that macro can be
>>> defined as something like this: bucket + (1 <<
>>> (fls(metap->hashm_maxbucket) - 1)).
>>>
>>
>> I think such a macro would not work for the usage of incomplete
>> splits.  The reason is that by the time we try to complete the split
>> of the current old bucket, the table half (lowmask, highmask,
>> maxbucket) would have changed and it could give you the bucket in new
>> table half.
>
> Can you provide an example of the scenario you are talking about here?
>

Consider a case as below:

First half of table
0 1 2 3
Second half of table
4 5 6 7

Now when split of bucket 2 (corresponding new bucket will be 6) is in
progress, system crashes and after restart it splits bucket number 3
(corresponding bucket will be 7).  Now after that, it will try to form
a new table half with buckets ranging from 8,9,..15.  Assume it
creates bucket 8 by splitting from bucket 0 and next if it tries to
split bucket 2, it will find an incomplete split and will attempt to
finish it.  At that time if it tries to calculate new bucket from old
bucket (2), it will calculate it as 10 (value of
metap->hashm_maxbucket will be 8 for third table half and if try it
with the above macro, it will calculate it as 10) whereas we need 6.
That is why you will see a check (if (new_bucket >
metap->hashm_maxbucket)) in _hash_get_newblk() which will ensure that
it returns the bucket number from previous half.  The basic idea is
that if there is an incomplete split from current bucket, it can't do
a new split from that bucket, so the check in _hash_get_newblk() will
give us correct value.

I can try to explain again if above is not clear enough.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying XLR_INFO_MASK correctly when looking at WAL record information

2016-11-04 Thread Tom Lane
Michael Paquier  writes:
> I have just been trapped by the fact that in some code paths we look
> at the RMGR information of a WAL record (record->xl_info) but the
> filter dedicated to it, ~XLR_INFO_MASK, is not applied. This is
> harmful now, but this could lead to problems if in the future new
> record-level flags, similar to XLR_SPECIAL_REL_UPDATE, are added.
> Disclaimer: I am looking at a patch where a record-level flag makes
> sense, still this looks like a separate problem.

> What about the patch attached to make things more consistent?

Grepping found a couple of additional places that needed the same
fix.  Pushed with those additions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-04 Thread Kevin Grittner
On Sun, Oct 30, 2016 at 10:35 AM, Kevin Grittner  wrote:
> On Sun, Oct 2, 2016 at 11:20 PM, Michael Paquier  
> wrote:

>> Not as big as I thought, only 2k when both patches are combined... The
>> patch without noapi in its name needs to be applied first, and after
>> the patch with noapi can be applied.
>>  60 files changed, 2073 insertions(+), 63 deletions(-)
>> Moved to next CF.
>
> In an attempt to make this patch more digestible for reviewers, I
> have split it up as follows:
>
> transition-c-triggers-only-v7.diff

>  17 files changed, 581 insertions(+), 47 deletions(-)
>
> This part is virtually unchanged (just curing bit-rot) since
> August, 2014, when I believe I had addressed all issues raised by
> reviewers.  It does provide a barely usable feature, since the
> syntax for transition tables is added and tuplestores are created
> when needed (and only when needed), with references stored in the
> TriggerData structure.  No new execution nodes are provided, so
> only C triggers can use these relations, and must explicitly and
> directly access the tuplestores from within the C code -- there is
> no support for referencing these tuplestores from within queries.
>
> This is basic infrastructure needed for the more complete feature.
> As far as I know there are no objections to what is implemented
> here.  I have pulled it out to make the review of the more
> controversial portions easier.  Since it had quite a bit of review
> two years ago, I will do some testing to make sure that nothing has
> broken and then push this part in a few days if there are no
> objections.

Hearing none, done.  Hopefully that makes what remains easier to
review.

During final testing I was annoyed by the thin support for CREATE
TRIGGER in the tab completion code, so I improved that a bit and
pushed that, too.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-11-04 Thread Robert Haas
On Tue, Nov 1, 2016 at 9:09 PM, Robert Haas  wrote:
> On Mon, Oct 24, 2016 at 10:30 AM, Amit Kapila  wrote:
>> [ new patches ]
>
> I looked over parts of this today, mostly the hashinsert.c changes.

Some more review...

@@ -656,6 +678,10 @@ _hash_squeezebucket(Relation rel,
 IndexTuple  itup;
 Sizeitemsz;

+/* skip dead tuples */
+if (ItemIdIsDead(PageGetItemId(rpage, roffnum)))
+continue;

Is this an optimization independent of the rest of the patch, or is
there something in this patch that necessitates it?  i.e. Could this
small change be committed independently?  If not, then I think it
needs a better comment explaining why it is now mandatory.

- *  Caller must hold exclusive lock on the target bucket.  This allows
+ *  Caller must hold cleanup lock on the target bucket.  This allows
  *  us to safely lock multiple pages in the bucket.

The notion of a lock on a bucket no longer really exists; with this
patch, we'll now properly speak of a lock on a primary bucket page.
Also, I think the bit about safely locking multiple pages is bizarre
-- that's not the issue at all: the problem is that reorganizing a
bucket might confuse concurrent scans into returning wrong answers.

I've included a broader updating of that comment, and some other
comment changes, in the attached incremental patch, which also
refactors your changes to _hash_freeovflpage() a bit to avoid some
code duplication.  Please consider this for inclusion in your next
version.

In hashutil.c, I think that _hash_msb() is just a reimplementation of
fls(), which you can rely on being present because we have our own
implementation in src/port.  It's quite similar to yours but slightly
shorter.  :-)   Also, some systems have a builtin fls() function which
actually optimizes down to a single machine instruction, and which is
therefore much faster than either version.

I don't like the fact that _hash_get_newblk() and _hash_get_oldblk()
are working out the bucket number by using the HashOpaque structure
within the bucket page they're examining.  First, it seems weird to
pass the whole structure when you only need the bucket number out of
it.  More importantly, the caller really ought to know what bucket
they care about without having to discover it.

For example, in _hash_doinsert(), we figure out the bucket into which
we need to insert, and we store that in a variable called "bucket".
Then from there we work out the primary bucket page's block number,
which we store in "blkno".  We read the page into "buf" and put a
pointer to that buffer's contents into "page" from which we discover
the HashOpaque, a pointer to which we store into "pageopaque".  Then
we pass that to _hash_get_newblk() which will go look into that
structure to find the bucket number ... but couldn't we have just
passed "bucket" instead?  Similarly, _hash_expandtable() has the value
available in the variable "old_bucket".

The only caller of _hash_get_oldblk() is _hash_first(), which has the
bucket number available in a variable called "bucket".

So it seems to me that these functions could be simplified to take the
bucket number as an argument directly instead of the HashOpaque.

Generally, this pattern recurs throughout the patch.  Every time you
use the data in the page to figure something out which the caller
already knew, you're introducing a risk of bugs: what if the answers
don't match?   I think you should try to root out as much of that from
this code as you can.

As you may be able to tell, I'm working my way into this patch
gradually, starting with peripheral parts and working toward the core
of it.  Generally, I think it's in pretty good shape, but I still have
quite a bit left to study.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


hashovfl-tweaks.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-11-04 Thread Peter Eisentraut
The organization of these patches makes sense to me.

On 10/20/16 1:14 AM, Michael Paquier wrote:
> - 0001, moving all the SHA2 functions to src/common/ and introducing a
> PG-like interface. No actual changes here.

That's probably alright, although the patch contains a lot more changes
than I would imagine for a simple file move.  I'll still have to review
that in detail.

> - 0002, replacing PostmasterRandom by pg_strong_random(), with a fix
> for the cancel key problem.
> - 0003, adding for pg_strong_random() a fallback for any nix platform
> not having /dev/random. This should be grouped with 0002, but I split
> it for clarity.

Also makes sense, but will need more detailed review.  I did not follow
the previous PostmasterRandom issues closely.

> - 0004, Add encoding routines for base64 without whitespace in
> src/common/. I improved the error handling here by making them return
> -1 in case of error and let the caller handle the error.

I don't think we want to have two different copies of base64 routines.
Surely we can make the existing routines do what we want with a
parameter or two about whitespace and line length.

> - 0005, Refactor decision-making of password encryption into a single routine.

It makes sense to factor this out.  We probably don't need the pstrdup
if we just keep the string as is.  (You could make an argument for it if
the input values were const char *.)  We probably also don't need the
pfree.  The Assert(0) can probably be done better.  We usually use
elog() in such cases.

> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.

"protocol" is a weird choice here.  Maybe something like "method" is
better.  The way the USING clause is placed can be confusing.  It's not
clear that it belongs to PASSWORD.  If someone wants to augment another
clause in CREATE ROLE with a secondary argument, then it could get
really confusing.  I'd suggest something to group things together, like
PASSWORD (val USING method).  The method could be an identifier instead
of a string.

Please add an example to the documentation and explain better how this
interacts with the existing ENCRYPTED PASSWORD clause.

> - 0007, the SCRAM implementation.

No documentation about pg_hba.conf changes, so I don't know how to use
this. ;-)

This implements SASL and SCRAM and SHA256.  We need to be clear about
which term we advertise to users.  An explanation in the missing
documentation would probably be a good start.

I would also like to see a test suite that covers the authentication
specifically.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-04 Thread Gilles Darold
Le 04/11/2016 à 14:17, Karl O. Pinc a écrit :
> On Mon, 31 Oct 2016 10:19:18 +0100
> Gilles Darold  wrote:
>
>> Le 31/10/2016 à 04:35, Karl O. Pinc a écrit :
>>> Attached is a patch to apply on top of the v10 patch.
>>>
>>> It updates current_logfiles only once per log rotation.
>>> I see no reason to open and write the file twice
>>> if both csvlog and stderr logging is happening.  
>> I do not take the effort to do that because log rotation is not
>> something that occurs too often and I don't want to change the
>> conditional "time_based_rotation" code lines in logfile_rotate() for
>> readability. My though was also that on high load, log rotation is
>> automatically disabled so logfile_writename() is not called and there
>> will be no additional I/O. But why not, if commiters want to save this
>> additional I/O, this patch can be applied.
> Ok.  You didn't put this into your v11 patch so it can be submitted to
> the committers as a separate patch.
>
>>> I don't understand why you're calling 
>>> logfile_writename(last_file_name, last_csv_file_name);
>>> in the SIGHUP code.  Presumably you've already
>>> written the old logfile names to current_logfiles.
>>> On SIGHUP you want to write the new names, not
>>> the old ones.  But the point of the SIGHUP code
>>> is to look for changes in logfile writing and then
>>> call logfile_rotate() to make those changes.  And
>>> logfile_rotate() calls logfile_writename() as appropriate.
>>> Shouldn't the logfile_writename call in the SIGHUP
>>> code be eliminated?  
>> No, when you change the log_destination and reload configuration you
>> have to refresh the content of current_logfiles, in this case no new
>> rotation have been done and you have to write last_file_name and/or
>> last_csv_file_name.
> I don't understand.  Please explain what's wrong with the
> picture I have of how logging operates on receipt of SIGHUP.
> Here is my understanding:
>
> The system is running normally, current_logfiles exists and contains
> the log file paths of the logs presently being written into.  These
> paths end with the file names in last_file_name and/or
> last_csv_file_name.  (I'm assuming throughout my description here that
> log_destination is writing to the file system at all.)
Yes, also if log_collector is on and log_destination is not stderr or
csvlog, current_logfiles exists too but it is emtpy.
When log_collector is off this file doesn't exists.

 
> A SIGHUP arrives.  The signal handler, sigHupHandler(), sets the
> got_SIGHUP flag.  Nothing else happens.
>
> The logging process wakes up due to the signal.
> Either there's also log data or there's not.  If there's
> not:
>
> The logging process goes through it's processing loop and finds,
> at line 305 of syslogger.c, got_SIGHUP to be true.
> Then it proceeds to do a bunch of assignment statements.
> If it finds that the log directory or logfile name changed
> it requests immediate log file rotation.  It does this by
> setting the request_rotation flag.  If log_destination
> or logging_collector changed request_rotation is not set.
>
> Then, your patch adds a call to logfile_writename().
> But nothing has touched the last_file_name or last_csv_file_name
> variables.  You rewrite into current_logfiles what's
> already in current_logfiles.

Your picture is good, you just miss the case where we just change
log_destination. In this case, syslogger add/change log file destination
but rotation_requested is false. If write to current_logfiles is
conditional to this flag, it will never reflect the file change until
next rotation, see why next answer bellow.


If log_destination remain unchanged I agree that I am rewriting the same
information, but I don't think that this kind of sporadic I/O is enough
to append code to store the old log_destination value and check if there
is a change like what is done with Log_directory. This is my opinion,
but may be I'm wrong to consider that those isolated and not critical
I/O doesn't justify this work.


> If there is log data in the pipe on SIGHUP
> and it's csv log data then if there's a csv
> log file open that's the file that's written to.
> last_csv_file_name does not change so current_logfiles
> does not need to be re-written.  If there is no csv
> log file open then open_csvlogfile() is called
> and it calls logfile_writename().  No need to
> call logfile_writename() again when processing the
> SIGHUP.

Yes, but the problem here is that logfile_writename() doesn't write the
change because the test (Log_destination & LOG_DESTINATION_CSVLOG)
returns false, Log_destination is set after the file is successfully
created. This make me though that the call of logfile_writename() into
function open_csvlogfile() can be removed, thanks for pointing me that.
I attached a v12 patch with some comment fix in the call to
logfile_writename().


Regards,

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/config.sgml 

Re: [HACKERS] Proposal for changes to recovery.conf API

2016-11-04 Thread Abhijit Menon-Sen
At 2016-11-04 10:35:05 +, si...@2ndquadrant.com wrote:
>
> Since the "lots of parameters" approach is almost exactly what we have
> now, I think we should do that first, get this patch committed and
> then sit back and discuss an improved API and see what downsides it
> introduces.

Thanks, I agree strongly with this.

Someone (Michael?) earlier mentioned the potential for introducing bugs
with this patch (the idea of merging recovery.conf into postgresql.conf
at all, not this particular incarnation).

I think the current proposed approach with

recovery_target=xid
recovery_target_xid=xxx

is preferable because (a) it doesn't introduce much new code, e.g., new
parsing functions, nor (b) need many changes in the documentation—all we
need is to say that of the various recovery_target_xxx parameters, the
one that has priority is the one named by recovery_target.

If I were to introduce recovery_target='xid xxx', I would at a minimum
need to factor out (or duplicate) parsing and error handling code, make
a type/value union-in-struct to store in the GUC *extra, then make sure
that we handle the older values in a backwards-compatible way, and move
a bunch of documentation around.

Can it be done? Yes, of course; and I'll do it if that's the consensus.
But it would be a backwards-compatible change to the current approach
anyway, and I think it would be better to put in the simple way now
before discussing the new API further.

Let's get the basic new *functionality* out so that people can play with
it, I'm sure we'll find a few non-API things that need tweaking as a
result of the change.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-11-04 Thread Robert Haas
On Fri, Oct 28, 2016 at 12:33 AM, Amit Kapila  wrote:
> On Fri, Oct 28, 2016 at 2:52 AM, Robert Haas  wrote:
>> On Mon, Oct 24, 2016 at 10:30 AM, Amit Kapila  
>> wrote:
 Amit, can you please split the buffer manager changes in this patch
 into a separate patch?
>>>
>>> Sure, attached patch extend_bufmgr_api_for_hash_index_v1.patch does that.
>>
>> The additional argument to ConditionalLockBuffer() doesn't seem to be
>> used anywhere in the main patch.  Do we actually need it?
>>
>
> No, with latest patch of concurrent hash index, we don't need it.  I
> have forgot to remove it.  Please find updated patch attached.  The
> usage of second parameter for ConditionalLockBuffer() is removed as we
> don't want to allow I/O across content locks, so the patch is changed
> to fallback to twice locking the metapage.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-11-04 Thread Robert Haas
On Fri, Nov 4, 2016 at 6:04 AM, Michael Paquier
 wrote:
> On Tue, Nov 1, 2016 at 11:31 PM, Robert Haas  wrote:
>> I liked Heikki's suggestion (at some point quite a while ago now) of
>> recovery_target = 'xid 123' or recovery_target='lsn 0/723' or
>> whatever.
>
> My vote goes for having two separate parameters, because as we know
> that there will be always two fields in this parameter, ...

That's not even true today: when the target is immediate, it has no
associated parameter value.  And who knows what the future may hold?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gather Merge

2016-11-04 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Nov 4, 2016 at 12:00 PM, Thomas Munro
>  wrote:
>> Shouldn't this say just "(c) 2016, PostgreSQL Global Development
>> Group"?  Are we supposed to be blaming the University of California
>> for new files?

> If the new file contains a portion of code from this age, yes.

My habit has been to include the whole old copyright if there's anything
at all in the new file that could be considered to be copy-and-paste from
an existing file.  Frequently it's a gray area.

Legally, I doubt anyone cares much.  Morally, I see it as paying due
respect to those who came before us in this project.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve hash-agg performance

2016-11-04 Thread Andres Freund
On 2016-11-04 15:18:49 +0200, Heikki Linnakangas wrote:
> On 11/03/2016 01:07 PM, Andres Freund wrote:
> > Hi,
> > 
> > There's two things I found while working on faster expression
> > evaluation, slot deforming and batched execution. As those two issues
> > often seemed quite dominant cost-wise it seemed worthwhile to evaluate
> > them independently.
> > 
> > 1) We atm do one ExecProject() to compute each aggregate's
> >arguments. Turns out it's noticeably faster to compute the argument
> >for all aggregates in one go. Both because it reduces the amount of
> >function call / moves more things into a relatively tight loop, and
> >because it allows to deform all the required columns at once, rather
> >than one-by-one.  For a single aggregate it'd be faster to avoid
> >ExecProject alltogether (i.e. directly evaluate the expression as we
> >used to), but as soon you have two the new approach is faster.
> 
> Makes sense. If we do your refactoring of ExecEvalExpr into an intermediate
> opcode representation, I assume the performance difference will go away
> anyway.

Precisely.


> > 2) For hash-aggs we right now we store the representative tuple using
> >the input tuple's format, with unneeded columns set to NULL. That
> >turns out to be expensive if the aggregated-on columns are not
> >leading columns, because we have to skip over a potentially large
> >number of NULLs.  The fix here is to simply use a different tuple
> >format for the hashtable.  That doesn't cause overhead, because we
> >already move columns in/out of the hashslot explicitly anyway.

> Heh, I came to the same conclusion a couple of months ago when I was
> profiling the aggregate code. I never got around to finish up and post the
> patch I wrote back then, but here you go, for comparison. It's pretty much
> the same as what you got here. So yeah, seems like a good idea.


> + /*
> +  * Note that we don't deduplicate key columns. That would 
> complicate
> +  * the comparisons. Don't write silly queries! (Not sure if 
> that would get
> +  * through the parser and optimizer, though).

Hehe. You made me spill more coffee.


Thanks for looking!

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve hash-agg performance

2016-11-04 Thread Heikki Linnakangas

On 11/03/2016 01:07 PM, Andres Freund wrote:

Hi,

There's two things I found while working on faster expression
evaluation, slot deforming and batched execution. As those two issues
often seemed quite dominant cost-wise it seemed worthwhile to evaluate
them independently.

1) We atm do one ExecProject() to compute each aggregate's
   arguments. Turns out it's noticeably faster to compute the argument
   for all aggregates in one go. Both because it reduces the amount of
   function call / moves more things into a relatively tight loop, and
   because it allows to deform all the required columns at once, rather
   than one-by-one.  For a single aggregate it'd be faster to avoid
   ExecProject alltogether (i.e. directly evaluate the expression as we
   used to), but as soon you have two the new approach is faster.


Makes sense. If we do your refactoring of ExecEvalExpr into an 
intermediate opcode representation, I assume the performance difference 
will go away anyway.



2) For hash-aggs we right now we store the representative tuple using
   the input tuple's format, with unneeded columns set to NULL. That
   turns out to be expensive if the aggregated-on columns are not
   leading columns, because we have to skip over a potentially large
   number of NULLs.  The fix here is to simply use a different tuple
   format for the hashtable.  That doesn't cause overhead, because we
   already move columns in/out of the hashslot explicitly anyway.


Heh, I came to the same conclusion a couple of months ago when I was 
profiling the aggregate code. I never got around to finish up and post 
the patch I wrote back then, but here you go, for comparison. It's 
pretty much the same as what you got here. So yeah, seems like a good idea.


- Heikki

>From 18a5098a340e7c8a18bea7e83f87b181f65d1976 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Thu, 1 Sep 2016 10:42:32 +0300
Subject: [PATCH 1/1] Don't store unused columns in hash table.

---
 src/backend/executor/nodeAgg.c | 129 -
 src/include/nodes/execnodes.h  |   6 +-
 2 files changed, 108 insertions(+), 27 deletions(-)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index ce2fc28..2521423 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -1654,6 +1654,13 @@ build_hash_table(AggState *aggstate)
 	Agg		   *node = (Agg *) aggstate->ss.ps.plan;
 	MemoryContext tmpmem = aggstate->tmpcontext->ecxt_per_tuple_memory;
 	Size		entrysize;
+	int			i;
+	AttrNumber *dummyColIdx;
+
+	dummyColIdx = MemoryContextAlloc(aggstate->aggcontexts[0]->ecxt_per_tuple_memory,
+	 aggstate->numHashCols * sizeof(AttrNumber));
+	for (i = 0; i < aggstate->numHashCols; i++)
+		dummyColIdx[i] = i + 1;
 
 	Assert(node->aggstrategy == AGG_HASHED);
 	Assert(node->numGroups > 0);
@@ -1661,8 +1668,8 @@ build_hash_table(AggState *aggstate)
 	entrysize = offsetof(AggHashEntryData, pergroup) +
 		aggstate->numaggs * sizeof(AggStatePerGroupData);
 
-	aggstate->hashtable = BuildTupleHashTable(node->numCols,
-			  node->grpColIdx,
+	aggstate->hashtable = BuildTupleHashTable(aggstate->numHashCols,
+			  dummyColIdx,
 			  aggstate->phase->eqfunctions,
 			  aggstate->hashfunctions,
 			  node->numGroups,
@@ -1695,27 +1702,71 @@ build_hash_table(AggState *aggstate)
  * columns.  However, the search will be needed when we add support for
  * SQL99 semantics that allow use of "functionally dependent" columns that
  * haven't been explicitly grouped by.
+ *
+ *
+ *
+ * We build two mapping arrays. One to convert an original input tuple to
+ * the format stored in the hash table. Another to convert back.
  */
-static List *
+static void
 find_hash_columns(AggState *aggstate)
 {
 	Agg		   *node = (Agg *) aggstate->ss.ps.plan;
-	Bitmapset  *colnos;
-	List	   *collist;
+	Bitmapset  *colnos = NULL;
+	Bitmapset  *other_colnos;
 	int			i;
+	int			numHashCols;
+	AttrNumber *mapping;
+	AttrNumber	maxHashColIdx = 0;
 
 	/* Find Vars that will be needed in tlist and qual */
-	colnos = find_unaggregated_cols(aggstate);
-	/* Add in all the grouping columns */
+	other_colnos = find_unaggregated_cols(aggstate);
+
+	mapping = palloc(bms_num_members(other_colnos) + node->numCols);
+
+	numHashCols = 0;
+
+	/*
+	 * Begin by putting all the grouping columns in the front, so that they're
+	 * fast to access.
+	 */
 	for (i = 0; i < node->numCols; i++)
-		colnos = bms_add_member(colnos, node->grpColIdx[i]);
-	/* Convert to list, using lcons so largest element ends up first */
-	collist = NIL;
-	while ((i = bms_first_member(colnos)) >= 0)
-		collist = lcons_int(i, collist);
+	{
+		AttrNumber keyColIdx = node->grpColIdx[i];
+
+		mapping[numHashCols++] = keyColIdx;
+		if (keyColIdx > maxHashColIdx)
+			maxHashColIdx = keyColIdx;
+
+		/*
+		 * Note that we don't deduplicate key columns. That would complicate
+		 * the comparisons. Don't write silly 

Re: [HACKERS] Logical Replication WIP

2016-11-04 Thread Andres Freund
Hi,

(btw, I vote against tarballing patches)

+   
+
+ 
+  Name
+  Type
+  References
+  Description
+ 
+
+
+
+ 
+  oid
+  oid
+  
+  Row identifier (hidden attribute; must be explicitly 
selected)
+ 
+

+ 
+  subpublications
+  name[]
+  
+  Array of subscribed publication names. These reference the
+   publications on the publisher server.
+  

Why is this names and not oids? So you can see it across databases?

I think this again should have an owner.


 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/event_trigger.c 
b/src/backend/commands/event_trigger.c
index 68d7e46..523008d 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -112,6 +112,7 @@ static event_trigger_support_data event_trigger_support[] = 
{
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+   {"SUBSCRIPTION", true},

Hm, is that ok? Subscriptions are shared, so ...?


+   /*
+* If requested, create the replication slot on remote side for 
our
+* newly created subscription.
+*
+* Note, we can't cleanup slot in case of failure as reason for
+* failure might be already existing slot of the same name and 
we
+* don't want to drop somebody else's slot by mistake.
+*/
+   if (create_slot)
+   {
+   XLogRecPtr  lsn;
+
+   /*
+* Create the replication slot on remote side for our 
newly created
+* subscription.
+*
+* Note, we can't cleanup slot in case of failure as 
reason for
+* failure might be already existing slot of the same 
name and we
+* don't want to drop somebody else's slot by mistake.
+*/

We should really be able to recognize that based on the error code...

+/*
+ * Drop subscription by OID
+ */
+void
+DropSubscriptionById(Oid subid)
+{

+   /*
+* We must ignore errors here as that would make it impossible to drop
+* subscription when publisher is down.
+*/

I'm not convinced.  Leaving a slot around without a "record" of it on
the creating side isn't nice either. Maybe a FORCE flag or something?

+subscription_create_opt_item:
+   subscription_opt_item
+   | INITIALLY IDENT
+   {
+   if (strcmp($2, "enabled") == 0)
+   $$ = makeDefElem("enabled",
+   
 (Node *)makeInteger(TRUE), @1);
+   else if (strcmp($2, "disabled") == 0)
+   $$ = makeDefElem("enabled",
+   
 (Node *)makeInteger(FALSE), @1);
+   else
+   ereport(ERROR,
+   
(errcode(ERRCODE_SYNTAX_ERROR),
+
errmsg("unrecognized subscription option \"%s\"", $1),
+
parser_errposition(@2)));
+   }
+   | IDENT
+   {
+   if (strcmp($1, "create_slot") == 0)
+   $$ = makeDefElem("create_slot",
+   
 (Node *)makeInteger(TRUE), @1);
+   else if (strcmp($1, "nocreate_slot") == 
0)
+   $$ = makeDefElem("create_slot",
+   
 (Node *)makeInteger(FALSE), @1);
+   }
+   ;

Hm, the IDENT case ignores $1 if it's not create_slot/nocreate_slot and
thus leaves $$ uninitialized?

I again really would like to have the error checking elsewhere.



- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-04 Thread Karl O. Pinc
On Mon, 31 Oct 2016 10:19:18 +0100
Gilles Darold  wrote:

> Le 31/10/2016 à 04:35, Karl O. Pinc a écrit :

> > Attached is a patch to apply on top of the v10 patch.
> >
> > It updates current_logfiles only once per log rotation.
> > I see no reason to open and write the file twice
> > if both csvlog and stderr logging is happening.  
> 
> I do not take the effort to do that because log rotation is not
> something that occurs too often and I don't want to change the
> conditional "time_based_rotation" code lines in logfile_rotate() for
> readability. My though was also that on high load, log rotation is
> automatically disabled so logfile_writename() is not called and there
> will be no additional I/O. But why not, if commiters want to save this
> additional I/O, this patch can be applied.

Ok.  You didn't put this into your v11 patch so it can be submitted to
the committers as a separate patch.

> > I don't understand why you're calling 
> > logfile_writename(last_file_name, last_csv_file_name);
> > in the SIGHUP code.  Presumably you've already
> > written the old logfile names to current_logfiles.
> > On SIGHUP you want to write the new names, not
> > the old ones.  But the point of the SIGHUP code
> > is to look for changes in logfile writing and then
> > call logfile_rotate() to make those changes.  And
> > logfile_rotate() calls logfile_writename() as appropriate.
> > Shouldn't the logfile_writename call in the SIGHUP
> > code be eliminated?  
> 
> No, when you change the log_destination and reload configuration you
> have to refresh the content of current_logfiles, in this case no new
> rotation have been done and you have to write last_file_name and/or
> last_csv_file_name.

I don't understand.  Please explain what's wrong with the
picture I have of how logging operates on receipt of SIGHUP.
Here is my understanding:

The system is running normally, current_logfiles exists and contains
the log file paths of the logs presently being written into.  These
paths end with the file names in last_file_name and/or
last_csv_file_name.  (I'm assuming throughout my description here that
log_destination is writing to the file system at all.)

A SIGHUP arrives.  The signal handler, sigHupHandler(), sets the
got_SIGHUP flag.  Nothing else happens.

The logging process wakes up due to the signal.
Either there's also log data or there's not.  If there's
not:

The logging process goes through it's processing loop and finds,
at line 305 of syslogger.c, got_SIGHUP to be true.
Then it proceeds to do a bunch of assignment statements.
If it finds that the log directory or logfile name changed
it requests immediate log file rotation.  It does this by
setting the request_rotation flag.  If log_destination
or logging_collector changed request_rotation is not set.

Then, your patch adds a call to logfile_writename().
But nothing has touched the last_file_name or last_csv_file_name
variables.  You rewrite into current_logfiles what's
already in current_logfiles.


If there is log data in the pipe on SIGHUP
and it's csv log data then if there's a csv
log file open that's the file that's written to.
last_csv_file_name does not change so current_logfiles
does not need to be re-written.  If there is no csv
log file open then open_csvlogfile() is called
and it calls logfile_writename().  No need to
call logfile_writename() again when processing the
SIGHUP.

If there is log data in the pipe on SIGHUP
and it's stderr log data then the currently open
stderr log file is written to.  This is already
in current_logfiles so no need to call logfile_writename().


Looking at what happens after your call to logfile_writename()
in SysLoggerMain() there's no changes to the log files
without calling logfile_writename.  

If there's stderr
log messages in the pipe these get written to the currently
open stderr log file until log rotation changes the file
name.  This either happens immediately because
request_rotation was set, or it waits.

If there's csv log messages in the pipe then these are either
written to the currently open log file or, when no
csv log file is open, open_csvlogfile() calls logfile_writename().
Either way, no need to re-write current_logfiles until
log rotation.

I'll respond to the rest of this email later, hopefully later
today.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-11-04 Thread Robert Haas
On Thu, Nov 3, 2016 at 6:25 AM, Amit Kapila  wrote:
>> +nblkno = _hash_get_newblk(rel, pageopaque);
>>
>> I think this is not a great name for this function.  It's not clear
>> what "new blocks" refers to, exactly.  I suggest
>> FIND_SPLIT_BUCKET(metap, bucket) or OLD_BUCKET_TO_NEW_BUCKET(metap,
>> bucket) returning a new bucket number.  I think that macro can be
>> defined as something like this: bucket + (1 <<
>> (fls(metap->hashm_maxbucket) - 1)).
>>
>
> I think such a macro would not work for the usage of incomplete
> splits.  The reason is that by the time we try to complete the split
> of the current old bucket, the table half (lowmask, highmask,
> maxbucket) would have changed and it could give you the bucket in new
> table half.

Can you provide an example of the scenario you are talking about here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in comment in contrib/postgres_fdw/deparse.c

2016-11-04 Thread Robert Haas
On Fri, Nov 4, 2016 at 7:20 AM, Etsuro Fujita
 wrote:
> On 2016/11/02 5:22, Robert Haas wrote:
>> On Tue, Nov 1, 2016 at 8:20 AM, Etsuro Fujita
>>  wrote:
>>>
>>> I ran into a typo in a comment in contrib/postgres_fdw/deparse.c. Please
>>> find attached a patch.
>
>> Committed.
>
> Thanks!
>
> I found another typo in postgres_fdw.c.  Attached is a patch for fixing
> that.

OK, committed that, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-04 Thread Andres Freund
Hi,

+ 
+  pg_publication_rel
+
+  
+   pg_publication_rel
+  
+
+  
+   The pg_publication_rel catalog contains
+   mapping between tables and publications in the database. This is many to
+   many mapping.
+  

I wonder if we shouldn't abstract this a bit away from relations to
allow other objects to be exported to. Could structure it a bit more
like pg_depend.


+ALTER PUBLICATION name [ [ WITH ] 
option [ ... ] ]
+
+where option can 
be:
+
+  PuBLISH_INSERT | NOPuBLISH_INSERT
+| PuBLISH_UPDATE | NOPuBLISH_UPDATE
+| PuBLISH_DELETE | NOPuBLISH_DELETE

That's odd casing.


+   
+PuBLISH_INSERT
+NOPuBLISH_INSERT
+PuBLISH_UPDATE
+NOPuBLISH_UPDATE
+PuBLISH_DELETE
+NOPuBLISH_DELETE

More odd casing.

+   
+FOR TABLE
+
+ 
+  Specifies optional list of tables to add to the publication.
+ 
+
+   
+
+   
+FOR TABLE ALL IN SCHEMA
+
+ 
+  Specifies optional schema for which all logged tables will be added to
+  publication.
+ 
+
+   

"FOR TABLE ALL IN SCHEMA" sounds weird.


+  
+   This operation does not reserve any resources on the server. It only
+   defines grouping and filtering logic for future subscribers.
+  

That's strictly speaking not true, maybe rephrase a bit?

+/*
+ * Check if relation can be in given publication and throws appropriate
+ * error if not.
+ */
+static void
+check_publication_add_relation(Relation targetrel)
+{
+   /* Must be table */
+   if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("only tables can be added to 
publication"),
+errdetail("%s is not a table",
+  
RelationGetRelationName(targetrel;
+
+   /* Can't be system table */
+   if (IsCatalogRelation(targetrel))
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("only user tables can be added to 
publication"),
+errdetail("%s is a system table",
+  
RelationGetRelationName(targetrel;
+
+   /* UNLOGGED and TEMP relations cannot be part of publication. */
+   if (!RelationNeedsWAL(targetrel))
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("UNLOGGED and TEMP relations cannot be 
replicated")));
+}

This probably means we need a check in the ALTER TABLE ... SET UNLOGGED
path.


+/*
+ * Returns if relation represented by oid and Form_pg_class entry
+ * is publishable.
+ *
+ * Does same checks as the above, but does not need relation to be opened
+ * and also does not throw errors.
+ */
+static bool
+is_publishable_class(Oid relid, Form_pg_class reltuple)
+{
+   return reltuple->relkind == RELKIND_RELATION &&
+   !IsCatalogClass(relid, reltuple) &&
+   reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
+   /* XXX needed to exclude information_schema tables */
+   relid >= FirstNormalObjectId;
+}

Shouldn't that be IsCatalogRelation() instead?


+CREATE VIEW pg_publication_tables AS
+SELECT
+P.pubname AS pubname,
+N.nspname AS schemaname,
+C.relname AS tablename
+FROM pg_publication P, pg_class C
+ JOIN pg_namespace N ON (N.oid = C.relnamespace)
+WHERE C.relkind = 'r'
+  AND C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));

That's going to be quite inefficient if you filter by table... Might be
better to do that via the underlying table.


+/*
+ * Create new publication.
+ * TODO ACL check
+ */

Hm?

+ObjectAddress
+CreatePublication(CreatePublicationStmt *stmt)
+{
+   check_replication_permissions();

+
+/*
+ * Drop publication by OID
+ */
+void
+DropPublicationById(Oid pubid)
+
+/*
+ * Remove relation from publication by mapping OID.
+ */
+void
+RemovePublicationRelById(Oid proid)
+{

Permission checks?

+}

Hm. Neither of these does dependency checking, wonder if that can be
argued to be problematic.


+/*
+ * Gather Relations based o provided by RangeVar list.
+ * The gathered tables are locked in ShareUpdateExclusiveLock mode.
+ */

s/o/on/.  Not sure if gather is the best name.

+static List *
+GatherTableList(List *tables)


+/*
+ * Close all relations in the list.
+ */
+static void
+CloseTables(List *rels)

Shouldn't that be CloseTableList() based on the preceding function's naming?

+
+/*
+ * Add listed tables to the publication.
+ */
+static void
+PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
+AlterPublicationStmt *stmt)
+{
+   ListCell   *lc;
+
+   Assert(!stmt || !stmt->for_all_tables);
+
+   

Re: [HACKERS] Gather Merge

2016-11-04 Thread Robert Haas
On Thu, Nov 3, 2016 at 11:00 PM, Thomas Munro
 wrote:
> + /*
> + * Avoid log(0)...
> + */
> + N = (path->num_workers < 2) ? 2.0 : (double) path->num_workers;
> + logN = LOG2(N);
> ...
> + /* Per-tuple heap maintenance cost */
> + run_cost += path->path.rows * comparison_cost * 2.0 * logN;
>
> Why multiply by two?  The comment above this code says "about log2(N)
> comparisons to delete the top heap entry and another log2(N)
> comparisons to insert its successor".  In fact gather_merge_getnext
> calls binaryheap_replace_first, which replaces the top element without
> any comparisons at all and then performs a sift-down in log2(N)
> comparisons to find its new position.  There is no per-tuple "delete"
> involved.  We "replace" the top element with the value it already had,
> just to trigger the sift-down, because we know that our comparator
> function might have a new opinion of the sort order of this element.
> Very clever!  The comment and the 2.0 factor in cost_gather_merge seem
> to be wrong though -- or am I misreading the code?

See cost_merge_append, and the header comments threreto.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-04 Thread Andres Freund
Hi,


/* Prototypes for interface functions */
-static void libpqrcv_connect(char *conninfo);
-static char *libpqrcv_get_conninfo(void);
-static void libpqrcv_identify_system(TimeLineID *primary_tli);
-static void libpqrcv_readtimelinehistoryfile(TimeLineID tli, char **filename, 
char **content, int *len);
-static bool libpqrcv_startstreaming(TimeLineID tli, XLogRecPtr startpoint,
-   char *slotname);
-static void libpqrcv_endstreaming(TimeLineID *next_tli);
-static int libpqrcv_receive(char **buffer, pgsocket *wait_fd);
-static void libpqrcv_send(const char *buffer, int nbytes);
-static void libpqrcv_disconnect(void);
+static WalReceiverConn *libpqrcv_connect(char *conninfo,
+   
 bool logical, const char *appname);
+static char *libpqrcv_get_conninfo(WalReceiverConn *conn);
+static char *libpqrcv_identify_system(WalReceiverConn *conn,
+ 
TimeLineID *primary_tli);
+static void libpqrcv_readtimelinehistoryfile(WalReceiverConn *conn,
+TimeLineID 
tli, char **filename,
+char 
**content, int *len);
+static bool libpqrcv_startstreaming(WalReceiverConn *conn,
+TimeLineID tli, 
XLogRecPtr startpoint,
+const char *slotname);
+static void libpqrcv_endstreaming(WalReceiverConn *conn,
+ TimeLineID 
*next_tli);
+static int libpqrcv_receive(WalReceiverConn *conn, char **buffer,
+pgsocket *wait_fd);
+static void libpqrcv_send(WalReceiverConn *conn, const char *buffer,
+ int nbytes);
+static void libpqrcv_disconnect(WalReceiverConn *conn);


That looks good.

 /* Prototypes for private functions */
-static bool libpq_select(int timeout_ms);
+static bool libpq_select(PGconn *streamConn,
+int timeout_ms);

If we're starting to use this more widely, we really should just a latch
instead of the plain select(). In fact, I think it's more or less a bug
that we don't (select is only interruptible by signals on a subset of
our platforms).  That shouldn't bother this patch, but...



This looks pretty close to committable, Peter do you want to do that, or
should I?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-04 Thread Karl O. Pinc
On Thu, 3 Nov 2016 18:34:50 -0500
"Karl O. Pinc"  wrote:

> On Mon, 31 Oct 2016 09:26:27 +0100
> Gilles Darold  wrote:
> 
> > Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :  
> 
> > > Have you given any thought to my proposal to change
> > > CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?
> > Yes, I don't think the information logged in this file are kind of
> > meta information and CURRENT_LOG_FILENAME seems obvious. 

> ... maybe the right name is LOG_METAINFO_DATAFILE.

CURRENT_LOGFILES_FILENAME is good, but a bit long.

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-04 Thread Andres Freund
/*
  * Replication slot on-disk data structure.
@@ -225,10 +226,25 @@ ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlot *slot = NULL;
int i;

-   Assert(MyReplicationSlot == NULL);
+   /* Only aka ephemeral slots can survive across commands. */

What does this comment mean?


+   Assert(!MyReplicationSlot ||
+  MyReplicationSlot->data.persistency == RS_EPHEMERAL);

+   if (MyReplicationSlot)
+   {
+   /* Already acquired? Nothis to do. */

typo.

+   if (namestrcmp(>data.name, name) == 0)
+   return;
+
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("cannot create replication slot %s, 
another slot %s is "
+   "already active in this 
session",
+   name, 
NameStr(MyReplicationSlot->data.name;
+   }
+

Why do we now create slots that are already created? That seems like an
odd API change.

/*
 * If some other backend ran this code concurrently with us, we'd likely
 * both allocate the same slot, and that would be bad.  We'd also be at
@@ -331,10 +347,25 @@ ReplicationSlotAcquire(const char *name)
int i;
int active_pid = 0;

-   Assert(MyReplicationSlot == NULL);
+   /* Only aka ephemeral slots can survive across commands. */
+   Assert(!MyReplicationSlot ||
+  MyReplicationSlot->data.persistency == RS_EPHEMERAL);

ReplicationSlotValidateName(name, ERROR);

+   if (MyReplicationSlot)
+   {
+   /* Already acquired? Nothis to do. */
+   if (namestrcmp(>data.name, name) == 0)
+   return;
+
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("cannot acquire replication slot %s, 
another slot %s is "
+   "already active in this 
session",
+   name, 
NameStr(MyReplicationSlot->data.name;
+   }
+
/* Search for the named slot and mark it active if we find it. */
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (i = 0; i < max_replication_slots; i++)
@@ -406,12 +437,26 @@ ReplicationSlotRelease(void)
 }

Uh? We shouldn't ever have to acquire ephemeral

 /*
+ * Same as above but only if currently acquired slot is peristent one.
+ */

s/peristent/persistent/

+void
+ReplicationSlotReleasePersistent(void)
+{
+   Assert(MyReplicationSlot);
+
+   if (MyReplicationSlot->data.persistency == RS_PERSISTENT)
+   ReplicationSlotRelease();
+}

Ick.



Hm. I think I have to agree a bit with Peter here.  Overloading
MyReplicationSlot this way seems ugly, and I think there's a bunch of
bugs around it too.


Sounds what we really want is a) two different lifetimes for ephemeral
slots, session and "command" b) have a number of slots that are released
either after a failed transaction / command or at session end.   The
easiest way for that appears to have a list of slots to be checked at
end-of-xact and backend shutdown. 

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in comment in contrib/postgres_fdw/deparse.c

2016-11-04 Thread Etsuro Fujita

On 2016/11/02 5:22, Robert Haas wrote:

On Tue, Nov 1, 2016 at 8:20 AM, Etsuro Fujita
 wrote:

I ran into a typo in a comment in contrib/postgres_fdw/deparse.c. Please
find attached a patch.



Committed.


Thanks!

I found another typo in postgres_fdw.c.  Attached is a patch for fixing 
that.


Best regards,
Etsuro Fujita
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 2cfb82b..fbe6929 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -4156,7 +4156,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 * other remote clauses. For LEFT and RIGHT OUTER join, the clauses from
 	 * the outer side are added to remote_conds since those can be evaluated
 	 * after the join is evaluated. The clauses from inner side are added to
-	 * the joinclauses, since they need to evaluated while constructing the
+	 * the joinclauses, since they need to be evaluated while constructing the
 	 * join.
 	 *
 	 * For a FULL OUTER JOIN, the other clauses from either relation can not

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Push down more full joins in postgres_fdw

2016-11-04 Thread Etsuro Fujita

On 2016/11/04 13:08, Ashutosh Bapat wrote:

On Fri, Nov 4, 2016 at 9:19 AM, Etsuro Fujita
 wrote:

On 2016/11/03 18:52, Ashutosh Bapat wrote:


I wrote:

Here is the updated version,



Other than the above issue and the alias issue we discussed, I addressed
all
your comments except one on testing; I tried to add test cases where the
remote query is deparsed as nested subqueries, but I couldn't because
IIUC,
reduce_outer_joins reduced full joins to inner joins or left joins.



No always. It will do so in some cases as explained in the prologue of
reduce_outer_joins()

 * More generally, an outer join can be reduced in strength if there is a
 * strict qual above it in the qual tree that constrains a Var from the
 * nullable side of the join to be non-null.  (For FULL joins this applies
 * to each side separately.)


Right.


Would it be necessary for this patch to include test cases for nested
subqueries?



A patch should have testcases testing the functionality added. This
patch adds functionality to deparse nested subqueries, so there should
be testcase to test it.


OK, I added such a test case.


This patch again has a lot of unrelated changes, esp. in
deparseSelectStmtForRel(). What was earlier part of deparseSelectSql()
and deparseFromExpr() is now flattened in deparseSelectStmtForRel(),
which seems unnecessary.



IIUC, I think this change was done to address your comment (see the comment
#2 in [1]).  Am I missing something?



There is some misunderstanding here. That comment basically says,
deparseRangeTblRef() duplicates code in deparseSelectStmtForRel(), so
we should either remove deparseRangeTblRef() altogether or should keep
it to minimal avoiding duplication of code. What might have confused
you is the last sentence in that comment "This will also make the
current changes to deparseSelectSql unnecessary." By current changes I
meant changes to deparseSelectSql() in your patch, not the one that's
in the repository. I don't think we should flatten
deparseSelectStmtForRel() in this patch.


I noticed that I misunderstood your words.  Sorry about that.  I agree 
with you, so I removed that change from the patch.


Attached is an updated version of the patch.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 109,114  typedef struct deparse_expr_cxt
--- 109,116 
  /* Handy macro to add relation name qualification */
  #define ADD_REL_QUALIFIER(buf, varno)	\
  		appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno))
+ #define SS_TAB_ALIAS_PREFIX	"s"
+ #define SS_COL_ALIAS_PREFIX	"c"
  
  /*
   * Functions to determine whether an expression can be evaluated safely on
***
*** 167,172  static void appendConditions(List *exprs, deparse_expr_cxt *context);
--- 169,180 
  static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
  	RelOptInfo *joinrel, bool use_alias, List **params_list);
  static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
+ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
+    bool make_subquery, List **params_list);
+ static void appendSubselectAlias(StringInfo buf, int tabno, int ncols);
+ static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel,
+ 	  int *tabno, int *colno);
+ static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno);
  static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
  static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
  static void appendAggOrderBy(List *orderList, List *targetList,
***
*** 1155,1165  deparseLockingClause(deparse_expr_cxt *context)
--- 1163,1182 
  	StringInfo	buf = context->buf;
  	PlannerInfo *root = context->root;
  	RelOptInfo *rel = context->scanrel;
+ 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private;
  	int			relid = -1;
  
  	while ((relid = bms_next_member(rel->relids, relid)) >= 0)
  	{
  		/*
+ 		 * Ignore relation if it appears in a lower subquery, because in that
+ 		 * case we would have already considered locking for the relation
+ 		 * while deparsing the lower subquery.
+ 		 */
+ 		if (bms_is_member(relid, fpinfo->subquery_rels))
+ 			continue;
+ 
+ 		/*
  		 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
  		 * initial row fetch, rather than later on as is done for local
  		 * tables. The extra roundtrips involved in trying to duplicate the
***
*** 1347,1364  deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
  
  	if (foreignrel->reloptkind == RELOPT_JOINREL)
  	{
- 		RelOptInfo *rel_o = fpinfo->outerrel;
- 		RelOptInfo *rel_i = fpinfo->innerrel;
  		StringInfoData join_sql_o;
  		StringInfoData join_sql_i;
  
  		/* Deparse outer relation */
  		initStringInfo(_sql_o);
! 		deparseFromExprForRel(_sql_o, root, rel_o, true, 

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2016-11-04 Thread Ashutosh Bapat
On Mon, Oct 31, 2016 at 6:37 PM, Robert Haas  wrote:
> On Fri, Oct 28, 2016 at 3:09 AM, Ashutosh Bapat
>  wrote:
>> I think there are going to be two kinds of partitioning use-cases.
>> First, carefully hand-crafted by DBAs so that every partition is
>> different from other and so is every join between two partitions.
>> There will be lesser number of partitions, but creating paths for each
>> join between partitions will be crucial from performance point of
>> view. Consider, for example, systems which use partitions to
>> consolidate results from different sources for analytical purposes or
>> sharding. If we consider various points you have listed in [1] as to
>> why a partition is equivalent to a table, each join between partitions
>> is going to have very different characteristics and thus deserves a
>> set of paths for its own. Add to that possibility of partition pruning
>> or certain conditions affecting particular partitions, the need for
>> detailed planning evident.
>>
>> The other usage of partitioning is to distribute the data and/or
>> quickly eliminate the data by partition pruning. In such case, all
>> partitions of a given table will have very similar properties. There
>> is a large chance that we will end up having same plans for every
>> partition and for joins between partitions. In such cases, I think it
>> suffices to create paths for just one or may be a handful partitions
>> of join and repeat that plan for other partitions of join. But in such
>> cases it also makes sense to have a light-weight representation for
>> partitions as compared to partitions being a full-fledged tables. If
>> we have such a light-weight representation, we may not even create
>> RelOptInfos representing joins between partitions, and different paths
>> for each join between partitions.
>
> I'm not sure I see a real distinction between these two use cases.  I
> think that the problem of differing data distribution between
> partitions is almost always going to be an issue.  Take the simple
> case of an "orders" table which is partitioned by month.  First, the
> month that's currently in progress may be much smaller than a typical
> completed month.  Second, many businesses are seasonal and may have
> many more orders at certain times of year.  For example, in American
> retail, many businesses have large spikes in December.  I think some
> businesses may do four times as much business in December as any other
> month, for example.  So you will have that sort of variation, at
> least.
>
>> A typical join tree will be composite: some portion partitioned and
>> some portion unpartitioned or different portions partitioned by
>> different partition schemes. In such case, inaccurate costs for
>> PartitionJoinPath, can affect the plan heavily, causing a suboptimal
>> path to be picked. Assuming that partitioning will be useful for large
>> sets of data, choosing a suboptimal plan can be more dangerous than
>> consuming memory for creating paths.
>
> Well, sure.  But, I mean, every simplifying assumption which the
> planner makes to limit resource consumption could have that effect.
> join_collapse_limit, for example, can cause horrible plans.  However,
> we have it anyway, because the alternative of having planning take far
> too long is unpalatable.  Planning is always, at some level,
> guesswork.

My point is, this behaviour is configurable. Users who are ready to
spend time and resources to get the best plan are still able to do so,
by choosing a higher limit on join_collapse_limit. Those who can not
afford to do so, are ready to use inferior plans willingly by setting
join_collapse_limit to a lower number.

>
>> A possible solution would be to keep the track of used paths using a
>> reference count. Once the paths for given join tree are created, free
>> up the unused paths by traversing pathlist in each of the RelOptInfos.
>> Attached patch has a prototype implementation for the same. There are
>> some paths which are not linked to RelOptInfos, which need a bit
>> different treatment, but they can be handled too.
>
> So, if you apply this with your previous patch, how much does it cut
> down memory consumption?

Answered this below:

>
>>> In that way, peak memory usage only grows by
>>> about a factor of 2 rather than a factor equal to the partition count,
>>> because we don't need to keep every possibly-useful path for every
>>> partition all at the same time, but rather every possibly-useful path
>>> for a single partition.
>>>
>>> Maybe there are other ideas but I have a feeling any way you slice it
>>> this is going to be a lot of work.
>>
>> For the case of carefully hand-crafted partitions, I think, users
>> would expect the planner to use really the best plan and thus may be
>> willing to accommodate for increased memory usage. Going by any
>> approach that does not create the paths for joins between partitions
>> is not guaranteed to give the 

Re: [HACKERS] Proposal for changes to recovery.conf API

2016-11-04 Thread Simon Riggs
On 4 November 2016 at 10:04, Michael Paquier  wrote:
> On Tue, Nov 1, 2016 at 11:31 PM, Robert Haas  wrote:
>> I liked Heikki's suggestion (at some point quite a while ago now) of
>> recovery_target = 'xid 123' or recovery_target='lsn 0/723' or
>> whatever.
>
> My vote goes for having two separate parameters, because as we know
> that there will be always two fields in this parameter, there is no
> need to complicate the GUC machinery with a new special case when
> parsing its value. Having two parameters would also make easier the
> life of anybody maintaining a library parsing parameters for values
> and doing in-place updates of those values. For example, I maintain a
> set of routines in Python doing that with some fancy regex routines,
> and that would avoid having to handle a special case when willing to
> update for example the same recovery target with a new value.

Parameters are required to all make sense independently, so two
parameters is off the table, IMHO.

The choice is one parameter, as Robert mentions again, or lots of
parameters (or both of those).

Since the "lots of parameters" approach is almost exactly what we have
now, I think we should do that first, get this patch committed and
then sit back and discuss an improved API and see what downsides it
introduces. Further delay on this isn't helpful for other patches
going in.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Contents of "backup_label" and "*.backup" in pg_wal location

2016-11-04 Thread Michael Paquier
On Fri, Nov 4, 2016 at 7:04 PM, Venkata B Nagothi  wrote:
> Sure. I will look at the possibility of using XLOG_BACKUP_END in my patch.
> I am looking at the possibility of keeping the backup_label at source until
> pg_stop_backup()
> is executed and then write the STOP information and then move it across to
> the backup location.

Non-exclusive backups already do that, except that as the backup is
already stopped at the moment the backup_label information is sent
back to the caller, and it is expected that it will be the caller that
will write its contents into the backed up PGDATA in a file named as
backup_label. Anyway, any design in this area for exclusive backups
would be really inconsistent. For example take the case of
pg_start_backup -> cp -> pg_stop_backup for an exclusive backup, when
are you going to update the backup_label file with the stop info?

> I see that when the START/STOP information is written to the WAL history
> file,
> the content from the backup_label file is being copied and I am thinking to
> do the same other way around.
>
> Am i making sense ? is that anyway not possible ?
>
> If this makes sense, then i would start working on an optimal design and
> look at the possibility of achieving this.

Before writing any code, I would be curious about the need behind
that, and you give no reason where this would help in this thread. Do
you actually want to time the timestamp when backup ends? This could
be added as a new field of pg_stop_backup for both the exclusive and
non-exclusive cases. Just an idea.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Contents of "backup_label" and "*.backup" in pg_wal location

2016-11-04 Thread Venkata B Nagothi
On Fri, Nov 4, 2016 at 3:44 PM, Michael Paquier 
wrote:

> On Fri, Nov 4, 2016 at 1:18 PM, Venkata B Nagothi 
> wrote:
> > I see the following contents in the file
> > "00010044.0060.backup" which was generated in the
> pg_wal
> > location during the online backup. When pg_stop_backup() is executed, the
> > following content is written which includes the content copied from the
> > backup_label file.
> >
> > [...]
> >
> > Can someone please help me know the importance of the above file?
>
> It is not actually critical, and useful for debugging (you could say
> the same about backup_label.old).
>
> > How about having the same contents in the backup_label file as well?
>
> > As of now, the backup_label file does not have any information related to
> > when and at what position the backup actually completed.
>
> Yes, and it is not actually possible to write the stop information
> because when a backup finishes the backup_label is simply removed and
> it has been included in the backup before it finishes. The role of
> this file is to provide the LSN start location from which the backup
> is able to replay things cleanly. The stop position, aka when
> everything on disk is consistent, is determined at replay by the
> XLOG_BACKUP_END record. This stop position is not something you can
> know when the backup_label file is generated. And I am of course
> talking about exclusive backups here.
>

Sure. I will look at the possibility of using XLOG_BACKUP_END in my patch.
I am looking at the possibility of keeping the backup_label at source until
pg_stop_backup()
is executed and then write the STOP information and then move it across to
the backup location.

I see that when the START/STOP information is written to the WAL history
file,
the content from the backup_label file is being copied and I am thinking to
do the same other way around.

Am i making sense ? is that anyway not possible ?

If this makes sense, then i would start working on an optimal design and
look at the possibility of achieving this.

Regards,

Venkata B N
Database Consultant

Fujitsu Australia


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-11-04 Thread Michael Paquier
On Tue, Nov 1, 2016 at 11:31 PM, Robert Haas  wrote:
> I liked Heikki's suggestion (at some point quite a while ago now) of
> recovery_target = 'xid 123' or recovery_target='lsn 0/723' or
> whatever.

My vote goes for having two separate parameters, because as we know
that there will be always two fields in this parameter, there is no
need to complicate the GUC machinery with a new special case when
parsing its value. Having two parameters would also make easier the
life of anybody maintaining a library parsing parameters for values
and doing in-place updates of those values. For example, I maintain a
set of routines in Python doing that with some fancy regex routines,
and that would avoid having to handle a special case when willing to
update for example the same recovery target with a new value.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-04 Thread Ashutosh Bapat
On Tue, Oct 4, 2016 at 2:35 PM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
> FWIW, I'm pretty much -1 on messing with the timing of the socket close
>> actions.  I broke that once within recent memory, so maybe I'm gun-shy,
>> but I think that the odds of unpleasant side effects greatly outweigh any
>> likely benefit there.
>
> I couldn't find any relevant mails in pgsql-hackers.  I found no problem with 
> the attached patch.  Do you think this is OK?

Few comments on this patch,

I am not sure if following condition is a good idea in ServerLoop()
1650 if (pmState == PM_WAIT_DEAD_END || ClosedSockets)

There are no sockets to listen on, so select in the else condition is
going to sleep for timeout determined based on the sequence expected.
Just before we close sockets in pmdie() it sets AbortStartTime, which
determines the timeout for the sleep here. So, it doesn't make sense
to ignore it. Instead may be we should change the default 60s sleep to
100ms sleep in DetermineSleepTime().

If not, we need to at least update the comments to indicate the other
reason for not polling using select().
 * If we are in PM_WAIT_DEAD_END state, then we don't want to accept
 * any new connections, so we don't call select(), and just sleep.
 */
memcpy((char *) , (char *) , sizeof(fd_set));

-   if (pmState == PM_WAIT_DEAD_END)
+   if (pmState == PM_WAIT_DEAD_END || ClosedSockets)

While the postmaster is terminating children, a new connection request
may arrive. We should probably close listening sockets before
terminating children in pmdie().

Otherwise this patch looks good to me. It applies and compiles
cleanly. make check-world doesn't show any failures.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-11-04 Thread Amit Kapila
On Thu, Nov 3, 2016 at 8:38 PM, Robert Haas  wrote:
> On Tue, Nov 1, 2016 at 11:31 PM, Tomas Vondra
>> The difference is that both the fast-path locks and msgNumLock went into
>> 9.2, so that end users probably never saw that regression. But we don't know
>> if that happens for clog and WAL.
>>
>> Perhaps you have a working patch addressing the WAL contention, so that we
>> could see how that changes the results?
>
> I don't think we do, yet.
>

Right.  At this stage, we are just evaluating the ways (basic idea is
to split the OS writes and Flush requests in separate locks) to reduce
it.  It is difficult to speculate results at this stage.  I think
after spending some more time (probably few weeks), we will be in
position to share our findings.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Danger of automatic connection reset in psql

2016-11-04 Thread Oleksandr Shulgin
On Thu, Nov 3, 2016 at 12:26 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
> A couple of doubts/suggestions:
> 1. Should pset.conn_was_reset be set to false, when we make a
> connection in do_connection()? Usually pset.conn_was_reset would be
> initialised with 0 since it's a global variable, so this may not be
> necessary. But as a precaution should we do this?
>

Hi,

Thank you for your comments.

I think this is not necessary since do_connection() is also called from
MainLoop where we clear the flag explicitly.  I also don't see a way we
could enter do_connection() with the conn_was_reset flag set to true in the
first place.

It still makes sense to initialize it to false in startup.c, though.

2. Comment below should be updated to reflect the new change
> /* fall out of loop if lexer reached EOL */
> -   if (scan_result == PSCAN_INCOMPLETE ||
> +   if (pset.conn_was_reset ||
> +   scan_result == PSCAN_INCOMPLETE ||
>

Check.  See attached v2.

3. What happens when the connection is reset while the source is a
> file say specified by \i in an interactive shell?


In this case pset.cur_cmd_interactive is false (see mainloop.c:66) and we
don't attempt to reset a failed connection:

postgres(p)=# \i 1.sql
psql:1.sql:1: FATAL:  terminating connection due to administrator command
psql:1.sql:1: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:1.sql:1: connection to server was lost
$

The psql process even exits with an error code 2, which might be not that
expected.  We could stop reading the file and reset connection afterwards,
but this is probably not that easy to achieve (think of nested \i calls).

I will still try to see if we can observe blocking status of a read on
stdin and if that might help in protecting from a more complex case with
pasting into terminal.

--
Alex
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
new file mode 100644
index a7789df..34a4507
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** CheckConnection(void)
*** 386,391 
--- 386,393 
  		}
  		else
  			psql_error("Succeeded.\n");
+ 
+ 		pset.conn_was_reset = true;
  	}
  
  	return OK;
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
new file mode 100644
index 37dfa4d..c049a39
*** a/src/bin/psql/mainloop.c
--- b/src/bin/psql/mainloop.c
*** MainLoop(FILE *source)
*** 390,401 
  	break;
  			}
  
! 			/* fall out of loop if lexer reached EOL */
  			if (scan_result == PSCAN_INCOMPLETE ||
! scan_result == PSCAN_EOL)
  break;
  		}
  
  		/* Add line to pending history if we didn't execute anything yet */
  		if (pset.cur_cmd_interactive && !line_saved_in_history)
  			pg_append_history(line, history_buf);
--- 390,404 
  	break;
  			}
  
! 			/* fall out of loop if lexer reached EOL or connection was reset */
  			if (scan_result == PSCAN_INCOMPLETE ||
! scan_result == PSCAN_EOL ||
! pset.conn_was_reset)
  break;
  		}
  
+ 		pset.conn_was_reset = false;
+ 
  		/* Add line to pending history if we didn't execute anything yet */
  		if (pset.cur_cmd_interactive && !line_saved_in_history)
  			pg_append_history(line, history_buf);
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
new file mode 100644
index 8cfe9d2..39a4be0
*** a/src/bin/psql/settings.h
--- b/src/bin/psql/settings.h
*** typedef struct _psqlSettings
*** 102,107 
--- 102,110 
  	FILE	   *cur_cmd_source; /* describe the status of the current main
   * loop */
  	bool		cur_cmd_interactive;
+ 	bool		conn_was_reset;	/* indicates that the connection was reset
+  * during the last attempt to execute an
+  * interactive command */
  	int			sversion;		/* backend server version */
  	const char *progname;		/* in case you renamed psql */
  	char	   *inputfile;		/* file being currently processed, if any */
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
new file mode 100644
index 7ce05fb..e238de9
*** a/src/bin/psql/startup.c
--- b/src/bin/psql/startup.c
*** main(int argc, char *argv[])
*** 139,144 
--- 139,145 
  	pset.last_error_result = NULL;
  	pset.cur_cmd_source = stdin;
  	pset.cur_cmd_interactive = false;
+ 	pset.conn_was_reset = false;
  
  	/* We rely on unmentioned fields of pset.popt to start out 0/false/NULL */
  	pset.popt.topt.format = PRINT_ALIGNED;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-04 Thread Michael Paquier
On Fri, Nov 4, 2016 at 5:02 PM, Michael Paquier
 wrote:
> On Thu, Nov 3, 2016 at 11:17 PM, Kuntal Ghosh
>  wrote:
>> I've updated the patch for review.
>
> Thank you for the new patch. This will be hopefully the last round of
> reviews, we are getting close to something that has an acceptable
> shape.

One last thing: in XLogRecordAssemble(), could you enforce the value
of info at the beginning of the routine when wal_consistency[rmid] is
true? And then use the value of info to decide if include_image is
true or not? The idea here is to allow callers of XLogInsert() to pass
by themselves XLR_CHECK_CONSISTENCY and still have consistency checks
enabled for a given record even if wal_consistency is false for the
rmgr of the record happening. This would be potentially useful for
extension and feature developers when debugging some stuff, for some
builds compiled with a DEBUG flag, or whatever.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-11-04 Thread Shay Rojansky
>
> Yes. Before doing this change, it is better to confirm the approach and
> then do all the changes.
>
> 1. Does everyone agrees that renaming the existing datatype without
> changing the OID?
>

As I said before, Npgsql for one loads data types by name, not by OID. So
this would definitely cause breakage.

For users who actually need the new variable-length type, it seems
perfectly reasonable to ask to switch to a new type - after all they're
making a change in their system. It would really be preferable to leave the
current type alone and create a new one.


Re: [HACKERS] Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)

2016-11-04 Thread Kyotaro HORIGUCHI
Hello,

(the header of this message is crafted so it might be isolate
this message from the thread)

The patch still applies on the current master with disaplacements.

> On Tue, Aug 30, 2016 at 1:44 PM, Fujii Masao  wrote:
> > On Tue, Aug 30, 2016 at 1:32 PM, Michael Paquier
> >  wrote:
> >> On Mon, Aug 29, 2016 at 11:16 PM, Fujii Masao  
> >> wrote:
> >>> You seem to add another entry for this patch into CommitFest.
> >>> Either of them needs to be removed.
> >>> https://commitfest.postgresql.org/10/698/
> >>
> >> Indeed. I just removed this one.
> >>
> >>> This patch prevents pg_stop_backup from starting while pg_start_backup
> >>> is running. OTOH, we also should prevent pg_start_backup from starting
> >>> until "previous" pg_stop_backup has completed? What happens if
> >>> pg_start_backup starts while pg_stop_backup is running?
> >>> As far as I read the current code, ISTM that there is no need to do that,
> >>> but it's better to do the double check.
> >>
> >> I don't agree about doing that.
> >
> > Hmm... my previous comment was confusing. I wanted to comment "it's better
> > *also for you* to do the double check whether we need to prevent 
> > pg_start_backup
> > while pg_stop_backup is running or not". Your answer seems to be almost the 
> > same
> > as mine, i.e., not necessary. Right?
> 
> Yes, that's not necessary to do more. In the worst case, say
> pg_start_backup starts when pg_stop_backup is running. And
> pg_stop_backup has decremented the backup counters, but not yet
> removed the backup_label, then pg_start_backup would just choke on the
> presence of the backup_label file

I don't see any problem on the state-transition of
exclusiveBackupState. For the following part

@@ -10217,7 +10255,7 @@ do_pg_start_backup(const char *backupidstr, bool fast, 
TimeLineID *starttli_p,
 {
   /*
* Check for existing backup label --- implies a backup is already
-   * running.  (XXX given that we checked exclusiveBackup above,
+   * running.  (XXX given that we checked exclusiveBackupState above,
* maybe it would be OK to just unlink any such label file?)

The issue in the XXX comment should be settled by this
chance. PostgreSQL no longer (or should not) places the file by
mistake of itself. It is only possible by someone evil crafting
it, or crash-then-restarted. For the former it can be safely
removed with some notice. For the latter we should remove it
since the backup taken through the restarting is not
reliable. Addition to that, issueing pg_start_backup indicates
that the operator already have forgotten that he/she issued it
previously. So it seems to me that it can be removed with some
WARNING.

The error checking on exclusiveBackupState looks somewhat
redandunt but I don't come up with a better coding.

So, everything other than the XXX comment looks fine for me, and
this is rather straghtforward patch. So after deciding what to do
for the issue and anyone opposed to this patch, I'll make this
'Ready for commiter'.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-04 Thread Gilles Darold
Le 04/11/2016 à 00:34, Karl O. Pinc a écrit :
> On Mon, 31 Oct 2016 09:26:27 +0100
> Gilles Darold  wrote:
>
>> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :
>>> Have you given any thought to my proposal to change
>>> CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?  
>> Yes, I don't think the information logged in this file are kind of
>> meta information and CURRENT_LOG_FILENAME seems obvious.
> To me, the CURRENT_LOG_FILENAME symbol should contain the name 
> of the current log file.  It does not.  The CURRENT_LOG_FILENAME symbol
> holds the name of the file which itself contains the name of 
> the log file(s) being written, plus the log file
> structure of each log file.
>
> IMO, the name of the log files being written, as well as
> the type of data structure written into each log file,
> are meta-information about the logging data.  So maybe
> the right name is LOG_METAINFO_DATAFILE.
>
> If you're not happy with making this change that's fine.
> If not, I'd like to make mention of the symbol name to
> the committers.

If it need to be changed I would prefer something like CURRENT_LOG_INFO,
but this is not really important. Please mention it, and the committer
will choose to change it or not.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-04 Thread Michael Paquier
On Thu, Nov 3, 2016 at 11:17 PM, Kuntal Ghosh
 wrote:
> I've updated the patch for review.

Thank you for the new patch. This will be hopefully the last round of
reviews, we are getting close to something that has an acceptable
shape.

+   
+  
+ 
+
+  
+ 
Did you try to compile the docs? Because that will break. (Likely my
fault). What needs to be done is removing one  and one
 markup.

+/*-
+ *
+ * bufmask.h
+ *   Buffer masking definitions.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/bufmask.h
+ */
We could likely survive here with just a copyright mention as 2016,
PGDG... Same remark for bufmask.c.

--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -25,6 +25,7 @@
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/plancat.h"
+#include "storage/bufmask.h"
 #include "utils/index_selfuncs.h"
 #include "utils/rel.h"
This header declaration is not necessary.

+   /*
+* Mask the Page LSN. Because, we store the page before updating the LSN.
+* Hence, LSNs of both pages will always be different.
+*/
+   mask_page_lsn(page_norm);
I don't fully understand this comment if phrased this way. Well, I do
understand it, but people who would read this code for the first time
may have a hard time understanding it. So I would suggest removing it,
but add a comment on top of mask_page_lsn() to mention that in
consistency checks the LSN of the two pages compared will likely be
different because of concurrent operations when the WAL is generated
and the state of the page where WAL is applied.

+   maskopaq = (BTPageOpaque)
+   (((char *) page_norm) + ((PageHeader) page_norm)->pd_special);
+   /*
+* Mask everything on a DELETED page.
+*/
Let's make the code breath and add a space here.

+/* Aligned Buffers dedicated to consistency checks of size BLCKSZ */
+static char *new_page_masked = NULL;
+static char *old_page_masked = NULL;
palloc'd buffers are aligned, so you could just remove the work
"Aligned" in this comment?

+   /* If we've just restored the block from backup image, skip
consistency check. */
+   if (XLogRecHasBlockImage(record, block_id) &&
+   XLogRecBlockImageApply(record, block_id))
+   continue;
Here you could just check for Apply() to decide if continue should be
called or not, and Assert afterwards on HasBlockImage(). The assertion
would help checking for inconsistency errors.

@@ -7810,6 +7929,7 @@ ReadCheckpointRecord(XLogReaderState
*xlogreader, XLogRecPtr RecPtr,
}

record = ReadRecord(xlogreader, RecPtr, LOG, true);
+   info = record->xl_info & ~XLR_INFO_MASK;

if (record == NULL)
{
@@ -7852,8 +7972,8 @@ ReadCheckpointRecord(XLogReaderState
*xlogreader, XLogRecPtr RecPtr,
}
return NULL;
}
-   if (record->xl_info != XLOG_CHECKPOINT_SHUTDOWN &&
-   record->xl_info != XLOG_CHECKPOINT_ONLINE)
+   if (info != XLOG_CHECKPOINT_SHUTDOWN &&
+   info != XLOG_CHECKPOINT_ONLINE)
Those changes are not directly related to this patch, but make sure
that record checks are done correctly or this patch would just fail.
It may be better to apply those changes independently first per the
patch on this thread:
https://www.postgresql.org/message-id/CAB7nPqSWVyaZJg7_amRKVqRpEP=_=54e+762+2pf9u3q3+z...@mail.gmail.com
My recommendation is to do so.

+   /*
+* Remember that, if WAL consistency check is enabled for
the current rmid,
+* we always include backup image with the WAL record.
But, during redo we
+* restore the backup block only if needs_backup is set.
+*/
This could be rewritten a bit:
"if WAL consistency is enabled for the resource manager of this WAL
record, a full-page image is included in the record for the block
modified. During redo, the full-page is replayed only of
BKPIMAGE_APPLY is set."

- * In RBM_ZERO_* modes, if the page doesn't exist, the relation is extended
- * with all-zeroes pages up to the referenced block number.  In
- * RBM_ZERO_AND_LOCK and RBM_ZERO_AND_CLEANUP_LOCK modes, the return value
+ * In RBM_ZERO_* modes, if BKPIMAGE_APPLY flag is not set for the backup block,
+ * the relation is extended with all-zeroes pages up to the
referenced block number.
+ * In RBM_ZERO_AND_LOCK and RBM_ZERO_AND_CLEANUP_LOCK modes, the return value
  * is always BLK_NEEDS_REDO
You are forgetting to mention "if the page does not exist" in the new
comment block.

+   /* If it has a full-page image and it should be restored, do it. */
+   if (XLogRecHasBlockImage(record, block_id) &&
XLogRecBlockImageApply(record, block_id))
{
Perhaps on two lines?

The headers of the functions in bufmask.c could be more descriptive,
there should be explanations regarding in 

Re: [HACKERS] Radix tree for character conversion

2016-11-04 Thread Kyotaro HORIGUCHI
Thank you for looling this.

At Mon, 31 Oct 2016 17:11:17 +0100, Daniel Gustafsson  wrote 
in <3fc648b5-2b7f-4585-9615-207a44b73...@yesql.se>
> > On 27 Oct 2016, at 09:23, Kyotaro HORIGUCHI 
> >  wrote:
> > Perl scripts are to be messy, I believe. Anyway the duplicate
> > check as been built into the sub print_radix_trees. Maybe the
> > same check is needed by some plain map files but it would be just
> > duplication for the maps having radix tree.
> 
> I took a small stab at doing some cleaning of the Perl scripts, mainly around
> using the more modern (well, modern as in +15 years old) form for open(..),
> avoiding global filehandles for passing scalar references and enforcing use
> strict.  Some smaller typos and fixes were also included.  It seems my Perl 
> has
> become a bit rusty so I hope the changes make sense.  The produced files are
> identical with these patches applied, they are merely doing cleaning as 
> opposed
> to bugfixing.
> 
> The attached patches are against the 0001-0006 patches from Heikki and you in
> this series of emails, the separation is intended to make them easier to read.

I'm not sure how the discussion about this goes, these patches
makes me think about coding style of Perl.

The distinction between executable script and library is by
intention with an obscure basis. Existing scripts don't get less
modification, but library uses more restricted scopes to get rid
of the troubles caused by using global scopes. But I don't have a
clear preference on that. The TAP test scripts takes OO notations
but I'm not sure convutils.pl also be better to take the same
notation. It would be rarely edited hereafter and won't gets
grown any more.

As far as I see the obvious bug fixes in the patchset are the
following,

- 0007: load_maptable fogets to close input file.
- 0010: commment for load_maptables is wrong.
- 0011: hash reference is incorrectly dereferenced

All other fixes other than the above three seem to be styling or
syntax-generation issues and I don't know whether any
recommendation exists...


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers