date:20170926

On Tue, Sep 26, 2017 at 4:31 PM, Tom Lane  wrote:

> Jeff Janes  writes:
> > I was not using -l.  Instead I set logging_collector=on in
> postgresql.conf,
> > but I suppose that that is not sufficent?
>
> No, because initial stderr is still connected to whatever.
>
> > But I just retried with -l, and it still gets the fast shutdown.
>
> Hmph.  Doesn't work that way for me on RHEL6, which surely oughta
> behave the same as your CentOS.
>
> regards, tom lane
>

I happened to have a virgin install of CentOS7 here, so I tried it on
that.  I installed 9.2 and 10rc1 from repositories (CentOS and PGDG,
respectively) rather than from source, and repeated it and still got the
fast shutdown when hitting ctrl-C when 10rc1 has restarted the 9.2 server.

I did the initdb, start, and restart from pg_ctl, not from whatever
management tool comes with the packages.  I did it both with the database
running as OS user jjanes, and separately running as OS user 'postgres', so
it doesn't seem to matter whether uid is high or low.  And with '-l
logfile'.

So, I don't know.

Cheers,

Jeff

Re: [HACKERS] path toward faster partition pruning

Hi Jesper.

Firstly, thanks for looking at the patch.

On 2017/09/26 22:00, Jesper Pedersen wrote:
> Hi Amit,
> 
> On 09/15/2017 04:50 AM, Amit Langote wrote:
>> On 2017/09/15 11:16, Amit Langote wrote:
>>> I will post rebased patches later today, although I think the overall
>>> design of the patch on the planner side of things is not quite there yet.
>>> Of course, your and others' feedback is greatly welcome.
>>
>> Rebased patches attached.  Because Dilip complained earlier today about
>> clauses of the form (const op var) not causing partition-pruning, I've
>> added code to commute the clause where it is required.  Some other
>> previously mentioned limitations remain -- no handling of OR clauses, no
>> elimination of redundant clauses for given partitioning column, etc.
>>
>> A note about 0001: this patch overlaps with
>> 0003-Canonical-partition-scheme.patch from the partitionwise-join patch
>> series that Ashutosh Bapat posted yesterday [1].  Because I implemented
>> the planner-portion of this patch based on what 0001 builds, I'm posting
>> it here.  It might actually turn out that we will review and commit
>> 0003-Canonical-partition-scheme.patch on that thread, but meanwhile apply
>> 0001 if you want to play with the later patches.  I would certainly like
>> to review  0003-Canonical-partition-scheme.patch myself, but won't be able
>> to immediately (see below).
>>
> 
> Could you share your thoughts on the usage of PartitionAppendInfo's
> min_datum_idx / max_datum_idx ? Especially in relation to hash partitions.
> 
> I'm looking at get_partitions_for_keys.

Sure.  You may have noticed that min_datum_idx and max_datum_idx in
PartitionAppendInfo are offsets in the PartitionDescData.boundinfo.datums
array, of datums that lie within the query-specified range (that is, using
=, >, >=, <, <= btree operators in the query). That array contains bounds
of all partitions sorted in the btree operator class defined order, at
least for list and range partitioning.  I haven't (yet) closely looked at
the composition of hash partition datums in PartitionBoundInfo, which
perhaps have different ordering properties (or maybe none) than list and
range partitioning datums.

Now, since they are offsets of datums in PartitionBoundInfo, not indexes
of partitions themselves, their utility outside partition.c might be
questionable.  But partition-wise join patch, for example, to determine if
two partitioned tables can be joined partition-wise, is going to check if
PartitionBoundInfos in RelOptInfos of two partitioned tables are
bound-to-bound equal [1].  Partition-pruning may select only a subset of
partitions of each of the joining partitioned tables.  Equi-join
requirement for partition-wise join means that the subset of partitions
will be same on both sides of the join.  My intent of having
min_datum_idx, max_datum_idx, along with contains_null_partition, and
contains_default_partition in PartitionAppendInfo is to have sort of a
cross check that those values end up being same on both sides of the join
after equi-join requirement has been satisfied.  That is,
get_partitions_for_keys will have chosen the same set of partitions for
both partitioned tables and hence will have set the same values for those
fields.

As mentioned above, that may be enough for list and range partitioning,
but since hash partition datums do not appear to have the same properties
as list and range datums [2], we may need an additional field(s) to
describe the hash partition selected by get_partitions_for_keys.  I guess
only one field will be enough, that will be the offset in the datums array
of the hash partition chosen for the query or -1 if query quals couldn't
conclusively determine one (maybe not all partition keys were specified to
be hashed or some or all used non-equality operator).

Hope that answers your question at least to some degree.  Now, there are
some points Robert mentioned in his reply that I will need to also
consider in the patch, which I'll go do now. :)

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpRc4UdCYknBai9pBu2GA1h4nZVNPDmzgs4jOkqFamT1huA%40mail.gmail.com

[2] It appears that in the hash partitioning case, unlike list and range
partitioning, PartitionBoundInfo doesn't contain values that are
directly comparable to query-specified constants, but a pair (modulus,
remainder) for each partition.  We'll first hash *all* the key values
(mentioned in the query) using the partitioning hash machinery and
then determine the hash partition index by using
(hash % largest_modulus) as offset into the PartitionBoundInfo.indexes
array.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multicolumn hash indexes

Robert Haas  writes:
> On Tue, Sep 26, 2017 at 7:18 PM, Tom Lane  wrote:
>> It's not simple, particularly not if you wish that the index would support
>> queries specifying conditions for just a subset of the indexed columns
>> (an assumption that's buried pretty deeply in the planner, for one thing).
>> Then you couldn't compute the hash.

> Whoa, that seems like moving the goalposts.

No, that's merely stating where the goalposts stand, and where they have
stood for the past twenty-odd years.  You might imagine they are somewhere
else, but you'd be mistaken.

There is a facility in the planner to require a condition for the first
column of an index before considering an indexscan plan.  We could perhaps
extend that to require a condition for each column of the index, though
I'm not sure how much work is involved directly in that.  The bigger
picture here though is that it puts a premium on *not* throwing away
"unnecessary" qual conditions, which is directly antithetical to a bunch
of other planner goals.

User: Why won't the planner use my multicolumn hash index?
  I have query conditions constraining all the columns!
Us: Well, one of your conditions was discarded because it was
constant-true after constant simplification, or redundant with
a partition qual or CHECK constraint, or implied by an index
predicate, or treated as a join condition instead of a
restriction condition, or absorbed into an equivalence class
and then the planner chose to emit some other equivalence
condition instead, or possibly two or three other things.
User: WH!

There are also some related problems concerning how to make index entries
for tuples that have some but not all of the indexed columns being NULL.
Maybe that goes away if you're willing to demand strict equality
constraints for all columns, but I'm not completely convinced offhand.
(See the amoptionalkey discussion in the index AM API spec for some
context.)

I agree that doing a half-baked job of this is probably within reach.
I'm uncertain about what it would take to bake it fully.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw: evaluate placeholdervars on remote server

2017-09-26 Thread Etsuro Fujita


On 2017/09/16 0:19, Robert Haas wrote:

On Fri, Sep 15, 2017 at 10:15 AM, Daniel Gustafsson  wrote:

Have you had a chance to look at this such that we can expect a rebased version
of this patch during the commitfest?


Frankly, I think things where there was a ping multiple weeks before
the CommitFest started and no rebase before it started should be
regarded as untimely submissions, and summarily marked Returned with
Feedback.  The CommitFest is supposed to be a time to get things that
are ready before it starts committed before it ends.  Some amount of
back-and-forth during the CF is of course to be expected, but we don't
even really have enough bandwidth to deal with the patches that are
being timely updated, never mind the ones that aren't.


Agreed.  I marked this as RWF.  Thank you.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Use of RangeVar for partitioned tables in autovacuum

Hi all,

I have been looking more closely at the problem in $subject, that I
have mentioned a couple of times, like here:
https://www.postgresql.org/message-id/cab7npqqa37oune_rjzpmwc4exqalx9f27-ma_-rsfl_3mj+...@mail.gmail.com

As of HEAD, the RangeVar defined in calls of vacuum() is used for
error reporting, only in two cases: for autoanalyze and autovacuum
when reporting to users that a lock cannot be taken on a relation. The
thing is that autovacuum has the good idea to call vacuum() on only
one relation at the same time, with always a relation OID and a
RangeVar defined, so the code currently on HEAD is basically fine with
its error reporting, because get_rel_oids() will always work on the
relation OID with its RangeVar, and because code paths with manual
VACUUMs don't use the RangeVar for any reports.

While v10 is safe, I think that this is wrong in the long-term and
that may be a cause of bugs if people begin to generate reports for
manual VACUUMs, which is plausible in my opinion. The patch attached,
is what one solution would look like if we would like to make things
more robust as long as manual VACUUM commands can only specify one
relation at the same time. I have found that tracking the parent OID
by tweaking a bit get_rel_oids() was the most elegant solution. Please
note that this range of problems is something that I think is better
solved with the infrastructure that support for VACUUM with multiple
relations includes (last version here
https://www.postgresql.org/message-id/766556dd-aa3c-42f7-acf4-5dc97f41f...@amazon.com).
So I don't think that the patch attached should be integrated, still I
am attaching it to satisfy the curiosity of anybody looking at this
message.

In conclusion, I think that the open item of $subject should be
removed from the list, and we should try to get the multi-VACUUM patch
in to cover any future problems. I'll do so if there are no
objections.

One comment on top of vacuum() is wrong by the way: in the case of a
manual VACUUM or ANALYZE, a NULL RangeVar is used if no relation is
specified in the command.

Thanks,
-- 
Michael


vac-rangevar-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

Hi David,

On 2017/09/27 6:04, David Rowley wrote:
> On 25 September 2017 at 23:04, Amit Langote
>  wrote:
>> By the way, I'm now rebasing these patches on top of [1] and will try to
>> merge your refactoring patch in some appropriate way.  Will post more
>> tomorrow.
>>
>> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9140cf826
> 
> Yeah, I see 0001 conflicts with that. I'm going to set this to waiting
> on author while you're busy rebasing this.

Thanks for the reminder.  Just thought I'd say that while I'm actually
done rebasing itself (attaching rebased patches to try 'em out), I'm now
considering Robert's comments and will be busy for a bit revising things
based on those comments.

Some notes about the attached patches:

- 0001 includes refactoring that Dilip proposed upthread [1] (added him as
  an author).  I slightly tweaked his patch -- renamed the function
  get_matching_clause to match_clauses_to_partkey, similar to
  match_clauses_to_index.

- Code to set AppendPath's partitioned_rels in add_paths_to_append_rel()
  revised by 0a480502b09 (was originally introduced in d3cc37f1d80) is
  still revised to get partitioned_rels from a source that is not
  PlannerInfo.pcinfo_list.  With the new code, partitioned_rels won't
  contain RT indexes of the partitioned child tables that were pruned.


Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFiTN-tGnQzF_4QtbOHT-3hE%3DOvNaMfbbeRxa4UY0CQyF0G8gQ%40mail.gmail.com
From f20aebcad9b089434ba60cd4439fa1a9d55091b8 Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 13 Sep 2017 18:24:55 +0900
Subject: [PATCH 1/4] WIP: planner-side changes for partition-pruning

Firstly, this adds a stub get_partitions_for_keys() in partition.c
with appropriate interface for the caller to specify bounding scan
keys, along with other information about the scan keys extracted
from the query, such as NULL-ness of the keys, inclusive-ness, etc.

More importantly, this implements the planner-side logic to extract
bounding scan keys to be passed to get_partitions_for_keys.  That is,
it will go through rel->baserestrictinfo and match individual clauses
to partition keys and construct lower bound and upper bound tuples,
which may cover only a prefix of a multi-column partition key.

A bunch of smarts are still missing when mapping the clause operands
with keys.  For example, code to match a clause is specifed as
(constant op var) doesn't exist.  Also, redundant keys are not
eliminated, for example, a combination of clauses a = 10 and a > 1
will cause the later clause a > 1 taking over and resulting in
needless scanning of partitions containing values a > 1 and a < 10.

...constraint exclusion is still used, because
get_partitions_for_keys is just a stub...

Authors: Amit Langote, Dilip Kumar
---
 src/backend/catalog/partition.c   |  43 
 src/backend/optimizer/path/allpaths.c | 390 ++
 src/backend/optimizer/util/plancat.c  |  10 +
 src/backend/optimizer/util/relnode.c  |   7 +
 src/include/catalog/partition.h   |   9 +
 src/include/nodes/nodes.h |   1 +
 src/include/nodes/relation.h  |  66 ++
 7 files changed, 487 insertions(+), 39 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 1ab6dba7ae..ccf8a1fa67 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1335,6 +1335,49 @@ get_partition_dispatch_recurse(Relation rel, Relation 
parent,
}
 }
 
+/*
+ * get_partitions_for_keys
+ * Returns the list of indexes of rel's partitions that will need 
to be
+ * scanned given the bounding scan keys.
+ *
+ * Each value in the returned list can be used as an index into the oids array
+ * of the partition descriptor.
+ *
+ * Inputs:
+ * keynullness contains between 0 and (key->partnatts - 1) values, each
+ * telling what kind of NullTest has been applies to the corresponding
+ * partition key column.  minkeys represents the lower bound on the 
partition
+ * the key of the records that the query will return, while maxkeys
+ * represents upper bound.  min_inclusive and max_inclusive tell whether 
the
+ * bounds specified minkeys and maxkeys is inclusive, respectively.
+ *
+ * Other outputs:
+ * *min_datum_index will return the index in boundinfo->datums of the first
+ * datum that the query's bounding keys allow to be returned for the query.
+ * Similarly, *max_datum_index.  *null_partition_chosen returns whether
+ * the null partition will be scanned.
+ *
+ * TODO: Implement.
+ */
+List *
+get_partitions_for_keys(Relation rel,
+   NullTestType *keynullness,
+   Datum *minkeys, int n_minkeys, 
bool min_inclusive,
+   Datum *maxkeys, int n_maxkeys, 
bool max_inclusive,
+

Re: [HACKERS] path toward faster partition pruning

On 2017/09/27 1:51, Robert Haas wrote:
> On Tue, Sep 26, 2017 at 10:57 AM, Jesper Pedersen
>  wrote:
>> One could advocate (*cough*) that the hash partition patch [1] should be
>> merged first in order to find other instances of where other CommitFest
>> entries doesn't account for hash partitions at the moment in their method
>> signatures; Beena noted something similar in [2]. I know that you said
>> otherwise [3], but this is CommitFest 1, so there is time for a revert
>> later, and hash partitions are already useful in internal testing.
> 
> Well, that's a fair point.  I was assuming that committing things in
> that order would cause me to win the "least popular committer" award
> at least for that day, but maybe not.  It's certainly not ideal to
> have to juggle that patch along and keep rebasing it over other
> changes when it's basically done, and just waiting on other
> improvements to land.  Anybody else wish to express an opinion?

FWIW, I tend to agree that it would be nice to get the hash partitioning
patch in, even with old constraint exclusion based partition-pruning not
working for hash partitions.  That way, it might be more clear what we
need to do in the partition-pruning patches to account for hash partitions.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multicolumn hash indexes

On Tue, Sep 26, 2017 at 7:18 PM, Tom Lane  wrote:
> Tomasz Ostrowski  writes:
>> I've noticed that hash indexes can't currently (in PG10) be multicolumn.
>> Are they technically hard to implement or just nobody took such a feature?
>
> It's not simple, particularly not if you wish that the index would support
> queries specifying conditions for just a subset of the indexed columns
> (an assumption that's buried pretty deeply in the planner, for one thing).
> Then you couldn't compute the hash.

Whoa, that seems like moving the goalposts.  Somebody could design a
hash index that was intended to answer such queries, but it's not the
one we've got.  I think we should just aim to support equality queries
on all columns.  That seems like a fairly straightforward
generalization of what we've already got.

You'd need to adjust a bunch of things, like _hash_datum2hashkey and
_hash_datum2hashkey_type but that doesn't necessarily seem like it
would be super-difficult.  One problem is that the metapage currently
stores the OID of the hashproc used for the index, and there's no
place there to store more than one.  But since that's only for
forensics anyway, we could store InvalidOid for a multi-column index,
or the value for the first column, or whatever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

Jeff Janes  writes:
> I was not using -l.  Instead I set logging_collector=on in postgresql.conf,
> but I suppose that that is not sufficent?

No, because initial stderr is still connected to whatever.

> But I just retried with -l, and it still gets the fast shutdown.

Hmph.  Doesn't work that way for me on RHEL6, which surely oughta
behave the same as your CentOS.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-perspective knowledge about wait events

On Wed, Sep 27, 2017 at 3:26 AM, Schneider  wrote:
> However I think that it would be immensely helpful to start gathering
> knowledge somewhere on wait events from the user perspective.  Listing
> a few of the wait events which users will probably see most often
> along with practical suggestions of what users could further examine
> or even what they could change to make improvements on their database
> systems.

I am not sure that listing wait events individually based on the
frequency they could be seen is that helpful, but profiles for
different workloads matter. For example, Postgres has pgbench in core,
so we could look at profiles generated with its generic tests, and
look at the profiles generated every second and then append them in a
custom table. By looking at how many of them are present you can guess
by what roughly Postgres is bottlecked by, and you can conclude this
and that.

For example, seeing a lot of SyncRep events in sessions would mean
that the synchronous standby may not be following its primary very
smoothly. Gathering a set of examples on wiki page with some rough
analysis I think would be a good start.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multicolumn hash indexes

On Tue, Sep 26, 2017 at 5:41 PM, Tomasz Ostrowski
 wrote:
> I've noticed that hash indexes can't currently (in PG10) be multicolumn. Are
> they technically hard to implement or just nobody took such a feature?
>
> I think multicolumn hash indexes should help pretty significantly with
> queries like:
> - where username=? and user_post_id=?
> - where client_id=? and period=? and invoice_number=?
> etc.
>
> I imagine that calculating a multicolumn hash should be pretty
> straightforward to implement - after hashing bytes of first column just keep
> going and update the hash state with bytes of a second and subsequent
> columns. And it should allow for faster (O(1), less IO) and much smaller
> (better cached, less IO again) multicolumn indexes. Also in PG10 hash
> indexes are WAL-logged and therefore much easier to work with. What do you
> think?

I was just thinking the same thing today.  I think it is a great idea,
but I don't have time to do the work myself at the moment.  I'd be
happy to help review, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multicolumn hash indexes

Tomasz Ostrowski  writes:
> I've noticed that hash indexes can't currently (in PG10) be multicolumn. 
> Are they technically hard to implement or just nobody took such a feature?

It's not simple, particularly not if you wish that the index would support
queries specifying conditions for just a subset of the indexed columns
(an assumption that's buried pretty deeply in the planner, for one thing).
Then you couldn't compute the hash.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

On Tue, Sep 26, 2017 at 3:54 PM, Tom Lane  wrote:

> Jeff Janes  writes:
> > On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane  wrote:
> >> Really?  The server should have detached itself from your terminal
> >> group long before that.  What platform is this?
>
> > CentOS release 6.9 (Final)
>
> Hm, same as here.  Are you perhaps not using pg_ctl's -l option?
> If not, the postmaster's stderr would remain attached to your tty,
> which might be the reason why a terminal ^C affects it.
>

I was not using -l.  Instead I set logging_collector=on in postgresql.conf,
but I suppose that that is not sufficent?

But I just retried with -l, and it still gets the fast shutdown.

Cheers,

Jeff

Re: [HACKERS] v10 pg_ctl compatibility

On 2017-09-26 18:54:17 -0400, Tom Lane wrote:
> Jeff Janes  writes:
> > On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane  wrote:
> >> Really?  The server should have detached itself from your terminal
> >> group long before that.  What platform is this?
> 
> > CentOS release 6.9 (Final)
> 
> Hm, same as here.  Are you perhaps not using pg_ctl's -l option?
> If not, the postmaster's stderr would remain attached to your tty,
> which might be the reason why a terminal ^C affects it.

Doesn't make a difference here (Debian sid), I see postgres get
SIGTERMed either way. GDBing in and doing a setsid() in postmaster
before ctrl-c'ing pg_ctl "fixes" it.  Is there a reason we're not doing
so after the fork in pg_ctl?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

Jeff Janes  writes:
> On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane  wrote:
>> Really?  The server should have detached itself from your terminal
>> group long before that.  What platform is this?

> CentOS release 6.9 (Final)

Hm, same as here.  Are you perhaps not using pg_ctl's -l option?
If not, the postmaster's stderr would remain attached to your tty,
which might be the reason why a terminal ^C affects it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

Andrew Dunstan  writes:
> I'm not happy about the idea of marking an input function as not
> parallel safe, certainly not without a good deal of thought and
> discussion that we don't have time for this cycle.

Yeah, that aspect of it was bothering me too: it's easy to say
"mark the function unsafe", but that only helps to the extent that
the function is used in queries where the planner has control of
whether to parallelize or not.  There's an awful lot of hard-wired
calls to I/O functions in our code, and I would not want to promise
that none of those are reachable in a parallel worker.

As for Stephen's concern, I had already looked at reverting 15bc038f9
earlier, and concluded that none of that code had changed significantly
since then.  There's some conflicts due to pgindent activity but I think
pulling it out will be a straightforward thing to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] md5 still listed as an option in pg_hba.conf.sample

On Wed, Sep 27, 2017 at 2:41 AM, Mark Dilger  wrote:
> I was under the impression that md5 was removed for this release, per other
> threads that I must not have followed closely enough.

md5 is still present, its configuration in pg_hba.conf and
password_encryption are still here. "plain" is what has been removed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

On 2017-09-26 15:15:39 -0700, Jeff Janes wrote:
> On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane  wrote:
> 
> > Jeff Janes  writes:
> > > To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
> > > it sits there for a long time waiting for it to start up even though it
> > has
> > > already started up, if I hit ctrl-C because I assume something is
> > horribly
> > > wrong, it then goes ahead and kills the successfully started-and-running
> > > server.
> >
> > Really?  The server should have detached itself from your terminal
> > group long before that.  What platform is this?
> >
> 
> 
> CentOS release 6.9 (Final)
> 
> The sever log file (9.6) says:
> 
> 
>64926  2017-09-26 13:56:38.284 PDT LOG:  database system was shut down
> at 2017-09-26 13:56:37 PDT
>64926  2017-09-26 13:56:38.299 PDT LOG:  MultiXact member wraparound
> protections are now enabled
>64930  2017-09-26 13:56:38.311 PDT LOG:  autovacuum launcher started
>64924  2017-09-26 13:56:38.313 PDT LOG:  database system is ready to
> accept connections
> ...  hit ctrl-C
>64924  2017-09-26 13:56:47.237 PDT LOG:  received fast shutdown request
>64924  2017-09-26 13:56:47.237 PDT LOG:  aborting any active transactions
>64930  2017-09-26 13:56:47.244 PDT LOG:  autovacuum launcher shutting
> down
>64927  2017-09-26 13:56:47.261 PDT LOG:  shutting down
> ...

It's reproducible here.  Postmaster never calls setsid() for itself, nor
does pg_ctl for it, therefore ctrl-c's SIGTERM to pg_ctl also go to its
children, namely postmaster.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication - test_decoding - unchanged-toast-datum

2017-09-26 Thread Euler Taveira

2017-09-26 1:11 GMT-03:00 Abhinav Singh :
> 5. On the target, when I do a select * and see that the column with
> character varying() datatype has changed to 'unchanged-toast-datum'.
>
The column "is_toast" didn't change its value to
"unchanged-toast-datum". It is just a test_decoding convention that
means that the value is stored in a TOAST table and it was not
changed. test_decoding doesn't show TOAST values to avoid performance
problems and to be brief. Try a SELECT in the table and you will see
that the value is already there.

You didn't write an explicit question but I believe it was your doubt,
didn't it?


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Andrew Dunstan



On 09/26/2017 06:04 PM, Andrew Dunstan wrote:
>
> On 09/26/2017 05:45 PM, Stephen Frost wrote:
>> I've not been following along very closely- are we sure that ripping
>> this out won't be worse than dealing with it in-place?  Will pulling it
>> out also require a post-RC1 catversion bump?
>>
>>
>
> It shouldn't do AFAIK - the function signatures weren't changed.
>


At this stage on reflection I agree it should be pulled :-(

I'm not happy about the idea of marking an input function as not
parallel safe, certainly not without a good deal of thought and
discussion that we don't have time for this cycle.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane  wrote:

> Jeff Janes  writes:
> > To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
> > it sits there for a long time waiting for it to start up even though it
> has
> > already started up, if I hit ctrl-C because I assume something is
> horribly
> > wrong, it then goes ahead and kills the successfully started-and-running
> > server.
>
> Really?  The server should have detached itself from your terminal
> group long before that.  What platform is this?
>


CentOS release 6.9 (Final)

The sever log file (9.6) says:


   64926  2017-09-26 13:56:38.284 PDT LOG:  database system was shut down
at 2017-09-26 13:56:37 PDT
   64926  2017-09-26 13:56:38.299 PDT LOG:  MultiXact member wraparound
protections are now enabled
   64930  2017-09-26 13:56:38.311 PDT LOG:  autovacuum launcher started
   64924  2017-09-26 13:56:38.313 PDT LOG:  database system is ready to
accept connections
...  hit ctrl-C
   64924  2017-09-26 13:56:47.237 PDT LOG:  received fast shutdown request
   64924  2017-09-26 13:56:47.237 PDT LOG:  aborting any active transactions
   64930  2017-09-26 13:56:47.244 PDT LOG:  autovacuum launcher shutting
down
   64927  2017-09-26 13:56:47.261 PDT LOG:  shutting down
...


I can try a different platform (ubuntu 16.04, probably) tonight and see if
it does the same thing.

Cheers,

Jeff

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Andrew Dunstan



On 09/26/2017 05:45 PM, Stephen Frost wrote:
>
> I've not been following along very closely- are we sure that ripping
> this out won't be worse than dealing with it in-place?  Will pulling it
> out also require a post-RC1 catversion bump?
>
>


It shouldn't do AFAIK - the function signatures weren't changed.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Stephen Frost

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Bruce Momjian  writes:
> > On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:
> >> Any other votes out there?
> 
> > Well, I was concerned yesterday that we had a broken build farm so close
> > to release. (I got consistent regression failures.)  I think PG 11 would
> > be better for this feature change, so I support reverting this.
> 
> I'll take the blame for (most of) yesterday's failures in the v10
> branch, but they were unrelated to this patch --- they were because
> of that SIGBUS patch I messed up.  So that doesn't seem like a very
> applicable argument.  Still, it's true that this seems like the most
> consequential patch that's gone into v10 post-RC1, certainly so if
> you discount stuff that was back-patched further than v10.

I've not been following along very closely- are we sure that ripping
this out won't be worse than dealing with it in-place?  Will pulling it
out also require a post-RC1 catversion bump?

If we can pull it out without bumping catversion and with confidence
that it won't cause more problems then, as much as I hate it, I'm
inclined to say we pull it out and come back to it in v11.  I really
don't like the idea of a post-rc1 catversion bump and it doesn't seem
like there's a good solution here that doesn't involve more changes and
most likely a catversion bump.  If it was reasonably fixable with only
small/local changes and without a catversion bump then I'd be more
inclined to keep it, but I gather from the discussion that's not the
case.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Bruce Momjian

On Tue, Sep 26, 2017 at 05:32:15PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:
> >> Any other votes out there?
> 
> > Well, I was concerned yesterday that we had a broken build farm so close
> > to release. (I got consistent regression failures.)  I think PG 11 would
> > be better for this feature change, so I support reverting this.
> 
> I'll take the blame for (most of) yesterday's failures in the v10
> branch, but they were unrelated to this patch --- they were because
> of that SIGBUS patch I messed up.  So that doesn't seem like a very
> applicable argument.  Still, it's true that this seems like the most
> consequential patch that's gone into v10 post-RC1, certainly so if
> you discount stuff that was back-patched further than v10.

Oh, I couldn't untangle that the regression failures were unrelated to
enums, so please ignore my opinion.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Multicolumn hash indexes

2017-09-26 Thread Tomasz Ostrowski


Hi.

I've noticed that hash indexes can't currently (in PG10) be multicolumn. 
Are they technically hard to implement or just nobody took such a feature?


I think multicolumn hash indexes should help pretty significantly with 
queries like:

- where username=? and user_post_id=?
- where client_id=? and period=? and invoice_number=?
etc.

I imagine that calculating a multicolumn hash should be pretty 
straightforward to implement - after hashing bytes of first column just 
keep going and update the hash state with bytes of a second and 
subsequent columns. And it should allow for faster (O(1), less IO) and 
much smaller (better cached, less IO again) multicolumn indexes. Also in 
PG10 hash indexes are WAL-logged and therefore much easier to work with. 
What do you think?


--
Tomasz "Tometzky" Ostrowski


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

Bruce Momjian  writes:
> On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:
>> Any other votes out there?

> Well, I was concerned yesterday that we had a broken build farm so close
> to release. (I got consistent regression failures.)  I think PG 11 would
> be better for this feature change, so I support reverting this.

I'll take the blame for (most of) yesterday's failures in the v10
branch, but they were unrelated to this patch --- they were because
of that SIGBUS patch I messed up.  So that doesn't seem like a very
applicable argument.  Still, it's true that this seems like the most
consequential patch that's gone into v10 post-RC1, certainly so if
you discount stuff that was back-patched further than v10.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Bruce Momjian

On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 09/26/2017 02:37 PM, Tom Lane wrote:
> >> ... and the buildfarm's not too happy.  It looks like force_parallel_mode
> >> breaks all the regression test cases around unsafe enums; which on
> >> reflection is unsurprising, because parallel workers will not have access
> >> to the parent's blacklist hash, so they will think unsafe values are safe.
> 
> > I think I would mark enum_in and friends as parallel-restricted. Yes I
> > know it would involve a cat version bump, so I'll understand if that's
> > not acceptable, but it seems to me the best of a bad bunch of choices.
> > Second choice might be turning off parallel mode if the hash exists, but
> > I'm unclear how that would work.
> 
> Meh.  I'm starting to slide back to my original opinion that we should
> revert back to 9.6 behavior.  Even if a post-RC1 catversion bump is OK,
> making these sorts of changes a week before GA is not comfort inducing.
> I'm losing faith that we've thought through the issue thoroughly, and
> there's no longer time to catch any remaining oversights through testing.
> 
> Any other votes out there?

Well, I was concerned yesterday that we had a broken build farm so close
to release. (I got consistent regression failures.)  I think PG 11 would
be better for this feature change, so I support reverting this.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-09-26 Thread David Rowley

On 25 September 2017 at 23:04, Amit Langote
 wrote:
> By the way, I'm now rebasing these patches on top of [1] and will try to
> merge your refactoring patch in some appropriate way.  Will post more
> tomorrow.
>
> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9140cf826

Yeah, I see 0001 conflicts with that. I'm going to set this to waiting
on author while you're busy rebasing this.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-09-26 Thread Maksim Milyutin


25.09.17 20:50, Maksim Milyutin wrote:

I have found out the problem when try to sequentially call the 
function that casts constant to composite type of temporary table that 
is deleted ateach transaction termination (i.e. at each function call 
completion).



For example, we have the following function 'test':

CREATE OR REPLACE FUNCTION test()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
begin
    create temp table tbl (id int) on commit drop;
    perform json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt;
end;
$function$


Оn the second and subsequent calls we'll observe the following error:

ERROR:  cache lookup failed for type 16392


I investigated the problem and realized that result type of function 
*json_populate_record* (/funcresulttype/ field of /FuncExpr/ struct) 
as well as type of the first null argument (/consttype/ field of 
/Const/ struct) refer to the invalid composite type related with 
temporary table'tbl'. Namely they take a value of oid gotten from the 
first 'tbl' initialization. The plan of query *'perform 
json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt'*is cached 
and is not invalidated at each function call. This is because** the 
statement of this query doesn't have any dependency from the 'tbl' 
relation (/relationOids/ list of /CachedPlanSource/ struct).



Attached patch resolves this problem by adding dependency from 
relation upon detection Const expression of composite type of that 
relation


Updated patchset contains more transparent definition of composite type 
for constant node and regression test for described above buggy case.


--
Regards,
Maksim Milyutin

diff --git a/src/backend/optimizer/plan/setrefs.c 
b/src/backend/optimizer/plan/setrefs.c
index b0c9e94459..482b0d227c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -78,6 +78,12 @@ typedef struct
(((con)->consttype == REGCLASSOID || (con)->consttype == OIDOID) && \
 !(con)->constisnull)
 
+/*
+ * Check if a Const node has composite type
+ */
+#define ISTABLETYPECONST(con) \
+   (get_typtype((con)->consttype) == TYPTYPE_COMPOSITE)
+
 #define fix_scan_list(root, lst, rtoffset) \
((List *) fix_scan_expr(root, (Node *) (lst), rtoffset))
 
@@ -1410,6 +1416,12 @@ fix_expr_common(PlannerInfo *root, Node *node)
root->glob->relationOids =
lappend_oid(root->glob->relationOids,

DatumGetObjectId(con->constvalue));
+
+   /* Check whether const has composite type */
+   if (ISTABLETYPECONST(con))
+   root->glob->relationOids =
+   lappend_oid(root->glob->relationOids,
+   
get_typ_typrelid(con->consttype));
}
else if (IsA(node, GroupingFunc))
{
diff --git a/src/test/regress/expected/plancache.out 
b/src/test/regress/expected/plancache.out
index c2eeff1614..5c0be47778 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -220,6 +220,28 @@ execute p2;
1
 (1 row)
 
+-- Check that invalidation deals with casting const value to temporary
+-- composite type reinitialized on each new transaction
+create function cache_query_with_composite_const() returns void as $$
+begin
+create temp table tbl(id int) on commit drop;
+
+-- Plan of the next query has to be rebuilt on each new call of function
+-- due to casting first argument 'null' to recreated temprary table 'tbl'
+perform json_populate_record(null::tbl, '{"id": 0}'::json);
+end$$ language plpgsql;
+select cache_query_with_composite_const();
+ cache_query_with_composite_const 
+--
+ 
+(1 row)
+
+select cache_query_with_composite_const();
+ cache_query_with_composite_const 
+--
+ 
+(1 row)
+
 -- Check DDL via SPI, immediately followed by SPI plan re-use
 -- (bug in original coding)
 create function cachebug() returns void as $$
diff --git a/src/test/regress/sql/plancache.sql 
b/src/test/regress/sql/plancache.sql
index cb2a551487..41be0d6bf4 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -140,6 +140,21 @@ create temp sequence seq;
 
 execute p2;
 
+-- Check that invalidation deals with casting const value to temporary
+-- composite type reinitialized on each new transaction
+
+create function cache_query_with_composite_const() returns void as $$
+begin
+create temp table tbl(id int) on commit drop;
+
+-- Plan of the next query has to be rebuilt on each new call of function
+-- due to casting first argument 'null' to recreated temprary table 'tbl'
+perform json_populate_record(null::tbl, '{"id": 0}'::json);
+end$$ language plpgsql;
+
+select cache_query_with_composite_const();
+select cache_query_with_composite_const();
+
 -- Check

Re: [HACKERS] v10 pg_ctl compatibility

Jeff Janes  writes:
> To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
> it sits there for a long time waiting for it to start up even though it has
> already started up, if I hit ctrl-C because I assume something is horribly
> wrong, it then goes ahead and kills the successfully started-and-running
> server.

Really?  The server should have detached itself from your terminal
group long before that.  What platform is this?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

Andrew Dunstan  writes:
> On 09/26/2017 02:37 PM, Tom Lane wrote:
>> ... and the buildfarm's not too happy.  It looks like force_parallel_mode
>> breaks all the regression test cases around unsafe enums; which on
>> reflection is unsurprising, because parallel workers will not have access
>> to the parent's blacklist hash, so they will think unsafe values are safe.

> I think I would mark enum_in and friends as parallel-restricted. Yes I
> know it would involve a cat version bump, so I'll understand if that's
> not acceptable, but it seems to me the best of a bad bunch of choices.
> Second choice might be turning off parallel mode if the hash exists, but
> I'm unclear how that would work.

Meh.  I'm starting to slide back to my original opinion that we should
revert back to 9.6 behavior.  Even if a post-RC1 catversion bump is OK,
making these sorts of changes a week before GA is not comfort inducing.
I'm losing faith that we've thought through the issue thoroughly, and
there's no longer time to catch any remaining oversights through testing.

Any other votes out there?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Patch to address concerns about ICU collcollate stability in v10 (Was: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?)

2017-09-26 Thread Peter Geoghegan

On Sun, Sep 24, 2017 at 9:24 PM, Peter Geoghegan  wrote:
> * Documents the aforementioned keyword collation attribute restriction
> on ICU versions before ICU 54. This was needed anyway. We only claim
> for Postgres collations what the ICU docs claim for ICU collators,
> even though there is reason to believe that some ICU versions before
> ICU 54 actually can do better.

Following some digging, it looks like the restriction actually only
needs to apply on versions prior to ICU 4.6. The internal function
_canonicalize() contains the functionality need for Postgres to avoid
converting to BCP 47 itself on the fly (this was previously believed
to be the case only in ICU 54):

https://ssl.icu-project.org/trac/browser/tags/release-50-1-2/icu4c/source/common/uloc.cpp#L1614

(This is uloc.c on earlier ICU versions.)

(The call stack here is that from ucol_open(), ucol_open_internal() is
called, which calls uloc_getBaseName(), which calls _canonicalize().)

ICU gained that _hasBCP47Extension() branch in ICU 46. The
_canonicalize function has hardly changed since, despite the rewrite
from C to C++ (I checked up until ICU 55, the reasonably current
version I've done most testing on). This is totally consistent with
what both Peter and I have observed, even though the ucol_open() docs
suggest that that stuff only became available within ICU 54. I think
that ucol_open() was updated to mention BCP 47 at the same time as it
mentioned the lifting of the restrictions on extended keywords (when
you no longer had to use ucol_setAttribute()), confusing two basically
unrelated issues.

I suggest that as a compromise, my proposal can be changed in either
of the following ways:

* Do the same thing as I propose, except only introduce the mapping
from/to language tags when ICU version < 46 is in use. This would mean
that we'd be doing that far less in practice.

Or:

* Never do *any* mapping/language tag conversion when opening a
collator, but still consistently canonicalize as BCP 47 on all ICU
versions. This can be done by only supporting versions that have the
required _hasBCP47Extension() branch (ICU >= 46). We'd desupport the
old ICU 42 and ICU 44 versions.

Support for ICU 42 and ICU 44 came fairly late, in commit eccead9 on
August 5th. Both Tom and I expressed reservations about that at the
time. It doesn't seem too late to desupport those 2 versions, leaving
us with a fix that is even less invasive. We actually wouldn't
technically be changing anything about canonicalization at all, this
way. We'd be changing our understanding of when a restriction applies
(not < 54, < 46), at which point the "collcollate =
U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name" thing that I take
particular issue with becomes "collcollate = U_ICU_VERSION_MAJOR_NUM
>= 46 ? langtag : name".

This is equivalent to "collcollate = langtag" (which is what it would
actually look like), since we're desupporting earlier versions.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

On Tue, Sep 26, 2017 at 12:29 PM, Andres Freund  wrote:

> Hi,
>
> On 2017-09-26 11:59:42 -0700, Jeff Janes wrote:
> > Should the release notes have a compatibility entry about pg_ctl restart,
> > being used against a running pre-10 server, no longer being able to
> detect
> > when startup is complete?
> >
> > I don't know if cross-version use of pg_ctl restart was ever officially
> > supported, but the current behavior is rather confusing (waiting for a
> long
> > time, and then reporting failure, even though it started successfully).
>
> I'm actually tempted to just make pg_ctl verify the right version of
> postgres is being used. Maybe I'm missing something, but what's the
> use-case for allowing it, and every couple releases have some breakage?
>

The use case for me is that I have multiple versions of postgres running on
the same machine, and don't want check which pg_ctl is in my path each time
I change one of their configurations and need to restart it.  Or at least,
I never had to check before, as pg_ctl goes out of its way to restart it
with the same bin/postgres that the server is currently running with,
rather than restarting with the bin/postgres currently in the path, or
whichever bin/postgres is in the same directory as the bin/pg_ctl.  (Which
is itself not an unmitigated win, but it is what I'm used to).

Admittedly, this is a much bigger problem with hacking/testing than it is
with production use, but still I do have production machines with mixed
versions running, and can see myself screwing this up a few times once one
of them gets upgraded to v10, even with an incompatibility warning in the
release notes.  But at least I had fair warning.

To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
it sits there for a long time waiting for it to start up even though it has
already started up, if I hit ctrl-C because I assume something is horribly
wrong, it then goes ahead and kills the successfully started-and-running
server.

If we do want to do a version check and fail out, I think it should check
before shutting it down, rather than shutting it down and then refusing to
start.

Cheers,

Jeff

Re: [HACKERS] v10 pg_ctl compatibility

On 2017-09-26 15:40:39 -0400, Tom Lane wrote:
> I'm not really feeling the need to insert a version check though.

It's only a mild preference here.


> Also, what would you check exactly?  Inquiring into what
> "postgres --version" returns is not very conclusive about what is
> actually running in the data directory pg_ctl has been pointed at.

Reading PG_VERSION ought to do the trick.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

Andres Freund  writes:
> On 2017-09-26 11:59:42 -0700, Jeff Janes wrote:
>> I don't know if cross-version use of pg_ctl restart was ever officially
>> supported, but the current behavior is rather confusing (waiting for a long
>> time, and then reporting failure, even though it started successfully).

> I'm actually tempted to just make pg_ctl verify the right version of
> postgres is being used. Maybe I'm missing something, but what's the
> use-case for allowing it, and every couple releases have some breakage?

At a high level the use case is

yum upgrade postgresql
service postgresql restart

In practice, that wouldn't work across a major-version upgrade anyway,
because of data directory incompatibility.  It should work for minor
versions though, so the version check couldn't be strict.

I'm not really feeling the need to insert a version check though.
It seems like it's more likely to reject cases that would have worked
than to do anything helpful.  The API that pg_ctl relies on is one
that we don't change very often, viz the contents of postmaster.pid.
The fact that we did change it this time is the source of Jeff's
surprise.

Also, what would you check exactly?  Inquiring into what
"postgres --version" returns is not very conclusive about what is
actually running in the data directory pg_ctl has been pointed at.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Andrew Dunstan



On 09/26/2017 02:37 PM, Tom Lane wrote:
> I wrote:
>> Pushed; sorry for the delay.
> ... and the buildfarm's not too happy.  It looks like force_parallel_mode
> breaks all the regression test cases around unsafe enums; which on
> reflection is unsurprising, because parallel workers will not have access
> to the parent's blacklist hash, so they will think unsafe values are safe.
>
> Now, as long as parallel workers are read-only, perhaps this matters
> little; they would not be allowed to write unsafe values into tables
> anyway.  I'm concerned though about whether it might be possible for a
> parallel worker to return an unsafe value to the parent (in OID form)
> and then the parent writes it into a table.  If we can convince ourselves
> that's not possible, it might be okay to just turn off force_parallel_mode
> for these test cases.
>
> A safer answer would be to mark enum_in() and other callers of
> check_safe_enum_use() as parallel-restricted.  That'd require a
> post-RC1 catversion bump, which seems pretty unpleasant, but
> none of the other answers are nice either.
>
> Transmitting the blacklist hash to workers would be a good long-term
> answer, but I don't want to try to shoehorn it in for v10.
>
> Another idea is that maybe the existence of a blacklist hash should
> be enough to turn off parallel mode altogether ... but ugh.
>
> Or maybe we're back to "revert the whole feature, go back to 9.6
> behavior".
>
> Thoughts?


I think I would mark enum_in and friends as parallel-restricted. Yes I
know it would involve a cat version bump, so I'll understand if that's
not acceptable, but it seems to me the best of a bad bunch of choices.
Second choice might be turning off parallel mode if the hash exists, but
I'm unclear how that would work.

cheers

andrew

-- 

Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

Hi,

On 2017-09-26 11:59:42 -0700, Jeff Janes wrote:
> Should the release notes have a compatibility entry about pg_ctl restart,
> being used against a running pre-10 server, no longer being able to detect
> when startup is complete?
> 
> I don't know if cross-version use of pg_ctl restart was ever officially
> supported, but the current behavior is rather confusing (waiting for a long
> time, and then reporting failure, even though it started successfully).

I'm actually tempted to just make pg_ctl verify the right version of
postgres is being used. Maybe I'm missing something, but what's the
use-case for allowing it, and every couple releases have some breakage?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security Documentation

2017-09-26 Thread Stephen Frost

Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 26 September 2017 at 00:42, Stephen Frost  wrote:
> > That's a relatively minor point, however, and I do feel that this patch
> > is a definite improvement.  Were you thinking of just applying this for
> > v10, or back-patching all or part of it..?
> 
> I was planning on back-patching it to 9.5, taking out the parts
> relating the restrictive policies as appropriate. Currently the 9.5
> and 9.6 docs are identical, as are 10 and HEAD, and 9.5/9.6 only
> differs from 10/HEAD in a couple of places where they mention
> restrictive policies. IMO we should stick to that, making any
> improvements available in the back-branches. I was also thinking the
> same about the new summary table, although I haven't properly reviewed
> that yet.

Makes sense to me.

+1

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Row Level Security Documentation

2017-09-26 Thread Dean Rasheed

On 26 September 2017 at 00:42, Stephen Frost  wrote:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> Attached is a proposed patch...
>
> I've taken a look through this and generally agree with it.

Thanks for looking at this.

> the bits inside  ...  tags should be
> consistently capitalized (you had one case of 'All' that I noticed)

Yes, that's a typo. Will fix.

> I wonder if it'd be better to have the "simple" description *first*
> instead of later on, eg, where you have "Thus the overall result is
> that" we might want to try and reword things to decribe it as "Overall,
> it works thusly, ..., and the specifics are ...".

OK, that makes sense. I'll try it that way round.

> That's a relatively minor point, however, and I do feel that this patch
> is a definite improvement.  Were you thinking of just applying this for
> v10, or back-patching all or part of it..?

I was planning on back-patching it to 9.5, taking out the parts
relating the restrictive policies as appropriate. Currently the 9.5
and 9.6 docs are identical, as are 10 and HEAD, and 9.5/9.6 only
differs from 10/HEAD in a couple of places where they mention
restrictive policies. IMO we should stick to that, making any
improvements available in the back-branches. I was also thinking the
same about the new summary table, although I haven't properly reviewed
that yet.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] v10 pg_ctl compatibility

Should the release notes have a compatibility entry about pg_ctl restart,
being used against a running pre-10 server, no longer being able to detect
when startup is complete?

I don't know if cross-version use of pg_ctl restart was ever officially
supported, but the current behavior is rather confusing (waiting for a long
time, and then reporting failure, even though it started successfully).

Cheers,

Jeff

Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-09-26 Thread Bossart, Nathan

On 9/25/17, 12:42 AM, "Michael Paquier"  wrote:
> +   if (!IsAutoVacuumWorkerProcess())
> +   ereport(WARNING,
> + (errmsg("skipping \"%s\" --- relation no longer exists",
> + relation->relname)));
> I like the use of WARNING here, but we could use as well a LOG to be
> consistent when a lock obtention is skipped.

It looks like the LOG statement is only emitted for autovacuum, so maybe
we should keep this at WARNING for consistency with the permission checks
below it.

> +* going to commit this transaction and begin a new one between 
> now
> +* and then.
> +*/
> +   relid = RangeVarGetRelid(relinfo->relation, NoLock, false);
> We will likely have to wait that the matters discussed in
> https://www.postgresql.org/message-id/25023.1506107...@sss.pgh.pa.us
> are settled.

Makes sense.

> +VACUUM FULL vactst, vactst, vactst, vactst;
> This is actually a waste of cycles.

I'll clean this up in v22.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

I wrote:
> Pushed; sorry for the delay.

... and the buildfarm's not too happy.  It looks like force_parallel_mode
breaks all the regression test cases around unsafe enums; which on
reflection is unsurprising, because parallel workers will not have access
to the parent's blacklist hash, so they will think unsafe values are safe.

Now, as long as parallel workers are read-only, perhaps this matters
little; they would not be allowed to write unsafe values into tables
anyway.  I'm concerned though about whether it might be possible for a
parallel worker to return an unsafe value to the parent (in OID form)
and then the parent writes it into a table.  If we can convince ourselves
that's not possible, it might be okay to just turn off force_parallel_mode
for these test cases.

A safer answer would be to mark enum_in() and other callers of
check_safe_enum_use() as parallel-restricted.  That'd require a
post-RC1 catversion bump, which seems pretty unpleasant, but
none of the other answers are nice either.

Transmitting the blacklist hash to workers would be a good long-term
answer, but I don't want to try to shoehorn it in for v10.

Another idea is that maybe the existence of a blacklist hash should
be enough to turn off parallel mode altogether ... but ugh.

Or maybe we're back to "revert the whole feature, go back to 9.6
behavior".

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] User-perspective knowledge about wait events

2017-09-26 Thread Schneider

On Mon, Aug 14, 2017 at 11:58 PM, Michael Paquier
 wrote:
> As $subject has been touched on two threads recently
> (https://www.postgresql.org/message-id/CAB7nPqTbHLcHFn6m11tfpwAdgz8BmnBza2jjN9AK=sdx_kb...@mail.gmail.com
> and 
> https://www.postgresql.org/message-id/20170808213537.wkmmagf2a6i3hjyi@alvherre.pgsql),
> the list of wait event and their category are becoming harder to
> maintain because of their number and the fact that:
> 1) It is easy to add a wait event and...
> 2) It is easy to not update its documentation.
> 3) the documentation tables get easily broken.

The documentation provides nice "developer-perspective" descriptions
of the wait events. It's a great start. If you have some familiarity
with the code, or if you're willing to dig into it, then the
descriptions get you going.

However I think that it would be immensely helpful to start gathering
knowledge somewhere on wait events from the user perspective.  Listing
a few of the wait events which users will probably see most often
along with practical suggestions of what users could further examine
or even what they could change to make improvements on their database
systems.

When Postgres 10 goes GA and users start adopting it, we will start
getting a real-world sense of which wait events are most commonly
observed.  But I don't think it's too early to stub out a wiki page
for gathering user-level community knowledge on this.  As collective
knowledge about wait event meaning and action matures, the best advice
might even be considered for eventual incorporation back into the
official documentation.

I do realize there are some big pitfalls in the area of documenting
performance optimization. Workloads vary significantly and it's all to
easy for sound bites to devolve into "golden rules" or "best
practices" and then become mantras and silver bullets.

Nonetheless I believe it is possible to make some basic, scientific
and well-qualified observations. Not only is this immensely useful but
in fact it's critical to users' ability to have a good experience when
they dramatically ramp up their usage of PostgreSQL.

One particular section of Oracle's perf tuning doc actually isn't a
bad example in giving a few qualified and actionable user-perspective
wait event descriptions:
http://docs.oracle.com/database/122/TGDBA/instance-tuning-using-performance-views.htm#TGDBA94472

I'm just making thus up off the top of my head, so I'm sure it will be
bad... but in the interest of starting a conversation here's a example
"stub/starter":

^-^-^-^-^-^-^-^-^-^-^-^
https://wiki.postgresql.org/wiki/Wait_Events

Lock:tuple - this means that the SQL is waiting because some
application has previously locked the tuple (row) with a different
database connection and has not yet committed or rolled back that
transaction.  if you frequently see this wait event then you probably
need to look for application-level optimizations to decrease
insert/update/delete/lock contention on the same rows.

IO:DataFileRead - this means that the SQL is waiting for data to be
physically read into PostgreSQL's internal memory cache.  if you
frequently see this wait event then you might want to examine the SQL
statements which cause the most I/O and look for optimizations.  Can
you accomplish the same result while examining less data?  If not, is
there an index that might make the processing more efficient?  Can the
SQL be restructured to generate a more optimal plan?  If no
application-level optimizations are possible, then can the I/O
subsystem be optimized somehow?

IO:XactSync and IO:WALWrite - [what's the exact difference between
these?]  this probably means that the SQL is waiting for WAL data to
either be asynchronously written to disk, or for the final fsync to
complete before returning from a transactions's COMMIT call.  if you
frequently see this wait event then you might want to review the
content and frequency of updates and inserts generated by your
application.  If your application cannot be refactored to remove
unneeded DML then you might example the WAL I/O path for opportunities
to optimize or increase capacity.

Notes about scaling out/up hardware:

IO-related waits can often be somewhat remediated by scaling out or
scaling up your hardware. However it's generally worthwhile to first
review your application for optimization opportunities.  A small
application-level change can make orders-of-magnitude greater
improvements over hardware scaling!

Note that locking and concurrency related waits can often become even
worse with scaling out/up.  (Waiting for something on another machine
can be slower than waiting for something on another processor.)  Also,
scaling the wrong hardware component can exacerbate IO-related waits
if you don't scale in a way that directly addresses the actual
bottleneck.

^-^-^-^-^-^-^-^-^-^-^-^

What do others think about this?  Is the wiki.postgresql.org the right
place for something like this, or should it just

Re: [HACKERS] [Proposal] Make the optimiser aware of partitions ordering

2017-09-26 Thread Julien Rouhaud

On Tue, Sep 26, 2017 at 2:56 PM, Robert Haas  wrote:
> On Sat, Sep 23, 2017 at 6:29 AM, Julien Rouhaud  wrote:
>> That's true, but numCols, sortColdIdx etc are also used to display the
>> sort key in an explain.  If an append can return sorted data, it
>> should also display the sort information, so I think these fields are
>> still required in an Append node.
>
> I don't think so.  An index scan doesn't output that information, nor
> does a nested loop which inherits a sort order from its outer path.  I
> think the rule is that a plan node which takes steps to get the data
> into a certain order might want to output something about that, but a
> plan node which somehow gets that ordering without taking any explicit
> action doesn't print anything.

Oh, ok that indeed makes sense.  As I said I'll remove all the useless
noise and send an updated patch.  Thanks again.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

Thomas Munro  writes:
> See attached, which also removes the ENOSYS stuff which I believe to
> be now useless.  Does this make sense?  Survives make check-world and
> my simple test procedure on a 3.10.0-327.36.1.el7.x86_64 system.

Thanks.  Works on my RHEL6 box too, so pushed.

This certainly explains the "could not resize ...: Success" oddity
we see in the buildfarm.  It's not so clear why those critters are
failing in the first place.  I hope that what is happening is that
they are getting EINTR results and failing to retry.  (Although
my RHEL6 man page doesn't mention EINTR as a possible failure,
I see that's been corrected in later editions.)  If not, we should
at least get a better fix on the nature of the failure with this
patch.  Awaiting buildfarm results ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Henry

It seems test_decoding.c could be easily changed to support JSON by using
the built in PostgreSQL functions (json.c composite_to_json) to convert a
Datum into SQL. It's use of OidOutputFunctionCall could be modified to emit
arrays and composite types as JSON. This might be enough to enable
downstream frameworks to parse (without having to code to the terse and
positional composite structure format).

It could be a minimal change to have in core using the built in JSON
support with no additional libraries. I have not made changes to this code
but it seems like it should work.

Thank you,
Henry

On Tue, Sep 26, 2017 at 9:37 AM Alvaro Hernandez  wrote:

>
>
> On 26/09/17 17:50, Craig Ringer wrote:
>
> On 26 September 2017 at 22:14, Magnus Hagander 
> wrote:
>
>>
>>
>> On Tue, Sep 26, 2017 at 2:16 PM, Alvaro Hernandez  wrote:
>>
>>>
>>>
>>>
>>> But what about earlier versions? Any chance it could be backported
>>> down to 9.4? If that would be acceptable, I could probably help/do that...
>>
>>
>> The likelihood is zero if you mean backported into core of earlier
>> versions.
>>
>
> Right. We don't add features to back branches.
>
>
> Yeah, I know the policy. But asking is free ;) and in my opinion this
> would be a very good reason to have an exception, if there would be a clear
> desire to have a single, unified, production quality output plugin across
> all PostgreSQL versions. At least I tried ;)
>
>
>
>
>
>>
>> If you mean backported as a standalone extension that could be installed
>> on a previous version, probably. I'm not sure if it relies on any internals
>> not present before that would make it harder, but it would probably at
>> least be possible.
>>
>>
> All the pub/sub stuff is new and hooked into syscache etc. So you'd be
> doing a bunch of emulation/shims using user catalogs. Not impossible, but
> probably irritating and verbose. And you'd have none of the DDL required to
> manage it, so you'd need SQL-function equivalents.
>
> I suspect you'd be better off tweaking pglogical to speak the same
> protocol as pg10, since the pgoutput protocol is an evolution of
> pglogical's protocol. Then using pglogical on older versions.
>
>
>
> Given all this, if I would be doing an app based on logical decoding,
> I think I will stick to test_decoding for <10
>
>
> Thanks,
>
>
> Álvaro
>
>
> --
>
> Alvaro Hernandez
>
>
> ---
> OnGres
>
>

Re: [HACKERS] md5 still listed as an option in pg_hba.conf.sample

2017-09-26 Thread Mark Dilger


> On Sep 26, 2017, at 10:36 AM, Bruce Momjian  wrote:
> 
> On Tue, Sep 26, 2017 at 10:23:55AM -0700, Mark Dilger wrote:
>> The comment that I think needs updating is:
>> 
>> # METHOD can be "trust", "reject", "md5", "password", "scram-sha-256",
>> # "gss", "sspi", "ident", "peer", "pam", "ldap", "radius" or "cert".
>> 
>> The "md5" option no longer works, as discussed in other threads.
> 
> Uh, I think that "md5" still works just fine.

Yes, sorry for the noise.

I was under the impression that md5 was removed for this release, per other
threads that I must not have followed closely enough.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] md5 still listed as an option in pg_hba.conf.sample

2017-09-26 Thread Bruce Momjian

On Tue, Sep 26, 2017 at 10:23:55AM -0700, Mark Dilger wrote:
> The comment that I think needs updating is:
> 
> # METHOD can be "trust", "reject", "md5", "password", "scram-sha-256",
> # "gss", "sspi", "ident", "peer", "pam", "ldap", "radius" or "cert".
> 
> The "md5" option no longer works, as discussed in other threads.

Uh, I think that "md5" still works just fine.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] md5 still listed as an option in pg_hba.conf.sample

On Tue, Sep 26, 2017 at 1:23 PM, Mark Dilger  wrote:
> The comment that I think needs updating is:
>
> # METHOD can be "trust", "reject", "md5", "password", "scram-sha-256",
> # "gss", "sspi", "ident", "peer", "pam", "ldap", "radius" or "cert".
>
> The "md5" option no longer works, as discussed in other threads.

It works for me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] md5 still listed as an option in pg_hba.conf.sample

2017-09-26 Thread Mark Dilger

The comment that I think needs updating is:

# METHOD can be "trust", "reject", "md5", "password", "scram-sha-256",
# "gss", "sspi", "ident", "peer", "pam", "ldap", "radius" or "cert".

The "md5" option no longer works, as discussed in other threads.

mark




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

I wrote:
> Andrew Dunstan  writes:
>> OK, that seems to be the consensus. So let's apply the blacklist patch
>> and then separately remove the 'created in the same transaction' test.
>> We'll need to adjust the regression tests and docs accordingly.

> Agreed.  I'll work on that in a little bit.

Pushed; sorry for the delay.

I noticed that the blacklist mechanism effectively removed the prohibition
against using a renamed enum value later in the same transaction, so I
added a regression test for that.  Also, as committed, I used RENAME TYPE
rather than ALTER OWNER in the test cases requiring an updated pg_type
row.  That way we don't need to create a role, even a transient one, which
is a good thing in terms of not risking collisions with other sessions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

On Tue, Sep 26, 2017 at 10:57 AM, Jesper Pedersen
 wrote:
> One could advocate (*cough*) that the hash partition patch [1] should be
> merged first in order to find other instances of where other CommitFest
> entries doesn't account for hash partitions at the moment in their method
> signatures; Beena noted something similar in [2]. I know that you said
> otherwise [3], but this is CommitFest 1, so there is time for a revert
> later, and hash partitions are already useful in internal testing.

Well, that's a fair point.  I was assuming that committing things in
that order would cause me to win the "least popular committer" award
at least for that day, but maybe not.  It's certainly not ideal to
have to juggle that patch along and keep rebasing it over other
changes when it's basically done, and just waiting on other
improvements to land.  Anybody else wish to express an opinion?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in plugin for logical decoding output




On 26/09/17 17:50, Craig Ringer wrote:
On 26 September 2017 at 22:14, Magnus Hagander > wrote:




On Tue, Sep 26, 2017 at 2:16 PM, Alvaro Hernandez > wrote:




    But what about earlier versions? Any chance it could be
backported down to 9.4? If that would be acceptable, I could
probably help/do that...


The likelihood is zero if you mean backported into core of earlier
versions.


Right. We don't add features to back branches.


    Yeah, I know the policy. But asking is free ;) and in my opinion 
this would be a very good reason to have an exception, if there would be 
a clear desire to have a single, unified, production quality output 
plugin across all PostgreSQL versions. At least I tried ;)





If you mean backported as a standalone extension that could be
installed on a previous version, probably. I'm not sure if it
relies on any internals not present before that would make it
harder, but it would probably at least be possible.


All the pub/sub stuff is new and hooked into syscache etc. So you'd be 
doing a bunch of emulation/shims using user catalogs. Not impossible, 
but probably irritating and verbose. And you'd have none of the DDL 
required to manage it, so you'd need SQL-function equivalents.


I suspect you'd be better off tweaking pglogical to speak the same 
protocol as pg10, since the pgoutput protocol is an evolution of 
pglogical's protocol. Then using pglogical on older versions.



    Given all this, if I would be doing an app based on logical 
decoding, I think I will stick to test_decoding for <10



    Thanks,

    Álvaro


--

Alvaro Hernandez


---
OnGres

Re: [HACKERS] Enhancements to passwordcheck

2017-09-26 Thread Bossart, Nathan

On 9/26/17, 2:38 AM, "Albe Laurenz"  wrote:
> Nathan Bossart wrote:
 passwordcheck.force_new_password

>>> Does it mean a password different from the old one? +1. It could be
>>> different from the last 3 passwords but we don't store a password
>>> history.
>> 
>> Yes.  As Michael pointed out, this might be better to do as a separate
>> effort since we'll almost certainly need to introduce a way to store
>> password history.
>
> That increases the number of passwords stored on the server and
> consequently the damage when that list is stolen.
> Of course the old passwords are invalid, but if someone cracks them
> they could still try them on other systems the person uses.
>
> I think we should accept such a risk only if the benefits are clear, and
> my opinion has always been that if you forbid password reuse, people
> tend to come up with password generation schemes that are no better
> than the original passwords.

Right.  However, without this, they may not change the password at
all, which would make the expiry functionality less effective.  I
suppose there's not a great way to guard against these sorts of
password generation schemes without being able to judge the proposed
password against the previous password, too.

Perhaps the max_expiry_period parameter should be left out for now
as well.

>> One interesting design challenge will be how to handle pre-hashed
>> passwords, since the number of checks we can do on those is pretty
>> limited.  I'm currently thinking of a parameter that can be used to
>> block, allow, or force pre-hashed passwords.  If we take that route,
>> perhaps we will also need to ensure that PostgreSQL fails to start when
>> invalid combinations are specified (e.g. pre-hashed passwords are forced
>> and min_password_length is nonzero).  Thoughts?
>
> As was pointed out in the original discussion
> d960cb61b694cf459dcfb4b0128514c203937...@exadv11.host.magwien.gv.at
> the weak point of "passwordcheck" is that it does not work very well
> for encrypted passwords.
> The only saving grace is that you can at least check against
> username equals password.

Thanks for linking the original thread.  There are a lot of
interesting points.  I wonder if enhanced password checking in core
or contrib might be received differently with the introduction of
SCRAM authentication, since the weaknesses of MD5 were often cited.

> Disabling pre-hashed passwords in order to allow better password
> checks is a problem rather than a solution, because it exposes you
> to password theft of the clear-text password.  I think we shouldn't
> go there.
>
> The overall opinion in the above thread was that if you *really* care
> about security, you don't use database passwords, but external
> authentication with a centralized identity management system.
>
> So I think it is fine to extend "passwordcheck", but we shouldn't
> take it serious enough to reduce security elsewhere in order to
> improve the module.

I understand the points made here, but not allowing configurability
here really hinders the module's ability to enforce much of
anything.  However, I did say I wanted to avoid controversial
parameters, so I'll plan to count this one out as well.

This leaves us with the following proposed parameters for now:

passwordcheck.min_password_length
passwordcheck.min_uppercase_letters
passwordcheck.min_lowercase_letters
passwordcheck.min_numbers
passwordcheck.min_special_chars
passwordcheck.special_chars
passwordcheck.allow_username_in_password
passwordcheck.use_cracklib

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace

2017-09-26 Thread Martín Marqués

El 13/09/17 a las 14:17, Pierre Ducroquet escribió:
> + boolfirstfile = 1;

You are still assigning 1 to a bool (which is not incorrect) instead of
true or false as Michael mentioned before.

P.D.: I didn't go though all the thread and patch in depth so will not
comment further.

-- 
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-09-26 Thread Jesper Pedersen

On 09/26/2017 10:33 AM, Robert Haas wrote:

On Tue, Sep 26, 2017 at 9:00 AM, Jesper Pedersen
wrote:

Could you share your thoughts on the usage of PartitionAppendInfo's
min_datum_idx / max_datum_idx ? Especially in relation to hash partitions.

This brings up something that I've kind of been thinking about. There
are sort of four cases when it comes to partition pruning:

1. There is exactly one matching partition. For example, this happens
when there is an equality constraint on every partition column.

2. There are multiple matching partitions which are consecutive. For
example, there is a single level of range partitioning with no default
partition and the single partitioning column is constrained by < > <=
or >=.

3. There are multiple matching partitions which are not consecutive.
This case is probably rare, but it can happen if there is a default
partition, if there are list partitions with multiple bounds that are
interleaved (e.g. p1 allows (1, 4), p2 allows (2), p3 allows (3, 5),
and the query allows values >= 4 and <= 5), if the query involves OR
conditions, or if there are multiple levels of partitioning (e.g.
partition by a, subpartition by b, put a range constraint on a and an
equality constraint on b).

4. There are no matching partitions.

One of the goals of this algorithm is to be fast. The obvious way to
cater to case (3) is to iterate through all partitions and test
whether each one works, returning a Bitmapset, but that is O(n).
Admittedly, it might be O(n) with a pretty small constant factor, but
it still seems like exactly the sort of thing that we want to avoid
given the desire to scale to higher partition counts.

I propose that we create a structure that looks like this:

struct foo {
int min_partition;
int max_partition;
Bitmapset *extra_partitions;
};

This indicates that all partitions from min_partition to max_partition
need to be scanned, and in addition any partitions in extra_partitions
need to be scanned. Assuming that we only consider cases where all
partition keys or a leading subset of the partition keys are
constrained, we'll generally be able to get by with just setting
min_partition and max_partition, but extra_partitions can be used to
handle default partitions and interleaved list bounds. For equality
on all partitioning columns, we can do a single bsearch of the bounds
to identify the target partition at a given partitioning level, and
the same thing works for a single range-bound. If there are two
range-bounds (< and > or <= and >= or whatever) we need to bsearch
twice. The default partition, if any and if matched, must also be
included. When there are multiple levels of partitioning things get a
bit more complex -- if someone wants to knock out a partition that
breaks up the range, we might need to shrink the main range to cover
part of it and kick the other indexes out to extra_partitions.

But the good thing is that in common cases with only O(lg n) effort we
can return O(1) data that describes what will be scanned. In cases
where that's not practical we expend more effort but still prune with
maximal effectiveness.

For OLTP style applications 1) would be the common case, and with hash
partitions it would be one equality constraint.

So, changing the method signature to use a data type as you described
above instead of the explicit min_datum_idx / max_datum_idx output
parameters would be more clear.

One could advocate (*cough*) that the hash partition patch [1] should be
merged first in order to find other instances of where other CommitFest
entries doesn't account for hash partitions at the moment in their
method signatures; Beena noted something similar in [2]. I know that you
said otherwise [3], but this is CommitFest 1, so there is time for a
revert later, and hash partitions are already useful in internal testing.

[1] https://commitfest.postgresql.org/14/1059/
[2]
https://www.postgresql.org/message-id/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A%40mail.gmail.com

[3] http://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Craig Ringer

On 26 September 2017 at 22:14, Magnus Hagander  wrote:

>
>
> On Tue, Sep 26, 2017 at 2:16 PM, Alvaro Hernandez  wrote:
>
>>
>>
>>
>> But what about earlier versions? Any chance it could be backported
>> down to 9.4? If that would be acceptable, I could probably help/do that...
>
>
> The likelihood is zero if you mean backported into core of earlier
> versions.
>

Right. We don't add features to back branches.


>
> If you mean backported as a standalone extension that could be installed
> on a previous version, probably. I'm not sure if it relies on any internals
> not present before that would make it harder, but it would probably at
> least be possible.
>
>
All the pub/sub stuff is new and hooked into syscache etc. So you'd be
doing a bunch of emulation/shims using user catalogs. Not impossible, but
probably irritating and verbose. And you'd have none of the DDL required to
manage it, so you'd need SQL-function equivalents.

I suspect you'd be better off tweaking pglogical to speak the same protocol
as pg10, since the pgoutput protocol is an evolution of pglogical's
protocol. Then using pglogical on older versions.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Euler Taveira

2017-09-26 2:46 GMT-03:00 Alvaro Hernandez :
> I think that's awesome. Now... back to my original question: what is the
> *list* of output plugins supported by managed PostgreSQL solutions? So far
> it looks like wal2json for 9.5-9.6 on RDS, and nothing else (it may just be
> not complete, but in the best case this list won't be unfortunately
> long...).
>
I can tell by other plugin authors but wal2json for being one of the
first plugins available, it is pretty popular. I've heard that it is
used in replication and audit solutions. It doesn't support all of the
logical decoding features (for example, origin filter) but it is in my
roadmap.


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

On Tue, Sep 26, 2017 at 9:00 AM, Jesper Pedersen
 wrote:
> Could you share your thoughts on the usage of PartitionAppendInfo's
> min_datum_idx / max_datum_idx ? Especially in relation to hash partitions.

This brings up something that I've kind of been thinking about.  There
are sort of four cases when it comes to partition pruning:

1. There is exactly one matching partition.  For example, this happens
when there is an equality constraint on every partition column.

2. There are multiple matching partitions which are consecutive.  For
example, there is a single level of range partitioning with no default
partition and the single partitioning column is constrained by < > <=
or >=.

3. There are multiple matching partitions which are not consecutive.
This case is probably rare, but it can happen if there is a default
partition, if there are list partitions with multiple bounds that are
interleaved (e.g. p1 allows (1, 4), p2 allows (2), p3 allows (3, 5),
and the query allows values >= 4 and <= 5), if the query involves OR
conditions, or if there are multiple levels of partitioning (e.g.
partition by a, subpartition by b, put a range constraint on a and an
equality constraint on b).

4. There are no matching partitions.

One of the goals of this algorithm is to be fast.  The obvious way to
cater to case (3) is to iterate through all partitions and test
whether each one works, returning a Bitmapset, but that is O(n).
Admittedly, it might be O(n) with a pretty small constant factor, but
it still seems like exactly the sort of thing that we want to avoid
given the desire to scale to higher partition counts.

I propose that we create a structure that looks like this:

struct foo {
   int min_partition;
   int max_partition;
   Bitmapset *extra_partitions;
};

This indicates that all partitions from min_partition to max_partition
need to be scanned, and in addition any partitions in extra_partitions
need to be scanned.  Assuming that we only consider cases where all
partition keys or a leading subset of the partition keys are
constrained, we'll generally be able to get by with just setting
min_partition and max_partition, but extra_partitions can be used to
handle default partitions and interleaved list bounds.  For equality
on all partitioning columns, we can do a single bsearch of the bounds
to identify the target partition at a given partitioning level, and
the same thing works for a single range-bound.  If there are two
range-bounds (< and > or <= and >= or whatever) we need to bsearch
twice.  The default partition, if any and if matched, must also be
included.  When there are multiple levels of partitioning things get a
bit more complex -- if someone wants to knock out a partition that
breaks up the range, we might need to shrink the main range to cover
part of it and kick the other indexes out to extra_partitions.

But the good thing is that in common cases with only O(lg n) effort we
can return O(1) data that describes what will be scanned.  In cases
where that's not practical we expend more effort but still prune with
maximal effectiveness.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Magnus Hagander

On Tue, Sep 26, 2017 at 2:16 PM, Alvaro Hernandez  wrote:

>
>
> On 26/09/17 12:57, Petr Jelinek wrote:
>
>> On 26/09/17 09:26, Alvaro Hernandez wrote:
>>
>>> On 26/09/17 10:03, Craig Ringer wrote:
>>>
 On 26 September 2017 at 14:08, Alvaro Hernandez > wrote:
  - If you stick to in-core plugins, then you need to support at
  least three different output formats if you want to support 9.4+:
  test_decoding (and pray it works!), pgoutput, and the "new"
  in-core plugin that was proposed at the beginning of this thread,
  if that would see the light.

 The only practical way will IMO be to have whatever new plugin it also
 have an out-of-core version maintained for older Pg versions, where it
 can be installed.

  But only in-core plugins help for general-purpose solutions.

 I still don't agree there. If there's enough need/interest/adoption
 you can get cloud vendors on board, they'll feel the customer
 pressure. It's not our job to create that pressure and do their work
 for them.

>>>  Don't want to get into a loop, but as I said before it's
>>> chicken-and-egg. But nobody is asking core to do their work. As much as
>>> I love it, I think logical decoding is a bit half-baked until there is a
>>> single, quality, in-core plugin, as it discourages its usage, because of
>>> the reasons I stated.
>>>
>>> Well, in that case it's all good as PG10 has that.
>>
>>
> Even though it's not fully documented, I agree this could fulfill this
> gap for 10+ (I assume this plugin will be maintained onwards, at least to
> support logical replication).
>
> But what about earlier versions? Any chance it could be backported
> down to 9.4? If that would be acceptable, I could probably help/do that...

The likelihood is zero if you mean backported into core of earlier versions.

If you mean backported as a standalone extension that could be installed on
a previous version, probably. I'm not sure if it relies on any internals
not present before that would make it harder, but it would probably at
least be possible.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] Page Scan Mode in Hash Index

On Mon, Sep 25, 2017 at 12:25 AM, Amit Kapila  wrote:
> I think your proposal makes sense.  Your patch looks good but you
> might want to tweak the comments atop _hash_kill_items ("However,
> having pin on the overflow page doesn't guarantee that vacuum won't
> delete any items.).  That part of the comment has been written to
> indicate that we have to check LSN in this function unconditonally.

OK, committed.

And I think that's the last of the hash index work.  Thanks to Amit
Kapila, Ashutosh Sharma, Mithun Cy, Kuntal Ghosh, Jesper Pedersen, and
Dilip Kumar for all the patches and reviews and to Jeff Janes and
others for additional code review and testing!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improve catcache/syscache performance.

2017-09-26 Thread Jesper Pedersen


On 09/26/2017 06:41 AM, tushar wrote:

On 09/22/2017 11:45 AM, Andres Freund wrote:

Here's a variant that cleans up the previous changes a bit, and adds
some further improvements:


I tested with different pgbench options with  master v/s patch and found 
an improvement.  I have applied 001 and 003 patch on PG Head ,patch 0002 
was already committed.


Virtual Machine configuration - Centos 6.5 x64 / 16 GB RAM / 8 VCPU core 
processor


Scaling factor=30

pgbench -M prepared -T 200 postgres

PG Head   -  tps = 902.225954 (excluding connections establishing).
PG HEAD+patch -  tps = 1001.896381 (10.97+% vs. head)


pgbench -M prepared -T 300 postgres

PG Head   -  tps = 920.108333 (excluding connections establishing).
PG HEAD+patch -  tps = 1023.89542 (11.19+% vs. head)

pgbench -M prepared -T 500 postgres

PG Head   -  tps = 995.178227 (excluding connections establishing)
PG HEAD+patch -  tps = 1078.3 (+8.34% vs. head)


Later I modified the create_many_cols.sql file (previously attached) and 
instead of
only using int  , I mixed it with varchar/int4/numeric/float and run 
pgbench

with different time duration


pgbench -M prepared -f /tmp/pgbench-many-cols.sql -T 300  postgres

PG Head   -  tps =  5540.143877 (excluding connections establishing).
PG HEAD+patch -  tps =  5679.713493 (2.50+% vs. head)


pgbench -M prepared -f /tmp/pgbench-many-cols.sql -T 500  postgres

PG Head   -  tps = 5519.212709 (excluding connections establishing).
PG HEAD+patch -  tps = 5967.059155 (8.11+% vs. head)


pgbench -M prepared -f /tmp/pgbench-many-cols.sql -T 700  postgres

PG Head   -  tps = 5640.314495(excluding connections establishing).
PG HEAD+patch -  tps = 6012.223147 (6.59+% vs. head)



I'm also seeing a speedup on a 2S/28C/56T/256Gb + 2 x RAID10 SSD machine 
using -M prepared, -M prepared -N and -M prepared -S scenarios with 
various scale factors, and custom queries.


Small typo in 0002- / commit 791961:

diff --git a/src/include/utils/hashutils.h b/src/include/utils/hashutils.h
index 35281689e8..366bd0e78b 100644
--- a/src/include/utils/hashutils.h
+++ b/src/include/utils/hashutils.h
@@ -22,7 +22,7 @@ hash_combine(uint32 a, uint32 b)


 /*
- * Simple inline murmur hash implementation hashing a 32 bit ingeger, for
+ * Simple inline murmur hash implementation hashing a 32 bit integer, for
  * performance.
  */
 static inline uint32

Best regards,
 Jesper


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-09-26 Thread Jesper Pedersen


Hi Amit,

On 09/15/2017 04:50 AM, Amit Langote wrote:

On 2017/09/15 11:16, Amit Langote wrote:

I will post rebased patches later today, although I think the overall
design of the patch on the planner side of things is not quite there yet.
Of course, your and others' feedback is greatly welcome.


Rebased patches attached.  Because Dilip complained earlier today about
clauses of the form (const op var) not causing partition-pruning, I've
added code to commute the clause where it is required.  Some other
previously mentioned limitations remain -- no handling of OR clauses, no
elimination of redundant clauses for given partitioning column, etc.

A note about 0001: this patch overlaps with
0003-Canonical-partition-scheme.patch from the partitionwise-join patch
series that Ashutosh Bapat posted yesterday [1].  Because I implemented
the planner-portion of this patch based on what 0001 builds, I'm posting
it here.  It might actually turn out that we will review and commit
0003-Canonical-partition-scheme.patch on that thread, but meanwhile apply
0001 if you want to play with the later patches.  I would certainly like
to review  0003-Canonical-partition-scheme.patch myself, but won't be able
to immediately (see below).



Could you share your thoughts on the usage of PartitionAppendInfo's 
min_datum_idx / max_datum_idx ? Especially in relation to hash partitions.


I'm looking at get_partitions_for_keys.

Best regards,
 Jesper


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Proposal] Make the optimiser aware of partitions ordering

On Sat, Sep 23, 2017 at 6:29 AM, Julien Rouhaud  wrote:
> That's true, but numCols, sortColdIdx etc are also used to display the
> sort key in an explain.  If an append can return sorted data, it
> should also display the sort information, so I think these fields are
> still required in an Append node.

I don't think so.  An index scan doesn't output that information, nor
does a nested loop which inherits a sort order from its outer path.  I
think the rule is that a plan node which takes steps to get the data
into a certain order might want to output something about that, but a
plan node which somehow gets that ordering without taking any explicit
action doesn't print anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

On Tue, Sep 26, 2017 at 5:06 AM, Masahiko Sawada  wrote:
> Based on the review comment from Robert, I'm planning to do the big
> change to the architecture of this patch so that a backend process
> work together with a dedicated background worker that is responsible
> for resolving the foreign transactions. For the usage of this feature,
> it will be almost the same as what this patch has been doing except
> for adding a new GUC paramter that controls the number of resovler
> process launch. That is, we can have multiple resolver process to keep
> latency down.

Multiple resolver processes is useful but gets a bit complicated.  For
example, if process 1 has a connection open to foreign server A and
process 2 does not, and a request arrives that needs to be handled on
foreign server A, what happens?  If process 1 is already busy doing
something else, probably we want process 2 to try to open a new
connection to foreign server A and handle the request.  But if process
1 and 2 are both idle, ideally we'd like 1 to get that request rather
than 2.  That seems a bit difficult to get working though.  Maybe we
should just ignore such considerations in the first version.

> * Resovler processes
> 1. Fetch PGPROC entry from the shmem queue and get its XID (say, XID-a).
> 2. Get the fdw_xact_state entry from shmem hash by XID-a.
> 3. Iterate fdw_xact entries using the index, and resolve the foreign
> transactions.
> 3-a. If even one foreign transaction failed to resolve, raise an error.
> 4. Change the waiting backend state to FDWXACT_COMPLETED and release it.

Comments:

- Note that any error we raise here won't reach the user; this is a
background process.  We don't want to get into a loop where we just
error out repeatedly forever -- at least not if there's any other
reasonable choice.

- I suggest that we ought to track the status for each XID separately
on each server rather than just track the XID status overall.  That
way, if transaction resolution fails on one server, we don't keep
trying to reconnect to the others.

- If we go to resolve a remote transaction and find that no such
remote transaction exists, what should we do?  I'm inclined to think
that we should regard that as if we had succeeded in resolving the
transaction.  Certainly, if we've retried the server repeatedly, it
might be that we previously succeeded in resolving the transaction but
then the network connection was broken before we got the success
message back from the remote server.  But even if that's not the
scenario, I think we should assume that the DBA or some other system
resolved it and therefore we don't need to do anything further.  If we
assume anything else, then we just go into an infinite error loop,
which isn't useful behavior.  We could log a message, though (for
example, LOG: unable to resolve foreign transaction ... because it
does not exist).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-26 Thread Ashutosh Bapat

Hi Jeevan,
I have started reviewing these patches.

0001 looks fine. There might be some changes that will be needed, but
those will be clear when I review the patch that uses this
refactoring.

0002
+ *
+ * If targetlist is provided, we use it else use targetlist from the root.
  */
 static double
 get_number_of_groups(PlannerInfo *root,
 double path_rows,
-grouping_sets_data *gd)
+grouping_sets_data *gd,
+List *tlist)
 {
Query  *parse = root->parse;
double  dNumGroups;
+   List   *targetList = (tlist == NIL) ? parse->targetList : tlist;

May be we should just pass targetlist always. Instead of passing NIL,
pass parse->targetList directly. That would save us one conditional
assignment. May be passing NIL is required for the patches that use
this refactoring, but that's not clear as is in this patch.

0003
In the documenation of enable_partition_wise_aggregate, we should
probably explain why the default is off or like partition_wise_join
GUC, explain the consequences of turning it off. I doubt if we could
accept something like partition_wise_agg_cost_factor looks. But we can
discuss this at a later stage. Mean time it may be worthwhile to fix
the reason why we would require this GUC. If the regular aggregation
has cost lesser than partition-wise aggregation in most of the cases,
then probably we need to fix the cost model.

I will continue reviewing rest of the patches.

On Mon, Sep 18, 2017 at 12:37 PM, Jeevan Chalke
 wrote:
>
>
> On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke
>  wrote:
>>
>>
>>
>> On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi
>>  wrote:
>>>
>>>
>>> Hi Jeevan,
>>>
>>> I have started testing partition-wise-aggregate and got one observation,
>>> please take a look.
>>> with the v2 patch, here if I change target list order, query is not
>>> picking full partition-wise-aggregate.
>>
>>
>> Thanks Rajkumar for reporting this.
>>
>> I am looking into this issue and will post updated patch with the fix.
>
>
> Logic for checking whether partition keys lead group by keys needs to be
> updated here. The group by expressions can appear in any order without
> affecting the final result. And thus, the need for partition keys should
> be leading the group by keys to have full aggregation is not mandatory.
> Instead we must ensure that the partition keys are part of the group by
> keys to compute full aggregation on a partition.
>
> Attached, revised patch-set with above fix.
>
> Also, in test-cases, I have removed DROP/ANALYZE commands on child
> relations and also removed VERBOSE from the EXPLAIN.
>
> Notes:
> HEAD: 8edacab209957520423770851351ab4013cb0167
> Partition-wise Join patch-set version: v32
>
> Thanks
>
> --
> Jeevan Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-09-26 Thread Dilip Kumar

On Tue, Sep 26, 2017 at 2:45 PM, Amit Langote
 wrote:
> On 2017/09/25 20:21, Dilip Kumar wrote:

> I see.  So, in the run-time pruning case, only the work of extracting
> bounding values is deferred to execution time.  Matching clauses with the
> partition key still occurs during planning time.  Only that the clauses
> that require run-time pruning are not those in rel->baserestrictinfo.

Right.
>
>> After separating out the matching clause we will do somewhat similar
>> processing what "get_rel_partitions" is doing. But, at optimizer time
>> for PARAM we will not have Datum values for rightop, so we will keep
>> track of the PARAM itself.
>
> I guess information about which PARAMs map to which partition keys will be
> kept in the plan somehow.

Yes.
>
>> And, finally at runtime when we get the PARAM value we can prepare
>> minkey and maxkey and call get_partitions_for_keys function.
>
> Note that get_partitions_for_keys() is not planner code, nor is it bound
> with any other planning code.  It's callable from executor without much
> change.  Maybe you already know that though.

Yes, Right.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix number skipping in to_number

2017-09-26 Thread Nathan Wagner

On Mon, Sep 25, 2017 at 07:52:19PM +0100, Oliver Ford wrote:

> Thanks for your review. The issue is that Oracle throws errors on many
> more input cases than Postgres does, so making it exactly like Oracle
> could break a lot of existing users. E.g. to_number ('123,000', '999')
> returns '123' on Postgres, but throws an error on Oracle. So making it
> exactly Oracle-like could break existing users who might rely on the
> current behavior.

I wouldn't use to_number for anything other than oracle compatibility,
and then just so I didn't have to wade through all the ported oracle
code.  I would use a regex or some such to clean up the number, and then
cast the result.  For an input string of '123,000' I might do a
translate('123,000', ',', '')::integer or perhaps use regexp_replace().
Either way I would want a more positive decision as to what was valid or
not, based on the input.

> My view is that we shouldn't deliberately introduce errors in order to be
> exactly like Oracle if we don't currently and there's a sane use case for
> current behavior. Do you have any examples of results that are different
> between Oracle and Postgres, and you think the Oracle result makes more
> sense?

Not really, other than I think an error report might make more sense.
'123,000' doesn't really match the format of '999'.  If anything it
seems like we're guessing rather than validating input.  It is
surprising (to me at least) that

to_char(to_number('123,000', '999'), '999')

doesn't give us the original input (in the sense that identical formats
don't preserve the original string).  So I'm not sure the current
behavior is a sane use case, but perhaps more people are using
to_number() to get *some* numeric result, rather than for wanting it to
be like oracle.  I would generally prefer to throw an exception instead
of getting a number I wasn't expecting, but I can see cases where that
might not be the case.

-- 
nw

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Logical Replication - test_decoding - unchanged-toast-datum

2017-09-26 Thread Abhinav Singh

Hello,

I am currently using PostgreSQL Community version 9.4.9 and then using this
instance, I am doing logical replication(using replication slots). I have
created the replication slots using the following query:

SELECT xlog_position FROM pg_create_logical_replication_
slot('cjkimqvfuvixqyjd_00016389_e6f7c975_a311_4067_bcf1_a6accb57ab37',
'test_decoding')

So the issue that I am facing is because of the updates that are being done
to my table. I was able to reproduce the same issue again using the
following sample:

_
1. Table on the source(which is RDS PostgreSQL):

CREATE TABLE public.toast_test1
(
id SERIAL PRIMARY KEY NOT NULL,
is_not_toast INT,
is_toast VARCHAR(32767)
);
CREATE UNIQUE INDEX toast_test_id_uindex1 ON public.toast_test1 (id);

2. Insert some values:

INSERT INTO public.toast_test1
(is_not_toast, is_toast) VALUES
(0, (SELECT string_agg(series::text, ',')
FROM generate_series(1, 1000) AS series));

So basically, every time you execute the above query, a new row will be
inserted. So execute the same for 4-5 times.

3. So now I started my replication.

4. If for example, I am doing an update using the below mentioned query on
my source instance:

UPDATE public.toast_test SET is_not_toast = 1;

5. On the target, when I do a select * and see that the column with
character varying() datatype has changed to 'unchanged-toast-datum'.

6. So on further checking the replication slot at the time, when I issued
an update, I can see this:

postgres2@t1=> SELECT * FROM pg_logical_slot_get_changes('
cjkimqvfuvixqyjd_00016389_e6f7c975_a311_4067_bcf1_a6accb57ab37', NULL,
NULL);
  location   |  xid  |
  data
-+---+--

-
3D/95003D58 | 17974 | BEGIN 17974
3D/950049D0 | 17974 | table public.toast_test1: UPDATE: id[integer]:1
is_not_toast[integer]:1 is_toast[character varying]:unchanged-toast-datum
3D/95004A78 | 17974 | COMMIT 17974
(3 rows)


---

Even after setting the REPLICA IDENTITY to FULL for this table did not
help.

_

Kindly review and please share your comments on this matter.

Re: [HACKERS] Built-in plugin for logical decoding output




On 26/09/17 10:55, Craig Ringer wrote:
On 26 September 2017 at 15:26, Alvaro Hernandez > wrote:



    That's better than nothing. But as much as interoperable json
may be, people still need to talk the (binary) replication
protocol to use it.


No, they don't.

They can use the SQL interface to logical decoding.


    AFAIK the SQL interface was also designed for testing, not for 
production use...




We could enhance that with a few small changes to make it a lot more 
useful too. Most importantly, allow a long-polling model, where you 
can wait if there's nothing to consume rather than getting an 
immediate empty result-set.


    Oh, that's a completely different perspective! Do we have any 
support for long-polling style queries (not that I know of). It's indeed 
a cool thing: SQL queries that return "live" results as soon as they 
happen. I don't see this restricted to logical decoding. Actually, this 
is what PipelineDB, among other things, seem to do. +1 for that, but I 
believe it's a different story.




I expect the walsender protocol to be dealt with by client drivers, 
not user applications, much like you never really deal with the 
regular libpq protocol in your app. PgJDBC and psycopg2 already 
support it. It'd be nice if libpq offered some helper interfaces for C 
apps, and I'd help review any such patch.


    Fair enough.


    Álvaro


--

Alvaro Hernandez


---
OnGres

Re: [HACKERS] Built-in plugin for logical decoding output




On 26/09/17 12:57, Petr Jelinek wrote:

On 26/09/17 09:26, Alvaro Hernandez wrote:

On 26/09/17 10:03, Craig Ringer wrote:

On 26 September 2017 at 14:08, Alvaro Hernandez > wrote:
 - If you stick to in-core plugins, then you need to support at
 least three different output formats if you want to support 9.4+:
 test_decoding (and pray it works!), pgoutput, and the "new"
 in-core plugin that was proposed at the beginning of this thread,
 if that would see the light.


The only practical way will IMO be to have whatever new plugin it also
have an out-of-core version maintained for older Pg versions, where it
can be installed.
   


 But only in-core plugins help for general-purpose solutions.


I still don't agree there. If there's enough need/interest/adoption
you can get cloud vendors on board, they'll feel the customer
pressure. It's not our job to create that pressure and do their work
for them.

     Don't want to get into a loop, but as I said before it's
chicken-and-egg. But nobody is asking core to do their work. As much as
I love it, I think logical decoding is a bit half-baked until there is a
single, quality, in-core plugin, as it discourages its usage, because of
the reasons I stated.


Well, in that case it's all good as PG10 has that.



    Even though it's not fully documented, I agree this could fulfill 
this gap for 10+ (I assume this plugin will be maintained onwards, at 
least to support logical replication).


    But what about earlier versions? Any chance it could be backported 
down to 9.4? If that would be acceptable, I could probably help/do that...



    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improve catcache/syscache performance.

2017-09-26 Thread amul sul

On Fri, Sep 22, 2017 at 11:47 AM, Andres Freund  wrote:

> Hi,
>
> On 2017-09-20 18:26:50 +0530, amul sul wrote:
> > Patch 0007:
>

> Other than these concern, patch looks pretty reasonable to me.
>
> I'd appreciate if you could have a look at the new version as well.
>
>
I have reviewed

the newer version
, looks good to me.

Here are the few cosmetic comments for the 0003 patch:

1
.
 982 +   ct->tuple.t_data = (HeapTupleHeader)
 983 +   MAXALIGN(((char *) ct) + sizeof(CatCTup));

It would be nice if you add a comment for this address alignment
.

2
.
 947 /*
 948 -* If there are any out-of-line toasted fields in the tuple,
expand them
 949 -* in-line.  This saves cycles during later use of the catcache
entry, and
 950 -* also protects us against the possibility of the toast tuples
being
 951 -* freed before we attempt to fetch them, in case of something
using a
 952 -* slightly stale catcache entry.
 953  */

Empty comment block left in the CatalogCacheCreateEntry().

3
.

411 +/*
 412 + * CatalogCacheCompareTuple
 413 + *
 414 + * Compare a tuple to the passed arguments.
 415 + */
 416 +static inline bool
 417 +CatalogCacheCompareTuple(const CatCache *cache, int nkeys,
 418 +const Datum *cachekeys,
 419 +const Datum *searchkeys)

Need an adjust to the p
rolog comment
.

Regards,
Amul

Re: [HACKERS] Improve catcache/syscache performance.

2017-09-26 Thread tushar


On 09/22/2017 11:45 AM, Andres Freund wrote:

Here's a variant that cleans up the previous changes a bit, and adds
some further improvements:


I tested with different pgbench options with  master v/s patch and found an 
improvement.  I have applied 001 and 003 patch on PG Head ,patch 0002 was 
already committed.

Virtual Machine configuration - Centos 6.5 x64 / 16 GB RAM / 8 VCPU core 
processor

Scaling factor=30

pgbench -M prepared -T 200 postgres

PG Head   -  tps = 902.225954 (excluding connections establishing).
PG HEAD+patch -  tps = 1001.896381 (10.97+% vs. head)


pgbench -M prepared -T 300 postgres

PG Head   -  tps = 920.108333 (excluding connections establishing).
PG HEAD+patch -  tps = 1023.89542 (11.19+% vs. head)

pgbench -M prepared -T 500 postgres

PG Head   -  tps = 995.178227 (excluding connections establishing)
PG HEAD+patch -  tps = 1078.3 (+8.34% vs. head)


Later I modified the create_many_cols.sql file (previously attached) and 
instead of
only using int  , I mixed it with varchar/int4/numeric/float and run pgbench
with different time duration


pgbench -M prepared -f /tmp/pgbench-many-cols.sql -T 300  postgres

PG Head   -  tps =  5540.143877 (excluding connections establishing).
PG HEAD+patch -  tps =  5679.713493 (2.50+% vs. head)


pgbench -M prepared -f /tmp/pgbench-many-cols.sql -T 500  postgres

PG Head   -  tps = 5519.212709 (excluding connections establishing).
PG HEAD+patch -  tps = 5967.059155 (8.11+% vs. head)


pgbench -M prepared -f /tmp/pgbench-many-cols.sql -T 700  postgres

PG Head   -  tps = 5640.314495(excluding connections establishing).
PG HEAD+patch -  tps = 6012.223147 (6.59+% vs. head)

-- regards,tushar
EnterpriseDBhttps://www.enterprisedb.com/
The Enterprise PostgreSQL Company





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and statistics

2017-09-26 Thread Masahiko Sawada

On Tue, Sep 26, 2017 at 6:57 PM, Pavel Stehule  wrote:
>
>
> 2017-09-26 11:56 GMT+02:00 Pavel Stehule :
>>
>>
>>
>> 2017-09-26 11:51 GMT+02:00 Masahiko Sawada :
>>>
>>> On Tue, Sep 26, 2017 at 2:50 AM, Pavel Stehule 
>>> wrote:
>>> >
>>> >
>>> > 2017-09-25 19:23 GMT+02:00 Petr Jelinek :
>>> >>
>>> >> On 25/09/17 19:19, Tom Lane wrote:
>>> >> > Pavel Stehule  writes:
>>> >> >> I had two instances on one server with different port. I am sure,
>>> >> >> so
>>> >> >> replication was functional. Only one issue is statistics
>>> >> >
>>> >> >> Master:
>>> >> >
>>> >> >> CREATE TABLE foo(id int primary key, a int);
>>> >> >> CREATE PUBLICATION test_pub FOR TABLE foo;
>>> >> >> INSERT INTO foo VALUES(1, 200);
>>> >> >
>>> >> >> slave
>>> >> >
>>> >> >> CREATE TABLE foo(id int primary key, a int);
>>> >> >> CREATE SUBSCRIPTION test_sub CONNECTION 'port=5432' PUBLICATION
>>> >> >> test_pub;
>>> >> >
>>> >> >> That was all
>>> >> >
>>> >> > In this example, nothing's been done yet by the actual replication
>>> >> > apply process, only by the initial table sync.  Maybe that accounts
>>> >> > for your not seeing stats?
>>> >> >
>>> >>
>>> >> The main replication worker should still be running though. The output
>>> >> of pg_stat_replication should only be empty if there is nothing
>>> >> running.
>>> >>
>>> >
>>> > I did some inserts, updates, ..
>>> >
>>> > I can recheck it - it was done on 10 RC
>>>
>>> I guess CREATE SUBSCRIPTION failed for whatever reason (e.g, wal_level
>>> < logical on the master). Didn't you get errors from CREATE
>>> SUBSCRIPTION?
>>
>>
>> sorry I had wal_level = logical
>
>
> but if I remember - maybe I had this level only on master
>>
>>

Hmm, has a logical replication slot created on the master?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-09-26 Thread Ants Aasma

On Tue, Aug 15, 2017 at 5:00 AM, Craig Ringer  wrote:
> On 22 March 2017 at 01:17, Robert Haas  wrote:
>>
>> On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munro
>>  wrote:
>> > Maybe someone can think of a clever way for an extension to insert a
>> > wait for a user-supplied LSN *before* acquiring a snapshot so it can
>> > work for the higher levels, or maybe the hooks should go into core
>> > PostgreSQL so that the extension can exist as an external project not
>> > requiring a patched PostgreSQL installation, or maybe this should be
>> > done with new core syntax that extends transaction commands.  Do other
>> > people have views on this?
>>
>> IMHO, trying to do this using a function-based interface is a really
>> bad idea for exactly the reasons you mention.  I don't see why we'd
>> resist the idea of core syntax here; transactions are a core part of
>> PostgreSQL.
>>
>> There is, of course, the question of whether making LSNs such a
>> user-visible thing is a good idea in the first place, but that's a
>> separate question from issue of what syntax for such a thing is best.
>
>
> (I know this is old, but):
>
> That ship sailed a long time ago unfortunately, they're all over
> pg_stat_replication and pg_replication_slots and so on. They're already
> routinely used for monitoring replication lag in bytes, waiting for a peer
> to catch up, etc.

(continuing the trend of resurrecting old topics)

Exposing this interface as WAITLSN will encode that visibility order
matches LSN order. This removes any chance of fixing for example
visibility order of async/vs/sync transactions. Just renaming it so
the token is an opaque commit visibility token that just happens to be
a LSN would still allow for progress in transaction management. For
example, making PostgreSQL distributed will likely want timestamp
and/or vector clock based visibility rules.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and statistics

2017-09-26 Thread Pavel Stehule

2017-09-26 11:56 GMT+02:00 Pavel Stehule :

>
>
> 2017-09-26 11:51 GMT+02:00 Masahiko Sawada :
>
>> On Tue, Sep 26, 2017 at 2:50 AM, Pavel Stehule 
>> wrote:
>> >
>> >
>> > 2017-09-25 19:23 GMT+02:00 Petr Jelinek :
>> >>
>> >> On 25/09/17 19:19, Tom Lane wrote:
>> >> > Pavel Stehule  writes:
>> >> >> I had two instances on one server with different port. I am sure, so
>> >> >> replication was functional. Only one issue is statistics
>> >> >
>> >> >> Master:
>> >> >
>> >> >> CREATE TABLE foo(id int primary key, a int);
>> >> >> CREATE PUBLICATION test_pub FOR TABLE foo;
>> >> >> INSERT INTO foo VALUES(1, 200);
>> >> >
>> >> >> slave
>> >> >
>> >> >> CREATE TABLE foo(id int primary key, a int);
>> >> >> CREATE SUBSCRIPTION test_sub CONNECTION 'port=5432' PUBLICATION
>> >> >> test_pub;
>> >> >
>> >> >> That was all
>> >> >
>> >> > In this example, nothing's been done yet by the actual replication
>> >> > apply process, only by the initial table sync.  Maybe that accounts
>> >> > for your not seeing stats?
>> >> >
>> >>
>> >> The main replication worker should still be running though. The output
>> >> of pg_stat_replication should only be empty if there is nothing
>> running.
>> >>
>> >
>> > I did some inserts, updates, ..
>> >
>> > I can recheck it - it was done on 10 RC
>>
>> I guess CREATE SUBSCRIPTION failed for whatever reason (e.g, wal_level
>> < logical on the master). Didn't you get errors from CREATE
>> SUBSCRIPTION?
>>
>
> sorry I had wal_level = logical
>

but if I remember - maybe I had this level only on master

>
> Pavel
>
>>
>> Regards,
>>
>> --
>> Masahiko Sawada
>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> NTT Open Source Software Center
>>
>
>

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Petr Jelinek

On 26/09/17 09:26, Alvaro Hernandez wrote:
> On 26/09/17 10:03, Craig Ringer wrote:
>> On 26 September 2017 at 14:08, Alvaro Hernandez > > wrote: 
>> - If you stick to in-core plugins, then you need to support at
>> least three different output formats if you want to support 9.4+:
>> test_decoding (and pray it works!), pgoutput, and the "new"
>> in-core plugin that was proposed at the beginning of this thread,
>> if that would see the light.
>>
>>
>> The only practical way will IMO be to have whatever new plugin it also
>> have an out-of-core version maintained for older Pg versions, where it
>> can be installed.
>>   
>>
>> But only in-core plugins help for general-purpose solutions.
>>
>>
>> I still don't agree there. If there's enough need/interest/adoption
>> you can get cloud vendors on board, they'll feel the customer
>> pressure. It's not our job to create that pressure and do their work
>> for them.
> 
>     Don't want to get into a loop, but as I said before it's
> chicken-and-egg. But nobody is asking core to do their work. As much as
> I love it, I think logical decoding is a bit half-baked until there is a
> single, quality, in-core plugin, as it discourages its usage, because of
> the reasons I stated.
> 

Well, in that case it's all good as PG10 has that.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and statistics

2017-09-26 Thread Pavel Stehule

2017-09-26 11:51 GMT+02:00 Masahiko Sawada :

> On Tue, Sep 26, 2017 at 2:50 AM, Pavel Stehule 
> wrote:
> >
> >
> > 2017-09-25 19:23 GMT+02:00 Petr Jelinek :
> >>
> >> On 25/09/17 19:19, Tom Lane wrote:
> >> > Pavel Stehule  writes:
> >> >> I had two instances on one server with different port. I am sure, so
> >> >> replication was functional. Only one issue is statistics
> >> >
> >> >> Master:
> >> >
> >> >> CREATE TABLE foo(id int primary key, a int);
> >> >> CREATE PUBLICATION test_pub FOR TABLE foo;
> >> >> INSERT INTO foo VALUES(1, 200);
> >> >
> >> >> slave
> >> >
> >> >> CREATE TABLE foo(id int primary key, a int);
> >> >> CREATE SUBSCRIPTION test_sub CONNECTION 'port=5432' PUBLICATION
> >> >> test_pub;
> >> >
> >> >> That was all
> >> >
> >> > In this example, nothing's been done yet by the actual replication
> >> > apply process, only by the initial table sync.  Maybe that accounts
> >> > for your not seeing stats?
> >> >
> >>
> >> The main replication worker should still be running though. The output
> >> of pg_stat_replication should only be empty if there is nothing running.
> >>
> >
> > I did some inserts, updates, ..
> >
> > I can recheck it - it was done on 10 RC
>
> I guess CREATE SUBSCRIPTION failed for whatever reason (e.g, wal_level
> < logical on the master). Didn't you get errors from CREATE
> SUBSCRIPTION?
>

sorry I had wal_level = logical

Pavel

>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>

Re: [HACKERS] logical replication and statistics

2017-09-26 Thread Masahiko Sawada

On Tue, Sep 26, 2017 at 2:50 AM, Pavel Stehule  wrote:
>
>
> 2017-09-25 19:23 GMT+02:00 Petr Jelinek :
>>
>> On 25/09/17 19:19, Tom Lane wrote:
>> > Pavel Stehule  writes:
>> >> I had two instances on one server with different port. I am sure, so
>> >> replication was functional. Only one issue is statistics
>> >
>> >> Master:
>> >
>> >> CREATE TABLE foo(id int primary key, a int);
>> >> CREATE PUBLICATION test_pub FOR TABLE foo;
>> >> INSERT INTO foo VALUES(1, 200);
>> >
>> >> slave
>> >
>> >> CREATE TABLE foo(id int primary key, a int);
>> >> CREATE SUBSCRIPTION test_sub CONNECTION 'port=5432' PUBLICATION
>> >> test_pub;
>> >
>> >> That was all
>> >
>> > In this example, nothing's been done yet by the actual replication
>> > apply process, only by the initial table sync.  Maybe that accounts
>> > for your not seeing stats?
>> >
>>
>> The main replication worker should still be running though. The output
>> of pg_stat_replication should only be empty if there is nothing running.
>>
>
> I did some inserts, updates, ..
>
> I can recheck it - it was done on 10 RC

I guess CREATE SUBSCRIPTION failed for whatever reason (e.g, wal_level
< logical on the master). Didn't you get errors from CREATE
SUBSCRIPTION?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

On 2017/09/25 20:21, Dilip Kumar wrote:
> On Mon, Sep 25, 2017 at 3:34 PM, Amit Langote
>  wrote:
> 
>> Thanks for looking at the patches and the comments.
> 
>> It's not clear to me whether get_rel_partitions() itself, as it is, is
>> callable from outside the planner, because its signature contains
>> RelOptInfo.  We have the RelOptInfo in the signature, because we want to
>> mark certain fields in it so that latter planning steps can use them.  So,
>> get_rel_partitions()'s job is not just to match clauses and find
>> partitions, but also to perform certain planner-specific tasks of
>> generating information that the later planning steps will want to use.
>> That may turn out to be unnecessary, but until we know that, let's not try
>> to export get_rel_partitions() itself out of the planner.
>>
>> OTOH, the function that your refactoring patch separates out to match
>> clauses to partition keys and extract bounding values seems reusable
>> outside the planner and we should export it in such a way that it can be
>> used in the executor.  Then, the hypothetical executor function that does
>> the pruning will first call the planner's clause-matching function,
>> followed by calling get_partitions_for_keys() in partition.c to get the
>> selected partitions.
> 
> Thanks for your reply.
> 
> Actually, we are still planning to call get_matching_clause at the
> optimizer time only.  Since we can not use get_rel_partitions function
> directly for runtime pruning because it does all the work (find
> matching clause, create minkey and maxkey and call
> get_partitions_for_keys) during planning time itself.
>
> For runtime pruning, we are planning to first get_matching_clause
> function during optimizer time to identify the clause which is
> matching with partition keys, but for PARAM_EXEC case we can not
> depend upon baserelrestriction instead we will get the from join
> clause, that's the reason I have separated out get_matching_clause.
> But it will still be used during planning time.

I see.  So, in the run-time pruning case, only the work of extracting
bounding values is deferred to execution time.  Matching clauses with the
partition key still occurs during planning time.  Only that the clauses
that require run-time pruning are not those in rel->baserestrictinfo.

> After separating out the matching clause we will do somewhat similar
> processing what "get_rel_partitions" is doing. But, at optimizer time
> for PARAM we will not have Datum values for rightop, so we will keep
> track of the PARAM itself.

I guess information about which PARAMs map to which partition keys will be
kept in the plan somehow.

> And, finally at runtime when we get the PARAM value we can prepare
> minkey and maxkey and call get_partitions_for_keys function.

Note that get_partitions_for_keys() is not planner code, nor is it bound
with any other planning code.  It's callable from executor without much
change.  Maybe you already know that though.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-09-26 Thread Alexander Korotkov

On Mon, Jan 23, 2017 at 2:56 PM, Ivan Kartyshov 
wrote:

> How to use it
> ==
> WAITLSN ‘LSN’ [, timeout in ms];
> WAITLSN_INFINITE ‘LSN’;
> WAITLSN_NO_WAIT ‘LSN’;

Adding suffix to the command name looks weird.  We don't do so for any
other command.
I propose following syntax options.

WAITLSN lsn;
WAITLSN lsn TIMEOUT delay;
WAITLSN lsn INFINITE;
WAITLSN lsn NOWAIT;

For me that looks rather better.  What do you think?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-09-26 Thread Masahiko Sawada

On Tue, Aug 1, 2017 at 1:40 AM, Robert Haas  wrote:
> On Thu, Jul 27, 2017 at 8:25 AM, Ashutosh Bapat
>  wrote:
>> The remote transaction can be committed/aborted only after the fate of
>> the local transaction is decided. If we commit remote transaction and
>> abort local transaction, that's not good. AtEOXact* functions are
>> called immediately after that decision in post-commit/abort phase. So,
>> if we want to commit/abort the remote transaction immediately it has
>> to be done in post-commit/abort processing. Instead if we delegate
>> that to the remote transaction resolved backend (introduced by the
>> patches) the delay between local commit and remote commits depends
>> upon when the resolve gets a chance to run and process those
>> transactions. One could argue that that delay would anyway exist when
>> post-commit/abort processing fails to resolve remote transaction. But
>> given the real high availability these days, in most of the cases
>> remote transaction will be resolved in the post-commit/abort phase. I
>> think we should optimize for most common case. Your concern is still
>> valid, that we shouldn't raise an error or do anything critical in
>> post-commit/abort phase. So we should device a way to send
>> COMMIT/ABORT prepared messages to the remote server in asynchronous
>> fashion carefully avoiding errors. Recent changes to 2PC have improved
>> performance in that area to a great extent. Relying on resolver
>> backend to resolve remote transactions would erode that performance
>> gain.
>
> I think there are two separate but interconnected issues here.  One is
> that if we give the user a new command prompt without resolving the
> remote transaction, then they might run a new query that sees their
> own work as committed, which would be bad.  Or, they might commit,
> wait for the acknowledgement, and then tell some other session to go
> look at the data, and find it not there.  That would also be bad.  I
> think the solution is likely to do something like what we did for
> synchronous replication in commit
> 9a56dc3389b9470031e9ef8e45c95a680982e01a -- wait for the remove
> transaction to be resolved (by the background process) but allow an
> interrupt to escape the wait-loop.
>
> The second issue is that having the resolver resolve transactions
> might be slower than doing it in the foreground.  I don't necessarily
> see a reason why that should be a big problem.  I mean, the resolver
> might need to establish a separate connection, but if it keeps that
> connection open for a while (say, 5 minutes) in case further
> transactions arrive then it won't be an issue except on really
> low-volume system which isn't really a case I think we need to worry
> about very much.  Also, the hand-off to the resolver might take some
> time, but that's equally true for sync rep and we're living with it
> there.  Anything else is presumably just the resolver itself being
> inefficient which seems like something that can simply be fixed.
>
> FWIW, I don't think the present resolver implementation is likely to
> be what we want.  IIRC, it's just calling an SQL function which
> doesn't seem like a good approach.  Ideally we should stick an entry
> into a shared memory queue and then ping the resolver via SetLatch,
> and it can directly invoke an FDW method on the data from the shared
> memory queue.  It should be possible to set things up so that a user
> who wishes to do so can run multiple copies of the resolver thread at
> the same time, which would be a good way to keep latency down if the
> system is very busy with distributed transactions.
>

Based on the review comment from Robert, I'm planning to do the big
change to the architecture of this patch so that a backend process
work together with a dedicated background worker that is responsible
for resolving the foreign transactions. For the usage of this feature,
it will be almost the same as what this patch has been doing except
for adding a new GUC paramter that controls the number of resovler
process launch. That is, we can have multiple resolver process to keep
latency down.

On technical view, the processing of the transaction involving
multiple foreign server will be changed as follows.

* Backend processes
1. In PreCommit phase, prepare the transaction on foreign servers and
save fdw_xact entries into the array on shmem. Also create a
fdw_xact_state entry on shmem hash that has the index of each fdw_xact
entry.
2. Local commit/abort.
3. Change its process state to FDWXACT_WAITING and enqueue the MyProc
to the shmem queue.
4. Ping to the resolver process via SetLatch.
5. Wait to be waken up.

* Resovler processes
1. Fetch PGPROC entry from the shmem queue and get its XID (say, XID-a).
2. Get the fdw_xact_state entry from shmem hash by XID-a.
3. Iterate fdw_xact entries using the index, and resolve the foreign
transactions.
3-a. If even one foreign transaction failed to resolve, raise an

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Magnus Hagander

On Tue, Sep 26, 2017 at 7:42 AM, Alvaro Hernandez  wrote:

>
>
> On 25/09/17 22:13, Magnus Hagander wrote:
>
> On Mon, Sep 25, 2017 at 8:20 PM, Alvaro Hernandez  wrote:
>
>>
>>
>> On 25/09/17 20:18, Andres Freund wrote:
>>
>>> On 2017-09-24 13:36:56 +0300, Alvaro Hernandez wrote:
>>>
  However, if DMS uses it for what I'd call production use, I assume
 it is
 actually production quality. I bet they do enough testing, and don't
 ship
 software to potentially millions of customers if it doesn't work well.
 So...
 first, I'd consider this a a sign of robustness.

>>> You've been in software for how long? ... ;)  There's quite mixed
>>> experiences with DMS.
>>>
>>
>> Actually long enough to understand that if someone "big" calls it
>> production quality, we should not be pickier and assume it is --whether it
>> is or not. People will accept it as such, and that's good enough.
>>
>
> Historically the fact that we have been pickier than many of the "someone
> big":s is exactly how we ended up with the codebase and relative stability
> we have today.
>
> Just because someone is big doesn't mean they know what's right. In fact,
> more often than not the opposite turns out to be true.
>
>
>
> Note that I'm not here supporting test_decoding. I'm just saying is
> all what is available in-core for 9.4-9.6, and it seems someone with
> potentially a lot of users tested it and is using it in its own solution.
> Ask me if I would like an in-core, well tested, performant, with an easy to
> parse format, and efficient, for 9.4-9.6? My answer would be an immediate
> 'yes'. But since this is not going to happen, test_decoding is good that is
> good enough, lucky us, because otherwise there would not be a good solution
> on this front.
>

I am not saying we shouldn't have that. I am saying that the argument "if
someone big calls it production quality, we should not be pickier and
assume it is" is incorrect.

And yes, I have used test_decoding in production multiple times. And yes,
there are good reasons why it's called *test* decoding, and should only be
used in production in fairly simple cases :)


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] Shaky coding for vacuuming partitioned relations

On 2017/09/26 11:12, Michael Paquier wrote:
> On Tue, Sep 26, 2017 at 10:54 AM, Amit Langote
>  wrote:
>> I think that's right, although, I don't see any new RangeVar created under
>> vacuum() at the moment.  Maybe, you're referring to the Nathan's patch
>> that perhaps does that.
> 
> Yes, you can check what it does here (last version):
> 766556dd-aa3c-42f7-acf4-5dc97f41f...@amazon.com

Thanks, will look.

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shaky coding for vacuuming partitioned relations

On 2017/09/26 11:14, Michael Paquier wrote:
> On Tue, Sep 26, 2017 at 10:55 AM, Amit Langote wrote:
>> On 2017/09/26 9:51, Michael Paquier wrote:
>>> On Tue, Sep 26, 2017 at 8:48 AM, Michael Paquier wrote:
 Something like that looks like a good compromise for v10. I would
 rather see a more complete fix with each relation name reported
 correctly on HEAD though. The information provided would be useful for
 users when using autovacuum when skipping a relation because no lock
 could be taken on it.
>>>
>>> Actually, perhaps this should be tracked as an open item? A simple fix
>>> is leading to the path that no information is better than misleading
>>> information in this case.
>>
>> +1.
> 
> Let's track it then and spawn a separate thread with a patch. Do you
> want to work on it or should I? The solution proposed by Tom seems
> like the correct answer. I am adding an item for now, we could always
> link it to a thread later on.

I assume you mean the Tom's solution wherein we pass a null RangeVar for
tables that were not mentioned in the command (that is, for partitions of
a partitioned table that was mentioned in the command or for tables read
from pg_class when a user ran VACUUM without mentioning any table name).

Please feel free to come up with the patch for the same, if you have time.
I'll be glad to review.

> Let's also track the problem that has been reported on this thread.

Yes.  Just to be clear, the original problem this thread was started for
is that get_rel_oids() may emit a less user-friendly "cache lookup failed
for relation " error, which it shouldn't.  We should fix the locking
like the patch you posted does, to avoid having to come across this
syscache lookup error.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Repetitive code in RI triggers

2017-09-26 Thread Daniel Gustafsson

> On 26 Sep 2017, at 10:51, Maksim Milyutin  wrote:
> 
> On 19.09.2017 11:09, Daniel Gustafsson wrote:
> 
>> Removing reviewer Maksim Milyutin from patch entry due to inactivity and
>> community account email bouncing.  Maksim: if you are indeed reviewing this
>> patch, then please update the community account and re-add to the patch 
>> entry.
>> 
>> cheers ./daniel
> 
> Daniel, thanks for noticing. I have updated my account and re-added to the 
> patch entry.

Great, thanks!

> Ildar, your patch is conflicting with the current HEAD of master branch, 
> please update it.

I’ve changed status to Waiting on Author based on this.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Repetitive code in RI triggers

2017-09-26 Thread Maksim Milyutin


On 19.09.2017 11:09, Daniel Gustafsson wrote:


Removing reviewer Maksim Milyutin from patch entry due to inactivity and
community account email bouncing.  Maksim: if you are indeed reviewing this
patch, then please update the community account and re-add to the patch entry.

cheers ./daniel


Daniel, thanks for noticing. I have updated my account and re-added to 
the patch entry.


Ildar, your patch is conflicting with the current HEAD of master branch, 
please update it.


--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimise default partition scanning while adding new partition

On 2017/09/16 1:57, Amit Langote wrote:
> On Sat, Sep 16, 2017 at 12:59 AM, Robert Haas  wrote:
>> I believe the intended advantage of the current system is that if you
>> specify multiple operations in a single ALTER TABLE command, you only
>> do one scan rather than having a second scan per operation.  If that's
>> currently working, we probably don't want to make it stop working.
> 
> OK.
> 
> How about squash Jeevan's and my patch, so both
> check_default_allows_bound() and ValidatePartitionConstraints() know
> to scan default partition's children and there won't be any surprises
> in the regression test output as you found after applying just the
> Jeevan's patch.  Unfortunately, I'm not able to post such a patch
> right now.

I guess we don't need to squash, as they could be seen as implementing
different features. Reordering the patches helps though.  So, apply them
in this order:

1. My patch to teach ValidatePartitionConstraints() to skip scanning
   a partition's own partitions, which optimizes ATTACH PARTITION
   command's partition constraint validation scan (this also covers the
   case of scanning the default partition to validate its updated
   constraint when attaching a new partition)

2. Jeevan's patch to teach check_default_allows_bound() to skip scanning
   the default partition's own partitions, which covers the case of
   scanning the default partition to validate its updated constraint when
   adding a new partition using CREATE TABLE

Attached 0001 and 0002 are ordered that way.

In addition to implementing the features mentioned in 1 and 2 above, the
patches also modify the INFO message to mention "updated partition
constraint for default partition \"%s\"", instead of "partition constraint
for table \"%s\"", when the default partition is involved.  That's so that
it's consistent with the error message that would be emitted by either
check_default_allows_bound() or ATRewriteTable() when the scan finds a row
that violates updated default partition constraint, viz. the following:
"updated partition constraint for default partition \"%s\" would be
violated by some row"

Thoughts?

Thanks,
Amit
From 166cc082b9d652e186df4ef7b6c41976a22e55a8 Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 7 Aug 2017 10:51:47 +0900
Subject: [PATCH 1/2] Teach ValidatePartitionConstraints to skip validation in
 more cases

In cases where the table being attached is a partitioned table and
the table itself does not have constraints that would allow validation
on the whole table to be skipped, we can still skip the validations
of individual partitions if they each happen to have the requisite
constraints.

Idea by Robert Haas.
---
 src/backend/commands/tablecmds.c  | 26 +++---
 src/test/regress/expected/alter_table.out | 17 +++--
 src/test/regress/sql/alter_table.sql  | 10 ++
 3 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 563bcda30c..2d4dcd7556 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13635,9 +13635,14 @@ ValidatePartitionConstraints(List **wqueue, Relation 
scanrel,
 */
if (PartConstraintImpliedByRelConstraint(scanrel, partConstraint))
{
-   ereport(INFO,
-   (errmsg("partition constraint for table \"%s\" 
is implied by existing constraints",
-   
RelationGetRelationName(scanrel;
+   if (!validate_default)
+   ereport(INFO,
+   (errmsg("partition constraint for table 
\"%s\" is implied by existing constraints",
+   
RelationGetRelationName(scanrel;
+   else
+   ereport(INFO,
+   (errmsg("updated partition constraint 
for default partition \"%s\" is implied by existing constraints",
+   
RelationGetRelationName(scanrel;
return;
}

@@ -13678,6 +13683,21 @@ ValidatePartitionConstraints(List **wqueue, Relation 
scanrel,
/* There can never be a whole-row reference here */
if (found_whole_row)
elog(ERROR, "unexpected whole-row reference 
found in partition key");
+
+   /* Can we skip scanning this part_rel? */
+   if (PartConstraintImpliedByRelConstraint(part_rel, 
my_partconstr))
+   {
+   if (!validate_default)
+   ereport(INFO,
+   (errmsg("partition 
constraint for table \"%s\" is implied by existing constraints",
+

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Craig Ringer

On 26 September 2017 at 15:26, Alvaro Hernandez  wrote:

>
> That's better than nothing. But as much as interoperable json may be,
> people still need to talk the (binary) replication protocol to use it.
>

No, they don't.

They can use the SQL interface to logical decoding.

We could enhance that with a few small changes to make it a lot more useful
too. Most importantly, allow a long-polling model, where you can wait if
there's nothing to consume rather than getting an immediate empty
result-set.

I expect the walsender protocol to be dealt with by client drivers, not
user applications, much like you never really deal with the regular libpq
protocol in your app. PgJDBC and psycopg2 already support it. It'd be nice
if libpq offered some helper interfaces for C apps, and I'd help review any
such patch.

Nothing against binary output, but as you noted, we can likely use pgoutput
for that to some extent at least. I think the pressing need is json, going
by the zillion plugins out there for it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] tablespaces inside $PGDATA considered harmful

2017-09-26 Thread Mark Kirkwood


On 29/04/15 09:35, Bruce Momjian wrote:


On Fri, Apr 24, 2015 at 01:05:03PM -0400, Bruce Momjian wrote:

This way, both pg_dump and pg_upgrade will issue warnings, though, of
course, those warnings can be ignored.  I am hopeful these two warnings
will be sufficient and we will not need make these errors, with the
possible inconvenience it will cause.  I am still afraid that someone
will ignore the new errors pg_dump would generate and lose data.  I just
don't remember enough cases where we threw new errors on _data_ restore.

Frankly, those using pg_upgrade already will have to move the old
tablespaces out of the old cluster if they ever want to delete those
clusters, so I am hopeful these additional warnings will help eliminate
this practice, which is already cumbersome and useless.  I am not
planning to revisit this for 9.6.




(resurrecting an old thread) I encountered this the other day, a 
customer had created tablespaces with directories inside 
$PGDATA/pg_tblspc. This is just pathalogical - e.g (v11 checkout with 
PGDATA=/data0/pgdata/11):


bench=# CREATE TABLESPACE space1 LOCATION 
'/data0/pgdata/11/pg_tblspc/space1';

WARNING:  tablespace location should not be inside the data directory
CREATE TABLESPACE
bench=# ALTER TABLE pgbench_accounts SET  TABLESPACE space1;
ALTER TABLE

Ok, so I've been warned:

$ pg_basebackup -D .
WARNING:  could not read symbolic link "pg_tblspc/space1": Invalid argument
pg_basebackup: directory "/data0/pgdata/11/pg_tblspc/space1" exists but 
is not empty

pg_basebackup: removing contents of data directory "."

So pg_basebackup is completely broken by this construction - should we 
not prohibit the creation of tablespace directories under $PGDATA (or at 
least $PGDATA/pg_tblspc) at this point?


regards

Mark




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

On 2017/09/26 16:30, Michael Paquier wrote:
> On Tue, Sep 26, 2017 at 4:22 PM, Amit Langote
>  wrote:
>>> Except that small thing, the patches do their duty.
>>
>> Thanks, revised patches attached.
> 
> Cool, let's switch it back to a ready for committer status then.

Sure, thanks.

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enhancements to passwordcheck

2017-09-26 Thread Albe Laurenz

Nathan Bossart wrote:
>>> passwordcheck.force_new_password
>>>
>> Does it mean a password different from the old one? +1. It could be
>> different from the last 3 passwords but we don't store a password
>> history.
> 
> Yes.  As Michael pointed out, this might be better to do as a separate
> effort since we'll almost certainly need to introduce a way to store
> password history.

That increases the number of passwords stored on the server and
consequently the damage when that list is stolen.
Of course the old passwords are invalid, but if someone cracks them
they could still try them on other systems the person uses.

I think we should accept such a risk only if the benefits are clear, and
my opinion has always been that if you forbid password reuse, people
tend to come up with password generation schemes that are no better
than the original passwords.

> One interesting design challenge will be how to handle pre-hashed
> passwords, since the number of checks we can do on those is pretty
> limited.  I'm currently thinking of a parameter that can be used to
> block, allow, or force pre-hashed passwords.  If we take that route,
> perhaps we will also need to ensure that PostgreSQL fails to start when
> invalid combinations are specified (e.g. pre-hashed passwords are forced
> and min_password_length is nonzero).  Thoughts?

As was pointed out in the original discussion
d960cb61b694cf459dcfb4b0128514c203937...@exadv11.host.magwien.gv.at
the weak point of "passwordcheck" is that it does not work very well
for encrypted passwords.
The only saving grace is that you can at least check against
username equals password.

Disabling pre-hashed passwords in order to allow better password
checks is a problem rather than a solution, because it exposes you
to password theft of the clear-text password.  I think we shouldn't
go there.

The overall opinion in the above thread was that if you *really* care
about security, you don't use database passwords, but external
authentication with a centralized identity management system.

So I think it is fine to extend "passwordcheck", but we shouldn't
take it serious enough to reduce security elsewhere in order to
improve the module.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

On Tue, Sep 26, 2017 at 4:22 PM, Amit Langote
 wrote:
>> Except that small thing, the patches do their duty.
>
> Thanks, revised patches attached.

Cool, let's switch it back to a ready for committer status then.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in plugin for logical decoding output




On 26/09/17 10:03, Craig Ringer wrote:
On 26 September 2017 at 14:08, Alvaro Hernandez > wrote:




    OK, let me try to do that. I believe data integration is a
priority.


Definitely agree so far.

- If you want to develop your own output plugin, then your market
is reduced as you have to exclude all the managed solutions
(until, and only if, you would convince them to include your
plugin... highly unlikely, very difficult). And probably another %
of enterprise environments which will hesitate to link your own
plugin to the running production database. Last but not least, you
need to compile and test (and testing this is far from easy) on
multiple OSes/architectures.


Right. You probably don't want one output plugin per application.

This doesn't mean it has to be in-core, just flexible and share-able 
by many, and adopted by cloud vendors. Like say PostGIS.


    That would be nice. But this is chicken-and-egg: an out-of-core 
plugin won't probably be used by many if applications like the ones I 
was mentioning if they do not exist. And developing such an application 
is so much less interesting if a significant part of your market is 
excluded from your app.


    However, it could work the other way around: a sufficiently good 
enough in-core base plugin could foster applications/ecosystem, which 
once adopted by users could push much more easily for other more 
advanced out-of-core plugins, that would be more easily accepted by 
pressure as those tools would already be with significant traction. But 
I don't see it the other way around.




- If you stick to in-core plugins, then you need to support at
least three different output formats if you want to support 9.4+:
test_decoding (and pray it works!), pgoutput, and the "new"
in-core plugin that was proposed at the beginning of this thread,
if that would see the light.


The only practical way will IMO be to have whatever new plugin it also 
have an out-of-core version maintained for older Pg versions, where it 
can be installed.


But only in-core plugins help for general-purpose solutions.


I still don't agree there. If there's enough need/interest/adoption 
you can get cloud vendors on board, they'll feel the customer 
pressure. It's not our job to create that pressure and do their work 
for them.


    Don't want to get into a loop, but as I said before it's 
chicken-and-egg. But nobody is asking core to do their work. As much as 
I love it, I think logical decoding is a bit half-baked until there is a 
single, quality, in-core plugin, as it discourages its usage, because of 
the reasons I stated.





I see nothing wrong with a plugin starting off out of core and being 
adopted+adapted later, assuming it's done well.


That said, I'm all in favour of a generic json output plugin that 
shares infrastructure with logical replication, so people who are on 
inflexible environments have a fallback option. I just don't care to 
write it.


    That's better than nothing. But as much as interoperable json may 
be, people still need to talk the (binary) replication protocol to use 
it. So once you talk binary protocol, why not talk binary also for the 
output plugin and have a much more efficient output? Again, nothing 
against json, but if a new plugin would be included in-core, I'd say 
json + binary also. Or just document pgoutput, as it could be good enough.


    Cheers,

    Álvaro


--

Alvaro Hernandez


---
OnGres

Re: [HACKERS] Setting pd_lower in GIN metapage

On 2017/09/26 12:17, Michael Paquier wrote:
> On Mon, Sep 25, 2017 at 3:48 PM, Amit Langote wrote:
>> So, ISTM, comments that the patches add should all say that setting the
>> meta pages' pd_lower to the correct value helps to pass those pages to
>> xlog.c as compressible standard layout pages, regardless of whether they
>> are actually passed that way.  Even if the patches do take care of the
>> latter as well.
>>
>> Did I miss something?
> 
> Not that I think of.

Thanks.

> Buffer  metabuffer;
> +   Pagemetapage;
> SpGistMetaPageData *metadata;
> 
> metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
> +   metapage = BufferGetPage(metabuffer);
> No need to define metapage here and to call BufferGetPage() as long as
> the lock on the buffer is not taken.

Ah, okay.

Moved those additions inside the if (ConditionalLockBuffer(metabuffer)) block.

> Except that small thing, the patches do their duty.

Thanks, revised patches attached.

Regards,
Amit
From e82e787bc9533977e73456b0782ed78354637bda Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 23 Jun 2017 11:20:41 +0900
Subject: [PATCH 1/5] Set pd_lower correctly in the GIN metapage.

Also tell xlog.c to treat the metapage like a standard page, so any
hole in it is compressed.
---
 src/backend/access/gin/ginfast.c   | 20 ++--
 src/backend/access/gin/gininsert.c |  4 ++--
 src/backend/access/gin/ginutil.c   | 17 -
 src/backend/access/gin/ginxlog.c   | 25 ++---
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 59e435465a..359de85dfc 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -399,6 +399,14 @@ ginHeapTupleFastInsert(GinState *ginstate, 
GinTupleCollector *collector)
/*
 * Write metabuffer, make xlog entry
 */
+
+   /*
+* Set pd_lower just past the end of the metadata.  This is essential,
+* because without doing so, metadata will be lost if xlog.c compresses
+* the page.
+*/
+   ((PageHeader) metapage)->pd_lower =
+   ((char *) metadata + sizeof(GinMetaPageData)) - (char 
*) metapage;
MarkBufferDirty(metabuffer);
 
if (needWal)
@@ -407,7 +415,7 @@ ginHeapTupleFastInsert(GinState *ginstate, 
GinTupleCollector *collector)
 
memcpy(, metadata, sizeof(GinMetaPageData));
 
-   XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT);
+   XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT | 
REGBUF_STANDARD);
XLogRegisterData((char *) , sizeof(ginxlogUpdateMeta));
 
recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_UPDATE_META_PAGE);
@@ -572,6 +580,13 @@ shiftList(Relation index, Buffer metabuffer, BlockNumber 
newHead,
metadata->nPendingHeapTuples = 0;
}
 
+   /*
+* Set pd_lower just past the end of the metadata.  This is 
essential,
+* because without doing so, metadata will be lost if xlog.c 
compresses
+* the page.
+*/
+   ((PageHeader) metapage)->pd_lower =
+   ((char *) metadata + sizeof(GinMetaPageData)) - (char 
*) metapage;
MarkBufferDirty(metabuffer);
 
for (i = 0; i < data.ndeleted; i++)
@@ -586,7 +601,8 @@ shiftList(Relation index, Buffer metabuffer, BlockNumber 
newHead,
XLogRecPtr  recptr;
 
XLogBeginInsert();
-   XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT);
+   XLogRegisterBuffer(0, metabuffer,
+  REGBUF_WILL_INIT | 
REGBUF_STANDARD);
for (i = 0; i < data.ndeleted; i++)
XLogRegisterBuffer(i + 1, buffers[i], 
REGBUF_WILL_INIT);
 
diff --git a/src/backend/access/gin/gininsert.c 
b/src/backend/access/gin/gininsert.c
index 5378011f50..c9aa4ee147 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -348,7 +348,7 @@ ginbuild(Relation heap, Relation index, IndexInfo 
*indexInfo)
Pagepage;
 
XLogBeginInsert();
-   XLogRegisterBuffer(0, MetaBuffer, REGBUF_WILL_INIT);
+   XLogRegisterBuffer(0, MetaBuffer, REGBUF_WILL_INIT | 
REGBUF_STANDARD);
XLogRegisterBuffer(1, RootBuffer, REGBUF_WILL_INIT);
 
recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_CREATE_INDEX);
@@ -447,7 +447,7 @@ ginbuildempty(Relation index)
START_CRIT_SECTION();
GinInitMetabuffer(MetaBuffer);
MarkBufferDirty(MetaBuffer);
-   log_newpage_buffer(MetaBuffer, false);
+   log_newpage_buffer(MetaBuffer, true);
GinInitBuffer(RootBuffer,

Re: [HACKERS] Setting pd_lower in GIN metapage

On 2017/09/26 11:34, Amit Kapila wrote:
> On Mon, Sep 25, 2017 at 12:18 PM, Amit Langote wrote:
>> So, ISTM, comments that the patches add should all say that setting the
>> meta pages' pd_lower to the correct value helps to pass those pages to
>> xlog.c as compressible standard layout pages, regardless of whether they
>> are actually passed that way.  Even if the patches do take care of the
>> latter as well.
>>
>> Did I miss something?
>>
>> Looking at Amit K's updated patches for btree and hash, it seems that he
>> updated the comment to say that setting pd_lower to the correct value is
>> *essential*, because those pages are passed as REGBUF_STANDARD pages and
>> hence will be compressed.  That seems to contradict what I wrote above,
>> but I think that's not wrong, too.
>>
> 
> I think you are missing that there are many cases where we use only
> REGBUF_STANDARD for meta-pages (cf. hash index).  For btree, where the
> advantage is in fewer cases, I have explicitly stated those cases.

I do see that there are some places that use only REGBUF_STANDARD.  I also
understand that specifying this flag is necessary condition for
XLogRecordAssemble() to perform the hole compression, if it is to be
performed at all.  ISTM, the hole is compressed only if we write the FP
image.  However, reasons for why FP image needs to be written, AFAICS, are
independent of whether the hole is (should be) compressed or not.  Whether
compression should occur or not depends on whether the REGBUF_STANDARD
flag is passed, that is, whether a caller is sure that the page is of
standard layout.  The last part is only true if page's pd_lower is always
set correctly.

Perhaps, we should focus on just writing exactly why setting pd_lower to
the correct value is essential.  I think that's a good change, so I
adopted it from your patch.  The *only* reason why it's essential to set
pd_lower to the correct value, as I see it, is that xloginsert.c *might*
compress the hole in the page and its pd_lower better be correct if we
want compression to preserve the metadata content (in meta pages).  OTOH,
it's not clear to me why we should write in that comment *why* the
compression would occur or why the FP image would be written in the first
place.  Whether or not FP image is written has nothing to do with whether
we should set pd_lower correctly, does it?  Also, since we want to repeat
the same comment in multiple places where we now set pd_lower (gin, brin,
spgist patches have many more new places where meta page's pd_lower is
set), keeping it to the point would be good.

But as you said a few times, we should leave it to the committer to decide
the exact text to write in these comment, which I think is fine.

> Do you still have confusion?

This is an area of PG code I don't have much experience with and is also a
bit complex, so sorry if I'm saying things that are not quite right.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Craig Ringer

On 26 September 2017 at 14:08, Alvaro Hernandez  wrote:

>
>>
> OK, let me try to do that. I believe data integration is a priority.

Definitely agree so far.

> - If you want to develop your own output plugin, then your market is
> reduced as you have to exclude all the managed solutions (until, and only
> if, you would convince them to include your plugin... highly unlikely, very
> difficult). And probably another % of enterprise environments which will
> hesitate to link your own plugin to the running production database. Last
> but not least, you need to compile and test (and testing this is far from
> easy) on multiple OSes/architectures.
>

Right. You probably don't want one output plugin per application.

This doesn't mean it has to be in-core, just flexible and share-able by
many, and adopted by cloud vendors. Like say PostGIS.

>
> - If you stick to in-core plugins, then you need to support at least three
> different output formats if you want to support 9.4+: test_decoding (and
> pray it works!), pgoutput, and the "new" in-core plugin that was proposed
> at the beginning of this thread, if that would see the light.
>

The only practical way will IMO be to have whatever new plugin it also have
an out-of-core version maintained for older Pg versions, where it can be
installed.

> But only in-core plugins help for general-purpose solutions.

I still don't agree there. If there's enough need/interest/adoption you can
get cloud vendors on board, they'll feel the customer pressure. It's not
our job to create that pressure and do their work for them.

I see nothing wrong with a plugin starting off out of core and being
adopted+adapted later, assuming it's done well.

That said, I'm all in favour of a generic json output plugin that shares
infrastructure with logical replication, so people who are on inflexible
environments have a fallback option. I just don't care to write it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] coverage analysis improvements