Re: [HACKERS] parallelize queries containing initplans

2017-10-12 Thread Amit Kapila
On Wed, Oct 11, 2017 at 9:24 PM, Robert Haas  wrote:
> On Mon, Oct 9, 2017 at 5:56 AM, Amit Kapila  wrote:
>> How about always returning false for PARAM_EXTERN?
>
> Yeah, I think that's what we should do.  Let's do that first as a
> separate patch, which we might even want to back-patch, then return to
> this.
>

Sure, I have started a new thread as this fix lead to some other
failures.  I hope that is okay.

https://www.postgresql.org/message-id/CAA4eK1%2B_BuZrmVCeua5Eqnm4Co9DAXdM5HPAOE2J19ePbR912Q%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel safety for extern params

2017-10-12 Thread Amit Kapila
As discussed in a nearby thread [1] (parallelize queries containing
initplans), it appears that there are cases where queries referring
PARAM_EXTERN params are treated as parallel-restricted even though
they should be parallel-safe.  I have done some further investigation
and found that actually for most of the cases they are treated as
parallel-restricted (due to tests in max_parallel_hazard_walker).  The
cases where such queries are treated as parallel-safe are when
eval_const_expressions_mutator converts the param to const.  So
whenever we select the generic plan, it won't work.  One simple
example is:

create table t1(c1 int, c2 char(500));
insert into t1 values(generate_series(1,1),'aaa');
analyze t1;

set force_parallel_mode = on;
regression=# prepare t1_select(int) as Select c1 from t1 where c1 < $1;
PREPARE
regression=# explain (costs off, analyze) execute t1_select(1);
QUERY PLAN
---
 Gather (actual time=35.252..44.727 rows= loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Seq Scan on t1 (actual time=0.364..5.638 rows= loops=1)
 Filter: (c1 < 1)
 Rows Removed by Filter: 1
 Planning time: 2135.514 ms
 Execution time: 52.886 ms
(9 rows)

The next four executions will give the same plan, however, the sixth
execution will give below plan:
regression=# explain (costs off, analyze) execute t1_select(10005);
  QUERY PLAN
--
 Seq Scan on t1 (actual time=0.049..6.188 rows=1 loops=1)
   Filter: (c1 < $1)
 Planning time: 22577.404 ms
 Execution time: 7.612 ms
(4 rows)

Attached patch fix_parallel_safety_for_extern_params_v1.patch fixes
this problem.  Note, for now, I have added a very simple test in
regression tests to cover prepared statement case, but we might want
to add something related to generic plan selection as I have shown
above. I am just not sure if it is a good idea to have multiple
executions just to get the generic plan.

After fixing this problem, when I ran the regression tests with
force_parallel_mode = regress, I saw multiple other failures.  All the
failures boil down to two kinds of cases:

1. There was an assumption while extracting simple expressions that
the target list of gather node can contain constants or Var's.  Now,
once the above patch allows extern params as parallel-safe, that
assumption no longer holds true.  We need to handle params as well.
Attached patch fix_simple_expr_interaction_gather_v1.patch handles
that case.

2. We don't allow execution to use parallelism if the plan can be
executed multiple times.  This has been enforced in ExecutePlan, but
it seems like that we miss to handle the case where we are already in
parallel mode by the time we enforce that condition.  So, what
happens, as a result, is that it will allow to use parallelism when it
shouldn't (when the same plan is executed multiple times) and lead to
a crash.  One way to fix is that we temporarily exit the parallel mode
in such cases and reenter in the same state once the current execution
is over. Attached patch fix_parallel_mode_nested_execution_v1.patch
fixes this problem.

Thanks to Kuntal who has helped in investigating the regression
failures which happened as a result of making param extern params as
parallel-safe.



[1] - 
https://www.postgresql.org/message-id/CA%2BTgmobSN66o2_c76rY12JvS_sZa17zpKvpuyG-QzRvVLYE8Vg%40mail.gmail.com
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_parallel_safety_for_extern_params_v1.patch
Description: Binary data


fix_simple_expr_interaction_gather_v1.patch
Description: Binary data


fix_parallel_mode_nested_execution_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Still another race condition in recovery TAP tests

2017-10-12 Thread Noah Misch
On Fri, Oct 06, 2017 at 05:57:24PM +0800, Craig Ringer wrote:
> On 6 October 2017 at 14:03, Noah Misch  wrote:
> > On Fri, Sep 08, 2017 at 10:32:03PM -0400, Tom Lane wrote:
> >> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
> >> a better implementation in CPAN?)
> >
> > Fewer people will test as we grow the list of modules they must first 
> > install.
> 
> Meh, I don't buy that. At worst, all we have to do is provide a script
> that fetches them, from distro repos if possible, and failing that
> from CPAN.
> 
> With cpanminus, that's pretty darn simple too.

If the tree had such a script and it were reliable, then yes, it would matter
little whether the script procured one module or five.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BLK_DONE state in XLogReadBufferForRedoExtended

2017-10-12 Thread Michael Paquier
On Thu, Oct 12, 2017 at 10:47 PM, Amit Kapila  wrote:
> Today, I was trying to think about cases when we can return BLK_DONE
> in XLogReadBufferForRedoExtended.  One thing that occurred to me is
> that it can happen during the replay of WAL if the full_page_writes is
> off.  Basically, when the full_page_writes is on, then during crash
> recovery, it will always first restore the full page image of page and
> then apply the WAL corresponding to that page, so we will never hit
> the case where the lsn of the page is greater than lsn of WAL record.
>
> Are there other cases for which we can get BLK_DONE state?  Is it
> mentioned somewhere in code/comments which I am missing?

Remember the thread about meta page optimization... Some index AMs use
XLogInitBufferForRedo() to redo their meta pages and those don't have
a FPW so when doing crash recovery you may finish by not replaying a
meta page if its LSN on the page header is newer than what's being
replayed. That's another case where BLK_DONE can be reached, even if
full_page_writes is on.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-10-12 Thread Thomas Munro
On Fri, Oct 13, 2017 at 3:59 PM, Peter Eisentraut
 wrote:
> On 9/24/17 07:00, Thomas Munro wrote:
>> Fair point.  In that case there are a few others we should consider
>> moving down too for consistency, like in the attached.
>
>> Thanks, that is much tidier.  Done that way in the attached.
>>
>> Here also is a small addition to your TAP test which exercises the
>> non-NULL code path because slapd rejects TLS by default with a
>> diagnostic message.  I'm not sure if this is worth adding, since it
>> doesn't actually verify that the code path is reached (though you can
>> see that it is from the logs).
>
> Committed.

Thanks, and thanks also for follow-up commit 7d1b8e75.  Looks like the
new macro arrived in OpenLDAP 2.4 but RHEL5 shipped with 2.3.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Oct 13, 2017 at 12:46 PM, Tom Lane  wrote:
>> Yeah, I agree --- personally I'd never write a query like that.  But
>> the fact that somebody ran into it when v10 has been out for barely
>> a week suggests that people are doing it.

> Not exactly -- Julien's bug report was about a *qualified* reference
> being incorrectly rejected.

Nonetheless, he was using a CTE name equivalent to the name of the
query's target table.  That's already confusing IMV ... and it does
not seem unreasonable to guess that he only qualified the target
because it stopped working unqualified.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Thomas Munro
On Fri, Oct 13, 2017 at 12:46 PM, Tom Lane  wrote:
> Thomas Munro  writes:
>> On Fri, Oct 13, 2017 at 10:01 AM, Tom Lane  wrote:
>>> The CTE was simply not part of the available namespace for the INSERT's
>>> target, so it found the regular table instead.  v10 has thus broken
>>> cases that used to work.  I think that's a bug.
>
>> Hmm.  Yeah.  I have to say it's a bit surprising that the following
>> refers to two different objects:
>>   with mytable as (select 1 as x) insert into mytable select * from mytable
>
> Yeah, I agree --- personally I'd never write a query like that.  But
> the fact that somebody ran into it when v10 has been out for barely
> a week suggests that people are doing it.

Not exactly -- Julien's bug report was about a *qualified* reference
being incorrectly rejected.

>>> I think we need to either remove that call from setTargetTable entirely,
>>> or maybe adjust it so it can only find ENRs not CTEs.
>
>> I think it'd be better to find and reject ENRs only.  The alternative
>> would be to make ENRs invisible to DML, which seems arbitrary and
>> weird (even though it might be more consistent with our traditional
>> treatment of CTEs).  One handwavy reason I'd like them to remain
>> visible to DML (and be rejected) is that I think they're a bit like
>> table variables (see SQL Server), and someone might eventually want to
>> teach them, or something like them, how to be writable.
>
> Yeah, that would be the argument for making them visible.  I'm not
> sure how likely it is that we'll ever actually have writable ENRs,
> but I won't stand in the way if you want to do it like that.

I hope so :-)  I might be a nice way to get cheap locally scoped
temporary tables, among other things.

Okay, here's Julien's patch tweaked to reject just the ENR case.  This
takes us back to the 9.6 behaviour where CTEs don't hide tables in
this context.  I also removed the schema qualification in his
regression test so we don't break that again.  This way, his query
from the first message in the thread works with the schema
qualification (the bug he reported) and without it (the bug or at
least incompatible change from 9.6 you discovered).

I considered testing for a NULL return from parserOpenTable() instead
of the way the patch has it, since parserOpenTable() already had an
explicit test for ENRs, but its coding would give preference to an
unqualified table of the same name.  I considered moving the test for
an ENR match higher up in parserOpenTable(), and that might have been
OK, but then I realised no code in the tree actually tests for its
undocumented NULL return value anyway.  I think that NULL-returning
branch is dead code, and all tests pass without it.  Shouldn't we just
remove it, as in the attached?

I renamed the ENR used in plpgsql.sql's
transition_table_level2_bad_usage_func() and with.sql's "sane
response" test, because they both confused matters by using an ENR
with the name "d" which is also the name of an existing table.  For
example, if you start with unpatched master, rename
transition_table_level2_bad_usage_func()'s ENR to "dx" and simply
remove the check for ENRs from setTargetTable() as you originally
suggested, you'll get a segfault because the NULL return from
parserOpenTable() wasn't checked.  If you leave
transition_table_level2_bad_usage_func()'s ENR name as "d" it'll
quietly access the wrong table instead, which is misleading.

-- 
Thomas Munro
http://www.enterprisedb.com
diff --git a/src/backend/parser/parse_clause.c 
b/src/backend/parser/parse_clause.c
index 9ff80b8b40..2d56a07774 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -184,9 +184,12 @@ setTargetTable(ParseState *pstate, RangeVar *relation,
RangeTblEntry *rte;
int rtindex;
 
-   /* So far special relations are immutable; so they cannot be targets. */
-   rte = getRTEForSpecialRelationTypes(pstate, relation);
-   if (rte != NULL)
+   /*
+* ENRs hide tables of the same name, so we need to check for them 
first.
+* In contrast, CTEs don't hide tables.
+*/
+   if (relation->schemaname == NULL &&
+   scanNameSpaceForENR(pstate, relation->relname))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("relation \"%s\" cannot be the target 
of a modifying statement",
@@ -1072,6 +1075,12 @@ transformRangeTableSample(ParseState *pstate, 
RangeTableSample *rts)
 }
 
 
+/*
+ * getRTEForSpecialRelationTypes
+ *
+ * If given RangeVar if a CTE reference or an EphemeralNamedRelation, return
+ * the transformed RangeVar otherwise return NULL
+ */
 static RangeTblEntry *
 getRTEForSpecialRelationTypes(ParseState *pstate, RangeVar *rv)
 {
@@ -1079,6 +1088,13 @@ getRTEForSpecialRelationTypes(ParseState *pstate, 
RangeVar *rv)
Index   

Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-10-12 Thread Peter Eisentraut
On 9/24/17 07:00, Thomas Munro wrote:
> Fair point.  In that case there are a few others we should consider
> moving down too for consistency, like in the attached.

> Thanks, that is much tidier.  Done that way in the attached.
> 
> Here also is a small addition to your TAP test which exercises the
> non-NULL code path because slapd rejects TLS by default with a
> diagnostic message.  I'm not sure if this is worth adding, since it
> doesn't actually verify that the code path is reached (though you can
> see that it is from the logs).

Committed.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] advanced partition matching algorithm for partition-wise join

2017-10-12 Thread Ashutosh Bapat
On Thu, Oct 12, 2017 at 9:46 PM, Robert Haas  wrote:
> On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat
>  wrote:
>> Here's updated patch set based on the basic partition-wise join
>> committed. The patchset applies on top of the patch to optimize the
>> case of dummy partitioned tables [1].
>>
>> Right now, the advanced partition matching algorithm bails out when
>> either of the joining relations has a default partition.
>
> So is that something you are going to fix?
>

Yes, if time permits. I had left the patch unattended while basic
partition-wise join was getting committed. Now that it's committed, I
rebased it. It still has TODOs and some work is required to improve
it. But for the patch to be really complete, we have to deal with the
problem of missing partitions described before. I am fine
collaborating if someone else wants to pick it up.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow synchronous logical replication

2017-10-12 Thread Craig Ringer
On 12 October 2017 at 16:09, Konstantin Knizhnik
 wrote:
>

> Is the CREATE TABLE and INSERT done in the same transaction?
>
> No. Table was create in separate transaction.
> Moreover  the same effect will take place if table is create before start of 
> replication.
> The problem in this case seems to be caused by spilling decoded transaction 
> to the file by ReorderBufferSerializeTXN.

Yeah. That's known to perform sub-optimally, and it also uses way more
memory than it should.

Your design compounds that by spilling transactions it will then
discard, and doing so multiple times.

To make your design viable you likely need some kind of cache of
serialized reorder buffer transactions, where you don't rebuild one if
it's already been generated. And likely a fair bit of optimisation on
the serialisation.

Or you might want a table- and even a row-filter that can be run
during decoding, before appending to the ReorderBuffer, to let you
skip changes early. Right now this can only be done at the transaction
level, based on replication origin. Of course, if you do this you
can't do the caching thing.

> Unfortunately it is not quite clear how to make wal-sender smarter and let
> him skip transaction not affecting its publication.

You'd need more hooks to be implemented by the output plugin.

> I really not sure that it is possible to skip over WAL. But the particular 
> problem with invalidation records etc  can be solved by always processing 
> this records by WAl sender.
> I.e. if backend is inserting invalidation record or some other record which 
> always should be processed by WAL sender, it can always promote LSN of this 
> record to WAL sender.
> So WAl sender will skip only those WAl records which is safe to skip 
> (insert/update/delete records not affecting this publication).

That sounds like a giant layering violation too.

I suggest focusing on reducing the amount of work done when reading
WAL, not trying to jump over whole ranges of WAL.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-12 Thread Ashutosh Bapat
On Thu, Oct 12, 2017 at 10:49 PM, Robert Haas  wrote:
> On Wed, Oct 11, 2017 at 10:43 PM, Ashutosh Bapat
>  wrote:
>> You are suggesting that a dummy partitioned table be treated as an
>> un-partitioned table and apply above suggested optimization. A join
>> between a partitioned and unpartitioned table is partitioned by the
>> keys of only partitioned table. An unpartitioned table doesn't have
>> any keys, so this is fine. But a dummy partitioned table does have
>> keys. Recording them as keys of the join relation helps when it joins
>> to other relations. Furthermore a join between partitioned and
>> unpartitioned table doesn't require any equi-join condition on
>> partition keys of partitioned table but a join between partitioned
>> tables is considered to be partitioned by keys on both sides only when
>> there is an equi-join. So, when implementing a partitioned join
>> between a partitioned and an unpartitioned table, we will have to make
>> a special case to record partition keys when the unpartitioned side is
>> actually a dummy partitioned table. That might be awkward.
>
> It seems to me that what we really need here is to move all of this
> stuff into a separate struct:
>
> /* used for partitioned relations */
> PartitionScheme part_scheme;/* Partitioning scheme. */
> int nparts; /* number of
> partitions */
> struct PartitionBoundInfoData *boundinfo;   /* Partition bounds */
> struct RelOptInfo **part_rels;  /* Array of RelOptInfos of partitions,
>
>   * stored in the same order of bounds */
> List  **partexprs;  /* Non-nullable partition key
> expressions. */
> List  **nullable_partexprs; /* Nullable partition key
> expressions. */
>

In a very early patch I had PartitionOptInfo to hold all of this.
RelOptInfo then had a pointer of PartitionOptInfo, if it was
partitioned. When a relation can be partitioned in multiple ways like
what you describe or because join by re-partitioning is efficient,
RelOptInfo would have a list of those. But the representation needs to
be thought through. I am wondering whether this should be modelled
like IndexOptInfo. I am not sure. This is a topic of much larger
discussion.

I think we are digressing. We were discussing my patch to handle dummy
partitioned relation, whose children are not marked dummy and do not
have pathlists set. Do you still think that we should leave that
aside?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replace GrantObjectType with ObjectType

2017-10-12 Thread Stephen Frost
Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> It seems to me that having ACL_OBJECT_* symbols alongside OBJECT_*
> symbols is not useful and leads to duplication.  Digging around in the
> past suggests that we used to have a lot of these command-specific
> symbols but got rid of them over time, except that the ACL stuff was
> never touched.  The attached patch accomplishes that.

I'm generally supportive of this, but I'm not entirely thrilled with how
this ends up conflating TABLEs and RELATIONs.  From the GRANT
perspective, there's no distinction, and that was clear from the
language used and how things were handled, but the OBJECT enum has that
distinction.  This change just makes VIEWs be OBJECT_TABLE even though
they actually aren't tables and there even is an OBJECT_VIEW value.
This commit may be able to grok that and manage it properly, but later
hackers might miss that.

I would also suggest that the naming be consistent with the other bits
of the GRANT system (eg: ACL_ALL_RIGHTS_NAMESPACE would be changed to
ACL_ALL_RIGHTS_SCHEMA, to match OBJECT_SCHEMA).

I also echo the concern raised by Alvaro.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread Stephen Frost
Laurenz,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> Robert Haas wrote:
> > One trick that some system use is avoid replanning as much as we do
> > by, for example, saving plans in a shared cache and reusing them even
> > in other sessions.  That's hard to do in our architecture because the
> > controlling GUCs can be different in every session and there's not
> > even any explicit labeling of which GUCs control planner behavior. But
> > if you had it, then extra planning cycles would be, perhaps, more
> > tolerable.
> 
> >From my experience with Oracle I would say that that is a can of worms.
> 
> Perhaps it really brings the performance benefits they claim, but
> a) there have been a number of bugs where the wrong plan got used
>(you have to keep several plans for the same statement around,
>since - as you say - different sessions have different environments)

I'm not sure this is really a fair strike against this concept- bugs
happen, even bugs in planning, and what we need is more testing, imv, to
reduce the number and minimize the risk as much as we can.

> b) it is a frequent problem that this shared memory area grows
>too large if the application does not use prepared statements
>but dynamic SQL with varying constants.

This just requires that the memory area be managed somehow, not unlike
how our shared buffers have to deal with evicting out old pages.
There's a challenge there around making sure that it doesn't make the
performance of the system be much worse than not having a cache at all,
naturally, but given that a lot of people use pg_stat_statements to good
effect and without much in the way of complaints, it seems like it might
be possible make it work reasonably (just imagining a generic plan being
attached to pg_stat_statements with some information about if the
generic plan works well or not, blah blah, hand waving goes here).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] postgres_fdw super user checks

2017-10-12 Thread Stephen Frost
Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Oct 5, 2017 at 1:02 PM, Jeff Janes  wrote:
> > I don't see a reason to block a directly-logged-in superuser from using a
> > mapping.  I asked in the closed list whether the current (released)
> > behavior was a security bug, and the answer was no.  And I don't know why
> > else to block superusers from doing something other than as a security bug.
> > Also it would create a backwards compatibility hazard to revoke the ability
> > now.
> 
> Well, my thought was that we ought to be consistent about whose
> authorization matters.  If we're using the view owner's credentials in
> general, then we also (defensibly, anyway) ought to use the view
> owner's superuser-ness to decide whether to enforce this restriction.
> Using either the view owner's superuser-ness or the session user's
> superuser-ness kind of puts you halfway in the middle.  The view
> owner's rights are what matters mostly, but your own rights also
> matter a little bit around the edges.  That's a little strange.
> 
> I don't have violently strong opinions about this - does anyone else
> have a view?

I haven't been following this closely, but I tend to agree with you- if
we're using the view owner's privileges then that's what everything
should be based on, not whether the caller is a superuser or not.

Consider a security-definer function.  Clearly, such a function should
always run as the owner of the function, even if the caller is a
superuser.  Running as the caller instead of the owner of the function
when the caller is a superuser because that would allow the function to
access more clearly isn't a good idea, imv.

Yes, that means that sometimes when superusers run things they get
permission denied errors.  That's always been the case, and is correct.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Pluggable storage

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 8:00 PM, Haribabu Kommi
 wrote:
> Currently I added a snapshot_satisfies API to find out whether the tuple
> satisfies the visibility or not with different types of visibility routines.
> I feel these
> are some how enough to develop a different storage methods like UNDO.
> The storage methods can decide internally how to provide the visibility.
>
> + amroutine->snapshot_satisfies[MVCC_VISIBILITY] = HeapTupleSatisfiesMVCC;
> + amroutine->snapshot_satisfies[SELF_VISIBILITY] = HeapTupleSatisfiesSelf;
> + amroutine->snapshot_satisfies[ANY_VISIBILITY] = HeapTupleSatisfiesAny;
> + amroutine->snapshot_satisfies[TOAST_VISIBILITY] = HeapTupleSatisfiesToast;
> + amroutine->snapshot_satisfies[DIRTY_VISIBILITY] = HeapTupleSatisfiesDirty;
> + amroutine->snapshot_satisfies[HISTORIC_MVCC_VISIBILITY] =
> HeapTupleSatisfiesHistoricMVCC;
> + amroutine->snapshot_satisfies[NON_VACUUMABLE_VISIBILTY] =
> HeapTupleSatisfiesNonVacuumable;
> +
> + amroutine->snapshot_satisfiesUpdate = HeapTupleSatisfiesUpdate;
> + amroutine->snapshot_satisfiesVacuum = HeapTupleSatisfiesVacuum;
>
> Currently no changes are carried out in snapshot logic as that is kept
> seperate
> from storage API.

That seems like a strange choice of API.  I think it should more
integrated with the scan logic.  For example, if I'm doing an index
scan, and I get a TID, then I should be able to just say "here's a
TID, give me any tuples associated with that TID that are visible to
the scan snapshot".  Then for the current heap it will do
heap_hot_search_buffer, and for zheap it will walk the undo chain and
return the relevant tuple from the chain.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Determine state of cluster (HA)

2017-10-12 Thread Joshua D. Drake

-Hackers,

I had a long call with a firm developing front end proxy/cache/HA for 
Postgres today. Essentially the software is a replacement for PGPool in 
entirety but also supports analytics etc... When I was asking them about 
pain points they talked about the below and I was wondering if this is a 
problem we would like to solve:


 Per your request, here is our failover issue.

1.  In a modern devops environment, the database should be able to scale 
and morph over time based on need.
2.  Tools that are leveraging the database should be able to easily 
discover and potentially control (with permissions) the database. 
Currently, you can discover the master and what nodes are syncing off of 
it, but on a failure, a tool can't easily discover what orchestration 
has done on the back-end to make the cluster whole again, i.e. from the 
slave, you can't discover the master reliably and easily.


The logic that our code now uses is to:

1.  Find the master
2.  Add replication nodes per the master's configuration.

To find a master, we start with a list of candidate nodes that MAY be a 
master at any point, and:

1. issue "SELECT pg_is_in_recovery()" to find if it is a slave
a. If so, use "SELECT pg_read_file('recovery.conf')" to extract the host
b. Attempt to connect to the host directly, if not...
c. use the slave and use the hostname via dblink to connect to the 
master, as the hostname , i.e. select * from dblink('" + connInfo + " 
dbname=postgres', 'select inet_server_addr()') AS t(inet_server_addr 
inet).  This is necessary in the event the hostname used in the 
recovery.conf file is not resolvable from the outside.
d. Use the dblink connection to ID the master node via select 
inet_server_addr();

e. connect to the IP provided by the master.
f.  Repeat through nodes until we get a master.

Issues:
1.  The dblink call doesn't have a way to specify a timeout, so we have 
to use Java futures to control how long this may take to a reasonable 
amount of time;
2.  NAT mapping may result in us detecting IP ranges that are not 
accessible to the application nodes.
3.  there is no easy way to monitor for state changes as they happen, 
allowing faster failovers, everything has to be polled based on events;
4.  It doesn't support cascading replication very well, although we 
could augment the logic to allow us to map the relationship between nodes.
5.  There is no way to connect to a db node with something akin to 
SQL-Server's "application intent" flags, to allow a connection to be 
rejected if we wish it to be a read/write connection.  This helps detect 
the state of the node directly without having to ask any further 
questions of the node, and makes it easier to "stall" during connection 
until a proper connection can be made.
6.  The master, on shutdown, will not actually close and disable 
connections as it shuts down, instead, it will issue an error that it is 
shutting down as it does so.


Fundamentally, the biggest issue is that it is very hard to determine 
the state of the cluster by asking all the nodes, in particular in the 
case of a failure.  Some state information is lost that is necessary to 
talk to the cluster moving forward in a reliable manner.



--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-10-12 Thread Amit Langote
On 2017/10/13 6:18, Robert Haas wrote:
> Is anybody still reviewing the main patch here?  (It would be good if
> the answer is "yes".)

I am going to try to look at the latest version over the weekend and early
next week.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Continuous integration on Windows?

2017-10-12 Thread Andrew Dunstan


On 10/12/2017 06:46 PM, Thomas Munro wrote:
> On Fri, Oct 13, 2017 at 10:57 AM, Andrew Dunstan
>  wrote:
>> Actually, that didn't take too long.
>>
>> No testing yet, but this runs a build successfully:
>> 
>>
>> See results at 
> Excellent!  Thanks for looking at this, Andrew.  It's going to be
> really useful for removing surprises.
>
> It would be nice to finish up with a little library of control files
> like this for different CI vendors, possibly with some different
> options (different compilers, different word size, add valgrind, ...).
> I don't know if it would ever make sense to have standardised CI
> control files in the tree -- many projects do -- but it's very easy to
> carry a commit that adds them in development branches but drop it as
> part of the format-patch-and-post process.
>


Well, as you can see here the appveyor.yml file can live outside the
tree that's being built.


What would be good, though, would be to split build.pl into two so you
wouldn't need the auxiliary file I had to create from it.


cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-10-12 Thread Amit Langote
On 2017/10/13 4:18, Robert Haas wrote:
> On Thu, Oct 5, 2017 at 9:29 PM, Amit Langote
>  wrote:
>> Attached a patch to modify the INFO messages in check_default_allows_bound.
> 
> Committed.  However, I didn't see a reason to adopt the comment change
> you proposed, so I left that part out.

Okay, thanks.

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions

2017-10-12 Thread Tom Lane
David Rowley  writes:
> On 13 October 2017 at 12:41, Tom Lane  wrote:
>> Yeah, we would probably also want to check the flag in nodeWindowAgg.
>> Not sure exactly how that should play out --- maybe we end up with
>> a tri-valued property "works as normal agg without merging, works
>> as normal agg with merging, works as window agg".

> hmm, maybe I'm lacking imagination here, but surely the final function
> is either destructive or it's not? I can't understand what the
> difference between nodeAgg.c calling the finalfn multiple times on the
> same state and nodeWindowAgg.c doing it. Maybe there's something I'm
> not accounting for that you are?

nodeWindowAgg is doing something more: not only is it calling the finalfn
repeatedly, but it's continuing to mutate the transition state in between.
The ordered-set aggs provide a counterexample to considering that to be
equivalent to state merging.  The OSAs can cope with state merging as
long as they have a flag to make sure only the first finalfn does
tuplesort_performsort ... but that's not good enough to make them workable
as window aggs.  Once we sort, we can't absorb more rows into the
tuplesort object.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2017-10-12 Thread Haribabu Kommi
On Fri, Oct 13, 2017 at 8:23 AM, Robert Haas  wrote:

> On Thu, Oct 12, 2017 at 4:38 PM, Alexander Korotkov
>  wrote:
> > It's probably that we imply different meaning to "MVCC implementation".
> > While writing "MVCC implementation" I meant that, for instance,
> alternative
> > storage
> > may implement UNDO chains to store versions of same row.
> Correspondingly,
> > it may not have any analogue of our HOT.
>
> Yes, the zheap project on which EnterpriseDB is working has precisely
> this characteristic.
>
> > However I imply that alternative storage would share our "MVCC model".
> So,
> > it
> > should share our transactional model including transactions,
> > subtransactions, snapshots etc.
> > Therefore, if alternative storage is transactional, then in particular it
> > should be able to fetch tuple with
> > given TID according to given snapshot.  However, how it's implemented
> > internally is
> > a black box for us.  Thus, we don't insist that tuple should have
> different
> > TID after update;
> > we don't insist there is any analogue of HOT; we don't insist alternative
> > storage needs vacuum
> > (or if even it needs vacuum, it might be performed in completely
> different
> > way) and so on.
>
> Fully agreed.


Currently I added a snapshot_satisfies API to find out whether the tuple
satisfies the visibility or not with different types of visibility
routines. I feel these
are some how enough to develop a different storage methods like UNDO.
The storage methods can decide internally how to provide the visibility.

+ amroutine->snapshot_satisfies[MVCC_VISIBILITY] = HeapTupleSatisfiesMVCC;
+ amroutine->snapshot_satisfies[SELF_VISIBILITY] = HeapTupleSatisfiesSelf;
+ amroutine->snapshot_satisfies[ANY_VISIBILITY] = HeapTupleSatisfiesAny;
+ amroutine->snapshot_satisfies[TOAST_VISIBILITY] = HeapTupleSatisfiesToast;
+ amroutine->snapshot_satisfies[DIRTY_VISIBILITY] = HeapTupleSatisfiesDirty;
+ amroutine->snapshot_satisfies[HISTORIC_MVCC_VISIBILITY] =
HeapTupleSatisfiesHistoricMVCC;
+ amroutine->snapshot_satisfies[NON_VACUUMABLE_VISIBILTY] =
HeapTupleSatisfiesNonVacuumable;
+
+ amroutine->snapshot_satisfiesUpdate = HeapTupleSatisfiesUpdate;
+ amroutine->snapshot_satisfiesVacuum = HeapTupleSatisfiesVacuum;

Currently no changes are carried out in snapshot logic as that is kept
seperate
from storage API.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions

2017-10-12 Thread David Rowley
On 13 October 2017 at 12:41, Tom Lane  wrote:
> David Rowley  writes:
>> If the user defines their normal aggregate as not safe for merging,
>> then surely it'll not be suitable to be used as a window function
>> either, since the final function will also be called there multiple
>> times per state.
>
> Yeah, we would probably also want to check the flag in nodeWindowAgg.
> Not sure exactly how that should play out --- maybe we end up with
> a tri-valued property "works as normal agg without merging, works
> as normal agg with merging, works as window agg".  But this would
> arguably be an improvement over the current situation.  Right now
> I'm sure there are user-written aggs out there that will just crash
> if used as a window agg, and the authors don't have much choice because
> the performance costs of not modifying the transition state in the
> finalfn are higher than they're willing to bear.  At least with a
> flag they could ensure that the case will fail cleanly.

hmm, maybe I'm lacking imagination here, but surely the final function
is either destructive or it's not? I can't understand what the
difference between nodeAgg.c calling the finalfn multiple times on the
same state and nodeWindowAgg.c doing it. Maybe there's something I'm
not accounting for that you are?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Oct 13, 2017 at 10:01 AM, Tom Lane  wrote:
>> The CTE was simply not part of the available namespace for the INSERT's
>> target, so it found the regular table instead.  v10 has thus broken
>> cases that used to work.  I think that's a bug.

> Hmm.  Yeah.  I have to say it's a bit surprising that the following
> refers to two different objects:
>   with mytable as (select 1 as x) insert into mytable select * from mytable

Yeah, I agree --- personally I'd never write a query like that.  But
the fact that somebody ran into it when v10 has been out for barely
a week suggests that people are doing it.

>> I think we need to either remove that call from setTargetTable entirely,
>> or maybe adjust it so it can only find ENRs not CTEs.

> I think it'd be better to find and reject ENRs only.  The alternative
> would be to make ENRs invisible to DML, which seems arbitrary and
> weird (even though it might be more consistent with our traditional
> treatment of CTEs).  One handwavy reason I'd like them to remain
> visible to DML (and be rejected) is that I think they're a bit like
> table variables (see SQL Server), and someone might eventually want to
> teach them, or something like them, how to be writable.

Yeah, that would be the argument for making them visible.  I'm not
sure how likely it is that we'll ever actually have writable ENRs,
but I won't stand in the way if you want to do it like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions

2017-10-12 Thread Tom Lane
David Rowley  writes:
> On 13 October 2017 at 12:08, Tom Lane  wrote:
>> Therefore, I think we need to bite the bullet and provide an aggregate
>> property (CREATE AGGREGATE argument / pg_aggregate column) that tells
>> whether the aggregate supports transition state merging.  Likely this
>> should have been in the state-merging patch to begin with, but better
>> late than never.

> Are you considering that this is an option only for ordered-set
> aggregates or for all?

All.

> If the user defines their normal aggregate as not safe for merging,
> then surely it'll not be suitable to be used as a window function
> either, since the final function will also be called there multiple
> times per state.

Yeah, we would probably also want to check the flag in nodeWindowAgg.
Not sure exactly how that should play out --- maybe we end up with
a tri-valued property "works as normal agg without merging, works
as normal agg with merging, works as window agg".  But this would
arguably be an improvement over the current situation.  Right now
I'm sure there are user-written aggs out there that will just crash
if used as a window agg, and the authors don't have much choice because
the performance costs of not modifying the transition state in the
finalfn are higher than they're willing to bear.  At least with a
flag they could ensure that the case will fail cleanly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2017-10-12 Thread Peter Geoghegan
On Tue, Aug 15, 2017 at 10:54 AM, Robert Haas  wrote:
>> Or at least make the filtering optional.
>
> I don't think "filtering" is the right way to think about it.  It's
> just labeling each combination of bits with the meaning appropriate to
> that combination of bits.

I do. -1 to not just showing what's on the page -- if the
HEAP_XMIN_COMMITTED and HEAP_XMIN_ABORTED bits are set, then I think
we should show them. Yeah, I accept that there is a real danger of
confusing people with that. Unfortunately, I think that displaying
HEAP_XMIN_FROZEN will cause even more confusion. I don't think that
HEAP_XMIN_FROZEN is an abstraction at all. It's a notational
convenience.

I don't think it's our place to "interpret" the bits. Are we *also*
going to show HEAP_XMIN_FROZEN when xmin is physically set to
FrozenTransactionId? Where does it end?

I think that we should prominently document that HEAP_XMIN_COMMITTED
|HEAP_XMIN_ABORTED == HEAP_XMIN_FROZEN, rather than trying to hide
complexity that we have no business hiding in a tool like pageinspect.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix traversal of half-frozen update chains

2017-10-12 Thread Alvaro Herrera
Peter Geoghegan wrote:
> On Fri, Oct 6, 2017 at 8:29 AM, Alvaro Herrera  
> wrote:

> > /*
> >  * When a tuple is frozen, the original Xmin is lost, but we know it's a
> >  * committed transaction.  So unless the Xmax is InvalidXid, we don't 
> > know
> >  * for certain that there is a match, but there may be one; and we must
> >  * return true so that a HOT chain that is half-frozen can be walked
> >  * correctly.
> >  *
> >  * We no longer freeze tuples this way, but we must keep this in order 
> > to
> >  * interpret pre-pg_upgrade pages correctly.
> >  */
> > if (TransactionIdEquals(xmin, FrozenTransactionId) &&
> > TransactionIdIsValid(xmax))
> > return true;
> >
> > return false;
> > }
> 
> Wouldn't this last "if" test, to cover the pg_upgrade case, be better
> targeted by comparing *raw* xmin to FrozenTransactionId? You're using
> the potentially distinct xmin value returned by
> HeapTupleHeaderGetXmin() for the test here. I think we should be
> directly targeting tuples frozen on or before 9.4 (prior to
> pg_upgrade) instead.

Yes, agreed, I should change that.  Thanks for continuing to think about
this.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Thomas Munro
On Fri, Oct 13, 2017 at 10:01 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> Before that, CTE used as modify targets produced a different error message:
>
>> postgres=# WITH d AS (SELECT 42) INSERT INTO d VALUES (1);
>> ERROR:  relation "d" does not exist
>> LINE 1: WITH d AS (SELECT 42) INSERT INTO d VALUES (1);
>>   ^
>
> Well, I think that is a poorly chosen example.  Consider this instead:
> pre-v10, you could do this:
>
> regression=# create table mytable (f1 int);
> CREATE TABLE
> regression=# with mytable as (select 1 as x) insert into mytable values(1);
> INSERT 0 1
> regression=# select * from mytable;
>  f1
> 
>   1
> (1 row)
>
> The CTE was simply not part of the available namespace for the INSERT's
> target, so it found the regular table instead.  v10 has thus broken
> cases that used to work.  I think that's a bug.

Hmm.  Yeah.  I have to say it's a bit surprising that the following
refers to two different objects:

  with mytable as (select 1 as x) insert into mytable select * from mytable

Obviously the spec is useless here since this is non-standard (at a
guess they'd probably require a qualifier there to avoid parsing as a
 if they allowed DML after ).  As you said
it's worked like that for several releases, so whatever I might think
about someone who deliberately writes such queries, the precedent
probably trumps naive notions about WITH creating a single consistent
lexical scope.

> There may or may not be a case for allowing ENRs to capture names that
> would otherwise refer to ordinary tables; I'm not sure.  But I see very
> little case for allowing CTEs to capture such references, because surely
> we are never going to allow that to do anything useful, and we have
> several years of precedent now that they don't capture.
>
> I think we need to either remove that call from setTargetTable entirely,
> or maybe adjust it so it can only find ENRs not CTEs.

I think it'd be better to find and reject ENRs only.  The alternative
would be to make ENRs invisible to DML, which seems arbitrary and
weird (even though it might be more consistent with our traditional
treatment of CTEs).  One handwavy reason I'd like them to remain
visible to DML (and be rejected) is that I think they're a bit like
table variables (see SQL Server), and someone might eventually want to
teach them, or something like them, how to be writable.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions

2017-10-12 Thread David Rowley
On 13 October 2017 at 12:08, Tom Lane  wrote:
> Therefore, I think we need to bite the bullet and provide an aggregate
> property (CREATE AGGREGATE argument / pg_aggregate column) that tells
> whether the aggregate supports transition state merging.  Likely this
> should have been in the state-merging patch to begin with, but better
> late than never.
>
> The main thing that probably has to be hashed out before we can write
> that patch is what the default should be for user-created aggregates.
> I am inclined to think that we should err on the side of safety and
> default it to false (no merge support).  You could argue that the
> lack of complaints since 9.6 came out is sufficient evidence that
> defaulting to true would be all right, but I'm not sure.

Are you considering that this is an option only for ordered-set
aggregates or for all?

If the user defines their normal aggregate as not safe for merging,
then surely it'll not be suitable to be used as a window function
either, since the final function will also be called there multiple
times per state.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Aggregate transition state merging vs. hypothetical set functions

2017-10-12 Thread Tom Lane
I started to look into fixing orderedsetaggs.c so that we could revert
52328727b, and soon found a rather nasty problem.  Although the plain
OSAs seem amenable to supporting multiple finalfn calls on the same
transition state, the "hypothetical set" functions are not at all.
What they do is to accumulate all the regular aggregate input into
a tuplesort, then add the direct arguments as an additional "hypothetical"
row, and finally sort the result.  There's no realistic way of undoing
the addition of the hypothetical row, so these finalfns simply can't
share tuplesort state.

You could imagine accumulating the regular input into a tuplestore
and then copying it into a tuplesort in each finalfn call.  But that
seems pretty icky, and I'm not sure it'd really be any more performant
than just keeping separate tuplesort objects for each aggregate.

So my conclusion is that we *must* teach nodeAgg.c not to merge
transition states for these functions.  But I do not want some
hard-wired rule that all and only hypothetical-set aggregates
are nonmergeable.  Our notion of an AGGKIND_HYPOTHETICAL aggregate
is mostly a syntactical one about requiring there to be direct
argument(s) matching the aggregated argument(s) in type.  I do not
think that that should translate to an assumption about how the
aggregate implementation works (indeed, the CREATE AGGREGATE man
page says in so many words that HYPOTHETICAL affects only type
matching, not runtime behavior).  Moreover, having seen this example,
I think it's likely that there are other cases in which it's more
trouble than it's worth for an aggregate to support merging of
transition states.

Therefore, I think we need to bite the bullet and provide an aggregate
property (CREATE AGGREGATE argument / pg_aggregate column) that tells
whether the aggregate supports transition state merging.  Likely this
should have been in the state-merging patch to begin with, but better
late than never.

The main thing that probably has to be hashed out before we can write
that patch is what the default should be for user-created aggregates.
I am inclined to think that we should err on the side of safety and
default it to false (no merge support).  You could argue that the
lack of complaints since 9.6 came out is sufficient evidence that
defaulting to true would be all right, but I'm not sure.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-12 Thread David Rowley
On 13 October 2017 at 04:56, Alvaro Herrera  wrote:
> I pushed your original fix.

Thanks for committing

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Continuous integration on Windows?

2017-10-12 Thread Thomas Munro
On Fri, Oct 13, 2017 at 10:57 AM, Andrew Dunstan
 wrote:
> Actually, that didn't take too long.
>
> No testing yet, but this runs a build successfully:
> 
>
> See results at 

Excellent!  Thanks for looking at this, Andrew.  It's going to be
really useful for removing surprises.

It would be nice to finish up with a little library of control files
like this for different CI vendors, possibly with some different
options (different compilers, different word size, add valgrind, ...).
I don't know if it would ever make sense to have standardised CI
control files in the tree -- many projects do -- but it's very easy to
carry a commit that adds them in development branches but drop it as
part of the format-patch-and-post process.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Continuous integration on Windows?

2017-10-12 Thread Andres Freund
On 2017-10-12 17:57:11 -0400, Andrew Dunstan wrote:
> No testing yet, but this runs a build successfully:
> 
> 
> See results at 

"Time Elapsed 00:04:36.37"

I'd expected building would take a lot longer than that... Neat.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Continuous integration on Windows?

2017-10-12 Thread Andrew Dunstan


On 10/12/2017 04:14 PM, Andrew Dunstan wrote:
>
> On 10/11/2017 11:04 PM, Thomas Munro wrote:
>> Hi hackers,
>>
>> I don't use Windows myself, but I'd rather avoid submitting patches
>> that fail to build, build with horrible warnings or blow up on that
>> fine operating system.  I think it would be neat to be able to have
>> experimental branches of PostgreSQL built and tested on Windows
>> automatically just by pushing them to a watched public git repo.  Just
>> like Travis CI and several others do for the major GNU/Linux distros,
>> it seems there is at least one Windows-based CI company that
>> generously offers free testing to open source projects:
>> https://www.appveyor.com (hat tip to Ilmari for the pointer).  I
>> wonder... has anyone here with Microsoft know-how ever tried to
>> produce an appveyor.yml file that would do a MSVC build and
>> check-world?
>>
>
> Interesting.
>
> I'm taking a look.
>
> A couple of things not in their pre-built images that we'll need are
> flex and bison. We might be able to overcome that with chocolatey, which
> is installed, haven't tested yet.
>
> getting a working appveyor.yml will take a little while, though.



Actually, that didn't take too long.

No testing yet, but this runs a build successfully:


See results at 

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread Andres Freund
On 2017-10-12 17:27:52 -0400, Robert Haas wrote:
> On Thu, Oct 12, 2017 at 4:20 PM, Andres Freund  wrote:
> >> In other words, it's not utterly fixed in stone --- we invented
> >> --load-via-partition-root primarily to cope with circumstances that
> >> could change hash values --- but we sure don't want to be changing it
> >> with any regularity, or for a less-than-excellent reason.
> >
> > Yea, that's what I expected. It'd probably good for somebody to run
> > smhasher or such on the output of the combine function (or even better,
> > on both the 32 and 64 bit variants) in that case.
> 
> Not sure how that test suite works exactly, but presumably the
> characteristics in practice will depend the behavior of the hash
> functions used as input the combine function - so the behavior could
> be good for an (int, int) key but bad for a (text, date) key, or
> whatever.

I don't think that's true, unless you have really bad hash functions on
the the component hashes. A hash combine function can't really do
anything about badly hashed input, what you want is that it doesn't
*reduce* the quality of the hash by combining.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 4:20 PM, Andres Freund  wrote:
>> In other words, it's not utterly fixed in stone --- we invented
>> --load-via-partition-root primarily to cope with circumstances that
>> could change hash values --- but we sure don't want to be changing it
>> with any regularity, or for a less-than-excellent reason.
>
> Yea, that's what I expected. It'd probably good for somebody to run
> smhasher or such on the output of the combine function (or even better,
> on both the 32 and 64 bit variants) in that case.

Not sure how that test suite works exactly, but presumably the
characteristics in practice will depend the behavior of the hash
functions used as input the combine function - so the behavior could
be good for an (int, int) key but bad for a (text, date) key, or
whatever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 4:38 PM, Alexander Korotkov
 wrote:
> It's probably that we imply different meaning to "MVCC implementation".
> While writing "MVCC implementation" I meant that, for instance, alternative
> storage
> may implement UNDO chains to store versions of same row.  Correspondingly,
> it may not have any analogue of our HOT.

Yes, the zheap project on which EnterpriseDB is working has precisely
this characteristic.

> However I imply that alternative storage would share our "MVCC model".  So,
> it
> should share our transactional model including transactions,
> subtransactions, snapshots etc.
> Therefore, if alternative storage is transactional, then in particular it
> should be able to fetch tuple with
> given TID according to given snapshot.  However, how it's implemented
> internally is
> a black box for us.  Thus, we don't insist that tuple should have different
> TID after update;
> we don't insist there is any analogue of HOT; we don't insist alternative
> storage needs vacuum
> (or if even it needs vacuum, it might be performed in completely different
> way) and so on.

Fully agreed.

> During conversations with you at PGCon and other conferences I had
> impression
> that you share this view on pluggable storages and MVCC.  Probably, we just
> express
> this view in different words.  Or alternatively I might understand you
> terribly wrong.

No, it sounds like we are on the same page.  I'm only hoping that we
don't end with a bunch of storage engines that each use a different
XID space or something icky like that.  I don't think the API should
try to cater to that sort of development.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-10-12 Thread Robert Haas
On Wed, Oct 4, 2017 at 9:51 AM, Amit Khandekar  wrote:
> Preparatory patches :
> 0001-Prepare-for-re-using-UPDATE-result-rels-during-tuple.patch
> 0002-Prevent-a-redundant-ConvertRowtypeExpr-node.patch
> Main patch :
> update-partition-key_v20.patch

Committed 0001 with a few tweaks and 0002 unchanged.  Please check
whether everything looks OK.

Is anybody still reviewing the main patch here?  (It would be good if
the answer is "yes".)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Oct 13, 2017 at 8:50 AM, Tom Lane  wrote:
>> Hm.  I actually think the bug here is that 18ce3a4ab introduced
>> anything into setTargetTable at all.  There was never previously
>> any assumption that the target could be anything but a regular
>> table, so we just ignored CTEs there, and I do not think the
>> new behavior is an improvement.

> Before that, CTE used as modify targets produced a different error message:

> postgres=# WITH d AS (SELECT 42) INSERT INTO d VALUES (1);
> ERROR:  relation "d" does not exist
> LINE 1: WITH d AS (SELECT 42) INSERT INTO d VALUES (1);
>   ^

Well, I think that is a poorly chosen example.  Consider this instead:
pre-v10, you could do this:

regression=# create table mytable (f1 int);
CREATE TABLE
regression=# with mytable as (select 1 as x) insert into mytable values(1);
INSERT 0 1
regression=# select * from mytable;
 f1 

  1
(1 row)

The CTE was simply not part of the available namespace for the INSERT's
target, so it found the regular table instead.  v10 has thus broken
cases that used to work.  I think that's a bug.

There may or may not be a case for allowing ENRs to capture names that
would otherwise refer to ordinary tables; I'm not sure.  But I see very
little case for allowing CTEs to capture such references, because surely
we are never going to allow that to do anything useful, and we have
several years of precedent now that they don't capture.

I think we need to either remove that call from setTargetTable entirely,
or maybe adjust it so it can only find ENRs not CTEs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show precise repos version for dev builds?

2017-10-12 Thread Andres Freund
Hi,

On 2017-10-12 22:50:47 +0200, Fabien COELHO wrote:
> The make dependencies ensure that the header file is regenerated on each
> build with a phony target, and the C file is thus recompiled and linked into
> the executables on each build. It means that all executables are linked on
> each rebuild, even if not necessary, though.

That'd be quite painful, consider e.g. people using LTO.  If done right
however, that should be avoidable to some degree. When creating the
version file, only replace its contents if the contents differ - that's
just a few lines of scripting.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show precise repos version for dev builds?

2017-10-12 Thread Fabien COELHO


Hello Tom,


I've seen issues with a number of tools. The one I can remember most
clearly is check_postgres.pl . Nobody's going to argue that this is
pretty code, but last time I tested (9.4-era, admittedly) it exploded
messily with extra-version.


FWIW, Salesforce tried to do something similar to Peter's example
while I was there.  It did not work as well as we'd hoped :-( because
what got baked into the built executables was the latest git commit hash
as of the time you'd last run configure, not what was current as of the
latest "make".  Not to mention that you might have built from an
uncommitted state.  We tried to find a fix for the former problem that
didn't create lots of overhead, without much success.


My 0.02€:

For a research project we regenerate a header file with a string 
containing the working copy status information,


 // file version.h
 #define REV ""

and there is a very small C file which is recompiled with a constant 
string based on the version:


 // version.c
 #include "version.h"
 const char * version = REV;

The make dependencies ensure that the header file is regenerated on each 
build with a phony target, and the C file is thus recompiled and linked 
into the executables on each build. It means that all executables are 
linked on each rebuild, even if not necessary, though.



No idea what to do about the latter.


"svnversion" adds a "M" for modified on the status. There is an option 
with "git describe" to get something similar:


git describe --long --always --all --dirty

Also there is a need of a fall back if this fails, to get "version>" instead.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Columnar storage support

2017-10-12 Thread legrand legrand
Thanks a lot for all thoses informations regarding this "Feature"
development.

I'll try to test VOPS soon, 
and see if monetdb_fdw support filter and aggregates pushdown ;o)

PAscal





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2017-10-12 Thread Alexander Korotkov
On Wed, Oct 11, 2017 at 11:08 PM, Robert Haas  wrote:

> On Mon, Oct 9, 2017 at 10:22 AM, Alexander Korotkov
>  wrote:
> > For me, it's crucial point that pluggable storages should be able to have
> > different MVCC implementation, and correspondingly have full control over
> > its interactions with indexes.
> > Thus, it would be good if we would get consensus on that point.  I'd like
> > other discussion participants to comment whether they agree/disagree and
> > why.
> > Any comments?
>
> I think it's good for new storage managers to have full control over
> interactions with indexes.  I'm not sure about the MVCC part.  I think
> it would be legitimate to want a storage manager to ignore MVCC
> altogether - e.g. to build a non-transactional table.  I don't know
> that it would be a very good idea to have two different full-fledged
> MVCC implementations, though.  Like Tom says, that would be
> replicating a lot of the awfulness of the MySQL model.


It's probably that we imply different meaning to "MVCC implementation".
While writing "MVCC implementation" I meant that, for instance, alternative
storage
may implement UNDO chains to store versions of same row.  Correspondingly,
it may not have any analogue of our HOT.

However I imply that alternative storage would share our "MVCC model".  So,
it
should share our transactional model including transactions,
subtransactions, snapshots etc.
Therefore, if alternative storage is transactional, then in particular it
should be able to fetch tuple with
given TID according to given snapshot.  However, how it's implemented
internally is
a black box for us.  Thus, we don't insist that tuple should have different
TID after update;
we don't insist there is any analogue of HOT; we don't insist alternative
storage needs vacuum
(or if even it needs vacuum, it might be performed in completely different
way) and so on.

During conversations with you at PGCon and other conferences I had
impression
that you share this view on pluggable storages and MVCC.  Probably, we just
express
this view in different words.  Or alternatively I might understand you
terribly wrong.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Thomas Munro
On Fri, Oct 13, 2017 at 8:50 AM, Tom Lane  wrote:
> Julien Rouhaud  writes:
>> On Mon, Oct 9, 2017 at 10:43 PM, Thomas Munro
>>  wrote:
>>> I suppose we could consider moving the schemaname check into
>>> getRTEForSpecialRelationType(), since otherwise both callers need to
>>> do that (and as you discovered, one forgot).
>
>> Thanks for the feedback.  That was my first idea, but I assumed there
>> could be future use for this function on qualified RangeVar if it
>> wasn't done this way.
>
>> I agree it'd be much safer, so v2 attached, check moved in
>> getRTEForSpecialRelationType().
>
> Hm.  I actually think the bug here is that 18ce3a4ab introduced
> anything into setTargetTable at all.  There was never previously
> any assumption that the target could be anything but a regular
> table, so we just ignored CTEs there, and I do not think the
> new behavior is an improvement.
>
> So my proposal is to rip out the getRTEForSpecialRelationTypes
> check there.  I tend to agree that getRTEForSpecialRelationTypes
> should probably contain an explicit check for unqualified name
> rather than relying on its caller ... but that's a matter of
> future-proofing not a bug fix.

That check arrived in v11 revision of the patch:

https://www.postgresql.org/message-id/CACjxUsPfUUa813oDvJRx2wuiqHXO3VsCLQzcuy0r%3DUEyS-xOjQ%40mail.gmail.com

Before that, CTE used as modify targets produced a different error message:

postgres=# WITH d AS (SELECT 42) INSERT INTO d VALUES (1);
ERROR:  relation "d" does not exist
LINE 1: WITH d AS (SELECT 42) INSERT INTO d VALUES (1);
  ^

... but ENRs used like that caused a crash.  The change to
setTargetTable() went in to prevent that (and improved the CTE case's
error message semi-incidentally).  To take out we'll need a new check
somewhere else to prevent that.  Where?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread Andres Freund
On 2017-10-12 16:06:11 -0400, Robert Haas wrote:
> On Thu, Oct 12, 2017 at 3:43 PM, Andres Freund  wrote:
> > Are we going to rely on the the combine function to stay the same
> > forever after?
> 
> If we change them, it will be a pg_upgrade compatibility break for
> anyone using hash-partitioned tables with more than one partitioning
> column.  Dump and reload will also break unless
> --load-via-partition-root is used.
> 
> In other words, it's not utterly fixed in stone --- we invented
> --load-via-partition-root primarily to cope with circumstances that
> could change hash values --- but we sure don't want to be changing it
> with any regularity, or for a less-than-excellent reason.

Yea, that's what I expected. It'd probably good for somebody to run
smhasher or such on the output of the combine function (or even better,
on both the 32 and 64 bit variants) in that case.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Continuous integration on Windows?

2017-10-12 Thread Andrew Dunstan


On 10/11/2017 11:04 PM, Thomas Munro wrote:
> Hi hackers,
>
> I don't use Windows myself, but I'd rather avoid submitting patches
> that fail to build, build with horrible warnings or blow up on that
> fine operating system.  I think it would be neat to be able to have
> experimental branches of PostgreSQL built and tested on Windows
> automatically just by pushing them to a watched public git repo.  Just
> like Travis CI and several others do for the major GNU/Linux distros,
> it seems there is at least one Windows-based CI company that
> generously offers free testing to open source projects:
> https://www.appveyor.com (hat tip to Ilmari for the pointer).  I
> wonder... has anyone here with Microsoft know-how ever tried to
> produce an appveyor.yml file that would do a MSVC build and
> check-world?
>


Interesting.

I'm taking a look.

A couple of things not in their pre-built images that we'll need are
flex and bison. We might be able to overcome that with chocolatey, which
is installed, haven't tested yet.

getting a working appveyor.yml will take a little while, though.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 3:43 PM, Andres Freund  wrote:
> Are we going to rely on the the combine function to stay the same
> forever after?

If we change them, it will be a pg_upgrade compatibility break for
anyone using hash-partitioned tables with more than one partitioning
column.  Dump and reload will also break unless
--load-via-partition-root is used.

In other words, it's not utterly fixed in stone --- we invented
--load-via-partition-root primarily to cope with circumstances that
could change hash values --- but we sure don't want to be changing it
with any regularity, or for a less-than-excellent reason.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oversight in EphemeralNamedRelation support

2017-10-12 Thread Tom Lane
Julien Rouhaud  writes:
> On Mon, Oct 9, 2017 at 10:43 PM, Thomas Munro
>  wrote:
>> I suppose we could consider moving the schemaname check into
>> getRTEForSpecialRelationType(), since otherwise both callers need to
>> do that (and as you discovered, one forgot).

> Thanks for the feedback.  That was my first idea, but I assumed there
> could be future use for this function on qualified RangeVar if it
> wasn't done this way.

> I agree it'd be much safer, so v2 attached, check moved in
> getRTEForSpecialRelationType().

Hm.  I actually think the bug here is that 18ce3a4ab introduced
anything into setTargetTable at all.  There was never previously
any assumption that the target could be anything but a regular
table, so we just ignored CTEs there, and I do not think the
new behavior is an improvement.

So my proposal is to rip out the getRTEForSpecialRelationTypes
check there.  I tend to agree that getRTEForSpecialRelationTypes
should probably contain an explicit check for unqualified name
rather than relying on its caller ... but that's a matter of
future-proofing not a bug fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread Andres Freund
On 2017-10-12 10:05:26 -0400, Robert Haas wrote:
> On Thu, Oct 12, 2017 at 9:08 AM, amul sul  wrote:
> > How about combining high 32 bits and the low 32 bits separately as shown 
> > below?
> >
> > static inline uint64
> > hash_combine64(uint64 a, uint64 b)
> > {
> > return (((uint64) hash_combine((uint32) a >> 32, (uint32) b >> 32) << 
> > 32)
> > | hash_combine((unit32) a, (unit32) b));
> > }
> 
> I doubt that's the best approach, but I don't have something specific
> to recommend.

Yea, that doesn't look great. There's basically no intermixing between
low and high 32 bits. going on.  We probably should just expand the
concept of the 32 bit function:

static inline uint32
hash_combine32(uint32 a, uint32 b)
{
/* 0x9e3779b9 is the golden ratio reciprocal */
a ^= b + 0x9e3779b9 + (a << 6) + (a >> 2);
return a;
}

to something roughly like:

static inline uint64
hash_combine64(uint64 a, uint64 b)
{
/* 0x49A0F4DD15E5A8E3 is 64bit random data */
a ^= b + 0x49A0F4DD15E5A8E3 + (a << 54) + (a >> 7);
return a;
}

In contrast to the 32 bit version's fancy use of the golden ratio
reciprocal as a constant I went brute force, and just used 64bit of
/dev/random. From my understanding the important property is that bits
are independent from each other, nothing else.

The shift widths are fairly random, but they should bring in enough bit
perturbation when mixing in only 32bit of hash value (i.e
0x).

Are we going to rely on the the combine function to stay the same
forever after?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
Well for this given query it is possible. I haven't come across any such
query yet.

Possibly because I am more concerned about the TPCDS and TPCH benchmarks,
where it's less likely to occur.

On 13 October 2017 at 00:52, Tom Lane  wrote:

> Gourav Kumar  writes:
> > A Join clause/predicate will only mention 2 relations. It can't have 3 or
> > more relations.
>
> Really?  What of, say,
>
> select ... from a,b,c where (a.x + b.y) = c.z;
>
> regards, tom lane
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Tom Lane
Gourav Kumar  writes:
> A Join clause/predicate will only mention 2 relations. It can't have 3 or
> more relations.

Really?  What of, say,

select ... from a,b,c where (a.x + b.y) = c.z;

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-10-12 Thread Robert Haas
On Thu, Oct 5, 2017 at 9:29 PM, Amit Langote
 wrote:
> Attached a patch to modify the INFO messages in check_default_allows_bound.

Committed.  However, I didn't see a reason to adopt the comment change
you proposed, so I left that part out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-10-12 Thread Robert Haas
On Thu, Oct 5, 2017 at 1:02 PM, Jeff Janes  wrote:
> I don't see a reason to block a directly-logged-in superuser from using a
> mapping.  I asked in the closed list whether the current (released)
> behavior was a security bug, and the answer was no.  And I don't know why
> else to block superusers from doing something other than as a security bug.
> Also it would create a backwards compatibility hazard to revoke the ability
> now.

Well, my thought was that we ought to be consistent about whose
authorization matters.  If we're using the view owner's credentials in
general, then we also (defensibly, anyway) ought to use the view
owner's superuser-ness to decide whether to enforce this restriction.
Using either the view owner's superuser-ness or the session user's
superuser-ness kind of puts you halfway in the middle.  The view
owner's rights are what matters mostly, but your own rights also
matter a little bit around the edges.  That's a little strange.

I don't have violently strong opinions about this - does anyone else
have a view?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
A Join clause/predicate will only mention 2 relations. It can't have 3 or
more relations.

On 12 October 2017 at 23:14, Tom Lane  wrote:

> Gourav Kumar  writes:
> > My objective is to construct join graph from a given query.
> > A join graph, has a node for each relation involved in a join, and an
> edge
> > between two relations if they share a join predicate among them.
>
> Hm, well, you could adapt the logic in have_relevant_joinclause() and
> have_relevant_eclass_joinclause().  Or maybe you could just use them
> as-is ... depends on what you have in mind to do with join clauses
> that mention 3 or more relations.
>
> regards, tom lane
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] Parallel Append implementation

2017-10-12 Thread Robert Haas
On Wed, Oct 11, 2017 at 8:51 AM, Amit Khandekar  wrote:
> [ new patch ]

+ parallel_append
+ Waiting to choose the next subplan during Parallel Append plan
+ execution.
+
+

Probably needs to update a morerows values of some earlier entry.

+   enable_parallelappend configuration
parameter

How about enable_parallel_append?

+ * pa_finished : workers currently executing the subplan. A worker which

The way the colon is used here is not a standard comment style for PostgreSQL.

+ * Go on to the "next" subplan. If no more subplans, return the empty
+ * slot set up for us by ExecInitAppend.
+ * Note: Parallel-aware Append follows different logic for choosing the
+ * next subplan.

Formatting looks wrong, and moreover I don't think this is the right
way of handling this comment anyway.  Move the existing comment inside
the if (!node->padesc) block and leave it unchanged; the else block
explains the differences for parallel append.

+ *ExecAppendEstimate
+ *
+ *estimates the space required to serialize Append node.

Ugh, this is wrong, but I notice it follows various other
equally-wrong comments for other parallel-aware node types. I guess
I'll go fix that.  We are not in serializing the Append node.

I do not think that it's a good idea to call
exec_append_parallel_next() from ExecAppendInitializeDSM,
ExecAppendReInitializeDSM, and ExecAppendInitializeWorker.  We want to
postpone selecting which plan to run until we're actually ready to run
that plan.  Otherwise, for example, the leader might seize a
non-partial plan (if only such plans are included in the Parallel
Append) when it isn't really necessary for it to do so.  If the
workers would've reached the plans and started returning tuples to the
leader before it grabbed a plan, oh well, too bad.  The leader's still
claimed that plan and must now run it.

I concede that's not a high-probability scenario, but I still maintain
that it is better for processes not to claim a subplan until the last
possible moment.  I think we need to initialize as_whichplan as
PA_INVALID plan and then fix it when ExecProcNode() is called for the
first time.

+if (!IsParallelWorker())

This is not a great test, because it would do the wrong thing if we
ever allowed an SQL function called from a parallel worker to run a
parallel query of its own.  Currently that's not allowed but we might
want to allow it someday.  What we really want to test is whether
we're the leader for *this* query.  Maybe use a flag in the
AppendState for that, and set it correctly in
ExecAppendInitializeWorker.

I think maybe the loop in exec_append_parallel_next should look more like this:

/* Pick the next plan. */
state->as_whichplan = padesc->pa_nextplan;
if (state->as_whichplan != PA_INVALID_PLAN)
{
int nextplan = state->as_whichplan;

/* Mark non-partial plans done immediately so that they can't be
picked again. */
if (nextplan < first_partial_plan)
padesc->pa_finished[nextplan] = true;

/* Figure out what plan the next worker should pick. */
do
{
/* If we've run through all the plans, loop back through
partial plans only. */
if (++nextplan >= state->as_nplans)
nextplan = first_partial_plan;

/* No plans remaining or tried them all?  Then give up. */
if (nextplan == state->as_whichplan || nextplan >= state->as_nplans)
{
nextplan = PA_INVALID_PLAN;
break;
}
} while (padesc->pa_finished[nextplan]);

/* Store calculated next plan back into shared memory. */
padesc->pa_next_plan = nextplan;
}

This might not be exactly right and the comments may need work, but
here are a couple of points:

- As you have it coded, the loop exit condition is whichplan !=
PA_INVALID_PLAN, but that's probably an uncommon case and you have two
other ways out of the loop.  It's easier to understand the code if the
loop condition corresponds to the most common way of exiting the loop,
and any break is for some corner case.

- Don't need a separate call to exec_append_get_next_plan; it's all
handled here (and, I think, pretty compactly).

- No need for pa_first_plan any more.  Looping back to
first_partial_plan is a fine substitute, because by the time anybody
loops around, pa_first_plan would equal first_partial_plan anyway
(unless there's a bug).

- In your code, the value in shared memory is the point at which to
start the search for the next plan.  Here, I made it the value that
the next worker should adopt without question.  Another option would
be to make it the value that the last worker adopted.  I think that
either that option or what I did above are slightly better than what
you have, because as you have it, you've got to use the
increment-with-looping logic in two different places whereas either of
those options only need it in one place, which makes this a little

Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Tom Lane
Gourav Kumar  writes:
> My objective is to construct join graph from a given query.
> A join graph, has a node for each relation involved in a join, and an edge
> between two relations if they share a join predicate among them.

Hm, well, you could adapt the logic in have_relevant_joinclause() and
have_relevant_eclass_joinclause().  Or maybe you could just use them
as-is ... depends on what you have in mind to do with join clauses
that mention 3 or more relations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On markers of changed data

2017-10-12 Thread Robert Haas
On Fri, Oct 6, 2017 at 10:34 AM, Michael Paquier
 wrote:
> On Fri, Oct 6, 2017 at 11:22 PM, Tom Lane  wrote:
>> I'd say no:
>>
>> 1. You don't know the granularity of the filesystem's timestamps, at least
>> not without making unportable assumptions.
>>
>> 2. There's no guarantee that the system clock can't be set backwards.
>>
>> 3. It's not uncommon for filesystems to have optimizations whereby they
>> skip or delay some updates of file mtimes.  (I think this is usually
>> optional, but you couldn't know whether it's turned on.)
>>
>> #2 is probably the worst of these problems.
>
> Or upwards. A simple example of things depending on clock changes is
> for example VM snapshotting. Any logic not depending on monotonic
> timestamps, with things like clock_gettime(CLOCK_MONOTONIC) is a lot
> of fun to investigate until you know that they are not using any
> monotonic logic... So the answer is *no*, do not depend on FS-level
> timestamps. The only sane method for Postgres is really to scan the
> page header LSNs, and of course you already know that.

It might also be worth noting that the FSM is not WAL-logged, so it
probably either doesn't have page LSNs or they are not meaningful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Help required to debug pg_repack breaking logical replication

2017-10-12 Thread Robert Haas
On Sat, Oct 7, 2017 at 2:37 PM, Daniele Varrazzo
 wrote:
> (with a
> custom addition to update the relfrozenxid which seems backwards to me
> as it sets the older frozen xid on the new table [3]).
>
> [3] https://github.com/reorg/pg_repack/issues/152

Wow.  That's really bad.  It will corrupt your database.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-12 Thread Robert Haas
On Wed, Oct 11, 2017 at 10:43 PM, Ashutosh Bapat
 wrote:
> You are suggesting that a dummy partitioned table be treated as an
> un-partitioned table and apply above suggested optimization. A join
> between a partitioned and unpartitioned table is partitioned by the
> keys of only partitioned table. An unpartitioned table doesn't have
> any keys, so this is fine. But a dummy partitioned table does have
> keys. Recording them as keys of the join relation helps when it joins
> to other relations. Furthermore a join between partitioned and
> unpartitioned table doesn't require any equi-join condition on
> partition keys of partitioned table but a join between partitioned
> tables is considered to be partitioned by keys on both sides only when
> there is an equi-join. So, when implementing a partitioned join
> between a partitioned and an unpartitioned table, we will have to make
> a special case to record partition keys when the unpartitioned side is
> actually a dummy partitioned table. That might be awkward.

It seems to me that what we really need here is to move all of this
stuff into a separate struct:

/* used for partitioned relations */
PartitionScheme part_scheme;/* Partitioning scheme. */
int nparts; /* number of
partitions */
struct PartitionBoundInfoData *boundinfo;   /* Partition bounds */
struct RelOptInfo **part_rels;  /* Array of RelOptInfos of partitions,

  * stored in the same order of bounds */
List  **partexprs;  /* Non-nullable partition key
expressions. */
List  **nullable_partexprs; /* Nullable partition key
expressions. */

...and then have a RelOptInfo carry a pointer to a list of those
structures.  That lets us consider multiple possible partition schemes
for the same relation.  For instance, suppose that a user joins four
relations, P1, P2, Q1, and Q2.  P1 and P2 are compatibly partitioned.
Q1 and Q2 are compatibly partitioned (but not compatible with P1 and
P2).

Furthermore, let's suppose that the optimal join order begins with a
join between P1 and Q1.  When we construct the paths for that joinrel,
we can either join all of P1 to all of Q1 (giving up on partition-wise
join), or we can join each partition of P1 to all of Q1 (producing a
result partitioned compatibly with P1 and allowing for a future
partition-wise join to P2), or we can join each partition of Q1 to all
of P1 (producing a result partitioned compatibly with Q1 and allowing
for a future partition-wise join to Q2).  Any of those could win
depending on the details.  With the data structure as it is today,
we'd have to choose whether to mark the joinrel as partitioned like P1
or like Q1, but that's not really what we need here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
What is meant by "unstructured Join"?

Thanks,
Gourav

On 12 October 2017 at 22:47, Gourav Kumar  wrote:

> My objective is to construct join graph from a given query.
> A join graph, has a node for each relation involved in a join, and an edge
> between two relations if they share a join predicate among them.
>
> To do this I first tried to use the make_join_rel() function
>  - There I checked if they root->join->cur->level is 2, just write the
> relation names to a file.
>  - But this strategy failed, because if there is somewhere a Cartesian
> product among two relations, then they can't have an edge in the join graph.
>  - So, along with writing the relation name, I need to know if they
> share a join predicate among them or not.
>
>
> On 12 October 2017 at 22:08, Tom Lane  wrote:
>
>> Gourav Kumar  writes:
>> > I have the RelOptInfo data structure for the relations which are to be
>> > joined but when I check their joininfo, it is empty.
>>
>> You aren't telling us anything much about the case you're studying,
>> but if the join clauses have the form of equality comparisons, they
>> likely got converted into EquivalenceClass data structures instead.
>> These days the joininfo lists only contain "unstructured" join
>> conditions.
>>
>> regards, tom lane
>>
>
>
>
> --
> Thanks,
> Gourav Kumar
> Computer Science and Automation
> Indian Institute of Science
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
My objective is to construct join graph from a given query.
A join graph, has a node for each relation involved in a join, and an edge
between two relations if they share a join predicate among them.

To do this I first tried to use the make_join_rel() function
 - There I checked if they root->join->cur->level is 2, just write the
relation names to a file.
 - But this strategy failed, because if there is somewhere a Cartesian
product among two relations, then they can't have an edge in the join graph.
 - So, along with writing the relation name, I need to know if they
share a join predicate among them or not.


On 12 October 2017 at 22:08, Tom Lane  wrote:

> Gourav Kumar  writes:
> > I have the RelOptInfo data structure for the relations which are to be
> > joined but when I check their joininfo, it is empty.
>
> You aren't telling us anything much about the case you're studying,
> but if the join clauses have the form of equality comparisons, they
> likely got converted into EquivalenceClass data structures instead.
> These days the joininfo lists only contain "unstructured" join
> conditions.
>
> regards, tom lane
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] [COMMITTERS] pgsql: Add port/strnlen support to libpq and ecpg Makefiles.

2017-10-12 Thread Tom Lane
I wrote:
> On reflection, let's just go with the solution of building libpgport_lib.a
> with the right flags (fPIC + threading) and leave libpgport.a alone.
> That way we don't need a debate about whether there's an efficiency
> cost worth worrying about.

I looked into this and got discouraged after realizing that not only does
libpq pull in and recompile a bundle of src/port/ files, but also from
src/common/, and even some random backend files like encnames.c and
wchar.c.  We could possibly apply the same build-a-third-way technique to
src/common/, but we'd still end up with messiness for the encoding files.

Perhaps what this is telling us is that encnames.c and wchar.c ought to be
moved to src/common/.

On top of that, there are also some interactions with the MSVC build
system that I don't especially want to debug or untangle.

In short, this is looking like a whole lot more work than we could
ever hope to save by not having to maintain these file lists anymore.
So I'm not planning to pursue it.  If someone else wants to, though,
have at it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Tom Lane
Gourav Kumar  writes:
> I have the RelOptInfo data structure for the relations which are to be
> joined but when I check their joininfo, it is empty.

You aren't telling us anything much about the case you're studying,
but if the join clauses have the form of equality comparisons, they
likely got converted into EquivalenceClass data structures instead.
These days the joininfo lists only contain "unstructured" join
conditions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
Hi Ashutosh,

I have the RelOptInfo data structure for the relations which are to be
joined but when I check their joininfo, it is empty.
Does baserestrictinfo contains base predicates ?

Thanks
Gourav.

On 11 October 2017 at 12:00, Ashutosh Bapat  wrote:

> On Tue, Oct 10, 2017 at 7:29 PM, Gourav Kumar 
> wrote:
> > Hi all,
> >
> > When you fire a query in postgresql, it will first parse the query and
> > create the data structures for storing various aspects of the query and
> > executing the query. (Like RangeTblEntry, PlannerInfo, RangeOptInfo
> etc.).
> >
> > I want to know how does postgresql stores the join predicates of a query.
> > Like which data structure is used to store the join predicates.
> >
> > How can we find the join predicates applied on a relation from relid,
> Oid or
> > RangeTblEntry ?
> >
>
> Every relation has a RelOptInfo associated with it. Predicates
> applicable to it are stored in this RelOptInfo as a list. For base
> relations (simple tables) it's in baserestrictinfo. The join
> predicates applicable are in joininfo. You can get RelOptInfo of a
> given simple table using find_base_rel().
>
> HTH.
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>


Re: [HACKERS] advanced partition matching algorithm for partition-wise join

2017-10-12 Thread Robert Haas
On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat
 wrote:
> Here's updated patch set based on the basic partition-wise join
> committed. The patchset applies on top of the patch to optimize the
> case of dummy partitioned tables [1].
>
> Right now, the advanced partition matching algorithm bails out when
> either of the joining relations has a default partition.

So is that something you are going to fix?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GUC for cleanup indexes threshold.

2017-10-12 Thread Robert Haas
On Tue, Oct 10, 2017 at 5:55 AM, Pavel Golub  wrote:
> DP> The new status of this patch is: Ready for Committer
>
> Seems  like,  we  may  also  going to hit it and it would be cool this
> vacuum issue solved for next PG version.

Exactly which patch on this thread is someone proposing for commit?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Extended statistics is not working on Vars hidden under a RelabelType

2017-10-12 Thread Alvaro Herrera
David Rowley wrote:

> The reason Tomas coded it the way it was coded is due to the fact that
> there's already code that works exactly the same way in
> clauselist_selectivity(). Personally, I don't particularly like that
> code, but I'd rather not invent a new way to do the same thing.

I pushed your original fix.  I still maintain that this should be
written differently, but I'm not going to try to change that in a hurry
and together with a bugfix.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix a typo in execReplication.c

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 6:55 AM, Petr Jelinek
 wrote:
> Thanks for the patch, looks correct to me.

Committed and back-patched to v10.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread David Rowley
On 13 October 2017 at 03:00, Tom Lane  wrote:
> Laurenz Albe  writes:
>> Robert Haas wrote:
>>> One trick that some system use is avoid replanning as much as we do
>>> by, for example, saving plans in a shared cache and reusing them even
>>> in other sessions.  That's hard to do in our architecture because the
>>> controlling GUCs can be different in every session and there's not
>>> even any explicit labeling of which GUCs control planner behavior. But
>>> if you had it, then extra planning cycles would be, perhaps, more
>>> tolerable.
>
>> From my experience with Oracle I would say that that is a can of worms.
>
> Yeah, I'm pretty suspicious of the idea too.  We've had an awful lot of
> bad experience with local plan caching, to the point where people wonder
> why we don't just auto-replan every time.  How would a shared cache
> make that better?  (Even assuming it was otherwise free, which it
> surely won't be.)

One idea I had when working on unique joins was that we could use it
to determine if a plan could only return 0 or 1 row by additionally
testing baserestrictinfo of each rel to see if an equality OpExpr with
a const Const matches a unique constraint which would serve as a
guarantee that only 0 or 1 row would match. I thought that these
single row plans could be useful as a start for some pro-active plan
cache. The plan here should be stable as they're not prone to any data
skew that could cause a plan change. You might think it would be very
few plans would fit the bill, but you'd likely find that the majority
of UPDATE and DELETE queries run in an OTLP application would be
eligible, and likely some decent percentage of SELECTs too.

I had imagined this would be some backend local cache that saved MRU
plans up to some size of memory defined by a GUC, where 0 would
disable the feature. I never got any further than those thoughts.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query started showing wrong result after Ctrl+c

2017-10-12 Thread Tom Lane
tushar  writes:
> On 10/12/2017 03:46 PM, Marko Tiikkaja wrote:
>> The subquery:
>>     select n from tv limit 1
>> could in theory return any row due to the lack of ORDER BY. What I'm 
>> guessing happened is that you're seeing a synchronized sequential scan 
>> in follow-up queries.  Add an ORDER BY.

> Bang on . After adding order by clause - i am getting same result 
> consistently. but why i got the  different result after canceling the 
> query only?

If you let the query run to completion, the syncscan start pointer will
return to the start of the table.  Cancelling it partway through allows
the syncscan pointer to be left pointing somewhere in the middle of the
table.

If you turn off synchronize_seqscans, you should find that you always
get the physically-first table row from that subselect.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 10:00 AM, Tom Lane  wrote:
>> From my experience with Oracle I would say that that is a can of worms.
>
> Yeah, I'm pretty suspicious of the idea too.  We've had an awful lot of
> bad experience with local plan caching, to the point where people wonder
> why we don't just auto-replan every time.  How would a shared cache
> make that better?  (Even assuming it was otherwise free, which it
> surely won't be.)

Obviously it wouldn't.  But it might make other things better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread Robert Haas
On Thu, Oct 12, 2017 at 9:08 AM, amul sul  wrote:
> How about combining high 32 bits and the low 32 bits separately as shown 
> below?
>
> static inline uint64
> hash_combine64(uint64 a, uint64 b)
> {
> return (((uint64) hash_combine((uint32) a >> 32, (uint32) b >> 32) << 32)
> | hash_combine((unit32) a, (unit32) b));
> }

I doubt that's the best approach, but I don't have something specific
to recommend.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread Tom Lane
Laurenz Albe  writes:
> Robert Haas wrote:
>> One trick that some system use is avoid replanning as much as we do
>> by, for example, saving plans in a shared cache and reusing them even
>> in other sessions.  That's hard to do in our architecture because the
>> controlling GUCs can be different in every session and there's not
>> even any explicit labeling of which GUCs control planner behavior. But
>> if you had it, then extra planning cycles would be, perhaps, more
>> tolerable.

> From my experience with Oracle I would say that that is a can of worms.

Yeah, I'm pretty suspicious of the idea too.  We've had an awful lot of
bad experience with local plan caching, to the point where people wonder
why we don't just auto-replan every time.  How would a shared cache
make that better?  (Even assuming it was otherwise free, which it
surely won't be.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] BLK_DONE state in XLogReadBufferForRedoExtended

2017-10-12 Thread Amit Kapila
Today, I was trying to think about cases when we can return BLK_DONE
in XLogReadBufferForRedoExtended.  One thing that occurred to me is
that it can happen during the replay of WAL if the full_page_writes is
off.  Basically, when the full_page_writes is on, then during crash
recovery, it will always first restore the full page image of page and
then apply the WAL corresponding to that page, so we will never hit
the case where the lsn of the page is greater than lsn of WAL record.

Are there other cases for which we can get BLK_DONE state?  Is it
mentioned somewhere in code/comments which I am missing?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-12 Thread David Rowley
On 13 October 2017 at 02:17, Alvaro Herrera  wrote:
> I propose this slightly larger change.

hmm, this is not right. You're not checking that there's a Var below
the RelabelType.

I tried with:

explain select * from ab where (a||a)::varchar = '' and b = '';

and your code assumed the OpExpr was a Var.

The reason Tomas coded it the way it was coded is due to the fact that
there's already code that works exactly the same way in
clauselist_selectivity(). Personally, I don't particularly like that
code, but I'd rather not invent a new way to do the same thing.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Extended statistics is not working on Vars hidden under a RelabelType

2017-10-12 Thread Alvaro Herrera
Tomas Vondra wrote:
> On 10/10/2017 05:03 AM, David Rowley wrote:
> > Basically, $subject is causing us not to properly find matching
> > extended stats in this case.
> > 
> > The attached patch fixes it.
> > 
> > The following test cases is an example of the misbehaviour. Note
> > rows=1 vs rows=98 in the Gather node.
> 
> Thanks for noticing this. The patch seems fine to me.

I propose this slightly larger change.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/statistics/dependencies.c 
b/src/backend/statistics/dependencies.c
index 2e7c0ad6ba..0a51639d9d 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -736,12 +736,9 @@ pg_dependencies_send(PG_FUNCTION_ARGS)
  * dependency_is_compatible_clause
  * Determines if the clause is compatible with functional 
dependencies
  *
- * Only OpExprs with two arguments using an equality operator are supported.
- * When returning True attnum is set to the attribute number of the Var within
- * the supported clause.
- *
- * Currently we only support Var = Const, or Const = Var. It may be possible
- * to expand on this later.
+ * Returns true, and sets *attnum to the attribute number of the Var in the
+ * clause, if the clause is compatible with functional dependencies on the
+ * given relation.  Otherwise, return false.
  */
 static bool
 dependency_is_compatible_clause(Node *clause, Index relid, AttrNumber *attnum)
@@ -762,39 +759,47 @@ dependency_is_compatible_clause(Node *clause, Index 
relid, AttrNumber *attnum)
if (is_opclause(rinfo->clause))
{
OpExpr *expr = (OpExpr *) rinfo->clause;
+   Node   *left;
+   Node   *right;
Var*var;
-   boolvaronleft = true;
-   boolok;
 
-   /* Only expressions with two arguments are considered 
compatible. */
+   /* Only expressions with two arguments are considered 
compatible ... */
if (list_length(expr->args) != 2)
return false;
 
-   /* see if it actually has the right */
-   ok = (NumRelids((Node *) expr) == 1) &&
-   (is_pseudo_constant_clause(lsecond(expr->args)) ||
-(varonleft = false,
- is_pseudo_constant_clause(linitial(expr->args;
-
-   /* unsupported structure (two variables or so) */
-   if (!ok)
+   /* ... and only if they operate on a single relation */
+   if (NumRelids((Node *) expr) != 1)
return false;
 
/*
-* If it's not "=" operator, just ignore the clause, as it's not
-* compatible with functional dependencies.
-*
-* This uses the function for estimating selectivity, not the 
operator
-* directly (a bit awkward, but well ...).
+* See which one is the pseudo-constant one, if any. If neither 
is, the
+* clause is not compatible.
 */
-   if (get_oprrest(expr->opno) != F_EQSEL)
+   left = (Node *) linitial(expr->args);
+   right = (Node *) lsecond(expr->args);
+
+   if (IsA(left, Var) || IsA(left, RelabelType))
+   {
+   if (!is_pseudo_constant_clause(right))
+   return false;
+   var = (Var *) left;
+   }
+   else if (IsA(right, Var) || IsA(right, RelabelType))
+   {
+   if (!is_pseudo_constant_clause(left))
+   return false;
+   var = (Var *) right;
+   }
+   else
return false;
 
-   var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
-
-   /* We only support plain Vars for now */
-   if (!IsA(var, Var))
-   return false;
+   /*
+* We may ignore any RelabelType node above the operand.  
(There won't
+* be more than one, since eval_const_expressions() has been 
applied
+* already.)
+*/
+   if (IsA(var, RelabelType))
+   var = (Var *) ((RelabelType *) var)->arg;
 
/* Ensure var is from the correct relation */
if (var->varno != relid)
@@ -808,6 +813,16 @@ dependency_is_compatible_clause(Node *clause, Index relid, 
AttrNumber *attnum)
if (!AttrNumberIsForUserDefinedAttr(var->varattno))
return false;
 
+   /*
+* If it's not "=" operator, just ignore the clause, as it's not
+

Re: [HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) on PostgreSQL 10

2017-10-12 Thread Dilip Kumar
On Thu, Oct 12, 2017 at 6:37 PM, Tomas Vondra
 wrote:
>
>
> On 10/12/2017 02:40 PM, Dilip Kumar wrote:
>> On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra
>>  wrote:
>>> Hi,
>>>
>>> It seems that Q19 from TPC-H is consistently failing with segfaults due
>>> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).
>>>
>>> I'm not very familiar with how the dsa is initialized and passed around,
>>> but I only see the failures when the bitmap is constructed by a mix of
>>> BitmapAnd and BitmapOr operations.
>>>
>> I think I have got the issue, bitmap_subplan_mark_shared is not
>> properly pushing the isshared flag to lower level bitmap index node,
>> and because of that tbm_create is passing NULL dsa while creating the
>> tidbitmap.  So this problem will come in very specific combination of
>> BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the
>> BitmapOr.  If BitmapIndex is the first subplan under BitmapOr then
>> there is no problem because BitmapOr node will create the tbm by
>> itself and isshared is set for BitmapOr.
>>
>> Attached patch fixing the issue for me.  I will thoroughly test this
>> patch with other scenario as well.  Thanks for reporting.
>>
>
> Yep, this fixes the failures for me.
>
Thanks for confirming.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-10-12 Thread amul sul
On Thu, Oct 12, 2017 at 6:31 AM, Robert Haas  wrote:
> On Tue, Oct 10, 2017 at 7:07 AM, amul sul  wrote:
>> How about the attached patch(0003)?
>> Also, the dim variable is renamed to natts.
>
> I'm not sure I believe this comment:
>
> +/*
> + * We arrange the partitions in the ascending order of their modulus
> + * and remainders.  Also every modulus is factor of next larger
> + * modulus.  This means that the index of a given partition is same 
> as
> + * the remainder of that partition.  Also entries at (remainder + N *
> + * modulus) positions in indexes array are all same for (modulus,
> + * remainder) specification for any partition.  Thus datums array 
> from
> + * both the given bounds are same, if and only if their indexes array
> + * will be same.  So, it suffices to compare indexes array.
> + */
>
> I am particularly not sure that I believe that the index of a
> partition must be the same as the remainder.  It doesn't seem like
> that would be true when there is more than one modulus or when some
> partitions are missing.
>

Looks like an explanation by the comment is not good enough, will think on this.

Here are the links for the previous discussion:
1] 
https://postgr.es/m/CAFjFpRfHqSGBjNgJV2p%2BC4Yr5Qxvwygdsg4G_VQ6q9NTB-i3MA%40mail.gmail.com
2] 
https://postgr.es/m/CAFjFpRdeESKFkVGgmOdYvmD3d56-58c5VCBK0zDRjHrkq_VcNg%40mail.gmail.com


> +if (offset < 0)
> +{
> +next_modulus = DatumGetInt32(datums[0][0]);
> +valid_modulus = (next_modulus % spec->modulus) == 0;
> +}
> +else
> +{
> +prev_modulus = DatumGetInt32(datums[offset][0]);
> +valid_modulus = (spec->modulus % prev_modulus) == 0;
> +
> +if (valid_modulus && (offset + 1) < ndatums)
> +{
> +next_modulus =
> DatumGetInt32(datums[offset + 1][0]);
> +valid_modulus = (next_modulus %
> spec->modulus) == 0;
> +}
> +}
>
> I don't think this is quite right.  It checks the new modulus against
> prev_modulus whenever prev_modulus is defined, which is correct, but
> it doesn't check the new modulus against the next_modulus except when
> offset < 0.  But actually that check needs to be performed, I think,
> whenever the new modulus is less than the greatest modulus seen so
> far.
>
It does. See the "if (valid_modulus && (offset + 1) < ndatums)"  block in the
else part of the snippet that you are referring.

For e.g new modulus 25 & 150 is not accepted for the hash partitioned bound with
modulus 10,50,200. Will cover this test as well.

> + * For a partitioned table defined as:
> + *CREATE TABLE simple_hash (a int, b char(10)) PARTITION BY HASH (a, b);
> + *
> + * CREATE TABLE p_p1 PARTITION OF simple_hash
> + *FOR VALUES WITH (MODULUS 2, REMAINDER 1);
> + * CREATE TABLE p_p2 PARTITION OF simple_hash
> + *FOR VALUES WITH (MODULUS 4, REMAINDER 2);
> + * CREATE TABLE p_p3 PARTITION OF simple_hash
> + *FOR VALUES WITH (MODULUS 8, REMAINDER 0);
> + * CREATE TABLE p_p4 PARTITION OF simple_hash
> + *FOR VALUES WITH (MODULUS 8, REMAINDER 4);
> + *
> + * This function will return one of the following in the form of an
> + * expression:
> + *
> + * for p_p1: satisfies_hash_partition(2, 1, hash_fn_1_extended(a, HASH_SEED),
> + * hash_fn_2_extended(b,
> HASH_SEED))
> + * for p_p2: satisfies_hash_partition(4, 2, hash_fn_1_extended(a, HASH_SEED),
> + * hash_fn_2_extended(b,
> HASH_SEED))
> + * for p_p3: satisfies_hash_partition(8, 0, hash_fn_1_extended(a, HASH_SEED),
> + * hash_fn_2_extended(b,
> HASH_SEED))
> + * for p_p4: satisfies_hash_partition(8, 4, hash_fn_1_extended(a, HASH_SEED),
> + * hash_fn_2_extended(b,
> HASH_SEED))
>
> I think instead of this lengthy example you should try to explain the
> general rule.  Maybe something like: the partition constraint for a
> hash partition is always a call to the built-in function
> satisfies_hash_partition().  The first two arguments are the modulus
> and remainder for the partition; the remaining arguments are the hash
> values computed for each column of the partition key using the
> extended hash function from the appropriate opclass.
>
Okay will add this.

> +static uint64
> +mix_hash_value(int nkeys, Datum *hash_array, bool *isnull)
>
How about combining high 32 bits and the low 32 bits separately as shown below?

static inline uint64
hash_combine64(uint64 a, uint64 b)
{
return (((uint64) hash_combine((uint32) a >> 32, (uint32) b >> 32) << 32)
| 

Re: [HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) on PostgreSQL 10

2017-10-12 Thread Tomas Vondra


On 10/12/2017 02:40 PM, Dilip Kumar wrote:
> On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra
>  wrote:
>> Hi,
>>
>> It seems that Q19 from TPC-H is consistently failing with segfaults due
>> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).
>>
>> I'm not very familiar with how the dsa is initialized and passed around,
>> but I only see the failures when the bitmap is constructed by a mix of
>> BitmapAnd and BitmapOr operations.
>>
> I think I have got the issue, bitmap_subplan_mark_shared is not
> properly pushing the isshared flag to lower level bitmap index node,
> and because of that tbm_create is passing NULL dsa while creating the
> tidbitmap.  So this problem will come in very specific combination of
> BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the
> BitmapOr.  If BitmapIndex is the first subplan under BitmapOr then
> there is no problem because BitmapOr node will create the tbm by
> itself and isshared is set for BitmapOr.
> 
> Attached patch fixing the issue for me.  I will thoroughly test this
> patch with other scenario as well.  Thanks for reporting.
> 

Yep, this fixes the failures for me.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) on PostgreSQL 10

2017-10-12 Thread Dilip Kumar
On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra
 wrote:
> Hi,
>
> It seems that Q19 from TPC-H is consistently failing with segfaults due
> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).
>
> I'm not very familiar with how the dsa is initialized and passed around,
> but I only see the failures when the bitmap is constructed by a mix of
> BitmapAnd and BitmapOr operations.
>
I think I have got the issue, bitmap_subplan_mark_shared is not
properly pushing the isshared flag to lower level bitmap index node,
and because of that tbm_create is passing NULL dsa while creating the
tidbitmap.  So this problem will come in very specific combination of
BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the
BitmapOr.  If BitmapIndex is the first subplan under BitmapOr then
there is no problem because BitmapOr node will create the tbm by
itself and isshared is set for BitmapOr.

Attached patch fixing the issue for me.  I will thoroughly test this
patch with other scenario as well.  Thanks for reporting.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 5c934f2..cc7590b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4922,7 +4922,11 @@ bitmap_subplan_mark_shared(Plan *plan)
 		bitmap_subplan_mark_shared(
    linitial(((BitmapAnd *) plan)->bitmapplans));
 	else if (IsA(plan, BitmapOr))
+	{
 		((BitmapOr *) plan)->isshared = true;
+		bitmap_subplan_mark_shared(
+  linitial(((BitmapOr *) plan)->bitmapplans));
+	}
 	else if (IsA(plan, BitmapIndexScan))
 		((BitmapIndexScan *) plan)->isshared = true;
 	else

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: [BUGS] BUG #14850: Implement optinal additinal parameter for 'justify' date/time function

2017-10-12 Thread KES
 Пересылаемое сообщение11.10.2017, 17:12, "Pavel Stehule" :Hi2017-10-11 12:35 GMT+02:00  :The following bug has been logged on the website:

Bug reference:      14850
Logged by:          Eugen Konkov
Email address:      kes-...@yandex.ru
PostgreSQL version: 10.0
Operating system:   Linux mint 18: Linux work 4.4.0-57-generic #78-Ubu
Description:

Hi. I try to do next math:

select extract( month from justify_days( timestamp '2016-01-31' +interval '1
month' -timestamp '2016-01-31') );
 date_part
---
         0
(1 row)

I expect `1` but get `0`. But here everything is right:

>Adjust interval so 30-day time periods are represented as months

https://www.postgresql.org/docs/9.6/static/functions-datetime.html

But with ability to setup justify date the math will be more sharp.

Please implement next feature:

select extract( month from justify_days( timestamp '2016-01-31' +interval '1
month' -timestamp '2016-01-31'), timestamp '2016-01-31' );
 date_part
---
         1
(1 row)

This is useful when I try to calculate how much month are left between
service start and end dates.This is not the bug, so pgsql-hackers, pgsql-general are better places for this discussion I am thinking so your request has sense, and should be registered in ToDo list https://wiki.postgresql.org/wiki/TodoYou can try to connect people from PostgreSQL Pro company for implementation.RegardsPavel

Thank you.


--
Sent via pgsql-bugs mailing list (pgsql-b...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

 Конец пересылаемого сообщения 



[HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) on PostgreSQL 10

2017-10-12 Thread Tomas Vondra
Hi,

It seems that Q19 from TPC-H is consistently failing with segfaults due
to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).

I'm not very familiar with how the dsa is initialized and passed around,
but I only see the failures when the bitmap is constructed by a mix of
BitmapAnd and BitmapOr operations.

Another interesting observation is that setting force_parallel_mode=on
may not be enough - there really need to be multiple parallel workers,
which is why the simple query does cpu_tuple_cost=1.

Attached is a bunch of files:

1) details for "full" query:

* query.sql
* plan.txt
* backtrace.txt

2) details for the "minimal" query triggering the issue:

* query-minimal.sql
* plan-minimal.txt
* backtrace-minimal.txt



regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Program terminated with signal 6, Aborted.
#0  0x7fe21265d1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64
(gdb) bt
#0  0x7fe21265d1f7 in raise () from /lib64/libc.so.6
#1  0x7fe21265e8e8 in abort () from /lib64/libc.so.6
#2  0x008468e7 in ExceptionalCondition 
(conditionName=conditionName@entry=0x9d213a "!(tbm->dsa != ((void *)0))", 
errorType=errorType@entry=0x88fc69 "FailedAssertion", 
fileName=fileName@entry=0x9d2014 "tidbitmap.c", 
lineNumber=lineNumber@entry=800) at assert.c:54
#3  0x0065b04f in tbm_prepare_shared_iterate (tbm=tbm@entry=0x2b244e8) 
at tidbitmap.c:800
#4  0x0062294a in BitmapHeapNext (node=node@entry=0x2adf118) at 
nodeBitmapHeapscan.c:155
#5  0x00616d7a in ExecScanFetch (recheckMtd=0x623050 
, accessMtd=0x622250 , node=0x2adf118) at 
execScan.c:97
#6  ExecScan (node=0x2adf118, accessMtd=0x622250 , 
recheckMtd=0x623050 ) at execScan.c:147
#7  0x00624c75 in ExecProcNode (node=0x2adf118) at 
../../../src/include/executor/executor.h:250
#8  gather_getnext (gatherstate=0x2aded50) at nodeGather.c:281
#9  ExecGather (pstate=0x2aded50) at nodeGather.c:215
#10 0x00610d12 in ExecProcNode (node=0x2aded50) at 
../../../src/include/executor/executor.h:250
#11 ExecutePlan (execute_once=, dest=0x2b09220, 
direction=, numberTuples=0, sendTuples=1 '\001', 
operation=CMD_SELECT, use_parallel_mode=, planstate=0x2aded50, 
estate=0x2adeb00) at execMain.c:1721
#12 standard_ExecutorRun (queryDesc=0x2a3bdf0, direction=, 
count=0, execute_once=) at execMain.c:363
#13 0x0074b50b in PortalRunSelect (portal=portal@entry=0x2a34050, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x2b09220) at pquery.c:932
#14 0x0074ca18 in PortalRun (portal=portal@entry=0x2a34050, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', 
run_once=run_once@entry=1 '\001', dest=dest@entry=0x2b09220, 
altdest=altdest@entry=0x2b09220, 
completionTag=completionTag@entry=0x7ffc8dad21c0 "") at pquery.c:773
#15 0x0074875b in exec_simple_query (
query_string=0x2a96ff0 "select\n*\nfrom\npart\nwhere\n(\n   
 p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')\nand p_size 
between 1 and 5\n)\nor\n(\np_container in ('MED BAG', 'MED 
B"...) at postgres.c:1099
#16 0x00749a03 in PostgresMain (argc=, 
argv=argv@entry=0x2a44048, dbname=0x2a43eb0 "test", username=) 
at postgres.c:4088
#17 0x0047665f in BackendRun (port=0x2a37cc0) at postmaster.c:4357
#18 BackendStartup (port=0x2a37cc0) at postmaster.c:4029
#19 ServerLoop () at postmaster.c:1753
#20 0x006d70d9 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x2a14b20) at postmaster.c:1361
#21 0x004774c1 in main (argc=3, argv=0x2a14b20) at main.c:228

   QUERY PLAN   



 Gather
   Workers Planned: 2
   ->  Parallel Bitmap Heap Scan on part
 Recheck Cond: (((p_size <= 5) AND (p_size >= 1) AND (p_container = ANY 
('{"SM CASE","SM BOX","SM PACK","SM PKG"}'::bpchar[]))) OR ((p_container = ANY 
('{"MED BAG","MED BOX","MED PKG","MED PACK"}'::bpchar[])) AND (p_size <= 10) 
AND (p_size >= 1)))
 ->  BitmapOr
   ->  BitmapAnd
 ->  Bitmap Index Scan on part_p_size_idx
   Index Cond: ((p_size <= 5) AND (p_size >= 1))
 ->  Bitmap Index Scan on 
part_p_container_p_brand_p_partkey_idx
   Index Cond: (p_container = ANY ('{"SM 

Re: [HACKERS] Fix a typo in execReplication.c

2017-10-12 Thread Petr Jelinek
On 12/10/17 00:52, Masahiko Sawada wrote:
> On Thu, Oct 12, 2017 at 5:02 AM, Robert Haas  wrote:
>> On Mon, Oct 9, 2017 at 10:59 PM, Masahiko Sawada  
>> wrote:
>>> Attached a patch for $subject. ISTM these are mistakes of copy-and-paste.
>>
>> Committed, but isn't the code itself wrong as well in the DELETE case?
>>
>> /* BEFORE ROW DELETE Triggers */
>> if (resultRelInfo->ri_TrigDesc &&
>> resultRelInfo->ri_TrigDesc->trig_update_before_row)
>> {
>> skip_tuple = !ExecBRDeleteTriggers(estate, epqstate, resultRelInfo,
>>>tts_tuple->t_self,
>>NULL);
>> }
>>
>> Why not trig_delete_before_row?
>>

Indeed, that's another copy-pasto.

> 
> Thank you!
> I think you're right. It should be trig_delete_before_row and be
> back-patched to 10. Attached patch fixes it.
> 

Thanks for the patch, looks correct to me.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query started showing wrong result after Ctrl+c

2017-10-12 Thread tushar

On 10/12/2017 03:46 PM, Marko Tiikkaja wrote:

The subquery:

    select n from tv limit 1

could in theory return any row due to the lack of ORDER BY. What I'm 
guessing happened is that you're seeing a synchronized sequential scan 
in follow-up queries.  Add an ORDER BY.


Bang on . After adding order by clause - i am getting same result 
consistently. but why i got the  different result after canceling the 
query only?


test=# \c f2
You are now connected to database "f2" as user "centos".
f2=# create table tv(n int,n1 char(100));
CREATE TABLE
f2=# insert into tv values (generate_series(1,100),'aaa');
INSERT 0 100
f2=# insert into tv values (generate_series(1,100),'a');
INSERT 0 990001
f2=# analyze tv;
ANALYZE
f2=# vacuum tv;
VACUUM
f2=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from 
(select n from tv  limit 1) c)) as c  ;

 n
---
 1
(1 row)

f2=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from 
(select n from tv  limit 1) c)) as c  ;

 n
---
 1
(1 row)

f2=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from 
(select n from tv  limit 1) c)) as c  ;

 n
---
 1
(1 row)

f2=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from 
(select n from tv  limit 1) c)) as c  ;

 n
---
 1
(1 row)

f2=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from 
(select n from tv  limit 1) c)) as c  ;

 n
---
 1
(1 row)

f2=#

even after restarting the server ,  i am getting the same result.

now after canceling the operation , next time - result is coming different ?

f2=# SELECT  *  FROM ( SELECT n   from  tv  where n!=ALL (select * from 
(select n from tv) c)) as c  ;

^CCancel request sent
ERROR:  canceling statement due to user request
f2=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from 
(select n from tv  limit 1) c)) as c  ;

  n
--
 3713
(1 row)

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query started showing wrong result after Ctrl+c

2017-10-12 Thread Marko Tiikkaja
On Thu, Oct 12, 2017 at 12:03 PM, tushar 
wrote:

> postgres=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * from
> (select n from tv limit 1) c)) as c  ;
>   n
> --
>  3713
> (1 row)
>
> This time , query is started showing wrong result.  Is this an expected
> behavior and if yes -then how to get the correct result ?


The subquery:

select n from tv limit 1

could in theory return any row due to the lack of ORDER BY.  What I'm
guessing happened is that you're seeing a synchronized sequential scan in
follow-up queries.  Add an ORDER BY.


.m


[HACKERS] Query started showing wrong result after Ctrl+c

2017-10-12 Thread tushar

Hi,

Steps to reproduce -

\\ PG HEAD / PG v10  sources . Connect to psql terminal -  create these 
following object


create table tv(n int,n1 char(100));
insert into tv values (generate_series(1,100),'aaa');
insert into tv values (generate_series(1,100),'a');
analyze tv;
vacuum tv;

\\1st  query

postgres=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * 
from (select n from tv limit 1) c)) as c  ;

 n
---
 1
(1 row)

\\2nd query
postgres=# SELECT  *  FROM ( SELECT n   from  tv  where n!=ALL (select * 
from (select n from tv) c)) as c  ;  [query was taking time so pressed 
CTRL-C)
^C2017-10-12 10:54:49.004 BST [9073] ERROR:  canceling statement due to 
user request
2017-10-12 10:54:49.004 BST [9073] STATEMENT:  SELECT  *  FROM ( SELECT 
n   from  tv  where n!=ALL (select * from (select n from tv) c)) as c  ;
2017-10-12 10:54:49.004 BST [9129] FATAL:  terminating connection due to 
administrator command
2017-10-12 10:54:49.004 BST [9129] STATEMENT:  SELECT  *  FROM ( SELECT 
n   from  tv  where n!=ALL (select * from (select n from tv) c)) as c  ;
2017-10-12 10:54:49.004 BST [9130] FATAL:  terminating connection due to 
administrator command
2017-10-12 10:54:49.004 BST [9130] STATEMENT:  SELECT  *  FROM ( SELECT 
n   from  tv  where n!=ALL (select * from (select n from tv) c)) as c  ;

Cancel request sent
2017-10-12 10:54:49.005 BST [9058] LOG:  background worker "parallel 
worker" (PID 9129) exited with exit code 1
2017-10-12 10:54:49.005 BST [9058] LOG:  background worker "parallel 
worker" (PID 9130) exited with exit code 1

ERROR:  canceling statement due to user request

\\again fired 1st query

postgres=# vacuum ANALYZE tv;
VACUUM
postgres=# SELECT  *  FROM ( SELECT n   from  tv  where n= (select * 
from (select n from tv limit 1) c)) as c  ;

  n
--
 3713
(1 row)

This time , query is started showing wrong result.  Is this an expected 
behavior and if yes -then how to get the correct result ?


--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-10-12 Thread Pavel Stehule
2017-09-19 20:49 GMT+02:00 Merlin Moncure :

> On Tue, Sep 19, 2017 at 1:37 PM, Robert Haas 
> wrote:
> > On Tue, Sep 19, 2017 at 12:45 PM, Pavel Stehule 
> wrote:
> >>> You can already set a GUC with function scope.  I'm not getting your
> >>> point.
> >>
> >> yes, it is true. But implementation of #option is limited to PLpgSQL -
> so
> >> there is not any too much questions - GUC is global - there is lot of
> >> points:
> >>
> >> * what is correct impact on PREPARE
> >> * what is correct impact on EXECUTE
> >> * what should be done if this GUC is changed ..
> >
> > For better or for worse, as a project we've settled on GUCs as a way
> > to control behavior.  I think it makes more sense to try to apply that
> > option to new behaviors we want to control than to invent some new
> > system.
>
> This seems very sensible.
>
> We also have infrastructure at the SQL level (SET) to manage the GUC.
> Tom upthread (for pretty good reasons) extending SET to pl/pgsql
> specific scoping but TBH I'm struggling as to why we need to implement
> new syntax for this; the only thing missing is being able to scope SET
> statements to a code block FWICT.
>
>
here is a GUC based patch for plancache controlling. Looks so this code is
working.

It is hard to create regress tests. Any ideas?

Regards

Pavel



> merlin
>
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index ad8a82f1e3..cc99cf6dcc 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -106,6 +106,8 @@ static void PlanCacheRelCallback(Datum arg, Oid relid);
 static void PlanCacheFuncCallback(Datum arg, int cacheid, uint32 hashvalue);
 static void PlanCacheSysCallback(Datum arg, int cacheid, uint32 hashvalue);
 
+/* GUC parameter */
+int	plancache_mode;
 
 /*
  * InitPlanCache: initialize module during InitPostgres.
@@ -1031,6 +1033,12 @@ choose_custom_plan(CachedPlanSource *plansource, ParamListInfo boundParams)
 	if (IsTransactionStmtPlan(plansource))
 		return false;
 
+	/* See if settings wants to force the decision */
+	if (plancache_mode & PLANCACHE_FORCE_GENERIC_PLAN)
+		return false;
+	if (plancache_mode & PLANCACHE_FORCE_CUSTOM_PLAN)
+		return true;
+
 	/* See if caller wants to force the decision */
 	if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
 		return false;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae22185fbd..4ce275e39d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -403,6 +403,13 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry plancache_mode_options[] = {
+	{"default", PLANCACHE_DEFAULT, false},
+	{"force_generic_plan", PLANCACHE_FORCE_GENERIC_PLAN, false},
+	{"force_custom_plan", PLANCACHE_FORCE_CUSTOM_PLAN, false},
+	{NULL, 0, false}
+};
+
 /*
  * password_encryption used to be a boolean, so accept all the likely
  * variants of "on", too. "off" used to store passwords in plaintext,
@@ -3916,6 +3923,16 @@ static struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"plancache_mode", PGC_USERSET, QUERY_TUNING_OTHER,
+			gettext_noop("Forces use of custom or generic plans."),
+			gettext_noop("It can control query plan cache.")
+		},
+		_mode,
+		PLANCACHE_DEFAULT, plancache_mode_options,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 87fab19f3c..962895cc1a 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -143,7 +143,6 @@ typedef struct CachedPlan
 	MemoryContext context;		/* context containing this CachedPlan */
 } CachedPlan;
 
-
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -182,4 +181,16 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 			  QueryEnvironment *queryEnv);
 extern void ReleaseCachedPlan(CachedPlan *plan, bool useResOwner);
 
+/* possible values for plancache_mode */
+typedef enum
+{
+	PLANCACHE_DEFAULT,
+	PLANCACHE_FORCE_GENERIC_PLAN,
+	PLANCACHE_FORCE_CUSTOM_PLAN
+}			PlanCacheMode;
+
+
+/* GUC parameter */
+extern int plancache_mode;
+
 #endif			/* PLANCACHE_H */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread Alvaro Herrera
Andres Freund wrote:
> On 2017-10-08 11:28:09 -0400, Tom Lane wrote:
> > Adam Brusselback  writes:
> > > On another note:
> > >> turning ORs into UNIONs
> > 
> > > This is another one which would be incredibly useful for me.  I've had
> > > to do this manually for performance reasons far too often.
> > 
> > Well, maybe you could sign up to help review the open patch for that then:
> > https://commitfest.postgresql.org/15/1001/
> > 
> > The reason that's not in v10 is we haven't been able to convince
> > ourselves whether it's 100% correct.
> 
> Unfortunately it won't help in this specific case (no support for UNION,
> just UNION ALL), but I thought it might be interesting to reference
> https://medium.com/@uwdb/introducing-cosette-527898504bd6
> here.

Interesting.  I thought about a completely different approach -- use a
fuzzer, which runs each generated query on two servers, one patched one
not, and compare the results.  Would it be possible to tweak sqlsmith to
do this?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replace GrantObjectType with ObjectType

2017-10-12 Thread Alvaro Herrera
Michael Paquier wrote:
> On Thu, Oct 12, 2017 at 7:55 AM, Peter Eisentraut
>  wrote:
> > It seems to me that having ACL_OBJECT_* symbols alongside OBJECT_*
> > symbols is not useful and leads to duplication.  Digging around in the
> > past suggests that we used to have a lot of these command-specific
> > symbols but got rid of them over time, except that the ACL stuff was
> > never touched.  The attached patch accomplishes that.

+1 for this.

> -bool
> -EventTriggerSupportsGrantObjectType(GrantObjectType objtype)
> -{
> -   switch (objtype)
> -   {
> -   case ACL_OBJECT_DATABASE:
> -   case ACL_OBJECT_TABLESPACE:
> -   /* no support for global objects */
> -   return false;
> By removing that, if any GRANT/REVOKE support is added for another
> object type, then EventTriggerSupportsObjectType would return true by
> default instead of getting a compilation failure.

Yeah, we've found it useful to remove default: clauses in some switch
blocks so that we get compile warnings when we forget to add a new type
(c.f. commit e84c0195980f).  Let's not add any more of those.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow synchronous logical replication

2017-10-12 Thread Konstantin Knizhnik



On 12.10.2017 04:23, Craig Ringer wrote:

On 12 October 2017 at 00:57, Konstantin Knizhnik
 wrote:


The reason of such behavior is obvious: wal sender has to decode huge
transaction generate by insert although it has no relation to this
publication.

It does. Though I wouldn't expect anywhere near the kind of drop you
report, and haven't observed it here.

Is the CREATE TABLE and INSERT done in the same transaction?


No. Table was create in separate transaction.
Moreover  the same effect will take place if table is create before 
start of replication.
The problem in this case seems to be caused by spilling decoded 
transaction to the file by ReorderBufferSerializeTXN.

Please look at two profiles:
http://garret.ru/lr1.svg  corresponds to normal work if pgbench with 
synchronous replication to one replica,
http://garret.ru/lr2.svg - the with concurrent execution of huge insert 
statement.


And here is output of pgbench (at fifth second insert is started):

progress: 1.0 s, 10020.9 tps, lat 0.791 ms stddev 0.232
progress: 2.0 s, 10184.1 tps, lat 0.786 ms stddev 0.192
progress: 3.0 s, 10058.8 tps, lat 0.795 ms stddev 0.301
progress: 4.0 s, 10230.3 tps, lat 0.782 ms stddev 0.194
progress: 5.0 s, 10335.0 tps, lat 0.774 ms stddev 0.192
progress: 6.0 s, 4535.7 tps, lat 1.591 ms stddev 9.370
progress: 7.0 s, 419.6 tps, lat 20.897 ms stddev 55.338
progress: 8.0 s, 105.1 tps, lat 56.140 ms stddev 76.309
progress: 9.0 s, 9.0 tps, lat 504.104 ms stddev 52.964
progress: 10.0 s, 14.0 tps, lat 797.535 ms stddev 156.082
progress: 11.0 s, 14.0 tps, lat 601.865 ms stddev 93.598
progress: 12.0 s, 11.0 tps, lat 658.276 ms stddev 138.503
progress: 13.0 s, 9.0 tps, lat 784.120 ms stddev 127.206
progress: 14.0 s, 7.0 tps, lat 870.944 ms stddev 156.377
progress: 15.0 s, 8.0 tps, lat .578 ms stddev 140.987
progress: 16.0 s, 7.0 tps, lat 1258.750 ms stddev 75.677
progress: 17.0 s, 6.0 tps, lat 991.023 ms stddev 229.058
progress: 18.0 s, 5.0 tps, lat 1063.986 ms stddev 269.361

It seems to be effect of large transactions.
Presence of several channels of synchronous logical replication reduce 
performance, but not so much.

Below are results at another machine and pgbench with scale 10.

Configuraion
standalone
1 async logical replica
1 sync logical replca
3 async logical replicas
3 syn logical replicas
TPS
15k
13k
10k
13k
8k





Only partly true. The output plugin can register a transaction origin
filter and use that to say it's entirely uninterested in a
transaction. But this only works based on filtering by origins. Not
tables.
Yes I know about origin filtering mechanism (and we are using it in 
multimaster).
But I am speaking about standard pgoutput.c output plugin. it's 
pgoutput_origin_filter

always returns false.




I imagine we could call another hook in output plugins, "do you care
about this table", and use it to skip some more work for tuples that
particular decoding session isn't interested in. Skip adding them to
the reorder buffer, etc. No such hook currently exists, but it'd be an
interesting patch for Pg11 if you feel like working on it.


Unfortunately it is not quite clear how to make wal-sender smarter and let
him skip transaction not affecting its publication.

As noted, it already can do so by origin. Mostly. We cannot totally
skip over WAL, since we need to process various invalidations etc. See
ReorderBufferSkip.
The problem is that before end of transaction we do not know whether it 
touch this publication or not.

So filtering by origin will not work in this case.

I really not sure that it is possible to skip over WAL. But the 
particular problem with invalidation records etc  can be solved by 
always processing this records by WAl sender.
I.e. if backend is inserting invalidation record or some other record 
which always should be processed by WAL sender, it can always promote 
LSN of this record to WAL sender.
So WAl sender will skip only those WAl records which is safe to skip 
(insert/update/delete records not affecting this publication).


I wonder if there can be some other problems with skipping part of 
transaction by WAL sender.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Omission in GRANT documentation

2017-10-12 Thread Laurenz Albe
Tom Lane wrote:
>> But types also have the USAGE privilege for PUBLIC by default:
> 
> Yup, that's an oversight.
> 
>> Hence I propose the attached documentation patch.
> 
> Pushed, with a bit of additional wordsmithing.

Thanks for taking the time.

Yours,
Laurenz Albe



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread Laurenz Albe
Robert Haas wrote:
> One trick that some system use is avoid replanning as much as we do
> by, for example, saving plans in a shared cache and reusing them even
> in other sessions.  That's hard to do in our architecture because the
> controlling GUCs can be different in every session and there's not
> even any explicit labeling of which GUCs control planner behavior. But
> if you had it, then extra planning cycles would be, perhaps, more
> tolerable.

>From my experience with Oracle I would say that that is a can of worms.

Perhaps it really brings the performance benefits they claim, but
a) there have been a number of bugs where the wrong plan got used
   (you have to keep several plans for the same statement around,
   since - as you say - different sessions have different environments)
b) it is a frequent problem that this shared memory area grows
   too large if the application does not use prepared statements
   but dynamic SQL with varying constants.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replace GrantObjectType with ObjectType

2017-10-12 Thread Michael Paquier
On Thu, Oct 12, 2017 at 7:55 AM, Peter Eisentraut
 wrote:
> It seems to me that having ACL_OBJECT_* symbols alongside OBJECT_*
> symbols is not useful and leads to duplication.  Digging around in the
> past suggests that we used to have a lot of these command-specific
> symbols but got rid of them over time, except that the ACL stuff was
> never touched.  The attached patch accomplishes that.

That's always welcome:
 14 files changed, 203 insertions(+), 254 deletions(-)

This needs an update:
$ git grep GrantObjectType
src/tools/pgindent/typedefs.list:GrantObjectType

-static const char *stringify_grantobjtype(GrantObjectType objtype);
-static const char *stringify_adefprivs_objtype(GrantObjectType objtype);
+static const char *stringify_grantobjtype(ObjectType objtype);
+static const char *stringify_adefprivs_objtype(ObjectType objtype);
stringify_grantobjtype should be renamed to stringify_objtype.

-bool
-EventTriggerSupportsGrantObjectType(GrantObjectType objtype)
-{
-   switch (objtype)
-   {
-   case ACL_OBJECT_DATABASE:
-   case ACL_OBJECT_TABLESPACE:
-   /* no support for global objects */
-   return false;
By removing that, if any GRANT/REVOKE support is added for another
object type, then EventTriggerSupportsObjectType would return true by
default instead of getting a compilation failure. I think that a
comment would be appropriate here:
GrantStmt  *stmt = (GrantStmt *) parsetree;

-   if (EventTriggerSupportsGrantObjectType(stmt->objtype))
+   if (EventTriggerSupportsObjectType(stmt->objtype))
ProcessUtilitySlow(pstate, pstmt, queryString,
Something like, "This checks for more object types than currently
supported by the GRANT statement"... Or at least something to outline
that risk.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers