Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Hm.  That would behave less than desirably if getObjectClass() could
>> return a value that wasn't a member of the enum, but it's hard to
>> credit that happening.  I guess I'd vote for removing the default:
>> case from all of the places that have "switch (getObjectClass(object))",
>> as forgetting to add an entry seems like a much larger hazard.

> Ignoring the one in alter.c's AlterObjectNamespace_oid, which only
> handles a small subset of the enum values, that gives the following
> warnings:

> /pgsql/source/master/src/backend/catalog/dependency.c: In function 
> 'doDeletion':
> /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
> enumeration value 'OCLASS_ROLE' not handled in switch [-Wswitch]
>   switch (getObjectClass(object))
>   ^
> /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
> enumeration value 'OCLASS_DATABASE' not handled in switch [-Wswitch]
> /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
> enumeration value 'OCLASS_TBLSPACE' not handled in switch [-Wswitch]

> /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
> enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch]

Hm.  I think it's intentional that shared objects are not handled there,
but I wonder whether the lack of SUBSCRIPTION represents a bug.
Anyway I think it would be fine to add those to the switch as explicit
error cases.


> /pgsql/source/master/src/backend/commands/tablecmds.c: In function 
> 'ATExecAlterColumnType':
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_AM' not handled in switch [-Wswitch]
>switch (getObjectClass())
>^
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch]
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_EVENT_TRIGGER' not handled in switch [-Wswitch]
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_PUBLICATION' not handled in switch [-Wswitch]
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_PUBLICATION_REL' not handled in switch [-Wswitch]
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch]
> /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
> enumeration value 'OCLASS_TRANSFORM' not handled in switch [-Wswitch]

All of those ought to go into the category that prints
"unexpected object depending on column", I think; certainly
"unrecognized object class" is not a better report, and the fact that
we've evidently not touched this switch in the last few additions of
object types provides fodder for the idea that the default: is unhelpful.
(Not to mention that not having OCLASS_STATISTIC_EXT here is an outright
bug as of today...)  So losing the default: case here seems good to me.

Alternatively, we could drop the explicit listing of oclasses we're not
expecting to see, and let the default: case be "unexpected object
depending on column".  Treating the default separately is only of value if
we'd expect getObjectDescription to fail, but there's no good reason for
this code to be passing judgment on whether that might happen.

What do we want this to do about statistics objects?  We could either
throw a feature-not-implemented error or try to update the stats
object for the new column type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> Oh, I see your patch also fixes missing code in getObjectDescription().
> >> We need that.  Is there a decent way to get the compiler to warn about
> >> that oversight?
> 
> > We could remove the default clause.  That results in the expected
> > warning, and no others (i.e. the switch was already complete):
> 
> > /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: 
> > enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch]
> 
> Hm.  That would behave less than desirably if getObjectClass() could
> return a value that wasn't a member of the enum, but it's hard to
> credit that happening.  I guess I'd vote for removing the default:
> case from all of the places that have "switch (getObjectClass(object))",
> as forgetting to add an entry seems like a much larger hazard.

Ignoring the one in alter.c's AlterObjectNamespace_oid, which only
handles a small subset of the enum values, that gives the following
warnings:

/pgsql/source/master/src/backend/catalog/dependency.c: In function 'doDeletion':
/pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
enumeration value 'OCLASS_ROLE' not handled in switch [-Wswitch]
  switch (getObjectClass(object))
  ^
/pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
enumeration value 'OCLASS_DATABASE' not handled in switch [-Wswitch]
/pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
enumeration value 'OCLASS_TBLSPACE' not handled in switch [-Wswitch]

/pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: 
enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch]

/pgsql/source/master/src/backend/commands/tablecmds.c: In function 
'ATExecAlterColumnType':
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_AM' not handled in switch [-Wswitch]
   switch (getObjectClass())
   ^
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch]
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_EVENT_TRIGGER' not handled in switch [-Wswitch]
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_PUBLICATION' not handled in switch [-Wswitch]
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_PUBLICATION_REL' not handled in switch [-Wswitch]
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch]
/pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: 
enumeration value 'OCLASS_TRANSFORM' not handled in switch [-Wswitch]

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Oh, I see your patch also fixes missing code in getObjectDescription().
>> We need that.  Is there a decent way to get the compiler to warn about
>> that oversight?

> We could remove the default clause.  That results in the expected
> warning, and no others (i.e. the switch was already complete):

> /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: 
> enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch]

Hm.  That would behave less than desirably if getObjectClass() could
return a value that wasn't a member of the enum, but it's hard to
credit that happening.  I guess I'd vote for removing the default:
case from all of the places that have "switch (getObjectClass(object))",
as forgetting to add an entry seems like a much larger hazard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> Oh, sorry --- I already pushed something about this.
> 
> > That's fine, thanks.
> 
> Oh, I see your patch also fixes missing code in getObjectDescription().
> We need that.  Is there a decent way to get the compiler to warn about
> that oversight?

We could remove the default clause.  That results in the expected
warning, and no others (i.e. the switch was already complete):

/pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: 
enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch]

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Oh, sorry --- I already pushed something about this.

> That's fine, thanks.

Oh, I see your patch also fixes missing code in getObjectDescription().
We need that.  Is there a decent way to get the compiler to warn about
that oversight?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> Although I've not done anything about it here, I'm not happy about the
> >> handling of dependencies for stats objects.
> 
> > Here are two patches regarding handling of dependencies.
> 
> Oh, sorry --- I already pushed something about this.

That's fine, thanks.

I'm glad that this thread got you interested in look over the extended
statistics stuff.  Thanks for the input.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Although I've not done anything about it here, I'm not happy about the
>> handling of dependencies for stats objects.

> Here are two patches regarding handling of dependencies.

Oh, sorry --- I already pushed something about this.

> The first one
> implements your first suggestion: add a NORMAL dependency on each
> column, and do away with RemoveStatisticsExt.  This works well and
> should uncontroversial.

No, I wanted an AUTO dependency there, for exactly the reason you
mention:

> If we only do this, then DROP TABLE needs to say CASCADE if there's a
> statistics object in the table.

I don't think we really want that, do we?  A stats object can't be of
any value if the underlying table is gone.  Perhaps that calculus
would change for cross-table stats but I don't see why.

> This seems pointless to me, so the
> second patch throws in an additional dependency on the table as a whole,
> AUTO this time, so that if the table is dropped, the statistics goes
> away without requiring cascade.  There is no point in forcing CASCADE
> for this case, ISTM.  This works well too, but I admit there might be
> resistance to doing it.  OTOH this is how CREATE INDEX works.

Uh, no it isn't.  Indexes have auto dependencies on the individual
columns they index, and *not* on the table as a whole.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Alvaro Herrera
Tom Lane wrote:

> Although I've not done anything about it here, I'm not happy about the
> handling of dependencies for stats objects.  I do not think that cloning
> RemoveStatistics as RemoveStatisticsExt is sane at all.  The former is
> meant to deal with cleaning up pg_statistic rows that we know will be
> there, so there's no real need to expend pg_depend overhead to track them.
> For objects that are only loosely connected, the right thing is to use
> the dependency system; in particular, this design has no real chance of
> working well with cross-table stats.  Also, it's really silly to have
> *both* this hard-wired mechanism and a pg_depend entry; the latter is
> surely redundant if you have the former.  IMO we should revert
> RemoveStatisticsExt and instead handle things by making stats objects
> auto-dependent on the individual column(s) they reference (not the whole
> table).
> 
> I'm also of the opinion that having an AUTO dependency, rather than
> a NORMAL dependency, on the stats object's schema is the wrong semantics.
> There isn't any other case where you can drop a non-empty schema without
> saying CASCADE, and I'm mystified why this case should act that way.

Here are two patches regarding handling of dependencies.  The first one
implements your first suggestion: add a NORMAL dependency on each
column, and do away with RemoveStatisticsExt.  This works well and
should uncontroversial.

If we only do this, then DROP TABLE needs to say CASCADE if there's a
statistics object in the table.  This seems pointless to me, so the
second patch throws in an additional dependency on the table as a whole,
AUTO this time, so that if the table is dropped, the statistics goes
away without requiring cascade.  There is no point in forcing CASCADE
for this case, ISTM.  This works well too, but I admit there might be
resistance to doing it.  OTOH this is how CREATE INDEX works.

(I considered what would happen with a stats object covering multiple
tables.  I think it's reasonable to drop the stats too in that case,
under the rationale that the object is no longer useful.  Not really
sure about this.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 06fad759888d5060f9b4cf4d4f4fdec777350027 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 12 May 2017 17:16:26 -0300
Subject: [PATCH 1/2] fix dependencies problem

---
 src/backend/catalog/heap.c  | 74 -
 src/backend/catalog/objectaddress.c | 20 +
 src/backend/commands/statscmds.c| 18 
 src/test/regress/expected/stats_ext.out | 13 --
 src/test/regress/sql/stats_ext.sql  |  9 ++--
 5 files changed, 45 insertions(+), 89 deletions(-)

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index ab3d83f29b..ba63939022 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1615,10 +1615,7 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
 
if (attnum > 0)
-   {
RemoveStatistics(relid, attnum);
-   RemoveStatisticsExt(relid, attnum);
-   }
 
relation_close(rel, NoLock);
 }
@@ -1873,7 +1870,6 @@ heap_drop_with_catalog(Oid relid)
 * delete statistics
 */
RemoveStatistics(relid, 0);
-   RemoveStatisticsExt(relid, 0);
 
/*
 * delete attribute tuples
@@ -2784,76 +2780,6 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
heap_close(pgstatistic, RowExclusiveLock);
 }
 
-
-/*
- * RemoveStatisticsExt --- remove entries in pg_statistic_ext for a relation
- *
- * If attnum is zero, remove all entries for rel; else remove only the
- * one(s) involving that column.
- */
-void
-RemoveStatisticsExt(Oid relid, AttrNumber attnum)
-{
-   Relationpgstatisticext;
-   SysScanDesc scan;
-   ScanKeyData key;
-   HeapTuple   tuple;
-
-   /*
-* Scan pg_statistic_ext to delete relevant tuples
-*/
-   pgstatisticext = heap_open(StatisticExtRelationId, RowExclusiveLock);
-
-   ScanKeyInit(,
-   Anum_pg_statistic_ext_stxrelid,
-   BTEqualStrategyNumber, F_OIDEQ,
-   ObjectIdGetDatum(relid));
-
-   scan = systable_beginscan(pgstatisticext,
- 
StatisticExtRelidIndexId,
- true, NULL, 1, );
-
-   while (HeapTupleIsValid(tuple = systable_getnext(scan)))
-   {
-   booldelete = false;
-
-   if (attnum == 0)
-   delete = true;
-   else if (attnum != 0)
-   {
-   Form_pg_statistic_ext   staForm;
-   int i;
-
-   

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
I wrote:
> Tomas Vondra  writes:
>> On 5/12/17 4:46 AM, Tom Lane wrote:
>>> Although I've not done anything about it here, I'm not happy about the
>>> handling of dependencies for stats objects.

>> Yeah, it's a bit frankensteinian ...

> I'm prepared to create a fix for that, but it'd be easier to commit the
> current patch first, to avoid merge conflicts.

Done.  I noted that we weren't making an owner dependency either :-(.

Also, we might want to think about letting stats objects be extension
members sometime.  As things stand, if an extension script does CREATE
STATISTICS, the stats object will just be "loose" rather than bound
into the extension.

We still have docs and nomenclature issues to work on.  Anyone want
to take point on those?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> I'm prepared to create a fix for that, but it'd be easier to commit the
> >> current patch first, to avoid merge conflicts.
> 
> > It seems we're mostly in agreement regarding the parts I was touching.
> > Do you want to push your version of the patch?
> 
> It's still almost all your work, so I figured you should push it.

OK, I'll do that shortly then.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> I'm prepared to create a fix for that, but it'd be easier to commit the
>> current patch first, to avoid merge conflicts.

> It seems we're mostly in agreement regarding the parts I was touching.
> Do you want to push your version of the patch?

It's still almost all your work, so I figured you should push it.

>> Hm.  OK, but then that example is pretty misleading, because it leads
>> the reader to suppose that the planner can tell the difference between
>> the selectivities of the two queries.  Maybe what's lacking is an
>> explanation of how you'd use this statistics type.

> I think we should remove the complex example from that page, and instead
> refer the reader to chapters 14 and 69.

Seems reasonable.  If we did add enough material there to be a standalone
description, it would be duplicative probably.  However, that does put
the onus on the other chapters to offer a complete definition, rather
than leaving details to the man page as we do in many other cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Alvaro Herrera
Tom Lane wrote:
> Tomas Vondra  writes:

> >> Although I've not done anything about it here, I'm not happy about the
> >> handling of dependencies for stats objects.
> 
> > Yeah, it's a bit frankensteinian ...
> 
> I'm prepared to create a fix for that, but it'd be easier to commit the
> current patch first, to avoid merge conflicts.

It seems we're mostly in agreement regarding the parts I was touching.
Do you want to push your version of the patch?

> >> Lastly, I tried the example given in the CREATE STATISTICS reference page,
> >> and it doesn't seem to work.
> 
> > I assume you're talking about the functional dependencies and in that 
> > case that's expected behavior, because f. dependencies do assume the 
> > conditions are "consistent" with the functional dependencies.
> 
> Hm.  OK, but then that example is pretty misleading, because it leads
> the reader to suppose that the planner can tell the difference between
> the selectivities of the two queries.  Maybe what's lacking is an
> explanation of how you'd use this statistics type.

I think we should remove the complex example from that page, and instead
refer the reader to chapters 14 and 69.  The CREATE STATISTICS page can
just give some overview of what the syntax looks like, and all
explanations of complex topics such as "what does it mean for a query to
be consistent with the functional dependencies" can be given at length
elsewhere.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tom Lane
Tomas Vondra  writes:
> On 5/12/17 4:46 AM, Tom Lane wrote:
>> If people agree that that reads better, we should make an effort to
>> propagate that terminology elsewhere in the docs, and into error messages,
>> objectaddress.c output, etc.

> I'm fine with the 'statistics object' wording. I've been struggling with 
> this bit while working on the patch, and I agree it sounds better and 
> getting it consistent is definitely worthwhile.

OK.  We can work on that as a followon patch.

>> Although I've not done anything about it here, I'm not happy about the
>> handling of dependencies for stats objects.

> Yeah, it's a bit frankensteinian ...

I'm prepared to create a fix for that, but it'd be easier to commit the
current patch first, to avoid merge conflicts.

>> Lastly, I tried the example given in the CREATE STATISTICS reference page,
>> and it doesn't seem to work.

> I assume you're talking about the functional dependencies and in that 
> case that's expected behavior, because f. dependencies do assume the 
> conditions are "consistent" with the functional dependencies.

Hm.  OK, but then that example is pretty misleading, because it leads
the reader to suppose that the planner can tell the difference between
the selectivities of the two queries.  Maybe what's lacking is an
explanation of how you'd use this statistics type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-12 Thread Tomas Vondra



On 5/12/17 4:46 AM, Tom Lane wrote:

Alvaro Herrera  writes:

Hmm, yeah, makes sense.  Here's a patch for this approach.


...

Also, while reading the docs changes, I got rather ill from the
inconsistent and not very grammatical handling of "statistics" as a
noun, particularly the inability to decide from one sentence to the next
if that is singular or plural.  In the attached, I fixed this in the
ref/*_statistics.sgml files by always saying "statistics object" instead.
If people agree that that reads better, we should make an effort to
propagate that terminology elsewhere in the docs, and into error messages,
objectaddress.c output, etc.



I'm fine with the 'statistics object' wording. I've been struggling with 
this bit while working on the patch, and I agree it sounds better and 
getting it consistent is definitely worthwhile.


>

Although I've not done anything about it here, I'm not happy about the
handling of dependencies for stats objects.  I do not think that cloning
RemoveStatistics as RemoveStatisticsExt is sane at all.  The former is
meant to deal with cleaning up pg_statistic rows that we know will be
there, so there's no real need to expend pg_depend overhead to track them.
For objects that are only loosely connected, the right thing is to use
the dependency system; in particular, this design has no real chance of
working well with cross-table stats.  Also, it's really silly to have
*both* this hard-wired mechanism and a pg_depend entry; the latter is
surely redundant if you have the former.  IMO we should revert
RemoveStatisticsExt and instead handle things by making stats objects
auto-dependent on the individual column(s) they reference (not the whole
table).

I'm also of the opinion that having an AUTO dependency, rather than
a NORMAL dependency, on the stats object's schema is the wrong semantics.
There isn't any other case where you can drop a non-empty schema without
saying CASCADE, and I'm mystified why this case should act that way.



Yeah, it's a bit frankensteinian ...

>

Lastly, I tried the example given in the CREATE STATISTICS reference page,
and it doesn't seem to work.  Without the stats object, the two queries
are both estimated at one matching row, whereas they really produce 100
and 0 rows respectively.  With the stats object, they seem to both get
estimated at ~100 rows, which is a considerable improvement for one case
but a considerable disimprovement for the other.  If this is not broken,
I'd like to know why not.  What's the point of an expensive extended-
stats mechanism if it can't get this right?



I assume you're talking about the functional dependencies and in that 
case that's expected behavior, because f. dependencies do assume the 
conditions are "consistent" with the functional dependencies.


This is an inherent limitation of functional dependencies, and perhaps a 
price for the simplicity of this statistics type. If your application 
executes queries that are likely not consistent with this, don't use 
functional dependencies.


The simplicity is why dependencies were implemented first, mostly to 
introduce all the infrastructure. The other statistics types (MCV, 
histograms) don't have this limitation, but those did not make it into pg10.



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Tom Lane
Alvaro Herrera  writes:
> Hmm, yeah, makes sense.  Here's a patch for this approach.

I did some code review on this patch.  The attached update makes the
following hopefully-uncontroversial changes:

* fix inconsistencies in the set of fields in CreateStatsStmt

* get rid of struct CreateStatsArgument, which seems to be a leftover

* tighten up parse checks in CreateStatistics(), and improve comments

* add missing check for ownership of target table; the documentation
  says this is checked, and surely it is a security bug that it's not

* adjust regression test to compensate for the above, because it
  fails otherwise

* change end of regression test to suppress cascade drop messages;
  it's astonishing that this test hasn't caused intermittent buildfarm
  failures due to unstable drop order

I did not review the rest of the regression test changes, nor the
tab-complete changes.  I can however report that DROP STATISTICS 
doesn't successfully complete any stats object names.

Also, while reading the docs changes, I got rather ill from the
inconsistent and not very grammatical handling of "statistics" as a
noun, particularly the inability to decide from one sentence to the next
if that is singular or plural.  In the attached, I fixed this in the
ref/*_statistics.sgml files by always saying "statistics object" instead.
If people agree that that reads better, we should make an effort to
propagate that terminology elsewhere in the docs, and into error messages,
objectaddress.c output, etc.

Although I've not done anything about it here, I'm not happy about the
handling of dependencies for stats objects.  I do not think that cloning
RemoveStatistics as RemoveStatisticsExt is sane at all.  The former is
meant to deal with cleaning up pg_statistic rows that we know will be
there, so there's no real need to expend pg_depend overhead to track them.
For objects that are only loosely connected, the right thing is to use
the dependency system; in particular, this design has no real chance of
working well with cross-table stats.  Also, it's really silly to have
*both* this hard-wired mechanism and a pg_depend entry; the latter is
surely redundant if you have the former.  IMO we should revert
RemoveStatisticsExt and instead handle things by making stats objects
auto-dependent on the individual column(s) they reference (not the whole
table).

I'm also of the opinion that having an AUTO dependency, rather than
a NORMAL dependency, on the stats object's schema is the wrong semantics.
There isn't any other case where you can drop a non-empty schema without
saying CASCADE, and I'm mystified why this case should act that way.

Lastly, I tried the example given in the CREATE STATISTICS reference page,
and it doesn't seem to work.  Without the stats object, the two queries
are both estimated at one matching row, whereas they really produce 100
and 0 rows respectively.  With the stats object, they seem to both get
estimated at ~100 rows, which is a considerable improvement for one case
but a considerable disimprovement for the other.  If this is not broken,
I'd like to know why not.  What's the point of an expensive extended-
stats mechanism if it can't get this right?

regards, tom lane

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index b10b734..32e17ee 100644
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
*** WHERE tablename = 'road';
*** 1132,1139 
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts WITH (dependencies)
!ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
FROM pg_statistic_ext
--- 1132,1139 
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts (dependencies)
!ON zip, city FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
FROM pg_statistic_ext
*** EXPLAIN (ANALYZE, TIMING OFF) SELECT * F
*** 1219,1226 
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 WITH (ndistinct)
!ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
FROM pg_statistic_ext
--- 1219,1226 
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 (ndistinct)
!ON zip, state, city FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
FROM pg_statistic_ext
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 11580bf..ef847b9 100644
*** a/doc/src/sgml/planstats.sgml
--- b/doc/src/sgml/planstats.sgml
*** EXPLAIN (ANALYZE, TIMING OFF) SELECT * F
*** 

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Tom Lane
Alvaro Herrera  writes:
> BTW the new castNode() family of macros don't work with Value nodes
> (because the tags are different depending on what's stored, but each
> type does not have its own struct.  Oh well.)

Yeah.  Value nodes are pretty much of a wart --- it's not clear to me
that they add anything compared to just having a separate node type
for each of the possible subclasses.  It's never bothered anyone enough
to try to replace them, though.

I'll take a look at the patch later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> Have you thought further about the upthread suggestion to just borrow
> >> SELECT's syntax lock stock and barrel?
> 
> > Bison seems to like the productions below.  Is this what you had in
> > mind?  These mostly mimic joined_table and table_ref, stripping out the
> > rules that we don't need.
> 
> I'd suggest just using the from_list production and then complaining
> at runtime if what you get is too complicated.  Otherwise, you have
> to maintain a duplicate set of productions, and you're going to be
> unable to throw anything more informative than "syntax error" when
> somebody tries to exceed the implementation limits.

Hmm, yeah, makes sense.  Here's a patch for this approach.  I ended up
using on ON again for the list of columns.  I suppose the checks in
CreateStatistics() could still be improved, but I'd go ahead and push
this tomorrow morning and we can hammer those details later on, if it's
still needed.  Better avoid shipping beta with outdated grammar ...

BTW the new castNode() family of macros don't work with Value nodes
(because the tags are different depending on what's stored, but each
type does not have its own struct.  Oh well.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index b10b734b90..32e17ee5f8 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1132,8 +1132,8 @@ WHERE tablename = 'road';
  To inspect functional dependencies on a statistics
  stts, you may do this:
 
-CREATE STATISTICS stts WITH (dependencies)
-   ON (zip, city) FROM zipcodes;
+CREATE STATISTICS stts (dependencies)
+   ON zip, city FROM zipcodes;
 ANALYZE zipcodes;
 SELECT stxname, stxkeys, stxdependencies
   FROM pg_statistic_ext
@@ -1219,8 +1219,8 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 10;
  Continuing the above example, the n-distinct coefficients in a ZIP
  code table may look like the following:
 
-CREATE STATISTICS stts2 WITH (ndistinct)
-   ON (zip, state, city) FROM zipcodes;
+CREATE STATISTICS stts2 (ndistinct)
+   ON zip, state, city FROM zipcodes;
 ANALYZE zipcodes;
 SELECT stxkeys AS k, stxndistinct AS nd
   FROM pg_statistic_ext
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 11580bfd22..ef847b9633 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -526,7 +526,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 1;
 multivariate statistics on the two columns:
 
 
-CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
+CREATE STATISTICS stts (dependencies) ON a, b FROM t;
 ANALYZE t;
 EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
   QUERY PLAN   
@@ -569,7 +569,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP 
BY a, b;
 calculation, the estimate is much improved:
 
 DROP STATISTICS stts;
-CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
+CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
 ANALYZE t;
 EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
QUERY PLAN  
  
diff --git a/doc/src/sgml/ref/create_statistics.sgml 
b/doc/src/sgml/ref/create_statistics.sgml
index edbcf5840b..84ff52df04 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -22,8 +22,8 @@ PostgreSQL documentation
  
 
 CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
-WITH ( option [= value] [, ... ] )
-ON ( column_name, 
column_name [, ...])
+[ ( statistic_type [, ... ] ) 
]
+ON column_name, column_name [, ...]
 FROM table_name
 
 
@@ -75,6 +75,19 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na

 

+statistic_type
+
+ 
+  A statistic type to be enabled for this statistics.  Currently
+  supported types are ndistinct, which enables
+  n-distinct coefficient tracking,
+  and dependencies, which enables functional
+  dependencies.
+ 
+
+   
+
+   
 column_name
 
  
@@ -94,42 +107,6 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na

 
   
-
-  
-   Parameters
-
- 
-  statistics parameters
- 
-
-   
-The WITH clause can specify options
-for the statistics. Available options are listed below.
-   
-
-   
-
-   
-dependencies (boolean)
-
- 
-  Enables functional dependencies for the statistics.
- 
-
-   
-
-   
-ndistinct (boolean)
-
- 
-  Enables ndistinct coefficients for the statistics.
- 
-
-   
-
-   
-
-  
  
 
  
@@ -158,7 +135,7 @@ CREATE TABLE t1 (
 INSERT INTO 

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Have you thought further about the upthread suggestion to just borrow
>> SELECT's syntax lock stock and barrel?

> Bison seems to like the productions below.  Is this what you had in
> mind?  These mostly mimic joined_table and table_ref, stripping out the
> rules that we don't need.

I'd suggest just using the from_list production and then complaining
at runtime if what you get is too complicated.  Otherwise, you have
to maintain a duplicate set of productions, and you're going to be
unable to throw anything more informative than "syntax error" when
somebody tries to exceed the implementation limits.

> Note that I had to put back the FOR keyword in there; without the FOR,
> that is "CREATE STATS any_name opt_name_list expr_list FROM" there was
> one shift/reduce conflict, which I tried to solve by splitting
> opt_name_list using two rules (one with "'(' name_list ')'" and one
> without), but that gave rise to two reduce/reduce conflicts, and I didn't
> see any way to deal with them.

That's fine.  I was going to suggest using ON after the stats type list
just because it seemed to read better.  FOR is about as good.

> (Note that I ended up realizing that stats_type_list is unnecessary
> since we can use name_list.)

Check.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Alvaro Herrera
Tom Lane wrote:

> Have you thought further about the upthread suggestion to just borrow
> SELECT's syntax lock stock and barrel?  That is, it'd look something
> like
> 
> CREATE STATISTICS name [(list of stats types)] expression-list FROM ...
> [ WHERE ... ] [ WITH (options) ]
> 
> This would certainly support any FROM stuff we might want later.
> Yeah, you'd need to reject any features you weren't supporting at
> execution, but doing that would allow issuing a friendlier message than
> "syntax error" anyway.
> 
> If you don't have the cycles for that, I'd be willing to look into it.

Bison seems to like the productions below.  Is this what you had in
mind?  These mostly mimic joined_table and table_ref, stripping out the
rules that we don't need.

Note that I had to put back the FOR keyword in there; without the FOR,
that is "CREATE STATS any_name opt_name_list expr_list FROM" there was
one shift/reduce conflict, which I tried to solve by splitting
opt_name_list using two rules (one with "'(' name_list ')'" and one
without), but that gave rise to two reduce/reduce conflicts, and I didn't
see any way to deal with them.

(Note that I ended up realizing that stats_type_list is unnecessary
since we can use name_list.)


CreateStatsStmt:
CREATE opt_if_not_exists STATISTICS any_name
opt_name_list FOR expr_list FROM stats_joined_table
{
CreateStatsStmt *n = 
makeNode(CreateStatsStmt);
n->defnames = $4;
n->stat_types = $6;
n->if_not_exists = $2;

stmt->relation = $9;
stmt->keys = $7;
$$ = (Node *)n;
}
;

stats_joined_table:
  '(' stats_joined_table ')'
{
}
| stats_table_ref CROSS JOIN stats_table_ref
{
}
| stats_table_ref join_type JOIN stats_table_ref 
join_qual
{
}
| stats_table_ref JOIN stats_table_ref join_qual
{
}
| stats_table_ref NATURAL join_type JOIN stats_table_ref
{
}
| stats_table_ref NATURAL JOIN table_ref
{
}
;

stats_table_ref:
relation_expr opt_alias_clause
{
}
| stats_joined_table
{
}
| '(' stats_joined_table ')' alias_clause
{
}
;



-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Alvaro Herrera
Tom Lane wrote:

> Have you thought further about the upthread suggestion to just borrow
> SELECT's syntax lock stock and barrel?  That is, it'd look something
> like
> 
> CREATE STATISTICS name [(list of stats types)] expression-list FROM ...
> [ WHERE ... ] [ WITH (options) ]
> 
> This would certainly support any FROM stuff we might want later.
> Yeah, you'd need to reject any features you weren't supporting at
> execution, but doing that would allow issuing a friendlier message than
> "syntax error" anyway.

Hmm, no, I hadn't looked at this angle.

> If you don't have the cycles for that, I'd be willing to look into it.

I'll give it a try today, and pass the baton if I'm unable to.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Hmm ... I'm not sure that I buy that particular argument.  If you're
>> concerned that the grammar could not handle "FROM x JOIN y USING (z)",
>> wouldn't it also have a problem with "FROM x JOIN y ON (z)"?

> Tomas spent some time trying to shoehorn the whole join syntax into the
> FROM clause, but stopped once he realized that the joined_table
> production uses table_ref, which allow things like TABLESAMPLE, SRFs,
> LATERAL, etc which presumably we don't want to accept in CREATE STATS.
> I didn't look into it any further.  But because of the other
> considerations, I did end up changing the ON to FOR.

Have you thought further about the upthread suggestion to just borrow
SELECT's syntax lock stock and barrel?  That is, it'd look something
like

CREATE STATISTICS name [(list of stats types)] expression-list FROM ...
[ WHERE ... ] [ WITH (options) ]

This would certainly support any FROM stuff we might want later.
Yeah, you'd need to reject any features you weren't supporting at
execution, but doing that would allow issuing a friendlier message than
"syntax error" anyway.

If you don't have the cycles for that, I'd be willing to look into it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-11 Thread Alvaro Herrera
Tom Lane wrote:

> Hmm ... I'm not sure that I buy that particular argument.  If you're
> concerned that the grammar could not handle "FROM x JOIN y USING (z)",
> wouldn't it also have a problem with "FROM x JOIN y ON (z)"?
> 
> It might work anyway, since the grammar should know whether ON or USING
> is needed to complete the JOIN clause.  But I think you'd better check
> whether the complete join syntax works there, even if we're not going
> to support it now.

Tomas spent some time trying to shoehorn the whole join syntax into the
FROM clause, but stopped once he realized that the joined_table
production uses table_ref, which allow things like TABLESAMPLE, SRFs,
LATERAL, etc which presumably we don't want to accept in CREATE STATS.
I didn't look into it any further.  But because of the other
considerations, I did end up changing the ON to FOR.

So the attached is the final version which I intend to push shortly.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***
*** 1132,1138  WHERE tablename = 'road';
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts WITH (dependencies)
 ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
--- 1132,1138 
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts (dependencies)
 ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
***
*** 1219,1225  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 10;
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 WITH (ndistinct)
 ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
--- 1219,1225 
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 (ndistinct)
 ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
*** a/doc/src/sgml/planstats.sgml
--- b/doc/src/sgml/planstats.sgml
***
*** 526,532  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND 
b = 1;
  multivariate statistics on the two columns:
  
  
! CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN  
 
--- 526,532 
  multivariate statistics on the two columns:
  
  
! CREATE STATISTICS stts (dependencies) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN  
 
***
*** 569,575  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY 
a, b;
  calculation, the estimate is much improved:
  
  DROP STATISTICS stts;
! CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
 QUERY PLAN 
   
--- 569,575 
  calculation, the estimate is much improved:
  
  DROP STATISTICS stts;
! CREATE STATISTICS stts (dependencies, ndistinct) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
 QUERY PLAN 
   
*** a/doc/src/sgml/ref/create_statistics.sgml
--- b/doc/src/sgml/ref/create_statistics.sgml
***
*** 22,29  PostgreSQL documentation
   
  
  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
! WITH ( option [= 
value] [, ... ] )
! ON ( column_name, 
column_name [, ...])
  FROM table_name
  
  
--- 22,29 
   
  
  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
! [ ( statistic_type [, ... ] 
) ]
! FOR ( column_name, 
column_name [, ...])
  FROM table_name
  
  
***
*** 75,80  CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
--- 75,93 
 
  
 
+ statistic_type
+ 
+  
+   A statistic type to be enabled for this statistics.  Currently
+   supported types are ndistinct, which enables
+   n-distinct coefficient tracking,
+   and dependencies, which enables functional
+   dependencies.
+  
+ 
+
+ 
+
  column_name
  
   
***
*** 94,135  CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
 
  

- 
-   
-Parameters
- 
-  
-   

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Sven R. Kunze

On 04.05.2017 23:13, Tom Lane wrote:

I'm not against what you've done here, because I had no love for USING
in this context anyway; it conveys approximately nothing to the mind
about what is in the list it's introducing.  But I'm concerned whether
we're boxing ourselves in by using ON.

Actually, "ON" doesn't seem all that mnemonic either.  Maybe "FOR"
would be a good substitute, if it turns out that "ON" has a problem?


The whole syntax reminds me of a regular SELECT clause. So, SELECT?


Also considering the most generic form of statistic support mentioned in 
[1], one could even thing about allowing aggregates, windowing functions 
etc, aka the full SELECT clause in the future.



Sven


[1] 
https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=vbj78v...@mail.gmail.com 




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Tom Lane
Alvaro Herrera  writes:
> Here's a patch implementing this idea.  From gram.y's comment, the
> support syntax is now:

>   
> /*
>*
>*QUERY :
> !  *CREATE STATISTICS stats_name [(stat types)] arguments
> ! 
> !  *where 'arguments' can be one or more of:
> !  *{ ON (columns)
> !  *  | FROM relations
> !  *  | WITH (options)
> !  *  | WHERE expression }

> Note that I removed the USING keyword in the stat types list, and also
> made it mandatory that that list appears immediately after the new stats
> name.  This should make it possible to have USING in the relation list
> (the FROM clause), if we allow explicit multiple relations with join
> syntax there.  The other options can appear in any order.

Hmm ... I'm not sure that I buy that particular argument.  If you're
concerned that the grammar could not handle "FROM x JOIN y USING (z)",
wouldn't it also have a problem with "FROM x JOIN y ON (z)"?

It might work anyway, since the grammar should know whether ON or USING
is needed to complete the JOIN clause.  But I think you'd better check
whether the complete join syntax works there, even if we're not going
to support it now.

I'm not against what you've done here, because I had no love for USING
in this context anyway; it conveys approximately nothing to the mind
about what is in the list it's introducing.  But I'm concerned whether
we're boxing ourselves in by using ON.

Actually, "ON" doesn't seem all that mnemonic either.  Maybe "FOR"
would be a good substitute, if it turns out that "ON" has a problem?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Alvaro Herrera
Here's a patch implementing this idea.  From gram.y's comment, the
support syntax is now:

  /*
   *
   *QUERY :
!  *CREATE STATISTICS stats_name [(stat types)] arguments
! 
!  *where 'arguments' can be one or more of:
!  *{ ON (columns)
!  *  | FROM relations
!  *  | WITH (options)
!  *  | WHERE expression }

Note that I removed the USING keyword in the stat types list, and also
made it mandatory that that list appears immediately after the new stats
name.  This should make it possible to have USING in the relation list
(the FROM clause), if we allow explicit multiple relations with join
syntax there.  The other options can appear in any order.

Also, both WITH and WHERE are accepted by the grammar, but immediately
throw "feature not implemented" error at parse time.

I was on the fence about adding copy/equal/out support for the new
StatisticArgument node; it seems pointless because that node does not
leave gram.y anyway.

Unless there are objections, I'll push this tomorrow.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***
*** 1132,1138  WHERE tablename = 'road';
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts WITH (dependencies)
 ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
--- 1132,1138 
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts (dependencies)
 ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
***
*** 1219,1225  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 10;
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 WITH (ndistinct)
 ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
--- 1219,1225 
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 (ndistinct)
 ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
*** a/doc/src/sgml/planstats.sgml
--- b/doc/src/sgml/planstats.sgml
***
*** 526,532  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND 
b = 1;
  multivariate statistics on the two columns:
  
  
! CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN  
 
--- 526,532 
  multivariate statistics on the two columns:
  
  
! CREATE STATISTICS stts (dependencies) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN  
 
***
*** 569,575  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY 
a, b;
  calculation, the estimate is much improved:
  
  DROP STATISTICS stts;
! CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
 QUERY PLAN 
   
--- 569,575 
  calculation, the estimate is much improved:
  
  DROP STATISTICS stts;
! CREATE STATISTICS stts (dependencies, ndistinct) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
 QUERY PLAN 
   
*** a/doc/src/sgml/ref/create_statistics.sgml
--- b/doc/src/sgml/ref/create_statistics.sgml
***
*** 22,28  PostgreSQL documentation
   
  
  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
! WITH ( option [= 
value] [, ... ] )
  ON ( column_name, 
column_name [, ...])
  FROM table_name
  
--- 22,28 
   
  
  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
! [ ( statistic_type [, ... ] 
) ]
  ON ( column_name, 
column_name [, ...])
  FROM table_name
  
***
*** 75,80  CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
--- 75,93 
 
  
 
+ statistic_type
+ 
+  
+   A statistic type to be enabled for this statistics.  Currently
+   supported types are ndistinct, which enables
+   n-distinct coefficient tracking,
+   and 

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Simon Riggs
On 3 May 2017 at 23:31, Andrew Dunstan  wrote:

>> It also seems like we don't need to have *both* fully-reserved keywords
>> introducing each clause *and* parentheses around the lists.  Maybe
>> dropping the parens around the stats-types list and the column-names
>> list would help to declutter?  (But I'd keep parens around the WITH
>> options, for consistency with other statements.)

+1

>> One other point is that as long as we've got reserved keywords introducing
>> each clause, there isn't actually an implementation reason why we couldn't
>> accept the clauses in any order.  Not sure I want to document it that way,
>> but it might not be a bad thing if the grammar was forgiving about whether
>> you write the USING or ON part first ...
>
> +1 for allowing arbitrary order of clauses.

+1

> I would document it with the
> USING clause at the end, and have that be what psql supports and pg_dump
> produces. Since there are no WITH options now we should leave that out
> until it's required.

Let's record the target syntax in parser comments so we can just slot
things in when needed later, without rediscussion.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Tomas Vondra



On 5/3/17 11:36 PM, Alvaro Herrera wrote:

Andrew Dunstan wrote:


On 05/03/2017 04:42 PM, Tom Lane wrote:



One other point is that as long as we've got reserved keywords introducing
each clause, there isn't actually an implementation reason why we couldn't
accept the clauses in any order.  Not sure I want to document it that way,
but it might not be a bad thing if the grammar was forgiving about whether
you write the USING or ON part first ...


+1 for allowing arbitrary order of clauses. I would document it with the
USING clause at the end, and have that be what psql supports and pg_dump
produces. Since there are no WITH options now we should leave that out
until it's required.


Ok, sounds good to me.  Unless there are objections I'm going to have a
shot at implementing this.  Thanks for the discussion.



Works for me. Do you also plan to remove the parentheses for the USING 
clause?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Alvaro Herrera
Andrew Dunstan wrote:

> On 05/03/2017 04:42 PM, Tom Lane wrote:

> > One other point is that as long as we've got reserved keywords introducing
> > each clause, there isn't actually an implementation reason why we couldn't
> > accept the clauses in any order.  Not sure I want to document it that way,
> > but it might not be a bad thing if the grammar was forgiving about whether
> > you write the USING or ON part first ...
> 
> +1 for allowing arbitrary order of clauses. I would document it with the
> USING clause at the end, and have that be what psql supports and pg_dump
> produces. Since there are no WITH options now we should leave that out
> until it's required.

Ok, sounds good to me.  Unless there are objections I'm going to have a
shot at implementing this.  Thanks for the discussion.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Andrew Dunstan


On 05/03/2017 04:42 PM, Tom Lane wrote:
> Alvaro Herrera  writes:
>> Yawn.  So, we have not achieved the stated goal which was to get rid of
>> the ugly clause in the middle of the command; moreover we have gained
>> *yet another* clause in the middle of the command.  Is this really an
>> improvement?  We're trading this
>> CREATE STATISTICS s1 WITH (dependencies, ndistinct, options) ON (a, b) 
>> FROM t1 WHERE partial-stuff;
>> for this:
>> CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON 
>> (a, b) FROM t1 WHERE partial-stuff;
>> Can we decide by a show of hands, please, whether we're really on board
>> with this change?
> That seems totally messy :-(
>
> I can't see any good reason why "WITH (options)" can't be at the end of
> the query.  WITH is a fully reserved word, there is not going to be any
> danger of a parse problem from having it follow the FROM or WHERE clauses.
> And the end is where we have other instances of "WITH (options)".
>
> It also seems like we don't need to have *both* fully-reserved keywords
> introducing each clause *and* parentheses around the lists.  Maybe
> dropping the parens around the stats-types list and the column-names
> list would help to declutter?  (But I'd keep parens around the WITH
> options, for consistency with other statements.)
>
> One other point is that as long as we've got reserved keywords introducing
> each clause, there isn't actually an implementation reason why we couldn't
> accept the clauses in any order.  Not sure I want to document it that way,
> but it might not be a bad thing if the grammar was forgiving about whether
> you write the USING or ON part first ...
>
>   


+1 for allowing arbitrary order of clauses. I would document it with the
USING clause at the end, and have that be what psql supports and pg_dump
produces. Since there are no WITH options now we should leave that out
until it's required.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Tom Lane
Alvaro Herrera  writes:
> Yawn.  So, we have not achieved the stated goal which was to get rid of
> the ugly clause in the middle of the command; moreover we have gained
> *yet another* clause in the middle of the command.  Is this really an
> improvement?  We're trading this
> CREATE STATISTICS s1 WITH (dependencies, ndistinct, options) ON (a, b) 
> FROM t1 WHERE partial-stuff;
> for this:
> CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON 
> (a, b) FROM t1 WHERE partial-stuff;

> Can we decide by a show of hands, please, whether we're really on board
> with this change?

That seems totally messy :-(

I can't see any good reason why "WITH (options)" can't be at the end of
the query.  WITH is a fully reserved word, there is not going to be any
danger of a parse problem from having it follow the FROM or WHERE clauses.
And the end is where we have other instances of "WITH (options)".

It also seems like we don't need to have *both* fully-reserved keywords
introducing each clause *and* parentheses around the lists.  Maybe
dropping the parens around the stats-types list and the column-names
list would help to declutter?  (But I'd keep parens around the WITH
options, for consistency with other statements.)

One other point is that as long as we've got reserved keywords introducing
each clause, there isn't actually an implementation reason why we couldn't
accept the clauses in any order.  Not sure I want to document it that way,
but it might not be a bad thing if the grammar was forgiving about whether
you write the USING or ON part first ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Alvaro Herrera
Stephen Frost wrote:

> > Here I add one, which seems to work for me.

Pushed it -- I also added another one which specifies options, to test
WITH handling in ruleutils.

> > Considering that Stephen missed a terminating semicolon for test with
> > create_order 96 (the last one prior to my commit) in commit
> > 31a8b77a9244, I propose that we change whatever is concatenating those
> > strings append a terminating semicolon.  (Surely we don't care about two
> > tests stanzas writing a single SQL command by omitting the semicolon
> > terminator.)
> 
> Whoops, sorry about that.  Yes, we could pretty easily add that.  The
> create SQL is built up at the bottom of 002_pg_dump.pl:
> 
> $create_sql .= $tests{$test}->{create_sql};

Okay, I added it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Simon Riggs wrote:
> 
> > 2.
> > USING keyword, no brackets
> > CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1
> > WHERE partial-stuff;
> > 
> > and if there are options, use the WITH for the optional parameters like this
> > CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON
> > (a, b) FROM t1 WHERE partial-stuff;
> > 
> > 
> > I think I like (2)
> 
> OK, sounds sensible.

Yawn.  So, we have not achieved the stated goal which was to get rid of
the ugly clause in the middle of the command; moreover we have gained
*yet another* clause in the middle of the command.  Is this really an
improvement?  We're trading this
CREATE STATISTICS s1 WITH (dependencies, ndistinct, options) ON (a, b) FROM 
t1 WHERE partial-stuff;
for this:
CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON (a, 
b) FROM t1 WHERE partial-stuff;

Can we decide by a show of hands, please, whether we're really on board
with this change?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Stephen Frost
Alvaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Alvaro Herrera wrote:
> 
> > In the meantime, I noticed that pg_dump support for extstats is not
> > covered, which I'll go fix next.
> 
> Here I add one, which seems to work for me.
> 
> Considering that Stephen missed a terminating semicolon for test with
> create_order 96 (the last one prior to my commit) in commit
> 31a8b77a9244, I propose that we change whatever is concatenating those
> strings append a terminating semicolon.  (Surely we don't care about two
> tests stanzas writing a single SQL command by omitting the semicolon
> terminator.)

Whoops, sorry about that.  Yes, we could pretty easily add that.  The
create SQL is built up at the bottom of 002_pg_dump.pl:

$create_sql .= $tests{$test}->{create_sql};

> I wonder if there's any rationale to the create_order numbers.  Surely
> we only care for objects that depend on others.

Yes, it was just a way to manage those dependencies.  If there's value
in doing something more complicated then we could certainly do that, but
I'm not sure why it would be necessary to add that complexity.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Alvaro Herrera
Alvaro Herrera wrote:

> In the meantime, I noticed that pg_dump support for extstats is not
> covered, which I'll go fix next.

Here I add one, which seems to work for me.

Considering that Stephen missed a terminating semicolon for test with
create_order 96 (the last one prior to my commit) in commit
31a8b77a9244, I propose that we change whatever is concatenating those
strings append a terminating semicolon.  (Surely we don't care about two
tests stanzas writing a single SQL command by omitting the semicolon
terminator.)

I wonder if there's any rationale to the create_order numbers.  Surely
we only care for objects that depend on others.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 1dcb16de656d48c1004d4f4a12475438618a5c72 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 3 May 2017 14:18:22 -0300
Subject: [PATCH] Add pg_dump tests for CREATE STATISTICS

---
 src/bin/pg_dump/t/002_pg_dump.pl | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index ccd0ed6a53..79108cd331 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -1053,7 +1053,7 @@ my %tests = (
all_runs  => 1,
catch_all => 'ALTER TABLE ... commands',
create_order => 96,
-   create_sql => 'ALTER TABLE dump_test.test_table CLUSTER ON 
test_table_pkey',
+   create_sql => 'ALTER TABLE dump_test.test_table CLUSTER ON 
test_table_pkey;',
regexp=> qr/^
\QALTER TABLE test_table CLUSTER ON test_table_pkey;\E\n
/xm,
@@ -4920,6 +4920,40 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL 
WITH FUNCTION pg_catalog
role => 1,
section_post_data=> 1, }, },
 
+   'CREATE STATISTICS extended_stats_no_using' => {
+   all_runs => 1,
+   catch_all=> 'CREATE ... commands',
+   create_order => 97,
+   create_sql   => 'CREATE STATISTICS 
dump_test.test_ext_stats_no_using
+   ON (col1, col2) FROM 
dump_test.test_fifth_table;',
+   regexp => qr/^
+   \QCREATE STATISTICS dump_test.test_ext_stats_no_using 
ON (col1, col2) FROM test_fifth_table;\E
+   /xms,
+   like => {
+   binary_upgrade  => 1,
+   clean   => 1,
+   clean_if_exists => 1,
+   createdb=> 1,
+   defaults=> 1,
+   exclude_test_table  => 1,
+   exclude_test_table_data => 1,
+   no_blobs=> 1,
+   no_privs=> 1,
+   no_owner=> 1,
+   only_dump_test_schema   => 1,
+   pg_dumpall_dbprivs  => 1,
+   schema_only => 1,
+   section_post_data   => 1,
+   test_schema_plus_blobs  => 1,
+   with_oids   => 1, },
+   unlike => {
+   exclude_dump_test_schema => 1,
+   only_dump_test_table => 1,
+   pg_dumpall_globals   => 1,
+   pg_dumpall_globals_clean => 1,
+   role => 1,
+   section_pre_data => 1, }, },
+
'CREATE SEQUENCE test_table_col1_seq' => {
all_runs  => 1,
catch_all => 'CREATE ... commands',
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-03 Thread Alvaro Herrera
Simon Riggs wrote:

> 2.
> USING keyword, no brackets
> CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1
> WHERE partial-stuff;
> 
> and if there are options, use the WITH for the optional parameters like this
> CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON
> (a, b) FROM t1 WHERE partial-stuff;
> 
> 
> I think I like (2)

OK, sounds sensible.

Note that the USING list is also optional -- if you don't specify it, we
default to creating all stat types.  Also note that we currently don't
have any option other than stat types, so the WITH would always be empty
-- in other words we should remove it until we implement some option.

(I can readily see two possible options to implement for pg11, so the
omission of the WITH clause would be temporary:
 1. sample size to use instead of the per-column values
 2. whether to forcibly collect stats for all column for this stat
 object even if the column has gotten a SET STATISTICS 0
Surely there can be others.)

Patch attached that adds the USING clause replacing the WITH clause,
which is also optional and only accepts statistic types (it doesn't
accept "foo = OFF" anymore, as it seems pointless, but I'm open to
accepting it if people care about it.)

(This patch removes WITH, but I verified that bison accepts having both.
The second attached reversed patch is what I used for removal.)

In the meantime, I noticed that pg_dump support for extstats is not
covered, which I'll go fix next.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index b10b734b90..3406b7a1cd 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1132,7 +1132,7 @@ WHERE tablename = 'road';
  To inspect functional dependencies on a statistics
  stts, you may do this:
 
-CREATE STATISTICS stts WITH (dependencies)
+CREATE STATISTICS stts USING (dependencies)
ON (zip, city) FROM zipcodes;
 ANALYZE zipcodes;
 SELECT stxname, stxkeys, stxdependencies
@@ -1219,7 +1219,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 10;
  Continuing the above example, the n-distinct coefficients in a ZIP
  code table may look like the following:
 
-CREATE STATISTICS stts2 WITH (ndistinct)
+CREATE STATISTICS stts2 USING (ndistinct)
ON (zip, state, city) FROM zipcodes;
 ANALYZE zipcodes;
 SELECT stxkeys AS k, stxndistinct AS nd
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index f4430eb23c..16c433c3a2 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -526,7 +526,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 1;
 multivariate statistics on the two columns:
 
 
-CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
+CREATE STATISTICS stts USING (dependencies) ON (a, b) FROM t;
 ANALYZE t;
 EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
   QUERY PLAN   
@@ -569,7 +569,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP 
BY a, b;
 calculation, the estimate is much improved:
 
 DROP STATISTICS stts;
-CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
+CREATE STATISTICS stts USING (dependencies, ndistinct) ON (a, b) FROM t;
 ANALYZE t;
 EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
QUERY PLAN  
  
diff --git a/doc/src/sgml/ref/create_statistics.sgml 
b/doc/src/sgml/ref/create_statistics.sgml
index edbcf5840b..ff6ed0668f 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
  
 
 CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
-WITH ( option [= value] [, ... ] )
+USING ( statistic_type [, ... 
] )
 ON ( column_name, 
column_name [, ...])
 FROM table_name
 
@@ -103,14 +103,14 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
  
 

-The WITH clause can specify options
-for the statistics. Available options are listed below.
+The USING clause can specify types of statistics
+to be enabled. Available types are listed below.

 

 

-dependencies (boolean)
+dependencies
 
  
   Enables functional dependencies for the statistics.
@@ -119,7 +119,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na

 

-ndistinct (boolean)
+ndistinct
 
  
   Enables ndistinct coefficients for the statistics.
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 0b9c33e30a..f4d1712091 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -194,23 +194,22 @@ CreateStatistics(CreateStatsStmt *stmt)
stxkeys = 

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-22 Thread David Rowley
On 22 April 2017 at 21:30, Simon Riggs  wrote:
> CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1
> WHERE partial-stuff;

+1

Seems much more CREATE INDEX like, and that's pretty good given that
most of the complaints so far were about it bearing enough resemblance
to CREATE INDEX

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-22 Thread Pavel Stehule
2017-04-22 11:30 GMT+02:00 Simon Riggs :

> On 21 April 2017 at 01:21, Tomas Vondra 
> wrote:
> > On 04/21/2017 12:13 AM, Tom Lane wrote:
> >>
> >> Alvaro Herrera  writes:
> >>>
> >>> Simon just pointed out that having the WITH clause appear in the middle
> >>> of the CREATE STATISTICS command looks odd; apparently somebody else
> >>> already complained on list about the same.  Other commands put the WITH
> >>> clause at the end, so perhaps we should do likewise in the new command.
> >>
> >>
> >>> Here's a patch to implement that.  I verified that if I change
> >>> qualified_name to qualified_name_list, bison does not complain about
> >>> conflicts, so this new syntax should support extension to multiple
> >>> relations without a problem.
> >>
> >>
> >> Yeah, WITH is fully reserved, so as long as the clause looks like
> >> WITH ( stuff... ) you're pretty much gonna be able to drop it
> >> wherever you want.
> >>
> >>> Discuss.
> >>
> >>
> >> +1 for WITH at the end; the existing syntax looks weird to me too.
> >>
> >
> > -1 from me
> >
> > I like the current syntax more, and  WHERE ... WITH seems a bit weird to
> me.
> > But more importantly, one thing Dean probably considered when proposing
> the
> > current syntax was that we may add support for partial statistics, pretty
> > much like partial indexes. And we don't allow WITH at the end (after
> WHERE)
> > for indexes:
> >
> > test=# create index on t (a) where a < 100 with (fillfactor=10);
> > ERROR:  syntax error at or near "with"
> > LINE 1: create index on t (a) where a < 100 with (fillfactor=10);
> > ^
> > test=# create index on t (a) with (fillfactor=10) where a < 100;
>
> OK, didn't know about WHERE clause; makes sense.
>
> Currently WITH is supported in two places, which feels definitely
> wrong. The WITH clause is used elsewhere to provide optional
> parameters, and if there are none present it is optional.
>
> OK, so lets try...
>
> 1.
> No keyword at all, just list of statistics we store (i.e. just lose the
> WITH)
> CREATE STATISTICS s1 (dependencies, ndistinct) ON (a, b) FROM t1 WHERE
> partial-stuff;
>
> and if there are options, use the WITH for the optional parameters like
> this
> CREATE STATISTICS s1 (dependencies, ndistinct) WITH (options) ON (a,
> b) FROM t1 WHERE partial-stuff;
>
> 2.
> USING keyword, no brackets
> CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1
> WHERE partial-stuff;
>
> and if there are options, use the WITH for the optional parameters like
> this
> CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON
> (a, b) FROM t1 WHERE partial-stuff;
>
>
> I think I like (2)
>

+1

Regards

Pavel


>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-22 Thread Simon Riggs
On 21 April 2017 at 01:21, Tomas Vondra  wrote:
> On 04/21/2017 12:13 AM, Tom Lane wrote:
>>
>> Alvaro Herrera  writes:
>>>
>>> Simon just pointed out that having the WITH clause appear in the middle
>>> of the CREATE STATISTICS command looks odd; apparently somebody else
>>> already complained on list about the same.  Other commands put the WITH
>>> clause at the end, so perhaps we should do likewise in the new command.
>>
>>
>>> Here's a patch to implement that.  I verified that if I change
>>> qualified_name to qualified_name_list, bison does not complain about
>>> conflicts, so this new syntax should support extension to multiple
>>> relations without a problem.
>>
>>
>> Yeah, WITH is fully reserved, so as long as the clause looks like
>> WITH ( stuff... ) you're pretty much gonna be able to drop it
>> wherever you want.
>>
>>> Discuss.
>>
>>
>> +1 for WITH at the end; the existing syntax looks weird to me too.
>>
>
> -1 from me
>
> I like the current syntax more, and  WHERE ... WITH seems a bit weird to me.
> But more importantly, one thing Dean probably considered when proposing the
> current syntax was that we may add support for partial statistics, pretty
> much like partial indexes. And we don't allow WITH at the end (after WHERE)
> for indexes:
>
> test=# create index on t (a) where a < 100 with (fillfactor=10);
> ERROR:  syntax error at or near "with"
> LINE 1: create index on t (a) where a < 100 with (fillfactor=10);
> ^
> test=# create index on t (a) with (fillfactor=10) where a < 100;

OK, didn't know about WHERE clause; makes sense.

Currently WITH is supported in two places, which feels definitely
wrong. The WITH clause is used elsewhere to provide optional
parameters, and if there are none present it is optional.

OK, so lets try...

1.
No keyword at all, just list of statistics we store (i.e. just lose the WITH)
CREATE STATISTICS s1 (dependencies, ndistinct) ON (a, b) FROM t1 WHERE
partial-stuff;

and if there are options, use the WITH for the optional parameters like this
CREATE STATISTICS s1 (dependencies, ndistinct) WITH (options) ON (a,
b) FROM t1 WHERE partial-stuff;

2.
USING keyword, no brackets
CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1
WHERE partial-stuff;

and if there are options, use the WITH for the optional parameters like this
CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON
(a, b) FROM t1 WHERE partial-stuff;


I think I like (2)

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-22 Thread Dean Rasheed
On 21 April 2017 at 01:21, Tomas Vondra  wrote:
> On 04/21/2017 12:13 AM, Tom Lane wrote:
>>
>> Alvaro Herrera  writes:
>>>
>>> Simon just pointed out that having the WITH clause appear in the middle
>>> of the CREATE STATISTICS command looks odd; apparently somebody else
>>> already complained on list about the same.  Other commands put the WITH
>>> clause at the end, so perhaps we should do likewise in the new command.
>>
>> +1 for WITH at the end; the existing syntax looks weird to me too.
>
> -1 from me
>

Yeah, I'm still marginally in favour of the current syntax because
it's a bit more consistent with the CREATE VIEW syntax which puts the
WITH (options) clause before the query, and assuming that multi-table
and partial statistics support do get added in the future, the thing
that the statistics get created on is going to look more like a query.

OTOH, a few people have now commented that the syntax looks weird, and
not just because of the placement of the WITH clause. I don't have any
better ideas, but it's not too late to change if anyone else does...

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-21 Thread Tels
Moin,

On Fri, April 21, 2017 7:04 am, David Rowley wrote:
> On 21 April 2017 at 22:30, Tels  wrote:
>> I'd rather see:
>>
>>  CREATE STATISTICS stats_name ON table(col);
>>
>> as this both mirrors CREATE INDEX and foreign keys with REFERENCES. It
>> could also be extended to both more columns, expressions or other tables
>> like so:
>>
>>  CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b);
>>
>> and even:
>>
>>  CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b) WITH (options)
>> WHERE t2.a > 4;
>
> How would you express a join condition with that syntax?
>
>> This looks easy to remember, since it compares to:
>>
>>  CREATE INDEX idx_name ON t2(a,b) WITH (options) WHERE t2.a > 4;
>>
>> Or am I'm missing something?
>
> Sadly yes, you are, and it's not the first time.
>
> I seem to remember mentioning this to you already in [1].
>
> Please, can you read over [2], it mentions exactly what you're
> proposing and why it's not any good.
>
> [1]
> https://www.postgresql.org/message-id/cakjs1f9hmet+7adiceau8heompob5pkkcvyzliezje3dvut...@mail.gmail.com
> [2]
> https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=vbj78v...@mail.gmail.com

Ah, ok, thank you, somehow I missed your answer the first time. So, just
ignore me :)

Best wishes,

Tels


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-21 Thread David Rowley
On 21 April 2017 at 22:30, Tels  wrote:
> I'd rather see:
>
>  CREATE STATISTICS stats_name ON table(col);
>
> as this both mirrors CREATE INDEX and foreign keys with REFERENCES. It
> could also be extended to both more columns, expressions or other tables
> like so:
>
>  CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b);
>
> and even:
>
>  CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b) WITH (options)
> WHERE t2.a > 4;

How would you express a join condition with that syntax?

> This looks easy to remember, since it compares to:
>
>  CREATE INDEX idx_name ON t2(a,b) WITH (options) WHERE t2.a > 4;
>
> Or am I'm missing something?

Sadly yes, you are, and it's not the first time.

I seem to remember mentioning this to you already in [1].

Please, can you read over [2], it mentions exactly what you're
proposing and why it's not any good.

[1] 
https://www.postgresql.org/message-id/cakjs1f9hmet+7adiceau8heompob5pkkcvyzliezje3dvut...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=vbj78v...@mail.gmail.com

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-21 Thread Tels
Moin,

On Thu, April 20, 2017 8:21 pm, Tomas Vondra wrote:
> On 04/21/2017 12:13 AM, Tom Lane wrote:
>> Alvaro Herrera  writes:
>>> Simon just pointed out that having the WITH clause appear in the middle
>>> of the CREATE STATISTICS command looks odd; apparently somebody else
>>> already complained on list about the same.  Other commands put the WITH
>>> clause at the end, so perhaps we should do likewise in the new command.
>>
>>> Here's a patch to implement that.  I verified that if I change
>>> qualified_name to qualified_name_list, bison does not complain about
>>> conflicts, so this new syntax should support extension to multiple
>>> relations without a problem.
>>
>> Yeah, WITH is fully reserved, so as long as the clause looks like
>> WITH ( stuff... ) you're pretty much gonna be able to drop it
>> wherever you want.
>>
>>> Discuss.
>>
>> +1 for WITH at the end; the existing syntax looks weird to me too.
>>
>
> -1 from me
>
> I like the current syntax more, and  WHERE ... WITH seems a bit weird to
> me. But more importantly, one thing Dean probably considered when
> proposing the current syntax was that we may add support for partial
> statistics, pretty much like partial indexes. And we don't allow WITH at
> the end (after WHERE) for indexes:
>
> test=# create index on t (a) where a < 100 with (fillfactor=10);
> ERROR:  syntax error at or near "with"
> LINE 1: create index on t (a) where a < 100 with (fillfactor=10);
>  ^
> test=# create index on t (a) with (fillfactor=10) where a < 100;

While I'm not sure about where to put the WITH, so to speak, I do favour
consistency.

So I'm inclinded to keep the syntax like for the "create index".

More importantly however, I'd rather see the syntax on the "ON (column)
FROM relname" to be changed to match the existing examples. (Already wrote
this to Simon, not sure if my email made it to the list)

So instead:

 CREATE STATISTICS stats_name ON (columns) FROM relname

I'd rather see:

 CREATE STATISTICS stats_name ON table(col);

as this both mirrors CREATE INDEX and foreign keys with REFERENCES. It
could also be extended to both more columns, expressions or other tables
like so:

 CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b);

and even:

 CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b) WITH (options)
WHERE t2.a > 4;

This looks easy to remember, since it compares to:

 CREATE INDEX idx_name ON t2(a,b) WITH (options) WHERE t2.a > 4;

Or am I'm missing something?

Regards,

Tels


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-20 Thread Tomas Vondra

On 04/21/2017 12:13 AM, Tom Lane wrote:

Alvaro Herrera  writes:

Simon just pointed out that having the WITH clause appear in the middle
of the CREATE STATISTICS command looks odd; apparently somebody else
already complained on list about the same.  Other commands put the WITH
clause at the end, so perhaps we should do likewise in the new command.



Here's a patch to implement that.  I verified that if I change
qualified_name to qualified_name_list, bison does not complain about
conflicts, so this new syntax should support extension to multiple
relations without a problem.


Yeah, WITH is fully reserved, so as long as the clause looks like
WITH ( stuff... ) you're pretty much gonna be able to drop it
wherever you want.


Discuss.


+1 for WITH at the end; the existing syntax looks weird to me too.



-1 from me

I like the current syntax more, and  WHERE ... WITH seems a bit weird to 
me. But more importantly, one thing Dean probably considered when 
proposing the current syntax was that we may add support for partial 
statistics, pretty much like partial indexes. And we don't allow WITH at 
the end (after WHERE) for indexes:


test=# create index on t (a) where a < 100 with (fillfactor=10);
ERROR:  syntax error at or near "with"
LINE 1: create index on t (a) where a < 100 with (fillfactor=10);
^
test=# create index on t (a) with (fillfactor=10) where a < 100;


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-20 Thread Tom Lane
Alvaro Herrera  writes:
> Simon just pointed out that having the WITH clause appear in the middle
> of the CREATE STATISTICS command looks odd; apparently somebody else
> already complained on list about the same.  Other commands put the WITH
> clause at the end, so perhaps we should do likewise in the new command.

> Here's a patch to implement that.  I verified that if I change
> qualified_name to qualified_name_list, bison does not complain about
> conflicts, so this new syntax should support extension to multiple
> relations without a problem.

Yeah, WITH is fully reserved, so as long as the clause looks like
WITH ( stuff... ) you're pretty much gonna be able to drop it
wherever you want.

> Discuss.

+1 for WITH at the end; the existing syntax looks weird to me too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WITH clause in CREATE STATISTICS

2017-04-20 Thread Alvaro Herrera
Simon just pointed out that having the WITH clause appear in the middle
of the CREATE STATISTICS command looks odd; apparently somebody else
already complained on list about the same.  Other commands put the WITH
clause at the end, so perhaps we should do likewise in the new command.

Here's a patch to implement that.  I verified that if I change
qualified_name to qualified_name_list, bison does not complain about
conflicts, so this new syntax should support extension to multiple
relations without a problem.

Discuss.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 237909c440054e97c09dc6e7711a4ca92892465b
Author: Alvaro Herrera 
AuthorDate: Thu Apr 20 18:19:06 2017 -0300
CommitDate: Thu Apr 20 18:19:06 2017 -0300

Put WITH clause at the end

diff --git a/doc/src/sgml/ref/create_statistics.sgml 
b/doc/src/sgml/ref/create_statistics.sgml
index edbcf5840b..910d6be8ab 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -22,9 +22,9 @@ PostgreSQL documentation
  
 
 CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
-WITH ( option [= value] [, ... ] )
 ON ( column_name, 
column_name [, ...])
 FROM table_name
+WITH ( option [= value] [, ... ] )
 
 
  
@@ -158,7 +158,7 @@ CREATE TABLE t1 (
 INSERT INTO t1 SELECT i/100, i/500
  FROM generate_series(1,100) s(i);
 
-CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t1;
+CREATE STATISTICS s1 ON (a, b) FROM t1 WITH (dependencies);
 
 ANALYZE t1;
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 89d2836c49..92e9aa8b28 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3847,28 +3847,28 @@ ExistingIndex:   USING INDEX index_name 
{ $$ = $3; }
 /*
  *
  * QUERY :
- * CREATE STATISTICS stats_name WITH (options) ON 
(columns) FROM relname
+ * CREATE STATISTICS stats_name ON (columns) FROM 
relname WITH (options)
  *
  */
 
 
-CreateStatsStmt:   CREATE STATISTICS any_name opt_reloptions ON '(' 
columnList ')' FROM qualified_name
+CreateStatsStmt:   CREATE STATISTICS any_name ON '(' columnList ')' FROM 
qualified_name opt_reloptions
{
CreateStatsStmt *n = 
makeNode(CreateStatsStmt);
n->defnames = $3;
-   n->relation = $10;
-   n->keys = $7;
-   n->options = $4;
+   n->relation = $9;
+   n->keys = $6;
+   n->options = $10;
n->if_not_exists = 
false;
$$ = (Node *)n;
}
-   | CREATE STATISTICS IF_P NOT EXISTS 
any_name opt_reloptions ON '(' columnList ')' FROM qualified_name
+   | CREATE STATISTICS IF_P NOT EXISTS 
any_name ON '(' columnList ')' FROM qualified_name opt_reloptions
{
CreateStatsStmt *n = 
makeNode(CreateStatsStmt);
n->defnames = $6;
-   n->relation = $13;
-   n->keys = $10;
-   n->options = $7;
+   n->relation = $12;
+   n->keys = $9;
+   n->options = $13;
n->if_not_exists = true;
$$ = (Node *)n;
}
diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index 0d6f65e604..7dc6011e6b 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -389,7 +389,7 @@ EXPLAIN (COSTS OFF)
 (2 rows)
 
 -- create statistics
-CREATE STATISTICS func_deps_stat WITH (dependencies) ON (a, b, c) FROM 
functional_dependencies;
+CREATE STATISTICS func_deps_stat ON (a, b, c) FROM