Re: Fix the synopsis of pg_md5_hash

2024-03-14 Thread Tatsuro Yamada
Hi, Daniel and Michael,

On Thu, Mar 14, 2024 at 09:32:55AM +0100, Daniel Gustafsson wrote:
> > On 14 Mar 2024, at 07:02, Tatsuro Yamada  wrote:
> >> So, I created a patch to fix them.
> >
> > Thanks, applied.
>
> Oops.  Thanks.
> --
> Michael
>

Thank you guys!

Regards,
Tatsuro Yamada
NTT Open Source Software Center


Fix the synopsis of pg_md5_hash

2024-03-14 Thread Tatsuro Yamada
Hi,

The synopsis of pg_md5_hash() seems wrong such as:
- s/int/bool/
- "errstr" is missing
So, I created a patch to fix them.

src/common/md5_common.c
==
*  SYNOPSIS  #include "md5.h"
*int pg_md5_hash(const void *buff, size_t len, char *hexsum)
...
bool
pg_md5_hash(const void *buff, size_t len, char *hexsum, const char **errstr)
==

Please find attached file.

Regards,
Tatsuro Yamada
NTT Open Source Software Center



fix_synopsis_of_pg_md5_hash.patch
Description: fix_synopsis_of_pg_md5_hash.patch


Showing applied extended statistics in explain Part 2

2024-02-29 Thread Tatsuro Yamada
Hi,

This original patch made by Tomas improves the usability of extended 
statistics, 
so I rebased it on 362de947, and I'd like to re-start developing it.
 
The previous thread [1] suggested something to solve. I'll try to solve it as 
best I can, but I think this feature is worth it with some limitations.
Please find the attached file.

[1] 
https://www.postgresql.org/message-id/flat/8081617b-d80f-ae2b-b79f-ea7e926f9fcf%40enterprisedb.com

Regards,
Tatsuro Yamada
NTT Open Source Software Center



0001-show-stats-in-explain-rebased-on-362de947.patch
Description: 0001-show-stats-in-explain-rebased-on-362de947.patch


Re: Add psql command to list constraints

2022-03-28 Thread Tatsuro Yamada

Hi Dag,


The patch adds the command "\dco" to list constraints in psql. This
seems useful to me.


Thank you!

 

The patch applies cleanly to HEAD, although some hunks have rather large
offsets.

As far as I can tell, the "\dco" command works as documented.

I have however found the following issues with the patch:

* A TAB character has been added to doc/src/sgml/ref/psql-ref.sgml -
   this should be replaced with spaces.


Fixed.



* The call to listConstraints in line src/bin/psql/command.c 794 refers
   to [2], this should rather be [3].

* The patch kills the "\dc" command in src/bin/psql/command.c
   This can be fixed by adding the following at line 800:
   else
 success =
   listConversions(pattern, show_verbose, show_system);



Oh, you are right! Fixed.

 

Another comment is that the "\dco" command outputs quite a lot of
information, which only fits in a wide terminal window. Would it be an
idea to only display the columns "Schema" and "Name" by default, and
use "+" to specify inclusion of the columns "Definition" and "Table".
 


I fixed the output columns as you proposed.

The current status of this patch is:

  - Addressed Dag's comments
  - Not implemented yet:
- Tab completion
- Regression test
- NOT NULL constraint, and so on (based on pg_attribute)

Please find attached new patch.


Thanks,
Tatsuro Yamada




diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index caabb06..125ae3d 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1388,6 +1388,26 @@ testdb=
 
 
   
+\dco[cfptuxS+] [ pattern ]
+
+
+Lists constraints.
+If pattern
+is specified, only entries whose name matches the pattern are listed.
+The modifiers c (check), f 
(foreign key),
+p (primary key), t (trigger), 
+u (unique), x (exclusion) can be 
+appended to the command, filtering the kind of constraints to list. 
+By default, only user-created constraints are shown; supply the 
+S modifier to include system objects. 
+If + is appended to the command name, each object
+is listed with its associated description.
+
+
+  
+
+
+  
 \dC[+] [ pattern ]
 
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 079f4a1..05ae25e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -780,7 +780,26 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
success = describeTablespaces(pattern, 
show_verbose);
break;
case 'c':
-   success = listConversions(pattern, 
show_verbose, show_system);
+   if (strncmp(cmd, "dco", 3) == 0) /* Constraint 
*/
+   switch (cmd[3])
+   {
+   case '\0':
+   case '+':
+   case 'S':
+   case 'c':
+   case 'f':
+   case 'p':
+   case 't':
+   case 'u':
+   case 'x':
+   success = 
listConstraints([3], pattern, show_verbose, show_system);
+   break;
+   default:
+   status = 
PSQL_CMD_UNKNOWN;
+   break;
+   }
+   else
+   success = listConversions(pattern, 
show_verbose, show_system);
break;
case 'C':
success = listCasts(pattern, show_verbose);
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4dddf08..7acd25a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -19,6 +19,7 @@
 #include "catalog/pg_attribute_d.h"
 #include "catalog/pg_cast_d.h"
 #include "catalog/pg_class_d.h"
+#include "catalog/pg_constraint_d.h"
 #include "catalog/pg_default_acl_d.h"
 #include "common.h"
 #include "common/logging.h"
@@ -4599,6 +4600,109 @@ listExtendedStats(const char *pattern)
 }
 
 /*
+ * \dco
+ *
+ * Describes constraints
+ *
+ * As with \d,

Re: Add psql command to list constraints

2022-03-28 Thread Tatsuro Yamada

Hi All,


In the interests of trying to clean up the CF and keep things moving
I'll mark the patch rejected.


Thank you for managing the commitfest and the comments from many of
hackers. I apologize for not being able to email you more often due to
my busy day job.

First of all, I understand to a certain extent your opinion that we
can use \d and look at the constraints on a table-by-table basis as a
way to check the constraints.
However, suppose We want to reverse lookup a table from a constraint.
In that case, there are two ways: (1) use "\d" to lookup all tables,
(2) execute a select statement against pg_constraint. I think the
proposed function is more valuable than these.

From another perspective, when looking at the comprehensiveness of
metacommands in psql, it seems that only functions that focus on
constraints do not exist. Therefore, It would be better to add it in
terms of comprehensiveness.

I think there is room for other discussions about this patch. Still,
at least there are people (including myself) who think it is useful.
I don't think there is anything wrong with this patch that would
complicate the code or cause performance degradation, so I would like to
continue developing it for those who want to use it.

However, I understand that it will not be ready in time for PG15, so
I would like to move forward with PG16. Therefore, the status of the
patch would be better by changing "Waiting for Author" rather than
"Rejected".

P.S.
I'll send a new patch addressing Dag's comments in the next email.

Thanks,
Tatsuro Yamada






Re: Showing applied extended statistics in explain

2022-01-18 Thread Tatsuro Yamada
Hi Tomas!

>> What exactly use case do you have in mind?
>Well, my goal was to help users investigating the plan/estimates,
>because once you create multiple "candidate" statistics objects it may
>get quite tricky to determine which of them were applied, in what order,
>etc. It's not all that straightforward, given the various heuristics we
>use to pick the "best" statistics, apply dependencies last, etc. And I
>don't quite want to teach the users those rules, because I consider them
>mostly implementation details that wee may want to tweak in the future.

You are right. This feature will be very useful for plan tuning with
extended statistics. Therefore, I would like to help develop it. :-D


>5) It includes just statistics name + clauses, but maybe we should
>include additional info (e.g estimate for that combination of clauses).

I thought the above sentence was about what level to aim for in the initial
version. Ideally, we would like to include additional information. However,
it is clear that the more things we have, the longer it will take to
develop.
So, I think it is better to commit the initial version at a minimal level
to
provide it to users quickly.

As long as an Extended stats name is displayed in EXPLAIN result, it is
possible to figure out which extended statistics are used. That information
alone is valuable to the user.


> 4) The info is collected always, but I guess we should do that only when
>in explain mode. Not sure how expensive it is.

In the case of pg_plan_advsr that I created, I used ProcessUtility_hook to
detect EXPLAIN command. It searches all queries to find T_ExplainStmt, and
set a flag. I guess we can't use this method in Postgres core, right?

If we have to collect the extra info for all query executions, we need to
reduce overhead. I mean collecting the minimum necessary.
To do that, I think it would be better to display less Extended stats info
in EXPLAIN results. For example, displaying only extended stats names is
fine,
I guess. (I haven't understood your patch yet, so If I say wrong, sorry)


>6) The clauses in the grouping query are transformed to AND list, which
>is wrong. This is easy to fix, I was lazy to do that in a PoC patch.

6) is related 5).
If we agree to remove showing quals of extended stats in EXPLAIN result,
We can solve this problem by removing the code. Is it okay?


>2) The deparsing is modeled (i.e. copied) from how we deal with index
>quals, but it's having issues with nested OR clauses, because there are
>nested RestrictInfo nodes and the deparsing does not expect that.
>
>3) It does not work for functional dependencies, because we effectively
>"merge" all functional dependencies and apply the entries. Not sure how
>to display this, but I think it should show the individual dependencies
>actually applied.
>
>7) It does not show statistics for individual expressions. I suppose
>examine_variable could add it to the rel somehow, and maybe we could do
>that with index expressions too?

I'm not sure about 2) 3) and 7) above, so I'd like to see some concrete
examples
of queries. I would like to include it in the test pattern for regression
testing.


To be fixed:

* StatisticExtInfo should have a _copy etc. node functionality now,
 right? I've hit "unrecognized node type" with a prepared statement
 because it's missing.

* Is there any particular reason behind choosing only some scan nodes to
 display extended stats? E.g. BitmapHeapScan is missing.

* (New) In the case of Group by, we should put Extended stats info under the
 Hash/Group Aggregate node (not under Scan node).

* (New) We have to create a regression test including the above test
patterns.


Attached patch:

It is a rebased version on the head of the master because there were many
Hunks
when I applied the previous patch on master.

I'll create regression tests firstly.

Regards,
Tatsuro Yamada


0001-show-stats-in-explain-rebased.patch
Description: Binary data


Re: \dP and \dX use ::regclass without "pg_catalog."

2022-01-10 Thread Tatsuro Yamada

Hi justin,

On 2022/01/08 6:56, Justin Pryzby wrote:

On Fri, Jan 07, 2022 at 08:08:57PM +0900, Michael Paquier wrote:

On Fri, Jan 07, 2022 at 06:30:30PM +0900, Tatsuro Yamada wrote:

We should prefix them with pg_catalog as well.
Are you planning to make a patch?
If not, I'll make a patch later since that's where \dX is.


If any of you can make a patch, that would be great.  Thanks!


I'd propose these.



Thanks!

Regards,
Tatsuro Yamada





Re: \dP and \dX use ::regclass without "pg_catalog."

2022-01-07 Thread Tatsuro Yamada

Hi Justin,

On 2022/01/07 11:22, Justin Pryzby wrote:

But, we missed two casts to ::text which don't use pg_catalog.
Evidently the cast is to allow stable sorting.


Ah, you are right.

We should prefix them with pg_catalog as well.
Are you planning to make a patch?
If not, I'll make a patch later since that's where \dX is.

Regards,
Tatsuro Yamada






Re: Add psql command to list constraints

2021-11-15 Thread Tatsuro Yamada

Hi David,

Thanks for your comments.


Okay, but you agree that there are DBAs and users who want to see the
list of constraints, I think.


My opinion is this doesn't exist because there isn't any demand for it.



I don't know if this is a good example, but if you look at StackOverflow,
it seems that people who want to see a list of constraints appear regularly.
(The other day, I also wanted to see the list, and I arrived at this site)
Therefore, the only thing that hasn't been implemented so far is that no one
could communicate the request to -hackers, and I think there is demand.

https://stackoverflow.com/questions/62987794/how-to-list-all-constraints-of-a-table-in-postgresql


Regards,
Tatsuro Yamada










Re: Add psql command to list constraints

2021-11-15 Thread Tatsuro Yamada

Hi Justin,


Thanks for your comments!

 

Currently, DBAs need the table name to see the constraint information.


Or, they can query pg_constraint or information_schema: check_constraints,
table_constraints.



Yeah, right.
If they can use the meta-command instead of a long query against pg_constraint
or information_schema and also pg_attribulte, it would be helpful, I believe. 
:-D




-   success = listConversions(pattern, 
show_verbose, show_system);
+   if (strncmp(cmd, "dco", 3) == 0) /* Constraint 
*/
+   switch (cmd[3])
+   {
+   case '\0':
+   case '+':


Does "+" do anything ?



No, it doesn't. Removed.


 

+++ b/src/bin/psql/help.c
@@ -231,6 +231,7 @@ slashUsage(unsigned short int pager)



fprintf(output, _("  \\db[+]  [PATTERN]  list tablespaces\n"));
fprintf(output, _("  \\dc[S+] [PATTERN]  list conversions\n"));
+   fprintf(output, _("  \\dco[S] [PATTERN]  list constraint\n"));


Should be plural "constraints".

I think "exclude" should be called "exclusion" ("exclude" sounded to me like
you're going to provide a way to "exclude" types of constraints, like "xc"
would show everything except check constraints).



Thanks! Fixed the both.


Attached file is new patch. It includes:

  - Fix help message s/constraint/constraints/
  - s/Exclude/Exclusion/
  - Remove unused modifier "+"
  - Add document for \dco



Thanks,
Tatsuro Yamada

From eee92ee549e49d0b5aef438aff10236611db410e Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Mon, 15 Nov 2021 17:58:31 +0900
Subject: [PATCH] Add psql command to list constraints POC3

- Fix help message s/constraint/constraints/
- s/Exclude/Exclusion/
- Remove unused modifier "+"
- Add document for \dco
---
 doc/src/sgml/ref/psql-ref.sgml | 18 
 src/bin/psql/command.c | 18 +++-
 src/bin/psql/describe.c| 98 ++
 src/bin/psql/describe.h|  3 ++
 src/bin/psql/help.c|  1 +
 5 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 48248f7..c6704d7 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1388,6 +1388,24 @@ testdb=
 
 
   
+\dco[cfptuxS] [ pattern ]
+
+
+Lists constraints.
+If pattern
+is specified, only entries whose name matches the pattern are listed.
+The modifiers c (check), f 
(foreign key),
+p (primary key), t (trigger), 
+u (unique), x (exclusion) can be 
+appended to the command, filtering the kind of constraints to list. 
+By default, only user-created constraints are shown; supply the 
+   S modifier to include system objects. 
+
+
+  
+
+
+  
 \dC[+] [ pattern ]
 
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3de9d09..0379610 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -769,7 +769,23 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
success = describeTablespaces(pattern, 
show_verbose);
break;
case 'c':
-   success = listConversions(pattern, 
show_verbose, show_system);
+   if (strncmp(cmd, "dco", 3) == 0) /* Constraint 
*/
+   switch (cmd[3])
+   {
+   case '\0':
+   case 'S':
+   case 'c':
+   case 'f':
+   case 'p':
+   case 'u':
+   case 't':
+   case 'x':
+   success = 
listConstraints([2], pattern, show_system);
+   break;
+   default:
+   status = 
PSQL_CMD_UNKNOWN;
+   break;
+   }
break;
case 'C':
success = listCasts(pattern, show_verbose);
diff --git a/src/bin/psql/d

Re: Add psql command to list constraints

2021-11-15 Thread Tatsuro Yamada

Hi Justin,

Thanks for your comments and review!


Maybe it ought to be possible to choose the type of constraints to show.
Similar to how \dt shows tables and \di shows indexes and \dti shows
tables+inds, you could run \dcoc for check constraints and \dcof for foreign
keys.  But I think "\dco" is too long of a prefix...



Yeah, agreed.
I added a filter based on the type of constraints:
  - c for check
  - f for foreign key
  - p for primary key
  - t for trigger
  - u for unique
  - x for exclude c, f, p, u, t, and x.

The following is examples of \dcop, \dcof, and \dcopf.


postgres=# \dcop
List of constraints
 Schema | Name |Definition|  Table
+--+--+-
 public | t03_pk1_pkey | PRIMARY KEY (product_no) | t03_pk1
 public | t04_fk_pkey  | PRIMARY KEY (order_id)   | t04_fk
(2 rows)

postgres=# \dcof
List of constraints
 Schema |  Name  |   Definition 
   | Table
++-+
 public | t04_fk_product_no_fkey | FOREIGN KEY (product_no) REFERENCES 
t03_pk1(product_no) | t04_fk
(1 row)

postgres=# \dcopf
 List of constraints
 Schema |  Name  |   Definition 
   |  Table
++-+-
 public | t03_pk1_pkey   | PRIMARY KEY (product_no) 
   | t03_pk1
 public | t04_fk_pkey| PRIMARY KEY (order_id)   
   | t04_fk
 public | t04_fk_product_no_fkey | FOREIGN KEY (product_no) REFERENCES 
t03_pk1(product_no) | t04_fk
(3 rows)



I too think \dco is a long name. So,  I'd like to get suggestions to be more 
shortened. :)


 

+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ "n.nspname AS \"%s\", \n"
+ "cst.conname AS \"%s\", \n"
+ "pg_catalog.pg_get_constraintdef(cst.oid) AS 
\"%s\", \n"
+ "c.relname AS \"%s\" \n"
+ "FROM pg_constraint cst \n"
+ "JOIN pg_namespace n ON n.oid = 
cst.connamespace \n"
+ "JOIN pg_class c ON c.oid = cst.conrelid 
\n",


You should write "pg_catalog." prefix for the tables (in addition to the
function).



Oops, You are right. Fixed.

 

Rather than join to pg_class, you can write conrelid::pg_catalog.regclass,
since regclass is supported since at least v7.3 (but ::regnamespace was
introduced in v9.5, so the join against pg_namespace is still necessary).
https://www.postgresql.org/docs/9.5/datatype-oid.html

+   myopt.title = _("List of constsraints");


spelling: constraints



Thanks! Fixed.

 

I'm not confident that if I would use this, so let's wait to see if someone
else wants to give a +1.



Okay, but you agree that there are DBAs and users who want to see the
list of constraints, I think.

Currently, DBAs need the table name to see the constraint information.
However, with this feature, you can see its definition and table name
from the constraint name.
For example, it will be easier to understand how many foreign key
constraints are in the DB. The \d command also displays the constraints
but does not list them, so this feature is more beneficial for those who
want to check them.


Attached new patch includes:

  -  Add a filter by contype
  - Add pg_catalog prefix
  - Fix typo
  - Fix help message to add \dco

Not implemented yet:
  - NOT NULL constraint, and so on (based on pg_attribute)
  - Tab completion
  - Regression test
  - Document

Any comments welcome! :-D


Thanks,
Tatsuro Yamada
From ce46a3fa7252109348876ab9efff8bafcb119730 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Mon, 15 Nov 2021 17:58:31 +0900
Subject: [PATCH] Add psql command to list constraints POC2

- Add a filter by contype
- Add pg_catalog prefix
- Fix typo
- Fix help message to add \dco
---
 src/bin/psql/command.c  | 19 +-
 src/bin/psql/describe.c | 98 +
 src/bin/psql/describe.h |  3 ++
 src/bin/psql/help.c |  1 +
 4 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3de9d09..92c61bc 100

Add psql command to list constraints

2021-11-14 Thread Tatsuro Yamada

Hi,

I have been wondering why there is no meta-command for listing
constraints in psql. So, I created a POC patch by using my
experience developing \dX command in PG14.

This feature is helpful for DBAs when they want to check or
modify the definition of constraints.

The current status of the POC patch is as follows:

  - Add "\dco" command to list constraints from pg_constraint

  - Not implemented yet:
- NOT NULL constraint, and so on (based on pg_attribute)
- Tab completion
- Regression test
- Document

The following is test results (See attached test_list_con.sql)

postgres=# \dco
 List of constsraints
 Schema |  Name   |   Definition
|  Table
+-+-+--
 public | t01_chk_price_check | CHECK ((price > (0)::numeric))  
| t01_chk
 public | t02_uniq_product_no_key | UNIQUE (product_no) 
| t02_uniq
 public | t03_pk1_pkey| PRIMARY KEY (product_no)
| t03_pk1
 public | t03_pk2_product_no_key  | UNIQUE (product_no) 
| t03_pk2
 public | t04_fk_pkey | PRIMARY KEY (order_id)  
| t04_fk
 public | t04_fk_product_no_fkey  | FOREIGN KEY (product_no) REFERENCES 
t03_pk1(product_no) | t04_fk
 public | t05_ex_c_excl   | EXCLUDE USING gist (c WITH &&)  
| t05_ex
(7 rows)



I have the following two questions that need to be discussed.

Questions:
(1) What strings should be assigned as meta-command for this feature?
   Currently, \dc and \dC are not available, so I tentatively
   assigned \dco. However, I do not have a strong opinion, so please
   let me know if you have any suggestions.

(2) About domain constraints
   There is the \dD command to show a list of domain constraints.
   So I think this feature should not include it. Is it Okay?


If I can get "+1" for this new feature development, I would like to
improve the patch by adding NOT NULL constraints, and so on.
Any advice or comments would be appreciated.


Thanks,
Tatsuro Yamada
-- check
CREATE TABLE t01_chk (
product_no integer,
name text,
price numeric CHECK (price > 0)
);

-- unique
CREATE TABLE t02_uniq (
product_no integer UNIQUE,
name text,
price numeric
);

-- primary key
CREATE TABLE t03_pk1 (
product_no integer PRIMARY KEY,
name text,
price numeric
);

CREATE TABLE t03_pk2 (
product_no integer UNIQUE NOT NULL,
name text,
price numeric
);

-- foreign key
CREATE TABLE t04_fk (
order_id integer PRIMARY KEY,
product_no integer REFERENCES t03_pk1 (product_no),
quantity integer
);

-- exclude
CREATE TABLE t05_ex (
c circle,
EXCLUDE USING gist (c WITH &&)
);

-- trigger
CREATE OR REPLACE FUNCTION trigger_notice_ab() RETURNS trigger as $$
  begin
raise notice 'trigger % on % % % for %: (a,b)=(%,%)',
TG_NAME, TG_TABLE_NAME, TG_WHEN, TG_OP, TG_LEVEL,
NEW.a, NEW.b;
if TG_LEVEL = 'ROW' then
   return NEW;
end if;
return null;
  end;
  $$ language plpgsql;

CREATE TABLE t06_trg (a int, b int);

CREATE CONSTRAINT TRIGGER t06_con_trig AFTER insert ON t06_trg DEFERRABLE
FOR each row EXECUTE PROCEDURE trigger_notice_ab();

-- domain
CREATE DOMAIN t07_posint AS integer CHECK (VALUE > 0);
CREATE TABLE t07_dm (id t07_posint);

-- not null
CREATE TABLE t08_notnull (
a varchar NOT NULL
);

-- default
CREATE TABLE t09_default (
a varchar DEFAULT 'a'
);

-- Tests
\dco
\dco t01_chk_price_check
\dco t02*
\dcoS pg_stat*

-- Visibility tests
 visible
create schema con_s1;
create TABLE con_s1.t10_test (a int PRIMARY KEY);
set search_path to public, con_s1;
\dco

 not visible
create role regress_constraint nosuperuser;
set role regress_constraint;
\dco
reset role;

-- clean-up
DROP TABLE t01_chk,
t02_uniq,
t03_pk1,
t03_pk2,
t04_fk,
t05_ex,
t06_trg,
t07_dm,
t08_notnull,
t09_default,
con_s1.t10_test;

drop domain t07_posint;
drop schema con_s1;
drop role regress_constraint;


diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 49d4c0e3ce..c450972f27 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -769,7 +769,10 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
success = describeTablespaces(pattern, 
show_verbose);
break;
case 'c':
-   success = listConversions(pattern, 
show_verbose, show_system);
+   if (strncmp(cmd, "dco", 

Re: Question about psql meta-command with schema option doesn't use visibilityrule

2021-11-07 Thread Tatsuro Yamada

Hi David,


I have a question that is a specification of permission check
(visibilityrule) for psql meta-command with schema option.

Visibility means search_path, not permission.  If s_a is not in the 
search_paths it objects are not visible unqualified but can be seen (catalog) 
when schema qualified.


Thanks for your comments! I understood them:
 - all users can show System catalog (pg_catalog. *) is a
   specification, so it is not a bug
 - visibility and permission are not the same (I confused it before, oops)

Regards,
Tatsuro Yamada





Question about psql meta-command with schema option doesn't use visibilityrule

2021-11-07 Thread Tatsuro Yamada

Hi,

I have a question that is a specification of permission check
(visibilityrule) for psql meta-command with schema option.

According to the source code [1], there is no check if a schema
option is added. As a result, a role that is not granted can see
other roles' object names.
We might say it's okay because it's a name, not contents (data),
but It seems not preferable, I think.

The following is a reproducer using \dX commands.
Note: It is not only \dX but also \d because it uses the same
permission check function (processSQLNamePattern).

The reproduction procedure (including some results):

-- Create role a, b as non-superuser
create role a nosuperuser;
create role b nosuperuser;
grant CREATE on database postgres to a;

-- Create schema s_a, table hoge, and its extend stats by role a
set role a;
create schema s_a;
create table s_a.hoge(a int, b int);
create statistics s_a.hoge_ext on a,b from s_a.hoge;
set search_path to public, s_a;

-- Run \dX and \dX s_a.* by role a: OK (since schema s_a was created by role a)
\dX
   List of extended statistics
 Schema |   Name   |   Definition   | Ndistinct | Dependencies |   MCV
+--++---+--+-
 s_a| hoge_ext | a, b FROM hoge | defined   | defined  | defined
(1 row)

\dX s_a.*
   List of extended statistics
 Schema |   Name   |   Definition   | Ndistinct | Dependencies |   MCV
+--++---+--+-
 s_a| hoge_ext | a, b FROM hoge | defined   | defined  | defined
(1 row)

-- Run \dX by role b: OK
--  (not displayed is fine since role b can't see info of role a)
reset role;
set role b;
\dX
 List of extended statistics
 Schema | Name | Definition | Ndistinct | Dependencies | MCV
+--++---+--+-
(0 rows)

-- Run \dX with schema by role b: OK?? (It should be NG?)
-- this case is a point in my question
\dX s_a.*
 List of extended statistics
 Schema |   Name   | Definition | Ndistinct | Dependencies |   MCV
+--++---+--+-
 s_a| hoge_ext | a, b FROM s_a.hoge | defined   | defined  | defined
(1 row)

-- clean-up
reset role;
drop schema s_a cascade;
revoke CREATE on DATABASE postgres FROM a;
drop role a;
drop role b;


From the above results, I expected "\dX s_a.*" doesn't show any info
as same as "\dX". but info is displayed. I'm wondering this behavior.

I'm maybe missing something, but if this is a problem, I'll send a
patch. Any comments are welcome!


[1]: processSQLNamePattern in src/fe_utils/string_utils.c
if (schemabuf.len > 2)
{
/* We have a schema pattern, so constrain the schemavar */

/* Optimize away a "*" pattern */
if (strcmp(schemabuf.data, "^(.*)$") != 0 && schemavar)
{
WHEREAND();
appendPQExpBuffer(buf, "%s OPERATOR(pg_catalog.~) ", schemavar);
appendStringLiteralConn(buf, schemabuf.data, conn);
if (PQserverVersion(conn) >= 12)
appendPQExpBufferStr(buf, " COLLATE pg_catalog.default");
appendPQExpBufferChar(buf, '\n');
}
}
else
{
/* No schema pattern given, so select only visible objects */
if (visibilityrule)
{
WHEREAND();
appendPQExpBuffer(buf, "%s\n", visibilityrule);
}
}



Thanks,
Tatsuro Yamada






Re: Add prefix pg_catalog to pg_get_statisticsobjdef_columns() in describe.c (\dX)

2021-09-28 Thread Tatsuro Yamada

Hi Magnus!



I found other functions that we should add "pg_catalog" prefix in
describe.c. This fix is similar to the following commit.

Yup, that's correct. Applied and backpatched to 14 (but it won't be in the 14.0 
release).



Thank you!


Regards,
Tatsuro Yamada






Add prefix pg_catalog to pg_get_statisticsobjdef_columns() in describe.c (\dX)

2021-09-26 Thread Tatsuro Yamada

Hi,

I found other functions that we should add "pg_catalog" prefix in
describe.c. This fix is similar to the following commit.

=
commit 359bcf775550aa577c86ea30a6d071487fcca1ed
Author: Alvaro Herrera 
Date:   Sat Aug 28 12:04:15 2021 -0400

psql \dX: reference regclass with "pg_catalog." prefix
=

Please find attached the patch.

Regards,
Tatsuro Yamada
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649..a33d77c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2881,7 +2881,7 @@ describeOneTableDetails(const char *schemaname,
  
"stxrelid::pg_catalog.regclass, "
  
"stxnamespace::pg_catalog.regnamespace AS nsp, "
  "stxname,\n"
- 
"pg_get_statisticsobjdef_columns(oid) AS columns,\n"
+ 
"pg_catalog.pg_get_statisticsobjdef_columns(oid) AS columns,\n"
  "  'd' = any(stxkind) 
AS ndist_enabled,\n"
  "  'f' = any(stxkind) 
AS deps_enabled,\n"
  "  'm' = any(stxkind) 
AS mcv_enabled,\n"
@@ -4734,7 +4734,7 @@ listExtendedStats(const char *pattern)
if (pset.sversion >= 14)
appendPQExpBuffer(,
  "pg_catalog.format('%%s FROM 
%%s', \n"
- "  
pg_get_statisticsobjdef_columns(es.oid), \n"
+ "  
pg_catalog.pg_get_statisticsobjdef_columns(es.oid), \n"
  "  
es.stxrelid::pg_catalog.regclass) AS \"%s\"",
  gettext_noop("Definition"));
else


Re: list of extended statistics on psql (\dX)

2021-07-26 Thread Tatsuro Yamada

Hi Tomas and Justin,

On 2021/07/27 4:26, Tomas Vondra wrote:

Hi,

I've pushed the last version of the fix, including the regression tests etc. 
Backpatch to 14, where \dX was introduced.



Thank you!


Regards,
Tatsuro Yamada









Re: list of extended statistics on psql (\dX)

2021-07-07 Thread Tatsuro Yamada

Hi Tomas and Justin,

On 2021/06/07 4:47, Tomas Vondra wrote:

Here's a slightly more complete patch, tweaking the regression tests a
bit to detect this.



I tested your patch on PG14beta2 and PG15devel.
And they work fine.
===
 All 209 tests passed.
===

Next time I create a feature on psql, I will be careful to add
a check for schema visibility rules. :-D

Thanks,
Tatsuro Yamada







Re: Farewell greeting

2021-06-27 Thread Tatsuro Yamada

Hi Tsunakawa-san,

On 2021/06/27 16:41, tsunakawa.ta...@fujitsu.com wrote:

I'm moving to another company from July 1st.  I may possibly not be allowed to 
do open source activities there, so let me say goodbye once.  Thank you all for 
your help.  I really enjoyed learning PostgreSQL and participating in its 
development.

It's a pity that I may not be able to part of PostgreSQL's great history until 
it becomes the most popular database (in the DB-Engines ranking.)  However, if 
possible, I'd like to come back as just MauMau.

I hope you all will succeed.



Good luck in your future career! :-D
See you at the Japanese PG developers' meetup.


Thanks,
Tatsuro Yamada





Re: Duplicate history file?

2021-06-09 Thread Tatsuro Yamada

Hi,

On 2021/06/09 16:23, Fujii Masao wrote:

On 2021/06/09 15:58, Tatsuro Yamada wrote:

This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+


Even with the patch, if "test ! -f ..." is used in archive_command,
you may still *easily* get the trouble that WAL archiving keeps failing?


Thanks for your comment.

Yes, it may solve the error when using the test command, but it is
dangerous to continue using the cp command, which is listed as an
example of an archive command.

 

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?



I agree with that idea.
Since archiving is important for all users, I think there should be
either a better and safer command in the documentation, or an archive
command (pg_archivecopy?) that we provide as a community, as you said.
I am curious about the conclusions of past discussions. :)


Regards,
Tatsuro Yamada






Re: Duplicate history file?

2021-06-09 Thread Tatsuro Yamada

Hi


Thank you for fixing the patch.
The new patch works well in my environment. :-D


This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+

Regards,
Tatsuro Yamada





Re: Duplicate history file?

2021-06-08 Thread Tatsuro Yamada

Hi Horiguchi-san,

On 2021/06/09 11:47, Kyotaro Horiguchi wrote:

On 2021/06/08 18:19, Tatsuro Yamada wrote:

I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.


Oops! The patch forgot about history files.

I checked the attached with your repro script and it works fine.



Thank you for fixing the patch.
The new patch works well in my environment. :-D


Regards,
Tatsuro Yamada





Re: Duplicate history file?

2021-06-08 Thread Tatsuro Yamada

Hi Horiguchi-san,


This thread should have been started here:

https://www.postgresql.org/message-id/20210531.165825.921389284096975508.horikyota.ntt%40gmail.com


(To recap: In a replication set using archive, startup tries to
restore WAL files from archive before checking pg_wal directory for
the desired file.  The behavior itself is intentionally designed and
reasonable. However, the restore code notifies of a restored file
regardless of whether it has been already archived or not.  If
archive_command is written so as to return error for overwriting as we
suggest in the documentation, that behavior causes archive failure.)

After playing with this, I see the problem just by restarting a
standby even in a simple archive-replication set after making
not-special prerequisites.  So I think this is worth fixing.

With this patch, KeepFileRestoredFromArchive compares the contents of
just-restored file and the existing file for the same segment only
when:

  - archive_mode = always
  and - the file to restore already exists in pgwal
  and - it has a .done and/or .ready status file.

which doesn't happen usually.  Then the function skips archive
notification if the contents are identical.  The included TAP test is
working both on Linux and Windows.



Thank you for the analysis and the patch.
I'll try the patch tomorrow.

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again. 



I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.

I understand that, as Stefan says, the test and cp commands have
problems and should not be used for archive commands. Maybe this is not
a big problem for the community.
Nevertheless, even if we do not improve the feature, I think it is a
good idea to explicitly state in the documentation that archiving may
fail under certain conditions for new users.

I'd like to hear the opinions of experts on the archive command.

P.S.
My customer's problem has already been solved, so it's ok. I've
emailed -hackers with the aim of preventing users from encountering
the same problem.


Regards,
Tatsuro Yamada






Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada

On 2021/06/07 17:32, Kyotaro Horiguchi wrote:

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.

Mmm. Maybe my mailer automatically inserted In-Reply-To field for the
cited messsage.  Do we (the two of us) bother re-launching a new
thread?



The reason I suggested it was because I thought it might be
confusing if the threads were not independent when registered in
a commitfest. If that is not a problem, then I'm fine with it as
is. :-D


(You can freely do that, too:p)


I should have told you that I would be happy to create a new thread.

Why I didn't create new thread is that because I didn't want people to
think I had hijacked the thread. :)



Hmm. I found that the pgsql-hackers archive treats the new thread as a
part of the old thread, so CF-app would do the same.

Anyway I re-launched a new standalone thread.

https://www.postgresql.org/message-id/20210607.173108.348241508233844279.horikyota.ntt%40gmail.com


Thank you!


Regards,
Tatsuro Yamada






Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada

On 2021/06/07 16:31, Kyotaro Horiguchi wrote:

At Mon, 07 Jun 2021 16:13:08 +0900, Tatsuro Yamada 
 wrote in

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.


Mmm. Maybe my mailer automatically inserted In-Reply-To field for the
cited messsage.  Do we (the two of us) bother re-launching a new
thread?



The reason I suggested it was because I thought it might be
confusing if the threads were not independent when registered in
a commitfest. If that is not a problem, then I'm fine with it as is. :-D

Regards,
Tatsuro Yamada






Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada

Hi Horiguchi-san,



(To recap: In a replication set using archive, startup tries to
restore WAL files from archive before checking pg_wal directory for
the desired file.  The behavior itself is intentionally designed and
reasonable. However, the restore code notifies of a restored file
regardless of whether it has been already archived or not.  If
archive_command is written so as to return error for overwriting as we
suggest in the documentation, that behavior causes archive failure.)

After playing with this, I see the problem just by restarting a
standby even in a simple archive-replication set after making
not-special prerequisites.  So I think this is worth fixing.

With this patch, KeepFileRestoredFromArchive compares the contents of
just-restored file and the existing file for the same segment only
when:

  - archive_mode = always
  and - the file to restore already exists in pgwal
  and - it has a .done and/or .ready status file.

which doesn't happen usually.  Then the function skips archive
notification if the contents are identical.  The included TAP test is
working both on Linux and Windows.



Thank you for the analysis and the patch.
I'll try the patch tomorrow.

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.


Regards,
Tatsuro Yamada






Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada

Hi Horiguchi-san,


Regarding "test ! -f",
I am wondering how many people are using the test command for
archive_command. If I remember correctly, the guide provided by
NTT OSS Center that we are using does not recommend using the test
command.


I think, as the PG-REX documentation says, the simple cp works well as
far as the assumption of PG-REX - no double failure happenes, and
following the instruction - holds.



I believe that this assumption started to be wrong after
archive_mode=always was introduced. As far as I can tell, it doesn't
happen when it's archive_mode=on.



On the other hand, I found that the behavior happens more generally.

If a standby with archive_mode=always craches, it starts recovery from
the last checkpoint. If the checkpoint were in a archived segment, the
restarted standby will fetch the already-archived segment from archive
then fails to archive it. (The attached).

So, your fear stated upthread is applicable for wider situations. The
following suggestion is rather harmful for the archive_mode=always
setting.

https://www.postgresql.org/docs/14/continuous-archiving.html

The archive command should generally be designed to refuse to
overwrite any pre-existing archive file. This is an important safety
feature to preserve the integrity of your archive in case of
administrator error (such as sending the output of two different
servers to the same archive directory).


I'm not sure how we should treat this..  Since archive must store
files actually applied to the server data, just being already archived
cannot be the reason for omitting archiving.  We need to make sure the
new file is byte-identical to the already-archived version. We could
compare just *restored* file to the same file in pg_wal but it might
be too much of penalty for for the benefit. (Attached second file.)



Thanks for creating the patch!

 

Otherwise the documentation would need someting like the following if
we assume the current behavior.


The archive command should generally be designed to refuse to
overwrite any pre-existing archive file. This is an important safety
feature to preserve the integrity of your archive in case of
administrator error (such as sending the output of two different
servers to the same archive directory).

+ For standby with the setting archive_mode=always, there's a case where
+ the same file is archived more than once.  For safety, it is
+ recommended that when the destination file exists, the archive_command
+ returns zero if it is byte-identical to the source file.



Agreed.
That is same solution as I mentioned earlier.
If possible, it also would better to write it postgresql.conf (that might
be overkill?!).


Regards,
Tatsuro Yamada






Re: Duplicate history file?

2021-05-31 Thread Tatsuro Yamada

Hi Horiguchi-san,

On 2021/05/31 16:58, Kyotaro Horiguchi wrote:

So, I started a thread for this topic diverged from the following
thread.

https://www.postgresql.org/message-id/4698027d-5c0d-098f-9a8e-8cf09e36a...@nttcom.co.jp_1


So, what should we do for the user? I think we should put some notes
in postgresql.conf or in the documentation. For example, something
like this:


I'm not sure about the exact configuration you have in mind, but that
would happen on the cascaded standby in the case where the upstream
promotes. In this case, the history file for the new timeline is
archived twice.  walreceiver triggers archiving of the new history
file at the time of the promotion, then startup does the same when it
restores the file from archive.  Is it what you complained about?



Thank you for creating a new thread and explaining this.
We are not using cascade replication in our environment, but I think
the situation is similar. As an overview, when I do a promote,
the archive_command fails due to the history file.

I've created a reproduction script that includes building replication,
and I'll share it with you. (I used Robert's test.sh as a reference
for creating the reproduction script. Thanks)

The scenario (sr_test_historyfile.sh) is as follows.

#1 Start pgprimary as a main
#2 Create standby
#3 Start pgstandby as a standby
#4 Execute archive command
#5 Shutdown pgprimary
#6 Start pgprimary as a standby
#7 Promote pgprimary
#8 Execute archive_command again, but failed since duplicate history
   file exists (see pgstandby.log)

Note that this may not be appropriate if you consider it as a recovery
procedure for replication configuration. However, I'm sharing it as it is
because this seems to be the procedure used in the customer's environment 
(PG-REX).

 

The same workaround using the alternative archive script works for the
case.

We could check pg_wal before fetching archive, however, archiving is
not controlled so strictly that duplicate archiving never happens and
I think we choose possible duplicate archiving than having holes in
archive. (so we suggest the "test ! -f" script)



Note: If you use archive_mode=always, the archive_command on the
standby side should not be used "test ! -f".



It could be one workaround. However, I would suggest not to overwrite
existing files (with a file with different content) to protect archive
from corruption.

We might need to write that in the documentation...


I think you're right, replacing it with an alternative archive script
that includes the cmp command will resolve the error. The reason is that
I checked with the diff command that the history files are identical.

=
$ diff -s pgprimary/arc/0002.history  pgstandby/arc/0002.history
Files pgprimary/arc/0002.history and pgstandby/arc/0002.history are 
identical
=

Regarding "test ! -f",
I am wondering how many people are using the test command for
archive_command. If I remember correctly, the guide provided by
NTT OSS Center that we are using does not recommend using the test command.


Regards,
Tatsuro Yamada

2021-06-01 13:11:50.168 JST [3041] LOG:  starting PostgreSQL 14beta1 on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 
64-bit
2021-06-01 13:11:50.169 JST [3041] LOG:  listening on IPv6 address "::1", port 
5450
2021-06-01 13:11:50.169 JST [3041] LOG:  listening on IPv4 address "127.0.0.1", 
port 5450
2021-06-01 13:11:50.171 JST [3041] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5450"
2021-06-01 13:11:50.176 JST [3042] LOG:  database system was shut down at 
2021-06-01 13:11:50 JST
2021-06-01 13:11:50.179 JST [3041] LOG:  database system is ready to accept 
connections
2021-06-01 13:11:56.890 JST [3041] LOG:  received fast shutdown request
2021-06-01 13:11:56.891 JST [3041] LOG:  aborting any active transactions
2021-06-01 13:11:56.891 JST [3041] LOG:  background worker "logical replication 
launcher" (PID 3049) exited with exit code 1
2021-06-01 13:11:56.891 JST [3043] LOG:  shutting down
2021-06-01 13:11:56.946 JST [3041] LOG:  database system is shut down
2021-06-01 13:12:00.023 JST [3120] LOG:  starting PostgreSQL 14beta1 on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 
64-bit
2021-06-01 13:12:00.023 JST [3120] LOG:  listening on IPv6 address "::1", port 
5450
2021-06-01 13:12:00.023 JST [3120] LOG:  listening on IPv4 address "127.0.0.1", 
port 5450
2021-06-01 13:12:00.025 JST [3120] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5450"
2021-06-01 13:12:00.028 JST [3121] LOG:  database system was shut down at 
2021-06-01 13:11:56 JST
cp: cannot stat ‘arc/0002.history’: No such file or directory
2021-06-01 13:12:00.031 JST [3121] LOG:  entering standby mode
cp: cannot stat ‘arc/00010007’: No such file or directory
2021-06-01 13:12:00.035 JST [3121] LOG:  consistent reco

Re: Race condition in recovery?

2021-05-30 Thread Tatsuro Yamada

Hi Horiguchi-san,


(Why me?)


Because the story was also related to PG-REX, which you are
also involved in developing. Perhaps off-list instead of
-hackers would have been better, but I emailed -hackers because
the same problem could be encountered by PostgreSQL users who
do not use PG-REX.

 

In a project I helped with, I encountered an issue where
the archive command kept failing. I thought this issue was
related to the problem in this thread, so I'm sharing it here.
If I should create a new thread, please let me know.

* Problem
   - The archive_command is failed always.


Although I think the configuration is a kind of broken, it can be seen
as it is mimicing the case of shared-archive, where primary and
standby share the same archive directory.



To be precise, the environment of this reproduction script is
different from our actual environment. I tried to make it as
simple as possible to reproduce the problem.
(In order to make it look like the actual environment, you have
to build a PG-REX environment.)

A simple replication environment might be enough, so I'll try to
recreate a script that is closer to the actual environment later.

 

Basically we need to use an archive command like the following for
that case to avoid this kind of failure. The script returns "success"
when the target file is found but identical with the source file. I
don't find such a description in the documentation, and haven't
bothered digging into the mailing-list archive.

==
#! /bin/bash

if [ -f $2 ]; then
cmp -s $1 $2
if [ $? != 0 ]; then
exit 1
fi
exit 0
fi

cp $1 $2
==


Thanks for your reply.
Since the above behavior is different from the behavior of the
test command in the following example in postgresql.conf, I think
we should write a note about this example.

# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'

Let me describe the problem we faced.
- When archive_mode=always, archive_command is (sometimes) executed
  in a situation where the history file already exists on the standby
  side.

- In this case, if "test ! -f" is written in the archive_command of
  postgresql.conf on the standby side, the command will keep failing.

  Note that this problem does not occur when archive_mode=on.

So, what should we do for the user? I think we should put some notes
in postgresql.conf or in the documentation. For example, something
like this:


Note: If you use archive_mode=always, the archive_command on the standby side should not 
be used "test ! -f".




Regards,
Tatsuro Yamada






Re: Race condition in recovery?

2021-05-27 Thread Tatsuro Yamada

Hi Horiguchi-san,

In a project I helped with, I encountered an issue where
the archive command kept failing. I thought this issue was
related to the problem in this thread, so I'm sharing it here.
If I should create a new thread, please let me know.

* Problem
  - The archive_command is failed always.

* Conditions under which the problem occurs (parameters)
  - archive_mode=always
  - Using the test command in archive_command
 
* Probable cause

  - I guess that is because the .history file already exists,
and the test command fails.
(but if we use archive_mode=on, archive_command is successful).

* How to reproduce
  - Attached is a script to reproduce the problem.
  Note: the script will remove $PGDATA when it started

The test command is listed as an example of the use of archive_command
in postgresql.conf, and the project faced this problem because it used
the example as is. If this behavior is a specification, it would be
better not to write the test command as a usage example.
Or maybe there should be a note that the test command should not be used
when archive_mode=always. Maybe, I'm missing something, sorry.

Regards,
Tatsuro Yamada
set -v
pg_ctl stop
rm -rf $PGDATA
initdb --encoding=UTF8 --no-locale
mkdir $PGDATA/arc

cat <<@EOF >> $PGDATA/postgresql.conf
archive_mode = always
archive_command = 'test ! -f arc/%f && cp %p arc/%f'

restore_command = 'cp arc/%f %p'
recovery_target_timeline = 'latest'
@EOF

touch $PGDATA/standby.signal
pg_ctl start
sleep 3
pg_ctl promote
sleep 3
psql -c "create table test (a int);"
# These are the trigger for archive_command
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();"  # OK
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();"  # OK
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();"  # OK

sleep 1

touch $PGDATA/standby.signal
pg_ctl stop
pg_ctl start
echo " NG  (If we use archive_mode=on, archive_command is successful)"
sleep 10
pg_ctl promote
sleep 3
# These are the trigger for archive_command
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();"  # 
NG: archive_command failed
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();"  # 
NG: archive_command failed
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();"  # 
NG: archive_command failed
waiting for server to shut down done
server stopped
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /home/postgres/PG140/DATA ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Tokyo
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok


Success. You can now start the database server using:

pg_ctl -D /home/postgres/PG140/DATA -l logfile start

waiting for server to start2021-05-28 12:19:06.378 JST [8215] LOG:  
starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 
20150623 (Red Hat 4.8.5-39), 64-bit
2021-05-28 12:19:06.378 JST [8215] LOG:  listening on IPv6 address "::1", port 
1400
2021-05-28 12:19:06.379 JST [8215] LOG:  listening on IPv4 address "127.0.0.1", 
port 1400
2021-05-28 12:19:06.381 JST [8215] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.1400"
2021-05-28 12:19:06.386 JST [8216] LOG:  database system was shut down at 
2021-05-28 12:19:06 JST
cp: cannot stat ‘arc/0002.history’: No such file or directory
2021-05-28 12:19:06.393 JST [8216] LOG:  entering standby mode
cp: cannot stat ‘arc/00010001’: No such file or directory
2021-05-28 12:19:06.402 JST [8216] LOG:  consistent recovery state reached at 
0/16E9CF8
2021-05-28 12:19:06.402 JST [8216] LOG:  invalid record length at 0/16E9CF8: 
wanted 24, got 0
2021-05-28 12:19:06.402 JST [8215] LOG:  database system is ready to accept 
read only connections
cp: cannot stat ‘arc/0002.history’: No such file or directory
cp: cannot stat ‘arc/00010001’: No such file or directory
cp: cannot stat ‘arc/0002.history’: No such file or directory
 done
server started
waiting for server to promotecp: cannot stat 
‘arc/00010001’: No such file or directory
2021-05-28 12:19:09.462 JST [8216] LOG:  received promote request
2021-05-28 12:19:09.463 JST [8216] LOG:  redo is not required
cp: cannot stat ‘arc/00010001’: No such file or directory
cp: cannot st

Re: Typo in jsonfuncs.c

2021-04-08 Thread Tatsuro Yamada

Hi Julien and Amit Kapila,

On 2021/04/08 17:33, Julien Rouhaud wrote:

On Thu, Apr 08, 2021 at 10:06:56AM +0900, Tatsuro Yamada wrote:

Hi,

I found a typo in jsonfuncs.c, probably.
   s/an an/an/
Please find attached patch.


For the archives' sake, this has been pushed as of 8ffb003591.



Julien, thanks for the info! :-D
Also, thanks for taking your time to push this, Amit.
 
Regards,

Tatsuro Yamada





Typo in jsonfuncs.c

2021-04-07 Thread Tatsuro Yamada

Hi,

I found a typo in jsonfuncs.c, probably.
  s/an an/an/
Please find attached patch.

Thanks,
Tatsuro Yamada

diff --git a/src/backend/utils/adt/jsonfuncs.c 
b/src/backend/utils/adt/jsonfuncs.c
index 511467280f..9961d27df4 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -4882,7 +4882,7 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2,
  * case if target is an array. The assignment index will not be restricted by
  * number of elements in the array, and if there are any empty slots between
  * last element of the array and a new one they will be filled with nulls. If
- * the index is negative, it still will be considered an an index from the end
+ * the index is negative, it still will be considered an index from the end
  * of the array. Of a part of the path is not present and this part is more
  * than just one last element, this flag will instruct to create the whole
  * chain of corresponding objects and insert the value.


Re: Huge memory consumption on partitioned table with FKs

2021-03-11 Thread Tatsuro Yamada

On 2021/03/12 2:44, Tom Lane wrote:

I wrote:

Now, maybe it's a coincidence that husky failed on a
partitioned-foreign-key test right after this patch went in, but I bet
not.  Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've
overlooked some cache-reset scenario or other.


After reproducing it here, that *is* a coincidence.  I shall now go beat
up on the correct blame-ee, instead.



I did "make installcheck-parallel" on 7bb97211a, just in case. It was 
successful. :-D

Regards,
Tatsuro Yamada





Re: Huge memory consumption on partitioned table with FKs

2021-03-10 Thread Tatsuro Yamada

On 2021/03/11 9:39, Amit Langote wrote:

On Thu, Mar 11, 2021 at 4:25 AM Tom Lane  wrote:

Amit Langote  writes:

On Wed, Mar 10, 2021 at 8:37 AM Tom Lane  wrote:

Hmm.  So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child.  We just have to
be sure we pass the right parameter values.



Right.


I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.


Perfect.   Thanks for your time on this.



Thanks for fixing the problem! :-D


Regards,
Tatsuro Yamada





Re: Is it useful to record whether plans are generic or custom?

2021-01-27 Thread Tatsuro Yamada

Hi Toricoshi-san,

On 2021/01/12 20:36, torikoshia wrote:

I suppose it would be normal practice to store past results of
pg_stat_statements for future comparisons.
If this is the case, I think that if we only add the number of
generic plan execution, it will give us a hint to notice the cause
of performance degradation due to changes in the plan between
generic and custom.

For example, if there is a clear difference in the number of times
the generic plan is executed between before and after performance
degradation as below, it would be natural to check if there is a
problem with the generic plan.

...

Attached a patch that just adds a generic call counter to
pg_stat_statements.



I think that I'd like to use the view when we faced a performance
problem and find the reason. If we did the fixed-point observation
(should I say time-series analysis?) of generic_calls, it allows us to
realize the counter changes, and we can know whether the suspect is
generic_plan or not. So the patch helps DBA, I believe.

Regards,
Tatsuro Yamada





Re: Is it useful to record whether plans are generic or custom?

2021-01-27 Thread Tatsuro Yamada

Horiguchi-san,

On 2020/12/04 15:37, Kyotaro Horiguchi wrote:

And I'm also struggling with the following.

| However, I also began to wonder how effective it would be to just
| distinguish between generic and custom plans.  Custom plans can
| include all sorts of plans. and thinking cache validation, generic
| plans can also include various plans.

| Considering this, I'm starting to feel that it would be better to
| not just keeping whether generic or cutom but the plan itself as
| discussed in the below thread.


FWIW, that seems to me to be like some existing extension modules,
pg_stat_plans or pg_store_plans..  The former is faster but may lose
plans, the latter doesn't lose plans but slower.  I feel that we'd
beter consider simpler feature if we are intendeng it to be a part of
a contrib module,


There is also pg_show_plans.
Ideally, it would be better to able to track all of the plan changes by
checking something view since Plan Stability is important for DBA when
they use PostgreSQL in Mission-critical systems.
I prefer that the feature will be released as a contrib module. :-D

Regards,
Tatsuro Yamada
 






Re: Is it useful to record whether plans are generic or custom?

2021-01-27 Thread Tatsuro Yamada

Torikoshi-san,

On 2020/12/04 15:03, torikoshia wrote:


ISTM now that creating pg_stat_statements_xxx views
both for generic andcustom plans is better than my PoC patch.

And I'm also struggling with the following.

| However, I also began to wonder how effective it would be to just
| distinguish between generic and custom plans.  Custom plans can
| include all sorts of plans. and thinking cache validation, generic
| plans can also include various plans.

| Considering this, I'm starting to feel that it would be better to
| not just keeping whether generic or cutom but the plan itself as
| discussed in the below thread.


Yamada-san,

Do you think it's effective just distinguish between generic
and custom plans?


Sorry for the super delayed replay.

Ah, it's okay.
It would be better to check both info by using a single view from the
perspective of usability. However, it's okay to divide both information
into two views to use memory effectively.

Regards,
Tatsuro Yamada





Re: Is it useful to record whether plans are generic or custom?

2021-01-27 Thread Tatsuro Yamada

On 2020/12/04 14:29, Fujii Masao wrote:


On 2020/11/30 15:24, Tatsuro Yamada wrote:

Hi Torikoshi-san,



In this patch, exposing new columns is mandatory, but I think
it's better to make it optional by adding a GUC something
like 'pgss.track_general_custom_plans.

I also feel it makes the number of columns too many.
Just adding the total time may be sufficient.



I think this feature is useful for DBA. So I hope that it gets
committed to PG14. IMHO, many columns are Okay because DBA can
select specific columns by their query.
Therefore, it would be better to go with the current design.


But that design may waste lots of memory. No? For example, when
plan_cache_mode=force_custom_plan, the memory used for the columns
for generic plans is not used.

Regards,



Sorry for the super delayed replay.
I don't think that because I suppose that DBA uses plan_cache_mode if
they faced an inefficient execution plan. And the parameter will be used
as a session-level GUC parameter, not a database-level.


Regards,
Tatsuro Yamada







Re: simplifying foreign key/RI checks

2021-01-26 Thread Tatsuro Yamada

Hi Amit-san,


+   case TM_Invisible:
+   elog(ERROR, "attempted to lock invisible tuple");
+   break;
+
+   case TM_SelfModified:
+   case TM_BeingModified:
+   case TM_WouldBlock:
+   elog(ERROR, "unexpected table_tuple_lock status: %u", res);
+   break;

+   default:
+   elog(ERROR, "unrecognized table_tuple_lock status: %u", res);

All of these are meant as debugging elog()s for cases that won't
normally occur.  IIUC, the discussion at the linked thread excludes
those from consideration.


Thanks for your explanation.
Ah, I reread the thread, and I now realized that user visible log messages
are the target to replace. I understood that that elog() for the cases won't
normally occur. Sorry for the noise.

Regards,
Tatsuro Yamada





Re: simplifying foreign key/RI checks

2021-01-26 Thread Tatsuro Yamada

Hi Amit-san,

On 2021/01/25 18:19, Amit Langote wrote:

On Mon, Jan 25, 2021 at 9:24 AM Corey Huinker  wrote:

On Sun, Jan 24, 2021 at 6:51 AM Amit Langote  wrote:

Here's v5.


v5 patches apply to master.
Suggested If/then optimization is implemented.
Suggested case merging is implemented.
Passes make check and make check-world yet again.
Just to confirm, we don't free the RI_CompareHashEntry because it points to an 
entry in a hash table which is TopMemoryContext aka lifetime of the session, 
correct?


Right.


Anybody else want to look this patch over before I mark it Ready For Committer?


Would be nice to have others look it over.  Thanks.



Thanks for creating the patch!

I tried to review the patch. Here is my comment.

* According to this thread [1], it might be better to replace elog() with
  ereport() in the patch.

[1]: 
https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com

Thanks,
Tatsuro Yamada





Re: list of extended statistics on psql

2021-01-20 Thread Tatsuro Yamada

Hi Tomas and hackers,

On 2021/01/21 7:00, Tomas Vondra wrote:

I created patches and my test results on PG10, 11, 12, and 14 are fine.

   0001:
 - Fix query to use pg_statistic_ext only
 - Replace statuses "required" and "built" with "defined"
 - Remove the size columns
 - Fix document
 - Add schema name as a filter condition on the query

   0002:
 - Fix all results of \dX
 - Add new testcase by non-superuser

Please find attached files. :-D


Thanks, I've pushed this. I had to tweak the regression tests a bit, for two 
reasons:

1) to change user in regression tests, don't use \connect, but SET ROLE and 
RESET ROLE

2) roles in regression tests should use names with regress_ prefix



Thanks for reviewing many times and committing the feature!

I understood 1) and 2). I'll keep that in mind for the next developing patch.
Then, If possible, could you add Justin to the commit message as a reviewer?
Because I revised the document partly based on his comments.

Finally, As extended stats were more used, this feature becomes more useful.
I hope it helps DBA. :-D


Thanks,
Tatsuro Yamada






Re: list of extended statistics on psql

2021-01-19 Thread Tatsuro Yamada

Hi Tomas,

On 2021/01/20 11:35, Tatsuro Yamada wrote:

Apologies for all the extra work - I haven't realized this flaw when pushing 
for showing more stuff :-(


Don't worry about it. We didn't notice the problem even when viewed by multiple
people on -hackers. Let's keep moving forward. :-D

I'll send a patch including a regression test on the next patch.



I created patches and my test results on PG10, 11, 12, and 14 are fine.

  0001:
- Fix query to use pg_statistic_ext only
- Replace statuses "required" and "built" with "defined"
- Remove the size columns
- Fix document
- Add schema name as a filter condition on the query

  0002:
- Fix all results of \dX
- Add new testcase by non-superuser

Please find attached files. :-D


Regards,
Tatsuro Yamada
From 1aac3df2af2f6c834ffab10ddd1be1dee5970eb3 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Wed, 20 Jan 2021 15:33:04 +0900
Subject: [PATCH 2/2] psql \dX regression test

Add a test by non-superuser
---
 src/test/regress/expected/stats_ext.out | 116 
 src/test/regress/sql/stats_ext.sql  |  37 ++
 2 files changed, 153 insertions(+)

diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index f094731e32..0ff4e51055 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -1727,6 +1727,122 @@ INSERT INTO tststats.priv_test_tbl
 CREATE STATISTICS tststats.priv_test_stats (mcv) ON a, b
   FROM tststats.priv_test_tbl;
 ANALYZE tststats.priv_test_tbl;
+-- Check printing info about extended statistics by \dX
+create table stts_t1 (a int, b int);
+create statistics stts_1 (ndistinct) on a, b from stts_t1;
+create statistics stts_2 (ndistinct, dependencies) on a, b from stts_t1;
+create statistics stts_3 (ndistinct, dependencies, mcv) on a, b from stts_t1;
+create table stts_t2 (a int, b int, c int);
+create statistics stts_4 on b, c from stts_t2;
+create table stts_t3 (col1 int, col2 int, col3 int);
+create statistics stts_hoge on col1, col2, col3 from stts_t3;
+create schema stts_s1;
+create schema stts_s2;
+create statistics stts_s1.stts_foo on col1, col2 from stts_t3;
+create statistics stts_s2.stts_yama (dependencies, mcv) on col1, col3 from 
stts_t3;
+insert into stts_t1 select i,i from generate_series(1,100) i;
+analyze stts_t1;
+\dX
+  List of extended statistics
+  Schema  |  Name  |  Definition  | 
Ndistinct | Dependencies |   MCV   
+--++--+---+--+-
+ public   | func_deps_stat | a, b, c FROM functional_dependencies |
   | defined  | 
+ public   | mcv_lists_arrays_stats | a, b, c FROM mcv_lists_arrays|
   |  | defined
+ public   | mcv_lists_bool_stats   | a, b, c FROM mcv_lists_bool  |
   |  | defined
+ public   | mcv_lists_stats| a, b, d FROM mcv_lists   |
   |  | defined
+ public   | stts_1 | a, b FROM stts_t1| 
defined   |  | 
+ public   | stts_2 | a, b FROM stts_t1| 
defined   | defined  | 
+ public   | stts_3 | a, b FROM stts_t1| 
defined   | defined  | defined
+ public   | stts_4 | b, c FROM stts_t2| 
defined   | defined  | defined
+ public   | stts_hoge  | col1, col2, col3 FROM stts_t3| 
defined   | defined  | defined
+ stts_s1  | stts_foo   | col1, col2 FROM stts_t3  | 
defined   | defined  | defined
+ stts_s2  | stts_yama  | col1, col3 FROM stts_t3  |
   | defined  | defined
+ tststats | priv_test_stats| a, b FROM tststats.priv_test_tbl |
   |  | defined
+(12 rows)
+
+\dX stts_?
+   List of extended statistics
+ Schema |  Name  |Definition | Ndistinct | Dependencies |   MCV   
+++---+---+--+-
+ public | stts_1 | a, b FROM stts_t1 | defined   |  | 
+ public | stts_2 | a, b FROM stts_t1 | defined   | defined  | 
+ public | stts_3 | a, b FROM stts_t1 | defined   | defined  | defined
+ public | stts_4 | b, c FROM stts_t2 | defined   | defined  | defined
+(4 rows)
+
+\dX *stts_hoge
+   List of extended statistics
+ Schema |   Name|  Definition   | Ndistinct | Dependencies 
|   MCV   
++---+---+---+--+-
+ public | stts_hoge | col1, col2, col3 FROM stts_t3 | defined   | defined  
| defined
+(1 row)
+
+\dX+
+  List of

Re: list of extended statistics on psql

2021-01-19 Thread Tatsuro Yamada

Hi Tomas,

On 2021/01/19 11:52, Tomas Vondra wrote:



I'm going to create the WIP patch to use the above queriy.
Any comments welcome. :-D


Yes, I think using this simpler query makes sense. If we decide we need 
something more elaborate, we can improve that by in future PostgreSQL versions 
(after adding view/function to core), but I'd leave that as a work for the 
future.



I see, thanks!



Apologies for all the extra work - I haven't realized this flaw when pushing 
for showing more stuff :-(



Don't worry about it. We didn't notice the problem even when viewed by multiple
people on -hackers. Let's keep moving forward. :-D

I'll send a patch including a regression test on the next patch.

Regards,
Tatsuro Yamada





Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-01-19 Thread Tatsuro Yamada

Hi Julien,



Rebase only, thanks to the cfbot!  V16 attached.


I tested the v16 patch on a0efda88a by using "make installcheck-parallel", and
my result is the following. Attached file is regression.diffs.



Sorry, my environment was not suitable for the test when I sent my previous 
email.
I fixed my environment and tested it again, and it was a success. See below:

===
 All 202 tests passed.
===

Regards,
Tatsuro Yamada






Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-01-19 Thread Tatsuro Yamada

Hi Julien,


Rebase only, thanks to the cfbot!  V16 attached.


I tested the v16 patch on a0efda88a by using "make installcheck-parallel", and
my result is the following. Attached file is regression.diffs.


 1 of 202 tests failed.


The differences that caused some tests to fail can be viewed in the
file "/home/postgres/PG140/src/test/regress/regression.diffs".  A copy of the 
test summary that you see
above is saved in the file 
"/home/postgres/PG140/src/test/regress/regression.out".


src/test/regress/regression.diffs
-
diff -U3 /home/postgres/PG140/src/test/regress/expected/rules.out 
/home/postgres/PG140/src/test/regress/results/rules.out
--- /home/postgres/PG140/src/test/regress/expected/rules.out2021-01-20 
08:41:16.383175559 +0900
+++ /home/postgres/PG140/src/test/regress/results/rules.out 2021-01-20 
08:43:46.589171774 +0900
@@ -1760,10 +1760,9 @@
 s.state,
 s.backend_xid,
 s.backend_xmin,
-s.queryid,
 s.query,
 s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
...

Thanks,
Tatsuro Yamada
diff -U3 /home/postgres/PG140/src/test/regress/expected/rules.out 
/home/postgres/PG140/src/test/regress/results/rules.out
--- /home/postgres/PG140/src/test/regress/expected/rules.out2021-01-20 
08:41:16.383175559 +0900
+++ /home/postgres/PG140/src/test/regress/results/rules.out 2021-01-20 
08:52:10.891159065 +0900
@@ -1760,10 +1760,9 @@
 s.state,
 s.backend_xid,
 s.backend_xmin,
-s.queryid,
 s.query,
 s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
  LEFT JOIN pg_database d ON ((s.datid = d.oid)))
  LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1875,7 +1874,7 @@
 s.gss_auth AS gss_authenticated,
 s.gss_princ AS principal,
 s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
 s.datid,
@@ -2032,7 +2031,7 @@
 w.sync_priority,
 w.sync_state,
 w.reply_time
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM ((pg_s

Re: list of extended statistics on psql

2021-01-18 Thread Tatsuro Yamada

Hi,


The above query is so simple so that we would better to use the following query:

# This query works on PG10 or later
SELECT
     es.stxnamespace::pg_catalog.regnamespace::text AS "Schema",
     es.stxname AS "Name",
     pg_catalog.format('%s FROM %s',
     (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
  FROM pg_catalog.unnest(es.stxkeys) s(attnum)
  JOIN pg_catalog.pg_attribute a
  ON (es.stxrelid = a.attrelid
  AND a.attnum = s.attnum
  AND NOT a.attisdropped)),
     es.stxrelid::regclass) AS "Definition",
     CASE WHEN 'd' = any(es.stxkind) THEN 'defined'
     END AS "Ndistinct",
     CASE WHEN 'f' = any(es.stxkind) THEN 'defined'
     END AS "Dependencies",
     CASE WHEN 'm' = any(es.stxkind) THEN 'defined'
     END AS "MCV"
FROM pg_catalog.pg_statistic_ext es
ORDER BY 1, 2;

  Schema |    Name    |    Definition    | Ndistinct | Dependencies | 
Dependencies
++--+---+--+--
  public | hoge1_ext  | a, b FROM hoge1  | defined   | defined  | defined
  public | hoge_t_ext | a, b FROM hoge_t | defined   | defined  | defined
(2 rows)


I'm going to create the WIP patch to use the above query.
Any comments welcome. :-D



Attached patch is WIP patch.

The changes are:
  - Use pg_statistic_ext only
  - Remove these statuses: "required" and "built"
  - Add new status: "defined"
  - Remove the size columns
  - Fix document

I'll create and send the regression test on the next patch if there is
no objection. Is it Okay?

Regards,
Tatsuro Yamada

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..36a79d9e3f 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,26 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the
+pattern are listed.
+
+
+
+The column of the kind of extended stats (e.g. Ndistinct) shows its 
status.
+NULL means that it doesn't exists. "defined" means that it is declared.
+You can use pg_stats_ext if you'd like to know whether 
+ANALYZE was run and statistics are available 
to the
+planner.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 303e7c3ad8..c98e3d31d0 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -928,6 +928,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index caf97563f4..899fe5d85c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4392,6 +4392,89 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"es.stxnamespace::pg_catalog.regnamespace::text AS \"%s\", \n"
+ "es.stxname AS \"%s\", \n"
+ "pg_catalog.format('%%s FROM %%s', \n"
+ "  (SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ') \n"
+ "   FROM 
pg_catalog.unnest(es.stxkeys) s(attnum) \n"
+  

Re: list of extended statistics on psql

2021-01-18 Thread Tatsuro Yamada

Hi Tomas,


As for how to deal with this, I can think of about three ways:

1) simplify the command to only print information from pg_statistic_ext (so on 
information about which stats are built or sizes)

2) extend pg_stats_ext with necessary information (e.g. sizes)

3) create a new system view, with necessary information (so that pg_stats_ext 
does not need to be modified)

4) add functions returning the necessary information, possibly only for 
statistics the user can access (similarly to what pg_stats_ext does)

Options 2-4 have the obvious disadvantage that this won't work on older 
releases (we can't add views or functions there). So I'm leaning towards #1 
even if that means we have to remove some of the details. We can consider 
adding that for new releases, though.



Thanks for the useful advice. I go with option 1).
The following query is created by using pg_stats_ext instead of 
pg_statistic_ext and pg_statistic_ext_data. However, I was confused
about writing a part of the query for calculating MCV size because
there are four columns related to MCV. For example, most_common_vals, 
most_common_val_nulls, most_common_freqs, and most_common_base_freqs.
Currently, I don't know how to calculate the size of MCV by using the
four columns. Thoughts? :-)


Well, my suggestion was to use pg_statistic_ext, because that lists all 
statistics, while pg_stats_ext is filtering statistics depending on access 
privileges. I think that's more appropriate for \dX, the contents should not 
change depending on the user.

Also, let me clarify - with option (1) we'd not show the sizes at all. The size 
of the formatted statistics may be very different from the on-disk 
representation, so I see no point in showing it in \dX.

We might show other stats (e.g. number of MCV items, or the fraction of data 
represented by the MCV list), but the user can inspect pg_stats_ext if needed.

What we might do is to show those stats when a superuser is running this 
command, but I'm not sure that's a good idea (or how difficult would it be to 
implement).



Thanks for clarifying.
I see that your suggestion was to use pg_statistic_ext, not pg_stats_ext.
And we don't need the size of stats.

If that's the case, we also can't get the status of stats since PG12 or later
because we can't use pg_statistic_ext_data, as you know. Therefore, it would be
better to replace the query with the old query that I sent five months ago like 
this:

# the old query
SELECT
stxnamespace::pg_catalog.regnamespace AS "Schema",
stxrelid::pg_catalog.regclass AS "Table",
stxname AS "Name",
(SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
 FROM pg_catalog.unnest(stxkeys) s(attnum)
 JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
 a.attnum = s.attnum AND NOT attisdropped)) AS "Columns",
'd' = any(stxkind) AS "Ndistinct",
'f' = any(stxkind) AS "Dependencies",
'm' = any(stxkind) AS "MCV"
FROM pg_catalog.pg_statistic_ext stat
ORDER BY 1,2;

 Schema | Table  |Name| Columns | Ndistinct | Dependencies | MCV
+++-+---+--+-
 public | hoge1  | hoge1_ext  | a, b| t | t| t
 public | hoge_t | hoge_t_ext | a, b| t | t| t
(2 rows)


The above query is so simple so that we would better to use the following query:

# This query works on PG10 or later
SELECT
es.stxnamespace::pg_catalog.regnamespace::text AS "Schema",
es.stxname AS "Name",
pg_catalog.format('%s FROM %s',
(SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
 FROM pg_catalog.unnest(es.stxkeys) s(attnum)
 JOIN pg_catalog.pg_attribute a
 ON (es.stxrelid = a.attrelid
 AND a.attnum = s.attnum
 AND NOT a.attisdropped)),
es.stxrelid::regclass) AS "Definition",
CASE WHEN 'd' = any(es.stxkind) THEN 'defined'
END AS "Ndistinct",
CASE WHEN 'f' = any(es.stxkind) THEN 'defined'
END AS "Dependencies",
CASE WHEN 'm' = any(es.stxkind) THEN 'defined'
END AS "MCV"
FROM pg_catalog.pg_statistic_ext es
ORDER BY 1, 2;

 Schema |Name|Definition| Ndistinct | Dependencies | 
Dependencies
++--+---+--+--
 public | hoge1_ext  | a, b FROM hoge1  | defined   | defined  | defined
 public | hoge_t_ext | a, b FROM hoge_t | defined   | defined  | defined
(2 rows)


I'm going to create the WIP patch to use the above queriy.
Any comments welcome. :-D

Thanks,
Tatsuro Yamada





Re: list of extended statistics on psql

2021-01-17 Thread Tatsuro Yamada

Hi Justin,

On 2021/01/18 1:52, Justin Pryzby wrote:

On Sun, Jan 17, 2021 at 03:31:57PM +0100, Tomas Vondra wrote:

I've reverted the commit - once we find the right way to handle this, I'll
get it committed again.


Please consider these doc changes for the next iteration.

commit 1a69f648ce6c63ebb37b6d8ec7c6539b3cb70787
Author: Justin Pryzby 
Date:   Sat Jan 16 17:47:35 2021 -0600

 doc review: psql \dX 891a1d0bca262ca78564e0fea1eaa5ae544ea5ee


Thanks for your comments!
It helps a lot since I'm not a native speaker.
I'll fix the document based on your suggestion on the next patch.

 

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index aaf55df921..a678a69dfb 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1928,15 +1928,15 @@ testdb=
  is specified, only those extended statistics whose names match the
  pattern are listed.
  If + is appended to the command name, each extended
-statistics is listed with its size.
+statistic is listed with its size.


Agreed.

   

  
-The column of the kind of extended stats (e.g. Ndistinct) shows some 
statuses.
+The column of the kind of extended stats (e.g. Ndistinct) shows its 
status.
  "requested" means that it needs to collect statistics by ANALYZE.
  "built" means ANALYZE was


Agreed.



-finished, and the planner can use it. NULL means that it doesn't 
exists.
+run, and statistics are available to the planner. NULL means that it 
doesn't exist.



Agreed.


Thanks,
Tatsuro Yamada





Re: list of extended statistics on psql

2021-01-17 Thread Tatsuro Yamada

Hi Tomas and Shinoda-san,

On 2021/01/17 23:31, Tomas Vondra wrote:



On 1/17/21 3:01 AM, Tomas Vondra wrote:

On 1/17/21 2:41 AM, Shinoda, Noriyoshi (PN Japan FSIP) wrote:

Hi, hackers.

I tested this committed feature.
It doesn't seem to be available to non-superusers due to the inability to 
access pg_statistics_ext_data.
Is this the expected behavior?



Ugh. I overlooked the test to check the case of the user hasn't Superuser 
privilege. The user without the privilege was able to access pg_statistics_ext. 
Therefore I supposed that it's also able to access pg_statics_ext_data. Oops.



Hmmm, that's a good point. Bummer we haven't noticed that earlier.

I wonder what the right fix should be - presumably we could do something like 
pg_stats_ext (we can't use that view directly, because it formats the data, so 
the sizes are different).

But should it list just the stats the user has access to, or should it list 
everything and leave the inaccessible fields NULL?



I've reverted the commit - once we find the right way to handle this, I'll get 
it committed again.

As for how to deal with this, I can think of about three ways:

1) simplify the command to only print information from pg_statistic_ext (so on 
information about which stats are built or sizes)

2) extend pg_stats_ext with necessary information (e.g. sizes)

3) create a new system view, with necessary information (so that pg_stats_ext 
does not need to be modified)

4) add functions returning the necessary information, possibly only for 
statistics the user can access (similarly to what pg_stats_ext does)

Options 2-4 have the obvious disadvantage that this won't work on older 
releases (we can't add views or functions there). So I'm leaning towards #1 
even if that means we have to remove some of the details. We can consider 
adding that for new releases, though.



Thanks for the useful advice. I go with option 1).
The following query is created by using pg_stats_ext instead of 
pg_statistic_ext and pg_statistic_ext_data. However, I was confused
about writing a part of the query for calculating MCV size because
there are four columns related to MCV. For example, most_common_vals, 
most_common_val_nulls, most_common_freqs, and most_common_base_freqs.
Currently, I don't know how to calculate the size of MCV by using the
four columns. Thoughts? :-)

===
\connect postgres hoge
create table hoge_t(a int, b int);
insert into hoge_t select i,i from generate_series(1,100) i;
create statistics hoge_t_ext on a, b from hoge_t;


SELECT
es.statistics_schemaname AS "Schema",
es.statistics_name AS "Name",
pg_catalog.format('%s FROM %s',
  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(s.attname),', ')
   FROM pg_catalog.unnest(es.attnames) s(attname)),
es.tablename) AS "Definition",
CASE WHEN es.n_distinct IS NOT NULL THEN 'built'
 WHEN 'd' = any(es.kinds) THEN 'requested'
END AS "Ndistinct",
CASE WHEN es.dependencies IS NOT NULL THEN 'built'
 WHEN 'f' = any(es.kinds) THEN 'requested'
END AS "Dependencies",
CASE WHEN es.most_common_vals IS NOT NULL THEN 'built'
 WHEN 'm' = any(es.kinds) THEN 'requested'
END AS "MCV",
CASE WHEN es.n_distinct IS NOT NULL THEN
  
pg_catalog.pg_size_pretty(pg_catalog.length(es.n_distinct)::bigint)
 WHEN 'd' = any(es.kinds) THEN '0 bytes'
END AS "Ndistinct_size",
CASE WHEN es.dependencies IS NOT NULL THEN
  
pg_catalog.pg_size_pretty(pg_catalog.length(es.dependencies)::bigint)
 WHEN 'f' = any(es.kinds) THEN '0 bytes'
END AS "Dependencies_size"
FROM pg_catalog.pg_stats_ext es
ORDER BY 1, 2;

-[ RECORD 1 ]-+-
Schema| public
Name  | hoge_t_ext
Definition| a, b FROM hoge_t
Ndistinct | requested
Dependencies  | requested
MCV   | requested
Ndistinct_size| 0 bytes
Dependencies_size | 0 bytes

analyze hoge_t;

-[ RECORD 1 ]-+-
Schema| public
Name  | hoge_t_ext
Definition| a, b FROM hoge_t
Ndistinct | built
Dependencies  | built
MCV   | built
Ndistinct_size| 13 bytes
Dependencies_size | 40 bytes
===

Thanks,
Tatsuro Yamada







Re: list of extended statistics on psql

2021-01-17 Thread Tatsuro Yamada

Hi Tomas,

On 2021/01/17 8:32, Tomas Vondra wrote:

I've pushed this, keeping the "requested". If we decide that some other term is 
a better choice, we can tweak that later of course.

Thanks Tatsuro-san for the patience!


Thanks for taking the time to review the patches.
I believe this feature is useful for DBA when they use Extended stats.
For example, the execution plan tuning phase and so on.

Then, I know the patch was reverted. So, I keep going to fix the patch
on the Second iteration. :-D

Regards,
Tatsuro Yamada





Re: list of extended statistics on psql

2021-01-17 Thread Tatsuro Yamada

Hi Julien,

On 2021/01/15 17:47, Julien Rouhaud wrote:

Hello Yamada-san,

I reviewed the patch and don't have specific complaints, it all looks good!

I'm however thinking about the "requested" status.  I'm wondering if it could
lead to people think that an ANALYZE is scheduled and will happen soon.
Maybe "defined" or "declared" might be less misleading, or even "waiting for
analyze"?



Thanks for reviewing the patch.
Yeah, "waiting for analyze" was preferable but it was a little long to use on 
the columns. Unfortunately, I couldn't imagine a suitable term. Therefore,
I'm keeping it as is.

Regards,
Tatsuro Yamada





Re: list of extended statistics on psql

2021-01-12 Thread Tatsuro Yamada

Hi Tomas,

On 2021/01/12 20:08, Tomas Vondra wrote:


On 1/12/21 2:57 AM, Tatsuro Yamada wrote:

Hi Tomas,

On 2021/01/09 9:01, Tomas Vondra wrote:

...>

While working on that, I realized that 'defined' might be a bit
ambiguous, I initially thought it means 'NOT NULL' (which it does not).
I propose to change it to 'requested' instead. Tatsuro, do you agree, or
do you think 'defined' is better?


Regarding the status of extended stats, I think the followings:

  - "defined": it shows the extended stats defined only. We can't know
   whether it needs to analyze or not. I agree this name was
    ambiguous. Therefore we should replace it with a more suitable
   name.
  - "requested": it shows the extended stats needs something. Of course,
   we know it needs to ANALYZE because we can create the patch.
   However, I feel there is a little ambiguity for DBA.
   To solve this, it would be better to write an explanation of
   the status in the document. For example,

==
The column of the kind of extended stats (e. g. Ndistinct) shows some statuses.
"requested" means that it needs to gather data by ANALYZE. "built" means ANALYZE
  was finished, and the planner can use it. NULL means that it doesn't exists.
==

What do you think? :-D



Yes, that seems reasonable to me. Will you provide an updated patch?



Sounds good. I'll send the updated patch today.


Thanks,
Tatsuro Yamada







Re: list of extended statistics on psql

2021-01-11 Thread Tatsuro Yamada

Hi Tomas,

On 2021/01/09 9:01, Tomas Vondra wrote:

On 1/8/21 1:14 AM, Tomas Vondra wrote:

On 1/8/21 12:52 AM, Tatsuro Yamada wrote:

On 2021/01/08 0:56, Tomas Vondra wrote:

On 1/7/21 3:47 PM, Alvaro Herrera wrote:

On 2021-Jan-07, Tomas Vondra wrote:

On 1/7/21 1:46 AM, Tatsuro Yamada wrote:



I overlooked the check for MCV in the logic building query
because I created the patch as a new feature on PG14.
I'm not sure whether we should do back patch or not. However, I'll
add the check on the next patch because it is useful if you decide to
do the back patch on PG10, 11, 12, and 13.


BTW perhaps a quick look at the other \d commands would show if
there are
precedents. I didn't have time for that.


Yes, we do promise that new psql works with older servers.



Yeah, makes sense. That means we need add the check for 12 / MCV.



Ah, I got it.
I fixed the patch to work with older servers to add the checking
versions. And I tested \dX command on older servers (PG10 - 13).
These results look fine.

0001:
   Added the check code to handle pre-PG12. It has not MCV and
    pg_statistic_ext_data.
0002:
   This patch is the same as the previous patch (not changed).

Please find the attached files.



OK, thanks. I'll take a look and probably push tomorrow. FWIW I plan to
squash the patches into a single commit.



Attached is a patch I plan to commit - 0001 is the last submitted
version with a couple minor tweaks, mostly in docs/comments, and small
rework of branching to be more like the other functions in describe.c.


Thanks for revising the patch.
I reviewed the 0001, and the branching and comments look good to me.
However, I added an alias name in processSQLNamePattern() on the patch:
s/"stxname"/"es.stxname"/



While working on that, I realized that 'defined' might be a bit
ambiguous, I initially thought it means 'NOT NULL' (which it does not).
I propose to change it to 'requested' instead. Tatsuro, do you agree, or
do you think 'defined' is better?


Regarding the status of extended stats, I think the followings:

 - "defined": it shows the extended stats defined only. We can't know
  whether it needs to analyze or not. I agree this name was
   ambiguous. Therefore we should replace it with a more suitable
  name.
 - "requested": it shows the extended stats needs something. Of course,
  we know it needs to ANALYZE because we can create the patch.
  However, I feel there is a little ambiguity for DBA.
  To solve this, it would be better to write an explanation of
  the status in the document. For example,

==
The column of the kind of extended stats (e. g. Ndistinct) shows some statuses.
"requested" means that it needs to gather data by ANALYZE. "built" means ANALYZE
 was finished, and the planner can use it. NULL means that it doesn't exists.
==

What do you think? :-D


Thanks,
Tatsuro Yamada
>From 3d2f4ef2ecba9fd7987df665237add6fc4ec03c1 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Thu, 7 Jan 2021 14:28:20 +0900
Subject: [PATCH 1/2] psql \dX: list extended statistics objects

The new command lists extended statistics objects, possibly with their
sizes. All past releases with extended statistics are supported.

Author: Tatsuro Yamada
Reviewed-by: Julien Rouhaud, Alvaro Herrera, Tomas Vondra
Discussion: 
https://postgr.es/m/c027a541-5856-75a5-0868-341301e1624b%40nttcom.co.jp_1
---
 doc/src/sgml/ref/psql-ref.sgml  |  14 +++
 src/bin/psql/command.c  |   3 +
 src/bin/psql/describe.c | 150 
 src/bin/psql/describe.h |   3 +
 src/bin/psql/help.c |   1 +
 src/bin/psql/tab-complete.c |   4 +-
 src/test/regress/expected/stats_ext.out |  94 +++
 src/test/regress/sql/stats_ext.sql  |  31 +
 8 files changed, 299 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..d01acc92b8 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the
+pattern are listed.
+If + is appended to the command name, each extended
+statistics is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 303e7c3ad8..c5ebc1c3f4 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -928,6 +928,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success =

Re: list of extended statistics on psql

2021-01-07 Thread Tatsuro Yamada

Hi,

On 2021/01/08 0:56, Tomas Vondra wrote:

On 1/7/21 3:47 PM, Alvaro Herrera wrote:

On 2021-Jan-07, Tomas Vondra wrote:


On 1/7/21 1:46 AM, Tatsuro Yamada wrote:



I overlooked the check for MCV in the logic building query
because I created the patch as a new feature on PG14.
I'm not sure whether we should do back patch or not. However, I'll
add the check on the next patch because it is useful if you decide to
do the back patch on PG10, 11, 12, and 13.


BTW perhaps a quick look at the other \d commands would show if there are
precedents. I didn't have time for that.


Yes, we do promise that new psql works with older servers.



Yeah, makes sense. That means we need add the check for 12 / MCV.



Ah, I got it.
I fixed the patch to work with older servers to add the checking versions. And 
I tested \dX command on older servers (PG10 - 13).
These results look fine.

0001:
 Added the check code to handle pre-PG12. It has not MCV and
  pg_statistic_ext_data.
0002:
 This patch is the same as the previous patch (not changed).

Please find the attached files.



I wonder the column names added by \dX+ is fine? For example,
"Ndistinct_size" and "Dependencies_size". It looks like long names,
but acceptable?



Seems acceptable - I don't have a better idea. 


I see, thanks!


Thanks,
Tatsuro Yamada
From c8be51a52a381a6e9c7be62022f5fc48b5915bd0 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Thu, 7 Jan 2021 14:28:20 +0900
Subject: [PATCH] Add \dX command on psql

This patch includes the version check in the logic building query.
---
 doc/src/sgml/ref/psql-ref.sgml |  14 
 src/bin/psql/command.c |   3 +
 src/bin/psql/describe.c| 141 +
 src/bin/psql/describe.h|   3 +
 src/bin/psql/help.c|   1 +
 src/bin/psql/tab-complete.c|   4 +-
 6 files changed, 165 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 303e7c3ad8..c5ebc1c3f4 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -928,6 +928,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 52c6de51b6..0ccd9ed286 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4401,6 +4401,147 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern, bool verbose)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"es.stxnamespace::pg_catalog.regnamespace::text AS \"%s\", \n"
+ "es.stxname AS \"%s\", \n"
+ "pg_catalog.format('%%s FROM %%s', \n"
+ "  (SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ') \n"
+ "   FROM 
pg_catalog.unne

Re: list of extended statistics on psql

2021-01-06 Thread Tatsuro Yamada

Hi Tomas,


Thanks, and Happy new year to you too.

I almost pushed this, but I have one more question. listExtendedStats first checks the 
server version, and errors out for pre-10 servers. Shouldn't the logic building query 
check the server version too, to decide whether to check the MCV stats? Otherwise it 
won't work on 10 and 11, which does not support the "mcv" stats.

I don't recall what exactly is our policy regarding new psql with older server versions, 
but it seems strange to check for 10.0 and then fail anyway for "supported" 
versions.


Thanks for your comments.

I overlooked the check for MCV in the logic building query
because I created the patch as a new feature on PG14.
I'm not sure whether we should do back patch or not. However, I'll
add the check on the next patch because it is useful if you decide to
do the back patch on PG10, 11, 12, and 13.

I wonder the column names added by \dX+ is fine? For example,
"Ndistinct_size" and "Dependencies_size". It looks like long names,
but acceptable?

Regards,
Tatsuro Yamada






Re: list of extended statistics on psql

2021-01-04 Thread Tatsuro Yamada

Hi,


I rebased the patch set on the master (7e5e1bba03), and the regression
test is good. Therefore, I changed the status of the patch: "needs review". 


Happy New Year!

I rebased my patches on HEAD.
Please find attached files. :-D

Thanks,
Tatsuro Yamada

From 9f36a9df0c2803f5554b951e37ba5969c2744e3b Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Tue, 5 Jan 2021 12:34:16 +0900
Subject: [PATCH 2/2] Add regression test for \dX on psql

---
 src/test/regress/expected/stats_ext.out | 94 +
 src/test/regress/sql/stats_ext.sql  | 31 +++
 2 files changed, 125 insertions(+)

diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index 7bfeaf85f0..8c8a0afcf6 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -1725,6 +1725,100 @@ INSERT INTO tststats.priv_test_tbl
 CREATE STATISTICS tststats.priv_test_stats (mcv) ON a, b
   FROM tststats.priv_test_tbl;
 ANALYZE tststats.priv_test_tbl;
+-- Check printing info about extended statistics by \dX
+create table stts_t1 (a int, b int);
+create statistics stts_1 (ndistinct) on a, b from stts_t1;
+create statistics stts_2 (ndistinct, dependencies) on a, b from stts_t1;
+create statistics stts_3 (ndistinct, dependencies, mcv) on a, b from stts_t1;
+create table stts_t2 (a int, b int, c int);
+create statistics stts_4 on b, c from stts_t2;
+create table stts_t3 (col1 int, col2 int, col3 int);
+create statistics stts_hoge on col1, col2, col3 from stts_t3;
+create schema stts_s1;
+create schema stts_s2;
+create statistics stts_s1.stts_foo on col1, col2 from stts_t3;
+create statistics stts_s2.stts_yama (dependencies, mcv) on col1, col3 from 
stts_t3;
+insert into stts_t1 select i,i from generate_series(1,100) i;
+analyze stts_t1;
+\dX
+  List of extended statistics
+  Schema  |  Name  |  Definition  | 
Ndistinct | Dependencies |   MCV   
+--++--+---+--+-
+ public   | func_deps_stat | a, b, c FROM functional_dependencies |
   | built| 
+ public   | mcv_lists_arrays_stats | a, b, c FROM mcv_lists_arrays|
   |  | built
+ public   | mcv_lists_bool_stats   | a, b, c FROM mcv_lists_bool  |
   |  | built
+ public   | mcv_lists_stats| a, b, d FROM mcv_lists   |
   |  | built
+ public   | stts_1 | a, b FROM stts_t1| 
built |  | 
+ public   | stts_2 | a, b FROM stts_t1| 
built | built| 
+ public   | stts_3 | a, b FROM stts_t1| 
built | built| built
+ public   | stts_4 | b, c FROM stts_t2| 
defined   | defined  | defined
+ public   | stts_hoge  | col1, col2, col3 FROM stts_t3| 
defined   | defined  | defined
+ stts_s1  | stts_foo   | col1, col2 FROM stts_t3  | 
defined   | defined  | defined
+ stts_s2  | stts_yama  | col1, col3 FROM stts_t3  |
   | defined  | defined
+ tststats | priv_test_stats| a, b FROM tststats.priv_test_tbl |
   |  | built
+(12 rows)
+
+\dX stts_?
+   List of extended statistics
+ Schema |  Name  |Definition | Ndistinct | Dependencies |   MCV   
+++---+---+--+-
+ public | stts_1 | a, b FROM stts_t1 | built |  | 
+ public | stts_2 | a, b FROM stts_t1 | built | built| 
+ public | stts_3 | a, b FROM stts_t1 | built | built| built
+ public | stts_4 | b, c FROM stts_t2 | defined   | defined  | defined
+(4 rows)
+
+\dX *stts_hoge
+   List of extended statistics
+ Schema |   Name|  Definition   | Ndistinct | Dependencies 
|   MCV   
++---+---+---+--+-
+ public | stts_hoge | col1, col2, col3 FROM stts_t3 | defined   | defined  
| defined
+(1 row)
+
+\dX+
+   List of 
extended statistics
+  Schema  |  Name  |  Definition  | 
Ndistinct | Dependencies |   MCV   | Ndistinct_size | Dependencies_size |  
MCV_size  
+--++--+---+--+-++---+
+ public   | func_deps_stat | a, b, c FROM functional_dependencies |
   | built| || 106 bytes | 
+ public   | mcv_lists_arrays_stats | a, b, c FROM mcv_li

Re: Is it useful to record whether plans are generic or custom?

2020-11-29 Thread Tatsuro Yamada

Hi Torikoshi-san,



In this patch, exposing new columns is mandatory, but I think
it's better to make it optional by adding a GUC something
like 'pgss.track_general_custom_plans.

I also feel it makes the number of columns too many.
Just adding the total time may be sufficient.



I think this feature is useful for DBA. So I hope that it gets
committed to PG14. IMHO, many columns are Okay because DBA can
select specific columns by their query.
Therefore, it would be better to go with the current design.

I did the regression test using your patch on 7e5e1bba03, and
it failed unfortunately. See below:

===
 122 of 201 tests failed, 1 of these failures ignored.
===
...
2020-11-30 09:45:18.160 JST [12977] LOG:  database system was not
properly shut down; automatic recovery in progress


Regards,
Tatsuro Yamada






Re: list of extended statistics on psql

2020-11-29 Thread Tatsuro Yamada

Hi Tomas and hackers,


I don't prefer a long name but I'll replace the name with it to be clearer.
For example, s/N_size/Ndistinct_size/.

Please find attached patcheds:
   - 0001: Replace column names
   - 0002: Recreate regression test based on 0001



I rebased the patch set on the master (7e5e1bba03), and the regression
test is good. Therefore, I changed the status of the patch: "needs review".

I know that you proposed the new extended statistics[1], and it probably
conflicts with the patch. I hope my patch will get commit before your
patch committed to avoid the time of recreating. :-)


[1] 
https://www.postgresql.org/message-id/flat/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com

Thanks,
Tatsuro Yamada



From 91c09db61c2891cf83b3151f51348dfd02e09744 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Mon, 30 Nov 2020 11:09:00 +0900
Subject: [PATCH 1/2] Add \dX command on psql (rebased on 7e5e1bba03)

---
 doc/src/sgml/ref/psql-ref.sgml |  14 ++
 src/bin/psql/command.c |   3 ++
 src/bin/psql/describe.c| 100 +
 src/bin/psql/describe.h|   3 ++
 src/bin/psql/help.c|   1 +
 src/bin/psql/tab-complete.c|   4 +-
 6 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 38b52d..46a6d0df76 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -928,6 +928,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 14150d05a9..7dac038f1b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4401,6 +4401,106 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern, bool verbose)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"es.stxnamespace::pg_catalog.regnamespace::text AS \"%s\", \n"
+ "es.stxname AS \"%s\", \n"
+ "pg_catalog.format('%%s FROM %%s', \n"
+ "  (SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ') \n"
+ "   FROM 
pg_catalog.unnest(es.stxkeys) s(attnum) \n"
+ "   JOIN pg_catalog.pg_attribute a \n"
+ "   ON (es.stxrelid = a.attrelid \n"
+ "   AND a.attnum = s.attnum \n"
+ "   AND NOT a.attisdropped)), \n"
+ "es.stxrelid::regclass) AS \"%s\", \n"
+ "CASE WHEN esd.stxdndistinct IS NOT 
NULL THEN 'built' \n"
+ " WHEN 'd' = any(es.stxkind) THEN 

Huge memory consumption on partitioned table with FKs

2020-11-23 Thread Tatsuro Yamada

Hi Hackers,

My company (NTT Comware) and NTT OSS Center did verification of
partitioned table on PG14dev, and we faced a problem that consumed
huge memory when we created a Foreign key constraint on a partitioned
table including 500 partitioning tables and inserted some data.

We investigated it a little to use the "pg_backend_memory_contextes"
command and realized a "CachedPlan" of backends increased dramatically.
See below:

Without FKs
==
CachedPlan 0kB

With FKs (the problem is here)
==
CachedPlan 710MB


Please find the attached file to reproduce the problem.

We know two things as following:
  - Each backend uses the size of CachedPlan
  - The more increasing partitioning tables, the more CachedPlan
consuming

If there are many backends, it consumes about the size of CachedPlan x
the number of backends. It may occur a shortage of memory and OOM killer.
We think the affected version are PG12 or later. I believe it would be
better to fix the problem. Any thoughts?

Regards,
Tatsuro Yamada
DROP TABLE IF EXISTS pr CASCADE;
DROP TABLE IF EXISTS ps CASCADE;
CREATE TABLE ps (c1 INT PRIMARY KEY) PARTITION BY RANGE(c1);
CREATE TABLE pr (c1 INT, c2 INT REFERENCES ps(c1)) PARTITION BY RANGE(c1);

-- Show memory usage of 'Cached%' 
SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM 
pg_backend_memory_contexts
WHERE name LIKE 'Cached%' GROUP BY name;

-- Procedure for creating partitioned table
CREATE OR REPLACE PROCEDURE part_make(tbl text, num int) AS $$
 DECLARE width int := 10; next int :=1;
 BEGIN
   FOR i in 1..num LOOP
   EXECUTE 'CREATE TABLE ' || tbl || '_' || i || ' partition of ' || tbl || 
' FOR VALUES FROM (' || next || ') TO (' || i * width
   || ');';
   next := i * width;
   END LOOP;
 END;
$$ LANGUAGE plpgsql;

-- Create partitioned tables named ps and pr. The each table has 500 
partitioning tables. 
CALL part_make('ps', 500);
CALL part_make('pr', 500);

-- Insert data
INSERT INTO ps SELECT generate_series(1,4999);
INSERT INTO pr SELECT i, i from generate_series(1,4999)i;

-- Show memory usages of 'Cached%' again
-- You can see 'CachedPlan 710MB'
SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM 
pg_backend_memory_contexts
WHERE name LIKE 'Cached%' GROUP BY name;



Re: list of extended statistics on psql

2020-11-16 Thread Tatsuro Yamada

Hi Tomas,

Thanks for your comments and also revising patches.

On 2020/11/16 3:22, Tomas Vondra wrote:

It's better to always post the whole patch series, so that cfbot can
test it properly. Sending just 0003 separately kind breaks that.


I now understand how "cfbot" works so that I'll take care of that
when I send patches. Thanks.



Also, 0003 seems to only tweak the .sql file, not the expected output,
and there actually seems to be two places that mistakenly use \dx (so
listing extensions) instead of \dX. I've fixed both issues in the
attached patches.


Oops, sorry about that.

 

However, I think the 0002 tests are better/sufficient - I prefer to keep
it compact, not interleaving with the tests testing various other stuff.
So I don't intend to commit 0003, unless there's something that I don't
see for some reason.


I Agreed. 0002 is easy to modify test cases and check results than 0003.
Therefore, I'll go with 0002.

 

The one remaining thing I'm not sure about is naming of the columns with
size of statistics - N_size, D_size and M_size does not seem very clear.
Any clearer naming will however make the tables wider, though :-/


Yeah, I think so too, but I couldn't get an idea of a suitable name for
the columns when I created the patch.
I don't prefer a long name but I'll replace the name with it to be clearer.
For example, s/N_size/Ndistinct_size/.

Please find attached patcheds:
  - 0001: Replace column names
  - 0002: Recreate regression test based on 0001


Regards,
Tatsuro Yamada

From 85fe05c3020cd595ae8d5c2cc6f695b39f4a6e03 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Tue, 17 Nov 2020 13:30:57 +0900
Subject: [PATCH 2/2] Recreate regression test

---
 src/test/regress/expected/stats_ext.out | 94 +
 src/test/regress/sql/stats_ext.sql  | 31 +++
 2 files changed, 125 insertions(+)

diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index 4c3edd213f..27ca54a8f3 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -1550,6 +1550,100 @@ INSERT INTO tststats.priv_test_tbl
 CREATE STATISTICS tststats.priv_test_stats (mcv) ON a, b
   FROM tststats.priv_test_tbl;
 ANALYZE tststats.priv_test_tbl;
+-- Check printing info about extended statistics by \dX
+create table stts_t1 (a int, b int);
+create statistics stts_1 (ndistinct) on a, b from stts_t1;
+create statistics stts_2 (ndistinct, dependencies) on a, b from stts_t1;
+create statistics stts_3 (ndistinct, dependencies, mcv) on a, b from stts_t1;
+create table stts_t2 (a int, b int, c int);
+create statistics stts_4 on b, c from stts_t2;
+create table stts_t3 (col1 int, col2 int, col3 int);
+create statistics stts_hoge on col1, col2, col3 from stts_t3;
+create schema stts_s1;
+create schema stts_s2;
+create statistics stts_s1.stts_foo on col1, col2 from stts_t3;
+create statistics stts_s2.stts_yama (dependencies, mcv) on col1, col3 from 
stts_t3;
+insert into stts_t1 select i,i from generate_series(1,100) i;
+analyze stts_t1;
+\dX
+  List of extended statistics
+  Schema  |  Name  |  Definition  | 
Ndistinct | Dependencies |   MCV   
+--++--+---+--+-
+ public   | func_deps_stat | a, b, c FROM functional_dependencies |
   | built| 
+ public   | mcv_lists_arrays_stats | a, b, c FROM mcv_lists_arrays|
   |  | built
+ public   | mcv_lists_bool_stats   | a, b, c FROM mcv_lists_bool  |
   |  | built
+ public   | mcv_lists_stats| a, b, d FROM mcv_lists   |
   |  | built
+ public   | stts_1 | a, b FROM stts_t1| 
built |  | 
+ public   | stts_2 | a, b FROM stts_t1| 
built | built| 
+ public   | stts_3 | a, b FROM stts_t1| 
built | built| built
+ public   | stts_4 | b, c FROM stts_t2| 
defined   | defined  | defined
+ public   | stts_hoge  | col1, col2, col3 FROM stts_t3| 
defined   | defined  | defined
+ stts_s1  | stts_foo   | col1, col2 FROM stts_t3  | 
defined   | defined  | defined
+ stts_s2  | stts_yama  | col1, col3 FROM stts_t3  |
   | defined  | defined
+ tststats | priv_test_stats| a, b FROM tststats.priv_test_tbl |
   |  | built
+(12 rows)
+
+\dX stts_?
+   List of extended statistics
+ Schema |  Name  |Definition | Ndistinct | Dependencies |   MCV   
+++---+---+--+-
+ public | stts_1 | a, b FROM stts_

Re: list of extended statistics on psql

2020-11-10 Thread Tatsuro Yamada

Hi,

 

2) The test is failing intermittently because it's executed in parallel
with stats_ext test, which is also creating extended statistics. So
depending on the timing the \dX may list some of the stats_ext stuff.
I'm not sure what to do about this. Either this part needs to be moved
to a separate test executed in a different group, or maybe we should
simply move it to stats_ext.


I thought all tests related to meta-commands exist in psql.sql, but I
realize it's not true. For example, the test of \dRp does not exist in
psql.sql. Therefore, I moved the regression test of \dX to stats_ext.sql
to avoid the test failed in parallel.

Attached patches is following:
  - 0001-v8-Add-dX-command-on-psql.patch
  - 0002-Add-regression-test-of-dX-to-stats_ext.sql.patch

However, I feel the test of \dX is not elegant, so I'm going to try
creating another one since it would be better to be aware of the context
of existing extended stats tests.


I tried to create another version of the regression test (0003).
"\dX" was added after ANALYZE command or SELECT... from pg_statistic_ext.

Please find the attached file:
  - 0003-Add-regression-test-of-dX-to-stats_ext.sql-another-ver

Both regression tests 0002 and 0003 are okay for me, I think.
Could you choose one?

Regards,
Tatsuro Yamada

From b22ebf34fc09e246f8d4cf408f76a6753f3d6bcb Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Tue, 10 Nov 2020 16:47:45 +0900
Subject: [PATCH] Add regression test of \dX to stats_ext.sql (another version)

---
 src/test/regress/sql/stats_ext.sql | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/test/regress/sql/stats_ext.sql 
b/src/test/regress/sql/stats_ext.sql
index 9781e590a3..c590dcc90a 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -49,6 +49,7 @@ CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON a, b FROM 
ab1;
 
 -- Let's also verify the pg_get_statisticsobjdef output looks sane.
 SELECT pg_get_statisticsobjdef(oid) FROM pg_statistic_ext WHERE stxname = 
'ab1_a_b_stats';
+\dX
 
 DROP STATISTICS regress_schema_2.ab1_a_b_stats;
 
@@ -60,9 +61,10 @@ ALTER TABLE ab1 DROP COLUMN a;
 \d ab1
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
+\dX
 DROP TABLE ab1;
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
-
+\dX
 -- Ensure things work sanely with SET STATISTICS 0
 CREATE TABLE ab1 (a INTEGER, b INTEGER);
 ALTER TABLE ab1 ALTER a SET STATISTICS 0;
@@ -73,13 +75,16 @@ ALTER TABLE ab1 ALTER a SET STATISTICS -1;
 -- setting statistics target 0 skips the statistics, without printing any 
message, so check catalog
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS 0;
 \d ab1
+\dX+ ab1_a_b_stats
 ANALYZE ab1;
 SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxname = 'ab1_a_b_stats'
AND d.stxoid = s.oid;
+\dX+ ab1_a_b_stats
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
 \d+ ab1
+\dX+ ab1_a_b_stats
 -- partial analyze doesn't build stats either
 ANALYZE ab1 (a);
 ANALYZE ab1;
@@ -93,6 +98,7 @@ CREATE TABLE ab1c () INHERITS (ab1);
 INSERT INTO ab1 VALUES (1,1);
 CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
+\dX+ ab1_a_b_stats
 DROP TABLE ab1 CASCADE;
 
 -- Verify supported object types for extended statistics
@@ -171,6 +177,7 @@ SELECT s.stxkind, d.stxdndistinct
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxrelid = 'ndistinct'::regclass
AND d.stxoid = s.oid;
+\dX+ s10
 
 -- minor improvement, make sure the ctid does not break the matching
 SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY 
ctid, a, b');
@@ -202,6 +209,7 @@ SELECT s.stxkind, d.stxdndistinct
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxrelid = 'ndistinct'::regclass
AND d.stxoid = s.oid;
+\dX+ s10
 
 -- correct estimates
 SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY a, 
b');
@@ -220,6 +228,7 @@ SELECT s.stxkind, d.stxdndistinct
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxrelid = 'ndistinct'::regclass
AND d.stxoid = s.oid;
+\dX+
 
 -- dropping the statistics results in under-estimates
 SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY a, 
b');
@@ -261,6 +270,7 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
functional_dependencies WHERE
 CREATE STATISTICS func_deps_stat (dependencies) ON a, b, c FROM 
functional_dependencies;
 
 ANALYZE functional_dependencies;
+\dX+ func_deps_stat
 
 SELECT * FROM check_estimated_rows('SELECT * FROM functional_dependencies 
WHERE a = 1 AND b = ''1''');
 
@@ -274,6 +284,7 @@ INSERT INTO functional_dependencies (a, b, c, filler1)
  SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) 
s(i);
 
 ANALYZE functional_dependencies;
+\dX+ func_deps_stat
 
 SELECT * FROM check_estimated_ro

Re: list of extended statistics on psql

2020-11-09 Thread Tatsuro Yamada

Hi Tomas,


I took a look at this today, and I think the code is ready, but the
regression test needs a bit more work:


Thanks for taking your time. :-D



1) It's probably better to use somewhat more specific names for the
objects, especially when created in public schema. It decreases the
chance of a collision with other tests (which may be hard to notice
because of timing). I suggest we use "stts_" prefix or something like
that, per the attached 0002 patch. (0001 is just the v7 patch)


I agree with your comment. Thanks.




2) The test is failing intermittently because it's executed in parallel
with stats_ext test, which is also creating extended statistics. So
depending on the timing the \dX may list some of the stats_ext stuff.
I'm not sure what to do about this. Either this part needs to be moved
to a separate test executed in a different group, or maybe we should
simply move it to stats_ext.


I thought all tests related to meta-commands exist in psql.sql, but I
realize it's not true. For example, the test of \dRp does not exist in
psql.sql. Therefore, I moved the regression test of \dX to stats_ext.sql
to avoid the test failed in parallel.

Attached patches is following:
 - 0001-v8-Add-dX-command-on-psql.patch
 - 0002-Add-regression-test-of-dX-to-stats_ext.sql.patch

However, I feel the test of \dX is not elegant, so I'm going to try
creating another one since it would be better to be aware of the context
of existing extended stats tests.

Regards,
Tatsuro Yamada


From 11c088cb43cdb13e445e246d0529d5721cf3bb10 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Tue, 10 Nov 2020 11:39:14 +0900
Subject: [PATCH] Add regression test of \dX to stats_ext.sql

---
 src/test/regress/expected/stats_ext.out | 94 +
 src/test/regress/sql/stats_ext.sql  | 31 +++
 2 files changed, 125 insertions(+)

diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index 4c3edd213f..0ec4e24960 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -1550,6 +1550,100 @@ INSERT INTO tststats.priv_test_tbl
 CREATE STATISTICS tststats.priv_test_stats (mcv) ON a, b
   FROM tststats.priv_test_tbl;
 ANALYZE tststats.priv_test_tbl;
+-- Check printing info about extended statistics by \dX
+create table stts_t1 (a int, b int);
+create statistics stts_1 (ndistinct) on a, b from stts_t1;
+create statistics stts_2 (ndistinct, dependencies) on a, b from stts_t1;
+create statistics stts_3 (ndistinct, dependencies, mcv) on a, b from stts_t1;
+create table stts_t2 (a int, b int, c int);
+create statistics stts_4 on b, c from stts_t2;
+create table stts_t3 (col1 int, col2 int, col3 int);
+create statistics stts_hoge on col1, col2, col3 from stts_t3;
+create schema stts_s1;
+create schema stts_s2;
+create statistics stts_s1.stts_foo on col1, col2 from stts_t3;
+create statistics stts_s2.stts_yama (dependencies, mcv) on col1, col3 from 
stts_t3;
+insert into stts_t1 select i,i from generate_series(1,100) i;
+analyze stts_t1;
+\dX
+  List of extended statistics
+  Schema  |  Name  |  Definition  | 
N_distinct | Dependencies |   Mcv   
+--++--++--+-
+ public   | func_deps_stat | a, b, c FROM functional_dependencies |
| built| 
+ public   | mcv_lists_arrays_stats | a, b, c FROM mcv_lists_arrays|
|  | built
+ public   | mcv_lists_bool_stats   | a, b, c FROM mcv_lists_bool  |
|  | built
+ public   | mcv_lists_stats| a, b, d FROM mcv_lists   |
|  | built
+ public   | stts_1 | a, b FROM stts_t1| 
built  |  | 
+ public   | stts_2 | a, b FROM stts_t1| 
built  | built| 
+ public   | stts_3 | a, b FROM stts_t1| 
built  | built| built
+ public   | stts_4 | b, c FROM stts_t2| 
defined| defined  | defined
+ public   | stts_hoge  | col1, col2, col3 FROM stts_t3| 
defined| defined  | defined
+ stts_s1  | stts_foo   | col1, col2 FROM stts_t3  | 
defined| defined  | defined
+ stts_s2  | stts_yama  | col1, col3 FROM stts_t3  |
| defined  | defined
+ tststats | priv_test_stats| a, b FROM tststats.priv_test_tbl |
|  | built
+(12 rows)
+
+\dX stts_?
+List of extended statistics
+ Schema |  Name  |Definition | N_distinct | Dependencies |   Mcv   
+++---++--+-
+ public | stts_1 | a, b FROM stts_

Re: list of extended statistics on psql

2020-11-03 Thread Tatsuro Yamada

Hi,


I addressed it, so I keep the size of extended stats with the unit.

Changes:

   - Use pg_size_pretty to show the size of extended stats by \dX+



I rebased the patch on the head and also added tab-completion.
Any feedback is welcome.


Preparing for tests:
===
create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;

create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;

create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

create schema foo;
create schema yama;
create statistics foo.stts_foo on col1, col2 from hoge;
create statistics yama.stts_yama (ndistinct, mcv) on col1, col3 from hoge;

insert into t1 select i,i from generate_series(1,100) i;
analyze t1;

Result of \dX:
==
postgres=# \dX
  List of extended statistics
 Schema |   Name| Definition | N_distinct | Dependencies |  
 Mcv
+---+++--+-
 foo| stts_foo  | col1, col2 FROM hoge   | defined| defined  | 
defined
 public | stts_1| a, b FROM t1   || built|
 public | stts_2| a, b FROM t1   | built  | built|
 public | stts_3| a, b FROM t1   | built  | built| 
built
 public | stts_4| b, c FROM t2   | defined| defined  | 
defined
 public | stts_hoge | col1, col2, col3 FROM hoge | defined| defined  | 
defined
 yama   | stts_yama | col1, col3 FROM hoge   | defined|  | 
defined
(7 rows)

Result of \dX+:
===
postgres=# \dX+
   List of extended statistics
 Schema |   Name| Definition | N_distinct | Dependencies |  
 Mcv   |  N_size  |  D_size  |   M_size
+---+++--+-+--+--+
 foo| stts_foo  | col1, col2 FROM hoge   | defined| defined  | 
defined | 0 bytes  | 0 bytes  | 0 bytes
 public | stts_1| a, b FROM t1   || built|  
   |  | 40 bytes |
 public | stts_2| a, b FROM t1   | built  | built|  
   | 13 bytes | 40 bytes |
 public | stts_3| a, b FROM t1   | built  | built| 
built   | 13 bytes | 40 bytes | 6126 bytes
 public | stts_4| b, c FROM t2   | defined| defined  | 
defined | 0 bytes  | 0 bytes  | 0 bytes
 public | stts_hoge | col1, col2, col3 FROM hoge | defined| defined  | 
defined | 0 bytes  | 0 bytes  | 0 bytes
 yama   | stts_yama | col1, col3 FROM hoge   | defined|  | 
defined | 0 bytes  |  | 0 bytes
(7 rows)

Results of Tab-completion:
===
postgres=# \dX 
foo. pg_toast.stts_2   stts_hoge
information_schema.  public.  stts_3   yama.
pg_catalog.  stts_1   stts_4

postgres=# \dX+ 
foo. pg_toast.stts_2   stts_hoge
information_schema.  public.  stts_3   yama.
pg_catalog.  stts_1   stts_4


Regards,
Tatsuro Yamada
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a83d5dfc..c6f1653cb7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -930,6 +930,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin

Re: list of extended statistics on psql

2020-10-28 Thread Tatsuro Yamada

Hi Tomas,

On 2020/10/29 4:06, Tomas Vondra wrote:

On Wed, Oct 28, 2020 at 03:07:56PM +0900, Tatsuro Yamada wrote:

Hi Michael-san and Hackers,

On 2020/09/30 15:19, Michael Paquier wrote:

On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:

Could you provide at least a rebased version of the patch?  The CF bot
is complaning here.


Not seeing this answered after two weeks, I have marked the patch as
RwF for now.
--
Michael



Sorry for the delayed reply.

I re-based the patch on the current head and did some
refactoring.
I think the size of extended stats are not useful for DBA.
Should I remove it?



I think it's an interesting / useful information, I'd keep it (in the
\dX+ output only, of course). But I think it needs to print the size
similarly to \d+, i.e. using pg_size_pretty - that'll include the unit
and make it more readable for large stats.



Thanks for your comment.
I addressed it, so I keep the size of extended stats with the unit.

Changes:

  - Use pg_size_pretty to show the size of extended stats by \dX+

Result of \dX+:
===
   Schema|Name|   Definition| N_distinct | Dependencies |   Mcv 
  |  N_Size  |  D_Size  |   M_Size
-++-++--+-+--+--+
 hoge1schema | hoge1_ext  | a, b FROM hoge1 | built  | built| built 
  | 13 bytes | 40 bytes | 6126 bytes
 public  | hoge1_ext1 | a, b FROM hoge1 | defined| defined  | 
defined | 0 bytes  | 0 bytes  | 0 bytes
 public  | hoge1_ext2 | a, b FROM hoge1 | defined|  |   
  | 0 bytes  |  |
(3 rows)

Please find the attached patch.

Regards,
Tatsuro Yamada

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a83d5dfc..c6f1653cb7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -930,6 +930,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 07d640021c..1807324542 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4397,6 +4397,106 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern, bool verbose)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"es.stxnamespace::pg_catalog.regnamespace::text AS \"%s\", \n"
+ "es.stxname AS \"%s\", \n"
+ "pg_catalog.format('%%s FROM %%s', \n"
+ "  (SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ') \n"
+ "   FROM 
pg_catalog.unnest(es.stxkeys) s(attnum) \n"
+ "   JOIN pg_catalog.pg_attribute a \n"
+ "   ON (es.stxrelid = a.attrelid \n"
+  

Re: list of extended statistics on psql

2020-10-28 Thread Tatsuro Yamada

Hi Tomas,

On 2020/10/29 4:07, Tomas Vondra wrote:

On Wed, Oct 28, 2020 at 04:20:25PM +0900, Tatsuro Yamada wrote:

Hi,


Results of \dX and \dX+:

postgres=# \dX
    List of extended statistics
    Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
-+---+-++--+-
  public  | hoge1_ext | a, b FROM hoge1 | defined    | defined  | 
defined
  hoge1schema | hoge1_ext | a, b FROM hoge1 | built  | built    | built
(2 rows)



I used "Order by 1, 2" on the query but I realized the ordering of
result was wrong so I fixed on the attached patch.
Please find the patch file. :-D



Thanks. I'll take a look at the beginning of the 2020-11 commitfest, and
I hope to get this committed.



Thanks for your reply and I'm glad to hear that.

I'm going to revise the patch as possible to get this committed on
the next commitfest.

Regards,
Tatsuro Yamada







Re: list of extended statistics on psql

2020-10-28 Thread Tatsuro Yamada

Hi,


Results of \dX and \dX+:

postgres=# \dX
     List of extended statistics
     Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
-+---+-++--+-
   public  | hoge1_ext | a, b FROM hoge1 | defined    | defined  | 
defined
   hoge1schema | hoge1_ext | a, b FROM hoge1 | built  | built    | built
(2 rows)



I used "Order by 1, 2" on the query but I realized the ordering of
result was wrong so I fixed on the attached patch.
Please fined the patch file. :-D

Regards,
Tatsuro Yamada


diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a83d5dfc..c6f1653cb7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -930,6 +930,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 07d640021c..484f011792 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4397,6 +4397,98 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern, bool verbose)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"es.stxnamespace::pg_catalog.regnamespace::text AS \"%s\", \n"
+ "es.stxname AS \"%s\", \n"
+ "pg_catalog.format('%%s FROM %%s', \n"
+ "  (SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ') \n"
+ "   FROM 
pg_catalog.unnest(es.stxkeys) s(attnum) \n"
+ "   JOIN pg_catalog.pg_attribute a \n"
+ "   ON (es.stxrelid = a.attrelid \n"
+ "   AND a.attnum = s.attnum \n"
+ "   AND NOT a.attisdropped)), \n"
+ "es.stxrelid::regclass) AS \"%s\", \n"
+ "CASE WHEN esd.stxdndistinct IS NOT 
NULL THEN 'built' \n"
+ " WHEN 'd' = any(es.stxkind) THEN 
'defined' \n"
+ "END AS \"%s\", \n"
+ "CASE WHEN esd.stxddependencies IS 
NOT NULL THEN 'built' \n"
+ " WHEN 'f' = any(es.stxkind) THEN 
'defined' \n"
+ "END AS \"%s\", \n"
+ "CASE WHEN esd.stxdmcv IS NOT NULL 
THEN 'built' \n"
+ " WHEN 'm' = any(es.stxkind) THEN 
'defined' \

Re: list of extended statistics on psql

2020-10-28 Thread Tatsuro Yamada

Hi Michael-san and Hackers,

On 2020/09/30 15:19, Michael Paquier wrote:

On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:

Could you provide at least a rebased version of the patch?  The CF bot
is complaning here.


Not seeing this answered after two weeks, I have marked the patch as
RwF for now.
--
Michael



Sorry for the delayed reply.

I re-based the patch on the current head and did some
refactoring.
I think the size of extended stats are not useful for DBA.
Should I remove it?

Changes:

   - Use a keyword "defined" instead of "not built"
   - Use COALESCE function for size for extended stats

Results of \dX and \dX+:

postgres=# \dX
List of extended statistics
Schema|   Name|   Definition| N_distinct | Dependencies |   Mcv
-+---+-++--+-
  public  | hoge1_ext | a, b FROM hoge1 | defined| defined  | 
defined
  hoge1schema | hoge1_ext | a, b FROM hoge1 | built  | built| built
(2 rows)

postgres=# \dX+
 List of extended statistics
Schema|   Name|   Definition| N_distinct | Dependencies |   Mcv 
  | N_size | D_size | M_size
-+---+-++--+-+++
  public  | hoge1_ext | a, b FROM hoge1 | defined| defined  | 
defined |  0 |  0 |  0
  hoge1schema | hoge1_ext | a, b FROM hoge1 | built  | built| built 
  | 13 | 40 |   6126
(2 rows)

Query of \dX+:
==
 SELECT
 stxnamespace::pg_catalog.regnamespace AS "Schema",
 stxname AS "Name",
 pg_catalog.format('%s FROM %s',
   (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
FROM pg_catalog.unnest(es.stxkeys) s(attnum)
JOIN pg_catalog.pg_attribute a
ON (es.stxrelid = a.attrelid
AND a.attnum = s.attnum
AND NOT a.attisdropped)),
 es.stxrelid::regclass) AS "Definition",
 CASE WHEN esd.stxdndistinct IS NOT NULL THEN 'built'
  WHEN 'd' = any(stxkind) THEN 'defined'
 END AS "N_distinct",
 CASE WHEN esd.stxddependencies IS NOT NULL THEN 'built'
  WHEN 'f' = any(stxkind) THEN 'defined'
 END AS "Dependencies",
 CASE WHEN esd.stxdmcv IS NOT NULL THEN 'built'
  WHEN 'm' = any(stxkind) THEN 'defined'
 END AS "Mcv",
 COALESCE(pg_catalog.length(stxdndistinct), 0) AS "N_size",
 COALESCE(pg_catalog.length(stxddependencies), 0) AS "D_size",
 COALESCE(pg_catalog.length(stxdmcv), 0) AS "M_size"
 FROM pg_catalog.pg_statistic_ext es
 LEFT JOIN pg_catalog.pg_statistic_ext_data esd
 ON es.oid = esd.stxoid
     INNER JOIN pg_catalog.pg_class c
 ON es.stxrelid = c.oid
 ORDER BY 1, 2;


Regards,
Tatsuro Yamada

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a83d5dfc..c6f1653cb7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -930,6 +930,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 07d640021c..e0ca32d34b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4397,6 +4397,98 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern, bool verbose)
+{
+   PQEx

Re: [spam] Re: list of extended statistics on psql

2020-10-27 Thread Tatsuro Yamada

Hi Michael-san and Hackers,

On 2020/09/30 15:19, Michael Paquier wrote:

On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:

Could you provide at least a rebased version of the patch?  The CF bot
is complaning here.


Not seeing this answered after two weeks, I have marked the patch as
RwF for now.
--
Michael



Sorry for the delayed reply.

I re-based the patch on the current head and did some
refactoring.
I think the size of extended stats are not useful for DBA.
Should I remove it?

Changes:

  - Use a keyword "defined" instead of "not built"
  - Use COALESCE function for size for extended stats

Results of \dX and \dX+:

postgres=# \dX
   List of extended statistics
   Schema|   Name|   Definition| N_distinct | Dependencies |   Mcv
-+---+-++--+-
 public  | hoge1_ext | a, b FROM hoge1 | defined| defined  | defined
 hoge1schema | hoge1_ext | a, b FROM hoge1 | built  | built| built
(2 rows)

postgres=# \dX+
List of extended statistics
   Schema|   Name|   Definition| N_distinct | Dependencies |   Mcv  
 | N_size | D_size | M_size
-+---+-++--+-+++
 public  | hoge1_ext | a, b FROM hoge1 | defined| defined  | 
defined |  0 |  0 |  0
 hoge1schema | hoge1_ext | a, b FROM hoge1 | built  | built| built  
 | 13 | 40 |   6126
(2 rows)

Query of \dX+:
==
SELECT
stxnamespace::pg_catalog.regnamespace AS "Schema",
stxname AS "Name",
pg_catalog.format('%s FROM %s',
  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
   FROM pg_catalog.unnest(es.stxkeys) s(attnum)
   JOIN pg_catalog.pg_attribute a
   ON (es.stxrelid = a.attrelid
   AND a.attnum = s.attnum
   AND NOT a.attisdropped)),
es.stxrelid::regclass) AS "Definition",
CASE WHEN esd.stxdndistinct IS NOT NULL THEN 'built'
 WHEN 'd' = any(stxkind) THEN 'defined'
END AS "N_distinct",
CASE WHEN esd.stxddependencies IS NOT NULL THEN 'built'
 WHEN 'f' = any(stxkind) THEN 'defined'
END AS "Dependencies",
CASE WHEN esd.stxdmcv IS NOT NULL THEN 'built'
 WHEN 'm' = any(stxkind) THEN 'defined'
END AS "Mcv",
COALESCE(pg_catalog.length(stxdndistinct), 0) AS "N_size",
COALESCE(pg_catalog.length(stxddependencies), 0) AS "D_size",
COALESCE(pg_catalog.length(stxdmcv), 0) AS "M_size"
FROM pg_catalog.pg_statistic_ext es
LEFT JOIN pg_catalog.pg_statistic_ext_data esd
ON es.oid = esd.stxoid
    INNER JOIN pg_catalog.pg_class c
ON es.stxrelid = c.oid
ORDER BY 1, 2;


Regards,
Tatsuro Yamada
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..fd860776af 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1918,6 +1918,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a83d5dfc..c6f1653cb7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -930,6 +930,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern, 
show_verbose);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 07d640021c..e0ca32d34b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4397,6 +4397,98 @@ listEventTriggers(const char *pattern, bool verbose)
return true;
 }
 
+/*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern, bool verbose)
+{
+   PQExpBufferData buf;
+   PGresult   *res;

Re: list of extended statistics on psql

2020-09-02 Thread Tatsuro Yamada

Hi,


 >> I wonder if trying to list info about all stats from the statistics
 >> object in a single line is necessary. Maybe we should split the info
 >> into one line per statistics, so for example
 >>
 >>     CREATE STATISTICS s (mcv, ndistinct, dependencies) ON ...
 >>
 >> would result in three lines in the \dX output. The statistics name would
 >> identify which lines belong together, but other than that the pieces are
 >> mostly independent.
 >
 >Yeah, that's what I'm suggesting.  I don't think we need to repeat the
 >name/definition for each line though.
 >
 >It might be useful to know how does pspg show a single entry that's
 >split in three lines, though.
 >

Ah, I didn't realize you're proposing that - I assumed it's broken
simply to make it readable, or something like that. I think the lines
are mostly independent, so I'd suggest to include the name of the object
on each line. The question is whether this independence will remain true
in the future - for example histograms would be built only on data not
represented by the MCV list, so there's a close dependency there.

Not sure about pspg, and I'm not sure it matters too much.


pspg almost ignores multiline rows - the horizontal cursor is one row every 
time. There is only one use case where pspg detects multiline rows - sorts, and 
pspg ensures correct content for multiline rows displayed in different (than 
input) order.




I try to summarize the discussion so far.
Is my understanding right? Could you revise it if it has something wrong?


* Summary

  1. "\dX[+]" doesn't display the Size of extended stats since the size is
  useful only for the development process of the stats.

  2. each row should have stats name, definition, type, and status.
 For example:

 statname |   definition | type  |
--+--+---+
 someobj  | (a, b) FROM tab  | n-distinct: built |
 someobj  | (a, b) FROM tab  | func-dependencies: built  |
 someobj  | (a, b) FROM tab  | mcv: built|
 sttshoge | (a, b) FROM hoge | n-distinct: required  |
 sttshoge | (a, b) FROM hoge | func-dependencies:required|
 sttscross| (a, b) FROM t1,t2| n-distinct: required  |


My opinion is below:

  For 1., Agreed. I will remove it on the next patch.
  For 2., I feel the design is not beautiful so I'd like to change it.
The reasons are:

- I think that even if we expected the number of types increasing two times,
   each type would be better to put as columns, not lines.
  Repeating items (the stats name and definition) should be removed.
  It's okay there are many columns in the future like "\df+" because we can
  use "\x" mode to display if we need it.

- The type column has two kinds of data, the one is stats type and another
  is status. We know the word "One fact in One place" for data modeling in
  the RDBMS world so it would be better to divide it.
  I'd like to suggest the bellow design of the view.

 statname |   definition | n-distinct | func-dependencies | mcv   |
--+--++---+---|
 someobj  | (a, b) FROM tab  | built  | built | built |
 sttshoge | (a, b) FROM hoge | required   | required      |       |
 sttscross| (a, b) FROM t1,t2| required   |   |   |


Any thoughts?


Thanks,
Tatsuro Yamada







Re: v13: show extended stats target in \d

2020-08-31 Thread Tatsuro Yamada

Hi Justin,

On 2020/08/31 14:00, Justin Pryzby wrote:

The stats target can be set since commit d06215d03, but wasn't shown by psql.
  ALTER STATISISTICS .. SET STATISTICS n.

Normal (1-D) stats targets are shown in \d+ table.
Stats objects are shown in \d (no plus).
Arguably, this should be shown only in "verbose" mode (\d+).


I tested your patch on 13beta3 and head (ab3c6d41).
It looks good to me. :-D

Thanks,
Tatsuro Yamada





Re: list of extended statistics on psql

2020-08-30 Thread Tatsuro Yamada

On 2020/08/31 1:59, Tom Lane wrote:

Tomas Vondra  writes:

On Sun, Aug 30, 2020 at 12:33:29PM -0400, Alvaro Herrera wrote:

I wonder how to report that.  Knowing that psql \-commands are not meant
for anything other than human consumption, maybe we can use a format()
string that says "built: %d bytes" when \dX+ is used (for each stat type),
and just "built" when \dX is used.  What do people think about this?


Seems a little too cute to me.


I'd use the same approach as \d+, i.e. a separate column with the size.
Maybe that'd mean too many columns, though.


psql already has \d commands with so many columns that you pretty much
have to use \x mode to make them legible; \df+ for instance.  I don't
mind if \dX+ is also in that territory.  It'd be good though if plain
\dX can fit in a normal terminal window.



Hmm. How about these instead of "built: %d bytes"?
I added three columns (N_size, D_size, M_size) to show size. See below:

===
 postgres=# \dX
   List of extended statistics
 Schema |   Name| Definition | N_distinct | Dependencies |  
  Mcv
+---+++--+---
 public | stts_1| a, b FROM t1   || built|
 public | stts_2| a, b FROM t1   | built  | built|
 public | stts_3| a, b FROM t1   | built  | built| 
built
 public | stts_4| b, c FROM t2   | not built  | not built| 
not built
 public | stts_hoge | col1, col2, col3 FROM hoge | not built  | not built| 
not built
(5 rows)

postgres=# \dX+
List of extended statistics
 Schema |   Name| Definition | N_distinct | Dependencies |  
  Mcv| N_size | D_size | M_size
+---+++--+---+++
 public | stts_1| a, b FROM t1   || built|  
 || 40 |
 public | stts_2| a, b FROM t1   | built  | built|  
 | 13 | 40 |
 public | stts_3| a, b FROM t1   | built  | built| 
built | 13 | 40 |   6126
 public | stts_4| b, c FROM t2   | not built  | not built| 
not built |||
 public | stts_hoge | col1, col2, col3 FROM hoge | not built  | not built| 
not built |||
===

I used this query to get results of "\dX+".
===
SELECT
 stxnamespace::pg_catalog.regnamespace AS "Schema",
 stxname AS "Name",
 format('%s FROM %s',
   (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
FROM pg_catalog.unnest(stxkeys) s(attnum)
JOIN pg_catalog.pg_attribute a
ON (stxrelid = a.attrelid
AND a.attnum = s.attnum
AND NOT attisdropped)),
 stxrelid::regclass) AS "Definition",
 CASE WHEN esd.stxdndistinct IS NOT NULL THEN 'built'
  WHEN 'd' = any(stxkind) THEN 'not built'
 END AS "N_distinct",
 CASE WHEN esd.stxddependencies IS NOT NULL THEN 'built'
  WHEN 'f' = any(stxkind) THEN 'not built'
 END AS "Dependencies",
 CASE WHEN esd.stxdmcv IS NOT NULL THEN 'built'
  WHEN 'm' = any(stxkind) THEN 'not built'
 END AS "Mcv",
   pg_catalog.length(stxdndistinct) AS "N_size",
   pg_catalog.length(stxddependencies) AS "D_size",
   pg_catalog.length(stxdmcv) AS "M_size"
   FROM pg_catalog.pg_statistic_ext es
   INNER JOIN pg_catalog.pg_class c
   ON stxrelid = c.oid
   LEFT JOIN pg_catalog.pg_statistic_ext_data esd
   ON es.oid = esd.stxoid
   ORDER BY 1, 2;
===
 


Attached patch includes:

   - Replace "Columns" and "Table" column with "Definition"
   - Show the status (built/not built/null) of extended stats by
 using pg_statistic_ext_data
   - Add "\dX+" command to show size of extended stats

Please find the attached file! :-D


Thanks,
Tatsuro Yamada

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index fc16e6c..5664c22 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1893,6 +1893,20 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+If + is appended to the command name, each extended 
statistics
+is listed with its size.
+
+
+  
 
   
  

Re: list of extended statistics on psql

2020-08-30 Thread Tatsuro Yamada

Hi Alvaro,


IMO the per-type columns should show both the type being enabled as
well as it being built.


Hmm. I'm not sure how to get the status (enabled or disabled) of
extended stats. :(
Could you explain it more?


pg_statistic_ext_data.stxdndistinct is not null if the stats have been
built. (I'm not sure whether there's an easier way to determine this.)



Ah.. I see! Thank you.



I suggest to do this

   Name| Schema | Definition   | Ndistinct | Dependencies | MCV
---++--+---+--+-
 stts_1| public | (a, b) FROM t1   | f | t| f


I suppose that the current column order is sufficient if there is
no improvement of extended stats on PG14. Do you know any plan to
improve extended stats such as to allow it to cross multiple tables on PG14?


I suggest that changing it in the future is going to be an uphill
battle, so better get it right from the get go, without requiring a
future restructure.



I understand your suggestions. I'll replace "Columns" and "Table" columns with 
"Definition" column.



Currently, I use this query to get Extended stats info from pg_statistic_ext.


Maybe something like this would do

SELECT
stxnamespace::pg_catalog.regnamespace AS "Schema",
stxname AS "Name",
format('%s FROM %s',
 (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
  FROM pg_catalog.unnest(stxkeys) s(attnum)
  JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
  a.attnum = s.attnum AND NOT attisdropped)),
  stxrelid::regclass) AS "Definition",
  CASE WHEN stxdndistinct IS NOT NULL THEN 'built' WHEN 'd' = any(stxkind) THEN 'enabled, 
not built' END AS "n-distinct",
  CASE WHEN stxddependencies IS NOT NULL THEN 'built' WHEN 'f' = any(stxkind) THEN 
'enabled, not built' END AS "functional dependencies",
  CASE WHEN stxdmcv IS NOT NULL THEN 'built' WHEN 'm' = any(stxkind) THEN 
'enabled, not built' END AS mcv
 FROM pg_catalog.pg_statistic_ext es
 INNER JOIN pg_catalog.pg_class c
 ON stxrelid = c.oid
 LEFT JOIN pg_catalog.pg_statistic_ext_data esd ON es.oid = esd.stxoid
 ORDER BY 1, 2, 3;


Great! It helped me a lot to understand your suggestions correctly. Thanks. :-D
I got the below results by your query.


create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;
create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;
create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

insert into t1 select i,i from generate_series(1,100) i;
analyze t1;


Your query gave this result:

 Schema |   Name| Definition | n-distinct | 
functional dependencies |mcv
+---+++-+
 public | stts_1| a, b FROM t1   || built   
|
 public | stts_2| a, b FROM t1   | built  | built   
|
 public | stts_3| a, b FROM t1   | built  | built   
| built
 public | stts_4| b, c FROM t2   | enabled, not built | 
enabled, not built  | enabled, not built
 public | stts_hoge | col1, col2, col3 FROM hoge | enabled, not built | 
enabled, not built  | enabled, not built
(5 rows)


I guess "enabled, not built" is a little redundant. The status would better to
have three patterns: "built", "not built" or nothing (NULL) like these:

  - "built":  extended stats is defined and built (collected by analyze cmd)
  - "not built": extended stats is defined but have not built yet
  - nothing (NULL): extended stats is not defined

What do you think about it?


I will send a new patch including :

  - Replace "Columns" and "Table" column with "Definition"
  - Show the status (built/not built/null) of extended stats by using
pg_statistic_ext_data

Thanks,
Tatsuro Yamada







Re: list of extended statistics on psql

2020-08-27 Thread Tatsuro Yamada

Hi Alvaro!

It's been ages since we created a progress reporting feature together. :-D


+1 good idea


+1 that's a good idea.  Please add it to the next commitfest!


+1 for the general idea, and +1 for \dX being the syntax to use


Thank you for voting!



IMO the per-type columns should show both the type being enabled as

well as it being built.

Hmm. I'm not sure how to get the status (enabled or disabled) of
extended stats. :(
Could you explain it more?



Also, the stat obj name column should be first, followed by a single
column listing both table and columns that it applies to.  Keep in mind
that in the future we might want to add stats that cross multiple tables
-- that's why the CREATE syntax is the way it is.  So we should give
room for that in psql's display too.


I understand your suggestions are the following, right?

* The Current column order:
===
  Schema | Table |  Name  | Columns | Ndistinct | Dependencies | MCV
+---++-+---+--+-
  public | t1| stts_1 | a, b| f | t| f
  public | t1| stts_2 | a, b| t | t| f
  public | t1| stts_3 | a, b| t | t| t
  public | t2| stts_4 | b, c| t | t| t
===

* The suggested column order is like this:
===
   Name| Schema | Table | Columns  | Ndistinct | Dependencies | MCV
---++---+--+---+--+-
 stts_1| public | t1| a, b | f | t| f
 stts_2| public | t1| a, b | t | t| f
 stts_3| public | t1| a, b | t | t| t
 stts_4| public | t2| b, c | t | t| t
===

*  In the future, Extended stats that cross multiple tables will be
   shown maybe... (t1, t2):
===
   Name| Schema | Table  | Columns  | Ndistinct | Dependencies | MCV
---+++--+---+--+-
 stts_5| public | t1, t2 | a, b | f | t| f
===

If so, I can revise the column order as you suggested easily.
However, I have no idea how to show extended stats that cross
multiple tables and the status now.

I suppose that the current column order is sufficient if there is
no improvement of extended stats on PG14. Do you know any plan to
improve extended stats such as to allow it to cross multiple tables on PG14?


In addition,
Currently, I use this query to get Extended stats info from pg_statistic_ext.

SELECT
stxnamespace::pg_catalog.regnamespace AS "Schema",
c.relname AS "Table",
stxname AS "Name",
(SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
 FROM pg_catalog.unnest(stxkeys) s(attnum)
 JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
 a.attnum = s.attnum AND NOT attisdropped)) AS "Columns",
'd' = any(stxkind) AS "Ndistinct",
'f' = any(stxkind) AS "Dependencies",
'm' = any(stxkind) AS "MCV"
FROM pg_catalog.pg_statistic_ext
INNER JOIN pg_catalog.pg_class c
    ON stxrelid = c.oid
ORDER BY 1, 2, 3;

Thanks,
Tatsuro Yamada







Re: list of extended statistics on psql

2020-08-27 Thread Tatsuro Yamada

Hi Julien!
 


Thanks also for the documentation and regression tests.  This overall looks
good, I just have a two comments:



Thank you for reviewing the patch! :-D



- there's a whitespace issue in the documentation part:

add_list_extended_stats_for_psql_by_dX_command.patch:10: tab in indent.
  
warning: 1 line adds whitespace errors.



Oops, I forgot to use "git diff --check". I fixed it.

 

- You're sorting the output on schema, table, extended statistics and columns
   but I think the last one isn't required since extended statistics names are
   unique.



You are right.
The sort key "columns" was not necessary so I removed it.

Attached new patch includes the above two fixes:

  - Fix whitespace issue in the documentation part
  - Remove unnecessary sort key from the query
 (ORDER BY 1, 2, 3, 4 -> ORDER BY 1, 2, 3)


Thanks,
Tatsuro Yamada

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index fc16e6c..8b0568c 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1893,6 +1893,18 @@ testdb=
 
 
   
+  
+  
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9902a4a..dcc9fba 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -929,6 +929,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d81f157..e43241f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4368,6 +4368,74 @@ listEventTriggers(const char *pattern, bool verbose)
 }
 
 /*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"stxnamespace::pg_catalog.regnamespace AS \"%s\", \n"
+ "c.relname AS \"%s\", \n"
+ "stxname AS \"%s\", \n"
+ "(SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ') \n"
+ " FROM pg_catalog.unnest(stxkeys) 
s(attnum) \n"
+ " JOIN pg_catalog.pg_attribute a ON 
(stxrelid = a.attrelid AND \n"
+ " a.attnum = s.attnum AND NOT 
attisdropped)) AS \"%s\", \n"
+ "'d' = any(stxkind) AS \"%s\", \n"
+ "'f' = any(stxkind) AS \"%s\", \n"
+ "'m' = any(stxkind) AS \"%s\"  \n"
+ "FROM pg_catalog.pg_statistic_ext \n"
+ "INNER JOIN pg_catalog.pg_class c \n"
+ "ON stxrelid = c.oid \n",
+ gettext_noop("Schema"),
+ gettext_noop("Table"),
+ gettext_noop("Name"),
+ gettext_noop("Columns"),
+ gettext_noop("Ndistinct"),
+  

Re: list of extended statistics on psql

2020-08-27 Thread Tatsuro Yamada

Hi Julien and Pavel!


How about using \dX rather than \dz?


Thanks for your suggestion!
I'll replace it if I got consensus. :-D



How about using \dX rather than \dz?


Thanks for your suggestion!
I'll replace it if I got consensus. :-D



I re-read a help message of \d* commands and realized it's better to
use "\dX".
There are already cases where the commands differ due to differences
in case, so I did the same way. Please find attached patch. :-D
 
For example:

==
  \da[S]  [PATTERN]  list aggregates
  \dA[+]  [PATTERN]  list access methods
==

Attached patch uses "\dX" instead of "\dz":
==
  \dx[+]  [PATTERN]  list extensions
  \dX [PATTERN]  list extended statistics
==

Results of regress test of the feature are the following:
==
-- check printing info about extended statistics
create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;
create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;
create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

\dX
  List of extended statistics
 Schema | Table |   Name| Columns  | Ndistinct | Dependencies | MCV
+---+---+--+---+--+-
 public | hoge  | stts_hoge | col1, col2, col3 | t | t| t
 public | t1| stts_1| a, b | f | t| f
 public | t1| stts_2| a, b | t | t| f
 public | t1| stts_3| a, b | t | t| t
 public | t2| stts_4| b, c | t | t| t
(5 rows)

\dX stts_?
List of extended statistics
 Schema | Table |  Name  | Columns | Ndistinct | Dependencies | MCV
+---++-+---+--+-
 public | t1| stts_1 | a, b| f | t| f
 public | t1| stts_2 | a, b| t | t| f
 public | t1| stts_3 | a, b| t | t| t
 public | t2| stts_4 | b, c| t | t| t
(4 rows)

\dX *hoge
  List of extended statistics
 Schema | Table |   Name| Columns  | Ndistinct | Dependencies | MCV
+---+---+--+---+--+-
 public | hoge  | stts_hoge | col1, col2, col3 | t | t| t
(1 row)
==


Thanks,
Tatsuro Yamada


diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index fc16e6c..ace8e5f 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1893,6 +1893,18 @@ testdb=
 
 
   
+  
+ 
+\dX [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+
+
+  
 
   
 \dy[+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9902a4a..dcc9fba 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -929,6 +929,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
else
success = listExtensions(pattern);
break;
+   case 'X':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern);
+   break;
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d81f157..77b3074 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4368,6 +4368,74 @@ listEventTriggers(const char *pattern, bool verbose)
 }
 
 /*
+ * \dX
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   i

Re: list of extended statistics on psql

2020-08-24 Thread Tatsuro Yamada

Hi!


+1 good idea


+1 that's a good idea.  Please add it to the next commitfest!


Thanks!



You have a typo:

+if (pset.sversion < 1)
+{
+charsverbuf[32];
+
+pg_log_error("The server (version %s) does not support
extended statistics.",
+ formatPGVersionNumber(pset.sversion, false,
+   sverbuf, sizeof(sverbuf)));
+return true;
+}

the version test is missing a 0, the feature looks otherwise ok.


Ouch, I fixed on the attached patch.

The new patch includes:

 - Fix the version number check (1 -> 10)
 - Fix query to get extended stats info for sort order
 - Add handling [Pattern] e.g \dz stts*
 - Add document and regression test for \dz
 

How about using \dX rather than \dz?


Thanks for your suggestion!
I'll replace it if I got consensus. :-D

Thanks,
Tatsuro Yamada
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index fc16e6c..d4c9de9 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1909,6 +1909,18 @@ testdb=
   
 
   
+\dz [ pattern ]
+
+
+Lists extended statistics.
+If pattern
+is specified, only those extended statistics whose names match the 
pattern
+are listed.
+
+
+  
+
+  
 \e or \edit  
 filename  
 line_number  

 
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9902a4a..dc36c98 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -932,6 +932,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
+   case 'z':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern);
+   break;
default:
status = PSQL_CMD_UNKNOWN;
}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d81f157..c9d9e08 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4368,6 +4368,74 @@ listEventTriggers(const char *pattern, bool verbose)
 }
 
 /*
+ * \dz
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 10)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT \n"
+ 
"stxnamespace::pg_catalog.regnamespace AS \"%s\", \n"
+ "c.relname AS \"%s\", \n"
+ "stxname AS \"%s\", \n"
+ "(SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ') \n"
+ " FROM pg_catalog.unnest(stxkeys) 
s(attnum) \n"
+ " JOIN pg_catalog.pg_attribute a ON 
(stxrelid = a.attrelid AND \n"
+ " a.attnum = s.attnum AND NOT 
attisdropped)) AS \"%s\", \n"
+ "'d' = any(stxkind) AS \"%s\", \n"
+ "'f' = any(stxkind) AS \"%s\", \n"
+ "'m' = any(stxkind) AS \"%s\"  \n"
+ "FROM pg_catalog.pg_statistic_ext \n"
+ "INNER JOIN pg_catalog.pg_class c \n"
+ "ON stxrelid = c.oid \n",
+ gettext_noop("Schema"),
+ gettext_noop("Table"),
+ gettext_noop("Name"),
+ gettext_noop("Columns"),
+ gettext_noop("Ndistinct"),
+ gettext_noop("Dependencies"),
+   

list of extended statistics on psql

2020-08-23 Thread Tatsuro Yamada

Hi!

I created a POC patch that allows showing a list of extended statistics by
"\dz" command on psql. I believe this feature helps DBA and users who
would like to know all extended statistics easily. :-D

I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
than "\dz" but they were already assigned. Therefore I used "\dz"
instead of them.

Please find the attached patch.
Any comments are welcome!

For Example:
===
CREATE TABLE t1 (a INT, b INT);
CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
ANALYZE t1;

CREATE TABLE t2 (a INT, b INT, c INT);
CREATE STATISTICS stts4 ON b, c FROM t2;
ANALYZE t2;

postgres=# \dz
List of extended statistics
 Schema | Table | Name  | Columns | Ndistinct | Dependencies | MCV
+---+---+-+---+--+-
 public | t1| stts1 | a, b| f | t| f
 public | t1| stts2 | a, b| t | t| f
 public | t1| stts3 | a, b| t | t| t
 public | t2| stts4 | b, c| t | t| t
(4 rows)

postgres=# \?
...
  \dy [PATTERN]  list event triggers
  \dz [PATTERN]  list extended statistics
  \l[+]   [PATTERN]  list databases
...
===

For now, I haven't written a document and regression test for that.
I'll create it later.

Thanks,
Tatsuro Yamada


diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9902a4a..dc36c98 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -932,6 +932,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
+   case 'z':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern);
+   break;
default:
status = PSQL_CMD_UNKNOWN;
}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d81f157..8128b1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4368,6 +4368,67 @@ listEventTriggers(const char *pattern, bool verbose)
 }
 
 /*
+ * \dz
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 1)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer();
+   printfPQExpBuffer(,
+ "SELECT "
+ 
"stxnamespace::pg_catalog.regnamespace AS \"%s\", "
+ "stxrelid::pg_catalog.regclass AS 
\"%s\", "
+ "stxname AS \"%s\", "
+ "(SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ') "
+ "FROM pg_catalog.unnest(stxkeys) 
s(attnum) "
+ "JOIN pg_catalog.pg_attribute a ON 
(stxrelid = a.attrelid AND "
+ "a.attnum = s.attnum AND NOT 
attisdropped)) AS \"%s\", "
+ "'d' = any(stxkind) AS \"%s\", "
+ "'f' = any(stxkind) AS \"%s\", "
+ "'m' = any(stxkind) AS \"%s\"  "
+ "FROM pg_catalog.pg_statistic_ext 
stat ",
+ gettext_noop("Schema"),
+ gettext_noop("Table"),
+ gettext_noop("Name"),
+ gettext_noop("Columns"),
+ gettext_noop("Ndistinct"),
+ gettext_noop("Dependencies"

Re: Is it useful to record whether plans are generic or custom?

2020-05-21 Thread Tatsuro Yamada

Hi Torikoshi-san!

On 2020/05/21 17:10, Kyotaro Horiguchi wrote:

At Thu, 21 May 2020 12:18:16 +0900, Fujii Masao  
wrote in



On 2020/05/20 21:56, Atsushi Torikoshi wrote:

On Wed, May 20, 2020 at 1:32 PM Kyotaro Horiguchi
mailto:horikyota@gmail.com>> wrote:
 At Tue, 19 May 2020 22:56:17 +0900, Atsushi Torikoshi
 mailto:ato...@gmail.com>> wrote in
  > On Sat, May 16, 2020 at 6:01 PM legrand legrand
  > mailto:legrand_legr...@hotmail.com>>
  > wrote:
  >
  > BTW, I'd also appreciate other opinions about recording the number
  > of generic and custom plans on pg_stat_statemtents.
 If you/we just want to know how a prepared statement is executed,
 couldn't we show that information in pg_prepared_statements view?
 =# select * from pg_prepared_statements;
 -[ RECORD 1 ]---+
 name| stmt1
 statement   | prepare stmt1 as select * from t where b = $1;
 prepare_time| 2020-05-20 12:01:55.733469+09
 parameter_types | {text}
 from_sql| t
 exec_custom | 5<- existing num_custom_plans
 exec_total  | 40   <- new member of CachedPlanSource
Thanks, Horiguchi-san!
Adding counters to pg_prepared_statements seems useful when we want
to know the way prepared statements executed in the current session.


I like the idea exposing more CachedPlanSource fields in
pg_prepared_statements. I agree it's useful, e.g., for the debug
purpose.
This is why I implemented the similar feature in my extension.
Please see [1] for details.


Thanks. I'm not sure plan_cache_mode should be a part of the view.
Cost numbers would look better if it is cooked a bit.  Is it worth
being in core?

=# select * from pg_prepared_statements;
-[ RECORD 1 ]---+
name| p1
statement   | prepare p1 as select a from t where a = $1;
prepare_time| 2020-05-21 15:41:50.419578+09
parameter_types | {integer}
from_sql| t
calls   | 7
custom_calls| 5
plan_generation | 6
generic_cost| 4.3105
custom_cost | 9.31

Perhaps plan_generation is not needed there.


I tried to creating PoC patch too, so I share it.
Please find attached file.

# Test case
prepare count as select count(*) from pg_class where oid >$1;
execute count(1); select * from pg_prepared_statements;

-[ RECORD 1 ]---+--
name| count
statement   | prepare count as select count(*) from pg_class where oid >$1;
prepare_time| 2020-05-21 17:41:16.134362+09
parameter_types | {oid}
from_sql| t
is_generic_plan | f <= False

You can see the following result, when you execute it 6 times.

-[ RECORD 1 ]---+--
name| count
statement   | prepare count as select count(*) from pg_class where oid >$1;
prepare_time| 2020-05-21 17:41:16.134362+09
parameter_types | {oid}
from_sql    | t
is_generic_plan | t <= True


Thanks,
Tatsuro Yamada
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80d6df8ac1..63de4fdb78 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -723,7 +723,7 @@ pg_prepared_statement(PG_FUNCTION_ARGS)
 * build tupdesc for result tuples. This must match the definition of 
the
 * pg_prepared_statements view in system_views.sql
 */
-   tupdesc = CreateTemplateTupleDesc(5);
+   tupdesc = CreateTemplateTupleDesc(6);
TupleDescInitEntry(tupdesc, (AttrNumber) 1, "name",
   TEXTOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 2, "statement",
@@ -734,6 +734,8 @@ pg_prepared_statement(PG_FUNCTION_ARGS)
   REGTYPEARRAYOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 5, "from_sql",
   BOOLOID, -1, 0);
+   TupleDescInitEntry(tupdesc, (AttrNumber) 6, "is_generic_plan",
+  BOOLOID, -1, 0);
 
/*
 * We put all the tuples into a tuplestore in one scan of the hashtable.
@@ -755,8 +757,8 @@ pg_prepared_statement(PG_FUNCTION_ARGS)
hash_seq_init(_seq, prepared_queries);
while ((prep_stmt = hash_seq_search(_seq)) != NULL)
{
-   Datum   values[5];
-   boolnulls[5];
+   Datum   values[6];
+   boolnulls[6];
 
MemSet(nulls, 0, sizeof(nulls));
 
@@ -766,6 +768,7 @@ pg_prepared_statement(PG_FUNCTION_ARGS)
values[3] = 
build_regtype_array(prep_stmt->plansource->param_ty

Re: PG 13 release notes, first draft

2020-05-11 Thread Tatsuro Yamada

On 2020/05/12 9:34, Bruce Momjian wrote:

Could you add "Vinayak Pokale" as a co-author of the following feature since
I sometimes read his old patch to create a patch [1] ?

===
E.1.3.1.6. System Views

-  Add system view pg_stat_progress_analyze to report analyze progress (Álvaro 
Herrera, Tatsuro Yamada)

+  Add system view pg_stat_progress_analyze to report analyze progress (Álvaro 
Herrera, Tatsuro Yamada, Vinayak Pokale)
===


Done.



Hi Bruce,

Thank you!


Regards,
Tatsuro Yamada








Re: PG 13 release notes, first draft

2020-05-11 Thread Tatsuro Yamada

Hi Bruce,

On 2020/05/05 12:16, Bruce Momjian wrote:

I have committed the first draft of the PG 13 release notes.  You can
see them here:

https://momjian.us/pgsql_docs/release-13.html

It still needs markup, word wrap, and indenting.  The community doc
build should happen in a few hours.


Thanks for working on this! :-D

Could you add "Vinayak Pokale" as a co-author of the following feature since
I sometimes read his old patch to create a patch [1] ?

===
E.1.3.1.6. System Views

-  Add system view pg_stat_progress_analyze to report analyze progress (Álvaro 
Herrera, Tatsuro Yamada)

+  Add system view pg_stat_progress_analyze to report analyze progress (Álvaro 
Herrera, Tatsuro Yamada, Vinayak Pokale)
===

[1]
https://www.postgresql.org/message-id/20190813140127.GA4933%40alvherre.pgsql

Thanks,
Tatsuro Yamada





Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2020-04-07 Thread Tatsuro Yamada

Hi Julien,

On 2020/04/02 22:25, Julien Rouhaud wrote:

New conflict, rebased v9 attached.


I tested the patch on the head (c7654f6a3) and
the result was fine. See below:

$ make installcheck-world
=
 All 1 tests passed.
=


Regards,
Tatsuro Yamada






Re: progress report for ANALYZE

2020-01-27 Thread Tatsuro Yamada

Hi,

On 2020/01/24 23:44, Amit Langote wrote:

On Fri, Jan 24, 2020 at 6:47 AM Alvaro Herrera  wrote:

On 2020-Jan-22, Tatsuro Yamada wrote:

P.S.
Next up is progress reporting for query execution?!


Actually, I think it's ALTER TABLE.


+1.  Existing infrastructure might be enough to cover ALTER TABLE's
needs, whereas we will very likely need to build entirely different
infrastructure for tracking the progress for SQL query execution.


Yeah, I agree.
I will create a little POC patch after reading tablecmds.c and alter.c to
investigate how to report progress. :)

Regards,
Tatsuro Yamada







Re: progress report for ANALYZE

2020-01-21 Thread Tatsuro Yamada

Hi Alvaro and All reviewers,

On 2020/01/16 2:11, Alvaro Herrera wrote:

I just pushed this after some small extra tweaks.

Thanks, Yamada-san, for seeing this to completion!


Thanks for reviewing and committing the patch!
Hope this helps DBA. :-D

P.S.
Next up is progress reporting for query execution?!
To create it, I guess that it needs to improve
progress reporting infrastructure.

Thanks,
Tatsuro Yamada






Re: progress report for ANALYZE

2019-12-19 Thread Tatsuro Yamada

Hi All,


*All* phases are repeated in this case, not not just "finalizing
analyze", because ANALYZE repeatedly runs for each partition after the
parent partitioned table's ANALYZE finishes.  ANALYZE's documentation
mentions that analyzing a partitioned table also analyzes all of its
partitions, so users should expect to see the progress information for
each partition.  So, I don't think we need to clarify that if only in
one phase's description.  Maybe we can add a note after the phase
description table which mentions this implementation detail about
partitioned tables.  Like this:

   
    
 Note that when ANALYZE is run on a partitioned table,
 all of its partitions are also recursively analyzed as also mentioned on
 .  In that case, ANALYZE
 progress is reported first for the parent table, whereby its inheritance
 statistics are collected, followed by that for each partition.
    
   



Ah.. you are right: All phases are repeated, it shouldn't be fixed
the only one phase's description.



Some more comments on the documentation:

+   Number of computed extended stats.  This counter only advances
when the phase
+   is computing extended stats.

Number of computed extended stats -> Number of extended stats computed



Will fix.



+   Number of analyzed child tables.  This counter only advances
when the phase
+   is computing extended stats.

Regarding, "Number of analyzed child table", note that we don't
"analyze" child tables in this phase, only scan its blocks to collect
samples for parent's ANALYZE.  Also, the 2nd sentence is wrong -- you
meant "when the phase is acquiring inherited sample
rows.  I suggest to write this as follows:

Number of child tables scanned.  This counter only advances when the phase
is acquiring inherited sample rows.



Oops, I will fix it. :)




+ OID of the child table currently being scanned.
+   It might be different from relid when analyzing tables that
have child tables.

I suggest:

OID of the child table currently being scanned.  This field is only valid when
the phase is computing extended stats.



Will fix.



+   The command is currently scanning the
current_relid
+   to obtain samples.

I suggest:

The command is currently scanning the the table given by
current_relid to obtain sample rows.



Will fix.



+   The command is currently scanning the
current_child_table_relid
+   to obtain samples.

I suggest (based on phase description pg_stat_progress_create_index
phase descriptions):

The command is currently scanning child tables to obtain sample rows.  Columns
child_tables_total,
child_tables_done, and
current_child_table_relid contain the progress
information for this phase.



Will fix.



+    
+ computing stats

I think the phase name should really be "computing statistics", that
is, use the full word.



Will fix.



+ 
+   The command is computing stats from the samples obtained
during the table scan.
+ 
+    

So I suggest:

The command is computing statistics from the sample rows obtained during
the table scan



Will fix.



+ computing extended stats
+ 
+   The command is computing extended stats from the samples
obtained in the previous phase.
+ 

I suggest:

The command is computing extended statistics from the sample rows obtained
during the table scan.



Will fix.



I fixed the document based on Amit's comments. :)
Please find attached file.


Thanks,
Tatsuro Yamadas







diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a3c5f86b7e..a5da126b8f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -360,6 +360,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
   
  
 
+ 
+  
pg_stat_progress_analyzepg_stat_progress_analyze
+  One row for each backend (including autovacuum worker processes) 
running
+   ANALYZE, showing current progress.
+   See .
+  
+ 
+
  
   
pg_stat_progress_clusterpg_stat_progress_cluster
   One row for each backend running
@@ -3502,7 +3510,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
PostgreSQL has the ability to report the 
progress of
certain commands during command execution.  Currently, the only commands
which support progress reporting are CREATE INDEX,
-   VACUUM and
+   VACUUM, ANALYZE and
CLUSTER. This may be expanded in the future.
   
 
@@ -3948,6 +3956,179 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
  
 
+ 
+  ANALYZE Progress Reporting
+
+  
+   Whenever ANALYZE is running, the
+   pg_stat_progress_analyze view will contain a
+   row for each backend that is currently running that command.  The tables
+   below describe the information that will be reported and provide
+   information about how to interpret it.
+  
+
+  
+   pg_stat_progress_analyze View
+   
+
+
+  Column
+  Type
+  Description
+ 
+
+
+   
+   

Re: progress report for ANALYZE

2019-12-17 Thread Tatsuro Yamada

Hi Amit-san,



  >>> 1)

For now, I'm not sure it should be set current_child_table_relid to zero
when the current phase is changed from "acquiring inherited sample rows" to
"computing stats". See  bellow.


In the upthread discussion [1], Robert asked to *not* do such things,
that is, resetting some values due to phase change.  I'm not sure his
point applies to this case too though.


//Below "computing stats" is for the partitioned table hoge,
//therefore the second column from the left side would be
//better to set Zero to easy to understand.


On second thought, maybe we should not reset, because that might be
considered loss of information.  To avoid confusion, we should simply
document that current_child_table_relid is only valid during the
"acquiring inherited sample rows" phase.



Okay, agreed. :)

 

2)
There are many "finalizing analyze" phases based on relids in the case
of partitioning tables. Would it better to fix the document? or it
would be better to reduce it to one?


It would be better to modify the document of "finalizing analyze" phase.

# Before modify
 The command is updating pg_class. When this phase is completed,
 ANALYZE will end.

# Modified
 The command is updating pg_class. When this phase is completed,
 ANALYZE will end. In the case of partitioned table,
 it might be shown on each partitions.

What do you think that? I'm going to fix it, if you agreed. :)


*All* phases are repeated in this case, not not just "finalizing
analyze", because ANALYZE repeatedly runs for each partition after the
parent partitioned table's ANALYZE finishes.  ANALYZE's documentation
mentions that analyzing a partitioned table also analyzes all of its
partitions, so users should expect to see the progress information for
each partition.  So, I don't think we need to clarify that if only in
one phase's description.  Maybe we can add a note after the phase
description table which mentions this implementation detail about
partitioned tables.  Like this:

   

 Note that when ANALYZE is run on a partitioned table,
 all of its partitions are also recursively analyzed as also mentioned on
 .  In that case, ANALYZE
 progress is reported first for the parent table, whereby its inheritance
 statistics are collected, followed by that for each partition.

   



Ah.. you are right: All phases are repeated, it shouldn't be fixed
the only one phase's description.



Some more comments on the documentation:

+   Number of computed extended stats.  This counter only advances
when the phase
+   is computing extended stats.

Number of computed extended stats -> Number of extended stats computed



Will fix.

 

+   Number of analyzed child tables.  This counter only advances
when the phase
+   is computing extended stats.

Regarding, "Number of analyzed child table", note that we don't
"analyze" child tables in this phase, only scan its blocks to collect
samples for parent's ANALYZE.  Also, the 2nd sentence is wrong -- you
meant "when the phase is acquiring inherited sample
rows.  I suggest to write this as follows:

Number of child tables scanned.  This counter only advances when the phase
is acquiring inherited sample rows.



Oops, I will fix it. :)


 

+ OID of the child table currently being scanned.
+   It might be different from relid when analyzing tables that
have child tables.

I suggest:

OID of the child table currently being scanned.  This field is only valid when
the phase is computing extended stats.



Will fix.



+   The command is currently scanning the
current_relid
+   to obtain samples.

I suggest:

The command is currently scanning the the table given by
current_relid to obtain sample rows.



Will fix.

 

+   The command is currently scanning the
current_child_table_relid
+   to obtain samples.

I suggest (based on phase description pg_stat_progress_create_index
phase descriptions):

The command is currently scanning child tables to obtain sample rows.  Columns
child_tables_total,
child_tables_done, and
current_child_table_relid contain the progress
information for this phase.



Will fix.



+
+ computing stats

I think the phase name should really be "computing statistics", that
is, use the full word.



Will fix.

 

+ 
+   The command is computing stats from the samples obtained
during the table scan.
+ 
+

So I suggest:

The command is computing statistics from the sample rows obtained during
the table scan



Will fix.

 

+ computing extended stats
+ 
+   The command is computing extended stats from the samples
obtained in the previous phase.
+ 

I suggest:

The command is computing extended statistics from the sample rows obtained
during the table scan.



Will fix.


Thanks for your above useful suggestions. It helps me a lot. :)


Cheers!
Tatsuro Yamada





Re: progress report for ANALYZE

2019-12-05 Thread Tatsuro Yamada
ample_rows(), then each partition to collect its
own statistics.  Note that this recursive application to ANALYZE to
partitions (child tables) only occurs for partitioned tables, not for
legacy inheritance.


Thanks for your explanation.
I understand Analyzing Partitioned table a little.



It would be better to modify the document of "finalizing analyze" phase.

  # Before modify
   The command is updating pg_class. When this phase is completed,
   ANALYZE will end.

  # Modified
   The command is updating pg_class. When this phase is completed,
   ANALYZE will end. In the case of partitioned table,
   it might be shown on each partitions.

What do you think that? I'm going to fix it, if you agreed. :)

Thanks,
Tatsuro Yamada







Re: progress report for ANALYZE

2019-12-04 Thread Tatsuro Yamada

Hi Amit-san,

Thanks for your comments!


Attached patch is the revised patch. :)

I wonder two things below. What do you think?

1)
For now, I'm not sure it should be set current_child_table_relid to zero
when the current phase is changed from "acquiring inherited sample rows" to
"computing stats". See  bellow.


In the upthread discussion [1], Robert asked to *not* do such things,
that is, resetting some values due to phase change.  I'm not sure his
point applies to this case too though.


Yeah, I understood.
I'll check target relid of "computing stats" to re-read a code of
analyze command later. :)

 

2)
There are many "finalizing analyze" phases based on relids in the case
of partitioning tables. Would it better to fix the document? or it
would be better to reduce it to one?


-
   finalizing analyze
   
 The command is updating pg_class. When this phase is completed,
 ANALYZE will end.
-


When a partitioned table is analyzed, its partitions are analyzed too.
So, the ANALYZE command effectively runs N + 1 times if there are N
partitions -- first analyze partitioned table to collect "inherited"
statistics by collecting row samples using
acquire_inherited_sample_rows(), then each partition to collect its
own statistics.  Note that this recursive application to ANALYZE to
partitions (child tables) only occurs for partitioned tables, not for
legacy inheritance.


Thanks for your explanation.
I understand Analyzing Partitioned table a little.
Below is my understanding. Is it right?

==
In the case of partitioned table (N = 3)

 - Partitioned table name: p (relid is 100)
 - Partitioning table names: p1, p2, p3 (relids are 201, 202 and 203)

For now, We can get the following results by executing "analyze p;".

Num Phase relid current_child_table_relid
 1  acquire inherited sample rows  100   201
 2  acquire inherited sample rows  100   202
 3  acquire inherited sample rows  100   203
 4  computing stats100   0
 5  finalizing analyze 100   0

 6  acquiring sample rows  201   0
 7  computing stats201   0
 8  finalizing analyze 201   0

 9  acquiring sample rows  202   0
10  computing stats202   0
11  finalizing analyze 202   0

12  acquiring sample rows  203   0
13  computing stats203   0
14  finalizing analyze 203   0
==========


Thanks,
Tatsuro Yamada





Re: progress report for ANALYZE

2019-11-29 Thread Tatsuro Yamada

Hi Alvaro and Amit!

On 2019/11/29 9:54, Tatsuro Yamada wrote:

Hi Alvaro!


Hmmm... I understand your opinion but I'd like to get more opinions too. :)
Do you prefer these column names? See below:


Here's my take on it:

  
   pid
   datid
   datname
   relid
   phase
   sample_blks_total
   sample_blks_scanned
   ext_stats_total
   ext_stats_computed
   child_tables_total
   child_tables_done
   current_child_table_relid

It seems to make sense to keep using the "child table" terminology in
that last column; but since the column carries an OID then as Robert
said it should have "relid" in the name.  For the other two "child
tables" columns, not using "relid" is appropriate because what they have
is not relids.

I think there should be an obvious correspondence in columns that are
closely related, which there isn't if you use "sample" in one and "heap"
in the other, so I'd go for "sample" in both.



Thanks for the comment.
Oops, You are right, I overlooked they are not relids..
I agreed with you and Amit's opinion so I'll send a revised patch on the next 
mail. :-)

Next patch will be included:
  - New columns definition of the view (as above)
  - Renamed the phase name: s/acquiring inh sample rows/acquiring inherited 
sample rows/
  - Update document


Attached patch is the revised patch. :)

I wonder two things below. What do you think?

1)
For now, I'm not sure it should be set current_child_table_relid to zero
when the current phase is changed from "acquiring inherited sample rows" to
"computing stats". See  bellow.

2)
There are many "finalizing analyze" phases based on relids in the case
of partitioning tables. Would it better to fix the document? or it
would be better to reduce it to one?


-
 finalizing analyze
 
   The command is updating pg_class. When this phase is completed,
   ANALYZE will end.
-



-
# \d pg_stat_progress_analyze
  View "pg_catalog.pg_stat_progress_analyze"
  Column   |  Type   | Collation | Nullable | Default
---+-+---+--+-
 pid   | integer |   |  |
 datid | oid |   |  |
 datname   | name|   |  |
 relid | oid |   |  |
 phase | text|   |  |
 sample_blks_total | bigint  |   |  |
 sample_blks_scanned   | bigint  |   |  |
 ext_stats_total   | bigint  |   |  |
 ext_stats_computed| bigint  |   |  |
 child_tables_total| bigint  |   |  |
 child_tables_done | bigint  |   |  |
 current_child_table_relid | oid |   |  |
-




-
# select * from pg_stat_progress_analyze ; \watch 0.0001

19309|13583|postgres|36081|acquiring inherited sample rows|0|0|0|0|0|0|0
19309|13583|postgres|36081|acquiring inherited sample rows|45|17|0|0|4|0|36084
19309|13583|postgres|36081|acquiring inherited sample rows|45|35|0|0|4|0|36084
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|0|36084
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|0|36084
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|0|36084
19309|13583|postgres|36081|acquiring inherited sample rows|45|3|0|0|4|1|36087
19309|13583|postgres|36081|acquiring inherited sample rows|45|22|0|0|4|1|36087
19309|13583|postgres|36081|acquiring inherited sample rows|45|38|0|0|4|1|36087
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|1|36087
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|1|36087
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|1|36087
19309|13583|postgres|36081|acquiring inherited sample rows|45|16|0|0|4|2|36090
19309|13583|postgres|36081|acquiring inherited sample rows|45|34|0|0|4|2|36090
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|2|36090
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|2|36090
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|2|36090
19309|13583|postgres|36081|acquiring inherited sample rows|45|10|0|0|4|3|36093
19309|13583|postgres|36081|acquiring inherited sample rows|45|29|0|0|4|3|36093
19309|13583|postgres|36081|acquiring inherited sample rows|45|43|0|0|4|3|36093
19309|13583|postgres|36081|acquiring inherited sample rows|45|45|0|0|4|3|36093
19309|13583|postgres|36081|acquiring inherited 

Re: progress report for ANALYZE

2019-11-28 Thread Tatsuro Yamada

Hi Michael,

On 2019/11/27 13:25, Michael Paquier wrote:

On Wed, Nov 27, 2019 at 12:45:41PM +0900, Tatsuro Yamada wrote:

Fixed.


Patch was waiting on input from author, so I have switched it back to
"Needs review", and moved it to next CF while on it as you are working
on it.


Thanks for your CF manager work.
I will do my best to be committed at the next CF because
Progress reporting feature is useful for DBAs, as you know. :)


Thanks,
Tatsuro Yamada





Re: progress report for ANALYZE

2019-11-28 Thread Tatsuro Yamada

Hi Alvaro!


Hmmm... I understand your opinion but I'd like to get more opinions too. :)
Do you prefer these column names? See below:


Here's my take on it:

  
   pid
   datid
   datname
   relid
   phase
   sample_blks_total
   sample_blks_scanned
   ext_stats_total
   ext_stats_computed
   child_tables_total
   child_tables_done
   current_child_table_relid

It seems to make sense to keep using the "child table" terminology in
that last column; but since the column carries an OID then as Robert
said it should have "relid" in the name.  For the other two "child
tables" columns, not using "relid" is appropriate because what they have
is not relids.

I think there should be an obvious correspondence in columns that are
closely related, which there isn't if you use "sample" in one and "heap"
in the other, so I'd go for "sample" in both.



Thanks for the comment.
Oops, You are right, I overlooked they are not relids..
I agreed with you and Amit's opinion so I'll send a revised patch on the next 
mail. :-)

Next patch will be included:
 - New columns definition of the view (as above)
 - Renamed the phase name: s/acquiring inh sample rows/acquiring inherited 
sample rows/
 - Update document


Thanks,
Tatsuro Yamada









Re: progress report for ANALYZE

2019-11-28 Thread Tatsuro Yamada

Hi Amit-san,

On 2019/11/28 10:59, Amit Langote wrote:

Yamada-san,

Thank you for updating the patch.

On Wed, Nov 27, 2019 at 12:46 PM Tatsuro Yamada
 wrote:

But I just remembered I replaced column name "*_table" with "*_relid"
based on Robert's comment three months ago, see below:


/me reviews.

+  scanning_table

I think this should be retitled to something that ends in 'relid',
like all of the corresponding cases in existing progress views.
Perhaps 'active_relid' or 'current_relid'.


So, it would be better to use same rule against child_tables_total and
child_tables_done. Thus I changed these column names to others and added
to the view.


Robert's comment seems OK for a column that actually reports an OID,
but for columns that report counts, names containing "relids" sound a
bit strange to me.  So, I prefer child_tables_total /
child_tables_done over child_relids_total / child_relids_done.  Would
like to hear more opinions.


Hmmm... I understand your opinion but I'd like to get more opinions too. :)
Do you prefer these column names? See below:


 pid
 datid
 datname
 relid
 phase
 sample_blks_total
 heap_blks_scanned
 ext_stats_total
 ext_stats_computed
 child_tables_total  <= Renamed: s/relid/table/
 child_tables_done   <= Renamed: s/relid/table/
 current_child_table <= Renamed: s/relid/table/




Also, inheritance tree stats are created *after* creating single table
stats, so I think that it would be better to have a distinct phase
name for that, say "acquiring inherited sample rows".  In
do_analyze_rel(), you can select which of two phases to set based on
whether inh is true or not.


Okay! I'll also add the new phase "acquiring inherited sample rows" on
the next patch.



Fixed.

I tried to abbreviate it to "acquiring inh sample rows" because I thought
"acquiring inherited sample rows" is a little long for the phase name.


I think phase names should contain full English words, because they
are supposed to be descriptive.  Users are very likely to not
understand what "inh" means without looking up the docs.



Okay, I will fix it.
   s/acquiring inh sample rows/acquiring inherited sample rows/


Thanks,
Tatsuro Yamada






Re: progress report for ANALYZE

2019-11-26 Thread Tatsuro Yamada

Hi Amit-san!


I think include_children and current_relid are not enough to
understand the progress of analyzing inheritance trees, because even
with current_relid being updated, I can't tell how many more there
will be.  I think it'd be better to show the total number of children
and the number of children processed, just like
pg_stat_progress_create_index does for partitions.  So, instead of
include_children and current_relid, I think it's better to have
child_tables_total, child_tables_done, and current_child_relid, placed
last in the set of columns.


Ah, I understood.
I'll check pg_stat_progress_create_index does for partitions,
and will create a new patch. 


Fixed.

But I just remembered I replaced column name "*_table" with "*_relid"
based on Robert's comment three months ago, see below:


/me reviews.

+  scanning_table

I think this should be retitled to something that ends in 'relid',
like all of the corresponding cases in existing progress views.
Perhaps 'active_relid' or 'current_relid'.


So, it would be better to use same rule against child_tables_total and
child_tables_done. Thus I changed these column names to others and added
to the view. I also removed include_children and current_relid.
The following columns are new version.


 pid
 datid
 datname
 relid
 phase
 sample_blks_total
 heap_blks_scanned
 ext_stats_total <= Added (based on Alvaro's comment)
 ext_stats_computed  <= Renamed
 child_relids_total  <= Added
 child_relids_done   <= Added
 current_child_relid <= Added



Also, inheritance tree stats are created *after* creating single table
stats, so I think that it would be better to have a distinct phase
name for that, say "acquiring inherited sample rows".  In
do_analyze_rel(), you can select which of two phases to set based on
whether inh is true or not.  For partitioned tables, the progress
output will immediately switch to this phase, because partitioned
table itself is empty so there's nothing to do in the "acquiring
sample rows" phase.

That's all for now.



Okay! I'll also add the new phase "acquiring inherited sample rows" on
the next patch. 



Fixed.

I tried to abbreviate it to "acquiring inh sample rows" because I thought
"acquiring inherited sample rows" is a little long for the phase name.

Attached WIP patch is including these fixes:
  - Remove columns: include_children and current_relid
  - Add new columns: child_relieds_total, child_relids_done and 
current_child_relid
  - Add new phase "acquiring inh sample rows"

  Note: the document is not updated, I'll fix it later. :)

Attached testcase.sql is for creating base table and partitioning table.
You can check the patch by using the following procedures, easily.

Terminal #1
--
\a \t
select * from pg_stat_progress_analyze; \watch 0.0001
--

Terminal #2
------
\i testcase.sql
analyze hoge;
analyze hoge2;
--

Thanks,
Tatsuro Yamada



diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a3c5f86b7e..462d3a5415 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -360,6 +360,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
   
  
 
+ 
+  
pg_stat_progress_analyzepg_stat_progress_analyze
+  One row for each backend (including autovacuum worker processes) 
running
+   ANALYZE, showing current progress.
+   See .
+  
+ 
+
  
   
pg_stat_progress_clusterpg_stat_progress_cluster
   One row for each backend running
@@ -3502,7 +3510,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
PostgreSQL has the ability to report the 
progress of
certain commands during command execution.  Currently, the only commands
which support progress reporting are CREATE INDEX,
-   VACUUM and
+   VACUUM, ANALYZE and
CLUSTER. This may be expanded in the future.
   
 
@@ -3948,6 +3956,140 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
  
 
+ 
+  ANALYZE Progress Reporting
+
+  
+   Whenever ANALYZE is running, the
+   pg_stat_progress_analyze view will contain a
+   row for each backend that is currently running that command.  The tables
+   below describe the information that will be reported and provide
+   information about how to interpret it.
+  
+
+  
+   pg_stat_progress_analyze View
+   
+
+
+  Column
+  Type
+  Description
+ 
+
+
+   
+
+ pid
+ integer
+ Process ID of backend.
+
+ 
+  datid
+  oid
+  OID of the database to which this backend is connected.
+ 
+ 
+  datname
+  name
+  Name of the database to which this backend is connected.
+ 
+ 
+  relid
+  oid
+  OID of the table being analyzed.
+ 
+ 
+  include_children
+  boolean
+  Whether scanning through child tables.
+ 
+ 
+  current_r

Re: progress report for ANALYZE

2019-11-26 Thread Tatsuro Yamada

Hi Amit-san,


On Wed, Nov 27, 2019 at 11:04 AM Tatsuro Yamada
 wrote:

Regarding to other total number columns,
I'll create another patch to add these columns "index_vacuum_total" and
"index_rebuild_count" on the other views. :)


Maybe you meant "index_rebuild_total"?


Yeah, you are right! :)


Thanks,
Tatsuro Yamada





Re: progress report for ANALYZE

2019-11-26 Thread Tatsuro Yamada

Hi Alvaro!

On 2019/11/26 21:22, Alvaro Herrera wrote:

On 2019-Nov-26, Tatsuro Yamada wrote:


I wonder whether we need the total number of ext stats on
pg_stat_progress_analyze or not. As you might know, there is the same
counter on pg_stat_progress_vacuum and pg_stat_progress_cluster.
For example, index_vacuum_count and index_rebuild_count.


Would it be better to add the total number column to these views? :)


Yeah, I think it would be good to add that.



Thanks for your comment!
Okay, I'll add the column "ext_stats_total" to
pg_stat_progress_analyze view on the next patch. :)

Regarding to other total number columns,
I'll create another patch to add these columns "index_vacuum_total" and
"index_rebuild_count" on the other views. :)

Thanks,
Tatsuro Yamada



 
 






Re: progress report for ANALYZE

2019-11-25 Thread Tatsuro Yamada

Hi Amit-san,



Related to the above,
I wonder whether we need the total number of ext stats on
pg_stat_progress_analyze or not. As you might know, there is the same
counter on pg_stat_progress_vacuum and pg_stat_progress_cluster.
For example, index_vacuum_count and index_rebuild_count.
Would it be added to the total number column to these views? :)



Oops, I made a mistake. :(

What I'd like to say was:
Would it be better to add the total number column to these views? :)

Thanks,
Tatsuro Yamada





Re: progress report for ANALYZE

2019-11-25 Thread Tatsuro Yamada

Hi Amit-san!

Thanks for your comments!



I have looked at the patch and here are some comments.

I think include_children and current_relid are not enough to
understand the progress of analyzing inheritance trees, because even
with current_relid being updated, I can't tell how many more there
will be.  I think it'd be better to show the total number of children
and the number of children processed, just like
pg_stat_progress_create_index does for partitions.  So, instead of
include_children and current_relid, I think it's better to have
child_tables_total, child_tables_done, and current_child_relid, placed
last in the set of columns.


Ah, I understood.
I'll check pg_stat_progress_create_index does for partitions,
and will create a new patch. :)

Related to the above,
I wonder whether we need the total number of ext stats on
pg_stat_progress_analyze or not. As you might know, there is the same
counter on pg_stat_progress_vacuum and pg_stat_progress_cluster.
For example, index_vacuum_count and index_rebuild_count.
Would it be added to the total number column to these views? :)



Also, inheritance tree stats are created *after* creating single table
stats, so I think that it would be better to have a distinct phase
name for that, say "acquiring inherited sample rows".  In
do_analyze_rel(), you can select which of two phases to set based on
whether inh is true or not.  For partitioned tables, the progress
output will immediately switch to this phase, because partitioned
table itself is empty so there's nothing to do in the "acquiring
sample rows" phase.

That's all for now.



Okay! I'll also add the new phase "acquiring inherited sample rows" on
the next patch. :)


Thanks,
Tatsuro Yamada





Re: progress report for ANALYZE

2019-11-05 Thread Tatsuro Yamada

Hi Alvaro!

On 2019/11/05 22:38, Alvaro Herrera wrote:

On 2019-Nov-05, Tatsuro Yamada wrote:


==
[Session1]
\! pgbench -i
create statistics pg_ext1 (dependencies) ON aid, bid from pgbench_accounts;
create statistics pg_ext2 (mcv) ON aid, bid from pgbench_accounts;
create statistics pg_ext3 (ndistinct) ON aid, bid from pgbench_accounts;


Wow, it takes a long time to compute these ...

Hmm, you normally wouldn't define stats that way; you'd do this instead:

create statistics pg_ext1 (dependencies, mcv,ndistinct) ON aid, bid from 
pgbench_accounts;


I'd like to say it's a just example of test case. But I understand that
your advice. Thanks! :)

 

I'm not sure if this has an important impact in practice.  What I'm
saying is that I'm not sure that "number of ext stats" is necessarily a
useful number as shown.  I wonder if it's possible to count the number
of items that have been computed for each stats object.  So if you do
this

create statistics pg_ext1 (dependencies, mcv) ON aid, bid from pgbench_accounts;
create statistics pg_ext2 (ndistinct,histogram) ON aid, bid from 
pgbench_accounts;

then the counter goes to 4.  But I also wonder if we need to publish
_which_ type of ext stats is currently being built, in a separate
column.



Hmm... I have never seen a lot of extended stats on a table (with many columns)
but I suppose it will be existence near future because extended stats is an only
solution to correct row estimation error in vanilla PostgreSQL. Therefore, it
would be better to add the counter on the view, I think.

I revised the patch as following because I realized counting the types of ext
stats is not useful for users.

 - Attached new patch counts a number of ext stats instead the types of ext 
stats.

So we can see the counter goes to "2", if we created above ext stats (pg_ext1 
and
pg_ext2) and analyzed as you wrote. :)


Thanks,
Tatsuro Yamada

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e6c49eebad..c84d9a0775 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -360,6 +360,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
   
  
 
+ 
+  
pg_stat_progress_analyzepg_stat_progress_analyze
+  One row for each backend (including autovacuum worker processes) 
running
+   ANALYZE, showing current progress.
+   See .
+  
+ 
+
  
   
pg_stat_progress_clusterpg_stat_progress_cluster
   One row for each backend running
@@ -3481,7 +3489,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
PostgreSQL has the ability to report the 
progress of
certain commands during command execution.  Currently, the only commands
which support progress reporting are CREATE INDEX,
-   VACUUM and
+   VACUUM, ANALYZE and
CLUSTER. This may be expanded in the future.
   
 
@@ -3927,6 +3935,140 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
  
 
+ 
+  ANALYZE Progress Reporting
+
+  
+   Whenever ANALYZE is running, the
+   pg_stat_progress_analyze view will contain a
+   row for each backend that is currently running that command.  The tables
+   below describe the information that will be reported and provide
+   information about how to interpret it.
+  
+
+  
+   pg_stat_progress_analyze View
+   
+
+
+  Column
+  Type
+  Description
+ 
+
+
+   
+
+ pid
+ integer
+ Process ID of backend.
+
+ 
+  datid
+  oid
+  OID of the database to which this backend is connected.
+ 
+ 
+  datname
+  name
+  Name of the database to which this backend is connected.
+ 
+ 
+  relid
+  oid
+  OID of the table being analyzed.
+ 
+ 
+  include_children
+  boolean
+  Whether scanning through child tables.
+ 
+ 
+  current_relid
+  oid
+  OID of the table currently being scanned.
+It might be different from relid when analyzing tables that have child 
tables.
+  
+ 
+ 
+  phase
+  text
+  Current processing phase. See 
+ 
+ 
+ sample_blks_total
+ bigint
+ 
+   Total number of heap blocks that will be sampled.
+ 
+ 
+
+ heap_blks_scanned
+ bigint
+ 
+   Number of heap blocks scanned.
+ 
+
+
+ ext_compute_count
+ bigint
+ 
+   Number of computed extended stats.  This counter only advances when the 
phase
+   is computing extended stats.
+ 
+
+   
+   
+  
+
+  
+   ANALYZE phases
+   
+
+
+  Phase
+  Description
+ 
+
+   
+
+ initializing
+ 
+   The command is preparing to begin scanning the heap.  This phase is
+   expected to be very brief.
+ 
+
+
+ acquiring sample rows
+ 
+   The command is currently scanning the 
current_relid
+   to obtain samples.
+ 
+
+
+

Re: progress report for ANALYZE

2019-11-05 Thread Tatsuro Yamada

Hi Alvaro, vignesh,

I rebased the patch on 2a4d96eb, and added new column
"ext_compute_count" in pg_stat_progress_analyze vie to
report a number of computing extended stats.
It is like a "index_vacuum_count" in vacuum progress reporter or
"index_rebuild_count" in cluster progress reporter. :)

Please find attached file: v7.

And the following is a test result:
==
[Session1]
\! pgbench -i
create statistics pg_ext1 (dependencies) ON aid, bid from pgbench_accounts;
create statistics pg_ext2 (mcv) ON aid, bid from pgbench_accounts;
create statistics pg_ext3 (ndistinct) ON aid, bid from pgbench_accounts;

[Session2]
# \a \t
# select * from pg_stat_progress_analyze ; \watch 0.0001

27064|13583|postgres|16405|initializing|f|0|0|0|0
27064|13583|postgres|16405|acquiring sample rows|f|16405|1640|0|0
27064|13583|postgres|16405|acquiring sample rows|f|16405|1640|23|0
27064|13583|postgres|16405|acquiring sample rows|f|16405|1640|64|0
27064|13583|postgres|16405|acquiring sample rows|f|16405|1640|1640|0
27064|13583|postgres|16405|computing stats|f|16405|1640|1640|0
27064|13583|postgres|16405|computing stats|f|16405|1640|1640|0
27064|13583|postgres|16405|computing extended stats|f|16405|1640|1640|0
27064|13583|postgres|16405|computing extended stats|f|16405|1640|1640|1
27064|13583|postgres|16405|computing extended stats|f|16405|1640|1640|2
27064|13583|postgres|16405|computing extended stats|f|16405|1640|1640|3
27064|13583|postgres|16405|finalizing analyze|f|16405|1640|1640|3

Note:
 The result on Session2 was shortened for readability.
 If you'd like to check the whole result, you can see attached file: "hoge.txt".
==

Thanks,
Tatsuro Yamada



diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e6c49eebad..c84d9a0775 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -360,6 +360,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
   
  
 
+ 
+  
pg_stat_progress_analyzepg_stat_progress_analyze
+  One row for each backend (including autovacuum worker processes) 
running
+   ANALYZE, showing current progress.
+   See .
+  
+ 
+
  
   
pg_stat_progress_clusterpg_stat_progress_cluster
   One row for each backend running
@@ -3481,7 +3489,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
PostgreSQL has the ability to report the 
progress of
certain commands during command execution.  Currently, the only commands
which support progress reporting are CREATE INDEX,
-   VACUUM and
+   VACUUM, ANALYZE and
CLUSTER. This may be expanded in the future.
   
 
@@ -3927,6 +3935,140 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
  
 
+ 
+  ANALYZE Progress Reporting
+
+  
+   Whenever ANALYZE is running, the
+   pg_stat_progress_analyze view will contain a
+   row for each backend that is currently running that command.  The tables
+   below describe the information that will be reported and provide
+   information about how to interpret it.
+  
+
+  
+   pg_stat_progress_analyze View
+   
+
+
+  Column
+  Type
+  Description
+ 
+
+
+   
+
+ pid
+ integer
+ Process ID of backend.
+
+ 
+  datid
+  oid
+  OID of the database to which this backend is connected.
+ 
+ 
+  datname
+  name
+  Name of the database to which this backend is connected.
+ 
+ 
+  relid
+  oid
+  OID of the table being analyzed.
+ 
+ 
+  include_children
+  boolean
+  Whether scanning through child tables.
+ 
+ 
+  current_relid
+  oid
+  OID of the table currently being scanned.
+It might be different from relid when analyzing tables that have child 
tables.
+  
+ 
+ 
+  phase
+  text
+  Current processing phase. See 
+ 
+ 
+ sample_blks_total
+ bigint
+ 
+   Total number of heap blocks that will be sampled.
+ 
+ 
+
+ heap_blks_scanned
+ bigint
+ 
+   Number of heap blocks scanned.
+ 
+
+
+ ext_compute_count
+ bigint
+ 
+   Number of computed extended stats.  This counter only advances when the 
phase
+   is computing extended stats.
+ 
+
+   
+   
+  
+
+  
+   ANALYZE phases
+   
+
+
+  Phase
+  Description
+ 
+
+   
+
+ initializing
+ 
+   The command is preparing to begin scanning the heap.  This phase is
+   expected to be very brief.
+ 
+
+
+ acquiring sample rows
+ 
+   The command is currently scanning the 
current_relid
+   to obtain samples.
+ 
+
+
+ computing stats
+ 
+   The command is computing stats from the samples obtained during the 
table scan.
+ 
+
+
+ computing extended stats
+ 
+   The command is computing extended stats from the

Re: progress report for ANALYZE

2019-11-01 Thread Tatsuro Yamada

Hi vignesh!


On 2019/09/17 20:51, vignesh C wrote:

On Thu, Sep 5, 2019 at 2:31 AM Alvaro Herrera  wrote:


There were some minor problems in v5 -- bogus Docbook as well as
outdated rules.out, small "git diff --check" complaint about whitespace.
This v6 (on today's master) fixes those, no other changes.


+ 
+   The command is preparing to begin scanning the heap.  This phase is
+   expected to be very brief.
+ 
In the above after "." there is an extra space, is this intentional. I
had noticed that in lot of places there is couple of spaces and
sometimes single space across this file.

Like in below, there is single space after ".":
  Time when this process' current transaction was started, or null
   if no transaction is active. If the current
   query is the first of its transaction, this column is equal to the
   query_start column.
  



Sorry for the late reply.

Probably, it is intentional because there are many extra space
in other documents. See below:

# Results of grep
=
$ grep '\.  ' doc/src/sgml/monitoring.sgml | wc -l
114
$ grep '\.  ' doc/src/sgml/information_schema.sgml | wc -l
184
$ grep '\.  ' doc/src/sgml/func.sgml | wc -l
577
=

Therefore, I'm going to leave it as is. :)


Thanks,
Tatsuro Yamada






Re: tab complete for explain SETTINGS

2019-09-26 Thread Tatsuro Yamada

On 2019/09/27 11:20, Justin Pryzby wrote:

Here's to hoping this is the worst omission in v12.

Justin



Hi Justin,

I share my test result of your patch.

I used two commits REL_12_RC1 and Head, and got a Hunk below:

#REL_12_RC1 (17822c0e4f5ab8093e78f665c9e44766ae648a44)
=
$ patch -p1 

  1   2   >