Re: Make EXPLAIN generate a generic plan for a parameterized query

2022-12-27 Thread Michel Pelletier
On Wed, Dec 7, 2022 at 3:23 AM Laurenz Albe 
wrote:

> On Tue, 2022-12-06 at 10:17 -0800, Andres Freund wrote:
> > On 2022-10-29 10:35:26 +0200, Laurenz Albe wrote:
> > > > > Here is a patch that
> > > > > implements it with an EXPLAIN option named GENERIC_PLAN.
> >
> > This fails to build the docs:
> >
> > https://cirrus-ci.com/task/5609301511766016
> >
> > [17:47:01.064] ref/explain.sgml:179: parser error : Opening and ending
> tag mismatch: likeral line 179 and literal
> > [17:47:01.064]   ANALYZE, since a statement with
> unknown parameters
> > [17:47:01.064] ^
>
> *blush* Here is a fixed version.
>

I built and tested this patch for review and it works well, although I got
the following warning when building:

analyze.c: In function 'transformStmt':
analyze.c:2919:35: warning: 'generic_plan' may be used uninitialized in
this function [-Wmaybe-uninitialized]
 2919 | pstate->p_generic_explain = generic_plan;
  | ~~^~
analyze.c:2909:25: note: 'generic_plan' was declared here
 2909 | boolgeneric_plan;
  | ^~~~


Re: Proposal to use JSON for Postgres Parser format

2022-10-31 Thread Michel Pelletier
On Mon, Oct 31, 2022 at 6:15 AM Matthias van de Meent <
boekewurm+postg...@gmail.com> wrote:

> On Mon, 31 Oct 2022 at 13:46, Alexander Korotkov 
> wrote:
> > On Fri, Oct 28, 2022 at 4:27 PM Andrew Dunstan 
> wrote:
> >> On 2022-10-27 Th 19:38, Andres Freund wrote:
> >> > Hi,
> >> >
> >> > On 2022-09-19 22:29:15 -0400, Tom Lane wrote:
> >> >> Maybe a compromise could be found whereby we provide a conversion
> function
> >> >> that converts whatever the catalog storage format is to some JSON
> >> >> equivalent.  That would address the needs of external code that
> doesn't want
> >> >> to write a custom parser, while not tying us directly to JSON.
> >> > +1
> >>
> >> Agreed.
> >
> > +1
> >
> > Michel, it seems that you now have a green light to implement node to
> > json function.
>
> I think that Tom's proposal that we +1 is on a pg_node_tree to json
> SQL function / cast; which is tangentially related to the "nodeToJson
> / changing the storage format of pg_node_tree to json" proposal, but
> not the same.
>

I agree.


> I will add my +1 to Tom's proposal for that function/cast, but I'm not
> sure on changing the storage format of pg_node_tree to json.
>

I'm going to spike on this function and will get back to the thread with
any updates.

Thank you!

-Michel


Re: Proposal to use JSON for Postgres Parser format

2022-10-27 Thread Michel Pelletier
On Wed, Sep 21, 2022 at 11:04 AM Matthias van de Meent <
boekewurm+postg...@gmail.com> wrote:

> On Tue, 20 Sept 2022 at 17:29, Alvaro Herrera 
> wrote:
> >
> > On 2022-Sep-20, Matthias van de Meent wrote:
> >
> > > Allow me to add: compressability
> > >
> > > In the thread surrounding [0] there were complaints about the size of
> > > catalogs, and specifically the template database. Significant parts of
> > > that (688kB of 8080kB a fresh PG14 database) are in pg_rewrite, which
> > > consists mostly of serialized Nodes. If we're going to replace our
> > > current NodeToText infrastructure, we'd better know we can effectively
> > > compress this data.
> >
> > True.  Currently, the largest ev_action values compress pretty well.  I
> > think if we wanted this to be more succint, we would have to invent some
> > binary format -- perhaps something like Protocol Buffers: it'd be stored
> > in the binary format in catalogs, but for output it would be converted
> > into something easy to read (we already do this for
> > pg_statistic_ext_data for example).  We'd probably lose compressibility,
> > but that'd be okay because the binary format would already remove most
> > of the redundancy by nature.
> >
> > Do we want to go there?
>
> I don't think that a binary format would be much better for
> debugging/fixing than an optimization of the current textual format
> when combined with compression.


I agree, JSON is not perfect, but it compresses and it's usable
everywhere.  My personal need for this is purely developer experience, and
Tom pointed out, a "niche" need for sure, but we are starting to do some
serious work with Dan Lynch's plpgsql deparser tool to generate RLS
policies from meta schema models, and having the same format come out of
the parser would make a complete end to end solution for us, especially if
we can get this data from a function in a ddl_command_start event trigger.
Dan also writes a popular deparser for Javascript, and unifying the formats
across these tools would be a big win for us.


> As I mentioned in that thread, there
> is a lot of improvement possible with the existing format, and I think
> any debugging of serialized nodes would greatly benefit from using a
> textual format.
>

Agreed.


> Then again, I also agree that this argument doesn't hold it's weight
> when storage and output formats are going to be different. I trust
> that any new tooling introduced as a result of this thread will be
> better than what we have right now.
>

Separating formats seems like a lot of work to me, to get what might not be
a huge improvement over compressing JSON, for what seems unlikely to be
more than a few megabytes of parsed SQL.


> As for best format: I don't know. The current format is usable, and a
> better format would not store any data for default values. JSON can do
> that, but I could think of many formats that could do the same (Smile,
> BSON, xml, etc.).
>
> I do not think that protobuf is the best choice for storage, though,
> because it has its own rules on what it considers a default value and
> what it does or does not serialize: zero is considered the only
> default for numbers, as is the empty string for text, etc.
> I think it is allright for general use, but with e.g. `location: -1`
> in just about every parse node we'd probably want to select our own
> values to ignore during (de)serialization of fields.
>

Agreed.

 Thank you everyone who has contributed to this thread, I'm pleased that it
got a very spirited debate and I apologize for the delay in getting back to
everyone.

I'd like to spike on a proposed patch that:

  - Converts the existing text format to JSON (or possibly jsonb,
considering feedback from this thread)
  - Can be stored compressed
  - Can be passed to a ddl_command_start event trigger with a function.

Thoughts?

-Michel


Proposal to use JSON for Postgres Parser format

2022-09-19 Thread Michel Pelletier
Hello hackers,

As noted in the source:

https://github.com/postgres/postgres/blob/master/src/include/nodes/pg_list.h#L6-L11

 * Once upon a time, parts of Postgres were written in Lisp and used real
 * cons-cell lists for major data structures.  When that code was rewritten
 * in C, we initially had a faithful emulation of cons-cell lists, which
 * unsurprisingly was a performance bottleneck.  A couple of major rewrites
 * later, these data structures are actually simple expansible arrays;
 * but the "List" name and a lot of the notation survives.

The Postgres parser format as described in the wiki page:

https://wiki.postgresql.org/wiki/Query_Parsing

looks almost, but not quite, entirely like JSON:

SELECT * FROM foo where bar = 42 ORDER BY id DESC LIMIT 23;
   (
  {SELECT
  :distinctClause <>
  :intoClause <>
  :targetList (
 {RESTARGET
 :name <>
 :indirection <>
 :val
{COLUMNREF
:fields (
   {A_STAR
   }
)
:location 7
}
 :location 7
 }
  )
  :fromClause (
 {RANGEVAR
 :schemaname <>
 :relname foo
 :inhOpt 2
 :relpersistence p
 :alias <>
 :location 14
 }
  )
  ... and so on
   )

This non-standard format is useful for visual inspection and perhaps
simple parsing.  Parsers that do exist for it are generally specific
to some languages.  If there were a standard way to parse queries,
tools like code generators and analysis tools can work with a variety
of libraries that already handle JSON quite well.  Future potential
would include exposing this data to command_ddl_start event triggers.
Providing a JSON Schema would also aid tools that want to validate or
transform the json with rule based systems.

I would like to propose a discussion that in a future major release
Postgres switch
from this custom format to JSON.  The current format is question is
generated from macros and functions found in
`src/backend/nodes/readfuncs.c` and `src/backend/nodes/outfuncs.c` and
converting them to emit valid JSON would be relatively
straightforward.

One downside would be that this would not be a forward compatible
binary change across releases.  Since it is unlikely that very much
code is reliant on this custom format; this would not be a huge problem
for most.

Thoughts?

-Michel


Re: PATCH: Add Table Access Method option to pgbench

2022-07-12 Thread Michel Pelletier
On Thu, 30 Jun 2022 at 18:09, Michael Paquier  wrote:

> On Fri, Jul 01, 2022 at 10:06:49AM +0900, Michael Paquier wrote:
> > And the conclusion back then is that one can already achieve this by
> > using PGOPTIONS:
> > PGOPTIONS='-c default_table_access_method=wuzza' pgbench [...]
> >
> > So there is no need to complicate more pgbench, particularly when it
> > comes to partitioned tables where USING is not supported.  Your patch
> > touches this area of the client code to bypass the backend error.
>
> Actually, it could be a good thing to mention that directly in the
> docs of pgbench.
>

I've attached a documentation patch that mentions and links to the
PGOPTIONS documentation per your suggestion.  I'll keep the other patch on
the back burner, perhaps in the future there will be demand for a command
line option as more TAMs are created.

Thanks,

-Michel


> --
> Michael
>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 37fd80388c..e2d728e0c4 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -317,7 +317,7 @@ UPDATE pg_settings SET setting = reset_val WHERE name = 'configuration_parameter
 

 
-   
+   
 Parameter Interaction via the Shell
 
  
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2acf55c2ac..f15825c293 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1018,6 +1018,17 @@ pgbench  options  d
always, auto and
never.
   
+
+  
+   The environment variable PGOPTIONS specifies database
+   configuration options that are passed to PostgreSQL via the command line
+   (See ).  For example, a hypothetical
+   default Table Access Method for the tables that pgbench creates
+   called wuzza can be specified with:
+
+PGOPTIONS='-c default_table_access_method=wuzza'
+
+  
  
 
  


Re: PATCH: Add Table Access Method option to pgbench

2022-06-30 Thread Michel Pelletier
I've got CI setup and building and the tests now pass, I was missing a
CASCADE in my test.  New patch attached:



On Thu, 30 Jun 2022 at 10:50, Michel Pelletier  wrote:

> On Thu, 30 Jun 2022 at 09:51, Justin Pryzby  wrote:
>
>> On Thu, Jun 30, 2022 at 09:09:17AM -0700, Michel Pelletier wrote:
>> > This change was originally authored by Alexander Korotkov, I have
>> updated
>> > it and added a test to the pgbench runner.  I'm hoping to make the
>> deadline
>> > for this currently open Commit Fest?
>>
>> This is failing check-world
>> http://cfbot.cputube.org/michel-pelletier.html
>>
>> BTW, you can test your patches the same as cfbot does (before mailing the
>> list)
>> on 4 OSes by pushing a branch to a github account.  See
>> ./src/tools/ci/README
>>
>> Ah that's very helpful thank you!  This is my first patch submission so
> sorry for any mixups.
>
> -Michel
>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index fbb74bdc4c..80b57e8a87 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -223,6 +223,11 @@ double		throttle_delay = 0;
  */
 int64		latency_limit = 0;
 
+/*
+ * tableam selection
+ */
+char	   *tableam = NULL;
+
 /*
  * tablespace selection
  */
@@ -890,6 +895,7 @@ usage(void)
 		   "  --partition-method=(range|hash)\n"
 		   "   partition pgbench_accounts with this method (default: range)\n"
 		   "  --partitions=NUM partition pgbench_accounts into NUM parts (default: 0)\n"
+		   "  --tableam=TABLEAMcreate tables using the specified Table Access Method\n"
 		   "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
 		   "  --unlogged-tablescreate tables as unlogged tables\n"
 		   "\nOptions to select what to run:\n"
@@ -4705,14 +4711,34 @@ createPartitions(PGconn *con)
 appendPQExpBufferStr(, "maxvalue");
 
 			appendPQExpBufferChar(, ')');
+
+			if (tableam != NULL)
+			{
+char	   *escape_tableam;
+
+escape_tableam = PQescapeIdentifier(con, tableam, strlen(tableam));
+appendPQExpBuffer(, " using %s", escape_tableam);
+PQfreemem(escape_tableam);
+			}
 		}
 		else if (partition_method == PART_HASH)
+		{
 			printfPQExpBuffer(,
 			  "create%s table pgbench_accounts_%d\n"
 			  "  partition of pgbench_accounts\n"
 			  "  for values with (modulus %d, remainder %d)",
 			  unlogged_tables ? " unlogged" : "", p,
 			  partitions, p - 1);
+
+			if (tableam != NULL)
+			{
+char	   *escape_tableam;
+
+escape_tableam = PQescapeIdentifier(con, tableam, strlen(tableam));
+appendPQExpBuffer(, " using %s", escape_tableam);
+PQfreemem(escape_tableam);
+			}
+		}
 		else	/* cannot get there */
 			Assert(0);
 
@@ -4799,10 +4825,20 @@ initCreateTables(PGconn *con)
 		if (partition_method != PART_NONE && strcmp(ddl->table, "pgbench_accounts") == 0)
 			appendPQExpBuffer(,
 			  " partition by %s (aid)", PARTITION_METHOD[partition_method]);
-		else if (ddl->declare_fillfactor)
+		else
 		{
+			if (tableam != NULL)
+			{
+char	   *escape_tableam;
+
+escape_tableam = PQescapeIdentifier(con, tableam, strlen(tableam));
+appendPQExpBuffer(, " using %s", escape_tableam);
+PQfreemem(escape_tableam);
+			}
+
 			/* fillfactor is only expected on actual tables */
-			appendPQExpBuffer(, " with (fillfactor=%d)", fillfactor);
+			if (ddl->declare_fillfactor)
+appendPQExpBuffer(, " with (fillfactor=%d)", fillfactor);
 		}
 
 		if (tablespace != NULL)
@@ -6556,6 +6592,7 @@ main(int argc, char **argv)
 		{"failures-detailed", no_argument, NULL, 13},
 		{"max-tries", required_argument, NULL, 14},
 		{"verbose-errors", no_argument, NULL, 15},
+		{"tableam", required_argument, NULL, 16},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -6898,6 +6935,10 @@ main(int argc, char **argv)
 benchmarking_option_set = true;
 verbose_errors = true;
 break;
+			case 16:			/* tableam */
+initialization_option_set = true;
+tableam = pg_strdup(optarg);
+break;
 			default:
 /* getopt_long already emitted a complaint */
 pg_log_error_hint("Try \"%s --help\" for more information.", progname);
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2c0dc36965..c33df7b829 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1418,6 +1418,23 @@ SELECT pg_advisory_unlock_all();
 # Clean up
 $node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
 
+# Test table access method
+$node->safe_psql('postgres', 'CREATE 

Re: PATCH: Add Table Access Method option to pgbench

2022-06-30 Thread Michel Pelletier
On Thu, 30 Jun 2022 at 09:51, Justin Pryzby  wrote:

> On Thu, Jun 30, 2022 at 09:09:17AM -0700, Michel Pelletier wrote:
> > This change was originally authored by Alexander Korotkov, I have updated
> > it and added a test to the pgbench runner.  I'm hoping to make the
> deadline
> > for this currently open Commit Fest?
>
> This is failing check-world
> http://cfbot.cputube.org/michel-pelletier.html
>
> BTW, you can test your patches the same as cfbot does (before mailing the
> list)
> on 4 OSes by pushing a branch to a github account.  See
> ./src/tools/ci/README
>
> Ah that's very helpful thank you!  This is my first patch submission so
sorry for any mixups.

-Michel


PATCH: Add Table Access Method option to pgbench

2022-06-30 Thread Michel Pelletier
Hello!

This patch adds a `--tableam=TABLEAM` option to the pgbench command line
which allows the user to specify which table am is used to create tables
initialized with `-i`.

This change was originally authored by Alexander Korotkov, I have updated
it and added a test to the pgbench runner.  I'm hoping to make the deadline
for this currently open Commit Fest?

My goal is to add a couple more regression tests but the implementation is
complete.

Thanks in advance for any comments or questions!

-Michel
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index fbb74bdc4..80b57e8a8 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -223,6 +223,11 @@ double		throttle_delay = 0;
  */
 int64		latency_limit = 0;
 
+/*
+ * tableam selection
+ */
+char	   *tableam = NULL;
+
 /*
  * tablespace selection
  */
@@ -890,6 +895,7 @@ usage(void)
 		   "  --partition-method=(range|hash)\n"
 		   "   partition pgbench_accounts with this method (default: range)\n"
 		   "  --partitions=NUM partition pgbench_accounts into NUM parts (default: 0)\n"
+		   "  --tableam=TABLEAMcreate tables using the specified Table Access Method\n"
 		   "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
 		   "  --unlogged-tablescreate tables as unlogged tables\n"
 		   "\nOptions to select what to run:\n"
@@ -4705,14 +4711,34 @@ createPartitions(PGconn *con)
 appendPQExpBufferStr(, "maxvalue");
 
 			appendPQExpBufferChar(, ')');
+
+			if (tableam != NULL)
+			{
+char	   *escape_tableam;
+
+escape_tableam = PQescapeIdentifier(con, tableam, strlen(tableam));
+appendPQExpBuffer(, " using %s", escape_tableam);
+PQfreemem(escape_tableam);
+			}
 		}
 		else if (partition_method == PART_HASH)
+		{
 			printfPQExpBuffer(,
 			  "create%s table pgbench_accounts_%d\n"
 			  "  partition of pgbench_accounts\n"
 			  "  for values with (modulus %d, remainder %d)",
 			  unlogged_tables ? " unlogged" : "", p,
 			  partitions, p - 1);
+
+			if (tableam != NULL)
+			{
+char	   *escape_tableam;
+
+escape_tableam = PQescapeIdentifier(con, tableam, strlen(tableam));
+appendPQExpBuffer(, " using %s", escape_tableam);
+PQfreemem(escape_tableam);
+			}
+		}
 		else	/* cannot get there */
 			Assert(0);
 
@@ -4799,10 +4825,20 @@ initCreateTables(PGconn *con)
 		if (partition_method != PART_NONE && strcmp(ddl->table, "pgbench_accounts") == 0)
 			appendPQExpBuffer(,
 			  " partition by %s (aid)", PARTITION_METHOD[partition_method]);
-		else if (ddl->declare_fillfactor)
+		else
 		{
+			if (tableam != NULL)
+			{
+char	   *escape_tableam;
+
+escape_tableam = PQescapeIdentifier(con, tableam, strlen(tableam));
+appendPQExpBuffer(, " using %s", escape_tableam);
+PQfreemem(escape_tableam);
+			}
+
 			/* fillfactor is only expected on actual tables */
-			appendPQExpBuffer(, " with (fillfactor=%d)", fillfactor);
+			if (ddl->declare_fillfactor)
+appendPQExpBuffer(, " with (fillfactor=%d)", fillfactor);
 		}
 
 		if (tablespace != NULL)
@@ -6556,6 +6592,7 @@ main(int argc, char **argv)
 		{"failures-detailed", no_argument, NULL, 13},
 		{"max-tries", required_argument, NULL, 14},
 		{"verbose-errors", no_argument, NULL, 15},
+		{"tableam", required_argument, NULL, 16},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -6898,6 +6935,10 @@ main(int argc, char **argv)
 benchmarking_option_set = true;
 verbose_errors = true;
 break;
+			case 16:			/* tableam */
+initialization_option_set = true;
+tableam = pg_strdup(optarg);
+break;
 			default:
 /* getopt_long already emitted a complaint */
 pg_log_error_hint("Try \"%s --help\" for more information.", progname);
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2c0dc3696..eed976d7e 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1418,6 +1418,23 @@ SELECT pg_advisory_unlock_all();
 # Clean up
 $node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
 
+# Test table access method
+$node->safe_psql('postgres', 'CREATE ACCESS METHOD heap2 TYPE TABLE HANDLER heap_tableam_handler;');
+
+# Initialize pgbench table am
+$node->pgbench(
+	'-i --tableam=heap2', 0,
+	[qr{^$}],
+	[
+		qr{creating tables},
+		qr{vacuuming},
+		qr{creating primary keys},
+		qr{done in \d+\.\d\d s }
+	],
+	'pgbench test tableam options');
+
+# Clean up
+$node->safe_psql('postgres', 'DROP ACCESS METHOD heap2;');
 
 # done
 $node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');


Re: Proposal for adding an example "expanded" data type to contrib

2021-11-08 Thread Michel Pelletier
On Mon, Nov 8, 2021 at 8:07 AM Alvaro Herrera 
wrote:

> On 2021-Nov-08, Michel Pelletier wrote:
>
> > I posted this last week to psql-general, and Pavel Stehule suggested it
> > would be a good candidate for inclusion in the contrib collection and
> that
> > I write this proposal to suggest the extension get into the next commit
> > fest.
>
> Hmm, I think src/test/modules/ would be a better place for it than
> contrib/ -- alongside plsample, which has a similar intention.
>

Ah excellent suggestion, that dir had slipped my mind.  I see the readme
says:

"src/test/modules contains PostgreSQL extensions that are primarily or
entirely intended for testing PostgreSQL and/or to serve as example code."

so that seems quite appropriate, thanks!

-Michel


> --
> Álvaro Herrera  Valdivia, Chile  —
> https://www.EnterpriseDB.com/
>


Proposal for adding an example "expanded" data type to contrib

2021-11-08 Thread Michel Pelletier
Hello,

Expanded data types (types that have a different in-memory working
representation than on disk) are pretty essential to a couple of projects
that I work on, and while the prose documentation is great at explaining
the rationale and technical overview, there isn't a complete,
self-contained, simple example of an expanded data type.  Readers of the
documentation are encouraged to study the array type code.

To wrap my own head around it, and to have a sort of template to quickly
spin up future expanded types, I wrote a simple extension that does nothing
useful other than show how to create and handle an expanded type.

https://github.com/michelp/pgexpanded

I posted this last week to psql-general, and Pavel Stehule suggested it
would be a good candidate for inclusion in the contrib collection and that
I write this proposal to suggest the extension get into the next commit
fest.

I have not contributed to the core of postgres in the past so please excuse
my ignorance on the process.  Does anyone have any comments or objections
to me pushing this process further along?

-Michel