Re: Improve conditional compilation for direct I/O alignment checks

2024-05-26 Thread Junwang Zhao
On Sun, May 26, 2024 at 3:16 PM Junwang Zhao  wrote:
>
> This patch refactors the alignment checks for direct I/O to preprocess phase,
> thereby reducing some CPU cycles.
>
> --
> Regards
> Junwang Zhao

Patch v2 with some additional minor polishment of the comments in `mdwriteback`.

-- 
Regards
Junwang Zhao


v2-0001-Improve-conditional-compilation-for-direct-I-O-al.patch
Description: Binary data


Improve conditional compilation for direct I/O alignment checks

2024-05-26 Thread Junwang Zhao
This patch refactors the alignment checks for direct I/O to preprocess phase,
thereby reducing some CPU cycles.

-- 
Regards
Junwang Zhao


0001-Improve-conditional-compilation-for-direct-I-O-align.patch
Description: Binary data


Re: Make COPY format extendable: Extract COPY TO format implementations

2024-02-01 Thread Junwang Zhao
On Fri, Feb 2, 2024 at 2:21 PM Michael Paquier  wrote:
>
> On Fri, Feb 02, 2024 at 09:40:56AM +0900, Sutou Kouhei wrote:
> > Thanks. It'll help us.
>
> I have done a review of v10, see v11 attached which is still WIP, with
> the patches for COPY TO and COPY FROM merged together.  Note that I'm
> thinking to merge them into a single commit.
>
> @@ -74,11 +75,11 @@ typedef struct CopyFormatOptions
>  boolconvert_selectively;/* do selective binary conversion? */
>  CopyOnErrorChoice on_error; /* what to do when error happened */
>  List   *convert_select; /* list of column names (can be NIL) */
> +constCopyToRoutine *to_routine;/* callback routines for COPY 
> TO */
>  } CopyFormatOptions;
>
> Adding the routines to the structure for the format options is in my
> opinion incorrect.  The elements of this structure are first processed
> in the option deparsing path, and then we need to use the options to
> guess which routines we need.  A more natural location is cstate
> itself, so as the pointer to the routines is isolated within copyto.c

I agree CopyToRoutine should be placed into CopyToStateData, but
why set it after ProcessCopyOptions, the implementation of
CopyToGetRoutine doesn't make sense if we want to support custom
format in the future.

Seems the refactor of v11 only considered performance but not
*extendable copy format*.

> and copyfrom_internal.h.  My point is: the routines are an
> implementation detail that the centralized copy.c has no need to know
> about.  This also led to a strange separation with
> ProcessCopyOptionFormatFrom() and ProcessCopyOptionFormatTo() to fit
> the hole in-between.
>
> The separation between cstate and the format-related fields could be
> much better, though I am not sure if it is worth doing as it
> introduces more duplication.  For example, max_fields and raw_fields
> are specific to text and csv, while binary does not care much.
> Perhaps this is just useful to be for custom formats.

I think those can be placed in format specific fields by utilizing the opaque
space, but yeah, this will introduce duplication.

>
> copyapi.h needs more documentation, like what is expected for
> extension developers when using these, what are the arguments, etc.  I
> have added what I had in mind for now.
>
> +typedef char *(*PostpareColumnValue) (CopyFromState cstate, char *string, 
> int m);
>
> CopyReadAttributes and PostpareColumnValue are also callbacks specific
> to text and csv, except that they are used within the per-row
> callbacks.  The same can be said about CopyAttributeOutHeaderFunction.
> It seems to me that it would be less confusing to store pointers to
> them in the routine structures, where the final picture involves not
> having multiple layers of APIs like CopyToCSVStart,
> CopyAttributeOutTextValue, etc.  These *have* to be documented
> properly in copyapi.h, and this is much easier now that cstate stores
> the routine pointers.  That would also make simpler function stacks.
> Note that I have not changed that in the v11 attached.
>
> This business with the extra callbacks required for csv and text is my
> main point of contention, but I'd be OK once the model of the APIs is
> more linear, with everything in Copy{From,To}State.  The changes would
> be rather simple, and I'd be OK to put my hands on it.  Just,
> Sutou-san, would you agree with my last point about these extra
> callbacks?
> --
> Michael

If V7 and V10 have no performance reduction, then I think V6 is also
good with performance, since most of the time goes to CopyToOneRow
and CopyFromOneRow.

I just think we should take the *extendable* into consideration at
the beginning.

-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-31 Thread Junwang Zhao
On Thu, Feb 1, 2024 at 11:56 AM Michael Paquier  wrote:
>
> On Thu, Feb 01, 2024 at 11:43:07AM +0800, Junwang Zhao wrote:
> > The first 6 rounds are like 10% better than the later 9 rounds, is this 
> > normal?
>
> Even with HEAD?  Perhaps you have some OS cache eviction in play here?
> FWIW, I'm not seeing any of that with longer runs after 7~ tries in a
> loop of 15.

Yeah, with HEAD. I'm on ubuntu 22.04, I did not change any gucs, maybe I should
set a higher shared_buffers? But I dought that's related ;(


> --
> Michael



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-31 Thread Junwang Zhao
Hi Michael,

On Thu, Feb 1, 2024 at 9:58 AM Michael Paquier  wrote:
>
> On Wed, Jan 31, 2024 at 02:39:54PM +0900, Michael Paquier wrote:
> > Thanks, I'm looking into that now.
>
> I have much to say about the patch, but for now I have begun running
> some performance tests using the patches, because this thread won't
> get far until we are sure that the callbacks do not impact performance
> in some kind of worst-case scenario.  First, here is what I used to
> setup a set of tables used for COPY FROM and COPY TO (requires [1] to
> feed COPY FROM's data to the void, and note that default values is to
> have a strict control on the size of the StringInfos used in the copy
> paths):
> CREATE EXTENSION blackhole_am;
> CREATE OR REPLACE FUNCTION create_table_cols(tabname text, num_cols int)
> RETURNS VOID AS
> $func$
> DECLARE
>   query text;
> BEGIN
>   query := 'CREATE UNLOGGED TABLE ' || tabname || ' (';
>   FOR i IN 1..num_cols LOOP
> query := query || 'a_' || i::text || ' int default 1';
> IF i != num_cols THEN
>   query := query || ', ';
> END IF;
>   END LOOP;
>   query := query || ')';
>   EXECUTE format(query);
> END
> $func$ LANGUAGE plpgsql;
> -- Tables used for COPY TO
> SELECT create_table_cols ('to_tab_1', 1);
> SELECT create_table_cols ('to_tab_10', 10);
> INSERT INTO to_tab_1 SELECT FROM generate_series(1, 1000);
> INSERT INTO to_tab_10 SELECT FROM generate_series(1, 1000);
> -- Data for COPY FROM
> COPY to_tab_1 TO '/tmp/to_tab_1.bin' WITH (format binary);
> COPY to_tab_10 TO '/tmp/to_tab_10.bin' WITH (format binary);
> COPY to_tab_1 TO '/tmp/to_tab_1.txt' WITH (format text);
> COPY to_tab_10 TO '/tmp/to_tab_10.txt' WITH (format text);
> -- Tables used for COPY FROM
> SELECT create_table_cols ('from_tab_1', 1);
> SELECT create_table_cols ('from_tab_10', 10);
> ALTER TABLE from_tab_1 SET ACCESS METHOD blackhole_am;
> ALTER TABLE from_tab_10 SET ACCESS METHOD blackhole_am;
>
> Then I have run a set of tests using HEAD, v7 and v10 with queries
> like that (adapt them depending on the format and table):
> COPY to_tab_1 TO '/dev/null' WITH (FORMAT text) \watch count=5
> SET client_min_messages TO error; -- for blackhole_am
> COPY from_tab_1 FROM '/tmp/to_tab_1.txt' with (FORMAT 'text') \watch count=5
> COPY from_tab_1 FROM '/tmp/to_tab_1.bin' with (FORMAT 'binary') \watch count=5
>
> All the patches have been compiled with -O2, without assertions, etc.
> Postgres is run in tmpfs mode, on scissors, without fsync.  Unlogged
> tables help a bit in focusing on the execution paths as we don't care
> about WAL, of course.  I have also included v7 in the test of tests,
> as this version uses more simple per-row callbacks.
>
> And here are the results I get for text and binary (ms, average of 15
> queries after discarding the three highest and three lowest values):
>   test   | master |  v7  | v10
> -++--+--
>  from_bin_1col   | 1575   | 1546 | 1575
>  from_bin_10col  | 5364   | 5208 | 5230
>  from_text_1col  | 1690   | 1715 | 1684
>  from_text_10col | 4875   | 4793 | 4757
>  to_bin_1col | 1717   | 1730 | 1731
>  to_bin_10col| 7728   | 7707 | 7513
>  to_text_1col| 1710   | 1730 | 1698
>  to_text_10col   | 5998   | 5960 | 5987
>
> I am getting an interesting trend here in terms of a speedup between
> HEAD and the patches with a table that has 10 attributes filled with
> integers, especially for binary and text with COPY FROM.  COPY TO
> binary also gets nice numbers, while text looks rather stable.  Hmm.
>
> These were on my buildfarm animal, but we need to be more confident
> about all this.  Could more people run these tests?  I am going to do
> a second session on a local machine I have at hand and see what
> happens.  Will publish the numbers here, the method will be the same.
>
> [1]: https://github.com/michaelpq/pg_plugins/tree/main/blackhole_am
> --
> Michael

I'm running the benchmark, but I got some strong numbers:

postgres=# \timing
Timing is on.
postgres=# COPY to_tab_10 TO '/dev/null' WITH (FORMAT binary) \watch count=15
COPY 1000
Time: 3168.497 ms (00:03.168)
COPY 1000
Time: 3255.464 ms (00:03.255)
COPY 1000
Time: 3270.625 ms (00:03.271)
COPY 1000
Time: 3285.112 ms (00:03.285)
COPY 1000
Time: 3322.304 ms (00:03.322)
COPY 1000
Time: 3341.328 ms (00:03.341)
COPY 1000
Time: 3621.564 ms (00:03.622)
COPY 1000
Time: 3700.911 ms (00:03.701)
COPY 1000
Time: 3717.992 ms (00:03.718)
COPY 1000
Time: 3708.350 ms (00:03.708)
COPY 1000
Time: 3704.367 ms (00:03.704)
COPY 1000
Time: 3724.281 ms (00:03.724)
COPY 1000
Time: 3703.335 ms (00:03.703)
COPY 1000
Time: 3728.629 ms (00:03.729)
COPY 1000
Time: 3758.135 ms (00:03.758)

The first 6 rounds are like 10% better than the later 9 rounds, is this normal?

-- 
Regards
Junwang Zhao




Re: Emitting JSON to file using COPY TO

2024-01-31 Thread Junwang Zhao
Hi Vignesh,

On Wed, Jan 31, 2024 at 5:50 PM vignesh C  wrote:
>
> On Sat, 27 Jan 2024 at 11:25, Junwang Zhao  wrote:
> >
> > Hi hackers,
> >
> > Kou-san(CCed) has been working on *Make COPY format extendable[1]*, so
> > I think making *copy to json* based on that work might be the right 
> > direction.
> >
> > I write an extension for that purpose, and here is the patch set together
> > with Kou-san's *extendable copy format* implementation:
> >
> > 0001-0009 is the implementation of extendable copy format
> > 00010 is the pg_copy_json extension
> >
> > I also created a PR[2] if anybody likes the github review style.
> >
> > The *extendable copy format* feature is still being developed, I post this
> > email in case the patch set in this thread is committed without knowing
> > the *extendable copy format* feature.
> >
> > I'd like to hear your opinions.
>
> CFBot shows that one of the test is failing as in [1]:
> [05:46:41.678] /bin/sh: 1: cannot open
> /tmp/cirrus-ci-build/contrib/pg_copy_json/sql/test_copy_format.sql: No
> such file
> [05:46:41.678] diff:
> /tmp/cirrus-ci-build/contrib/pg_copy_json/expected/test_copy_format.out:
> No such file or directory
> [05:46:41.678] diff:
> /tmp/cirrus-ci-build/contrib/pg_copy_json/results/test_copy_format.out:
> No such file or directory
> [05:46:41.678] # diff command failed with status 512: diff
> "/tmp/cirrus-ci-build/contrib/pg_copy_json/expected/test_copy_format.out"
> "/tmp/cirrus-ci-build/contrib/pg_copy_json/results/test_copy_format.out"
> > "/tmp/cirrus-ci-build/contrib/pg_copy_json/results/test_copy_format.out.diff"
> [05:46:41.678] Bail out!make[2]: *** [../../src/makefiles/pgxs.mk:454:
> check] Error 2
> [05:46:41.679] make[1]: *** [Makefile:96: check-pg_copy_json-recurse] Error 2
> [05:46:41.679] make: *** [GNUmakefile:71: check-world-contrib-recurse] Error 2
>
> Please post an updated version for the same.

Thanks for the reminder, the patch set I posted is not for commit but
for further discussion.

I will post more information about the *extendable copy* feature
when it's about to be committed.

>
> [1] - https://cirrus-ci.com/task/5322439115145216
>
> Regards,
> Vignesh



-- 
Regards
Junwang Zhao




Re: UUID v7

2024-01-30 Thread Junwang Zhao
Hi Andrey,

On Tue, Jan 30, 2024 at 5:56 PM Andrey M. Borodin  wrote:
>
>
>
> > On 30 Jan 2024, at 12:28, Sergey Prokhorenko 
> >  wrote:
> >
> >
> > I think this phrase is outdated: "This function can optionally accept a 
> > timestamp used instead of current time.
> > This allows implementation of k-way sotable identifiers.”
> Fixed.
>
> > This phrase is wrong: "Both functions return a version 4 (random) UUID.”
> This applies to functions gen_random_uuid() and uuidv4().
> >
> > For this phrase the reason is unclear and the phrase is most likely 
> > incorrect:
> > if large batches of UUIDs are generated at the
> > +   same time it's possible that some UUIDs will store a time that is 
> > slightly later
> > +   than their actual generation time
>
> I’ve rewritten this phrase, hope it’s more clear now.
>
>
> Best regards, Andrey Borodin.

+Datum
+uuid_extract_var(PG_FUNCTION_ARGS)
+{
+ pg_uuid_t  *uuid = PG_GETARG_UUID_P(0);
+ uint16_t result;
+ result = uuid->data[8] >> 6;
+
+ PG_RETURN_UINT16(result);
+}
\ No newline at end of file

It's always good to add a newline at the end of a  source file, though
this might be nitpicky.

-- 
Regards
Junwang Zhao




Re: UUID v7

2024-01-29 Thread Junwang Zhao
e the clock value needs to
> > be to the actual time.
>
> I see no reason why we cannot make stronger guarantees about the
> timestamps that we use to generate UUIDs with our uuidv7() function.
> And then we can update the documentation for
> uuid_extract_time to something like this:
>
> > This function extracts a timestamptz from UUID versions 1, 6 and 7. For 
> > other
> > versions and variants this function returns NULL. The extracted timestamp
> > does not necessarily equate to the time of UUID generation. How close it is
> > to the actual time depends on the implementation that generated to UUID.
> > The uuidv7() function provided PostgreSQL will normally store the actual 
> > time of
> > generation to in the UUID, but if large batches of UUIDs are generated at 
> > the
> > same time it's possible that some UUIDs will store a time that is slightly 
> > later
> > than their actual generation time.
>
>


-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Junwang Zhao
On Mon, Jan 29, 2024 at 2:03 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Sat, 27 Jan 2024 14:15:02 +0800,
>   Junwang Zhao  wrote:
>
> > I have been working on a *COPY TO JSON* extension since yesterday,
> > which is based on your V6 patch set, I'd like to give you more input
> > so you can make better decisions about the implementation(with only
> > pg-copy-arrow you might not get everything considered).
>
> Thanks!
>
> > 0009 is some changes made by me, I changed CopyToGetFormat to
> > CopyToSendCopyBegin because pg_copy_json need to send different bytes
> > in SendCopyBegin, get the format code along is not enough
>
> Oh, I haven't cared about the case.
> How about the following API instead?
>
> static void
> SendCopyBegin(CopyToState cstate)
> {
> StringInfoData buf;
>
> pq_beginmessage(, PqMsg_CopyOutResponse);
> cstate->opts.to_routine->CopyToFillCopyOutResponse(cstate, );
> pq_endmessage();
> cstate->copy_dest = COPY_FRONTEND;
> }
>
> static void
> CopyToJsonFillCopyOutResponse(CopyToState cstate, StringInfoData )
> {
> int16   format = 0;
>
> pq_sendbyte(, format);  /* overall format */
> /*
>  * JSON mode is always one non-binary column
>  */
> pq_sendint16(, 1);
> pq_sendint16(, format);
> }

Make sense to me.

>
> > 00010 is the pg_copy_json extension, I think this should be a good
> > case which can utilize the *extendable copy format* feature
>
> It seems that it's convenient that we have one more callback
> for initializing CopyToState::opaque. It's called only once
> when Copy{To,From}Routine is chosen:
>
> typedef struct CopyToRoutine
> {
> void(*CopyToInit) (CopyToState cstate);
> ...
> };

I like this, we can alloc private data in this hook.

>
> void
> ProcessCopyOptions(ParseState *pstate,
>CopyFormatOptions *opts_out,
>bool is_from,
>void *cstate,
>List *options)
> {
> ...
> foreach(option, options)
> {
> DefElem*defel = lfirst_node(DefElem, option);
>
> if (strcmp(defel->defname, "format") == 0)
> {
> ...
> opts_out->to_routine = 
> opts_out->to_routine->CopyToInit(cstate);
> ...
> }
> }
> ...
> }
>
>
> >  maybe we
> > should delete copy_test_format if we have this extension as an
> > example?
>
> I haven't read the COPY TO format json thread[1] carefully
> (sorry), but we may add the JSON format as a built-in
> format. If we do it, copy_test_format is useful to test the
> extension API.

Yeah, maybe, I have no strong opinion here, pg_copy_json is
just a toy extension for discussion.

>
> [1] 
> https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Junwang Zhao
On Mon, Jan 29, 2024 at 11:22 AM Masahiko Sawada  wrote:
>
> On Mon, Jan 29, 2024 at 12:10 PM Junwang Zhao  wrote:
> >
> > On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao  wrote:
> > > >
> > > > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  
> > > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > In 
> > > > > 
> > > > >   "Re: Make COPY format extendable: Extract COPY TO format 
> > > > > implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > > > >   Junwang Zhao  wrote:
> > > > >
> > > > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > > > single option, and store the options in the opaque field,  but it 
> > > > > > can not
> > > > > > check the relation of two options, for example, considering json 
> > > > > > format,
> > > > > > the `header` option can not be handled by these two functions.
> > > > > >
> > > > > > I want to find a way when the user specifies the header option, 
> > > > > > customer
> > > > > > handler can error out.
> > > > >
> > > > > Ah, you want to use a built-in option (such as "header")
> > > > > value from a custom handler, right? Hmm, it may be better
> > > > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > > > for all options including built-in options.
> > > > >
> > > > Hmm, still I don't think it can handle all cases, since we don't know
> > > > the sequence of the options, we need all the options been parsed
> > > > before we check the compatibility of the options, or customer
> > > > handlers will need complicated logic to resolve that, which might
> > > > lead to ugly code :(
> > > >
> > >
> > > Does it make sense to pass only non-builtin options to the custom
> > > format callback after parsing and evaluating the builtin options? That
> > > is, we parse and evaluate only the builtin options and populate
> > > opts_out first, then pass each rest option to the custom format
> > > handler callback. The callback can refer to the builtin option values.
> >
> > Yeah, I think this makes sense.
> >
> > > The callback is expected to return false if the passed option is not
> > > supported. If one of the builtin formats is specified and the rest
> > > options list has at least one option, we raise "option %s not
> > > recognized" error.  IOW it's the core's responsibility to ranse the
> > > "option %s not recognized" error, which is in order to raise a
> > > consistent error message. Also, I think the core should check the
> > > redundant options including bultiin and custom options.
> >
> > It would be good that core could check all the redundant options,
> > but where should core do the book-keeping of all the options? I have
> > no idea about this, in my implementation of pg_copy_json extension,
> > I handle redundant options by adding a xxx_specified field for each
> > xxx.
>
> What I imagined is that while parsing the all specified options, we
> evaluate builtin options and we add non-builtin options to another
> list. Then when parsing a non-builtin option, we check if this option
> already exists in the list. If there is, we raise the "option %s not
> recognized" error.". Once we complete checking all options, we pass
> each option in the list to the callback.

LGTM.

>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Junwang Zhao
On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada  wrote:
>
> On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao  wrote:
> >
> > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  wrote:
> > >
> > > Hi,
> > >
> > > In 
> > >   "Re: Make COPY format extendable: Extract COPY TO format 
> > > implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > >   Junwang Zhao  wrote:
> > >
> > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > single option, and store the options in the opaque field,  but it can 
> > > > not
> > > > check the relation of two options, for example, considering json format,
> > > > the `header` option can not be handled by these two functions.
> > > >
> > > > I want to find a way when the user specifies the header option, customer
> > > > handler can error out.
> > >
> > > Ah, you want to use a built-in option (such as "header")
> > > value from a custom handler, right? Hmm, it may be better
> > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > for all options including built-in options.
> > >
> > Hmm, still I don't think it can handle all cases, since we don't know
> > the sequence of the options, we need all the options been parsed
> > before we check the compatibility of the options, or customer
> > handlers will need complicated logic to resolve that, which might
> > lead to ugly code :(
> >
>
> Does it make sense to pass only non-builtin options to the custom
> format callback after parsing and evaluating the builtin options? That
> is, we parse and evaluate only the builtin options and populate
> opts_out first, then pass each rest option to the custom format
> handler callback. The callback can refer to the builtin option values.

Yeah, I think this makes sense.

> The callback is expected to return false if the passed option is not
> supported. If one of the builtin formats is specified and the rest
> options list has at least one option, we raise "option %s not
> recognized" error.  IOW it's the core's responsibility to ranse the
> "option %s not recognized" error, which is in order to raise a
> consistent error message. Also, I think the core should check the
> redundant options including bultiin and custom options.

It would be good that core could check all the redundant options,
but where should core do the book-keeping of all the options? I have
no idea about this, in my implementation of pg_copy_json extension,
I handle redundant options by adding a xxx_specified field for each
xxx.

>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-26 Thread Junwang Zhao
On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 26 Jan 2024 16:41:50 +0800,
>   Junwang Zhao  wrote:
>
> > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > single option, and store the options in the opaque field,  but it can not
> > check the relation of two options, for example, considering json format,
> > the `header` option can not be handled by these two functions.
> >
> > I want to find a way when the user specifies the header option, customer
> > handler can error out.
>
> Ah, you want to use a built-in option (such as "header")
> value from a custom handler, right? Hmm, it may be better
> that we call CopyToProcessOption()/CopyFromProcessOption()
> for all options including built-in options.
>
Hmm, still I don't think it can handle all cases, since we don't know
the sequence of the options, we need all the options been parsed
before we check the compatibility of the options, or customer
handlers will need complicated logic to resolve that, which might
lead to ugly code :(

>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-26 Thread Junwang Zhao
On Fri, Jan 26, 2024 at 4:32 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 26 Jan 2024 16:18:14 +0800,
>   Junwang Zhao  wrote:
>
> > In the current implementation, there is no way that one can check
> > incompatibility
> > options in ProcessCopyOptions, we can postpone the check in CopyFromStart
> > or CopyToStart, but I think it is a little bit late. Do you think
> > adding an extra
> > check for incompatible options hook is acceptable (PFA)?
>
> Thanks for the suggestion! But I think that a custom handler
> can do it in
> CopyToProcessOption()/CopyFromProcessOption(). What do you
> think about this? Or could you share a sample COPY TO/FROM
> WITH() SQL you think?

CopyToProcessOption()/CopyFromProcessOption() can only handle
single option, and store the options in the opaque field,  but it can not
check the relation of two options, for example, considering json format,
the `header` option can not be handled by these two functions.

I want to find a way when the user specifies the header option, customer
handler can error out.

>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-26 Thread Junwang Zhao
On Thu, Jan 25, 2024 at 4:52 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Thu, 25 Jan 2024 13:36:03 +0900,
>   Masahiko Sawada  wrote:
>
> > I've experimented with a similar optimization for csv
> > and text format; have different callbacks for text and csv format and
> > remove "if (cstate->opts.csv_mode)" branches. I've attached a patch
> > for that. Here are results:
> >
> > HEAD w/ 0001 patch + remove branches:
> > binary 2824.502 ms
> > text 2715.264 ms
> > csv 2803.381 ms
> >
> > The numbers look better now. I'm not sure these are within a noise
> > range but it might be worth considering having different callbacks for
> > text and csv formats.
>
> Wow! Interesting. I tried the approach before but I didn't
> see any difference by the approach. But it may depend on my
> environment.
>
> I'll import the approach to the next patch set so that
> others can try the approach easily.
>
>
> Thanks,
> --
> kou

Hi Kou-san,

In the current implementation, there is no way that one can check
incompatibility
options in ProcessCopyOptions, we can postpone the check in CopyFromStart
or CopyToStart, but I think it is a little bit late. Do you think
adding an extra
check for incompatible options hook is acceptable (PFA)?


-- 
Regards
Junwang Zhao


0001-add-check-incomptiblity-options-hooks.patch
Description: Binary data


Re: make dist using git archive

2024-01-22 Thread Junwang Zhao
On Tue, Jan 23, 2024 at 2:36 AM Peter Eisentraut  wrote:
>
> On 22.01.24 13:10, Junwang Zhao wrote:
> > I played this with meson build on macOS, the packages are generated
> > in source root but not build root, I'm sure if this is by design but I think
> > polluting *working directory* is not good.
>
> Yes, it's not good, but I couldn't find a way to make it work.
>
> This is part of the complications with meson I referred to.  The
> @BUILD_ROOT@ placeholder in custom_target() is apparently always a
> relative path, but it doesn't know that git -C changes the current
> directory.
>
> > Another thing I'd like to point out is, should we also introduce *git 
> > commit*
> > or maybe *git tag* to package name, something like:
> >
> > git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
> > postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
> > git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
> > postgresql-`git describe --tags`.tar.gz
>
> I'm not sure why we would need it built-in.  It can be done by hand, of
> course.

If this is only used by the release phase, one can do this by hand.

*commit id/tag* in package name can be used to identify the git source,
which might be useful for cooperation between QA and dev team,
but surely there are better ways for this, so I do not have a strong
opinion here.

>


-- 
Regards
Junwang Zhao




Re: make dist using git archive

2024-01-22 Thread Junwang Zhao
Hi,

On Mon, Jan 22, 2024 at 3:32 PM Peter Eisentraut  wrote:
>
> One of the goals is to make the creation of the distribution tarball
> more directly traceable to the git repository.  That is why we removed
> the "make distprep" step.
>
> Here I want to take another step in that direction, by changing "make
> dist" to directly use "git archive", rather than the custom shell script
> it currently runs.
>
> The simple summary is that it would run
>
> git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
> postgresql-17devel.tar.gz
>
> (with appropriate version numbers, of course), and that's the tarball we
> would ship.
>
> There are analogous commands for other compression formats.
>
> The actual command gets subtly more complicated if you need to run this
> in a separate build directory.  In my attached patch, the make version
> doesn't support vpath at the moment, just so that it's easier to
> understand for now.  The meson version is a bit hairy.
>
> I have studied and tested this quite a bit, and I have found that the
> archives produced this way are deterministic and reproducible, meaning
> for a given commit the result file should always be bit-for-bit identical.
>
> The exception is that if you use a git version older than 2.38.0, gzip
> records the platform in the archive, so you'd get a different output on
> Windows vs. macOS vs. "UNIX" (everything else).  In git 2.38.0, this was
> changed so that everything is recorded as "UNIX" now.  This is just
> something to keep in mind.  This issue is specific to the gzip format,
> it does not affect other compression formats.
>
> Meson has its own distribution building command (meson dist), but opted
> against using that.  The main problem is that the way they have
> implemented it, it is not deterministic in the above sense.  (Another
> point is of course that we probably want a "make" version for the time
> being.)
>
> But the target name "dist" in meson is reserved for that reason, so I
> needed to call the custom target "pgdist".
>
> I did take one idea from meson: It runs a check before git archive that
> the checkout is clean.  That way, you avoid mistakes because of
> uncommitted changes.  This works well in my "make" implementation.  In
> the meson implementation, I had to find a workaround, because a
> custom_target cannot have a dependency on a run_target.  As also
> mentioned above, the whole meson implementation is a bit ugly.
>
> Anyway,  with the attached patch you can do
>
>  make dist
>
> or
>
>  meson compile -C build pgdist

I played this with meson build on macOS, the packages are generated
in source root but not build root, I'm sure if this is by design but I think
polluting *working directory* is not good.

Another thing I'd like to point out is, should we also introduce *git commit*
or maybe *git tag* to package name, something like:

git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-`git describe --tags`.tar.gz

>
> and it produces the same set of tarballs as before, except it's done
> differently.
>
> The actual build scripts need some fine-tuning, but the general idea is
> correct, I think.

I think this is a good idea, thanks for working on this.


-- 
Regards
Junwang Zhao




Re: make pg_ctl more friendly

2024-01-17 Thread Junwang Zhao
Hi Alvaro,

On Wed, Jan 17, 2024 at 4:54 PM Alvaro Herrera  wrote:
>
> I think this needs more comments.  First, in the WaitPMResult enum, we
> currently have three values -- READY, STILL_STARTING, FAILED.  These are
> all pretty self-explanatory.  But this patch adds SHUTDOWN_IN_RECOVERY,
> and that's not at all self-explanatory.  Did postmaster start or not?
> The enum value's name doesn't make that clear.  So I'd do something like
>
> typedef enum
> {
> POSTMASTER_READY,
> POSTMASTER_STILL_STARTING,
> /*
>  * postmaster no longer running, because it stopped after recovery
>  * completed.
>  */
> POSTMASTER_SHUTDOWN_IN_RECOVERY,
> POSTMASTER_FAILED,
> } WaitPMResult;
>
> Maybe put the comments in wait_for_postmaster_start instead.

I put the comments in WaitPMResult since we need to add two
of those if in wait_for_postmaster_start.

>
> Secondly, the patch proposes to add new text to be returned by
> do_start() when this happens, which would look like this:
>
>   waiting for server to start...  shut down while in recovery
>   update recovery target settings for startup again if needed
>
> Is this crystal clear?  I'm not sure.  How about something like this?
>
>   waiting for server to start...  done, but not running
>   server shut down because of recovery target settings

Agreed.
>
> variations on the first phrase:
>
> "done, no longer running"
> "done, but no longer running"
> "done, automatically shut down"
> "done, automatically shut down after recovery"

I chose the last one because it gives information to users.
See V5, thanks for the wording ;)

>
> --
> Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
> "Now I have my system running, not a byte was off the shelf;
> It rarely breaks and when it does I fix the code myself.
> It's stable, clean and elegant, and lightning fast as well,
> And it doesn't cost a nickel, so Bill Gates can go to hell."



-- 
Regards
Junwang Zhao


v5-0001-PITR-shutdown-should-not-report-error-by-pg_ctl.patch
Description: Binary data


Re: make pg_ctl more friendly

2024-01-15 Thread Junwang Zhao
Hi Nathan,

On Tue, Jan 16, 2024 at 5:39 AM Nathan Bossart  wrote:
>
> +   POSTMASTER_RECOVERY_SHUTDOWN,
>
> Perhaps this should be POSTMASTER_SHUTDOWN_IN_RECOVERY to match the state
> in the control file?

Agreed

>
> +   case POSTMASTER_RECOVERY_SHUTDOWN:
> +   print_msg(_("PITR shutdown\n"));
> +   print_msg(_("update configuration for startup 
> again if needed\n"));
> +   break;
>
> I'm not sure I agree that this is a substantially friendlier message.  From
> a quick skim of the thread, it seems like you want to avoid sending a scary
> error message if Postgres was intentionally shut down while in recovery.
> If I got this particular message, I think I would be worried that something
> went wrong during my point-in-time restore, and I'd be scrambling to figure
> out what configuration this message wants me to update.
>
> If I'm correctly interpreting the intent here, it might be worth fleshing
> out the messages a bit more.  For example, instead of "PITR shutdown,"
> perhaps we could say "shut down while in recovery."

Make sense. Fixed. See V4

> And maybe we should
> point to the specific settings in the latter message.

I've changed this latter message to:
update recovery target settings for startup again if needed

What do you think?

>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao


v4-0001-PITR-shutdown-should-not-report-error-by-pg_ctl.patch
Description: Binary data


Re: speed up a logical replica setup

2024-01-13 Thread Junwang Zhao
Hi Euler,

On Fri, Jan 12, 2024 at 6:16 AM Euler Taveira  wrote:
>
> On Thu, Jan 11, 2024, at 9:18 AM, Hayato Kuroda (Fujitsu) wrote:
>
> I have been concerned that the patch has not been tested by cfbot due to the
> application error. Also, some comments were raised. Therefore, I created a 
> patch
> to move forward.
>
>
> Let me send an updated patch to hopefully keep the CF bot happy. The following
> items are included in this patch:
>
> * drop physical replication slot if standby is using one [1].
> * cleanup small changes (copyright, .gitignore) [2][3]
> * fix getopt_long() options [2]
> * fix format specifier for some messages
> * move doc to Server Application section [4]
> * fix assert failure
> * ignore duplicate database names [2]
> * store subscriber server log into a separate file
> * remove MSVC support
>
> I'm still addressing other reviews and I'll post another version that includes
> it soon.
>
> [1] 
> https://www.postgresql.org/message-id/e02a2c17-22e5-4ba6-b788-de696ab74f1e%40app.fastmail.com
> [2] 
> https://www.postgresql.org/message-id/CALDaNm1joke42n68LdegN5wCpaeoOMex2EHcdZrVZnGD3UhfNQ%40mail.gmail.com
> [3] 
> https://www.postgresql.org/message-id/TY3PR01MB98895BA6C1D72CB8582CACC4F5682%40TY3PR01MB9889.jpnprd01.prod.outlook.com
> [4] 
> https://www.postgresql.org/message-id/TY3PR01MB988978C7362A101927070D29F56A2%40TY3PR01MB9889.jpnprd01.prod.outlook.com
>

+ 
+  pg_subscriber
+  create a new logical replica from a standby server
+ 
I'm a bit confused about this wording because we are converting a standby
to a logical replica but not creating a new logical replica and leaving the
standby as is. How about:

convert a standby replica to a logical replica

+  
+   The pg_subscriber should be run at the target
+   server. The source server (known as publisher server) should accept logical
+   replication connections from the target server (known as subscriber server).
+   The target server should accept local logical replication connection.
+  

What is *local logical replication*? I can't find any clue in the patch, can you
give me some hint?


>
> --
> Euler Taveira
> EDB   https://www.enterprisedb.com/
>


-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-11 Thread Junwang Zhao
Hi,

On Wed, Jan 10, 2024 at 2:20 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 22 Dec 2023 10:58:05 +0800,
>   Junwang Zhao  wrote:
>
> >> 1. Add an opaque space for custom COPY TO handler
> >>* Add CopyToState{Get,Set}Opaque()
> >>
> >> https://github.com/kou/postgres/commit/5a610b6a066243f971e029432db67152cfe5e944
> >>
> >> 2. Export CopyToState::attnumlist
> >>* Add CopyToStateGetAttNumList()
> >>
> >> https://github.com/kou/postgres/commit/15fcba8b4e95afa86edb3f677a7bdb1acb1e7688
> >>
> >> 3. Export CopySend*()
> >>* Rename CopySend*() to CopyToStateSend*() and export them
> >>* Exception: CopySendEndOfRow() to CopyToStateFlush() because
> >>  it just flushes the internal buffer now.
> >>
> >> https://github.com/kou/postgres/commit/289a5640135bde6733a1b8e2c412221ad522901e
> >>
> > I guess the purpose of these helpers is to avoid expose CopyToState to
> > copy.h,
>
> Yes.
>
> > but I
> > think expose CopyToState to user might make life easier, users might want 
> > to use
> > the memory contexts of the structure (though I agree not all the
> > fields are necessary
> > for extension handers).
>
> OK. I don't object it as I said in another e-mail:
> https://www.postgresql.org/message-id/flat/20240110.120644.1876591646729327180.kou%40clear-code.com#d923173e9625c20319750155083cbd72
>
> >> 2. Need an opaque space like IndexScanDesc::opaque does
> >>
> >>* A custom COPY TO handler needs to keep its data
> >
> > I once thought users might want to parse their own options, maybe this
> > is a use case for this opaque space.
>
> Good catch! I forgot to suggest a callback for custom format
> options. How about the following API?
>
> 
> ...
> typedef bool (*CopyToProcessOption_function) (CopyToState cstate, DefElem 
> *defel);
>
> ...
> typedef bool (*CopyFromProcessOption_function) (CopyFromState cstate, DefElem 
> *defel);
>
> typedef struct CopyToFormatRoutine
> {
> ...
> CopyToProcessOption_function process_option_fn;
> } CopyToFormatRoutine;
>
> typedef struct CopyFromFormatRoutine
> {
> ...
> CopyFromProcessOption_function process_option_fn;
> } CopyFromFormatRoutine;
> 
>
> 
> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
> index e7597894bf..1aa8b62551 100644
> --- a/src/backend/commands/copy.c
> +++ b/src/backend/commands/copy.c
> @@ -416,6 +416,7 @@ void
>  ProcessCopyOptions(ParseState *pstate,
>CopyFormatOptions *opts_out,
>bool is_from,
> +  void *cstate, /* CopyToState* for 
> !is_from, CopyFromState* for is_from */
>List *options)
>  {
> boolformat_specified = false;
> @@ -593,11 +594,19 @@ ProcessCopyOptions(ParseState *pstate,
>  parser_errposition(pstate, 
> defel->location)));
> }
> else
> -   ereport(ERROR,
> -   (errcode(ERRCODE_SYNTAX_ERROR),
> -errmsg("option \"%s\" not 
> recognized",
> -   defel->defname),
> -parser_errposition(pstate, 
> defel->location)));
> +   {
> +   bool processed;
> +   if (is_from)
> +   processed = 
> opts_out->from_ops->process_option_fn(cstate, defel);
> +   else
> +   processed = 
> opts_out->to_ops->process_option_fn(cstate, defel);
> +   if (!processed)
> +   ereport(ERROR,
> +   
> (errcode(ERRCODE_SYNTAX_ERROR),
> +errmsg("option \"%s\" not 
> recognized",
> +   
> defel->defname),
> +parser_errposition(pstate, 
> defel->location)));
> +   }
> }
>
> /*
> 

Looks good.

>
> > For the name, I thought private_data might be a better candidate than
>

Re: make pg_ctl more friendly

2024-01-09 Thread Junwang Zhao
Hi Nazir,

On Tue, Jan 9, 2024 at 9:23 PM Nazir Bilal Yavuz  wrote:
>
> Hi,
>
> Thank you for working on this! I agree that the current message is not 
> friendly.
>
> On Thu, 9 Nov 2023 at 10:19, Junwang Zhao  wrote:
> >
> > On Thu, Nov 9, 2023 at 3:08 PM Junwang Zhao  wrote:
> > >
> > > After a PITR shutdown, the db state should be *shut down in recovery*, 
> > > try the
> > > patch attached.
> > >
> >
> > previous patch has some format issues, V2 attached.
>
> v2-0001-PITR-shutdown-should-not-report-error-by-pg_ctl.patch:
>
> -   "Examine the log output.\n"),
> +   "Examine the log output\n"),
>   progname);
>
> I don't think that this is needed.
There seems to be no common sense for the ending dot when using
write_stderr, so I will leave this not changed.

>
> Other than that, the patch looks good and I confirm that after PITR shutdown:
>
> "PITR shutdown"
> "update configuration for startup again if needed"
>
> message shows up, instead of:
>
> "pg_ctl: could not start server"
> "Examine the log output.".
>
> nitpick: It would be better if the order of the error message cases
> and enums is the same ( i.e. 'POSTMASTER_RECOVERY_SHUTDOWN' before
> 'POSTMASTER_FAILED' in enum )
Agreed, fixed in V3

>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft



-- 
Regards
Junwang Zhao


v3-0001-PITR-shutdown-should-not-report-error-by-pg_ctl.patch
Description: Binary data


Re: Change GUC hashtable to use simplehash?

2024-01-07 Thread Junwang Zhao
Hi John,

On Mon, Jan 8, 2024 at 10:44 AM John Naylor  wrote:
>
> On Sat, Jan 6, 2024 at 9:01 AM jian he  wrote:
> >
> > latency average = 147.782 ms
> > select * from bench_cstring_hash_unaligned(10);
> > latency average = 101.179 ms
> > select * from bench_cstring_hash_aligned(10);
> > latency average = 101.219 ms
>
> Thanks for testing again! This looks closer to my results. It doesn't
> show improvement for the aligned case, but it's not worse, either.
>
> There is still some polishing to be done, mostly on comments/examples,
> but I think it's mostly there. I'll return to it by next week.
>
>

+ * Portions Copyright (c) 2018-2023, PostgreSQL Global Development Group

A kind reminder, it's already 2024 :)

I'm also curious why the 2018, is there any convention for that?

-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-29 Thread Junwang Zhao
On Fri, Dec 29, 2023 at 6:00 PM Andrey M. Borodin  wrote:
>
>
>
> > On 28 Dec 2023, at 21:02, Junwang Zhao  wrote:
> >
> > Seems V5~V17 doesn't work as expected for Nikolay's case:
> >
>
> Yeah, that's a problem.
> > So I propose the following change, what do you think?
> This breaks COMMIT AND CHAIN.
>
> PFA v18: I've added a test for Nik's case and for COMMIT AND CHAIN. Now we 
> need to fix stuff to pass this tests (I've crafted output).
> We also need test for patchset step "Try to enable transaction_timeout before 
> next command".
>
> Thanks!

After exploring the code, I found scheduling the timeout in
`StartTransaction` might be a reasonable idea, all the chain
commands will call this function.

What concerns me is that it is also called by StartParallelWorkerTransaction,
I'm not sure if we should enable this timeout for parallel execution.

Thought?

>
>
> Best regards, Andrey Borodin.



-- 
Regards
Junwang Zhao


v19-0002-Use-test-from-Li-Japin-Also-add-tests-for-multip.patch
Description: Binary data


v19-0003-Try-to-enable-transaction_timeout-before-next-co.patch
Description: Binary data


v19-0001-Introduce-transaction_timeout.patch
Description: Binary data


v19-0004-fix-reschedule-timeout-for-each-commmand.patch
Description: Binary data


Re: Transaction timeout

2023-12-28 Thread Junwang Zhao
Hey Andrey,

On Sun, Dec 24, 2023 at 1:14 AM Andrey M. Borodin  wrote:
>
>
>
> > On 22 Dec 2023, at 10:39, Japin Li  wrote:
> >
> >
> > I try to split the test for transaction timeout, and all passed on my CI 
> > [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, 
> because permutations would try to reinitialize FATALed sessions. But, 
> obviously, tests work the way you refactored it.
> However I don't think ignoring test failures on Windows without understanding 
> root cause is a good idea.
> Let's get back to v13 version of tests, understand why it failed, apply your 
> test refactorings afterwards. BTW are you sure that v14 refactorings are 
> functional equivalent of v13 tests?
>
> To go with this plan I attach slightly modified version of v13 tests in v16 
> patchset. The only change is timing in "sleep_there" step. I suspect that 
> failure was induced by more coarse timer granularity on Windows. Tests were 
> giving only 9 milliseconds for a timeout to entirely wipe away backend from 
> pg_stat_activity. This saves testing time, but might induce false positive 
> test flaps. So I've raised wait times to 100ms. This seems too much, but I do 
> not have other ideas how to ensure tests stability. Maybe 50ms would be 
> enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say 
> that 200ms for timeouts worth it.
>
>
> As to 2nd step "Try to enable transaction_timeout during transaction", I 
> think this makes sense. But if we are doing so, shouldn't we also allow to 
> enable idle_in_transaction timeout in a same manner? Currently we only allow 
> to disable other timeouts... Also, if we are already in transaction, 
> shouldn't we also subtract current transaction span from timeout?
> I think making this functionality as another step of the patchset was a good 
> idea.
>
> Thanks!
Seems V5~V17 doesn't work as expected for Nikolay's case:

postgres=# set transaction_timeout to '2s';
SET
postgres=# begin; select pg_sleep(1); select pg_sleep(1); select
pg_sleep(1); select pg_sleep(1); select pg_sleep(1); commit;
BEGIN

The reason for this seems the timer has been refreshed for each
command, xact_started along can not indicate it's a new
transaction or not, there is a TransactionState contains some
infos.

So I propose the following change, what do you think?

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a2611cf8e6..cffd2c44d0 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2746,7 +2746,7 @@ start_xact_command(void)
StartTransactionCommand();

/* Schedule or reschedule transaction timeout */
-   if (TransactionTimeout > 0)
+   if (TransactionTimeout > 0 &&
!get_timeout_active(TRANSACTION_TIMEOUT))
        enable_timeout_after(TRANSACTION_TIMEOUT,
TransactionTimeout);

xact_started = true;

>
>
> Best regards, Andrey Borodin.



-- 
Regards
Junwang Zhao




Re: Proposal to include --exclude-extension Flag in pg_dump

2023-12-25 Thread Junwang Zhao
Hi

On Mon, Dec 25, 2023 at 6:22 PM Ayush Vatsa  wrote:
>
> Added a CF entry for the same https://commitfest.postgresql.org/46/4721/
>
> Regards
> Ayush Vatsa
> Amazon Web Services (AWS)
>
> On Mon, 25 Dec 2023 at 15:48, Ayush Vatsa  wrote:
>>
>> Hi PostgreSQL Community,
>> Recently I have been working on pg_dump regarding my project and wanted to 
>> exclude an extension from the dump generated. I wonder why it doesn't have 
>> --exclude-extension type of support whereas --extension exists!
>> Since I needed that support, I took the initiative to contribute to the 
>> community by adding the --exclude-extension flag.
>> Attached is the patch for the same. Looking forward to your feedback.
>>
>> Regards
>> Ayush Vatsa
>> Amazon Web services (AWS)

  printf(_("  -e, --extension=PATTERN  dump the specified
extension(s) only\n"));
+ printf(_("  --exclude-extension=PATTERN  do NOT dump the specified
extension(s)\n"));
  printf(_("  -E, --encoding=ENCODING  dump the data in encoding
ENCODING\n"));

long options should not mess with short options, does the following
make sense to you?

 printf(_(" --enable-row-security enable row security (dump only
content user has\n"
 "access to)\n"));
+ printf(_("  --exclude-extension=PATTERN  do NOT dump the specified
extension(s)\n"));
 printf(_(" --exclude-table-and-children=PATTERN\n"

-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-24 Thread Junwang Zhao
Hi Andrey,

On Sun, Dec 24, 2023 at 1:14 AM Andrey M. Borodin  wrote:
>
>
>
> > On 22 Dec 2023, at 10:39, Japin Li  wrote:
> >
> >
> > I try to split the test for transaction timeout, and all passed on my CI 
> > [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, 
> because permutations would try to reinitialize FATALed sessions. But, 
> obviously, tests work the way you refactored it.
> However I don't think ignoring test failures on Windows without understanding 
> root cause is a good idea.
> Let's get back to v13 version of tests, understand why it failed, apply your 
> test refactorings afterwards. BTW are you sure that v14 refactorings are 
> functional equivalent of v13 tests?
>
> To go with this plan I attach slightly modified version of v13 tests in v16 
> patchset. The only change is timing in "sleep_there" step. I suspect that 
> failure was induced by more coarse timer granularity on Windows. Tests were 
> giving only 9 milliseconds for a timeout to entirely wipe away backend from 
> pg_stat_activity. This saves testing time, but might induce false positive 
> test flaps. So I've raised wait times to 100ms. This seems too much, but I do 
> not have other ideas how to ensure tests stability. Maybe 50ms would be 
> enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say 
> that 200ms for timeouts worth it.
>
>
> As to 2nd step "Try to enable transaction_timeout during transaction", I 
> think this makes sense. But if we are doing so, shouldn't we also allow to 
> enable idle_in_transaction timeout in a same manner? Currently we only allow 
> to disable other timeouts... Also, if we are already in transaction, 
> shouldn't we also subtract current transaction span from timeout?
idle_in_transaction_session_timeout is already the behavior Japin suggested,
it is enabled before backend sends *read for query* to client.

> I think making this functionality as another step of the patchset was a good 
> idea.
>
> Thanks!
>
>
> Best regards, Andrey Borodin.



-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-22 Thread Junwang Zhao
On Sat, Dec 23, 2023 at 11:17 AM Japin Li  wrote:
>
> a
> On Sat, 23 Dec 2023 at 10:40, Japin Li  wrote:
> > On Sat, 23 Dec 2023 at 08:32, Japin Li  wrote:
> >> On Fri, 22 Dec 2023 at 23:30, Junwang Zhao  wrote:
> >>> On Fri, Dec 22, 2023 at 10:44 PM Japin Li  wrote:
> >>>>
> >>>>
> >>>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao  wrote:
> >>>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li  wrote:
> >>>> >> I try to set idle_in_transaction_session_timeout after begin 
> >>>> >> transaction,
> >>>> >> it changes immediately, so I think transaction_timeout should also be 
> >>>> >> take
> >>>> >> immediately.
> >>>> >
> >>>> > Ah, right, idle_in_transaction_session_timeout is set after the set
> >>>> > command finishes and before the backend send *ready for query*
> >>>> > to the client, so the value of the GUC is already set before
> >>>> > next command.
> >>>> >
> >>>>
> >>>> I mean, is it possible to set transaction_timeout before next comand?
> >>>>
> >>> Yeah, it's possible, set transaction_timeout in the when it first
> >>> goes into *idle in transaction* mode, see the attached files.
> >>>
> >>
> >> Thanks for updating the patch, LGTM.
> >
> > Sorry for the noise!
> >
> > Read the previous threads, I find why the author enable transaction_timeout
> > in start_xact_command().
> >
> > The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:
> >
> > SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND 
> > CHAIN; SELECT 2, pg_sleep(1); COMMIT;
> >
> > The transaction_timeout do not reset when executing COMMIT AND CHAIN.
> >
> > [1] 
> > https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com
>
> Attach v16 to solve this. Any suggestions?

I've checked this with *COMMIT AND CHAIN* and *ABORT AND CHAIN*,
both work as expected. Thanks for the update.

>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.
>


-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-22 Thread Junwang Zhao
On Sat, Dec 23, 2023 at 10:40 AM Japin Li  wrote:
>
>
> On Sat, 23 Dec 2023 at 08:32, Japin Li  wrote:
> > On Fri, 22 Dec 2023 at 23:30, Junwang Zhao  wrote:
> >> On Fri, Dec 22, 2023 at 10:44 PM Japin Li  wrote:
> >>>
> >>>
> >>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao  wrote:
> >>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li  wrote:
> >>> >> I try to set idle_in_transaction_session_timeout after begin 
> >>> >> transaction,
> >>> >> it changes immediately, so I think transaction_timeout should also be 
> >>> >> take
> >>> >> immediately.
> >>> >
> >>> > Ah, right, idle_in_transaction_session_timeout is set after the set
> >>> > command finishes and before the backend send *ready for query*
> >>> > to the client, so the value of the GUC is already set before
> >>> > next command.
> >>> >
> >>>
> >>> I mean, is it possible to set transaction_timeout before next comand?
> >>>
> >> Yeah, it's possible, set transaction_timeout in the when it first
> >> goes into *idle in transaction* mode, see the attached files.
> >>
> >
> > Thanks for updating the patch, LGTM.
>
> Sorry for the noise!
>
> Read the previous threads, I find why the author enable transaction_timeout
> in start_xact_command().
>
> The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:

I didn't read the previous threads, sorry for that, let's stick to v14.

>
> SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND 
> CHAIN; SELECT 2, pg_sleep(1); COMMIT;
>
> The transaction_timeout do not reset when executing COMMIT AND CHAIN.
>
> [1] 
> https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com
>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.



-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-22 Thread Junwang Zhao
On Fri, Dec 22, 2023 at 10:44 PM Japin Li  wrote:
>
>
> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao  wrote:
> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li  wrote:
> >> I try to set idle_in_transaction_session_timeout after begin transaction,
> >> it changes immediately, so I think transaction_timeout should also be take
> >> immediately.
> >
> > Ah, right, idle_in_transaction_session_timeout is set after the set
> > command finishes and before the backend send *ready for query*
> > to the client, so the value of the GUC is already set before
> > next command.
> >
>
> I mean, is it possible to set transaction_timeout before next comand?
>
Yeah, it's possible, set transaction_timeout in the when it first
goes into *idle in transaction* mode, see the attached files.

>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.



-- 
Regards
Junwang Zhao


v15-0001-Introduce-transaction_timeout.patch
Description: Binary data


v15-0002-set-transaction_timeout-before-next-command.patch
Description: Binary data


Re: Transaction timeout

2023-12-22 Thread Junwang Zhao
On Fri, Dec 22, 2023 at 10:25 PM Japin Li  wrote:
>
>
> On Fri, 22 Dec 2023 at 20:29, Junwang Zhao  wrote:
> > On Fri, Dec 22, 2023 at 1:39 PM Japin Li  wrote:
> >>
> >>
> >> On Tue, 19 Dec 2023 at 22:06, Japin Li  wrote:
> >> > On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin  
> >> > wrote:
> >> >>> On 19 Dec 2023, at 13:26, Andrey M. Borodin  
> >> >>> wrote:
> >> >>>
> >> >>> I don’t have Windows machine, so I hope CF bot will pick this.
> >> >>
> >> >> I used Github CI to produce version of tests that seems to be is stable 
> >> >> on Windows.
> >> >
> >> > It still failed on Windows Server 2019 [1].
> >> >
> >> > diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out 
> >> > C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
> >> > --- C:/cirrus/src/test/isolation/expected/timeouts.out2023-12-19 
> >> > 10:34:30.354721100 +
> >> > +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out  
> >> > 2023-12-19 10:38:25.877981600 +
> >> > @@ -100,7 +100,7 @@
> >> >  step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE 
> >> > application_name = 'isolation/timeouts/stt2'
> >> >  count
> >> >  -
> >> > -0
> >> > +1
> >> >  (1 row)
> >> >
> >> >  step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET 
> >> > statement_timeout = '10s'; SET lock_timeout = '10s'; SET 
> >> > transaction_timeout = '10s';
> >> >
> >> > [1] 
> >> > https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs
> >>
> >> Hi,
> >>
> >> I try to split the test for transaction timeout, and all passed on my CI 
> >> [1].
> >>
> >> OTOH, I find if I set transaction_timeout in a transaction, it will not 
> >> take
> >> effect immediately.  For example:
> >>
> >> [local]:2049802 postgres=# BEGIN;
> >> BEGIN
> >> [local]:2049802 postgres=*# SET transaction_timeout TO '1s';
> > when this execute, TransactionTimeout is still 0, this command will
> > not set timeout
> >> SET
> >> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;  -- wait 
> >> 10s
> > when this command get execute, start_xact_command will enable the timer
>
> Thanks for your exaplantion, got it.
>
> >>relname
> >> --
> >>  pg_statistic
> >> (1 row)
> >>
> >> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;
> >> FATAL:  terminating connection due to transaction timeout
> >> server closed the connection unexpectedly
> >> This probably means the server terminated abnormally
> >> before or while processing the request.
> >> The connection to the server was lost. Attempting reset: Succeeded.
> >>
> >> It looks odd.  Does this is expected? I'm not read all the threads,
> >> am I missing something?
> >
> > I think this is by design, if you debug statement_timeout, it's the same
> > behaviour, the timeout will be set for each command after the second
> > command was called, you just aren't aware of this.
> >
>
> I try to set idle_in_transaction_session_timeout after begin transaction,
> it changes immediately, so I think transaction_timeout should also be take
> immediately.

Ah, right, idle_in_transaction_session_timeout is set after the set
command finishes and before the backend send *ready for query*
to the client, so the value of the GUC is already set before
next command.

I bet you must have checked this ;)

>
> > I doubt people will set this in a transaction.
>
> Maybe not,
>
>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.



-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-22 Thread Junwang Zhao
On Fri, Dec 22, 2023 at 1:39 PM Japin Li  wrote:
>
>
> On Tue, 19 Dec 2023 at 22:06, Japin Li  wrote:
> > On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin  
> > wrote:
> >>> On 19 Dec 2023, at 13:26, Andrey M. Borodin  wrote:
> >>>
> >>> I don’t have Windows machine, so I hope CF bot will pick this.
> >>
> >> I used Github CI to produce version of tests that seems to be is stable on 
> >> Windows.
> >
> > It still failed on Windows Server 2019 [1].
> >
> > diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out 
> > C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
> > --- C:/cirrus/src/test/isolation/expected/timeouts.out2023-12-19 
> > 10:34:30.354721100 +
> > +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out  
> > 2023-12-19 10:38:25.877981600 +
> > @@ -100,7 +100,7 @@
> >  step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE 
> > application_name = 'isolation/timeouts/stt2'
> >  count
> >  -
> > -0
> > +1
> >  (1 row)
> >
> >  step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET 
> > statement_timeout = '10s'; SET lock_timeout = '10s'; SET 
> > transaction_timeout = '10s';
> >
> > [1] 
> > https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs
>
> Hi,
>
> I try to split the test for transaction timeout, and all passed on my CI [1].
>
> OTOH, I find if I set transaction_timeout in a transaction, it will not take
> effect immediately.  For example:
>
> [local]:2049802 postgres=# BEGIN;
> BEGIN
> [local]:2049802 postgres=*# SET transaction_timeout TO '1s';
when this execute, TransactionTimeout is still 0, this command will
not set timeout
> SET
> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;  -- wait 10s
when this command get execute, start_xact_command will enable the timer
>relname
> --
>  pg_statistic
> (1 row)
>
> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;
> FATAL:  terminating connection due to transaction timeout
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
>
> It looks odd.  Does this is expected? I'm not read all the threads,
> am I missing something?

I think this is by design, if you debug statement_timeout, it's the same
behaviour, the timeout will be set for each command after the second
command was called, you just aren't aware of this.

I doubt people will set this in a transaction.
>
> [1] https://cirrus-ci.com/build/6574686130143232
>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.
>


-- 
Regards
Junwang Zhao




Re: [meson] expose buildtype debug/optimization info to pg_config

2023-12-22 Thread Junwang Zhao
Hi,

On Fri, Dec 15, 2023 at 10:20 PM Andres Freund  wrote:
>
> Hi,
>
> On 2023-12-14 17:24:58 +0800, Junwang Zhao wrote:
> > On Thu, Dec 14, 2023 at 4:50 PM Peter Eisentraut  
> > wrote:
> > >
> > > On 12.12.23 11:40, Junwang Zhao wrote:
> > > > build system using configure set VAL_CFLAGS with debug and
> > > > optimization flags, so pg_config will show these infos. Some
> > > > extensions depend on the mechanism.
> > > >
> > > > This patch exposes these flags with a typo fixed together.
> > >
> > > I have committed the typo fix.
> > >
> > > But I would like to learn more about the requirements of extensions in
> > > this area.  This seems a bit suspicious to me.
> >
> > This is what I found when building citus against an installation
> > of meson debug build pg instance, since the CFLAGS doesn't
> > contain -g flag, the binary doesn't include the debug information,
> > which is different behavior from configure building system.
>
> Hm. I'm not sure it's the right call to make extensions build the same way as
> the main postgres install with regard to optimization and debug info. So I
> feel a bit hesitant around generating -g and particularly -Ox. But it's
> historically what we've done...
>
> If we want to do so, I think this should not check buildtype, but debug.

I'm confused which *debug* do you mean, can you be more specific?
>
>
> > Another issue I found is that some C++
> > extensions(ajust/parquet_fdw for example) don't build against
> > the meson generated pgxs.mk, since it doesn't set the CXX
> > command. CXX is only set when llvm option is enabled, which
> > is different from old building system.
>
> I wanted to skip the C++ tests when we don't need C++, because it makes
> configure take longer. But I could be convinced that we should always at least
> determine the C++ compiler for Makefile.global.

The first idea that came to my mind is using the *project* command
to set [`c`, `cpp`], but this might be a little bit confusing for somebody.

Then I tried another way by adding a 'pgxscpp' option to let the user
choose whether he will set the C++ compiler for Makefile.global.
It works but may not be an ideal way, see the attached.


>
> Greetings,
>
> Andres Freund



-- 
Regards
Junwang Zhao


0001-PGXS-determine-C-compiler-for-Makefile.global.patch
Description: Binary data


Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-21 Thread Junwang Zhao
On Thu, Dec 21, 2023 at 5:35 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Mon, 11 Dec 2023 23:31:29 +0900,
>   Masahiko Sawada  wrote:
>
> > I've sketched the above idea including a test module in
> > src/test/module/test_copy_format, based on v2 patch. It's not splitted
> > and is dirty so just for discussion.
>
> I implemented a sample COPY TO handler for Apache Arrow that
> supports only integer and text.
>
> I needed to extend the patch:
>
> 1. Add an opaque space for custom COPY TO handler
>* Add CopyToState{Get,Set}Opaque()
>
> https://github.com/kou/postgres/commit/5a610b6a066243f971e029432db67152cfe5e944
>
> 2. Export CopyToState::attnumlist
>* Add CopyToStateGetAttNumList()
>
> https://github.com/kou/postgres/commit/15fcba8b4e95afa86edb3f677a7bdb1acb1e7688
>
> 3. Export CopySend*()
>* Rename CopySend*() to CopyToStateSend*() and export them
>* Exception: CopySendEndOfRow() to CopyToStateFlush() because
>  it just flushes the internal buffer now.
>
> https://github.com/kou/postgres/commit/289a5640135bde6733a1b8e2c412221ad522901e
>
I guess the purpose of these helpers is to avoid expose CopyToState to
copy.h, but I
think expose CopyToState to user might make life easier, users might want to use
the memory contexts of the structure (though I agree not all the
fields are necessary
for extension handers).

> The attached patch is based on the Sawada-san's patch and
> includes the above changes. Note that this patch is also
> dirty so just for discussion.
>
> My suggestions from this experience:
>
> 1. Split COPY handler to COPY TO handler and COPY FROM handler
>
>* CopyFormatRoutine is a bit tricky. An extension needs
>  to create a CopyFormatRoutine node and
>  a CopyToFormatRoutine node.
>
>* If we just require "copy_to_${FORMAT}(internal)"
>  function and "copy_from_${FORMAT}(internal)" function,
>  we can remove the tricky approach. And it also avoid
>  name collisions with other handler such as tablesample
>  handler.
>  See also:
>  
> https://www.postgresql.org/message-id/flat/20231214.184414.2179134502876898942.kou%40clear-code.com#af71f364d0a9f5c144e45b447e5c16c9
>
> 2. Need an opaque space like IndexScanDesc::opaque does
>
>* A custom COPY TO handler needs to keep its data

I once thought users might want to parse their own options, maybe this
is a use case for this opaque space.

For the name, I thought private_data might be a better candidate than
opaque, but I do not insist.
>
> 3. Export CopySend*()
>
>* If we like minimum API, we just need to export
>  CopySendData() and CopySendEndOfRow(). But
>  CopySend{String,Char,Int32,Int16}() will be convenient
>  custom COPY TO handlers. (A custom COPY TO handler for
>  Apache Arrow doesn't need them.)

Do you use the arrow library to control the memory? Is there a way that
we can let the arrow use postgres' memory context? I'm not sure this
is necessary, just raise the question for discussion.
>
> Questions:
>
> 1. What value should be used for "format" in
>PgMsg_CopyOutResponse message?
>
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/copyto.c;h=c66a047c4a79cc614784610f385f1cd0935350f3;hb=9ca6e7b9411e36488ef539a2c1f6846ac92a7072#l144
>
>It's 1 for binary format and 0 for text/csv format.
>
>Should we make it customizable by custom COPY TO handler?
>If so, what value should be used for this?
>
> 2. Do we need more tries for design discussion for the first
>    implementation? If we need, what should we try?
>
>
> Thanks,
> --
> kou

+PG_FUNCTION_INFO_V1(copy_testfmt_handler);
+Datum
+copy_testfmt_handler(PG_FUNCTION_ARGS)
+{
+ bool is_from = PG_GETARG_BOOL(0);
+ CopyFormatRoutine *cp = makeNode(CopyFormatRoutine);;
+

extra semicolon.

-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-19 Thread Junwang Zhao
On Wed, Dec 20, 2023 at 9:58 AM Thomas wen
 wrote:
>
> Hi Junwang Zhao
>  #should we invalidate lock_timeout? Or maybe just document this.
> I think you mean when lock_time is greater than trasaction-time invalidate 
> lock_timeout or  needs to be logged ?
>
I mean the interleaving of the gucs, which is lock_timeout and the new
introduced transaction_timeout,
if lock_timeout >= transaction_timeout, seems no need to enable lock_timeout.
>
>
>
> Best whish
> ____
> 发件人: Junwang Zhao 
> 发送时间: 2023年12月20日 9:48
> 收件人: Andrey M. Borodin 
> 抄送: Japin Li ; 邱宇航 ; Fujii Masao 
> ; Andrey Borodin ; Andres 
> Freund ; Michael Paquier ; 
> Nikolay Samokhvalov ; pgsql-hackers 
> ; pgsql-hackers@lists.postgresql.org 
> 
> 主题: Re: Transaction timeout
>
> On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao  wrote:
> >
> > On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin  
> > wrote:
> > >
> > >
> > >
> > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin  
> > > > wrote:
> > > >
> > > > I don’t have Windows machine, so I hope CF bot will pick this.
> > >
> > > I used Github CI to produce version of tests that seems to be is stable 
> > > on Windows.
> > > Sorry for the noise.
> > >
> > >
> > > Best regards, Andrey Borodin.
> >
> > +   
> > +If transaction_timeout is shorter than
> > +idle_in_transaction_session_timeout or
> > statement_timeout
> > +transaction_timeout will invalidate longer 
> > timeout.
> > +   
> >
> > When transaction_timeout is *equal* to idle_in_transaction_session_timeout
> > or statement_timeout, idle_in_transaction_session_timeout and 
> > statement_timeout
> > will also be invalidated, the logic in the code seems right, though
> > this document
> > is a little bit inaccurate.
> >
>
> Unlike statement_timeout, this timeout can only 
> occur
> while waiting for locks.  Note that if
> statement_timeout
> is nonzero, it is rather pointless to set
> lock_timeout to
> the same or larger value, since the statement timeout would always
> trigger first.  If log_min_error_statement is set 
> to
> ERROR or lower, the statement that timed out will 
> be
>     logged.
>    
>
> There is a note about statement_timeout and lock_timeout, set both
> and lock_timeout >= statement_timeout is pointless, but this logic seems not
> implemented in the code. I am wondering if lock_timeout >= 
> transaction_timeout,
> should we invalidate lock_timeout? Or maybe just document this.
>
> > --
> > Regards
> > Junwang Zhao
>
>
>
> --
> Regards
> Junwang Zhao
>
>


-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-19 Thread Junwang Zhao
On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao  wrote:
>
> On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin  
> wrote:
> >
> >
> >
> > > On 19 Dec 2023, at 13:26, Andrey M. Borodin  wrote:
> > >
> > > I don’t have Windows machine, so I hope CF bot will pick this.
> >
> > I used Github CI to produce version of tests that seems to be is stable on 
> > Windows.
> > Sorry for the noise.
> >
> >
> > Best regards, Andrey Borodin.
>
> +   
> +If transaction_timeout is shorter than
> +idle_in_transaction_session_timeout or
> statement_timeout
> +transaction_timeout will invalidate longer 
> timeout.
> +   
>
> When transaction_timeout is *equal* to idle_in_transaction_session_timeout
> or statement_timeout, idle_in_transaction_session_timeout and 
> statement_timeout
> will also be invalidated, the logic in the code seems right, though
> this document
> is a little bit inaccurate.
>
   
Unlike statement_timeout, this timeout can only occur
while waiting for locks.  Note that if
statement_timeout
is nonzero, it is rather pointless to set
lock_timeout to
the same or larger value, since the statement timeout would always
trigger first.  If log_min_error_statement is set to
ERROR or lower, the statement that timed out will be
logged.
   

There is a note about statement_timeout and lock_timeout, set both
and lock_timeout >= statement_timeout is pointless, but this logic seems not
implemented in the code. I am wondering if lock_timeout >= transaction_timeout,
should we invalidate lock_timeout? Or maybe just document this.

> --
> Regards
> Junwang Zhao



-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-19 Thread Junwang Zhao
On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin  wrote:
>
>
>
> > On 19 Dec 2023, at 13:26, Andrey M. Borodin  wrote:
> >
> > I don’t have Windows machine, so I hope CF bot will pick this.
>
> I used Github CI to produce version of tests that seems to be is stable on 
> Windows.
> Sorry for the noise.
>
>
> Best regards, Andrey Borodin.

+   
+If transaction_timeout is shorter than
+idle_in_transaction_session_timeout or
statement_timeout
+transaction_timeout will invalidate longer timeout.
+   

When transaction_timeout is *equal* to idle_in_transaction_session_timeout
or statement_timeout, idle_in_transaction_session_timeout and statement_timeout
will also be invalidated, the logic in the code seems right, though
this document
is a little bit inaccurate.

-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-14 Thread Junwang Zhao
On Fri, Dec 15, 2023 at 12:45 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 15 Dec 2023 11:27:30 +0800,
>   Junwang Zhao  wrote:
>
> >> > Adding a prefix or suffix would be one option but to give extensions
> >> > more flexibility, another option would be to support format = 'custom'
> >> > and add the "handler" option to specify a copy handler function name
> >> > to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
> >> > HANDLER = 'arrow_copy_handler').
> >>
> > I like the prefix/suffix idea, easy to implement. *custom* is not a FORMAT,
> > and user has to know the name of the specific handler names, not
> > intuitive.
>
> Ah! I misunderstood this idea. "custom" is the special
> format to use "HANDLER". I thought that we can use it like
>
>(FORMAT = 'arrow', HANDLER = 'arrow_copy_handler_impl1')
>
> and
>
>(FORMAT = 'arrow', HANDLER = 'arrow_copy_handler_impl2')
>
> .
>
> >> Interesting. If we use this option, users can choose an COPY
> >> FORMAT implementation they like from multiple
> >> implementations. For example, a developer may implement a
> >> COPY FROM FORMAT = 'json' handler with PostgreSQL's JSON
> >> related API and another developer may implement a handler
> >> with simdjson[1] which is a fast JSON parser. Users can
> >> choose whichever they like.
> > Not sure about this, why not move Json copy handler to contrib
> > as an example for others, any extensions share the same format
> > function name and just install one? No bound would implement
> > another CSV or TEXT copy handler IMHO.
>
> I should have used a different format not JSON as an example
> for easy to understand. I just wanted to say that extension
> developers can implement another implementation without
> conflicting another implementation.

Yeah, I can see the value of the HANDLER option now. The possibility
of two extensions for the same format using same hanlder name should
be rare I guess ;)
>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-14 Thread Junwang Zhao
On Fri, Dec 15, 2023 at 8:53 AM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 15 Dec 2023 05:19:43 +0900,
>   Masahiko Sawada  wrote:
>
> > To avoid collisions, extensions can be created in a
> > different schema than public.
>
> Thanks. I didn't notice it.
>
> > And note that built-in format copy handler doesn't need to
> > declare its handler function.
>
> Right. I know it.
>
> > Adding a prefix or suffix would be one option but to give extensions
> > more flexibility, another option would be to support format = 'custom'
> > and add the "handler" option to specify a copy handler function name
> > to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
> > HANDLER = 'arrow_copy_handler').
>
I like the prefix/suffix idea, easy to implement. *custom* is not a FORMAT,
and user has to know the name of the specific handler names, not
intuitive.

> Interesting. If we use this option, users can choose an COPY
> FORMAT implementation they like from multiple
> implementations. For example, a developer may implement a
> COPY FROM FORMAT = 'json' handler with PostgreSQL's JSON
> related API and another developer may implement a handler
> with simdjson[1] which is a fast JSON parser. Users can
> choose whichever they like.
Not sure about this, why not move Json copy handler to contrib
as an example for others, any extensions share the same format
function name and just install one? No bound would implement
another CSV or TEXT copy handler IMHO.
>
> But specifying HANDLER = '...' explicitly is a bit
> inconvenient. Because only one handler will be installed in
> most use cases. In the case, users don't need to choose one
> handler.
>
> If we choose this option, it may be better that we also
> provide a mechanism that can work without HANDLER. Searching
> a function by name like tablesample method does is an option.
>
>
> [1]: https://github.com/simdjson/simdjson
>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: [meson] expose buildtype debug/optimization info to pg_config

2023-12-14 Thread Junwang Zhao
Hi Peter,

Thanks for looking into this.

On Thu, Dec 14, 2023 at 4:50 PM Peter Eisentraut  wrote:
>
> On 12.12.23 11:40, Junwang Zhao wrote:
> > build system using configure set VAL_CFLAGS with debug and
> > optimization flags, so pg_config will show these infos. Some
> > extensions depend on the mechanism.
> >
> > This patch exposes these flags with a typo fixed together.
>
> I have committed the typo fix.
>
> But I would like to learn more about the requirements of extensions in
> this area.  This seems a bit suspicious to me.

This is what I found when building citus against an installation
of meson debug build pg instance, since the CFLAGS doesn't
contain -g flag, the binary doesn't include the debug information,
which is different behavior from configure building system.

Another issue I found is that some C++
extensions(ajust/parquet_fdw for example) don't build against
the meson generated pgxs.mk, since it doesn't set the CXX
command. CXX is only set when llvm option is enabled, which
is different from old building system.

I don't insist we make Meson the same behaviour with old building
system, I just think the issues I raised might stop developers try
the fancy new building system. And the fix I post might not be
ideal, you and Andres might have better solutions.

>


-- 
Regards
Junwang Zhao




Re: Simplify newNode()

2023-12-13 Thread Junwang Zhao
On Thu, Dec 14, 2023 at 9:34 AM Zhang Mingli  wrote:
>
> Hi,
>
> LGTM.
>
> + Assert(size >= sizeof(Node)); /* need the tag, at least */
> + result = (Node *) palloc0fast(size);
> + result->type = tag;
>
> + return result;
> +}
>
> How about moving the comments /* need the tag, at least */ after result->type 
> = tag; by the way?

I don't think so, the comment has the meaning of the requested size
should at least the size
of Node, which contains just a NodeTag.

typedef struct Node
{
NodeTag type;
} Node;

>
>
>
> Zhang Mingli
> www.hashdata.xyz



-- 
Regards
Junwang Zhao




[meson] expose buildtype debug/optimization info to pg_config

2023-12-12 Thread Junwang Zhao
build system using configure set VAL_CFLAGS with debug and
optimization flags, so pg_config will show these infos. Some
extensions depend on the mechanism.

This patch exposes these flags with a typo fixed together.

-- 
Regards
Junwang Zhao


0001-meson-expose-buildtype-debug-optimization-info-to-pg.patch
Description: Binary data


Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-11 Thread Junwang Zhao
On Mon, Dec 11, 2023 at 10:32 PM Masahiko Sawada  wrote:
>
> On Mon, Dec 11, 2023 at 7:19 PM Michael Paquier  wrote:
> >
> > On Mon, Dec 11, 2023 at 10:57:15AM +0900, Masahiko Sawada wrote:
> > > IIUC we cannot create two same name functions with the same arguments
> > > but a different return value type in the first place. It seems to me
> > > to be an overkill to change such a design.
> >
> > Agreed to not touch the logictics of LookupFuncName() for the sake of
> > this thread.  I have not checked the SQL specification, but I recall
> > that there are a few assumptions from the spec embedded in the lookup
> > logic particularly when it comes to specify a procedure name without
> > arguments.
> >
> > > Another idea is to encapsulate copy_to/from_handler by a super class
> > > like copy_handler. The handler function is called with an argument,
> > > say copyto, and returns copy_handler encapsulating either
> > > copy_to/from_handler depending on the argument.
> >
> > Yep, that's possible as well and can work as a cross-check between the
> > argument and the NodeTag assigned to the handler structure returned by
> > the function.
> >
> > At the end, the final result of the patch should IMO include:
> > - Documentation about how one can register a custom copy_handler.
> > - Something in src/test/modules/, minimalistic still useful that can
> > be used as a template when one wants to implement their own handler.
> > The documentation should mention about this module.
> > - No need for SQL functions for all the in-core handlers: let's just
> > return pointers to them based on the options given.
>
> Agreed.
>
> > It would be probably cleaner to split the patch so as the code is
> > refactored and evaluated with the in-core handlers first, and then
> > extended with the pluggable facilities and the function lookups.
>
> Agreed.
>
> I've sketched the above idea including a test module in
> src/test/module/test_copy_format, based on v2 patch. It's not splitted
> and is dirty so just for discussion.
>
The test_copy_format extension doesn't use the fields of CopyToState and
CopyFromState in this patch, I think we should move CopyFromStateData
and CopyToStateData to commands/copy.h, what do you think?

The framework in the patch LGTM.

>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-11 Thread Junwang Zhao
On Sat, Dec 9, 2023 at 7:38 PM Hannu Krosing  wrote:
>
> Hi Junwang
>
> Please also see my presentation slides from last years PostgreSQL
> Conference in Berlin (attached)

I read through the slides, really promising ideas, it's will be great
if we can get there at last.

>
> The main Idea is to make not just "format", but also "transport" and
> "stream processing" extendable via virtual function tables.
The code is really coupled, it is not easy to do all of these in one round,
it will be great if you have a POC patch.

>
> Btw, will any of you here be in Prague next week ?
> Would be a good opportunity to discuss this in person.
Sorry, no.

>
>
> Best Regards
> Hannu
>
> On Sat, Dec 9, 2023 at 9:39 AM Junwang Zhao  wrote:
> >
> > On Sat, Dec 9, 2023 at 10:43 AM Hayato Kuroda (Fujitsu)
> >  wrote:
> > >
> > > Dear Junagn, Sutou-san,
> > >
> > > Basically I agree your point - improving a extendibility is good.
> > > (I remember that this theme was talked at Japan PostgreSQL conference)
> > > Below are my comments for your patch.
> > >
> > > 01. General
> > >
> > > Just to confirm - is it OK to partially implement APIs? E.g., only COPY 
> > > TO is
> > > available. Currently it seems not to consider a case which is not 
> > > implemented.
> > >
> > For partially implements, we can leave the hook as NULL, and check the NULL
> > at *ProcessCopyOptions* and report error if not supported.
> >
> > > 02. General
> > >
> > > It might be trivial, but could you please clarify how users can extend? 
> > > Is it OK
> > > to do below steps?
> > >
> > > 1. Create a handler function, via CREATE FUNCTION,
> > > 2. Register a handler, via new SQL (CREATE COPY HANDLER),
> > > 3. Specify the added handler as COPY ... FORMAT clause.
> > >
> > My original thought was option 2, but as Michael point, option 1 is
> > the right way
> > to go.
> >
> > > 03. General
> > >
> > > Could you please add document-related tasks to your TODO? I imagined like
> > > fdwhandler.sgml.
> > >
> > > 04. General - copyright
> > >
> > > For newly added files, the below copyright seems sufficient. See 
> > > applyparallelworker.c.
> > >
> > > ```
> > >  * Copyright (c) 2023, PostgreSQL Global Development Group
> > > ```
> > >
> > > 05. src/include/catalog/* files
> > >
> > > IIUC, 8000 or higher OIDs should be used while developing a patch. 
> > > src/include/catalog/unused_oids
> > > would suggest a candidate which you can use.
> >
> > Yeah, I will run renumber_oids.pl at last.
> >
> > >
> > > 06. copy.c
> > >
> > > I felt that we can create files per copying methods, like 
> > > copy_{text|csv|binary}.c,
> > > like indexes.
> > > How do other think?
> >
> > Not sure about this, it seems others have put a lot of effort into
> > splitting TO and From.
> > Also like to hear from others.
> >
> > >
> > > 07. fmt_to_name()
> > >
> > > I'm not sure the function is really needed. Can we follow like 
> > > get_foreign_data_wrapper_oid()
> > > and remove the funciton?
> >
> > I have referenced some code from greenplum, will remove this.
> >
> > >
> > > 08. GetCopyRoutineByName()
> > >
> > > Should we use syscache for searching a catalog?
> > >
> > > 09. CopyToFormatTextSendEndOfRow(), CopyToFormatBinaryStart()
> > >
> > > Comments still refer CopyHandlerOps, whereas it was renamed.
> > >
> > > 10. copy.h
> > >
> > > Per foreign.h and fdwapi.h, should we add a new header file and move some 
> > > APIs?
> > >
> > > 11. copy.h
> > >
> > > ```
> > > -/* These are private in commands/copy[from|to].c */
> > > -typedef struct CopyFromStateData *CopyFromState;
> > > -typedef struct CopyToStateData *CopyToState;
> > > ```
> > >
> > > Are above changes really needed?
> > >
> > > 12. CopyFormatOptions
> > >
> > > Can we remove `bool binary` in future?
> > >
> > > 13. external functions
> > >
> > > ```
> > > +extern void CopyToFormatTextStart(CopyToState cstate, TupleDesc tupDesc);
> > > +extern void CopyToFormatTextOneRow(CopyToState cstate, TupleTableSlot 
> > > *slo

Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-09 Thread Junwang Zhao
On Sun, Dec 10, 2023 at 4:44 AM Sutou Kouhei  wrote:
>
> Hi,
>
> Thanks for reviewing our latest patch!
>
> In
>  
> 
>   "RE: Make COPY format extendable: Extract COPY TO format implementations" 
> on Sat, 9 Dec 2023 02:43:49 +,
>   "Hayato Kuroda (Fujitsu)"  wrote:
>
> > (I remember that this theme was talked at Japan PostgreSQL conference)
>
> Yes. I should have talked to you more at the conference...
> I will do it next time!
>
>
> Can we discuss how to proceed this improvement?
>
> There are 2 approaches for it:
>
> 1. Do the followings concurrently:
>a. Implementing small changes that got a consensus and
>   merge them step-by-step
>   (e.g. We got a consensus that we need to extract the
>   current format related routines.)
>b. Discuss design
>
>(v1-v3 patches use this approach.)
>
> 2. Implement one (large) complete patch set with design
>discussion and merge it
>
>(v4- patches use this approach.)
>
> Which approach is preferred? (Or should we choose another
> approach?)
>
> I thought that 1. is preferred because it will reduce review
> cost. So I chose 1.
>
> If 2. is preferred, I'll use 2. (I'll add more changes to
> the latest patch.)
>
I'm ok with both, and I'd like to work with you for the parquet
extension, excited about this new feature, thanks for bringing
this up.

Forgive me for making so much noise about approach 2, I
just want to hear about more suggestions of the final shape
of this feature.

>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-09 Thread Junwang Zhao
On Sat, Dec 9, 2023 at 10:43 AM Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Junagn, Sutou-san,
>
> Basically I agree your point - improving a extendibility is good.
> (I remember that this theme was talked at Japan PostgreSQL conference)
> Below are my comments for your patch.
>
> 01. General
>
> Just to confirm - is it OK to partially implement APIs? E.g., only COPY TO is
> available. Currently it seems not to consider a case which is not implemented.
>
For partially implements, we can leave the hook as NULL, and check the NULL
at *ProcessCopyOptions* and report error if not supported.

> 02. General
>
> It might be trivial, but could you please clarify how users can extend? Is it 
> OK
> to do below steps?
>
> 1. Create a handler function, via CREATE FUNCTION,
> 2. Register a handler, via new SQL (CREATE COPY HANDLER),
> 3. Specify the added handler as COPY ... FORMAT clause.
>
My original thought was option 2, but as Michael point, option 1 is
the right way
to go.

> 03. General
>
> Could you please add document-related tasks to your TODO? I imagined like
> fdwhandler.sgml.
>
> 04. General - copyright
>
> For newly added files, the below copyright seems sufficient. See 
> applyparallelworker.c.
>
> ```
>  * Copyright (c) 2023, PostgreSQL Global Development Group
> ```
>
> 05. src/include/catalog/* files
>
> IIUC, 8000 or higher OIDs should be used while developing a patch. 
> src/include/catalog/unused_oids
> would suggest a candidate which you can use.

Yeah, I will run renumber_oids.pl at last.

>
> 06. copy.c
>
> I felt that we can create files per copying methods, like 
> copy_{text|csv|binary}.c,
> like indexes.
> How do other think?

Not sure about this, it seems others have put a lot of effort into
splitting TO and From.
Also like to hear from others.

>
> 07. fmt_to_name()
>
> I'm not sure the function is really needed. Can we follow like 
> get_foreign_data_wrapper_oid()
> and remove the funciton?

I have referenced some code from greenplum, will remove this.

>
> 08. GetCopyRoutineByName()
>
> Should we use syscache for searching a catalog?
>
> 09. CopyToFormatTextSendEndOfRow(), CopyToFormatBinaryStart()
>
> Comments still refer CopyHandlerOps, whereas it was renamed.
>
> 10. copy.h
>
> Per foreign.h and fdwapi.h, should we add a new header file and move some 
> APIs?
>
> 11. copy.h
>
> ```
> -/* These are private in commands/copy[from|to].c */
> -typedef struct CopyFromStateData *CopyFromState;
> -typedef struct CopyToStateData *CopyToState;
> ```
>
> Are above changes really needed?
>
> 12. CopyFormatOptions
>
> Can we remove `bool binary` in future?
>
> 13. external functions
>
> ```
> +extern void CopyToFormatTextStart(CopyToState cstate, TupleDesc tupDesc);
> +extern void CopyToFormatTextOneRow(CopyToState cstate, TupleTableSlot *slot);
> +extern void CopyToFormatTextEnd(CopyToState cstate);
> +extern void CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc);
> +extern bool CopyFromFormatTextNext(CopyFromState cstate, ExprContext 
> *econtext,
> +
> Datum *values, bool *nulls);
> +extern void CopyFromFormatTextErrorCallback(CopyFromState cstate);
> +
> +extern void CopyToFormatBinaryStart(CopyToState cstate, TupleDesc tupDesc);
> +extern void CopyToFormatBinaryOneRow(CopyToState cstate, TupleTableSlot 
> *slot);
> +extern void CopyToFormatBinaryEnd(CopyToState cstate);
> +extern void CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc 
> tupDesc);
> +extern bool CopyFromFormatBinaryNext(CopyFromState cstate,
> ExprContext *econtext,
> +
>   Datum *values, bool *nulls);
> +extern void CopyFromFormatBinaryErrorCallback(CopyFromState cstate);
> ```
>
> FYI - If you add files for {text|csv|binary}, these declarations can be 
> removed.
>
> Best Regards,
> Hayato Kuroda
> FUJITSU LIMITED
>

Thanks for all the valuable suggestions.

-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-07 Thread Junwang Zhao
On Fri, Dec 8, 2023 at 3:27 AM Masahiko Sawada  wrote:
>
> On Fri, Dec 8, 2023 at 1:39 AM Andrew Dunstan  wrote:
> >
> >
> > On 2023-12-07 Th 03:37, Junwang Zhao wrote:
> > >
> > > The point of this refactor (from my view) is to make it possible to add 
> > > new
> > > copy handlers in extensions, just like access method. As Andres suggested,
> > > a system catalog like *pg_copy_handler*, if we split TO and FROM into two
> > > sets of routines, does that mean we have to create two catalog(
> > > pg_copy_from_handler and pg_copy_to_handler)?
> >
> >
> >
> > Surely not. Either have two fields, one for the TO handler and one for
> > the FROM handler, or a flag on each row indicating if it's a FROM or TO
> > handler.

If we wrap the two fields into a single structure, that will still be in
copy.h, which I think is not necessary. A single routing wrapper should
be enough, the actual implementation still stays separate
copy_[to/from].c files.

>
> True.
>
> But why do we need a system catalog like pg_copy_handler in the first
> place? I imagined that an extension can define a handler function
> returning a set of callbacks and the parser can lookup the handler
> function by name, like FDW and TABLESAMPLE.
>
I can see FDW related utility commands but no TABLESAMPLE related,
and there is a pg_foreign_data_wrapper system catalog which has
a *fdwhandler* field.

If we want extensions to create a new copy handler, I think
something like pg_copy_hander should be necessary.

> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

I go one step further to implement the pg_copy_handler, attached V5 is
the implementation with some changes suggested by Kou.

You can also review this on this github pull request [1].

[1]: https://github.com/zhjwpku/postgres/pull/1/files

-- 
Regards
Junwang Zhao


v5-0001-Extract-COPY-handlers.patch
Description: Binary data


Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-07 Thread Junwang Zhao
On Thu, Dec 7, 2023 at 1:05 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Wed, 6 Dec 2023 22:07:51 +0800,
>   Junwang Zhao  wrote:
>
> > Should we extract both *copy to* and *copy from* for the first step, in that
> > case we can add the pg_copy_handler catalog smoothly later.
>
> I don't object it (mixing TO/FROM changes to one patch) but
> it may make review difficult. Is it acceptable?
>
> FYI: I planed that I implement TO part, and then FROM part,
> and then unify TO/FROM parts if needed. [1]

I'm fine with step by step refactoring, let's just wait for more
suggestions.

>
> > Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> > please take a look.
>
> Thanks. Here are my comments:
>
> > + /*
> > + * Error is relevant to a particular line.
> > + *
> > + * If line_buf still contains the correct line, print 
> > it.
> > + */
> > + if (cstate->line_buf_valid)
>
> We need to fix the indentation.
>
> > +CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc)
> > +{
> > + FmgrInfo   *in_functions;
> > + Oid*typioparams;
> > + Oid in_func_oid;
> > + AttrNumber  num_phys_attrs;
> > +
> > + /*
> > +  * Pick up the required catalog information for each attribute in the
> > +  * relation, including the input function, the element type (to pass 
> > to
> > +  * the input function), and info about defaults and constraints. 
> > (Which
> > +  * input function we use depends on text/binary format choice.)
> > +  */
> > + num_phys_attrs = tupDesc->natts;
> > + in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> > + typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));
>
> We need to update the comment because defaults and
> constraints aren't picked up here.
>
> > +CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc)
> ...
> > + /*
> > +  * Pick up the required catalog information for each attribute in the
> > +  * relation, including the input function, the element type (to pass 
> > to
> > +  * the input function), and info about defaults and constraints. 
> > (Which
> > +  * input function we use depends on text/binary format choice.)
> > +  */
> > + in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> > + typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));
>
> ditto.
>
>
> > @@ -1716,15 +1776,6 @@ BeginCopyFrom(ParseState *pstate,
> >   ReceiveCopyBinaryHeader(cstate);
> >   }
>
> I think that this block should be moved to
> CopyFromFormatBinaryStart() too. But we need to run it after
> we setup inputs such as data_source_cb, pipe and filename...
>
> +/* Routines for a COPY HANDLER implementation. */
> +typedef struct CopyHandlerOps
> +{
> +   /* Called when COPY TO is started. This will send a header. */
> +   void(*copy_to_start) (CopyToState cstate, TupleDesc 
> tupDesc);
> +
> +   /* Copy one row for COPY TO. */
> +   void(*copy_to_one_row) (CopyToState cstate, 
> TupleTableSlot *slot);
> +
> +   /* Called when COPY TO is ended. This will send a trailer. */
> +   void(*copy_to_end) (CopyToState cstate);
> +
> +   void(*copy_from_start) (CopyFromState cstate, TupleDesc 
> tupDesc);
> +   bool(*copy_from_next) (CopyFromState cstate, ExprContext 
> *econtext,
> +  Datum 
> *values, bool *nulls);
> +   void(*copy_from_error_callback) (CopyFromState cstate);
> +   void(*copy_from_end) (CopyFromState cstate);
> +} CopyHandlerOps;
>
> It seems that "copy_" prefix is redundant. Should we use
> "to_start" instead of "copy_to_start" and so on?
>
> BTW, it seems that "COPY FROM (FORMAT json)" may not be implemented. [2]
> We may need to care about NULL copy_from_* cases.
>
>
> > I added a hook *copy_from_end* but this might be removed later if not used.
>
> It may be useful to clean up resources for COPY FROM but the
> patch doesn't call the copy_from_end. How about removing it
> for now? We can add it and call it from EndCopyFrom() later?
> Because it's not needed for now.
>
> I think that we should focus on refactoring instead of
> adding a new feature in this patch.
>
>
> [1]: 
> https://www.postgresql.org/message-id/20231204.153548.2126325458835528809.kou%40clear-code.com
> [2]: 
> https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-07 Thread Junwang Zhao
On Thu, Dec 7, 2023 at 8:39 AM Michael Paquier  wrote:
>
> On Wed, Dec 06, 2023 at 10:07:51PM +0800, Junwang Zhao wrote:
> > I read the thread[1] you posted and I think Andres's suggestion sounds 
> > great.
> >
> > Should we extract both *copy to* and *copy from* for the first step, in that
> > case we can add the pg_copy_handler catalog smoothly later.
> >
> > Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> > please take a look.
> >
> > I added a hook *copy_from_end* but this might be removed later if not used.
> >
> > [1]: 
> > https://www.postgresql.org/message-id/20180211211235.5x3jywe5z3lkgcsr%40alap3.anarazel.de
>
> I was looking at the differences between v3 posted by Sutou-san and
> v4 from you, seeing that:
>
> +/* Routines for a COPY HANDLER implementation. */
> +typedef struct CopyHandlerOps
>  {
>  /* Called when COPY TO is started. This will send a header. */
> -void(*start) (CopyToState cstate, TupleDesc tupDesc);
> +void(*copy_to_start) (CopyToState cstate, TupleDesc tupDesc);
>
>  /* Copy one row for COPY TO. */
> -void(*one_row) (CopyToState cstate, TupleTableSlot *slot);
> +void(*copy_to_one_row) (CopyToState cstate, TupleTableSlot 
> *slot);
>
>  /* Called when COPY TO is ended. This will send a trailer. */
> -void(*end) (CopyToState cstate);
> -} CopyToFormatOps;
> +void(*copy_to_end) (CopyToState cstate);
> +
> +void(*copy_from_start) (CopyFromState cstate, TupleDesc tupDesc);
> +bool(*copy_from_next) (CopyFromState cstate, ExprContext 
> *econtext,
> +Datum *values, bool *nulls);
> +void(*copy_from_error_callback) (CopyFromState cstate);
> +void(*copy_from_end) (CopyFromState cstate);
> +} CopyHandlerOps;
>
> And we've spent a good deal of time refactoring the copy code so as
> the logic behind TO and FROM is split.  Having a set of routines that
> groups both does not look like a step in the right direction to me,

The point of this refactor (from my view) is to make it possible to add new
copy handlers in extensions, just like access method. As Andres suggested,
a system catalog like *pg_copy_handler*, if we split TO and FROM into two
sets of routines, does that mean we have to create two catalog(
pg_copy_from_handler and pg_copy_to_handler)?

> and v4 is an attempt at solving two problems, while v3 aims to improve
> one case.  It seems to me that each callback portion should be focused
> on staying in its own area of the code, aka copyfrom*.c or copyto*.c.
> --
> Michael



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-06 Thread Junwang Zhao
On Wed, Dec 6, 2023 at 8:32 PM Daniel Verite  wrote:
>
> Sutou Kouhei wrote:
>
> > * 2022-04: Apache Arrow [2]
> > * 2018-02: Apache Avro, Apache Parquet and Apache ORC [3]
> >
> > (FYI: I want to add support for Apache Arrow.)
> >
> > There were discussions how to add support for more formats. [3][4]
> > In these discussions, we got a consensus about making COPY
> > format extendable.
>
>
> These formats seem all column-oriented whereas COPY is row-oriented
> at the protocol level [1].
> With regard to the procotol, how would it work to support these formats?
>

They have kind of *RowGroup* concepts, a bunch of rows goes to a RowBatch
and the data of the same column goes together.

I think they should fit the COPY semantics and there are some  FDW out there for
these modern formats, like [1]. If we support COPY to deal with the
format, it will
be easier to interact with them(without creating
server/usermapping/foreign table).

[1]: https://github.com/adjust/parquet_fdw

>
> [1] https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-COPY
>
>
> Best regards,
> --
> Daniel Vérité
> https://postgresql.verite.pro/
> Twitter: @DanielVerite
>
>


-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-06 Thread Junwang Zhao
On Wed, Dec 6, 2023 at 3:28 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Wed, 6 Dec 2023 15:11:34 +0800,
>   Junwang Zhao  wrote:
>
> > For the extra curly braces, I mean the following code block in
> > CopyToFormatBinaryStart:
> >
> > + {<-- I thought this is useless?
> > + /* Generate header for a binary copy */
> > + int32 tmp;
> > +
> > + /* Signature */
> > + CopySendData(cstate, BinarySignature, 11);
> > + /* Flags field */
> > + tmp = 0;
> > + CopySendInt32(cstate, tmp);
> > + /* No header extension */
> > + tmp = 0;
> > + CopySendInt32(cstate, tmp);
> > + }
>
> Oh, I see. I've removed and attach the v3 patch. In general,
> I don't change variable name and so on in this patch. I just
> move codes in this patch. But I also removed the "tmp"
> variable for this case because I think that the name isn't
> suitable for larger scope. (I think that "tmp" is acceptable
> in a small scope like the above code.)
>
> New code:
>
> /* Generate header for a binary copy */
> /* Signature */
> CopySendData(cstate, BinarySignature, 11);
> /* Flags field */
> CopySendInt32(cstate, 0);
> /* No header extension */
> CopySendInt32(cstate, 0);
>
>
> Thanks,
> --
> kou

Hi Kou,

I read the thread[1] you posted and I think Andres's suggestion sounds great.

Should we extract both *copy to* and *copy from* for the first step, in that
case we can add the pg_copy_handler catalog smoothly later.

Attached V4 adds 'extract copy from' and it passed the cirrus ci,
please take a look.

I added a hook *copy_from_end* but this might be removed later if not used.

[1]: 
https://www.postgresql.org/message-id/20180211211235.5x3jywe5z3lkgcsr%40alap3.anarazel.de
-- 
Regards
Junwang Zhao


v4-0001-Extract-COPY-handlers.patch
Description: Binary data


Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-05 Thread Junwang Zhao
On Wed, Dec 6, 2023 at 2:19 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Wed, 6 Dec 2023 11:18:35 +0800,
>   Junwang Zhao  wrote:
>
> > For the modern formats(parquet, orc, avro, etc.), will they be
> > implemented as extensions or in core?
>
> I think that they should be implemented as extensions
> because they will depend of external libraries and may not
> use C. For example, C++ will be used for Apache Parquet
> because the official Apache Parquet C++ implementation
> exists but the C implementation doesn't.
>
> (I can implement an extension for Apache Parquet after we
> complete this feature. I'll implement an extension for
> Apache Arrow with the official Apache Arrow C++
> implementation. And it's easy that we convert Apache Arrow
> data to Apache Parquet with the official Apache Parquet
> implementation.)
>
> > The patch looks good except for a pair of extra curly braces.
>
> Thanks for the review! I attach the v2 patch that removes
> extra curly braces for "if (isnull)".
>
For the extra curly braces, I mean the following code block in
CopyToFormatBinaryStart:

+ {<-- I thought this is useless?
+ /* Generate header for a binary copy */
+ int32 tmp;
+
+ /* Signature */
+ CopySendData(cstate, BinarySignature, 11);
+ /* Flags field */
+ tmp = 0;
+ CopySendInt32(cstate, tmp);
+ /* No header extension */
+ tmp = 0;
+ CopySendInt32(cstate, tmp);
+ }

>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-05 Thread Junwang Zhao
On Wed, Dec 6, 2023 at 10:45 AM Sutou Kouhei  wrote:
>
> Hi,
>
> Thanks for replying to this proposal!
>
> In <20231205182458.GC2757816@nathanxps13>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Tue, 5 Dec 2023 12:24:58 -0600,
>   Nathan Bossart  wrote:
>
> > I think it makes sense to do this part independently, but we should be
> > careful to design this with the follow-up tasks in mind.
>
> OK. I'll keep updating the "TODOs" section in the original
> e-mail. It also includes design in the follow-up tasks. We
> can discuss the design separately from the patches
> submitting. (The current submitted patch just focuses on
> refactoring but we can discuss the final design.)
>
> > I assume the performance concerns stem from the use of
> > function pointers.  Or was there something else?
>
> I think so too.
>
> The original e-mail that mentioned the performance concern
> [1] didn't say about the reason but the use of function
> pointers might be concerned.
>
> If the currently supported formats ("text", "csv" and
> "binary") are implemented as an extension, it may have more
> concerns but we will keep them as built-in formats for
> compatibility. So I think that no more concerns exist for
> these formats.
>

For the modern formats(parquet, orc, avro, etc.), will they be
implemented as extensions or in core?

The patch looks good except for a pair of extra curly braces.

>
> [1]: 
> https://www.postgresql.org/message-id/flat/3741749.1655952719%40sss.pgh.pa.us#2bb7af4a3d2c7669f9a49808d777a20d
>
>
> Thanks,
> --
> kou
>
>


-- 
Regards
Junwang Zhao




Re: Re: How to solve the problem of one backend process crashing and causing other processes to restart?

2023-11-13 Thread Junwang Zhao
On Mon, Nov 13, 2023 at 5:14 PM yuansong  wrote:
>
> Enhancing the overall fault tolerance of the entire system for this feature 
> is quite important. No one can avoid bugs, and I don't believe that this 
> approach is a more advanced one. It might be worth considering adding it to 
> the roadmap so that interested parties can conduct relevant research.
>
> The current major issue is that when one process crashes, resetting all 
> connections has a significant impact on other connections. Is it possible to 
> only disconnect the crashed connection and have the other connections simply 
> roll back the current transaction without reconnecting? Perhaps this problem 
> can be mitigated through the use of a connection pool.

It's not about the other connections, it's that the crashed connection
has no way to rollback.

>
> If we want to implement this feature, would it be sufficient to clean up or 
> restore the shared memory and disk changes caused by the crashed backend? 
> Besides clearing various known locks, what else needs to be changed? Does 
> anyone have any insights? Please help me. Thank you.
>
>
>
>
>
>
>
> At 2023-11-13 13:53:29, "Laurenz Albe"  wrote:
> >On Sun, 2023-11-12 at 21:55 -0500, Tom Lane wrote:
> >> yuansong  writes:
> >> > In PostgreSQL, when a backend process crashes, it can cause other backend
> >> > processes to also require a restart, primarily to ensure data 
> >> > consistency.
> >> > I understand that the correct approach is to analyze and identify the
> >> > cause of the crash and resolve it. However, it is also important to be
> >> > able to handle a backend process crash without affecting the operation of
> >> > other processes, thus minimizing the scope of negative impact and
> >> > improving availability. To achieve this goal, could we mimic the Oracle
> >> > process by introducing a "pmon" process dedicated to rolling back crashed
> >> > process transactions and performing resource cleanup? I wonder if anyone
> >> > has attempted such a strategy or if there have been previous discussions
> >> > on this topic.
> >>
> >> The reason we force a database-wide restart is that there's no way to
> >> be certain that the crashed process didn't corrupt anything in shared
> >> memory.  (Even with the forced restart, there's a window where bad
> >> data could reach disk before we kill off the other processes that
> >> might write it.  But at least it's a short window.)  "Corruption"
> >> here doesn't just involve bad data placed into disk buffers; more
> >> often it's things like unreleased locks, which would block other
> >> processes indefinitely.
> >>
> >> I seriously doubt that anything like what you're describing
> >> could be made reliable enough to be acceptable.  "Oracle does
> >> it like this" isn't a counter-argument: they have a much different
> >> (and non-extensible) architecture, and they also have an army of
> >> programmers to deal with minutiae like undoing resource acquisition.
> >> Even with that, you'd have to wonder about the number of bugs
> >> existing in such necessarily-poorly-tested code paths.
> >
> >Yes.
> >I think that PostgreSQL's approach is superior: rather than investing in
> >code to mitigate the impact of data corruption caused by a crash, invest
> >in quality code that doesn't crash in the first place.
> >
> >Euphemistically naming a crash "ORA-600 error" seems to be part of
> >their strategy.
> >
> >Yours,
> >Laurenz Albe
> >



-- 
Regards
Junwang Zhao




Re: make pg_ctl more friendly

2023-11-08 Thread Junwang Zhao
On Thu, Nov 9, 2023 at 3:08 PM Junwang Zhao  wrote:
>
> On Thu, Nov 9, 2023 at 9:57 AM Crisp Lee  wrote:
> >
> > Hi,
> >
> > I know it. But my question is not that. I did a PITR operation with 
> > recovery_target_name and recovery_target_action('shutdown'). The PITR 
> > process was very short and the PITR was done before pg_ctl check. The 
> > postmaster shutdown due to recovery_target_action, and there was no crash. 
> > But pg_ctl told me about startup failure.  I think the startup had 
> > succeeded and the result was not a exception. pg_ctl should tell users 
> > about detailed messages.
> >
> > On Thu, Nov 9, 2023 at 9:32 AM Andres Freund  wrote:
> >>
> >> Hi,
> >>
> >> On 2023-11-09 09:29:32 +0800, Crisp Lee wrote:
> >> > How to judge from 'DB_SHUTDOWNED' that PITR ends normally? 
> >> > 'DB_SHUTDOWNED'
> >> > is just a state, it could not give more meaning, so I reuse the
> >> > recovery.done.
> >>
> >> DB_SHUTDOWNED cannot be encountered while recovery is ongoing. If there 
> >> was a
> >> hard crash, you'd see DB_IN_ARCHIVE_RECOVERY or such, if the command was 
> >> shut
> >> down orderly before PITR has finished, you'd see DB_SHUTDOWNED_IN_RECOVERY.
> >>
> >> - Andres
>
> After a PITR shutdown, the db state should be *shut down in recovery*, try the
> patch attached.
>

previous patch has some format issues, V2 attached.

>
> --
> Regards
> Junwang Zhao



-- 
Regards
Junwang Zhao


v2-0001-PITR-shutdown-should-not-report-error-by-pg_ctl.patch
Description: Binary data


Re: make pg_ctl more friendly

2023-11-08 Thread Junwang Zhao
On Thu, Nov 9, 2023 at 9:57 AM Crisp Lee  wrote:
>
> Hi,
>
> I know it. But my question is not that. I did a PITR operation with 
> recovery_target_name and recovery_target_action('shutdown'). The PITR process 
> was very short and the PITR was done before pg_ctl check. The postmaster 
> shutdown due to recovery_target_action, and there was no crash. But pg_ctl 
> told me about startup failure.  I think the startup had succeeded and the 
> result was not a exception. pg_ctl should tell users about detailed messages.
>
> On Thu, Nov 9, 2023 at 9:32 AM Andres Freund  wrote:
>>
>> Hi,
>>
>> On 2023-11-09 09:29:32 +0800, Crisp Lee wrote:
>> > How to judge from 'DB_SHUTDOWNED' that PITR ends normally? 'DB_SHUTDOWNED'
>> > is just a state, it could not give more meaning, so I reuse the
>> > recovery.done.
>>
>> DB_SHUTDOWNED cannot be encountered while recovery is ongoing. If there was a
>> hard crash, you'd see DB_IN_ARCHIVE_RECOVERY or such, if the command was shut
>> down orderly before PITR has finished, you'd see DB_SHUTDOWNED_IN_RECOVERY.
>>
>> - Andres

After a PITR shutdown, the db state should be *shut down in recovery*, try the
patch attached.


-- 
Regards
Junwang Zhao


0001-PITR-shutdown-should-not-report-error-by-pg_ctl.patch
Description: Binary data


Re: Don't pass NULL pointer to strcmp().

2023-11-01 Thread Junwang Zhao
On Wed, Nov 1, 2023 at 5:25 PM Xing Guo  wrote:
>
> Hi hackers,
>
> I found that there's a nullable pointer being passed to strcmp() and
> can make the server crash. It can be reproduced on the latest master
> branch by crafting an extension[1]. Patch for fixing it is attatched.
>
> [1] https://github.com/higuoxing/guc_crash/tree/pg
>

Can we set a string guc to NULL? If not, `*lconf->variable == NULL` would
be unnecessary.

> --
> Best Regards,
> Xing



-- 
Regards
Junwang Zhao




Re: Simplify xlogreader.c with XLogRec* macros

2023-10-31 Thread Junwang Zhao
On Tue, Oct 31, 2023 at 5:23 PM 邱宇航  wrote:
>
> Hello hackers,
>
> Commit 3f1ce97 refactored XLog record access macros, but missed in a few 
> places. I fixed this, and patch is attached.
>
> --
> Yuhang Qiu
>
>
>

@@ -2036,8 +2035,8 @@ RestoreBlockImage(XLogReaderState *record, uint8
block_id, char *page)
  char*ptr;
  PGAlignedBlock tmp;

- if (block_id > record->record->max_block_id ||
- !record->record->blocks[block_id].in_use)
+ if (block_id > XLogRecMaxBlockId(record) ||
+ !XLogRecGetBlock(record, block_id)->in_use)

I thought these can also be rewrite to:

if (!XLogRecHasBlockRef(record, block_id))


-- 
Regards
Junwang Zhao




Re: Add trailing commas to enum definitions

2023-10-24 Thread Junwang Zhao
On Tue, Oct 24, 2023 at 4:34 AM Nathan Bossart  wrote:
>
> On Mon, Oct 23, 2023 at 05:55:32PM +0800, Junwang Zhao wrote:
> > On Mon, Oct 23, 2023 at 2:37 PM Peter Eisentraut  
> > wrote:
> >> Since C99, there can be a trailing comma after the last value in an enum
> >
> > C99 allows us to do this doesn't mean we must do this, this is not
> > inconsistent IMHO, and this will pollute the git log messages, people
> > may *git blame* the file and see the reason for the introduction of the
> > line.
>
> I suspect that your concerns about git-blame could be resolved by adding
> this commit to .git-blame-ignore-revs.  From a long-term perspective, I
> think standardizing on the trailing comma style will actually improve
> git-blame because patches won't need to add a comma to the previous line
> when adding a value.

make sense, +1 from me now.

>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao




Re: Add trailing commas to enum definitions

2023-10-23 Thread Junwang Zhao
On Mon, Oct 23, 2023 at 2:37 PM Peter Eisentraut  wrote:
>
> Since C99, there can be a trailing comma after the last value in an enum

C99 allows us to do this doesn't mean we must do this, this is not
inconsistent IMHO, and this will pollute the git log messages, people
may *git blame* the file and see the reason for the introduction of the
line.

There are a lot of 'typedef struct' as well as 'struct', which is not
inconsistent either just like the *enum* case.

> definition.  A lot of new code has been introducing this style on the
> fly.  I have noticed that some new patches are now taking an
> inconsistent approach to this.  Some add the last comma on the fly if
> they add a new last value, some are trying to preserve the existing
> style in each place, some are even dropping the last comma if there was
> one.  I figured we could nudge this all in a consistent direction if we
> just add the trailing commas everywhere once.  See attached patch; it
> wasn't actually that much.
>
> I omitted a few places where there was a fixed "last" value that will
> always stay last.  I also skipped the header files of libpq and ecpg, in
> case people want to use those with older compilers.  There were also a
> small number of cases where the enum type wasn't used anywhere (but the
> enum values were), which ended up confusing pgindent a bit.



-- 
Regards
Junwang Zhao




Re: [dynahash] do not refill the hashkey after hash_search

2023-09-14 Thread Junwang Zhao
On Wed, Sep 13, 2023 at 5:28 PM John Naylor
 wrote:
>
>
> On Wed, Sep 13, 2023 at 3:46 PM Junwang Zhao  wrote:
> >
> > On Wed, Sep 13, 2023 at 4:22 PM John Naylor
> >  wrote:
>
> > > - memset(part_entry, 0, sizeof(LogicalRepPartMapEntry));
> > > - part_entry->partoid = partOid;
> > > + Assert(part_entry->partoid == partOid);
> > > + memset(entry, 0, sizeof(LogicalRepRelMapEntry));
> > >
> > > This is making an assumption that the non-key part of 
> > > LogicalRepPartMapEntry will never get new members. Without knowing much 
> > > about this code, it seems like a risk in the abstract.
> >
> > What do you mean by 'the non-key part of LogicalRepPartMapEntry will
> > never get new members'?
>
> I mean, if this struct:
>
> > typedef struct LogicalRepPartMapEntry
> > {
> > Oid partoid; /* LogicalRepPartMap's key */
> > LogicalRepRelMapEntry relmapentry;
> > } LogicalRepPartMapEntry;
>
> ...gets a new member, it will not get memset when memsetting "relmapentry".

ok, I see. I will leave this case as it was.

>
> > > Taking a quick look, I didn't happen to see any existing asserts of this 
> > > sort, so the patch doesn't seem to be making things more "normal". I did 
> > > see a few instances of /* hash_search already filled in the key */, so if 
> > > we do anything at all here, we might prefer that.
> >
> > There are some code using assert for this sort, for example in
> > *ReorderBufferToastAppendChunk*:
>
> > and in *rebuild_database_list*, tom commented that the key has already
> > been filled, which I think
> > he was trying to tell people no need to assign the key again.
>
> Okay, we have examples of each.
>
> --
> John Naylor
> EDB: http://www.enterprisedb.com

Add a v2 with some change to fix warnings about unused-parameter.

I will add this to Commit Fest.

-- 
Regards
Junwang Zhao


v2-0001-do-not-refill-the-hashkey-after-hash_search.patch
Description: Binary data


Re: [dynahash] do not refill the hashkey after hash_search

2023-09-13 Thread Junwang Zhao
On Wed, Sep 13, 2023 at 4:22 PM John Naylor
 wrote:
>
>
> On Wed, Sep 13, 2023 at 1:47 PM Junwang Zhao  wrote:
> >
> > Hi hackers,
> >
> > It's not necessary to fill the key field for most cases, since
> > hash_search has already done that for you. For developer that
> > using memset to zero the entry structure after enter it, fill the
> > key field is a must, but IMHO that is not good coding style, we
> > really should not touch the key field after insert it into the
> > dynahash.
>
> - memset(part_entry, 0, sizeof(LogicalRepPartMapEntry));
> - part_entry->partoid = partOid;
> + Assert(part_entry->partoid == partOid);
> + memset(entry, 0, sizeof(LogicalRepRelMapEntry));
>
> This is making an assumption that the non-key part of LogicalRepPartMapEntry 
> will never get new members. Without knowing much about this code, it seems 
> like a risk in the abstract.

What do you mean by 'the non-key part of LogicalRepPartMapEntry will
never get new members'?

typedef struct LogicalRepPartMapEntry
{
Oid partoid; /* LogicalRepPartMap's key */
LogicalRepRelMapEntry relmapentry;
} LogicalRepPartMapEntry;

partoid has already been filled by hash_search with HASH_ENTER action,
so I think the
above code should have the same effects.

>
> > This patch fixed some most abnormal ones, instead of refilling the
> > key field of primitive types, adding some assert might be a better
> > choice.
>
> Taking a quick look, I didn't happen to see any existing asserts of this 
> sort, so the patch doesn't seem to be making things more "normal". I did see 
> a few instances of /* hash_search already filled in the key */, so if we do 
> anything at all here, we might prefer that.

There are some code using assert for this sort, for example in
*ReorderBufferToastAppendChunk*:

```
ent = (ReorderBufferToastEnt *)
hash_search(txn->toast_hash, _id, HASH_ENTER, );

if (!found)
{
Assert(ent->chunk_id == chunk_id);   <--- this
line, by Robert Haas
ent->num_chunks = 0;
ent->last_chunk_seq = 0;
```

and in *rebuild_database_list*, tom commented that the key has already
been filled, which I think
he was trying to tell people no need to assign the key again.

```
/* we assume it isn't found because the hash was just created */
db = hash_search(dbhash, , HASH_ENTER, NULL);

/* hash_search already filled in the key */ <--- this
line, by Tom Lane
db->adl_score = score++;
/* next_worker is filled in later */
```

>
> - hash_search(uncommitted_enums, serialized++, HASH_ENTER, NULL);
> + (void) hash_search(uncommitted_enums, serialized++, HASH_ENTER, NULL);
>
> I prefer explicit (void) for new code, but others may disagree. I don't think 
> we have a preferred style for this, so changing current usage will just cause 
> unnecessary code churn.
>

What I am concerned about is that if we change the key after
hash_search with HASH_ENTER action, there
are chances that if we assign a wrong value, it will be impossible to
match that entry again.

> --
> John Naylor
> EDB: http://www.enterprisedb.com



-- 
Regards
Junwang Zhao




[dynahash] do not refill the hashkey after hash_search

2023-09-13 Thread Junwang Zhao
Hi hackers,

It's not necessary to fill the key field for most cases, since
hash_search has already done that for you. For developer that
using memset to zero the entry structure after enter it, fill the
key field is a must, but IMHO that is not good coding style, we
really should not touch the key field after insert it into the
dynahash.

This patch fixed some most abnormal ones, instead of refilling the
key field of primitive types, adding some assert might be a better
choice.


-- 
Regards
Junwang Zhao


0001-do-not-refill-the-hashkey-after-hash_search.patch
Description: Binary data


Re: [BackendXidGetPid] only access allProcs when xid matches

2023-09-07 Thread Junwang Zhao
On Thu, Sep 7, 2023 at 5:07 PM Ashutosh Bapat
 wrote:
>
> Forgot to attach the patch.

LGTM

Should I change the status to ready for committer now?

>
> On Thu, Sep 7, 2023 at 1:22 PM Ashutosh Bapat
>  wrote:
> >
> > Hi Junwang,
> > We leave a line blank after variable declaration as in the attached patch.
> >
> > Otherwise the patch looks good to me.
> >
> > The function modified by the patch is only used by extension
> > pgrowlocks. Given that the function will be invoked as many times as
> > the number of locked rows in the relation, the patch may show some
> > improvement and thus be more compelling. One way to measure
> > performance is to create a table with millions of rows, SELECT all
> > rows with FOR SHARE/UPDATE clause. Then run pgrowlock() on that
> > relation. This will invoke the given function a million times. That
> > way we might be able to catch some miniscule improvement per row.
> >
> > If the performance is measurable, we can mark the CF entry as ready
> > for committer.
> >
> > --
> > Best Wishes,
> > Ashutosh Bapat
> >
> > On Thu, Aug 10, 2023 at 1:48 PM Junwang Zhao  wrote:
> > >
> > > On Thu, Aug 10, 2023 at 4:11 PM Ashutosh Bapat
> > >  wrote:
> > > >
> > > > Please add this to commitfest so that it's not forgotten.
> > > >
> > >
> > > Added [1], thanks
> > >
> > > [1]: https://commitfest.postgresql.org/44/4495/
> > >
> > > > On Wed, Aug 9, 2023 at 8:37 PM Junwang Zhao  wrote:
> > > > >
> > > > > On Wed, Aug 9, 2023 at 10:46 PM Ashutosh Bapat
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, Aug 9, 2023 at 9:30 AM Junwang Zhao  
> > > > > > wrote:
> > > > > > >
> > > > > > > In function `BackendXidGetPid`, when looping every proc's
> > > > > > > TransactionId, there is no need to access its PGPROC since 
> > > > > > > there
> > > > > > > is shared memory access: `arrayP->pgprocnos[index]`.
> > > > > > >
> > > > > > > Though the compiler can optimize this kind of inefficiency, I
> > > > > > > believe we should ship with better code.
> > > > > > >
> > > > > >
> > > > > > Looks good to me. However, I would just move the variable 
> > > > > > declaration
> > > > > > with their assignments inside the if () rather than combing the
> > > > > > expressions. It more readable that way.
> > > > >
> > > > > yeah, make sense, also checked elsewhere using the original style,
> > > > > attachment file
> > > > > keep that style, thanks ;)
> > > > >
> > > > > >
> > > > > > --
> > > > > > Best Wishes,
> > > > > > Ashutosh Bapat
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Junwang Zhao
> > > >
> > > >
> > > >
> > > > --
> > > > Best Wishes,
> > > > Ashutosh Bapat
> > >
> > >
> > >
> > > --
> > > Regards
> > > Junwang Zhao
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat



-- 
Regards
Junwang Zhao




Re: [BackendXidGetPid] only access allProcs when xid matches

2023-09-07 Thread Junwang Zhao
On Thu, Sep 7, 2023 at 4:04 PM Michael Paquier  wrote:
>
> On Thu, Sep 07, 2023 at 01:22:07PM +0530, Ashutosh Bapat wrote:
> > Otherwise the patch looks good to me.
> >
> > The function modified by the patch is only used by extension
> > pgrowlocks. Given that the function will be invoked as many times as
> > the number of locked rows in the relation, the patch may show some
> > improvement and thus be more compelling. One way to measure
> > performance is to create a table with millions of rows, SELECT all
> > rows with FOR SHARE/UPDATE clause. Then run pgrowlock() on that
> > relation. This will invoke the given function a million times. That
> > way we might be able to catch some miniscule improvement per row.
> >
> > If the performance is measurable, we can mark the CF entry as ready
> > for committer.
>
> So, is the difference measurable?  Assuming that your compiler does
I have checked my compiler, this patch give them same assembly code
as before.
> not optimize that, my guess is no because the cycles are going to be
> eaten by the other system calls in pgrowlocks.  You could increase the
> number of proc entries and force a large loop around
> BackendXidGetPid() to see a difference of runtime, but that's going
> to require a lot more proc entries to see any kind of difference
> outside the scope of a usual ~1% (somewhat magic number!) noise with
> such micro benchmarks.
>
> -intpgprocno = arrayP->pgprocnos[index];
> -PGPROC   *proc = [pgprocno];
> -
>  if (other_xids[index] == xid)
>  {
> +PGPROC   *proc = [arrayP->pgprocnos[index]];
>  result = proc->pid;
>  break;
>
> Saying that, moving the declarations into the inner loop is usually a
> good practice, but I would just keep two variables instead of one for
> the sake of readability.  That's a nit, sure.

I remember I split this into two lines in v2 patch. Whatever, I attached
a v3 with a suggestion from Ashutosh Bapat.

> --
> Michael



-- 
Regards
Junwang Zhao


v3-0001-BackendXidGetPid-only-access-allProcs-when-xid-ma.patch
Description: Binary data


Re: Should we use MemSet or {0} for struct initialization?

2023-08-31 Thread Junwang Zhao
On Thu, Aug 31, 2023 at 7:07 PM John Naylor
 wrote:
>
> > On Thu, Aug 31, 2023 at 5:34 PM Richard Guo  wrote:
> > >
> > > While working on a bug in expandRecordVariable() I noticed that in the
> > > switch statement for case RTE_SUBQUERY we initialize struct ParseState
> > > with {0} while for case RTE_CTE we do that with MemSet.  I understand
> > > that there is nothing wrong with this, just cannot get away with the
> > > inconsistency inside the same function (sorry for the nitpicking).
> > >
> > > Do we have a preference for how to initialize structures?  From 9fd45870
> > > it seems that we prefer to {0}.  So here is a trivial patch doing that.
>
> It seems to have been deliberately left that way in the wake of that commit, 
> see:
>
> https://www.postgresql.org/message-id/87d2e5f8-3c37-d185-4bbc-1de163ac4b10%40enterprisedb.com
>
> (If so, it deserves a comment to keep people from trying to change it...)
>
> > > And with a rough scan the MemSet calls in pg_stat_get_backend_subxact()
> > > can also be replaced with {0}, so include that in the patch too.
>
> I _believe_ that's harmless to change.
>
> On Thu, Aug 31, 2023 at 4:57 PM Junwang Zhao  wrote:
>
> > If the struct has padding or aligned, {0} only guarantee the struct
> > members initialized to 0, while memset sets the alignment/padding
> > to 0 as well, but since we will not access the alignment/padding, so
> > they give the same effect.
>
> See above -- if it's used as a hash key, for example, you must clear 
> everything.

Yeah, if memcmp was used as the key comparison function, there is a problem.

>
> > I bet {0} should be faster since there is no function call, but I'm not
> > 100% sure ;)
>
> Neither has a function call. MemSet is a PG macro. You're thinking of memset, 
> the libc library function, but a decent compiler can easily turn that into 
> something else for fixed-size inputs.

good to know, thanks.

>
> --
> John Naylor
> EDB: http://www.enterprisedb.com
>
>


-- 
Regards
Junwang Zhao




Re: Should we use MemSet or {0} for struct initialization?

2023-08-31 Thread Junwang Zhao
On Thu, Aug 31, 2023 at 5:34 PM Richard Guo  wrote:
>
> While working on a bug in expandRecordVariable() I noticed that in the
> switch statement for case RTE_SUBQUERY we initialize struct ParseState
> with {0} while for case RTE_CTE we do that with MemSet.  I understand
> that there is nothing wrong with this, just cannot get away with the
> inconsistency inside the same function (sorry for the nitpicking).
>
> Do we have a preference for how to initialize structures?  From 9fd45870
> it seems that we prefer to {0}.  So here is a trivial patch doing that.
> And with a rough scan the MemSet calls in pg_stat_get_backend_subxact()
> can also be replaced with {0}, so include that in the patch too.
>
> Thanks
> Richard

If the struct has padding or aligned, {0} only guarantee the struct
members initialized to 0, while memset sets the alignment/padding
to 0 as well, but since we will not access the alignment/padding, so
they give the same effect.

I bet {0} should be faster since there is no function call, but I'm not
100% sure ;)


-- 
Regards
Junwang Zhao




Re: [PATCH v1] PQputCopyEnd never returns 0, fix the inaccurate comment

2023-08-29 Thread Junwang Zhao
On Tue, Aug 29, 2023 at 6:40 AM Michael Paquier  wrote:
>
> On Mon, Aug 28, 2023 at 09:46:07PM +0800, Junwang Zhao wrote:
> > Yeah, it makes sense to me, or maybe just `PQputCopyEnd(...) == -1`,
> > let's wait for some other opinions.
>
> One can argue that PQputCopyEnd() returning 0 could be possible in an
> older version of libpq these callers are linking to, but this has
> never existed from what I can see (just checked down to 8.2 now).
> Anyway, changing these callers may create some backpatching conflicts,
> so I'd let them as they are, and just fix the comment.

sure, thanks for the explanation.

> --
> Michael



-- 
Regards
Junwang Zhao




Re: [PATCH v1] PQputCopyEnd never returns 0, fix the inaccurate comment

2023-08-28 Thread Junwang Zhao
On Mon, Aug 28, 2023 at 7:48 PM Aleksander Alekseev
 wrote:
>
> Hi Junwang,
>
> > PQputCopyEnd returns 1 or -1, never 0, I guess the comment was
> > copy/paste from PQputCopyData's comment, this should be fixed.
>
> The patch LGTM but I wonder whether we should also change all the
> existing calls of PQputCopyEnd() from:
>
> ```
> PQputCopyEnd(...) <= 0
> ```
>
> ... to:
>
> ```
> PQputCopyEnd(...) < 0
> ```
>
> Given the circumstances, checking for equality to zero seems to be at
> least strange.
>

Yeah, it makes sense to me, or maybe just `PQputCopyEnd(...) == -1`,
let's wait for some other opinions.

>
> On top of that, none of the PQputCopyData() callers cares whether the
> function returns 0 or -1, both are treated the same way. I suspect the
> function does some extra work no one asked to do and no one cares
> about. Perhaps this function should be refactored too for consistency.
>
> Thoughts?
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




[PATCH v1] PQputCopyEnd never returns 0, fix the inaccurate comment

2023-08-28 Thread Junwang Zhao
PQputCopyEnd returns 1 or -1, never 0, I guess the comment was
copy/paste from PQputCopyData's comment, this should be fixed.

-- 
Regards
Junwang Zhao


v1-0001-PQputCopyEnd-never-returns-0-fix-the-inaccurate-c.patch
Description: Binary data


Re: [dsm] comment typo

2023-08-21 Thread Junwang Zhao
On Mon, Aug 21, 2023 at 5:16 PM Daniel Gustafsson  wrote:
>
> > On 18 Aug 2023, at 11:10, Junwang Zhao  wrote:
> >
> > In the following sentence, I believe either 'the' or 'a' should be kept, not
> > both. I here keep the 'the', but feel free to change.
>
> >  * handle: The handle of an existing object, or for DSM_OP_CREATE, the
> > - *a new handle the caller wants created.
> > + *new handle the caller wants created.
>
> Since the handle doesn't exist for DSM_OP_CREATE, both "a handle" and "the
> handle" seems a tad misleading, how about "the identifier for the new handle 
> the
> caller wants created"?
>

Sounds great 

> --
> Daniel Gustafsson
>


-- 
Regards
Junwang Zhao




[dsm] comment typo

2023-08-18 Thread Junwang Zhao
In the following sentence, I believe either 'the' or 'a' should be kept, not
both. I here keep the 'the', but feel free to change.

---
 src/backend/storage/ipc/dsm_impl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/dsm_impl.c
b/src/backend/storage/ipc/dsm_impl.c
index 6399fa2ad5..19a9cfc8ac 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -137,7 +137,7 @@ int min_dynamic_shared_memory;
  * Arguments:
  * op: The operation to be performed.
  * handle: The handle of an existing object, or for DSM_OP_CREATE, the
- *a new handle the caller wants created.
+ *new handle the caller wants created.
  * request_size: For DSM_OP_CREATE, the requested size.  Otherwise, 0.
  * impl_private: Private, implementation-specific data.  Will be a pointer
  *to NULL for the first operation on a shared memory segment within this
-- 

-- 
Regards
Junwang Zhao




[question] difference between vm_extend and fsm_extend

2023-08-10 Thread Junwang Zhao
Hey hacks,

I'm looking the code of free space map and visibility map, both the module
call  `ExtendBufferedRelTo` to extend its file when necessary, what confused
me is that `vm_extend` has an extra step that send a shared message to force
other backends to close other smgr references.

My question is why does `fsm_extend` not need the invalidation step?

-- 
Regards
Junwang Zhao




Re: [BackendXidGetPid] only access allProcs when xid matches

2023-08-10 Thread Junwang Zhao
On Thu, Aug 10, 2023 at 4:11 PM Ashutosh Bapat
 wrote:
>
> Please add this to commitfest so that it's not forgotten.
>

Added [1], thanks

[1]: https://commitfest.postgresql.org/44/4495/

> On Wed, Aug 9, 2023 at 8:37 PM Junwang Zhao  wrote:
> >
> > On Wed, Aug 9, 2023 at 10:46 PM Ashutosh Bapat
> >  wrote:
> > >
> > > On Wed, Aug 9, 2023 at 9:30 AM Junwang Zhao  wrote:
> > > >
> > > > In function `BackendXidGetPid`, when looping every proc's
> > > > TransactionId, there is no need to access its PGPROC since there
> > > > is shared memory access: `arrayP->pgprocnos[index]`.
> > > >
> > > > Though the compiler can optimize this kind of inefficiency, I
> > > > believe we should ship with better code.
> > > >
> > >
> > > Looks good to me. However, I would just move the variable declaration
> > > with their assignments inside the if () rather than combing the
> > > expressions. It more readable that way.
> >
> > yeah, make sense, also checked elsewhere using the original style,
> > attachment file
> > keep that style, thanks ;)
> >
> > >
> > > --
> > > Best Wishes,
> > > Ashutosh Bapat
> >
> >
> >
> > --
> > Regards
> > Junwang Zhao
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat



-- 
Regards
Junwang Zhao




Re: [BackendXidGetPid] only access allProcs when xid matches

2023-08-09 Thread Junwang Zhao
On Wed, Aug 9, 2023 at 10:46 PM Ashutosh Bapat
 wrote:
>
> On Wed, Aug 9, 2023 at 9:30 AM Junwang Zhao  wrote:
> >
> > In function `BackendXidGetPid`, when looping every proc's
> > TransactionId, there is no need to access its PGPROC since there
> > is shared memory access: `arrayP->pgprocnos[index]`.
> >
> > Though the compiler can optimize this kind of inefficiency, I
> > believe we should ship with better code.
> >
>
> Looks good to me. However, I would just move the variable declaration
> with their assignments inside the if () rather than combing the
> expressions. It more readable that way.

yeah, make sense, also checked elsewhere using the original style,
attachment file
keep that style, thanks ;)

>
> --
> Best Wishes,
> Ashutosh Bapat



-- 
Regards
Junwang Zhao


v2-0001-BackendXidGetPid-only-access-allProcs-when-xid-ma.patch
Description: Binary data


[BackendXidGetPid] only access allProcs when xid matches

2023-08-08 Thread Junwang Zhao
In function `BackendXidGetPid`, when looping every proc's
TransactionId, there is no need to access its PGPROC since there
is shared memory access: `arrayP->pgprocnos[index]`.

Though the compiler can optimize this kind of inefficiency, I
believe we should ship with better code.


-- 
Regards
Junwang Zhao


0001-BackendXidGetPid-only-access-allProcs-when-xid-match.patch
Description: Binary data


[PATCH] [zh_CN.po] fix a typo in simplified Chinese translation file

2023-08-01 Thread Junwang Zhao
add the missing leading `l` for log_statement_sample_rate

-- 
Regards
Junwang Zhao


0001-zh_CN.po-fix-a-typo-in-simplified-Chinese-translatio.patch
Description: Binary data


Re: pg_upgrade fails with in-place tablespace

2023-07-31 Thread Junwang Zhao
On Mon, Jul 31, 2023 at 5:36 PM Rui Zhao  wrote:
>
> Hello postgres hackers,
> Recently I encountered an issue: pg_upgrade fails when dealing with in-place 
> tablespace. As we know, pg_upgrade uses pg_dumpall to dump objects and 
> pg_restore to restore them. The problem seems to be that pg_dumpall is 
> dumping in-place tablespace as relative path, which can't be restored.
>
> Here is the error message of pg_upgrade:
> psql:/home/postgres/postgresql/src/bin/pg_upgrade/tmp_check/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230729T210058.329/dump/pg_upgrade_dump_globals.sql:36:
>  ERROR:  tablespace location must be an absolute path
>
> To help reproduce the failure, I have attached a tap test. The test also 
> fails with tablespace regression, and it change the default value of 
> allow_in_place_tablespaces to true only for debug, so it may not be fit for 
> production. However, it is enough to reproduce this failure.
> I have also identified a solution for this problem, which I have included in 
> the patch. The solution has two modifications:
>   1) Make the function pg_tablespace_location returns path "" with in-place 
> tablespace, rather than relative path. Because the path of the in-place 
> tablespace is always 'pg_tblspc/'.
>   2) Only check the tablespace with an absolute path in pg_upgrade.
>
>   There are also other solutions, such as supporting the creation of 
> relative-path tablespace in function CreateTableSpace. But do we really need 
> relative-path tablespace? I think in-place tablespace is enough by now. My 
> solution may be more lightweight and harmless.
>
> Thank you for your attention to this matter.
>
> Best regards,
> Rui Zhao

Seems like allow_in_place_tablespaces is a developer only guc, and it
is not for end user usage.
check this commit 7170f2159fb21b62c263acd458d781e2f3c3f8bb


-- 
Regards
Junwang Zhao




Re: table_open/table_close with different lock mode

2023-07-21 Thread Junwang Zhao
On Fri, Jul 21, 2023 at 2:57 PM Gurjeet Singh  wrote:
>
> On Thu, Jul 20, 2023 at 11:38 PM Junwang Zhao  wrote:
> >
> > On Fri, Jul 21, 2023 at 2:26 PM Michael Paquier  wrote:
> > >
> > > On Fri, Jul 21, 2023 at 02:05:56PM +0800, Junwang Zhao wrote:
> > > > I noticed there are some places calling table_open with
> > > > RowExclusiveLock but table_close with NoLock, like in function
> > > > toast_save_datum.
> > > >
> > > > Can anybody explain the underlying logic, thanks in advance.
> > >
> > > This rings a bell.  This is a wanted behavior, see commit f99870d and
> > > its related thread:
> > > https://postgr.es/m/17268-d2fb426e0895a...@postgresql.org
> > >
> >
> > I see this patch, so all the locks held by a transaction will be released
> > at the commit phase, right? Can you show me where the logic is located?
>
> The NoLock is simple a marker that tells the underlying machinery to
> not bother releasing any locks. As a matter of fact, you can pass
> NoLock in *_open() calls, too, to indicate that you don't want any new
> locks, perhaps because the transaction has already taken an
> appropriate lock on the object.
>
> As for lock-releasing codepath at transaction end, see
> CommitTransaction() in xact.c, and specifically at the
> ResourceOwnerRelease() calls in there.
>

Great, thanks for the thorough explanation, will look into the code :)

> Best regards,
> Gurjeet
> http://Gurje.et



-- 
Regards
Junwang Zhao




Re: table_open/table_close with different lock mode

2023-07-21 Thread Junwang Zhao
On Fri, Jul 21, 2023 at 2:26 PM Michael Paquier  wrote:
>
> On Fri, Jul 21, 2023 at 02:05:56PM +0800, Junwang Zhao wrote:
> > I noticed there are some places calling table_open with
> > RowExclusiveLock but table_close with NoLock, like in function
> > toast_save_datum.
> >
> > Can anybody explain the underlying logic, thanks in advance.
>
> This rings a bell.  This is a wanted behavior, see commit f99870d and
> its related thread:
> https://postgr.es/m/17268-d2fb426e0895a...@postgresql.org
>

I see this patch, so all the locks held by a transaction will be released
at the commit phase, right? Can you show me where the logic is located?

> The tests added by this commit in src/test/isolation/ will show the
> difference in terms of the way the toast values get handled with and
> without the change.
> --
> Michael



-- 
Regards
Junwang Zhao




table_open/table_close with different lock mode

2023-07-21 Thread Junwang Zhao
Hey hackers,

I noticed there are some places calling table_open with
RowExclusiveLock but table_close with NoLock, like in function
toast_save_datum.

Can anybody explain the underlying logic, thanks in advance.

-- 
Regards
Junwang Zhao




Re: session username in default psql prompt?

2023-05-27 Thread Junwang Zhao
I'd like to take this if this is approved ;)

On Sat, May 27, 2023 at 8:52 PM Andrew Dunstan  wrote:
>
> I don't recall if this has come up before.
>
> I'm sometimes mildly annoyed when I get on a new system and find the username 
> missing in my psql prompt. Or if a customer shows me a screen and I have to 
> ask "which user is this". If we're dealing with several roles it can get 
> confusing. My usual .psqlrc has
>
>\set PROMPT1 '%n@%~%R%x%# '
>
> So my suggestion is that we prepend '%n@' to the default psql PROMPT1 (and 
> maybe PROMPT2).
>
> I realize it's not exactly earth-shattering, but I think it's a bit more 
> user-friendly.
>
> (Would be a good beginner project, the code change would be tiny but there 
> are lots of docs / examples that we might want to change if we did it.)
>
>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com



-- 
Regards
Junwang Zhao




[pg_rewind] use the passing callback instead of global function

2023-04-25 Thread Junwang Zhao
`local_traverse_files` and `libpq_traverse_files` both have a
callback parameter but instead use the global process_source_file
which is no good for function encapsulation.

-- 
Regards
Junwang Zhao


0001-pg_rewind-use-the-passing-callback-instead-of-global.patch
Description: Binary data


Re: Use INT_MAX for wal size related gucs's max value

2023-04-18 Thread Junwang Zhao
These gucs are always used with ConvertToXSegs, to calculate the count of
wal segments(see the following code snip), and wal_segment_size can be
configured by initdb as a value of a power of 2 between 1 and 1024 (megabytes),
so I think INT_MAX should be safe here.

/*
* Convert values of GUCs measured in megabytes to equiv. segment count.
* Rounds down.
*/
#define ConvertToXSegs(x, segsize) XLogMBVarToSegs((x), (segsize))

/*
* Convert values of GUCs measured in megabytes to equiv. segment count.
* Rounds down.
*/
#define XLogMBVarToSegs(mbvar, wal_segsz_bytes) \
((mbvar) / ((wal_segsz_bytes) / (1024 * 1024)))

On Wed, Apr 19, 2023 at 11:33 AM Tom Lane  wrote:
>
> Junwang Zhao  writes:
> > The wal size related gucs use the MB unit, so we should just use
> > INT_MAX instead of MAX_KILOBYTES as the max value.
>
> The point of MAX_KILOBYTES is to avoid overflow when the value
> is multiplied by 1kB.  It does seem like that might not be
> appropriate for these values, but that doesn't mean that we can
> blithely go to INT_MAX.  Have you chased down how they are used?
>
>     regards, tom lane



-- 
Regards
Junwang Zhao




Use INT_MAX for wal size related gucs's max value

2023-04-18 Thread Junwang Zhao
The wal size related gucs use the MB unit, so we should just use
INT_MAX instead of MAX_KILOBYTES as the max value.

-- 
Regards
Junwang Zhao


0001-use-INT_MAX-for-wal-size-related-max-value.patch
Description: Binary data


fix a typo in file src/backend/utils/adt/xid8funcs.c comment

2023-03-24 Thread Junwang Zhao
%s/pg_current_xact/pg_current_xact_id

-- 
Regards
Junwang Zhao


0001-typo-replace-pg_current_xact-with-pg_current_xact_id.patch
Description: Binary data


[meson] add missing pg_attribute_aligned for MSVC in meson build

2022-10-13 Thread Junwang Zhao
Hi Andres,

Commit ec3c9cc add pg_attribute_aligned in MSVC[1],
which was pushed one day before the meson commits,
so meson build missed this feature.

[1]: 
https://www.postgresql.org/message-id/caaaqye-hbtzvr3msomtk+hyw2s0e0oapzmw8icsmytma+mn...@mail.gmail.com

-- 
Regards
Junwang Zhao


0001-meson-add-missing-pg_attribute_aligned-for-MSVC.patch
Description: Binary data


Re: Non-decimal integer literals

2022-10-11 Thread Junwang Zhao
On Tue, Oct 11, 2022 at 4:59 PM Peter Eisentraut
 wrote:
>
> On 11.10.22 05:29, Junwang Zhao wrote:
> > What do you think if we move these code into a static inline function? like:
> >
> > static inline char*
> > process_digits(char *ptr, int32 *result)
> > {
> > ...
> > }
>
> How would you handle the different ways each branch checks for valid
> digits and computes the value of each digit?
>

Didn't notice that, sorry for the noise ;(


-- 
Regards
Junwang Zhao




Re: Non-decimal integer literals

2022-10-10 Thread Junwang Zhao
Hi Peter,

  /* process digits */
+ if (ptr[0] == '0' && (ptr[1] == 'x' || ptr[1] == 'X'))
+ {
+ ptr += 2;
+ while (*ptr && isxdigit((unsigned char) *ptr))
+ {
+ int8 digit = hexlookup[(unsigned char) *ptr];
+
+ if (unlikely(pg_mul_s16_overflow(tmp, 16, )) ||
+ unlikely(pg_sub_s16_overflow(tmp, digit, )))
+ goto out_of_range;
+
+ ptr++;
+ }
+ }
+ else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O'))
+ {
+ ptr += 2;
+
+ while (*ptr && (*ptr >= '0' && *ptr <= '7'))
+ {
+ int8 digit = (*ptr++ - '0');
+
+ if (unlikely(pg_mul_s16_overflow(tmp, 8, )) ||
+ unlikely(pg_sub_s16_overflow(tmp, digit, )))
+ goto out_of_range;
+ }
+ }
+ else if (ptr[0] == '0' && (ptr[1] == 'b' || ptr[1] == 'B'))
+ {
+ ptr += 2;
+
+ while (*ptr && (*ptr >= '0' && *ptr <= '1'))
+ {
+ int8 digit = (*ptr++ - '0');
+
+ if (unlikely(pg_mul_s16_overflow(tmp, 2, )) ||
+ unlikely(pg_sub_s16_overflow(tmp, digit, )))
+ goto out_of_range;
+ }
+ }
+ else
+ {
  while (*ptr && isdigit((unsigned char) *ptr))
  {
  int8 digit = (*ptr++ - '0');
@@ -128,6 +181,7 @@ pg_strtoint16(const char *s)
  unlikely(pg_sub_s16_overflow(tmp, digit, )))
  goto out_of_range;
  }
+ }

What do you think if we move these code into a static inline function? like:

static inline char*
process_digits(char *ptr, int32 *result)
{
...
}

On Mon, Oct 10, 2022 at 10:17 PM Peter Eisentraut
 wrote:
>
> On 16.02.22 11:11, Peter Eisentraut wrote:
> > The remaining patches are material for PG16 at this point, and I will
> > set the commit fest item to returned with feedback in the meantime.
>
> Time to continue with this.
>
> Attached is a rebased and cleaned up patch for non-decimal integer
> literals.  (I don't include the underscores-in-numeric literals patch.
> I'm keeping that for later.)
>
> Two open issues from my notes:
>
> Technically, numeric_in() should be made aware of this, but that seems
> relatively complicated and maybe not necessary for the first iteration.
>
> Taking another look around ecpg to see how this interacts with C-syntax
> integer literals.  I'm not aware of any particular issues, but it's
> understandably tricky.
>
> Other than that, this seems pretty complete as a start.



-- 
Regards
Junwang Zhao




[PATCH v1] [meson] fix some typo to make it more readable

2022-10-05 Thread Junwang Zhao
Hi Andres,

Seems there are some typo in file src/backend/meson.build comment, pls
have a look.

-- 
Regards
Junwang Zhao


0001-meson-fix-some-typo-to-make-it-more-readable.patch
Description: Binary data


[PATCH v1] [meson] add a default option prefix=/usr/local/pgsql

2022-09-30 Thread Junwang Zhao
autoconf set PREFIX to /usr/local/pgsql, so I think we should
do the same in meson build.

This will group all the targets generated by postgres in the same directory.

-- 
Regards
Junwang Zhao


0001-meson-add-a-default-option-prefix-usr-local-pgsql.patch
Description: Binary data


Re: [patch] Adding an assertion to report too long hash table name

2022-09-29 Thread Junwang Zhao
LGTM
+1

On Thu, Sep 29, 2022 at 9:38 AM Xiaoran Wang  wrote:
>
> Hi,
>
> The max size for the shmem hash table name is SHMEM_INDEX_KEYSIZE - 1.
> but when the caller uses a longer hash table name, it doesn't report any 
> error, instead
> it just uses the first SHMEM_INDEX_KEYSIZE -1 chars as the hash table name.
>
> I created some shmem hash tables with the same prefix which was longer than
> SHMEM_INDEX_KEYSIZE - 1, and the size of those hash tables were the same,
> then only one hash table was created. But I thought those hash tables were 
> created
> successfully.
>
> I know this is a corner case, but it's difficult to figure it out when run 
> into it. So I add
> an assertion to prevent it.
>
>
> Thanks,
> Xiaoran



-- 
Regards
Junwang Zhao




Re: [PATCH] polish the error message of creating proc

2022-09-21 Thread Junwang Zhao
On Wed, Sep 21, 2022 at 10:53 PM Tom Lane  wrote:
>
> Junwang Zhao  writes:
> > I noticed that there are some translations under the backend/po directory,
> > can we just change
> > msgid "function \"%s\" already exists with same argument types"
> > to
> > msgid "%s \"%s\" already exists with same argument types" ?
>
> No.  This doesn't satisfy our message translation guidelines [1].
> The fact that there are other messages that aren't up to project
> standard isn't a license to create more.
>
> More generally: there are probably dozens, if not hundreds, of
> messages in the backend that say "function" but nowadays might
> also be talking about a procedure.  I'm not sure there's value
> in improving just one of them.
>
> I am pretty sure that we made an explicit decision some time back
> that it is okay to say "function" when the object could also be
> an aggregate or window function.  So you could at least cut this
> back to just handling "procedure" and "function".  Or you could
> change it to "routine" as Julien suggests, but I think a lot of
> people will not think that's an improvement.

Yeah, make sense, will leave it as is.

>
> regards, tom lane
>
> [1] https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES



-- 
Regards
Junwang Zhao




Re: [PATCH] polish the error message of creating proc

2022-09-21 Thread Junwang Zhao
On Wed, Sep 21, 2022 at 8:17 PM Julien Rouhaud  wrote:
>
> Hi,
>
> On Wed, Sep 21, 2022 at 07:45:01PM +0800, Junwang Zhao wrote:
> > when a error occurs when creating proc, it should point out the
> > specific proc kind instead of just printing "function".
>
> You should have kept the original thread in copy (1), or at least mention it.

Yeah, thanks for pointing that out, will do that next time.

>
> > diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
> > index a9fe45e347..58af4b48ce 100644
> > --- a/src/backend/catalog/pg_proc.c
> > +++ b/src/backend/catalog/pg_proc.c
> > @@ -373,7 +373,11 @@ ProcedureCreate(const char *procedureName,
> > if (!replace)
> > ereport(ERROR,
> > 
> > (errcode(ERRCODE_DUPLICATE_FUNCTION),
> > -errmsg("function \"%s\"
> > already exists with same argument types",
> > +errmsg("%s \"%s\" already
> > exists with same argument types",
> > +
> > (oldproc->prokind == PROKIND_AGGREGATE ? "aggregate function" :
> > +
> > oldproc->prokind == PROKIND_PROCEDURE ? "procedure" :
> > +
> > oldproc->prokind == PROKIND_WINDOW ? "window function" :
> > +"function"),
> > procedureName)));
> > if (!pg_proc_ownercheck(oldproc->oid, proowner))
> > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_FUNCTION,
>
> You can't put part of the message in parameter, as the resulting string isn't
> translatable.  You should either use "routine" as a generic term or provide 3
> different full messages.

I noticed that there are some translations under the backend/po directory,
can we just change
msgid "function \"%s\" already exists with same argument types"
to
msgid "%s \"%s\" already exists with same argument types" ?

>
> [1] 
> https://www.postgresql.org/message-id/29ea5666.6ce8.1835f4b4992.coremail.qtds_...@126.com



-- 
Regards
Junwang Zhao




[PATCH] polish the error message of creating proc

2022-09-21 Thread Junwang Zhao
when a error occurs when creating proc, it should point out the
specific proc kind instead of just printing "function".

diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index a9fe45e347..58af4b48ce 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -373,7 +373,11 @@ ProcedureCreate(const char *procedureName,
if (!replace)
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_FUNCTION),
-errmsg("function \"%s\"
already exists with same argument types",
+errmsg("%s \"%s\" already
exists with same argument types",
+
(oldproc->prokind == PROKIND_AGGREGATE ? "aggregate function" :
+
oldproc->prokind == PROKIND_PROCEDURE ? "procedure" :
+
oldproc->prokind == PROKIND_WINDOW ? "window function" :
+"function"),
procedureName)));
if (!pg_proc_ownercheck(oldproc->oid, proowner))
aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_FUNCTION,

-- 
Regards
Junwang Zhao


0001-polish-the-error-message-of-creating-proc.patch
Description: Binary data


Re: Modernizing our GUC infrastructure

2022-09-05 Thread Junwang Zhao
ah, yes, that makes sense ;)

On Tue, Sep 6, 2022 at 10:48 AM Tom Lane  wrote:
>
> Junwang Zhao  writes:
> >   /*
> > - * Create table with 20% slack
> > + * Create hash table with 20% slack
> >   */
> >   size_vars = num_vars + num_vars / 4;
>
> > Should we change 20% to 25%, I thought that might be
> > a typo.
>
> No ... 20% of the allocated space is spare.
>
>     regards, tom lane



-- 
Regards
Junwang Zhao




Re: Modernizing our GUC infrastructure

2022-09-05 Thread Junwang Zhao
Hi Tom,

@@ -5836,74 +5865,106 @@ build_guc_variables(void)
  }

  /*
- * Create table with 20% slack
+ * Create hash table with 20% slack
  */
  size_vars = num_vars + num_vars / 4;

Should we change 20% to 25%, I thought that might be
a typo.

On Tue, Sep 6, 2022 at 6:28 AM Tom Lane  wrote:
>
> Attached is a patch series that attempts to modernize our GUC
> infrastructure, in particular removing the performance bottlenecks
> it has when there are lots of GUC variables.  I wrote this because
> I am starting to question the schema-variables patch [1] --- that's
> getting to be quite a large patch and I grow less and less sure
> that it's solving a problem our users want solved.  I think what
> people actually want is better support of the existing mechanism
> for ad-hoc session variables, namely abusing custom GUCs for that
> purpose.  One of the big reasons we have been resistant to formally
> supporting that is fear of the poor performance guc.c would have
> with lots of variables.  But we already have quite a lot of them:
>
> regression=# select count(*) from pg_settings;
>  count
> ---
>353
> (1 row)
>
> and more are getting added all the time.  I think this patch series
> could likely be justified just in terms of positive effect on core
> performance, never mind user-added GUCs.
>
> 0001 and 0002 below are concerned with converting guc.c to store its
> data in a dedicated memory context (GUCMemoryContext) instead of using
> raw malloc().  This is not directly a performance improvement, and
> I'm prepared to drop the idea if there's a lot of pushback, but I
> think it would be a good thing to do.  The only hard reason for using
> malloc() there was the lack of ability to avoid throwing elog(ERROR)
> on out-of-memory in palloc().  But mcxt.c grew that ability years ago.
> Switching to a dedicated context would greatly improve visibility and
> accountability of GUC-related allocations.  Also, the 0003 patch will
> switch guc.c to relying on a palloc-based hashtable, and it seems a
> bit silly to have part of the data structure in palloc and part in
> malloc.  However 0002 is a bit invasive, in that it forces code
> changes in GUC check callbacks, if they want to reallocate the new
> value or create an "extra" data structure.  My feeling is that not
> enough external modules use those facilities for this to pose a big
> problem.  However, the ones that are subject to it will have a
> non-fun time tracking down why their code is crashing.  (The recent
> context-header changes mean that you get a very obscure failure when
> trying to pfree() a malloc'd chunk -- for me, that typically ends
> in an assertion failure in generation.c.  Can we make that less
> confusing?)
>
> 0003 replaces guc.c's bsearch-a-sorted-array lookup infrastructure
> with a dynahash hash table.  (I also looked at simplehash, but it
> has no support for not elog'ing on OOM, and it increases the size
> of guc.o by 10KB or so.)  This fixes the worse-than-O(N^2) time
> needed to create N new GUCs, as in
>
> do $$
> begin
>   for i in 1..1 loop
> perform set_config('foo.bar' || i::text, i::text, false);
>   end loop;
> end $$;
>
> On my machine, this takes about 4700 ms in HEAD, versus 23 ms
> with this patch.  However, the places that were linearly scanning
> the array now need to use hash_seq_search, so some other things
> like transaction shutdown (AtEOXact_GUC) get slower.
>
> To address that, 0004 adds some auxiliary lists that link together
> just the variables that are interesting for specific purposes.
> This is helpful even without considering the possibility of a
> lot of user-added GUCs: in a typical session, for example, not
> many of those 353 GUCs have non-default values, and even fewer
> get modified in any one transaction (typically, anyway).
>
> As an example of the speedup from 0004, these DO loops:
>
> create or replace function func_with_set(int) returns int
> strict immutable language plpgsql as
> $$ begin return $1; end $$
> set enable_seqscan = false;
>
> do $$
> begin
>   for i in 1..10 loop
> perform func_with_set(i);
>   end loop;
> end $$;
>
> do $$
> begin
>   for i in 1..10 loop
> begin
>   perform func_with_set(i);
> exception when others then raise;
> end;
>   end loop;
> end $$;
>
> take about 260 and 320 ms respectively for me, in HEAD with
> just the stock set of variables.  But after creating 1
> GUCs with the previous DO loop, they're up to about 3200 ms.
> 0004 brings that back down to being indistinguishable from the
> speed with few GUCs.
>
> So I think this is good cleanup in its own right, plus it
> removes one major objection to considering user-defined GUCs
> as a supported feature.
>
> regards, tom lane
>
> [1] 
> https://www.postgresql.org/message-id/flat/CAFj8pRD053CY_N4%3D6SvPe7ke6xPbh%3DK50LUAOwjC3jm8Me9Obg%40mail.gmail.com
>


-- 
Regards
Junwang Zhao




Re: [PATCH v1] fix potential memory leak in untransformRelOptions

2022-09-01 Thread Junwang Zhao
got it, thanks.

Tom Lane 于2022年9月2日 周五01:13写道:

> Junwang Zhao  writes:
> > I'm a little confused when we should call *pfree* and when we should not.
> > A few lines before there is a call *text_to_cstring* in which it invokes
> > *pfree* to free the unpacked text [0]. I'm just thinking that since *s*
> has
> > been duplicated, we should free it, that's where the patch comes from.
>
> By and large, the server is designed so that small memory leaks don't
> matter: the space will be reclaimed when the current memory context
> is deleted, and most code runs in reasonably short-lived contexts.
> Individually pfree'ing such allocations is actually a net negative,
> because it costs cycles and code space.
>
> There are places where a leak *does* matter, but unless you can
> demonstrate that this is one, it's not worth changing.
>
>     regards, tom lane
>
-- 
Regards
Junwang Zhao


Re: [PATCH v1] fix potential memory leak in untransformRelOptions

2022-09-01 Thread Junwang Zhao
On Thu, Sep 1, 2022 at 10:10 PM Tom Lane  wrote:
>
> Junwang Zhao  writes:
> >   result = lappend(result, makeDefElem(pstrdup(s), val, -1));
> > + pfree(s);
>
> I wonder why it's pstrdup'ing s in the first place.
>
Maybe it's pstrdup'ing s so that the caller should take care of the free?

I'm a little confused when we should call *pfree* and when we should not.
A few lines before there is a call *text_to_cstring* in which it invokes
*pfree* to free the unpacked text [0]. I'm just thinking that since *s* has
been duplicated, we should free it, that's where the patch comes from.

[0]:
```
char *
text_to_cstring(const text *t)
{
/* must cast away the const, unfortunately */
text *tunpacked = pg_detoast_datum_packed(unconstify(text *, t));
int len = VARSIZE_ANY_EXHDR(tunpacked);
char *result;

result = (char *) palloc(len + 1);
memcpy(result, VARDATA_ANY(tunpacked), len);
result[len] = '\0';

if (tunpacked != t)
pfree(tunpacked);

return result;
}
```

> regards, tom lane



-- 
Regards
Junwang Zhao




[PATCH v1] fix potential memory leak in untransformRelOptions

2022-09-01 Thread Junwang Zhao
*TextDatumGetCString* calls palloc to alloc memory for the option
text datum, in some cases the the memory is allocated in
*TopTransactionContext*, this may cause memory leak for a long
running backend.
---
 src/backend/access/common/reloptions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/backend/access/common/reloptions.c
b/src/backend/access/common/reloptions.c
index 609329bb21..6076677aef 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -1360,6 +1360,7 @@ untransformRelOptions(Datum options)
  val = (Node *) makeString(pstrdup(p));
  }
  result = lappend(result, makeDefElem(pstrdup(s), val, -1));
+ pfree(s);
  }

  return result;
-- 
2.33.0

-- 
Regards
Junwang Zhao


0001-fix-potential-memory-leak-in-untransformRelOptions.patch
Description: Binary data


use NoLock instead of the magic number 0

2022-09-01 Thread Junwang Zhao
Inside *add_local_<>_reloption*, we should pass NoLock instead of
the magic 0 to init_<>_reloption, which makes more sense.


-- 
Regards
Junwang Zhao


0001-use-NoLock-instead-of-the-magic-0.patch
Description: Binary data


  1   2   >