On Fri, May 9, 2025 at 2:41 AM Sutou Kouhei <k...@clear-code.com> wrote: > > Hi, > > In <cakfquwardxanal+qct6lzraem4rwkswv9v+viv_mcr+rex3...@mail.gmail.com> > "Re: Make COPY format extendable: Extract COPY TO format implementations" > on Sat, 3 May 2025 22:27:36 -0700, > "David G. Johnston" <david.g.johns...@gmail.com> wrote: > > > In any case, I’m doubtful either of us can make a convincing enough > > argument to sway the other fully. Both options are plausible, IMO. Others > > need to chime in. > > I may misunderstand but here is the current summary, right?
Thank you for summarizing the discussion. > > Proposed approaches to register custom COPY formats: > a. Create a function that has the same name of custom COPY > format > b. Call a register function from _PG_init() > > FYI: I proposed c. approach that uses a. but it always > requires schema name for format name in other e-mail. With approach (c), do you mean that we require users to change all FORMAT option values like from 'text' to 'pg_catalog.text' after the upgrade? Or are we exempt the built-in formats? > > Users can register the same format name: > a. Yes > * Users can distinct the same format name by schema name > * If format name doesn't have schema name, the used > format depends on search_path > * Pros: > * Using schema for it is consistent with other > PostgreSQL mechanisms > * Custom format never conflict with built-in > format. For example, an extension register "xml" and > PostgreSQL adds "xml" later, they are never > conflicted because PostgreSQL's "xml" is registered > to pg_catalog. > * Cons: Different format may be used with the same > input. For example, "jsonlines" may choose > "jsonlines" implemented by extension X or implemented > by extension Y when search_path is different. > b. No > * Users can use "${schema}.${name}" for format name > that mimics PostgreSQL's builtin schema (but it's just > a string) > > > Built-in formats (text/csv/binary) should be able to > overwritten by extensions: > a. (The current patch is no but David's answer is) Yes > * Pros: Users can use drop-in replacement faster > implementation without changing input > * Cons: Users may overwrite them accidentally. > It may break pg_dump result. > (This is called as "backward incompatibility.") > b. No The summary matches my understanding. I think the second point is important. If we go with a tablesample-like API, I agree with David's point that all FORMAT values including the built-in formats should depend on the search_path value. While it provides a similar user experience to other database objects, there is a possibility that a COPY with built-in format could work differently on v19 than v18 or earlier depending on the search_path value. > Are there any missing or wrong items? I think the approach (b) provides more flexibility than (a) in terms of API design as with (a) we need to do everything based on one handler function and callbacks. > If we can summarize > the current discussion here correctly, others will be able > to chime in this discussion. (At least I can do it.) +1 Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com