Hi, In <CAEG8a3LSRhK601Bn50u71BgfNWm4q3kv-o-KEq=hrbylby_...@mail.gmail.com> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Wed, 6 Dec 2023 22:07:51 +0800, Junwang Zhao <zhjw...@gmail.com> wrote:
> Should we extract both *copy to* and *copy from* for the first step, in that > case we can add the pg_copy_handler catalog smoothly later. I don't object it (mixing TO/FROM changes to one patch) but it may make review difficult. Is it acceptable? FYI: I planed that I implement TO part, and then FROM part, and then unify TO/FROM parts if needed. [1] > Attached V4 adds 'extract copy from' and it passed the cirrus ci, > please take a look. Thanks. Here are my comments: > + /* > + * Error is relevant to a particular line. > + * > + * If line_buf still contains the correct line, print it. > + */ > + if (cstate->line_buf_valid) We need to fix the indentation. > +CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc) > +{ > + FmgrInfo *in_functions; > + Oid *typioparams; > + Oid in_func_oid; > + AttrNumber num_phys_attrs; > + > + /* > + * Pick up the required catalog information for each attribute in the > + * relation, including the input function, the element type (to pass to > + * the input function), and info about defaults and constraints. (Which > + * input function we use depends on text/binary format choice.) > + */ > + num_phys_attrs = tupDesc->natts; > + in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo)); > + typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid)); We need to update the comment because defaults and constraints aren't picked up here. > +CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc) ... > + /* > + * Pick up the required catalog information for each attribute in the > + * relation, including the input function, the element type (to pass to > + * the input function), and info about defaults and constraints. (Which > + * input function we use depends on text/binary format choice.) > + */ > + in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo)); > + typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid)); ditto. > @@ -1716,15 +1776,6 @@ BeginCopyFrom(ParseState *pstate, > ReceiveCopyBinaryHeader(cstate); > } I think that this block should be moved to CopyFromFormatBinaryStart() too. But we need to run it after we setup inputs such as data_source_cb, pipe and filename... +/* Routines for a COPY HANDLER implementation. */ +typedef struct CopyHandlerOps +{ + /* Called when COPY TO is started. This will send a header. */ + void (*copy_to_start) (CopyToState cstate, TupleDesc tupDesc); + + /* Copy one row for COPY TO. */ + void (*copy_to_one_row) (CopyToState cstate, TupleTableSlot *slot); + + /* Called when COPY TO is ended. This will send a trailer. */ + void (*copy_to_end) (CopyToState cstate); + + void (*copy_from_start) (CopyFromState cstate, TupleDesc tupDesc); + bool (*copy_from_next) (CopyFromState cstate, ExprContext *econtext, + Datum *values, bool *nulls); + void (*copy_from_error_callback) (CopyFromState cstate); + void (*copy_from_end) (CopyFromState cstate); +} CopyHandlerOps; It seems that "copy_" prefix is redundant. Should we use "to_start" instead of "copy_to_start" and so on? BTW, it seems that "COPY FROM (FORMAT json)" may not be implemented. [2] We may need to care about NULL copy_from_* cases. > I added a hook *copy_from_end* but this might be removed later if not used. It may be useful to clean up resources for COPY FROM but the patch doesn't call the copy_from_end. How about removing it for now? We can add it and call it from EndCopyFrom() later? Because it's not needed for now. I think that we should focus on refactoring instead of adding a new feature in this patch. [1]: https://www.postgresql.org/message-id/20231204.153548.2126325458835528809.kou%40clear-code.com [2]: https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com Thanks, -- kou