Hi,

In <CAEG8a3LSRhK601Bn50u71BgfNWm4q3kv-o-KEq=hrbylby_...@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Wed, 6 Dec 2023 22:07:51 +0800,
  Junwang Zhao <zhjw...@gmail.com> wrote:

> Should we extract both *copy to* and *copy from* for the first step, in that
> case we can add the pg_copy_handler catalog smoothly later.

I don't object it (mixing TO/FROM changes to one patch) but
it may make review difficult. Is it acceptable?

FYI: I planed that I implement TO part, and then FROM part,
and then unify TO/FROM parts if needed. [1]

> Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> please take a look.

Thanks. Here are my comments:

> +             /*
> +                     * Error is relevant to a particular line.
> +                     *
> +                     * If line_buf still contains the correct line, print it.
> +                     */
> +             if (cstate->line_buf_valid)

We need to fix the indentation.

> +CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc)
> +{
> +     FmgrInfo   *in_functions;
> +     Oid                *typioparams;
> +     Oid                     in_func_oid;
> +     AttrNumber      num_phys_attrs;
> +
> +     /*
> +      * Pick up the required catalog information for each attribute in the
> +      * relation, including the input function, the element type (to pass to
> +      * the input function), and info about defaults and constraints. (Which
> +      * input function we use depends on text/binary format choice.)
> +      */
> +     num_phys_attrs = tupDesc->natts;
> +     in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> +     typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));

We need to update the comment because defaults and
constraints aren't picked up here.

> +CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc)
...
> +     /*
> +      * Pick up the required catalog information for each attribute in the
> +      * relation, including the input function, the element type (to pass to
> +      * the input function), and info about defaults and constraints. (Which
> +      * input function we use depends on text/binary format choice.)
> +      */
> +     in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> +     typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));

ditto.


> @@ -1716,15 +1776,6 @@ BeginCopyFrom(ParseState *pstate,
>               ReceiveCopyBinaryHeader(cstate);
>       }

I think that this block should be moved to
CopyFromFormatBinaryStart() too. But we need to run it after
we setup inputs such as data_source_cb, pipe and filename...

+/* Routines for a COPY HANDLER implementation. */
+typedef struct CopyHandlerOps
+{
+       /* Called when COPY TO is started. This will send a header. */
+       void            (*copy_to_start) (CopyToState cstate, TupleDesc 
tupDesc);
+
+       /* Copy one row for COPY TO. */
+       void            (*copy_to_one_row) (CopyToState cstate, TupleTableSlot 
*slot);
+
+       /* Called when COPY TO is ended. This will send a trailer. */
+       void            (*copy_to_end) (CopyToState cstate);
+
+       void            (*copy_from_start) (CopyFromState cstate, TupleDesc 
tupDesc);
+       bool            (*copy_from_next) (CopyFromState cstate, ExprContext 
*econtext,
+                                                                  Datum 
*values, bool *nulls);
+       void            (*copy_from_error_callback) (CopyFromState cstate);
+       void            (*copy_from_end) (CopyFromState cstate);
+} CopyHandlerOps;

It seems that "copy_" prefix is redundant. Should we use
"to_start" instead of "copy_to_start" and so on?

BTW, it seems that "COPY FROM (FORMAT json)" may not be implemented. [2]
We may need to care about NULL copy_from_* cases.


> I added a hook *copy_from_end* but this might be removed later if not used.

It may be useful to clean up resources for COPY FROM but the
patch doesn't call the copy_from_end. How about removing it
for now? We can add it and call it from EndCopyFrom() later?
Because it's not needed for now.

I think that we should focus on refactoring instead of
adding a new feature in this patch.


[1]: 
https://www.postgresql.org/message-id/20231204.153548.2126325458835528809.kou%40clear-code.com
[2]: 
https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com


Thanks,
-- 
kou


Reply via email to