On Tue, Nov 25, 2025 at 10:48 PM 河田達也 <[email protected]> wrote:
> > One question: should TRIM_SPACE remove only the literal space character (' 
> > '),
> >   or should it also trim other whitespace characters (e.g., tab, newline,
> > those recognized by isspace())?
> 3. Characters to trim
>    I plan to trim only the ASCII space character (' ') to keep the behavior 
> simple and avoid ambiguity.
>    Support for additional whitespace characters could be considered later if 
> there is consensus.

Understood.


> >   I'm kind of down on it, because it's inevitably going to add
> >   processing overhead to every COPY operation whether the feature
> >   is used or not.  I don't find it likely to be sufficiently
> > useful to justify that universal cost.
> 4. Performance / overhead concerns
>    Thank you for raising this point.
>    I fully agree that the feature must not introduce overhead when TRIM_SPACE 
> is disabled.
>
>    The trimming logic will be executed only when the option is explicitly 
> enabled,
>    so there will be just a single additional conditional check.
>    While this does technically add some processing overhead, it is expected 
> to be slight in practice for typical CSV loads and unlikely to be a concern.

I also agree that this feature probably won't add noticeable overhead to
COPY when it isn't used, but it would still be good to measure
the performance impact of the patch.


> > COPY is not a general-purpose filter or ETL tool, and we try
> > to make it one at our peril.
>    My intention is not to expand COPY into an ETL tool, but rather to
>    provide a small convenience option similar to FORCE_NULL or ON_ERROR,
>    to help users avoid common issues caused by unintended leading or
>    trailing spaces in CSV/text files.

I see Tom's point. There might be many possible features that could process
COPY input/output, and we can't reasonably add all of them to PostgreSQL core.
So the question is which ones belong in core. On second thought,
perhaps features that are difficult or impossible to implement outside the core
are the ones that can be considered for COPY itself. Otherwise, it might be
better to avoid expanding COPY unnecessarily. Anyway I'd like to hear more
opinons on this.

As for TRIM_SPACE, it seems possible to implement it as an external module
and call it via COPY PROGRAM. Is this true?

Regards,

-- 
Fujii Masao


Reply via email to