Re: Fixing backslash dot for COPY FROM...CSV

Daniel Verite Tue, 16 Jan 2024 06:43:06 -0800

        Robert Haas wrote:

> Part of my hesitancy, I suppose, is that I don't
> understand why we even have this strange convention of making \.
> terminate the input in the first place -- I mean, why wouldn't that be
> done in some kind of out-of-band way, rather than including a special
> marker in the data?


The v3 protocol added the out-of-band method, but the v2 protocol
did not have it, and as far as I understand, this is the reason why
CopyReadLineText() must interpret \. as an end-of-data marker.

The v2 protocol was removed in pg14
https://www.postgresql.org/docs/release/14.0/
<quote>
  Remove server and libpq support for the version 2 wire protocol (Heikki
Linnakangas)
  This was last used as the default in PostgreSQL 7.3 (released in 2002).
</quote>

Also I hadnt' noticed this before, but the current doc has this mention
that is relevant to this patch:

https://www.postgresql.org/docs/current/protocol-changes.html
"Summary of Changes since Protocol 2.0"
<quote>
  COPY data is now encapsulated into CopyData and CopyDone
  messages. There is a well-defined way to recover from errors during
  COPY. The special “\.” last line is not needed anymore, and is not
  sent during COPY OUT. (It is still recognized as a terminator during
  COPY IN, but its use is deprecated and will eventually be removed.)
</quote>

What the present patch does is essentially, for the server-side part,
stop recognizing "\." as as terminator, like this paragraph says, but
it does that for CSV only, not for TEXT.

> Hmm. Looking at the rest of the patch, it seems like you're removing
> the logic that prevents us from interpreting
> 
> \. lksdghksdhgjskdghjs
> 
> as an end-of-file while in CSV mode. But I would have thought based on
> what problem you're trying to fix that you would have wanted to keep
> that logic and further restrict it so that it only applies when not
> within a quoted string.
> 
> Maybe I'm misunderstanding what bug you're trying to fix?

The fix is that \. is no longer recognized as special in CSV, whether
alone on a line or not, and whether in a quoted section or not.
It's always interpreted as data, like it would have been in
the first place, I imagine, if the v2 protocol could have handled
it. This is why the patch consists mostly of removing code and
simplifying comments.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Re: Fixing backslash dot for COPY FROM...CSV

Reply via email to