On Sat, May 20, 2023 at 09:16:30AM +0200, Joel Jacobson wrote:
> On Fri, May 19, 2023, at 18:06, Daniel Verite wrote:
> > COPY FROM file CSV somewhat differs as your example shows,
> > but it still mishandle \. when unquoted. For instance, consider this
> > file to load with COPY t FROM
Kirk Wolak wrote:
> We do NOT do "CSV", we mimic pg_dump.
pg_dump uses the text format (as opposed to csv), where
\. on a line by itself cannot appear in the data, so there's
no problem. The problem is limited to the csv format.
Best regards,
--
Daniel Vérité
On Mon, May 22, 2023 at 12:13 PM Daniel Verite
wrote:
> Joel Jacobson wrote:
>
> > Is there a valid reason why \. is needed for COPY FROM filename?
> > It seems to me it would only be necessary for the COPY FROM STDIN case,
> > since files have a natural end-of-file and a known file
Joel Jacobson wrote:
> Is there a valid reason why \. is needed for COPY FROM filename?
> It seems to me it would only be necessary for the COPY FROM STDIN case,
> since files have a natural end-of-file and a known file size.
Looking at CopyReadLineText() over at [1], I don't see a
On Fri, May 19, 2023, at 18:06, Daniel Verite wrote:
> COPY FROM file CSV somewhat differs as your example shows,
> but it still mishandle \. when unquoted. For instance, consider this
> file to load with COPYt FROM '/tmp/t.csv' WITH CSV
> $ cat /tmp/t.csv
> line 1
> \.
> line 3
> line 4
>
Joel Jacobson wrote:
> I understand its necessity for STDIN, given that the end of input needs to
> be explicitly defined.
> However, for files, we have a known file size and the end-of-file can be
> detected without the need for special markers.
>
> Also, is the difference in how
On Thu, May 18, 2023, at 18:48, Daniel Verite wrote:
> Joel Jacobson wrote:
>> OTOH, one would then need to inspect the TSV file doesn't contain \. on an
>> empty line...
>
> Note that this is the case for valid CSV contents, since backslash-dot
> on a line by itself is both an end-of-data marker
Joel Jacobson wrote:
> I've been using that trick myself many times in the past, but thanks to this
> deep-dive into this topic, it looks to me like TEXT would be a better format
> fit when dealing with unquoted TSV files, or?
>
> OTOH, one would then need to inspect the TSV file doesn't
On 2023-05-18 Th 02:19, Joel Jacobson wrote:
On Thu, May 18, 2023, at 08:00, Joel Jacobson wrote:
> 1. How about adding a `WITHOUT QUOTE` or `QUOTE NONE` option in
conjunction
> with `COPY ... WITH CSV`?
More ideas:
[ QUOTE 'quote_character' | UNQUOTED ]
or
[ QUOTE 'quote_character' |
On Thu, May 18, 2023, at 08:35, Pavel Stehule wrote:
> Maybe there is another third implementation in Libre Office.
>
> Generally TSV is not well specified, and then the implementations are not
> consistent.
Thanks Pavel, that was a very interesting case indeed:
Libre Office (tested on Mac)
čt 18. 5. 2023 v 8:01 odesílatel Joel Jacobson napsal:
> On Thu, May 18, 2023, at 00:18, Kirk Wolak wrote:
> > Here you go. Not horrible handling. (I use DataGrip so I saved it from
> there
> > directly as TSV, just for an extra datapoint).
> >
> > FWIW, if you copy/paste in windows, the data,
On Thu, May 18, 2023, at 08:00, Joel Jacobson wrote:
> 1. How about adding a `WITHOUT QUOTE` or `QUOTE NONE` option in conjunction
> with `COPY ... WITH CSV`?
More ideas:
[ QUOTE 'quote_character' | UNQUOTED ]
or
[ QUOTE 'quote_character' | NO_QUOTE ]
Thinking about it, I recall another hack;
On Thu, May 18, 2023, at 00:18, Kirk Wolak wrote:
> Here you go. Not horrible handling. (I use DataGrip so I saved it from there
> directly as TSV, just for an extra datapoint).
>
> FWIW, if you copy/paste in windows, the data, the field with the tab gets
> split into another column in Excel. But
On Wed, May 17, 2023 at 5:47 PM Joel Jacobson wrote:
> On Wed, May 17, 2023, at 19:42, Andrew Dunstan wrote:
> > You can use CSV mode pretty reliably for TSV files. The trick is to use a
> > quoting char that shouldn't appear, such as E'\x01' as well as setting
> the
> > delimiter to E'\t'. Yes,
On Wed, May 17, 2023, at 19:42, Andrew Dunstan wrote:
> You can use CSV mode pretty reliably for TSV files. The trick is to use a
> quoting char that shouldn't appear, such as E'\x01' as well as setting the
> delimiter to E'\t'. Yes, it's far from obvious.
I've been using that trick myself many
On 2023-05-16 Tu 13:15, Joel Jacobson wrote:
On Tue, May 16, 2023, at 13:43, Joel Jacobson wrote:
>If we made midfield quoting a CSV error, those users who are
currently mistaken
>about their TSV/TEXT files being CSV while also having balanced
quotes in their
>data, would encounter an error
On Tue, May 16, 2023, at 13:43, Joel Jacobson wrote:
>If we made midfield quoting a CSV error, those users who are currently mistaken
>about their TSV/TEXT files being CSV while also having balanced quotes in their
>data, would encounter an error rather than a silent failure, which I believe
On Sun, May 14, 2023, at 16:58, Andrew Dunstan wrote:
> And if people do follow the method you describe then their input with
> unescaped quotes will be rejected 999 times out of 1000. It's only cases where
> the field happens to have an even number of embedded quotes, like Joel's
> somewhat
On 2023-05-13 Sa 23:11, Greg Stark wrote:
On Sat, 13 May 2023 at 09:46, Tom Lane wrote:
Andrew Dunstan writes:
I could see an argument for a STRICT mode which would disallow partially
quoted fields, although I'd like some evidence that we're dealing with a
real problem here. Is there really
On Sat, 13 May 2023 at 09:46, Tom Lane wrote:
>
> Andrew Dunstan writes:
> > I could see an argument for a STRICT mode which would disallow partially
> > quoted fields, although I'd like some evidence that we're dealing with a
> > real problem here. Is there really a CSV producer that produces
Andrew Dunstan writes:
> I could see an argument for a STRICT mode which would disallow partially
> quoted fields, although I'd like some evidence that we're dealing with a
> real problem here. Is there really a CSV producer that produces output
> like that you showed in your example? And if
On 2023-05-13 Sa 04:20, Joel Jacobson wrote:
On Fri, May 12, 2023, at 21:57, Andrew Dunstan wrote:
Maybe this is unexpected by you, but it's not by me. What other sane
interpretation of that data could there be? And what CSV producer
outputs such horrible content? As you've noted, ours
On Fri, May 12, 2023, at 21:57, Andrew Dunstan wrote:
> Maybe this is unexpected by you, but it's not by me. What other sane
> interpretation of that data could there be? And what CSV producer outputs
> such horrible content? As you've noted, ours certainly does not. Our rules
> are clear:
On 2023-05-11 Th 10:03, Joel Jacobson wrote:
Hi hackers,
I've come across an unexpected behavior in our CSV parser that I'd like to
bring up for discussion.
% cat example.csv
id,rating,review
1,5,"Great product, will buy again."
2,3,"I bought this for my 6" laptop but it didn't fit my 8"
On Thu, 11 May 2023 at 10:04, Joel Jacobson wrote:
>
> The parser currently accepts quoting within an unquoted field. This can lead
> to
> data misinterpretation when the quote is part of the field data (e.g.,
> for inches, like in the example).
I think you're thinking about it differently than
čt 11. 5. 2023 v 16:04 odesílatel Joel Jacobson napsal:
> Hi hackers,
>
> I've come across an unexpected behavior in our CSV parser that I'd like to
> bring up for discussion.
>
> % cat example.csv
> id,rating,review
> 1,5,"Great product, will buy again."
> 2,3,"I bought this for my 6" laptop
26 matches
Mail list logo