Patch applied. Thanks.
---
Andrew Dunstan wrote:
I wrote:
If it bothers you that much. I'd make a flag, cleared at the start of
each COPY, and then where we test for CR or LF in CopyAttributeOutCSV,
if
I wrote:
If it bothers you that much. I'd make a flag, cleared at the start of
each COPY, and then where we test for CR or LF in CopyAttributeOutCSV,
if the flag is not set then set it and issue the warning.
I didn't realise until Bruce told me just now that I was on the hook for
this. I
Andrew Dunstan [EMAIL PROTECTED] writes:
+ if (!embedded_line_warning (c == '\n' || c == '\r') )
+ {
+ embedded_line_warning = true;
+ elog(WARNING,
+ CSV fields with embedded linefeed or carriage
Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
+ if (!embedded_line_warning (c == '\n' || c == '\r') )
+ {
+ embedded_line_warning = true;
+ elog(WARNING,
+ CSV fields with embedded linefeed or carriage return
+ characters might not be able to be reimported);
+
On Tue, 30 Nov 2004, Greg Stark wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
The advantage of having it in COPY is that it can be done serverside
direct from the file system. For massive bulk loads that might be a
plus, although I don't know what the protocol+socket overhead is.
Greg Stark wrote:
Personally I find the current CSV support inadequate. It seems pointless to
support CSV if it can't load data exported from Excel, which seems like the
main use case.
OK, I'm starting to get mildly annoyed now. We have identified one
failure case connected with multiline
Greg Stark wrote:
Personally I find the current CSV support inadequate. It seems
pointless to
support CSV if it can't load data exported from Excel, which seems like
the
main use case.
OK, I'm starting to get mildly annoyed now. We have identified one
failure case connected
[EMAIL PROTECTED] wrote:
I am normally more of a lurker on these lists, but I thought you had
better know
that when we developed CSV import/export for an application at my last
company
we discovered that Excel can't always even read the CSV that _it_ has
output!
(With embedded newlines a
Andrew Dunstan [EMAIL PROTECTED] writes:
FWIW, I don't make a habit of using multiline fields in my spreadsheets - and
some users I have spoken to aren't even aware that you can have them at all.
Unfortunately I don't get a choice. I offer a field on the web site where
users can upload an
Andrew Dunstan wrote:
Greg Stark wrote:
Personally I find the current CSV support inadequate. It seems pointless to
support CSV if it can't load data exported from Excel, which seems like the
main use case.
OK, I'm starting to get mildly annoyed now. We have identified one
Bruce Momjian wrote:
I am wondering if one good solution would be to pre-process the input
stream in copy.c to convert newline to \n and carriage return to \r and
double data backslashes and tell copy.c to interpret those like it does
for normal text COPY files. That way, the changes to copy.c
Andrew Dunstan wrote:
Bruce Momjian wrote:
I am wondering if one good solution would be to pre-process the input
stream in copy.c to convert newline to \n and carriage return to \r and
double data backslashes and tell copy.c to interpret those like it does
for normal text COPY files.
Bruce Momjian [EMAIL PROTECTED] writes:
Tom Lane wrote:
Which we do not have, because pg_dump doesn't use CSV. I do not think
this is a must-fix, especially not if the proposed fix introduces
inconsistencies elsewhere.
Sure, pg_dump doesn't use it but COPY should be able to load anything it
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Tom Lane wrote:
Which we do not have, because pg_dump doesn't use CSV. I do not think
this is a must-fix, especially not if the proposed fix introduces
inconsistencies elsewhere.
Sure, pg_dump doesn't use it but COPY should be
Bruce Momjian wrote:
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Tom Lane wrote:
Which we do not have, because pg_dump doesn't use CSV. I do not think
this is a must-fix, especially not if the proposed fix introduces
inconsistencies elsewhere.
Sure, pg_dump
Andrew Dunstan wrote:
OK, then should we disallow dumping out data in CVS format that we can't
load? Seems like the least we should do for 8.0.
As Tom rightly points out, having data make the round trip was not the
goal of the exercise. Excel, for example, has no trouble reading
Bruce Momjian wrote:
Andrew Dunstan wrote:
OK, then should we disallow dumping out data in CVS format that we can't
load? Seems like the least we should do for 8.0.
As Tom rightly points out, having data make the round trip was not the
goal of the exercise. Excel, for
Bruce Momjian wrote:
Andrew Dunstan wrote:
OK, then should we disallow dumping out data in CVS format that we can't
load? Seems like the least we should do for 8.0.
As Tom rightly points out, having data make the round trip was not the
goal of the exercise. Excel, for example, has no
Bruce Momjian wrote:
Also, can you explain why we can't read across a newline to the next
quote? Is it a problem with the way our code is structured or is it a
logical problem? Someone mentioned multibyte encodings but I don't
understand how that applies here.
In a CSV file, each line is a
Bruce Momjian [EMAIL PROTECTED] writes:
Also, can you explain why we can't read across a newline to the next
quote? Is it a problem with the way our code is structured or is it a
logical problem?
It's a structural issue in the sense that we separate the act of
dividing the input into rows
On Mon, 29 Nov 2004, Andrew Dunstan wrote:
Longer term I'd like to be able to have a command parameter that
specifies certain fields as multiline and for those relax the line end
matching restriction (and for others forbid multiline altogether). That
would be a TODO for 8.1 though, along
Kris Jurka wrote:
On Mon, 29 Nov 2004, Andrew Dunstan wrote:
Longer term I'd like to be able to have a command parameter that
specifies certain fields as multiline and for those relax the line end
matching restriction (and for others forbid multiline altogether). That
would be a
Kris Jurka [EMAIL PROTECTED] writes:
Endlessly extending the COPY command doesn't seem like a winning
proposition to me and I think if we aren't comfortable telling every user
to write a script to pre/post-process the data we should instead provide a
bulk loader/unloader that transforms
Tom Lane wrote:
Kris Jurka [EMAIL PROTECTED] writes:
Endlessly extending the COPY command doesn't seem like a winning
proposition to me and I think if we aren't comfortable telling every user
to write a script to pre/post-process the data we should instead provide a
bulk
Tom Lane wrote:
Kris Jurka [EMAIL PROTECTED] writes:
Endlessly extending the COPY command doesn't seem like a winning
proposition to me and I think if we aren't comfortable telling every user
to write a script to pre/post-process the data we should instead provide a
bulk loader/unloader
Andrew Dunstan [EMAIL PROTECTED] writes:
The advantage of having it in COPY is that it can be done serverside direct
from the file system. For massive bulk loads that might be a plus, although I
don't know what the protocol+socket overhead is.
Actually even if you use client-side COPY it's
OK, what solutions do we have for this? Not being able to load dumped
data is a serious bug. I have added this to the open items list:
* fix COPY CSV with \r,\n in data
My feeling is that if we are in a quoted string we just process whatever
characters we find, even passing through an
Bruce Momjian [EMAIL PROTECTED] writes:
OK, what solutions do we have for this? Not being able to load dumped
data is a serious bug.
Which we do not have, because pg_dump doesn't use CSV. I do not think
this is a must-fix, especially not if the proposed fix introduces
inconsistencies
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
OK, what solutions do we have for this? Not being able to load dumped
data is a serious bug.
Which we do not have, because pg_dump doesn't use CSV. I do not think
this is a must-fix, especially not if the proposed fix introduces
Bruce Momjian said:
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
OK, what solutions do we have for this? Not being able to load
dumped data is a serious bug.
Which we do not have, because pg_dump doesn't use CSV. I do not think
this is a must-fix, especially not if the
This example should fail on data line 2 or 3 on any platform,
regardless of the platform's line-end convention, although I haven't
tested on Windows.
cheers
andrew
[EMAIL PROTECTED] inst]$ bin/psql -e -f csverr.sql ; od -c
/tmp/csverrtest.csv
create table csverrtest (a int, b text, c int);
On Nov 12, 2004, at 12:20 AM, Tom Lane wrote:
Patrick B Kelly [EMAIL PROTECTED] writes:
I may not be explaining myself well or I may fundamentally
misunderstand how copy works.
Well, you're definitely ignoring the character-set-conversion issue.
I was not trying to ignore the character set and
On Nov 10, 2004, at 6:10 PM, Andrew Dunstan wrote:
The last really isn't an option, because the whole point of CSVs is to
play with other programs, and my understanding is that those that
understand multiline fields (e.g. Excel) expect them not to be
escaped, and do not produce them escaped.
Patrick B Kelly wrote:
On Nov 10, 2004, at 6:10 PM, Andrew Dunstan wrote:
The last really isn't an option, because the whole point of CSVs is
to play with other programs, and my understanding is that those that
understand multiline fields (e.g. Excel) expect them not to be
escaped, and do not
Andrew Dunstan [EMAIL PROTECTED] writes:
Patrick B Kelly wrote:
Actually, when I try to export a sheet with multi-line cells from
excel, it tells me that this feature is incompatible with the CSV
format and will not include them in the CSV file.
It probably depends on the version. I have
Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
Patrick B Kelly wrote:
Actually, when I try to export a sheet with multi-line cells from
excel, it tells me that this feature is incompatible with the CSV
format and will not include them in the CSV file.
It probably
Tom Lane [EMAIL PROTECTED] writes:
I would vote in favor of removing the current code that attempts to
support unquoted newlines, and waiting to see if there are complaints.
Uhm. *raises hand*
I agree with your argument but one way or another I have to load these CSVs
I'm given. And like it
On Thu, Nov 11, 2004 at 03:38:16PM -0500, Greg Stark wrote:
Tom Lane [EMAIL PROTECTED] writes:
I would vote in favor of removing the current code that attempts
to support unquoted newlines, and waiting to see if there are
complaints.
Uhm. *raises hand*
I agree with your argument
On Nov 11, 2004, at 2:56 PM, Andrew Dunstan wrote:
Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
Patrick B Kelly wrote:
Actually, when I try to export a sheet with multi-line cells from
excel, it tells me that this feature is incompatible with the CSV
format and will not include them
Patrick B Kelly wrote:
What about just coding a FSM into
backend/commands/copy.c:CopyReadLine() that does not process any
flavor of NL characters when it is inside of a data field?
It would be a major change - the routine doesn't read data a field at a
time, and has no idea if we are even
Patrick B Kelly [EMAIL PROTECTED] writes:
What about just coding a FSM into
backend/commands/copy.c:CopyReadLine() that does not process any flavor
of NL characters when it is inside of a data field?
CopyReadLine has no business tracking that. One reason why not is that
it is dealing with
On Nov 11, 2004, at 6:16 PM, Tom Lane wrote:
Patrick B Kelly [EMAIL PROTECTED] writes:
What about just coding a FSM into
backend/commands/copy.c:CopyReadLine() that does not process any
flavor
of NL characters when it is inside of a data field?
CopyReadLine has no business tracking that. One
Patrick B Kelly wrote:
My suggestion is to simply have CopyReadLine recognize these two
states (in-field and out-of-field) and execute the current logic only
while in the second state. It would not be too hard but as you
mentioned it is non-trivial.
We don't know what state we expect the
On Nov 11, 2004, at 10:07 PM, Andrew Dunstan wrote:
Patrick B Kelly wrote:
My suggestion is to simply have CopyReadLine recognize these two
states (in-field and out-of-field) and execute the current logic only
while in the second state. It would not be too hard but as you
mentioned it is
Can I see an example of such a failure line?
---
Andrew Dunstan wrote:
Darcy Buskermolen has drawn my attention to unfortunate behaviour of
COPY CSV with fields containing embedded line end chars if the embedded
Patrick B Kelly [EMAIL PROTECTED] writes:
I may not be explaining myself well or I may fundamentally
misunderstand how copy works.
Well, you're definitely ignoring the character-set-conversion issue.
regards, tom lane
---(end of
Darcy Buskermolen has drawn my attention to unfortunate behaviour of
COPY CSV with fields containing embedded line end chars if the embedded
sequence isn't the same as those of the file containing the CSV data. In
that case we error out when reading the data in. This means there are
cases
47 matches
Mail list logo