Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
Still applies; just badly worded. On Tue, May 21, 2019 at 8:47 AM Dominique Devienne wrote: > On Tue, May 21, 2019 at 5:36 PM Shawn Wagner > wrote: > > > I have a handy script that can handle that sort of input with extra > spaces > > (With the --strip option), and other stuff that csv .import doesn't > always > > deal well with: > > > > https://github.com/shawnw/useful_sqlite_extensions/tree/master/tools > > > Doc says from *unquoted fields*. > Here it's about a space between the comma and the leading double-quote of a > *quoted field*. > So still applies? --DD > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
On Tuesday, 21 May, 2019 07:55, Richard Hipp wrote: >On 5/21/19, Patrick Sherrill wrote: >> I don’t know about ‘valid’ csv, it has been a moving target for >>decades. >> White space as far as my recollection should not be considered in >>parsing a csv. >I'm going by RFC 4180. https://tools.ietf.org/html/rfc4180. On page >2 it says: "Spaces are considered part of a field and should not be >ignored." and also the RFC also says in section 5 Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields. So, since white space is not ignored and the first character of the field is not a quote (it is a space), then the quote inside the field is an error and should presumably toss some sort of a parse exception ... --- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
On Tue, May 21, 2019 at 5:36 PM Shawn Wagner wrote: > I have a handy script that can handle that sort of input with extra spaces > (With the --strip option), and other stuff that csv .import doesn't always > deal well with: > > https://github.com/shawnw/useful_sqlite_extensions/tree/master/tools Doc says from *unquoted fields*. Here it's about a space between the comma and the leading double-quote of a *quoted field*. So still applies? --DD ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
I have a handy script that can handle that sort of input with extra spaces (With the --strip option), and other stuff that csv .import doesn't always deal well with: https://github.com/shawnw/useful_sqlite_extensions/tree/master/tools On Tue, May 21, 2019 at 6:14 AM Faheem Mitha wrote: > > Hi, > > I'm seeing the same bug reported here, in an issue from 2009, supposedly > fixed in 2014. > > https://www.sqlite.org/src/tktview?name=c25aab7e7e with the title: > The ".import" command does not parse CSV correctly. > > I'm using Debian buster, with SQLite version 3.27.2-2. > > Here is a simple reproduction recipe. Consider the following simple CSV > file. > > ### > comma.csv > ### > somestuff, "some,stuff" > > Then one just needs to do > > sqlite3 comma.db 'create table comma(foo, bar);' '.mode csv' '.import > comma.csv comma' '.exit' > > Which gives the error message: > > expected 2 columns but found 3 - extras ignored. > > As mentioned in that ticket, this is per the > https://tools.ietf.org/html/rfc4180 standard. > > Pandas handles this case fine. I've not checked R. > > Should I attempt to reopen that ticket? > > Regards, Faheem > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
On 5/21/19, Patrick Sherrill wrote: > I don’t know about ‘valid’ csv, it has been a moving target for decades. > White space as far as my recollection should not be considered in parsing a > csv. I'm going by RFC 4180. https://tools.ietf.org/html/rfc4180. On page 2 it says: "Spaces are considered part of a field and should not be ignored." -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
Patrick Sherrill, on Tuesday, May 21, 2019 09:38 AM, wrote... >I don’t know about ‘valid’ csv, it has been a moving target for decades. White >space as far as my recollection should not be considered in parsing a csv. In wikipedia [1], under Specification, #3, it states, "with the records divided into fields separated by delimiters (typically a single reserved character such as comma, semicolon, or tab; sometimes the delimiter may include optional spaces)," I don't know how the RFC agrees with wikipedia, but, it looks like spaces should be ignore when importing. However, the RFC [2], section 4 says, "... Spaces are considered part of a field and should not be ignored... " So, who is right? [1] https://en.wikipedia.org/wiki/Comma-separated_values [2] https://tools.ietf.org/html/rfc4180#section-2 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
On 21 May 2019, at 2:38pm, Patrick Sherrill wrote: > I don’t know about ‘valid’ csv, it has been a moving target for decades. > White space as far as my recollection should not be considered in parsing a > csv. You can do it, but you have to be consistent. There is a problem with this specific line: somestuff, "some,stuff" CSV files are expected to come in two different format. One is to have double-quotes around all text values: "somestuff", "some,stuff" In this form, the comma is just, and obviously, a separator. Blank space around separators is not a problem and parsers can identify it and ignore it as required. The other commonly-found CSV format is this: somestuff, otherstuff In this format, strings are not indicated with the use of double quotes. In this case the separator is a comma. Space characters around it are not blank space, they're part of the value. Because if they aren't there's no way to supply a string that begins or ends with a space. So the second value parsed from this line starts with a space character. Given those two possibilities, How would you expect a parser to figure out this line ? somestuff, "some,stuff" It's neither one format nor the other. Parsers might handle this however they want. It is permitted for a parser to look at the first character, decide that the line starts with a string with no delimiter, and identify the three strings on that line as somestuff "some <-- this value begins with stuff" <-- this value ends with If you need include commas in your strings, delimit all your strings with double quotes and the problem will go away. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
I don’t know about ‘valid’ csv, it has been a moving target for decades. White space as far as my recollection should not be considered in parsing a csv. My 2 cents. Pat... Sent from my iPhone > On May 21, 2019, at 9:28 AM, Richard Hipp wrote: > >> On 5/21/19, Faheem Mitha wrote: >> The ".import" command does not parse CSV correctly. >> somestuff, "some,stuff" > > This is not valid CSV. There is an extra space character after the > comma and before the double-quote. > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CSV import does not handle fields with a comma surrounded by double
On 5/21/19, Faheem Mitha wrote: > The ".import" command does not parse CSV correctly. > somestuff, "some,stuff" This is not valid CSV. There is an extra space character after the comma and before the double-quote. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] CSV import does not handle fields with a comma surrounded by double
Hi, I'm seeing the same bug reported here, in an issue from 2009, supposedly fixed in 2014. https://www.sqlite.org/src/tktview?name=c25aab7e7e with the title: The ".import" command does not parse CSV correctly. I'm using Debian buster, with SQLite version 3.27.2-2. Here is a simple reproduction recipe. Consider the following simple CSV file. ### comma.csv ### somestuff, "some,stuff" Then one just needs to do sqlite3 comma.db 'create table comma(foo, bar);' '.mode csv' '.import comma.csv comma' '.exit' Which gives the error message: expected 2 columns but found 3 - extras ignored. As mentioned in that ticket, this is per the https://tools.ietf.org/html/rfc4180 standard. Pandas handles this case fine. I've not checked R. Should I attempt to reopen that ticket? Regards, Faheem ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users