On 04.02.20 15:42, Simon Slavin wrote:
On 4 Feb 2020, at 12:18pm, Robert M. Münch <robert.mue...@saphirion.com> wrote:
- sep=';': field separator character (different from default ',')
If you provide this facility, please don't add it to anything called 'csv'
since the 'c' stands for 'comma'.
For those playing along at home, csv files using semi-colon are a result of a
bug in Excel. Windows has a setting for a 'list separator'. The two most
usual values are ',' and ';'. The CSV export filter in Excel takes its
separator from this field rather than always using a comma, because it was
written by someone who wasn't aware of, didn't understand, or was intentionally
trying to disrupt the standard. Decades after being told about the bug,
Microsoft hasn't fixed it.
There are a couple of other errors in Excel's CSV filters including how strings
are quoted and how a blank value differs from a zero-length string. The best
way I've seen to handle this was to add a new filter to your software, similar
to 'csv', called something like 'exceltext' which did things the Excel way.
Believe it or not, there is no binding standard for the CSV format. The
closest anyone has come was RFC 4180.
However:
According to RFC 4180, section 2:
"While there are various specifications and implementations for the
CSV format (for ex. [4], [5], [6] and [7]), there is no formal
specification in existence, which allows for a wide variety of
interpretations of CSV files."
https://tools.ietf.org/html/rfc4180#section-2
In section 3, under "Interoperability considerations":
"Due to lack of a single specification, there are considerable
differences among implementations. Implementors should "be
conservative in what you do, be liberal in what you accept from
others" (RFC 793 [8]) when processing CSV files."
https://tools.ietf.org/html/rfc4180#section-3
That being said, the problem with trying to enforce the comma as the
sole delimiter character is due to the fact that over half of the
non-English speaking world (or perhaps even more) uses the comma as the
decimal separator. The "work-around" for that, of course, would be to
enclose all fields in double quote characters. But, as we know, the
800-pound gorilla in the room doesn't necessarily do that...
I agree that this would be a very good option to have. In the meantime,
check out libcsv on GitHub:
https://github.com/rgamble/libcsv
It adheres as closely to what standards there are, and you can choose
your own delimiter and quote character if you like. Of course, you have
to do some programming to use it, but it's really easy to use. And it is
very fast since it does just one thing, but does it very well.
HTH,
Bob Hairgrove
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users