Hello, On 2017-06-23 22:12, Mahmoud Al-Qudsi wrote:
I think you and I are on the same page here, Clemens? I abhor the BOM, but the question is whether or not SQLite will cater to the fact that the bigger names in the industry appear hell-bent on shoving it in users’ documents by default.
Given that ‘.import’ and ‘.mode csv’ are “user mode” commands, perhaps leeway can be shown in breaking with standards for the sake of compatibility and sanity?
IMHO, this is not a good way to show a leeway. The Unicode Standard has enough bad things in itself. It is not necessary to transform a good Unicode's thing into a bad one. Should SQLite disregard one <EF BB BF> sequence, or all <EF BB BF> sequences, or at most 2, 3, 10 ones at the beginning of a file? Such stream can be produced by a sequence of conversions done by a mix of conforming and ``breaking the standard for the sake of compatibility'' converters. To be clear: I understand your point very well - ``let's ignore optional BOM at the beginning'', but I want to show that there is no limit in such thinking. Why one optional? You have not pointed out what compatibility with. The next step is to ignore N BOMs for the sake of compatibility with breaking the standard for the sake of compatibility with breaking the standard for the sake of... lim = \infty. I cannot see any sanity here. The standard says: ``Only UTF-16/32 (even not UTF-16/32LE/BE) encoding forms can contain BOM''. Let's conform to this. Certainly, there are no objections to extend an import's functionality in such a way that it ignores the initial 0xFEFF. However, an import should allow ZWNBSP as the first character, in its basic form, to be conforming to the standard. -- best regards Cezary H. Noweta _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users