Re: [sqlite] UTF8-BOM not disregarded in CSV import

Cezary H. Noweta Sun, 25 Jun 2017 12:16:55 -0700

Hello,

On 2017-06-23 22:12, Mahmoud Al-Qudsi wrote:

I think you and I are on the same page here, Clemens? I abhor the
BOM, but the question is whether or not SQLite will cater to the fact
that the bigger names in the industry appear hell-bent on shoving it
in users’ documents by default.

Given that ‘.import’ and ‘.mode csv’ are “user mode” commands,
perhaps leeway can be shown in breaking with standards for the sake
of compatibility and sanity?


IMHO, this is not a good way to show a leeway. The Unicode Standard has
enough bad things in itself. It is not necessary to transform a good
Unicode's thing into a bad one.

Should SQLite disregard one <EF BB BF> sequence, or all <EF BB BF>
sequences, or at most 2, 3, 10 ones at the beginning of a file? Such
stream can be produced by a sequence of conversions done by a mix of
conforming and ``breaking the standard for the sake of compatibility''
converters.

To be clear: I understand your point very well - ``let's ignore optional
BOM at the beginning'', but I want to show that there is no limit in
such thinking. Why one optional? You have not pointed out what
compatibility with. The next step is to ignore N BOMs for the sake of
compatibility with breaking the standard for the sake of compatibility
with breaking the standard for the sake of... lim = \infty. I cannot see
any sanity here.

The standard says: ``Only UTF-16/32 (even not UTF-16/32LE/BE) encoding
forms can contain BOM''. Let's conform to this.

Certainly, there are no objections to extend an import's functionality
in such a way that it ignores the initial 0xFEFF. However, an import
should allow ZWNBSP as the first character, in its basic form, to be
conforming to the standard.

-- best regards

Cezary H. Noweta
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] UTF8-BOM not disregarded in CSV import

Reply via email to