On 18 April 2017 at 16:28, Hick Gunter <h...@scigames.at> wrote: > Richard Hipp wrote: > >I think the OP is referring to a problem that comes up because the field > width and precision of a printf() format are measured in bytes, not > characters, and if the input is multi-byte UTF then it is possible for a > single character to >be cut in half, resulting in goofy output. > > > >I checked in a fix for this yesterday. See > >https://www.sqlite.org/src/timeline?c=f508aff8 > > > > Damn UTF8 ... my "favorite" error is a data source defined as having ISO > encoding actually containing (2 byte) UTF8 characters, which, after calling > iconv() ISO->UTF8, yield invalid codepoints (2 x 2 bytes) that produce > goofy output on terminals and cause mysterious crashes within XML parsers, > preferably called from within perl procedures that silently terminate > without any indication of the cause. >
Right, so 1. the encoding is mislabeled 2. iconv is invoked to change the encoding 3. garbage in garbage out 4. this is utf-8's fault? The elegance of utf-8 lies in the fact that (a) the entire unicode space is representable using the one encoding and (b) it's backwards compatible with ASCII. A corollary of (a) is that in a utf-8 system no transcoding is required - ie. as soon as you invoke iconv you are complaining about *other* encodings, not utf-8 :P -Rowan (I did find the cascade of errors amusing :) ) _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users