NUL is a valid utf8 character but FF is never valid. (would be like a 36 bit length specification) and practically anthing more than F8 is invalid utf8 character. Other than BOM https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 EF BB BF 239 187 191
// EF - 80 | 3b - 80 | 3f ( 0xfeff ) Many Windows <https://en.wikipedia.org/wiki/Microsoft_Windows> programs (including Windows Notepad <https://en.wikipedia.org/wiki/Notepad_(Windows)>) add the bytes 0xEF, 0xBB, 0xBF at the start of any document saved as UTF-8. Th (Not that BOM is even required, because, it's already ordered bytes) ---------- But anYway FF could be used as a string terminator instead of 00. It is never legal in any utf-8 sequence. (F8,F9,FA,FB,FC,FD,FE,FF) F8 would be a 5 byte encoding, but that is more code points than unicode has allocated. It could be potentially useful to permit a little extra space in sequences , so I would avoid F8(F9,FA,FB) and stick to FC-FF for possible control characters. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users