Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-02-16 Thread Maks Verver
*Richard:* the issue with the JSON extension seems unrelated to the issue that I reported originally, which relates to the SQLite C API (specifically, the sqlite3_bind_text16() and sqlite3_bind_text16() functions). My issue is still not fixed. I've expanded my original sample code to make it

Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-01-14 Thread Richard Hipp
On 1/14/20, Richard Hipp wrote: > I'm having trouble reproducing this. I went back to version 3.30.1 and I was able to reproduce it. So I bisected and found the following: https://sqlite.org/src/timeline?c=51027f08c0478f1b -- D. Richard Hipp d...@sqlite.org

Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-01-14 Thread Richard Hipp
On 1/13/20, Dennis Snell wrote: > We have a JSON document like this which we store in a table. > > {“content”: “\ud83c\udd70\ud83c(null)\udd71”,”tags":[]} > > > The JSON is well-formed but the sequence of UTF-16 code points is invalid. > > When sqlite reads this data two types of further

Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-01-14 Thread Detlef Golze
Betreff: Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data I’d like to raise this issue again and give my support for what Maks Verver recommended in  https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg110107.html Independently I came to this bug while

Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-01-13 Thread Dennis Snell
I’d like to raise this issue again and give my support for what Maks Verver recommended in  https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg110107.html Independently I came to this bug while working on an issue in Simplenote’s Android app where our data was being corrupted

[sqlite] Feature request: more robust handling of invalid UTF-16 data

2018-05-21 Thread Maks Verver
*Background: *UTF-16 is an encoding which allows most characters to be encoded in a single 16-bit code unit. Characters outside the basic multilingual plane (i.e. code points between 0x1 and 0x10), require two code units: a high surrogate between 0xD800 and 0xDBFF, followed by a low