Well, I believe this is the relevant bit from the docs for binding: https://www.sqlite.org/c3ref/bind_blob.html
"If a non-negative fourth parameter is provided to sqlite3_bind_text() or sqlite3_bind_text16() or sqlite3_bind_text64() then that parameter must be the byte offset where the NUL terminator would occur assuming the string were NUL terminated. If any NUL characters occur at byte offsets less than the value of the fourth parameter then the resulting string value will contain embedded NULs. The result of expressions involving strings with embedded NULs is undefined. " -----Original Message----- From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of Barry Smith Sent: Monday, January 13, 2020 1:54 PM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Unexplained table bloat On the original topic... How does one end up with a database in this state? I.e with a binary value that contains 0x00 bytes followed by other bytes but a type of TEXT? If the definition of a text string in SQLite is that it ends at the first 0x00 byte, then it seems that anything stored as a text string should adhere to that. So a database with a TEXT value that contains characters after the first 0x00 should be considered corrupt. Given that to retrieve the actual contents of the cell it must be cast to BLOB, why not force the storage of any string that contains 0x00 as a BLOB in the first place? What am I missing here? On 13 Jan 2020, at 6:02 am, Simon Slavin <slav...@bigfraud.org> wrote: > > On 13 Jan 2020, at 9:26am, Dominique Devienne <ddevie...@gmail.com> wrote: > >> Which implies length(text_val) is O(N), while >> length(blob_val) is O(1), >> something I never quite realized. > > For this reason, and others discussed downthread, some languages which store > Unicode strings store the number of graphemes as well as its contents. So > functions which care about the … let's call it "width" … just retrieve that > number rather than having to parse the string to figure out the length. > > In a Unicode string 'length' can mean > > 1) octet count (number of 8-bit bytes used to store the string) > 2) number of code points (basic unicode unit) > 3) number of code units (how code points get arranged in UTF8, UTF16, etc., > not as simple as it looks) > 4) length in graphemes (space-using units) > 5) length in glyphs (font-rendering units) > > and probably others I've forgotten. Not to mention that I simplified the > definitions of the above and may have got them wrong. > > An application centred around rendering text (e.g. vector graphics drawing > apps) might have each piece of text stored with all five of those numbers, > just to save it from having to constantly recalculate them. > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users