Re: [sqlite] builtin functions and strings with embedded nul characters

R Smith Mon, 04 Jul 2016 04:08:18 -0700


On 2016/07/04 10:22 AM, Rob Golsteijn wrote:

@Clemens,



It is indeed documented that the behaviour is undefined when using a bind_text 
variant. I missed that part of documentation.


Hi Rob,

The behaviour is undefined in ALL instances where you pass nullcharacters through C strings because of a C string peculiarity, notbecause of a shortcoming of SQLite.

I think you are missing an important bit in all of this - the strings inC is the problem, they think a Null character indicates termination. Ithas nothing to do with how SQL stores data - SQLite will store it withall bytes intact, but you typically retrieve or set it via some C callsusing a C api.. and this is where the problem is. So whenever you wantto push strings into the DB or get them out, and they do contain char(0)characters, then you need to read them into/from bytestreams, arrays,blobs, hex-encoded strings, or some or other method that will not bepassing through a standard C string, because at that moment, and unlessyou force the length, the string will become shortened to the first zerobyte found.


Thus, it is not the implementation that needs changing, but the usage.

I had this problem in a similar situation where I tried to store MBCSstrings with 16-bit chars and 32-bit (4-byte) character strings. Apartfrom the enormous waste in 99% of characters, the trailing bytes wereall Zero bytes (so character 'A' would be represented by 0x65 00 00 00)and if you try to store 'ABC' like that into the DB and then read itwith just a C string, you end up with just A - but the DB still containsthe full 'ABC', it's only your own string that doesn't know it. Morewicked still, SQLite likely pushes the entire string to the memorylocation internally, so it is really there, but whatever next functionoperates on that string will only regard everything up to that firstZero byte thanks to C, and SQLite cannot help for that.

If SQLite could fix this - it wouldn't be documented as undefined, itwould have been fixed.

As far as documenting the above... Any C developer reading this wouldprobably giggle and think "Thanks captain obvious!", because this isreally first-week stuff in a C-Programming-101 course. However, comingfrom other compiling platforms, this may not be very obvious. I preferhow Lazarus/Delphi does it (wrt the Pascal variant options as opposed toC++) where a string is a record with first the encoding, the length andthen the actual bytes given. You never have to walk the memory to figureout the length and never care about null characters, it's all mapped inone place - but it does add overhead for small-ish strings and they havethat stupid convention where the first character index is at 1 and not 0- yuck, so pro's and con's for all.


Cheers,
Ryan


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] builtin functions and strings with embedded nul characters

Reply via email to