> On Apr 12, 2019, at 12:58 PM, x <tam118...@hotmail.com> wrote:
> 
> I’ve been asking myself if I could have done the above more efficiently as 
> sqlite’s converting the original string then I’m converting it and copying 
> it. While thinking about that I started to wonder how c++ handled utf8/16. 
> E.g. To access the i’th character does it have to rattle through all previous 
> I-1 characters to find the start of character i, how pointer arithmetic was 
> handled when pointing to utf8/16 chars etc.
> 

Basically, if you are dealing with a variable width encoding (UTF-8/UTF-16), 
then finding the nth character requires scanning the string counting beginning 
of characters. If this is an important operation, you pay the cost of 
conversion and work in UCS-4. On the other hand, UTF-8 has a lot of nice 
properties such that it can be a fairly seamless upgrade for processing plain 
ASCII text, and if reasonably efficient for typical text. (There are a number 
of complications if you try to support ALL of Unicode, like the composed 
characters, where you use several code-point together to define a single 
character), where you need to decide how you want to normalize and need some 
big character tables for the instructions of how to do this.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to