Dear all

On bitbucket is now an update (see change log message below) that introduces support of UTF-8 characters using up to 4 bytes (with Tcl 8.6). It should work as well with 6 byte UTF when Tcl 8.7 is properly compiled (by setting TCL_UTF_MAX).

One can now use e.g. emoticons in SQL queries

    db_0or1row ... {select 1 from cr_items where name = '😈'}

or as values of bind variables

    set x 😈
    db_0or1row ... {select 1 from cr_items where name = :x}

... but not as names of bind variables (these have the same restricted syntax than before
(in essence no funny characters).

The code is already running at openacs.org.

all the best

-gn


Added support for UTF-8 characters up to 4 bytes

These changes add proper export of UTF-8 for Unicode symbols taking up
to 4 bytes. For the western world the biggest interest is probably for
emoticons. The change is implemented with performance in mind. The
proper encoded byte-strings are cached in Tcl_Objs, such that only the
values for bind-vars (which have probably different values per call)
have to be recoded at call time. This should keep the performance
penalty small (we see on some of our servers in day-average 1500 SQL
operations per second, peaks at >10K).

The names of bind variables follow still the same rules as before (no
emoticons as variable names).

On 16.11.21 16:39, Wolfgang Winkler via naviserver-devel wrote:

the fix worked, thank you Gustaf! But we still have a problem with emojis when writing them to the database. The error we get is:

Database operation "dml" failed (exception ERROR, "ERROR:  invalid byte sequence for encoding "UTF8": 0xf0 0x9f

_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to