This seems to work OK as a sqlite function.


// assume values[0] & [1] are supplied and not null

// find Count of values[1] in values[0]



char *c = (char *)sqlite3_value_text(values[0]);

char *Sep = (char *)sqlite3_value_text(values[1]);

int Byte1, Count=0, NrBytes, NrSepBytes = strlen(Sep);



while (*c)

{



       Byte1 = (*c) >> 4;

       if ((Byte1 & 8) == 0) NrBytes = 1;

       else if (Byte1 & 1) NrBytes = 4;

       else if (Byte1 & 2) NrBytes = 3;

       else NrBytes = 2; // (Byte1 & 4) == 4



       if (NrBytes == NrSepBytes && memcmp(c, Sep, NrBytes) == 0) Count++; // 
at first byte of Sep

       c += NrBytes;

}

sqlite3_result_int(ctx, Count);



________________________________
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> on behalf of 
Scott Robison <sc...@casaderobison.com>
Sent: Friday, April 12, 2019 8:40:19 PM
To: SQLite mailing list
Subject: Re: [sqlite] Help with sqlite3_value_text

On Fri, Apr 12, 2019, 1:06 PM Keith Medcalf <kmedc...@dessus.com> wrote:

>
> Actually you would have to convert the strings to UCS-4.  UTF-16 is a
> variable-length encoding.  An actual "unicode character" is (at this
> present moment in time, though perhaps not tomorrow) 4 bytes (64-bits).
>

That is some impressive compression! :)

Regardless, even if you use UCS-4, you still have the issue of combining
characters. Unicode is complex as had been observed.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to