Igor Tandetnik, on Monday, November 11, 2019 11:02 AM, wrote... > > On 11/11/2019 10:49 AM, Jose Isaias Cabrera wrote: > > So, yes, it's bulky, but, if you want to count characters in languages such > > as > > Arabic, Hebrew, Chinese, Japanese, etc., the easiest way is to convert that > > string > > to UTF32, and do a string count of that UTF32 variable. > > Between ligatures and combining diacritics, the number of Unicode codepoints > in a > string has little practical meaning. E.g. it is not necessarily correlated > with the > width of the string as displayed on the screen or on paper; or with the > number of > graphemes a human would say the string contains, if asked.
That could be true, but have you tried to just display an specific number of characters from an UTF8 string having Hebrew, Arabic, Chinese, Japanese (see below). > > Most people have to figure out what Unicode they are using, count the > > bytes, divide > > by... and on, and on. Not me, I just take that UTF8, or UTF16 string, > > convert it to > > UTF32, and do a count. > > And then what do you do with that count? What do you use it for? Say that I am writing a report and I only want to print the first 20 characters of a string, that would be something like, if (var.length> 20) { writefln(var[0 .. 20]); } else { writefln(var ~ " "[0 .. 20]); } if var is declared UTF8, and there is a Chinese string or some multi-byte language in that string, this will never print 20 Chinese characters. It will print less. If, I convert that UTF8 string to UTF32, then each multi-byte character fits in one UTF32 character. So, dchar[] var32 = std.utf.toUTF32(var); if (var32.length> 20) { writefln(var32[0 .. 20]); } else { writefln(var32 ~ cast(dchar[])" "[0 .. 20]); } This will always print 20 characters, whether these are ASCII or multi-byte language characters. Thanks. josé _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users