Yep, I hear you Henry, I wanted to be sure you were aware of the differences. The concern that I have is that when the UTF-8 is broken up that the display be something that is a bit more understandable for the user. I was playing around with it recently in trying to develop a web based solution, which is a lot easier because there is no constraint of a fixed font size. Here is a video of what I ended up with https://youtu.be/eN9H-rMk1No and the key verb in the process was this one that breaks a list of point numbers into appropriate chunks to generate characters.
utf_vts_ 3 : 0"1 if. y-:'' do. return. end. try. ((utf@:((1<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (1<.#) {. ]))) y catch. try. ((utf@:((2<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (2<.#) {. ]))) y catch. try. ((utf@:((3<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (3<.#) {. ]))) y catch. ({. ; utf@}.) y end. end. end. ) Thanks for all of your work on this, handling the variable widths and uneven box boundaries within your constraints seems quite a challenge. By the way, the most recent version of the beta that I could download was 7 on the Darwin platform, so I have not had a chance to try out the solution to the CJK character widths. Cheers, bob > On Jul 5, 2016, at 2:52 AM, Henry Rich <henryhr...@gmail.com> wrote: > > This 3 4 $ 'a', 'ఝ', 'b' > is a strange beast: The string requires 5 bytes, so you have chopped 3 copies > of it into 3 4-byte sections. ": formats each line separately; the second > and third lines contain partial UTF-8 sequences which are displayed as > unknown. I think it's right; at least it's reasonable (formatting of invalid > UTF-8 is undefined). > > Playing with UTF-8 is hard; easier to convert to unicodes first, using u: . > > Henry Rich > > > > > On 7/5/2016 12:21 AM, robert therriault wrote: >> JVERSION >> Engine: j804/j64/darwin >> Release: commercial/2015-12-21 18:06:25 >> Library: 8.04.15 >> Qt IDE: 1.4.10/5.4.2 >> Platform: Darwin 64 >> Installer: J804 install >> InstallPath: /applications/j64-804 >> Contact: www.jsoftware.com >> >> 3 4 $ 'a', 'ఝ', 'b' >> aఝ >> ba�� >> �ba� >> >> JVERSION >> Engine: j805/j64/darwin >> Beta-7: commercial/2016-06-22T14:29:03 >> Library: 8.04.15 >> Qt IDE: 1.4.9/5.4.2 >> Platform: Darwin 64 >> Installer: J804 install >> InstallPath: /users/bobtherriault/j64-804 >> Contact: www.jsoftware.com >> >> 3 4 $ 'a', 'ఝ', 'b' >> aఝ >> baà° >> baà >> >> I can't seem to download the newest beta, but this behaviour in the previous >> one does not seem correct and may be the same in the newest version (I don't >> expect that the CJK changes would affect this behaviour). Introducing new >> characters instead of the � to show non-displayable characters would be >> confusing to the user, and the third line containing 3 characters instead of >> the expected 4 seems incorrect. >> >> Cheers, bob >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm