Hmm..

So I should have used 3 u: 7 u: instead of using 3 u: directly.

Thinking about this further

N=: ".0 :0-.LF
 117 205 158 204 153 204 157 205 153 204 151 204 151 205 150 110
 105 204 187 99 204 184 204 152 204 188 204 173 205 150 204 172
 204 178 111 205 158 100 204 168 204 174 204 186 204 170 204 169
 205 150 101 205 152 204 171 204 163
)

   (4 u: N) -: N {a.
1

But 4 u: N does not look like N{a.

But I guess that that should be ok, because something analogous
happens with numbers:

   M=:2^31
   M
2.14748e9
   x:M
2147483648
   M-:x:M
1

I guess I have to keep in mind that it can never be perfect.

Thanks,

-- 
Raul

On Wed, Sep 10, 2014 at 2:26 PM, Dan Bron <[email protected]> wrote:
> Raul wrote:
>>  Rather than give you a screenshot, here's what inspired my original message:
>>
>>  https://twitter.com/0xabad1dea/status/509748597668446208
>
> When I want to embed Unicode reliably in my J programs, I typically spell
> out the codepoints as numeric constants, using "16bXXXX" in place of
> "U+XXXX" (bearing in mind that hexadecimal values in J's constant notation
> must be in lowercase; that is, 16b01ab, not 16b01AB).
>
> Then, I play around with variations on ucp, utf8, 3&u:, 4&u: until I get
> the results I'm expecting (at the very least, in terms of the length of
> the string, which may not be the number of distinct characters a human
> would identify, but should at least be [significantly] lower than the
> number of UTF8 codepoints).
>
> Using your example, I might do something along the lines below.
>
> -Dan
>
> require 'printf ~system\extras\util\browser.ijs'
>
> UC =: ucp 4 u: 0 ". LF&=`(,:& ' ')} noun define -. TAB
>         16b0075 16b035e 16b0319 16b031d 16b0359 16b0317
>         16b0317 16b0356 16b006e 16b0069 16b033b 16b0063
>         16b0338 16b0318 16b033c 16b032d 16b0356 16b032c
>         16b0332 16b006f 16b035e 16b0064 16b0328 16b032e
>         16b033a 16b032a 16b0329 16b0356 16b0065 16b0358
>         16b032b 16b0323 16b0020
> )
>
> HTML =: noun define -. TAB,CR
>         <!DOCTYPE html>
>         <html lang="en">
>           <head>
>             <meta charset="utf-8">
>                 <title>Unicode test page</title>
>           </head>
>           <body>
>              %s
>           </body>
>         </html>
> )
>
>
> FN =: jpath '~temp\unicode.html'
>
> FN fwrite~ HTML sprintf < utf8 UC
> launch_jbrowser_ F
>
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to