Aha! I was thinking about using cut, but I didn't realize it had a dyadic
case.

I made iscombining rank zero just for clarity --when someone makes another
verb for it, it may or may not need an explicit "0.

Marshall

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Henry Rich
Sent: Monday, April 25, 2011 10:03 AM
To: Programming forum
Subject: Re: [Jprogramming] 32 bit wide unicode characters?

Or just

(<;.1~ -.@iscombining)

iscombining needn't be rank 0.  e. is applied to 0-cells when the right
operand has rank 1.

Henry Rich

On 4/25/2011 9:44 AM, Marshall Lochbaum wrote:
> This solution uses key:
>
>     iscombining=.e.&8413 773"0
>     seq=. 97 115 8413 100 102 773
>
>     (</.~ [: +/\ -.@iscombining) seq
> ---T--------T---T-------┐
> │97│115 8413│100│102 773│
> L--+--------+---+--------
>
> where the verb iscombining tells you whether a character is a 
> combining character. The result of ([: +/\ -.@iscombining) gives you a 
> list of indices which increase by one at each noncombining character:
>     +/\ -.@iscombining seq
> 1 2 2 3 4 4
>
> Key uses these indices to sort seq into boxes. I would set
>     tounities=.  (</.~ [: +/\ -.@iscombining) :. ; so that the verb to 
> reverse is just
>     |.&.tounities
> .
>
> Obviously, the verb iscombining needs to be improved to actually 
> recognize all combining characters. I don't know which characters are 
> combining, so I will let someone else do this.
>
> Marshall
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Alex Gian
> Sent: Monday, April 25, 2011 6:42 AM
> To: Programming forum
> Subject: Re: [Jprogramming] 32 bit wide unicode characters?
>
> Not sure if I'm on the right track here, but if I had to do this
> -- And I may in the future, as I sometimes have to deal with Greek 
> Polytonic, where _several_ combining characters may follow a vowel -- 
> I would somehow box all the unities (where a unity is either a letter 
> with no diacriticals or a letter followed by all its
> diacriticals/combiners) and then reverse the boxed contents.
>   97 ; 115  8413 ; 100 ; 102  773
> Voila - combining glyphs preserved.
>
> All I need now is a selective boxing filter... any suggestions?
> I haven't quite mastered boxing/unboxing yet, and I get some weird
results.
>
>
> On Fri, 2011-04-22 at 11:33 -0400, Raul Miller wrote:
>> On Fri, Apr 22, 2011 at 11:16 AM, Roger Hui<[email protected]>  wrote:
>>> The URL you cited appeared to be for 20000 to 20FFF in hex?
>>
>> Yes.
>>
>>     3&u:inv 20086
>> refers to the wrong character.
>>
>> Hypothetically speaking,
>>     3&u:inv 16b20086
>> would refer to the right character, except that it does not refer to 
>> any character.
>>
>> Thanks,
>>
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to