That fixed it.

I found that the culprit might be that when it eliminate
the leading 'one' (idiom) for 11 to 19 by the mask t
  t=. 0 0 1 0*.0 1-:1 2{d
  ' ' -.~ ; ((0=y){.ZH10) , (p*.s) # t}.&.> (d{ZH10) ,&.> b #&.> ZH4

however chinese utf8 has 3 bytes, just dropping one byte will leave 2
garbage bytes. So that another workaround might be amend t to become

  t=. 0 0 3 0*.0 1-:1 2{d

and leaves chinese constants in its original utf8.

On Thu, 15 Oct 2009, Roger Hui wrote:
> Fixed.  Please verify that it now works.
> 
> 
> 
> ----- Original Message -----
> From: bill lam <[email protected]>
> Date: Thursday, October 15, 2009 1:56
> Subject: [Jprogramming] jwiki essay/number in words, chinese version
> To: JProgramming <[email protected]>
> 
> > in jwiki essay/number in words. chinese version for some numbers 
> > eg. 10 to 19
> > and also 122000, 123456 there are extra garbage byte in the 
> > beginning.  Because
> > they are illegal bytes so that j session will not display them. 
> > As a test, each
> > chinese character in bmp has 3 bytes in utf8 so that the number 
> > of bytes in
> > resulting words should be divisible by 3.
> > 
> >    +/ 0 ~: 3&|@#...@zh("0) i.1e6
> > 109000
> > 
> > as a workaround, convert chinese literal into wide character, eg
> > 
> > ZH10=: 7&u:&.> <;._1 ' 零 一 二 三 四 五 六 七 八 九'  NB. i.10

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to