On Fri, 5 Jul 2002 17:22:57 -0700
"Arun v" <[EMAIL PROTECTED]> wrote:

> Hi
> 
>  Im a newbie to this unicode world.
>  Im developing a EcmaScript Interpreter according to ECma 262 Standard,which
> states the input given to the ecma interpreter
>  will be in UTF-16(normalised to Unicode Normalised form C) transformation
> format.
> 
>  I have an C program in Linux(which acts as scanner for the interpreter),now
> I wanna make it aware of UTF-16 transformed
>  input.(I need not do any transformation or normalisation but make my
> program understand the UTF-16 Encoding)

Do  the  identifiers  and variable names of ECMAScript itself need to be in
Unicode or just the Strings? If it's just the strings then then just decode
each in your tokenizer (or I suppose you could build up each string in your
state machine). Then you will need basic string operators for UTF-16. 

> P.N : also suggest me some good online resource of  Unicode and UTF-16

See  this  FAQ.  It's  focus  is  UTF-8  on  Unix  but UTF-16 uses the same
principle and will be a good jumping off point.

  http://www.cl.cam.ac.uk/~mgk25/unicode.html

Mike

-- 
http://www.eskimo.com/~miallen/c/jus.c

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to