In a message dated 27/06/2003 10:41:34 Pacific Daylight Time, [EMAIL PROTECTED] writes:
Sorry for being away for most of this month... am working my way
through 200+ sword-related e-mails and saw this one:

>NEW CHINESE TEXTS:  It seems in our older Union texts, we added spaces
>between every character to help with line wraps and word breaks. 
I think the right thing to do is to change your layout engine to support correct Chinese line wrapping, instead of adding space (which should not be there) to work around the limitation in the layout engine.
Is
>this needed in the new NCV texts?  It seems they have spaces included at
>certain places. 

Chinese texts usually don't have spaces except after punctuation
marks.
Neither have space after puncation. No space, period.
I'll install NCV and take a look at the spaces it has.

>I noticed this using the Hanzi dictionary which always
>tried to lookup a 'word' instead of an individual glyph.
Chinese do have the concept of "word". But that is very different from the concept of the Latin word.
First of all, space is not used to seperate words.
Second, there are no easy way to parse a word.
Third a word could be a single characters or composed by 2-6 characters.
Forth, there are compound word so some times there are no easy way to tell the boundary of a word even you are native Chinese.
 
google implement very good Chinese search. Maybe you should look at how they do the search job.

I didn't do anything do make it lookup a 'word', in fact I don't know
how to make it lookup an individual glyph only ;-). It is often not
very useful to only look up one character (imagine looking up "foot"
and "ball" vs. looking up "football". The first lets you someone guess
the meaning, but the second gives the exact information). So it should
be possible to select a few characters and look them up in the
dictionary with the mouse or keyboard. However, for "standard lookup"
(ie. without text being selected) looking up the current character
only instead of the whole 'word' probably would be more useful, since
with most modules the 'word' is going to be the whole line.

Greetings,
   Christian

Reply via email to