> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Philippe Verdy
> Sent: Monday, May 10, 2004 9:09 AM
> From: "Michael Everson" <[EMAIL PROTECTED]>
> > Japanese is different; the users all use both scripts all the time.
>
> And there are occurences in Japanese of Katakana suffixes or
> particules added to
> Latin or Han words, notably to people names and trademarks...
> I've seen many
> texts where Han and Katakana are mixed in the same "word"
> (where it would be
> inappropriate to insert a word-break between runs of Han and Katakana
> particules.)
You mean hiragana, not katakana, and kanji, not Han, I believe. Katakana are used for transliteration, and are not typically joined to kanji, whereas hiragana are ubiquitously joined to kanji, as Japanese particles do not ordinarily have kanji representation. I have not seen katakana joined to kanji (or romaji), and suspect that such does not occur.
> My first implementation allowed line-breaks after each Han
> character, but an
> exception was made after users request to not do that after
> Han and before
> Katakana (despite line break is allowed between two Han
> characters), or after
> Latin and Katakana. So a simple approache that allows
> linebreaks between
> distinct scripts is deceptive. Am I wrong, or are my users
> wrong and want it as
> a presentation preference?
I believe, but am not certain, that nonbreaking kanji-to-hiragana is correct, whereas you can break on kanji-to-katakana.
But all this leads me to finally ask: what does "script" mean? It seems clear to me that although the term has been used throughout the Phoenician debate, not everyone is using it the same way. I know that there is a definition of "script" that is used for encoding purposes, but can I find it written anywhere, or is it more of an ephemeral thing?
Thanks,
/|/|ike

