Re: Eastern language keyboard events

2000-01-24 Thread teunis

On Sun, 23 Jan 2000, Steve Cheng wrote:

 On Sat, Jan 22, 2000 at 04:05:06PM -0800, Jon M. Taylor wrote:
 
  Nice, eh?  The unfortunate truth is that this crazy input system
  is pretty much required, due to the highly contextualized nature of the
  Japanese language.  The Kanji for 'Zen' (for example) can have over 20
  completely different meanings when used in different grammatical contexts.  
  Unless you keep track of the running context, it is impossible to
  accurately translate subsequently entered Kanji syllables into Kanji
  words.  This more or less requires a full Japanese grammar engine be
  embedded into the input protocol itself |-/.
 
 There are free programs which do this kana - kanji conversion (Wnn,
 Canna, etc.).
 
  As far a Chinese and Korean are concerned, I really don't know
  that much.  I think that the Chinese and Japanese kanji are mostly the
  same, but China does not have a phonemeic written alphabet.  And written
  Vietnamese is all phonetic (Roman alphabet with a bunch of phonetic
  modifications to the basic roman characters).
 
 Chinese input methods are even more difficult.  There is a phonetic
 syllabary ("bopomofo"), but it is never used in normal writing, only
 learning, and only for Mandarin dialect I think.  (I speak Cantonese,
 read Chinese, but still don't know the syllabary.)
 
 Various Chinese input methods use the strokes/radical of the kanji... I
 don't know how to use them myself and have not seen any documentation on
 it, so I can't tell you much.  Using a tablet and (CJK) writing
 recognition software for input is quite popular here, but unfortunately
 the products are all for Windows :(

Now I -have- seen how Chinese input methods are typically done...  at
least as far as IBM is concerned with their North American distrib.  (I
stopped by and visited a large IBM store and checked it out there a ways
back :)

Anyways, you have about 5 keys (IIRC) mapped to different strokes.  A key
is used to bring up the entry window and you tap strokes, and the menu
gives choices of valid characters from that list.  IMHO I like it better
than the various Japanese systems I've seen - although you need to know
how many strokes and what order they are in for the characters.  (this is
how I learned to write the Chinese characters so... :)

Anyways, once you've selected the character (glyph) you want, it's
inserted where-ever you happen to be.

Now this -was- under OS/2 so Linux may handle it differently.  IIRC there
were no extra keys either so I'm not sure how they triggered the character
entry.  Though there could have been a key for it.

G'day, eh? :)
- Teunis



Re: Eastern language keyboard events

2000-01-23 Thread Andreas Beck

 ok. Do you think that this accumulation is something gii should handle ?
 I mean, it would certainly be clean to just expose the unicode once it is
 completed. Then the client would only see one single key press even though
 there were quite a couple.

Though I agree it would be the cleanest solution, the menu stuff would
require heavy interaction with the main application to avoid chaos to break
out on screen.

 since this seems indeed to be very complex, might I suggest that we handle 
 this inside berlin ? This way, we could define ourself helper applets which 
 present contextual help for composing text. This would be extremely 
 difficult on a lower  level.

Yes - the main point is the display of completion menus and stuff. This is
very hard to handle on lower levels, especially if you allow multithreading.
We'd basically have to lock areas for the menus, which essentially can't be
done without application help and it would as well couple at least LibGII 
to LibGGI, which isn't a good thing either.

As (according to the other post here) there seems to be a Japanese input 
mapper available, it would probably be best, if LibGII just reported the
unicode values on the keyboard and left the upper layers to Berlin that
would use said mapper internally.

CU, ANdy

-- 
= Andreas Beck|  Email :  [EMAIL PROTECTED] =



Re: Eastern language keyboard events

2000-01-23 Thread Steve Cheng

On Sat, Jan 22, 2000 at 04:05:06PM -0800, Jon M. Taylor wrote:

   Nice, eh?  The unfortunate truth is that this crazy input system
 is pretty much required, due to the highly contextualized nature of the
 Japanese language.  The Kanji for 'Zen' (for example) can have over 20
 completely different meanings when used in different grammatical contexts.  
 Unless you keep track of the running context, it is impossible to
 accurately translate subsequently entered Kanji syllables into Kanji
 words.  This more or less requires a full Japanese grammar engine be
 embedded into the input protocol itself |-/.

There are free programs which do this kana - kanji conversion (Wnn,
Canna, etc.).

   As far a Chinese and Korean are concerned, I really don't know
 that much.  I think that the Chinese and Japanese kanji are mostly the
 same, but China does not have a phonemeic written alphabet.  And written
 Vietnamese is all phonetic (Roman alphabet with a bunch of phonetic
 modifications to the basic roman characters).

Chinese input methods are even more difficult.  There is a phonetic
syllabary ("bopomofo"), but it is never used in normal writing, only
learning, and only for Mandarin dialect I think.  (I speak Cantonese,
read Chinese, but still don't know the syllabary.)

Various Chinese input methods use the strokes/radical of the kanji... I
don't know how to use them myself and have not seen any documentation on
it, so I can't tell you much.  Using a tablet and (CJK) writing
recognition software for input is quite popular here, but unfortunately
the products are all for Windows :(

-- 
Steve Cheng

email: [EMAIL PROTECTED]
www: http://shell.ipoline.com/~elmert/



Re: Eastern language keyboard events

2000-01-22 Thread Stefan Seefeld

"Jon M. Taylor" wrote:

 * Method 0: Romanji (sp?).  Standard Roman character input.
 
 * Method 1: Key-per-Hiragana-character.  Hiragana is a native Japanese
 phonetic alphabet, with each key/modifier mapped in a similar manner to
 roman alphabets.  Fairly easy to handle.
 
 * Method 2: Key-per-Katakana-character.  Another(!) native Japanese
 phonetic alphabet, used with foreign words.  Just needs another
 key/modifier lookup table.  Also fairly easy to handle.
 
 * Method 3: Inline contextualized phoneme-set-to-Kanji-grammar based
 lookup (!!!).  This is the hard one.  This is what Andy was referring to
 below, and it is _wierd_.  Essentially it works as follows:
 
 1: You use the Hiragana or Katakana(?) alphabet to enter a string of
 phonemes which comprise a _spoken_ Kanji syllable.  Each syllabic Kanji
 can be composed of 1, 2, 3 or sometimes 4 phonemes.  You then need a way
 to tell the software side that you are finished entering your Kanji
 syllable, which on Japanese Windows is done by hitting the spacebar.  So:
 
 Hiragana 'Z' phoneme +
 Hiragana 'e' phoneme +
 Hiragana 'n' phoneme +
 Spacebar =
 --
 'Zen' Kanji syllabic glyph.
 
 I *think* that this stage can be handled with a simple keypress
 accumulator and table lookup.

ok. Do you think that this accumulation is something gii should handle ?
I mean, it would certainly be clean to just expose the unicode once it is
completed. Then the client would only see one single key press even though
there were quite a couple.
The alternative is that berlin handles this internally. However, I think it
would be coherent with the rest if it would already be done. key to unicode
mapping is handled outside as you told me so I would assume that this is a
general rule.

 2: Now that we have a way to turn phonemic Katakana or Hiragana into Kanji
 syllables, we next need a way to turn the Kanji syllables into higher
 level conceptually-mapped Kanji glyphs or glyph strings ("words").  As

Do you really mean Glyph ? I assume you know that the charater - glyph
mapping is quite complex. So I would assume the syllables to be mapped to
characters, not glyphs.

[...]

 Nice, eh?  The unfortunate truth is that this crazy input system
 is pretty much required, due to the highly contextualized nature of the
 Japanese language.  The Kanji for 'Zen' (for example) can have over 20
 completely different meanings when used in different grammatical contexts.
 Unless you keep track of the running context, it is impossible to
 accurately translate subsequently entered Kanji syllables into Kanji
 words.  This more or less requires a full Japanese grammar engine be
 embedded into the input protocol itself |-/.

since this seems indeed to be very complex, might I suggest that we handle this
inside berlin ? This way, we could define ourself helper applets which present
contextual help for composing text. This would be extremely difficult on a lower 
level.

 Luckily, although this is all quite complex, I do not think it
 impossible.  One or more LibGII translation modules will need to sit in
 the input stream and perform the various translation steps, wile also
 sending events back and forth to the higher-level LibGGI code which
 handles the display updating, highlighting, autocompetion, etc.  And I
 would be surprised if there were not already some open source code
 Japanese grammar engine code out there, give how much has been done
 already WRT Japanese locale support.

ok. This would be a different approach. For us, however, this would require
a more tight coupling between GII and berlin. berlin would essentially need to
install callbacks into gii so that the desired functionality would be achieved
by cooperation. I think handling this exclusively in berlin is by far the easiest.

 My knowledge is secondhand, but Mitch (my roomate) knows all of
 this quite well, so I can quickly find out whatever I do not already know.
 Let me know if I can be of further help to anyone here.

Thanks.

Stefan

___  
  
Stefan Seefeld
Departement de Physique
Universite de Montreal
email: [EMAIL PROTECTED]

___

  ...ich hab' noch einen Koffer in Berlin...



Re: Eastern language keyboard events

2000-01-22 Thread Dan Hollis

On Sat, 22 Jan 2000, Jon M. Taylor wrote:
   Luckily, although this is all quite complex, I do not think it
 impossible.  One or more LibGII translation modules will need to sit in
 the input stream and perform the various translation steps, wile also
 sending events back and forth to the higher-level LibGGI code which
 handles the display updating, highlighting, autocompetion, etc.  And I
 would be surprised if there were not already some open source code
 Japanese grammar engine code out there, give how much has been done
 already WRT Japanese locale support.

You dont need to do anything. Just write a frontend that speaks to Canna.

Canna is the dictionary/grammar parser. kinput2 is merely an X11 front end
to it. Communication to Canna is done over unix domain sockets.

Linux already works quite well with Canna/kinput2+X11. Many distributions
include preconfigured RPMs for this. Debian too, I think.

If you plan to support japanese language, this is where you should start.

http://hikari.tlug.gr.jp/~craigoda/writings/linux-nihongo/linux-nihongo.html

The sections of particular interest are 'Japanese Encoding Methods' and
'Japanese Input'. Ignore the parts on Wnn, though. Canna and kinput2 are
what everyone uses.

If you implement anything, do it the way kinput2 does it. It's well
understood and everyone will be used to it. Noone likes to have to
re-learn new input methods.

BTW default encoding method for Japanese on Linux is EUC.

-Dan