Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-29 Thread Silvan Jegen



On July 26, 2022 3:29:15 PM GMT+03:00, a...@sdf.org wrote:
>> Silvan Jegen wrote:
>> ktrans seems to be quite different actually. According to the
>> documentation it uses the Cangjie input method
>I was really surprised when I read this and of course, this is not true. I 
>suppose you meant ctrans.

Ah, my bad. I must have confused the two.


Cheers,
Silvan

> https://git.sansfontieres.com/~romi/ktrans/tree/front/item/README.kenji

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Ma5d0eb2bc4af1d14fe1e7e30
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-26 Thread adr
> Silvan Jegen wrote:
> ktrans seems to be quite different actually. According to the
> documentation it uses the Cangjie input method
I was really surprised when I read this and of course, this is not true. I 
suppose you meant ctrans.

https://git.sansfontieres.com/~romi/ktrans/tree/front/item/README.kenji

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M0049de1a1058af72e04fe22c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread LdBeth
> In <288YQ7Y33V3RF.38NPGPX4H2CHU@homearch.localdomain> 
>   "Silvan Jegen"  wrote:
SJ> andp...@foxmail.com wrote:
>> On Friday, 22 July 2022, at 2:09 PM, Silvan Jegen wrote:
>> > Ah, I didn't know that! I also don't know anyone who does office work
>> > in a place where traditional Chinese characters are used though ...
>>
>> They would use RIME, https://rime.im a free software widely
>> recognized among Chinese users who are not satisfied with default
>> Pinyin. But unfortunately that thing is written in C++ so making a
>> port is unliky.
SJ> Funnily enough I use Rime on my Linux machine to input Simplified
SJ> Chinese. I honestly just switched a Rime input setting to something that
SJ> looks like pinyin but the suggestions seem better to me than the old
SJ> IME that I used ... I should probably invest some time in understanding
SJ> how the thing actually is supposed to be used (documentation in English
SJ> seems sparse and my Chinese sucks).

RIME was popularized because most other Pinyin based IMEs on the
market suck for traditional Chinese input, for these IMEs' suggestion
dictionaries were usually directly substituted from simplified Chinese
versions, but mapping simplified Chinese to transitional Chinese is
very context sensitive. The byproduct of RIME is the OpenCC
https://github.com/BYVoid/OpenCC library that can handles all the
trivia of these kinds of translation.

The SC support for RIME was contributed by community, I think, and the
author of RIME uses Cangjie. Cangjie was not officially designed for
simplified Chinese but was extended to be able to handle that. I heard
rumors that the author refused to add a switch to prioritize
simplified Chinese characters for Cangjie in RIME, so an external
dictionary is used if users want to have that behavior.

---
LDB

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M7654c6f7091bf0a32c7e3bca
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread Silvan Jegen
andp...@foxmail.com wrote:
> On Friday, 22 July 2022, at 2:09 PM, Silvan Jegen wrote:
> > Ah, I didn't know that! I also don't know anyone who does office work
> > in a place where traditional Chinese characters are used though ...
>
> They would use RIME, https://rime.im a free software widely
> recognized among Chinese users who are not satisfied with default
> Pinyin. But unfortunately that thing is written in C++ so making a
> port is unliky.

Funnily enough I use Rime on my Linux machine to input Simplified
Chinese. I honestly just switched a Rime input setting to something that
looks like pinyin but the suggestions seem better to me than the old
IME that I used ... I should probably invest some time in understanding
how the thing actually is supposed to be used (documentation in English
seems sparse and my Chinese sucks).

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M9e59e41273b1269646ab8584
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread andpuke
On Friday, 22 July 2022, at 2:09 PM, Silvan Jegen wrote:
> Ah, I didn't know that! I also don't know anyone who does office work
in a place where traditional Chinese characters are used though ...
They would use RIME, https://rime.im a free software widely recognized among 
Chinese users who are not satisfied with default Pinyin. But unfortunately that 
thing is written in C++ so making a port is unliky.

ldb
--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mc5ba1baecec99ea1967578b2
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread Jacob Moody
On 7/22/22 12:06, Sebastian Higgins wrote:
> A few things:
> 
> 1.  Cangjie is still widely used in places that uses traditional Chinese 
> characters. You would still be required to be good at it if you apply for 
> text-heavy office jobs in these places.
> 2.  Radical-based/shape-based methods were extremely popular when the 
> prediction technology wasn't as good (which means Pinyin was significantly 
> slower). It wasn't until late 2000s to early 2010s before this situation has 
> changed.
> 3.  Pinyin without prediction is slow because of what we called the 重码 (lit. 
> "overlap of encoding") problem. For Pinyin the encoding overlaps because many 
> characters may have the same Pinyin; the purpose of all shape-based method is 
> to reduce the overlap problem and thus increase the input speed.
> 4.  ctrans uses cangjie because (1) implementing shape-based methods was 
> much, much more simpler than phonetic-based methods because most (if not all) 
> of the job is table lookup; (2) if we were to use the same UI (or lack 
> thereof) as ktrans the overlap-of-encoding problem of Pinyin would very 
> probably drive you nuts when using it; (3) it is the input method the author 
> uses, however I do admit using Cangjie for simplified Chinese input is kinda 
> peculiar.
> 
> Source: me who is a native Chinese speaker and have learned Wubi (a 
> shape-based method for simplified Chinese) in primary school.

I had taken a naive attempt at trying getting ktrans to support a form
of Chinese input. Admitably, my interest was mostly in stress testing
my rewrite of the hashmap used in ktrans, throwing a ~100k character
dictionary at it seemed like a fun way to test it. The dictionary I
imported was one that used Wubi based mapping for charters, posted
by jxy to the 9front mailing list a week or so ago.

If anyone is curious the dictionary itself can be found here:
https://raw.githubusercontent.com/fcitx/fcitx-table-data/master/wbx.txt

This has been super interesting to me from a learning perspective.

Thanks for the insight!
Jacob Moody

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mcf3888dbfc4013192d8c471e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread andpuke
On Wednesday, 20 July 2022, at 11:15 PM, cigar562hfsp952fans wrote:
> I've often wondered that.  What input methods do Chinese speakers use?
What do Chinese keyboards look like?  How do they find/select the
character they want?  Are different sets of characters available on
different computers, or are input methods standardized?  I wonder.
Most Chinese speakers just use standard "British and American keyboards". There 
are keycaps engraved with Wubi or Cangjie or Bopomofo (or Zhuyin), but they are 
all compatible with QWERTY.

On Thursday, 21 July 2022, at 1:58 AM, sirjofri wrote:
> I was more referring to computers built without any american influence at 
all, so no ansi, no ascii, no LTR, probably different keycodes...
Cangjie was the first solution to Chinese processing with *personal computers* 
(at the time of Apple ][ it was sold as  extension boards.)
There used to be other encoding methods such as using only numpad (Four-Corner 
Method), or special keyboards (Ming Kwai typewriter), even an input method for 
Chinese had been invented in US https://patents.google.com/patent/US2412777A, 
but they were almost disappeared.

There are a few other considerations regards to adopting Cangjie besides 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mf1934dc65975e0ca3989d488/ctrans-chinese-language-input-for-plan9:

1. Cangjie is copyright free and related IMEs are distributed as free software, 
while (at least newer version of) Wubi is patented.
2. Personally, I realized the order of strokes has been changed during the last 
10 years or so and similarly, the pronunciation of certain characters has also 
altered over the time.

Best wishes
---
ldb
--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M63fcb9504cafbca55334
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread Silvan Jegen
Heyhey!

Sebastian Higgins  wrote:
> A few things:
> 
> 1.  Cangjie is still widely used in places that uses traditional
> Chinese characters. You would still be required to be good at it if
> you apply for text-heavy office jobs in these places.

Ah, I didn't know that! I also don't know anyone who does office work
in a place where traditional Chinese characters are used though ...


> 2.  Radical-based/shape-based methods were extremely popular when
> the prediction technology wasn't as good (which means Pinyin was
> significantly slower). It wasn't until late 2000s to early 2010s
> before this situation has changed.

At least in Japan I have never met anyone using a
radical-based/shape-based input method. I have not even met anyone using
direct Kana input, only through romaji. That said, may be an earlier
generation used it more commonly ...


> 3.  Pinyin without prediction is slow because of what we called the
> 重码 (lit. "overlap of encoding") problem. For Pinyin the encoding
> overlaps because many characters may have the same Pinyin; the purpose
> of all shape-based method is to reduce the overlap problem and thus
> increase the input speed.

Yeah, it's due to the high homophones count. Only the tones differ and
these are not supported in pinyin input methods (as far as I know ...)


> 4. ctrans uses cangjie because (1) implementing shape-based methods
> was much, much more simpler than phonetic-based methods because most
> (if not all) of the job is table lookup; (2) if we were to use the
> same UI (or lack thereof) as ktrans the overlap-of-encoding problem
> of Pinyin would very probably drive you nuts when using it; (3) it is
> the input method the author uses, however I do admit using Cangjie for
> simplified Chinese input is kinda peculiar.
> 
> Source: me who is a native Chinese speaker and have learned Wubi
> (a shape-based method for simplified Chinese) in primary school.

Thanks for the insights. I appreciate it!


Cheers,
Silvan

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M977f609261cd764b55ad5dbf
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread Sebastian Higgins
A few things:

1.  Cangjie is still widely used in places that uses traditional Chinese 
characters. You would still be required to be good at it if you apply for 
text-heavy office jobs in these places.
2.  Radical-based/shape-based methods were extremely popular when the 
prediction technology wasn't as good (which means Pinyin was significantly 
slower). It wasn't until late 2000s to early 2010s before this situation has 
changed.
3.  Pinyin without prediction is slow because of what we called the 重码 (lit. 
"overlap of encoding") problem. For Pinyin the encoding overlaps because many 
characters may have the same Pinyin; the purpose of all shape-based method is 
to reduce the overlap problem and thus increase the input speed.
4.  ctrans uses cangjie because (1) implementing shape-based methods was much, 
much more simpler than phonetic-based methods because most (if not all) of the 
job is table lookup; (2) if we were to use the same UI (or lack thereof) as 
ktrans the overlap-of-encoding problem of Pinyin would very probably drive you 
nuts when using it; (3) it is the input method the author uses, however I do 
admit using Cangjie for simplified Chinese input is kinda peculiar.

Source: me who is a native Chinese speaker and have learned Wubi (a shape-based 
method for simplified Chinese) in primary school.


From: Silvan Jegen 
Sent: Friday, July 22, 2022 12:30
To: 9fans
Subject: Re: [9fans] Re: ctrans - Chinese language input for Plan9

a...@sdf.org wrote:
> > I stumbled onto an instructive video on youtube not that long ago. I'm
> > sure there are a few you'll be able to search for. If I understand
> > correctly, it's a combination of entering the phoneme by the nearest
> > Latin letter, then select from a diminishing range of suitable options
> > on the screen.
>
> There are other input methods based on the shape of the
> characters. Some are better with traditional Chinese characters,
> other with simplified characters, it's complicated... Let see if some
> Chinese comrade share with us his daily life experience. The Japanese
> is input writing kana directly with a Japanese keyboard or by romaji
> with roman characters on western keyboards (ka -> か, ) and then
> transformed to kanji when necessary. There are different IMEs, but the
> principle is the same. I suppose that ktrans is similar, I haven't
> tried jet.

ktrans seems to be quite different actually. According to the
documentation it uses the Cangjie input method [0] which is based on the
so called "radicals". These are some more basic elements that the Chinese
characters are made of (note that the "radicals" chosen for Cangjie are
not identical to the 214 radicals that are commonly used to classify
Chinese characters. For the latter see [1]).

Every one of these 24 Cangjie radicals gets mapped to an ASCII character
and their combinations then uniquely identify a Chinese character (the
wikipage at [0] illustrates the approach very well).

This input method seems to be old and I have never seen a Chinese person
use it. From what I understand, most Chinese people nowadays just write
text in Pinyin (a latin transliteration of the Chinese pronounciation)
and then the IME helps you choose the correct combination of Chinese
characters (potentially taking the context of the text already written
into account).


Cheers,

Silvan

[0] https://en.wikipedia.org/wiki/Cangjie_input_method
[1] https://en.wikipedia.org/wiki/Kangxi_radical

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mf1934dc65975e0ca3989d488
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread adr
Yep, Cangjie is one of those input methods based on shape I was talking about, 
more appropriate for traditional Chinese characters used in Taiwan, Hong-Kong, 
etc. South Korea still use kanji similar to traditional Chinese, but I don't 
know what input method they use. Note that in mainland China people use Pinyin 
because they imposed the use of simplified Chinese characters.  It surprises me 
to hear that ktrans uses Cangjie, Japanese keyboards let you input kana 
directly, and the use of kana to write without kanji is common, specially in 
books for kids, so it seams more natural to me to make a kana->kanji conversion 
(or romaji->kana->kanji in Western keyboards). But I'm not Japanese, maybe 
Cangjie is faster, I've never tryed.
--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M9f10d9140a5f0838d615958f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-22 Thread Silvan Jegen
a...@sdf.org wrote:
> > I stumbled onto an instructive video on youtube not that long ago. I'm
> > sure there are a few you'll be able to search for. If I understand
> > correctly, it's a combination of entering the phoneme by the nearest
> > Latin letter, then select from a diminishing range of suitable options
> > on the screen.
> 
> There are other input methods based on the shape of the
> characters. Some are better with traditional Chinese characters,
> other with simplified characters, it's complicated... Let see if some
> Chinese comrade share with us his daily life experience. The Japanese
> is input writing kana directly with a Japanese keyboard or by romaji
> with roman characters on western keyboards (ka -> か, ) and then
> transformed to kanji when necessary. There are different IMEs, but the
> principle is the same. I suppose that ktrans is similar, I haven't
> tried jet.

ktrans seems to be quite different actually. According to the
documentation it uses the Cangjie input method [0] which is based on the
so called "radicals". These are some more basic elements that the Chinese
characters are made of (note that the "radicals" chosen for Cangjie are
not identical to the 214 radicals that are commonly used to classify
Chinese characters. For the latter see [1]).

Every one of these 24 Cangjie radicals gets mapped to an ASCII character
and their combinations then uniquely identify a Chinese character (the
wikipage at [0] illustrates the approach very well).

This input method seems to be old and I have never seen a Chinese person
use it. From what I understand, most Chinese people nowadays just write
text in Pinyin (a latin transliteration of the Chinese pronounciation)
and then the IME helps you choose the correct combination of Chinese
characters (potentially taking the context of the text already written
into account).


Cheers,

Silvan

[0] https://en.wikipedia.org/wiki/Cangjie_input_method
[1] https://en.wikipedia.org/wiki/Kangxi_radical

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M9589c3997fe9cf5b52b599d5
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-21 Thread adr
> I stumbled onto an instructive video on youtube not that long ago. I'm
> sure there are a few you'll be able to search for. If I understand
> correctly, it's a combination of entering the phoneme by the nearest
> Latin letter, then select from a diminishing range of suitable options
> on the screen.

There are other input methods based on the shape of the characters. Some are 
better with traditional Chinese characters, other with simplified characters, 
it's complicated... Let see if some Chinese comrade share with us his daily 
life experience. The Japanese is input writing kana directly with a Japanese 
keyboard or by romaji with roman characters on western keyboards (ka -> か, ) 
and then transformed to kanji when necessary. There are different IMEs, but the 
principle is the same. I suppose that ktrans is similar, I haven't tried jet.

adr
--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M428fc6fd31a9ffdb29d773bc
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-21 Thread adr
> I know that the russian tech was very
> isolated compared to modern technology.

The most interesting for me are the Setun ternary computers designed by Nikolay 
Brusentsov in the late '50s running a Forth like system. They did a lot of 
research and came to the conclusion that Forth was _the_ language. They saw 
Forth as a discovery by Chuck Moore, not an invention (to give him more credit, 
no less). The binary computers that become popular (m-3, ural, etc) were slowly 
replaced by clones of western computers PDP-11, Intel, Vax, etc). The operating 
systems were mostly clones too. The computers of the '80s and '90s in schools 
and homes were clones of PC, Apple, Z80. The Spectrum clones were very popular. 
Asian computer technology was imported from the Western or Soviet worlds, so 
they had to add devices or methods to enter their own characters (look for some 
crazy keyboard built in Taiwan). The early input methods (form the '70s?) were 
pretty much like the ones we use today. As far as I know, there wasn't any 
Asian computer created without Western or Soviet influence.

adr
--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mfe79a57631b9a0b4b7b839e8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-21 Thread Lucio De Re
On 7/21/22, cigar562hfsp952f...@icebubble.org
 wrote:
> sirjofri  writes:
>
>> I'm pretty sure that pure Chinese computers would look different.
>
> I've often wondered that.  What input methods do Chinese speakers use?
> What do Chinese keyboards look like?  How do they find/select the
> character they want?  Are different sets of characters available on
> different computers, or are input methods standardized?  I wonder.
>
I stumbled onto an instructive video on youtube not that long ago. I'm
sure there are a few you'll be able to search for. If I understand
correctly, it's a combination of entering the phoneme by the nearest
Latin letter, then select from a diminishing range of suitable options
on the screen.

The video was more focused specifically on how this need - which
Chinese, Japanese and Koreans somewhat reacted differently to - caused
the Chinese to make great strides in computing.

Lucio.
> --
> 9fans: 9fans
> Permalink:
> https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mfd7cc77a83bcefbc998c371e
> Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
>


-- 
Lucio De Re
2 Piet Retief St
Kestell (Eastern Free State)
9860 South Africa

Ph.: +27 58 653 1433
Cell: +27 83 251 5824

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M0df0a84a156b182c700ca96c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


Re: [9fans] Re: ctrans - Chinese language input for Plan9

2022-07-21 Thread sirjofri



21.07.2022 04:44:53 cigar562hfsp952f...@icebubble.org:


sirjofri  writes:


I'm pretty sure that pure Chinese computers would look different.


I've often wondered that.  What input methods do Chinese speakers use?
What do Chinese keyboards look like?  How do they find/select the
character they want?  Are different sets of characters available on
different computers, or are input methods standardized?  I wonder.


I was more referring to computers built without any american influence at 
all, so no ansi, no ascii, no LTR, probably different keycodes...



I can't give you an answer as I'm not from an asian culture (although I 
studied it a little) and it's hard to answer anyway since I'm heavily 
influenced by american computers. I'd really need a few years studying 
those cultures heavily to be able to describe a possible tendency.


I can imagine though to look at early russian (and maybe even chinese, if 
there is) space technology. I know that the russian tech was very 
isolated compared to modern technology.


sirjofri

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M817c5719a75708c69b3cfd05
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription


[9fans] Re: ctrans - Chinese language input for Plan9

2022-07-20 Thread cigar562hfsp952fans
sirjofri  writes:

> I'm pretty sure that pure Chinese computers would look different.

I've often wondered that.  What input methods do Chinese speakers use?
What do Chinese keyboards look like?  How do they find/select the
character they want?  Are different sets of characters available on
different computers, or are input methods standardized?  I wonder.

--
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mfd7cc77a83bcefbc998c371e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription