Here are 14 ideas on language, information and intelligence. 2 of them are 
newly added (OEE and FLPM). An HTML version is at 
http://www.bytecool.com/ideas.html .

Foreign Language Learning

    * Automatic Code-Switching (ACS) - The computer automatically selects a few 
words in a user's native language communication (such as a web page being 
viewed), and supplements or even replaces them with their foreign language 
counterparts, thus naturally building up his vocabulary. For example, if a 
sentence
              他是一个好学生。
      (Chinese for "He is a good student.") appears in a Chinese person's Web 
browser, the computer can insert student after 学生 (optionally with additional 
information such as student's pronunciation):
              他是一个好学生 (student)。
      After several times of such teaching, the computer can directly replace 
future occurrences of 学生 with student:
              他是一个好 student。
      Ambiguous words such as the 看 (Chinese for "see", "look", "watch", 
"read", etc.) in
              他在电视前看书。
      (Chinese for "He is reading a book before the TV.") can also be 
automatically handled by listing all context-possible translations:
              他在电视前看 (阅读: read; 观看: watch) 书。
      Practice is also possible:
              他在电视前 [read? watch?] 书。
      Because the computer would only teach and/or practice foreign language 
elements at a small number of positions in the native language article the user 
is viewing, the user wouldn't find it too intrusive. Automatic code-switching 
can also teach grammatical knowledge in similar ways.
    * Progressive Word Acquisition (PWA) - In ACS, long words are optionally 
split into small segments (usually two syllables long) and taught 
progressively, and even practiced progressively. For example, when
              科罗拉多州
      (Chinese for "Colorado") first appears in a Chinese person's Web browser, 
the computer inserts Colo' after it (optionally with Colo's pronunciation):
              科罗拉多州 (Colo')
      When 科罗拉多州 appears for the second time, the computer may decide to test 
the user's memory about Colo' so it replaces 科罗拉多州 with
              Colo' (US state)
      Note that a hint such as "US state" is necessary in order to 
differentiate this Colo' from other words beginning with Colo. For the third 
occurrence of 科罗拉多州, the computer teaches the full form, Colorado, by inserting 
it after the Chinese occurrence:
              科罗拉多州 (Colorado)
      At the fourth time, the computer may totally replace 科罗拉多州 with
              Colorado
      Not only the foreign language element (Colorado) can emerge gradually, 
the original native language element (科罗拉多州) can also gradually fade out, 
either visually or semantically (e.g. 科罗拉多州 -> 美国某州 -> 地名 -> ∅). This prevents 
the learner from suddenly losing the Chinese clue, while also engages him in 
active recalls of the occurrence's complete meaning (科罗拉多州) with gradually 
reduced clues.
    * Subword Familiarization (SWF) - Again in ACS, word roots (e.g. pro-, 
scrib-) and meaningless word fragments (e.g. -ot) are optionally treated as two 
special kinds of standalone words and taught and practiced in the user's 
incoming native language information. Meaningless fragments are considered 
abbreviations and acronyms derived from real, meaningful words. Getting the 
learner familiar with all these subword units can facilitate the acquisition of 
longer, real words that contain them.
    * Phonetics-Enhanced English (PEE) - The computer can add non-intrusive 
diacritical marks (e.g. the mark in á) above normal English words to better 
reflect their pronunciations. Unlike radical spelling reform proposals, a 
word's original literal form is always preserved. Unlike annotating words with 
their IPA forms above, diacritical marks are closely integrated with letters so 
a learner can "read once and learn both the literal and the phonetic form." In 
inputting English, the learner still uses the original literal form only.
    * Orthography-Enhanced English (OEE) - Sometimes spelling a word based on 
its pronunciation can be hard, even for native speakers. For example, is it 
"Lawrence" or "Lawrance"? Is it "porridge" or "porrige"? We can slightly change 
a word's visual form to help recall its correct spelling. For example, when the 
computer displays a word that has the "-ance" suffix (e.g. "instance"), it can 
lower the letter "a" a little, just like Intel has a trademark "intel" with a 
lowered "e". Such a new visual form can help people recall that the unclear 
letter in "inst*nce" is "a" because "a" is always lowered in "-ance". Similarly 
we can let the computer display "porridge" in a new form by adding an arc 
(Unicode U+035C) below "dg" to indicate this sound corresponds to two letters 
instead of one.

Computer-Assisted Foreign Language Writing

    * Input-Driven Syntax Aid (IDSA) - As a non-native English user inputs a 
word, e.g. search, the word's sentence-making syntaxes are prompted by the 
computer, e.g.
              v. search:  n. searcher search~ [n. search scope] [for n. search 
target]
      so he can now write a syntactically valid sentence like "I'm searching 
the room for the cat."
    * Input-Driven Ontology Aid (IDOA) - As a non-native English user inputs a 
word, e.g. badminton, things (entities) and relations that normally co-exist 
with the word in the same scenario or domain are prompted as a systematic 
ontology graph by the computer, e.g. entities like racquet, shuttlecock and 
playing court, relations like alternate, serve and strike, and even 
full-scripted composition templates like template: a badminton game. The 
benefits of the ontology aid are twofold. First, the ontology helps the user 
verify that the "seed word", badminton, is a valid concept in the intended 
scenario (or context); second, the ontology pre-emptively exposes other valid 
words in this context to the user, preventing him from using a wrong word, e.g. 
bat (instead of racquet), from the very beginning.

Foreign Language Reading without Learning that Language

    * Full-Automatic Layered-Quality Machine Translation (FALQ-MT) - Lexical 
and syntactic ambiguities are translated to fuzzy concepts and structures 
instead of precise but error-prone results. Less information is better than 
misinformation. If the reader can't guess the meaning of a fuzzy occurrence 
from its context, he can "zoom in" and see more detailed translation 
possibilities if he feels that occurrence is important.

Foreign Language Writing without Learning that Language

    * Formal Language Writing and Machine Translation (FLW) - A person not 
knowing a target language can generate information in that language by 
composing in a formal language based on his native vocabulary and having the 
composition machine-translated. Tools such as the input-driven syntax aid and 
input-driven ontology aid can be borrowed to assist the person in formal 
language writing. Manual word sense disambiguation (WSD) can be conducted after 
the composition is finished, on a domain-to-domain basis, because it is 
cognitively easier for the writer to focus on a single domain at a time and 
answer a series of questions "Does <word_i> belong to this domain?" Another 
approach to manual WSD is to borrow the idea in Machine Translation with 
Natural Disambiguation.

Ontology-Based Resource Sharing

    * Wikipedia-Based Resource Sharing (WP-RES) - A useful property of 
Wikipedia is that each Wikipedia article or category can serve as a unique 
address, or "coordinates", for the topic it corresponds to. With this property, 
we can enable people with the same interest to rendezvous at the same Wikipedia 
page and therefore talk with each other. People could also register resources 
at a Wikipedia page's External Links section so that other people with the same 
interest can find them. People could even "subscribe" to a Wikipedia page for 
new and updated resources and opportunities on that topic.

Ontology-Based Problem-Solving Skills Sharing

    * Wikipedia: From Knowledgebase to Strategybase (STRABASE) - If we're 
solving a problem, say, a math problem, we choose a seemingly promising 
strategy from our "strategy bases" in our minds, according to the problem's 
main type and characteristic conditions. Such a "strategy base" is something we 
can build up externally using a wiki. A "strategy" is a special kind of 
knowledge that caters to certain problem characteristics and provides certain 
problem-solving frameworks. The wiki can store and categorize strategies and 
domain knowledge by their intended problem types and characteristics, so the 
human can better evaluate, select and apply strategies relevant to his problem.

Miscellaneous

    * Chinese Pinyin Input Method Revisited (PYIME) - Today's Chinese pinyin 
input methods inherit the single-row candidates window from the DOS era. If we 
categorize candidate characters into multiple rows according to some criteria, 
the user can more easily home in on his desired character. For example, each 
row contains characters that have the same phonetic radical, and one row reads 
"马 吗 妈 码 玛", while another row reads "麻 嘛 䗫". Rows can also correspond to the 
five possible tones in Chinese, as most mainland Chinese don't type tones. 
Still, there can be a special, first row for the most frequently used words and 
characters.
    * A Politically Correct New Name for English (ARCS) - As technology like 
automatic code-switching would make English a much cheaper commodity for 
non-native people to acquire, for the first time it will become possible for 
most people in the world to use decent English. But nationalist sentiments can 
be a negative factor for some people to adopt English. While it is logically 
recognized by everybody that all natural languages are actually made of equally 
random syllables, emotionally people can still more or less feel unequal that 
one language is more international than others. A reason for this paradox is 
that languages are named by their nations of origin: English, French, Spanish, 
etc. Therefore, we can use a "renaming" technique to better reflect a 
language's random nature rather than nationalist connotation. Actually, the 
word "language" itself already has a strong nationalist connotation, and I 
propose the term "code system" to eliminate that connotation. As for English, 
let's rename it as "A Random Code System", or ARCS for short.
    * Foreign Language Proficiency Measurement (FLPM) - How does a non-native 
speaker introduce his language level to a native speaker in an understandable 
manner? The computer can test his proficiency and compare it with native 
speakers at different ages. Introductions like "My English level is like a 
10-year-old American child" should be understood well by a native speaker.


_________________________________________________________________
Hotmail® is up to 70% faster. Now good news travels really fast. 
http://windowslive.com/online/hotmail?ocid=PID23391::T:WLMTAGL:ON:WL:en-US:WM_HYGN_faster:082009
_______________________________________________
Mt-list mailing list

Reply via email to