Language modeling (was Re: [agi] draft for comment)

Matt Mahoney Fri, 05 Sep 2008 08:20:28 -0700

--- On Thu, 9/4/08, Pei Wang <[EMAIL PROTECTED]> wrote:

> I guess you still see NARS as using model-theoretic
> semantics, so you
> call it "symbolic" and contrast it with system
> with sensors. This is
> not correct --- see
> http://nars.wang.googlepages.com/wang.semantics.pdf and
> http://nars.wang.googlepages.com/wang.AI_Misconceptions.pdf


I mean NARS is symbolic in the sense that you write statements in Narsese like 
"raven -> bird <0.97, 0.92>" (probability=0.97, confidence=0.92). I realize 
that the meanings of "raven" and "bird" are determined by their relations to 
other symbols in the knowledge base and that the probability and confidence 
change with experience. But in practice you are still going to write statements 
like this because it is the easiest way to build the knowledge base. You aren't 
going to specify the brightness of millions of pixels in a vision system in 
Narsese, and there is no mechanism I am aware of to collect this knowledge from 
a natural language text corpus. There is no mechanism to add new symbols to the 
knowledge base through experience. You have to explicitly add them.

> You have made this point on "CPU power" several
> times, and I'm still
> not convinced that the bottleneck of AI is hardware
> capacity. Also,
> there is no reason to believe an AGI must be designed in a
> "biologically plausible" way.

Natural language has evolved to be learnable on a massively parallel network of 
slow computing elements. This should be apparent when we compare successful 
language models with unsuccessful ones. Artificial language models usually 
consist of tokenization, parsing, and semantic analysis phases. This does not 
work on natural language because artificial languages have precise 
specifications and natural languages do not. No two humans use exactly the same 
language, nor does the same human at two points in time. Rather, language is 
learnable by example, so that each message causes the language of the receiver 
to be a little more like that of the sender.

Children learn semantics before syntax, which is the opposite order from which 
you would write an artificial language interpreter. An example of a successful 
language model is a search engine. We know that most of the meaning of a text 
document depends only on the words it contains, ignoring word order. A search 
engine matches the semantics of the query with the semantics of a document 
mostly by matching words, but also by matching semantically related words like 
"water" to "wet".

Here is an example of a computationally intensive but biologically plausible 
language model. A semantic model is a word-word matrix A such that A_ij is the 
degree to which words i and j are related, which you can think of as the 
probability of finding i and j together in a sliding window over a huge text 
corpus. However, semantic relatedness is a fuzzy identity relation, meaning it 
is reflexive, commutative, and transitive. If i is related to j and j to k, 
then i is related to k. Deriving transitive relations in A, also known as 
latent semantic analysis, is performed by singular value decomposition, 
factoring A = USV where S is diagonal, then discarding the small terms of S, 
which has the effect of lossy compression. Typically, A has about 10^6 elements 
and we keep only a few hundred elements of S. Fortunately there is a parallel 
algorithm that incrementally updates the matrices as the system learns: a 3 
layer neural network where S is the hidden layer
 (which can grow) and U and V are weight matrices. [1].

Traditional language processing has failed because the task of converting 
natural language statements like "ravens are birds" to formal language is 
itself an AI problem. It requires humans who have already learned what ravens 
are and how to form and recognize grammatically correct sentences so they 
understand all of the hundreds of ways to express the same statement. You have 
to have human level understand of the logic to realize that "ravens are coming" 
doesn't mean "ravens -> coming". If you solve the translation problem, then you 
must have already solved the natural language problem. You can't take a 
shortcut directly to the knowledge base, tempting as it might be. You have to 
learn the language first, going through all the childhood stages. I would have 
hoped we have learned a lesson from Cyc.

1. Gorrell, Genevieve (2006), "Generalized Hebbian Algorithm for Incremental 
Singular Value Decomposition in Natural Language Processing", Proceedings of 
EACL 2006, Trento, Italy.
http://www.aclweb.org/anthology-new/E/E06/E06-1013.pdf

-- Matt Mahoney, [EMAIL PROTECTED]




-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Language modeling (was Re: [agi] draft for comment)

Reply via email to