Re: Language modeling (was Re: [agi] draft for comment)

Matt Mahoney Fri, 05 Sep 2008 17:42:19 -0700

--- On Fri, 9/5/08, Pei Wang <[EMAIL PROTECTED]> wrote:

> Like to many existing AI works, my disagreement with you is
> not that
> much on the solution you proposed (I can see the value),
> but on the
> problem you specified as the goal of AI. For example, I
> have no doubt
> about the theoretical and practical values of compression,
> but don't
> think it has much to do with intelligence.


In http://cs.fit.edu/~mmahoney/compression/rationale.html I explain why text 
compression is an AI problem. To summarize, if you know the probability 
distribution of text, then you can compute P(A|Q) for any question Q and answer 
A to pass the Turing test. Compression allows you to precisely measure the 
accuracy of your estimate of P. Compression (actually, word perplexity) has 
been used since the early 1990's to measure the quality of language models for 
speech recognition, since it correlates well with word error rate.

The purpose of this work is not to solve general intelligence, such as the 
universal intelligence proposed by Legg and Hutter [1]. That is not computable, 
so you have to make some arbitrary choice with regard to test environments 
about what problems you are going to solve. I believe the goal of AGI should be 
to do useful work for humans, so I am making a not so arbitrary choice to solve 
a problem that is central to what most people regard as useful intelligence.

I had hoped that my work would lead to an elegant theory of AI, but that hasn't 
been the case. Rather, the best compression programs were developed as a series 
of thousands of hacks and tweaks, e.g. change a 4 to a 5 because it gives 
0.002% better compression on the benchmark. The result is an opaque mess. I 
guess I should have seen it coming, since it is predicted by information theory 
(e.g. [2]).

Nevertheless the architectures of the best text compressors are consistent with 
cognitive development models, i.e. phoneme (or letter) sequences -> lexical -> 
semantics -> syntax, which are themselves consistent with layered neural 
architectures. I already described a neural semantic model in my last post. I 
also did work supporting Hutchens and Alder showing that lexical models can be 
learned from n-gram statistics, consistent with the observation that babies 
learn the rules for segmenting continuous speech before they learn any words 
[3].

I agree it should also be clear that semantics is learned before grammar, 
contrary to the way artificial languages are processed. Grammar requires 
semantics, but not the other way around. Search engines work using semantics 
only. Yet we cannot parse sentences like "I ate pizza with Bob", "I ate pizza 
with pepperoni", "I ate pizza with chopsticks", without semantics.

My benchmark does not prove that there aren't better language models, but it is 
strong evidence. It represents the work of about 100 researchers who have tried 
and failed to find more accurate, faster, or less memory intensive models. The 
resource requirements seem to increase as we go up the chain from n-grams to 
grammar, contrary to symbolic approaches. This is my argument why I think AI is 
bound by lack of hardware, not lack of theory.

1. Legg, Shane, and Marcus Hutter (2006), A Formal Measure of Machine 
Intelligence, Proc. Annual machine learning conference of Belgium and The 
Netherlands (Benelearn-2006). Ghent, 2006.  
http://www.vetta.org/documents/ui_benelearn.pdf

2. Legg, Shane, (2006), Is There an Elegant Universal Theory of Prediction?,  
Technical Report IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for 
Artificial Intelligence, Galleria 2, 6928 Manno, Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf

3. M. Mahoney (2000), A Note on Lexical Acquisition in Text without Spaces, 
http://cs.fit.edu/~mmahoney/dissertation/lex1.html


-- Matt Mahoney, [EMAIL PROTECTED]



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Re: Language modeling (was Re: [agi] draft for comment)

Reply via email to