Re: Language modeling (was Re: [agi] draft for comment)

Pei Wang Fri, 05 Sep 2008 15:42:57 -0700

On Fri, Sep 5, 2008 at 6:15 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Fri, 9/5/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> NARS indeed can learn semantics before syntax --- see
>> http://nars.wang.googlepages.com/wang.roadmap.pdf
>
> Yes, I see this corrects many of the problems with Cyc and with traditional 
> language models. I didn't see a description of a mechanism for learning new 
> terms in your other paper. Clearly this could be added, although I believe it 
> should be a statistical process.


I don't have a separate paper on term composition, so you'd have to
read my book. It is indeed a statistical process, in the sense that
most of the composed terms won't be useful, so will be forgot
gradually. Only the "useful patterns" will be kept for long time in
the form of compound terms.

> I am interested in determining the computational cost of language modeling. 
> The evidence I have so far is that it is high. I believe the algorithmic 
> complexity of a model is 10^9 bits. This is consistent with Turing's 1950 
> prediction that AI would require this much memory, with Landauer's estimate 
> of human long term memory, and is about how much language a person processes 
> by adulthood assuming an information content of 1 bit per character as 
> Shannon estimated in 1950. This is why I use a 1 GB data set in my 
> compression benchmark.

I see your point, though I think to analyze this problem in terms of
computational complexity is not the correct way to go, because this
process does not follow a predetermined algorithm. Instead, language
learning is an incremental process, without a well-defined beginning
and ending.

> However there is a 3 way tradeoff between CPU speed, memory, and model 
> accuracy (as measured by compression ratio). I added two graphs to my 
> benchmark at http://cs.fit.edu/~mmahoney/compression/text.html (below the 
> main table) which shows this clearly. In particular the size-memory tradeoff 
> is an almost perfectly straight line (with memory on a log scale) over tests 
> of 104 compressors. These tests suggest to me that CPU and memory are indeed 
> bottlenecks to language modeling. The best models in my tests use simple 
> semantic and grammatical models, well below adult human level. The 3 top 
> programs on the memory graph map words to tokens using dictionaries that 
> group semantically and syntactically related words together, but only one 
> (paq8hp12any) uses a semantic space of more than one dimension. All have 
> large vocabularies, although not implausibly large for an educated person. 
> Other top programs like nanozipltcb and WinRK use smaller dictionaries and
>  strictly lexical models. Lesser programs model only at the n-gram level.

Like to many existing AI works, my disagreement with you is not that
much on the solution you proposed (I can see the value), but on the
problem you specified as the goal of AI. For example, I have no doubt
about the theoretical and practical values of compression, but don't
think it has much to do with intelligence. I don't think this kind of
issue can be efficient handled by email discussion like this one. I've
been thinking about to write a paper to compare my ideas with the
ideas represented by AIXI, which is closely related to yours, though
this project hasn't got enough priority in my to-do list. Hopefully
I'll find the time to make myself clear on this topic.

> I don't yet have an answer to my question, but I believe efficient 
> human-level NLP will require hundreds of GB or perhaps 1 TB of memory. The 
> slowest programs are already faster than real time, given that equivalent 
> learning in humans would take over a decade. I think you could use existing 
> hardware in a speed-memory tradeoff to get real time NLP, but it would not be 
> practical for doing experiments where each source code change requires 
> training the model from scratch. Model development typically requires 
> thousands of tests.

I guess we are exploring very different paths in NLP, and now it is
too early to tell which one will do better.

Pei


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Re: Language modeling (was Re: [agi] draft for comment)

Reply via email to