Re: Neural language models (was Re: [singularity] Help get the 400k SIAI matching challenge on DIGG's front page)

Matt Mahoney Wed, 16 May 2007 07:29:31 -0700

--- Tom McCabe <[EMAIL PROTECTED]> wrote:

> If such neural systems can actually spit out sensible
> analyses of natural language, it would obviously be a
> huge discovery and could probably be sold to a good
> number of people as a commercial product. So why
> aren't more people investing in this, if you've
> already got working software that just needs a
> suitable supercomputer?
> 
>  - Tom

I don't have working software because I don't have a supercomputer.  Sure, I
could build scaled down toy systems on my PC.  A lot of people have done that
in the 1980's.  It would not be anything new.

Google does have a supercomputer and they are working on the problem.  It
already does a good job of handling natural language questions, not quite
Turing test, but pretty good.  Google's approach seems to be brute force,
because they can do it that way.  A neural model ought to recognize a sentence
as correct by recognizing grammatical and semantic patterns learned from prior
training data.  Google can recognize a sentence as correct by counting exact
matches in its huge database.

I think a neural language model can be built with more than 1 PC but less than
a million.  I don't know how many.  A lot depends on how good our neural
models approximate real neurons.  Our models use fully connected, feedforward
networks with real-valued synapses and sigmoid activation responses that vary
smoothly in time or in discrete steps.  Real networks have lots of feedback,
pulse train activation responses, and for all we know might have binary
synapses.  The learning rule first proposed by Hebb in 1949 has been used
successfully to solve many problems in the 1980's, but not language.

Suppose that memory works as follows (we don't really know).  During the day,
all high level learning related to vision and language takes place in the
hippocampus by probabilistic switching of redundant, binary synapses.  At this
time, the cortex is effectively read-only.  When we sleep, memory is
transferred to the cortex by the physical growth of excitory synapses (or
inhibitory or both) directed by recalling fragments of the day's events during
REM sleep (dreaming).  During deep sleep there is a reverse process in which
memories are erased from the hippocampus.

Two questions.  Is this model correct?  And can we get the same effect using
our usual feedforward continuous networks with real-valued synapses (that
don't sleep), as small scale experiments in other domains suggest?  Some
counterexamples come to mind.  I already mentioned how pulse trains are used
to transmit phase information for stereoscopic sound perception.  Here is
another.  A small number of real-valued synapses can approximate a larger
number of binary synapses, but clearly there are some functions that could not
be modeled this way, such as complex Boolean functions.  Are there any such
functions important to language?  We don't know.

> --- Matt Mahoney <[EMAIL PROTECTED]> wrote:
> 
> > --- Tom McCabe <[EMAIL PROTECTED]> wrote:
> > > --- Matt Mahoney <[EMAIL PROTECTED]> wrote:
> > > > Personally, I would experiment with
> > > > neural language models that I can't currently
> > > > implement because I lack the
> > > > computing power.
> > > 
> > > Could you please describe these models?
> > 
> > Essentially models in which neurons (with time
> > delays) respond to increasingly
> > abstract language concepts: letters, syllables,
> > words, grammatical roles,
> > phrases, and sentence structures.  This is not
> > really new.  Models like these
> > have been proposed in the 1980's but were never
> > fully implemented due to lack
> > of computing power.  These constraints resulted in
> > connectionist systems in
> > which each concept mapped to a single neuron.  Such
> > models can't learn well. 
> > There is no mechanism for adding to the vocabulary,
> > for instance.  I believe
> > you need at least hundreds of neurons per concept,
> > where each neuron may
> > correlate weakly with hundreds of different
> > concepts.  Exactly how many, I
> > don't know.  That is why I need to experiment.
> > 
> > One problem that bothers me is the disconnect
> > between the information
> > theoretic estimates of the size of a language model,
> > about 10^9 bits, and
> > models based on neuroanatomy, perhaps 10^14 bits. 
> > Experiments might tell us
> > what's wrong with our neural models.  But how to do
> > such experiments?  A fully
> > connected network of 10^9 connections trained on
> > 10^9 bits of data would
> > require about 10^18 operations, about a year on a
> > PC.  There are optimizations
> > I could do, such as activating only a small fraction
> > of the neurons at one
> > time, but if the model fails, is it because of these
> > optimizations or because
> > you really do need 10^14 connections, or the
> > training data is bad, or
> > something else?
> > 
> > 
> > -- Matt Mahoney, [EMAIL PROTECTED]

-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&user_secret=8eb45b07

Re: Neural language models (was Re: [singularity] Help get the 400k SIAI matching challenge on DIGG's front page)

Reply via email to