Re: AW: [agi] Re: AW: Language learning

Matt Mahoney Sat, 03 May 2008 11:32:00 -0700

--- "Dr. Matthias Heger" <[EMAIL PROTECTED]> wrote:

> I think it is even more complicated. The flow of signals in the brain
> does not move only from low levels to high levels.
> The modules communicate in both directions. And as far as I know
> there is already evidence for this from cognitive science.


Yes, of course, and there are signals within layers and skipping
layers, so the whole "layer" concept is not sharply defined.  It is
just an approximation to aid understanding.

> How
> Many
> Apples
> APPLIES
> Are
> On 
> The
> Tree

This is a good example where a neural language model can solve the
problem.  The approximate model is

  phonemes -> words -> semantics -> grammar

where the phoneme set activates both the "apples" and "applies" neurons
at the word level.  This is resolved by feedback from the semantics
level by the learned association (apple - tree), and by the grammar
level by the learned links (apple - NOUN) and the grammatical form (how
many NOUN are).

There is no need to explicitly code any of this knowledge.  It is all
learnable from a large corpus of text.  Counting Google hits we can
infer the semantic relation (apple - tree):

apple = 457,000,000
apply = 343,000,000
P(apple) = 0.57

apple tree = 6,690,000
tree apple = 1,060,000 (total 7,750,000)
apply tree = 1,020,000
tree apply = 1,050,000 (total 2,070,000)
P(apple | tree) = 0.79

Grammar is a little harder.  We could count:

"how many apples are" = 1050
"how many applies are" = 1

But a human-sized training corpus (1 GB) would be too small to gather
these statistics.  To solve this problem, we note that words can be
clustered by their immediate context, and these clusters correspond to
grammatical roles.  For example, given the pattern like "how many X
are", X is likely to be a plural noun, and there are enough patterns
where this is true to learn that "apples" is a plural noun.

I know that latent semantic analysis (LSA) has been used to identify
clusters of semantically related words, but I don't know that it has
been applied to discovering grammatical roles.  LSA uses singular value
decomposition to compress a word-word association matrix by factoring
it and discarding small terms from the middle diagonal matrix, similar
to a neural network with a small hidden layer.  For semantics, an
element in a word-word matrix means the words occur near each other in
running text.  For grammar, it would mean the words appear in the same
immediate context.


-- Matt Mahoney, [EMAIL PROTECTED]

-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com

Re: AW: [agi] Re: AW: Language learning

Reply via email to