On Tue, Feb 26, 2019 at 5:19 PM Linas Vepstas <[email protected]>
wrote:

>
> On Mon, Feb 25, 2019 at 9:04 PM Rob Freeman <[email protected]>
> wrote:
>
>> ...
>> You mean you have no knowledge of attempts at distributional learning of
>> grammar from the '90s?
>>
>
> Sure. Yes, I suppose I mean that. Is there something you can recommend?
> Should I just google  David Powers and Hinrich Schuetze and start reading
> randomly, or is there something particularly juicey that I should look at?
>

I can't imagine you haven't googled already, so I can only guess you intend
to make some distinction.

Hinrich Schuetze was the first lead I found when I started searching in
1994: "Dimensions of Meaning", "Distributional Part-of-Speech Tagging".
Steven Finch was another. He googles up in a proceedings: "Grammatical
Inference: Learning Syntax from Sentences: Third International Colloquium,
ICGI-96..., Volume 3".

There is more about distributional models of meaning. I'm trying to
remember what the big sub-field was called... Lexical Semantic Analysis.
Very mainstream. That carried on to Dominic Widdows and "Geometry and
Meaning" in 2004. Widdows went on to be central in the organization of the
"Quantum Interaction" conference, I believe. That relates to some of the
work by Coecke et al, with quantum field theory formalisms.

But focusing on grammar... Of course I know Powers from a particular
project. I wrote him off in 1998 because he was still trying to learn
categories, and I had decided it was impossible. Googling now, I didn't
realize his work went back quite so far. All came to NIL I guess... This is
of historical interest:

"ITK Research Memo December 1991 SHOE: Tiie Extraction of Hierarchical
Structure for Machine Learning of Natural Language." D. Powers 8i W.
Daelemans.

https://pdfs.semanticscholar.org/596a/713366155d907c8340a44fa0d80489e4491e.pdf


"Powers (1984, 1989) has already shown that word classes and associated
rules may be learned either by statistical clustering techniques or
self-organizing neural models, using untagged data, thus achieving
completely unsupervised learning"

Browsing the bibliography at random, this is quite interesting. Another
chaos and language lead to follow up:

"Nicolis, John S. (1991b) "Chaotic Dynamics of Linguistic Processes at the
syntactical, semantic and Pragmatic Levels: Zipf's law, Pattern Recognition
and Decision Making under Uncertainty and Conflict", Proceedings of
QUALICO`91, University of Trier, September 23-27, 1991.

In the late '90s everything was about finding the right set of "features".
I think it came to a head with ever more complex statistical models, first
Hidden Markov Models, then probabilistic context free grammars.

I wasn't attending closely, because I already believed clear symbolic
categories were impossible.

Of more interest to me there was a separate thread of analogy based
grammar, a form of distributed representation, which took off separately,
because of course no-one was finding nice clean symbolic categories.
Daelemans "Memory-based learning", Skousen "Analogical Model of Language".
Rens Bod took off with another tangent of that, statistical combination of
trees(?) which seemed to get a lot of funding for a while.

Oh, Menno van Zaanen had something he called "Alignment Based Learning".
Which he took broadly back to Zellig Harris, Chomsky's teacher. I had quite
a lot to do with him.

You can see the symbolic learning work terminating with Bengio, retreating
from learned symbolic categories to vector representations around 2003 with
his Neural Language Model, which has been the most successful.

It's fun to reminisce.

There's a bunch of other threads to it too.

But you must be making some narrow distinction, which makes your attempt to
learn rules unique??

-Rob

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta6fce6a7b640886a-Ma31c62b3691456d066298ff0
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to