On Tue, Feb 26, 2019 at 5:19 PM Linas Vepstas <[email protected]> wrote:
> > On Mon, Feb 25, 2019 at 9:04 PM Rob Freeman <[email protected]> > wrote: > >> ... >> You mean you have no knowledge of attempts at distributional learning of >> grammar from the '90s? >> > > Sure. Yes, I suppose I mean that. Is there something you can recommend? > Should I just google David Powers and Hinrich Schuetze and start reading > randomly, or is there something particularly juicey that I should look at? > I can't imagine you haven't googled already, so I can only guess you intend to make some distinction. Hinrich Schuetze was the first lead I found when I started searching in 1994: "Dimensions of Meaning", "Distributional Part-of-Speech Tagging". Steven Finch was another. He googles up in a proceedings: "Grammatical Inference: Learning Syntax from Sentences: Third International Colloquium, ICGI-96..., Volume 3". There is more about distributional models of meaning. I'm trying to remember what the big sub-field was called... Lexical Semantic Analysis. Very mainstream. That carried on to Dominic Widdows and "Geometry and Meaning" in 2004. Widdows went on to be central in the organization of the "Quantum Interaction" conference, I believe. That relates to some of the work by Coecke et al, with quantum field theory formalisms. But focusing on grammar... Of course I know Powers from a particular project. I wrote him off in 1998 because he was still trying to learn categories, and I had decided it was impossible. Googling now, I didn't realize his work went back quite so far. All came to NIL I guess... This is of historical interest: "ITK Research Memo December 1991 SHOE: Tiie Extraction of Hierarchical Structure for Machine Learning of Natural Language." D. Powers 8i W. Daelemans. https://pdfs.semanticscholar.org/596a/713366155d907c8340a44fa0d80489e4491e.pdf "Powers (1984, 1989) has already shown that word classes and associated rules may be learned either by statistical clustering techniques or self-organizing neural models, using untagged data, thus achieving completely unsupervised learning" Browsing the bibliography at random, this is quite interesting. Another chaos and language lead to follow up: "Nicolis, John S. (1991b) "Chaotic Dynamics of Linguistic Processes at the syntactical, semantic and Pragmatic Levels: Zipf's law, Pattern Recognition and Decision Making under Uncertainty and Conflict", Proceedings of QUALICO`91, University of Trier, September 23-27, 1991. In the late '90s everything was about finding the right set of "features". I think it came to a head with ever more complex statistical models, first Hidden Markov Models, then probabilistic context free grammars. I wasn't attending closely, because I already believed clear symbolic categories were impossible. Of more interest to me there was a separate thread of analogy based grammar, a form of distributed representation, which took off separately, because of course no-one was finding nice clean symbolic categories. Daelemans "Memory-based learning", Skousen "Analogical Model of Language". Rens Bod took off with another tangent of that, statistical combination of trees(?) which seemed to get a lot of funding for a while. Oh, Menno van Zaanen had something he called "Alignment Based Learning". Which he took broadly back to Zellig Harris, Chomsky's teacher. I had quite a lot to do with him. You can see the symbolic learning work terminating with Bengio, retreating from learned symbolic categories to vector representations around 2003 with his Neural Language Model, which has been the most successful. It's fun to reminisce. There's a bunch of other threads to it too. But you must be making some narrow distinction, which makes your attempt to learn rules unique?? -Rob ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Ta6fce6a7b640886a-Ma31c62b3691456d066298ff0 Delivery options: https://agi.topicbox.com/groups/agi/subscription
