Le sam. 30 oct. 2021 à 06:17, Linas Vepstas <[email protected]> a écrit : > > Hi! > > The slide deck that I presented is available at > > https://github.com/opencog/learn/blob/master/learn-lang-diary/recognizing-patterns.pdf > > and a transcript of what I was going to say is at > > https://github.com/opencog/learn/blob/master/learn-lang-diary/recognizing-patterns-notes
Very interesting. What are those acronyms: - MI = Mutual Information? - MST parses = Maximum Spanning Tree, according to wikipedia: a spanning tree is "In the mathematical field of graph theory, a spanning tree T of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G.", the maximum spanning tree will be the spanning tree that goes through most edges or vertices. It looks similar to a space filling curve somehow, except it is structured. - GUE I am wondering why the algorithm only takes into account adjacent word pairs. Unlike Link Grammar that draws connections across a sentence jumping through intermediate words... Oops! Then you mention skip-grams (https://en.wikipedia.org/wiki/N-gram#Skip-gram) so my guess, unlike what is written in Combinatory Linguistics by Cem Bozşahin, that stress the need to build a phrase structure grammar with adjacent words (https://en.wikipedia.org/wiki/Phrase_structure_grammar) vs. a dependency grammar (https://en.wikipedia.org/wiki/Dependency_grammar) but that is a categorical grammar? It is unclear to me what is what, and whether that matters. Quoting the transcript: > * We can learn the rules of reasoning; they are not God-given (aka > hard-coded by some programmer.) > * They can be learned, and I've described an algorithm for learning > them. Awesome. To summarize the presentation: you claim that it is possible with a Machine Learning algorithm to build, in a completely *unsupervised* way, that is without annotations, by mining existing corpus materials, a grammar for natural languages, hence creating links between the words forming a tree or graph. That graph is annotated somehow with words, hence is explainable. You also claim the algorithm may be used to infer grammars from other sources such as audio, video, etc... You also claim that it is a very simple, walked path in terms of math, already used in the industry. As far as I understand you joined the dots, but there are still known unknowns such as a normal distribution that appears out-of-the-blue. In simpler words, you shed light (structures) into the void (the unstructured). Let me know if I got this correctly. Thanks for sharing. -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAL7_Mo91ekF_fxrGUzyDRxLPW30RkdKh4u32Fq4L5SyMt-7XqA%40mail.gmail.com.
