Interesting! I will read it through tomorrow, the rest of my today seems eaten by other stuff...
I am not surprised that PCA stinks as a classifier... Regarding "hidden multivariate logistic regression", as you hint at the end of your document ... it seems you are gradually inching toward my suggestion of using neural nets here... My most recent suggestion to Ruiting has been to explore the following code/algorithm, which looks like a nicer way of finding word2vec style condensed representations for multiple senses of words, https://arxiv.org/pdf/1502.07257.pdf The only code I could find for this is in Julia https://github.com/sbos/AdaGram.jl but looks not that complicated.... We would need to modify that Julia code to work on the data from the MST parses rather than on word sequences like it now does... However, we haven't gotten to experimenting with that yet, because are still getting stuck with weird Guile problems in trying to get the MST parsing done ... we (Curtis) can get through MST-parsing maybe 800-1500 sentences before it crashes (and it doesn't crash when examined with GDB, which is frustrating...).... -- Ben On Mon, Jun 19, 2017 at 3:30 PM, Linas Vepstas <linasveps...@gmail.com> wrote: > Hi Ben, > > Here's this week's update on results from the natural language datasets. In > short, the datasets seem to be of high quality, based on a sampling of the > cosine similarity between words. Looks really nice. > > Naive PCA stinks as a classifier; I'm looking for something nicer, and > perhaps based on first principles, and a bit less ad-hoc. > > Since you had the guts to use the words "algebraic topology" in a recent > email, I call your bluff and raise: this report includes a brief, short > sketch that points out that every language, natural or otherwise, has an > associated cohomology theory. The path, from here to there, goes by means of > sheaves. Which is semi-obvious because every book on algebraic topology or > at least differential topology explains the steps. > > The part that's new, to me, was the sudden realization that the "disjuncts" > and "connector sets" of Link Grammar are in fact just the sheaves (germs, > stalks) of a graph. The Link Grammar dictionary, say, for the English > language, is a sheaf with a probability distribution on it. > > BTW, this clarifies why Link Grammar looks so damned modal-logic-ish. I > noticed this a long ago, and always thought it was mysterious and weird and > interesting. Well, it turns out that, for some folks, this is old news: > apparently, when the language is first-order logic, then the sheafification > of first-order logic gives you Kripke-Joyal semantics; this was spotted in > 1965. So I'm guessing that this is generic: take any language, any formal > language, or a natural language, look at it from the point of sheaves, and > then observe that the gluing axioms mean that modal logic describes how the > sections glue together. I think that's pretty cool. > > So, can you find a grad student to work out the details? The thesis title > would be "the Cohomology of the English Language". It would fill in all the > details in the above paragraphs. > > --linas > > -- > You received this message because you are subscribed to the Google Groups > "link-grammar" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to link-grammar+unsubscr...@googlegroups.com. > To post to this group, send email to link-gram...@googlegroups.com. > Visit this group at https://groups.google.com/group/link-grammar. > For more options, visit https://groups.google.com/d/optout. -- Ben Goertzel, PhD http://goertzel.org "I am God! I am nothing, I'm play, I am freedom, I am life. I am the boundary, I am the peak." -- Alexander Scriabin -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscr...@googlegroups.com. To post to this group, send email to opencog@googlegroups.com. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBemP650x0qzkB8ixpEf0%2Bgvf%3DAejwfWSq7CT3_GnigviA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.