Hi Linas, Nice stuff!
A quick comment regarding *** More precisely, the factorization of the language model appears to require three important ingredients: - A way of decomposing word-vectors into sums of word-sense vectors, - A way of performing biclustering, so as to split the bipartite graph p (w, d) into left, central and right components, holding the left and right parts to be sparse, - Using an information-theoretic similarity metric, to preserve the proba- bilistic interpretation of the contingency table p (w, d). *** The first of these is, of course, what Adagram attempts to do ... and Andres has experimented with a variety of Adagram that replaces standard SkipGram with "SkipGram-in-a-parse" (to be done after a round of e.g. MST parsing) .... But improvements on Adagram that incorporate broader context will be valuable... The third of these makes total sense and is fortunately not a huge deal, as most clustering algorithms can work with whatever similarity metric you throw at them... The second of these is the most interesting to me... basically it seems you are wanting to cluster (word, disjunct pairs) in a way that has high "clustering quality" both in the word dimension and in the disjunct dimension [i.e. so that both words are divided into meaningful clusters, and disjuncts are divided into meaningful clusters, even though the words and disjuncts are distinct and are stuck onto each other in various combinations] This is interesting and could be attempted via many possible algorithms including of course k-means-like iterative algorithms or EM-like estimation algorithms.... (Or, as you note, evolutionary learning methods) ... Oleg may have some views on this... -- Ben On Sat, Aug 4, 2018 at 12:05 PM, Linas Vepstas <[email protected]> wrote: > Hi Anton, > > Attached please find this weeks new-and-improved version of the > neural-nets-vs-symbolic-parsing document. Improvements over the last > version include: > > More words spent explaining why: > > - k-means clustering is identical to matrix factorization -- this is > not my result, its an old result, I'm recapping it because you need to > understand it to understand the next step, which is bicategorization. > - Why bicategorization is the right thing to do -- because that is > what the link-grammar dicts already do! Bicategorization is also an old > algo, from 2003 - but if you look at it carefully, you can see that the > Link Grammar dicts are ***exactly*** bicategorized contingency matrixes. > That is to say: ordinary old-school linguists who manually write dependency > grammars do so in a format that is naturally the same format as the output > of a k-means bicategorization. > - Why an information-theoretic divergence is much better than a cosine > distance. This is a lot more subtle, I suppose, because it requires you to > see a vector dot-product as something that is invariant under not under > rotations (well it is, but that misses the point), but rather as something > that is invariant under Markov transformations, which preserve > probabilities. This is because all of the vectors are rows and columns in > a probability distribution. Thus, cosine distance is "wrong", and > Kullback-Leibler divergence is correct. Again -- this is an old result, > from 2003, but all of the people who are doing ordinary off-the-shelf > k-means are unaware and oblivious to it, because their data is not a joint > probability. I try to spell this out in great detail, and to provide all > of the explicit formulas you need to do this. > > This paper is still not yet done, but I think it lays out the groundwork > much more nicely than before. I am hoping that it is not hard to read -- > again, I tried to mostly simplify everything. I hope its not oversimplified. > > Anyway, I think its a lot more promising, a lot better direction to go in > than triadic k-means. Its probably simpler too. > > --linas > > On Sun, Jul 22, 2018 at 12:01 AM, Anton Kolonin @ Gmail < > [email protected]> wrote: > >> Hi Linas, thanks, I will look into that. >> >> In meantime, below, the guys are getting close with "triadic K-means": >> >> http://aclweb.org/anthology/P18-2010 >> >> They use "FrameNet 1.7" and "dataset of polysemous verb classes by >> Korhonen" for evalutaion. >> >> If we get these, we may compare to which extent we are doing better. >> Cheers, >> >> -Anton >> >> 20.07.2018 1:46, Linas Vepstas пишет: >> >> Due to the obvious confusion that the sheaves paper caused for everyone, >> I have started work on a different way of explaining it. This one takes, >> as >> its starting point, a description of the word2vec algorithms, and >> explains how >> word2vec can be viewed as a sheaf. So, if you are more comfortable with >> that viewpoint, this might be a better way of groking the concept. >> >> The paper is very much an early draft; I've already re-written the final >> 2-3 >> pages since last night. The title is likely to change. The introduction >> will change. >> But maybe the middle bits will help clarify these issues. >> >> Here: >> https://github.com/opencog/opencog/raw/master/opencog/nlp/ >> learn/learn-lang-diary/skippy.pdf >> >> -- Linas >> >> ---------- Forwarded message ---------- >> From: Anton Kolonin (Google Docs) <d+MTEyNjExODQyMzA2NTk3MDYxMzE >> [email protected]> >> Date: Thu, Mar 1, 2018 at 4:16 AM >> Subject: Unsupervised Lang... - This comes from works of +linasvepsta... >> To: [email protected] >> >> >> Anton Kolonin mentioned you in a comment on Unsupervised Language >> Learning (ULL) Design Draft >> <https://docs.google.com/document/d/14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg/edit?disco=AAAABqvKTUk&ts=5a97d2e3&usp=comment_email_document> >> [image: Anton Kolonin] >> *Anton Kolonin* >> Section - collection of adjacent Seeds from single sentence, series of >> adjacent sentence or entire single text Sheaf - unclearly defined >> combination of Sections and Lexical Entries representing particular corpus >> >> This comes from works of [email protected] - clearer definition >> may get required and potential use should be explored further >> Open >> <https://docs.google.com/document/d/14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg/edit?disco=AAAABqvKTUk&usp=comment_email_discussion&ts=5a97d2e3> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA >> <https://maps.google.com/?q=1600+Amphitheatre+Parkway,+Mountain+View,+CA+94043,+USA&entry=gmail&source=g> >> >> You have received this email because you are mentioned in this thread.Change >> what Google Docs sends you. >> <https://docs.google.com/document/u/112611842306597061311/docos/notify?id=14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg&title=Unsupervised+Language+Learning+%28ULL%29+Design+Draft>You >> can reply to this email to reply to the discussion. >> >> >> >> -- >> cassette tapes - analog TV - film cameras - you >> >> >> -- >> -Anton Kolonin >> skype: akolonin >> cell: >> [email protected]https://aigents.comhttps://www.youtube.com/aigentshttps://www.facebook.com/aigentshttps://plus.google.com/+Aigentshttps://medium.com/@aigentshttps://steemit.com/@aigentshttps://golos.blog/@aigentshttps://vk.com/aigents >> >> > > > -- > cassette tapes - analog TV - film cameras - you > -- Ben Goertzel, PhD http://goertzel.org "The dewdrop world / Is the dewdrop world / And yet, and yet …" -- Kobayashi Issa -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBetkfDno126_YqtfOHmb_qAAtSDWbUQ1nTrYc8bkPe%2BuQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
