I don't think we want an arithmetic average of distance and MI, maybe more like
f(1) = C >1 f(1) > f(2) > f(3) > f(4) f(4) = f(5) = ... = 1 and then f(distance) * MI i.e. maybe we count the MI significantly more if the distance is small... but if MI is large and distance is large, we still count the MI a lot... (of course the decreasing function f becomes the thing to tune here...) On Tue, May 7, 2019 at 12:58 AM Anton Kolonin @ Gmail <[email protected]> wrote: > > Andres, can you upload the sequential parses that you have evaluated and > provide them in the comments to the cells? > > Ben, I think the 0.67-0.72 corresponds to naive impression that 2/3-3/4 of > word-to-word connections in English is "sequential" and the rest is not. For > Russian and Portuguese, it would be somewhat less, I guess. > > What you suggest here ("used *both* the sequential parse *and* some fancier > hierarchical parse as inputs to clustering and grammar learning? I.e. don't > throw out the information of simple before-and-after co-occurrence, but > augment it with information from the statistically inferred dependency parse > tree") can be simply (I guess) implemented in existing MST-Parser given the > changes that Andres and Claudia have done year ago. > > That could be tried with "distance_vs_MI" blending parameter in the > MST-Parser code which accounts for word-to-word distance. So that if the > distance_vs_MI=1.0 we would get "sequential parses", distance_vs_MI=0.0 would > produce "Pure MST-Parses", distance_vs_MI=0.7 would provide "English parses", > distance_vs_MI=0.5 would provide "Russian parses", does it make sense, Andres? > > Ben, do you want let Andres to try this - get parses with different > distance_vs_MI in range 0.0-1.0 an see what happens? > > This could be tried both ways using traditional MI or DNN-MI, BTW. > > Cheers, > > -Anton > > > 06.05.2019 12:30, Ben Goertzel : > > > > On Sun, May 5, 2019 at 10:15 PM Anton Kolonin @ Gmail <[email protected]> > wrote: >> >> Hi Linas, I am re-reading your emails and updating our TODO issues from some >> of them. >> >> Not sure about this one: >> >> >Did Deniz Yuret falsify his thesis data? He got better than 80% accuracy; >> >we should too. >> >> I don't recall Deniz Yuret comparing MST-parses to LG-English-grammar-parses. > > > > Linas: Where does the > 80% figure come from? > > This paper of Yuret's > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.129.5016&rep=rep1&type=pdf > > cites 53% accuracy compared against "dependency parses derived from > dependency-grammar-izing Penn Treebank parses on WSJ text" .... It was > written after his PhD thesis. Is there more recent work by Yuret that gives > massively better results? If so I haven't seen it. > > Spitkovsky's more recent work on unsupervised grammar induction seems to have > gotten better statistics than this, but it used radically different methods. > >> >> >> a) Seemingly "worse than LG-English" "sequential parses" provide seemingly >> better "LG grammar" - that may be some mistake, so we will have to >> double-check this. > > > Anton -- Have you looked at the inferred grammar for this case, to see how > much sense it makes conceptually? > > Using sequential parses is basically just using co-occurrence rather than > syntactic information > > I wonder what would happen if you used *both* the sequential parse *and* some > fancier hierarchical parse as inputs to clustering and grammar learning? > I.e. don't throw out the information of simple before-and-after > co-occurrence, but augment it with information from the statistically > inferred dependency parse tree... > > > > > -- Ben > > -- > -Anton Kolonin > skype: akolonin > cell: +79139250058 > [email protected] > https://aigents.com > https://www.youtube.com/aigents > https://www.facebook.com/aigents > https://medium.com/@aigents > https://steemit.com/@aigents > https://golos.blog/@aigents > https://vk.com/aigents > > -- > You received this message because you are subscribed to the Google Groups > "lang-learn" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/lang-learn/f6f8a242-fcb4-3456-77cf-dfa8833612ca%40gmail.com. > For more options, visit https://groups.google.com/d/optout. -- Ben Goertzel, PhD http://goertzel.org "Listen: This world is the lunatic's sphere, / Don't always agree it's real. / Even with my feet upon it / And the postman knowing my door / My address is somewhere else." -- Hafiz -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBd-T1gG7rjssa0maqebCd3k9cfvigS2QHgt_nesMU7jUQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
