Hi, I think everyone understands that
*** The third claim, "the Linas claim", that you love to reject, is that "when ULL is given a non-lexical input, it will converge to the SAME lexical output, provided that your sampling size is large enough". *** but it's not clear for what cases feasible-sized corpora are "large enough" ... ben On Sat, Jun 22, 2019 at 11:18 PM Linas Vepstas <linasveps...@gmail.com> wrote: > > Hi Anton, > > On Sat, Jun 22, 2019 at 2:32 AM Anton Kolonin @ Gmail <akolo...@gmail.com> > wrote: >> >> >> CAUTION: *** parses in the folder with dict files are not the inputs, but >> outputs - they are produced on basis of the grammar in the same folder, I am >> listing the input parses below !!! *** > > I did not look either at your inputs or your outputs; they are irrelevant for > my purposes. It is enough for me to know that you trained on some texts from > project gutenberg. When I evaluate the quality of your dictionaries, I do > not use your inputs, or outputs, or software; I have an independent tool for > the evaluation of your dictionaries. > > It would be very useful if you kept track of how many word-pairs were counted > during training. There are two important statistics to track: the number of > unique word-pairs, and the total number observed, with multiplicity. These > two numbers are important summaries of the size of the training set. There > are two other important numbers: the number of *unique* words that occurred > on the left side of a pair, and the number of unique words that occurred on > the right side of a pair. These two will be almost equal, but not quite. It > would be very useful for me to know these four numbers: the the first two > characterize the *size* of your training set; the second two characterize the > size of the vocabulary. >> >> >> - row 63, learned NOT from parses produced by DNN, BUT from honest >> MST-Parses, however MI-values for that were extracted from DNN and made >> specific to context of every sentence, so each pair of words could have >> different MI-values in different sentences: > > OK, Look: MI has a very precise definition. You cannot use some other number > you computed, and then call it "MI". Call it something else. Call it "DLA" > -- Deep Learning Affinity. Affinity, because the word "information" also has > a very precise definition: it is the log-base-2 of the entropy. If it is not > that, then it cannot be called "information". Call it "BW" -- Bertram > Weights, if I understand correctly. > > So, if I understand correctly, you computed some kind of DLA/BW number for > word-pairs, and then preformed an MST parse using those numbers? > >> exported in new "ull" format invented by Man Hin: > > Side-comment -- you guys seem to be confused about what the atomspace is, and > what it is good for. The **whole idea** of the atomspace is that it is a > "one size fits all" format, so that you do not have to "invent" new formats. > There is a reason why databases, and graph databases are popular. Inventing > new file formats is a well-paved road to hell. > >> Regarding what you call "the breakthroughs": >> >> >Results from the ull-lgeng dataset indicates that the ULL pipeline is a >> >high- fidelity transducer of grammars. The grammar that is pushed in is the >> >effec- tively the same as the grammar that falls out. If this can be >> >reproduced for other grammars, e.g. Stanford, McParseface or some HPSG >> >grammar, then one has a reliable way of tuning the pipeline. After it is >> >tuned to maximize fidelity on known grammars, then, when applied to unknown >> >grammars, it can be assumed to be working correctly, so that whatever comes >> >out must in fact be correct. >> >> That has been worked accordingly to the plan set up way back in 2017. I am >> glad that you accept the results. Unfortunately, the MST-Parser is not >> built-in into pipeline yet but is is on the way. >> >> If one like you could help with the outstanding work items, it would be >> appreciated, because we are short-handed now. >> >> >The relative lack of differences between the ull-dnn-mi and the >> >ull-sequential datasets suggests that the accuracy of the so-called “MST >> >parse” is relatively unimportant. Any parse, giving any results with >> >better-than-random outputs can be used to feed the pipeline. What matters >> >is that a lot of observation counts need to be accumulated so that junky >> >parses cancel each-other out, on average, while good ones add up and occur >> >with high frequency. That is, if you want a good signal, then integrate >> >long enough that the noise cancels out. >> >> I would disagree (and I guess Ben may disagree as well) given the existing >> evidence with "full reference corpus". > > I think you are mis-interpreting your own results. The "existing evidence" > proves the opposite of what you believe. (I suspect Ben is too busy to think > about this very deeply). >> >> If you compare F1 for LG-English parses with MST > 2 on tab "MWC-Study" you >> will find the F1 on LG-English parses is decent, so it is not that "parses >> do not matter", it is rather just "MST-Parses are even less accurate that >> sequential". > > You are mis-understanding what I said; I think you are also mis-understanding > what your own data is saying. > > The F1-for-LG-English is high because of two reasons: (1) natural language > grammar has the "decomposition property" (aka "lexical property"), and (2) > You are comparing the decomposition provided by LG to LG itself. > > The "decomposition property" states that "grammar is lexical". Natural > language is "lexical" when it's structure can be described by a "lexis" -- a > dictionary, whose dictionary headings are words, end whose dictionary > entries are word-definitions of some kind -- disjuncts for LG; something else > for Stanford/McParseface/HPSG/etc. > > If you take some lexical grammar (Stanford/McParseface/whatever) and generate > a bunch of parses, run it through the ULL pipeline, and learn a new lexis, > then, ideally, if your software works well, then that *new* lexis should get > close to the original input lexis. And indeed, that is what you are finding > with F1-for-LG-English. > > Your F1-for-LG-English results indicate that if you use LG as input, then ULL > correctly learns the LG lexis. That is a good thing. I believe that ULL will > also be able to do this for any lexis... provided that you take enough > samples. (There is a lot of evidence that your sample sizes are much too > small.) > > Let's assume, now, that you take Stanford parses, run them through ULL, learn > a dict, and then measure F1-for-Stanford against parses made by Stanford. The > F1 should be high. Ideally, it should be 1.0. If you measure that learned > lexis against LG, it will be lower - maybe 0.9, maybe 0.8, maybe as low as > 0.65. That is because Stanford is not LG; there is no particular reason for > these two to agree, other than in some general outline: they probably mostly > agree on subjects, objects and determiners, but will disagree on other > details (aux verbs, "to be", etc.) > > Do you see what I mean now? The ULL pipeline should preserve the lexical > structure of language. If you use lexis X as input, then ULL should generate > something very similar to lexis X as output. You've done this for X==LG. Do > it for X=Stanford, McParseface, etc. If you do, you should see F1=1.0 for > each of these (well, something close to F1=1.0) > > Now for part two: what happens when X==sequential, what happens when > X==DNN-MI (aka "bertram weights") and what happens when X=="honest MI" ? > > Let's analyze X==sequential first. First of all, this is not a lexical > grammar. Second of all, it is true that for English, and for just about *any* > language, "sequential" is a reasonably accurate approximation of the "true > grammar". People have actually measured this. I can give you a reference > that gives numbers for the accuracy of "sequential" for 20 different > languages. One paper measures "sequential" for Old English, Middle English, > 17th, 18th, 19th and 20th century English, and finds that English becomes > more and more sequential over time! Cool! > > If you train on X==sequential and learn a lexis, and then compare that lexis > to LG, you might find that F1=0.55 or F1=0.6 -- this is not a surprise. If > you compare it to Stanford, McParseface, etc. you will also get F1=0.5 or 0.6 > -- that is because English is kind-of sequential. > > If you train on X==sequential and learn a lexis, and then compare that lexis > to "sequential", you will get ... kind-of-crap, unless your training dataset > is extremely large, in which case you might approach F1=1.0 However, you > will need to have an absolutely immense training corpus size to get this -- > many terabytes and many CPU-years of training. The problem is that > "sequential" is not lexical. It can be made approximately lexical, but that > lexis would have to be huge. > > What about X==DNN-Bert and X==MI? Well, neither of those are lexical, > either. So you are using a non-lexical grammar source, and attempting to > extract a lexis out of it. What will you get? Well -- you'll get ... > something. It might be kind-of-ish LG-like. It might be kind-of-ish > Stanford-like. Maybe kind-of-ish HPSG-like. If your training set is big > enough (and your training sets are not big enough) you should get at least > 0.65 or 0.7 maybe even 0.8 if you are lucky, and I will be surprised if you > get much better than that. > > What does this mean? Well, the first claim is "ULL preserves lexical > grammars" and that seems to be true. The second claim is that "when ULL is > given a non-lexical input, it will converge to some kind of lexical output". > > The third claim, "the Linas claim", that you love to reject, is that "when > ULL is given a non-lexical input, it will converge to the SAME lexical > output, provided that your sampling size is large enough". Normally, this is > followed by a question "what non-lexical input makes it converge the > fastest?" If you don't believe the third claim, then this is a non-sense > question. If you do believe the third claim, then information theory > supplies an answer: the maximum-entropy input will converge the fastest. If > you believe this answer, then the next question is "what is the maximum > entropy input?" and I believe that it is honest-MI+weighted-clique. Then > there is claim four: the weighted clique can be approximated by MST. > > It is now becoming clear to me that MST is a kind-of mistake, and that a > weighted clique would probably be better, faster-converging. Maybe. The > problem with all of this is rate-of-convergence, sample-set-size, > amount-of-computation. it is easy to invent a theoretically ideal > NP-complete algorithm; its much harder to find something that runs fast. > > Anyway, since you don't believe my third claim, I have a proposal. You won't > like it. The proposal is to create a training set that is 10x bigger than > your current one, and one that is 100x bigger than your current one. Then > run "sequential", "honest-MI" and "DNN-Bert" on each. All three of these > will start to converge to the same lexis. How quickly? I don't know. It might > take a training set that is 1000x larger. But that should be enough; larger > than that will surely not be needed. (famous last words. Sometimes, things > just converge slowly...) > > -- Linas > >> >> Still, we have got "surprize-surprize" with "gold reference corpus". Note, >> it still says "parses do matter but MST-Parses are as bad or as good as >> sequential but both are still not good enough". Also note, that it has been >> obtained just on 4 sentences which is not reliable evidence. >> >> Now, we are full-throttle working on proving your claim now with "silver >> reference corpus" - stay tuned... >> >> Cheers, >> >> -Anton >> >> 22.06.2019 5:38, Linas Vepstas: >> >> Anton, >> >> It's not clear if you fully realize this yet, or not, but you have not just >> one >> but two major breakthroughs here. I will explain them shortly, but first, >> can you send me your MST dictionary? Of the three that you'd sent earlier, >> none had the MST results in them. >> >> OK, on to the major breakthroughs... I describe exactly what they are in the >> attached PDF. It supersedes the PDF I had sent out earlier, which contained >> invalid/incorrect data. This new PDF explains exactly what works, what >> you've found. >> Again, its important, and I'm very excited by it. I hope Ben is paying >> attention, >> he should understand this. This really paves the way to forward motion. >> >> BTW, your datasets that "rock"? Actually, they suck, when tested >> out-of-training-set. >> This is probably the third but more minor discovery: the Gutenberg training >> set >> offers poor coverage of modern English, and also your training set is wayyyy >> too small. >> All this is fixable, and is overshadowed by the important results. >> >> Let me quote myself for the rest of this email. This is quoted from the PDF. >> Read the whole PDF, it makes a few other points you should understand. >> >> ull-lgeng >> >> Based on LG-English parses: obtained from >> http://langlearn.singularitynet.io/data/aglushchenko_parses/GCB-FULL-ALE-dILEd-2019-04-10/context:2_db-row:1_f1-col:11_pa-col:6_word-space:discrete/ >> >> I believe that this dictionary was generated by replacing the MST step with >> a parse where linkages are obtained from LG; these are then busted up back >> into disjuncts. This is an interesting test, because it validates the >> fidelity of the overall pipeline. It answers the question: “If I pump LG >> into the pipeline, do I get LG back out?” and the answer seems to be “yes, >> it does!” This is good news, since it implies that the overall learning >> process does keep grammars invariant. That is, whatever grammar goes in, >> that is the grammar that comes out! >> >> This is important, because it demonstrates that the apparatus is actually >> working as designed, and is, in fact, capable of discovering grammar in >> data! This suggests several ideas: >> >> * First, verify that this really is the case, with a broader class of >> systems. For example, start with the Stanford Parser, pump it through the >> system. Then compare the output not to LG, but to Stanford parser. Are the >> resulting linkages (the F1 scores) at 80% or better? Is the pipeline >> preserving the Stanford Grammar? I'm guessing it does... >> >> * The same, but with Parsey McParseface. >> >> * The same, but with some known-high-quality HPSG system. >> >> If the above two bullet points hold out, then this is a major breakthrough, >> in that it solves a major problem. The problem is that of evaluating the >> quality of the grammars generated by the system. To what should they be >> compared? If we input MST parses, there is no particular reason to believe >> that they should correspond to LG grammars. One might hope that they would, >> based, perhaps, on some a-priori hand-waving about how most linguists agree >> about what the subject and object of a sentences is. One might in fact find >> that this does hold up to some fair degree, but that is all. Validating >> grammars is difficult, and seems ad hoc. >> >> This result offers an alternative: don't validate the grammar; validate the >> pipeline itself. If the pipeline is found to be structure-preserving, then >> it is a good pipeline. If we want to improve or strengthen the pipeline, we >> know have a reliable way of measuring, free of quibbles and argumentation: >> if it can transfer an input grammar to an output grammar with high-fidelity, >> with low loss and low noise, then it is a quality pipeline. It instructs one >> how to tune a pipeline for quality: work with these known grammars >> (LG/Stanford/McParse/HPSG) and fiddle with the pipeline, attempting to >> maximize the scores. Built the highest-fidelity, lowest-noise pipeline >> possible. >> >> This allows one to move forward. If one believes that probability and >> statistics are the correct way of discerning reality, then that's it: if one >> has a high-fidelity corpus-to-grammar transducer, then whatever grammar >> falls out is necessarily, a priori a correct grammar. Statistics doesn't >> lie. This is an important breakthrough for the project. >> >> ull-sequential >> >> Based on "sequential" parses: obtained from >> http://langlearn.singularitynet.io/data/aglushchenko_parses/GCB-FULL-SEQ-dILEd-2019-05-16-94/GL_context:2_db-row:1_f1-col:11_pa-col:6_word-space:discrete/ >> >> I believe that this dictionary was generated by replacing the MST step with >> a parse where there are links between neighboring words, and then extracting >> disjuncts that way. This is an interesting test, as it leverages the fact >> that most links really are between neighboring words. The sharp drawback is >> that it forces each word to have an arity of exactly two, which is clearly >> incorrect. >> >> ull-dnn-mi >> >> Based on "DNN-MI-lked MST-Parses": obtained from >> http://langlearn.singularitynet.io/data/aglushchenko_parses/GCB-GUCH-SUMABS-dILEd-2019-05-21-94/GL_context:2_db-row:1_f1-col:11_pa-col:6_word-space:discrete/ >> >> I believe that this dictionary was generated by replacing the MST step with >> a parse where some sort of neural net is used to obtain the parse. >> >> Comparing either of these to the ull-sequential dictionary indicates that >> precision is worse, recall is worse, and F1 is worse. This vindicates some >> statements I had made earlier: the quality of the results at the MST-like >> step of the process matters relatively little for the final outcome. Almost >> anything that generates disjuncts with slightly-better-than-random will do. >> The key to learning is to accumulate many disjuncts: just as in radio signal >> processing, or any kind of frequentist statistics, to integrate over a large >> sample, hoping that the noise will cancel out, while the invariant signal is >> repeatedly observed and boosted. >> >> On Thu, Jun 20, 2019 at 11:11 PM Anton Kolonin @ Gmail <akolo...@gmail.com> >> wrote: >>> >>> It turns out the difference on if we apply MWC for GL and GT both (lower >>> block) or for GT only (upper block) is miserable - applying it for GL make >>> results 1% better. >>> >>> So far, testing on full LG-English parses (including partially parsed) as a >>> reference: >>> >>> >>> As we know, MWC=2 is much better than MWC=1 and no improvement further. >>> >>> "Sequential parses" rock, MST and "random" parses suck. >>> >>> Pearson(parses,grammar) = 1.0 >>> >>> Alexey is running this with "silver standard" for MWC=1,2,3,4,5,10 >>> >>> -Anton >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "lang-learn" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to lang-learn+unsubscr...@googlegroups.com. >>> To post to this group, send email to lang-le...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/lang-learn/4dfac49f-a6b5-f5ab-6fb0-d0be96ee77ef%40gmail.com. >>> For more options, visit https://groups.google.com/d/optout. >> >> >> >> -- >> cassette tapes - analog TV - film cameras - you >> >> -- >> -Anton Kolonin >> skype: akolonin >> cell: +79139250058 >> akolo...@aigents.com >> https://aigents.com >> https://www.youtube.com/aigents >> https://www.facebook.com/aigents >> https://medium.com/@aigents >> https://steemit.com/@aigents >> https://golos.blog/@aigents >> https://vk.com/aigents > > > > -- > cassette tapes - analog TV - film cameras - you > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to opencog+unsubscr...@googlegroups.com. > To post to this group, send email to opencog@googlegroups.com. > Visit this group at https://groups.google.com/group/opencog. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CAHrUA34sz8okB%2B3UGM0XO9CsYu5U0MBRL0A0fedTDtN%2BNc7Mfg%40mail.gmail.com. > For more options, visit https://groups.google.com/d/optout. -- Ben Goertzel, PhD http://goertzel.org "Listen: This world is the lunatic's sphere, / Don't always agree it's real. / Even with my feet upon it / And the postman knowing my door / My address is somewhere else." -- Hafiz -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscr...@googlegroups.com. To post to this group, send email to opencog@googlegroups.com. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBfCqqYbq%3DptiRboyCEv51sHwX2rH_j08pu_7wKsvxT9Mg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.