[opencog-dev] Re: 100 sentences for GC

Ben Goertzel Sun, 31 Mar 2019 22:16:29 -0700

"Replacing MST by DNN/BERT" is a strange way to put it...

DNN/BERT builds a pretty complex and comprehensive language model,
much beyond what is done by calculation of MI values and similar


The extraction of a parse dag satisfying syntactic constraints (no
links cross, covering all words in the sentence, connected graph) is a
conceptually simple step, and nobody is spending much time on this
step indeed...

The question of how to assign a quantitative weight to the relation
btw two word-instances in a sentence, taking into account the specific
context in that sentence, but also the history of co-utilization of
those words (or other similar words), is less conceptually simple and
this is one place I think DNN language models can help

Using MST or similar parsing based on numbers exported from DNN
language models is one way of extracting symbolic-ish structured
knowledge from these big messy subsymbolic probabilistic language
models...

The DNNs in use now like BERT do not really satisfy me on a
theoretical or conceptual level, but they have been tuned to work
pretty nicely and they have been implemented pretty efficiently on
multi-GPU hardware -- so, given this and given the quality of the
recent practical results obtained with them -- I consider it well
worth exploring how to use them as tools in our pursuits for grammar
and semantics learning

-- Ben

On Mon, Apr 1, 2019 at 2:07 PM Linas Vepstas <[email protected]> wrote:
>
>
>
> On Sun, Mar 31, 2019 at 10:51 PM Anton Kolonin @ Gmail <[email protected]> 
> wrote:
>>
>> Hi Linas, I like this thread more and more :-)
>
> I don't. I use a lot of CAPITALIZED WORDS below.  There is a deep and dark 
> fundamental misunderstanding, and I am sometimes at wits end trying to figure 
> out why, and how to explain things in an understandable fashion.
>>
>> >But somehow, I suspect... Isn't this why OpenCog has "unified rule engine" 
>> >(URE) instead of link grammar at its core,
>>
>> Linas, the "extraction of phrasemes" goal approaching has been discussed 
>> exactly in terms of MST->GL->URL on the last fall in Hong Kong discussion: 
>> https://docs.google.com/document/d/13YyqtGud0GAbVaFcc94kAd2LhGf7jTr5XDYgiuC294c/edit
>>
>> That is:
>>
>> 1) Do MST-parsing to get word links proto-disjuncts
>>
>> 2) Do Grammar Learning to cluster and conclude word categories and rules 
>> with disjuncts
>>
>> 3) Do URE-kind-of-thing to build the rules into "phrasemes" or "sections" or 
>> "patterns".
>
> Yes.
>>
>> However, your current discourse and our current results just show that "no 
>> one is be able to do reasonable MST-parsing" so the above is just waste of 
>> time, correct?
>
> No. Very much no.  I'm saying the opposite of that. You can replace MST by 
> almost *ANYTHING* else, and the quality of your results WILL NOT CHANGE!
>
> If the quality of your results depends on the quality of MST, you are DOING 
> SOMETHING WRONG!
>
> I'm utterly flabbergasted. I don't know how many more times I can say this: 
> stop wasting time on this unimportant step!
>>
>> At the time we speak, Ben, Alexely, Sergey and Asuares are trying to use 
>> DNN/BERT magic to do the trick 1.
>
> I want to call this "a complete waste of time". It will almost surely not 
> improve the quality of the results!  I don't understand why four smart people 
> think that replacing MST by BERT will make any difference at all!  It should 
> not matter!  Nothing depends on this step! Anything at all, anything with a 
> probability better than random chance, is sufficient!  Why isn't this obvious?
>
> If Ben is reading this: I recall talking to Ben about this in an ice-cream 
> shop in Berlin, for an AGI conference, and he seemed to understand back then. 
>  I have no idea why he changed his mind.  I really do not understand why 
> everyone spends so much time obsessing about MST. Is this a "color of the 
> bike shed" problem?  https://en.wikipedia.org/wiki/Law_of_triviality
>
> MST-vs.-BERT==color-of-bike-shed
>
> Just use MST. It's simple. It works. It gives good results.  Stop trying to 
> improve it.  The interesting problems are elsewhere!  Just use MST, and move 
> on to the good stuff!
>>
>> To my mind, that may get possible only if the DNN/BERT magic do the trick 
>> having the steps 2 and 3 done under the hood. If this is done, in such case, 
>> we don't need to do 2 and 3 after we have the DNN/BERT-based model, because 
>> we can simply "milk-out" the grammar rules out of DNN/BERT micelium for 
>> that. And we don't need the ULL as well by the way, because we just need 
>> DNN/BERT and rows of different sorts of milk machines around it.
>
> So why are you bothering to work on ULL?
>>
>> So, instead of solving the problem of constructing the pipeline for learning 
>> grammar from raw text we need to solve the problem of milking the grammar 
>> out of DNN/BERT model trained on these texts, right?
>
> Because I don't think that you know how to milk lexical functions out of 
> DNN/BERT -- We've wasted more than a year talking about MST.  Instead of 
> endlessly talking about MST, you could have  JUST USED IT, WITHOUT ANY 
> MODIFICATIONS, gotten good results, and spent the year working on something 
> interesting!
>
> Again: replacing MST by DNN/BERT with something else will NOT IMPROVE the 
> accuracy!  You'll have exactly the same accuracy as before, and if your 
> accuracy improves, it is because you are doing something wrong!
>
>> However, either way, we need to understand algorithmic machinery of how the 
>> links assemble in disjuncts and disjuncts assemble into sections, through 
>> the universe-scale combinatorial explosion.
>
> No. That is the OPPOSITE of what ACTUALLY HAPPENS!!!!
>>
>> And I agree that clustering and categorizing word and links (and then 
>> disjuncts and sections, right) is part of the process - explicitly in ULL 
>> pipeline or implicitly deep in DNN/BERT darkness.
>
> It is NOT DEEP AND DARK.  I wrote not one but TWO PAPERS on this, CASTING 
> LIGHT ON THAT DARKNESS
>
> I'm frustrated to the 43rd degree on why I cannot seem to have a reasonable 
> conversation with any other human being about any of this.
>
> -- Linas
>
>> Cheers,
>>
>> -Anton
>>
>>
>> 01.04.2019 9:17, Linas Vepstas:
>>
>>
>>
>> On Thu, Mar 28, 2019 at 10:22 AM Ivan V. <[email protected]> wrote:
>>>
>>> Linas Vepstas wrote:
>>>
>>> >... knowledge extraction can be done generically, and not just on language.
>>>
>>> If link grammar would be Turing complete, this might be possible right away.
>>
>>
>> In my experience, thinking about Turing completeness is unproductive and a 
>> distraction.
>>
>>> But somehow, I suspect... Isn't this why OpenCog has "unified rule engine" 
>>> (URE) instead of link grammar at its core,
>>
>>
>> No. It has the rule-engine because back then, I did not understand sheaves.  
>> I'm starting to think that the rule engine is a strategic mistake. The 
>> original idea is that rule-application is the main conceptual abstraction of 
>> term-rewriting.  One rewrites, or proves theorems by applying sequences of 
>> rules.  It turns out that discovering the right sequence is hard. Finding 
>> correct long sequences is hard - a combinatorial explosion.
>>
>> The openpsi system addresses some of these issues. Unfortunately, it's 
>> current implementation is a tangle of rule-selection mechanisms, and 
>> theories of human psychology. It's probably better than the URE, but is 
>> currently not as powerful.
>>
>> I'm trying to place a theory of sheaves as a replacement for URE, and as the 
>> natural generalization of openpsi, but I've successfully self-sabotaged 
>> myself in these efforts.
>>
>>>
>>> and with URE things get much more complicated. I'm sorry, but that is still 
>>> a Gordian knot to me, considering all of my modest knowledge.
>>
>>
>> We all have modest knowledge. That is the nature of the human condition.
>>
>>>
>>> On the other hand, if someone really smart would provide automatic grammar 
>>> extraction by means of unrestricted grammar, I believe that would be it.
>>
>>
>> Yes, that is the goal of the language-learning project.  However, as noted 
>> in my last email (on the link-grammar list) it is not enough to just learn a 
>> semi-Thue system, declare victory, and go home.  The example I gave there:
>>
>>   "I think that you should give that car a second look"
>>   "you should really give that song a second listen"
>>   "maybe you should give Sue a second chance".
>>
>> Learning to parse these "set phrases" or phrasemes is equivalent to learning 
>> a semi-Thue system; however, its not enough to realize that all three are 
>> forms of advice-giving, having "conserved" or "fixed" regions "x YOU SHOULD 
>> y GIVE z SECOND w" where z is very highly variable having millions of 
>> variations, and w only has a few dozen allowed variations.  Note that the 
>> words "fixed", "conserved", "variable" are words used in genetics and 
>> proteomics and antibody structure. Its the same idea.
>>
>> The goal of learning lexical functions (LF's) is to learn that all three are 
>> advice-giving forms, and also to learn what is, and what can be plugged in 
>> for x,y,z,w.   So, although a super-whiz-bang grammar learner capable of 
>> learning context-sensitive languages should be able to learn "x YOU SHOULD y 
>> GIVE z SECOND w", it still will not know the *meaning* of this phrase.  To 
>> know the *meaning*, you have to know the acceptable ranges (as fuzzy-sets) 
>> of x,y,z,w.
>>
>> To conclude, thinking about Turing-completeness is a waste of time, because 
>> Turing completeness only tells you that "x YOU SHOULD y GIVE z SECOND w" is 
>> recursively enumerable; it does not tell you what it actually means.
>>
>> Put another way:  having a universal Turing machine is not the same as 
>> knowing how some particular program works. Automagically learning a 
>> context-sensitive grammar is not enough to know what that grammar is 
>> "saying/doing".
>>
>> -- Linas
>>
>>>
>>>
>>> Thank you,
>>> Ivan V.
>>>
>>>
>>> čet, 28. ožu 2019. u 07:58 Anton Kolonin @ Gmail <[email protected]> 
>>> napisao je:
>>>>
>>>> Ben, Linas,
>>>>
>>>> >But we know that MST parsing is shit.  Stop wasting time on MST or trying 
>>>> >to "improve" it.
>>>>
>>>> I think that sounds like kind of support for the concept of "dumb 
>>>> explosive parsing" being advocated for 1+ year ago:
>>>>
>>>> https://docs.google.com/document/d/14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg/edit#heading=h.aqo9bumb3doy
>>>>
>>>> I also agree we other Linas'es reasoning in this thread. I would consider 
>>>> giving it a try starting next month if we don't have a breakthrough with 
>>>> DNN-MI-milking-based-MST-Parsing by that time.
>>>>
>>>> > can be done generically, and not just on language
>>>>
>>>> I think everyone in bio-informatics dreams of extracting secrets of "dark 
>>>> side of the genome" with something like that ;-)
>>>>
>>>> Cheers,
>>>>
>>>> -Anton
>>>>
>>>>
>>>> 28.03.2019 1:24, Linas Vepstas пишет:
>>>>
>>>> Hi Anton,
>>>>
>>>> I've cc'ed the link-grammar mailing list, because I describe below some 
>>>> concepts for word-sense disambiguation. I'm also cc'ing the opencog 
>>>> mailing list and ivan vodisek, because after studying hilbert systems, I 
>>>> think he's ready to think about how knowledge extraction can be done 
>>>> generically, and not just on language.
>>>>
>>>> -- Linas
>>>>
>>>> On Mon, Mar 25, 2019 at 1:39 AM Anton Kolonin @ Gmail <[email protected]> 
>>>> wrote:
>>>>>
>>>>> Hi Linas,
>>>>>
>>>>> >I'd call it "interesting", but maybe not "golden"
>>>>>
>>>>> These are randomly selected sentences from "Gutenberg Children" corpus:
>>>>>
>>>>> http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/lower_LGEng_token/
>>>>>
>>>>> "Gutenberg Children silver standard" is LG-English parses:
>>>>>
>>>>> http://langlearn.singularitynet.io/data/parses/English/Gutenberg-Children-Books/test/GCB-LG-English-clean.ull
>>>>>
>>>>> "Gutenberg Children gold standard" is subset of "silver standard" with 
>>>>> semi-random selection of sentences skipping direct speech and doing 
>>>>> manual verification of the links.
>>>>>
>>>>> So as long as we are training on "Gutenberg Children" corpus, having the 
>>>>> test on the same "Gutenberg Children" seems reasonable, right?
>>>>
>>>>
>>>> Yes. You still need to verify that each word in the "golden" corpus occurs 
>>>> at least N=10 or 20 times in the training corpus. The dependency of 
>>>> accuracy on N is not generally known, but it is very clear that if a word 
>>>> occurs only N=3 times in the training corpus, then whatever is learned 
>>>> about it will be very low quality.
>>>>
>>>>>
>>>>> But thanks, we may have put mire effort in removal of ancient 
>>>>> constructions and words even if these are present in the corpus.
>>>>
>>>> If you consistently train on 19th century literature, and then evaluate 
>>>> 19th-century literature comprehension, that's fine.  Just don't expect it 
>>>> to work for 21st century blog posts.
>>>>
>>>> The strongest effect will be the N=number of observations effect.
>>>>
>>>>>
>>>>> >Anyway -- you only indicate pair-wise word-links. Is the omission of 
>>>>> >disjuncts intentional?
>>>>>
>>>>> If you have all links in the sentence, you can construct all of the 
>>>>> disjuncts with o ambiguity, correct?
>>>>
>>>> No, but only because you did not indicate the link-type.  The whole point 
>>>> of a clustering step is to obtain a link-type; if you discard it, you will 
>>>> never get  better-than-MST results. The link-type is critical for 
>>>> obtaining the word-classes.  The whole point of learning is to learn the 
>>>> word-classes; you've learned very little, if you know only word-pairs.
>>>>
>>>> Consider this example:
>>>>
>>>> I saw wood
>>>> I saw some wood
>>>>
>>>> A solution that would be "almost perfect" (or "golden") would be this:
>>>>
>>>> saw: {performer-of-actions}- & {sculptable-mass}+;
>>>> saw: {observer}-  & {viewable-thing}+;
>>>>
>>>> These disambiguate the two different senses of the word "saw".  It's 
>>>> impossible to have word-sense disambiguation without actually having these 
>>>> disjuncts.  The word-pairs alone are not sufficient to report the 
>>>> link-type connecting the words.  Clustering gives the other dictionary 
>>>> entries:
>>>>
>>>> I: {performer-of-actions}+ or {observer}+;
>>>> wood: {sculptable-mass}- or ({quantity-determiner}- & {viewable-thing}-);
>>>> some: {quantity-determiner}+;
>>>>
>>>> Thus, the pronoun "I" also belong to two different word-sense categories: 
>>>> performers and observers.  Compare to:
>>>>
>>>> "The chainsaw saws wood"  -- a "chainsaw" can be  a "performer of actions" 
>>>> but cannot be an "observer".
>>>> "The dog saw some wood" -- dogs can be observers. They can perform some 
>>>> actions; like run, jump, but they cannot saw, hammer, cut, stab.
>>>>
>>>> The link-type is absolutely crucial to understanding a word.  The 
>>>> language-learning project is all about learning the link-types. Without 
>>>> correct link-type assignments, you cannot have correct parses.
>>>>
>>>> ... which is 100% of the problem with MST.  The problem with MST is not so 
>>>> much that "its not accurate" -sure, it is not terribly accurate. But even 
>>>> if MST or some MST-replacement was 100% accurate, it would still be 
>>>> "wrong" because it fails to indicate the link-type.  If you want to 
>>>> understand a sentence, you MUST know the link-types!
>>>>
>>>> Otherwise, you just have "green ideas sleep furiously", which parses, but 
>>>> only because the link types have been erased, or made stupid.  Here's a 
>>>> stupid grammar:
>>>>
>>>> ideas:  {adjective}- & {verb}+;
>>>> green: {adjective}+;
>>>>
>>>> which allows "green ideas" to parse.  But of course, this is wrong; it 
>>>> should have been:
>>>>
>>>> ideas: {noospheric-modifier}- & {concept-manipulating-verb}+;
>>>> green: {physical-object-modifier}+;
>>>>
>>>> and now it is clear that "green ideas" cannot parse, because the 
>>>> link-types clash.
>>>>
>>>> * If you cluster down to 5 or 6 clusters (adjective, verb, noun ...) you 
>>>> will get very low quality grammars.
>>>>
>>>> * If you cluster to 200 or 300 clusters, you get sort-of-OK grammars. This 
>>>> is what deep-learning/neural-nets do: this is why the deep-learning 
>>>> systems seem to give nice results: 200 or 300 features is enough to start 
>>>> having adequate functional distinctions (e.g. the famous "king - 
>>>> male+female=queen" example, or "paris-france+germany=berlin" example)
>>>>
>>>> * If you cluster to 3K to 8K clusters, you start having a quite decent 
>>>> model of language
>>>>
>>>> * Note that wordnet has 117K "synsets".
>>>>
>>>> Note that in the above example:
>>>> wood: {sculptable-mass}- or ({quantity-determiner}- & {viewable-thing}-);
>>>>
>>>> the things in the curly-braces are effectively "synsets".
>>>>
>>>> The next set of goal-posts is to have disjuncts, of maybe low-medium 
>>>> quality, and use these to extract ontologies.  e.g.
>>>> {sculptable-mass} is-a {mass} is-a {physical-thing} is-a {thing}
>>>>
>>>> You can try to do this by clustering but there are probably better ways of 
>>>> discovering ontology.
>>>>
>>>>
>>>>>
>>>>> >Also -- no hint of any word-classes or part-of-speech tagging? This is 
>>>>> >surely important to evaluate as well, or is this to be done in some 
>>>>> >other way?  i.e. to evaluate if "Pivi" was correctly clustered with 
>>>>> >other given names?  Or that lama/llama was clustered with other 
>>>>> >four-legged animals?
>>>>>
>>>>> We don't have that in MST-Parsing, right? We need this corpus to assess 
>>>>> the quality of the MST-Parsing so we don't need part-of-speech 
>>>>> information for that.
>>>>
>>>> But we know that MST parsing is shit.  Stop wasting time on MST or trying 
>>>> to "improve" it. We already know that it is close to a high-entropy path 
>>>> to structure; trying to squeeze a few more percent of entropy is not worth 
>>>> the effort, not at this time.  Focus on finding a high-entropy structure 
>>>> extraction algorithm, don't waste time on MST.
>>>>
>>>> You should be focusing on extracting disjuncts, word-classes, word-senses, 
>>>> and trying to improve the quality of those.  If you obtain a high-entropy 
>>>> path to these structures, the quality of your parses will automatically 
>>>> improve.  Focus on the entropy numbers. Try to maximize that.
>>>>
>>>>> The clustering is able to do that anyway - see the graphs in the end of 
>>>>> the last year report:
>>>>>
>>>>> https://docs.google.com/document/d/1gxl-hIqPQCYPb9NNkyA3sBYUyfwvJFvT1hZ5ZpXsaPc/edit#heading=h.twoiv52o0tou
>>>>>
>>>>> >Also -- I can't tell -- is it free of loops, or are loops allowed?  
>>>>> >Allowing loops tends to provide stronger, more accurate parses.  Loops 
>>>>> >act as constraints.
>>>>>
>>>>> The loops and crossing links are not allowed in the MST-Parser now. If we 
>>>>> allow them in the test corpus, how could it make assessment of MST-Parses 
>>>>> better?
>>>>>
>>>>> Note, that we ARE working we MST-Parses now - accordingly to Ben's 
>>>>> directions.
>>>>
>>>>
>>>> Not to say bad things about Ben, but I'm certain he has not actually 
>>>> thought about this problem very much. He is very very busy doing other 
>>>> things; he is not thinking about this stuff.  I have repeatedly tried to 
>>>> explain the issues to him, and its quite clear that he is far away from 
>>>> understanding them, from working at the level that I would like to have 
>>>> you and your team work at.
>>>>
>>>> I'm trying to have you make small, quantified baby-steps, to verify the 
>>>> accuracy of your methods and data.  What I'm seeing is that you are 
>>>> attempting to make giant-steps, without verification, and then getting 
>>>> low-quality results, without understanding the root causes for them.  You 
>>>> can't dig yourself out of a ditch, and digging harder and more furiously 
>>>> won't raise the accuracy of the parse results.
>>>>
>>>> --linas
>>>>
>>>>> We have your MST-Parser-less idea on the map but we are NOT trying it now:
>>>>>
>>>>> https://github.com/singnet/language-learning/issues/170
>>>>>
>>>>> We may try it after we explore the account for costs
>>>>>
>>>>> https://github.com/singnet/language-learning/issues/183
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Anton
>>>>>
>>>>> 24.03.2019 9:24, Linas Vepstas пишет:
>>>>>
>>>>> Also, BTW, link-grammar cannot parse "I just stood there, my hand on the 
>>>>> knob, trembling like a leaf." correctly. It is one of a class of 
>>>>> sentences it does not know about.  Which is maybe OK, because ideally, 
>>>>> the learned grammar will be able to do this. But today, LG cannot.
>>>>>
>>>>> --linas
>>>>>
>>>>> On Sat, Mar 23, 2019 at 9:12 PM Linas Vepstas <[email protected]> 
>>>>> wrote:
>>>>>>
>>>>>> Anton,
>>>>>>
>>>>>> It's certainly an unusual corpus, and it might give you rather low 
>>>>>> scores. I'd call it "interesting", but maybe not "golden". Although I 
>>>>>> suppose it depends on your training corpus.  Here are some problems that 
>>>>>> pop out:
>>>>>>
>>>>>> First sentence --
>>>>>> "the old beast was whinnying on his shoulder" -- the word "whinnying" is 
>>>>>> a fairly rare English verb -- you could read half-a-million wikipedia 
>>>>>> articles, and not see it once. You could read lots of 19th-century or 
>>>>>> early-20th century cowboy/adventure novels, (like what you'd find on 
>>>>>> Project Gutenberg) and maybe see it some fair amount. Even then -- to 
>>>>>> "whinny on a shoulder" seems bizarre.. I guess he's hugging the horse? 
>>>>>> How often does that happen, in any cowboy novel? "to whinny on 
>>>>>> something" is an extremely rare construction.  It will work only if 
>>>>>> you've correctly categorized "whinny" as a verb that can take a 
>>>>>> preposition.  Are your clustering algos that good, yet, to correctly 
>>>>>> cluster rare words into appropriate verb categories?
>>>>>>
>>>>>> Second sentence .. "Jims" is a very uncommon name. Frankly, I've never 
>>>>>> heard of it as a name before.  Your training data is going to be 
>>>>>> extremely slim on this. And lack of training data means poor statistics, 
>>>>>> which means low scores.  Unless -- again, your clustering code is good 
>>>>>> enough to place "Jims" in a "proper name" cluster...
>>>>>>
>>>>>> "the lama snuffed blandly" -- "snuffed" is a very uncommon, almost 
>>>>>> archaic verb. These days, everyone spells llama with two ll's not one. 
>>>>>> Unless your talking about Buddhist monks, its a typo.
>>>>>>
>>>>>> "you understand?"  is .. awkward. Common in speech, uncommon in writing. 
>>>>>> Unlikely that you'll have enough training data for this.
>>>>>>
>>>>>> "Willard" is an uncommon name. Does your training corp[us have a 
>>>>>> sufficient number of mentions of Willard? Do you have clustering working 
>>>>>> well enough to stick "Willard" into a cluster with other names?
>>>>>>
>>>>>> "it is so with Sammy Jay" is clearly archaic English.
>>>>>>
>>>>>> "he hasn't any relations here" is clearly archaic, an olde-fashioned 
>>>>>> construction.
>>>>>>
>>>>>> "Pivi said not one word" - again, a clearly old-fashioned construction. 
>>>>>> Does the training set contain enough examples of "Pivi" to recognize it 
>>>>>> as a name? Are names clustering correctly?
>>>>>>
>>>>>> Any sentence with an inversion is going to sound old-fashioned. All of 
>>>>>> the sentences in that corpus sound old-fashioned. Which maybe is OK if 
>>>>>> you are training on 19th century Gutenberg texts .. but its certainly 
>>>>>> not modern English.  Even when I was a child, and I read those old 
>>>>>> crumbly-yellow paper adventure books, part of the fun was that no one 
>>>>>> actually talked that way -- not at school, not at home, not on TV. It 
>>>>>> was clearly from a different time and place -- an adventure.
>>>>>>
>>>>>> Anyway -- you only indicate pair-wise word-links. Is the omission of 
>>>>>> disjuncts intentional? Also -- no hint of any word-classes or 
>>>>>> part-of-speech tagging? This is surely important to evaluate as well, or 
>>>>>> is this to be done in some other way?  i.e. to evaluate if "Pivi" was 
>>>>>> correctly clustered with other given names?  Or that lama/llama was 
>>>>>> clustered with other four-legged animals?
>>>>>>
>>>>>> Also -- I can't tell -- is it free of loops, or are loops allowed?  
>>>>>> Allowing loops tends to provide stronger, more accurate parses.  Loops 
>>>>>> act as constraints.
>>>>>>
>>>>>> -- Linas
>>>>>>
>>>>>> On Thu, Mar 21, 2019 at 11:09 PM Anton Kolonin @ Gmail 
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi Linas, Andes and whoever understands LG and English well enough both.
>>>>>>>
>>>>>>> Attached are first 100 sentences for GC "gold standard" - manually 
>>>>>>> checked based on LG parses.
>>>>>>>
>>>>>>> We are expecting more to come in the next two weeks.
>>>>>>>
>>>>>>> To enable that, please have cursory review of the corpus and let us 
>>>>>>> know if there are corrections still needed so your corrections will be 
>>>>>>> used as a reference to fix the rest and keep going further.
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> -Anton
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "lang-learn" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>>> an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/lang-learn/bde76364-a578-4ab8-8ac5-2f49f794072b%40gmail.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> cassette tapes - analog TV - film cameras - you
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> cassette tapes - analog TV - film cameras - you
>>>>>
>>>>> --
>>>>> -Anton Kolonin
>>>>> skype: akolonin
>>>>> cell: +79139250058
>>>>> [email protected]
>>>>> https://aigents.com
>>>>> https://www.youtube.com/aigents
>>>>> https://www.facebook.com/aigents
>>>>> https://medium.com/@aigents
>>>>> https://steemit.com/@aigents
>>>>> https://golos.blog/@aigents
>>>>> https://vk.com/aigents
>>>>
>>>>
>>>>
>>>> --
>>>> cassette tapes - analog TV - film cameras - you
>>>> --
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "lang-learn" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/lang-learn/CAHrUA36dE5ihtcCaqPv_q4qgmbEy-yX6kTkUHyLZmjk6d4VfOg%40mail.gmail.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>> --
>>>> -Anton Kolonin
>>>> skype: akolonin
>>>> cell: +79139250058
>>>> [email protected]
>>>> https://aigents.com
>>>> https://www.youtube.com/aigents
>>>> https://www.facebook.com/aigents
>>>> https://medium.com/@aigents
>>>> https://steemit.com/@aigents
>>>> https://golos.blog/@aigents
>>>> https://vk.com/aigents
>>
>>
>>
>> --
>> cassette tapes - analog TV - film cameras - you
>>
>> --
>> -Anton Kolonin
>> skype: akolonin
>> cell: +79139250058
>> [email protected]
>> https://aigents.com
>> https://www.youtube.com/aigents
>> https://www.facebook.com/aigents
>> https://medium.com/@aigents
>> https://steemit.com/@aigents
>> https://golos.blog/@aigents
>> https://vk.com/aigents
>
>
>
> --
> cassette tapes - analog TV - film cameras - you



-- 
Ben Goertzel, PhD
http://goertzel.org

"Listen: This world is the lunatic's sphere,  /  Don't always agree
it's real.  /  Even with my feet upon it / And the postman knowing my
door / My address is somewhere else." -- Hafiz

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBc86WkLfWNDtO%2BUE8tv5fDAg1P22i3KeWqW7uV1xb-0gQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: 100 sentences for GC

Reply via email to