[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-06-15 Thread Linas Vepstas
On Mon, May 6, 2019 at 12:30 AM Ben Goertzel wrote: > > > On Sun, May 5, 2019 at 10:15 PM Anton Kolonin @ Gmail > wrote: > >> Hi Linas, I am re-reading your emails and updating our TODO issues from >> some of them. >> >> Not sure about this one: >> >Did Deniz Yuret falsify his thesis data? He

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-05-09 Thread andres
Anton, sequential and random parses are in D56 and D57. Or do you want specifically the ones for GS and SS? If so, please tell me where you want them to avoid messing with your file structure, please. Yes, the mix of distance and MI is what we have been doing when we use the distance

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-05-07 Thread Ben Goertzel
I don't think we want an arithmetic average of distance and MI, maybe more like f(1) = C >1 f(1) > f(2) > f(3) > f(4) f(4) = f(5) = ... = 1 and then f(distance) * MI i.e. maybe we count the MI significantly more if the distance is small... but if MI is large and distance is large, we still

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-05-05 Thread Ben Goertzel
On Sun, May 5, 2019 at 10:15 PM Anton Kolonin @ Gmail wrote: > Hi Linas, I am re-reading your emails and updating our TODO issues from > some of them. > > Not sure about this one: > >Did Deniz Yuret falsify his thesis data? He got better than 80% accuracy; > we should too. > > I don't recall

Re: [opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-05-01 Thread Linas Vepstas
On Fri, Apr 26, 2019 at 11:57 AM Sarah Weaver wrote: > Hey did my last message show up in spam again? :P > The above is the full text of what I received from you, and nothing more. --linas -- cassette tapes - analog TV - film cameras - you -- You received this message because you are

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-05-01 Thread Linas Vepstas
On Wed, Apr 24, 2019 at 9:31 PM Anton Kolonin @ Gmail wrote: > Ben, Linas, here is full set of results generated by Alexey: > > Results update: > My gut intuition is that the most interesting numbers would be this: > MWC(GT) MSL(GT) PA F1 > > 5 2 > 5 3 > 5 4 > 5 5

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-05-01 Thread Linas Vepstas
Hi Anton, sorry for very late reply. On Tue, Apr 23, 2019 at 8:25 PM Anton Kolonin @ Gmail wrote: > Linas, how would you "weight the disjuncts"? > > We know how to weight the words (by frequency), and word pairs (by MI). > > But how would you weight the disjuncts? > That is a very good

Re: [opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-26 Thread Sarah Weaver
Hey did my last message show up in spam again? :P On Tue, Apr 23, 2019 at 4:45 PM Linas Vepstas wrote: > Hi Ben, > > On Tue, Apr 23, 2019 at 5:09 AM Ben Goertzel wrote: > >> *** >> Ah, well, hmm. It appears I had misunderstood. I did not realize that >> the input was 100% correct but

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-23 Thread Linas Vepstas
Hi Ben, On Tue, Apr 23, 2019 at 5:09 AM Ben Goertzel wrote: > *** > Ah, well, hmm. It appears I had misunderstood. I did not realize that > the input was 100% correct but unlaballed parses. In this case, > obtaining 100% accuracy is NOT suprising, its actually just a proof > that the code is

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-23 Thread Linas Vepstas
On Tue, Apr 23, 2019 at 5:00 AM Ben Goertzel wrote: > > On Mon, Apr 22, 2019 at 11:18 PM Anton Kolonin @ Gmail < > akolo...@gmail.com> wrote: > >> > >> > >> We are going to repeat the same experiment with MST-Parses during this > week. > > > > > > The much more interesting experiment is to see

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-23 Thread Ben Goertzel
*** Ah, well, hmm. It appears I had misunderstood. I did not realize that the input was 100% correct but unlaballed parses. In this case, obtaining 100% accuracy is NOT suprising, its actually just a proof that the code is reasonably bug-free. *** It's a proof that the algorithms embodied in

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-23 Thread Ben Goertzel
> On Mon, Apr 22, 2019 at 11:18 PM Anton Kolonin @ Gmail > wrote: >> >> >> We are going to repeat the same experiment with MST-Parses during this week. > > > The much more interesting experiment is to see what happens when you give it > a known percentage of intentionally-bad unlabelled parses.

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-22 Thread Linas Vepstas
On Mon, Apr 22, 2019 at 11:18 PM Anton Kolonin @ Gmail wrote: > > We are going to repeat the same experiment with MST-Parses during this > week. > The much more interesting experiment is to see what happens when you give it a known percentage of intentionally-bad unlabelled parses. I claim that

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-22 Thread Linas Vepstas
On Mon, Apr 22, 2019 at 10:48 PM Ben Goertzel wrote: > *** > Thank you! This is fairly impressive: it says that if the algo heard > a word five or more times, that was sufficient for it to deduce the > correct grammatical form! > *** > > Yes. What we can see overall is that, with the current

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-22 Thread Linas Vepstas
On Mon, Apr 15, 2019 at 9:02 PM Anton Kolonin @ Gmail wrote: > Ben, > > I'd be curious to see some examples of the sentences used in > > *** > 5 0 100.00% 1.00 - sentences with each word occurring 5+ > 10 0 100.00% 1.00 - sentences with each word occurring 10+ > 50

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-22 Thread Ben Goertzel
*** Thank you! This is fairly impressive: it says that if the algo heard a word five or more times, that was sufficient for it to deduce the correct grammatical form! *** Yes. What we can see overall is that, with the current algorithms Anton's team is using: If we have "correct" unlabeled

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-22 Thread Linas Vepstas
On Mon, Apr 15, 2019 at 11:18 AM Anton Kolonin @ Gmail wrote: > > > 1) Identical Lexical Entries (ILE) algorithm is "over-fitting" in fact, > so there is still way to go being able to learn "generalized grammars"; > Can you explain in detail what "Identical lexical entries" are? I can guess,

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

2019-04-22 Thread Linas Vepstas
Hi Anton, On Mon, Apr 15, 2019 at 11:18 AM Anton Kolonin @ Gmail wrote: > Ben, Linas, > > Let me comment on latest results, given LG-English parses are given as > input for Grammar Learner using Identical Lexical Entries (ILE) > algorithm and compared against the same input LG-English parses -