[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

Linas Vepstas Tue, 23 Apr 2019 14:46:05 -0700

Hi Ben,

On Tue, Apr 23, 2019 at 5:09 AM Ben Goertzel <[email protected]> wrote:


> ***
> Ah, well, hmm. It appears I had misunderstood. I did not realize that
> the input was 100% correct but unlaballed parses. In this case,
> obtaining 100% accuracy is NOT suprising, its actually just a proof
> that the code is reasonably bug-free.
> ***
>
>  It's a proof that the algorithms embodied in this portion of the code
> are actually up to the task.   Not just a proof that the code is
> relatively bug-free, except in a broad sense of "bug" as "algorithm
> that doesn't fulfill the intended goals"
>

Recently, one week of my time was sucked into a black hole.  I read all six
papers from the latest Event Horizon Telescope announcement. Five and a
half of these papers are devoted to describing the EHT, and proving that it
works correctly.  The actual results are just one photo, and a few
paragraphs explaining the photo.  And you got that in the mainstream-press.

I'd like to see the same mind-set here: a lot more effort put into
characterizing exactly what it is that is being done, and proving that it
works as expected, where "expected==intuitive explanation of why it
works".  So, yes, characterizing the stage that moves from unlabeled parses
to labeled parses is really important.  If you want to sound like a
professional scientist, then write that up in detail, i.e. prove that your
experimental equipment works.  That's what the EHT people did, we can do it
too.


>
> ***
>  Such proofs are good to have, but its not theoretically interesting.
> ***
>
> I think it's theoretically somewhat interesting, because there are a
> lot of possible ways to do clustering and grammar rule learning, and
> now we know a specific combination of clustering algorithm and grammar
> rule learning algorithm that actually works (if the input dependency
> parses are good)
>

Yes.  Despite all the spread-sheets, PDF's and github issues that Anton has
aimed my way, I still do not understand what this "specific combination of
clustering algorithm and grammar rule learning algorithm" actually is.
I've got a vague impression, but not enough of one to be able to reproduce
that work.  Which is funny, because as an insider, I wrote half the code
that is being used as ingredients.  So I should be in a prime position to
understand what is being done ... but I don't.  This still needs to be
fixed.  It should be written up at EHT-level quality write-ups.


>
> Then the approach would be
>

I don't want to comment on this part, because I've already commented on it
before.  If there is an accuracy problem, its got nothing to do with the
accuracy of MST.  The accuracy of MST should NOT affect final results!  If
the accuracy of MST is impacting the final results, then some other part of
the pipeline is not working correctly!

In a real radio-telescope, the very first transistor in the antenna
dominates the signal-to-noise ratio, and provides about 3dB of
amplification. 3DB is equal to one binary-bit!  10^0.3==2^1 == Two to the
power-one of entropy decrease. All the data processing happens after that
first transistor.

MST is like that first transistor. Its gonna be shitty.  If the downstream
stages - the disjunct processing aren't working right, then you get no
worthwhile results.   Focus on the downstream, characterize the operation
of the downstream. Quit obsessing on MST, its a waste of time.

--linas

-- 
cassette tapes - analog TV - film cameras - you

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA34BzxwmJMeMLT%2Byd_ih14RE6Y3S86XMPKEtCTG7URQKmA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: Testing the same unsupervisedly learned grammars on different kinds of corpora

Reply via email to