> On Mon, Apr 22, 2019 at 11:18 PM Anton Kolonin @ Gmail <[email protected]> > wrote: >> >> >> We are going to repeat the same experiment with MST-Parses during this week. > > > The much more interesting experiment is to see what happens when you give it > a known percentage of intentionally-bad unlabelled parses. I claim that this > step provides natural error-reduction, error-correction, but I don't know how > much.
If we assume roughly that "insufficient data" has a similar effect to "noisy data", then the effect of adding intentionally-bad parses may be similar to the effect of having insufficient examples of the words involved... which we already know from Anton's experiments. Accuracy degrades smoothly but steeply as number of examples decreases below adequacy. *** My claim is that this mechanism acts as an "amplifier" and a "noise filter" -- that it can take low-quality MST parses as input, and still generate high-quality results. In fact, I make an even stronger claim: you can throw *really low quality data* at it -- something even worse than MST, and it will still return high-quality grammars. This can be explicitly tested now: Take the 100% perfect unlaballed parses, and artificially introduce 1%, 5%, 10%, 20%, 30%, 40% and 50% random errors into it. What is the accuracy of the learned grammar? I claim that you can introduce 30% errors, and still learn a grammar with greater than 80% accuracy. I claim this, I think it is a very important point -- a key point - but I cannot prove it. *** Hmmm. So I am pretty sure you are right given enough data. However, whether this is true given the magnitudes of data we are now looking at (Gutenberg Childrens Corpus for example) is less clear to me Also the current MST parses are much worse than "30% errors" compared to correct parses. So even if what you say is correct, it doesn't remove the need to improve the MST parses... But you are right -- this will be an interesting and important set of experiments to run. Anton, I suggest you add it to the to-do list... -- Ben -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBeetZ6vPoC7NuposQqzP9vLjMkO8uG6m0mPCth%2B5rKf_Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
