[agi] 40 years of parsing NL...

Steve Richfield Fri, 22 Mar 2013 12:15:34 -0700

Piaget, Logan, et al,

We have had some interesting discussions about which method is best and
fastest, but is it even possible?!!!


My own big wake-up call came many years ago, when I recorded a class I
presented, and had it transcribed with instructions "don't edit it, just
transcribe what I said". It was FULL of fragments, missing words, and even
misstatements, but the class had NO problem grokking what I had said.

Similarly, just take any unedited posting (you can easily recognize editing
by the lack of ANY spelling errors) and try hand-diagramming its sentences.
They will be better than spoken sentences, but still, you will have
problems with around half of them.

Several early NL projects set out with dictionaries that identified every
part of speech that each word could be, and programmatically set about
identifying a set of assumptions wherein each sentence would hang together.
Unfortunately, few sentences had exactly one solution, and the presence of
any presumed words fractured the entire process.

More recently, "ontological" approaches have attempted to sub-divide the
parts of speech, e.g. identifying whether a particular noun can have color,
weight, etc., to assist in assigning the targets of adjectives and adverbs.

The present consensus seems to be that speech is made to a particular
audience with a particular set of presumed knowledge to use to fill in the
gaps, and an automated listener/reader will NOT be able to understand
"plain English" without similar real-world experience as an intended
reader. Without that experience, lots of gaps and disambiguation errors
will persist regardless of how much programming effort is expended.

Language translation can skirt many/most of these issues, by preserving the
semantic ambiguities in the translation, to let the reader/listener figure
out what the computer failed to figure out.

No, there will never ever be "full understanding", if for no other reason
than some of what I say simply doesn't make sense. Instead, what can be
done, and what is needed for present applications, are various forms of
partial understanding. You can see this in throwing some numerical problems
at WolframAlpha.com and watching the parsing of it. It picks out key words
and tries ways of relating them together. Similarly, DrEliza.com picks out
key words and phrases that are associated with symptoms and conditions it
knows about.

The MOST important part of "understanding" is often identifying what the
writer does NOT know (and the computer does know), sort of a reverse
analysis. I refer to these as "statements of ignorance" and this is an
important part of DrEliza.com

My parsing proposal was made as a component in a larger system in support
of problem solving and sales (it is just one box among many in figure 1 in
my patent application). My approach appears to be general purpose and
applicable to other applications. Given that a universal parser appears to
be impossible until it can walk among us, and even then will have some
problems, each application must consider what it needs to obtain from the
text/speech to do its job.

So, when relating performance of parsers, it is important to disambiguate
just WHAT is being performed, e.g. just WHAT is "parsing", and what
applications will a particular approach work best for?

Logan, what do you see are the "best fit" applications for reverse ascent
descent parsing?

Piaget, what do you see are the "best fit" applications for LA parsing?

Any thoughts?

Steve



-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

[agi] 40 years of parsing NL...

Reply via email to