RE: [agi] 40 years of parsing NL...

Piaget Modeler Sat, 23 Mar 2013 10:54:31 -0700

Steve, 
LA parsing gives you both the syntactic parse and the semantic parse 
concurrently.  
I don't know where you got the idea that you don't get the semantics as well. 
The semantics comes as frames in the form of "proplets", i.e., semantic frame 
attributes.
~PM

Date: Sat, 23 Mar 2013 08:57:34 -0700
Subject: Re: [agi] 40 years of parsing NL...
From: [email protected]
To: [email protected]

PM (and Logan),

You said in a previous posting that you have experience with L-A. What have you 
(or others) done with it?

I ask because once you sidestep semantic units, it seems to  me like you have 
thrown the baby out with the bathwater, at least for the usual applications 
needing some degree of "understanding". Maybe I just haven't noticed a good 
application that doesn't need semantic units, or I haven't understood a good 
way to live without them. Sure you can "parse" while ignoring them, but then of 
what use is the resulting parse?!!!

Idioms (of which there are thousands) are a sort of ill-behaved semantic unit. 
How do you handle idioms while sidestepping semantic units?

Logan: Have you been following this discussion? RADP is close enough to what I 
am planning to have the same semantic unit needs. Can you help make sense of 
this? 

What (if anything) am I missing here?

Steve
=================
On Fri, Mar 22, 2013 at 7:08 PM, Steve Richfield <[email protected]> 
wrote:

PM,

On Fri, Mar 22, 2013 at 5:27 PM, Piaget Modeler <[email protected]> 
wrote:

Actually, it's more than making a chatbot.  It's having a real robot respond to 
a person based on linking utterances(made by either the robot or the person) to 
the current context (milieu entities and events). 

I think before you make your Worldcomp presentation it would behoove you to 
read the NEWCAT and Computation of Language books so that you can adequately 
articulate the differences in your approach.

We seem to be talking past each other here. My presentation at Worldcomp need 
not compare with anything, most especially character-based methods that don't 
seem to even recognize what parsing applications need from a parser, let alone 
squarely addressing the how to provide what those applications need. There is 
SO much that these methods don't on first glance address.

Each parsing method seems to need a champion, and you seem to be the resident 
champion for L-A grammar here. I know you want to just send me some hyperlinks 
and tell me to go away and read some books, but here on this forum we each 
learn our own particular areas, and defend against stones tossed by people 
defending nearby areas. I tossed a stone your way when I claimed blinding 
speed. You tossed a stone back when you explained that all that was needed to 
parse was to move about though L-A map of English grammar. I tossed the stone 
back, pointing out that losing the semantic elements (many of which are idioms 
that don't make much grammatical sense) throws the baby out with the bath 
water, because applications (other than machine translation) are only 
interested in semantics, not syntax. Dragging semantics out of a parse tree is 
a really BIG job, requiring the SAME tests as other parsing methods. Sure you 
produce a parse in a hurry by not doing the job of other parsers, but then 
doing that job loses the speed advantage.

To illustrate some of the challenges, I took a large idiom dictionary and tried 
looking up idioms that I commonly use in everyday speech, and only found about 
half of them. So much for quality control. How does L-A deal with idioms? Once 
you have discarded the low-level semantic elements as part of putting words 
into parse trees, recognizing idioms could become quite difficult. Further. 
many idioms are ungrammatical. Are you planning to include idioms as part of 
the map of the language?!!!

Anyway, I **DO** want to understand L-A enough to see if it is significant, or 
have you understand my method enough to be able to compare the two, so we can 
both see the relationships between these two VERY different things.

Steve

Date: Fri, 22 Mar 2013 15:30:59 -0700
Subject: Re: [agi] 40 years of parsing NL...
From: [email protected]
To: [email protected]

PM,

This guy is talking about a different approach for making a chatbot - right? If 
so, he doesn't show any indication of knowing about present chatbots. Present 
technology is to have a variety of sentence skeletons, into which appropriate 
words and phrases are placed, which seems to work quite well.

I would think that promoting a technology would best be done with FREE 
documents and other supporting material. I already have the 10,000 most 
commonly used words in a file in order of frequency of use, if you or anyone 
else wants a copy.

I believe that my approach will be fast enough to keep up with the Internet, 
and I haven't seen any other approach that promises such blinding speed. In 
theory, all I need do is get the word out, and wait for folks at Google, Yahoo, 
and Facebook to discover it, which is my present plan.

I also plan to present this at the next WORLDCOMP conference.

BTW, ***THANKS*** for holding my feet to the fire!!!  I plan to adapt these 
discussions into the paper I present at WORLDCOMP.

Steve
===================

On Fri, Mar 22, 2013 at 1:39 PM, Piaget Modeler <[email protected]> 
wrote:

Roland's next step:  
http://www.amazon.com/Computational-Linguistics-Talking-Robots-Processing/dp/3642224318/ref=sr_1_1?ie=UTF8&qid=1363984424&sr=8-1&keywords=talking+robots+roland+hausser

Computational Linguistics and Talking Robots: Processing Content in Database 
Semantics

Publication Date: September 14, 2011 | ISBN-10: 3642224318 | ISBN-13: 
978-3642224317 | Edition: 2011

The practical task of building a talking robot requires a theory of how natural 
language communication works. Conversely, the best way to computationally 
verify a theory of natural language communication is to demonstrate its 
functioning concretely in the form of a talking robot, the epitome of 
human–machine communication. To build an actual robot requires hardware that 
provides appropriate recognition and action interfaces, and because such 
hardware is hard to develop the approach in this book is theoretical: the 
author presents an artificial cognitive agent with language as a software 
system called database semantics (DBS). Because a theoretical approach does not 
have to deal with the technical difficulties of hardware engineering there is 
no reason to simplify the system – instead the software components of DBS aim 
at completeness of function and of data coverage in word form recognition, 
syntactic–semantic interpretation and inferencing, leaving the procedural 
implementation of elementary concepts for later. In this book the author first 
examines the universals of natural language and explains the Database Semantics 
approach. Then in Part I he examines the following natural language 
communication issues: using external surfaces; the cycle of natural language 
communication; memory structure; autonomous control; and learning. In Part II 
he analyzes the coding of content according to the aspects: semantic relations 
of structure; simultaneous amalgamation of content; graph-theoretical 
considerations; computing perspective in dialogue; and computing perspective in 
text. The book ends with a concluding chapter, a bibliography and an index. The 
book will be of value to researchers, graduate students and engineers in the 
areas of artificial intelligence and robotics, in particular those who deal 
with natural language processing.

For you, Steve, the next step is to write a book about your approach and sell 
it for $100 a pop, or $75 for the e-book, and do a book tour (if possible).

Then gain some early adopters and market traction.
The point is to make money WHILE promoting your idea. 

Cheers,
~PM
Date: Fri, 22 Mar 2013 12:13:23 -0700
Subject: [agi] 40 years of parsing NL...
From: [email protected]

To: [email protected]

Piaget, Logan, et al,

We have had some interesting discussions about which method is best and 
fastest, but is it even possible?!!!

My own big wake-up call came many years ago, when I recorded a class I 
presented, and had it transcribed with instructions "don't edit it, just 
transcribe what I said". It was FULL of fragments, missing words, and even 
misstatements, but the class had NO problem grokking what I had said.

Similarly, just take any unedited posting (you can easily recognize editing by 
the lack of ANY spelling errors) and try hand-diagramming its sentences. They 
will be better than spoken sentences, but still, you will have problems with 
around half of them.

Several early NL projects set out with dictionaries that identified every part 
of speech that each word could be, and programmatically set about identifying a 
set of assumptions wherein each sentence would hang together. Unfortunately, 
few sentences had exactly one solution, and the presence of any presumed words 
fractured the entire process.

More recently, "ontological" approaches have attempted to sub-divide the parts 
of speech, e.g. identifying whether a particular noun can have color, weight, 
etc., to assist in assigning the targets of adjectives and adverbs.

The present consensus seems to be that speech is made to a particular audience 
with a particular set of presumed knowledge to use to fill in the gaps, and an 
automated listener/reader will NOT be able to understand "plain English" 
without similar real-world experience as an intended reader. Without that 
experience, lots of gaps and disambiguation errors will persist regardless of 
how much programming effort is expended.

Language translation can skirt many/most of these issues, by preserving the 
semantic ambiguities in the translation, to let the reader/listener figure out 
what the computer failed to figure out.

No, there will never ever be "full understanding", if for no other reason than 
some of what I say simply doesn't make sense. Instead, what can be done, and 
what is needed for present applications, are various forms of partial 
understanding. You can see this in throwing some numerical problems at 
WolframAlpha.com and watching the parsing of it. It picks out key words and 
tries ways of relating them together. Similarly, DrEliza.com picks out key 
words and phrases that are associated with symptoms and conditions it knows 
about.

The MOST important part of "understanding" is often identifying what the writer 
does NOT know (and the computer does know), sort of a reverse analysis. I refer 
to these as "statements of ignorance" and this is an important part of 
DrEliza.com

My parsing proposal was made as a component in a larger system in support of 
problem solving and sales (it is just one box among many in figure 1 in my 
patent application). My approach appears to be general purpose and applicable 
to other applications. Given that a universal parser appears to be impossible 
until it can walk among us, and even then will have some problems, each 
application must consider what it needs to obtain from the text/speech to do 
its job.

So, when relating performance of parsers, it is important to disambiguate just 
WHAT is being performed, e.g. just WHAT is "parsing", and what applications 
will a particular approach work best for?

Logan, what do you see are the "best fit" applications for reverse ascent 
descent parsing?

Piaget, what do you see are the "best fit" applications for LA parsing?

Any thoughts?

Steve

      AGI | Archives

 | Modify
 Your Subscription

      AGI | Archives

 | Modify
 Your Subscription

-- 
Full employment can be had with the stoke of a pen. Simply institute a six hour 
workday. That will easily create enough new jobs to bring back full employment.

      AGI | Archives

 | Modify
 Your Subscription

      AGI | Archives

 | Modify
 Your Subscription

-- 
Full employment can be had with the stoke of a pen. Simply institute a six hour 
workday. That will easily create enough new jobs to bring back full employment.

-- 
Full employment can be had with the stoke of a pen. Simply institute a six hour 
workday. That will easily create enough new jobs to bring back full employment.

      AGI | Archives

 | Modify
 Your Subscription

-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

RE: [agi] 40 years of parsing NL...

Reply via email to