I am very interested in parsing the constructions used in WordNet and 
Wiktionary glosses (i.e. definitions).  Here are some samples from WordNet 
online http://wordnet.princeton.edu/perl/webwn .  The glosses are 
parenthesized, and examples are in italics for those of you with rich text 
email editors.

(1) very simple patterns

ruble - (the basic unit of money in Tajikistan)
ruble, rouble (the basic unit of money in Russia)
lira, Maltese lira (the basic unit of money on Malta; equal to 100 cents)
lira, Turkish lira (the basic unit of money in Turkey)
lira, Italian lira (formerly the basic unit of money in Italy; equal to 100 
centesimi)

(2) complex constructions

break (terminate) "She interrupted her pregnancy"; "break a lucky streak"; 
"break the cycle of poverty
break, separate, split up, fall apart, come apart (become separated into pieces 
or fragments) "The figurine broke"; "The freshly baked loaf fell apart"
break (render inoperable or ineffective) "You broke the alarm clock when you 
took it apart!"
break, bust (ruin completely) "He busted my radio!"
break (destroy the integrity of; usually by force; cause to separate into 
pieces or fragments) "He broke the glass plate"; "She broke the match"
transgress, offend, infract, violate, go against, breach, break (act in 
disregard of laws, rules, contracts, or promises) "offend all laws of 
humanity"; "violate the basic laws or human civilization"; "break a law"; 
"break a promise"
break, break out, break away (move away or escape suddenly) "The horses broke 
from the stable"; "Three inmates broke jail"; "Nobody can break out--this 
prison is high security
break (scatter or part) "The clouds broke after the heavy downpour"

Having a dialog system gives one the ability to query a contributing user about 
otherwise confusing or circular glosses.  Plus one can always recurse into a 
session to understand a word or phrase used in a containing gloss.

And after the commonly occuring word senses from Wiktionary / WordNet glosses 
are understood and incorporated in the KB, its on to Wikipedia.  For the 
latter, I'm closely monitoring the Cyc Foundation effort to link OpenCyc with 
Wikipedia topics.

-Steve

Stephen L. Reed 
Artificial Intelligence Researcher
http://texai.org/blog
http://texai.org
3008 Oak Crest Ave.
Austin, Texas, USA 78704
512.791.7860

----- Original Message ----
From: William Pearson <[EMAIL PROTECTED]>
To: [email protected]
Sent: Thursday, January 10, 2008 3:04:34 PM
Subject: Re: [agi] Incremental Fluid Construction Grammar released

 On 10/01/2008, Benjamin Goertzel <[EMAIL PROTECTED]> wrote:
> On Jan 10, 2008 10:26 AM, William Pearson <[EMAIL PROTECTED]>  wrote:
> > On 10/01/2008, Benjamin Goertzel <[EMAIL PROTECTED]> wrote:
> > > > I'll be a lot more interested when people start creating NLP  systems
> > > > that are syntactically and semantically processing statements  *about*
> > > > words, sentences and other linguistic structures and adding  syntactic
> > > > and semantic rules based on those sentences.
> >
> > Note the new emphasis ;-) You example didn't have statements  *about*
> > words, but new rules were inferred from word usage.
>
> Well, here's the thing.
>
> Dictionary text and English-grammar-textbook text are highly  ambiguous and
> complex English... so you'll need a very sophisticated NLP system to  be able
> to grok them...

Firstly, so what? Why not allow for the fact that there will hopefully
be a sophisticated NLP system in the system at some point? Give it the
hooks to use dictionary style acquisition, even if it won't for the
first x years of development. We are aiming for adult human-level in
the end, right? Not just a 5 year old.

It will make adding French or another language a whole lot quicker,
when it comes to that level. Retrofitting the ability may or may not
be easy at that stage. It would be better to figure out whether it is
easy or not before settling on an architecture. My hunch, is that it
is not easy.

Secondly, I'm not buying that it is any more complex than dealing with
other domains. You easily get equal complexity dealing with
non-linguistic stuff such as

This is a battery
A battery can be part of a machine
Putting a battery in the battery holder, gives the machine power

Is as complex, if not more so, than

un- is a prefix
A prefix is the front part of a word
Adding un- to a, "word," is equivalent to saying, "not word."

What the system does after processing these different sets of
sentences is vastly different. A difference worth exploring before
settling on an architecture, IMO.

Not building the potential to have a capability into a baby based AI,
even if it is not initially used, means when the AI is grown up it
still won't be able to have that capability. Unless you are relying on
it getting to the self-modifying code phase before the
asking-what-words-mean phase.


  Will

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;







      
____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  
http://tools.search.yahoo.com/newsearch/category.php?category=shopping

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=84435099-a2f4b6

Reply via email to