Re: [OpenCog] Re: [agi] constructivist issues

2008-10-22 Thread Mark Waser
 I think this would be a relatively pain-free way to communicate with an AI 
 that lacks the common sense to carry out disambiguation and reference 
 resolution reliably.   Also, the log of communication would provide a nice 
 training DB for it to use in studying disambiguation.

Awesome.  Like I said, it's a piece of something that I'm trying currently.  If 
I get positive results, I'm certainly not going to hide the fact.  ;-)

(or, it could turn into a learning experience like my attempts with Simplified 
English and Basic English :-)
  - Original Message - 
  From: Ben Goertzel 
  To: agi@v2.listbox.com 
  Cc: [EMAIL PROTECTED] 
  Sent: Wednesday, October 22, 2008 12:27 PM
  Subject: [OpenCog] Re: [agi] constructivist issues



  This is the standard Lojban dictionary

  http://jbovlaste.lojban.org/

  I am not so worried about word meanings, they can always be handled via 
reference to WordNet via usages like run_1, run_2, etc. ... or as you say by 
using rarer, less ambiguous words

  Prepositions are more worrisome, however, I suppose they can be handled in a 
similar way, e.g. by defining an ontology of preposition meanings like with_1, 
with_2, with_3, etc.

  In fact we had someone spend a couple months integrating existing resources 
into a preposition-meaning ontology like this a while back ... the so-called 
PrepositionWordNet ... or as it eventually came to be called the LARDict or 
LogicalArgumentRelationshipDictionary ...

  I think it would be feasible to tweak RelEx to recognize these sorts of 
subscripts, and in this way to recognize a highly controlled English that would 
be unproblematic to map semantically...

  We would then say e.g.

  I ate dinner with_2 my fork

  I live in_2 Maryland

  I have lived_6 for_3 41 years

  (where I suppress all _1's, so that e.g. ate means ate_1)

  Because, RelEx already happily parses the syntax of all simple sentences, so 
the only real hassle to deal with is disambiguation.   We could use similar 
hacking for reference resolution, temporal sequencing, etc.

  The terrorists_v1 robbed_v2 my house.   After that_v2, the jerks_v1 urinated 
in_3 my yard.  

  I think this would be a relatively pain-free way to communicate with an AI 
that lacks the common sense to carry out disambiguation and reference 
resolution reliably.   Also, the log of communication would provide a nice 
training DB for it to use in studying disambiguation.

  -- Ben G



  On Wed, Oct 22, 2008 at 12:00 PM, Mark Waser [EMAIL PROTECTED] wrote:

 IMHO that is an almost hopeless approach, ambiguity is too integral to 
English or any natural language ... e.g preposition ambiguity

Actually, I've been making pretty good progress.  You just always use big 
words and never use small words and/or you use a specific phrase as a word.  
Ambiguous prepositions just disambiguate to one of three/four/five/more 
possible unambiguous words/phrases.

The problem is that most previous subsets (Simplified English, Basic 
English) actually *favored* the small tremendously over-used/ambiguous words 
(because you got so much more bang for the buck with them).

Try only using big unambiguous words and see if you still have the same 
opinion.  

 If you want to take this sort of approach, you'd better start with 
Lojban instead  Learning Lojban is a pain but far less pain than you'll 
have trying to make a disambiguated subset of English.

My first reaction is . . . . Take a Lojban dictionary and see if you can 
come up with an unambiguous English word or very short phrase for each Lojban 
word.  If you can do it, my approach will work and will have the advantage that 
the output can be read by anyone (i.e. it's the equivalent of me having done it 
in Lojban and then added a Lojban - English translation on the end) though the 
input is still *very* problematical (thus the need for a semantically-driven 
English-subset translator).  If you can't do it, then my approach won't work.

Can you do it?  Why or why not?  If you can, do you still believe that my 
approach won't work?  Oh, wait . . . . a Lojban-to-English dictionary *does* 
attempt to come up with an unambiguous English word or very short phrase for 
each Lojban word.  :-)

Actually, h . . . . a Lojban dictionary would probably help me focus my 
efforts a bit better and highlight things that I may have missed . . . . do you 
have a preferred dictionary or resource?  (Google has too many for me to do a 
decent perusal quickly)



  - Original Message - 
  From: Ben Goertzel 
  To: agi@v2.listbox.com 
  Sent: Wednesday, October 22, 2008 11:11 AM
  Subject: Re: [agi] constructivist issues







Personally, rather than starting with NLP, I think that we're going to 
need to start with a formal language that is a disambiguated subset of English 


  IMHO that is an almost hopeless approach, ambiguity is too integral to 
English or any natural language ... e.g

Re: [OpenCog] Re: [agi] constructivist issues

2008-10-22 Thread Mark Waser
 Well, I am confident my approach with subscripts to handle disambiguation 
 and reference resolution would work, in conjunction with the existing 
 link-parser/RelEx framework...
 If anyone wants to implement it, it seems like just some hacking with the 
 open-source Java RelEx code...

Like what I called a semantically-driven English-subset translator?.  

Oh, I'm pretty confidant that it will work as well . . . . after the LaBrea tar 
pit of implementations . . . . (exactly how little semantic-related coding do 
you think will be necessary? ;-)



  - Original Message - 
  From: Ben Goertzel 
  To: agi@v2.listbox.com 
  Cc: [EMAIL PROTECTED] 
  Sent: Wednesday, October 22, 2008 1:06 PM
  Subject: Re: [OpenCog] Re: [agi] constructivist issues



  Well, I am confident my approach with subscripts to handle disambiguation and 
reference resolution would work, in conjunction with the existing 
link-parser/RelEx framework...

  If anyone wants to implement it, it seems like just some hacking with the 
open-source Java RelEx code...

  ben g


  On Wed, Oct 22, 2008 at 12:59 PM, Mark Waser [EMAIL PROTECTED] wrote:

 I think this would be a relatively pain-free way to communicate with an 
AI that lacks the common sense to carry out disambiguation and reference 
resolution reliably.   Also, the log of communication would provide a nice 
training DB for it to use in studying disambiguation.

Awesome.  Like I said, it's a piece of something that I'm trying currently. 
 If I get positive results, I'm certainly not going to hide the fact.  ;-)

(or, it could turn into a learning experience like my attempts with 
Simplified English and Basic English :-)
  - Original Message - 
  From: Ben Goertzel 
  To: agi@v2.listbox.com 
  Cc: [EMAIL PROTECTED] 
  Sent: Wednesday, October 22, 2008 12:27 PM
  Subject: [OpenCog] Re: [agi] constructivist issues



  This is the standard Lojban dictionary

  http://jbovlaste.lojban.org/

  I am not so worried about word meanings, they can always be handled via 
reference to WordNet via usages like run_1, run_2, etc. ... or as you say by 
using rarer, less ambiguous words

  Prepositions are more worrisome, however, I suppose they can be handled 
in a similar way, e.g. by defining an ontology of preposition meanings like 
with_1, with_2, with_3, etc.

  In fact we had someone spend a couple months integrating existing 
resources into a preposition-meaning ontology like this a while back ... the 
so-called PrepositionWordNet ... or as it eventually came to be called the 
LARDict or LogicalArgumentRelationshipDictionary ...

  I think it would be feasible to tweak RelEx to recognize these sorts of 
subscripts, and in this way to recognize a highly controlled English that would 
be unproblematic to map semantically...

  We would then say e.g.

  I ate dinner with_2 my fork

  I live in_2 Maryland

  I have lived_6 for_3 41 years

  (where I suppress all _1's, so that e.g. ate means ate_1)

  Because, RelEx already happily parses the syntax of all simple sentences, 
so the only real hassle to deal with is disambiguation.   We could use similar 
hacking for reference resolution, temporal sequencing, etc.

  The terrorists_v1 robbed_v2 my house.   After that_v2, the jerks_v1 
urinated in_3 my yard.  

  I think this would be a relatively pain-free way to communicate with an 
AI that lacks the common sense to carry out disambiguation and reference 
resolution reliably.   Also, the log of communication would provide a nice 
training DB for it to use in studying disambiguation.

  -- Ben G



  On Wed, Oct 22, 2008 at 12:00 PM, Mark Waser [EMAIL PROTECTED] wrote:

 IMHO that is an almost hopeless approach, ambiguity is too integral 
to English or any natural language ... e.g preposition ambiguity

Actually, I've been making pretty good progress.  You just always use 
big words and never use small words and/or you use a specific phrase as a 
word.  Ambiguous prepositions just disambiguate to one of 
three/four/five/more possible unambiguous words/phrases.

The problem is that most previous subsets (Simplified English, Basic 
English) actually *favored* the small tremendously over-used/ambiguous words 
(because you got so much more bang for the buck with them).

Try only using big unambiguous words and see if you still have the same 
opinion.  

 If you want to take this sort of approach, you'd better start with 
Lojban instead  Learning Lojban is a pain but far less pain than you'll 
have trying to make a disambiguated subset of English.

My first reaction is . . . . Take a Lojban dictionary and see if you 
can come up with an unambiguous English word or very short phrase for each 
Lojban word.  If you can do it, my approach will work and will have the 
advantage that the output can be read by anyone (i.e. it's the equivalent