Re: Re: Re: [agi] Understanding Natural Language

Ben Goertzel Sat, 25 Nov 2006 16:03:34 -0800

The point of using Lojban for proto-AGI's is to enable productive,
interactive conversations with AGI's at a fairly early stage in their
development ...


Of course, mining masses of online English text is a better way for
the system to gain general knowledge about science, politics, human
psychology and the like...

But, perhaps the deeper understanding an AGI needs to meaningfully
interpret all that data in the masses of online English text, can best
be imparted to the AGI via interacting with it in a simulation world
and conversing with it about its experiences using a language that it
can interpret more readily (Lojban).

Lojban does not solve any of the hard problems of AGI, but it may
decrease the amount of pragmatic effort required to turn a theoretical
solution to the hard problems of AGI into a practical solution.

-- BenG

On 11/25/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:

Andrii (lOkadin) Zvorygin <[EMAIL PROTECTED]> wrote:
>> Even if we were able to constrain the grammar, you still have the
problem that people will still make ungrammatical statements, misspell
words, omit words, and so on.
>Amazing you should mention such valid points against natural languages.

This misses the point.  Where are you going to get 1 GB of Lojban text to train 
your language model?  If you require that all text pass through a syntax 
checker for errors, you will greatly increase the cost of generating your 
training data.  This is not a trivial problem.  It is a big part of why 
programmers can only write 10 lines of code per day on projects 1/1000 the size 
of a language model.  Then when you have built the model, you will still have a 
system that is intolerant of errors and hard to use.  Your language model needs 
to have a better way to deal with inconsistency than to report errors and make 
more work for the user.

>Lojban already exceeds many natural languages in it's ability to express.

How so?  In English I can use mathematical notation understood by others to 
express complex ideas.  I could even invent a new branch of mathematics, 
introduce appropriate notation, and express ideas in it.

There are cognitive limits to what natural language can express, such as the 
inability to describe a person's face (as well as a picture would), or to 
describe a novel odor, or to convey learned physical skills such as swimming or 
riding a bicycle.  One could conceivably introduce notation to describe such 
things in any natural or artificial language, but that does not solve the 
problem.  Your neural circuitry has limits; it allows you to connect a face to 
a name but not to a description.  Any such notation might be usable by machines 
but not by humans.

-- Matt Mahoney, [EMAIL PROTECTED]

----- Original Message ----
From: Andrii (lOkadin) Zvorygin <[EMAIL PROTECTED]>
To: [email protected]
Sent: Saturday, November 25, 2006 5:01:04 AM
Subject: Re: Re: [agi] Understanding Natural Language

On 11/24/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> Andrii (lOkadin) Zvorygin <[EMAIL PROTECTED]> wrote:
> >I  personally don't understand why everyone seems to insist on using
> >ambiguous illogical languages to express things when there are viable
> >alternative available.
>
> I think because an AGI needs to communicate in languages that people already 
know.
I don't understand how artificial languages like Lojban contribute to
this goal.
We should focus our efforts instead on learning and modeling existing languages.
>
> I understand that artificial languages like Lojban and Esperanto and Attempto 
have simple grammars.
>I don't believe they would stay that way if they were widely used for
person to person communication (as opposed to machine interfaces).
Lojban grammar is easily extensible and forwards compatible.
You can add features to the language through CMAvo and GISmu.
Lojban already exceeds many natural languages in it's ability to
express.  There are very crucial parts of communication that English
lacks such as logical connectives and attitudinals.

>Languages evolve over time, both in individuals, and more slowly in
social groups.
Are you implying languages evolve faster in individuals?
>A language model is not a simple set of rules.
A natural language model is not.
An artificial language is constructed with rules that were also
created by individual -- as opposed to groups of -- humans. Lojban was
especially designed to be logical, unlike Esperanto.
Therefore making them recreatable by individual humans, and depending
on your definition: "simple".
>It is a probability distribution described by a large set of patterns
such as words, word associations, grammatical structures and
sentences.
The approach of a world of blind to seeing is to feel at things.
Sometimes they wonder if there is not another way.
>Each time you read or hear a message, the probabilities for the
observed patterns are increased a little and new patterns are added.
>In a social setting, these probabilities tend to converge by
consensus as this knowledge is shared.
I agree this is a wonderful solution to predicting what the vocabulary
of a language group is.
>Formal definitions of artificial languages do not capture this type
of knowledge, the thousands or millions of new words, idioms, shared
knowledge and habits of usage.
sa'u(simply speaking) Artificial languages lack a historic/cultural user base.
Do I even need to reply to that?zo'o.ui.u'i(last statement humourously
while happy in an amused kind of way)
>
> Even if we were able to constrain the grammar, you still have the problem 
that people will still make ungrammatical statements, misspell words, omit words, 
and so on.
Amazing you should mention such valid points against natural languages.

* ungrammatical statements:
If they were ungrammatical they wouldn't parse in the universal Lojban
parser(All Lojban parsers can be Universal Lojban parsers as long as
they follow the few simple grammar rules).
* misspell words:
     In Lojban words have a very strict formation,
     mu'a(for example): GISmu are either in (ccvcv or cvccv formation)
all others are also syntactically unambiguous.
     Additionally words in Lojban are specifically designed not to
sound similar to each other, so chances are it  still looks/sounds
just like the original word even when misspelled.
     If a parse error occurs(rare for Lojban users, usually typos) the
user can always be notified.
*omit words:
(I gave an example of some GISmu before, basically they have
predefined places, so you can always ask a specific question about
ommitted information by simply putting a "ma" for the SUMti(argument)
which you wish to know, or "mo" for the SELbri(function).
*and so on.
>A language model must be equipped to deal with this.
go'i.ui(repetition of your statement as confirmation and happiness)
>It means evaluating lots of soft constraints from a huge database for
error correction, just like we do to resolve ambiguity in natural
language.
If "It" can be substituted as "Resolving ambiguity in natural
languages" OR(logical connective) "Resolve ambiguity in ambiguous
languages", I agree.
> -- Matt Mahoney, [EMAIL PROTECTED]
mu'omi'eLOKadin(Over to you, my name in Lokadin.)

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: Re: Re: [agi] Understanding Natural Language

Reply via email to