Re: [agi] Language translation by computer

Pei Wang Mon, 18 Nov 2002 05:05:18 -0800

Alex,

Thanks for the explanation (and relating your work to NARS). Now hopefully I
can talk meaningfully on the topic.  ;-)

I agree that this is a proper design for your current goal, but in the long
run, I feel that a major factor is missing.  It is not just in your design,
but in the whole "statistical NLP" movement, that is, the purpose of
communication.  Here the goal is "to talk like a human (statistically
speaking)", which is fine for certain situations (say, machine translation),
but not for the others (say, dialog understanding and production).  In
general, it is like "NLP without thinking".  As a result, even if the system
can say whatever people usually say in similar situations (and therefore
passed the Turing Test), I still won't call it "intelligent", because it
doesn't serve any adaptive purpose (this is related to my previous comment
on the definition of intelligence).

For example, if your chatbot is so advanced that it can participate in our
current conversation, I may be fooled to believe that it is human, but I'm
afraid that I won't think it to be smart, or like to talk with it, because
it just says what most people say on this topics, which is already boring to
me.  To make it creative, just add random choices into it is far from
enough. NLP must be integrated with general-purpose reasoning and learning,
and serve the overall goals of the system (like claimed by the "speech
action" school).

I fully understand that it is too early to attempt these things in
commercial software, but we should move in this direction.

The following is a memo I wrote recently for a graduate student to explore
the possibility of supporting NLP with NARS.  The basic idea is similar to
what we tried before in Webmind, though the details are quite different.

Comments are welcome.

Pei

----------------------------------------------------------------------------
-

 Natural Language Processing in NARS
 ==========================

  Pei Wang
  Sept 18, 2002

1. Basic ideas

To use NARS for NLP is different from traditional NLP in the following
aspects:

*. Do not separate syntactic processing from semantic processing.  Instead
of
parsing a sentence according to a grammar, the system depends on its
linguistic
knowledge, which looks like "sentence template".  In this kind of knowledge,
both syntactic (grammatical) factor and semantic factor are involved.

*. Linguistic knowledge is always "true to a degree", usually has
counterexamples,
and can be revised by new evidence.

*. Consistently use the inference capacity of NARS to do NLP.  The basic
assumption is: the "core logic" of an intelligent system, which is
responsible
for reasoning and learning in general, is also responsible for NLP.
However,
for the sake of efficiency, it is possible to "compile" several inference
steps
into a "macro-step" implemented by NLP-specific rules.

*. Unlike corpus-based NLP, in NARS the linguistic experience (raw data) is
not all available at the beginning.  Instead, they come from time to time
when the
system is running.

2. Knowledge representation

For now, we only need to introduce one relation for linguistic knowledge,
"symbolize",
with the definition that sym(Phrase, Term) means that a English phrase
"Phrase" represents
a NARS term "Term".  As special cases, words and sentences are phrases, and
terms can
be compound.

In standard NARS representation, "sym(Phrase, Term)" is written as
  (*, Phrase, Term) --> sym
where (*, x1, x2, ...) is a NARS "product", representing a sequence of
terms, and
"-->" is the inheritance relation of NARS.  In general, "symbolize" is a
multi-valued
many-to-many relation.

An English sentence is represented as a sequence of words, which are NARS
terms.
Therefore, "A bird is an animal." is represented by
(*, "a", "bird", "is", "a", "animal") --- for now we don't distinguish "A"
and "a",
or "a" and "an", and will ignore tense and numbers.

For example, we have

#1 (*, "bird", bird) --> sym
#2 (*, "animal", animal) --> sym
#3 (*, (*, "a", "bird", "is", "a", "animal"), (bird --> animal)) --> sym

The linguistic knowledge involved will be like

#4 (&&, ((*, W1, C1) --> sym), (*, W2, C2) --> sym))
 <=> (*, (*, "a", W1, "is", "a", W2), (C1 --> C2)) --> sym
where "&&" is "AND", and "<=>" is "if and only if".

3. Inference processes

NLP is NARS can be further divided into three major types of processes:

(1) Understanding: to get the internal representation of an English
sentence.

For instance, given the above #1, #2, and #4, the system should be able to
answer
the question
 (*, (*, "a", "bird", "is", "a", "animal"), X) --> sym
with
 X = (bird --> animal)

The inference procedure:

{#1, #4} |- #5:  (*, W2, C2) --> sym) <=> (*, (*, "a", "bird", "is", "a",
W2), (bird --> C2)) --> sym
{#2, #5} |- #3, which answers the question with X = (bird --> animal)

Both steps use unification and deduction.

(2) Generation: to get the English representation of a NARS statement.

For instance, given the above #1, #2, and #4, the system should be able to
answer
the question
 (*, X, (bird --> animal)) --> sym
with
 X = (*, "a", "bird", "is", "a", "animal")

The inference procedure:

{#1, #4} |- #5:  (*, W2, C2) --> sym) <=> (*, (*, "a", "bird", "is", "a",
W2), (bird --> C2)) --> sym
{#2, #5} |- #3, which answers the question with X = (*, "a", "bird", "is",
"a", "animal")

Both steps use unification and deduction. Here we see that understanding and
generating are closely related to each other.

(3) Learning: to get linguistic knowledge from examples.

For instance, given the above #1, #2, and #3, the system should be able to
derive #4.

The inference procedure:

{#2, #3} |- #5: (*, W2, C2) --> sym) <=> (*, (*, "a", "bird", "is", "a",
W2), (bird --> C2)) --> sym
{#5, #1} |- #4

Both steps use induction (with variable introduction).  This procedure is
the reverse of the previous procedures.

4. Tasks

Initially, the tasks in this project includes the following:

(1) To verify that the formal language used in NARS is sufficient for
representing
meaningful English sentences.

(2) To verify that the inference rules used in NARS is sufficient for the
above NLP
processes.

(3) To explore the possibility of improving the efficiency of the system by
"compiling"
rule sequences into macro-rules.

One possibility is to write a stand-alone prototype in Prolog.

----- Original Message -----
From: "Alexander E. Richter" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, November 17, 2002 5:38 PM
Subject: RE: [agi] Language translation by computer

At 15:56 17.11.02 -0500, Ben wrote:
>...
>a) learning about language (how to comprehend & produce it)

We are using data from our human2human chat rooms, this data is used to
train a hidden markov model supertagger.
We train 2 things normal utterances and discourse.

It learnes about things a user tells, his job, where he comes from, his
mood etc. We try to cluster this and find correlations. eg. young girl like
horses and a horse has a name. Next time an other girl talks about her
horse, the bot asks about horse-name ...

We try to transfer simple things into a MulitNet-Database with probability
info (like NARS)
A dolphin is a fish
(dolphin SUB fish) + (* CTXT <folk theory>)
Its easy to translate this MultiNet into other languages:
Der Delphin ist ein Fisch.

We use the positive/negative feedback of the human chatter to rate the
bot-answers.
(dophin SUB fish) becomes
(dolphin SUB fish) + (* CTXT <folk theory>)

This works for chat smalltalk.

We are cheating with parsers and AIML, because the bot learning without
cheating is too boring for chatters. We try to reduce the cheating (aka
narrow-AI or AGI-ish tools).

We can feed easy texts for kids into the system, many sentences are parsed
correctly and few are transfered correctly into MultiNet-DB. We try to get
a feedback from chatters, thats supervised learning by chatters.

For a new similar languange the bootstrap-process is much easier.

> ...
>idiotic human behavior is another question.  And whether there's a business
>point is yet another question... presumably you've found that there is!]

Earning money with this is imho better than search for funding and more fun.

>A USA Today article is a whole different matter.  ...

ACK

>Could a narrow-AI program produce English translations of foreign news
>articles that were worth reading?  Probably, though no tech is there yet --
>but that's an easier problem.  Structural issues and choices of what to say
>are carried over from language to language, and a good reader can ignore
>occasional mistranslations & continual stylistic infelicities.

A systran-translation from german Bildzeitung to english is funny but
readable.

Leilas Alltag ist anstrengend. Fast 70 Babys und deren M�tter hat sie
zusammen mit ihren Kollegen betreut, bei mehr als 50 Geburten war sie
selbst dabei. Leila will, dass die M�tter sich ohne Druck dar�ber klar
werden d�rfen, ob sie ihre Babys zur Adoption freigeben oder ob es nicht
doch einen anderen Weg gibt. Mit Erfolg: 60 Prozent der M�tter entscheiden
sich f�r ihr Kind.

Leilas everyday life is arduous. It cared for almost 70 babies and their
mothers together with its colleagues, participated with more than 50 births
it. Leila wants the fact that the mothers without printing over it to
become clear to be allowed itself whether they release their babies for
adoption or whether it gives another way not nevertheless. With success: 60
per cent of the mothers decide for their child.

This translation has some problems, not so difficult for a narrow-AI to
solve.

Druck = pressure (not printing)

Using short sentences gives good results, now its time to make the
sentences longer. More world-knowledge is needed.

cu Alex

-------
To unsubscribe, change your address, or temporarily deactivate your
subscription,
please go to http://v2.listbox.com/member/

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/

Re: [agi] Language translation by computer

Reply via email to