On Fri, Feb 23, 2018 at 11:26 AM, Amirouche Boubekki
<[email protected]> wrote:
>
>>  > The goal of the atomspace is to eliminate human-curated datasets.
>>
>> Music to my ears. "Curated" means "detached from the actual source and
>> context of knowledge."
>
> Not always. Curated means fixed, patched and edited by a human being
> supervisor that knows best, until the correction is delivered in code. That
> is chance to avoid structural bias like racist bots.

Ah!  Now this last is a very interesting philosophical observation.
This is not quite the correct mailing list within which to discuss
this, but it overlaps onto a large number of political and
mathematical issues that are very interesting to me. So here I go.

Political - if this was a human, not  bot, what amount of racism
should be tolerated?  Speech, thought, action are interconnected. For
example: the American constitution enshrines freedom of speech, and
the freedom to practice religion. But clearly, we have lost our
freedom of speech: say the wrong thing about Islam, you get bombed.
Should we restrain freedom of religion?

Religion is a form of thought. What about freedom of thought? You can
think murderous thoughts, but if you commit murder, you are socially
unwanted (usually).  The ability to commit murder is correlated with
the absence of certain neural circuitry in the brain having to do with
empathy. Some humans lack these neurons, and thus are prone to be
psychopaths.  Those who do have those neurons, and commit (or even
witness) murder end up with PTSD.

The mathematical issues first arise if you think of bots as
approximating humans.  Its trivial to create a bot that prints random
dictionary words.  Its a bit harder, but not too hard, to create a bot
that spews random dictionary words assembled in grammatical sentences
(just run the random word sequences through a grammar-checker, e.g.
link-grammar, and reject the ungrammatical ones; don't print them.
Since most random word-sequences are not grammatical, this is not
CPU-efficient, so better algorithms avoid obviously-ungrammatical
word-sequences by working at higher abstraction layers).  What
Microsoft did was just one single step beyond this:  spew random
grammatically correct sentences, using a probability weighting based
on recently heard utterances. The system was too simple, the gamers
gamed the system: trained up the probability weights to spew racist
remarks.

OK, suppose we can go one step beyond what Microsoft did: spew random
sentences, that are created by means of "logical deduction" or
"reasoning" applied to "knowledge" obtained from some database (e.g.
wikipedia, or from a triple store). This could certainly wow some
people, as it would demonstrate a robot capable of logical inference.

So: this last is where your comment about "structural bias like racist
bots" starts getting interesting. To recap:

Step 0: random word sequences
Step 1: random but grammatically correct word sequences
Step 2: random grammatical sentences weighted by recent input  <-- the
Microsoft bot
Step 3: grammatical sentences from random "logical inferences" <--
what opencog is currently attempting
...
Step n: crazy shit people say and do
...
Step p: crazy shit societies,cultures and civilizations do

What are the values of n and p?  Some might argue that perhaps they
are 4 and 5; others might argue that they are higher.

My point is: a curated database might make step 3 simpler. Its
hopeless for step 4.

For a commercial product, curated data is super-important: Alexa and
Siri and Cortana are operating at the step 2/3 level with carefully
curated databases of capitalist value: locations of restaurants,
household products, luxury goods.

The Russian twitter-bots, as well as Cambridge Analytica and the
Facebook black-ops division are working at the step 2/3 level with
carefully curated databases of psychological profiles and political
propaganda.

Scientists in general (and Ben in particular) would love to operate at
the step 2/3 level with carefully curated databases of scientific
knowledge, e.g. anti-aging, life-extension info.  I'm getting old too.
Medical breakthroughs are not happening fast enough, for me.

So, yes, curated data is vitally important for commercial, political
and scientific reasons.  Just that it does not really put us into step
4 and 5, which are the steps along which AGI lies.  The dream of AGI
is to take those steps, without the curated bullshit (racism,
religion, capitalism) that humankind generates, and yet also avoid the
creation of a crisis that would threaten humanity/civilization.

Linas.

-- 
cassette tapes - analog TV - film cameras - you

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA36%2B3wCN%2BF0kRrJkK59-aCNS1UbZ33JGWkj5XJJSMmGP3g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to