Derek Zahn wrote:
Richard Loosemore writes:
> You must remember that the complexity is not a massive part of the
> system, just a small-but-indispensible part.
>
> I think this sometimes causes confusion: did you think that I meant
> that the whole thing would be so opaque that I could not understand
> *anything* about the behavior of the system? Like, all the
> characteristics of the system would be one huge emergent property, with
> us having no idea about where the intelligence came from?
No, certainly not. I think the confusion here involves the distinction
between Friendliness with a capital F (meaning a formal theory of what
the term means and an intelligent system built to provably maintain that
property in the mathematical, not verbal, sense), and friendliness with
a lower case f, which relies on more human types of reasoning.
Derek,
Your post raises several issues that I will try to get to in due course,
but I want to deal with one of them quickly (if I can).
I am attacking the very notion that there really is something that is
mathematical Friendliness with a capital F, which can be proved formally
rather than (something else).
I am also stating that while this mythical provable-friendliness does
not really exist (i.e. it will never be possible), there is something
else that gives us exactly what we want, but is not a mathematical proof.
Here is why. According to quantum mechanics there is a finite, non-zero
probability that the Sun could suddenly quantum-tunnel itself to a new
position inside the perfume department of Bloomingdales.
There is no formal proof that it will not do this. There is no
possibility of such a formal proof.
But we accept that we do not need to worry about this happening because
we have an idea of what the probability is. In essence, we know that
for the Sun to do that, each atom in it would have to do the same thing
all at once, and since the probability of each individual event is so
small, and since they are all multiplied, the overall probability is
stupidly small
Now, of course I exaggerate for comedy, but the fact is that if you can
make the event "An AGI reneges on the motivations designed into it"
dependent on a very large number of improbable events all happening at
once, then you can multipl the probabilities and come to a situation
where the overall probability is vanishingly small.
You agree that if we could get such a connection between the
probabilities, we are home and dry? That we need not care about
"proving" the friendliness if we can show that the probability is simply
too low to be plausible?
Right, now consider the nature of the design I propose: the
motivational system never has an opportunity for a point failure:
everything that happens is multiply-constrained (and on a massive scale:
far more than is the case even in our own brains). Once the system is
set up to behave according to a diffuse set of checks and balances (tens
of thousands of ideas about what is "right", rather than one single
directive), it can never wander far from that set of constraints without
noticing the departure immediately.
Would you agree that IF such a design were feasible, you would not be
able to think of any way to bollix it?
Let's pause the discussion there: I want to know if you can see any
problems within the assumptions I have laid down.
Richard Loosemore.
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=48442747-6d3c9d