It *might* get stuck in bad territory, but can you make an argument why there is a *significant* chance of that happening?

Not off the top of my head. I'm just playing it "better safe than sorry" since, as far as I can tell, there *may* be a significant chance of it happening.

Also, I'm not concerned about it getting *stuck* in bad territory, I am more concerned about just transiting bad territory and destroying humanity on the way through.

One thing that I think most of will agree on is that if things did work as Eliezer intended, things certainly could go very wrong if it turns out that the vast majority of people -- when smarter, more the people they wish they could be, as if they grew up more together ... -- are extremely unfriendly in approximately the same way (so that their extrapolated volition is coherent and may be acted upon). Our meanderings through state space would then head into very undesirable territory. (This is the "people turn out to be evil and screw it all up" scenario.) Your approach suffers from a similar weakness though, since it would suffer under the "seeming friendly people turn out to be evil and screw it all up before there are non-human intelligent friendlies to save us" scenario.

But my approach has the advantage that it "proves" that Friendliness is in those evil people's self-interest so *maybe* we can convert them before they do us in.

I'm not claiming that my approach is perfect or fool-proof. I'm just claiming that it's better than anything else thus far proposed.

Which, if either, of 'including all of humanity' rather than just 'friendly humanity', or 'excluding non-human friendlies (initially)' do you see as the greater risk?

I see 'excluding non-human friendlies (initially)' as a tremendously greater risk. I think that the proportionality aspect of Friendliness will keep the non-Friendly portion of humanity safe as we move towards Friendliness.

Actually, let me rephrase your question and turn it around -- Which, if either, of 'not protecting all of humanity from Friendlies rather than just friendly humanity' or 'being actively unfriendly' do you see as a greater risk?

Or is there some other aspect of Eliezer's approach that especially concerns you and motivates your alternative approach?

The lack of self-reinforcing stability under errors and/or outside forces is also especially concerning and was my initially motivation for my vision.

Thanks for continuing to answer my barrage of questions.

No. Thank you for the continued intelligent feedback. I'm disappointed by all the people who aren't interested in participating until they can get a link to the final paper without any effort. This is still very much a work in progress with respect to the best way to present it and the only way I can improve it is with decent feedback -- which is therefore *much* appreciated.




-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com

Reply via email to