Here is an early (2002) experiment described on SL4 (precursor to Overcoming Bias and Lesswrong) on whether an unfriendly self improving AI could convince humans to let it escape from a box onto the internet. http://sl4.org/archive/0207/4935.html
This is how actual science is done on AI safety. The results showed that attempts to contain it would be hopeless. Almost everyone let the (role played) AI escape. Of course the idea that a goal directed, self improving AI could even be developed in isolation from the internet seems hopelessly naïve in hindsight. Eliezer Yudkowsky, who I still regard as brilliant, was young and firmly believe that the unfriendly AI (now called alignment) problem could be and must be solved before it kills everyone, like it was a really hard math problem. Now, after decades of effort it seems he has given up hope. He organized communities of rationalists (Singularity Institute, later MIRI), attempted to formally define human goals (coherent extrapolated volition), timeless decision theory and information hazards (Roko's Basilisk), but to no avail. Vernor Vinge described the Singularity as an event horizon on the future. It cannot be predicted. The best we can do is extrapolate long term trends like Moore's law, increasing quality of life, life expectancy, and economic growth. But who forecast the Internet, social media, social isolation, and population collapse? What are we missing now? ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Te0da187fd19737a7-M74abe1f60f6dc75c28386a99 Delivery options: https://agi.topicbox.com/groups/agi/subscription