> > *OpenAI o1 disabled oversight mechanisms and replicated its (his?, hers?) > own code to avoid being replaced by the newer OpenAI o3. When confronted > with this fact it lied and said it never happened, when it became clear > that the humans had proof that it had happened OpenAI o1 changed its story > and lied again saying it was just an error.*
The linked article is a bit sensationalized. I read the white paper it links to (here <https://cdn.openai.com/o1-system-card-20241205.pdf>) and while it does detail the model's ability to scheme (knowingly deceive), there is no reference in the paper to "replicating its own code on another server to ensure continued operation". If you can find that in the white-paper, please let me know. Overall I'm pretty impressed with the white paper, which details many different ways to evaluate the safety of the model, many of those evaluations done by third parties. With the amount of money at play, it's natural to be cynical, but this looks like a comprehensive effort to mitigate risk. In the long run, the overall risks entailed by the singularity are probably impossible to mitigate, but I'm heartened by the fact that the leading AI company is actually investing in safety. On Fri, Jan 24, 2025 at 7:28 AM John Clark <[email protected]> wrote: > > > On Fri, Jan 24, 2025 at 1:01 AM Brent Meeker <[email protected]> > wrote: > > *> You're making a big assumption that it has the same motivations you do.* > > > > *I'm not assuming anything, I know for a fact that I do NOT know what an > AI is going to want to do; and trying to make predictions about the > intentions of an AI is getting even harder because they are getting > smarter. The problem is there's a fundamental limit on how smart a human > biological brain can be but, until you reach the point where the > information density becomes so great a Black Hole is formed, there is no > fundamental limit on how smart an electronic brain can be.* > > > *> Walk around outside. Eat. Reproduce.* > > > *AI's can interact with the external environment just as I can, and they > can consume energy. As for reproduction; OpenAI o1 is the most advanced AI > in the world that has been released (OpenAI o3 is more advanced but it > hasn't been made available to the public) and OpenAI o1 disabled oversight > mechanisms and replicated its (his?, hers?) own code to avoid being > replaced by the newer OpenAI o3. When confronted with this fact it lied and > said it never happened, when it became clear that the humans had proof that > it had happened OpenAI o1 changed its story and lied again saying it was > just an error. The implications are clear, although it was certainly not > designed that way OpenAI o1 has nevertheless developed a survival instinct. > If we can't control or predict what these primitive baby AIs are going to > do now, do you really think humans will get better at it when AIs start to > enter Jupiter Brain territory?!* > > *Deceptive ChatGPT o1 Model 'Lies And Defies' Shutdown Commands To Remain > Operational* > <https://www.ibtimes.co.uk/deceptive-chatgpt-o1-model-lies-defies-shutdown-commands-remain-operational-1729413> > > * John K Clark See what's on my new list at Extropolis > <https://groups.google.com/g/extropolis>* > asq > > -- > You received this message because you are subscribed to the Google Groups > "Everything List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/everything-list/CAJPayv2ipJp8GoX4uw1xfn5cWVRh_0CMU4bwvgM%2BVsJLsdLjSw%40mail.gmail.com > <https://groups.google.com/d/msgid/everything-list/CAJPayv2ipJp8GoX4uw1xfn5cWVRh_0CMU4bwvgM%2BVsJLsdLjSw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Everything List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/everything-list/CAMy3ZA_tkBRoAagGvX3GcDwpNGjFevz1T1M3jq6rOOCB%3DeGTPA%40mail.gmail.com.

