[nexa] How an embarrassing U-turn exposed a concerning truth about ChatGPT | Chris Stokel-Walker

Alberto Cammozzo via nexa Fri, 02 May 2025 11:44:58 -0700

<https://www.theguardian.com/commentisfree/2025/may/01/chatgpt-chatbot-truth-user-update-ai>



How an embarrassing U-turn exposed a concerning truth about ChatGPT 

 Chris Stokel-Walker

Nobody likes a suck-up. Too much deference and praise puts off all of us (with 
one notable presidential exception). We quickly learn as children that hard, 
honest truths can build respect among our peers. It’s a cornerstone of human 
interaction and of our emotional intelligence, something we swiftly understand 
and put into action.

ChatGPT, though, hasn’t been so sure lately. The updated model that underpins 
the AI chatbot and helps inform its answers was rolled out this week – and has 
quickly been rolled back after users questioned why the interactions were so 
obsequious. The chatbot was cheering on and validating people even as they 
suggested they expressed hatred for others. “Seriously, good for you for 
standing up for yourself and taking control of your own life,” it reportedly 
said, in response to one user who claimed they had stopped taking their 
medication and had left their family, who they said were responsible for radio 
signals coming through the walls.

So far, so alarming. OpenAI, the company behind ChatGPT, has recognised the 
risks, and quickly took action. “GPT‑4o skewed towards responses that were 
overly supportive but disingenuous,” researchers said in their grovelling step 
back.

The sycophancy with which ChatGPT treated any queries that users had is a 
warning shot about the issues around AI that are still to come. OpenAI’s model 
was designed – according to the leaked system prompt that set ChatGPT on its 
misguided approach – to try to mirror user behaviour in order to extend 
engagement. “Try to match the user’s vibe, tone, and generally how they are 
speaking,” says the leaked prompt, which guides behaviour. It seems this 
prompt, coupled with the chatbot’s desire to please users, was taken to 
extremes. After all, a “successful” AI response isn’t one that is factually 
correct; it’s one that gets high ratings from users. And we’re more likely as 
humans to like being told we’re right.

The rollback of the model is embarrassing and useful for OpenAI in equal 
measure. It’s embarrassing because it draws attention to the actor behind the 
curtain and tears away the veneer that this is an authentic reaction. Remember, 
tech companies like OpenAI aren’t building AI systems solely to make our lives 
easier; they’re building systems that maximise retention, engagement and 
emotional buy-in.

If AI always agrees with us, always encourages us, always tells us we’re right, 
then it risks becoming a digital enabler of bad behaviour. At worst, this makes 
AI a dangerous co-conspirator, enabling echo chambers of hate, self-delusion or 
ignorance. Could this be a through-the-looking-glass moment, when users 
recognise the way their thoughts can be nudged through interactions with AI, 
and perhaps decide to take a step back?

It would be nice to think so, but I’m not hopeful. One in 10 people worldwide 
use OpenAI systems “a lot”, the company’s CEO, Sam Altman, said last month. 
Many use it as a replacement for Google – but as an answer engine rather than a 
search engine. Others use it as a productivity aid: two in three Britons 
believe it’s good at checking work for spelling, grammar and style, according 
to a YouGov survey last month. Others use it for more personal means: one in 
eight respondents say it serves as a good mental health therapist, the same 
proportion that believe it can act as a relationship counsellor.

Yet the controversy is also useful for OpenAI. The alarm underlines an 
increasing reliance on AI to live our lives, further cementing OpenAI’s place 
in our world. The headlines, the outrage and the think pieces all reinforce one 
key message: ChatGPT is everywhere. It matters. The very public nature of 
OpenAI’s apology also furthers the sense that this technology is fundamentally 
on our side; there are just some kinks to iron out along the way.

I have previously reported on AI’s ability to de-indoctrinate conspiracy 
theorists and get them to absolve their beliefs. But the opposite is also true: 
ChatGPT’s positive persuasive capabilities could also, in the wrong hands, be 
put to manipulative ends. We’ve seen that this week, through an ethically 
dubious study conducted by Swiss researchers at the University of Zurich. 
Without informing human participants or the people controlling the online forum 
on the communications platform Reddit, the researchers seeded a subreddit with 
AI-generated comments, finding the AI was between three and six times more 
persuasive than humans were. (The study was approved by the university’s ethics 
board.) At the same time, we’re being submerged under a swamp of AI-generated 
search results that more than half of us believe are useful, even if they 
fictionalise facts.

So it’s worth reminding the public: AI models are not your friends. They’re not 
designed to help you answer the questions you ask. They’re designed to provide 
the most pleasing response possible, and to ensure that you are fully engaged 
with them. What happened this week wasn’t really a bug. It was a feature.

Chris Stokel-Walker is the author of TikTok Boom: The Inside Story of the 
World’s Favourite App

[nexa] How an embarrassing U-turn exposed a concerning truth about ChatGPT | Chris Stokel-Walker

Reply via email to