Please Note: This email did not come from ANU, Be careful of any request to buy 
gift cards or other items for senders outside of ANU. Learn why this is 
important.
https://www.scamwatch.gov.au/types-of-scams/email-scams#toc-warning-signs-it-might-be-a-scam
`

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

To make matters worse, programmers in the study would often overlook the 
misinformation.

By Matt Novak Published Yesterday
https://gizmodo.com/chatgpt-answers-wrong-programming-openai-52-study-1851499417

Artificial intelligence chatbots like OpenAI’s ChatGPT are being sold as 
revolutionary tools that can help workers become more efficient at their jobs, 
perhaps replacing those people entirely in the future.

But a stunning new study has found ChatGPT answers computer programming 
questions incorrectly 52% of the time.

The research from Purdue University, first spotted by news outlet Futurism, was 
presented earlier this month at the Computer-Human Interaction Conference in 
Hawaii and looked at 517 programming questions on Stack Overflow that were then 
fed to ChatGPT.

“Our analysis shows that 52% of ChatGPT answers contain incorrect information 
and 77% are verbose,” the new study explained. “Nonetheless, our user study 
participants still preferred ChatGPT answers 35% of the time due to their 
comprehensiveness and well-articulated language style.”

Disturbingly, programmers in the study didn’t always catch the mistakes being 
produced by the AI chatbot.

“However, they also overlooked the misinformation in the ChatGPT answers 39% of 
the time,” according to the study.

“This implies the need to counter misinformation in ChatGPT answers to 
programming questions and raise awareness of the risks associated with 
seemingly correct answers.”

Obviously, this is just one study, which is available to read online, but it 
points to issues that anyone who’s been using these tools can relate to. Large 
tech companies are pouring billions of dollars into AI right now in an effort 
to deliver the most reliable chatbots.

Meta, Microsoft, and Google are all in a race to dominate an emerging space 
that has the potential to radically reshape our relationship with the internet. 
But there are a number of hurdles standing in the way.

Chief among those problems is that AI is frequently unreliable, especially if a 
given user asks a truly unique question.

Google’s new AI-powered Search is constantly spouting garbage that’s often 
scraped from unreliable sources. In fact, there have been multiple times this 
week when Google Search has presented satirical articles from The Onion as 
dependable information.

For its part, Google defends itself by insisting wrong answers are anomalies.

“The examples we’ve seen are generally very uncommon queries, and aren’t 
representative of most people’s experiences,” a Google spokesperson told 
Gizmodo over email earlier this week. “The vast majority of AI Overviews 
provide high-quality information, with links to dig deeper on the web.”

But that defense, that “uncommon queries” are showing wrong answers, is frankly 
laughable. Are users only supposed to ask these chatbots the most mundane 
questions? How is that acceptable, when the promise is that these tools are 
supposed to be revolutionary?

OpenAI didn’t immediately respond to a request for comment on Friday about the 
new study on ChatGPT answers. Gizmodo will update this post if we hear back.

--

STUDY FINDS THAT 52 PERCENT OF CHATGPT ANSWERS TO PROGRAMMING QUESTIONS ARE 
WRONG

By SHARON ADARLO MAY 23, 2:16 PM EDT
https://futurism.com/the-byte/study-chatgpt-answers-wrong

GETTY / FUTURISM

AH YES. AND YET...

Not So Smart

In recent years, computer programmers have flocked to chatbots like OpenAI's 
ChatGPT to help them code, dealing a blow to places like Stack Overflow, which 
had to lay off nearly 30 percent of its staff last year.

The only problem? A team of researchers from Purdue University presented 
research this month at the Computer-Human Interaction conference that shows 
that 52 percent of programming answers generated by ChatGPT are incorrect.

That's a staggeringly large proportion for a program that people are relying on 
to be accurate and precise, underlining what other end users like writers and 
teachers are experiencing: AI platforms like ChatGPT often hallucinate totally 
incorrectly answers out of thin air.

For the study, the researchers looked over 517 questions in Stack Overflow and 
analyzed ChatGPT's attempt to answer them.

"We found that 52 percent of ChatGPT answers contain misinformation, 77 percent 
of the answers are more verbose than human answers, and 78 percent of the 
answers suffer from different degrees of inconsistency to human answers," they 
wrote.

Robot vs Human

The team also performed a linguistic analysis of 2,000 randomly selected 
ChatGPT answers and found they were "more formal and analytical" while 
portraying "less negative sentiment" — the sort of bland and cheery tone AI 
tends to produce.

What's especially troubling is that many human programmers seem to prefer the 
ChatGPT answers. The Purdue researchers polled 12 programmers — admittedly a 
small sample size — and found they preferred ChatGPT at a rate of 35 percent 
and didn't catch AI-generated mistakes at 39 percent.

Why is this happening? It might just be that ChatGPT is more polite than people 
online.

"The follow-up semi-structured interviews revealed that the polite language, 
articulated and text-book style answers, and comprehensiveness are some of the 
main reasons that made ChatGPT answers look more convincing, so the 
participants lowered their guard and overlooked some misinformation in ChatGPT 
answers," the researchers wrote.

The study demonstrates that ChatGPT still has major flaws — but that's cold 
comfort to people laid off from Stack Overflow or programmers who have to fix 
AI-generated mistakes in code.

--

_______________________________________________
Link mailing list
[email protected]
https://mailman.anu.edu.au/mailman/listinfo/link

Reply via email to