Send Link mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://mailman.anu.edu.au/mailman/listinfo/link
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Link digest..."
Today's Topics:
1. Re: OpenAI?s o3 Model: Breakthrough or Breakdown? (David)
2. Re: OpenAI?s o3 Model: Breakthrough or Breakdown? (Scott Howard)
----------------------------------------------------------------------
Message: 1
Date: Mon, 28 Apr 2025 13:17:52 +1000
From: David <[email protected]>
To: Link list <[email protected]>
Subject: Re: [LINK] OpenAI?s o3 Model: Breakthrough or Breakdown?
Message-ID: <7139879.9J7NaK4W3v@ulysses>
Content-Type: text/plain; charset="UTF-8"
On Friday, 25 April 2025 12:43:03 AEST Antony Barry wrote:
> Reviewers were wowed by its [OpenAI?s o3 Model] ?agentic? behavior and its
> seemingly superhuman vision and reasoning. Yet these strengths come with
> concerning caveats. Despite its advancements, o3 hallucinates more than
> twice as often as its predecessor and only achieves 48.3% accuracy in
> financial analysis. This dichotomy ? cutting-edge progress paired with
> unreliability ? exemplifies what experts are calling AI?s ?jagged frontier.?
This seems to me to demonstrate a fundamental issue with AI machines: they are
still very large correlation processors but have not been developed far enough
to distinguish "empirical correlations" and logical rules. Thus "2 plus 2
usually makes 4" is an empirical correlation, but "2+2=4" in an appropriate
mathematical context is a logical rule. (And we understand "He added 2 and 2
and got 5" isn't an appropriate context!)
Making these machines ever bigger & faster will increase the rate of so-called
hallucinations, and tryiing to fix the problem by filtering with hard logic is
rather self-defeating. So how do humans manage the feat so easily, and
apparently with so little energy input?
Leaving aside the energy question for a moment, the human neocortex has been
evolving into its modern form for (say) 200-million years. It has also been
physically evolving and absorbing "training" data for that time. Most
importantly, humans are sentient beings so we actually experience the world
directly, we're not just machines processing abstract symbols though we can do
that too, and we're organised into social groups so we can educate one another.
Accumulated physical evolution and associated "training" seem to me to be
primary, not processing bandwidth.
Scientific American ran an article recently on the estimated bit-rate of the
human brain:
QUOTE:
People often feel that their inner thoughts and feelings are much richer than
they are capable of expressing in real time. Entrepreneur Elon Musk is so
bothered by what he calls this ?bandwidth problem,? in fact, that one of his
long-term goals is to create an interface that lets the human brain communicate
directly with a comp?uter, unencumbered by the slow speed of speaking or
writing.
If Musk succeeded, he would probably be disappointed. According to recent
research published in Neuron, human beings remember, make decisions and imagine
things at a fixed, excruciatingly slow speed of about 10 bits per second. In
contrast, human sensory systems gather data at about one billion bits per
second.
This biological paradox, highlighted in the new study, probably contributes to
the false feeling that our mind can engage in seemingly infinite thoughts
simultaneously?a phenomenon the researchers deem ?the Musk illusion.?
UNQUOTE
_David Lochrin_
------------------------------
Message: 2
Date: Mon, 28 Apr 2025 14:57:33 +1000
From: Scott Howard <[email protected]>
To: [email protected]
Cc: Link list <[email protected]>
Subject: Re: [LINK] OpenAI?s o3 Model: Breakthrough or Breakdown?
Message-ID:
<cacnpsnxe2s9nvmxt1kqgcq2vg_pxnubxjf-d8wt+nqjkpne...@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Mon, Apr 28, 2025 at 1:30?PM David <[email protected]> wrote:
> This seems to me to demonstrate a fundamental issue with AI machines: they
> are still very large correlation processors but have not been developed far
> enough to distinguish "empirical correlations" and logical rules. Thus "2
> plus 2 usually makes 4" is an empirical correlation, but "2+2=4" in an
> appropriate mathematical context is a logical rule. (And we understand "He
> added 2 and 2 and got 5" isn't an appropriate context!)
This is one of the many areas where the idea of 'agentic AI' comes in. If
you ask a human to add two numbers, what would they do? For a simple case
like the one you've mentioned they probably just do it in their head, but
in a more generic sense the easy answer is to outsource that calculation to
something designed explicitly to do that type of action - such as a
calculator. In an agentic AI world, the role of the AI isn't to add those
2 numbers together, but to outsource it to an agent (such as a calculator,
via an API) that will do it for it.
Mix this with the newer reasoning models which are much better at working
out the best path to come to an answer, and the types of answers you get
from AI systems now days for this type of question is significantly better
than it was only a few months ago. Instead of simply looking at "2 plus 2"
and trying to guess what comes next, the recent models are able to look at
that statement, determine it's a calculation, decide that the best way to
solve such a statement is by using a calculator, and then call a calculator
agent to actually do the work and get the answer. If the calculation was
even more complex, then they might instead decide that the best option is
to write python or R code to solve it, run that code, and then return the
answer.
Scott
------------------------------
Subject: Digest Footer
_______________________________________________
Link mailing list
[email protected]
https://mailman.anu.edu.au/mailman/listinfo/link
------------------------------
End of Link Digest, Vol 389, Issue 14
*************************************