Date: Tue, 14 May 2024 20:48:41 +1000
From: Mez Breeze via NetBehaviour <netbehaviour@lists.netbehaviour.org>
To: NetBehaviour for networked distributed creativity
<netbehaviour@lists.netbehaviour.org>
Cc: Mez Breeze <netwur...@gmail.com>
Subject: [NetBehaviour] ChatGPT-4o: Breakthrough or Bust?
Hihi All.
So I've just written this Patreon post all about OpenAi's latest
release,
ChatGPT-4o. I'm curious to see what peeps here make of it:
--
_ChatGPT-4o: Breakthrough or Bust?_
ChatGPT-4o [or Omni as the cool kids at OpenAI term it] sauntered
into the
release-spotlight earlier today, with OpenAi writing on their website
that
it?s:
??a step towards much more natural human-computer interaction?it
accepts as input
any combination of text, audio, and image and generates any
combination of
text, audio, and image outputs. It can respond to audio inputs in as
little
as 232 milliseconds, with an average of 320 milliseconds, which is
similar
to human response time in a conversation.?
Released alongside a bevy of slick videos, ChatGPT-4o is touted to be
the
next big thing with multiple video promos parading its new multimodal
chops.
One such video shows one version of GPT-4o narrating visual scenes to
its
visually impaired AI mate [while it asks questions in turn] all while
the
human testers fidgeted impatiently on the sidelines, barely masking
their
urge to fast-forward to the good bits.
The grand unveiling of GPT-4o could?ve been lifted straight from a
sci-fi
script. We watched [some in awe, some with cynicism] as these videos
attempted to paint a future where AIs chat about the d?cor in an
empty room
and then sing a duet about the process. And yet it's hard not to
chuckle at
the not-so-subtle desperation in the human testers who seem hell-bent on
skipping/interrupting the more stilted voice-scripted parts of the AI
dialogue and hurry to reach the 'money shot' of the demo [ie the
concrete
feature they were trying to showcase]. It begs the question: was the
fanfare
a bit premature? After all, this wasn?t the GPT-5 release party
everyone had
RSVP?d to.
Diving into the nitty-gritty, the AI?s voice tech is frankly impressive,
reminding us of those Figure 01 Speech-to-Speech Reasoning clips from a
while back. The voices are spot-on: the AI sounds like a person, with
natural human-emulated cadences, sub-vocalisations, and tonalities [the
laughter is weirdly realistic]. What was far less impressive is the
replication of bias in the gendered aspects the simulated speech with
the
female voices being far more sexualised/flirty than the male ones. It's
especially disappointing given the potential ramifications for how
this will
impact users and perpetuate current gender biases. In 2024, one would
hope
we?d be past such gendered gimmicks [but?.no].
Now let?s talk timing: releasing ChatGPT-4o for free might seem like
a clever
move, but the cynics among us might sniff something fishy. Rumour has it
OpenAI?s nearly chewed through the entire web for data to feed its
language
models, so why not throw open the gates and let the global crowd
serve up
fresh fodder? It?s a clever ploy about scraping up every last crumb
of human
interaction to power their data-hungry tech.
Let?s not be too harsh, though. OpenAI?s latest toy is a bit of a
marvel: in
some ways it?s like watching a new species come to life. But as we
ooh and ahh
over this latest OpenAI release, are we so dazzled by the prospect of
talking chat [and vision-processing] bots that we overlook the less
glamorous implications, like privacy erosion and data exploitation?
[And let?s
not kid ourselves, wasn't everyone really hanging out for GPT-5?]
In essence ChatGPT-4o is a bit of a mixed bag. It's an impressive party
trick, sure, but when the release-glitter settles we?re left
pondering what it
really brings to the table. It?s a step forward no doubt, but also a
sidestep,
a fancy detour on the road to more profound innovations. The real
kicker isn?t
what this AI can do but what it represents in the grand scheme of things
[all up a blend of breakthrough and bias, of marvels and missed
opportunities]. So let?s marvel at the spectacle but remain skeptical
of the
smoke and mirrors. After all [in the current world of current AI
release-hype] every new release is a double-edged sword, and it?s up
to us to
suss out which edge to grab.
--
| mezbreezedesign.com