On Wed, Jul 5, 2023 at 7:05 PM Matt Mahoney <[email protected]> wrote:
>...
> LLMs do have something to say about consciousness. If a machine passes the 
> Turing test, then it is conscious as far as you can tell.

I see no reason to accept the Turning test as a definition of
consciousness. Who ever suggested that? Even Turing never suggested
that to my knowledge. And the Turing test is even a weak, non
explanatory, definition of intelligence. It doesn't say much to me
about intelligence. It says nothing to me about consciousness. I don't
even know if you're conscious.

What does amuse me about LLMs is now large they become. Especially
amusing in the context of the Hutter Prize. Which I recall you
administered for a time.

I recall the goal of the Hutter Prize was to compress text, on the
assumption that the compression would be an abstraction of meaning.

I argued it was the wrong goal. That meaning would turn out to be an
expansion of data.

And now, what do we find? We find that LLMs just seem to get bigger
all the time. That more training, far from compressing more
efficiently, just keeps on generating new parameters. In fact training
for longer generates a number of parameters roughty equivalent to just
adding more data.

I asked about this online. Is there any evidence for a ceiling. The
best evidence I was able to find was for a model called Chinchilla. It
seems that Chinchilla at 70B parameters slightly outperformed a 280B
model trained with 4.5x fewer (300B vs 1.4T) tokens.

So 4x the training gave a result much the same as 4x the data!

Is training compressing the data, or expanding it?

In practice it seems people are going with more data. Much easier than
doing more training. But it seems they are much the same thing. It
says nothing about what would happen if you just kept training for
ever. Eternally better performance with eternally more "parameters"?
Nobody knows.

Anyway, the whole success of the, enormously BIG, LARGE language
models, with no ceiling yet in sight, seems to knock into a cocked hat
once and for all the whole conception of the Hutter Prize, that
intelligence is a compression, and the model with the smallest number
of parameters would turn out to be the best.

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T42db51de471cbcb9-Md910373c6afaf37948f6942d
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to