On Thu, Jul 6, 2023 at 7:58 PM Matt Mahoney <[email protected]> wrote: > ... > The LTCB and Hutter prize entries model grammar and semantics to some extent > but never developed to the point of constructing world models enabling them > to reason about physics or psychology or solve novel math and coding > problems. We now know this is possible in larger models without grounding in > nonverbal sensory data, even though we don't understand how it happened.
We don't understand how it happened. No. We don't really understand much at all. It's all been a process of hacking at some very old ideas about training to fixed categories. Speeded up by GPUs developed for the game market. And most recently enhanced by the accidental discovery that context, through "attention", seems to be central. That distributional analysis of language might result in categories useful for broader reasoning was always plausible to me, so I find no shock in it with LLMs. I just think LLMs are limited by not being able to find novelty. Their categories are fixed at the time of training. In reality I think the categories can shift. And be novel. That they are chaotic attractors. Anyway, I encourage those who are inclined to think about theory to focus on the fact that simply allowing the size of the models to increase seems to be of central importance. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T42db51de471cbcb9-M156f326bfc2335867341f308 Delivery options: https://agi.topicbox.com/groups/agi/subscription
