GPT-3 is like a super brain that memorizes the internet. RETRO is like a normal brain (1B parameters) that can look up stuff on Google. This requires less computation for the same performance.
The data is interesting. More parameters always helps and more data always helps. Even the worst models are predicting at about human level, 1 bit per byte. If there is an upper limit on prediction accuracy then we still don't know what it is, and this goes far beyond my own research. On Sat, Dec 11, 2021, 12:08 PM James Bowery <[email protected]> wrote: > My apologies if the image didn't come through. > > The reference is to: > https://deepmind.com/research/publications/2021/improving-language-models-by-retrieving-from-trillions-of-tokens > > On Sat, Dec 11, 2021 at 10:22 AM <[email protected]> wrote: > >> *?* >> > *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/T22ce813ce07d9b1a-M28cfab84ff9e11bc71016ea6> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T22ce813ce07d9b1a-M3385ffe7e0a22a7bb779e9d0 Delivery options: https://agi.topicbox.com/groups/agi/subscription
