Wait. I might have real proof for my "store all the main strings on GPU, no network needed".
1) GPT-2 trained on 40GBs of text. Network size is 6.2GBs after training. 2) Humans mostly know 8 word-long short sentences. We don't store/remember every 100 words from everything we read (a 100-word window that slides 1 word over for the entire 40GBs of text say, taking snapshots). 3) If we don't even use BPE, and just store all 5 word phrases that occur at least 2 times in 40GBs, that's about I calculated 20 billion strings for 40GBs of text. That's 100GBs of text (16x larger than GPT-2's storage). But we haven't did anything else to get that down, I'm sure there's something simple. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-M8d9ad13c8bc77261cc0a50dc Delivery options: https://agi.topicbox.com/groups/agi/subscription
