I will release the new code soon. My research seems to suggest, and it still needs to be tested to know if it's not crazy, that storing all of the Byte Pair Encoding "elements" as-is on GPU with "no web/network" (i'm temporally avoiding gpu), doesn't "hurt" memory and the main algorithm can all still be ran with no changes.
> If this actually works, a large dataset can be stored VERY fast and inference > would be VERY fast, as the input would simply broadcast to all stored > features to collect predictions. And the code is much simpler (huge bonus). > This also avoids the need to make a network on GPU work, because it would be > more tricky to "get" the input to signal correctly, including difficult to > manage thewidth of the network last I thought about how to bring it back > smaller (many tiny routers than make the network smaller width again). Seems > GPT does soemthign liek that too, it makes the width smaller. What do yous think? ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-Me05b6030e4cbd2b211d0c343 Delivery options: https://agi.topicbox.com/groups/agi/subscription
