I will improve my translation function as I score it, it works fine for now. Before I add it to my AI code, I may want to verify it won't take too long to compute. I already mentioned it but it should be great maybe better than GPT's for 40GBs of text, and for the Hutter Prize/LTCB it obviously needs to update its web of relations it stores 10 or 100 times throughout the 1GB enwik9.txt file, which may be costly if 100 times longer a wait. However others score 15MBs so it should be possible especially if done online and not using a dictionary for "Byte Pair Encoding".
After I *verify* it won't take too much resources, I may go back to my BPE and weight mixing, because I should finish them first, seeing that I *verified* won't waste my efforts only to find later translation takes ages to compute. I kinda already verified it, but should more. My roadmap roughly is this: Current score: 19MB (need to get 15MB at least). May get with current code 18.6MB is get all the juice out. May get 17.6MB if tweak the hell out of the mixing function to perfect it. The last 2.6MB is probably largely part of the translated matching and priming systems. Obviously there is more ideas I have, so if I'm still not at 15MB and am at 16MB after all this, I still know how to get it down then. This could all be done in 1 year if move quickly enough...... The real test will be seeing if it generates like GPT-2 for text as we get closer. If it still seems far off after getting closer to 15MBs, that's a sign....but currently it looks very promising, I'm not sure what is up with current leaders on the LTCB, where is there text completions? Why am I the only one doing the job here. No one else cares to post results or explain why they can't generate? There is a lot of possible matches when you consider hole matches, delay, and translation. You want to get the big/likely matches, the main juice, the patterns. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T01fa5e447808d368-M8b3cf3dcdb03a81405482482 Delivery options: https://agi.topicbox.com/groups/agi/subscription
