Oops, I had some new abilities ON, which means the no preprocessor mode had a worse score (because it could not use the new abelites without changing weights, time taking lol), making it look like a shuffled preprocessor alone got it done a lot. New tests:
*Let me compare to above's, maybe no change! So we see 0.5"MB" increase when go up to the shuffled run for the smaller input, and 0.37"MB" for 1MB input. For the below, it is 0.38"MB" and 0.27"MB". The amount shaved off for just going down the shuffled run for both above and below post is 3.5"MB" and 1.9"MB" for above post and 3.1"MB" and 1.6"MB"for below. So, by just looking at the below visually, it looks, good still! Woho.* without preprocessor 28,081 240,741 with [shuffled] preprocessor 25,018 224,552 with preprocessor 24,641 221,854 @Matt "Dictionary preprocessing helps by reducing the input size, which reduces memory usage," "and by effectively increasing the context length without increasing the number of contexts that need to be mixed." I'm going to reverse engineer it to be online (on the fly). More than 16 letter exact-matches don't help much, little own as much as above, and this is because it is rare to find many or even one 20 letter context match. So it isn't increasing the context size for me (I just ran tests above with no hole or delay matching or aheadoftime predictions, only exact matching, priming, exponential functions, meaning I did not do longer than ex. 20 letter matching by using hole matching etc). There's only 2 possible explanations for why cmix preprocessor is giving me so much shaved off score still: 1) Because it is predicting multiple letters when it predicts a letter, similar to Byte Pair Encoding style. 2) Because it is paying attention to context matches at the spaces and common joints, to get predicted next letters, ex. it asks what is the next letter for we walked, walked, ed, and no context match. Instead of walking alking lking king ing ng g and nocontextmatch. I'm going to be testing those 2 to see which or if both are helping and how to improve them. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T192296c5c5a27230-M8f52a461e2f7528aea290490 Delivery options: https://agi.topicbox.com/groups/agi/subscription
