I love doing math when it is simple elegant math. Let's now take the top likely 3 best and see if one is better. http://mattmahoney.net/dc/text.html
14,838,332 correction 208,961 code 602,867 time 25,738 RAM 100,000,000 - (14,838,332 + 208,961) / 25,738 = 3,300 * 100,000,000 = 330,000,000,000 * 1 = 330,000,000,000 15,010,414 correction 42,944 code 86,305 time 6,319 RAM 100,000,000 - (15,010,414 + 208,961) / 6,319 = 13,416 * 100,000,000 = 1,341,677,876,246 * (602,867 / 86,305) = 9,372,033,094,476 16,209,167 correction 407,477 code 1,797 time 13,000 RAM 100,000,000 - (16,209,167 + 407,477) / 13,000 = 6,414 * 100,000,000 = 641,400,000,000 * ((602,867 / 1,797) = 215,180,241,402,337 I show above that durilca'kingsize has 652 times more data training than cmix. Their compression scores are: 14,838,332 and 115,714,367 16,209,167 and 127,377,411 I'm thankful each have 2 scores (enwik 8 and enwik9) because this goes to show us that, regardless of how fast it ran or how much mem it used, durlica got the 1,000,000,000 bytes down to 127,377,411, this means using just 10 times more data we see a curve of its learning rate: 16,209,167 and 127,377,411 - which is (of its 100% it started with which it compressed) 6.1% and 7.8% compression factors, cmix's is 6.7 and 8.6. So if we guess the other's for durlica i.e. 7.8 for 1,000,000,000 bytes, 6.1 for 100,000,000, and the rest in compressions: 2,100,000 , 280,000 , 40,000 , 5,400 , 670 , ... so about then 4.7 compression factor for 10,000,000 bytes for durlica.....so the next would be let's see: 4.7, 6.1 (~1.4 increase), 7.8 (1.7 increase), so it seems completely plausible that just 10 times more data little own 652x more will give us at least 1.0 more i.e. 8.8, more than cmix's 8.6! So it's way better, it's like 18.0 or more! So durlica kingsize should be at top of the Hutter Prize board and LTCB board then truly ... as this should be the ultimate equation. I haven't tested the others but durlica looked nice in speed and mem and compression. I do see others below that might be better. You might think compression shows how smart it is but....more data makes it smarter, it's not just about how much it can compress a specified data amount. If it can find rare patterns, great, it'll be more accurate, it probably will find many to get the score down a wee bit. Same for "more data". Now note, if you have a truly fantastic dataset that has all patterns, and the rarest being few in numbers appearing, then this would be evident, because whether you add more data or find those rarer patterns not picked up in he old dumber algorithm, you get the same effect, the more data will pass over more rare patterns and therefore get to know what to predict for them better. Hmm, it may be possible that what I did above in my calculations DOES make it a better predictor, but somehow it is overhiding the rank even though we SHOULD include it! Because if we feed it 100x more data it doesn't get 100x better at rare patterns, just the common patterns, it isn't covering harder problems, even if had the perfect dataset, because it may pass 100x more of ex. Last Name examples ex Joe Sands has a dad Greg Sands...etc etc etc, and non share the names except the "has a son/mom/dad", and if the AI doesn't pick that up, it would fail every case tested on here, instead of acing them all, no? ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T6761a13445e5864b-M53c17962e9be5df9bc23ecc4 Delivery options: https://agi.topicbox.com/groups/agi/subscription
