I love doing math when it is simple elegant math.
Let's now take the top likely 3 best and see if one is better.
http://mattmahoney.net/dc/text.html

14,838,332 correction
208,961 code
602,867 time
25,738 RAM
100,000,000 - (14,838,332 + 208,961) / 25,738 = 3,300 * 100,000,000 = 
330,000,000,000 * 1 = 330,000,000,000

15,010,414 correction
42,944 code
86,305 time
6,319 RAM
100,000,000 - (15,010,414 + 208,961) / 6,319 =  13,416 * 100,000,000 = 
1,341,677,876,246 * (602,867 / 86,305) = 9,372,033,094,476

16,209,167 correction
407,477 code
1,797 time
13,000 RAM
100,000,000 - (16,209,167 + 407,477) / 13,000 = 6,414 * 100,000,000 = 
641,400,000,000 * ((602,867 / 1,797) = 215,180,241,402,337

I show above that durilca'kingsize has 652 times more data training than cmix.

Their compression scores are:
14,838,332 and  115,714,367
16,209,167 and  127,377,411
I'm thankful each have 2 scores (enwik 8 and enwik9) because this goes to show 
us that, regardless  of how fast it ran or how much mem it used, durlica got 
the 1,000,000,000 bytes down to 127,377,411, this means using just 10 times 
more data we see a curve of its learning rate: 16,209,167 and  127,377,411 - 
which is (of its 100% it started with which it compressed) 6.1% and 7.8% 
compression factors, cmix's is 6.7 and 8.6. So if we guess the other's for 
durlica i.e. 7.8 for 1,000,000,000 bytes, 6.1 for 100,000,000, and the rest in 
compressions: 2,100,000 , 280,000 , 40,000 , 5,400 , 670 , ... so about then 
4.7 compression factor for 10,000,000 bytes for durlica.....so the next would 
be let's see: 4.7, 6.1 (~1.4 increase), 7.8 (1.7 increase), so it seems 
completely plausible that just 10 times more data little own 652x more will 
give us at least 1.0 more i.e. 8.8, more than cmix's 8.6! So it's way better, 
it's like 18.0 or more!

So durlica kingsize should be at top of the Hutter Prize board and LTCB board 
then truly ... as this should be the ultimate equation. I haven't tested the 
others but durlica looked nice in speed and mem and compression. I do see 
others below that might be better.

You might think compression shows how smart it is but....more data makes it 
smarter, it's not just about how much it can compress a specified data amount. 
If it can find rare patterns, great, it'll be more accurate, it probably will 
find many to get the score down a wee bit. Same for "more data". Now note, if 
you have a truly fantastic dataset that has all patterns, and the rarest being 
few in numbers appearing, then this would be evident, because whether you add 
more data or find those rarer patterns not picked up in he old dumber 
algorithm, you get the same effect, the more data will pass over more rare 
patterns and therefore get to know what to predict for them better.

Hmm, it may be possible that what I did above in my calculations DOES make it a 
better predictor, but somehow it is overhiding the rank even though we SHOULD 
include it! Because if we feed it 100x more data it doesn't get 100x better at 
rare patterns, just the common patterns, it isn't covering harder problems, 
even if had the perfect dataset, because it may pass 100x more of ex. Last Name 
examples ex Joe Sands has a dad Greg Sands...etc etc etc, and non share the 
names except the "has a son/mom/dad", and if the AI doesn't pick that up, it 
would fail every case tested on here, instead of acing them all, no?
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T6761a13445e5864b-M53c17962e9be5df9bc23ecc4
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to