[agi] rules...found something slightly undesired I think...very interesting

immortal . discoveries Fri, 28 May 2021 15:05:15 -0700

So the Hutter Prize contest rules state only CPU usage, not GPU. I assume you 
can use CPU cores for parallelization.


Matt's contest (LTCB) allows GPU usage, unlimited core counts, and memory, and 
time.
"Timing information, when available, may vary widely depending on the test 
machine used."

Both are correct...........with a small issue I found I think......... We know 
an algorithm can train on more data or use more cores, the contests restrict 
the data to a dataset enwik9 and at a stuck size of 1GB, we don't need to 
benchmark AI on more data to see who's is better. Nor use more cores. But this 
isn't to say 1 core is all we need, then can multiple the usage over 100000 
cores. We should make sure the algorithm CAN be parallel, so we allow in the 
rules to use 4 cores in the CPU. Hutter Prize does this, you can use ex. 4 CPU 
cores and only enwik9, all limited in amount. Matt's goes too far, (and really 
it is a good thing but problem is he doesn't go all the way, keep reading), 
there is no cores limit, this is not good, it's like using more data, my AI can 
get LESS error per ratio if train on 100GB of text as it does better on bigger 
data hence better ratio, same if I use more cores, it doesn't tell us who's AI 
is better really, only who has more cash at home to use more cores or train on 
more data/ time. Matt's allows unlimited cores but limited data size, why can't 
I show my ratio from using 100GB? Now, I DO agree the HP contest should limit 
cores and data amount used to see who's AI is better, and I agree also we 
should have a unlimited contest like Matt's that shows how good a predictor can 
be, but Matt's needs to start allowing unlimited dataset sizes, he only 
currently allows unlimited core usage. > Because if you have more cash then you 
can use more cores and get a better predictor from having more compute, -- 
this, this right here in not public equalness, only the rich can get the best 
score, so the contest is no longer public - it is what is possible on Earth !, 
hence Matt's contest should also allow 100GB+ usage too, so we can show how 
good a predictor can be. Why unlimited cores but not unlimted data scores? We 
/can/ compare ratios, i.e. notice how Fabrice Bellard's scores 15MB on 100MB 
and 110MB on 1GB, well that means for the amount of prompts it saw it predicted 
the actual answer blindfoldedly that accurately averaged over all prompts, 89% 
accurate per prompt on average! It will more accurate on 1TB of text.

So I'm going to add to my Guide that we should use 1 contest for finding better 
AI (HP...), and another contest for finding the best implementation of AI (half 
does Matt's match this criteria :(... ). Simply start adding 10GB+ benchmarks 
Matt, it is easy to just take the top algorithms and get some stats right away. 
Also beowulfs and supercomputers, should allow intense parallization... 
Korrelan seems to do this and so do supercomputers, and I think OpenAI.

My job is obviously the HP contest, I could, sooner (instead of later), try 
large GPU usage but really this is not my game it is a rich man's job, I can 
get richer by doing the HP contest (showing my AI is smarter, not that I am 
rich/ have more cores). In this case my AI may appear worse than Bellard's 
score then, for now. Doing any elaborate core usage or even using GPU would be 
a waste of time like using more data is for finding smarter AI, for the most 
part.

Then again hmm, Matt's contest is about scaling using more cores, a rich man's 
job, what we can do on Earth....but maybe you can use a limited dataset size of 
1GB....I mean his test is who has more money, it's clear when used with 1GB, 
why use 10GB then? It would only change if had more cash for more memory, 
compute, I mean to score good on 10GB speed won't change it any more than 
scoring on 1GB (except for the contest of who has more time to train AI), hmm 
same for more memory. So it's a contest of who has more cash, and spent longer 
training AI, but the later requires unlimited datset size, otherwise matt's 
contest is fine then. But since it's not a who's AI is smarter "contest" and is 
a who is richer contest, why is it any weirder so see it as a who spent longer 
time training contest (hence use 1TB of text)?

And if you had 1 trillion cores and big RAM or cache, and used enwik[3], or IOW 
1KB of data, you couldn't use all your cores really, assuming you can look at 
the future far ahead and no longer are using compression (evaluation), for 
training. So.....perhaps using more data ex. 1Tb of text shows how rich you are.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Te9633f76cfbb22e5-M7890e697c4d229f7091aa213
Delivery options: https://agi.topicbox.com/groups/agi/subscription

[agi] rules...found something slightly undesired I think...very interesting

Reply via email to