ARC-AGI-78794312-SUPER-DUPER-MEGA-EXTRA! Only that will achieve 
consciousness!++

On Saturday, 21 December 2024 at 16:01:40 UTC+2 John Clark wrote:

> On Sat, Dec 21, 2024 at 5:14 AM PGC <[email protected]> wrote:
>
>> * > there is a statement that when the system is scaled up dramatically 
>> (172 times more compute resources), it manages to score 87.5%. The 
>> difference between the 75.7% result and 87.5% result is thus explained by a 
>> large disparity in the computational budget used for training or inference.*
>>
> *Yes, and if O3 had been given even more time it would've scored even 
> higher, to me that indicates that the fundamental problem of AGI has been 
> solved, and now it's just a question of optimizing things to make them more 
> efficient. And if history is any guide that won't take long, today much 
> smaller more compute efficient models can equal the performance of huge 
> compute hungry state of the art models of just a few months ago. *
>
> *It's bizarre to realize that just a month and a half ago the majority of 
> people in the USA thought the major problems facing the country were the 
> trivial issues of illegal immigration and transsexual bathrooms, and that's 
> why Donald Trump will be the most powerful hominid on earth during the most 
> critical period in the entire history of his Homo sapiens species.  *
>  
>>
>> *> the model was explicitly trained on the very same data (or a 
>> substantial subset of it) against which it was later tested. The text 
>> itself says: “trained on the ARC-AGI-1 Public Training set”*
>>
>
> *I don't see how the fact that O3 was trained on the ARC-AGI-1 Public 
> Training set could be considered cheating when the ARC people are the ones 
> who released the ARC-AGI-1 Public Training set for the precise purpose, as 
> its name indicates, of training AIs.*
>
> *> Beyond the bare mention of “trained on the ARC-AGI-1 Public Training 
>> set,” there is an implied process of repeated tuning or hyperparameter 
>> searches.*
>>
> *Yes, because that's what "training an AI" means!  *
>
> *> children’s ability to adapt to novel tasks and generalize without being 
>> artificially “trained” on the same data is a key part of the skepticism:*
>>
>
> *Human children need to go to school, so do newly born childish AIs. *
>
>  
>
>>
>> *> a quote from the blog:"Passing ARC-AGI does not equate to achieving 
>> AGI, and, as a matter of fact, I don't think o3 is AGI yet." *
>
>
> *The average human taking the ARC test will receive a score of about 50%, 
> some very exceptionally talented humans can get a score of around 80%. 
> About one year ago, back in the stone age when the best AI's only scored 
> about 2% on the ARC test, Francois Chollet, the author of the above 
> quote and the originator of the ARC test, said that if a computer got a 
> score above 75% he would consider it an AGI. But now that O3 can get a 
> score of 87.5% if it thinks for a long time and 75.7% if it is only allowed 
> a short time to think, Chollet has done what all AI skeptics have done 
> since the 1960s, he has moved the goal post. * 
>  
>
>> *> Furthermore, early data points suggest that the upcoming ARC-AGI-2 
>> benchmark will still pose a significant challenge to o3,*
>>
>
> *Yes, I'm certain computers will find it more difficult to get a high 
> score on ARC-AGI-2, but human beings will find this new test to be even 
> more difficult than computers do. Today's benchmarks are becoming obsolete 
> because computers are rapidlymaxing them out, that's why we need ARC-AGI-2, 
> it will be very useful in comparing one AGI to another AGI.*
>
> *John K Clark    See what's on my new list at  Extropolis 
> <https://groups.google.com/g/extropolis>*
> 4n1
>  
>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/everything-list/beed5d08-f934-42ad-8309-c147b655b71bn%40googlegroups.com.

Reply via email to