I'm sure the Hollywood meme machine isn't worried about such quibbles as the "HIV virus" idiom populating its marquees.
On Tue, Jul 7, 2020 at 1:43 AM Ben Goertzel <b...@goertzel.org> wrote: > On Mon, Jul 6, 2020 at 11:41 PM Ben Goertzel <b...@goertzel.org> wrote: > > > > The COIN Criterion ... sounds like money, it's got to be good... > > Maybe we can fund the competition by making a Hollywood-style thriller > about some Bitcoin criminals... lots of potentials here... > > > > > On Mon, Jul 6, 2020 at 9:13 PM James Bowery <jabow...@gmail.com> wrote: > > > > > > On Fri, Jul 3, 2020 at 6:53 PM Ben Goertzel <b...@goertzel.org> wrote: > > >> > > >> ...Under what conditions is it the case that, for prediction based on > a dataset using realistically limited resources, the smallest of the > available programs that precisely predicts the training data actually gives > the best predictions on the test data? > > > > > > > > > If I may refine this a bit to head off misunderstanding at the outset > of this project: > > > > > > The CIC* (Compression Information Criterion) hypothesis is that among > existing models of a process producing an executable archive of the same > training data within the same computation constraints, the one that > produces the smallest executable archive will in general be the most > accurate on the test data. > > > > > > > > > Run a number of experiments and for each: > > > 1 Select a nontrivial > > > 1.1 computational resource level as constraint > > > 1.2 real world dataset -- no less than 1GB gzipped. > > > 2 Divide the data into training and testing sets > > > 3 For each competing model: > > > 3.1 Provide the training set > > > 3.2 Record the length of the executable archive the model produces > > > 3.3 Append the test set to the training set > > > 3.4 Record the length of the executable archive the model produces > > > 4 Produce 2 rank orders for the models > > > 4.1 training set executable archive sizes > > > 4.2 training with testing set executable archive sizes > > > 5 Record differences in the training vs test rank orders > > > > > > The lower the average differences the more general the criterion. > > > > > > It should be possible to run similar tests of other model selection > criteria and rank order model selection criteria. > > > > > > *We're going to need a catchy acronym to keep up with: > > > > > > AIC (Akaike Information Criterion) > > > BIC (Bayesian Information Criterion)... > > > ...aka > > > SIC (Schwarz Information Criterion)... > > > ...aka > > > MDL or MDLP (both travestic abuses of "Minimum Description Length > [Principle]" that should be forever cast into the bottomless pit) > > > HQIC (Hannan-Quinn Information Criterion)... > > > KIC (Kullback Information Criterion) > > > etc. etc. > > > Artificial General Intelligence List / AGI / see discussions + > participants + delivery options Permalink > > > > > > > > -- > > Ben Goertzel, PhD > > http://goertzel.org > > > > “The only people for me are the mad ones, the ones who are mad to > > live, mad to talk, mad to be saved, desirous of everything at the same > > time, the ones who never yawn or say a commonplace thing, but burn, > > burn, burn like fabulous yellow roman candles exploding like spiders > > across the stars.” -- Jack Kerouac > > > -- > Ben Goertzel, PhD > http://goertzel.org > > “The only people for me are the mad ones, the ones who are mad to > live, mad to talk, mad to be saved, desirous of everything at the same > time, the ones who never yawn or say a commonplace thing, but burn, > burn, burn like fabulous yellow roman candles exploding like spiders > across the stars.” -- Jack Kerouac ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Ta901988932dbca83-M198483120a43ae72ba94809a Delivery options: https://agi.topicbox.com/groups/agi/subscription