Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
But what is Bits Per Character evaluation testing? How does it work? -- Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M17bef63d367de5a6688fe7f4 Delivery options:

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Indeed, if someone removes a dozen char/word types in the training/test set if they use Perplexity as evaluation, they can get a higher score. To work/compare with enwiki8 you must either decompress it losslessly or train on it in full and in full on test set

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread Matt Mahoney
On Sun, Mar 22, 2020, 8:57 AM wrote: > 1 more question Matt: > https://openai.com/blog/better-language-models/ > They say "enwik8 - bits per character (–) - OURS: 0.93 - LAST RECORD 0.99" > Butbut...enwiki8 is 100MB! 0.99 alone is 100MB / 8 = 12.5MB. The best > compression is 14.8MB though.

Re: [agi] Symbolic AI: Concepts

2020-03-22 Thread Mike Archbold
Go Dawgs! I played collision football for 9 seasons, probably 70 games and 3x practices, at least concussions and kept playing. One time the world turned purple. It's amazing the amount of trauma the brain can take and still work. There's a limit, though. College football is cracking down on

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Thanks JB. So my refined conclusion is Perplexity is worse than Lossless Compression because Lossless Compression forces you to Learn Online, etc, which was amazing for me to code. And the Perplexity test dataset is ok if different actually but it still can be quite similar in some ways in

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread Alan Grimes via AGI
immortal.discover...@gmail.com wrote: > > It's [imperative] you understand that all AI find/create patterns > because it lets them solve unseen problems the programmer never set > them to answer. And all of physics has patterns. The reason Earth will > become a fractal pattern of nanobot units

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread James Bowery
Marcus Hutter implicitly addresses perplexity in this Hutter Prize FAQ entry: http://www.hutter1.net/prize/hfaq.htm#xvalid Why aren't cross-validation or train/test-set used for evaluation?A common way of evaluating machine learning algorithms is to split the data into a training set and a test

[agi] Symbolic AI: Concepts

2020-03-22 Thread A.T. Murray
When I addressed the Board of Regents of the University of Washington on December 12, 2019 -- see https://www.washington.edu/regents/minutes/meeting-minutes-for-2019 -- I told them that I was protesting against Husky Football brain injuries because I had spent my adult life studying the human

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
1 more question Matt: https://openai.com/blog/better-language-models/ They say "enwik8 - bits per character (–) - OURS: 0.93 - LAST RECORD 0.99" Butbut...enwiki8 is 100MB! 0.99 alone is 100MB / 8 = 12.5MB. The best compression is 14.8MB though. What are they doing here? 0.93 is 11,625,000

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Another question Matt. I see you have a large page on compressors, but I don't see a comparison of compressor's top n predictions (generated text not in enwiki8). This would show how nicely the better compressors are generating realistic text like from the dataset. I know the output probably

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread stefan.reich.maker.of.eye via AGI
> Data compression won't solve AGI.  I thought you had tried to convince us otherwise earlier... :) -- Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M646707e88d06bae5f550dbc7 Delivery options:

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Ah, you can train a net on a dataset and then test on another different dataset but you can never be sure the dataset is on topic, it has to be different lol! With Lossless Compression evaluation, the predictor also predicts the next token, and we store the accuracy error, but it is of the

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Also see the below link about Perplexity Evaluation for AI! As I said, Lossless Compression evaluation in the Hutter Prize is *the best* and see it really is the same thing, prediction accuracy. Except it allows errors. https://planspace.org/2013/09/23/perplexity-what-it-is-and-what-yours-is/

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
That's spot on JB, a 3D physics engine would help predict movies. But we must work with larger "objects" where can, not atoms. Still works! All the physics sim would lack is the fact that the cat, seeing the mouse, is likely going TO jump on it, before actually springing off the floor. So you

Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
With image compression, and building an understanding of the world first and then adding language on top: humans regularly don't talk about microscopic levels or walls of bread bag etc, we say "bread" - we learn to segment objects in vision, vision has noise in images and there's never an exact