But what is Bits Per Character evaluation testing? How does it work?
--
Artificial General Intelligence List: AGI
Permalink:
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M17bef63d367de5a6688fe7f4
Delivery options:
Indeed, if someone removes a dozen char/word types in the training/test set if
they use Perplexity as evaluation, they can get a higher score. To work/compare
with enwiki8 you must either decompress it losslessly or train on it in full
and in full on test set
On Sun, Mar 22, 2020, 8:57 AM wrote:
> 1 more question Matt:
> https://openai.com/blog/better-language-models/
> They say "enwik8 - bits per character (–) - OURS: 0.93 - LAST RECORD 0.99"
> Butbut...enwiki8 is 100MB! 0.99 alone is 100MB / 8 = 12.5MB. The best
> compression is 14.8MB though.
Go Dawgs!
I played collision football for 9 seasons, probably 70 games and 3x
practices, at least concussions and kept playing. One time the world
turned purple. It's amazing the amount of trauma the brain can take
and still work. There's a limit, though. College football is cracking
down on
Thanks JB. So my refined conclusion is Perplexity is worse than Lossless
Compression because Lossless Compression forces you to Learn Online, etc, which
was amazing for me to code. And the Perplexity test dataset is ok if different
actually but it still can be quite similar in some ways in
immortal.discover...@gmail.com wrote:
>
> It's [imperative] you understand that all AI find/create patterns
> because it lets them solve unseen problems the programmer never set
> them to answer. And all of physics has patterns. The reason Earth will
> become a fractal pattern of nanobot units
Marcus Hutter implicitly addresses perplexity in this Hutter Prize FAQ
entry:
http://www.hutter1.net/prize/hfaq.htm#xvalid
Why aren't cross-validation or train/test-set used for evaluation?A common
way of evaluating machine learning algorithms is to split the data into a
training set and a test
When I addressed the Board of Regents of the University of Washington on
December 12, 2019 -- see
https://www.washington.edu/regents/minutes/meeting-minutes-for-2019 -- I
told them that I was protesting against Husky Football brain injuries
because I had spent my adult life studying the human
1 more question Matt:
https://openai.com/blog/better-language-models/
They say "enwik8 - bits per character (–) - OURS: 0.93 - LAST RECORD 0.99"
Butbut...enwiki8 is 100MB! 0.99 alone is 100MB / 8 = 12.5MB. The best
compression is 14.8MB though. What are they doing here? 0.93 is 11,625,000
Another question Matt.
I see you have a large page on compressors, but I don't see a comparison of
compressor's top n predictions (generated text not in enwiki8). This would show
how nicely the better compressors are generating realistic text like from the
dataset. I know the output probably
> Data compression won't solve AGI.
I thought you had tried to convince us otherwise earlier... :)
--
Artificial General Intelligence List: AGI
Permalink:
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M646707e88d06bae5f550dbc7
Delivery options:
Ah, you can train a net on a dataset and then test on another different dataset
but you can never be sure the dataset is on topic, it has to be different
lol! With Lossless Compression evaluation, the predictor also predicts the
next token, and we store the accuracy error, but it is of the
Also see the below link about Perplexity Evaluation for AI! As I said, Lossless
Compression evaluation in the Hutter Prize is *the best* and see it really is
the same thing, prediction accuracy. Except it allows errors.
https://planspace.org/2013/09/23/perplexity-what-it-is-and-what-yours-is/
That's spot on JB, a 3D physics engine would help predict movies. But we must
work with larger "objects" where can, not atoms. Still works! All the physics
sim would lack is the fact that the cat, seeing the mouse, is likely going TO
jump on it, before actually springing off the floor. So you
With image compression, and building an understanding of the world first and
then adding language on top: humans regularly don't talk about microscopic
levels or walls of bread bag etc, we say "bread" - we learn to segment objects
in vision, vision has noise in images and there's never an exact
15 matches
Mail list logo