Ah, you can train a net on a dataset and then test on another different dataset 
but you can never be sure the dataset is on topic, it has to be different 
lol!!!!! With Lossless Compression evaluation, the predictor also predicts the 
next token, and we store the accuracy error, but it is of the same dataset, 
meaning it can fully understand the dataset, and is safe because we include the 
code size and compressed error size and make sure the compression is most can 
get. Speed matters too. And working memory size. Cus brute force would work but 
is slowest

 Since both evaluations test the predictor's accuracy and know the right symbol 
to predict, we see the error, but we can't know the best compression/accuracy 
possible, the contest will never stop. With Perplexity, this is true too I 
think, it gets ex. 90% letters or words predicted exactly, but how many can it 
get right? 100%? Maybe if the training dataset is large enough, it will do 
better, but doesn't mean it is understanding it as much. With compression, you 
can do better the bigger the dataset, but you can at least keep the size static 
and focus on compression aka understanding the data better. I guess with 
Perplexity you too can keep your training set static. So ya both can keep 
dataset same size and improve prediction to an unknown limit.

Conclusion is Perplexity isn't focusing on the very dataset it is digesting, 
but a different "test" dataset, which is bad. Right Matt?
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M7f38f959969b1087b0d8cde5
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to