@Matt and others,

So:
So far my enwik8 (100MB) lossless compression score is 20,085,564 bytes (using 
my AI that predicts/stores patterns). I haven't fully exhausted my ideas/ 
insight, so expect it to improve lots.

BTW:
Many of the top scores on the Large Text Compression Benchmark - even the new 
nncp despite being a Transformer architecture, use a pre-processor that shaves 
off often ~0.75MB (as shown on Matt's helpful long benchmark page) by ex. 
replacing common words with smaller-to compress codes, so you could say my 
score is 20MB - 0.75MB = 19.25MB. While this may be the right thing to do, I 
believe it is solved better using the AI, so I refuse to do it for now. 
Re-arranging related enwik8 articles or by a high-D method also is used, this 
should not hurt prediction, why can't the AI just detect the new topic change 
after a few words?

Questions:
Matt I saw on your page methods you use, listed below, I actually don't fully 
understand many of these. I collect methods and so this is really interesting. 
Matt can you give an actual example for each the below using a sentence to show 
how it works so there is no doubt it being understood? For example when you say 
"ISSE", show an toy example like this: "Sally walked the dog, the dog saw a 
cat, the cat saw a dog" > cat and dog have similar predictions and/or are 
close, so it makes sense to predict dog>meowed if only saw cat>meowed, as cat 
and dog have been seen to be interchangeable, there is evidence. It would also 
be most awesome if you know ~how much MBs each may shave off enwik8.
SSE (Secondary Symbol Estimation) --- like you can do this a 2nd time!!!
ISSE (Indirect Secondary Symbol Estimation) --- no you didn't
SEE (Secondary Escape Estimation) --- I may actually get this a bit, unsure 
though! Isn't this for when exhaust all orders?
ICM (Indirect Context Match) --- as if this is not covered above :/(, no clue 
what this is
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T23ab994ac902fe7e-Mb25840abc26bf56c4f0a53af
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to