SO, I'm unsure how to covert that to compression. HOWEVER, I AM able to simply score the below, not able to compare to compression metric. The below seem fine, this is how many letters I predicted correctly, notice the score rises the more data experience it has:
10MB 0.5505950784146294 1MB 0.5310066708037171 100KB 0.47092919877903217 10KB 0.46603868078103533 1KB 0.5588715348677878 I might be able to use this solely as my evaluation. As for why it is at 0.5 when it seems like a random guess would get 0.5, well, remember we are looking at above 10,000,000 cases of a letter prediction, and looking at them all it has predicted out of 256 possible letters the correct one on average by 0.55%, the rest ex. 0.1, 0.12, 0.04, 0.01, 0.02....adding up to 1.0 respectively for its set of predictions. IF I were predicting which bit, or WORD, or letter, these are simply maybe different metrics then (I'd have a lot fewer samples to average if predict the next sentence or word, and would be much harder if had 3000 choices, versus 2 choices for bit prediction), which is not a good thing for comparing but the AI field has the same issue, and if they can resolve it I can too probably. Why do Lossless Compression then? (besides being maybe a universal way to allow comparison; when one AI predicts bits and mine predict words). ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Td13a829978c4c9f3-M0f2fdfab9b3dd99b02ec5782 Delivery options: https://agi.topicbox.com/groups/agi/subscription
