Yes stefan, if you saw 'cat ran' 3 times and 'cat slept' 1 time, your 
predictions for 'cat' are cat>(ran=75% likely, slept=25% likely). So you 
predict ran 75% of the time in Generate Mode, or if in Compress Mode then all 
the time you predict (ran=75% likely, slept=25% likely). There's no other way 
to know what to say after cat, cat what? You get your things that follow, they 
have their counts 3 and 1 times seen, and you give them a up to date % score - 
and to get their scores you take the counts cat ran (3) and cat slept (1) and 
get a % probability distribution 75% and 25%.

If you wanted to use Perplexity like everyone in the field does or Lossless 
Compression, you need to get the %s. Because in Perplexity you add up errors 
ex. cat>ran/slept is predicted 75% and 25%, and the true answer let's make up 
is cat>ran (here in the file it is let's say), so you have a 25% error to add 
to your error score! If you used the native easy way 3 and 1 counts per 
prediction, 6 and 2 is same but not normalized, 6 and 2 would give different 
error scores, which is untrue, it's still 75% and 25%.

So you need it for evaluation at least 1 time done ya (and generate mode, I 
think, because you could mix (perhaps a sloppy way) predictions for c, ca, cat, 
without using %s, giving you a single set of counts, no %s, but maybe it will 
do it one time at the end to get %s). My line of code makes a set of %s into 
weights for Generate Mode: prediction = random.choices(predict[0], 
weights=(predict), k=1). It may be able to use counts, unsure if it converts it 
to %s on its own. So once yes to eval, possibly once to generate, and mine does 
it for each damn set of predictions for each context order, to know how much 
weight to give a set of weights.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Tf856e4082d9ea09a-M0d694caddf46ef15ce2c288a
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to