Predicting words but letters actually:

Predicting words (as I explained why) is not only better apparently but also 
allows for related word priming. I may have found a way to predict like this 
using letter prediction. Remember my first post. So: we have the dataset fa fu 
sz, if we predict letters and go through each 3 times in a make-up test set, it 
is worse, we get 50% will be a f/s, so 50% it was wrong, we do that 3 times for 
3 words in the test set, then out of 2 of those test words were started with f, 
so we get a/u 50% wrong 2 times, z is 100%, so total byte gotten wrong is 
50%+50%+50%   +50%+50% = 2.5, if we predict words we would see each word is 
1/3rd the probability so would be 3 times simply we predict 33% chance hence 3 
times we were 66% wrong, so 66%+66%+66% = 2.0. But here's how to so it to 
letters I think: let's change it to f=99%, s=1%, f>a/u 50%, now we get 2 times 
we had to pay 0.1%, once 99.9% when had to predict s correctly, then 50% 2 
times, this gives us a have to pay of 0.1%+0.1%+99.9%+50%+50% = 2.001, O>O ! 
But why does that even work, what gives. The a/u makes sense, 50%/50%, but the 
f/s.......why 99.9999 and 0.0001....the complete elimination of the other makes 
it able to do 50/50% and such, i guess...

let's try aa, ab, ba, ba....word prediction: 75%, 75%, 50%, 
50%...=2.5.....letter: 50% x4, 50% x2......=3.0..............................so 
it doesn't work on even distributions, but maybe these are rare, at least for 
early layers of a network....i'll think on it later if need then
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T90b7756a48658254-Mf9740a048c3bcbf341904f03
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to