Think if you had a GPT-2 prompt 1 word long only ex. scratching, it could be 
seen as scratch ing, so let's say we had only scratch,  the word to predict 
next is the one seen before in the data, and the frequent one is more likely 
chosen. This word scratch can also match related words itch scrap etc and then 
see what frequent word is next. So the closest related word, and the highest 
frequent word is chosen.

scratch the x64
scratch it x58
scratch back x38
itch my x76
itch shoulder x50

So my is predicted because itch=scratch a lot and my is seen after itch x76 
times. But we aren't done yet. The candidate words/tokens are given an extra 
score, now, which is relational score to story words 'scratch' (no itch exist) 
in your input prompt. So we end up picking shoulder say. We may also then 
translate shoulder to adapt it in ex. stomach if your story was about stomach 

If we have 2 words in our generated story so far: 'Scratch stomach', we can 
predict the next word using either word but we want to use both, so we see what 
nodes/related nodes including positional rearranging exists in the data ex. 
scratch stomach, itch bladder, bladder itch, etc, and the more distant they are 
positioned the lesser of course a match it is.

So say we got a 'first I go into the lake and stay there, second I cold freeze 
in the winter', the 'first' votes on then because it could be 1st or  etc 
translated, frequency, relation, but it is far back and there's other words, so 
it definitely has less vote being 1/10th the weight and fading in energy by 
this time, yet it does have a strong vote too.

Say we find 'stomach itch but it has a word in the middle or one missing ex. 
'stomach really itch' or 'stomach', this also has less score but still has some.

What if I say 'second, this is that, third, i am this, but i won't say the next 
because i refuse to repeat myself'. Here I pay attention to ignore the word 
'fourth' being said.
Artificial General Intelligence List: AGI
Delivery options:

Reply via email to