One of the only differences if I try vision is instead of me storing/matching a 
group of symbols ex. thanksgivin>g and get the next predicted letter like that, 
I must do so in 2 dimensions since images are 2D. Not hard to do so far. Also, 
instead of thanksgivin>g, it would look like (if we had 10 types of pixel 
shades i.e. 0-9): 361736472>5. Now, 5 is similar to 6 in brightness, and if all 
are 2 shades brighter then they really is no error it is still a image of a 
turkey just brighter. So if we see a turkey again, it will be 'similar' to that 
string of numbers or same just all 2 shades brighter. So that is not so 
ambitious to code in too. You'll also want to do rotation/ scale/ location like 
this too using the described relative mismatch pattern. The relativeness 
function actually works in text too, you may see h e l l o. SO: I'd need to add 
2D and similar delay for pixels, I actually already have the delay working just 
it is in time/proximity for image/sequence and not pixel self identification 
brightness level. So, lots here is familiar already really, not much change to 
the cod eis needed really...

So turning my text predictor into an image predictor isn't that hard once it 
gets like GPT, I think. Also I had been thinking of throwing away image data 
except lines as since if there is lines on the sides then the inside is the 
same shade, lines show the change of shade, but, one problem, it seems 
confusing and similar performance to just using and predicting all the pixels. 
I mean you need to output them all anyway, the other way you'd simply fil in 
holes with shade, and some may be open holes that graduate in shade, which 
seems tough to fill in! Or maybe I'm overlooking something.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T5b614d3e3bb8e0da-Me6ea1e6105d9451419976755
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to