Consider video. Humans recognize object features, like apple, stem, curve, even a dot. Some apples are green, some are distorted, some may look like a piano (almost never lol). It is the shape of a piano, and the color and texture zig zags of an apple, so depending on which you focus on you will notice one in isolation. Of course to recognize 'apple' fully will require not just texture and color but also shape, meaning 'apple' doesn't activate fully because it has the shape of a piano in our case here. But it can be sure it is most likely an apple for various reasons, and edible. We can recognize an apple being thrown as 'motion' and 'motion of apple'. Human seem to write text as they see vision, like I lift my arm and throw a ball to Tom. So humans will see a sequence of these features like apple...moving...hits wall....apple disappears....wall remains. And humans will remember important features like food and will talk about them on their own to creating related data. Humans use text as their gateway to say what they are thinking. Apples, motion, and shapes appear on Earth multiple places, there's re-occurrence. Vision can store a sequence in an image or a video of images. Position matters less in a single frame unless you pay attention to a let's say pirate map painting a trail to an X like a>b>c>d>X...in that case you see a sequence in a single image! However how long did you look at the image? Ah so it was a video, of what you saw in order. You always see 1 feature at a time, it may be 'apple', or 'multiple fruit', and when it is 'apple' it may slightly activate 'multiple fruit' if the surrounds seep into your attention by accident.
------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T43ab26814eaa1bdd-M9b4078cce5eed1185a5d64d0 Delivery options: https://agi.topicbox.com/groups/agi/subscription
