Re: [agi] Computer Vision not as hard as I thought!

David Jones Wed, 04 Aug 2010 12:44:22 -0700

Steve,

I wouldn't say that's an accurate description of what I wrote. What a wrote
was a way to think about how to solve computer vision.


My approach to artificial intelligence is a "Neat" approach. See
http://en.wikipedia.org/wiki/Neats_vs._scruffies The paper you attached is a
"Scruffy" approach. Neat approaches are characterized by deliberate
algorithms that are analogous to the problem and can sometimes be shown to
be provably correct. An example of a Neat approach is the use of features in
the paper I mentioned. One can describe why the features are calculated and
manipulated the way they are. An example of a scruffies approach would be
neural nets, where you don't know the rules by which it comes up with an
answer and such approaches are not very scalable. Neural nets require
manually created training data and the knowledge generated is not in a form
that can be used for other tasks. The knowledge isn't portable.

I also wouldn't say I switched from absolute values to rates of change.
That's not really at all what I'm saying here.

Dave

On Wed, Aug 4, 2010 at 2:32 PM, Steve Richfield
<steve.richfi...@gmail.com>wrote:

> David,
>
> It appears that you may have reinvented the wheel. See the attached
> article. There is LOTS of evidence, along with some good math, suggesting
> that our brains work on rates of change rather than absolute values. Then,
> temporal learning, which is otherwise very difficult, falls out as the
> easiest of things to do.
>
> In effect, your proposal shifts from absolute values to rates of change.
>
> Steve
> ===================
> On Tue, Aug 3, 2010 at 8:52 AM, David Jones <davidher...@gmail.com> wrote:
>
>> I've suddenly realized that computer vision of real images is very much
>> solvable and that it is now just a matter of engineering. I was so stuck
>> before because you can't make the simple assumptions in screenshot computer
>> vision that you can in real computer vision. This makes experience probably
>> necessary to effectively learn from screenshots. Objects in real images to
>> not change drastically in appearance, position or other dimensions in
>> unpredictable ways.
>>
>> The reason I came to the conclusion that it's a lot easier than I thought
>> is that I found a way to describe why existing solutions work, how they work
>> and how to come up with even better solutions.
>>
>> I've also realized that I don't actually have to implement it, which is
>> what is most difficult because even if you know a solution to part of the
>> problem has certain properties and issues, implementing it takes a lot of
>> time. Whereas I can just assume I have a less than perfect solution with the
>> properties I predict from other experiments. Then I can solve the problem
>> without actually implementing every last detail.
>>
>> *First*, existing methods find observations that are likely true by
>> themselves. They find data patterns that are very unlikely to occur by
>> coincidence, such as many features moving together over several frames of a
>> video and over a statistically significant distance. They use thresholds to
>> ensure that the observed changes are likely transformations of the original
>> property observed or to ensure the statistical significance of an
>> observation. These are highly likely true observations and not coincidences
>> or noise.
>>
>> *Second*, they make sure that the other possible explanations of the
>> observations are very unlikely. This is usually done using a threshold, and
>> a second difference threshold from the first match to the second match. This
>> makes sure that second best matches are much farther away than the best
>> match. This is important because it's not enough to find a very likely match
>> if there are 1000 very likely matches. You have to be able to show that the
>> other matches are very unlikely, otherwise the specific match you pick may
>> be just a tiny bit better than the others, and the confidence of that match
>> would be very low.
>>
>>
>> So, my initial design plans are as follows. Note: I will probably not
>> actually implement the system because the engineering part dominates the
>> time. I'd rather convert real videos to pseudo test cases or simulation test
>> cases and then write a psuedo design and algorithm that can solve it. This
>> would show that it works without actually spending the time needed to
>> implement it. It's more important for me to prove it works and show what it
>> can do than to actually do it. If I can prove it, there will be sufficient
>> motivation for others to do it with more resources and man power than I have
>> at my disposal.
>>
>> *My Design*
>> *First, we use high speed cameras and lidar systems to gather sufficient
>> data with very low uncertainty because the changes possible between data
>> points can be assumed to be very low, allowing our thresholds to be much
>> smaller, which eliminates many possible errors and ambiguities.
>>
>> *Second*, *we have to gain experience from high confidence observations.
>> These are gathered as follows:
>> 1) Describe allowable transformations(thresholds) and what they mean. Such
>> as the change in size and position of an object based on the frame rate of a
>> camera. Another might be allowable change in hue and contrast because of
>> lighting changes.  With a high frame rate camera, if you can find a match
>> that is within these high confidence thresholds in multiple dimensions(size,
>> position, color, etc), then you have a high confidence match.
>> 2) Find data patterns that are very unlikely to occur by coincidence, such
>> as many features moving together over several frames of a video and over a
>> statistically significant distance. These are highly likely true
>> observations and not coincidences or noise.
>> 3) Most importantly, make sure the matches we find are highly likely on
>> their own and unlikely to be coincidental.
>> 4) Second most importantly, make sure that any other possible matches or
>> alternative explanations are very unlikely in terms of distance( measured in
>> multiple dimensions and weighted by the certainty of those observations).
>> These should also be in terms of the thresholds we used previously because
>> those define acceptable changes in a normalized way.
>>
>> *That is a rough description of the idea. Basically highly likely matches
>> and very unlikely for the matches to be incorrect, coincidental or
>> mistmatched. *
>>
>> Third, We use experience, when we have it, in combination with the
>> algorithm I just described. If we can find unlikely coincidences between our
>> experience and our raw sensory observations, we can use this to look
>> specifically for those important observations the experience predicts and
>> verify them, which will in turn give us higher confidence of inferences.
>>
>> Once we have solved the correspondence problem like this, we can perform
>> higher reasoning and learning.
>>
>> Dave
>>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
>> <https://www.listbox.com/member/archive/rss/303/> | 
>> Modify<https://www.listbox.com/member/?&;>Your Subscription
>> <http://www.listbox.com>
>>
>
>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com>
>



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Computer Vision not as hard as I thought!

Reply via email to