I've experimented a little with this.
Basically, there are a few different tasks, which are not necessarily
done in this order, but:
1. Segment the image into the areas you care about (the hand) and the
areas you don't
2. Detect the position/motion
3. Determine coordinates
4. Output coordinates to something
There are a few techniques for #1. One is color, as you suggest, but
that's probably going to be fairly problematic as people and clothing
come in many colors. Another is shape... look for a hand-shaped
thing, or require the user to hold their fingers in a "V". Another is
to do step 2 first, then look for the "end" of the motion, which would
presumably be the fingertips. Another is to ignore this step
altogether and take average motion for the whole scene and hope it's
exact enough.
For #2, someone suggested the optical flow patch. The optical flow
algorithm is pretty handy in computer vision, as it can do everything
from detect motion of things in the scene to determine whether things
are moving closer or further away. It's limitation is that it can
only detect motion in the direction of the brightness gradient. (So
it can't easily detect a ball spinning or a pool cue moving toward a
cue ball.) A simple diff also works for detecting motion-- especially
if you don't care as much about direction and speed of motion as you
do location.
Another option for #2 is to actually correlate the points from one
frame to the next using something like the KLT feature tracking
algorithm (Kanade-Lucas-Tomasi). At one point I tried to write a QC
plug-in to do that, but it's hacky and slow and I never refined it (http://www.samkass.com/blog/page2/page2.html
).
For #3, you talked about mucking with contrast and saturation to draw
out the motion. Another option is to put everything in black and
white with a threshold, do some noise reduction, then just look for
edges. In any case, figuring out which are the "interesting"
coordinates may end up being the hardest part. You could do a filter
that looked for areas of higher curvature on the diff image and assume
those are fingertips. Or use the average optical flow values to
offset the "current" location to a "new" location... it won't be
exact, but you'll be able to "sweep" your hand in a direction and the
mouse will move that way.
For #4, you'll need a custom QC plug-in. Unless things have changed
(I haven't done any serious QC hacking in quite awhile), there's no
way to output coordinates from an image input. So put your Objective-
C hat on and take a look at the examples...
--Sam
On Jan 17, 2008, at 10:35 AM, Johnson, Mark P. - Duluth wrote:
I'm trying to figure out the best way to track hands as though a
person in
front of a kiosk were using a mouse (or two).
My strategy so far is to grab an image when the Quartz composition
launches
from a web cam. Using a trick I saw in PhotoBooth, you can then
compare the
current frame of captured video to the background and find the
difference --
that gives you a quick mask of something entering the screen.
Next, I would up contrast and saturation, to exaggerate differences.
By
comparing a frame from a moment before with a current frame, I could
then
discover motion.
The trick is to get the "most moved" area and perhaps the smallest
area --
assuming that a finger is going to be smaller than a head, and is
going to
have more movement (not always true -- but the user should get the
hang of
it in short order).
So I would need to figure out, not only the difference between the
prior
image and the current image for what has moved -- I need to use a
false
color to get an idea that that tan blob over there (a hand) has
moved more
than the blue jacket. I'm not completely sure how to do that. It
sort of
overlaps some of the other color-lookup discussions, where you don't
want to
check the color of each strand of hair, because lighting conditions
can
quickly change -- you want to "blob" the median values of a region.
AFTER
you discover the motion of a common region, you can track perhaps a
highlight of the most moved point (like a finger). Luminosity would be
valuable to use the Outline sketch effects to define regions, and
then again
for highlighting that gives dimension.
I'm thinking out loud -- maybe someone has come up with a much easier
strategy to turn a camera into a mouse. I've seen these with video
displays
at malls and I'm pretty sure they are using InfraRed on the camera
to ignore
their own projection. I may need to get a real video camera and hack
it to
grab only infrared -- I'd love advice on that.
I know there is some patch that allows for finding the pixel in a
row or
column with the greatest or least value. With so many new patch's
I've lost
track of it -- but, after I can somehow visually create an image
with the
"most moved" regions, then I can find an x and y position and use
that as
though it were a mouse.
CIKernel functions that do these operations I'm assuming would be
faster --
so if anyone can point me to prior work or a way to do this, I'd
love it.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartzcomposer-dev mailing list ([email protected]
)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/quartzcomposer-dev/samkass%40samkass.com
This email sent to [EMAIL PROTECTED]
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartzcomposer-dev mailing list ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/quartzcomposer-dev/archive%40mail-archive.com
This email sent to [EMAIL PROTECTED]