I've experimented a little with this.

Basically, there are a few different tasks, which are not necessarily done in this order, but: 1. Segment the image into the areas you care about (the hand) and the areas you don't
2. Detect the position/motion
3. Determine coordinates
4. Output coordinates to something

There are a few techniques for #1. One is color, as you suggest, but that's probably going to be fairly problematic as people and clothing come in many colors. Another is shape... look for a hand-shaped thing, or require the user to hold their fingers in a "V". Another is to do step 2 first, then look for the "end" of the motion, which would presumably be the fingertips. Another is to ignore this step altogether and take average motion for the whole scene and hope it's exact enough.

For #2, someone suggested the optical flow patch. The optical flow algorithm is pretty handy in computer vision, as it can do everything from detect motion of things in the scene to determine whether things are moving closer or further away. It's limitation is that it can only detect motion in the direction of the brightness gradient. (So it can't easily detect a ball spinning or a pool cue moving toward a cue ball.) A simple diff also works for detecting motion-- especially if you don't care as much about direction and speed of motion as you do location.

Another option for #2 is to actually correlate the points from one frame to the next using something like the KLT feature tracking algorithm (Kanade-Lucas-Tomasi). At one point I tried to write a QC plug-in to do that, but it's hacky and slow and I never refined it (http://www.samkass.com/blog/page2/page2.html ).

For #3, you talked about mucking with contrast and saturation to draw out the motion. Another option is to put everything in black and white with a threshold, do some noise reduction, then just look for edges. In any case, figuring out which are the "interesting" coordinates may end up being the hardest part. You could do a filter that looked for areas of higher curvature on the diff image and assume those are fingertips. Or use the average optical flow values to offset the "current" location to a "new" location... it won't be exact, but you'll be able to "sweep" your hand in a direction and the mouse will move that way.

For #4, you'll need a custom QC plug-in. Unless things have changed (I haven't done any serious QC hacking in quite awhile), there's no way to output coordinates from an image input. So put your Objective- C hat on and take a look at the examples...

        --Sam


On Jan 17, 2008, at 10:35 AM, Johnson, Mark P. - Duluth wrote:

I'm trying to figure out the best way to track hands as though a person in
front of a kiosk were using a mouse (or two).

My strategy so far is to grab an image when the Quartz composition launches from a web cam. Using a trick I saw in PhotoBooth, you can then compare the current frame of captured video to the background and find the difference --
that gives you a quick mask of something entering the screen.

Next, I would up contrast and saturation, to exaggerate differences. By comparing a frame from a moment before with a current frame, I could then
discover motion.

The trick is to get the "most moved" area and perhaps the smallest area -- assuming that a finger is going to be smaller than a head, and is going to have more movement (not always true -- but the user should get the hang of
it in short order).

So I would need to figure out, not only the difference between the prior image and the current image for what has moved -- I need to use a false color to get an idea that that tan blob over there (a hand) has moved more than the blue jacket. I'm not completely sure how to do that. It sort of overlaps some of the other color-lookup discussions, where you don't want to check the color of each strand of hair, because lighting conditions can quickly change -- you want to "blob" the median values of a region. AFTER
you discover the motion of a common region, you can track perhaps a
highlight of the most moved point (like a finger). Luminosity would be
valuable to use the Outline sketch effects to define regions, and then again
for highlighting that gives dimension.

I'm thinking out loud -- maybe someone has come up with a much easier
strategy to turn a camera into a mouse. I've seen these with video displays at malls and I'm pretty sure they are using InfraRed on the camera to ignore their own projection. I may need to get a real video camera and hack it to
grab only infrared -- I'd love advice on that.

I know there is some patch that allows for finding the pixel in a row or column with the greatest or least value. With so many new patch's I've lost track of it -- but, after I can somehow visually create an image with the "most moved" regions, then I can find an x and y position and use that as
though it were a mouse.

CIKernel functions that do these operations I'm assuming would be faster -- so if anyone can point me to prior work or a way to do this, I'd love it.

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartzcomposer-dev mailing list ([email protected] )
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/quartzcomposer-dev/samkass%40samkass.com

This email sent to [EMAIL PROTECTED]

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartzcomposer-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/quartzcomposer-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to