Hello,

I had an interesting private mail exchange on autofocus with Alex - thanks 
for helping me to find the reason why I couldn't get it to work, Alex!

WRT implementing autofocus, I'm reposting some ideas I had, in case anybody 
is interested:

I think very good performance is a primary requirement - at 960x720 and 
15fps, luvcview already uses all my CPU. :-/ There are probably more 
efficient JPEG decoders than luvcview's, but still...

Here are some ideas on how to get the autofocus to require little CPU. I
mostly have MJPEG rather than YUV output in mind. (With YUV, you will need
to do a DCT yourself to be able to use the same algorithms.)

* The biggest speedup would be achieved by only concentrating on the center
area of the screen, and not even decoding the rest.

* You only need to decode the Y part (luminance), the colour info is
lower-resolution anyway.

* You do not need to completely decode the JPEG pictures! I'm certain there
is a very good correlation between sharp details in the picture, and the
presence of high-frequency coefficients in the JPEG data. Basically,
frequency analysis of the picture has already been done for you for free by
the MJPEG compression!

* Most of the time while the camera is in use, there will be no need to
adjust the focus. The adjustment algorithm should only kick in when there
are significant changes (movement) in the centre of the image. A cheap
check for larger-scale changes is possible by looking at the JPEG DC
coefficients, which specify the mean brightness of each 8x8 square of the
picture. As long as there is no movement, this check need not even be done
for every frame, but maybe only every .5 sec or so.

Some ideas on heuristics of how to make the refocusing itself fast:

The JPEG coefficients may also be suitable for measuring how much out of
focus the image is: I can describe this in detail on request, as it's a bit
hard to explain, but for each frame, you can create an array which maps
from the size of visible details (1..8 pixels) to the number of occurrences
of a detail of that size in the region of interest (centre of the image).

When the image begins to go out of focus, you will notice that all
coefficients will shift toward low frequencies - details with a size of 1,
2 or more pixels will disappear altogether from the image. The amount by
which to adjust the focus should be deducable from just how large the
smallest remaining detail is.

But there is a problem: You now know that the image is out of focus and by 
what amount, but you don't know whether the subject has moved nearer to the 
webcam or further away. I think there is no way to find this out in 100% of 
the cases, but the following heuristic might help to get the initial 
direction of searching (towards macro vs. infinity) right:

Look at the overall amount of change in the whole region of interest. If
some object has moved near the camera, the differences between successive
frames will get larger and larger - so move the focus towards macro. If the
amount of change is disminishing from one frame to the next, something is
moving away - move the focus towards infinity. In the middle ground (few
changes, or same amount of change) there's no way of telling which is
better, I think.

Once you have started to adjust the focus, you will quickly find out
whether it was the right direction by watching the amount of small details,
as described above.

Well, those were just some thoughts, I don't know whether it would all
work. :)

All the best,

  Richard

-- 
  __   _
  |_) /|  Richard Atterer     |  GnuPG key: 888354F7
  | \/¯|  http://atterer.net  |  08A9 7B7D 3D13 3EF2 3D25  D157 79E6 F6DC 8883 
54F7
  ¯ '` ¯
_______________________________________________
Linux-uvc-devel mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/linux-uvc-devel

Reply via email to