Hello, I had an interesting private mail exchange on autofocus with Alex - thanks for helping me to find the reason why I couldn't get it to work, Alex!
WRT implementing autofocus, I'm reposting some ideas I had, in case anybody is interested: I think very good performance is a primary requirement - at 960x720 and 15fps, luvcview already uses all my CPU. :-/ There are probably more efficient JPEG decoders than luvcview's, but still... Here are some ideas on how to get the autofocus to require little CPU. I mostly have MJPEG rather than YUV output in mind. (With YUV, you will need to do a DCT yourself to be able to use the same algorithms.) * The biggest speedup would be achieved by only concentrating on the center area of the screen, and not even decoding the rest. * You only need to decode the Y part (luminance), the colour info is lower-resolution anyway. * You do not need to completely decode the JPEG pictures! I'm certain there is a very good correlation between sharp details in the picture, and the presence of high-frequency coefficients in the JPEG data. Basically, frequency analysis of the picture has already been done for you for free by the MJPEG compression! * Most of the time while the camera is in use, there will be no need to adjust the focus. The adjustment algorithm should only kick in when there are significant changes (movement) in the centre of the image. A cheap check for larger-scale changes is possible by looking at the JPEG DC coefficients, which specify the mean brightness of each 8x8 square of the picture. As long as there is no movement, this check need not even be done for every frame, but maybe only every .5 sec or so. Some ideas on heuristics of how to make the refocusing itself fast: The JPEG coefficients may also be suitable for measuring how much out of focus the image is: I can describe this in detail on request, as it's a bit hard to explain, but for each frame, you can create an array which maps from the size of visible details (1..8 pixels) to the number of occurrences of a detail of that size in the region of interest (centre of the image). When the image begins to go out of focus, you will notice that all coefficients will shift toward low frequencies - details with a size of 1, 2 or more pixels will disappear altogether from the image. The amount by which to adjust the focus should be deducable from just how large the smallest remaining detail is. But there is a problem: You now know that the image is out of focus and by what amount, but you don't know whether the subject has moved nearer to the webcam or further away. I think there is no way to find this out in 100% of the cases, but the following heuristic might help to get the initial direction of searching (towards macro vs. infinity) right: Look at the overall amount of change in the whole region of interest. If some object has moved near the camera, the differences between successive frames will get larger and larger - so move the focus towards macro. If the amount of change is disminishing from one frame to the next, something is moving away - move the focus towards infinity. In the middle ground (few changes, or same amount of change) there's no way of telling which is better, I think. Once you have started to adjust the focus, you will quickly find out whether it was the right direction by watching the amount of small details, as described above. Well, those were just some thoughts, I don't know whether it would all work. :) All the best, Richard -- __ _ |_) /| Richard Atterer | GnuPG key: 888354F7 | \/¯| http://atterer.net | 08A9 7B7D 3D13 3EF2 3D25 D157 79E6 F6DC 8883 54F7 ¯ '` ¯ _______________________________________________ Linux-uvc-devel mailing list [email protected] https://lists.berlios.de/mailman/listinfo/linux-uvc-devel
