Hi Marco,
On Friday 20 June 2008, Marco Argiolas wrote:
> Hi Richard,
>
> > You shouldn't have allocated extra memory in the first place. Instead of
> > modifying the whole frame at once, you should instead replace the call to
> > memcpy() in uvc_video_decode_data() and revert the image there.
> > Computation of boundaries and memory offsets might be a bit more
> > difficult, but you will
> > speed up the process and you'll avoid allocating an extra buffer.
>
> Thank you for this suggestion,
> I haven't thought about this solution before.
>
> > The purpose of GFP_ATOMIC is not to speed up memory allocation, but to
> > make sure that memory allocation will not sleep. This is required here, as
> > uvc_video_decode_data() is called in interrupt context where you're not
> > allowed to sleep.
>
> So, in this way I'm sure memory allocation won't sleep and I can say it is
> the fastest way (and also the only one suggested in this case) to allocate
> memory. Is it right?
A call to kmalloc with GFP_ATOMIC will probably faster than with GFP_KERNEL
(although I'm not sure there are not some corner cases where the second could
be faster than the first). This is however not the reason why you can't use
GFP_KERNEL here: code running in interrupt context is not allowed to sleep,
so must use GFP_ATOMIC. This is not a speed issue.
> > As explained above, you should modify the existing memcpy() instead of
> > adding a new one. The statement your mention will then not have to be
> > modified.
>
> Here it is the new patch of "uvc_video_decode_data()" (I was going crazy
> with the computation of all offset) and as attachement you can find the
> modded "uvc_video.c" file:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> $ diff -u original modded
> --- original 2008-06-20 19:15:28.000000000 +0200
> +++ modded 2008-06-20 19:11:31.000000000 +0200
> @@ -4,6 +4,11 @@
> struct uvc_video_queue *queue = &video->queue;
> unsigned int maxlen, nbytes;
> void *mem;
> + unsigned int
> + pixel_size,
> + to_be_copied,
> + shift_right,
> + buf_offset;
>
> if (len <= 0)
> return;
> @@ -12,11 +17,29 @@
> maxlen = buf->buf.length - buf->buf.bytesused;
> mem = queue->mem + buf->buf.m.offset + buf->buf.bytesused;
> nbytes = min((unsigned int)len, maxlen);
> - memcpy(mem, data, nbytes);
> - buf->buf.bytesused += nbytes;
>
> - if (len > maxlen) {
> + pixel_size = video->streaming->cur_frame->wWidth *
> video->streaming->format->bpp / 8;
pixel_size is misnamed as it stores the size of a whole length of video data.
> + to_be_copied = 0;
> + while( nbytes > 0 ) { // Are there other bytes to be copied?
Please respect the kernel coding style (Documentation/CodingStyle). Don't put
spaces inside parentheses, add a space between 'while' and the opening
parenthesis, don't put any statement on the same line as the 'else' keyword
and keep you code 80 characters wide.
> + shift_right = buf->buf.bytesused % pixel_size;
> + if ( nbytes > pixel_size - shift_right)
> + to_be_copied = pixel_size - shift_right ;
> + else to_be_copied = nbytes;
> +
> + if (shift_right > 0 )
> + buf_offset = to_be_copied - shift_right ;
> + else buf_offset = pixel_size;
> + memcpy( queue->mem + buf->buf.m.offset + pixel_size *
> video->streaming->cur_frame->wHeight - buf->buf.bytesused - buf_offset,
> + data,
> + to_be_copied ) ;
If I'm not mistaken this will only work when the data packets contain entire
lines of image data. This is not guaranteed. To reverse the image you
shouldn't use memcpy but instead roll your own memcpy version that will
decrement the destination pointer instead of incrementing it. It shouldn't be
more complex than that. Just fill the video buffer backwards starting at the
end instead of forward starting at the beginning.
> +
> + data += to_be_copied;
> + buf->buf.bytesused += to_be_copied ;
In the end all of nbytes will have been copied, so you can increment
buf->buf.bytesused in one go outside the loop.
> + nbytes -= to_be_copied;
> + }
>
> + if (len >= maxlen) {
Don't modify this (see my previous e-mail).
> uvc_trace(UVC_TRACE_FRAME, "Frame complete (overflow).\n");
> buf->state = UVC_BUF_STATE_DONE;
> }
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Unluckly also if now I don't allocate any memory, with bigger resolutions
> the final video seems to be a little bit slower than with the allocation
> (previous solution).
Can you benchmark that ? I haven't
> Please if you can/need try both my solutions and let me know your
> results!!! Which one is the fastest?
> Please let also me know if you think I made some mistakes that slow down
> this last version of uvcvideo without memory allocation.
>
> Thank you a lot for your help!!
>
> Marco Argiolas
_______________________________________________
Linux-uvc-devel mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/linux-uvc-devel