Re: [PATCH v5 0/9] Asynchronous UVC
Hi Pavel, On Tuesday, 27 November 2018 22:17:30 EET Pavel Machek wrote: > On Tue 2018-11-06 21:27:11, Kieran Bingham wrote: > > From: Kieran Bingham > > > > The Linux UVC driver has long provided adequate performance capabilities > > for web-cams and low data rate video devices in Linux while resolutions > > were low. > > > > Modern USB cameras are now capable of high data rates thanks to USB3 with > > 1080p, and even 4k capture resolutions supported. > > > > Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO > > (isochronous transfers) can generate more data than an embedded ARM core > > is able to process on a single core, resulting in frame loss. > > > > A large part of this performance impact is from the requirement to > > ‘memcpy’ frames out from URB packets to destination frames. This > > unfortunate requirement is due to the UVC protocol allowing a variable > > length header, and thus it is not possible to provide the target frame > > buffers directly. > > > > Extra throughput is possible by moving the actual memcpy actions to a work > > queue, and moving the memcpy out of interrupt context thus allowing work > > tasks to be scheduled across multiple cores. > > Hmm. Doing memcpy() on many cores is improvement but... not really. > Would it be possible to improve kernel<->user interface, so it says > "data is in this buffer, and it starts here" and so that memcpy in > kernel is not neccessary? Unfortunately not, as the UVC protocol segments the frame in a large number of small packets, each prefixed with a variable-length header. It's a poorly designed protocol from that point of view. -- Regards, Laurent Pinchart
Re: [PATCH v5 0/9] Asynchronous UVC
On Tue 2018-11-06 21:27:11, Kieran Bingham wrote: > From: Kieran Bingham > > The Linux UVC driver has long provided adequate performance capabilities for > web-cams and low data rate video devices in Linux while resolutions were low. > > Modern USB cameras are now capable of high data rates thanks to USB3 with > 1080p, and even 4k capture resolutions supported. > > Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO > (isochronous transfers) can generate more data than an embedded ARM core is > able to process on a single core, resulting in frame loss. > > A large part of this performance impact is from the requirement to > ‘memcpy’ frames out from URB packets to destination frames. This unfortunate > requirement is due to the UVC protocol allowing a variable length header, and > thus it is not possible to provide the target frame buffers directly. > > Extra throughput is possible by moving the actual memcpy actions to a work > queue, and moving the memcpy out of interrupt context thus allowing work tasks > to be scheduled across multiple cores. Hmm. Doing memcpy() on many cores is improvement but... not really. Would it be possible to improve kernel<->user interface, so it says "data is in this buffer, and it starts here" and so that memcpy in kernel is not neccessary? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH v5 0/9] Asynchronous UVC
Hi Laurent, On 08/11/2018 17:01, Laurent Pinchart wrote: > Hi Kieran, > > On Wednesday, 7 November 2018 01:31:29 EET Laurent Pinchart wrote: >> On Tuesday, 6 November 2018 23:27:11 EET Kieran Bingham wrote: >>> From: Kieran Bingham >>> >>> The Linux UVC driver has long provided adequate performance capabilities >>> for web-cams and low data rate video devices in Linux while resolutions >>> were low. >>> >>> Modern USB cameras are now capable of high data rates thanks to USB3 with >>> 1080p, and even 4k capture resolutions supported. >>> >>> Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO >>> (isochronous transfers) can generate more data than an embedded ARM core >>> is able to process on a single core, resulting in frame loss. >>> >>> A large part of this performance impact is from the requirement to >>> ‘memcpy’ frames out from URB packets to destination frames. This >>> unfortunate requirement is due to the UVC protocol allowing a variable >>> length header, and thus it is not possible to provide the target frame >>> buffers directly. >>> >>> Extra throughput is possible by moving the actual memcpy actions to a work >>> queue, and moving the memcpy out of interrupt context thus allowing work >>> tasks to be scheduled across multiple cores. >>> >>> This series has been tested on both the ZED and BRIO cameras on arm64 >>> platforms, and with thanks to Randy Dunlap, a Dynex 1.3MP Webcam, a Sonix >>> USB2 Camera, and a built in Toshiba Laptop camera, and with thanks to >>> Philipp Zabel for testing on a Lite-On internal Laptop Webcam, Logitech >>> C910 (USB2 isoc), Oculus Sensor (USB3 isoc), and Microsoft HoloLens >>> Sensors (USB3 bulk). >>> >>> As far as I am aware iSight devices, and devices which use UVC to encode >>> data (output device) have not yet been tested - but should find no ill >>> effect (at least not until they are tested of course :D ) >> >> :-D >> >> I'm not sure whether anyone is still using those devices with Linux. I >> wouldn't be surprised if we realized down the road that they already don't >> work. >> >>> Tested-by: Randy Dunlap >>> Tested-by: Philipp Zabel >>> >>> v2: >>> - Fix race reported by Guennadi >>> >>> v3: >>> - Fix similar race reported by Laurent >>> - Only queue work if required (encode/isight do not queue work) >>> - Refactor/Rename variables for clarity >>> >>> v4: >>> - (Yet another) Rework of the uninitialise path. >>> >>>This time to hopefully clean up the shutdown races for good. >>>use usb_poison_urb() to halt all URBs, then flush the work queue >>>before freeing. >>> >>> - Rebase to latest linux-media/master >>> >>> v5: >>> - Provide lockdep validation >>> - rename uvc_queue_requeue -> uvc_queue_buffer_requeue() >>> - Fix comments and periods throughout >>> - Rebase to media/v4.20-2 >>> - Use GFP_KERNEL allocation in uvc_video_copy_data_work() >>> - Fix function documentation for uvc_video_copy_data_work() >>> - Add periods to the end of sentences >>> - Rename 'decode' variable to 'op' in uvc_video_decode_data() >>> - Move uvc_urb->async_operations initialisation to before use >>> - Move async workqueue to match uvc_streaming lifetime instead of >>> >>>streamon/streamoff >>> >>> - bracket the for_each_uvc_urb() macro >>> >>> - New patches added to series: >>> media: uvcvideo: Split uvc_video_enable into two >>> media: uvcvideo: Rename uvc_{un,}init_video() >>> media: uvcvideo: Utilise for_each_uvc_urb iterator >>> >>> Kieran Bingham (9): >>> media: uvcvideo: Refactor URB descriptors >>> media: uvcvideo: Convert decode functions to use new context structure >>> media: uvcvideo: Protect queue internals with helper >>> media: uvcvideo: queue: Simplify spin-lock usage >>> media: uvcvideo: queue: Support asynchronous buffer handling >>> media: uvcvideo: Move decode processing to process context >>> media: uvcvideo: Split uvc_video_enable into two >> >> I've taken the above patches in my tree. > > I'm afraid I'll have to drop them :-( If I disconnect the camera while in use, > I get the following oops: Argh - that's what we get for that 'last minute' change of the lifetime of the workqueue. Ok - I'll see what we can do instead. -- Kieran > > [ 237.514625] usb 2-1.4: USB disconnect, device number 5 > [ 237.516123] uvcvideo: Failed to resubmit video URB (-19). > [ 237.549470] BUG: unable to handle kernel paging request at 4bec0091 > [ 237.549476] PGD 0 P4D 0 > [ 237.549481] Oops: [#1] PREEMPT SMP PTI > [ 237.549485] CPU: 3 PID: 5332 Comm: luvcview Tainted: GW O > 4.18.16-gentoo #1 > [ 237.549487] Hardware name: Dell Inc. Latitude E6420/0K0DNP, BIOS A08 > 10/18/2011 > [ 237.549493] RIP: 0010:flush_workqueue_prep_pwqs+0x49/0x120 > [ 237.549494] Code: 47 48 85 c0 0f 85 e1 00 00 00 c7 45 48 01 00 00 00 48 8b > 45 00 4c 8d 78 90 48 39 c5 0f 84 d0 00 00 00 c6 44 24 07 00 4d 63 f4 <4d> 8b > 2f 4c 89 ef e8 5c b3 89 00 45 85 e4 78
Re: [PATCH v5 0/9] Asynchronous UVC
Hi Kieran, On Wednesday, 7 November 2018 01:31:29 EET Laurent Pinchart wrote: > On Tuesday, 6 November 2018 23:27:11 EET Kieran Bingham wrote: > > From: Kieran Bingham > > > > The Linux UVC driver has long provided adequate performance capabilities > > for web-cams and low data rate video devices in Linux while resolutions > > were low. > > > > Modern USB cameras are now capable of high data rates thanks to USB3 with > > 1080p, and even 4k capture resolutions supported. > > > > Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO > > (isochronous transfers) can generate more data than an embedded ARM core > > is able to process on a single core, resulting in frame loss. > > > > A large part of this performance impact is from the requirement to > > ‘memcpy’ frames out from URB packets to destination frames. This > > unfortunate requirement is due to the UVC protocol allowing a variable > > length header, and thus it is not possible to provide the target frame > > buffers directly. > > > > Extra throughput is possible by moving the actual memcpy actions to a work > > queue, and moving the memcpy out of interrupt context thus allowing work > > tasks to be scheduled across multiple cores. > > > > This series has been tested on both the ZED and BRIO cameras on arm64 > > platforms, and with thanks to Randy Dunlap, a Dynex 1.3MP Webcam, a Sonix > > USB2 Camera, and a built in Toshiba Laptop camera, and with thanks to > > Philipp Zabel for testing on a Lite-On internal Laptop Webcam, Logitech > > C910 (USB2 isoc), Oculus Sensor (USB3 isoc), and Microsoft HoloLens > > Sensors (USB3 bulk). > > > > As far as I am aware iSight devices, and devices which use UVC to encode > > data (output device) have not yet been tested - but should find no ill > > effect (at least not until they are tested of course :D ) > > :-D > > I'm not sure whether anyone is still using those devices with Linux. I > wouldn't be surprised if we realized down the road that they already don't > work. > > > Tested-by: Randy Dunlap > > Tested-by: Philipp Zabel > > > > v2: > > - Fix race reported by Guennadi > > > > v3: > > - Fix similar race reported by Laurent > > - Only queue work if required (encode/isight do not queue work) > > - Refactor/Rename variables for clarity > > > > v4: > > - (Yet another) Rework of the uninitialise path. > > > >This time to hopefully clean up the shutdown races for good. > >use usb_poison_urb() to halt all URBs, then flush the work queue > >before freeing. > > > > - Rebase to latest linux-media/master > > > > v5: > > - Provide lockdep validation > > - rename uvc_queue_requeue -> uvc_queue_buffer_requeue() > > - Fix comments and periods throughout > > - Rebase to media/v4.20-2 > > - Use GFP_KERNEL allocation in uvc_video_copy_data_work() > > - Fix function documentation for uvc_video_copy_data_work() > > - Add periods to the end of sentences > > - Rename 'decode' variable to 'op' in uvc_video_decode_data() > > - Move uvc_urb->async_operations initialisation to before use > > - Move async workqueue to match uvc_streaming lifetime instead of > > > >streamon/streamoff > > > > - bracket the for_each_uvc_urb() macro > > > > - New patches added to series: > > media: uvcvideo: Split uvc_video_enable into two > > media: uvcvideo: Rename uvc_{un,}init_video() > > media: uvcvideo: Utilise for_each_uvc_urb iterator > > > > Kieran Bingham (9): > > media: uvcvideo: Refactor URB descriptors > > media: uvcvideo: Convert decode functions to use new context structure > > media: uvcvideo: Protect queue internals with helper > > media: uvcvideo: queue: Simplify spin-lock usage > > media: uvcvideo: queue: Support asynchronous buffer handling > > media: uvcvideo: Move decode processing to process context > > media: uvcvideo: Split uvc_video_enable into two > > I've taken the above patches in my tree. I'm afraid I'll have to drop them :-( If I disconnect the camera while in use, I get the following oops: [ 237.514625] usb 2-1.4: USB disconnect, device number 5 [ 237.516123] uvcvideo: Failed to resubmit video URB (-19). [ 237.549470] BUG: unable to handle kernel paging request at 4bec0091 [ 237.549476] PGD 0 P4D 0 [ 237.549481] Oops: [#1] PREEMPT SMP PTI [ 237.549485] CPU: 3 PID: 5332 Comm: luvcview Tainted: GW O 4.18.16-gentoo #1 [ 237.549487] Hardware name: Dell Inc. Latitude E6420/0K0DNP, BIOS A08 10/18/2011 [ 237.549493] RIP: 0010:flush_workqueue_prep_pwqs+0x49/0x120 [ 237.549494] Code: 47 48 85 c0 0f 85 e1 00 00 00 c7 45 48 01 00 00 00 48 8b 45 00 4c 8d 78 90 48 39 c5 0f 84 d0 00 00 00 c6 44 24 07 00 4d 63 f4 <4d> 8b 2f 4c 89 ef e8 5c b3 89 00 45 85 e4 78 1d 41 83 7f 14 ff 75 [ 237.549525] RSP: 0018:c9000154fc50 EFLAGS: 00010296 [ 237.549527] RAX: 4bec0101 RBX: 0002 RCX: 0002 [ 237.549529] RDX: 0002 RSI: 0001 RDI:
Re: [PATCH v5 0/9] Asynchronous UVC
Hi Kieran, Thank you for the patches. On Tuesday, 6 November 2018 23:27:11 EET Kieran Bingham wrote: > From: Kieran Bingham > > The Linux UVC driver has long provided adequate performance capabilities for > web-cams and low data rate video devices in Linux while resolutions were > low. > > Modern USB cameras are now capable of high data rates thanks to USB3 with > 1080p, and even 4k capture resolutions supported. > > Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO > (isochronous transfers) can generate more data than an embedded ARM core is > able to process on a single core, resulting in frame loss. > > A large part of this performance impact is from the requirement to > ‘memcpy’ frames out from URB packets to destination frames. This unfortunate > requirement is due to the UVC protocol allowing a variable length header, > and thus it is not possible to provide the target frame buffers directly. > > Extra throughput is possible by moving the actual memcpy actions to a work > queue, and moving the memcpy out of interrupt context thus allowing work > tasks to be scheduled across multiple cores. > > This series has been tested on both the ZED and BRIO cameras on arm64 > platforms, and with thanks to Randy Dunlap, a Dynex 1.3MP Webcam, a Sonix > USB2 Camera, and a built in Toshiba Laptop camera, and with thanks to > Philipp Zabel for testing on a Lite-On internal Laptop Webcam, Logitech > C910 (USB2 isoc), Oculus Sensor (USB3 isoc), and Microsoft HoloLens Sensors > (USB3 bulk). > > As far as I am aware iSight devices, and devices which use UVC to encode > data (output device) have not yet been tested - but should find no ill > effect (at least not until they are tested of course :D ) :-D I'm not sure whether anyone is still using those devices with Linux. I wouldn't be surprised if we realized down the road that they already don't work. > Tested-by: Randy Dunlap > Tested-by: Philipp Zabel > > v2: > - Fix race reported by Guennadi > > v3: > - Fix similar race reported by Laurent > - Only queue work if required (encode/isight do not queue work) > - Refactor/Rename variables for clarity > > v4: > - (Yet another) Rework of the uninitialise path. >This time to hopefully clean up the shutdown races for good. >use usb_poison_urb() to halt all URBs, then flush the work queue >before freeing. > - Rebase to latest linux-media/master > > v5: > - Provide lockdep validation > - rename uvc_queue_requeue -> uvc_queue_buffer_requeue() > - Fix comments and periods throughout > - Rebase to media/v4.20-2 > - Use GFP_KERNEL allocation in uvc_video_copy_data_work() > - Fix function documentation for uvc_video_copy_data_work() > - Add periods to the end of sentences > - Rename 'decode' variable to 'op' in uvc_video_decode_data() > - Move uvc_urb->async_operations initialisation to before use > - Move async workqueue to match uvc_streaming lifetime instead of >streamon/streamoff > - bracket the for_each_uvc_urb() macro > > - New patches added to series: > media: uvcvideo: Split uvc_video_enable into two > media: uvcvideo: Rename uvc_{un,}init_video() > media: uvcvideo: Utilise for_each_uvc_urb iterator > > Kieran Bingham (9): > media: uvcvideo: Refactor URB descriptors > media: uvcvideo: Convert decode functions to use new context structure > media: uvcvideo: Protect queue internals with helper > media: uvcvideo: queue: Simplify spin-lock usage > media: uvcvideo: queue: Support asynchronous buffer handling > media: uvcvideo: Move decode processing to process context > media: uvcvideo: Split uvc_video_enable into two I've taken the above patches in my tree. > media: uvcvideo: Rename uvc_{un,}init_video() > media: uvcvideo: Utilise for_each_uvc_urb iterator And I've sent review comments for these two. > drivers/media/usb/uvc/uvc_driver.c | 2 +- > drivers/media/usb/uvc/uvc_isight.c | 6 +- > drivers/media/usb/uvc/uvc_queue.c | 110 +--- > drivers/media/usb/uvc/uvc_video.c | 282 +++--- > drivers/media/usb/uvc/uvcvideo.h | 65 ++- > 5 files changed, 331 insertions(+), 134 deletions(-) > > base-commit: dafb7f9aef2fd44991ff1691721ff765a23be27b -- Regards, Laurent Pinchart