Re: [PATCH v5 0/9] Asynchronous UVC

2018-11-27 Thread Laurent Pinchart
Hi Pavel,

On Tuesday, 27 November 2018 22:17:30 EET Pavel Machek wrote:
> On Tue 2018-11-06 21:27:11, Kieran Bingham wrote:
> > From: Kieran Bingham 
> > 
> > The Linux UVC driver has long provided adequate performance capabilities
> > for web-cams and low data rate video devices in Linux while resolutions
> > were low.
> > 
> > Modern USB cameras are now capable of high data rates thanks to USB3 with
> > 1080p, and even 4k capture resolutions supported.
> > 
> > Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO
> > (isochronous transfers) can generate more data than an embedded ARM core
> > is able to process on a single core, resulting in frame loss.
> > 
> > A large part of this performance impact is from the requirement to
> > ‘memcpy’ frames out from URB packets to destination frames. This
> > unfortunate requirement is due to the UVC protocol allowing a variable
> > length header, and thus it is not possible to provide the target frame
> > buffers directly.
> > 
> > Extra throughput is possible by moving the actual memcpy actions to a work
> > queue, and moving the memcpy out of interrupt context thus allowing work
> > tasks to be scheduled across multiple cores.
> 
> Hmm. Doing memcpy() on many cores is improvement but... not really.
> Would it be possible to improve kernel<->user interface, so it says
> "data is in this buffer, and it starts here" and so that memcpy in
> kernel is not neccessary?

Unfortunately not, as the UVC protocol segments the frame in a large number of 
small packets, each prefixed with a variable-length header. It's a poorly 
designed protocol from that point of view.

-- 
Regards,

Laurent Pinchart





Re: [PATCH v5 0/9] Asynchronous UVC

2018-11-27 Thread Pavel Machek
On Tue 2018-11-06 21:27:11, Kieran Bingham wrote:
> From: Kieran Bingham 
> 
> The Linux UVC driver has long provided adequate performance capabilities for
> web-cams and low data rate video devices in Linux while resolutions were low.
> 
> Modern USB cameras are now capable of high data rates thanks to USB3 with
> 1080p, and even 4k capture resolutions supported.
> 
> Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO
> (isochronous transfers) can generate more data than an embedded ARM core is
> able to process on a single core, resulting in frame loss.
> 
> A large part of this performance impact is from the requirement to
> ‘memcpy’ frames out from URB packets to destination frames. This unfortunate
> requirement is due to the UVC protocol allowing a variable length header, and
> thus it is not possible to provide the target frame buffers directly.
> 
> Extra throughput is possible by moving the actual memcpy actions to a work
> queue, and moving the memcpy out of interrupt context thus allowing work tasks
> to be scheduled across multiple cores.

Hmm. Doing memcpy() on many cores is improvement but... not really.
Would it be possible to improve kernel<->user interface, so it says
"data is in this buffer, and it starts here" and so that memcpy in
kernel is not neccessary?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v5 0/9] Asynchronous UVC

2018-11-09 Thread Kieran Bingham
Hi Laurent,

On 08/11/2018 17:01, Laurent Pinchart wrote:
> Hi Kieran,
> 
> On Wednesday, 7 November 2018 01:31:29 EET Laurent Pinchart wrote:
>> On Tuesday, 6 November 2018 23:27:11 EET Kieran Bingham wrote:
>>> From: Kieran Bingham 
>>>
>>> The Linux UVC driver has long provided adequate performance capabilities
>>> for web-cams and low data rate video devices in Linux while resolutions
>>> were low.
>>>
>>> Modern USB cameras are now capable of high data rates thanks to USB3 with
>>> 1080p, and even 4k capture resolutions supported.
>>>
>>> Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO
>>> (isochronous transfers) can generate more data than an embedded ARM core
>>> is able to process on a single core, resulting in frame loss.
>>>
>>> A large part of this performance impact is from the requirement to
>>> ‘memcpy’ frames out from URB packets to destination frames. This
>>> unfortunate requirement is due to the UVC protocol allowing a variable
>>> length header, and thus it is not possible to provide the target frame
>>> buffers directly.
>>>
>>> Extra throughput is possible by moving the actual memcpy actions to a work
>>> queue, and moving the memcpy out of interrupt context thus allowing work
>>> tasks to be scheduled across multiple cores.
>>>
>>> This series has been tested on both the ZED and BRIO cameras on arm64
>>> platforms, and with thanks to Randy Dunlap, a Dynex 1.3MP Webcam, a Sonix
>>> USB2 Camera, and a built in Toshiba Laptop camera, and with thanks to
>>> Philipp Zabel for testing on a Lite-On internal Laptop Webcam, Logitech
>>> C910 (USB2 isoc), Oculus Sensor (USB3 isoc), and Microsoft HoloLens
>>> Sensors (USB3 bulk).
>>>
>>> As far as I am aware iSight devices, and devices which use UVC to encode
>>> data (output device) have not yet been tested - but should find no ill
>>> effect (at least not until they are tested of course :D )
>>
>> :-D
>>
>> I'm not sure whether anyone is still using those devices with Linux. I
>> wouldn't be surprised if we realized down the road that they already don't
>> work.
>>
>>> Tested-by: Randy Dunlap 
>>> Tested-by: Philipp Zabel 
>>>
>>> v2:
>>>  - Fix race reported by Guennadi
>>>
>>> v3:
>>>  - Fix similar race reported by Laurent
>>>  - Only queue work if required (encode/isight do not queue work)
>>>  - Refactor/Rename variables for clarity
>>>
>>> v4:
>>>  - (Yet another) Rework of the uninitialise path.
>>>  
>>>This time to hopefully clean up the shutdown races for good.
>>>use usb_poison_urb() to halt all URBs, then flush the work queue
>>>before freeing.
>>>  
>>>  - Rebase to latest linux-media/master
>>>
>>> v5:
>>>  - Provide lockdep validation
>>>  - rename uvc_queue_requeue -> uvc_queue_buffer_requeue()
>>>  - Fix comments and periods throughout
>>>  - Rebase to media/v4.20-2
>>>  - Use GFP_KERNEL allocation in uvc_video_copy_data_work()
>>>  - Fix function documentation for uvc_video_copy_data_work()
>>>  - Add periods to the end of sentences
>>>  - Rename 'decode' variable to 'op' in uvc_video_decode_data()
>>>  - Move uvc_urb->async_operations initialisation to before use
>>>  - Move async workqueue to match uvc_streaming lifetime instead of
>>>  
>>>streamon/streamoff
>>>  
>>>  - bracket the for_each_uvc_urb() macro
>>>  
>>>  - New patches added to series:
>>> media: uvcvideo: Split uvc_video_enable into two
>>> media: uvcvideo: Rename uvc_{un,}init_video()
>>> media: uvcvideo: Utilise for_each_uvc_urb iterator
>>>
>>> Kieran Bingham (9):
>>>   media: uvcvideo: Refactor URB descriptors
>>>   media: uvcvideo: Convert decode functions to use new context structure
>>>   media: uvcvideo: Protect queue internals with helper
>>>   media: uvcvideo: queue: Simplify spin-lock usage
>>>   media: uvcvideo: queue: Support asynchronous buffer handling
>>>   media: uvcvideo: Move decode processing to process context
>>>   media: uvcvideo: Split uvc_video_enable into two
>>
>> I've taken the above patches in my tree.
> 
> I'm afraid I'll have to drop them :-( If I disconnect the camera while in use,
> I get the following oops:

Argh - that's what we get for that 'last minute' change of the lifetime
of the workqueue.

Ok - I'll see what we can do instead.
--
Kieran



> 
> [  237.514625] usb 2-1.4: USB disconnect, device number 5
> [  237.516123] uvcvideo: Failed to resubmit video URB (-19).
> [  237.549470] BUG: unable to handle kernel paging request at 4bec0091
> [  237.549476] PGD 0 P4D 0 
> [  237.549481] Oops:  [#1] PREEMPT SMP PTI
> [  237.549485] CPU: 3 PID: 5332 Comm: luvcview Tainted: GW  O  
> 4.18.16-gentoo #1
> [  237.549487] Hardware name: Dell Inc. Latitude E6420/0K0DNP, BIOS A08 
> 10/18/2011
> [  237.549493] RIP: 0010:flush_workqueue_prep_pwqs+0x49/0x120
> [  237.549494] Code: 47 48 85 c0 0f 85 e1 00 00 00 c7 45 48 01 00 00 00 48 8b 
> 45 00 4c 8d 78 90 48 39 c5 0f 84 d0 00 00 00 c6 44 24 07 00 4d 63 f4 <4d> 8b 
> 2f 4c 89 ef e8 5c b3 89 00 45 85 e4 78 

Re: [PATCH v5 0/9] Asynchronous UVC

2018-11-08 Thread Laurent Pinchart
Hi Kieran,

On Wednesday, 7 November 2018 01:31:29 EET Laurent Pinchart wrote:
> On Tuesday, 6 November 2018 23:27:11 EET Kieran Bingham wrote:
> > From: Kieran Bingham 
> > 
> > The Linux UVC driver has long provided adequate performance capabilities
> > for web-cams and low data rate video devices in Linux while resolutions
> > were low.
> > 
> > Modern USB cameras are now capable of high data rates thanks to USB3 with
> > 1080p, and even 4k capture resolutions supported.
> > 
> > Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO
> > (isochronous transfers) can generate more data than an embedded ARM core
> > is able to process on a single core, resulting in frame loss.
> > 
> > A large part of this performance impact is from the requirement to
> > ‘memcpy’ frames out from URB packets to destination frames. This
> > unfortunate requirement is due to the UVC protocol allowing a variable
> > length header, and thus it is not possible to provide the target frame
> > buffers directly.
> > 
> > Extra throughput is possible by moving the actual memcpy actions to a work
> > queue, and moving the memcpy out of interrupt context thus allowing work
> > tasks to be scheduled across multiple cores.
> > 
> > This series has been tested on both the ZED and BRIO cameras on arm64
> > platforms, and with thanks to Randy Dunlap, a Dynex 1.3MP Webcam, a Sonix
> > USB2 Camera, and a built in Toshiba Laptop camera, and with thanks to
> > Philipp Zabel for testing on a Lite-On internal Laptop Webcam, Logitech
> > C910 (USB2 isoc), Oculus Sensor (USB3 isoc), and Microsoft HoloLens
> > Sensors (USB3 bulk).
> > 
> > As far as I am aware iSight devices, and devices which use UVC to encode
> > data (output device) have not yet been tested - but should find no ill
> > effect (at least not until they are tested of course :D )
>
> :-D
> 
> I'm not sure whether anyone is still using those devices with Linux. I
> wouldn't be surprised if we realized down the road that they already don't
> work.
> 
> > Tested-by: Randy Dunlap 
> > Tested-by: Philipp Zabel 
> > 
> > v2:
> >  - Fix race reported by Guennadi
> > 
> > v3:
> >  - Fix similar race reported by Laurent
> >  - Only queue work if required (encode/isight do not queue work)
> >  - Refactor/Rename variables for clarity
> > 
> > v4:
> >  - (Yet another) Rework of the uninitialise path.
> >  
> >This time to hopefully clean up the shutdown races for good.
> >use usb_poison_urb() to halt all URBs, then flush the work queue
> >before freeing.
> >  
> >  - Rebase to latest linux-media/master
> > 
> > v5:
> >  - Provide lockdep validation
> >  - rename uvc_queue_requeue -> uvc_queue_buffer_requeue()
> >  - Fix comments and periods throughout
> >  - Rebase to media/v4.20-2
> >  - Use GFP_KERNEL allocation in uvc_video_copy_data_work()
> >  - Fix function documentation for uvc_video_copy_data_work()
> >  - Add periods to the end of sentences
> >  - Rename 'decode' variable to 'op' in uvc_video_decode_data()
> >  - Move uvc_urb->async_operations initialisation to before use
> >  - Move async workqueue to match uvc_streaming lifetime instead of
> >  
> >streamon/streamoff
> >  
> >  - bracket the for_each_uvc_urb() macro
> >  
> >  - New patches added to series:
> > media: uvcvideo: Split uvc_video_enable into two
> > media: uvcvideo: Rename uvc_{un,}init_video()
> > media: uvcvideo: Utilise for_each_uvc_urb iterator
> > 
> > Kieran Bingham (9):
> >   media: uvcvideo: Refactor URB descriptors
> >   media: uvcvideo: Convert decode functions to use new context structure
> >   media: uvcvideo: Protect queue internals with helper
> >   media: uvcvideo: queue: Simplify spin-lock usage
> >   media: uvcvideo: queue: Support asynchronous buffer handling
> >   media: uvcvideo: Move decode processing to process context
> >   media: uvcvideo: Split uvc_video_enable into two
> 
> I've taken the above patches in my tree.

I'm afraid I'll have to drop them :-( If I disconnect the camera while in use,
I get the following oops:

[  237.514625] usb 2-1.4: USB disconnect, device number 5
[  237.516123] uvcvideo: Failed to resubmit video URB (-19).
[  237.549470] BUG: unable to handle kernel paging request at 4bec0091
[  237.549476] PGD 0 P4D 0 
[  237.549481] Oops:  [#1] PREEMPT SMP PTI
[  237.549485] CPU: 3 PID: 5332 Comm: luvcview Tainted: GW  O  
4.18.16-gentoo #1
[  237.549487] Hardware name: Dell Inc. Latitude E6420/0K0DNP, BIOS A08 
10/18/2011
[  237.549493] RIP: 0010:flush_workqueue_prep_pwqs+0x49/0x120
[  237.549494] Code: 47 48 85 c0 0f 85 e1 00 00 00 c7 45 48 01 00 00 00 48 8b 
45 00 4c 8d 78 90 48 39 c5 0f 84 d0 00 00 00 c6 44 24 07 00 4d 63 f4 <4d> 8b 2f 
4c 89 ef e8 5c b3 89 00 45 85 e4 78 1d 41 83 7f 14 ff 75 
[  237.549525] RSP: 0018:c9000154fc50 EFLAGS: 00010296
[  237.549527] RAX: 4bec0101 RBX: 0002 RCX: 0002
[  237.549529] RDX: 0002 RSI: 0001 RDI: 

Re: [PATCH v5 0/9] Asynchronous UVC

2018-11-06 Thread Laurent Pinchart
Hi Kieran,

Thank you for the patches.

On Tuesday, 6 November 2018 23:27:11 EET Kieran Bingham wrote:
> From: Kieran Bingham 
> 
> The Linux UVC driver has long provided adequate performance capabilities for
> web-cams and low data rate video devices in Linux while resolutions were
> low.
> 
> Modern USB cameras are now capable of high data rates thanks to USB3 with
> 1080p, and even 4k capture resolutions supported.
> 
> Cameras such as the Stereolabs ZED (bulk transfers) or the Logitech BRIO
> (isochronous transfers) can generate more data than an embedded ARM core is
> able to process on a single core, resulting in frame loss.
> 
> A large part of this performance impact is from the requirement to
> ‘memcpy’ frames out from URB packets to destination frames. This unfortunate
> requirement is due to the UVC protocol allowing a variable length header,
> and thus it is not possible to provide the target frame buffers directly.
> 
> Extra throughput is possible by moving the actual memcpy actions to a work
> queue, and moving the memcpy out of interrupt context thus allowing work
> tasks to be scheduled across multiple cores.
> 
> This series has been tested on both the ZED and BRIO cameras on arm64
> platforms, and with thanks to Randy Dunlap, a Dynex 1.3MP Webcam, a Sonix
> USB2 Camera, and a built in Toshiba Laptop camera, and with thanks to
> Philipp Zabel for testing on a Lite-On internal Laptop Webcam, Logitech
> C910 (USB2 isoc), Oculus Sensor (USB3 isoc), and Microsoft HoloLens Sensors
> (USB3 bulk).
> 
> As far as I am aware iSight devices, and devices which use UVC to encode
> data (output device) have not yet been tested - but should find no ill
> effect (at least not until they are tested of course :D )

:-D

I'm not sure whether anyone is still using those devices with Linux. I 
wouldn't be surprised if we realized down the road that they already don't 
work.

> Tested-by: Randy Dunlap 
> Tested-by: Philipp Zabel 
> 
> v2:
>  - Fix race reported by Guennadi
> 
> v3:
>  - Fix similar race reported by Laurent
>  - Only queue work if required (encode/isight do not queue work)
>  - Refactor/Rename variables for clarity
> 
> v4:
>  - (Yet another) Rework of the uninitialise path.
>This time to hopefully clean up the shutdown races for good.
>use usb_poison_urb() to halt all URBs, then flush the work queue
>before freeing.
>  - Rebase to latest linux-media/master
> 
> v5:
>  - Provide lockdep validation
>  - rename uvc_queue_requeue -> uvc_queue_buffer_requeue()
>  - Fix comments and periods throughout
>  - Rebase to media/v4.20-2
>  - Use GFP_KERNEL allocation in uvc_video_copy_data_work()
>  - Fix function documentation for uvc_video_copy_data_work()
>  - Add periods to the end of sentences
>  - Rename 'decode' variable to 'op' in uvc_video_decode_data()
>  - Move uvc_urb->async_operations initialisation to before use
>  - Move async workqueue to match uvc_streaming lifetime instead of
>streamon/streamoff
>  - bracket the for_each_uvc_urb() macro
> 
>  - New patches added to series:
> media: uvcvideo: Split uvc_video_enable into two
> media: uvcvideo: Rename uvc_{un,}init_video()
> media: uvcvideo: Utilise for_each_uvc_urb iterator
> 
> Kieran Bingham (9):
>   media: uvcvideo: Refactor URB descriptors
>   media: uvcvideo: Convert decode functions to use new context structure
>   media: uvcvideo: Protect queue internals with helper
>   media: uvcvideo: queue: Simplify spin-lock usage
>   media: uvcvideo: queue: Support asynchronous buffer handling
>   media: uvcvideo: Move decode processing to process context
>   media: uvcvideo: Split uvc_video_enable into two

I've taken the above patches in my tree.

>   media: uvcvideo: Rename uvc_{un,}init_video()
>   media: uvcvideo: Utilise for_each_uvc_urb iterator

And I've sent review comments for these two.

>  drivers/media/usb/uvc/uvc_driver.c |   2 +-
>  drivers/media/usb/uvc/uvc_isight.c |   6 +-
>  drivers/media/usb/uvc/uvc_queue.c  | 110 +---
>  drivers/media/usb/uvc/uvc_video.c  | 282 +++---
>  drivers/media/usb/uvc/uvcvideo.h   |  65 ++-
>  5 files changed, 331 insertions(+), 134 deletions(-)
> 
> base-commit: dafb7f9aef2fd44991ff1691721ff765a23be27b

-- 
Regards,

Laurent Pinchart