Re: [PATCH v5 09/12] media: uvcvideo: Implement UVC_QUIRK_PRIVACY_DURING_STREAM

2021-04-19 Thread Tomasz Figa
On Wed, Dec 23, 2020 at 9:56 PM Ricardo Ribalda  wrote:
>
> Hi again
>
> On Wed, Dec 23, 2020 at 9:31 AM Ricardo Ribalda  wrote:
> >
> > Hi Laurent
> >
> > On Wed, Dec 23, 2020 at 9:05 AM Laurent Pinchart
> >  wrote:
> > >
> > > Hi Ricardo,
> > >
> > > On Tue, Dec 22, 2020 at 09:04:19PM +0100, Ricardo Ribalda wrote:
> > > > On Tue, Dec 22, 2020 at 11:30 AM Laurent Pinchart wrote:
> > > > > On Mon, Dec 21, 2020 at 05:48:16PM +0100, Ricardo Ribalda wrote:
> > > > > > Some devices, can only read the privacy_pin if the device is
> > > > >
> > > > > s/devices,/devices/
> > > > >
> > > > > > streaming.
> > > > > >
> > > > > > This patch implement a quirk for such devices, in order to avoid 
> > > > > > invalid
> > > > > > reads and/or spurious events.
> > > > > >
> > > > > > Signed-off-by: Ricardo Ribalda 
> > > > > > ---
> > > > > >  drivers/media/usb/uvc/uvc_driver.c | 57 
> > > > > > --
> > > > > >  drivers/media/usb/uvc/uvc_queue.c  |  3 ++
> > > > > >  drivers/media/usb/uvc/uvcvideo.h   |  4 +++
> > > > > >  3 files changed, 61 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/media/usb/uvc/uvc_driver.c 
> > > > > > b/drivers/media/usb/uvc/uvc_driver.c
> > > > > > index 72516101fdd0..7af37d4bd60a 100644
> > > > > > --- a/drivers/media/usb/uvc/uvc_driver.c
> > > > > > +++ b/drivers/media/usb/uvc/uvc_driver.c
> > > > > > @@ -7,6 +7,7 @@
> > > > > >   */
> > > > > >
> > > > > >  #include 
> > > > > > +#include 
> > > > > >  #include 
> > > > > >  #include 
> > > > > >  #include 
> > > > > > @@ -1472,6 +1473,17 @@ static int uvc_parse_control(struct 
> > > > > > uvc_device *dev)
> > > > > >  /* 
> > > > > > -
> > > > > >   * Privacy GPIO
> > > > > >   */
> > > > >
> > > > > There should be a blank line here.
> > > > >
> > > > > > +static bool uvc_gpio_is_streaming(struct uvc_device *dev)
> > > > > > +{
> > > > > > + struct uvc_streaming *streaming;
> > > > > > +
> > > > > > + list_for_each_entry(streaming, >streams, list) {
> > > > > > + if (uvc_queue_streaming(>queue))
> > > > > > + return true;
> > > > > > + }
> > > > > > +
> > > > > > + return false;
> > > > > > +}
> > > > > >
> > > > > >
> > > > >
> > > > > But not too blank lines here.
> > > > >
> > > > > >  static u8 uvc_gpio_update_value(struct uvc_device *dev,
> > > > > > @@ -1499,7 +1511,12 @@ static int uvc_gpio_get_cur(struct 
> > > > > > uvc_device *dev, struct uvc_entity *entity,
> > > > > >   if (cs != UVC_CT_PRIVACY_CONTROL || size < 1)
> > > > > >   return -EINVAL;
> > > > > >
> > > > > > + if ((dev->quirks & UVC_QUIRK_PRIVACY_DURING_STREAM) &&
> > > > > > + !uvc_gpio_is_streaming(dev))
> > > > > > + return -EBUSY;
> > > > > > +
> > > > > >   *(uint8_t *)data = uvc_gpio_update_value(dev, entity);
> > > > > > +
> > > > > >   return 0;
> > > > > >  }
> > > > > >
> > > > > > @@ -1528,19 +1545,50 @@ static struct uvc_entity 
> > > > > > *uvc_gpio_find_entity(struct uvc_device *dev)
> > > > > >   return NULL;
> > > > > >  }
> > > > > >
> > > > > > -static irqreturn_t uvc_gpio_irq(int irq, void *data)
> > > > > > +void uvc_privacy_gpio_event(struct uvc_device *dev)
> > > > > >  {
> > > > > > - struct uvc_device *dev = data;
> > > > > >   struct uvc_entity *unit;
> > > > > >
> > > > > > +
> > > > > >   unit = uvc_gpio_find_entity(dev);
> > > > > >   if (!unit)
> > > > > > - return IRQ_HANDLED;
> > > > > > + return;
> > > > > >
> > > > > >   uvc_gpio_update_value(dev, unit);
> > > > > > +}
> > > > > > +
> > > > > > +static irqreturn_t uvc_gpio_irq(int irq, void *data)
> > > > > > +{
> > > > > > + struct uvc_device *dev = data;
> > > > > > +
> > > > > > + /* Ignore privacy events during streamoff */
> > > > > > + if (dev->quirks & UVC_QUIRK_PRIVACY_DURING_STREAM)
> > > > > > + if (!uvc_gpio_is_streaming(dev))
> > > > > > + return IRQ_HANDLED;
> > > > >
> > > > > I'm still a bit concerned of race conditions. When stopping the 
> > > > > stream,
> > > > > vb2_queue.streaming is set to 0 after calling the driver's 
> > > > > .stop_stream()
> > > > > handler. This means that the device will cut power before
> > > > > uvc_gpio_is_streaming() can detect that streaming has stopped, and the
> > > > > GPIO could thus trigger an IRQ.
> > > >
> > > > On the affected devices I have not seen this. I guess it takes some
> > > > time to discharge. Anyway I am implementing a workaround. Tell me if
> > > > it is too ugly.
> > > >
> > > > > You mentioned that devices have a pull-up or pull-down on the GPIO 
> > > > > line.
> > > > > As there are only two devices affected, do you know if it's a pull-up 
> > > > > or
> > > > > pull-down ? Would it be worse to expose that state to userspace than 
> > > > > to
> > > > > return -EBUSY when reading the control ?
> > 

Re: [PATCH v2 2/2] media: staging/intel-ipu3: Fix set_fmt error handling

2021-04-08 Thread Tomasz Figa
On Mon, Mar 15, 2021 at 01:34:06PM +0100, Ricardo Ribalda wrote:
> If there in an error during a set_fmt, do not overwrite the previous
> sizes with the invalid config.
> 
> [   38.662975] ipu3-imgu :00:05.0: swiotlb buffer is full (sz: 4096 bytes)
> [   38.662980] DMA: Out of SW-IOMMU space for 4096 bytes at device 
> :00:05.0
> [   38.663010] general protection fault:  [#1] PREEMPT SMP
> 
> Cc: sta...@vger.kernel.org
> Fixes: 6d5f26f2e045 ("media: staging/intel-ipu3-v4l: reduce kernel stack 
> usage")
> Signed-off-by: Ricardo Ribalda 
> ---
>  drivers/staging/media/ipu3/ipu3-v4l2.c | 25 ++---
>  1 file changed, 14 insertions(+), 11 deletions(-)

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz


Re: [PATCH v9 22/22] uvc: use vb2 ioctl and fop helpers

2021-04-01 Thread Tomasz Figa
Hi Ricardo,

On Fri, Mar 26, 2021 at 7:00 PM Ricardo Ribalda  wrote:
>
> From: Hans Verkuil 
>
> When uvc was written the vb2 ioctl and file operation helpers didn't exist.
>
> This patch switches uvc over to those helpers, which removes a lot of 
> boilerplate
> code and simplifies VIDIOC_G/S_PRIORITY handling and allows us to drop the
> 'privileges' scheme, since that's now handled inside the vb2 helpers.
>
> This makes it possible for uvc to pass the v4l2-compliance streaming tests.
>
> Signed-off-by: Hans Verkuil 

Thanks for the patch. Did you perhaps miss adding your sign-off?

Also, see my comments inline.

[snip]
> @@ -1166,11 +969,6 @@ static int uvc_ioctl_s_parm(struct file *file, void *fh,
>  {
> struct uvc_fh *handle = fh;
> struct uvc_streaming *stream = handle->stream;
> -   int ret;
> -
> -   ret = uvc_acquire_privileges(handle);
> -   if (ret < 0)
> -   return ret;

Why is it okay not to acquire the privileges here?

>
> return uvc_v4l2_set_streamparm(stream, parm);
>  }

Best regards,
Tomasz


Re: [PATCH 0/8] videobuf2: support new noncontiguous DMA API

2021-03-24 Thread Tomasz Figa
Hi Sergey,

On Tue, Mar 02, 2021 at 09:46:16AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
>   RFC
> 
>   The series adds support for new noncontiguous DMA API [0] and
> adds V4L2_FLAG_MEMORY_NON_COHERENT UAPI. This is similar to previous
> V4L2_FLAG_MEMORY_NON_CONSISTENT (which was renamed), but the patch set
> goes a bit further this time and also does some videobuf2 API
> refactroings along the way.
> 
> A corresponding v4l2-compliance patch will be posted shortly.
> 
> [0] https://lore.kernel.org/lkml/20210301085236.947011-2-...@lst.de/
> 
> Sergey Senozhatsky (8):
>   videobuf2: rework vb2_mem_ops API
>   videobuf2: inverse buffer cache_hints flags
>   videobuf2: split buffer cache_hints initialisation
>   videobuf2: move cache_hints handling to allocators
>   videobuf2: add V4L2_FLAG_MEMORY_NON_COHERENT flag
>   videobuf2: add queue memory coherency parameter
>   videobuf2: handle V4L2_FLAG_MEMORY_NON_COHERENT flag
>   videobuf2: handle non-contiguous DMA allocations
> 
>  .../userspace-api/media/v4l/buffer.rst|  40 +++-
>  .../media/v4l/vidioc-create-bufs.rst  |   7 +-
>  .../media/v4l/vidioc-reqbufs.rst  |  16 +-
>  .../media/common/videobuf2/videobuf2-core.c   | 135 +-
>  .../common/videobuf2/videobuf2-dma-contig.c   | 175 ++
>  .../media/common/videobuf2/videobuf2-dma-sg.c |  39 ++--
>  .../media/common/videobuf2/videobuf2-v4l2.c   |  47 ++---
>  .../common/videobuf2/videobuf2-vmalloc.c  |  30 +--
>  drivers/media/dvb-core/dvb_vb2.c  |   2 +-
>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c |   9 +-
>  drivers/media/v4l2-core/v4l2-ioctl.c  |   5 +-
>  include/media/videobuf2-core.h|  57 +++---
>  include/uapi/linux/videodev2.h|  13 +-
>  13 files changed, 396 insertions(+), 179 deletions(-)
> 
> -- 
> 2.30.1.766.gb4fecdf3b7-goog
> 

Just some minor nits for patch 8. Otherwise, with Hans's comments
addressed:

Acked-by: Tomasz Figa 

Thanks for the great job.

Best regards,
Tomasz



Re: [PATCH 8/8] videobuf2: handle non-contiguous DMA allocations

2021-03-24 Thread Tomasz Figa
On Tue, Mar 02, 2021 at 09:46:24AM +0900, Sergey Senozhatsky wrote:
> This adds support for new noncontiguous DMA API, which
> requires allocators to have two execution branches: one
> for the current API, and one for the new one.
> 
> Signed-off-by: Sergey Senozhatsky 
> [hch: untested conversion to the ne API]
> Signed-off-by: Christoph Hellwig 
> ---
>  .../common/videobuf2/videobuf2-dma-contig.c   | 141 +++---
>  1 file changed, 117 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index 1e218bc440c6..d6a9f7b682f3 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -42,8 +43,14 @@ struct vb2_dc_buf {
>   struct dma_buf_attachment   *db_attach;
>  
>   struct vb2_buffer   *vb;
> + unsigned intnon_coherent_mem:1;
>  };
>  
> +static bool vb2_dc_is_coherent(struct vb2_dc_buf *buf)
> +{
> + return !buf->non_coherent_mem;
> +}

nit: Given that this is just a simple negated return, do we need a dedicated
helper for it?

> +
>  /*/
>  /*scatterlist table functions*/
>  /*/
> @@ -78,12 +85,21 @@ static void *vb2_dc_cookie(struct vb2_buffer *vb, void 
> *buf_priv)
>  static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
>  {
>   struct vb2_dc_buf *buf = buf_priv;
> - struct dma_buf_map map;
> - int ret;
>  
> - if (!buf->vaddr && buf->db_attach) {
> - ret = dma_buf_vmap(buf->db_attach->dmabuf, );
> - buf->vaddr = ret ? NULL : map.vaddr;
> + if (buf->vaddr)
> + return buf->vaddr;
> +
> + if (buf->db_attach) {
> + struct dma_buf_map map;
> +
> + if (!dma_buf_vmap(buf->db_attach->dmabuf, ))
> + buf->vaddr = map.vaddr;
> + }
> +
> + if (!vb2_dc_is_coherent(buf)) {

I believe it's not possible for both buf->db_attach and
!vb2_dc_is_coherent() to be true, but nevertheless the code can be
misleading for the reader. Would it make sense to just return early in the
if that handles db_attach?

> + buf->vaddr = dma_vmap_noncontiguous(buf->dev,
> + buf->size,
> + buf->dma_sgt);
>   }
>  
>   return buf->vaddr;
> @@ -101,13 +117,26 @@ static void vb2_dc_prepare(void *buf_priv)
>   struct vb2_dc_buf *buf = buf_priv;
>   struct sg_table *sgt = buf->dma_sgt;
>  
> + /* This takes care of DMABUF and user-enforced cache sync hint */
>   if (buf->vb->skip_cache_sync_on_prepare)
>   return;
>  
> + /*
> +  * Coherent MMAP buffers do not need to be synced, unlike coherent
> +  * USERPTR and non-coherent MMAP buffers.

USERPTR buffers are always considered non-coherent.

> +  */
> + if (buf->vb->memory == V4L2_MEMORY_MMAP && vb2_dc_is_coherent(buf))
> + return;
> +
>   if (!sgt)
>   return;
>  
> + /* For both USERPTR and non-coherent MMAP */
>   dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> +
> + /* Non-coherrent MMAP only */
> + if (!vb2_dc_is_coherent(buf) && buf->vaddr)
> + flush_kernel_vmap_range(buf->vaddr, buf->size);
>  }
>  
>  static void vb2_dc_finish(void *buf_priv)
> @@ -115,19 +144,46 @@ static void vb2_dc_finish(void *buf_priv)
>   struct vb2_dc_buf *buf = buf_priv;
>   struct sg_table *sgt = buf->dma_sgt;
>  
> + /* This takes care of DMABUF and user-enforced cache sync hint */
>   if (buf->vb->skip_cache_sync_on_finish)
>   return;
>  
> + /*
> +  * Coherent MMAP buffers do not need to be synced, unlike coherent
> +  * USERPTR and non-coherent MMAP buffers.

Ditto.

> +  */
> + if (buf->vb->memory == V4L2_MEMORY_MMAP && vb2_dc_is_coherent(buf))
> + return;
> +
>   if (!sgt)
>   return;
>  
> + /* For both USERPTR and non-coherent MMAP */
>   dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> +
> + /* Non-coherrent MMAP only */
> + if (!vb2_dc_is_coherent(buf) && buf->vaddr)
> + invalidate_kernel_vmap_range(buf->vaddr, buf->size);
>  }
>  
>  /*/
>  /*callbacks for MMAP buffers */
>  /*/
>  
> +static void __vb2_dc_put(struct vb2_dc_buf *buf)
> +{
> + if (vb2_dc_is_coherent(buf)) {
> + dma_free_attrs(buf->dev, buf->size, buf->cookie,
> +buf->dma_addr, buf->attrs);
> + return;
> + }
> +
> + if (buf->vaddr)
> + 

Re: [PATCHv3 5/6] media: uvcvideo: add UVC 1.5 ROI control

2021-03-23 Thread Tomasz Figa
On Wed, Mar 24, 2021 at 11:52 AM Sergey Senozhatsky
 wrote:
>
> On (21/03/24 11:34), Tomasz Figa wrote:
> > On Wed, Mar 24, 2021 at 11:31 AM Sergey Senozhatsky
> >  wrote:
> [..]
> > > > Adjusting the rectangle to something supported by the hardware is
> > > > mentioned explicitly in the V4L2 API documentation and is what drivers
> > > > have to implement. Returning an error on invalid value is not a
> > > > correct behavior here (and similarly for many other operations, e.g.
> > > > S_FMT).
> > >
> > > Well, in this particular case we are talking about user-space that wants
> > > to set ROI rectangle that is knowingly violates device's GET_MAX and
> > > overflows UVC ROI rectangle u16 value range. That's a clear bug in 
> > > user-space.
> > > Do we want to pretend that user-space does the correct thing and fixup
> > > stuff behind the scenes?
> > >
> >
> > That's how the API is defined. There is a valid use case for this -
> > you don't need to run QUERY_CTRL if all you need is setting the
> > biggest possible rectangle, just set it to (0, 0), (INT_MAX, INT_MAX).
>
> I guess in our case we need to talk about rectangle,auto-controls tuple
> that we send to firmware
>
> rect {
> (0, 0), (INT_MAX, INT_MAX)
> }
> auto-controls {
> INT_MAX
> }
>
> For ROI user-space also must provide valid auto-controls value, which
> normally requires GET_MIN/GET_MAX discovery.
>
> v4l2 selection API mentions only rectangle adjustments and errnos like
> -ERANGE also mention "It is not possible to adjust struct v4l2_rect r
> rectangle to satisfy all constraints given in the flags argument".
>
> So in case when auto-controls is out of supported range (out of
> GET_MIN, GET_MAX range) there is no way for us to tell user-space that
> auto-controls is wrong. We probably need silently pick up the first
> supported value, but not sure how well this will work out in the end.

Shouldn't the autocontrol selection be done via a separate bitmask
control rather than some custom flags in the selection API?

Best regards,
Tomasz


Re: [PATCHv3 5/6] media: uvcvideo: add UVC 1.5 ROI control

2021-03-23 Thread Tomasz Figa
On Wed, Mar 24, 2021 at 11:31 AM Sergey Senozhatsky
 wrote:
>
> On (21/03/24 11:14), Tomasz Figa wrote:
> > > > > +static int uvc_ioctl_s_roi(struct file *file, void *fh,
> > > > > +  struct v4l2_selection *sel)
> > > > > +{
> > > > > +   struct uvc_fh *handle = fh;
> > > > > +   struct uvc_streaming *stream = handle->stream;
> > > > > +   struct uvc_roi_rect *roi;
> > > > > +   int ret;
> > > > > +
> > > > > +   if (!validate_roi_bounds(stream, sel))
> > > > > +   return -E2BIG;
> > > >
> > > > Not sure if this is the correct approach or if we should convert the
> > > > value to the closest valid...
> > >
> > > Well, at this point we know that ROI rectangle dimensions are out of
> > > sane value range. I'd rather tell user-space about integer overflow.
> >
> > Adjusting the rectangle to something supported by the hardware is
> > mentioned explicitly in the V4L2 API documentation and is what drivers
> > have to implement. Returning an error on invalid value is not a
> > correct behavior here (and similarly for many other operations, e.g.
> > S_FMT).
>
> Well, in this particular case we are talking about user-space that wants
> to set ROI rectangle that is knowingly violates device's GET_MAX and
> overflows UVC ROI rectangle u16 value range. That's a clear bug in user-space.
> Do we want to pretend that user-space does the correct thing and fixup
> stuff behind the scenes?
>

That's how the API is defined. There is a valid use case for this -
you don't need to run QUERY_CTRL if all you need is setting the
biggest possible rectangle, just set it to (0, 0), (INT_MAX, INT_MAX).

> > > Looking for the closest ROI rectangle that suffice can be rather
> > > tricky. It may sounds like we can just use BOUNDARIES_MAX, but this
> > > is what Firmware D returns for GET_MAX
> > >
> > > ioctl(V4L2_SEL_TGT_ROI_BOUNDS_MAX)
> > >
> > > 0, 0, 65535, 65535
> >
> > Perhaps the frame size would be the correct bounds?
>
> I can check that.


Re: [PATCHv3 5/6] media: uvcvideo: add UVC 1.5 ROI control

2021-03-23 Thread Tomasz Figa
On Wed, Mar 24, 2021 at 11:01 AM Sergey Senozhatsky
 wrote:
>
> On (21/03/23 17:16), Ricardo Ribalda wrote:
> [..]
> > > +static bool validate_roi_bounds(struct uvc_streaming *stream,
> > > +   struct v4l2_selection *sel)
> > > +{
> > > +   if (sel->r.left > USHRT_MAX ||
> > > +   sel->r.top > USHRT_MAX ||
> > > +   (sel->r.width + sel->r.left) > USHRT_MAX ||
> > > +   (sel->r.height + sel->r.top) > USHRT_MAX ||
> > > +   !sel->r.width || !sel->r.height)
> > > +   return false;
> > > +
> > > +   if (sel->flags > V4L2_SEL_FLAG_ROI_AUTO_HIGHER_QUALITY)
> > > +   return false;
> >
> > Is it not allowed V4L2_SEL_FLAG_ROI_AUTO_IRIS |
> > V4L2_SEL_FLAG_ROI_AUTO_HIGHER_QUALITY   ?
>
> Good question.
>
> I don't know. Depends on what HIGHER_QUALITY can stand for (UVC doesn't
> specify). But overall it seems like features there are mutually
> exclusive. E.g. AUTO_FACE_DETECT and AUTO_DETECT_AND_TRACK.
>
>
> I think it'll be better to replace this with
>
> if (sel->flags > USHRT_MAX)
> return false;
>
> so that we don't let overflow happen and accidentally enable/disable
> some of the features.
>
> > > +
> > > +   return true;
> > > +}
> > > +
> > > +static int uvc_ioctl_s_roi(struct file *file, void *fh,
> > > +  struct v4l2_selection *sel)
> > > +{
> > > +   struct uvc_fh *handle = fh;
> > > +   struct uvc_streaming *stream = handle->stream;
> > > +   struct uvc_roi_rect *roi;
> > > +   int ret;
> > > +
> > > +   if (!validate_roi_bounds(stream, sel))
> > > +   return -E2BIG;
> >
> > Not sure if this is the correct approach or if we should convert the
> > value to the closest valid...
>
> Well, at this point we know that ROI rectangle dimensions are out of
> sane value range. I'd rather tell user-space about integer overflow.

Adjusting the rectangle to something supported by the hardware is
mentioned explicitly in the V4L2 API documentation and is what drivers
have to implement. Returning an error on invalid value is not a
correct behavior here (and similarly for many other operations, e.g.
S_FMT).

https://www.kernel.org/doc/html/v4.8/media/uapi/v4l/vidioc-g-selection.html

>
> Looking for the closest ROI rectangle that suffice can be rather
> tricky. It may sounds like we can just use BOUNDARIES_MAX, but this
> is what Firmware D returns for GET_MAX
>
> ioctl(V4L2_SEL_TGT_ROI_BOUNDS_MAX)
>
> 0, 0, 65535, 65535

Perhaps the frame size would be the correct bounds?

Best regards,
Tomasz


Re: [PATCH] media: venus: use contig vb2 ops

2021-03-01 Thread Tomasz Figa
On Mon, Mar 1, 2021 at 7:22 PM Stanimir Varbanov
 wrote:
>
>
>
> On 3/1/21 11:23 AM, Tomasz Figa wrote:
> > Hi Alex, Stanimir,
> >
> > On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa  wrote:
> >>
> >> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  
> >> wrote:
> >>>
> >>> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> >>>> Hi Tomasz,
> >>>>
> >>>> On 12/15/20 1:47 PM, Tomasz Figa wrote:
> >>>>> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> >>>>>  wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Cc: Robin
> >>>>>>
> >>>>>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> >>>>>>> This driver uses the SG vb2 ops, but effectively only ever accesses 
> >>>>>>> the
> >>>>>>> first entry of the SG table, indicating that it expects a flat layout.
> >>>>>>> Switch it to use the contiguous ops to make sure this expected 
> >>>>>>> invariant
> >>>>>>
> >>>>>> Under what circumstances the sg table will has nents > 1? I came down 
> >>>>>> to
> >>>>>> [1] but not sure I got it right.
> >>>>>>
> >>>>>> I'm afraid that for systems with low amount of system memory and when
> >>>>>> the memory become fragmented, the driver will not work. That's why I
> >>>>>> started with sg allocator.
> >>>>>
> >>>>> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> >>>>> in terms of the DMA (aka IOVA) address space. In other words, it
> >>>>> guarantees that having one DMA address and length fully describes the
> >>>>
> >>>> Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> >>>> allocator.
> >>>
> >>> I'm learning everyday too, but I'm surprised I don't see a call to
> >>> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
> >>> missed
> >>> a patch when overlooking this thread) ?
> >>>
> >>> The reason I'm asking, doc says it should be called by driver supporting 
> >>> IOMMU,
> >>> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, 
> >>> mtk-
> >>> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst 
> >>> case it's
> >>> all covered and we are good, otherwise perhaps a downstream patch didn't 
> >>> make it
> >>> ?
> >>>
> >>> /**
> >>>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
> >>>  * @dev:device for configuring DMA parameters
> >>>  * @size:   size of DMA max segment size to set
> >>>  *
> >>>  * To allow mapping the scatter-list into a single chunk in the DMA
> >>>  * address space, the device is required to have the DMA max segment
> >>>  * size parameter set to a value larger than the buffer size. Otherwise,
> >>>  * the DMA-mapping subsystem will split the mapping into max segment
> >>>  * size chunks. This function sets the DMA max segment size
> >>>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
> >>>  * address space.
> >>>  * This code assumes that the DMA-mapping subsystem will merge all
> >>>  * scatterlist segments if this is really possible (for example when
> >>>  * an IOMMU is available and enabled).
> >>>  * Ideally, this parameter should be set by the generic bus code, but it
> >>>  * is left with the default 64KiB value due to historical litmiations in
> >>>  * other subsystems (like limited USB host drivers) and there no good
> >>>  * place to set it to the proper value.
> >>>  * This function should be called from the drivers, which are known to
> >>>  * operate on platforms with IOMMU and provide access to shared buffers
> >>>  * (either USERPTR or DMABUF). This should be done before initializing
> >>>  * videobuf2 queue.
> >>>  */
> >>
> >> It does call dma_set_max_seg_size() directly:
> >> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
> >>
> >> Actually, why do we even need a vb2 helper for this?
> >>
> >
> > What's the plan for this patch?
>
> It will be part of v5.12.

Great, thanks!


Re: [PATCH] media: venus: use contig vb2 ops

2021-03-01 Thread Tomasz Figa
Hi Alex, Stanimir,

On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa  wrote:
>
> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  wrote:
> >
> > Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> > > Hi Tomasz,
> > >
> > > On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > > > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > Cc: Robin
> > > > >
> > > > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > > > This driver uses the SG vb2 ops, but effectively only ever accesses 
> > > > > > the
> > > > > > first entry of the SG table, indicating that it expects a flat 
> > > > > > layout.
> > > > > > Switch it to use the contiguous ops to make sure this expected 
> > > > > > invariant
> > > > >
> > > > > Under what circumstances the sg table will has nents > 1? I came down 
> > > > > to
> > > > > [1] but not sure I got it right.
> > > > >
> > > > > I'm afraid that for systems with low amount of system memory and when
> > > > > the memory become fragmented, the driver will not work. That's why I
> > > > > started with sg allocator.
> > > >
> > > > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > > > in terms of the DMA (aka IOVA) address space. In other words, it
> > > > guarantees that having one DMA address and length fully describes the
> > >
> > > Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> > > allocator.
> >
> > I'm learning everyday too, but I'm surprised I don't see a call to
> > vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
> > missed
> > a patch when overlooking this thread) ?
> >
> > The reason I'm asking, doc says it should be called by driver supporting 
> > IOMMU,
> > which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, 
> > mtk-
> > mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case 
> > it's
> > all covered and we are good, otherwise perhaps a downstream patch didn't 
> > make it
> > ?
> >
> > /**
> >  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
> >  * @dev:device for configuring DMA parameters
> >  * @size:   size of DMA max segment size to set
> >  *
> >  * To allow mapping the scatter-list into a single chunk in the DMA
> >  * address space, the device is required to have the DMA max segment
> >  * size parameter set to a value larger than the buffer size. Otherwise,
> >  * the DMA-mapping subsystem will split the mapping into max segment
> >  * size chunks. This function sets the DMA max segment size
> >  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
> >  * address space.
> >  * This code assumes that the DMA-mapping subsystem will merge all
> >  * scatterlist segments if this is really possible (for example when
> >  * an IOMMU is available and enabled).
> >  * Ideally, this parameter should be set by the generic bus code, but it
> >  * is left with the default 64KiB value due to historical litmiations in
> >  * other subsystems (like limited USB host drivers) and there no good
> >  * place to set it to the proper value.
> >  * This function should be called from the drivers, which are known to
> >  * operate on platforms with IOMMU and provide access to shared buffers
> >  * (either USERPTR or DMABUF). This should be done before initializing
> >  * videobuf2 queue.
> >  */
>
> It does call dma_set_max_seg_size() directly:
> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
>
> Actually, why do we even need a vb2 helper for this?
>

What's the plan for this patch?

Best regards,
Tomasz

> >
> > regards,
> > Nicolas
> >
> > >
> > > > buffer. This seems to be the requirement of the hardware/firmware
> > > > handled by the venus driver. If the device is behind an IOMMU, which
> > > > is the case for the SoCs in question, the underlying DMA ops will
> > > > actually allocate a discontiguous set of pages, so it has nothing to
> > > > do to system memory amount or fragmentation. If for some reason the
> > > > IOMMU can't be used, there is no way around, the memory needs to b

Re: [PATCH 6/7] dma-iommu: implement ->alloc_noncontiguous

2021-02-16 Thread Tomasz Figa
Hi Christoph


On Tue, Feb 2, 2021 at 6:51 PM Christoph Hellwig  wrote:
>
> Implement support for allocating a non-contiguous DMA region.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/dma-iommu.c | 35 +++
>  1 file changed, 35 insertions(+)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 85cb004d7a44c6..4e0b170d38d57a 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -718,6 +718,7 @@ static struct page 
> **__iommu_dma_alloc_noncontiguous(struct device *dev,
> goto out_free_sg;
>
> sgt->sgl->dma_address = iova;
> +   sgt->sgl->dma_length = size;
> return pages;
>
>  out_free_sg:
> @@ -755,6 +756,36 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
> size_t size,
> return NULL;
>  }
>
> +#ifdef CONFIG_DMA_REMAP
> +static struct sg_table *iommu_dma_alloc_noncontiguous(struct device *dev,
> +   size_t size, enum dma_data_direction dir, gfp_t gfp)
> +{
> +   struct dma_sgt_handle *sh;
> +
> +   sh = kmalloc(sizeof(*sh), gfp);
> +   if (!sh)
> +   return NULL;
> +
> +   sh->pages = __iommu_dma_alloc_noncontiguous(dev, size, >sgt, gfp,
> +   PAGE_KERNEL, 0);

When working on the videobuf2 integration with Sergey I noticed that
we always pass 0 as DMA attrs here, which removes the ability for
drivers to use DMA_ATTR_ALLOC_SINGLE_PAGES.

It's quite important from a system stability point of view, because by
default the iommu_dma allocator would prefer big order allocations for
TLB locality reasons. For many devices, though, it doesn't really
affect the performance, because of random access patterns, so single
pages are good enough and reduce the risk of allocation failures or
latency due to fragmentation.

Do you think we could add the attrs parameter to the
dma_alloc_noncontiguous() API?

Best regards,
Tomasz


Re: [PATCH v5 06/27] dt-bindings: mediatek: Add binding for mt8192 IOMMU

2021-02-09 Thread Tomasz Figa
On Mon, Feb 1, 2021 at 7:44 PM Robin Murphy  wrote:
>
> On 2021-01-29 11:45, Tomasz Figa wrote:
> > On Mon, Jan 25, 2021 at 4:34 PM Yong Wu  wrote:
> >>
> >> On Mon, 2021-01-25 at 13:18 +0900, Tomasz Figa wrote:
> >>> On Wed, Jan 20, 2021 at 4:08 PM Yong Wu  wrote:
> >>>>
> >>>> On Wed, 2021-01-20 at 13:15 +0900, Tomasz Figa wrote:
> >>>>> On Wed, Jan 13, 2021 at 3:45 PM Yong Wu  wrote:
> >>>>>>
> >>>>>> On Wed, 2021-01-13 at 14:30 +0900, Tomasz Figa wrote:
> >>>>>>> On Thu, Dec 24, 2020 at 8:35 PM Yong Wu  wrote:
> >>>>>>>>
> >>>>>>>> On Wed, 2020-12-23 at 17:18 +0900, Tomasz Figa wrote:
> >>>>>>>>> On Wed, Dec 09, 2020 at 04:00:41PM +0800, Yong Wu wrote:
> >>>>>>>>>> This patch adds decriptions for mt8192 IOMMU and SMI.
> >>>>>>>>>>
> >>>>>>>>>> mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor 
> >>>>>>>>>> translation
> >>>>>>>>>> table format. The M4U-SMI HW diagram is as below:
> >>>>>>>>>>
> >>>>>>>>>>EMI
> >>>>>>>>>> |
> >>>>>>>>>>M4U
> >>>>>>>>>> |
> >>>>>>>>>>
> >>>>>>>>>> SMI Common
> >>>>>>>>>>
> >>>>>>>>>> |
> >>>>>>>>>>+---+--+--+--+---+
> >>>>>>>>>>|   |  |  |   .. |   |
> >>>>>>>>>>|   |  |  |  |   |
> >>>>>>>>>> larb0   larb1  larb2  larb4 ..  larb19   larb20
> >>>>>>>>>> disp0   disp1   mdpvdec   IPE  IPE
> >>>>>>>>>>
> >>>>>>>>>> All the connections are HW fixed, SW can NOT adjust it.
> >>>>>>>>>>
> >>>>>>>>>> mt8192 M4U support 0~16GB iova range. we preassign different 
> >>>>>>>>>> engines
> >>>>>>>>>> into different iova ranges:
> >>>>>>>>>>
> >>>>>>>>>> domain-id  module iova-range  larbs
> >>>>>>>>>> 0   disp0 ~ 4G  larb0/1
> >>>>>>>>>> 1   vcodec  4G ~ 8G larb4/5/7
> >>>>>>>>>> 2   cam/mdp 8G ~ 12G 
> >>>>>>>>>> larb2/9/11/13/14/16/17/18/19/20
> >>>>>>>>>
> >>>>>>>>> Why do we preassign these addresses in DT? Shouldn't it be a user's 
> >>>>>>>>> or
> >>>>>>>>> integrator's decision to split the 16 GB address range into 
> >>>>>>>>> sub-ranges
> >>>>>>>>> and define which larbs those sub-ranges are shared with?
> >>>>>>>>
> >>>>>>>> The problem is that we can't split the 16GB range with the larb as 
> >>>>>>>> unit.
> >>>>>>>> The example is the below ccu0(larb13 port9/10) is a independent
> >>>>>>>> range(domain), the others ports in larb13 is in another domain.
> >>>>>>>>
> >>>>>>>> disp/vcodec/cam/mdp don't have special iova requirement, they could
> >>>>>>>> access any range. vcodec also can locate 8G~12G. it don't care about
> >>>>>>>> where its iova locate. here I preassign like this following with our
> >>>>>>>> internal project setting.
> >>>>>>>
> >>>>>>> Let me try to understand this a bit more. Given the split you're
> >>>>>>> proposing, is there actually any isolation enforced between particular
> >>>&

Re: add a new dma_alloc_noncontiguous API v2

2021-02-08 Thread Tomasz Figa
Hi Christoph,

On Mon, Feb 8, 2021 at 3:49 AM Christoph Hellwig  wrote:
>
> Any comments?
>

Sorry for the delay. The whole series looks very good to me. Thanks a lot.

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz

> On Tue, Feb 02, 2021 at 10:51:03AM +0100, Christoph Hellwig wrote:
> > Hi all,
> >
> > this series adds the new noncontiguous DMA allocation API requested by
> > various media driver maintainers.
> >
> > Changes since v1:
> >  - document that flush_kernel_vmap_range and invalidate_kernel_vmap_range
> >must be called once an allocation is mapped into KVA
> >  - add dma-debug support
> >  - remove the separate dma_handle argument, and instead create fully formed
> >DMA mapped scatterlists
> >  - use a directional allocation in uvcvideo
> >  - call invalidate_kernel_vmap_range from uvcvideo
> > ___
> > iommu mailing list
> > io...@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> ---end quoted text---


Re: [PATCH v5 06/27] dt-bindings: mediatek: Add binding for mt8192 IOMMU

2021-01-24 Thread Tomasz Figa
On Wed, Jan 20, 2021 at 4:08 PM Yong Wu  wrote:
>
> On Wed, 2021-01-20 at 13:15 +0900, Tomasz Figa wrote:
> > On Wed, Jan 13, 2021 at 3:45 PM Yong Wu  wrote:
> > >
> > > On Wed, 2021-01-13 at 14:30 +0900, Tomasz Figa wrote:
> > > > On Thu, Dec 24, 2020 at 8:35 PM Yong Wu  wrote:
> > > > >
> > > > > On Wed, 2020-12-23 at 17:18 +0900, Tomasz Figa wrote:
> > > > > > On Wed, Dec 09, 2020 at 04:00:41PM +0800, Yong Wu wrote:
> > > > > > > This patch adds decriptions for mt8192 IOMMU and SMI.
> > > > > > >
> > > > > > > mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor 
> > > > > > > translation
> > > > > > > table format. The M4U-SMI HW diagram is as below:
> > > > > > >
> > > > > > >   EMI
> > > > > > >|
> > > > > > >   M4U
> > > > > > >|
> > > > > > >   
> > > > > > >SMI Common
> > > > > > >   
> > > > > > >|
> > > > > > >   +---+--+--+--+---+
> > > > > > >   |   |  |  |   .. |   |
> > > > > > >   |   |  |  |  |   |
> > > > > > > larb0   larb1  larb2  larb4 ..  larb19   larb20
> > > > > > > disp0   disp1   mdpvdec   IPE  IPE
> > > > > > >
> > > > > > > All the connections are HW fixed, SW can NOT adjust it.
> > > > > > >
> > > > > > > mt8192 M4U support 0~16GB iova range. we preassign different 
> > > > > > > engines
> > > > > > > into different iova ranges:
> > > > > > >
> > > > > > > domain-id  module iova-range  larbs
> > > > > > >0   disp0 ~ 4G  larb0/1
> > > > > > >1   vcodec  4G ~ 8G larb4/5/7
> > > > > > >2   cam/mdp 8G ~ 12G 
> > > > > > > larb2/9/11/13/14/16/17/18/19/20
> > > > > >
> > > > > > Why do we preassign these addresses in DT? Shouldn't it be a user's 
> > > > > > or
> > > > > > integrator's decision to split the 16 GB address range into 
> > > > > > sub-ranges
> > > > > > and define which larbs those sub-ranges are shared with?
> > > > >
> > > > > The problem is that we can't split the 16GB range with the larb as 
> > > > > unit.
> > > > > The example is the below ccu0(larb13 port9/10) is a independent
> > > > > range(domain), the others ports in larb13 is in another domain.
> > > > >
> > > > > disp/vcodec/cam/mdp don't have special iova requirement, they could
> > > > > access any range. vcodec also can locate 8G~12G. it don't care about
> > > > > where its iova locate. here I preassign like this following with our
> > > > > internal project setting.
> > > >
> > > > Let me try to understand this a bit more. Given the split you're
> > > > proposing, is there actually any isolation enforced between particular
> > > > domains? For example, if I program vcodec to with a DMA address from
> > > > the 0-4G range, would the IOMMU actually generate a fault, even if
> > > > disp had some memory mapped at that address?
> > >
> > > In this case. we will get fault in current SW setting.
> > >
> >
> > Okay, thanks.
> >
> > > >
> > > > >
> > > > > Why set this in DT?, this is only for simplifying the code. Assume we
> > > > > put it in the platform data. We have up to 32 larbs, each larb has up 
> > > > > to
> > > > > 32 ports, each port may be in different iommu domains. we should have 
> > > > > a
> > > > > big array for this..however we only use a macro to get the domain in 
> > > > > the
> > > > > DT method.
> > > > >
> > > > > When replying this mail, I h

Re: [PATCH v5 06/27] dt-bindings: mediatek: Add binding for mt8192 IOMMU

2021-01-19 Thread Tomasz Figa
On Wed, Jan 13, 2021 at 3:45 PM Yong Wu  wrote:
>
> On Wed, 2021-01-13 at 14:30 +0900, Tomasz Figa wrote:
> > On Thu, Dec 24, 2020 at 8:35 PM Yong Wu  wrote:
> > >
> > > On Wed, 2020-12-23 at 17:18 +0900, Tomasz Figa wrote:
> > > > On Wed, Dec 09, 2020 at 04:00:41PM +0800, Yong Wu wrote:
> > > > > This patch adds decriptions for mt8192 IOMMU and SMI.
> > > > >
> > > > > mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor 
> > > > > translation
> > > > > table format. The M4U-SMI HW diagram is as below:
> > > > >
> > > > >   EMI
> > > > >|
> > > > >   M4U
> > > > >|
> > > > >   
> > > > >SMI Common
> > > > >   
> > > > >|
> > > > >   +---+--+--+--+---+
> > > > >   |   |  |  |   .. |   |
> > > > >   |   |  |  |  |   |
> > > > > larb0   larb1  larb2  larb4 ..  larb19   larb20
> > > > > disp0   disp1   mdpvdec   IPE  IPE
> > > > >
> > > > > All the connections are HW fixed, SW can NOT adjust it.
> > > > >
> > > > > mt8192 M4U support 0~16GB iova range. we preassign different engines
> > > > > into different iova ranges:
> > > > >
> > > > > domain-id  module iova-range  larbs
> > > > >0   disp0 ~ 4G  larb0/1
> > > > >1   vcodec  4G ~ 8G larb4/5/7
> > > > >2   cam/mdp 8G ~ 12G 
> > > > > larb2/9/11/13/14/16/17/18/19/20
> > > >
> > > > Why do we preassign these addresses in DT? Shouldn't it be a user's or
> > > > integrator's decision to split the 16 GB address range into sub-ranges
> > > > and define which larbs those sub-ranges are shared with?
> > >
> > > The problem is that we can't split the 16GB range with the larb as unit.
> > > The example is the below ccu0(larb13 port9/10) is a independent
> > > range(domain), the others ports in larb13 is in another domain.
> > >
> > > disp/vcodec/cam/mdp don't have special iova requirement, they could
> > > access any range. vcodec also can locate 8G~12G. it don't care about
> > > where its iova locate. here I preassign like this following with our
> > > internal project setting.
> >
> > Let me try to understand this a bit more. Given the split you're
> > proposing, is there actually any isolation enforced between particular
> > domains? For example, if I program vcodec to with a DMA address from
> > the 0-4G range, would the IOMMU actually generate a fault, even if
> > disp had some memory mapped at that address?
>
> In this case. we will get fault in current SW setting.
>

Okay, thanks.

> >
> > >
> > > Why set this in DT?, this is only for simplifying the code. Assume we
> > > put it in the platform data. We have up to 32 larbs, each larb has up to
> > > 32 ports, each port may be in different iommu domains. we should have a
> > > big array for this..however we only use a macro to get the domain in the
> > > DT method.
> > >
> > > When replying this mail, I happen to see there is a "dev->dev_range_map"
> > > which has "dma-range" information, I think I could use this value to get
> > > which domain the device belong to. then no need put domid in DT. I will
> > > test this.
> >
> > My feeling is that the only part that needs to be enforced statically
> > is the reserved IOVA range for CCUs. The other ranges should be
> > determined dynamically, although I think I need to understand better
> > how the hardware and your proposed design work to tell what would be
> > likely the best choice here.
>
> I have removed the domid patch in v6. and get the domain id in [27/33]
> in v6..
>
> About the other ranges should be dynamical, the commit message [30/33]
> of v6 should be helpful. the problem is that we have a bank_sel setting
> for the iova[32:33]. currently we preassign this value. thus, all the
> ranges are fixed. If you adjust this setting, you can let 

Re: [PATCH v5 06/27] dt-bindings: mediatek: Add binding for mt8192 IOMMU

2021-01-12 Thread Tomasz Figa
On Thu, Dec 24, 2020 at 8:35 PM Yong Wu  wrote:
>
> On Wed, 2020-12-23 at 17:18 +0900, Tomasz Figa wrote:
> > On Wed, Dec 09, 2020 at 04:00:41PM +0800, Yong Wu wrote:
> > > This patch adds decriptions for mt8192 IOMMU and SMI.
> > >
> > > mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> > > table format. The M4U-SMI HW diagram is as below:
> > >
> > >   EMI
> > >|
> > >   M4U
> > >|
> > >   
> > >SMI Common
> > >   
> > >|
> > >   +---+--+--+--+---+
> > >   |   |  |  |   .. |   |
> > >   |   |  |  |  |   |
> > > larb0   larb1  larb2  larb4 ..  larb19   larb20
> > > disp0   disp1   mdpvdec   IPE  IPE
> > >
> > > All the connections are HW fixed, SW can NOT adjust it.
> > >
> > > mt8192 M4U support 0~16GB iova range. we preassign different engines
> > > into different iova ranges:
> > >
> > > domain-id  module iova-range  larbs
> > >0   disp0 ~ 4G  larb0/1
> > >1   vcodec  4G ~ 8G larb4/5/7
> > >2   cam/mdp 8G ~ 12G 
> > > larb2/9/11/13/14/16/17/18/19/20
> >
> > Why do we preassign these addresses in DT? Shouldn't it be a user's or
> > integrator's decision to split the 16 GB address range into sub-ranges
> > and define which larbs those sub-ranges are shared with?
>
> The problem is that we can't split the 16GB range with the larb as unit.
> The example is the below ccu0(larb13 port9/10) is a independent
> range(domain), the others ports in larb13 is in another domain.
>
> disp/vcodec/cam/mdp don't have special iova requirement, they could
> access any range. vcodec also can locate 8G~12G. it don't care about
> where its iova locate. here I preassign like this following with our
> internal project setting.

Let me try to understand this a bit more. Given the split you're
proposing, is there actually any isolation enforced between particular
domains? For example, if I program vcodec to with a DMA address from
the 0-4G range, would the IOMMU actually generate a fault, even if
disp had some memory mapped at that address?

>
> Why set this in DT?, this is only for simplifying the code. Assume we
> put it in the platform data. We have up to 32 larbs, each larb has up to
> 32 ports, each port may be in different iommu domains. we should have a
> big array for this..however we only use a macro to get the domain in the
> DT method.
>
> When replying this mail, I happen to see there is a "dev->dev_range_map"
> which has "dma-range" information, I think I could use this value to get
> which domain the device belong to. then no need put domid in DT. I will
> test this.

My feeling is that the only part that needs to be enforced statically
is the reserved IOVA range for CCUs. The other ranges should be
determined dynamically, although I think I need to understand better
how the hardware and your proposed design work to tell what would be
likely the best choice here.

Best regards,
Tomasz

>
> Thanks.
> >
> > Best regards,
> > Tomasz
> >
> > >3   CCU00x4000_ ~ 0x43ff_ larb13: port 9/10
> > >4   CCU10x4400_ ~ 0x47ff_ larb14: port 4/5
> > >
> > > The iova range for CCU0/1(camera control unit) is HW requirement.
> > >
> > > Signed-off-by: Yong Wu 
> > > Reviewed-by: Rob Herring 
> > > ---
> > >  .../bindings/iommu/mediatek,iommu.yaml|  18 +-
> > >  include/dt-bindings/memory/mt8192-larb-port.h | 240 ++
> > >  2 files changed, 257 insertions(+), 1 deletion(-)
> > >  create mode 100644 include/dt-bindings/memory/mt8192-larb-port.h
> > >
> [snip]


Re: [PATCH v5 04/27] dt-bindings: memory: mediatek: Add domain definition

2021-01-12 Thread Tomasz Figa
On Thu, Dec 24, 2020 at 8:27 PM Yong Wu  wrote:
>
> On Wed, 2020-12-23 at 17:15 +0900, Tomasz Figa wrote:
> > Hi Yong,
> >
> > On Wed, Dec 09, 2020 at 04:00:39PM +0800, Yong Wu wrote:
> > > In the latest SoC, there are several HW IP require a sepecial iova
> > > range, mainly CCU and VPU has this requirement. Take CCU as a example,
> > > CCU require its iova locate in the range(0x4000_ ~ 0x43ff_).
> >
> > Is this really a domain? Does the address range come from the design of
> > the IOMMU?
>
> It is not a really a domain. The address range comes from CCU HW
> requirement. That HW can only access this iova range. thus I create a
> special iommu domain for it.
>

I guess it's the IOMMU/DT maintainers who have the last word here, but
shouldn't DT just specify the hardware characteristics and then the
kernel configure the hardware appropriately, possibly based on some
other configuration interface (e.g. command line parameters or sysfs)?

How I'd do this is rather than enforcing those arbitrary decisions
onto the DT bindings, I'd add properties to the master devices (e.g.
CCU) that specify which IOVA range they can operate on. Then, the
exact split of the complete address space would be done at runtime,
based on kernel configuration, command line parameters and possibly
sysfs attributes if things could be reconfigured dynamically.

Best regards,
Tomasz

> >
> > Best regards,
> > Tomasz
> >
> > >
> > > In this patch we add a domain definition for the special port. In the
> > > example of CCU, If we preassign CCU port in domain1, then iommu driver
> > > will prepare a independent iommu domain of the special iova range for it,
> > > then the iova got from dma_alloc_attrs(ccu-dev) will locate in its special
> > > range.
> > >
> > > This is a preparing patch for multi-domain support.
> > >
> > > Signed-off-by: Yong Wu 
> > > Acked-by: Krzysztof Kozlowski 
> > > Acked-by: Rob Herring 
> > > ---
> > >  include/dt-bindings/memory/mtk-smi-larb-port.h | 9 -
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/dt-bindings/memory/mtk-smi-larb-port.h 
> > > b/include/dt-bindings/memory/mtk-smi-larb-port.h
> > > index 7d64103209af..2d4c973c174f 100644
> > > --- a/include/dt-bindings/memory/mtk-smi-larb-port.h
> > > +++ b/include/dt-bindings/memory/mtk-smi-larb-port.h
> > > @@ -7,9 +7,16 @@
> > >  #define __DT_BINDINGS_MEMORY_MTK_MEMORY_PORT_H_
> > >
> > >  #define MTK_LARB_NR_MAX32
> > > +#define MTK_M4U_DOM_NR_MAX 8
> > > +
> > > +#define MTK_M4U_DOM_ID(domid, larb, port)  \
> > > +   (((domid) & 0x7) << 16 | (((larb) & 0x1f) << 5) | ((port) & 0x1f))
> > > +
> > > +/* The default dom id is 0. */
> > > +#define MTK_M4U_ID(larb, port) MTK_M4U_DOM_ID(0, larb, port)
> > >
> > > -#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
> > >  #define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0x1f)
> > >  #define MTK_M4U_TO_PORT(id)((id) & 0x1f)
> > > +#define MTK_M4U_TO_DOM(id) (((id) >> 16) & 0x7)
> > >
> > >  #endif
> > > --
> > > 2.18.0
> > >
> > > ___
> > > iommu mailing list
> > > io...@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/iommu
>


Re: [RFC PATCH v3 0/6] Restricted DMA

2021-01-12 Thread Tomasz Figa
On Wed, Jan 13, 2021 at 12:56 PM Florian Fainelli  wrote:
>
>
>
> On 1/12/2021 6:29 PM, Tomasz Figa wrote:
> > Hi Florian,
> >
> > On Wed, Jan 13, 2021 at 3:01 AM Florian Fainelli  
> > wrote:
> >>
> >> On 1/11/21 11:48 PM, Claire Chang wrote:
> >>> On Fri, Jan 8, 2021 at 1:59 AM Florian Fainelli  
> >>> wrote:
> >>>>
> >>>> On 1/7/21 9:42 AM, Claire Chang wrote:
> >>>>
> >>>>>> Can you explain how ATF gets involved and to what extent it does help,
> >>>>>> besides enforcing a secure region from the ARM CPU's perpsective? Does
> >>>>>> the PCIe root complex not have an IOMMU but can somehow be denied 
> >>>>>> access
> >>>>>> to a region that is marked NS=0 in the ARM CPU's MMU? If so, that is
> >>>>>> still some sort of basic protection that the HW enforces, right?
> >>>>>
> >>>>> We need the ATF support for memory MPU (memory protection unit).
> >>>>> Restricted DMA (with reserved-memory in dts) makes sure the predefined 
> >>>>> memory
> >>>>> region is for PCIe DMA only, but we still need MPU to locks down PCIe 
> >>>>> access to
> >>>>> that specific regions.
> >>>>
> >>>> OK so you do have a protection unit of some sort to enforce which region
> >>>> in DRAM the PCIE bridge is allowed to access, that makes sense,
> >>>> otherwise the restricted DMA region would only be a hint but nothing you
> >>>> can really enforce. This is almost entirely analogous to our systems 
> >>>> then.
> >>>
> >>> Here is the example of setting the MPU:
> >>> https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132
> >>>
> >>>>
> >>>> There may be some value in standardizing on an ARM SMCCC call then since
> >>>> you already support two different SoC vendors.
> >>>>
> >>>>>
> >>>>>>
> >>>>>> On Broadcom STB SoCs we have had something similar for a while however
> >>>>>> and while we don't have an IOMMU for the PCIe bridge, we do have a a
> >>>>>> basic protection mechanism whereby we can configure a region in DRAM to
> >>>>>> be PCIe read/write and CPU read/write which then gets used as the PCIe
> >>>>>> inbound region for the PCIe EP. By default the PCIe bridge is not
> >>>>>> allowed access to DRAM so we must call into a security agent to allow
> >>>>>> the PCIe bridge to access the designated DRAM region.
> >>>>>>
> >>>>>> We have done this using a private CMA area region assigned via Device
> >>>>>> Tree, assigned with a and requiring the PCIe EP driver to use
> >>>>>> dma_alloc_from_contiguous() in order to allocate from this device
> >>>>>> private CMA area. The only drawback with that approach is that it
> >>>>>> requires knowing how much memory you need up front for buffers and DMA
> >>>>>> descriptors that the PCIe EP will need to process. The problem is that
> >>>>>> it requires driver modifications and that does not scale over the 
> >>>>>> number
> >>>>>> of PCIe EP drivers, some we absolutely do not control, but there is no
> >>>>>> need to bounce buffer. Your approach scales better across PCIe EP
> >>>>>> drivers however it does require bounce buffering which could be a
> >>>>>> performance hit.
> >>>>>
> >>>>> Only the streaming DMA (map/unmap) needs bounce buffering.
> >>>>
> >>>> True, and typically only on transmit since you don't really control
> >>>> where the sk_buff are allocated from, right? On RX since you need to
> >>>> hand buffer addresses to the WLAN chip prior to DMA, you can allocate
> >>>> them from a pool that already falls within the restricted DMA region, 
> >>>> right?
> >>>>
> >>>
> >>> Right, but applying bounce buffering to RX will make it more secure.
> >>> The device won't be able to modify the content after unmap. Just like what
> >>> iommu_unmap does.
> >>
> >> Sure, however the goals of using bounce buffe

Re: [RFC PATCH v3 0/6] Restricted DMA

2021-01-12 Thread Tomasz Figa
Hi Florian,

On Wed, Jan 13, 2021 at 3:01 AM Florian Fainelli  wrote:
>
> On 1/11/21 11:48 PM, Claire Chang wrote:
> > On Fri, Jan 8, 2021 at 1:59 AM Florian Fainelli  
> > wrote:
> >>
> >> On 1/7/21 9:42 AM, Claire Chang wrote:
> >>
>  Can you explain how ATF gets involved and to what extent it does help,
>  besides enforcing a secure region from the ARM CPU's perpsective? Does
>  the PCIe root complex not have an IOMMU but can somehow be denied access
>  to a region that is marked NS=0 in the ARM CPU's MMU? If so, that is
>  still some sort of basic protection that the HW enforces, right?
> >>>
> >>> We need the ATF support for memory MPU (memory protection unit).
> >>> Restricted DMA (with reserved-memory in dts) makes sure the predefined 
> >>> memory
> >>> region is for PCIe DMA only, but we still need MPU to locks down PCIe 
> >>> access to
> >>> that specific regions.
> >>
> >> OK so you do have a protection unit of some sort to enforce which region
> >> in DRAM the PCIE bridge is allowed to access, that makes sense,
> >> otherwise the restricted DMA region would only be a hint but nothing you
> >> can really enforce. This is almost entirely analogous to our systems then.
> >
> > Here is the example of setting the MPU:
> > https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132
> >
> >>
> >> There may be some value in standardizing on an ARM SMCCC call then since
> >> you already support two different SoC vendors.
> >>
> >>>
> 
>  On Broadcom STB SoCs we have had something similar for a while however
>  and while we don't have an IOMMU for the PCIe bridge, we do have a a
>  basic protection mechanism whereby we can configure a region in DRAM to
>  be PCIe read/write and CPU read/write which then gets used as the PCIe
>  inbound region for the PCIe EP. By default the PCIe bridge is not
>  allowed access to DRAM so we must call into a security agent to allow
>  the PCIe bridge to access the designated DRAM region.
> 
>  We have done this using a private CMA area region assigned via Device
>  Tree, assigned with a and requiring the PCIe EP driver to use
>  dma_alloc_from_contiguous() in order to allocate from this device
>  private CMA area. The only drawback with that approach is that it
>  requires knowing how much memory you need up front for buffers and DMA
>  descriptors that the PCIe EP will need to process. The problem is that
>  it requires driver modifications and that does not scale over the number
>  of PCIe EP drivers, some we absolutely do not control, but there is no
>  need to bounce buffer. Your approach scales better across PCIe EP
>  drivers however it does require bounce buffering which could be a
>  performance hit.
> >>>
> >>> Only the streaming DMA (map/unmap) needs bounce buffering.
> >>
> >> True, and typically only on transmit since you don't really control
> >> where the sk_buff are allocated from, right? On RX since you need to
> >> hand buffer addresses to the WLAN chip prior to DMA, you can allocate
> >> them from a pool that already falls within the restricted DMA region, 
> >> right?
> >>
> >
> > Right, but applying bounce buffering to RX will make it more secure.
> > The device won't be able to modify the content after unmap. Just like what
> > iommu_unmap does.
>
> Sure, however the goals of using bounce buffering equally applies to RX
> and TX in that this is the only layer sitting between a stack (block,
> networking, USB, etc.) and the underlying device driver that scales well
> in order to massage a dma_addr_t to be within a particular physical range.
>
> There is however room for improvement if the drivers are willing to
> change their buffer allocation strategy. When you receive Wi-Fi frames
> you need to allocate buffers for the Wi-Fi device to DMA into, and that
> happens ahead of the DMA transfers by the Wi-Fi device. At buffer
> allocation time you could very well allocate these frames from the
> restricted DMA region without having to bounce buffer them since the
> host CPU is in control over where and when to DMA into.
>

That is, however, still a trade-off between saving that one copy and
protection from the DMA tampering with the packet contents when the
kernel is reading them. Notice how the copy effectively makes a
snapshot of the contents, guaranteeing that the kernel has a
consistent view of the packet, which is not true if the DMA could
modify the buffer contents in the middle of CPU accesses.

Best regards,
Tomasz

> The issue is that each network driver may implement its own buffer
> allocation strategy, some may simply call netdev_alloc_skb() which gives
> zero control over where the buffer comes from unless you play tricks
> with NUMA node allocations and somehow declare that your restricted DMA
> region is a different NUMA node. If the driver allocates pages and then
> attaches a SKB to 

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2021-01-08 Thread Tomasz Figa
On Wed, Dec 23, 2020 at 9:04 PM Helen Koike  wrote:
>
> Hi Tomasz,
>
> On 12/21/20 12:13 AM, Tomasz Figa wrote:
> > On Thu, Dec 17, 2020 at 10:20 PM Helen Koike  
> > wrote:
> >>
> >> Hi Tomasz,
> >>
> >> Thanks for your comments, I have a few questions below.
> >>
> >> On 12/16/20 12:13 AM, Tomasz Figa wrote:
> >>> On Tue, Dec 15, 2020 at 11:37 PM Helen Koike  
> >>> wrote:
> >>>>
> >>>> Hi Tomasz,
> >>>>
> >>>> On 12/14/20 7:46 AM, Tomasz Figa wrote:
> >>>>> On Fri, Dec 4, 2020 at 4:52 AM Helen Koike  
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Please see my 2 points below (about v4l2_ext_buffer and another about 
> >>>>>> timestamp).
> >>>>>>
> >>>>>> On 12/3/20 12:11 PM, Hans Verkuil wrote:
> >>>>>>> On 23/11/2020 18:40, Helen Koike wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 11/23/20 12:46 PM, Tomasz Figa wrote:
> >>>>>>>>> On Tue, Nov 24, 2020 at 12:08 AM Helen Koike 
> >>>>>>>>>  wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Hans,
> >>>>>>>>>>
> >>>>>>>>>> Thank you for your review.
> >>>>>>>>>>
> >>>>>>>>>> On 9/9/20 9:27 AM, Hans Verkuil wrote:
> >>>>>>>>>>> Hi Helen,
> >>>>>>>>>>>
> >>>>>>>>>>> Again I'm just reviewing the uAPI.
> >>>>>>>>>>>
> >>>>>>>>>>> On 04/08/2020 21:29, Helen Koike wrote:
> > [snip]
> >>>
> >>>>
> >>>> Output: userspace fills plane information, informing in which memory 
> >>>> buffer each
> >>>> plane was placed (Or should this be pre-determined by the 
> >>>> driver?)
> >>>>
> >>>> For MMAP
> >>>> ---
> >>>> userspace performs EXT_CREATE_BUF ioctl to reserve a buffer "index" 
> >>>> range in
> >>>> that mode, to be used in EXT_QBUF and EXT_DQBUF
> >>>>
> >>>> Should the API allow userspace to select how many memory buffers it 
> >>>> wants?
> >>>> (maybe not)
> >>>
> >>> I think it does allow that - it accepts the v4l2_ext_format struct.
> >>
> >> hmmm, I thought v4l2_ext_format would describe color planes, and not 
> >> memory planes.
> >> Should it describe memory planes instead? Since planes are defined by the 
> >> pixelformat.
> >> But is this information relevant to ext_{set/get/try} format?
> >>
> >
> > Good point. I ended up assuming the current convention, where giving
> > an M format would imply num_memory_planes == num_color_planes and
> > non-M format num_memory_planes == 1. Sounds like we might want
> > something like a flags field and that could have bits defined to
> > select that. I think it would actually be useful for S_FMT as well,
> > because that's what REQBUFS would use.
>
> Would this flag select between memory and color planes?
> I didn't understand how this flag would be useful to S_FMT, could you
> please clarify?

I mean a flag that decides the plane layout between the 2 possible
options (all planes in their own buffers at offsets 0 vs all planes in
one buffer one after another), rather than giving too much flexibility
for MMAP buffers, which isn't necessary any way, because DMABUF can be
used if more flexibility is needed.

Best regards,
Tomasz

>
> Thanks
> Helen
>
> >
> >>>
> >>>>
> >>>> userspace performs EXT_QUERY_MMAP_BUF to get the mmap offset/cookie and 
> >>>> length
> >>>> for each memory buffer.
> >>>>
> >>>> On EXT_QBUF, userspace doesn't need to fill membuf information. Should 
> >>>> the
> >>>> mmap offset and length be filled by the kernel and returned to userspace 
> >>>> here
> >>>> as well? I'm leaning towards: no.
> >>>
> >>> Yeah, based on my comment above, I think the answer should be no.
> >>>
> >&g

Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once

2021-01-08 Thread Tomasz Figa
On Wed, Dec 23, 2020 at 8:00 PM Robin Murphy  wrote:
>
> On 2020-12-23 08:56, Tomasz Figa wrote:
> > On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
> >> In current iommu_unmap, this code is:
> >>
> >>  iommu_iotlb_gather_init(_gather);
> >>  ret = __iommu_unmap(domain, iova, size, _gather);
> >>  iommu_iotlb_sync(domain, _gather);
> >>
> >> We could gather the whole iova range in __iommu_unmap, and then do tlb
> >> synchronization in the iommu_iotlb_sync.
> >>
> >> This patch implement this, Gather the range in mtk_iommu_unmap.
> >> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> >> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> >> could be regardless of granule size.
> >>
> >> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> >>
> >> This patch aims to do tlb synchronization *once* in the iommu_unmap.
> >>
> >> Signed-off-by: Yong Wu 
> >> ---
> >>   drivers/iommu/mtk_iommu.c | 8 +---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> >> index db7d43adb06b..89cec51405cd 100644
> >> --- a/drivers/iommu/mtk_iommu.c
> >> +++ b/drivers/iommu/mtk_iommu.c
> >> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain 
> >> *domain,
> >>struct iommu_iotlb_gather *gather)
> >>   {
> >>  struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >> +unsigned long long end = iova + size;
> >>
> >> +if (gather->start > iova)
> >> +gather->start = iova;
> >> +if (gather->end < end)
> >> +gather->end = end;
> >
> > I don't know how common the case is, but what happens if
> > gather->start...gather->end is a disjoint range from iova...end? E.g.
> >
> >   | gather  | ..XXX... | iova |
> >   | |  |  |
> >   gather->start |  iova   |
> > gather->end   end
> >
> > We would also end up invalidating the TLB for the XXX area, which could
> > affect the performance.
>
> Take a closer look at iommu_unmap() - the gather data is scoped to each
> individual call, so that can't possibly happen.
>
> > Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
> > to call io_pgtable_tlb_add_page() already, so it should be batching the
> > flushes.
>
> Because if we leave io-pgtable in charge of maintenance it will also
> inject additional invalidations and syncs for the sake of strictly
> correct walk cache maintenance. Apparently we can get away without that
> on this hardware, so the fundamental purpose of this series is to
> sidestep it.
>
> It's proven to be cleaner overall to devolve this kind of "non-standard"
> TLB maintenance back to drivers rather than try to cram yet more
> special-case complexity into io-pgtable itself. I'm planning to clean up
> the remains of the TLBI_ON_MAP quirk entirely after this.

(Sorry, I sent an empty email accidentally.)

I see, thanks for clarifying. The patch looks good to me then.

Best regards,
Tomasz

>
> Robin.
>
> >>  return dom->iop->unmap(dom->iop, iova, size, gather);
> >>   }
> >>
> >> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain 
> >> *domain,
> >>  struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >>  size_t length = gather->end - gather->start;
> >>
> >> -if (gather->start == ULONG_MAX)
> >> -return;
> >> -
> >>  mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
> >> dom->data);
> >>   }
> >> --
> >> 2.18.0
> >>
> >> ___
> >> iommu mailing list
> >> io...@lists.linux-foundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once

2021-01-08 Thread Tomasz Figa
On Wed, Dec 23, 2020 at 8:00 PM Robin Murphy  wrote:
>
> On 2020-12-23 08:56, Tomasz Figa wrote:
> > On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
> >> In current iommu_unmap, this code is:
> >>
> >>  iommu_iotlb_gather_init(_gather);
> >>  ret = __iommu_unmap(domain, iova, size, _gather);
> >>  iommu_iotlb_sync(domain, _gather);
> >>
> >> We could gather the whole iova range in __iommu_unmap, and then do tlb
> >> synchronization in the iommu_iotlb_sync.
> >>
> >> This patch implement this, Gather the range in mtk_iommu_unmap.
> >> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> >> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> >> could be regardless of granule size.
> >>
> >> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> >>
> >> This patch aims to do tlb synchronization *once* in the iommu_unmap.
> >>
> >> Signed-off-by: Yong Wu 
> >> ---
> >>   drivers/iommu/mtk_iommu.c | 8 +---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> >> index db7d43adb06b..89cec51405cd 100644
> >> --- a/drivers/iommu/mtk_iommu.c
> >> +++ b/drivers/iommu/mtk_iommu.c
> >> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain 
> >> *domain,
> >>struct iommu_iotlb_gather *gather)
> >>   {
> >>  struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >> +unsigned long long end = iova + size;
> >>
> >> +if (gather->start > iova)
> >> +gather->start = iova;
> >> +if (gather->end < end)
> >> +gather->end = end;
> >
> > I don't know how common the case is, but what happens if
> > gather->start...gather->end is a disjoint range from iova...end? E.g.
> >
> >   | gather  | ..XXX... | iova |
> >   | |  |  |
> >   gather->start |  iova   |
> > gather->end   end
> >
> > We would also end up invalidating the TLB for the XXX area, which could
> > affect the performance.
>
> Take a closer look at iommu_unmap() - the gather data is scoped to each
> individual call, so that can't possibly happen.
>
> > Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
> > to call io_pgtable_tlb_add_page() already, so it should be batching the
> > flushes.
>
> Because if we leave io-pgtable in charge of maintenance it will also
> inject additional invalidations and syncs for the sake of strictly
> correct walk cache maintenance. Apparently we can get away without that
> on this hardware, so the fundamental purpose of this series is to
> sidestep it.
>
> It's proven to be cleaner overall to devolve this kind of "non-standard"
> TLB maintenance back to drivers rather than try to cram yet more
> special-case complexity into io-pgtable itself. I'm planning to clean up
> the remains of the TLBI_ON_MAP quirk entirely after this.
>
> Robin.
>
> >>  return dom->iop->unmap(dom->iop, iova, size, gather);
> >>   }
> >>
> >> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain 
> >> *domain,
> >>  struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >>  size_t length = gather->end - gather->start;
> >>
> >> -if (gather->start == ULONG_MAX)
> >> -return;
> >> -
> >>  mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
> >> dom->data);
> >>   }
> >> --
> >> 2.18.0
> >>
> >> ___
> >> iommu mailing list
> >> io...@lists.linux-foundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 18/27] iommu/mediatek: Add power-domain operation

2021-01-08 Thread Tomasz Figa
On Tue, Dec 29, 2020 at 8:06 PM Yong Wu  wrote:
>
> On Wed, 2020-12-23 at 17:36 +0900, Tomasz Figa wrote:
> > On Wed, Dec 09, 2020 at 04:00:53PM +0800, Yong Wu wrote:
> > > In the previous SoC, the M4U HW is in the EMI power domain which is
> > > always on. the latest M4U is in the display power domain which may be
> > > turned on/off, thus we have to add pm_runtime interface for it.
> > >
> > > When the engine work, the engine always enable the power and clocks for
> > > smi-larb/smi-common, then the M4U's power will always be powered on
> > > automatically via the device link with smi-common.
> > >
> > > Note: we don't enable the M4U power in iommu_map/unmap for tlb flush.
> > > If its power already is on, of course it is ok. if the power is off,
> > > the main tlb will be reset while M4U power on, thus the tlb flush while
> > > m4u power off is unnecessary, just skip it.
> > >
> > > There will be one case that pm runctime status is not expected when tlb
> > > flush. After boot, the display may call dma_alloc_attrs before it call
> > > pm_runtime_get(disp-dev), then the m4u's pm status is not active inside
> > > the dma_alloc_attrs. Since it only happens after boot, the tlb is clean
> > > at that time, I also think this is ok.
> > >
> > > Signed-off-by: Yong Wu 
> > > ---
> > >  drivers/iommu/mtk_iommu.c | 41 +--
> > >  1 file changed, 35 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > index 6fe3ee2b2bf5..0e9c03cbab32 100644
> > > --- a/drivers/iommu/mtk_iommu.c
> > > +++ b/drivers/iommu/mtk_iommu.c
> > > @@ -184,6 +184,8 @@ static void mtk_iommu_tlb_flush_all(void *cookie)
> > > struct mtk_iommu_data *data = cookie;
> > >
> > > for_each_m4u(data) {
> > > +   if (!pm_runtime_active(data->dev))
> > > +   continue;
> >
> > Is it guaranteed that the status is active in the check above, but then
> > the process is preempted and it goes down here?
> >
> > Shouldn't we do something like below?
> >
> > ret = pm_runtime_get_if_active();
> > if (!ret)
> > continue;
> > if (ret < 0)
> > // handle error
> >
> > // Flush
> >
> > pm_runtime_put();
>
> Make sense. Thanks. There is a comment in arm_smmu.c "avoid touching
> dev->power.lock in fastpaths". To avoid this here too(we have many SoC
> don't have power-domain). then the code will be like:
>
> bool has_pm = !!data->dev->pm_domain;
>
> if (has_pm) {
> if (pm_runtime_get_if_in_use(data->dev) <= 0)
> continue;
> }
>
> 
>
> if (has_pm)
> pm_runtime_put(data->dev);

Looks good to me, thanks.

> >
> > Similar comment to the other places being changed by this patch.
> >
> > > writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
> > >data->base + data->plat_data->inv_sel_reg);
> > > writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
> > > @@ -200,6 +202,10 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned 
> > > long iova, size_t size,
> > > u32 tmp;
> > >
> > > for_each_m4u(data) {
> > > +   /* skip tlb flush when pm is not active. */
> > > +   if (!pm_runtime_active(data->dev))
> > > +   continue;
> > > +
> > > spin_lock_irqsave(>tlb_lock, flags);
> > > writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
> > >data->base + data->plat_data->inv_sel_reg);
> [snip]


Re: [PATCH v2] media: ov8856: Fix Bayer format dependance on mode

2021-01-08 Thread Tomasz Figa
Hi Robert,

On Thu, Jan 7, 2021 at 11:21 PM Robert Foss  wrote:
>
> The Bayer GRBG10 mode used for earlier modes 3280x2460 and
> 1640x1232 isn't the mode output by the sensor for the
> 3264x2448 and 1632x1224 modes.
>
> Switch from MEDIA_BUS_FMT_SGRBG10_1X10 to MEDIA_BUS_FMT_SBGGR10_1X10
> for 3264x2448 & 1632x1224 modes.
>
> Signed-off-by: Robert Foss 
> ---
>
> Changes since v1:
>  - Sakari: Added mode information to ov8856_mode struct
>  - Sakari: enum_mbus_code updated
>
>  drivers/media/i2c/ov8856.c | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/media/i2c/ov8856.c b/drivers/media/i2c/ov8856.c
> index 2f4ceaa80593..7cd83564585c 100644
> --- a/drivers/media/i2c/ov8856.c
> +++ b/drivers/media/i2c/ov8856.c
> @@ -126,6 +126,9 @@ struct ov8856_mode {
>
> /* Sensor register settings for this resolution */
> const struct ov8856_reg_list reg_list;
> +
> +   /* MEDIA_BUS_FMT for this mode */
> +   u32 code;
>  };
>
>  static const struct ov8856_reg mipi_data_rate_720mbps[] = {
> @@ -942,6 +945,11 @@ static const char * const ov8856_test_pattern_menu[] = {
> "Bottom-Top Darker Color Bar"
>  };
>
> +static const u32 ov8856_formats[] = {
> +   MEDIA_BUS_FMT_SBGGR10_1X10,
> +   MEDIA_BUS_FMT_SGRBG10_1X10,
> +};
> +
>  static const s64 link_freq_menu_items[] = {
> OV8856_LINK_FREQ_360MHZ,
> OV8856_LINK_FREQ_180MHZ
> @@ -974,6 +982,7 @@ static const struct ov8856_mode supported_modes[] = {
> .regs = mode_3280x2464_regs,
> },
> .link_freq_index = OV8856_LINK_FREQ_720MBPS,
> +   .code = MEDIA_BUS_FMT_SGRBG10_1X10,
> },
> {
> .width = 3264,
> @@ -986,6 +995,7 @@ static const struct ov8856_mode supported_modes[] = {
> .regs = mode_3264x2448_regs,
> },
> .link_freq_index = OV8856_LINK_FREQ_720MBPS,
> +   .code = MEDIA_BUS_FMT_SBGGR10_1X10,
> },
> {
> .width = 1640,
> @@ -998,6 +1008,7 @@ static const struct ov8856_mode supported_modes[] = {
> .regs = mode_1640x1232_regs,
> },
> .link_freq_index = OV8856_LINK_FREQ_360MBPS,
> +   .code = MEDIA_BUS_FMT_SGRBG10_1X10,
> },
> {
> .width = 1632,
> @@ -1010,6 +1021,7 @@ static const struct ov8856_mode supported_modes[] = {
> .regs = mode_1632x1224_regs,
> },
> .link_freq_index = OV8856_LINK_FREQ_360MBPS,
> +   .code = MEDIA_BUS_FMT_SBGGR10_1X10,
> }
>  };
>
> @@ -1281,8 +1293,8 @@ static void ov8856_update_pad_format(const struct 
> ov8856_mode *mode,
>  {
> fmt->width = mode->width;
> fmt->height = mode->height;
> -   fmt->code = MEDIA_BUS_FMT_SGRBG10_1X10;
> fmt->field = V4L2_FIELD_NONE;
> +   fmt->code = mode->code;
>  }
>
>  static int ov8856_start_streaming(struct ov8856 *ov8856)
> @@ -1519,11 +1531,10 @@ static int ov8856_enum_mbus_code(struct v4l2_subdev 
> *sd,
>  struct v4l2_subdev_pad_config *cfg,
>  struct v4l2_subdev_mbus_code_enum *code)
>  {
> -   /* Only one bayer order GRBG is supported */
> -   if (code->index > 0)
> +   if (code->index >= ARRAY_SIZE(ov8856_formats))
> return -EINVAL;
>
> -   code->code = MEDIA_BUS_FMT_SGRBG10_1X10;
> +   code->code = ov8856_formats[code->index];
>
> return 0;
>  }
> @@ -1532,10 +1543,11 @@ static int ov8856_enum_frame_size(struct v4l2_subdev 
> *sd,
>   struct v4l2_subdev_pad_config *cfg,
>   struct v4l2_subdev_frame_size_enum *fse)
>  {
> -   if (fse->index >= ARRAY_SIZE(supported_modes))
> +   if ((fse->code != ov8856_formats[0]) &&
> +   (fse->code != ov8856_formats[1]))

Shouldn't this be validated against the current mode? I guess it's the
question about which part of the state takes precedence - the mbus
code or the frame size.

Best regards,
Tomasz


Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once

2020-12-23 Thread Tomasz Figa
On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
> In current iommu_unmap, this code is:
> 
>   iommu_iotlb_gather_init(_gather);
>   ret = __iommu_unmap(domain, iova, size, _gather);
>   iommu_iotlb_sync(domain, _gather);
> 
> We could gather the whole iova range in __iommu_unmap, and then do tlb
> synchronization in the iommu_iotlb_sync.
> 
> This patch implement this, Gather the range in mtk_iommu_unmap.
> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> could be regardless of granule size.
> 
> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> 
> This patch aims to do tlb synchronization *once* in the iommu_unmap.
> 
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index db7d43adb06b..89cec51405cd 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain 
> *domain,
> struct iommu_iotlb_gather *gather)
>  {
>   struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> + unsigned long long end = iova + size;
>  
> + if (gather->start > iova)
> + gather->start = iova;
> + if (gather->end < end)
> + gather->end = end;

I don't know how common the case is, but what happens if
gather->start...gather->end is a disjoint range from iova...end? E.g.

 | gather  | ..XXX... | iova |
 | |  |  |
 gather->start |  iova   |
   gather->end   end

We would also end up invalidating the TLB for the XXX area, which could
affect the performance.

Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
to call io_pgtable_tlb_add_page() already, so it should be batching the
flushes.

>   return dom->iop->unmap(dom->iop, iova, size, gather);
>  }
>  
> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain 
> *domain,
>   struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>   size_t length = gather->end - gather->start;
>  
> - if (gather->start == ULONG_MAX)
> - return;
> -
>   mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
>  dom->data);
>  }
> -- 
> 2.18.0
> 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 18/27] iommu/mediatek: Add power-domain operation

2020-12-23 Thread Tomasz Figa
On Wed, Dec 09, 2020 at 04:00:53PM +0800, Yong Wu wrote:
> In the previous SoC, the M4U HW is in the EMI power domain which is
> always on. the latest M4U is in the display power domain which may be
> turned on/off, thus we have to add pm_runtime interface for it.
> 
> When the engine work, the engine always enable the power and clocks for
> smi-larb/smi-common, then the M4U's power will always be powered on
> automatically via the device link with smi-common.
> 
> Note: we don't enable the M4U power in iommu_map/unmap for tlb flush.
> If its power already is on, of course it is ok. if the power is off,
> the main tlb will be reset while M4U power on, thus the tlb flush while
> m4u power off is unnecessary, just skip it.
> 
> There will be one case that pm runctime status is not expected when tlb
> flush. After boot, the display may call dma_alloc_attrs before it call
> pm_runtime_get(disp-dev), then the m4u's pm status is not active inside
> the dma_alloc_attrs. Since it only happens after boot, the tlb is clean
> at that time, I also think this is ok.
> 
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 41 +--
>  1 file changed, 35 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 6fe3ee2b2bf5..0e9c03cbab32 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -184,6 +184,8 @@ static void mtk_iommu_tlb_flush_all(void *cookie)
>   struct mtk_iommu_data *data = cookie;
>  
>   for_each_m4u(data) {
> + if (!pm_runtime_active(data->dev))
> + continue;

Is it guaranteed that the status is active in the check above, but then
the process is preempted and it goes down here?

Shouldn't we do something like below?

ret = pm_runtime_get_if_active();
if (!ret)
continue;
if (ret < 0)
// handle error

// Flush

pm_runtime_put();

Similar comment to the other places being changed by this patch.

>   writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
>  data->base + data->plat_data->inv_sel_reg);
>   writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
> @@ -200,6 +202,10 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
> iova, size_t size,
>   u32 tmp;
>  
>   for_each_m4u(data) {
> + /* skip tlb flush when pm is not active. */
> + if (!pm_runtime_active(data->dev))
> + continue;
> +
>   spin_lock_irqsave(>tlb_lock, flags);
>   writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
>  data->base + data->plat_data->inv_sel_reg);
> @@ -384,6 +390,8 @@ static int mtk_iommu_attach_device(struct iommu_domain 
> *domain,
>  {
>   struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
>   struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> + struct device *m4udev = data->dev;
> + bool pm_enabled = pm_runtime_enabled(m4udev);
>   int ret;
>  
>   if (!data)
> @@ -391,12 +399,25 @@ static int mtk_iommu_attach_device(struct iommu_domain 
> *domain,
>  
>   /* Update the pgtable base address register of the M4U HW */
>   if (!data->m4u_dom) {
> + if (pm_enabled) {
> + ret = pm_runtime_get_sync(m4udev);
> + if (ret < 0) {
> + pm_runtime_put_noidle(m4udev);
> + return ret;
> + }
> + }
>   ret = mtk_iommu_hw_init(data);
> - if (ret)
> + if (ret) {
> + if (pm_enabled)
> + pm_runtime_put(m4udev);
>   return ret;
> + }
>   data->m4u_dom = dom;
>   writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
>  data->base + REG_MMU_PT_BASE_ADDR);
> +
> + if (pm_enabled)
> + pm_runtime_put(m4udev);
>   }
>  
>   mtk_iommu_config(data, dev, true);
> @@ -747,10 +768,13 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   if (dev->pm_domain) {
>   struct device_link *link;
>  
> + pm_runtime_enable(dev);
> +
>   link = device_link_add(data->smicomm_dev, dev,
>  DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);
>   if (!link) {
>   dev_err(dev, "Unable link %s.\n", 
> dev_name(data->smicomm_dev));
> + pm_runtime_disable(dev);
>   return -EINVAL;
>   }
>   }
> @@ -785,8 +809,10 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>  out_sysfs_remove:
>   iommu_device_sysfs_remove(>iommu);
>  out_link_remove:
> - if (dev->pm_domain)
> + if (dev->pm_domain) {
>   

Re: [PATCH v5 17/27] iommu/mediatek: Add pm runtime callback

2020-12-23 Thread Tomasz Figa
On Wed, Dec 09, 2020 at 04:00:52PM +0800, Yong Wu wrote:
> This patch adds pm runtime callback.
> 
> In pm runtime case, all the registers backup/restore and bclk are
> controlled in the pm_runtime callback, then pm_suspend is not needed in
> this case.
> 
> runtime PM is disabled when suspend, thus we call
> pm_runtime_status_suspended instead of pm_runtime_suspended.
> 
> And, m4u doesn't have its special pm runtime domain in previous SoC, in
> this case dev->power.runtime_status is RPM_SUSPENDED defaultly,

This sounds wrong and could lead to hard to debug errors when the driver
is changed in the future. Would it be possible to make the behavior
consistent across the SoCs instead, so that runtime PM status is ACTIVE
when needed, even on SoCs without an IOMMU PM domain?

> thus add
> a "dev->pm_domain" checking for the SoC that has pm runtime domain.
> 
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 22 --
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 5614015e5b96..6fe3ee2b2bf5 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -808,7 +808,7 @@ static int mtk_iommu_remove(struct platform_device *pdev)
>   return 0;
>  }
>  
> -static int __maybe_unused mtk_iommu_suspend(struct device *dev)
> +static int __maybe_unused mtk_iommu_runtime_suspend(struct device *dev)
>  {
>   struct mtk_iommu_data *data = dev_get_drvdata(dev);
>   struct mtk_iommu_suspend_reg *reg = >reg;
> @@ -826,7 +826,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device 
> *dev)
>   return 0;
>  }
>  
> -static int __maybe_unused mtk_iommu_resume(struct device *dev)
> +static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev)
>  {
>   struct mtk_iommu_data *data = dev_get_drvdata(dev);
>   struct mtk_iommu_suspend_reg *reg = >reg;
> @@ -853,7 +853,25 @@ static int __maybe_unused mtk_iommu_resume(struct device 
> *dev)
>   return 0;
>  }
>  
> +static int __maybe_unused mtk_iommu_suspend(struct device *dev)
> +{
> + /* runtime PM is disabled when suspend in pm_runtime case. */
> + if (dev->pm_domain && pm_runtime_status_suspended(dev))
> + return 0;
> +
> + return mtk_iommu_runtime_suspend(dev);
> +}
> +
> +static int __maybe_unused mtk_iommu_resume(struct device *dev)
> +{
> + if (dev->pm_domain && pm_runtime_status_suspended(dev))
> + return 0;
> +
> + return mtk_iommu_runtime_resume(dev);
> +}

Wouldn't it be enough to just use pm_runtime_force_suspend() and
pm_runtime_force_resume() as system sleep ops?

> +
>  static const struct dev_pm_ops mtk_iommu_pm_ops = {
> + SET_RUNTIME_PM_OPS(mtk_iommu_runtime_suspend, mtk_iommu_runtime_resume, 
> NULL)
>   SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_iommu_suspend, mtk_iommu_resume)
>  };
>  
> -- 
> 2.18.0
> 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 16/27] iommu/mediatek: Add device link for smi-common and m4u

2020-12-23 Thread Tomasz Figa
On Wed, Dec 09, 2020 at 04:00:51PM +0800, Yong Wu wrote:
> In the lastest SoC, M4U has its special power domain. thus, If the engine
> begin to work, it should help enable the power for M4U firstly.
> Currently if the engine work, it always enable the power/clocks for
> smi-larbs/smi-common. This patch adds device_link for smi-common and M4U.
> then, if smi-common power is enabled, the M4U power also is powered on
> automatically.
> 
> Normally M4U connect with several smi-larbs and their smi-common always
> are the same, In this patch it get smi-common dev from the first smi-larb
> device(i==0), then add the device_link only while m4u has power-domain.
> 
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 30 --
>  drivers/iommu/mtk_iommu.h |  1 +
>  2 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 09c8c58feb78..5614015e5b96 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -706,7 +707,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   return larb_nr;
>  
>   for (i = 0; i < larb_nr; i++) {
> - struct device_node *larbnode;
> + struct device_node *larbnode, *smicomm_node;
>   struct platform_device *plarbdev;
>   u32 id;
>  
> @@ -732,6 +733,26 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>  
>   component_match_add_release(dev, , release_of,
>   compare_of, larbnode);
> + if (i != 0)
> + continue;

How about using the last larb instead and moving the code below outside
of the loop?

> + smicomm_node = of_parse_phandle(larbnode, "mediatek,smi", 0);
> + if (!smicomm_node)
> + return -EINVAL;
> +
> + plarbdev = of_find_device_by_node(smicomm_node);
> + of_node_put(smicomm_node);
> + data->smicomm_dev = >dev;
> + }
> +
> + if (dev->pm_domain) {
> + struct device_link *link;
> +
> + link = device_link_add(data->smicomm_dev, dev,
> +DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);
> + if (!link) {
> + dev_err(dev, "Unable link %s.\n", 
> dev_name(data->smicomm_dev));
> + return -EINVAL;
> + }
>   }
>  
>   platform_set_drvdata(pdev, data);
> @@ -739,7 +760,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   ret = iommu_device_sysfs_add(>iommu, dev, NULL,
>"mtk-iommu.%pa", );
>   if (ret)
> - return ret;
> + goto out_link_remove;
>  
>   iommu_device_set_ops(>iommu, _iommu_ops);
>   iommu_device_set_fwnode(>iommu, >dev.of_node->fwnode);
> @@ -763,6 +784,9 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   iommu_device_unregister(>iommu);
>  out_sysfs_remove:
>   iommu_device_sysfs_remove(>iommu);
> +out_link_remove:
> + if (dev->pm_domain)
> + device_link_remove(data->smicomm_dev, dev);
>   return ret;
>  }
>  
> @@ -777,6 +801,8 @@ static int mtk_iommu_remove(struct platform_device *pdev)
>   bus_set_iommu(_bus_type, NULL);
>  
>   clk_disable_unprepare(data->bclk);
> + if (pdev->dev.pm_domain)
> + device_link_remove(data->smicomm_dev, >dev);
>   devm_free_irq(>dev, data->irq, data);
>   component_master_del(>dev, _iommu_com_ops);
>   return 0;
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index d0c93652bdbe..5e03a029c4dc 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -68,6 +68,7 @@ struct mtk_iommu_data {
>  
>   struct iommu_device iommu;
>   const struct mtk_iommu_plat_data *plat_data;
> + struct device   *smicomm_dev;
>  
>   struct dma_iommu_mapping*mapping; /* For mtk_iommu_v1.c */
>  
> -- 
> 2.18.0
> 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 15/27] iommu/mediatek: Add fail handle for sysfs_add and device_register

2020-12-23 Thread Tomasz Figa
On Wed, Dec 09, 2020 at 04:00:50PM +0800, Yong Wu wrote:
> Add fail handle for iommu_device_sysfs_add and iommu_device_register.
> 
> Fixes: b16c0170b53c ("iommu/mediatek: Make use of iommu_device_register 
> interface")
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 39478cfbe0f1..09c8c58feb78 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -746,7 +746,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>  
>   ret = iommu_device_register(>iommu);
>   if (ret)
> - return ret;
> + goto out_sysfs_remove;
>  
>   spin_lock_init(>tlb_lock);
>   list_add_tail(>list, );
> @@ -754,7 +754,16 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   if (!iommu_present(_bus_type))
>   bus_set_iommu(_bus_type, _iommu_ops);
>  
> - return component_master_add_with_match(dev, _iommu_com_ops, match);
> + ret = component_master_add_with_match(dev, _iommu_com_ops, match);
> + if (ret)
> + goto out_dev_unreg;
> + return ret;
> +
> +out_dev_unreg:

Shouldn't other operations be undone as well? I can see that above
bus_set_iommu() is set and an entry is added to m4ulist.

> + iommu_device_unregister(>iommu);
> +out_sysfs_remove:
> + iommu_device_sysfs_remove(>iommu);
> + return ret;
>  }
>  
>  static int mtk_iommu_remove(struct platform_device *pdev)
> -- 
> 2.18.0
> 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 09/27] iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek

2020-12-23 Thread Tomasz Figa
On Wed, Dec 09, 2020 at 04:00:44PM +0800, Yong Wu wrote:
> MediaTek extend the bit5 in lvl1 and lvl2 descriptor as PA34.
> 
> Signed-off-by: Yong Wu 
> Acked-by: Will Deacon 
> Reviewed-by: Robin Murphy 
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 9 +++--
>  drivers/iommu/mtk_iommu.c  | 2 +-
>  include/linux/io-pgtable.h | 4 ++--
>  3 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index e880745ab1e8..4d0aa079470f 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -112,9 +112,10 @@
>  #define ARM_V7S_TEX_MASK 0x7
>  #define ARM_V7S_ATTR_TEX(val)(((val) & ARM_V7S_TEX_MASK) << 
> ARM_V7S_TEX_SHIFT)
>  
> -/* MediaTek extend the two bits for PA 32bit/33bit */
> +/* MediaTek extend the bits below for PA 32bit/33bit/34bit */
>  #define ARM_V7S_ATTR_MTK_PA_BIT32BIT(9)
>  #define ARM_V7S_ATTR_MTK_PA_BIT33BIT(4)
> +#define ARM_V7S_ATTR_MTK_PA_BIT34BIT(5)
>  
>  /* *well, except for TEX on level 2 large pages, of course :( */
>  #define ARM_V7S_CONT_PAGE_TEX_SHIFT  6
> @@ -194,6 +195,8 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, 
> int lvl,
>   pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
>   if (paddr & BIT_ULL(33))
>   pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> + if (paddr & BIT_ULL(34))
> + pte |= ARM_V7S_ATTR_MTK_PA_BIT34;
>   return pte;
>  }
>  
> @@ -218,6 +221,8 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int 
> lvl,
>   paddr |= BIT_ULL(32);
>   if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
>   paddr |= BIT_ULL(33);
> + if (pte & ARM_V7S_ATTR_MTK_PA_BIT34)
> + paddr |= BIT_ULL(34);
>   return paddr;
>  }
>  
> @@ -754,7 +759,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
> io_pgtable_cfg *cfg,
>   if (cfg->ias > ARM_V7S_ADDR_BITS)
>   return NULL;
>  
> - if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
> + if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
>   return NULL;
>  
>   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 6451d83753e1..ec3c87d4b172 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -320,7 +320,7 @@ static int mtk_iommu_domain_finalise(struct 
> mtk_iommu_domain *dom)
>   IO_PGTABLE_QUIRK_ARM_MTK_EXT,
>   .pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
>   .ias = 32,
> - .oas = 34,
> + .oas = 35,

Shouldn't this be set according to the real hardware capabilities,
instead of always setting it to 35?

Best regards,
Tomasz

>   .tlb = _iommu_flush_ops,
>   .iommu_dev = data->dev,
>   };
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 4cde111e425b..1ae0757f4f94 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -77,8 +77,8 @@ struct io_pgtable_cfg {
>*  TLB maintenance when mapping as well as when unmapping.
>*
>* IO_PGTABLE_QUIRK_ARM_MTK_EXT: (ARM v7s format) MediaTek IOMMUs extend
> -  *  to support up to 34 bits PA where the bit32 and bit33 are
> -  *  encoded in the bit9 and bit4 of the PTE respectively.
> +  *  to support up to 35 bits PA where the bit32, bit33 and bit34 are
> +  *  encoded in the bit9, bit4 and bit5 of the PTE respectively.
>*
>* IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
>*  on unmap, for DMA domains using the flush queue mechanism for
> -- 
> 2.18.0
> 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 06/27] dt-bindings: mediatek: Add binding for mt8192 IOMMU

2020-12-23 Thread Tomasz Figa
On Wed, Dec 09, 2020 at 04:00:41PM +0800, Yong Wu wrote:
> This patch adds decriptions for mt8192 IOMMU and SMI.
> 
> mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> table format. The M4U-SMI HW diagram is as below:
> 
>   EMI
>|
>   M4U
>|
>   
>SMI Common
>   
>|
>   +---+--+--+--+---+
>   |   |  |  |   .. |   |
>   |   |  |  |  |   |
> larb0   larb1  larb2  larb4 ..  larb19   larb20
> disp0   disp1   mdpvdec   IPE  IPE
> 
> All the connections are HW fixed, SW can NOT adjust it.
> 
> mt8192 M4U support 0~16GB iova range. we preassign different engines
> into different iova ranges:
> 
> domain-id  module iova-range  larbs
>0   disp0 ~ 4G  larb0/1
>1   vcodec  4G ~ 8G larb4/5/7
>2   cam/mdp 8G ~ 12G larb2/9/11/13/14/16/17/18/19/20

Why do we preassign these addresses in DT? Shouldn't it be a user's or
integrator's decision to split the 16 GB address range into sub-ranges
and define which larbs those sub-ranges are shared with?

Best regards,
Tomasz

>3   CCU00x4000_ ~ 0x43ff_ larb13: port 9/10
>4   CCU10x4400_ ~ 0x47ff_ larb14: port 4/5
> 
> The iova range for CCU0/1(camera control unit) is HW requirement.
> 
> Signed-off-by: Yong Wu 
> Reviewed-by: Rob Herring 
> ---
>  .../bindings/iommu/mediatek,iommu.yaml|  18 +-
>  include/dt-bindings/memory/mt8192-larb-port.h | 240 ++
>  2 files changed, 257 insertions(+), 1 deletion(-)
>  create mode 100644 include/dt-bindings/memory/mt8192-larb-port.h
> 
> diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml 
> b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
> index ba6626347381..0f26fe14c8e2 100644
> --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
> @@ -76,6 +76,7 @@ properties:
>- mediatek,mt8167-m4u  # generation two
>- mediatek,mt8173-m4u  # generation two
>- mediatek,mt8183-m4u  # generation two
> +  - mediatek,mt8192-m4u  # generation two
>  
>- description: mt7623 generation one
>  items:
> @@ -115,7 +116,11 @@ properties:
>dt-binding/memory/mt6779-larb-port.h for mt6779,
>dt-binding/memory/mt8167-larb-port.h for mt8167,
>dt-binding/memory/mt8173-larb-port.h for mt8173,
> -  dt-binding/memory/mt8183-larb-port.h for mt8183.
> +  dt-binding/memory/mt8183-larb-port.h for mt8183,
> +  dt-binding/memory/mt8192-larb-port.h for mt8192.
> +
> +  power-domains:
> +maxItems: 1
>  
>  required:
>- compatible
> @@ -133,11 +138,22 @@ allOf:
>- mediatek,mt2701-m4u
>- mediatek,mt2712-m4u
>- mediatek,mt8173-m4u
> +  - mediatek,mt8192-m4u
>  
>  then:
>required:
>  - clocks
>  
> +  - if:
> +  properties:
> +compatible:
> +  enum:
> +- mediatek,mt8192-m4u
> +
> +then:
> +  required:
> +- power-domains
> +
>  additionalProperties: false
>  
>  examples:
> diff --git a/include/dt-bindings/memory/mt8192-larb-port.h 
> b/include/dt-bindings/memory/mt8192-larb-port.h
> new file mode 100644
> index ..ec1ac2ba7094
> --- /dev/null
> +++ b/include/dt-bindings/memory/mt8192-larb-port.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2020 MediaTek Inc.
> + *
> + * Author: Chao Hao 
> + * Author: Yong Wu 
> + */
> +#ifndef _DT_BINDINGS_MEMORY_MT8192_LARB_PORT_H_
> +#define _DT_BINDINGS_MEMORY_MT8192_LARB_PORT_H_
> +
> +#include 
> +
> +/*
> + * MM IOMMU:
> + * domain 0: display: larb0, larb1.
> + * domain 1: vcodec: larb4, larb5, larb7.
> + * domain 2: CAM/MDP: larb2, larb9, larb11, larb13, larb14, larb16,
> + *   larb17, larb18, larb19, larb20,
> + * domain 3: CCU0: larb13 - port9/10.
> + * domain 4: CCU1: larb14 - port4/5.
> + *
> + * larb3/6/8/10/12/15 is null.
> + */
> +
> +/* larb0 */
> +#define M4U_PORT_L0_DISP_POSTMASK0   MTK_M4U_DOM_ID(0, 0, 0)
> +#define M4U_PORT_L0_OVL_RDMA0_HDRMTK_M4U_DOM_ID(0, 0, 1)
> +#define M4U_PORT_L0_OVL_RDMA0MTK_M4U_DOM_ID(0, 0, 2)
> +#define M4U_PORT_L0_DISP_RDMA0   MTK_M4U_DOM_ID(0, 0, 3)
> +#define M4U_PORT_L0_DISP_WDMA0   MTK_M4U_DOM_ID(0, 0, 4)
> +#define M4U_PORT_L0_DISP_FAKE0   MTK_M4U_DOM_ID(0, 0, 5)
> +
> +/* larb1 */
> +#define M4U_PORT_L1_OVL_2L_RDMA0_HDR 

Re: [PATCH v5 04/27] dt-bindings: memory: mediatek: Add domain definition

2020-12-23 Thread Tomasz Figa
Hi Yong,

On Wed, Dec 09, 2020 at 04:00:39PM +0800, Yong Wu wrote:
> In the latest SoC, there are several HW IP require a sepecial iova
> range, mainly CCU and VPU has this requirement. Take CCU as a example,
> CCU require its iova locate in the range(0x4000_ ~ 0x43ff_).

Is this really a domain? Does the address range come from the design of
the IOMMU?

Best regards,
Tomasz

> 
> In this patch we add a domain definition for the special port. In the
> example of CCU, If we preassign CCU port in domain1, then iommu driver
> will prepare a independent iommu domain of the special iova range for it,
> then the iova got from dma_alloc_attrs(ccu-dev) will locate in its special
> range.
> 
> This is a preparing patch for multi-domain support.
> 
> Signed-off-by: Yong Wu 
> Acked-by: Krzysztof Kozlowski 
> Acked-by: Rob Herring 
> ---
>  include/dt-bindings/memory/mtk-smi-larb-port.h | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/dt-bindings/memory/mtk-smi-larb-port.h 
> b/include/dt-bindings/memory/mtk-smi-larb-port.h
> index 7d64103209af..2d4c973c174f 100644
> --- a/include/dt-bindings/memory/mtk-smi-larb-port.h
> +++ b/include/dt-bindings/memory/mtk-smi-larb-port.h
> @@ -7,9 +7,16 @@
>  #define __DT_BINDINGS_MEMORY_MTK_MEMORY_PORT_H_
>  
>  #define MTK_LARB_NR_MAX  32
> +#define MTK_M4U_DOM_NR_MAX   8
> +
> +#define MTK_M4U_DOM_ID(domid, larb, port)\
> + (((domid) & 0x7) << 16 | (((larb) & 0x1f) << 5) | ((port) & 0x1f))
> +
> +/* The default dom id is 0. */
> +#define MTK_M4U_ID(larb, port)   MTK_M4U_DOM_ID(0, larb, port)
>  
> -#define MTK_M4U_ID(larb, port)   (((larb) << 5) | (port))
>  #define MTK_M4U_TO_LARB(id)  (((id) >> 5) & 0x1f)
>  #define MTK_M4U_TO_PORT(id)  ((id) & 0x1f)
> +#define MTK_M4U_TO_DOM(id)   (((id) >> 16) & 0x7)
>  
>  #endif
> -- 
> 2.18.0
> 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-12-20 Thread Tomasz Figa
On Thu, Dec 17, 2020 at 10:20 PM Helen Koike  wrote:
>
> Hi Tomasz,
>
> Thanks for your comments, I have a few questions below.
>
> On 12/16/20 12:13 AM, Tomasz Figa wrote:
> > On Tue, Dec 15, 2020 at 11:37 PM Helen Koike  
> > wrote:
> >>
> >> Hi Tomasz,
> >>
> >> On 12/14/20 7:46 AM, Tomasz Figa wrote:
> >>> On Fri, Dec 4, 2020 at 4:52 AM Helen Koike  
> >>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> Please see my 2 points below (about v4l2_ext_buffer and another about 
> >>>> timestamp).
> >>>>
> >>>> On 12/3/20 12:11 PM, Hans Verkuil wrote:
> >>>>> On 23/11/2020 18:40, Helen Koike wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 11/23/20 12:46 PM, Tomasz Figa wrote:
> >>>>>>> On Tue, Nov 24, 2020 at 12:08 AM Helen Koike 
> >>>>>>>  wrote:
> >>>>>>>>
> >>>>>>>> Hi Hans,
> >>>>>>>>
> >>>>>>>> Thank you for your review.
> >>>>>>>>
> >>>>>>>> On 9/9/20 9:27 AM, Hans Verkuil wrote:
> >>>>>>>>> Hi Helen,
> >>>>>>>>>
> >>>>>>>>> Again I'm just reviewing the uAPI.
> >>>>>>>>>
> >>>>>>>>> On 04/08/2020 21:29, Helen Koike wrote:
[snip]
> >
> >>
> >> Output: userspace fills plane information, informing in which memory 
> >> buffer each
> >> plane was placed (Or should this be pre-determined by the driver?)
> >>
> >> For MMAP
> >> ---
> >> userspace performs EXT_CREATE_BUF ioctl to reserve a buffer "index" range 
> >> in
> >> that mode, to be used in EXT_QBUF and EXT_DQBUF
> >>
> >> Should the API allow userspace to select how many memory buffers it wants?
> >> (maybe not)
> >
> > I think it does allow that - it accepts the v4l2_ext_format struct.
>
> hmmm, I thought v4l2_ext_format would describe color planes, and not memory 
> planes.
> Should it describe memory planes instead? Since planes are defined by the 
> pixelformat.
> But is this information relevant to ext_{set/get/try} format?
>

Good point. I ended up assuming the current convention, where giving
an M format would imply num_memory_planes == num_color_planes and
non-M format num_memory_planes == 1. Sounds like we might want
something like a flags field and that could have bits defined to
select that. I think it would actually be useful for S_FMT as well,
because that's what REQBUFS would use.

> >
> >>
> >> userspace performs EXT_QUERY_MMAP_BUF to get the mmap offset/cookie and 
> >> length
> >> for each memory buffer.
> >>
> >> On EXT_QBUF, userspace doesn't need to fill membuf information. Should the
> >> mmap offset and length be filled by the kernel and returned to userspace 
> >> here
> >> as well? I'm leaning towards: no.
> >
> > Yeah, based on my comment above, I think the answer should be no.
> >
> >>
> >> If the answer is no, then here is my proposal:
> >> --
> >>
> >> /* If MMAP, drivers decide how many memory buffers to allocate */
> >> int ioctl( int fd, VIDIOC_EXT_CREATE_BUFS, struct v4l2_ext_buffer *argp )
> >>
> >> /* Returns -EINVAL if not MMAP */
> >> int ioctl( int fd, VIDIOC_EXT_MMAP_QUERYBUF, struct v4l2_ext_mmap_querybuf 
> >> *argp )
> >>
> >> /* userspace fills v4l2_ext_buffer.membufs if DMA-fd or Userptr, leave it 
> >> zero for MMAP
> >>  * Should userspace also fill v4l2_ext_buffer.planes?
> >>  */
> >> int ioctl( int fd, VIDIOC_EXT_QBUF, struct v4l2_ext_buffer *argp )
> >>
> >> /* v4l2_ext_buffer.membufs is set to zero by the driver */
> >> int ioctl( int fd, VIDIOC_EXT_DBUF, struct v4l2_ext_buffer *argp )
> >>
> >> (I omitted reserved fields below)
> >>
> >> struct v4l2_ext_create_buffers {
> >> __u32   index;
> >> __u32   count;
> >> __u32   memory;
> >> __u32   capabilities;
> >> struct v4l2_ext_pix_format  format;
> >> };
> >>
> >> struct v4l2_ext_mma

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Tomasz Figa
On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  wrote:
>
> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> > Hi Tomasz,
> >
> > On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > Cc: Robin
> > > >
> > > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > > This driver uses the SG vb2 ops, but effectively only ever accesses 
> > > > > the
> > > > > first entry of the SG table, indicating that it expects a flat layout.
> > > > > Switch it to use the contiguous ops to make sure this expected 
> > > > > invariant
> > > >
> > > > Under what circumstances the sg table will has nents > 1? I came down to
> > > > [1] but not sure I got it right.
> > > >
> > > > I'm afraid that for systems with low amount of system memory and when
> > > > the memory become fragmented, the driver will not work. That's why I
> > > > started with sg allocator.
> > >
> > > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > > in terms of the DMA (aka IOVA) address space. In other words, it
> > > guarantees that having one DMA address and length fully describes the
> >
> > Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> > allocator.
>
> I'm learning everyday too, but I'm surprised I don't see a call to
> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
> missed
> a patch when overlooking this thread) ?
>
> The reason I'm asking, doc says it should be called by driver supporting 
> IOMMU,
> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case 
> it's
> all covered and we are good, otherwise perhaps a downstream patch didn't make 
> it
> ?
>
> /**
>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
>  * @dev:device for configuring DMA parameters
>  * @size:   size of DMA max segment size to set
>  *
>  * To allow mapping the scatter-list into a single chunk in the DMA
>  * address space, the device is required to have the DMA max segment
>  * size parameter set to a value larger than the buffer size. Otherwise,
>  * the DMA-mapping subsystem will split the mapping into max segment
>  * size chunks. This function sets the DMA max segment size
>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
>  * address space.
>  * This code assumes that the DMA-mapping subsystem will merge all
>  * scatterlist segments if this is really possible (for example when
>  * an IOMMU is available and enabled).
>  * Ideally, this parameter should be set by the generic bus code, but it
>  * is left with the default 64KiB value due to historical litmiations in
>  * other subsystems (like limited USB host drivers) and there no good
>  * place to set it to the proper value.
>  * This function should be called from the drivers, which are known to
>  * operate on platforms with IOMMU and provide access to shared buffers
>  * (either USERPTR or DMABUF). This should be done before initializing
>  * videobuf2 queue.
>  */

It does call dma_set_max_seg_size() directly:
https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230

Actually, why do we even need a vb2 helper for this?

>
> regards,
> Nicolas
>
> >
> > > buffer. This seems to be the requirement of the hardware/firmware
> > > handled by the venus driver. If the device is behind an IOMMU, which
> > > is the case for the SoCs in question, the underlying DMA ops will
> > > actually allocate a discontiguous set of pages, so it has nothing to
> > > do to system memory amount or fragmentation. If for some reason the
> > > IOMMU can't be used, there is no way around, the memory needs to be
> > > contiguous because of the hardware/firmware/driver expectation.
> > >
> > > On the other hand, the vb2-dma-sg allocator doesn't have any
> > > continuity guarantees for the DMA, or any other, address space. The
> > > current code works fine, because it calls dma_map_sg() on the whole
> > > set of pages and that ends up mapping it contiguously in the IOVA
> > > space, but that's just an implementation detail, not an API guarantee.
> >
> > It was good to know. Thanks for the explanation.
> >
> > >
> > > Best regards,
> > > 

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-12-15 Thread Tomasz Figa
On Tue, Dec 15, 2020 at 11:37 PM Helen Koike  wrote:
>
> Hi Tomasz,
>
> On 12/14/20 7:46 AM, Tomasz Figa wrote:
> > On Fri, Dec 4, 2020 at 4:52 AM Helen Koike  
> > wrote:
> >>
> >> Hi,
> >>
> >> Please see my 2 points below (about v4l2_ext_buffer and another about 
> >> timestamp).
> >>
> >> On 12/3/20 12:11 PM, Hans Verkuil wrote:
> >>> On 23/11/2020 18:40, Helen Koike wrote:
> >>>>
> >>>>
> >>>> On 11/23/20 12:46 PM, Tomasz Figa wrote:
> >>>>> On Tue, Nov 24, 2020 at 12:08 AM Helen Koike 
> >>>>>  wrote:
> >>>>>>
> >>>>>> Hi Hans,
> >>>>>>
> >>>>>> Thank you for your review.
> >>>>>>
> >>>>>> On 9/9/20 9:27 AM, Hans Verkuil wrote:
> >>>>>>> Hi Helen,
> >>>>>>>
> >>>>>>> Again I'm just reviewing the uAPI.
> >>>>>>>
> >>>>>>> On 04/08/2020 21:29, Helen Koike wrote:
> >>>>>>>> From: Hans Verkuil 
> >>>>>>>>
> >>>>>>>> Those extended buffer ops have several purpose:
> >>>>>>>> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
> >>>>>>>>the number of ns elapsed since 1970
> >>>>>>>> 2/ Unify single/multiplanar handling
> >>>>>>>> 3/ Add a new start offset field to each v4l2 plane buffer info struct
> >>>>>>>>to support the case where a single buffer object is storing all
> >>>>>>>>planes data, each one being placed at a different offset
> >>>>>>>>
> >>>>>>>> New hooks are created in v4l2_ioctl_ops so that drivers can start 
> >>>>>>>> using
> >>>>>>>> these new objects.
> >>>>>>>>
> >>>>>>>> The core takes care of converting new ioctls requests to old ones
> >>>>>>>> if the driver does not support the new hooks, and vice versa.
> >>>>>>>>
> >>>>>>>> Note that the timecode field is gone, since there doesn't seem to be
> >>>>>>>> in-kernel users. We can be added back in the reserved area if needed 
> >>>>>>>> or
> >>>>>>>> use the Request API to collect more metadata information from the
> >>>>>>>> frame.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Hans Verkuil 
> >>>>>>>> Signed-off-by: Boris Brezillon 
> >>>>>>>> Signed-off-by: Helen Koike 
> >>>>>>>> ---
> >>>>>>>> Changes in v5:
> >>>>>>>> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> >>>>>>>> - return mem_offset to struct v4l2_ext_plane
> >>>>>>>> - change sizes and reorder fields to avoid holes in the struct and 
> >>>>>>>> make
> >>>>>>>>   it the same for 32 and 64 bits
> >>>>>>>>
> >>>>>>>> Changes in v4:
> >>>>>>>> - Use v4l2_ext_pix_format directly in the ioctl, drop 
> >>>>>>>> v4l2_ext_format,
> >>>>>>>> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> >>>>>>>> - Drop VIDIOC_EXT_EXPBUF, since the only difference from 
> >>>>>>>> VIDIOC_EXPBUF
> >>>>>>>> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at 
> >>>>>>>> once.
> >>>>>>>> I think we can add this later, so I removed it from this RFC to 
> >>>>>>>> simplify it.
> >>>>>>>> - Remove num_planes field from struct v4l2_ext_buffer
> >>>>>>>> - Add flags field to struct v4l2_ext_create_buffers
> >>>>>>>> - Reformulate struct v4l2_ext_plane
> >>>>>>>> - Fix some bugs caught by v4l2-compliance
> >>>>>>>> - Rebased on top of media/master (post 5.8-rc1)
> >>>>>>>>
> >>>>>>>> Changes in v3:
> >>>>>>>> - Rebased on top of media/master 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Tomasz Figa
On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
 wrote:
>
> Hi,
>
> Cc: Robin
>
> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > first entry of the SG table, indicating that it expects a flat layout.
> > Switch it to use the contiguous ops to make sure this expected invariant
>
> Under what circumstances the sg table will has nents > 1? I came down to
> [1] but not sure I got it right.
>
> I'm afraid that for systems with low amount of system memory and when
> the memory become fragmented, the driver will not work. That's why I
> started with sg allocator.

It is exactly the opposite. The vb2-dma-contig allocator is "contig"
in terms of the DMA (aka IOVA) address space. In other words, it
guarantees that having one DMA address and length fully describes the
buffer. This seems to be the requirement of the hardware/firmware
handled by the venus driver. If the device is behind an IOMMU, which
is the case for the SoCs in question, the underlying DMA ops will
actually allocate a discontiguous set of pages, so it has nothing to
do to system memory amount or fragmentation. If for some reason the
IOMMU can't be used, there is no way around, the memory needs to be
contiguous because of the hardware/firmware/driver expectation.

On the other hand, the vb2-dma-sg allocator doesn't have any
continuity guarantees for the DMA, or any other, address space. The
current code works fine, because it calls dma_map_sg() on the whole
set of pages and that ends up mapping it contiguously in the IOVA
space, but that's just an implementation detail, not an API guarantee.

Best regards,
Tomasz

>
> [1]
> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
>
> > is always enforced. Since the device is supposed to be behind an IOMMU
> > this should have little to none practical consequences beyond making the
> > driver not rely on a particular behavior of the SG implementation.
> >
> > Reported-by: Tomasz Figa 
> > Signed-off-by: Alexandre Courbot 
> > ---
> > Hi everyone,
> >
> > It probably doesn't hurt to fix this issue before some actual issue happens.
> > I have tested this patch on Chrome OS and playback was just as fine as with
> > the SG ops.
> >
> >  drivers/media/platform/Kconfig  | 2 +-
> >  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
> >  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
> >  drivers/media/platform/qcom/venus/venc.c| 6 +++---
> >  4 files changed, 9 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > index 35a18d388f3f..d9d7954111f2 100644
> > --- a/drivers/media/platform/Kconfig
> > +++ b/drivers/media/platform/Kconfig
> > @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> >   depends on INTERCONNECT || !INTERCONNECT
> >   select QCOM_MDT_LOADER if ARCH_QCOM
> >   select QCOM_SCM if ARCH_QCOM
> > - select VIDEOBUF2_DMA_SG
> > + select VIDEOBUF2_DMA_CONTIG
> >   select V4L2_MEM2MEM_DEV
> >   help
> > This is a V4L2 driver for Qualcomm Venus video accelerator
> > diff --git a/drivers/media/platform/qcom/venus/helpers.c 
> > b/drivers/media/platform/qcom/venus/helpers.c
> > index 50439eb1ffea..859d260f002b 100644
> > --- a/drivers/media/platform/qcom/venus/helpers.c
> > +++ b/drivers/media/platform/qcom/venus/helpers.c
> > @@ -7,7 +7,7 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +#include 
> >  #include 
> >  #include 
> >
> > @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
> >   struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> >   struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> >   struct venus_buffer *buf = to_venus_buffer(vbuf);
> > - struct sg_table *sgt;
> > -
> > - sgt = vb2_dma_sg_plane_desc(vb, 0);
> > - if (!sgt)
> > - return -EFAULT;
> >
> >   buf->size = vb2_plane_size(vb, 0);
> > - buf->dma_addr = sg_dma_address(sgt->sgl);
>
> Can we do it:
>
> if (WARN_ON(sgt->nents > 1))
> return -EFAULT;
>
> I understand that logically using dma-sg when the flat layout is
> expected by the hardware is wrong, but I haven't seen issues until now.
>
> > + buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> >
> >   if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> >   list_add_tail(>reg_list, >registeredbu

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-12-15 Thread Tomasz Figa
On Mon, Dec 14, 2020 at 10:24 PM Helen Koike  wrote:
>
> Hi Tomasz,
>
> Thank you for your comments,
>
> On 12/14/20 7:36 AM, Tomasz Figa wrote:
> > On Tue, Nov 24, 2020 at 5:33 AM Helen Koike  
> > wrote:
> >>
> >> Hi Tomasz,
> >>
> >>
> >> On 11/20/20 8:14 AM, Tomasz Figa wrote:
> >>> Hi Helen,
> >>>
> >>> On Tue, Aug 04, 2020 at 04:29:34PM -0300, Helen Koike wrote:
> >>>> From: Hans Verkuil 
> >>>>
> >>>> Those extended buffer ops have several purpose:
> >>>> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
> >>>>the number of ns elapsed since 1970
> >>>> 2/ Unify single/multiplanar handling
> >>>> 3/ Add a new start offset field to each v4l2 plane buffer info struct
> >>>>to support the case where a single buffer object is storing all
> >>>>planes data, each one being placed at a different offset
> >>>>
> >>>> New hooks are created in v4l2_ioctl_ops so that drivers can start using
> >>>> these new objects.
> >>>>
> >>>> The core takes care of converting new ioctls requests to old ones
> >>>> if the driver does not support the new hooks, and vice versa.
> >>>>
> >>>> Note that the timecode field is gone, since there doesn't seem to be
> >>>> in-kernel users. We can be added back in the reserved area if needed or
> >>>> use the Request API to collect more metadata information from the
> >>>> frame.
> >>>>
> >>>
> >>> Thanks for the patch. Please see my comments inline.
> >>
> >> Thank you for your detailed review, please see my comments below.
> >>
> >>>
> >>>> Signed-off-by: Hans Verkuil 
> >>>> Signed-off-by: Boris Brezillon 
> >>>> Signed-off-by: Helen Koike 
> >>>> ---
> >>>> Changes in v5:
> >>>> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> >>>> - return mem_offset to struct v4l2_ext_plane
> >>>> - change sizes and reorder fields to avoid holes in the struct and make
> >>>>   it the same for 32 and 64 bits
> >>>>
> >>>> Changes in v4:
> >>>> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> >>>> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> >>>> - Drop VIDIOC_EXT_EXPBUF, since the only difference from VIDIOC_EXPBUF
> >>>> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at once.
> >>>> I think we can add this later, so I removed it from this RFC to simplify 
> >>>> it.
> >>>> - Remove num_planes field from struct v4l2_ext_buffer
> >>>> - Add flags field to struct v4l2_ext_create_buffers
> >>>> - Reformulate struct v4l2_ext_plane
> >>>> - Fix some bugs caught by v4l2-compliance
> >>>> - Rebased on top of media/master (post 5.8-rc1)
> >>>>
> >>>> Changes in v3:
> >>>> - Rebased on top of media/master (post 5.4-rc1)
> >>>>
> >>>> Changes in v2:
> >>>> - Add reserved space to v4l2_ext_buffer so that new fields can be added
> >>>>   later on
> >>>> ---
> >>>>  drivers/media/v4l2-core/v4l2-dev.c   |  29 ++-
> >>>>  drivers/media/v4l2-core/v4l2-ioctl.c | 353 +--
> >>>>  include/media/v4l2-ioctl.h   |  26 ++
> >>>>  include/uapi/linux/videodev2.h   |  90 +++
> >>>>  4 files changed, 476 insertions(+), 22 deletions(-)
> >>>>
> >>>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
> >>>> b/drivers/media/v4l2-core/v4l2-dev.c
> >>>> index e1829906bc086..cb21ee8eb075c 100644
> >>>> --- a/drivers/media/v4l2-core/v4l2-dev.c
> >>>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
> >>>> @@ -720,15 +720,34 @@ static void determine_valid_ioctls(struct 
> >>>> video_device *vdev)
> >>>>  SET_VALID_IOCTL(ops, VIDIOC_TRY_FMT, 
> >>>> vidioc_try_fmt_sdr_out);
> >>>>  }
> >>>>
> >>>> +if (is_vid || is_tch) {
> >>>> +/* ioctls valid for video and touch */
> >>>> +if (ops->vidioc_querybuf || ops->vidioc_ext_querybu

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Tomasz Figa
On Mon, Dec 14, 2020 at 9:57 PM Alexandre Courbot  wrote:
>
> This driver uses the SG vb2 ops, but effectively only ever accesses the
> first entry of the SG table, indicating that it expects a flat layout.
> Switch it to use the contiguous ops to make sure this expected invariant
> is always enforced. Since the device is supposed to be behind an IOMMU
> this should have little to none practical consequences beyond making the
> driver not rely on a particular behavior of the SG implementation.
>
> Reported-by: Tomasz Figa 
> Signed-off-by: Alexandre Courbot 
> ---
> Hi everyone,
>
> It probably doesn't hurt to fix this issue before some actual issue happens.
> I have tested this patch on Chrome OS and playback was just as fine as with
> the SG ops.
>
>  drivers/media/platform/Kconfig  | 2 +-
>  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
>  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
>  drivers/media/platform/qcom/venus/venc.c| 6 +++---
>  4 files changed, 9 insertions(+), 14 deletions(-)
>

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz

> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index 35a18d388f3f..d9d7954111f2 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> depends on INTERCONNECT || !INTERCONNECT
> select QCOM_MDT_LOADER if ARCH_QCOM
> select QCOM_SCM if ARCH_QCOM
> -   select VIDEOBUF2_DMA_SG
> +   select VIDEOBUF2_DMA_CONTIG
> select V4L2_MEM2MEM_DEV
> help
>   This is a V4L2 driver for Qualcomm Venus video accelerator
> diff --git a/drivers/media/platform/qcom/venus/helpers.c 
> b/drivers/media/platform/qcom/venus/helpers.c
> index 50439eb1ffea..859d260f002b 100644
> --- a/drivers/media/platform/qcom/venus/helpers.c
> +++ b/drivers/media/platform/qcom/venus/helpers.c
> @@ -7,7 +7,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>
> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
> struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> struct venus_buffer *buf = to_venus_buffer(vbuf);
> -   struct sg_table *sgt;
> -
> -   sgt = vb2_dma_sg_plane_desc(vb, 0);
> -   if (!sgt)
> -   return -EFAULT;
>
> buf->size = vb2_plane_size(vb, 0);
> -   buf->dma_addr = sg_dma_address(sgt->sgl);
> +   buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>
> if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> list_add_tail(>reg_list, >registeredbufs);
> diff --git a/drivers/media/platform/qcom/venus/vdec.c 
> b/drivers/media/platform/qcom/venus/vdec.c
> index 8488411204c3..3fb277c81aca 100644
> --- a/drivers/media/platform/qcom/venus/vdec.c
> +++ b/drivers/media/platform/qcom/venus/vdec.c
> @@ -13,7 +13,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>
>  #include "hfi_venus_io.h"
>  #include "hfi_parser.h"
> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
> src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> src_vq->ops = _vb2_ops;
> -   src_vq->mem_ops = _dma_sg_memops;
> +   src_vq->mem_ops = _dma_contig_memops;
> src_vq->drv_priv = inst;
> src_vq->buf_struct_size = sizeof(struct venus_buffer);
> src_vq->allow_zero_bytesused = 1;
> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
> dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> dst_vq->ops = _vb2_ops;
> -   dst_vq->mem_ops = _dma_sg_memops;
> +   dst_vq->mem_ops = _dma_contig_memops;
> dst_vq->drv_priv = inst;
> dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> dst_vq->allow_zero_bytesused = 1;
> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> b/drivers/media/platform/qcom/venus/venc.c
> index 1c61602c5de1..a09550cd1dba 100644
> --- a/drivers/media/platform/qcom/venus/venc.c
> +++ b/drivers/media/platform/qcom/venus/venc.c
> @@ -10,7 +10,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
> src_vq-&g

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-12-14 Thread Tomasz Figa
On Fri, Dec 4, 2020 at 4:52 AM Helen Koike  wrote:
>
> Hi,
>
> Please see my 2 points below (about v4l2_ext_buffer and another about 
> timestamp).
>
> On 12/3/20 12:11 PM, Hans Verkuil wrote:
> > On 23/11/2020 18:40, Helen Koike wrote:
> >>
> >>
> >> On 11/23/20 12:46 PM, Tomasz Figa wrote:
> >>> On Tue, Nov 24, 2020 at 12:08 AM Helen Koike  
> >>> wrote:
> >>>>
> >>>> Hi Hans,
> >>>>
> >>>> Thank you for your review.
> >>>>
> >>>> On 9/9/20 9:27 AM, Hans Verkuil wrote:
> >>>>> Hi Helen,
> >>>>>
> >>>>> Again I'm just reviewing the uAPI.
> >>>>>
> >>>>> On 04/08/2020 21:29, Helen Koike wrote:
> >>>>>> From: Hans Verkuil 
> >>>>>>
> >>>>>> Those extended buffer ops have several purpose:
> >>>>>> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
> >>>>>>the number of ns elapsed since 1970
> >>>>>> 2/ Unify single/multiplanar handling
> >>>>>> 3/ Add a new start offset field to each v4l2 plane buffer info struct
> >>>>>>to support the case where a single buffer object is storing all
> >>>>>>planes data, each one being placed at a different offset
> >>>>>>
> >>>>>> New hooks are created in v4l2_ioctl_ops so that drivers can start using
> >>>>>> these new objects.
> >>>>>>
> >>>>>> The core takes care of converting new ioctls requests to old ones
> >>>>>> if the driver does not support the new hooks, and vice versa.
> >>>>>>
> >>>>>> Note that the timecode field is gone, since there doesn't seem to be
> >>>>>> in-kernel users. We can be added back in the reserved area if needed or
> >>>>>> use the Request API to collect more metadata information from the
> >>>>>> frame.
> >>>>>>
> >>>>>> Signed-off-by: Hans Verkuil 
> >>>>>> Signed-off-by: Boris Brezillon 
> >>>>>> Signed-off-by: Helen Koike 
> >>>>>> ---
> >>>>>> Changes in v5:
> >>>>>> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> >>>>>> - return mem_offset to struct v4l2_ext_plane
> >>>>>> - change sizes and reorder fields to avoid holes in the struct and make
> >>>>>>   it the same for 32 and 64 bits
> >>>>>>
> >>>>>> Changes in v4:
> >>>>>> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> >>>>>> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> >>>>>> - Drop VIDIOC_EXT_EXPBUF, since the only difference from VIDIOC_EXPBUF
> >>>>>> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at 
> >>>>>> once.
> >>>>>> I think we can add this later, so I removed it from this RFC to 
> >>>>>> simplify it.
> >>>>>> - Remove num_planes field from struct v4l2_ext_buffer
> >>>>>> - Add flags field to struct v4l2_ext_create_buffers
> >>>>>> - Reformulate struct v4l2_ext_plane
> >>>>>> - Fix some bugs caught by v4l2-compliance
> >>>>>> - Rebased on top of media/master (post 5.8-rc1)
> >>>>>>
> >>>>>> Changes in v3:
> >>>>>> - Rebased on top of media/master (post 5.4-rc1)
> >>>>>>
> >>>>>> Changes in v2:
> >>>>>> - Add reserved space to v4l2_ext_buffer so that new fields can be added
> >>>>>>   later on
> >>>>>> ---
> >>>>>>  drivers/media/v4l2-core/v4l2-dev.c   |  29 ++-
> >>>>>>  drivers/media/v4l2-core/v4l2-ioctl.c | 353 +--
> >>>>>>  include/media/v4l2-ioctl.h   |  26 ++
> >>>>>>  include/uapi/linux/videodev2.h   |  90 +++
> >>>>>>  4 files changed, 476 insertions(+), 22 deletions(-)
> >>>>>>
> >>>>>
> >>>>> 
> >>>>>
> >>>>>> diff --git a/include/uapi/linux/videodev2.h 
> >>>>>> b/include/ua

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-12-14 Thread Tomasz Figa
On Fri, Dec 4, 2020 at 12:11 AM Hans Verkuil  wrote:
>
> On 23/11/2020 18:40, Helen Koike wrote:
> >
> >
> > On 11/23/20 12:46 PM, Tomasz Figa wrote:
> >> On Tue, Nov 24, 2020 at 12:08 AM Helen Koike  
> >> wrote:
> >>>
> >>> Hi Hans,
> >>>
> >>> Thank you for your review.
> >>>
> >>> On 9/9/20 9:27 AM, Hans Verkuil wrote:
> >>>> Hi Helen,
> >>>>
> >>>> Again I'm just reviewing the uAPI.
> >>>>
> >>>> On 04/08/2020 21:29, Helen Koike wrote:
> >>>>> From: Hans Verkuil 
> >>>>>
> >>>>> Those extended buffer ops have several purpose:
> >>>>> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
> >>>>>the number of ns elapsed since 1970
> >>>>> 2/ Unify single/multiplanar handling
> >>>>> 3/ Add a new start offset field to each v4l2 plane buffer info struct
> >>>>>to support the case where a single buffer object is storing all
> >>>>>planes data, each one being placed at a different offset
> >>>>>
> >>>>> New hooks are created in v4l2_ioctl_ops so that drivers can start using
> >>>>> these new objects.
> >>>>>
> >>>>> The core takes care of converting new ioctls requests to old ones
> >>>>> if the driver does not support the new hooks, and vice versa.
> >>>>>
> >>>>> Note that the timecode field is gone, since there doesn't seem to be
> >>>>> in-kernel users. We can be added back in the reserved area if needed or
> >>>>> use the Request API to collect more metadata information from the
> >>>>> frame.
> >>>>>
> >>>>> Signed-off-by: Hans Verkuil 
> >>>>> Signed-off-by: Boris Brezillon 
> >>>>> Signed-off-by: Helen Koike 
> >>>>> ---
> >>>>> Changes in v5:
> >>>>> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> >>>>> - return mem_offset to struct v4l2_ext_plane
> >>>>> - change sizes and reorder fields to avoid holes in the struct and make
> >>>>>   it the same for 32 and 64 bits
> >>>>>
> >>>>> Changes in v4:
> >>>>> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> >>>>> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> >>>>> - Drop VIDIOC_EXT_EXPBUF, since the only difference from VIDIOC_EXPBUF
> >>>>> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at once.
> >>>>> I think we can add this later, so I removed it from this RFC to 
> >>>>> simplify it.
> >>>>> - Remove num_planes field from struct v4l2_ext_buffer
> >>>>> - Add flags field to struct v4l2_ext_create_buffers
> >>>>> - Reformulate struct v4l2_ext_plane
> >>>>> - Fix some bugs caught by v4l2-compliance
> >>>>> - Rebased on top of media/master (post 5.8-rc1)
> >>>>>
> >>>>> Changes in v3:
> >>>>> - Rebased on top of media/master (post 5.4-rc1)
> >>>>>
> >>>>> Changes in v2:
> >>>>> - Add reserved space to v4l2_ext_buffer so that new fields can be added
> >>>>>   later on
> >>>>> ---
> >>>>>  drivers/media/v4l2-core/v4l2-dev.c   |  29 ++-
> >>>>>  drivers/media/v4l2-core/v4l2-ioctl.c | 353 +--
> >>>>>  include/media/v4l2-ioctl.h   |  26 ++
> >>>>>  include/uapi/linux/videodev2.h   |  90 +++
> >>>>>  4 files changed, 476 insertions(+), 22 deletions(-)
> >>>>>
> >>>>
> >>>> 
> >>>>
> >>>>> diff --git a/include/uapi/linux/videodev2.h 
> >>>>> b/include/uapi/linux/videodev2.h
> >>>>> index 7123c6a4d9569..334cafdd2be97 100644
> >>>>> --- a/include/uapi/linux/videodev2.h
> >>>>> +++ b/include/uapi/linux/videodev2.h
> >>>>> @@ -996,6 +996,41 @@ struct v4l2_plane {
> >>>>>  __u32   reserved[11];
> >>>>>  };
> >>>>>
> >>>>> +/**
> >>>>> + * struct v4l2_ext_plane - 

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-12-14 Thread Tomasz Figa
On Tue, Nov 24, 2020 at 5:33 AM Helen Koike  wrote:
>
> Hi Tomasz,
>
>
> On 11/20/20 8:14 AM, Tomasz Figa wrote:
> > Hi Helen,
> >
> > On Tue, Aug 04, 2020 at 04:29:34PM -0300, Helen Koike wrote:
> >> From: Hans Verkuil 
> >>
> >> Those extended buffer ops have several purpose:
> >> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
> >>the number of ns elapsed since 1970
> >> 2/ Unify single/multiplanar handling
> >> 3/ Add a new start offset field to each v4l2 plane buffer info struct
> >>to support the case where a single buffer object is storing all
> >>planes data, each one being placed at a different offset
> >>
> >> New hooks are created in v4l2_ioctl_ops so that drivers can start using
> >> these new objects.
> >>
> >> The core takes care of converting new ioctls requests to old ones
> >> if the driver does not support the new hooks, and vice versa.
> >>
> >> Note that the timecode field is gone, since there doesn't seem to be
> >> in-kernel users. We can be added back in the reserved area if needed or
> >> use the Request API to collect more metadata information from the
> >> frame.
> >>
> >
> > Thanks for the patch. Please see my comments inline.
>
> Thank you for your detailed review, please see my comments below.
>
> >
> >> Signed-off-by: Hans Verkuil 
> >> Signed-off-by: Boris Brezillon 
> >> Signed-off-by: Helen Koike 
> >> ---
> >> Changes in v5:
> >> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> >> - return mem_offset to struct v4l2_ext_plane
> >> - change sizes and reorder fields to avoid holes in the struct and make
> >>   it the same for 32 and 64 bits
> >>
> >> Changes in v4:
> >> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> >> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> >> - Drop VIDIOC_EXT_EXPBUF, since the only difference from VIDIOC_EXPBUF
> >> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at once.
> >> I think we can add this later, so I removed it from this RFC to simplify 
> >> it.
> >> - Remove num_planes field from struct v4l2_ext_buffer
> >> - Add flags field to struct v4l2_ext_create_buffers
> >> - Reformulate struct v4l2_ext_plane
> >> - Fix some bugs caught by v4l2-compliance
> >> - Rebased on top of media/master (post 5.8-rc1)
> >>
> >> Changes in v3:
> >> - Rebased on top of media/master (post 5.4-rc1)
> >>
> >> Changes in v2:
> >> - Add reserved space to v4l2_ext_buffer so that new fields can be added
> >>   later on
> >> ---
> >>  drivers/media/v4l2-core/v4l2-dev.c   |  29 ++-
> >>  drivers/media/v4l2-core/v4l2-ioctl.c | 353 +--
> >>  include/media/v4l2-ioctl.h   |  26 ++
> >>  include/uapi/linux/videodev2.h   |  90 +++
> >>  4 files changed, 476 insertions(+), 22 deletions(-)
> >>
> >> diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
> >> b/drivers/media/v4l2-core/v4l2-dev.c
> >> index e1829906bc086..cb21ee8eb075c 100644
> >> --- a/drivers/media/v4l2-core/v4l2-dev.c
> >> +++ b/drivers/media/v4l2-core/v4l2-dev.c
> >> @@ -720,15 +720,34 @@ static void determine_valid_ioctls(struct 
> >> video_device *vdev)
> >>  SET_VALID_IOCTL(ops, VIDIOC_TRY_FMT, vidioc_try_fmt_sdr_out);
> >>  }
> >>
> >> +if (is_vid || is_tch) {
> >> +/* ioctls valid for video and touch */
> >> +if (ops->vidioc_querybuf || ops->vidioc_ext_querybuf)
> >> +set_bit(_IOC_NR(VIDIOC_EXT_QUERYBUF), valid_ioctls);
> >> +if (ops->vidioc_qbuf || ops->vidioc_ext_qbuf)
> >> +set_bit(_IOC_NR(VIDIOC_EXT_QBUF), valid_ioctls);
> >> +if (ops->vidioc_dqbuf || ops->vidioc_ext_dqbuf)
> >> +set_bit(_IOC_NR(VIDIOC_EXT_DQBUF), valid_ioctls);
> >> +if (ops->vidioc_create_bufs || ops->vidioc_ext_create_bufs)
> >> +set_bit(_IOC_NR(VIDIOC_EXT_CREATE_BUFS), 
> >> valid_ioctls);
> >> +if (ops->vidioc_prepare_buf || ops->vidioc_ext_prepare_buf)
> >> +set_bit(_IOC_NR(VIDIOC_EXT_PREPARE_BUF), 
> >> valid_ioctls);
> >
> > nit: Could we stick to the SET_VALID_IOCTL() macro and 

Re: [PATCH v5 1/7] media: v4l2: Extend pixel formats to unify single/multi-planar handling (and more)

2020-12-14 Thread Tomasz Figa
On Thu, Nov 19, 2020 at 10:43 PM Helen Koike  wrote:
>
>
>
> On 11/19/20 7:08 AM, Helen Koike wrote:
> > Hi Tomasz,
> >
> > On 11/19/20 2:45 AM, Tomasz Figa wrote:
> >> On Sat, Nov 14, 2020 at 11:21:26AM -0300, Helen Koike wrote:
> >>> Hi Tomasz,
> >>>
> >>> On 10/2/20 4:49 PM, Tomasz Figa wrote:
> >>>> Hi Helen,
> >>>>
> >>>> On Tue, Aug 04, 2020 at 04:29:33PM -0300, Helen Koike wrote:
> >> [snip]
> >>>>> +static void v4l_print_ext_pix_format(const void *arg, bool write_only)
> >>>>> +{
> >>>>> + const struct v4l2_ext_pix_format *pix = arg;
> >>>>> + unsigned int i;
> >>>>> +
> >>>>> + pr_cont("type=%s, width=%u, height=%u, format=%c%c%c%c, modifier 
> >>>>> %llx, field=%s, colorspace=%d, ycbcr_enc=%u, quantization=%u, 
> >>>>> xfer_func=%u\n",
> >>>>> + prt_names(pix->type, v4l2_type_names),
> >>>>> + pix->width, pix->height,
> >>>>> + (pix->pixelformat & 0xff),
> >>>>> + (pix->pixelformat >>  8) & 0xff,
> >>>>> + (pix->pixelformat >> 16) & 0xff,
> >>>>> + (pix->pixelformat >> 24) & 0xff,
> >>>>> + pix->modifier, prt_names(pix->field, v4l2_field_names),
> >>>>> + pix->colorspace, pix->ycbcr_enc,
> >>>>> + pix->quantization, pix->xfer_func);
> >>>>> + for (i = 0; i < VIDEO_MAX_PLANES && pix->plane_fmt[i].sizeimage; i++)
> >>>>
> >>>> This is going to print 8 lines every time. Maybe we could skip 0-sized
> >>>> planes or something?
> >>>
> >>> I'm already checking pix->plane_fmt[i].sizeimage in the loop, it shouldn't
> >>> print 8 lines every time.
> >>>
> >>
> >> Oops, how could I not notice it. Sorry for the noise.
> >>
> >> [snip]
> >>>>> +int v4l2_ext_pix_format_to_format(const struct v4l2_ext_pix_format *e,
> >>>>> +   struct v4l2_format *f, bool mplane_cap,
> >>>>> +   bool strict)
> >>>>> +{
> >>>>> + const struct v4l2_plane_ext_pix_format *pe;
> >>>>> + struct v4l2_plane_pix_format *p;
> >>>>> + unsigned int i;
> >>>>> +
> >>>>> + memset(f, 0, sizeof(*f));
> >>>>> +
> >>>>> + /*
> >>>>> +  * Make sure no modifier is required before doing the
> >>>>> +  * conversion.
> >>>>> +  */
> >>>>> + if (e->modifier && strict &&
> >>>>
> >>>> Do we need the explicit check for e->modifier != 0 if we have to check 
> >>>> for
> >>>> the 2 specific values below anyway?
> >>>
> >>> We don't, since DRM_FORMAT_MOD_LINEAR is zero.
> >>>
> >>> But I wanted to make it explicit we don't support modifiers in this 
> >>> conversion.
> >>> But I can remove this check, no problem.
> >>>
> >>
> >> Yes, please. I think the double checking is confusing for the reader.
> >
> > ok.
> >
> >>
> >>>>
> >>>>> + e->modifier != DRM_FORMAT_MOD_LINEAR &&
> >>>>> + e->modifier != DRM_FORMAT_MOD_INVALID)
> >>>>> + return -EINVAL;
> >>>>> +
> >>>>> + if (!e->plane_fmt[0].sizeimage && strict)
> >>>>> + return -EINVAL;
> >>>>
> >>>> Why is this incorrect?
> >>>
> >>> !sizeimage for the first plane means that there are no planes in ef.
> >>> strict is true if the result for the conversion should be returned to 
> >>> userspace
> >>> and it is not some internal handling.
> >>>
> >>> So if there are no planes, we would return an incomplete v4l2_format 
> >>> struct
> >>> to userspace.
> >>>
> >>> But this is not very clear, I'll improve this for the next version.
> >>>
> >>
> >> So I can see 2 cases here:
> >>
> >> 1) Userspace gives ext struct and driver accepts legacy.
> >>
> >> In this 

Re: [PATCH v5 3/7] media: videobuf2: Expose helpers to implement the _ext_fmt and _ext_buf hooks

2020-12-14 Thread Tomasz Figa
Hi Helen,

On Tue, Aug 04, 2020 at 04:29:35PM -0300, Helen Koike wrote:
> The VB2 layer is used by a lot of drivers. Patch it to support the
> _EXT_PIX_FMT and _EXT_BUF ioctls in order to ease conversion of existing
> drivers to these new APIs.
> 
> Note that internally, the VB2 core is now only using ext structs and old
> APIs are supported through conversion wrappers.

We decided to only support V4L2_BUF_TYPE_VIDEO* for the ext structs. Still,
existing drivers may use vb2 with the other, legacy, buf types. How would
they be handled with this change?

> 
> Signed-off-by: Boris Brezillon 
> Signed-off-by: Helen Koike 
> ---
> Changes in v5:
> - Update with new format and buffer structs
> - Updated commit message with the uAPI prefix
> 
> Changes in v4:
> - Update with new format and buffer structs
> - Fix some bugs caught by v4l2-compliance
> - Rebased on top of media/master (post 5.8-rc1)
> 
> Changes in v3:
> - Rebased on top of media/master (post 5.4-rc1)
> 
> Changes in v2:
> - New patch
> ---
>  .../media/common/videobuf2/videobuf2-core.c   |   2 +
>  .../media/common/videobuf2/videobuf2-v4l2.c   | 560 ++
>  include/media/videobuf2-core.h|   6 +-
>  include/media/videobuf2-v4l2.h|  21 +-
>  4 files changed, 345 insertions(+), 244 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
> b/drivers/media/common/videobuf2/videobuf2-core.c
> index f544d3393e9d6..d719b1e9c148b 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -1270,6 +1270,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>   vb->planes[plane].length = 0;
>   vb->planes[plane].m.fd = 0;
>   vb->planes[plane].data_offset = 0;
> + vb->planes[plane].dbuf_offset = 0;
>  
>   /* Acquire each plane's memory */
>   mem_priv = call_ptr_memop(vb, attach_dmabuf,
> @@ -1313,6 +1314,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>   vb->planes[plane].length = planes[plane].length;
>   vb->planes[plane].m.fd = planes[plane].m.fd;
>   vb->planes[plane].data_offset = planes[plane].data_offset;
> + vb->planes[plane].dbuf_offset = planes[plane].dbuf_offset;
>   }
>  
>   if (reacquired) {
> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c 
> b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> index 30caad27281e1..911681d24b3ae 100644
> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -56,72 +57,39 @@ module_param(debug, int, 0644);
>V4L2_BUF_FLAG_TIMECODE | \
>V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF)
>  
> -/*
> - * __verify_planes_array() - verify that the planes array passed in struct
> - * v4l2_buffer from userspace can be safely used
> - */
> -static int __verify_planes_array(struct vb2_buffer *vb, const struct 
> v4l2_buffer *b)
> -{
> - if (!V4L2_TYPE_IS_MULTIPLANAR(b->type))
> - return 0;
> -
> - /* Is memory for copying plane information present? */
> - if (b->m.planes == NULL) {
> - dprintk(vb->vb2_queue, 1,
> - "multi-planar buffer passed but planes array not 
> provided\n");
> - return -EINVAL;
> - }
> -
> - if (b->length < vb->num_planes || b->length > VB2_MAX_PLANES) {
> - dprintk(vb->vb2_queue, 1,
> - "incorrect planes array length, expected %d, got %d\n",
> - vb->num_planes, b->length);
> - return -EINVAL;
> - }
> -
> - return 0;
> -}
> -
>  static int __verify_planes_array_core(struct vb2_buffer *vb, const void *pb)
>  {
> - return __verify_planes_array(vb, pb);
> + return 0;
>  }
>  
>  /*
>   * __verify_length() - Verify that the bytesused value for each plane fits in
>   * the plane length and that the data offset doesn't exceed the bytesused 
> value.
>   */
> -static int __verify_length(struct vb2_buffer *vb, const struct v4l2_buffer 
> *b)
> +static int __verify_length(struct vb2_buffer *vb,
> +const struct v4l2_ext_buffer *b)
>  {
>   unsigned int length;
>   unsigned int bytesused;
> - unsigned int plane;
> + unsigned int i;
>  
>   if (V4L2_TYPE_IS_CAPTURE(b->type))
>   return 0;
>  
> - if (V4L2_TYPE_IS_MULTIPLANAR(b->type)) {
> - for (plane = 0; plane < vb->num_planes; ++plane) {
> - length = (b->memory == VB2_MEMORY_USERPTR ||
> -   b->memory == VB2_MEMORY_DMABUF)
> -? b->m.planes[plane].length
> - : vb->planes[plane].length;
> - bytesused = b->m.planes[plane].bytesused
> - 

Re: [PATCH v3 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API

2020-12-09 Thread Tomasz Figa
On Wed, Dec 9, 2020 at 10:05 PM Robin Murphy  wrote:
>
> On 2020-12-09 11:12, Christoph Hellwig wrote:
> > On Tue, Dec 08, 2020 at 01:54:00PM +0900, Tomasz Figa wrote:
> >> >From the media perspective, it would be good to have the vmap
> >> optional, similarly to the DMA_ATTR_NO_KERNEL_MAPPING attribute for
> >> coherent allocations. Actually, in the media drivers, the need to have
> >> a kernel mapping of the DMA buffers corresponds to a minority of the
> >> drivers. Most of them only need to map them to the userspace.
> >>
> >> Nevertheless, that minority actually happens to be quite widely used,
> >> e.g. the uvcvideo driver, so we can't go to the other extreme and just
> >> drop the vmap at all.
> >
> > My main problem is that the DMA_ATTR_NO_KERNEL_MAPPING makes a mess
> > of an API.  I'd much rather have low-level API that returns the
> > discontiguous allocations and another one that vmaps them rather
> > than starting to overload arguments like in dma_alloc_attrs with
> > DMA_ATTR_NO_KERNEL_MAPPING.

Okay, makes sense. Actually, a separate mapping function makes it
possible to defer creating the mapping to when (and if) it is really
needed.

>
> Agreed - if iommu-dma's dma_alloc_coherent() ends up as little more than
> a thin wrapper around those two functions I think that would be a good
> sign. It also seems like it might be a good idea for this API to use
> scatterlists rather than page arrays as it's fundamental format, to help
> reduce impedance with dma-buf -

True.

> if we can end up with a wider redesign
> that also gets rid of dma_get_sgtable(), all the better!

That would also require taking care of the old dma_alloc_attrs() API.
I guess it could return an already mapped sgtable, with only 1 mapped
entry and as many following entries as needed to describe the physical
pages. To be honest, I'd say this is far out of scope of this
discussion, though.

Best regards,
Tomasz


Re: [PATCH] media: ov8856: Remove 3280x2464 mode

2020-12-08 Thread Tomasz Figa
On Fri, Nov 27, 2020 at 10:38 PM Robert Foss  wrote:
>
> Thanks for digging into this everyone!
>
> Assuming Tomasz doesn't find any stretching, I think we can conclude
> that this mode works, and should be kept. Thanks Dongchun for parsing
> the datasheet and finding the Bayer mode issue for the two other
> recently added resolutions.

I checked the raw output and it actually seems to have 3296x2464
non-black pixels. The rightmost 16 ones seem a copy of the ones from
3248. That might be just some padding from the output DMA, though.

Generally all the datasheets I've seen still suggest that only the
middle 3264x2448 are active pixels to be output, so this warrants
double checking this with Omnivision. Let me see what we can do about
this.

Best regards,
Tomasz

>
> On Fri, 27 Nov 2020 at 11:26, Tomasz Figa  wrote:
> >
> > On Thu, Nov 26, 2020 at 7:00 PM Robert Foss  wrote:
> > >
> > > On Wed, 25 Nov 2020 at 08:32, Tomasz Figa  wrote:
> > > >
> > > > Hi Bingbu,
> > > >
> > > > On Wed, Nov 25, 2020 at 1:15 PM Bingbu Cao  
> > > > wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 11/24/20 6:20 PM, Robert Foss wrote:
> > > > > > On Tue, 24 Nov 2020 at 10:42, Bingbu Cao 
> > > > > >  wrote:
> > > > > >>
> > > > > >> Hi, Robert
> > > > > >>
> > > > > >> I remember that the full size of ov8856 image sensor is 3296x2480 
> > > > > >> and we can get the 3280x2464
> > > > > >> frames based on current settings.
> > > > > >>
> > > > > >> Do you have any issues with this mode?
> > > > > >
> > > > > > As far as I can tell using the 3280x2464 mode actually yields an
> > > > > > output resolution that is 3264x2448.
> > > > > >
> > > > > > What does your hardware setup look like? And which revision of the
> > > > > > sensor are you using?
> > > > > >
> > > > >
> > > > > Robert, the sensor revision I am using is v1.1. I just checked the 
> > > > > actual output pixels on our
> > > > > hardware, the output resolution with 2464 mode is 3280x2464, no black 
> > > > > pixels.
> > > > >
> > > > > As Tomasz said, some ISP has the requirement of extra pixel padding, 
> > > > > From the ov8856 datasheet,
> > > > > the central 3264x2448 pixels are *suggested* to be output from the 
> > > > > pixel array and the boundary
> > > > > pixels can be used for additional processing. In my understanding, 
> > > > > the 32 dummy lines are not
> > > > > black lines.
> > > >
> > > > The datasheet says that only 3264x2448 are active pixels. What pixel
> > > > values are you seeing outside of that central area? In the datasheet,
> > > > those look like "optically black" pixels, which are not 100% black,
> > > > but rather as if the sensor cells didn't receive any light - noise can
> > > > be still there.
> > > >
> > >
> > > I've been developing support for some Qcom ISP functionality, and
> > > during the course of this I ran into the issue I was describing, where
> > > the 3280x2464 mode actually outputs 3264x2448.
> > >
> > > I can think of two reasons for this, either ISP driver bugs on my end
> > > or the fact that the sensor is being run outside of the specification
> > > and which could be resulting in differences between how the ov8856
> > > sensors behave.
> >
> > I just confirmed and we're indeed using this mode in a number of our
> > projects based on the Intel ISP and it seems to be producing a proper
> > image with all pixels of the 3280x2464 matrix having proper values.
> > I'm now double checking whether this isn't some processing done by the
> > ISP, but I suspect the quality would be bad if it stretched the
> > central 3264x2448 part into the 3280x2464 frame.
> >
> > Best regards,
> > Tomasz


Re: [PATCH v3 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API

2020-12-07 Thread Tomasz Figa
Hi Christoph,

On Tue, Dec 1, 2020 at 11:49 PM Christoph Hellwig  wrote:
>
> On Tue, Dec 01, 2020 at 12:36:58PM +0900, Sergey Senozhatsky wrote:
> > Not that I have any sound experience in this area, but the helper
> > probably won't hurt. Do you also plan to add vmap() to that helper
> > or dma_alloc_noncontiguous()/sg_alloc_table_from_pages() only?
>
> Yes, I think adding the vmap is useful, and it probably makes sense
> to do that unconditionally.  I'd also include the fallback to
> dma_alloc_pages when the noncontig version isn't supported in the
> helper.

>From the media perspective, it would be good to have the vmap
optional, similarly to the DMA_ATTR_NO_KERNEL_MAPPING attribute for
coherent allocations. Actually, in the media drivers, the need to have
a kernel mapping of the DMA buffers corresponds to a minority of the
drivers. Most of them only need to map them to the userspace.

Nevertheless, that minority actually happens to be quite widely used,
e.g. the uvcvideo driver, so we can't go to the other extreme and just
drop the vmap at all.

In any case, Sergey is going to share a preliminary patch on how the
current API would be used in the V4L2 videobuf2 framework. That should
give us more input on how such a helper could look.

Other than that, again, thanks a lot for helping with this.

Best regards,
Tomasz


Re: media: i2c: add OV02A10 image sensor driver

2020-12-03 Thread Tomasz Figa
On Fri, Dec 4, 2020 at 11:47 AM Dongchun Zhu  wrote:
>
> Hi Andy,
>
> On Thu, 2020-12-03 at 20:10 +0200, Andy Shevchenko wrote:
> > On Thu, Dec 3, 2020 at 8:03 PM Colin Ian King  
> > wrote:
> >
> > > Static analysis on linux-next with Coverity has detected an issue with
> > > the following commit:
> >
> > If you want to fix it properly, see my comments below...
> >
> > > 529 static int ov02a10_s_stream(struct v4l2_subdev *sd, int on)
> > > 530 {
> > > 531struct ov02a10 *ov02a10 = to_ov02a10(sd);
> > > 532struct i2c_client *client =
> > > v4l2_get_subdevdata(>subdev);
> > >
> > >1. var_decl: Declaring variable ret without initializer.
> > >
> > > 533int ret;
> > > 534
> > > 535mutex_lock(>mutex);
> > > 536
> > >
> > >2. Condition ov02a10->streaming == on, taking true branch.
> > >
> > > 537if (ov02a10->streaming == on)
> > >
> > >3. Jumping to label unlock_and_return.
> > >
> > > 538goto unlock_and_return;
> > > 539
> > > 540if (on) {
> > > 541ret = pm_runtime_get_sync(>dev);
> > > 542if (ret < 0) {
> >
> > > 543pm_runtime_put_noidle(>dev);
> > > 544goto unlock_and_return;
> >
> > Instead of two above:
>
> From the document, pm_runtime_put_noidle is to decrease the runtime PM
> usage counter of a device unless it is 0 already; while pm_runtime_put
> would additionally run pm_request_idle to turn off the power if usage
> counter is zero.
>
> So here maybe we can really use pm_runtime_put instead of
> pm_runtime_put_noidle, although it seems that 'pm_runtime_get_sync' and
> 'pm_runtime_put_noidle' often appear in pairs.
>

In an error state (e.g. if pm_runtime_get_sync() fails),
pm_runtime_put() would decrement the usage counter and call rpm_idle()
which would instantly return an error code. The end result would be
the same, except that pm_runtime_put() would return a non-zero error
code, but we ignore it anyway. However strange it looks, this seems to
be an API guarantee, so Andy's suggestion is correct.

Best regards,
Tomasz

> >goto err_rpm_put;
> >
> > > 545}
> > > 546
> > > 547ret = __ov02a10_start_stream(ov02a10);
> > > 548if (ret) {
> > > 549__ov02a10_stop_stream(ov02a10);
> > > 550ov02a10->streaming = !on;
> > > 551goto err_rpm_put;
> > > 552}
> > > 553} else {
> > > 554__ov02a10_stop_stream(ov02a10);
> > > 555pm_runtime_put(>dev);
> > > 556}
> > > 557
> > > 558ov02a10->streaming = on;
> >
> > (1)
> >
> > > 559mutex_unlock(>mutex);
> > > 560
> > > 561return 0;
> > > 562
> > > 563 err_rpm_put:
> > > 564pm_runtime_put(>dev);
> >
> > > 565 unlock_and_return:
> >
> > Should be moved to (1).
> >
> > > 566mutex_unlock(>mutex);
> > > 567
> > >
> > > Uninitialized scalar variable (UNINIT)
> > > 4. uninit_use: Using uninitialized value ret.
> > >
> > > 568return ret;
> > > 569 }
> > >
> > > Variable ret has not been initialized, so the error return value is a
> > > garbage value. It should be initialized with some appropriate negative
> > > error code, or ret could be removed and the return should return a
> > > literal value of a error code.
> > >
> > > I was unsure what value is appropriate to fix this, so instead I'm
> > > reporting this issue.
> >
>


Re: [PATCH] media: vb2: always set buffer cache sync hints

2020-11-27 Thread Tomasz Figa
On Sat, Nov 28, 2020 at 1:35 AM Sergey Senozhatsky
 wrote:
>
> On (20/11/27 15:56), Hans Verkuil wrote:
> > Yes.
> >
> > BTW, wouldn't it be sufficient to change this code to:
> >
> >   if (!q->allow_cache_hints && q->memory != VB2_MEMORY_DMABUF) {
> >   vb->need_cache_sync_on_prepare = 1;
> >   vb->need_cache_sync_on_finish = 1;
> >   }
>
> I think it would be sufficient.

Does it matter at this point if allow_cache_hints is set or not?

Best regards,
Tomasz


Re: [PATCH] media: ov8856: Remove 3280x2464 mode

2020-11-27 Thread Tomasz Figa
On Thu, Nov 26, 2020 at 7:00 PM Robert Foss  wrote:
>
> On Wed, 25 Nov 2020 at 08:32, Tomasz Figa  wrote:
> >
> > Hi Bingbu,
> >
> > On Wed, Nov 25, 2020 at 1:15 PM Bingbu Cao  
> > wrote:
> > >
> > >
> > >
> > > On 11/24/20 6:20 PM, Robert Foss wrote:
> > > > On Tue, 24 Nov 2020 at 10:42, Bingbu Cao  
> > > > wrote:
> > > >>
> > > >> Hi, Robert
> > > >>
> > > >> I remember that the full size of ov8856 image sensor is 3296x2480 and 
> > > >> we can get the 3280x2464
> > > >> frames based on current settings.
> > > >>
> > > >> Do you have any issues with this mode?
> > > >
> > > > As far as I can tell using the 3280x2464 mode actually yields an
> > > > output resolution that is 3264x2448.
> > > >
> > > > What does your hardware setup look like? And which revision of the
> > > > sensor are you using?
> > > >
> > >
> > > Robert, the sensor revision I am using is v1.1. I just checked the actual 
> > > output pixels on our
> > > hardware, the output resolution with 2464 mode is 3280x2464, no black 
> > > pixels.
> > >
> > > As Tomasz said, some ISP has the requirement of extra pixel padding, From 
> > > the ov8856 datasheet,
> > > the central 3264x2448 pixels are *suggested* to be output from the pixel 
> > > array and the boundary
> > > pixels can be used for additional processing. In my understanding, the 32 
> > > dummy lines are not
> > > black lines.
> >
> > The datasheet says that only 3264x2448 are active pixels. What pixel
> > values are you seeing outside of that central area? In the datasheet,
> > those look like "optically black" pixels, which are not 100% black,
> > but rather as if the sensor cells didn't receive any light - noise can
> > be still there.
> >
>
> I've been developing support for some Qcom ISP functionality, and
> during the course of this I ran into the issue I was describing, where
> the 3280x2464 mode actually outputs 3264x2448.
>
> I can think of two reasons for this, either ISP driver bugs on my end
> or the fact that the sensor is being run outside of the specification
> and which could be resulting in differences between how the ov8856
> sensors behave.

I just confirmed and we're indeed using this mode in a number of our
projects based on the Intel ISP and it seems to be producing a proper
image with all pixels of the 3280x2464 matrix having proper values.
I'm now double checking whether this isn't some processing done by the
ISP, but I suspect the quality would be bad if it stretched the
central 3264x2448 part into the 3280x2464 frame.

Best regards,
Tomasz


Re: [PATCH] media: ov8856: Remove 3280x2464 mode

2020-11-24 Thread Tomasz Figa
Hi Bingbu,

On Wed, Nov 25, 2020 at 1:15 PM Bingbu Cao  wrote:
>
>
>
> On 11/24/20 6:20 PM, Robert Foss wrote:
> > On Tue, 24 Nov 2020 at 10:42, Bingbu Cao  wrote:
> >>
> >> Hi, Robert
> >>
> >> I remember that the full size of ov8856 image sensor is 3296x2480 and we 
> >> can get the 3280x2464
> >> frames based on current settings.
> >>
> >> Do you have any issues with this mode?
> >
> > As far as I can tell using the 3280x2464 mode actually yields an
> > output resolution that is 3264x2448.
> >
> > What does your hardware setup look like? And which revision of the
> > sensor are you using?
> >
>
> Robert, the sensor revision I am using is v1.1. I just checked the actual 
> output pixels on our
> hardware, the output resolution with 2464 mode is 3280x2464, no black pixels.
>
> As Tomasz said, some ISP has the requirement of extra pixel padding, From the 
> ov8856 datasheet,
> the central 3264x2448 pixels are *suggested* to be output from the pixel 
> array and the boundary
> pixels can be used for additional processing. In my understanding, the 32 
> dummy lines are not
> black lines.

The datasheet says that only 3264x2448 are active pixels. What pixel
values are you seeing outside of that central area? In the datasheet,
those look like "optically black" pixels, which are not 100% black,
but rather as if the sensor cells didn't receive any light - noise can
be still there.

Best regards,
Tomasz


Re: [PATCH] media: ov8856: Remove 3280x2464 mode

2020-11-24 Thread Tomasz Figa
Hi Robert,

On Tue, Nov 17, 2020 at 12:52 AM Robert Foss  wrote:
>
> Remove the 3280x2464 mode as it can't be reproduced and yields
> an output resolution of 3264x2448 instead of the desired one.
>
> Furthermore the 3264x2448 resolution is the highest resolution
> that the product brief lists.
>
> Since 3280x2464 neither works correctly nor seems to be supported
> by the sensor, let's remove it.
>

Let me check which modes are used by our projects. For one I'm sure
it's the 3264, but not sure about the other.

To be fair, 3280 sounds like a valid setup, with black pixels on the
edges. It's sometimes needed to add the black pixels either due to ISP
requirements or to obtain the black pixel values.

Best regards,
Tomasz

> Signed-off-by: Robert Foss 
> ---
>  drivers/media/i2c/ov8856.c | 202 -
>  1 file changed, 202 deletions(-)
>
> diff --git a/drivers/media/i2c/ov8856.c b/drivers/media/i2c/ov8856.c
> index 2f4ceaa80593..3365d19a303d 100644
> --- a/drivers/media/i2c/ov8856.c
> +++ b/drivers/media/i2c/ov8856.c
> @@ -148,196 +148,6 @@ static const struct ov8856_reg mipi_data_rate_360mbps[] 
> = {
> {0x031e, 0x0c},
>  };
>
> -static const struct ov8856_reg mode_3280x2464_regs[] = {
> -   {0x3000, 0x20},
> -   {0x3003, 0x08},
> -   {0x300e, 0x20},
> -   {0x3010, 0x00},
> -   {0x3015, 0x84},
> -   {0x3018, 0x72},
> -   {0x3021, 0x23},
> -   {0x3033, 0x24},
> -   {0x3500, 0x00},
> -   {0x3501, 0x9a},
> -   {0x3502, 0x20},
> -   {0x3503, 0x08},
> -   {0x3505, 0x83},
> -   {0x3508, 0x01},
> -   {0x3509, 0x80},
> -   {0x350c, 0x00},
> -   {0x350d, 0x80},
> -   {0x350e, 0x04},
> -   {0x350f, 0x00},
> -   {0x3510, 0x00},
> -   {0x3511, 0x02},
> -   {0x3512, 0x00},
> -   {0x3600, 0x72},
> -   {0x3601, 0x40},
> -   {0x3602, 0x30},
> -   {0x3610, 0xc5},
> -   {0x3611, 0x58},
> -   {0x3612, 0x5c},
> -   {0x3613, 0xca},
> -   {0x3614, 0x20},
> -   {0x3628, 0xff},
> -   {0x3629, 0xff},
> -   {0x362a, 0xff},
> -   {0x3633, 0x10},
> -   {0x3634, 0x10},
> -   {0x3635, 0x10},
> -   {0x3636, 0x10},
> -   {0x3663, 0x08},
> -   {0x3669, 0x34},
> -   {0x366e, 0x10},
> -   {0x3706, 0x86},
> -   {0x370b, 0x7e},
> -   {0x3714, 0x23},
> -   {0x3730, 0x12},
> -   {0x3733, 0x10},
> -   {0x3764, 0x00},
> -   {0x3765, 0x00},
> -   {0x3769, 0x62},
> -   {0x376a, 0x2a},
> -   {0x376b, 0x30},
> -   {0x3780, 0x00},
> -   {0x3781, 0x24},
> -   {0x3782, 0x00},
> -   {0x3783, 0x23},
> -   {0x3798, 0x2f},
> -   {0x37a1, 0x60},
> -   {0x37a8, 0x6a},
> -   {0x37ab, 0x3f},
> -   {0x37c2, 0x04},
> -   {0x37c3, 0xf1},
> -   {0x37c9, 0x80},
> -   {0x37cb, 0x16},
> -   {0x37cc, 0x16},
> -   {0x37cd, 0x16},
> -   {0x37ce, 0x16},
> -   {0x3800, 0x00},
> -   {0x3801, 0x00},
> -   {0x3802, 0x00},
> -   {0x3803, 0x06},
> -   {0x3804, 0x0c},
> -   {0x3805, 0xdf},
> -   {0x3806, 0x09},
> -   {0x3807, 0xa7},
> -   {0x3808, 0x0c},
> -   {0x3809, 0xd0},
> -   {0x380a, 0x09},
> -   {0x380b, 0xa0},
> -   {0x380c, 0x07},
> -   {0x380d, 0x88},
> -   {0x380e, 0x09},
> -   {0x380f, 0xb8},
> -   {0x3810, 0x00},
> -   {0x3811, 0x00},
> -   {0x3812, 0x00},
> -   {0x3813, 0x01},
> -   {0x3814, 0x01},
> -   {0x3815, 0x01},
> -   {0x3816, 0x00},
> -   {0x3817, 0x00},
> -   {0x3818, 0x00},
> -   {0x3819, 0x10},
> -   {0x3820, 0x80},
> -   {0x3821, 0x46},
> -   {0x382a, 0x01},
> -   {0x382b, 0x01},
> -   {0x3830, 0x06},
> -   {0x3836, 0x02},
> -   {0x3862, 0x04},
> -   {0x3863, 0x08},
> -   {0x3cc0, 0x33},
> -   {0x3d85, 0x17},
> -   {0x3d8c, 0x73},
> -   {0x3d8d, 0xde},
> -   {0x4001, 0xe0},
> -   {0x4003, 0x40},
> -   {0x4008, 0x00},
> -   {0x4009, 0x0b},
> -   {0x400a, 0x00},
> -   {0x400b, 0x84},
> -   {0x400f, 0x80},
> -   {0x4010, 0xf0},
> -   {0x4011, 0xff},
> -   {0x4012, 0x02},
> -   {0x4013, 0x01},
> -   {0x4014, 0x01},
> -   {0x4015, 0x01},
> -   {0x4042, 0x00},
> -   {0x4043, 0x80},
> -   {0x4044, 0x00},
> -   {0x4045, 0x80},
> -   {0x4046, 0x00},
> -   {0x4047, 0x80},
> -   {0x4048, 0x00},
> -   {0x4049, 0x80},
> -   {0x4041, 0x03},
> -   {0x404c, 0x20},
> -   {0x404d, 0x00},
> -   {0x404e, 0x20},
> -   {0x4203, 0x80},
> -   {0x4307, 0x30},
> -   {0x4317, 0x00},
> -   {0x4503, 0x08},
> -   {0x4601, 0x80},
> -   {0x4800, 0x44},
> -   {0x4816, 0x53},
> -   {0x481b, 0x58},
> -   {0x481f, 0x27},
> -   {0x4837, 0x16},
> -   {0x483c, 0x0f},
> -   {0x484b, 0x05},
> -   {0x5000, 0x57},
> -   {0x5001, 0x0a},
> -   {0x5004, 0x04},
> -   {0x502e, 0x03},
> -   

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-11-23 Thread Tomasz Figa
On Tue, Nov 24, 2020 at 12:08 AM Helen Koike  wrote:
>
> Hi Hans,
>
> Thank you for your review.
>
> On 9/9/20 9:27 AM, Hans Verkuil wrote:
> > Hi Helen,
> >
> > Again I'm just reviewing the uAPI.
> >
> > On 04/08/2020 21:29, Helen Koike wrote:
> >> From: Hans Verkuil 
> >>
> >> Those extended buffer ops have several purpose:
> >> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
> >>the number of ns elapsed since 1970
> >> 2/ Unify single/multiplanar handling
> >> 3/ Add a new start offset field to each v4l2 plane buffer info struct
> >>to support the case where a single buffer object is storing all
> >>planes data, each one being placed at a different offset
> >>
> >> New hooks are created in v4l2_ioctl_ops so that drivers can start using
> >> these new objects.
> >>
> >> The core takes care of converting new ioctls requests to old ones
> >> if the driver does not support the new hooks, and vice versa.
> >>
> >> Note that the timecode field is gone, since there doesn't seem to be
> >> in-kernel users. We can be added back in the reserved area if needed or
> >> use the Request API to collect more metadata information from the
> >> frame.
> >>
> >> Signed-off-by: Hans Verkuil 
> >> Signed-off-by: Boris Brezillon 
> >> Signed-off-by: Helen Koike 
> >> ---
> >> Changes in v5:
> >> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> >> - return mem_offset to struct v4l2_ext_plane
> >> - change sizes and reorder fields to avoid holes in the struct and make
> >>   it the same for 32 and 64 bits
> >>
> >> Changes in v4:
> >> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> >> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> >> - Drop VIDIOC_EXT_EXPBUF, since the only difference from VIDIOC_EXPBUF
> >> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at once.
> >> I think we can add this later, so I removed it from this RFC to simplify 
> >> it.
> >> - Remove num_planes field from struct v4l2_ext_buffer
> >> - Add flags field to struct v4l2_ext_create_buffers
> >> - Reformulate struct v4l2_ext_plane
> >> - Fix some bugs caught by v4l2-compliance
> >> - Rebased on top of media/master (post 5.8-rc1)
> >>
> >> Changes in v3:
> >> - Rebased on top of media/master (post 5.4-rc1)
> >>
> >> Changes in v2:
> >> - Add reserved space to v4l2_ext_buffer so that new fields can be added
> >>   later on
> >> ---
> >>  drivers/media/v4l2-core/v4l2-dev.c   |  29 ++-
> >>  drivers/media/v4l2-core/v4l2-ioctl.c | 353 +--
> >>  include/media/v4l2-ioctl.h   |  26 ++
> >>  include/uapi/linux/videodev2.h   |  90 +++
> >>  4 files changed, 476 insertions(+), 22 deletions(-)
> >>
> >
> > 
> >
> >> diff --git a/include/uapi/linux/videodev2.h 
> >> b/include/uapi/linux/videodev2.h
> >> index 7123c6a4d9569..334cafdd2be97 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -996,6 +996,41 @@ struct v4l2_plane {
> >>  __u32   reserved[11];
> >>  };
> >>
> >> +/**
> >> + * struct v4l2_ext_plane - extended plane buffer info
> >> + * @buffer_length:  size of the entire buffer in bytes, should fit
> >> + *  @offset + @plane_length
> >> + * @plane_length:   size of the plane in bytes.
> >> + * @mem_offset: If V4L2_MEMORY_MMAP is used, then it can be a 
> >> "cookie"
> >> + *  that should be passed to mmap() called on the video 
> >> node.
> >> + * @userptr:when memory is V4L2_MEMORY_USERPTR, a 
> >> userspace pointer pointing
> >> + *  to this plane.
> >> + * @dmabuf_fd:  when memory is V4L2_MEMORY_DMABUF, a 
> >> userspace file descriptor
> >> + *  associated with this plane.
> >> + * @offset: offset in the memory buffer where the plane starts.
> >> + * @memory: enum v4l2_memory; the method, in which the actual 
> >> video
> >> + *  data is passed
> >> + * @reserved:   extra space reserved for future fields, must 
> >> be set to 0.
> >> + *
> >> + *
> >> + * Buffers consist of one or more planes, e.g. an YCbCr buffer with two 
> >> planes
> >> + * can have one plane for Y, and another for interleaved CbCr components.
> >> + * Each plane can reside in a separate memory buffer, or even in
> >> + * a completely separate memory node (e.g. in embedded devices).
> >> + */
> >> +struct v4l2_ext_plane {
> >> +__u32 buffer_length;
> >> +__u32 plane_length;
> >> +union {
> >> +__u32 mem_offset;
> >> +__u64 userptr;
> >> +__s32 dmabuf_fd;
> >> +} m;
> >> +__u32 offset;
> >
> > I'd rename this plane_offset. I think some reordering would make this 
> > struct easier
> > to understand:
> >
> > struct v4l2_ext_plane {
> >   __u32 buffer_length;
> >   __u32 plane_offset;
> >   __u32 plane_length;
> >   __u32 memory;
> >   union {
> >

Re: [PATCH v5 7/7] media: docs: add documentation for the Extended API

2020-11-20 Thread Tomasz Figa
On Fri, Nov 20, 2020 at 9:24 PM Hans Verkuil  wrote:
>
> On 20/11/2020 12:06, Tomasz Figa wrote:z
> > Hi Helen,
> >
> > On Tue, Aug 04, 2020 at 04:29:39PM -0300, Helen Koike wrote:
> >> Add documentation and update references in current documentation for the
> >> Extended API.
> >>
> >
> > Thank you for the patch. Please see my comments inline.
> >
> >> Signed-off-by: Helen Koike 
> >> ---
> >> Changes in v5:
> >> - new patch
> >>
> >>  .../userspace-api/media/v4l/buffer.rst|   5 +
> >>  .../userspace-api/media/v4l/common.rst|   1 +
> >>  .../userspace-api/media/v4l/dev-capture.rst   |   5 +
> >>  .../userspace-api/media/v4l/dev-output.rst|   5 +
> >>  .../userspace-api/media/v4l/ext-api.rst   | 107 +
> >>  .../userspace-api/media/v4l/format.rst|  16 +-
> >>  .../userspace-api/media/v4l/user-func.rst |   5 +
> >>  .../media/v4l/vidioc-ext-create-bufs.rst  |  95 
> >>  .../media/v4l/vidioc-ext-prepare-buf.rst  |  62 ++
> >>  .../media/v4l/vidioc-ext-qbuf.rst | 204 ++
> >>  .../media/v4l/vidioc-ext-querybuf.rst |  79 +++
> >>  .../media/v4l/vidioc-g-ext-pix-fmt.rst| 117 ++
> >>  12 files changed, 697 insertions(+), 4 deletions(-)
> >>  create mode 100644 Documentation/userspace-api/media/v4l/ext-api.rst
> >>  create mode 100644 
> >> Documentation/userspace-api/media/v4l/vidioc-ext-create-bufs.rst
> >>  create mode 100644 
> >> Documentation/userspace-api/media/v4l/vidioc-ext-prepare-buf.rst
> >>  create mode 100644 
> >> Documentation/userspace-api/media/v4l/vidioc-ext-qbuf.rst
> >>  create mode 100644 
> >> Documentation/userspace-api/media/v4l/vidioc-ext-querybuf.rst
> >>  create mode 100644 
> >> Documentation/userspace-api/media/v4l/vidioc-g-ext-pix-fmt.rst
> >>
> >> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst 
> >> b/Documentation/userspace-api/media/v4l/buffer.rst
> >> index 57e752aaf414a..c832bedd64e4c 100644
> >> --- a/Documentation/userspace-api/media/v4l/buffer.rst
> >> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> >> @@ -27,6 +27,11 @@ such as pointers and sizes for each plane, are stored in
> >>  struct :c:type:`v4l2_plane` instead. In that case,
> >>  struct :c:type:`v4l2_buffer` contains an array of plane structures.
> >>
> >> +.. note::
> >> +
> >> +The :ref:`ext_api` version can also be used, and it is
> >> +preferable when applicable.
> >
> > Would rephrasing this as below making a bit more definitive?
> >
> >   For modern applications, these operations are replaced by their
> >   :ref:`ext_api` counterparts, which should be used instead.
>
> You can't say that, since especially in the beginning userspace will be 
> running
> on older kernels that do not support this.
>
> This will work: "should be used instead, if supported by the driver."
>

With the wrappers that the patches provide, all drivers would support
the new API, so this boils down to the kernel version only, not
specific drivers.

Agreed, though, that application developers must be made aware that
the new API is only available in new kernels and old API must be used
for compatibility with old kernels.

> >
> >> +
> >>  Dequeued video buffers come with timestamps. The driver decides at which
> >>  part of the frame and with which clock the timestamp is taken. Please
> >>  see flags in the masks ``V4L2_BUF_FLAG_TIMESTAMP_MASK`` and
> >> diff --git a/Documentation/userspace-api/media/v4l/common.rst 
> >> b/Documentation/userspace-api/media/v4l/common.rst
> >> index 7d81c58a13cd7..3430e0bdad667 100644
> >> --- a/Documentation/userspace-api/media/v4l/common.rst
> >> +++ b/Documentation/userspace-api/media/v4l/common.rst
> >> @@ -59,6 +59,7 @@ applicable to all devices.
> >>  ext-ctrls-detect
> >>  fourcc
> >>  format
> >> +ext-api
> >>  planar-apis
> >>  selection-api
> >>  crop
> >> diff --git a/Documentation/userspace-api/media/v4l/dev-capture.rst 
> >> b/Documentation/userspace-api/media/v4l/dev-capture.rst
> >> index 44d3094093ab6..5077639787d92 100644
> >> --- a/Documentation/userspace-api/media/v4l/dev-capture.rst
> >> +++ b/Documentation/userspace-api/media/v4l/dev-capture.rst
> >> @@ -102,6 +102,11 @@ and :ref:`VIDIOC_

Re: [PATCH v6 09/17] media/videbuf1|2: Mark follow_pfn usage as unsafe

2020-11-20 Thread Tomasz Figa
On Fri, Nov 20, 2020 at 9:08 PM Hans Verkuil  wrote:
>
> On 20/11/2020 11:51, Daniel Vetter wrote:
> > On Fri, Nov 20, 2020 at 11:39 AM Hans Verkuil  wrote:
> >>
> >> On 20/11/2020 10:18, Daniel Vetter wrote:
> >>> On Fri, Nov 20, 2020 at 9:28 AM Hans Verkuil  wrote:
> >>>>
> >>>> On 20/11/2020 09:06, Hans Verkuil wrote:
> >>>>> On 19/11/2020 15:41, Daniel Vetter wrote:
> >>>>>> The media model assumes that buffers are all preallocated, so that
> >>>>>> when a media pipeline is running we never miss a deadline because the
> >>>>>> buffers aren't allocated or available.
> >>>>>>
> >>>>>> This means we cannot fix the v4l follow_pfn usage through
> >>>>>> mmu_notifier, without breaking how this all works. The only real fix
> >>>>>> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
> >>>>>> tell everyone to cut over to dma-buf memory sharing for zerocopy.
> >>>>>>
> >>>>>> userptr for normal memory will keep working as-is, this only affects
> >>>>>> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
> >>>>>> videobuf2-dma-sg: Support io userptr operations on io memory").
> >>>>>>
> >>>>>> Acked-by: Tomasz Figa 
> >>>>>
> >>>>> Acked-by: Hans Verkuil 
> >>>>
> >>>> Actually, cancel this Acked-by.
> >>>>
> >>>> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can
> >>>> move around. There is a mmu_notifier that can be used to be notified when
> >>>> that happens, but that can't be used with media buffers since those 
> >>>> buffers
> >>>> must always be available and in the same place.
> >>>>
> >>>> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is 
> >>>> attempted
> >>>> is unsafe and unreliable.
> >>>>
> >>>> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if 
> >>>> it
> >>>> is unset, then it writes a warning to the kernel log but just continues 
> >>>> while
> >>>> still unsafe.
> >>>>
> >>>> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the 
> >>>> media
> >>>> subsystem. For vb2 there is a working alternative in the form of dmabuf, 
> >>>> and
> >>>> frankly for vb1 I don't care. If someone really needs this for a vb1 
> >>>> driver,
> >>>> then they can do the work to convert that driver to vb2.
> >>>>
> >>>> I've added Mauro to the CC list and I'll ping a few more people to see 
> >>>> what
> >>>> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP
> >>>> should just be killed off.
> >>>>
> >>>> If others would like to keep it, then frame_vector.c needs a comment 
> >>>> before
> >>>> the 'while' explaining why the unsafe_follow_pfn is there and that using
> >>>> dmabuf is the proper alternative to use. That will make it easier for
> >>>> developers to figure out why they see a kernel warning and what to do to
> >>>> fix it, rather than having to dig through the git history for the reason.
> >>>
> >>> I'm happy to add a comment, but otherwise if you all want to ditch
> >>> this, can we do this as a follow up on top? There's quite a bit of
> >>> code that can be deleted and I'd like to not hold up this patch set
> >>> here on that - it's already a fairly sprawling pain touching about 7
> >>> different subsystems (ok only 6-ish now since the s390 patch landed).
> >>> For the comment, is the explanation next to unsafe_follow_pfn not good
> >>> enough?
> >>
> >> No, because that doesn't mention that you should use dma-buf as a 
> >> replacement.
> >> That's really the critical piece of information I'd like to see. That 
> >> doesn't
> >> belong in unsafe_follow_pfn, it needs to be in frame_vector.c since it's
> >> vb2 specific.
> >
> > Ah makes sense, I'll add that.
> >
> >>>
> >>> So ... can I get you to un-cancel your ack?
> >>
> >> Hmm, I really would like to see support for this 

Re: [PATCH v5 2/7] media: v4l2: Add extended buffer operations

2020-11-20 Thread Tomasz Figa
Hi Helen,

On Tue, Aug 04, 2020 at 04:29:34PM -0300, Helen Koike wrote:
> From: Hans Verkuil 
> 
> Those extended buffer ops have several purpose:
> 1/ Fix y2038 issues by converting the timestamp into an u64 counting
>the number of ns elapsed since 1970
> 2/ Unify single/multiplanar handling
> 3/ Add a new start offset field to each v4l2 plane buffer info struct
>to support the case where a single buffer object is storing all
>planes data, each one being placed at a different offset
> 
> New hooks are created in v4l2_ioctl_ops so that drivers can start using
> these new objects.
> 
> The core takes care of converting new ioctls requests to old ones
> if the driver does not support the new hooks, and vice versa.
> 
> Note that the timecode field is gone, since there doesn't seem to be
> in-kernel users. We can be added back in the reserved area if needed or
> use the Request API to collect more metadata information from the
> frame.
> 

Thanks for the patch. Please see my comments inline.

> Signed-off-by: Hans Verkuil 
> Signed-off-by: Boris Brezillon 
> Signed-off-by: Helen Koike 
> ---
> Changes in v5:
> - migrate memory from v4l2_ext_buffer to v4l2_ext_plane
> - return mem_offset to struct v4l2_ext_plane
> - change sizes and reorder fields to avoid holes in the struct and make
>   it the same for 32 and 64 bits
> 
> Changes in v4:
> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> - Drop VIDIOC_EXT_EXPBUF, since the only difference from VIDIOC_EXPBUF
> was that with VIDIOC_EXT_EXPBUF we could export multiple planes at once.
> I think we can add this later, so I removed it from this RFC to simplify it.
> - Remove num_planes field from struct v4l2_ext_buffer
> - Add flags field to struct v4l2_ext_create_buffers
> - Reformulate struct v4l2_ext_plane
> - Fix some bugs caught by v4l2-compliance
> - Rebased on top of media/master (post 5.8-rc1)
> 
> Changes in v3:
> - Rebased on top of media/master (post 5.4-rc1)
> 
> Changes in v2:
> - Add reserved space to v4l2_ext_buffer so that new fields can be added
>   later on
> ---
>  drivers/media/v4l2-core/v4l2-dev.c   |  29 ++-
>  drivers/media/v4l2-core/v4l2-ioctl.c | 353 +--
>  include/media/v4l2-ioctl.h   |  26 ++
>  include/uapi/linux/videodev2.h   |  90 +++
>  4 files changed, 476 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
> b/drivers/media/v4l2-core/v4l2-dev.c
> index e1829906bc086..cb21ee8eb075c 100644
> --- a/drivers/media/v4l2-core/v4l2-dev.c
> +++ b/drivers/media/v4l2-core/v4l2-dev.c
> @@ -720,15 +720,34 @@ static void determine_valid_ioctls(struct video_device 
> *vdev)
>   SET_VALID_IOCTL(ops, VIDIOC_TRY_FMT, vidioc_try_fmt_sdr_out);
>   }
>  
> + if (is_vid || is_tch) {
> + /* ioctls valid for video and touch */
> + if (ops->vidioc_querybuf || ops->vidioc_ext_querybuf)
> + set_bit(_IOC_NR(VIDIOC_EXT_QUERYBUF), valid_ioctls);
> + if (ops->vidioc_qbuf || ops->vidioc_ext_qbuf)
> + set_bit(_IOC_NR(VIDIOC_EXT_QBUF), valid_ioctls);
> + if (ops->vidioc_dqbuf || ops->vidioc_ext_dqbuf)
> + set_bit(_IOC_NR(VIDIOC_EXT_DQBUF), valid_ioctls);
> + if (ops->vidioc_create_bufs || ops->vidioc_ext_create_bufs)
> + set_bit(_IOC_NR(VIDIOC_EXT_CREATE_BUFS), valid_ioctls);
> + if (ops->vidioc_prepare_buf || ops->vidioc_ext_prepare_buf)
> + set_bit(_IOC_NR(VIDIOC_EXT_PREPARE_BUF), valid_ioctls);

nit: Could we stick to the SET_VALID_IOCTL() macro and just call it twice,
once for the new and once for the legacy callback?

> + }
> +
>   if (is_vid || is_vbi || is_sdr || is_tch || is_meta) {
>   /* ioctls valid for video, vbi, sdr, touch and metadata */
>   SET_VALID_IOCTL(ops, VIDIOC_REQBUFS, vidioc_reqbufs);
> - SET_VALID_IOCTL(ops, VIDIOC_QUERYBUF, vidioc_querybuf);
> - SET_VALID_IOCTL(ops, VIDIOC_QBUF, vidioc_qbuf);
>   SET_VALID_IOCTL(ops, VIDIOC_EXPBUF, vidioc_expbuf);
> - SET_VALID_IOCTL(ops, VIDIOC_DQBUF, vidioc_dqbuf);
> - SET_VALID_IOCTL(ops, VIDIOC_CREATE_BUFS, vidioc_create_bufs);
> - SET_VALID_IOCTL(ops, VIDIOC_PREPARE_BUF, vidioc_prepare_buf);
> + if (ops->vidioc_querybuf || ops->vidioc_ext_querybuf)
> + set_bit(_IOC_NR(VIDIOC_QUERYBUF), valid_ioctls);
> + if (ops->vidioc_qbuf || ops->vidioc_ext_qbuf)
> + set_bit(_IOC_NR(VIDIOC_QBUF), valid_ioctls);
> + if (ops->vidioc_dqbuf || ops->vidioc_ext_dqbuf)
> + set_bit(_IOC_NR(VIDIOC_DQBUF), valid_ioctls);
> + if (ops->vidioc_create_bufs || ops->vidioc_ext_create_bufs)
> + set_bit(_IOC_NR(VIDIOC_CREATE_BUFS), 

Re: [PATCH v5 7/7] media: docs: add documentation for the Extended API

2020-11-20 Thread Tomasz Figa
Hi Helen,

On Tue, Aug 04, 2020 at 04:29:39PM -0300, Helen Koike wrote:
> Add documentation and update references in current documentation for the
> Extended API.
> 

Thank you for the patch. Please see my comments inline.

> Signed-off-by: Helen Koike 
> ---
> Changes in v5:
> - new patch
> 
>  .../userspace-api/media/v4l/buffer.rst|   5 +
>  .../userspace-api/media/v4l/common.rst|   1 +
>  .../userspace-api/media/v4l/dev-capture.rst   |   5 +
>  .../userspace-api/media/v4l/dev-output.rst|   5 +
>  .../userspace-api/media/v4l/ext-api.rst   | 107 +
>  .../userspace-api/media/v4l/format.rst|  16 +-
>  .../userspace-api/media/v4l/user-func.rst |   5 +
>  .../media/v4l/vidioc-ext-create-bufs.rst  |  95 
>  .../media/v4l/vidioc-ext-prepare-buf.rst  |  62 ++
>  .../media/v4l/vidioc-ext-qbuf.rst | 204 ++
>  .../media/v4l/vidioc-ext-querybuf.rst |  79 +++
>  .../media/v4l/vidioc-g-ext-pix-fmt.rst| 117 ++
>  12 files changed, 697 insertions(+), 4 deletions(-)
>  create mode 100644 Documentation/userspace-api/media/v4l/ext-api.rst
>  create mode 100644 
> Documentation/userspace-api/media/v4l/vidioc-ext-create-bufs.rst
>  create mode 100644 
> Documentation/userspace-api/media/v4l/vidioc-ext-prepare-buf.rst
>  create mode 100644 Documentation/userspace-api/media/v4l/vidioc-ext-qbuf.rst
>  create mode 100644 
> Documentation/userspace-api/media/v4l/vidioc-ext-querybuf.rst
>  create mode 100644 
> Documentation/userspace-api/media/v4l/vidioc-g-ext-pix-fmt.rst
> 
> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst 
> b/Documentation/userspace-api/media/v4l/buffer.rst
> index 57e752aaf414a..c832bedd64e4c 100644
> --- a/Documentation/userspace-api/media/v4l/buffer.rst
> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> @@ -27,6 +27,11 @@ such as pointers and sizes for each plane, are stored in
>  struct :c:type:`v4l2_plane` instead. In that case,
>  struct :c:type:`v4l2_buffer` contains an array of plane structures.
>  
> +.. note::
> +
> +The :ref:`ext_api` version can also be used, and it is
> +preferable when applicable.

Would rephrasing this as below making a bit more definitive?

For modern applications, these operations are replaced by their
:ref:`ext_api` counterparts, which should be used instead.

> +
>  Dequeued video buffers come with timestamps. The driver decides at which
>  part of the frame and with which clock the timestamp is taken. Please
>  see flags in the masks ``V4L2_BUF_FLAG_TIMESTAMP_MASK`` and
> diff --git a/Documentation/userspace-api/media/v4l/common.rst 
> b/Documentation/userspace-api/media/v4l/common.rst
> index 7d81c58a13cd7..3430e0bdad667 100644
> --- a/Documentation/userspace-api/media/v4l/common.rst
> +++ b/Documentation/userspace-api/media/v4l/common.rst
> @@ -59,6 +59,7 @@ applicable to all devices.
>  ext-ctrls-detect
>  fourcc
>  format
> +ext-api
>  planar-apis
>  selection-api
>  crop
> diff --git a/Documentation/userspace-api/media/v4l/dev-capture.rst 
> b/Documentation/userspace-api/media/v4l/dev-capture.rst
> index 44d3094093ab6..5077639787d92 100644
> --- a/Documentation/userspace-api/media/v4l/dev-capture.rst
> +++ b/Documentation/userspace-api/media/v4l/dev-capture.rst
> @@ -102,6 +102,11 @@ and :ref:`VIDIOC_S_FMT ` ioctl, even if 
> :ref:`VIDIOC_S_FMT   requests and always returns default parameters as :ref:`VIDIOC_G_FMT 
> ` does.
>  :ref:`VIDIOC_TRY_FMT ` is optional.
>  
> +.. note::
> +
> +The :ref:`ext_api` version can also be used, and it is
> +preferable when applicable.
> +
>  
>  Reading Images
>  ==
> diff --git a/Documentation/userspace-api/media/v4l/dev-output.rst 
> b/Documentation/userspace-api/media/v4l/dev-output.rst
> index e4f2a1d8b0fc7..f8f40c708e49f 100644
> --- a/Documentation/userspace-api/media/v4l/dev-output.rst
> +++ b/Documentation/userspace-api/media/v4l/dev-output.rst
> @@ -99,6 +99,11 @@ and :ref:`VIDIOC_S_FMT ` ioctl, even if 
> :ref:`VIDIOC_S_FMT   requests and always returns default parameters as :ref:`VIDIOC_G_FMT 
> ` does.
>  :ref:`VIDIOC_TRY_FMT ` is optional.
>  
> +.. note::
> +
> +The :ref:`ext_api` version can also be used, and it is
> +preferable when applicable.
> +
>  
>  Writing Images
>  ==
> diff --git a/Documentation/userspace-api/media/v4l/ext-api.rst 
> b/Documentation/userspace-api/media/v4l/ext-api.rst
> new file mode 100644
> index 0..da2a82960d22f
> --- /dev/null
> +++ b/Documentation/userspace-api/media/v4l/ext-api.rst
> @@ -0,0 +1,107 @@
> +.. Permission is granted to copy, distribute and/or modify this
> +.. document under the terms of the GNU Free Documentation License,
> +.. Version 1.1 or any later version published by the Free Software
> +.. Foundation, with no Invariant Sections, no Front-Cover Texts
> +.. and no Back-Cover Texts. A copy of the license is included at
> +.. 

Re: [PATCH v6 09/17] media/videbuf1|2: Mark follow_pfn usage as unsafe

2020-11-20 Thread Tomasz Figa
On Fri, Nov 20, 2020 at 5:28 PM Hans Verkuil  wrote:
>
> On 20/11/2020 09:06, Hans Verkuil wrote:
> > On 19/11/2020 15:41, Daniel Vetter wrote:
> >> The media model assumes that buffers are all preallocated, so that
> >> when a media pipeline is running we never miss a deadline because the
> >> buffers aren't allocated or available.
> >>
> >> This means we cannot fix the v4l follow_pfn usage through
> >> mmu_notifier, without breaking how this all works. The only real fix
> >> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
> >> tell everyone to cut over to dma-buf memory sharing for zerocopy.
> >>
> >> userptr for normal memory will keep working as-is, this only affects
> >> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
> >> videobuf2-dma-sg: Support io userptr operations on io memory").
> >>
> >> Acked-by: Tomasz Figa 
> >
> > Acked-by: Hans Verkuil 
>
> Actually, cancel this Acked-by.
>
> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can
> move around. There is a mmu_notifier that can be used to be notified when
> that happens, but that can't be used with media buffers since those buffers
> must always be available and in the same place.
>
> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is 
> attempted
> is unsafe and unreliable.
>
> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if it
> is unset, then it writes a warning to the kernel log but just continues while
> still unsafe.
>
> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the media
> subsystem. For vb2 there is a working alternative in the form of dmabuf, and
> frankly for vb1 I don't care. If someone really needs this for a vb1 driver,
> then they can do the work to convert that driver to vb2.
>
> I've added Mauro to the CC list and I'll ping a few more people to see what
> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP
> should just be killed off.
>
> If others would like to keep it, then frame_vector.c needs a comment before
> the 'while' explaining why the unsafe_follow_pfn is there and that using
> dmabuf is the proper alternative to use. That will make it easier for
> developers to figure out why they see a kernel warning and what to do to
> fix it, rather than having to dig through the git history for the reason.

I'm all for dropping that code.

Best regards,
Tomasz

>
> Regards,
>
> Hans
>
> >
> > Thanks!
> >
> >   Hans
> >
> >> Signed-off-by: Daniel Vetter 
> >> Cc: Jason Gunthorpe 
> >> Cc: Kees Cook 
> >> Cc: Dan Williams 
> >> Cc: Andrew Morton 
> >> Cc: John Hubbard 
> >> Cc: Jérôme Glisse 
> >> Cc: Jan Kara 
> >> Cc: Dan Williams 
> >> Cc: linux...@kvack.org
> >> Cc: linux-arm-ker...@lists.infradead.org
> >> Cc: linux-samsung-...@vger.kernel.org
> >> Cc: linux-me...@vger.kernel.org
> >> Cc: Pawel Osciak 
> >> Cc: Marek Szyprowski 
> >> Cc: Kyungmin Park 
> >> Cc: Tomasz Figa 
> >> Cc: Laurent Dufour 
> >> Cc: Vlastimil Babka 
> >> Cc: Daniel Jordan 
> >> Cc: Michel Lespinasse 
> >> Signed-off-by: Daniel Vetter 
> >> --
> >> v3:
> >> - Reference the commit that enabled the zerocopy userptr use case to
> >>   make it abundandtly clear that this patch only affects that, and not
> >>   normal memory userptr. The old commit message already explained that
> >>   normal memory userptr is unaffected, but I guess that was not clear
> >>   enough.
> >> ---
> >>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
> >>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
> >>  2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/media/common/videobuf2/frame_vector.c 
> >> b/drivers/media/common/videobuf2/frame_vector.c
> >> index a0e65481a201..1a82ec13ea00 100644
> >> --- a/drivers/media/common/videobuf2/frame_vector.c
> >> +++ b/drivers/media/common/videobuf2/frame_vector.c
> >> @@ -70,7 +70,7 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> >> nr_frames,
> >>  break;
> >>
> >>  while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> >> -err = follow_pfn(vma, start, [ret]);
> >> +err = unsafe_follow_pfn(vma, start, [ret]);
> >>  

Re: [PATCH v5 1/7] media: v4l2: Extend pixel formats to unify single/multi-planar handling (and more)

2020-11-18 Thread Tomasz Figa
On Sat, Nov 14, 2020 at 11:21:26AM -0300, Helen Koike wrote:
> Hi Tomasz,
> 
> On 10/2/20 4:49 PM, Tomasz Figa wrote:
> > Hi Helen,
> > 
> > On Tue, Aug 04, 2020 at 04:29:33PM -0300, Helen Koike wrote:
[snip]
> >> +static void v4l_print_ext_pix_format(const void *arg, bool write_only)
> >> +{
> >> +  const struct v4l2_ext_pix_format *pix = arg;
> >> +  unsigned int i;
> >> +
> >> +  pr_cont("type=%s, width=%u, height=%u, format=%c%c%c%c, modifier %llx, 
> >> field=%s, colorspace=%d, ycbcr_enc=%u, quantization=%u, xfer_func=%u\n",
> >> +  prt_names(pix->type, v4l2_type_names),
> >> +  pix->width, pix->height,
> >> +  (pix->pixelformat & 0xff),
> >> +  (pix->pixelformat >>  8) & 0xff,
> >> +  (pix->pixelformat >> 16) & 0xff,
> >> +  (pix->pixelformat >> 24) & 0xff,
> >> +  pix->modifier, prt_names(pix->field, v4l2_field_names),
> >> +  pix->colorspace, pix->ycbcr_enc,
> >> +  pix->quantization, pix->xfer_func);
> >> +  for (i = 0; i < VIDEO_MAX_PLANES && pix->plane_fmt[i].sizeimage; i++)
> > 
> > This is going to print 8 lines every time. Maybe we could skip 0-sized
> > planes or something?
> 
> I'm already checking pix->plane_fmt[i].sizeimage in the loop, it shouldn't
> print 8 lines every time.
> 

Oops, how could I not notice it. Sorry for the noise.

[snip]
> >> +int v4l2_ext_pix_format_to_format(const struct v4l2_ext_pix_format *e,
> >> +struct v4l2_format *f, bool mplane_cap,
> >> +bool strict)
> >> +{
> >> +  const struct v4l2_plane_ext_pix_format *pe;
> >> +  struct v4l2_plane_pix_format *p;
> >> +  unsigned int i;
> >> +
> >> +  memset(f, 0, sizeof(*f));
> >> +
> >> +  /*
> >> +   * Make sure no modifier is required before doing the
> >> +   * conversion.
> >> +   */
> >> +  if (e->modifier && strict &&
> > 
> > Do we need the explicit check for e->modifier != 0 if we have to check for
> > the 2 specific values below anyway?
> 
> We don't, since DRM_FORMAT_MOD_LINEAR is zero.
> 
> But I wanted to make it explicit we don't support modifiers in this 
> conversion.
> But I can remove this check, no problem.
> 

Yes, please. I think the double checking is confusing for the reader.

> > 
> >> +  e->modifier != DRM_FORMAT_MOD_LINEAR &&
> >> +  e->modifier != DRM_FORMAT_MOD_INVALID)
> >> +  return -EINVAL;
> >> +
> >> +  if (!e->plane_fmt[0].sizeimage && strict)
> >> +  return -EINVAL;
> > 
> > Why is this incorrect?
> 
> !sizeimage for the first plane means that there are no planes in ef.
> strict is true if the result for the conversion should be returned to 
> userspace
> and it is not some internal handling.
> 
> So if there are no planes, we would return an incomplete v4l2_format struct
> to userspace.
> 
> But this is not very clear, I'll improve this for the next version.
> 

So I can see 2 cases here:

1) Userspace gives ext struct and driver accepts legacy.

In this case, the kernel needs to adjust the structure to be correct.
-EINVAL is only valid if

"The struct v4l2_format type field is invalid or the requested buffer type not 
supported."

as per the current uAPI documentation.

2) Driver gives ext struct and userspace accepts legacy.

If at this point we get a struct with no planes, that sounds like a
driver bug, so rather than signaling -EINVAL to the userspace, we should
probably WARN()?

Or am I getting something wrong? :)

[snip]
> >> +{
> >> +  const struct v4l2_plane_pix_format *p;
> >> +  struct v4l2_plane_ext_pix_format *pe;
> >> +  unsigned int i;
> >> +
> >> +  memset(e, 0, sizeof(*e));
> >> +
> >> +  switch (f->type) {
> >> +  case V4L2_BUF_TYPE_VIDEO_CAPTURE:
> >> +  case V4L2_BUF_TYPE_VIDEO_OUTPUT:
> >> +  e->width = f->fmt.pix.width;
> >> +  e->height = f->fmt.pix.height;
> >> +  e->pixelformat = f->fmt.pix.pixelformat;
> >> +  e->field = f->fmt.pix.field;
> >> +  e->colorspace = f->fmt.pix.colorspace;
> >> +  if (f->fmt.pix.flags)
> >> +  pr_warn("Ignoring pixelformat flags 0x%x\n",
> >> +  f->fm

Re: [PATCH 8/8] WIP: add a dma_alloc_contiguous API

2020-11-10 Thread Tomasz Figa
On Tue, Nov 10, 2020 at 6:33 PM Ricardo Ribalda  wrote:
>
> Hi Christoph
>
> On Tue, Nov 10, 2020 at 10:25 AM Christoph Hellwig  wrote:
> >
> > On Mon, Nov 09, 2020 at 03:53:55PM +0100, Ricardo Ribalda wrote:
> > > Hi Christoph
> > >
> > > I have started now to give a try to your patchset. Sorry for the delay.
> > >
> > > For uvc I have prepared this patch:
> > > https://github.com/ribalda/linux/commit/9094fe223fe38f8c8ff21366d893b43cbbdf0113
> > >
> > > I have tested successfully in a x86_64 noteboot..., yes I know there
> > > is no change for that platform :).
> > > I am trying to get hold of an arm device that can run the latest
> > > kernel from upstream.
> > >
> > > On the meanwhile if you could take a look to the patch to verify that
> > > this the way that you expect the drivers to use your api I would
> > > appreciate it
> >
> > This looks pretty reaosnable.
> >
>
> Great
>

Thanks Christoph for taking a look quickly.

> Also FYI, I managed to boot an ARM device with that tree. But I could
> not test the uvc driver (it was a remote device with no usb device
> attached)
>
> Hopefully I will be able to test it for real this week.
>
> Any suggestions for how to measure performance difference?

Back in time Kieran (+CC) shared a patch to add extra statistics for
packet processing and payload assembly, with results of various
approaches summarized in a spreadsheet:
https://docs.google.com/spreadsheets/d/1uPdbdVcebO9OQ0LQ8hR2LGIEySWgSnGwwhzv7LPXAlU/edit#gid=0

That and just simple CPU usage comparison would be enough.

>
> Thanks!
>
> > Note that ifdef  CONFIG_DMA_NONCOHERENT in the old code doesn't actually
> > work, as that option is an internal thing just for mips and sh..

In what terms it doesn't actually work? Last time I checked some
platforms actually defined CONFIG_DMA_NONCOHERENT, so those would
instead use the kmalloc() + dma_map() path. I don't have any
background on why that was added and whether it needs to be preserved,
though. Kieran, Laurent, do you have any insight?

Best regards,
Tomasz


Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM

2020-11-02 Thread Tomasz Figa
On Fri, Oct 30, 2020 at 3:38 PM Daniel Vetter  wrote:
>
> On Fri, Oct 30, 2020 at 3:11 PM Tomasz Figa  wrote:
> >
> > On Fri, Oct 30, 2020 at 11:08 AM Daniel Vetter  
> > wrote:
> > >
> > > This is used by media/videbuf2 for persistent dma mappings, not just
> > > for a single dma operation and then freed again, so needs
> > > FOLL_LONGTERM.
> > >
> > > Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> > > locking issues. Rework the code to pull the pup path out from the
> > > mmap_sem critical section as suggested by Jason.
> > >
> > > By relying entirely on the vma checks in pin_user_pages and follow_pfn
> > > (for vm_flags and vma_is_fsdax) we can also streamline the code a lot.
> > >
> > > Signed-off-by: Daniel Vetter 
> > > Cc: Jason Gunthorpe 
> > > Cc: Pawel Osciak 
> > > Cc: Marek Szyprowski 
> > > Cc: Kyungmin Park 
> > > Cc: Tomasz Figa 
> > > Cc: Mauro Carvalho Chehab 
> > > Cc: Andrew Morton 
> > > Cc: John Hubbard 
> > > Cc: Jérôme Glisse 
> > > Cc: Jan Kara 
> > > Cc: Dan Williams 
> > > Cc: linux...@kvack.org
> > > Cc: linux-arm-ker...@lists.infradead.org
> > > Cc: linux-samsung-...@vger.kernel.org
> > > Cc: linux-me...@vger.kernel.org
> > > Signed-off-by: Daniel Vetter 
> > > --
> > > v2: Streamline the code and further simplify the loop checks (Jason)
> > >
> > > v5: Review from Tomasz:
> > > - fix page counting for the follow_pfn case by resetting ret
> > > - drop gup_flags paramater, now unused
> > > ---
> > >  .../media/common/videobuf2/videobuf2-memops.c |  3 +-
> > >  include/linux/mm.h|  2 +-
> > >  mm/frame_vector.c | 53 ++-
> > >  3 files changed, 19 insertions(+), 39 deletions(-)
> > >
> >
> > Thanks, looks good to me now.
> >
> > Acked-by: Tomasz Figa 
> >
> > From reading the code, this is quite unlikely to introduce any
> > behavior changes, but just to be safe, did you have a chance to test
> > this with some V4L2 driver?
>
> Nah, unfortunately not.

I believe we don't have any setup that could exercise the IO/PFNMAP
user pointers, but it should be possible to exercise the basic userptr
path by enabling the virtual (fake) video driver, vivid or
CONFIG_VIDEO_VIVID, in your kernel and then using yavta [1] with
--userptr and --capture= (and possibly some more
options) to grab a couple of frames from the test pattern generator.

Does it sound like something that you could give a try? Feel free to
ping me on IRC (tfiga on #v4l or #dri-devel) if you need any help.

[1] https://git.ideasonboard.org/yavta.git

Best regards,
Tomasz

> -Daniel
>
> >
> > Best regards,
> > Tomasz
> >
> > > diff --git a/drivers/media/common/videobuf2/videobuf2-memops.c 
> > > b/drivers/media/common/videobuf2/videobuf2-memops.c
> > > index 6e9e05153f4e..9dd6c27162f4 100644
> > > --- a/drivers/media/common/videobuf2/videobuf2-memops.c
> > > +++ b/drivers/media/common/videobuf2/videobuf2-memops.c
> > > @@ -40,7 +40,6 @@ struct frame_vector *vb2_create_framevec(unsigned long 
> > > start,
> > > unsigned long first, last;
> > > unsigned long nr;
> > > struct frame_vector *vec;
> > > -   unsigned int flags = FOLL_FORCE | FOLL_WRITE;
> > >
> > > first = start >> PAGE_SHIFT;
> > > last = (start + length - 1) >> PAGE_SHIFT;
> > > @@ -48,7 +47,7 @@ struct frame_vector *vb2_create_framevec(unsigned long 
> > > start,
> > > vec = frame_vector_create(nr);
> > > if (!vec)
> > > return ERR_PTR(-ENOMEM);
> > > -   ret = get_vaddr_frames(start & PAGE_MASK, nr, flags, vec);
> > > +   ret = get_vaddr_frames(start & PAGE_MASK, nr, vec);
> > > if (ret < 0)
> > > goto out_destroy;
> > > /* We accept only complete set of PFNs */
> > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > index ef360fe70aaf..d6b8e30dce2e 100644
> > > --- a/include/linux/mm.h
> > > +++ b/include/linux/mm.h
> > > @@ -1765,7 +1765,7 @@ struct frame_vector {
> > >  struct frame_vector *frame_vector_create(unsigned int nr_frames);
> > >  void frame_vector_destroy(struct frame_vector *vec);
> > >  int get_vaddr_frames(unsigned long start, unsigned int n

Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM

2020-10-30 Thread Tomasz Figa
On Fri, Oct 30, 2020 at 11:08 AM Daniel Vetter  wrote:
>
> This is used by media/videbuf2 for persistent dma mappings, not just
> for a single dma operation and then freed again, so needs
> FOLL_LONGTERM.
>
> Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> locking issues. Rework the code to pull the pup path out from the
> mmap_sem critical section as suggested by Jason.
>
> By relying entirely on the vma checks in pin_user_pages and follow_pfn
> (for vm_flags and vma_is_fsdax) we can also streamline the code a lot.
>
> Signed-off-by: Daniel Vetter 
> Cc: Jason Gunthorpe 
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Kyungmin Park 
> Cc: Tomasz Figa 
> Cc: Mauro Carvalho Chehab 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Signed-off-by: Daniel Vetter 
> --
> v2: Streamline the code and further simplify the loop checks (Jason)
>
> v5: Review from Tomasz:
> - fix page counting for the follow_pfn case by resetting ret
> - drop gup_flags paramater, now unused
> ---
>  .../media/common/videobuf2/videobuf2-memops.c |  3 +-
>  include/linux/mm.h|  2 +-
>  mm/frame_vector.c | 53 ++-----
>  3 files changed, 19 insertions(+), 39 deletions(-)
>

Thanks, looks good to me now.

Acked-by: Tomasz Figa 

>From reading the code, this is quite unlikely to introduce any
behavior changes, but just to be safe, did you have a chance to test
this with some V4L2 driver?

Best regards,
Tomasz

> diff --git a/drivers/media/common/videobuf2/videobuf2-memops.c 
> b/drivers/media/common/videobuf2/videobuf2-memops.c
> index 6e9e05153f4e..9dd6c27162f4 100644
> --- a/drivers/media/common/videobuf2/videobuf2-memops.c
> +++ b/drivers/media/common/videobuf2/videobuf2-memops.c
> @@ -40,7 +40,6 @@ struct frame_vector *vb2_create_framevec(unsigned long 
> start,
> unsigned long first, last;
> unsigned long nr;
> struct frame_vector *vec;
> -   unsigned int flags = FOLL_FORCE | FOLL_WRITE;
>
> first = start >> PAGE_SHIFT;
> last = (start + length - 1) >> PAGE_SHIFT;
> @@ -48,7 +47,7 @@ struct frame_vector *vb2_create_framevec(unsigned long 
> start,
> vec = frame_vector_create(nr);
> if (!vec)
> return ERR_PTR(-ENOMEM);
> -   ret = get_vaddr_frames(start & PAGE_MASK, nr, flags, vec);
> +   ret = get_vaddr_frames(start & PAGE_MASK, nr, vec);
> if (ret < 0)
> goto out_destroy;
> /* We accept only complete set of PFNs */
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ef360fe70aaf..d6b8e30dce2e 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1765,7 +1765,7 @@ struct frame_vector {
>  struct frame_vector *frame_vector_create(unsigned int nr_frames);
>  void frame_vector_destroy(struct frame_vector *vec);
>  int get_vaddr_frames(unsigned long start, unsigned int nr_pfns,
> -unsigned int gup_flags, struct frame_vector *vec);
> +struct frame_vector *vec);
>  void put_vaddr_frames(struct frame_vector *vec);
>  int frame_vector_to_pages(struct frame_vector *vec);
>  void frame_vector_to_pfns(struct frame_vector *vec);
> diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> index 10f82d5643b6..f8c34b895c76 100644
> --- a/mm/frame_vector.c
> +++ b/mm/frame_vector.c
> @@ -32,13 +32,12 @@
>   * This function takes care of grabbing mmap_lock as necessary.
>   */
>  int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> -unsigned int gup_flags, struct frame_vector *vec)
> +struct frame_vector *vec)
>  {
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> int ret = 0;
> int err;
> -   int locked;
>
> if (nr_frames == 0)
> return 0;
> @@ -48,40 +47,26 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> nr_frames,
>
> start = untagged_addr(start);
>
> -   mmap_read_lock(mm);
> -   locked = 1;
> -   vma = find_vma_intersection(mm, start, start + 1);
> -   if (!vma) {
> -   ret = -EFAULT;
> -   goto out;
> -   }
> -
> -   /*
> -* While get_vaddr_frames() could be used for transient (kernel
> -* controlled lifetime) pinning of memory pages all current
> -* users establish long term (userspace con

Re: [PATCH] media: i2c: imx258: correct mode to GBGB/RGRG

2020-10-28 Thread Tomasz Figa
On Wed, Oct 28, 2020 at 11:15 AM Krzysztof Kozlowski  wrote:
>
> On Wed, 28 Oct 2020 at 11:03, Sakari Ailus  
> wrote:
> >
> > On Wed, Oct 28, 2020 at 10:56:55AM +0100, Krzysztof Kozlowski wrote:
> > > On Wed, 28 Oct 2020 at 10:45, Krzysztof Kozlowski  wrote:
> > > >
> > > > On Wed, 28 Oct 2020 at 10:43, Yeh, Andy  wrote:
> > > > >
> > > > > But the sensor settings for the original submission is to output GRBG 
> > > > > Bayer RAW.
> > > > >
> > > > > Regards, Andy
> > > >
> > > > No, not to my knowledge. There are no settings for color output
> > > > because it is fixed to GBGB/RGRG. I was looking a lot into this driver
> > > > (I have few other problems with it, already few other patches posted)
> > > > and I could not find a setting for this in datasheet. If you know the
> > > > setting for the other color - can you point me to it?
> > >
> > > And except the datasheet which mentions the specific format, the
> > > testing confirms it. With original color the pictures are pink/purple.
> > > With proper color, the pictures are correct (with more green color as
> > > expected for bayer).
> >
> > Quoting the driver's start_streaming function:
> >
> > /* Set Orientation be 180 degree */
> > ret = imx258_write_reg(imx258, REG_MIRROR_FLIP_CONTROL,
> >IMX258_REG_VALUE_08BIT, 
> > REG_CONFIG_MIRROR_FLIP);
>
> I understand that you think it will replace the lines and columns and
> the first line will be RG, instead of GB or actually BG because it
> flips horizontal and vertical? So why does it not work?

Any chance your SoC capture interface performs this flipping on its own as well?

>
> BTW, this nicely points that the comment around
> device_property_read_u32() for rotation is a little bit misleading :)
>

Are you referring to the comment below?

/*
* Check that the device is mounted upside down. The driver only
* supports a single pixel order right now.
*/
ret = device_property_read_u32(>dev, "rotation", );
if (ret || val != 180)
return -EINVAL;

What's misleading about it?

> > if (ret) {
> > dev_err(>dev, "%s failed to set orientation\n",
> > __func__);
> > return ret;
> > }
> >
> > Could it be you're taking pictures of pink objects? ;-)
>
> I can send a few sample pictures taken with GStreamer (RAW8, not
> original RAW10)...
>
> Best regards,
> Krzysztof


Re: [PATCH] ASoC: Intel: kbl_rt5663_max98927: Fix kabylake_ssp_fixup function

2020-10-26 Thread Tomasz Figa
On Wed, Oct 14, 2020 at 08:02:26PM +0100, Mark Brown wrote:
> On Wed, Oct 14, 2020 at 02:16:24PM +0000, Tomasz Figa wrote:
> 
> > Fixes a boot crash on a HP Chromebook x2:
> > 
> > [   16.582225] BUG: kernel NULL pointer dereference, address: 
> > 0050
> > [   16.582231] #PF: supervisor read access in kernel mode
> > [   16.582233] #PF: error_code(0x) - not-present page
> > [   16.582234] PGD 0 P4D 0
> > [   16.582238] Oops:  [#1] PREEMPT SMP PTI
> > [   16.582241] CPU: 0 PID: 1980 Comm: cras Tainted: G C
> > 5.4.58 #1
> > [   16.582243] Hardware name: HP Soraka/Soraka, BIOS 
> > Google_Soraka.10431.75.0 08/30/2018
> 
> Please think hard before including complete backtraces in upstream
> reports, they are very large and contain almost no useful information
> relative to their size so often obscure the relevant content in your
> message. If part of the backtrace is usefully illustrative (it often is
> for search engines if nothing else) then it's usually better to pull out
> the relevant sections.

Okay, I'll trim things down next time. Somehow I was convinced it's a
common practice.

Best regards,
Tomasz


Re: [PATCH v4 05/15] mm/frame-vector: Use FOLL_LONGTERM

2020-10-26 Thread Tomasz Figa
Hi Daniel,

On Mon, Oct 26, 2020 at 11:58:08AM +0100, Daniel Vetter wrote:
> This is used by media/videbuf2 for persistent dma mappings, not just
> for a single dma operation and then freed again, so needs
> FOLL_LONGTERM.
> 
> Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> locking issues. Rework the code to pull the pup path out from the
> mmap_sem critical section as suggested by Jason.
> 
> By relying entirely on the vma checks in pin_user_pages and follow_pfn
> (for vm_flags and vma_is_fsdax) we can also streamline the code a lot.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Jason Gunthorpe 
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Kyungmin Park 
> Cc: Tomasz Figa 
> Cc: Mauro Carvalho Chehab 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Signed-off-by: Daniel Vetter 
> --
> v2: Streamline the code and further simplify the loop checks (Jason)
> ---
>  mm/frame_vector.c | 50 ++-
>  1 file changed, 15 insertions(+), 35 deletions(-)
> 

Thank you for the patch. Please see my comments inline.

> diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> index 10f82d5643b6..d44779e56313 100644
> --- a/mm/frame_vector.c
> +++ b/mm/frame_vector.c
> @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> nr_frames,
>   struct vm_area_struct *vma;
>   int ret = 0;
>   int err;
> - int locked;
>  
>   if (nr_frames == 0)
>   return 0;
> @@ -48,40 +47,25 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> nr_frames,
>  
>   start = untagged_addr(start);
>  
> - mmap_read_lock(mm);
> - locked = 1;
> - vma = find_vma_intersection(mm, start, start + 1);
> - if (!vma) {
> - ret = -EFAULT;
> - goto out;
> - }
> -
> - /*
> -  * While get_vaddr_frames() could be used for transient (kernel
> -  * controlled lifetime) pinning of memory pages all current
> -  * users establish long term (userspace controlled lifetime)
> -  * page pinning. Treat get_vaddr_frames() like
> -  * get_user_pages_longterm() and disallow it for filesystem-dax
> -  * mappings.
> -  */
> - if (vma_is_fsdax(vma)) {
> - ret = -EOPNOTSUPP;
> - goto out;
> - }
> -
> - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
> + ret = pin_user_pages_fast(start, nr_frames,
> +   FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
> +   (struct page **)(vec->ptrs));
> + if (ret > 0) {
>   vec->got_ref = true;
>   vec->is_pfns = false;
> - ret = pin_user_pages_locked(start, nr_frames,
> - gup_flags, (struct page **)(vec->ptrs), );

Should we drop the gup_flags argument, since it's ignored now?

> - goto out;
> + goto out_unlocked;
>   }
>  

Should we initialize ret with 0 here, since pin_user_pages_fast() can
return a negative error code, but below we use it as a counter for the
looked up frames?

Best regards,
Tomasz

> + mmap_read_lock(mm);
>   vec->got_ref = false;
>   vec->is_pfns = true;
>   do {
>   unsigned long *nums = frame_vector_pfns(vec);
>  
> + vma = find_vma_intersection(mm, start, start + 1);
> + if (!vma)
> + break;
> +
>   while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
>   err = follow_pfn(vma, start, [ret]);
>   if (err) {
> @@ -92,17 +76,13 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> nr_frames,
>   start += PAGE_SIZE;
>   ret++;
>   }
> - /*
> -  * We stop if we have enough pages or if VMA doesn't completely
> -  * cover the tail page.
> -  */
> - if (ret >= nr_frames || start < vma->vm_end)
> + /* Bail out if VMA doesn't completely cover the tail page. */
> + if (start < vma->vm_end)
>   break;
> - vma = find_vma_intersection(mm, start, start + 1);
> - } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
> + } while (ret < nr_frames);
>  out:
> - if (locked)
> - mmap_read_unlock(mm);
> + mmap_read_unlock(mm);
> +out_unlocked:
>   if (!ret)
>   ret = -EFAULT;
>   if (ret > 0)
> -- 
> 2.28.0
> 


Re: [PATCH v4 06/15] media: videobuf2: Move frame_vector into media subsystem

2020-10-26 Thread Tomasz Figa
On Mon, Oct 26, 2020 at 11:58:09AM +0100, Daniel Vetter wrote:
> It's the only user. This also garbage collects the CONFIG_FRAME_VECTOR
> symbol from all over the tree (well just one place, somehow omap media
> driver still had this in its Kconfig, despite not using it).
> 
> Reviewed-by: John Hubbard 
> Acked-by: Mauro Carvalho Chehab 
> Signed-off-by: Daniel Vetter 
> Cc: Jason Gunthorpe 
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Kyungmin Park 
> Cc: Tomasz Figa 
> Cc: Mauro Carvalho Chehab 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Daniel Vetter 
> Signed-off-by: Daniel Vetter 
> --
> v3:
> - Create a new frame_vector.h header for this (Mauro)
> ---
>  drivers/media/common/videobuf2/Kconfig|  1 -
>  drivers/media/common/videobuf2/Makefile   |  1 +
>  .../media/common/videobuf2}/frame_vector.c|  2 +
>  drivers/media/platform/omap/Kconfig   |  1 -
>  include/linux/mm.h| 42 -
>  include/media/frame_vector.h  | 47 +++
>  include/media/videobuf2-core.h|  1 +
>  mm/Kconfig|  3 --
>  mm/Makefile   |  1 -
>  9 files changed, 51 insertions(+), 48 deletions(-)
>  rename {mm => drivers/media/common/videobuf2}/frame_vector.c (99%)
>  create mode 100644 include/media/frame_vector.h
> 

Acked-by: Tomasz Figa 

Best regards,
Tomasz

> diff --git a/drivers/media/common/videobuf2/Kconfig 
> b/drivers/media/common/videobuf2/Kconfig
> index edbc99ebba87..d2223a12c95f 100644
> --- a/drivers/media/common/videobuf2/Kconfig
> +++ b/drivers/media/common/videobuf2/Kconfig
> @@ -9,7 +9,6 @@ config VIDEOBUF2_V4L2
>  
>  config VIDEOBUF2_MEMOPS
>   tristate
> - select FRAME_VECTOR
>  
>  config VIDEOBUF2_DMA_CONTIG
>   tristate
> diff --git a/drivers/media/common/videobuf2/Makefile 
> b/drivers/media/common/videobuf2/Makefile
> index 77bebe8b202f..54306f8d096c 100644
> --- a/drivers/media/common/videobuf2/Makefile
> +++ b/drivers/media/common/videobuf2/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  videobuf2-common-objs := videobuf2-core.o
> +videobuf2-common-objs += frame_vector.o
>  
>  ifeq ($(CONFIG_TRACEPOINTS),y)
>videobuf2-common-objs += vb2-trace.o
> diff --git a/mm/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
> similarity index 99%
> rename from mm/frame_vector.c
> rename to drivers/media/common/videobuf2/frame_vector.c
> index d44779e56313..6590987c14bd 100644
> --- a/mm/frame_vector.c
> +++ b/drivers/media/common/videobuf2/frame_vector.c
> @@ -8,6 +8,8 @@
>  #include 
>  #include 
>  
> +#include 
> +
>  /**
>   * get_vaddr_frames() - map virtual addresses to pfns
>   * @start:   starting user address
> diff --git a/drivers/media/platform/omap/Kconfig 
> b/drivers/media/platform/omap/Kconfig
> index f73b5893220d..de16de46c0f4 100644
> --- a/drivers/media/platform/omap/Kconfig
> +++ b/drivers/media/platform/omap/Kconfig
> @@ -12,6 +12,5 @@ config VIDEO_OMAP2_VOUT
>   depends on VIDEO_V4L2
>   select VIDEOBUF2_DMA_CONTIG
>   select OMAP2_VRFB if ARCH_OMAP2 || ARCH_OMAP3
> - select FRAME_VECTOR
>   help
> V4L2 Display driver support for OMAP2/3 based boards.
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 16b799a0522c..acd60fbf1a5a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1743,48 +1743,6 @@ int account_locked_vm(struct mm_struct *mm, unsigned 
> long pages, bool inc);
>  int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
>   struct task_struct *task, bool bypass_rlim);
>  
> -/* Container for pinned pfns / pages */
> -struct frame_vector {
> - unsigned int nr_allocated;  /* Number of frames we have space for */
> - unsigned int nr_frames; /* Number of frames stored in ptrs array */
> - bool got_ref;   /* Did we pin pages by getting page ref? */
> - bool is_pfns;   /* Does array contain pages or pfns? */
> - void *ptrs[];   /* Array of pinned pfns / pages. Use
> -  * pfns_vector_pages() or pfns_vector_pfns()
> -  * for access */
> -};
> -
> -struct frame_vector *frame_vector_create(unsigned int nr_frames);
> -void frame_vector_destroy(struct frame_vector *vec);
> -int get_vaddr_frames(unsigned long start, unsign

Re: [PATCH v4 09/15] media/videbuf1|2: Mark follow_pfn usage as unsafe

2020-10-26 Thread Tomasz Figa
Hi Daniel,

On Mon, Oct 26, 2020 at 11:58:12AM +0100, Daniel Vetter wrote:
> The media model assumes that buffers are all preallocated, so that
> when a media pipeline is running we never miss a deadline because the
> buffers aren't allocated or available.
> 
> This means we cannot fix the v4l follow_pfn usage through
> mmu_notifier, without breaking how this all works. The only real fix
> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
> tell everyone to cut over to dma-buf memory sharing for zerocopy.
> 
> userptr for normal memory will keep working as-is, this only affects
> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
> videobuf2-dma-sg: Support io userptr operations on io memory").

Note that this is true only for the videobuf2 change. The videobuf1 code
was like this all the time and does not support normal memory in the
dma_contig variant (because normal memory is rarely physically contiguous).

If my understanding is correct that the CONFIG_STRICT_FOLLOW_PFN is not
enabled by default, we stay backwards compatible, with only whoever
decides to turn it on risking a breakage.

I agree that this is a good first step towards deprecating this legacy
code, so:

Acked-by: Tomasz Figa 

Of course the last word goes to Mauro. :)

Best regards,
Tomasz

> 
> Signed-off-by: Daniel Vetter 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Dan Williams 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Kyungmin Park 
> Cc: Tomasz Figa 
> Cc: Laurent Dufour 
> Cc: Vlastimil Babka 
> Cc: Daniel Jordan 
> Cc: Michel Lespinasse 
> Signed-off-by: Daniel Vetter 
> --
> v3:
> - Reference the commit that enabled the zerocopy userptr use case to
>   make it abundandtly clear that this patch only affects that, and not
>   normal memory userptr. The old commit message already explained that
>   normal memory userptr is unaffected, but I guess that was not clear
>   enough.
> ---
>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/frame_vector.c 
> b/drivers/media/common/videobuf2/frame_vector.c
> index 6590987c14bd..e630494da65c 100644
> --- a/drivers/media/common/videobuf2/frame_vector.c
> +++ b/drivers/media/common/videobuf2/frame_vector.c
> @@ -69,7 +69,7 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> nr_frames,
>   break;
>  
>   while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> - err = follow_pfn(vma, start, [ret]);
> + err = unsafe_follow_pfn(vma, start, [ret]);
>   if (err) {
>   if (ret == 0)
>   ret = err;
> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c 
> b/drivers/media/v4l2-core/videobuf-dma-contig.c
> index 52312ce2ba05..821c4a76ab96 100644
> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
> @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct 
> videobuf_dma_contig_memory *mem,
>   user_address = untagged_baddr;
>  
>   while (pages_done < (mem->size >> PAGE_SHIFT)) {
> - ret = follow_pfn(vma, user_address, _pfn);
> + ret = unsafe_follow_pfn(vma, user_address, _pfn);
>   if (ret)
>   break;
>  
> -- 
> 2.28.0
> 


Re: [PATCH 2/2] venus: venc: fix handlig of S_SELECTION and G_SELECTION

2020-10-22 Thread Tomasz Figa
On Thu, Oct 22, 2020 at 6:37 AM  wrote:
>
> Hi Tomasz,
>
> On 2020-10-13 19:09, Tomasz Figa wrote:
> > Hi Vikash,
> >
> > On Tue, Oct 13, 2020 at 02:56:21PM +0530, vgaro...@codeaurora.org
> > wrote:
> >>
> >> On 2020-10-08 19:51, Tomasz Figa wrote:
> >> > On Wed, Oct 7, 2020 at 9:33 PM  wrote:
> >> > >
> >> > > Hi Tomasz,
> >> > >
> >> > > On 2020-10-01 20:47, Tomasz Figa wrote:
> >> > > > On Thu, Oct 1, 2020 at 3:32 AM Stanimir Varbanov
> >> > > >  wrote:
> >> > > >>
> >> > > >> Hi Tomasz,
> >> > > >>
> >> > > >> On 9/25/20 11:55 PM, Tomasz Figa wrote:
> >> > > >> > Hi Dikshita, Stanimir,
> >> > > >> >
> >> > > >> > On Thu, Sep 24, 2020 at 7:31 PM Dikshita Agarwal
> >> > > >> >  wrote:
> >> > > >> >>
> >> > > >> >> From: Stanimir Varbanov 
> >> > > >> >>
> >> > > >> >> - return correct width and height for G_SELECTION
> >> > > >> >> - if requested rectangle wxh doesn't match with capture port wxh
> >> > > >> >>   adjust the rectangle to supported wxh.
> >> > > >> >>
> >> > > >> >> Signed-off-by: Dikshita Agarwal 
> >> > > >> >> ---
> >> > > >> >>  drivers/media/platform/qcom/venus/venc.c | 20 
> >> > > >> >> 
> >> > > >> >>  1 file changed, 12 insertions(+), 8 deletions(-)
> >> > > >> >>
> >> > > >> >> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> >> > > >> >> b/drivers/media/platform/qcom/venus/venc.c
> >> > > >> >> index 7d2aaa8..a2cc12d 100644
> >> > > >> >> --- a/drivers/media/platform/qcom/venus/venc.c
> >> > > >> >> +++ b/drivers/media/platform/qcom/venus/venc.c
> >> > > >> >> @@ -463,13 +463,13 @@ static int venc_g_fmt(struct file *file, 
> >> > > >> >> void *fh, struct v4l2_format *f)
> >> > > >> >> switch (s->target) {
> >> > > >> >> case V4L2_SEL_TGT_CROP_DEFAULT:
> >> > > >> >> case V4L2_SEL_TGT_CROP_BOUNDS:
> >> > > >> >> -   s->r.width = inst->width;
> >> > > >> >> -   s->r.height = inst->height;
> >> > > >> >> -   break;
> >> > > >> >> -   case V4L2_SEL_TGT_CROP:
> >> > > >> >> s->r.width = inst->out_width;
> >> > > >> >> s->r.height = inst->out_height;
> >> > > >> >> break;
> >> > > >> >> +   case V4L2_SEL_TGT_CROP:
> >> > > >> >> +   s->r.width = inst->width;
> >> > > >> >> +   s->r.height = inst->height;
> >> > > >> >> +   break;
> >> > > >> >> default:
> >> > > >> >> return -EINVAL;
> >> > > >> >> }inter
> >> > > >> >> @@ -490,10 +490,14 @@ static int venc_g_fmt(struct file *file, 
> >> > > >> >> void *fh, struct v4l2_format *f)
> >> > > >> >>
> >> > > >> >> switch (s->target) {
> >> > > >> >> case V4L2_SEL_TGT_CROP:
> >> > > >> >> -   if (s->r.width != inst->out_width ||
> >> > > >> >> -   s->r.height != inst->out_height ||
> >> > > >> >> -   s->r.top != 0 || s->r.left != 0)
> >> > > >> >> -   return -EINVAL;
> >> > > >> >> +   if (s->r.width != inst->width ||
> >> > > >> >> +   s->r.height != inst->height ||
> >> > > >> >> +   s->r.top != 0 || s->r.left != 0) {
> >> > > >> >> +   s->r.top 

Re: [PATCH] media: staging: rkisp1: cap: refactor enable/disable stream to allow multistreaming

2020-10-16 Thread Tomasz Figa
On Fri, Oct 16, 2020 at 4:28 PM Dafna Hirschfeld
 wrote:
>
> Hi,
>
> Am 15.10.20 um 21:57 schrieb Helen Koike:
> > Allow streaming from self picture path and main picture path at the same
> > time.
> >
> > Take care for s_stream() callbacks to not be called twice.
> > When starting a stream, s_stream(true) shouldn't be called for the isp
> > and the sensor if the other stream is already enabled (since it was
> > already called).
> > When stopping a stream, s_stream(false) shouldn't be called for isp and
> > the sensor if the other stream is still enabled.
> >
> > Remove the callback function scheme for navigating through the topology,
> > simplifying the code, improving readability, while calling
> > media_pipeline_{start,stop}() in the right order.
> >
> > Remove multistreaming item from the TODO list.
> >
> > Signed-off-by: Helen Koike 
> >
> > ---
> > Hello,
> >
> > Since we didn't reach an agreement on the helpers in the core[1], I'm
> > sending this patch to fix this limitation only for rkisp1.
> >
> > [1] 
> > https://patchwork.linuxtv.org/project/linux-media/cover/20200415013044.1778572-1-helen.ko...@collabora.com/
> >
> > If we decide to add the helpers in the future, we can clean up drivers
> > even more, but I don't want to block this feature.
> >
> > Overview of the patch:
> > ==
> >
> > * Rename rkisp1_stream_{start,stop}() to
> >rkisp1_cap_stream_{enable,disable}() to clarify the difference between
> >other stream enable/disable functions
> >
> > * Implement rkisp1_pipeline_stream_{enable,disable}() to replace
> >rkisp1_pipeline_{enable,disable}_cb() and rkisp1_pipeline_sink_walk(),
> >which were removed.
> >
> > * Call rkisp1_cap_stream_{enable,disable}() from
> >rkisp1_pipeline_stream_{enable,disable}() for better
> >unwind handling and function name semantics.
> >
> > * Call media_pipeline_{start,stop}() in the right order.
> >
> > * Remove item from TODO list (I also reviewed the use of the
> >is_streaming var in the code and lgtm).
> >
> > This patch was tested on rockpi4 board with:
> > 
> >
> > "media-ctl" "-d" "platform:rkisp1" "-r"
> > "media-ctl" "-d" "platform:rkisp1" "-l" "'imx219 4-0010':0 -> 
> > 'rkisp1_isp':0 [1]"
> > "media-ctl" "-d" "platform:rkisp1" "-l" "'rkisp1_isp':2 -> 
> > 'rkisp1_resizer_selfpath':0 [1]"
> > "media-ctl" "-d" "platform:rkisp1" "-l" "'rkisp1_isp':2 -> 
> > 'rkisp1_resizer_mainpath':0 [1]"
> >
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"imx219 4-0010":0 
> > [fmt:SRGGB10_1X10/1640x1232]'
> >
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"rkisp1_isp":0 
> > [fmt:SRGGB10_1X10/1640x1232 crop: (0,0)/1600x1200]'
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"rkisp1_isp":2 
> > [fmt:YUYV8_2X8/1600x1200 crop: (0,0)/1500x1100]'
> >
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" 
> > '"rkisp1_resizer_selfpath":0 [fmt:YUYV8_2X8/1500x1100 crop: 
> > (300,400)/1400x1000]'
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" 
> > '"rkisp1_resizer_selfpath":1 [fmt:YUYV8_2X8/900x800]'
> >
> > "v4l2-ctl" "-z" "platform:rkisp1" "-d" "rkisp1_selfpath" "-v" 
> > "width=900,height=800,"
> > "v4l2-ctl" "-z" "platform:rkisp1" "-d" "rkisp1_selfpath" "-v" 
> > "pixelformat=422P"
> >
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" 
> > '"rkisp1_resizer_mainpath":0 [fmt:YUYV8_2X8/1500x1100 crop: 
> > (300,400)/1400x1000]'
> > "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" 
> > '"rkisp1_resizer_mainpath":1 [fmt:YUYV8_2X8/900x800]'
> >
> > "v4l2-ctl" "-z" "platform:rkisp1" "-d" "rkisp1_mainpath" "-v" 
> > "width=900,height=800,"
> > "v4l2-ctl" "-z" "platform:rkisp1" "-d" "rkisp1_mainpath" "-v" 
> > "pixelformat=422P"
> >
> > sleep 1
> >
> > time v4l2-ctl "-z" "platform:rkisp1" "-d" "rkisp1_mainpath" "--stream-mmap" 
> > "--stream-count" "100" &
> > time v4l2-ctl "-z" "platform:rkisp1" "-d" "rkisp1_selfpath" "--stream-mmap" 
> > "--stream-count" "100" &
> >
> > wait
> > echo "Completed"
> >
> > Thanks
> > Helen
> > ---
> >   drivers/staging/media/rkisp1/TODO |   3 -
> >   drivers/staging/media/rkisp1/rkisp1-capture.c | 227 +-
> >   2 files changed, 113 insertions(+), 117 deletions(-)
> >
> > diff --git a/drivers/staging/media/rkisp1/TODO 
> > b/drivers/staging/media/rkisp1/TODO
> > index e7c8398fc2cef..a2dd0ad951c25 100644
> > --- a/drivers/staging/media/rkisp1/TODO
> > +++ b/drivers/staging/media/rkisp1/TODO
> > @@ -1,9 +1,6 @@
> >   * Fix pad format size for statistics and parameters entities.
> >   * Fix checkpatch errors.
> >   * Add uapi docs. Remember to add documentation of how quantization is 
> > handled.
> > -* streaming paths (mainpath and selfpath) check if the other path is 
> > streaming
> > -in several places of the code, review this, specially that it doesn't seem 
> > it
> > -supports streaming from both paths at the same time.
> >
> >   NOTES:
> >   * All v4l2-compliance test must pass.
> > diff --git 

Re: [PATCH v5 8/9] arm64: dts: rockchip: add isp0 node for rk3399

2020-10-14 Thread Tomasz Figa
On Wed, Oct 14, 2020 at 6:27 PM Helen Koike  wrote:
>
> Hi Tomasz,
>
> On 9/26/20 10:00 AM, Tomasz Figa wrote:
> > Hi Helen,
> >
> > On Wed, Jul 22, 2020 at 12:55:32PM -0300, Helen Koike wrote:
> >> From: Shunqian Zheng 
> >>
> >> RK3399 has two ISPs, but only isp0 was tested.
> >> Add isp0 node in rk3399 dtsi
> >>
> >> Verified with:
> >> make ARCH=arm64 dtbs_check 
> >> DT_SCHEMA_FILES=Documentation/devicetree/bindings/media/rockchip-isp1.yaml
> >>
> >> Signed-off-by: Shunqian Zheng 
> >> Signed-off-by: Jacob Chen 
> >> Signed-off-by: Helen Koike 
> >>
> >> ---
> >>
> >> V4:
> >> - update clock names
> >>
> >> V3:
> >> - clean up clocks
> >>
> >> V2:
> >> - re-order power-domains property
> >>
> >> V1:
> >> This patch was originally part of this patchset:
> >>
> >> https://patchwork.kernel.org/patch/10267431/
> >>
> >> The only difference is:
> >> - add phy properties
> >> - add ports
> >> ---
> >>  arch/arm64/boot/dts/rockchip/rk3399.dtsi | 25 
> >>  1 file changed, 25 insertions(+)
> >>
> >> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi 
> >> b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> >> index dba9641947a3a..ed8ba75dbbce8 100644
> >> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> >> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> >> @@ -1721,6 +1721,31 @@ vopb_mmu: iommu@ff903f00 {
> >>  status = "disabled";
> >>  };
> >>
> >> +isp0: isp0@ff91 {
> >> +compatible = "rockchip,rk3399-cif-isp";
> >> +reg = <0x0 0xff91 0x0 0x4000>;
> >> +interrupts = ;
> >> +clocks = < SCLK_ISP0>,
> >> + < ACLK_ISP0_WRAPPER>,
> >> + < HCLK_ISP0_WRAPPER>;
> >> +clock-names = "isp", "aclk", "hclk";
> >> +iommus = <_mmu>;
> >> +phys = <_dphy_rx0>;
> >> +phy-names = "dphy";
> >> +power-domains = < RK3399_PD_ISP0>;
> >
> > Should this have status = "disabled" too? The mipi_dphy_rx0 node is
> > disabled by default too, so in the default configuration the driver
> > would always fail to probe.
>
> I'm thinking what is the overall guideline here.
> Since isp and mipi_dphy are always present in the rk3399, shouldn't they 
> always be enabled?
> Or since they are only useful if a sensor is present, we should let the dts 
> of the board to
> enable it?

I don't have a strong opinion. I'm fine with enabling both by default
as well, as it shouldn't hurt.

That said, I recall some alternative CIF IP block being present on
this SoC as well (and patches posted recently), which AFAIR can't be
activated at the same time as the ISP, so perhaps both of the
alternatives should be disabled by default?

Best regards,
Tomasz


[PATCH] ASoC: Intel: kbl_rt5663_max98927: Fix kabylake_ssp_fixup function

2020-10-14 Thread Tomasz Figa
This is a copy of commit 5c5f1baee85a ("ASoC: Intel:
kbl_rt5663_rt5514_max98927: Fix kabylake_ssp_fixup function") applied to
the kbl_rt5663_max98927 board file.

Original explanation of the change:

kabylake_ssp_fixup function uses snd_soc_dpcm to identify the
codecs DAIs. The HW parameters are changed based on the codec DAI of the
stream. The earlier approach to get snd_soc_dpcm was using container_of()
macro on snd_pcm_hw_params.

The structures have been modified over time and snd_soc_dpcm does not have
snd_pcm_hw_params as a reference but as a copy. This causes the current
driver to crash when used.

This patch changes the way snd_soc_dpcm is extracted. snd_soc_pcm_runtime
holds 2 dpcm instances (one for playback and one for capture). 2 codecs
on the SSP are dmic (capture) and speakers (playback). Based on the
stream direction, snd_soc_dpcm is extracted from snd_soc_pcm_runtime.

Fixes a boot crash on a HP Chromebook x2:

[   16.582225] BUG: kernel NULL pointer dereference, address: 0050
[   16.582231] #PF: supervisor read access in kernel mode
[   16.582233] #PF: error_code(0x) - not-present page
[   16.582234] PGD 0 P4D 0
[   16.582238] Oops:  [#1] PREEMPT SMP PTI
[   16.582241] CPU: 0 PID: 1980 Comm: cras Tainted: G C5.4.58 #1
[   16.582243] Hardware name: HP Soraka/Soraka, BIOS Google_Soraka.10431.75.0 
08/30/2018
[   16.582247] RIP: 0010:kabylake_ssp_fixup+0x19/0xbb 
[snd_soc_kbl_rt5663_max98927]
[   16.582250] Code: c6 6f c5 80 c0 44 89 f2 31 c0 e8 3e c9 4c d6 eb de 0f 1f 
44 00 00 55 48 89 e5 41 57 41 56 53 48 89 f3 48 8b 46 c8 48 8b 4e d0 <48> 8b 49 
10 4c 8b 78 10 4c 8b 31 4c 89 f7 48 c7 c6 4b c2 80 c0 e8
[   16.582252] RSP: :af7e81e0b958 EFLAGS: 00010282
[   16.582254] RAX: 96f13e0d RBX: af7e81e0ba00 RCX: 0040
[   16.582256] RDX: af7e81e0ba00 RSI: af7e81e0ba00 RDI: a3b208558028
[   16.582258] RBP: af7e81e0b970 R08: a3b203b54160 R09: af7e81e0ba00
[   16.582259] R10:  R11: c080b345 R12: a3b209fb6e00
[   16.582261] R13: a3b1b1a47838 R14: a3b1e6197f28 R15: af7e81e0ba00
[   16.582263] FS:  7eb3f25aaf80() GS:a3b236a0() 
knlGS:
[   16.582265] CS:  0010 DS:  ES:  CR0: 80050033
[   16.582267] CR2: 0050 CR3: 000246bc8006 CR4: 003606f0
[   16.582269] Call Trace:
[   16.582275]  snd_soc_link_be_hw_params_fixup+0x21/0x68
[   16.582278]  snd_soc_dai_hw_params+0x25/0x94
[   16.582282]  soc_pcm_hw_params+0x2d8/0x583
[   16.582288]  dpcm_be_dai_hw_params+0x172/0x29e
[   16.582291]  dpcm_fe_dai_hw_params+0x9f/0x12f
[   16.582295]  snd_pcm_hw_params+0x137/0x41c
[   16.582298]  snd_pcm_hw_params_user+0x3c/0x71
[   16.582301]  snd_pcm_common_ioctl+0x2c6/0x565
[   16.582304]  snd_pcm_ioctl+0x32/0x36
[   16.582307]  do_vfs_ioctl+0x506/0x783
[   16.582311]  ksys_ioctl+0x58/0x83
[   16.582313]  __x64_sys_ioctl+0x1a/0x1e
[   16.582316]  do_syscall_64+0x54/0x7e
[   16.582319]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   16.582322] RIP: 0033:0x7eb3f1886157
[   16.582324] Code: 8a 66 90 48 8b 05 11 dd 2b 00 64 c7 00 26 00 00 00 48 c7 
c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d e1 dc 2b 00 f7 d8 64 89 01 48
[   16.582326] RSP: 002b:77559818 EFLAGS: 0246 ORIG_RAX: 
0010
[   16.582329] RAX: ffda RBX: 5acc9188b140 RCX: 7eb3f1886157
[   16.582330] RDX: 77559940 RSI: c2604111 RDI: 001e
[   16.582332] RBP: 77559840 R08: 0004 R09: 
[   16.582333] R10:  R11: 0246 R12: bb80
[   16.582335] R13: 5acc91702e80 R14: 77559940 R15: 5acc91702e80
[   16.582337] Modules linked in: rfcomm cmac algif_hash algif_skcipher af_alg 
uinput hid_google_hammer snd_soc_kbl_rt5663_max98927 snd_soc_hdac_hdmi 
snd_soc_dmic snd_soc_skl_ssp_clk snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp 
snd_soc_hdac_hda snd_soc_acpi_intel_match snd_soc_acpi snd_hda_ext_core 
snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core ipu3_cio2 ipu3_imgu(C) 
videobuf2_v4l2 videobuf2_common videobuf2_dma_sg videobuf2_memops 
snd_soc_rt5663 snd_soc_max98927 snd_soc_rl6231 ov5670 ov13858 acpi_als 
v4l2_fwnode dw9714 fuse xt_MASQUERADE iio_trig_sysfs cros_ec_light_prox 
cros_ec_sensors cros_ec_sensors_core cros_ec_sensors_ring 
industrialio_triggered_buffer kfifo_buf industrialio cros_ec_sensorhub 
cdc_ether usbnet btusb btrtl btintel btbcm bluetooth ecdh_generic ecc lzo_rle 
lzo_compress iwlmvm zram iwl7000_mac80211 r8152 mii iwlwifi cfg80211 joydev
[   16.584243] gsmi: Log Shutdown Reason 0x03
[   16.584246] CR2: 0050
[   16.584248] ---[ end trace c8511d090c11edff ]---

Suggested-by: Łukasz Majczak 
Fixes: 2e5894d73789e ("ASoC: pcm: Add support for DAI multicodec")
Signed-off-by: Tomasz F

Re: [PATCH 8/8] WIP: add a dma_alloc_contiguous API

2020-10-14 Thread Tomasz Figa
+CC Ricardo who will be looking into using this in the USB stack (UVC
camera driver).

On Wed, Sep 30, 2020 at 6:09 PM Christoph Hellwig  wrote:
>
> Add a new API that returns a virtually non-contigous array of pages
> and dma address.  This API is only implemented for dma-iommu and will
> not be implemented for non-iommu DMA API instances that have to allocate
> contiguous memory.  It is up to the caller to check if the API is
> available.
>
> The intent is that media drivers can use this API if either:
>
>  - no kernel mapping or only temporary kernel mappings are required.
>That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING
>  - a kernel mapping is required for cached and DMA mapped pages, but
>the driver also needs the pages to e.g. map them to userspace.
>In that sense it is a replacement for some aspects of the recently
>removed and never fully implemented DMA_ATTR_NON_CONSISTENT
>
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/dma-iommu.c   | 73 +
>  include/linux/dma-mapping.h |  9 +
>  kernel/dma/mapping.c| 35 ++
>  3 files changed, 93 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 7922f545cd5eef..158026a856622c 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -565,23 +565,12 @@ static struct page **__iommu_dma_alloc_pages(struct 
> device *dev,
> return pages;
>  }
>
> -/**
> - * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
> - * @dev: Device to allocate memory for. Must be a real device
> - *  attached to an iommu_dma_domain
> - * @size: Size of buffer in bytes
> - * @dma_handle: Out argument for allocated DMA handle
> - * @gfp: Allocation flags
> - * @prot: pgprot_t to use for the remapped mapping
> - * @attrs: DMA attributes for this allocation
> - *
> - * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
> +/*
> + * If size is less than PAGE_SIZE, then a full CPU page will be allocated,
>   * but an IOMMU which supports smaller pages might not map the whole thing.
> - *
> - * Return: Mapped virtual address, or NULL on failure.
>   */
> -static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
> -   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
> +static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev,
> +   size_t size, dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
> unsigned long attrs)
>  {
> struct iommu_domain *domain = iommu_get_dma_domain(dev);
> @@ -593,7 +582,6 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
> size_t size,
> struct page **pages;
> struct sg_table sgt;
> dma_addr_t iova;
> -   void *vaddr;
>
> *dma_handle = DMA_MAPPING_ERROR;
>
> @@ -636,17 +624,10 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
> size_t size,
> < size)
> goto out_free_sg;
>
> -   vaddr = dma_common_pages_remap(pages, size, prot,
> -   __builtin_return_address(0));
> -   if (!vaddr)
> -   goto out_unmap;
> -
> *dma_handle = iova;
> sg_free_table();
> -   return vaddr;
> +   return pages;
>
> -out_unmap:
> -   __iommu_dma_unmap(dev, iova, size);
>  out_free_sg:
> sg_free_table();
>  out_free_iova:
> @@ -656,6 +637,46 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
> size_t size,
> return NULL;
>  }
>
> +static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
> +   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
> +   unsigned long attrs)
> +{
> +   struct page **pages;
> +   void *vaddr;
> +
> +   pages = __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp,
> +   prot, attrs);
> +   if (!pages)
> +   return NULL;
> +   vaddr = dma_common_pages_remap(pages, size, prot,
> +   __builtin_return_address(0));
> +   if (!vaddr)
> +   goto out_unmap;
> +   return vaddr;
> +
> +out_unmap:
> +   __iommu_dma_unmap(dev, *dma_handle, size);
> +   __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
> +   return NULL;
> +}
> +
> +#ifdef CONFIG_DMA_REMAP
> +static struct page **iommu_dma_alloc_noncontiguous(struct device *dev,
> +   size_t size, dma_addr_t *dma_handle, gfp_t gfp,
> +   unsigned long attrs)
> +{
> +   return __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp,
> +  PAGE_KERNEL, attrs);
> +}
> +
> +static void iommu_dma_free_noncontiguous(struct device *dev, size_t size,
> +   struct page **pages, dma_addr_t dma_handle)
> +{
> +   __iommu_dma_unmap(dev, dma_handle, size);
> +   

Re: [PATCH] v4l: Add source change event for colorimetry

2020-10-13 Thread Tomasz Figa
On Tue, Oct 13, 2020 at 4:53 PM Stanimir Varbanov
 wrote:
>
>
>
> On 10/13/20 5:07 PM, Tomasz Figa wrote:
> > On Tue, Oct 13, 2020 at 3:53 PM Stanimir Varbanov
> >  wrote:
> >>
> >>
> >>
> >> On 10/13/20 4:40 PM, Tomasz Figa wrote:
> >>> On Tue, Oct 13, 2020 at 11:03 AM Stanimir Varbanov
> >>>  wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> On 7/2/20 2:52 PM, Stanimir Varbanov wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Once we have this event there is still open question how the client will
> >>>>> know the data buffer on which the new colorspace is valid/applied.
> >>>>>
> >>>>> The options could be:
> >>>>>  * a new buffer flag and
> >>>>>  * some information in the v4l2_event structure.
> >>>>>
> >>>>> Thoughts?
> >>>>
> >>>> Kindly ping.
> >>>>
> >>>
> >>> The event itself sounds good to me, but how do we know which buffer is
> >>> the first to have the new colorimetry?
> >>
> >> I think Hans have a very good idea to have width/height and colorspace
> >> specifiers in v4l2_ext_buffer [1].
> >>
> >> [1] https://lkml.org/lkml/2020/9/9/531
> >>
> >
> > Hmm, I think that would basically make the event obsolete and without
> > solving that problem I suspect the event is not very useful, because
> > one could already receive and display (incorrectly) some buffers
> > before realizing that they have different colorimetry
>
> Yes, I agree. I wasn't sure does Hans's idea will be well received, thus
> I pinged.
>
> >
> > I believe for now we would have to handle this like a resolution
> > change - flush the CAPTURE queue and the next buffer after the flush
> > would have the new colorimetry. With the new API we could optimize the
>
> I'm not sure what you mean by flush capture queue? Dequeue until LAST
> flag (EPIPE) and stop streaming g_fmt(capture queue) and decide what is
> changed and start streaming ?

Yes, although no strict need to stop streaming, other ways are defined
as well, e.g. DEC_CMD_START.

Of course we would need to make appropriate changes to the spec and so
I'd just unify it with the resolution change sequence. I think we
could rename it to "Stream parameter change".

>
> > decoding by getting rid of the flushes and relying on the in-bound
> > information.
> >
> > Best regards,
> > Tomasz
> >
> >>>
> >>> Best regards,
> >>> Tomasz
> >>>
> >>>>>
> >>>>> On 7/2/20 1:00 PM, Stanimir Varbanov wrote:
> >>>>>> This event indicate that the source colorspace is changed
> >>>>>> during run-time. The client has to retrieve the new colorspace
> >>>>>> identifiers by getting the format (G_FMT).
> >>>>>>
> >>>>>> Signed-off-by: Stanimir Varbanov 
> >>>>>> ---
> >>>>>>  .../userspace-api/media/v4l/vidioc-dqevent.rst| 11 ++-
> >>>>>>  .../userspace-api/media/videodev2.h.rst.exceptions|  1 +
> >>>>>>  include/uapi/linux/videodev2.h|  1 +
> >>>>>>  3 files changed, 12 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst 
> >>>>>> b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >>>>>> index a9a176d5256d..3f69c753db58 100644
> >>>>>> --- a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >>>>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >>>>>> @@ -381,7 +381,16 @@ call.
> >>>>>>  that many Video Capture devices are not able to recover from a 
> >>>>>> temporary
> >>>>>>  loss of signal and so restarting streaming I/O is required in 
> >>>>>> order for
> >>>>>>  the hardware to synchronize to the video signal.
> >>>>>> -
> >>>>>> +* - ``V4L2_EVENT_SRC_CH_COLORIMETRY``
> >>>>>> +  - 0x0002
> >>>>>> +  - This event gets triggered when a colorspace change is 
> >>>>>> detected at
> >>>>>> +an input. By colorspa

Re: [PATCH] v4l: Add source change event for colorimetry

2020-10-13 Thread Tomasz Figa
On Tue, Oct 13, 2020 at 3:53 PM Stanimir Varbanov
 wrote:
>
>
>
> On 10/13/20 4:40 PM, Tomasz Figa wrote:
> > On Tue, Oct 13, 2020 at 11:03 AM Stanimir Varbanov
> >  wrote:
> >>
> >> Hi,
> >>
> >> On 7/2/20 2:52 PM, Stanimir Varbanov wrote:
> >>> Hi,
> >>>
> >>> Once we have this event there is still open question how the client will
> >>> know the data buffer on which the new colorspace is valid/applied.
> >>>
> >>> The options could be:
> >>>  * a new buffer flag and
> >>>  * some information in the v4l2_event structure.
> >>>
> >>> Thoughts?
> >>
> >> Kindly ping.
> >>
> >
> > The event itself sounds good to me, but how do we know which buffer is
> > the first to have the new colorimetry?
>
> I think Hans have a very good idea to have width/height and colorspace
> specifiers in v4l2_ext_buffer [1].
>
> [1] https://lkml.org/lkml/2020/9/9/531
>

Hmm, I think that would basically make the event obsolete and without
solving that problem I suspect the event is not very useful, because
one could already receive and display (incorrectly) some buffers
before realizing that they have different colorimetry.

I believe for now we would have to handle this like a resolution
change - flush the CAPTURE queue and the next buffer after the flush
would have the new colorimetry. With the new API we could optimize the
decoding by getting rid of the flushes and relying on the in-bound
information.

Best regards,
Tomasz

> >
> > Best regards,
> > Tomasz
> >
> >>>
> >>> On 7/2/20 1:00 PM, Stanimir Varbanov wrote:
> >>>> This event indicate that the source colorspace is changed
> >>>> during run-time. The client has to retrieve the new colorspace
> >>>> identifiers by getting the format (G_FMT).
> >>>>
> >>>> Signed-off-by: Stanimir Varbanov 
> >>>> ---
> >>>>  .../userspace-api/media/v4l/vidioc-dqevent.rst| 11 ++-
> >>>>  .../userspace-api/media/videodev2.h.rst.exceptions|  1 +
> >>>>  include/uapi/linux/videodev2.h|  1 +
> >>>>  3 files changed, 12 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst 
> >>>> b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >>>> index a9a176d5256d..3f69c753db58 100644
> >>>> --- a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >>>> @@ -381,7 +381,16 @@ call.
> >>>>  that many Video Capture devices are not able to recover from a 
> >>>> temporary
> >>>>  loss of signal and so restarting streaming I/O is required in order 
> >>>> for
> >>>>  the hardware to synchronize to the video signal.
> >>>> -
> >>>> +* - ``V4L2_EVENT_SRC_CH_COLORIMETRY``
> >>>> +  - 0x0002
> >>>> +  - This event gets triggered when a colorspace change is detected 
> >>>> at
> >>>> +an input. By colorspace change here we include also changes in the
> >>>> +colorspace specifiers (transfer function, Y'CbCr encoding and
> >>>> +quantization). This event can come from an input or from video 
> >>>> decoder.
> >>>> +Once the event has been send to the client the driver has to update
> >>>> +the colorspace specifiers internally so that they could be 
> >>>> retrieved by
> >>>> +client. In that case queue re-negotiation is not needed as this 
> >>>> change
> >>>> +only reflects on the interpretation of the data.
> >>>>
> >>>>  Return Value
> >>>>  
> >>>> diff --git 
> >>>> a/Documentation/userspace-api/media/videodev2.h.rst.exceptions 
> >>>> b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >>>> index ca05e4e126b2..54fc21af852d 100644
> >>>> --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >>>> +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >>>> @@ -492,6 +492,7 @@ replace define V4L2_EVENT_CTRL_CH_FLAGS 
> >>>> ctrl-changes-flags
> >>>>  replace define V4L2_EVENT_CTRL_CH_RANGE ctrl-changes-flags
> >>>>
> >>>>  replace define V4L2_EVENT_SRC_CH_RESOLUTION src-changes-flags
> >>>> +replace define V4L2_EVENT_SRC_CH_COLORIMETRY src-changes-flags
> >>>>
> >>>>  replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ 
> >>>> :c:type:`v4l2_event_motion_det`
> >>>>
> >>>> diff --git a/include/uapi/linux/videodev2.h 
> >>>> b/include/uapi/linux/videodev2.h
> >>>> index 303805438814..b5838bc4e3a3 100644
> >>>> --- a/include/uapi/linux/videodev2.h
> >>>> +++ b/include/uapi/linux/videodev2.h
> >>>> @@ -2351,6 +2351,7 @@ struct v4l2_event_frame_sync {
> >>>>  };
> >>>>
> >>>>  #define V4L2_EVENT_SRC_CH_RESOLUTION(1 << 0)
> >>>> +#define V4L2_EVENT_SRC_CH_COLORIMETRY   (1 << 1)
> >>>>
> >>>>  struct v4l2_event_src_change {
> >>>>  __u32 changes;
> >>>>
> >>>
> >>
> >> --
> >> regards,
> >> Stan
>
> --
> regards,
> Stan


Re: [PATCH] v4l: Add source change event for colorimetry

2020-10-13 Thread Tomasz Figa
On Tue, Oct 13, 2020 at 11:03 AM Stanimir Varbanov
 wrote:
>
> Hi,
>
> On 7/2/20 2:52 PM, Stanimir Varbanov wrote:
> > Hi,
> >
> > Once we have this event there is still open question how the client will
> > know the data buffer on which the new colorspace is valid/applied.
> >
> > The options could be:
> >  * a new buffer flag and
> >  * some information in the v4l2_event structure.
> >
> > Thoughts?
>
> Kindly ping.
>

The event itself sounds good to me, but how do we know which buffer is
the first to have the new colorimetry?

Best regards,
Tomasz

> >
> > On 7/2/20 1:00 PM, Stanimir Varbanov wrote:
> >> This event indicate that the source colorspace is changed
> >> during run-time. The client has to retrieve the new colorspace
> >> identifiers by getting the format (G_FMT).
> >>
> >> Signed-off-by: Stanimir Varbanov 
> >> ---
> >>  .../userspace-api/media/v4l/vidioc-dqevent.rst| 11 ++-
> >>  .../userspace-api/media/videodev2.h.rst.exceptions|  1 +
> >>  include/uapi/linux/videodev2.h|  1 +
> >>  3 files changed, 12 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst 
> >> b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >> index a9a176d5256d..3f69c753db58 100644
> >> --- a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >> +++ b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> >> @@ -381,7 +381,16 @@ call.
> >>  that many Video Capture devices are not able to recover from a 
> >> temporary
> >>  loss of signal and so restarting streaming I/O is required in order 
> >> for
> >>  the hardware to synchronize to the video signal.
> >> -
> >> +* - ``V4L2_EVENT_SRC_CH_COLORIMETRY``
> >> +  - 0x0002
> >> +  - This event gets triggered when a colorspace change is detected at
> >> +an input. By colorspace change here we include also changes in the
> >> +colorspace specifiers (transfer function, Y'CbCr encoding and
> >> +quantization). This event can come from an input or from video 
> >> decoder.
> >> +Once the event has been send to the client the driver has to update
> >> +the colorspace specifiers internally so that they could be retrieved 
> >> by
> >> +client. In that case queue re-negotiation is not needed as this change
> >> +only reflects on the interpretation of the data.
> >>
> >>  Return Value
> >>  
> >> diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions 
> >> b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >> index ca05e4e126b2..54fc21af852d 100644
> >> --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >> +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >> @@ -492,6 +492,7 @@ replace define V4L2_EVENT_CTRL_CH_FLAGS 
> >> ctrl-changes-flags
> >>  replace define V4L2_EVENT_CTRL_CH_RANGE ctrl-changes-flags
> >>
> >>  replace define V4L2_EVENT_SRC_CH_RESOLUTION src-changes-flags
> >> +replace define V4L2_EVENT_SRC_CH_COLORIMETRY src-changes-flags
> >>
> >>  replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ 
> >> :c:type:`v4l2_event_motion_det`
> >>
> >> diff --git a/include/uapi/linux/videodev2.h 
> >> b/include/uapi/linux/videodev2.h
> >> index 303805438814..b5838bc4e3a3 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -2351,6 +2351,7 @@ struct v4l2_event_frame_sync {
> >>  };
> >>
> >>  #define V4L2_EVENT_SRC_CH_RESOLUTION(1 << 0)
> >> +#define V4L2_EVENT_SRC_CH_COLORIMETRY   (1 << 1)
> >>
> >>  struct v4l2_event_src_change {
> >>  __u32 changes;
> >>
> >
>
> --
> regards,
> Stan


Re: [PATCH 2/2] venus: venc: fix handlig of S_SELECTION and G_SELECTION

2020-10-13 Thread Tomasz Figa
Hi Vikash,

On Tue, Oct 13, 2020 at 02:56:21PM +0530, vgaro...@codeaurora.org wrote:
> 
> On 2020-10-08 19:51, Tomasz Figa wrote:
> > On Wed, Oct 7, 2020 at 9:33 PM  wrote:
> > > 
> > > Hi Tomasz,
> > > 
> > > On 2020-10-01 20:47, Tomasz Figa wrote:
> > > > On Thu, Oct 1, 2020 at 3:32 AM Stanimir Varbanov
> > > >  wrote:
> > > >>
> > > >> Hi Tomasz,
> > > >>
> > > >> On 9/25/20 11:55 PM, Tomasz Figa wrote:
> > > >> > Hi Dikshita, Stanimir,
> > > >> >
> > > >> > On Thu, Sep 24, 2020 at 7:31 PM Dikshita Agarwal
> > > >> >  wrote:
> > > >> >>
> > > >> >> From: Stanimir Varbanov 
> > > >> >>
> > > >> >> - return correct width and height for G_SELECTION
> > > >> >> - if requested rectangle wxh doesn't match with capture port wxh
> > > >> >>   adjust the rectangle to supported wxh.
> > > >> >>
> > > >> >> Signed-off-by: Dikshita Agarwal 
> > > >> >> ---
> > > >> >>  drivers/media/platform/qcom/venus/venc.c | 20 
> > > >> >>  1 file changed, 12 insertions(+), 8 deletions(-)
> > > >> >>
> > > >> >> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> > > >> >> b/drivers/media/platform/qcom/venus/venc.c
> > > >> >> index 7d2aaa8..a2cc12d 100644
> > > >> >> --- a/drivers/media/platform/qcom/venus/venc.c
> > > >> >> +++ b/drivers/media/platform/qcom/venus/venc.c
> > > >> >> @@ -463,13 +463,13 @@ static int venc_g_fmt(struct file *file, void 
> > > >> >> *fh, struct v4l2_format *f)
> > > >> >> switch (s->target) {
> > > >> >> case V4L2_SEL_TGT_CROP_DEFAULT:
> > > >> >> case V4L2_SEL_TGT_CROP_BOUNDS:
> > > >> >> -   s->r.width = inst->width;
> > > >> >> -   s->r.height = inst->height;
> > > >> >> -   break;
> > > >> >> -   case V4L2_SEL_TGT_CROP:
> > > >> >> s->r.width = inst->out_width;
> > > >> >> s->r.height = inst->out_height;
> > > >> >> break;
> > > >> >> +   case V4L2_SEL_TGT_CROP:
> > > >> >> +   s->r.width = inst->width;
> > > >> >> +   s->r.height = inst->height;
> > > >> >> +   break;
> > > >> >> default:
> > > >> >> return -EINVAL;
> > > >> >> }inter
> > > >> >> @@ -490,10 +490,14 @@ static int venc_g_fmt(struct file *file, void 
> > > >> >> *fh, struct v4l2_format *f)
> > > >> >>
> > > >> >> switch (s->target) {
> > > >> >> case V4L2_SEL_TGT_CROP:
> > > >> >> -   if (s->r.width != inst->out_width ||
> > > >> >> -   s->r.height != inst->out_height ||
> > > >> >> -   s->r.top != 0 || s->r.left != 0)
> > > >> >> -   return -EINVAL;
> > > >> >> +   if (s->r.width != inst->width ||
> > > >> >> +   s->r.height != inst->height ||
> > > >> >> +   s->r.top != 0 || s->r.left != 0) {
> > > >> >> +   s->r.top = 0;
> > > >> >> +   s->r.left = 0;
> > > >> >> +   s->r.width = inst->width;
> > > >> >> +   s->r.height = inst->height;
> > > >> >
> > > >> > What's the point of exposing the selection API if no selection can
> > > >> > actually be done?
> > > >>
> > > >> If someone can guarantee that dropping of s_selection will not break
> > > >> userspace applications I'm fine with removing it.
> > > >
> > > > Indeed the specification could be made more clear about this. The
> > > > visible rectangle configuration 

Re: [PATCH] v4l2-ctrl: add control for thumnails

2020-10-13 Thread Tomasz Figa
On Tue, Oct 13, 2020 at 2:52 PM Stanimir Varbanov
 wrote:
>
> Hi,
>
> On 6/4/20 3:57 PM, Tomasz Figa wrote:
> > On Thu, Jun 4, 2020 at 2:56 PM Hans Verkuil  
> > wrote:
> >>
> >> On 04/06/2020 14:34, Stanimir Varbanov wrote:
> >>> Hi Hans,
> >>>
> >>> On 6/4/20 12:08 PM, Hans Verkuil wrote:
> >>>> On 04/06/2020 11:02, Stanimir Varbanov wrote:
> >>>>> Hi Hans,
> >>>>>
> >>>>> On 5/27/20 12:53 AM, Stanimir Varbanov wrote:
> >>>>>> Hi Hans,
> >>>>>>
> >>>>>> On 5/26/20 3:04 PM, Hans Verkuil wrote:
> >>>>>>> On 26/05/2020 10:54, Stanimir Varbanov wrote:
> >>>>>>>> Add v4l2 control for decoder thumbnail.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Stanimir Varbanov 
> >>>>>>>> ---
> >>>>>>>>  Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst | 7 
> >>>>>>>> +++
> >>>>>>>>  drivers/media/v4l2-core/v4l2-ctrls.c  | 2 ++
> >>>>>>>>  include/uapi/linux/v4l2-controls.h| 2 ++
> >>>>>>>>  3 files changed, 11 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git 
> >>>>>>>> a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst 
> >>>>>>>> b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> >>>>>>>> index d0d506a444b1..e838e410651b 100644
> >>>>>>>> --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> >>>>>>>> +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> >>>>>>>> @@ -3726,6 +3726,13 @@ enum 
> >>>>>>>> v4l2_mpeg_video_hevc_size_of_length_field -
> >>>>>>>>  disables generating SPS and PPS at every IDR. Setting it to one 
> >>>>>>>> enables
> >>>>>>>>  generating SPS and PPS at every IDR.
> >>>>>>>>
> >>>>>>>> +``V4L2_CID_MPEG_VIDEO_DECODER_THUMBNAIL (button)``
> >>>>>>>> +Instructs the decoder to produce immediate output. The decoder 
> >>>>>>>> should
> >>>>>>>> +consume first input buffer for progressive stream (or first two 
> >>>>>>>> buffers
> >>>>>>>> +for interlace). Decoder should not allocate more output buffers 
> >>>>>>>> that it
> >>>>>>>> +is required to consume one input frame. Usually the decoder 
> >>>>>>>> input
> >>>>>>>> +buffers will contain only I/IDR frames but it is not mandatory.
> >>>>>>>
> >>>>>>> This is very vague. It doesn't explain why the control is called 
> >>>>>>> 'THUMBNAIL',
> >>>>>>> but more importantly it doesn't explain how this relates to normal 
> >>>>>>> decoding.
> >>>>>>
> >>>>>> If in the normal decode the capture queue buffers are 5, in the
> >>>>>> thumbnail mode the number of buffers will be only 1 (if the bitstream 
> >>>>>> is
> >>>>>> progressive) and this will guarantee low memory usage. The other
> >>>>>> difference is that the decoder will produce decoded frames (without
> >>>>>> errors) only for I/IDR (sync frames).
> >>>>
> >>>> Isn't this really a "DECODE_SYNC_FRAMES_ONLY" control? That's what it 
> >>>> does,
> >>>> right? Skip any B/P frames and only decode sync frames.
> >>>
> >>> Yes, it is.
> >>> To me V4L2_CID_MPEG_VIDEO_DECODE_SYNC_FRAMES sounds better. If you are
> >>> fine I can send a new patch.
> >>>
> >>> The definition of "sync frames" is a bit difficult for codec-agnostic
> >>> controls. Is it sound better "INTRA", DECODE_INTRA_FRAMES (ONLY)?
> >>
> >> INTRA is better. DECODE_INTRA_FRAMES_ONLY is a good name, I think.
> >>
> >> Thumbnail creation can be given as an example in the description of the
> >> control, but that's just a use-case.
> >
> > How about the othe

Re: [PATCH v2 09/17] mm: Add unsafe_follow_pfn

2020-10-10 Thread Tomasz Figa
Hi Mauro,

On Fri, Oct 9, 2020 at 2:37 PM Mauro Carvalho Chehab
 wrote:
>
> Em Fri, 9 Oct 2020 09:21:11 -0300
> Jason Gunthorpe  escreveu:
>
> > On Fri, Oct 09, 2020 at 12:34:21PM +0200, Mauro Carvalho Chehab wrote:
> > > Hi,
> > >
> > > Em Fri,  9 Oct 2020 09:59:26 +0200
> > > Daniel Vetter  escreveu:
> > >
> > > > Way back it was a reasonable assumptions that iomem mappings never
> > > > change the pfn range they point at. But this has changed:
> > > >
> > > > - gpu drivers dynamically manage their memory nowadays, invalidating
> > > > ptes with unmap_mapping_range when buffers get moved
> > > >
> > > > - contiguous dma allocations have moved from dedicated carvetouts to
> > > > cma regions. This means if we miss the unmap the pfn might contain
> > > > pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
> > > >
> > > > - even /dev/mem now invalidates mappings when the kernel requests that
> > > > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
> > > > ("/dev/mem: Revoke mappings when a driver claims the region")
> > > >
> > > > Accessing pfns obtained from ptes without holding all the locks is
> > > > therefore no longer a good idea.
> > > >
> > > > Unfortunately there's some users where this is not fixable (like v4l
> > > > userptr of iomem mappings) or involves a pile of work (vfio type1
> > > > iommu). For now annotate these as unsafe and splat appropriately.
> > > >
> > > > This patch adds an unsafe_follow_pfn, which later patches will then
> > > > roll out to all appropriate places.
> > >
> > > NACK, as this breaks an existing userspace API on media.
> >
> > It doesn't break it. You get a big warning the thing is broken and it
> > keeps working.
> >
> > We can't leave such a huge security hole open - it impacts other
> > subsystems, distros need to be able to run in a secure mode.
>
> Well, if distros disable it, then apps will break.
>

Do we have any information on userspace that actually needs this functionality?

Note that we're _not_ talking here about the complete USERPTR
functionality, but rather just the very corner case of carveout memory
not backed by struct pages.

Given that the current in-tree ways of reserving carveout memory, such
as shared-dma-pool, actually give memory backed by struct pages, do we
even have a source of such legacy memory in the kernel today?

I think that given that this is a very niche functionality, we could
have it disabled by default for security reasons and if someone
_really_ (i.e. there is no replacement) needs it, they probably need
to use a custom kernel build anyway for their exotic hardware setup
(with PFN-backed carveout memory), so they can enable it.

> > > While I agree that using the userptr on media is something that
> > > new drivers may not support, as DMABUF is a better way of
> > > handling it, changing this for existing ones is a big no,
> > > as it may break usersapace.
> >
> > media community needs to work to fix this, not pretend it is OK to
> > keep going as-is.
>
> > Dealing with security issues is the one case where an uABI break might
> > be acceptable.
> >
> > If you want to NAK it then you need to come up with the work to do
> > something here correctly that will support the old drivers without the
> > kernel taint.
> >
> > Unfortunately making things uncomfortable for the subsystem is the big
> > hammer the core kernel needs to use to actually get this security work
> > done by those responsible.
>
>
> I'm not pretending that this is ok. Just pointing that the approach
> taken is NOT OK.
>
> I'm not a mm/ expert, but, from what I understood from Daniel's patch
> description is that this is unsafe *only if*  __GFP_MOVABLE is used.
>
> Well, no drivers inside the media subsystem uses such flag, although
> they may rely on some infrastructure that could be using it behind
> the bars.
>
> If this is the case, the proper fix seems to have a GFP_NOT_MOVABLE
> flag that it would be denying the core mm code to set __GFP_MOVABLE.
>
> Please let address the issue on this way, instead of broken an
> userspace API that it is there since 1991.

Note that USERPTR as a whole generally has been considered deprecated
in V4L2 for many years and people have been actively discouraged to
use it. And, still, we're just talking here about the very rare corner
case, not the whole USERPTR API.

Best regards,
Tomasz


Re: [PATCH v2 09/17] mm: Add unsafe_follow_pfn

2020-10-10 Thread Tomasz Figa
Hi Daniel,

On Fri, Oct 9, 2020 at 7:52 PM Daniel Vetter  wrote:
>
> On Fri, Oct 9, 2020 at 2:48 PM Jason Gunthorpe  wrote:
> >
> > On Fri, Oct 09, 2020 at 02:37:23PM +0200, Mauro Carvalho Chehab wrote:
> >
> > > I'm not a mm/ expert, but, from what I understood from Daniel's patch
> > > description is that this is unsafe *only if*  __GFP_MOVABLE is used.
> >
> > No, it is unconditionally unsafe. The CMA movable mappings are
> > specific VMAs that will have bad issues here, but there are other
> > types too.
> >
> > The only way to do something at a VMA level is to have a list of OK
> > VMAs, eg because they were creatd via a special mmap helper from the
> > media subsystem.
> >
> > > Well, no drivers inside the media subsystem uses such flag, although
> > > they may rely on some infrastructure that could be using it behind
> > > the bars.
> >
> > It doesn't matter, nothing prevents the user from calling media APIs
> > on mmaps it gets from other subsystems.
>
> I think a good first step would be to disable userptr of non struct
> page backed storage going forward for any new hw support. Even on
> existing drivers. dma-buf sharing has been around for long enough now
> that this shouldn't be a problem. Unfortunately right now this doesn't
> seem to exist, so the entire problem keeps getting perpetuated.
>
> > > If this is the case, the proper fix seems to have a GFP_NOT_MOVABLE
> > > flag that it would be denying the core mm code to set __GFP_MOVABLE.
> >
> > We can't tell from the VMA these kinds of details..
> >
> > It has to go the other direction, evey mmap that might be used as a
> > userptr here has to be found and the VMA specially created to allow
> > its use. At least that is a kernel only change, but will need people
> > with the HW to do this work.
>
> I think the only reasonable way to keep this working is:
> - add a struct dma_buf *vma_tryget_dma_buf(struct vm_area_struct *vma);
> - add dma-buf export support to fbdev and v4l

I assume you mean V4L2 and not the obsolete V4L that is emulated in
the userspace by libv4l. If so, every video device that uses videobuf2
gets DMA-buf export for free and there is nothing needed to enable it.

We probably still have a few legacy drivers using videobuf (non-2),
but IMHO those should be safe to put behind some disabled-by-default
Kconfig symbol or even completely drop, as the legacy framework has
been deprecated for many years already.

> - roll this out everywhere we still need it.
>
> Realistically this just isn't going to happen. And anything else just
> reimplements half of dma-buf, which is kinda pointless (you need
> minimally refcounting and some way to get at a promise of a permanent
> sg list for dma. Plus probably the vmap for kernel cpu access.
>
> > > Please let address the issue on this way, instead of broken an
> > > userspace API that it is there since 1991.
> >
> > It has happened before :( It took 4 years for RDMA to undo the uAPI
> > breakage caused by a security fix for something that was a 15 years
> > old bug.
>
> Yeah we have a bunch of these on the drm side too. Some of them are
> really just "you have to upgrade userspace", and there's no real fix
> for the security nightmare without that.

I think we need to phase out such userspace indeed. The Kconfig symbol
allows enabling the unsafe functionality for anyone who still needs
it, so I think it's not entirely a breakage.

Best regards,
Tomasz


Re: [PATCH 2/2] venus: venc: fix handlig of S_SELECTION and G_SELECTION

2020-10-08 Thread Tomasz Figa
On Wed, Oct 7, 2020 at 9:33 PM  wrote:
>
> Hi Tomasz,
>
> On 2020-10-01 20:47, Tomasz Figa wrote:
> > On Thu, Oct 1, 2020 at 3:32 AM Stanimir Varbanov
> >  wrote:
> >>
> >> Hi Tomasz,
> >>
> >> On 9/25/20 11:55 PM, Tomasz Figa wrote:
> >> > Hi Dikshita, Stanimir,
> >> >
> >> > On Thu, Sep 24, 2020 at 7:31 PM Dikshita Agarwal
> >> >  wrote:
> >> >>
> >> >> From: Stanimir Varbanov 
> >> >>
> >> >> - return correct width and height for G_SELECTION
> >> >> - if requested rectangle wxh doesn't match with capture port wxh
> >> >>   adjust the rectangle to supported wxh.
> >> >>
> >> >> Signed-off-by: Dikshita Agarwal 
> >> >> ---
> >> >>  drivers/media/platform/qcom/venus/venc.c | 20 
> >> >>  1 file changed, 12 insertions(+), 8 deletions(-)
> >> >>
> >> >> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> >> >> b/drivers/media/platform/qcom/venus/venc.c
> >> >> index 7d2aaa8..a2cc12d 100644
> >> >> --- a/drivers/media/platform/qcom/venus/venc.c
> >> >> +++ b/drivers/media/platform/qcom/venus/venc.c
> >> >> @@ -463,13 +463,13 @@ static int venc_g_fmt(struct file *file, void 
> >> >> *fh, struct v4l2_format *f)
> >> >> switch (s->target) {
> >> >> case V4L2_SEL_TGT_CROP_DEFAULT:
> >> >> case V4L2_SEL_TGT_CROP_BOUNDS:
> >> >> -   s->r.width = inst->width;
> >> >> -   s->r.height = inst->height;
> >> >> -   break;
> >> >> -   case V4L2_SEL_TGT_CROP:
> >> >> s->r.width = inst->out_width;
> >> >> s->r.height = inst->out_height;
> >> >> break;
> >> >> +   case V4L2_SEL_TGT_CROP:
> >> >> +   s->r.width = inst->width;
> >> >> +   s->r.height = inst->height;
> >> >> +   break;
> >> >> default:
> >> >> return -EINVAL;
> >> >> }inter
> >> >> @@ -490,10 +490,14 @@ static int venc_g_fmt(struct file *file, void 
> >> >> *fh, struct v4l2_format *f)
> >> >>
> >> >> switch (s->target) {
> >> >> case V4L2_SEL_TGT_CROP:
> >> >> -   if (s->r.width != inst->out_width ||
> >> >> -   s->r.height != inst->out_height ||
> >> >> -   s->r.top != 0 || s->r.left != 0)
> >> >> -   return -EINVAL;
> >> >> +   if (s->r.width != inst->width ||
> >> >> +   s->r.height != inst->height ||
> >> >> +   s->r.top != 0 || s->r.left != 0) {
> >> >> +   s->r.top = 0;
> >> >> +   s->r.left = 0;
> >> >> +   s->r.width = inst->width;
> >> >> +   s->r.height = inst->height;
> >> >
> >> > What's the point of exposing the selection API if no selection can
> >> > actually be done?
> >>
> >> If someone can guarantee that dropping of s_selection will not break
> >> userspace applications I'm fine with removing it.
> >
> > Indeed the specification could be made more clear about this. The
> > visible rectangle configuration is described as optional, so I'd
> > consider the capability to be optional as well.
> >
> > Of course it doesn't change the fact that something that is optional
> > in the API may be mandatory for some specific integrations, like
> > Chrome OS or Android.
> >
> >>
> >> I implemented g/s_selection with the idea to add crop functionality
> >> later because with current firmware interface it needs more work.
> >
> > I suggested one thing internally, but not sure if it was understood
> > correctly:
> >
> > Most of the encoders only support partial cropping, with the rectangle
> > limited to top = 0 and left = 0, in other words, only setting the
> > visible width and height. This can be easily implemented on most of
> > the hardware, even those that don't have dedicated cropping
> >

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Tomasz Figa
On Wed, Oct 7, 2020 at 4:23 PM Daniel Vetter  wrote:
>
> On Wed, Oct 7, 2020 at 4:12 PM Tomasz Figa  wrote:
> >
> > On Wed, Oct 7, 2020 at 4:09 PM Daniel Vetter  wrote:
> > >
> > > On Wed, Oct 7, 2020 at 3:34 PM Tomasz Figa  wrote:
> > > >
> > > > On Wed, Oct 7, 2020 at 3:06 PM Jason Gunthorpe  wrote:
> > > > >
> > > > > On Wed, Oct 07, 2020 at 02:58:33PM +0200, Daniel Vetter wrote:
> > > > > > On Wed, Oct 7, 2020 at 2:48 PM Tomasz Figa  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, Oct 7, 2020 at 2:44 PM Jason Gunthorpe  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Wed, Oct 07, 2020 at 02:33:56PM +0200, Marek Szyprowski 
> > > > > > > > wrote:
> > > > > > > > > Well, it was in vb2_get_vma() function, but now I see that it 
> > > > > > > > > has been
> > > > > > > > > lost in fb639eb39154 and 6690c8c78c74 some time ago...
> > > > > > > >
> > > > > > > > There is no guarentee that holding a get on the file says 
> > > > > > > > anthing
> > > > > > > > about the VMA. This needed to check that the file was some 
> > > > > > > > special
> > > > > > > > kind of file that promised the VMA layout and file lifetime are
> > > > > > > > connected.
> > > > > > > >
> > > > > > > > Also, cloning a VMA outside the mm world is just really bad. 
> > > > > > > > That
> > > > > > > > would screw up many assumptions the drivers make.
> > > > > > > >
> > > > > > > > If it is all obsolete I say we hide it behind a default n config
> > > > > > > > symbol and taint the kernel if anything uses it.
> > > > > > > >
> > > > > > > > Add a big comment above the follow_pfn to warn others away from 
> > > > > > > > this
> > > > > > > > code.
> > > > > > >
> > > > > > > Sadly it's just verbally declared as deprecated and not formally 
> > > > > > > noted
> > > > > > > anyway. There are a lot of userspace applications relying on user
> > > > > > > pointer support.
> > > > > >
> > > > > > userptr can stay, it's the userptr abuse for zerocpy buffer sharing
> > > > > > which doesn't work anymore. At least without major surgery (you'd 
> > > > > > need
> > > > > > an mmu notifier to zap mappings and recreate them, and that pretty
> > > > > > much breaks the v4l model of preallocating all buffers to make sure 
> > > > > > we
> > > > > > never underflow the buffer queue). And static mappings are not 
> > > > > > coming
> > > > > > back I think, we'll go ever more into the direction of dynamic
> > > > > > mappings and moving stuff around as needed.
> > > > >
> > > > > Right, and to be clear, the last time I saw a security flaw of this
> > > > > magnitude from a subsystem badly mis-designing itself, Linus's
> > > > > knee-jerk reaction was to propose to remove the whole subsystem.
> > > > >
> > > > > Please don't take status-quo as acceptable, V4L community has to work
> > > > > to resolve this, uABI breakage or not. The follow_pfn related code
> > > > > must be compiled out of normal distro kernel builds.
> > > >
> > > > I think the userptr zero-copy hack should be able to go away indeed,
> > > > given that we now have CMA that allows having carveouts backed by
> > > > struct pages and having the memory represented as DMA-buf normally.
> > >
> > > Not sure whether there's a confusion here: dma-buf supports memory not
> > > backed by struct page.
> > >
> >
> > That's new to me. The whole API relies on sg_tables a lot, which in
> > turn rely on struct page pointers to describe the physical memory.
>
> You're not allowed to look at struct page pointers from the importer
> side, those might not be there. Which isn't the prettiest thing, but
> it works. And even if there's a struct page, you're still not allowed
> to look at it, sinc

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Tomasz Figa
On Wed, Oct 7, 2020 at 4:09 PM Daniel Vetter  wrote:
>
> On Wed, Oct 7, 2020 at 3:34 PM Tomasz Figa  wrote:
> >
> > On Wed, Oct 7, 2020 at 3:06 PM Jason Gunthorpe  wrote:
> > >
> > > On Wed, Oct 07, 2020 at 02:58:33PM +0200, Daniel Vetter wrote:
> > > > On Wed, Oct 7, 2020 at 2:48 PM Tomasz Figa  wrote:
> > > > >
> > > > > On Wed, Oct 7, 2020 at 2:44 PM Jason Gunthorpe  wrote:
> > > > > >
> > > > > > On Wed, Oct 07, 2020 at 02:33:56PM +0200, Marek Szyprowski wrote:
> > > > > > > Well, it was in vb2_get_vma() function, but now I see that it has 
> > > > > > > been
> > > > > > > lost in fb639eb39154 and 6690c8c78c74 some time ago...
> > > > > >
> > > > > > There is no guarentee that holding a get on the file says anthing
> > > > > > about the VMA. This needed to check that the file was some special
> > > > > > kind of file that promised the VMA layout and file lifetime are
> > > > > > connected.
> > > > > >
> > > > > > Also, cloning a VMA outside the mm world is just really bad. That
> > > > > > would screw up many assumptions the drivers make.
> > > > > >
> > > > > > If it is all obsolete I say we hide it behind a default n config
> > > > > > symbol and taint the kernel if anything uses it.
> > > > > >
> > > > > > Add a big comment above the follow_pfn to warn others away from this
> > > > > > code.
> > > > >
> > > > > Sadly it's just verbally declared as deprecated and not formally noted
> > > > > anyway. There are a lot of userspace applications relying on user
> > > > > pointer support.
> > > >
> > > > userptr can stay, it's the userptr abuse for zerocpy buffer sharing
> > > > which doesn't work anymore. At least without major surgery (you'd need
> > > > an mmu notifier to zap mappings and recreate them, and that pretty
> > > > much breaks the v4l model of preallocating all buffers to make sure we
> > > > never underflow the buffer queue). And static mappings are not coming
> > > > back I think, we'll go ever more into the direction of dynamic
> > > > mappings and moving stuff around as needed.
> > >
> > > Right, and to be clear, the last time I saw a security flaw of this
> > > magnitude from a subsystem badly mis-designing itself, Linus's
> > > knee-jerk reaction was to propose to remove the whole subsystem.
> > >
> > > Please don't take status-quo as acceptable, V4L community has to work
> > > to resolve this, uABI breakage or not. The follow_pfn related code
> > > must be compiled out of normal distro kernel builds.
> >
> > I think the userptr zero-copy hack should be able to go away indeed,
> > given that we now have CMA that allows having carveouts backed by
> > struct pages and having the memory represented as DMA-buf normally.
>
> Not sure whether there's a confusion here: dma-buf supports memory not
> backed by struct page.
>

That's new to me. The whole API relies on sg_tables a lot, which in
turn rely on struct page pointers to describe the physical memory.

> > How about the regular userptr use case, though?
> >
> > The existing code resolves the user pointer into pages by following
> > the get_vaddr_frames() -> frame_vector_to_pages() ->
> > sg_alloc_table_from_pages() / vm_map_ram() approach.
> > get_vaddr_frames() seems to use pin_user_pages() behind the scenes if
> > the vma is not an IO or a PFNMAP, falling back to follow_pfn()
> > otherwise.
>
> Yeah pin_user_pages is fine, it's just the VM_IO | VM_PFNMAP vma that
> don't work.

Ack.

> >
> > Is your intention to drop get_vaddr_frames() or we could still keep
> > using it and if vec->is_pfns is true:
> > a) if CONFIG_VIDEO_LEGACY_PFN_USERPTR is set, taint the kernel
> > b) otherwise just undo and fail?
>
> I'm typing that patch series (plus a pile more) right now.

Cool, thanks!

We also need to bring back the vma_open() that somehow disappeared
around 4.2, as Marek found.

Best regards,
Tomasz


Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Tomasz Figa
On Wed, Oct 7, 2020 at 3:06 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 02:58:33PM +0200, Daniel Vetter wrote:
> > On Wed, Oct 7, 2020 at 2:48 PM Tomasz Figa  wrote:
> > >
> > > On Wed, Oct 7, 2020 at 2:44 PM Jason Gunthorpe  wrote:
> > > >
> > > > On Wed, Oct 07, 2020 at 02:33:56PM +0200, Marek Szyprowski wrote:
> > > > > Well, it was in vb2_get_vma() function, but now I see that it has been
> > > > > lost in fb639eb39154 and 6690c8c78c74 some time ago...
> > > >
> > > > There is no guarentee that holding a get on the file says anthing
> > > > about the VMA. This needed to check that the file was some special
> > > > kind of file that promised the VMA layout and file lifetime are
> > > > connected.
> > > >
> > > > Also, cloning a VMA outside the mm world is just really bad. That
> > > > would screw up many assumptions the drivers make.
> > > >
> > > > If it is all obsolete I say we hide it behind a default n config
> > > > symbol and taint the kernel if anything uses it.
> > > >
> > > > Add a big comment above the follow_pfn to warn others away from this
> > > > code.
> > >
> > > Sadly it's just verbally declared as deprecated and not formally noted
> > > anyway. There are a lot of userspace applications relying on user
> > > pointer support.
> >
> > userptr can stay, it's the userptr abuse for zerocpy buffer sharing
> > which doesn't work anymore. At least without major surgery (you'd need
> > an mmu notifier to zap mappings and recreate them, and that pretty
> > much breaks the v4l model of preallocating all buffers to make sure we
> > never underflow the buffer queue). And static mappings are not coming
> > back I think, we'll go ever more into the direction of dynamic
> > mappings and moving stuff around as needed.
>
> Right, and to be clear, the last time I saw a security flaw of this
> magnitude from a subsystem badly mis-designing itself, Linus's
> knee-jerk reaction was to propose to remove the whole subsystem.
>
> Please don't take status-quo as acceptable, V4L community has to work
> to resolve this, uABI breakage or not. The follow_pfn related code
> must be compiled out of normal distro kernel builds.

I think the userptr zero-copy hack should be able to go away indeed,
given that we now have CMA that allows having carveouts backed by
struct pages and having the memory represented as DMA-buf normally.

How about the regular userptr use case, though?

The existing code resolves the user pointer into pages by following
the get_vaddr_frames() -> frame_vector_to_pages() ->
sg_alloc_table_from_pages() / vm_map_ram() approach.
get_vaddr_frames() seems to use pin_user_pages() behind the scenes if
the vma is not an IO or a PFNMAP, falling back to follow_pfn()
otherwise.

Is your intention to drop get_vaddr_frames() or we could still keep
using it and if vec->is_pfns is true:
a) if CONFIG_VIDEO_LEGACY_PFN_USERPTR is set, taint the kernel
b) otherwise just undo and fail?

Best regards,
Tomasz


Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Tomasz Figa
On Sat, Oct 3, 2020 at 1:31 AM Jason Gunthorpe  wrote:
>
> On Fri, Oct 02, 2020 at 08:16:48PM +0200, Daniel Vetter wrote:
> > On Fri, Oct 2, 2020 at 8:06 PM Jason Gunthorpe  wrote:
> > > On Fri, Oct 02, 2020 at 07:53:03PM +0200, Daniel Vetter wrote:
> > > > For $reasons I've stumbled over this code and I'm not sure the change
> > > > to the new gup functions in 55a650c35fea ("mm/gup: frame_vector:
> > > > convert get_user_pages() --> pin_user_pages()") was entirely correct.
> > > >
> > > > This here is used for long term buffers (not just quick I/O) like
> > > > RDMA, and John notes this in his patch. But I thought the rule for
> > > > these is that they need to add FOLL_LONGTERM, which John's patch
> > > > didn't do.
> > > >
> > > > There is already a dax specific check (added in b7f0554a56f2 ("mm:
> > > > fail get_vaddr_frames() for filesystem-dax mappings")), so this seems
> > > > like the prudent thing to do.
> > > >
> > > > Signed-off-by: Daniel Vetter 
> > > > Cc: Andrew Morton 
> > > > Cc: John Hubbard 
> > > > Cc: Jérôme Glisse 
> > > > Cc: Jan Kara 
> > > > Cc: Dan Williams 
> > > > Cc: linux...@kvack.org
> > > > Cc: linux-arm-ker...@lists.infradead.org
> > > > Cc: linux-samsung-...@vger.kernel.org
> > > > Cc: linux-me...@vger.kernel.org
> > > > Hi all,
> > > >
> > > > I stumbled over this and figured typing this patch can't hurt. Really
> > > > just to maybe learn a few things about how gup/pup is supposed to be
> > > > used (we have a bit of that in drivers/gpu), this here isn't really
> > > > ralated to anything I'm doing.
> > >
> > > FOLL_FORCE is a pretty big clue it should be FOLL_LONGTERM, IMHO
> >
> > Since you're here ... I've noticed that ib sets FOLL_FORCE when the ib
> > verb access mode indicates possible writes. I'm not really clear on
> > why FOLL_WRITE isn't enough any why you need to be able to write
> > through a vma that's write protected currently.
>
> Ah, FOLL_FORCE | FOLL_WRITE means *read* confusingly enough, and the
> only reason you'd want this version for read is if you are doing
> longterm stuff. I wrote about this recently:
>
> https://lore.kernel.org/linux-mm/20200928235739.gu9...@ziepe.ca/
>
> > > Since every driver does this wrong anything that uses this is creating
> > > terrifying security issues.
> > >
> > > IMHO this whole API should be deleted :(
> >
> > Yeah that part I just tried to conveniently ignore. I guess this dates
> > back to a time when ioremaps where at best fixed, and there wasn't
> > anything like a gpu driver dynamically managing vram around, resulting
> > in random entirely unrelated things possibly being mapped to that set
> > of pfns.
>
> No, it was always wrong. Prior to GPU like cases the lifetime of the
> PTE was tied to the vma and when the vma becomes free the driver can
> move the things in the PTEs to 'free'. Easy to trigger use-after-free
> issues and devices like RDMA have security contexts attached to these
> PTEs so it becomes a serious security bug to do something like this.
>
> > The underlying follow_pfn is also used in other places within
> > drivers/media, so this doesn't seem to be an accident, but actually
> > intentional.
>
> Looking closely, there are very few users, most *seem* pointless, but
> maybe there is a crazy reason?
>
> The sequence
>   get_vaddr_frames();
>   frame_vector_to_pages();
>   sg_alloc_table_from_pages();
>
> Should be written
>   pin_user_pages_fast(FOLL_LONGTERM);
>   sg_alloc_table_from_pages()
>
> There is some 'special' code in frame_vector_to_pages() that tries to
> get a struct page for things from a VM_IO or VM_PFNMAP...
>
> Oh snap, that is *completely* broken! If the first VMA is IO|PFNMAP
> then get_vaddr_frames() iterates over all VMAs in the range, of any
> kind and extracts the PTEs then blindly references them! This means it
> can be used to use after free normal RAM struct pages!! Gah!
>
> Wow. Okay. That has to go.
>
> So, I *think* we can assume there is no sane cases where
> frame_vector_to_pages() succeeds but pin_user_pages() wasn't called.
>
> That means the users here:
>  - habanalabs:  Hey Oded can you fix this up?
>
>  - gpu/exynos: Daniel can you get someone there to stop using it?
>
>  - media/videobuf via vb2_dma_sg_get_userptr()
>
> Should all be switched to the standard pin_user_pages sequence
> above.
>
> That leaves the only interesting places as vb2_dc_get_userptr() and
> vb2_vmalloc_get_userptr() which both completely fail to follow the
> REQUIRED behavior in the function's comment about checking PTEs. It
> just DMA maps them. Badly broken.

Note that vb2_vmalloc is only used for in-kernel CPU usage, e.g. the
contents being copied by the driver between vb2 buffers and some
hardware FIFO or other dedicated buffers. The memory does not go to
any hardware DMA.

Could you elaborate on what "the REQUIRED behavior is"? I can see that
both follow the get_vaddr_frames() -> frame_vector_to_pages() flow, as
you mentioned. Perhaps the only change needed is switching to

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Tomasz Figa
On Wed, Oct 7, 2020 at 2:44 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 02:33:56PM +0200, Marek Szyprowski wrote:
> > Well, it was in vb2_get_vma() function, but now I see that it has been
> > lost in fb639eb39154 and 6690c8c78c74 some time ago...
>
> There is no guarentee that holding a get on the file says anthing
> about the VMA. This needed to check that the file was some special
> kind of file that promised the VMA layout and file lifetime are
> connected.
>
> Also, cloning a VMA outside the mm world is just really bad. That
> would screw up many assumptions the drivers make.
>
> If it is all obsolete I say we hide it behind a default n config
> symbol and taint the kernel if anything uses it.
>
> Add a big comment above the follow_pfn to warn others away from this
> code.

Sadly it's just verbally declared as deprecated and not formally noted
anyway. There are a lot of userspace applications relying on user
pointer support.


Re: [PATCH 8/8] WIP: add a dma_alloc_contiguous API

2020-10-07 Thread Tomasz Figa
On Wed, Oct 7, 2020 at 8:21 AM Christoph Hellwig  wrote:
>
> On Tue, Oct 06, 2020 at 10:56:04PM +0200, Tomasz Figa wrote:
> > > Yes.  And make sure the API isn't implemented when VIVT caches are
> > > used, but that isn't really different from the current interface.
> >
> > Okay, thanks. Let's see if we can make necessary changes to the videobuf2.
> >
> > +Sergey Senozhatsky for awareness too.
>
> I can defer the changes a bit to see if you'd really much prefer
> the former interface.  I think for now the most important thing is
> that it works properly for the potential users, and the prime one is
> videobuf2 for now.  drm also seems like a big potential users, but I
> had a really hard time getting the developers to engage in API
> development.

My initial feeling is that it should work, but we'll give you a
definitive answer once we prototype it. :)

We might actually give it a try in the USB HCD subsystem as well, to
implement usb_alloc_noncoherent(), as an optimization for drivers
which have to perform multiple random accesses to the URB buffers. I
think you might recall discussing this by the way of the pwc and
uvcvideo camera drivers.

Best regards,
Tomasz


Re: [PATCH 8/8] WIP: add a dma_alloc_contiguous API

2020-10-06 Thread Tomasz Figa
On Mon, Oct 5, 2020 at 10:26 AM Christoph Hellwig  wrote:
>
> On Fri, Oct 02, 2020 at 05:50:40PM +, Tomasz Figa wrote:
> > Hi Christoph,
> >
> > On Wed, Sep 30, 2020 at 06:09:17PM +0200, Christoph Hellwig wrote:
> > > Add a new API that returns a virtually non-contigous array of pages
> > > and dma address.  This API is only implemented for dma-iommu and will
> > > not be implemented for non-iommu DMA API instances that have to allocate
> > > contiguous memory.  It is up to the caller to check if the API is
> > > available.
> >
> > Would you mind scheding some more light on what made the previous attempt
> > not work well? I liked the previous API because it was more consistent with
> > the regular dma_alloc_coherent().
>
> The problem is that with a dma_alloc_noncoherent that can return pages
> not in the kernel mapping we can't just use virt_to_page to fill in
> scatterlists or mmap the buffer to userspace, but would need new helpers
> and another two methods.
>
> > >  - no kernel mapping or only temporary kernel mappings are required.
> > >That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING
> > >  - a kernel mapping is required for cached and DMA mapped pages, but
> > >the driver also needs the pages to e.g. map them to userspace.
> > >In that sense it is a replacement for some aspects of the recently
> > >removed and never fully implemented DMA_ATTR_NON_CONSISTENT
> >
> > What's the expected allocation and mapping flow with the latter? Would that 
> > be
> >
> > pages = dma_alloc_noncoherent(...)
> > vaddr = vmap(pages, ...);
> >
> > ?
>
> Yes.  Witht the vmap step optional for replacements of
> DMA_ATTR_NO_KERNEL_MAPPING, which is another nightmare to deal with.
>
> > Would one just use the usual dma_sync_for_{cpu,device}() for cache
> > invallidate/clean, while keeping the mapping in place?
>
> Yes.  And make sure the API isn't implemented when VIVT caches are
> used, but that isn't really different from the current interface.

Okay, thanks. Let's see if we can make necessary changes to the videobuf2.

+Sergey Senozhatsky for awareness too.

Best regrards,
Tomasz


Re: [PATCH v8 6/6] at24: Support probing while off

2020-10-06 Thread Tomasz Figa
Hi Sakari,

On Thu, Sep 3, 2020 at 10:15 AM Sakari Ailus
 wrote:
>
> In certain use cases (where the chip is part of a camera module, and the
> camera module is wired together with a camera privacy LED), powering on
> the device during probe is undesirable. Add support for the at24 to
> execute probe while being powered off. For this to happen, a hint in form
> of a device property is required from the firmware.
>
> Signed-off-by: Sakari Ailus 
> ---
>  drivers/misc/eeprom/at24.c | 43 +++---
>  1 file changed, 26 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
> index 8f5de5f10bbea..2d24e33788d7d 100644
> --- a/drivers/misc/eeprom/at24.c
> +++ b/drivers/misc/eeprom/at24.c
> @@ -595,6 +595,7 @@ static int at24_probe(struct i2c_client *client)
> bool i2c_fn_i2c, i2c_fn_block;
> unsigned int i, num_addresses;
> struct at24_data *at24;
> +   bool low_power;
> struct regmap *regmap;
> bool writable;
> u8 test_byte;
> @@ -733,25 +734,30 @@ static int at24_probe(struct i2c_client *client)
>
> i2c_set_clientdata(client, at24);
>
> -   err = regulator_enable(at24->vcc_reg);
> -   if (err) {
> -   dev_err(dev, "Failed to enable vcc regulator\n");
> -   return err;
> -   }
> +   low_power = acpi_dev_state_low_power(>dev);
> +   if (!low_power) {
> +   err = regulator_enable(at24->vcc_reg);
> +   if (err) {
> +   dev_err(dev, "Failed to enable vcc regulator\n");
> +   return err;
> +   }
>
> -   /* enable runtime pm */
> -   pm_runtime_set_active(dev);
> +   pm_runtime_set_active(dev);
> +   }
> pm_runtime_enable(dev);
>

What's the guarantee that at this point the runtime PM wouldn't
suspend the device? Notice that the nvmem device is already exposed to
the userspace, which could trigger pm runtime gets and puts (and thus
idles as well).

Best regards,
Tomasz

> /*
> -* Perform a one-byte test read to verify that the
> -* chip is functional.
> +* Perform a one-byte test read to verify that the chip is functional,
> +* unless powering on the device is to be avoided during probe (i.e.
> +* it's powered off right now).
>  */
> -   err = at24_read(at24, 0, _byte, 1);
> -   if (err) {
> -   pm_runtime_disable(dev);
> -   regulator_disable(at24->vcc_reg);
> -   return -ENODEV;
> +   if (!low_power) {
> +   err = at24_read(at24, 0, _byte, 1);
> +   if (err) {
> +   pm_runtime_disable(dev);
> +   regulator_disable(at24->vcc_reg);
> +   return -ENODEV;
> +   }
> }
>
> pm_runtime_idle(dev);
> @@ -771,9 +777,11 @@ static int at24_remove(struct i2c_client *client)
> struct at24_data *at24 = i2c_get_clientdata(client);
>
> pm_runtime_disable(>dev);
> -   if (!pm_runtime_status_suspended(>dev))
> -   regulator_disable(at24->vcc_reg);
> -   pm_runtime_set_suspended(>dev);
> +   if (!acpi_dev_state_low_power(>dev)) {
> +   if (!pm_runtime_status_suspended(>dev))
> +   regulator_disable(at24->vcc_reg);
> +   pm_runtime_set_suspended(>dev);
> +   }
>
> return 0;
>  }
> @@ -810,6 +818,7 @@ static struct i2c_driver at24_driver = {
> .probe_new = at24_probe,
> .remove = at24_remove,
> .id_table = at24_ids,
> +   .flags = I2C_DRV_FL_ALLOW_LOW_POWER_PROBE,
>  };
>
>  static int __init at24_init(void)
> --
> 2.20.1
>


Re: [PATCH v8 6/6] at24: Support probing while off

2020-10-06 Thread Tomasz Figa
On Tue, Oct 6, 2020 at 1:20 PM Tomasz Figa  wrote:
>
> Hi Sakari,
>
> On Thu, Sep 3, 2020 at 10:15 AM Sakari Ailus
>  wrote:
> >
> > In certain use cases (where the chip is part of a camera module, and the
> > camera module is wired together with a camera privacy LED), powering on
> > the device during probe is undesirable. Add support for the at24 to
> > execute probe while being powered off. For this to happen, a hint in form
> > of a device property is required from the firmware.
> >
> > Signed-off-by: Sakari Ailus 
> > ---
> >  drivers/misc/eeprom/at24.c | 43 +++---
> >  1 file changed, 26 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
> > index 8f5de5f10bbea..2d24e33788d7d 100644
> > --- a/drivers/misc/eeprom/at24.c
> > +++ b/drivers/misc/eeprom/at24.c
> > @@ -595,6 +595,7 @@ static int at24_probe(struct i2c_client *client)
> > bool i2c_fn_i2c, i2c_fn_block;
> > unsigned int i, num_addresses;
> > struct at24_data *at24;
> > +   bool low_power;
> > struct regmap *regmap;
> > bool writable;
> > u8 test_byte;
> > @@ -733,25 +734,30 @@ static int at24_probe(struct i2c_client *client)
> >
> > i2c_set_clientdata(client, at24);
> >
> > -   err = regulator_enable(at24->vcc_reg);
> > -   if (err) {
> > -   dev_err(dev, "Failed to enable vcc regulator\n");
> > -   return err;
> > -   }
> > +   low_power = acpi_dev_state_low_power(>dev);
> > +   if (!low_power) {
> > +   err = regulator_enable(at24->vcc_reg);
> > +   if (err) {
> > +   dev_err(dev, "Failed to enable vcc regulator\n");
> > +   return err;
> > +   }
> >
> > -   /* enable runtime pm */
> > -   pm_runtime_set_active(dev);
> > +   pm_runtime_set_active(dev);
> > +   }
> > pm_runtime_enable(dev);
> >
>
> What's the guarantee that at this point the runtime PM wouldn't
> suspend the device? Notice that the nvmem device is already exposed to
> the userspace, which could trigger pm runtime gets and puts (and thus
> idles as well).
>
> Best regards,
> Tomasz
>
> > /*
> > -* Perform a one-byte test read to verify that the
> > -* chip is functional.
> > +* Perform a one-byte test read to verify that the chip is 
> > functional,
> > +* unless powering on the device is to be avoided during probe (i.e.
> > +* it's powered off right now).
> >  */
> > -   err = at24_read(at24, 0, _byte, 1);

Actually never mind. Someone pointed out to me that at24_read() calls
pm_runtime_get_sync() internally, so we should be fine. Sorry, for the
noise.

Best regards,
Tomasz

> > -   if (err) {
> > -   pm_runtime_disable(dev);
> > -   regulator_disable(at24->vcc_reg);
> > -   return -ENODEV;
> > +   if (!low_power) {
> > +   err = at24_read(at24, 0, _byte, 1);
> > +   if (err) {
> > +   pm_runtime_disable(dev);
> > +   regulator_disable(at24->vcc_reg);
> > +   return -ENODEV;
> > +   }
> > }
> >
> > pm_runtime_idle(dev);
> > @@ -771,9 +777,11 @@ static int at24_remove(struct i2c_client *client)
> > struct at24_data *at24 = i2c_get_clientdata(client);
> >
> > pm_runtime_disable(>dev);
> > -   if (!pm_runtime_status_suspended(>dev))
> > -   regulator_disable(at24->vcc_reg);
> > -   pm_runtime_set_suspended(>dev);
> > +   if (!acpi_dev_state_low_power(>dev)) {
> > +   if (!pm_runtime_status_suspended(>dev))
> > +   regulator_disable(at24->vcc_reg);
> > +   pm_runtime_set_suspended(>dev);
> > +   }
> >
> > return 0;
> >  }
> > @@ -810,6 +818,7 @@ static struct i2c_driver at24_driver = {
> > .probe_new = at24_probe,
> > .remove = at24_remove,
> > .id_table = at24_ids,
> > +   .flags = I2C_DRV_FL_ALLOW_LOW_POWER_PROBE,
> >  };
> >
> >  static int __init at24_init(void)
> > --
> > 2.20.1
> >


Re: [PATCH v5 1/7] media: v4l2: Extend pixel formats to unify single/multi-planar handling (and more)

2020-10-02 Thread Tomasz Figa
Hi Helen,

On Tue, Aug 04, 2020 at 04:29:33PM -0300, Helen Koike wrote:
> This is part of the multiplanar and singleplanar unification process.
> v4l2_ext_pix_format is supposed to work for both cases.
> 
> We also add the concept of modifiers already employed in DRM to expose
> HW-specific formats (like tiled or compressed formats) and allow
> exchanging this information with the DRM subsystem in a consistent way.
> 
> Note that only V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] are accepted in
> v4l2_ext_format, other types will be rejected if you use the
> {G,S,TRY}_EXT_PIX_FMT ioctls.
> 
> New hooks have been added to v4l2_ioctl_ops to support those new ioctls
> in drivers, but, in the meantime, the core takes care of converting
> {S,G,TRY}_EXT_PIX_FMT requests into {S,G,TRY}_FMT so that old drivers can
> still work if the userspace app/lib uses the new ioctls.
> The conversion is also done the other around to allow userspace
> apps/libs using {S,G,TRY}_FMT to work with drivers implementing the
> _ext_ hooks.
> 

Thank you for the patch. Please see my comments inline.

> Signed-off-by: Boris Brezillon 
> Signed-off-by: Helen Koike 
> ---
> Changes in v5:
> - change sizes and reorder fields to avoid holes in the struct and make
>   it the same for 32 and 64 bits
> - removed __attribute__ ((packed)) from uapi structs
> - Fix doc warning from make htmldocs
> - Updated commit message with EXT_PIX prefix for the ioctls.
> 
> Changes in v4:
> - Use v4l2_ext_pix_format directly in the ioctl, drop v4l2_ext_format,
> making V4L2_BUF_TYPE_VIDEO_[OUTPUT,CAPTURE] the only valid types.
> - Add reserved fields
> - Removed num_planes from struct v4l2_ext_pix_format
> - Removed flag field from struct v4l2_ext_pix_format, since the only
>   defined value is V4L2_PIX_FMT_FLAG_PREMUL_ALPHA only used by vsp1,
>   where we can use modifiers, or add it back later through the reserved
>   bits.
> - In v4l2_ext_format_to_format(), check if modifier is != MOD_LINEAR &&
>   != MOD_INVALID
> - Fix type assignment in v4l_g_fmt_ext_pix()
> - Rebased on top of media/master (post 5.8-rc1)
> 
> Changes in v3:
> - Rebased on top of media/master (post 5.4-rc1)
> 
> Changes in v2:
> - Move the modifier in v4l2_ext_format (was formerly placed in
>   v4l2_ext_plane)
> - Fix a few bugs in the converters and add a strict parameter to
>   allow conversion of uninitialized/mis-initialized objects
> ---
>  drivers/media/v4l2-core/v4l2-dev.c   |  21 +-
>  drivers/media/v4l2-core/v4l2-ioctl.c | 585 +++
>  include/media/v4l2-ioctl.h   |  34 ++
>  include/uapi/linux/videodev2.h   |  56 +++
>  4 files changed, 615 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
> b/drivers/media/v4l2-core/v4l2-dev.c
> index a593ea0598b55..e1829906bc086 100644
> --- a/drivers/media/v4l2-core/v4l2-dev.c
> +++ b/drivers/media/v4l2-core/v4l2-dev.c
> @@ -607,25 +607,37 @@ static void determine_valid_ioctls(struct video_device 
> *vdev)
>   set_bit(_IOC_NR(VIDIOC_ENUM_FMT), valid_ioctls);
>   if ((is_rx && (ops->vidioc_g_fmt_vid_cap ||
>  ops->vidioc_g_fmt_vid_cap_mplane ||
> +ops->vidioc_g_ext_pix_fmt_vid_cap ||
>  ops->vidioc_g_fmt_vid_overlay)) ||
>   (is_tx && (ops->vidioc_g_fmt_vid_out ||
>  ops->vidioc_g_fmt_vid_out_mplane ||
> -ops->vidioc_g_fmt_vid_out_overlay)))
> +ops->vidioc_g_ext_pix_fmt_vid_out ||
> +ops->vidioc_g_fmt_vid_out_overlay))) {
>set_bit(_IOC_NR(VIDIOC_G_FMT), valid_ioctls);
> +  set_bit(_IOC_NR(VIDIOC_G_EXT_PIX_FMT), valid_ioctls);

Is it expected to allow the new ioctls for drivers which implement the old
vid_out_overlay callbacks?

> + }
>   if ((is_rx && (ops->vidioc_s_fmt_vid_cap ||
>  ops->vidioc_s_fmt_vid_cap_mplane ||
> +ops->vidioc_s_ext_pix_fmt_vid_cap ||
>  ops->vidioc_s_fmt_vid_overlay)) ||
>   (is_tx && (ops->vidioc_s_fmt_vid_out ||
>  ops->vidioc_s_fmt_vid_out_mplane ||
> -ops->vidioc_s_fmt_vid_out_overlay)))
> +ops->vidioc_s_ext_pix_fmt_vid_out ||
> +ops->vidioc_s_fmt_vid_out_overlay))) {
>set_bit(_IOC_NR(VIDIOC_S_FMT), valid_ioctls);
> +  set_bit(_IOC_NR(VIDIOC_S_EXT_PIX_FMT), valid_ioctls);
> + }
>   if ((is_rx && (ops->vidioc_try_fmt_vid_cap ||
>  ops->vidioc_try_fmt_vid_cap_mplane ||
> +ops->vidioc_try_ext_pix_fmt_vid_cap ||
>  ops->vidioc_try_fmt_vid_overlay)) ||
>   (is_tx && 

Re: [PATCH 1/2] mm/frame-vec: Drop gup_flags from get_vaddr_frames()

2020-10-02 Thread Tomasz Figa
On Fri, Oct 2, 2020 at 7:53 PM Daniel Vetter  wrote:
>
> FOLL_WRITE | FOLL_FORCE is really the only reasonable thing to do for
> simple dma device that can't guarantee write protection. Which is also
> what all the callers are using.
>
> So just simplify this.
>
> Signed-off-by: Daniel Vetter 
> Cc: Inki Dae 
> Cc: Joonyoung Shim 
> Cc: Seung-Woo Kim 
> Cc: Kyungmin Park 
> Cc: Kukjin Kim 
> Cc: Krzysztof Kozlowski 
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Tomasz Figa 
> Cc: Andrew Morton 
> Cc: Oded Gabbay 
> Cc: Omer Shpigelman 
> Cc: Tomer Tayar 
> Cc: Greg Kroah-Hartman 
> Cc: Pawel Piskorski 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: linux...@kvack.org
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c   | 3 +--
>  drivers/media/common/videobuf2/videobuf2-memops.c | 3 +--
>  drivers/misc/habanalabs/common/memory.c   | 3 +--
>  include/linux/mm.h| 2 +-
>  mm/frame_vector.c | 4 ++--
>  5 files changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index 967a5cdc120e..ac452842bab3 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -480,8 +480,7 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
> g2d_data *g2d,
> goto err_free;
> }
>
> -   ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE,
> -   g2d_userptr->vec);
> +   ret = get_vaddr_frames(start, npages, g2d_userptr->vec);
> if (ret != npages) {
> DRM_DEV_ERROR(g2d->dev,
>   "failed to get user pages from userptr.\n");
> diff --git a/drivers/media/common/videobuf2/videobuf2-memops.c 
> b/drivers/media/common/videobuf2/videobuf2-memops.c
> index 6e9e05153f4e..9dd6c27162f4 100644
> --- a/drivers/media/common/videobuf2/videobuf2-memops.c
> +++ b/drivers/media/common/videobuf2/videobuf2-memops.c
> @@ -40,7 +40,6 @@ struct frame_vector *vb2_create_framevec(unsigned long 
> start,
> unsigned long first, last;
> unsigned long nr;
> struct frame_vector *vec;
> -   unsigned int flags = FOLL_FORCE | FOLL_WRITE;
>
> first = start >> PAGE_SHIFT;
> last = (start + length - 1) >> PAGE_SHIFT;
> @@ -48,7 +47,7 @@ struct frame_vector *vb2_create_framevec(unsigned long 
> start,
> vec = frame_vector_create(nr);
> if (!vec)
> return ERR_PTR(-ENOMEM);
> -   ret = get_vaddr_frames(start & PAGE_MASK, nr, flags, vec);
> +   ret = get_vaddr_frames(start & PAGE_MASK, nr, vec);
> if (ret < 0)
> goto out_destroy;
> /* We accept only complete set of PFNs */

For drivers/media/common/videobuf2/:

Acked-by: Tomasz Figa 

Best regards,
Tomasz


Re: [PATCH 8/8] WIP: add a dma_alloc_contiguous API

2020-10-02 Thread Tomasz Figa
Hi Christoph,

On Wed, Sep 30, 2020 at 06:09:17PM +0200, Christoph Hellwig wrote:
> Add a new API that returns a virtually non-contigous array of pages
> and dma address.  This API is only implemented for dma-iommu and will
> not be implemented for non-iommu DMA API instances that have to allocate
> contiguous memory.  It is up to the caller to check if the API is
> available.

Would you mind scheding some more light on what made the previous attempt
not work well? I liked the previous API because it was more consistent with
the regular dma_alloc_coherent().

> 
> The intent is that media drivers can use this API if either:

FWIW, the USB subsystem also has similar needs, and so do some DRM drivers
using DMA API rather than IOMMU API directly. Basically I believe that all
the users removed in your previous series relied on custom downstream
patches to make DMA_ATTR_NON_CONSISTENT work and could be finally made work
in upstream using this API.

> 
>  - no kernel mapping or only temporary kernel mappings are required.
>That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING
>  - a kernel mapping is required for cached and DMA mapped pages, but
>the driver also needs the pages to e.g. map them to userspace.
>In that sense it is a replacement for some aspects of the recently
>removed and never fully implemented DMA_ATTR_NON_CONSISTENT

What's the expected allocation and mapping flow with the latter? Would that be

pages = dma_alloc_noncoherent(...)
vaddr = vmap(pages, ...);

?

Would one just use the usual dma_sync_for_{cpu,device}() for cache
invallidate/clean, while keeping the mapping in place?

Best regards,
Tomasz


Re: [PATCH 2/2] venus: venc: fix handlig of S_SELECTION and G_SELECTION

2020-10-01 Thread Tomasz Figa
On Thu, Oct 1, 2020 at 3:32 AM Stanimir Varbanov
 wrote:
>
> Hi Tomasz,
>
> On 9/25/20 11:55 PM, Tomasz Figa wrote:
> > Hi Dikshita, Stanimir,
> >
> > On Thu, Sep 24, 2020 at 7:31 PM Dikshita Agarwal
> >  wrote:
> >>
> >> From: Stanimir Varbanov 
> >>
> >> - return correct width and height for G_SELECTION
> >> - if requested rectangle wxh doesn't match with capture port wxh
> >>   adjust the rectangle to supported wxh.
> >>
> >> Signed-off-by: Dikshita Agarwal 
> >> ---
> >>  drivers/media/platform/qcom/venus/venc.c | 20 
> >>  1 file changed, 12 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> >> b/drivers/media/platform/qcom/venus/venc.c
> >> index 7d2aaa8..a2cc12d 100644
> >> --- a/drivers/media/platform/qcom/venus/venc.c
> >> +++ b/drivers/media/platform/qcom/venus/venc.c
> >> @@ -463,13 +463,13 @@ static int venc_g_fmt(struct file *file, void *fh, 
> >> struct v4l2_format *f)
> >> switch (s->target) {
> >> case V4L2_SEL_TGT_CROP_DEFAULT:
> >> case V4L2_SEL_TGT_CROP_BOUNDS:
> >> -   s->r.width = inst->width;
> >> -   s->r.height = inst->height;
> >> -   break;
> >> -   case V4L2_SEL_TGT_CROP:
> >> s->r.width = inst->out_width;
> >> s->r.height = inst->out_height;
> >> break;
> >> +   case V4L2_SEL_TGT_CROP:
> >> +   s->r.width = inst->width;
> >> +   s->r.height = inst->height;
> >> +   break;
> >> default:
> >> return -EINVAL;
> >> }inter
> >> @@ -490,10 +490,14 @@ static int venc_g_fmt(struct file *file, void *fh, 
> >> struct v4l2_format *f)
> >>
> >> switch (s->target) {
> >> case V4L2_SEL_TGT_CROP:
> >> -   if (s->r.width != inst->out_width ||
> >> -   s->r.height != inst->out_height ||
> >> -   s->r.top != 0 || s->r.left != 0)
> >> -   return -EINVAL;
> >> +   if (s->r.width != inst->width ||
> >> +   s->r.height != inst->height ||
> >> +   s->r.top != 0 || s->r.left != 0) {
> >> +   s->r.top = 0;
> >> +   s->r.left = 0;
> >> +   s->r.width = inst->width;
> >> +   s->r.height = inst->height;
> >
> > What's the point of exposing the selection API if no selection can
> > actually be done?
>
> If someone can guarantee that dropping of s_selection will not break
> userspace applications I'm fine with removing it.

Indeed the specification could be made more clear about this. The
visible rectangle configuration is described as optional, so I'd
consider the capability to be optional as well.

Of course it doesn't change the fact that something that is optional
in the API may be mandatory for some specific integrations, like
Chrome OS or Android.

>
> I implemented g/s_selection with the idea to add crop functionality
> later because with current firmware interface it needs more work.

I suggested one thing internally, but not sure if it was understood correctly:

Most of the encoders only support partial cropping, with the rectangle
limited to top = 0 and left = 0, in other words, only setting the
visible width and height. This can be easily implemented on most of
the hardware, even those that don't have dedicated cropping
capability, by configuring the hardware as follows:

stride = CAPTURE format width (or bytesperline)
width = CROP width
height = CROP height

I believe Android requires the hardware to support stride and AFAIK
this hardware is also commonly used on Android, so perhaps it's
possible to achieve the above without any firmware changes?

Best regards,
Tomasz


Re: [PATCH v8 0/6] Support running driver's probe for a device powered off

2020-09-28 Thread Tomasz Figa
On Mon, Sep 28, 2020 at 4:18 PM Rafael J. Wysocki  wrote:
>
> On Sun, Sep 27, 2020 at 9:44 PM Tomasz Figa  wrote:
> >
> > On Sun, Sep 27, 2020 at 9:39 PM Wolfram Sang  wrote:
> > >
> > >
> > > > I think we might be overly complicating things. IMHO the series as is
> > > > with the "i2c_" prefix removed from the flags introduced would be
> > > > reusable as is for any other subsystem that needs it. Of course, for
> > > > now, the handling of the flag would remain implemented only in the I2C
> > > > subsystem.
> > >
> > > Just to be clear: you are suggesting to remove "i2c" from the DSD
> > > binding "i2c-allow-low-power-probe". And you are not talking about
> > > moving I2C_DRV_FL_ALLOW_LOW_POWER_PROBE to struct device_driver? I
> > > recall the latter has been NACKed by gkh so far.
> > >
> >
> > I'd also drop "I2C_" from "I2C_DRV_FL_ALLOW_LOW_POWER_PROBE", but all
> > the implementation would remain where it is in the code. IOW, I'm just
> > suggesting a naming change to avoid proliferating duplicate flags of
> > the same meaning across subsystems.
>
> But that would indicate that the property was recognized by other
> subsystems which wouldn't be the case, so it would be confusing.
>
> That's why it cannot be documented as a general property ATM too.

I guess that's true. Well, this is kAPI in the end, so if we have more
subsystems, it could be always renamed. So feel free to ignore my
previous comment.

Best regards,
Tomasz


Re: [PATCH v8 0/6] Support running driver's probe for a device powered off

2020-09-27 Thread Tomasz Figa
On Sun, Sep 27, 2020 at 9:39 PM Wolfram Sang  wrote:
>
>
> > I think we might be overly complicating things. IMHO the series as is
> > with the "i2c_" prefix removed from the flags introduced would be
> > reusable as is for any other subsystem that needs it. Of course, for
> > now, the handling of the flag would remain implemented only in the I2C
> > subsystem.
>
> Just to be clear: you are suggesting to remove "i2c" from the DSD
> binding "i2c-allow-low-power-probe". And you are not talking about
> moving I2C_DRV_FL_ALLOW_LOW_POWER_PROBE to struct device_driver? I
> recall the latter has been NACKed by gkh so far.
>

I'd also drop "I2C_" from "I2C_DRV_FL_ALLOW_LOW_POWER_PROBE", but all
the implementation would remain where it is in the code. IOW, I'm just
suggesting a naming change to avoid proliferating duplicate flags of
the same meaning across subsystems.


Re: [PATCH 17/18] dma-iommu: implement ->alloc_noncoherent

2020-09-26 Thread Tomasz Figa
On Sat, Sep 26, 2020 at 4:14 PM Christoph Hellwig  wrote:
>
> On Fri, Sep 25, 2020 at 06:46:22PM +, Tomasz Figa wrote:
> > > +static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size,
> > > +   dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp)
> > > +{
> > > +   if (!gfpflags_allow_blocking(gfp)) {
> > > +   struct page *page;
> > > +
> > > +   page = dma_common_alloc_pages(dev, size, handle, dir, gfp);
> > > +   if (!page)
> > > +   return NULL;
> > > +   return page_address(page);
> > > +   }
> > > +
> > > +   return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO,
> > > +PAGE_KERNEL, 0);
> >
> > iommu_dma_alloc_remap() makes use of the DMA_ATTR_ALLOC_SINGLE_PAGES 
> > attribute
> > to optimize the allocations for devices which don't care about how 
> > contiguous
> > the backing memory is. Do you think we could add an attrs argument to this
> > function and pass it there?
> >
> > As ARM is being moved to the common iommu-dma layer as well, we'll probably
> > make use of the argument to support the DMA_ATTR_NO_KERNEL_MAPPING 
> > attribute to
> > conserve the vmalloc area.
>
> We could probably at it.  However I wonder why this is something the
> drivers should care about.  Isn't this really something that should
> be a kernel-wide policy for a given system?

There are IOMMUs out there which support huge pages and those can
benefit *some* hardware depending on what kind of accesses they
perform, possibly on a per-buffer basis. At the same time, order > 0
allocations can be expensive, significantly affecting allocation
latency, so for devices which don't care about huge pages anyone would
prefer simple single-page allocations. Currently the drivers know the
best on whether the hardware they drive would care. There are some
decision factors listed in the documentation [1].

I can imagine cases where drivers could not be the best to decide
about this - for example, the workload could vary depending on the
userspace or a product decision regarding the performance vs
allocation latency, but we haven't seen such cases in practice yet.

[1] 
https://www.kernel.org/doc/html/latest/core-api/dma-attributes.html?highlight=dma_attr_alloc_single_pages#dma-attr-alloc-single-pages

Best regards,
Tomasz


Re: [PATCH v5 0/9] move Rockchip ISP bindings out of staging / add ISP DT nodes for RK3399

2020-09-26 Thread Tomasz Figa
Hi Helen,

On Wed, Jul 22, 2020 at 12:55:24PM -0300, Helen Koike wrote:
> Move the bindings out of drivers/staging and place them in
> Documentation/devicetree/bindings instead.
> 
> Also, add DT nodes for RK3399 and verify with make ARCH=arm64 dtbs_check
> and make ARCH=arm64 dt_binding_check.
> 
> Tested by verifying images streamed from Scarlet Chromebook
> 
> Changes in v5:
> - Drop unit addresses in dt-bindings example for simplification and fix
> errors as suggested by Rob Herring in previous version
> - Fix typos
> - Re-write clock organization with if/then schema
>

Besides one comment to patch 8/9,

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz


Re: [PATCH v5 8/9] arm64: dts: rockchip: add isp0 node for rk3399

2020-09-26 Thread Tomasz Figa
Hi Helen,

On Wed, Jul 22, 2020 at 12:55:32PM -0300, Helen Koike wrote:
> From: Shunqian Zheng 
> 
> RK3399 has two ISPs, but only isp0 was tested.
> Add isp0 node in rk3399 dtsi
> 
> Verified with:
> make ARCH=arm64 dtbs_check 
> DT_SCHEMA_FILES=Documentation/devicetree/bindings/media/rockchip-isp1.yaml
> 
> Signed-off-by: Shunqian Zheng 
> Signed-off-by: Jacob Chen 
> Signed-off-by: Helen Koike 
> 
> ---
> 
> V4:
> - update clock names
> 
> V3:
> - clean up clocks
> 
> V2:
> - re-order power-domains property
> 
> V1:
> This patch was originally part of this patchset:
> 
> https://patchwork.kernel.org/patch/10267431/
> 
> The only difference is:
> - add phy properties
> - add ports
> ---
>  arch/arm64/boot/dts/rockchip/rk3399.dtsi | 25 
>  1 file changed, 25 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi 
> b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> index dba9641947a3a..ed8ba75dbbce8 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> @@ -1721,6 +1721,31 @@ vopb_mmu: iommu@ff903f00 {
>   status = "disabled";
>   };
>  
> + isp0: isp0@ff91 {
> + compatible = "rockchip,rk3399-cif-isp";
> + reg = <0x0 0xff91 0x0 0x4000>;
> + interrupts = ;
> + clocks = < SCLK_ISP0>,
> +  < ACLK_ISP0_WRAPPER>,
> +  < HCLK_ISP0_WRAPPER>;
> + clock-names = "isp", "aclk", "hclk";
> + iommus = <_mmu>;
> + phys = <_dphy_rx0>;
> + phy-names = "dphy";
> + power-domains = < RK3399_PD_ISP0>;

Should this have status = "disabled" too? The mipi_dphy_rx0 node is
disabled by default too, so in the default configuration the driver
would always fail to probe.

Best regards,
Tomasz


Re: [PATCH v8 0/6] Support running driver's probe for a device powered off

2020-09-26 Thread Tomasz Figa
Hi Sakari,

On Thu, Sep 03, 2020 at 11:15:44AM +0300, Sakari Ailus wrote:
> 
> Hi all,
> 
> These patches enable calling (and finishing) a driver's probe function
> without powering on the respective device on busses where the practice is
> to power on the device for probe. While it generally is a driver's job to
> check the that the device is there, there are cases where it might be
> undesirable. (In this case it stems from a combination of hardware design
> and user expectations; see below.) The downside with this change is that
> if there is something wrong with the device, it will only be found at the
> time the device is used. In this case (the camera sensors + EEPROM in a
> sensor) I don't see any tangible harm from that though.
> 
> An indication both from the driver and the firmware is required to allow
> the device's power state to remain off during probe (see the first patch).
> 
> 
> The use case is such that there is a privacy LED next to an integrated
> user-facing laptop camera, and this LED is there to signal the user that
> the camera is recording a video or capturing images. That LED also happens
> to be wired to one of the power supplies of the camera, so whenever you
> power on the camera, the LED will be lit, whether images are captured from
> the camera --- or not. There's no way to implement this differently
> without additional software control (allowing of which is itself a
> hardware design decision) on most CSI-2-connected camera sensors as they
> simply have no pin to signal the camera streaming state.
> 
> This is also what happens during driver probe: the camera will be powered
> on by the I²C subsystem calling dev_pm_domain_attach() and the device is
> already powered on when the driver's own probe function is called. To the
> user this visible during the boot process as a blink of the privacy LED,
> suggesting that the camera is recording without the user having used an
> application to do that. From the end user's point of view the behaviour is
> not expected and for someone unfamiliar with internal workings of a
> computer surely seems quite suspicious --- even if images are not being
> actually captured.
> 
> I've tested these on linux-next master. They also apply to Wolfram's
> i2c/for-next branch, there's a patch that affects the I²C core changes
> here (see below). The patches apart from that apply to Bartosz's
> at24/for-next as well as Mauro's linux-media master branch.

Besides the suggestion to make the defintions added less specific to i2c
(but still keeping the implementation so for now), feel free to add:

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz


Re: [PATCH v8 0/6] Support running driver's probe for a device powered off

2020-09-26 Thread Tomasz Figa
On Mon, Sep 14, 2020 at 12:47:27PM +0300, Sakari Ailus wrote:
> Hi Luca,
> 
> On Mon, Sep 14, 2020 at 09:58:24AM +0200, Luca Ceresoli wrote:
> > Hi Sakari,
> > 
> > On 11/09/20 15:01, Sakari Ailus wrote:
> > > Hi Luca,
> > > 
> > > On Fri, Sep 11, 2020 at 02:49:26PM +0200, Luca Ceresoli wrote:
> > >> Hi Sakari,
> > >>
> > >> On 03/09/20 10:15, Sakari Ailus wrote:
> > >>>
> > >>> Hi all,
> > >>>
> > >>> These patches enable calling (and finishing) a driver's probe function
> > >>> without powering on the respective device on busses where the practice 
> > >>> is
> > >>> to power on the device for probe. While it generally is a driver's job 
> > >>> to
> > >>> check the that the device is there, there are cases where it might be
> > >>> undesirable. (In this case it stems from a combination of hardware 
> > >>> design
> > >>> and user expectations; see below.) The downside with this change is that
> > >>> if there is something wrong with the device, it will only be found at 
> > >>> the
> > >>> time the device is used. In this case (the camera sensors + EEPROM in a
> > >>> sensor) I don't see any tangible harm from that though.
> > >>>
> > >>> An indication both from the driver and the firmware is required to allow
> > >>> the device's power state to remain off during probe (see the first 
> > >>> patch).
> > >>>
> > >>>
> > >>> The use case is such that there is a privacy LED next to an integrated
> > >>> user-facing laptop camera, and this LED is there to signal the user that
> > >>> the camera is recording a video or capturing images. That LED also 
> > >>> happens
> > >>> to be wired to one of the power supplies of the camera, so whenever you
> > >>> power on the camera, the LED will be lit, whether images are captured 
> > >>> from
> > >>> the camera --- or not. There's no way to implement this differently
> > >>> without additional software control (allowing of which is itself a
> > >>> hardware design decision) on most CSI-2-connected camera sensors as they
> > >>> simply have no pin to signal the camera streaming state.
> > >>>
> > >>> This is also what happens during driver probe: the camera will be 
> > >>> powered
> > >>> on by the I²C subsystem calling dev_pm_domain_attach() and the device is
> > >>> already powered on when the driver's own probe function is called. To 
> > >>> the
> > >>> user this visible during the boot process as a blink of the privacy LED,
> > >>> suggesting that the camera is recording without the user having used an
> > >>> application to do that. From the end user's point of view the behaviour 
> > >>> is
> > >>> not expected and for someone unfamiliar with internal workings of a
> > >>> computer surely seems quite suspicious --- even if images are not being
> > >>> actually captured.
> > >>>
> > >>> I've tested these on linux-next master. They also apply to Wolfram's
> > >>> i2c/for-next branch, there's a patch that affects the I²C core changes
> > >>> here (see below). The patches apart from that apply to Bartosz's
> > >>> at24/for-next as well as Mauro's linux-media master branch.
> > >>
> > >> Apologies for having joined this discussion this late.
> > > 
> > > No worries. But thanks for the comments.
> > > 
> > >>
> > >> This patchset seems a good base to cover a different use case, where I
> > >> also cannot access the physical device at probe time.
> > >>
> > >> I'm going to try these patches, but in my case there are a few
> > >> differences that need a better understanding.
> > >>
> > >> First, I'm using device tree, not ACPI. In addition to adding OF support
> > >> similar to the work you've done for ACPI, I think instead of
> > >> acpi_dev_state_low_power() we should have a function that works for both
> > >> ACPI and DT.
> > > 
> > > acpi_dev_state_low_power() is really ACPI specific: it does tell the ACPI
> > > power state of the device during probe or remove. It is not needed on DT
> > > since the power state of the device is controlled directly by the driver.
> > > On I²C ACPI devices, it's the framework that powers them on for probe.
> > 
> > I see, thanks for clarifying. I'm not used to ACPI so I didn't get that.
> > 
> > > You could have a helper function on DT to tell a driver what to do in
> > > probe, but the functionality in that case is unrelated.
> > 
> > So in case of DT we might think of a function that just tells whether
> > the device is marked to allow low-power probe, but it's just an info
> > from DT:
> > 
> > int mydriver_probe(struct i2c_client *client)
> > {
> > ...
> > low_power = of_dev_state_low_power(>dev);
> > if (!low_power) {
> > mydriver_initialize(); /* power+clocks, write regs */
> > }
> > ...
> > }
> > 
> > ...and, if (low_power), call mydriver_initialize() at first usage.
> > 
> > I'm wondering whether this might make sense in mainline.
> 
> Quite possibly, if there are drivers that would need it.
> 
> The function should probably be called differently though as what it does
> is quite different 

Re: [PATCH 2/2] venus: venc: fix handlig of S_SELECTION and G_SELECTION

2020-09-25 Thread Tomasz Figa
Hi Dikshita, Stanimir,

On Thu, Sep 24, 2020 at 7:31 PM Dikshita Agarwal
 wrote:
>
> From: Stanimir Varbanov 
>
> - return correct width and height for G_SELECTION
> - if requested rectangle wxh doesn't match with capture port wxh
>   adjust the rectangle to supported wxh.
>
> Signed-off-by: Dikshita Agarwal 
> ---
>  drivers/media/platform/qcom/venus/venc.c | 20 
>  1 file changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> b/drivers/media/platform/qcom/venus/venc.c
> index 7d2aaa8..a2cc12d 100644
> --- a/drivers/media/platform/qcom/venus/venc.c
> +++ b/drivers/media/platform/qcom/venus/venc.c
> @@ -463,13 +463,13 @@ static int venc_g_fmt(struct file *file, void *fh, 
> struct v4l2_format *f)
> switch (s->target) {
> case V4L2_SEL_TGT_CROP_DEFAULT:
> case V4L2_SEL_TGT_CROP_BOUNDS:
> -   s->r.width = inst->width;
> -   s->r.height = inst->height;
> -   break;
> -   case V4L2_SEL_TGT_CROP:
> s->r.width = inst->out_width;
> s->r.height = inst->out_height;
> break;
> +   case V4L2_SEL_TGT_CROP:
> +   s->r.width = inst->width;
> +   s->r.height = inst->height;
> +   break;
> default:
> return -EINVAL;
> }
> @@ -490,10 +490,14 @@ static int venc_g_fmt(struct file *file, void *fh, 
> struct v4l2_format *f)
>
> switch (s->target) {
> case V4L2_SEL_TGT_CROP:
> -   if (s->r.width != inst->out_width ||
> -   s->r.height != inst->out_height ||
> -   s->r.top != 0 || s->r.left != 0)
> -   return -EINVAL;
> +   if (s->r.width != inst->width ||
> +   s->r.height != inst->height ||
> +   s->r.top != 0 || s->r.left != 0) {
> +   s->r.top = 0;
> +   s->r.left = 0;
> +   s->r.width = inst->width;
> +   s->r.height = inst->height;

What's the point of exposing the selection API if no selection can
actually be done?

Best regards,
Tomasz


  1   2   3   4   5   6   7   8   9   10   >