Re: [PATCH v2] media: s5p-mfc: Add support for V4L2_MEMORY_DMABUF type

2017-11-03 Thread Tobias Jakobi
Tobias Jakobi wrote:
> Nicolas Dufresne wrote:
>> Le vendredi 03 novembre 2017 à 09:11 +0100, Marek Szyprowski a écrit :
>>> MFC driver supports only MMAP operation mode mainly due to the hardware
>>> restrictions of the addresses of the DMA buffers (MFC v5 hardware can
>>> access buffers only in 128MiB memory region starting from the base address
>>> of its firmware). When IOMMU is available, this requirement is easily
>>> fulfilled even for the buffers located anywhere in the memory - typically
>>> by mapping them in the DMA address space as close as possible to the
>>> firmware. Later hardware revisions don't have this limitations at all.
>>>
>>> The second limitation of the MFC hardware related to the memory buffers
>>> is constant buffer address. Once the hardware has been initialized for
>>> operation on given buffer set, the addresses of the buffers cannot be
>>> changed.
>>>
>>> With the above assumptions, a limited support for USERPTR and DMABUF
>>> operation modes can be added. The main requirement is to have all buffers
>>> known when starting hardware. This has been achieved by postponing
>>> hardware initialization once all the DMABUF or USERPTR buffers have been
>>> queued for the first time. Once then, buffers cannot be modified to point
>>> to other memory area.
>>
>> I am concerned about enabling this support with existing userspace.
>> Userspace application will be left with some driver with this
>> limitation and other drivers without. I think it is harmful to enable
>> that feature without providing userspace the ability to know.
> I voiced similar concern in my previous mail. I have to admit that I know very
> little about V4L2, but when I investigated this some time ago I saw that
> previously used/queued DMABUF buffers can be removed again, and, under certain
> conditions are 'reacquired', which issues a buf_cleanup() -- which does 
> nothing
> atm. My guess is that, in the worst case, one could trigger an IOMMU pagefault
> (or simply memory corruption, if not under IOMMU) this way.
> 
> Like Marek says, the hw needs to know DMA adresses of all available buffers on
> hw init. Hence a proper implementation would need to trigger some
> re-initialization once these adresses changes (DMABUFs 
> removed/reacquired/etc.).

Forgot to mention something here. Back when I looked at the code, I saw that the
MFC state machine has a state when the resolution of the video changes. IIRC
then this also triggers a reconfiguration of the hw buffers. I thought that
maybe one could re-use the states in the event of changed buffers. I didn't get
very far there, mainly because documentation of the whole state machine and the
interaction with the hw core is very sparse...

- Tobias




> These extra constraints which are introduced here, IMO they violate the API 
> then.
> 
> 
> 
>> This is specially conflicting with let's say UVC driver providing
>> buffers, since UVC driver implementing CREATE_BUFS ioctl. So even if
>> userspace start making an effort to maintain the same DMABuf for the
>> same buffer index, if a new buffer is created, we won't be able to use
>> it.
>>
>>>
>>> This patch also removes unconditional USERPTR operation mode from encoder
>>> video node, because it doesn't work with v5 MFC hardware without IOMMU
>>> being enabled.
>>>
>>> In case of MFC v5 a bidirectional queue flag has to be enabled as a
>>> workaround of the strange hardware behavior - MFC performs a few writes
>>> to source data during the operation.
>>
>> Do you have more information about this ? It is quite terrible, since
>> if you enable buffer importation, the buffer might be multi-plex across
>> multiple encoder instance. That is another way this feature can be
>> harmful to existing userspace.
> IIRC the hw stores some "housekeeping" data in the input buffers used during
> decoding. So input buffers are not strictly "input", but also output, in the
> following sense. If the buffer has total size N, and the data stored inside 
> has
> size n (with n < N), then the hw uses the remaining N-n bytes at the end of 
> the
> buffer for this housekeeping data. I'm not sure what happens if n equals N, or
> if that's even possible.
> 
> With best wishes,
> Tobias
> 
> 
>>
>>>
>>> Signed-off-by: Seung-Woo Kim <sw0312@samsung.com>
>>> [mszyprow: adapted to v4.14 code base, rewrote and extended commit message,
>>>  added checks for changing buffer addresses, added bidirectional queue
>>>  flags and comments]
>&

Re: [PATCH v2] media: s5p-mfc: Add support for V4L2_MEMORY_DMABUF type

2017-11-03 Thread Tobias Jakobi
Nicolas Dufresne wrote:
> Le vendredi 03 novembre 2017 à 09:11 +0100, Marek Szyprowski a écrit :
>> MFC driver supports only MMAP operation mode mainly due to the hardware
>> restrictions of the addresses of the DMA buffers (MFC v5 hardware can
>> access buffers only in 128MiB memory region starting from the base address
>> of its firmware). When IOMMU is available, this requirement is easily
>> fulfilled even for the buffers located anywhere in the memory - typically
>> by mapping them in the DMA address space as close as possible to the
>> firmware. Later hardware revisions don't have this limitations at all.
>>
>> The second limitation of the MFC hardware related to the memory buffers
>> is constant buffer address. Once the hardware has been initialized for
>> operation on given buffer set, the addresses of the buffers cannot be
>> changed.
>>
>> With the above assumptions, a limited support for USERPTR and DMABUF
>> operation modes can be added. The main requirement is to have all buffers
>> known when starting hardware. This has been achieved by postponing
>> hardware initialization once all the DMABUF or USERPTR buffers have been
>> queued for the first time. Once then, buffers cannot be modified to point
>> to other memory area.
> 
> I am concerned about enabling this support with existing userspace.
> Userspace application will be left with some driver with this
> limitation and other drivers without. I think it is harmful to enable
> that feature without providing userspace the ability to know.
I voiced similar concern in my previous mail. I have to admit that I know very
little about V4L2, but when I investigated this some time ago I saw that
previously used/queued DMABUF buffers can be removed again, and, under certain
conditions are 'reacquired', which issues a buf_cleanup() -- which does nothing
atm. My guess is that, in the worst case, one could trigger an IOMMU pagefault
(or simply memory corruption, if not under IOMMU) this way.

Like Marek says, the hw needs to know DMA adresses of all available buffers on
hw init. Hence a proper implementation would need to trigger some
re-initialization once these adresses changes (DMABUFs removed/reacquired/etc.).
These extra constraints which are introduced here, IMO they violate the API 
then.



> This is specially conflicting with let's say UVC driver providing
> buffers, since UVC driver implementing CREATE_BUFS ioctl. So even if
> userspace start making an effort to maintain the same DMABuf for the
> same buffer index, if a new buffer is created, we won't be able to use
> it.
> 
>>
>> This patch also removes unconditional USERPTR operation mode from encoder
>> video node, because it doesn't work with v5 MFC hardware without IOMMU
>> being enabled.
>>
>> In case of MFC v5 a bidirectional queue flag has to be enabled as a
>> workaround of the strange hardware behavior - MFC performs a few writes
>> to source data during the operation.
> 
> Do you have more information about this ? It is quite terrible, since
> if you enable buffer importation, the buffer might be multi-plex across
> multiple encoder instance. That is another way this feature can be
> harmful to existing userspace.
IIRC the hw stores some "housekeeping" data in the input buffers used during
decoding. So input buffers are not strictly "input", but also output, in the
following sense. If the buffer has total size N, and the data stored inside has
size n (with n < N), then the hw uses the remaining N-n bytes at the end of the
buffer for this housekeeping data. I'm not sure what happens if n equals N, or
if that's even possible.

With best wishes,
Tobias


> 
>>
>> Signed-off-by: Seung-Woo Kim 
>> [mszyprow: adapted to v4.14 code base, rewrote and extended commit message,
>>  added checks for changing buffer addresses, added bidirectional queue
>>  flags and comments]
>> Signed-off-by: Marek Szyprowski 
>> ---
>> v2:
>> - fixed copy/paste bug, which broke encoding support (thanks to Marian
>>   Mihailescu for reporting it)
>> - added checks for changing buffers DMA addresses
>> - added bidirectional queue flags
>>
>> v1:
>> - inital version
>> ---
>>  drivers/media/platform/s5p-mfc/s5p_mfc.c |  23 +-
>>  drivers/media/platform/s5p-mfc/s5p_mfc_dec.c | 111 
>> +++
>>  drivers/media/platform/s5p-mfc/s5p_mfc_enc.c |  64 +++
>>  3 files changed, 147 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc.c 
>> b/drivers/media/platform/s5p-mfc/s5p_mfc.c
>> index 1839a86cc2a5..f1ab8d198158 100644
>> --- a/drivers/media/platform/s5p-mfc/s5p_mfc.c
>> +++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c
>> @@ -754,6 +754,7 @@ static int s5p_mfc_open(struct file *file)
>>  struct s5p_mfc_dev *dev = video_drvdata(file);
>>  struct s5p_mfc_ctx *ctx = NULL;
>>  struct vb2_queue *q;
>> +unsigned int io_modes;
>>  int ret = 0;
>>  
>>  mfc_debug_enter();
>> @@ 

Re: [PATCH] media: s5p-mfc: Add support for V4L2_MEMORY_DMABUF type

2017-10-20 Thread Tobias Jakobi
Hey Marek and others,

just wanted to point out that I've also played around with Seung-Woo' patch for
a while. However this patch alone is very much incomplete.

In particular this is missing:
- At least v5 MFC hw needs source buffers to be also writable, so dma mapping
needs to be setup bidirectional.
- Like with mmap, all buffers need to be setup before decoding can begin. This
is due to how MFC hw gets initialized. dmabufs that are added later are not
going to be used before MFC hw isn't reinitialized.
- Removing dmabufs, or replacing them seems to be impossible with the current
code architecture.

I had extended samsung-utils with some C++ app to test stuff when I was looking
into this. You can find the code here:
https://github.com/tobiasjakobi/samsung-utils/tree/devel/v4l2-mfc-drm-direct

Now here is what happens. I allocate N buffer objects in DRM land to be used as
destination for the MFC decoder. The BOs are exported, so that I can then use
them in V4L2 space. I have to queue n (with n < N) buffers before I can start
the MFC engine.

If I do start the engine at that point (n buffers queued), I soon get an IOMMU
pagefault. I need to queue all N buffers before anything works at all. Queueing
a buffer the first time also registers it, and this has to happen before the MFC
hw is initialized.

In particular I can't just allocate more buffers from DRM and use them here
_after_ decoding has started.

To me it looks like the MFC code was never written with dmabuf in mind. It's
centered around a static memory setup that is fixed before decoding begins.

Anyway, just my 2 cents :)

- Tobias


Marek Szyprowski wrote:
> From: Seung-Woo Kim 
> 
> There is memory constraint for the buffers in V5 of the MFC hardware, but
> when IOMMU is used, then this constraint is meaningless. Other version of
> the MFC hardware don't have such limitations. So in such cases the driver
> is able to use buffers placed anywhere in the system memory, thus USERPTR
> and DMABUF operation modes can be also enabled.
> 
> This patch also removes USERPTR operation mode from encoder node, as it
> doesn't work with v5 MFC hardware without IOMMU being enabled.
> 
> Signed-off-by: Seung-Woo Kim 
> [mszyprow: adapted to v4.14 code base and updated commit message]
> Signed-off-by: Marek Szyprowski 
> ---
>  drivers/media/platform/s5p-mfc/s5p_mfc.c | 14 --
>  drivers/media/platform/s5p-mfc/s5p_mfc_dec.c | 73 
> 
>  drivers/media/platform/s5p-mfc/s5p_mfc_enc.c | 24 ++---
>  3 files changed, 89 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc.c 
> b/drivers/media/platform/s5p-mfc/s5p_mfc.c
> index cf68aed59e0d..f975523dc040 100644
> --- a/drivers/media/platform/s5p-mfc/s5p_mfc.c
> +++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c
> @@ -754,6 +754,7 @@ static int s5p_mfc_open(struct file *file)
>   struct s5p_mfc_dev *dev = video_drvdata(file);
>   struct s5p_mfc_ctx *ctx = NULL;
>   struct vb2_queue *q;
> + unsigned int io_modes;
>   int ret = 0;
>  
>   mfc_debug_enter();
> @@ -839,16 +840,21 @@ static int s5p_mfc_open(struct file *file)
>   if (ret)
>   goto err_init_hw;
>   }
> +
> + io_modes = VB2_MMAP;
> + if (exynos_is_iommu_available(>plat_dev->dev) || !IS_TWOPORT(dev))
> + io_modes |= VB2_USERPTR | VB2_DMABUF;
> +
>   /* Init videobuf2 queue for CAPTURE */
>   q = >vq_dst;
>   q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;
>   q->drv_priv = >fh;
>   q->lock = >mfc_mutex;
>   if (vdev == dev->vfd_dec) {
> - q->io_modes = VB2_MMAP;
> + q->io_modes = io_modes;
>   q->ops = get_dec_queue_ops();
>   } else if (vdev == dev->vfd_enc) {
> - q->io_modes = VB2_MMAP | VB2_USERPTR;
> + q->io_modes = io_modes;
>   q->ops = get_enc_queue_ops();
>   } else {
>   ret = -ENOENT;
> @@ -872,10 +878,10 @@ static int s5p_mfc_open(struct file *file)
>   q->drv_priv = >fh;
>   q->lock = >mfc_mutex;
>   if (vdev == dev->vfd_dec) {
> - q->io_modes = VB2_MMAP;
> + q->io_modes = io_modes;
>   q->ops = get_dec_queue_ops();
>   } else if (vdev == dev->vfd_enc) {
> - q->io_modes = VB2_MMAP | VB2_USERPTR;
> + q->io_modes = io_modes;
>   q->ops = get_enc_queue_ops();
>   } else {
>   ret = -ENOENT;
> diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc_dec.c 
> b/drivers/media/platform/s5p-mfc/s5p_mfc_dec.c
> index 8937b0af7cb3..efe65fce4880 100644
> --- a/drivers/media/platform/s5p-mfc/s5p_mfc_dec.c
> +++ b/drivers/media/platform/s5p-mfc/s5p_mfc_dec.c
> @@ -546,14 +546,27 @@ static int reqbufs_capture(struct s5p_mfc_dev *dev, 
> struct s5p_mfc_ctx *ctx,
>   goto out;
>   }
>  
> - 

Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-04-26 Thread Tobias Jakobi
Hey,

Nicolas Dufresne wrote:
> Le mercredi 26 avril 2017 à 18:52 +0200, Tobias Jakobi a écrit :
>> Hello again,
>>
>>
>> Nicolas Dufresne wrote:
>>> Le mercredi 26 avril 2017 à 01:21 +0300, Sakari Ailus a écrit :
>>>> Hi Marek,
>>>>
>>>> On Thu, Apr 20, 2017 at 01:23:09PM +0200, Marek Szyprowski wrote:
>>>>> Hi Laurent,
>>>>>
>>>>> On 2017-04-20 12:25, Laurent Pinchart wrote:
>>>>>> Hi Marek,
>>>>>>
>>>>>> (CC'ing Sakari Ailus)
>>>>>>
>>>>>> Thank you for the patches.
>>>>>>
>>>>>> On Thursday 20 Apr 2017 11:13:36 Marek Szyprowski wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> This is an updated proposal for extending EXYNOS DRM API with generic
>>>>>>> support for hardware modules, which can be used for processing image 
>>>>>>> data
>>>>>>> from the one memory buffer to another. Typical memory-to-memory 
>>>>>>> operations
>>>>>>> are: rotation, scaling, colour space conversion or mix of them. This is 
>>>>>>> a
>>>>>>> follow-up of my previous proposal "[RFC 0/2] New feature: Framebuffer
>>>>>>> processors", which has been rejected as "not really needed in the DRM
>>>>>>> core":
>>>>>>> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
>>>>>>>
>>>>>>> In this proposal I moved all the code to Exynos DRM driver, so now this
>>>>>>> will be specific only to Exynos DRM. I've also changed the name from
>>>>>>> framebuffer processor (fbproc) to picture processor (pp) to avoid 
>>>>>>> confusion
>>>>>>> with fbdev API.
>>>>>>>
>>>>>>> Here is a bit more information what picture processors are:
>>>>>>>
>>>>>>> Embedded SoCs are known to have a number of hardware blocks, which 
>>>>>>> perform
>>>>>>> such operations. They can be used in paralel to the main GPU module to
>>>>>>> offload CPU from processing grapics or video data. One of example use of
>>>>>>> such modules is implementing video overlay, which usually requires color
>>>>>>> space conversion from NV12 (or similar) to RGB32 color space and 
>>>>>>> scaling to
>>>>>>> target window size.
>>>>>>>
>>>>>>> The proposed API is heavily inspired by atomic KMS approach - it is also
>>>>>>> based on DRM objects and their properties. A new DRM object is 
>>>>>>> introduced:
>>>>>>> picture processor (called pp for convenience). Such objects have a set 
>>>>>>> of
>>>>>>> standard DRM properties, which describes the operation to be performed 
>>>>>>> by
>>>>>>> respective hardware module. In typical case those properties are a 
>>>>>>> source
>>>>>>> fb id and rectangle (x, y, width, height) and destination fb id and
>>>>>>> rectangle. Optionally a rotation property can be also specified if
>>>>>>> supported by the given hardware. To perform an operation on image data,
>>>>>>> userspace provides a set of properties and their values for given fbproc
>>>>>>> object in a similar way as object and properties are provided for
>>>>>>> performing atomic page flip / mode setting.
>>>>>>>
>>>>>>> The proposed API consists of the 3 new ioctls:
>>>>>>> - DRM_IOCTL_EXYNOS_PP_GET_RESOURCES: to enumerate all available picture
>>>>>>>   processors,
>>>>>>> - DRM_IOCTL_EXYNOS_PP_GET: to query capabilities of given picture
>>>>>>>   processor,
>>>>>>> - DRM_IOCTL_EXYNOS_PP_COMMIT: to perform operation described by given
>>>>>>>   property set.
>>>>>>>
>>>>>>> The proposed API is extensible. Drivers can attach their own, custom
>>>>>>> properties to add support for more advanced picture processing (for 
>>>>>>> example
>>>>>>> blending).
>>>>>>>
>>>>&

Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-04-26 Thread Tobias Jakobi
Hello again,


Nicolas Dufresne wrote:
> Le mercredi 26 avril 2017 à 01:21 +0300, Sakari Ailus a écrit :
>> Hi Marek,
>>
>> On Thu, Apr 20, 2017 at 01:23:09PM +0200, Marek Szyprowski wrote:
>>> Hi Laurent,
>>>
>>> On 2017-04-20 12:25, Laurent Pinchart wrote:
 Hi Marek,

 (CC'ing Sakari Ailus)

 Thank you for the patches.

 On Thursday 20 Apr 2017 11:13:36 Marek Szyprowski wrote:
> Dear all,
>
> This is an updated proposal for extending EXYNOS DRM API with generic
> support for hardware modules, which can be used for processing image data
> from the one memory buffer to another. Typical memory-to-memory operations
> are: rotation, scaling, colour space conversion or mix of them. This is a
> follow-up of my previous proposal "[RFC 0/2] New feature: Framebuffer
> processors", which has been rejected as "not really needed in the DRM
> core":
> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
>
> In this proposal I moved all the code to Exynos DRM driver, so now this
> will be specific only to Exynos DRM. I've also changed the name from
> framebuffer processor (fbproc) to picture processor (pp) to avoid 
> confusion
> with fbdev API.
>
> Here is a bit more information what picture processors are:
>
> Embedded SoCs are known to have a number of hardware blocks, which perform
> such operations. They can be used in paralel to the main GPU module to
> offload CPU from processing grapics or video data. One of example use of
> such modules is implementing video overlay, which usually requires color
> space conversion from NV12 (or similar) to RGB32 color space and scaling 
> to
> target window size.
>
> The proposed API is heavily inspired by atomic KMS approach - it is also
> based on DRM objects and their properties. A new DRM object is introduced:
> picture processor (called pp for convenience). Such objects have a set of
> standard DRM properties, which describes the operation to be performed by
> respective hardware module. In typical case those properties are a source
> fb id and rectangle (x, y, width, height) and destination fb id and
> rectangle. Optionally a rotation property can be also specified if
> supported by the given hardware. To perform an operation on image data,
> userspace provides a set of properties and their values for given fbproc
> object in a similar way as object and properties are provided for
> performing atomic page flip / mode setting.
>
> The proposed API consists of the 3 new ioctls:
> - DRM_IOCTL_EXYNOS_PP_GET_RESOURCES: to enumerate all available picture
>   processors,
> - DRM_IOCTL_EXYNOS_PP_GET: to query capabilities of given picture
>   processor,
> - DRM_IOCTL_EXYNOS_PP_COMMIT: to perform operation described by given
>   property set.
>
> The proposed API is extensible. Drivers can attach their own, custom
> properties to add support for more advanced picture processing (for 
> example
> blending).
>
> This proposal aims to replace Exynos DRM IPP (Image Post Processing)
> subsystem. IPP API is over-engineered in general, but not really 
> extensible
> on the other side. It is also buggy, with significant design flaws - the
> biggest issue is the fact that the API covers memory-2-memory picture
> operations together with CRTC writeback and duplicating features, which
> belongs to video plane. Comparing with IPP subsystem, the PP framework is
> smaller (1807 vs 778 lines) and allows driver simplification (Exynos
> rotator driver smaller by over 200 lines).
> 
> Just a side note, we have written code in GStreamer using the Exnynos 4
> FIMC IPP driver. I don't know how many, if any, deployment still exist
> (Exynos 4 is relatively old now), but there exist userspace for the
> FIMC driver.
I was searching for this code, but I didn't find anything. Are you sure
you really mean the FIMC IPP in Exynos DRM, and not just the FIMC driver
from the V4L2 subsystem?


With best wishes,
Tobias



> We use this for color transformation (from tiled to
> linear) and scaling. The FIMC driver is in fact quite stable in
> upstream kernel today. The GScaler V4L2 M2M driver on Exynos 5 is
> largely based on it and has received some maintenance to properly work
> in GStreamer. unlike this DRM API, you can reuse the same userspace
> code across multiple platforms (which we do already). We have also
> integrated this driver in Chromium in the past (not upstream though).
> 
> I am well aware that the blitter driver has not got much attention
> though. But again, V4L2 offers a generic interface to userspace
> application. Fixing this driver could enable some work like this one:
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=772766
> 
> This work in progress feature is a generic hardware accelerated video
> 

Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-04-26 Thread Tobias Jakobi
Hello everyone,


Nicolas Dufresne wrote:
> Le mercredi 26 avril 2017 à 01:21 +0300, Sakari Ailus a écrit :
>> Hi Marek,
>>
>> On Thu, Apr 20, 2017 at 01:23:09PM +0200, Marek Szyprowski wrote:
>>> Hi Laurent,
>>>
>>> On 2017-04-20 12:25, Laurent Pinchart wrote:
 Hi Marek,

 (CC'ing Sakari Ailus)

 Thank you for the patches.

 On Thursday 20 Apr 2017 11:13:36 Marek Szyprowski wrote:
> Dear all,
>
> This is an updated proposal for extending EXYNOS DRM API with generic
> support for hardware modules, which can be used for processing image data
> from the one memory buffer to another. Typical memory-to-memory operations
> are: rotation, scaling, colour space conversion or mix of them. This is a
> follow-up of my previous proposal "[RFC 0/2] New feature: Framebuffer
> processors", which has been rejected as "not really needed in the DRM
> core":
> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
>
> In this proposal I moved all the code to Exynos DRM driver, so now this
> will be specific only to Exynos DRM. I've also changed the name from
> framebuffer processor (fbproc) to picture processor (pp) to avoid 
> confusion
> with fbdev API.
>
> Here is a bit more information what picture processors are:
>
> Embedded SoCs are known to have a number of hardware blocks, which perform
> such operations. They can be used in paralel to the main GPU module to
> offload CPU from processing grapics or video data. One of example use of
> such modules is implementing video overlay, which usually requires color
> space conversion from NV12 (or similar) to RGB32 color space and scaling 
> to
> target window size.
>
> The proposed API is heavily inspired by atomic KMS approach - it is also
> based on DRM objects and their properties. A new DRM object is introduced:
> picture processor (called pp for convenience). Such objects have a set of
> standard DRM properties, which describes the operation to be performed by
> respective hardware module. In typical case those properties are a source
> fb id and rectangle (x, y, width, height) and destination fb id and
> rectangle. Optionally a rotation property can be also specified if
> supported by the given hardware. To perform an operation on image data,
> userspace provides a set of properties and their values for given fbproc
> object in a similar way as object and properties are provided for
> performing atomic page flip / mode setting.
>
> The proposed API consists of the 3 new ioctls:
> - DRM_IOCTL_EXYNOS_PP_GET_RESOURCES: to enumerate all available picture
>   processors,
> - DRM_IOCTL_EXYNOS_PP_GET: to query capabilities of given picture
>   processor,
> - DRM_IOCTL_EXYNOS_PP_COMMIT: to perform operation described by given
>   property set.
>
> The proposed API is extensible. Drivers can attach their own, custom
> properties to add support for more advanced picture processing (for 
> example
> blending).
>
> This proposal aims to replace Exynos DRM IPP (Image Post Processing)
> subsystem. IPP API is over-engineered in general, but not really 
> extensible
> on the other side. It is also buggy, with significant design flaws - the
> biggest issue is the fact that the API covers memory-2-memory picture
> operations together with CRTC writeback and duplicating features, which
> belongs to video plane. Comparing with IPP subsystem, the PP framework is
> smaller (1807 vs 778 lines) and allows driver simplification (Exynos
> rotator driver smaller by over 200 lines).
> 
> Just a side note, we have written code in GStreamer using the Exnynos 4
> FIMC IPP driver. I don't know how many, if any, deployment still exist
> (Exynos 4 is relatively old now), but there exist userspace for the
> FIMC driver. We use this for color transformation (from tiled to
> linear) and scaling. The FIMC driver is in fact quite stable in
> upstream kernel today. The GScaler V4L2 M2M driver on Exynos 5 is
> largely based on it and has received some maintenance to properly work
> in GStreamer. unlike this DRM API, you can reuse the same userspace
> code across multiple platforms (which we do already). We have also
> integrated this driver in Chromium in the past (not upstream though).
> 
> I am well aware that the blitter driver has not got much attention
> though. But again, V4L2 offers a generic interface to userspace
> application. Fixing this driver could enable some work like this one:
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=772766
> 
> This work in progress feature is a generic hardware accelerated video
> mixer. It has been tested with IMX.6 v4l2 m2m blitter driver (which I
> believe is in staging right now). Again, unlike the exynos/drm, this
> code could be reused between platforms.
> 
> In general, 

Re: [PATCH 1/2] drm/exynos: g2d: Add support for old S5Pv210 type

2016-05-25 Thread Tobias Jakobi
Hey Krzysztof,


Krzysztof Kozlowski wrote:
> On 05/24/2016 06:05 PM, Tobias Jakobi wrote:
>> Hello Krzysztof,
>>
>>
>> Krzysztof Kozlowski wrote:
>>> On 05/24/2016 03:49 PM, Tobias Jakobi wrote:
>>>> Hello Krzysztof,
>>>>
>>>> are you sure that these are the only differences. Because AFAIK there
>>>> are quite a few more:
>>>> - DMA submission of commands
>>>> - blend mode / rounding
>>>> - solid fill
>>>> - YCrCb support
>>>> - and probably more
>>>>
>>>> One would need to add least split the command list parser into a v3 and
>>>> v41 version to accomodate for the differences. In fact userspace/libdrm
>>>> would need to know which hw type it currently uses, but you currently
>>>> always return 4.1 in the corresponding ioctl.
>>>
>>> Eh, so probably my patch does not cover fully the support for v3 G2D. I
>>> looked mostly at the differences between v3 and v4 in the s5p-g2d driver
>>> itself. However you are right that this might be not sufficient because
>>> exynos-g2d moved forward and is different than s5p-g2d.
>>>
>>>> Krzysztof Kozlowski wrote:
>>>>> The non-DRM s5p-g2d driver supports two versions of G2D: v3.0 on
>>>>> S5Pv210 and v4.x on Exynos 4x12 (or newer). The driver for 3.0 device
>>>>> version is doing two things differently:
>>>>> 1. Before starting the render process, it invalidates caches (pattern,
>>>>>source buffer and mask buffer). Cache control is not present on v4.x
>>>>>device.
>>>>> 2. Scalling is done through StretchEn command (in BITBLT_COMMAND_REG
>>>>>register) instead of SRC_SCALE_CTRL_REG as in v4.x. However the
>>>>>exynos_drm_g2d driver does not implement the scalling so this
>>>>>difference can be eliminated.
>>>> Huh? Where did you get this from? Scaling works with the DRM driver.
>>>
>>> I was looking for the usage of scaling reg (as there is no scaling
>>> command). How the scaling is implemented then?
>> Like you said above the drivers work completly different. The DRM one
>> receives a command list that is constructed by userspace (libdrm
>> mostly), copies it to a contiguous buffer and passes the memory address
>> of that buffer to the engine which then works on it. Of course
>> everything is slightly more complex.
>>
>> You don't see any reference to scaling in the driver because the scaling
>> regs don't need any kind of specific validation.
>>
>> If you want to know how the command list is constructed, the best way is
>> to look into libdrm. The Exynos specific tests actually cover scaling.
> 
> Thanks for explanations. The patch is insufficient then and it requires
> much more effort. Please drop the series as of now.
If you intend to add support for v3 hardware we might want to join
forces. I have an (ongoing) rewrite of the driver to enable additional
functionality and to make it more suitable to provide EXA for a X11 DDX.
However this breaks the existing API and rewrites quite a bit of the driver

Here's what's there so far.
https://github.com/tobiasjakobi/linux-odroid-public/compare/e35ca9aca1214c5e104e6906c1d9affeb80fe5df...3d1ddb86db73b0d664f3e339709e8e8dacdc8e91

And here's the DDX:
https://github.com/tobiasjakobi/xf86-video-armsoc/commits/g2d


With best wishes,
Tobias



> 
> Best regards,
> Krzysztof
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] drm/exynos: g2d: Add support for old S5Pv210 type

2016-05-24 Thread Tobias Jakobi
Hello Krzysztof,


Krzysztof Kozlowski wrote:
> On 05/24/2016 03:49 PM, Tobias Jakobi wrote:
>> Hello Krzysztof,
>>
>> are you sure that these are the only differences. Because AFAIK there
>> are quite a few more:
>> - DMA submission of commands
>> - blend mode / rounding
>> - solid fill
>> - YCrCb support
>> - and probably more
>>
>> One would need to add least split the command list parser into a v3 and
>> v41 version to accomodate for the differences. In fact userspace/libdrm
>> would need to know which hw type it currently uses, but you currently
>> always return 4.1 in the corresponding ioctl.
> 
> Eh, so probably my patch does not cover fully the support for v3 G2D. I
> looked mostly at the differences between v3 and v4 in the s5p-g2d driver
> itself. However you are right that this might be not sufficient because
> exynos-g2d moved forward and is different than s5p-g2d.
> 
>> Krzysztof Kozlowski wrote:
>>> The non-DRM s5p-g2d driver supports two versions of G2D: v3.0 on
>>> S5Pv210 and v4.x on Exynos 4x12 (or newer). The driver for 3.0 device
>>> version is doing two things differently:
>>> 1. Before starting the render process, it invalidates caches (pattern,
>>>source buffer and mask buffer). Cache control is not present on v4.x
>>>device.
>>> 2. Scalling is done through StretchEn command (in BITBLT_COMMAND_REG
>>>register) instead of SRC_SCALE_CTRL_REG as in v4.x. However the
>>>exynos_drm_g2d driver does not implement the scalling so this
>>>difference can be eliminated.
>> Huh? Where did you get this from? Scaling works with the DRM driver.
> 
> I was looking for the usage of scaling reg (as there is no scaling
> command). How the scaling is implemented then?
Like you said above the drivers work completly different. The DRM one
receives a command list that is constructed by userspace (libdrm
mostly), copies it to a contiguous buffer and passes the memory address
of that buffer to the engine which then works on it. Of course
everything is slightly more complex.

You don't see any reference to scaling in the driver because the scaling
regs don't need any kind of specific validation.

If you want to know how the command list is constructed, the best way is
to look into libdrm. The Exynos specific tests actually cover scaling.


With best wishes,
Tobias


> Thanks for feedback!
> 
> Best regards,
> Krzysztof
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] drm/exynos: g2d: Add support for old S5Pv210 type

2016-05-24 Thread Tobias Jakobi
Hello Krzysztof,

are you sure that these are the only differences. Because AFAIK there
are quite a few more:
- DMA submission of commands
- blend mode / rounding
- solid fill
- YCrCb support
- and probably more

One would need to add least split the command list parser into a v3 and
v41 version to accomodate for the differences. In fact userspace/libdrm
would need to know which hw type it currently uses, but you currently
always return 4.1 in the corresponding ioctl.


Krzysztof Kozlowski wrote:
> The non-DRM s5p-g2d driver supports two versions of G2D: v3.0 on
> S5Pv210 and v4.x on Exynos 4x12 (or newer). The driver for 3.0 device
> version is doing two things differently:
> 1. Before starting the render process, it invalidates caches (pattern,
>source buffer and mask buffer). Cache control is not present on v4.x
>device.
> 2. Scalling is done through StretchEn command (in BITBLT_COMMAND_REG
>register) instead of SRC_SCALE_CTRL_REG as in v4.x. However the
>exynos_drm_g2d driver does not implement the scalling so this
>difference can be eliminated.
Huh? Where did you get this from? Scaling works with the DRM driver.


With best wishes,
Tobias


> After adding support for v3.0 to exynos_drm_g2d driver, the old driver
> can be removed.
> 
> Cc: Kyungmin Park 
> Cc: Kamil Debski 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 37 
> +++--
>  drivers/gpu/drm/exynos/exynos_drm_g2d.h |  7 +++
>  2 files changed, 42 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index 493552368295..44d8b28e9d98 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -38,6 +39,7 @@
>  #define G2D_SOFT_RESET   0x
>  #define G2D_INTEN0x0004
>  #define G2D_INTC_PEND0x000C
> +#define G2D_CACHECTL 0x0018 /* S5PV210 specific */
>  #define G2D_DMA_SFR_BASE_ADDR0x0080
>  #define G2D_DMA_COMMAND  0x0084
>  #define G2D_DMA_STATUS   0x008C
> @@ -78,6 +80,11 @@
>  #define G2D_INTP_GCMD_FIN(1 << 1)
>  #define G2D_INTP_SCMD_FIN(1 << 0)
>  
> +/* G2D_CACHECTL, S5PV210 specific */
> +#define G2D_CACHECTL_PATCACHE(BIT(2))
> +#define G2D_CACHECTL_SRCBUFFER   (BIT(1))
> +#define G2D_CACHECTL_MASKBUFFER  (BIT(0))
> +
>  /* G2D_DMA_COMMAND */
>  #define G2D_DMA_HALT (1 << 2)
>  #define G2D_DMA_CONTINUE (1 << 1)
> @@ -245,6 +252,7 @@ struct g2d_data {
>  
>   unsigned long   current_pool;
>   unsigned long   max_pool;
> + enum exynos_drm_g2d_typetype;
>  };
>  
>  static int g2d_init_cmdlist(struct g2d_data *g2d)
> @@ -1125,6 +1133,13 @@ int exynos_g2d_set_cmdlist_ioctl(struct drm_device 
> *drm_dev, void *data,
>   cmdlist->data[cmdlist->last++] = G2D_SRC_BASE_ADDR;
>   cmdlist->data[cmdlist->last++] = 0;
>  
> + if (g2d->type == EXYNOS_DRM_G2D_TYPE_3X) {
> + cmdlist->data[cmdlist->last++] = G2D_CACHECTL;
> + cmdlist->data[cmdlist->last++] = G2D_CACHECTL_PATCACHE |
> +  G2D_CACHECTL_SRCBUFFER |
> +  G2D_CACHECTL_MASKBUFFER;
> + }
> +
>   /*
>* 'LIST_HOLD' command should be set to the DMA_HOLD_CMD_REG
>* and GCF bit should be set to INTEN register if user wants
> @@ -1369,10 +1384,20 @@ static int g2d_probe(struct platform_device *pdev)
>   struct exynos_drm_subdrv *subdrv;
>   int ret;
>  
> + /* Sanity check, we can be instatiated only from DT */
> + if (!dev->of_node)
> + return -EINVAL;
> +
>   g2d = devm_kzalloc(dev, sizeof(*g2d), GFP_KERNEL);
>   if (!g2d)
>   return -ENOMEM;
>  
> + g2d->type = (enum exynos_drm_g2d_type)of_device_get_match_data(dev);
> + if (g2d->type == EXYNOS_DRM_G2D_TYPE_UNKNOWN) {
> + dev_err(dev, "failed to get type of device\n");
> + return -EINVAL;
> + }
> +
>   g2d->runqueue_slab = kmem_cache_create("g2d_runqueue_slab",
>   sizeof(struct g2d_runqueue_node), 0, 0, NULL);
>   if (!g2d->runqueue_slab)
> @@ -1535,8 +1560,16 @@ static const struct dev_pm_ops g2d_pm_ops = {
>  };
>  
>  static const struct of_device_id exynos_g2d_match[] = {
> - { .compatible = "samsung,exynos5250-g2d" },
> - { .compatible = "samsung,exynos4212-g2d" },
> + {
> + .compatible = "samsung,exynos5250-g2d",
> + .data = (void *)EXYNOS_DRM_G2D_TYPE_4X,
> + }, {
> +

Re: exynos4 / g2d

2014-02-10 Thread Tobias Jakobi
Hello!


Sachin Kamat wrote:
 +cc linux-media list and some related maintainers

 Hi,

 On 10 February 2014 00:22, Tobias Jakobi tjak...@math.uni-bielefeld.de 
 wrote:
 Hello!

 I noticed while here
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/arm/boot/dts/exynos4x12.dtsi?id=3a0d48f6f81459c874165ffb14b310c0b5bb0c58)
 the necessary entry for the dts was made, on the drm driver side
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/exynos/exynos_drm_g2d.c)
 this was never added.

 Shouldn't samsung,exynos4212-g2d go into exynos_g2d_match as well?
 The DRM version of G2D driver does not support Exynos4 based G2D IP
 yet. The support for this IP
 is available only in the V4L2 version of the driver. Please see the file:
 drivers/media/platform/s5p-g2d/g2d.c

That doesn't make sense to me. From the initial commit message of the
DRM code:
The G2D is a 2D graphic accelerator that supports Bit Block Transfer.
This G2D driver is exynos drm specific and supports only G2D(version
4.1) of later Exynos series from Exynos4X12 because supporting DMA.
(https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/gpu/drm/exynos/exynos_drm_g2d.c?id=d7f1642c90ab5eb2d7c48af0581c993094f97e1a)

In fact, this doesn't even mention the Exynos5?!

Greets,
Tobias

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: exynos4 / g2d

2014-02-10 Thread Tobias Jakobi
Sachin Kamat wrote:
 On 10 February 2014 14:28, Inki Dae inki@samsung.com wrote:
 2014-02-10 17:44 GMT+09:00 Sachin Kamat sachin.ka...@linaro.org:
 +cc Joonyoung Shim

 Hi,

 On 10 February 2014 13:58, Tobias Jakobi tjak...@math.uni-bielefeld.de 
 wrote:
 Hello!


 Sachin Kamat wrote:
 +cc linux-media list and some related maintainers

 Hi,

 On 10 February 2014 00:22, Tobias Jakobi tjak...@math.uni-bielefeld.de 
 wrote:
 Hello!

 I noticed while here
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/arm/boot/dts/exynos4x12.dtsi?id=3a0d48f6f81459c874165ffb14b310c0b5bb0c58)
 the necessary entry for the dts was made, on the drm driver side
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/exynos/exynos_drm_g2d.c)
 this was never added.

 Shouldn't samsung,exynos4212-g2d go into exynos_g2d_match as well?
 The DRM version of G2D driver does not support Exynos4 based G2D IP
 yet. The support for this IP
 is available only in the V4L2 version of the driver. Please see the file:
 drivers/media/platform/s5p-g2d/g2d.c

 That doesn't make sense to me. From the initial commit message of the
 DRM code:
 The G2D is a 2D graphic accelerator that supports Bit Block Transfer.
 This G2D driver is exynos drm specific and supports only G2D(version
 4.1) of later Exynos series from Exynos4X12 because supporting DMA.
 (https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/gpu/drm/exynos/exynos_drm_g2d.c?id=d7f1642c90ab5eb2d7c48af0581c993094f97e1a)

 In fact, this doesn't even mention the Exynos5?!
 It does say later Exynos series from Exynos4X12 which technically
 includes Exynos5 and
 Right, supported.

 does not include previous Exynos series SoCs like 4210, etc.
 Anyway, I haven't tested this driver on Exynos4 based platforms and
 hence cannot confirm if it
 supports 4x12 in the current form. I leave it to the original author
 and Inki to comment about it.

 Just add samsung,exynos4212-g2d to exynos_g2d_match if you want to
 use g2d driver on exynos4212 SoC. We already tested this driver on
 Exynos4x12 SoC also. We didn't just post dt support patch for
 exynos4x12 series.
 If you prefer I could add that and send a patch. I wouldn't be able to
 test it though.

Thanks for the feedback!

The reason for bringing this up is the following. I'm trying to get the
G2D block working through the DRM interface on a Hardkernel ODROID-X2,
which is Exynos4412 based. Hardkernel provides a modified 3.8.y kernel,
which they somehow support. The problem is that the G2D doesn't work
there. I can provide more specific details on that, if wanted.

So my plan was to check if the G2D at least works with a more recent
kernel. I'm currently working on getting 3.13.y vanilla running on the
device and then want to test the G2D block there (e.g. with the test
tool that is included in the libdrm tree -- this is also the one that
fails on the odroid kernel).

Obviously this would be huge waste of time, if you guys would tell me
that the G2D isn't working (at least through the DRM) on the Exynos4412
platform anyway.

With best regards,
Tobias

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html