Re: [Mesa-dev] [PATCH v2] st/va: ensure linear memory for dmabuf
Am 08.06.2016 um 15:53 schrieb Julien Isorce: To go back to "add a bind flag to struct pipe_video_buffer instead ", the alternative is to bring back the first version of the patch but according to the first review, it was duplication of bind flag between pipe_video_buffer and pipe_resource so it would require quite of big of refactoring unless I miss understood the comment. >> I obviously have missed that discussion, but yes that is exactly the way we should go. The duplication is unproblematic as far as I can see. The v1 patch and discussion is here: https://patchwork.freedesktop.org/patch/66382/ Thanks for pointing that out! That's more than 6 month ago and I barely remember what I had for lunch yesterday :). In that v1 should I just replace the specific "bool disable_tiling" by a generic bind flag ? Yeah, that's what I wanted to say with my comment back then. Regards, Christian. On 8 June 2016 at 14:07, Christian König> wrote: Am 08.06.2016 um 14:20 schrieb Julien Isorce: On 8 June 2016 at 09:16, Christian König > wrote: Am 02.06.2016 um 16:00 schrieb Julien Isorce: In order to do zero-copy between two different devices the memory should not be tiled. This is currently no way to set pipe_resource template's flag from pipe_video_buffer template. So disabled_tiling is added. Choosed "disable" prefix so that CALLOC keeps tiling enabled by default. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. Note that tiling is working if 1 and 2 are done on the same GPU. So it is up to the application to set or not the flag: VA_SURFACE_EXTBUF_DESC_ENABLE_TILING Signed-off-by: Julien Isorce > Thx for the review. NAK, it won't be possible to use the resulting video buffer with hardware decoding on AMD hardware. But I restrict to these format: +case VA_FOURCC_RGBA: +case VA_FOURCC_RGBX: +case VA_FOURCC_BGRA: +case VA_FOURCC_BGRX: So if the vaapi user request a linear layout, it will fail if not one of these formats. So basically for now it requires vpp. Yeah, ok that should work for now but is clearly not a good idea. Please add a bind flag to struct pipe_video_buffer instead so that we can specify if linear layout is requested or not. Do you mean that resource = pscreen->resource_create(pscreen, ) does not honor the bind flag of the template. Maybe I can just checked if it was effective after that call, i.e. checking presence of PIPE_BIND_LINEAR in resources[0]->bind. No, for resource_create() the flag should be honored. But we shouldn't be using resource_create() to create the different planes of video buffers if possible. We should use create_video_buffer() on the pipe context if possible. This way the hardware driver can still reject the request if this would result in a surface which can't be decoded to. For now it requires vpp since I explicitly restricted linear layout request to the rgbs format above. The reason behind is that vaapi is limited to export 1 fd per surface. Problem is that for at least nouveau, it uses 1 pipe resource per plane, and NV12 has 2 planes. In the spec the problem comes from the fact that a VAImage has only one VABufferID. It would require to define a new VABufferType which represents an array of VAImageBufferType, something like that. Yeah, I know that is one of the many deficits VA-API unfortunately has. It should work with the Radeon implementation, but only because UVD requires both planes to be in the same buffer object. To go back to "add a bind flag to struct pipe_video_buffer instead ", the alternative is to bring back the first version of the patch but according to the first review, it was duplication of bind flag between pipe_video_buffer and pipe_resource so it would require quite of big of refactoring unless I miss understood the comment. I obviously have missed that discussion, but yes that is exactly the way we should go. The duplication is unproblematic as far as I can see. Also in vdpau, the function vlVdpOutputSurfaceCreate is using PIPE_BIND_LINEAR flag and resource_create call, like in the v2 of my patch. Yeah, but that isn't a video buffer you try to create here. VA-API unfortunately doesn't
Re: [Mesa-dev] [PATCH v2] st/va: ensure linear memory for dmabuf
To go back to "add a bind flag to struct pipe_video_buffer instead ", the alternative is to bring back the first version of the patch but according to the first review, it was duplication of bind flag between pipe_video_buffer and pipe_resource so it would require quite of big of refactoring unless I miss understood the comment. >> I obviously have missed that discussion, but yes that is exactly the way we should go. The duplication is unproblematic as far as I can see. The v1 patch and discussion is here: https://patchwork.freedesktop.org/patch/66382/ In that v1 should I just replace the specific "bool disable_tiling" by a generic bind flag ? On 8 June 2016 at 14:07, Christian Königwrote: > Am 08.06.2016 um 14:20 schrieb Julien Isorce: > > > > On 8 June 2016 at 09:16, Christian König wrote: > >> Am 02.06.2016 um 16:00 schrieb Julien Isorce: >> >>> In order to do zero-copy between two different devices >>> the memory should not be tiled. >>> >>> This is currently no way to set pipe_resource template's flag >>> from pipe_video_buffer template. So disabled_tiling is added. >>> >>> Choosed "disable" prefix so that CALLOC keeps tiling enabled >>> by default. >>> >>> Tested with GStreamer on a laptop that has 2 GPUs: >>> 1- gstvaapidecode: >>> HW decoding and dmabuf export with nouveau driver on Nvidia GPU. >>> 2- glimagesink: >>> EGLImage imports dmabuf on Intel GPU. >>> >>> Note that tiling is working if 1 and 2 are done on the same GPU. >>> So it is up to the application to set or not the flag: >>> VA_SURFACE_EXTBUF_DESC_ENABLE_TILING >>> >>> Signed-off-by: Julien Isorce < >>> j.iso...@samsung.com> >>> >> >> > Thx for the review. > > >> NAK, it won't be possible to use the resulting video buffer with hardware >> decoding on AMD hardware. >> > > But I restrict to these format: > > +case VA_FOURCC_RGBA: > +case VA_FOURCC_RGBX: > +case VA_FOURCC_BGRA: > +case VA_FOURCC_BGRX: > > So if the vaapi user request a linear layout, it will fail if not one of > these formats. So basically for now it requires vpp. > > > Yeah, ok that should work for now but is clearly not a good idea. > > > >> >> Please add a bind flag to struct pipe_video_buffer instead so that we can >> specify if linear layout is requested or not. >> > > Do you mean that resource = pscreen->resource_create(pscreen, ) does > not honor the bind flag of the template. > Maybe I can just checked if it was effective after that call, i.e. > checking presence of PIPE_BIND_LINEAR in > resources[0]->bind. > > > No, for resource_create() the flag should be honored. But we shouldn't be > using resource_create() to create the different planes of video buffers if > possible. We should use create_video_buffer() on the pipe context if > possible. > > > >> >> This way the hardware driver can still reject the request if this would >> result in a surface which can't be decoded to. >> > > For now it requires vpp since I explicitly restricted linear layout > request to the rgbs format above. The reason behind is > that vaapi is limited to export 1 fd per surface. Problem is that for at > least nouveau, it uses 1 pipe resource per plane, and > NV12 has 2 planes. > > In the spec the problem comes from the fact that a VAImage has only one > VABufferID. It would require to define > a new VABufferType which represents an array of VAImageBufferType, > something like that. > > > Yeah, I know that is one of the many deficits VA-API unfortunately has. It > should work with the Radeon implementation, but only because UVD requires > both planes to be in the same buffer object. > > > > To go back to "add a bind flag to struct pipe_video_buffer instead ", the > alternative is to bring back the first version > of the patch but according to the first review, it was duplication of bind > flag between pipe_video_buffer > and pipe_resource so it would require quite of big of refactoring unless I > miss understood the comment. > > > I obviously have missed that discussion, but yes that is exactly the way > we should go. The duplication is unproblematic as far as I can see. > > Also in vdpau, the function vlVdpOutputSurfaceCreate is using > PIPE_BIND_LINEAR flag and resource_create call, > like in the v2 of my patch. > > > Yeah, but that isn't a video buffer you try to create here. VA-API > unfortunately doesn't properly distinct between those two things. > > Regards, > Christian. > > > > Not sure if I replied to your concerns, let me know. > > Thx > Julien > > >> >> Regards, >> Christian. >> >> >> --- >>> src/gallium/state_trackers/va/surface.c | 60 >>> - >>> 1 file changed, 59 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/gallium/state_trackers/va/surface.c >>> b/src/gallium/state_trackers/va/surface.c >>> index 8a6a397..b04ced4 100644 >>> --- a/src/gallium/state_trackers/va/surface.c >>> +++
Re: [Mesa-dev] [PATCH v2] st/va: ensure linear memory for dmabuf
Am 08.06.2016 um 14:20 schrieb Julien Isorce: On 8 June 2016 at 09:16, Christian König> wrote: Am 02.06.2016 um 16:00 schrieb Julien Isorce: In order to do zero-copy between two different devices the memory should not be tiled. This is currently no way to set pipe_resource template's flag from pipe_video_buffer template. So disabled_tiling is added. Choosed "disable" prefix so that CALLOC keeps tiling enabled by default. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. Note that tiling is working if 1 and 2 are done on the same GPU. So it is up to the application to set or not the flag: VA_SURFACE_EXTBUF_DESC_ENABLE_TILING Signed-off-by: Julien Isorce > Thx for the review. NAK, it won't be possible to use the resulting video buffer with hardware decoding on AMD hardware. But I restrict to these format: +case VA_FOURCC_RGBA: +case VA_FOURCC_RGBX: +case VA_FOURCC_BGRA: +case VA_FOURCC_BGRX: So if the vaapi user request a linear layout, it will fail if not one of these formats. So basically for now it requires vpp. Yeah, ok that should work for now but is clearly not a good idea. Please add a bind flag to struct pipe_video_buffer instead so that we can specify if linear layout is requested or not. Do you mean that resource = pscreen->resource_create(pscreen, ) does not honor the bind flag of the template. Maybe I can just checked if it was effective after that call, i.e. checking presence of PIPE_BIND_LINEAR in resources[0]->bind. No, for resource_create() the flag should be honored. But we shouldn't be using resource_create() to create the different planes of video buffers if possible. We should use create_video_buffer() on the pipe context if possible. This way the hardware driver can still reject the request if this would result in a surface which can't be decoded to. For now it requires vpp since I explicitly restricted linear layout request to the rgbs format above. The reason behind is that vaapi is limited to export 1 fd per surface. Problem is that for at least nouveau, it uses 1 pipe resource per plane, and NV12 has 2 planes. In the spec the problem comes from the fact that a VAImage has only one VABufferID. It would require to define a new VABufferType which represents an array of VAImageBufferType, something like that. Yeah, I know that is one of the many deficits VA-API unfortunately has. It should work with the Radeon implementation, but only because UVD requires both planes to be in the same buffer object. To go back to "add a bind flag to struct pipe_video_buffer instead ", the alternative is to bring back the first version of the patch but according to the first review, it was duplication of bind flag between pipe_video_buffer and pipe_resource so it would require quite of big of refactoring unless I miss understood the comment. I obviously have missed that discussion, but yes that is exactly the way we should go. The duplication is unproblematic as far as I can see. Also in vdpau, the function vlVdpOutputSurfaceCreate is using PIPE_BIND_LINEAR flag and resource_create call, like in the v2 of my patch. Yeah, but that isn't a video buffer you try to create here. VA-API unfortunately doesn't properly distinct between those two things. Regards, Christian. Not sure if I replied to your concerns, let me know. Thx Julien Regards, Christian. --- src/gallium/state_trackers/va/surface.c | 60 - 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/va/surface.c b/src/gallium/state_trackers/va/surface.c index 8a6a397..b04ced4 100644 --- a/src/gallium/state_trackers/va/surface.c +++ b/src/gallium/state_trackers/va/surface.c @@ -507,7 +507,9 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, int i; int memory_type; int expected_fourcc; + int linear_layout; VAStatus vaStatus; + const enum pipe_format *resource_formats; if (!ctx) return VA_STATUS_ERROR_INVALID_CONTEXT; @@ -529,6 +531,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, memory_attibute = NULL; memory_type = VA_SURFACE_ATTRIB_MEM_TYPE_VA; expected_fourcc = 0; + resource_formats = NULL; for (i = 0; i < num_attribs && attrib_list;
Re: [Mesa-dev] [PATCH v2] st/va: ensure linear memory for dmabuf
On 8 June 2016 at 09:16, Christian Königwrote: > Am 02.06.2016 um 16:00 schrieb Julien Isorce: > >> In order to do zero-copy between two different devices >> the memory should not be tiled. >> >> This is currently no way to set pipe_resource template's flag >> from pipe_video_buffer template. So disabled_tiling is added. >> >> Choosed "disable" prefix so that CALLOC keeps tiling enabled >> by default. >> >> Tested with GStreamer on a laptop that has 2 GPUs: >> 1- gstvaapidecode: >> HW decoding and dmabuf export with nouveau driver on Nvidia GPU. >> 2- glimagesink: >> EGLImage imports dmabuf on Intel GPU. >> >> Note that tiling is working if 1 and 2 are done on the same GPU. >> So it is up to the application to set or not the flag: >> VA_SURFACE_EXTBUF_DESC_ENABLE_TILING >> >> Signed-off-by: Julien Isorce >> > > Thx for the review. > NAK, it won't be possible to use the resulting video buffer with hardware > decoding on AMD hardware. > But I restrict to these format: +case VA_FOURCC_RGBA: +case VA_FOURCC_RGBX: +case VA_FOURCC_BGRA: +case VA_FOURCC_BGRX: So if the vaapi user request a linear layout, it will fail if not one of these formats. So basically for now it requires vpp. > > Please add a bind flag to struct pipe_video_buffer instead so that we can > specify if linear layout is requested or not. > Do you mean that resource = pscreen->resource_create(pscreen, ) does not honor the bind flag of the template. Maybe I can just checked if it was effective after that call, i.e. checking presence of PIPE_BIND_LINEAR in resources[0]->bind. > > This way the hardware driver can still reject the request if this would > result in a surface which can't be decoded to. > For now it requires vpp since I explicitly restricted linear layout request to the rgbs format above. The reason behind is that vaapi is limited to export 1 fd per surface. Problem is that for at least nouveau, it uses 1 pipe resource per plane, and NV12 has 2 planes. In the spec the problem comes from the fact that a VAImage has only one VABufferID. It would require to define a new VABufferType which represents an array of VAImageBufferType, something like that. To go back to "add a bind flag to struct pipe_video_buffer instead ", the alternative is to bring back the first version of the patch but according to the first review, it was duplication of bind flag between pipe_video_buffer and pipe_resource so it would require quite of big of refactoring unless I miss understood the comment. Also in vdpau, the function vlVdpOutputSurfaceCreate is using PIPE_BIND_LINEAR flag and resource_create call, like in the v2 of my patch. Not sure if I replied to your concerns, let me know. Thx Julien > > Regards, > Christian. > > > --- >> src/gallium/state_trackers/va/surface.c | 60 >> - >> 1 file changed, 59 insertions(+), 1 deletion(-) >> >> diff --git a/src/gallium/state_trackers/va/surface.c >> b/src/gallium/state_trackers/va/surface.c >> index 8a6a397..b04ced4 100644 >> --- a/src/gallium/state_trackers/va/surface.c >> +++ b/src/gallium/state_trackers/va/surface.c >> @@ -507,7 +507,9 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned >> int format, >> int i; >> int memory_type; >> int expected_fourcc; >> + int linear_layout; >> VAStatus vaStatus; >> + const enum pipe_format *resource_formats; >>if (!ctx) >> return VA_STATUS_ERROR_INVALID_CONTEXT; >> @@ -529,6 +531,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned >> int format, >> memory_attibute = NULL; >> memory_type = VA_SURFACE_ATTRIB_MEM_TYPE_VA; >> expected_fourcc = 0; >> + resource_formats = NULL; >>for (i = 0; i < num_attribs && attrib_list; i++) { >> if ((attrib_list[i].type == VASurfaceAttribPixelFormat) && >> @@ -569,8 +572,27 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned >> int format, >> return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT; >> } >> + /* The application will clear the TILING flag when the surface is >> +* intended to be exported as dmabuf. */ >> + linear_layout = memory_attibute && >> + !(memory_attibute->flags & VA_SURFACE_EXTBUF_DESC_ENABLE_TILING); >> + >> switch (memory_type) { >> case VA_SURFACE_ATTRIB_MEM_TYPE_VA: >> + if (linear_layout) { >> + switch (expected_fourcc) { >> +case VA_FOURCC_RGBA: >> +case VA_FOURCC_RGBX: >> +case VA_FOURCC_BGRA: >> +case VA_FOURCC_BGRX: >> + if (memory_attibute->num_planes != 1) >> + return VA_STATUS_ERROR_INVALID_PARAMETER; >> + break; >> +default: >> + return VA_STATUS_ERROR_INVALID_PARAMETER; >> + } >> + } >> + >> break; >> case VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME: >> if (!memory_attibute) >> @@ -587,6
Re: [Mesa-dev] [PATCH v2] st/va: ensure linear memory for dmabuf
Am 02.06.2016 um 16:00 schrieb Julien Isorce: In order to do zero-copy between two different devices the memory should not be tiled. This is currently no way to set pipe_resource template's flag from pipe_video_buffer template. So disabled_tiling is added. Choosed "disable" prefix so that CALLOC keeps tiling enabled by default. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. Note that tiling is working if 1 and 2 are done on the same GPU. So it is up to the application to set or not the flag: VA_SURFACE_EXTBUF_DESC_ENABLE_TILING Signed-off-by: Julien IsorceNAK, it won't be possible to use the resulting video buffer with hardware decoding on AMD hardware. Please add a bind flag to struct pipe_video_buffer instead so that we can specify if linear layout is requested or not. This way the hardware driver can still reject the request if this would result in a surface which can't be decoded to. Regards, Christian. --- src/gallium/state_trackers/va/surface.c | 60 - 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/va/surface.c b/src/gallium/state_trackers/va/surface.c index 8a6a397..b04ced4 100644 --- a/src/gallium/state_trackers/va/surface.c +++ b/src/gallium/state_trackers/va/surface.c @@ -507,7 +507,9 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, int i; int memory_type; int expected_fourcc; + int linear_layout; VAStatus vaStatus; + const enum pipe_format *resource_formats; if (!ctx) return VA_STATUS_ERROR_INVALID_CONTEXT; @@ -529,6 +531,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, memory_attibute = NULL; memory_type = VA_SURFACE_ATTRIB_MEM_TYPE_VA; expected_fourcc = 0; + resource_formats = NULL; for (i = 0; i < num_attribs && attrib_list; i++) { if ((attrib_list[i].type == VASurfaceAttribPixelFormat) && @@ -569,8 +572,27 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT; } + /* The application will clear the TILING flag when the surface is +* intended to be exported as dmabuf. */ + linear_layout = memory_attibute && + !(memory_attibute->flags & VA_SURFACE_EXTBUF_DESC_ENABLE_TILING); + switch (memory_type) { case VA_SURFACE_ATTRIB_MEM_TYPE_VA: + if (linear_layout) { + switch (expected_fourcc) { +case VA_FOURCC_RGBA: +case VA_FOURCC_RGBX: +case VA_FOURCC_BGRA: +case VA_FOURCC_BGRX: + if (memory_attibute->num_planes != 1) + return VA_STATUS_ERROR_INVALID_PARAMETER; + break; +default: + return VA_STATUS_ERROR_INVALID_PARAMETER; + } + } + break; case VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME: if (!memory_attibute) @@ -587,6 +609,13 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, if (expected_fourcc) { templat.buffer_format = VaFourccToPipeFormat(expected_fourcc); templat.interlaced = 0; + + if (linear_layout) { + resource_formats = vl_video_buffer_formats(pscreen, +templat.buffer_format); + if (!resource_formats) +return VA_STATUS_ERROR_INVALID_PARAMETER; + } } else { templat.buffer_format = pscreen->get_video_param ( @@ -621,7 +650,36 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format, switch (memory_type) { case VA_SURFACE_ATTRIB_MEM_TYPE_VA: - surf->buffer = drv->pipe->create_video_buffer(drv->pipe, ); + if (linear_layout) { +struct pipe_resource resource_template; +struct pipe_resource *resources[VL_NUM_COMPONENTS]; +int depth = 1; +int array_size = 1; + +memset(resources, 0, sizeof(resources)); + +assert(resource_formats); +vl_video_buffer_template(_template, , + resource_formats[0], depth, array_size, + PIPE_USAGE_DEFAULT, 0); + +/* Shared because VASurfaceAttribExternalBuffers is used. */ +assert(memory_attibute); +resource_template.bind |= PIPE_BIND_LINEAR | PIPE_BIND_SHARED; + +resources[0] = pscreen->resource_create(pscreen, +_template); +if (!resources[0]) { +FREE(surf); +goto no_res; +} + +surf->buffer = vl_video_buffer_create_ex2(drv->pipe, , + resources); + } else { +