Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-12 Thread Michel Dänzer
On 12/01/17 12:09 AM, Cheng, Tony wrote:
> Vblank interrupt fires as soon as the last line of active region is
> scanned out.
> VSync interrupt fires at the vsync.
> VUpdate interrupt fires HW is ready to scan out a new frame, this include
> latch on double buffer registers, starting memory request etc.
> 
> We use VUpdate to accommodate free sync, as in free sync, blank region
> is variable and a frame can be terminate as soon as new surface address
> is written to register.  If we use vblank interrupt will be fired too
> early and might not stretch frame time properly.

What does "fired too early" and "might not stretch frame time properly"
mean exactly?

Is it that the timestamp reported to userspace, which is supposed to
correspond to when scanning out the next frame starts, would be
incorrect? Or something else?


Apart from being too late for some use cases of the
DRM_IOCTL_WAIT_VBLANK ioctl, there's another issue with the VUPDATE
interrupt: Because it's processed after the PFLIP interrupt, the DRM
vblank sequence number is only incremented after a corresponding page
flip completion event is sent to userspace, so the vblank sequence
number in the event is 1 too low.

I'd like to help come up with a solution for this, but I need to
understand the concerns about the VBLANK interrupt vs FreeSync better.


> I think us DAL guys might not have the full understanding of DRM
> vblank machinery.  Is there some document we can go read up on to make
> sure all our assumption is correct?

I'm not sure, but I've been involved with it pretty much since the
beginning, so I can explain some things about it. :)


> From our perspective it seems some of the DRM vblank machinery (or the
> way we implement them) is redundant as our HW can do things that we
> queue off a work item to do automatically if we configure the HW correctly.

Do you mean amdgpu_flip_work_func? I'm not sure how the HW could
automatically wait for fences to signal before executing the flip, but
maybe it could delay the flip until the frame count passes a threshold?
Or are you thinking of something else?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-12 Thread Michel Dänzer
On 11/01/17 08:41 PM, Andy Furniss wrote:
> 
> Pure luck noticing this because I haven't tested modesetting driver for
> ages, but -
> 
> These patches also break full screen vdpau playback when using that.
> 
> Result is a screen of mostly junk with a hint of the vid - looks like
> when direct scan out fails on wayland due to tiling mismatch.

Yeah, it's the same effect, due to the scanout buffer using tiling
parameters which aren't supported by the display hardware. In this case,
it happens because the modesetting driver attempts to use page flipping
anyway, and the kernel driver doesn't catch it but pretends that it works.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Harry Wentland

On 2017-01-11 12:50 AM, Michel Dänzer wrote:

On 10/01/17 09:07 PM, Andy Furniss wrote:

Andy Furniss wrote:


Though recent testing shows this is not true with DAL/DC on 3.7 -
todo test DC on new drm-next branch.


todo done, DC for some reason on both amd-staging-4.7 and
amd-staging-drm-next is "slower" = the tear region is 2 to 3 times
larger than non DC kernel with powerplay auto. With high it is smaller
but still present.


This particular issue is because DC uses the GPU's VUPDATE interrupt
instead of the VBLANK interrupt to drive the DRM vblank machinery. The
result is that userspace is only notified of a vertical blank period
when it's already over, so it doesn't get a chance to do anything inside
the vertical blank period.




Adding Tony for comment on why DC behaves the way it does.

Harry
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Nayan Deshmukh
dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
  to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
use constant number of buffer and we can avoid calls to present
extension for every frame

Reviewed and Suggested-by: Leo Liu 
Acked-by: Christian König 
Tested-by: Andy Furniss 
Signed-off-by: Nayan Deshmukh 
---
 configure.ac  |   2 +-
 src/gallium/auxiliary/vl/vl_winsys.h  |   5 ++
 src/gallium/auxiliary/vl/vl_winsys_dri3.c | 126 ++
 3 files changed, 115 insertions(+), 18 deletions(-)

diff --git a/configure.ac b/configure.ac
index d1ffb57..1b3507c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2078,7 +2078,7 @@ if test "x$enable_xvmc" = xyes -o \
 "x$enable_va" = xyes; then
 if test x"$enable_dri3" = xyes; then
 PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync xshmfence >= 
$XSHMFENCE_REQUIRED
- x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
+ xcb-xfixes x11-xcb xcb xcb-dri2 >= 
$XCBDRI2_REQUIRED])
 else
 PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
 fi
diff --git a/src/gallium/auxiliary/vl/vl_winsys.h 
b/src/gallium/auxiliary/vl/vl_winsys.h
index 26db9f2..e1f9b27 100644
--- a/src/gallium/auxiliary/vl/vl_winsys.h
+++ b/src/gallium/auxiliary/vl/vl_winsys.h
@@ -59,6 +59,11 @@ struct vl_screen
void *
(*get_private)(struct vl_screen *vscreen);
 
+   void
+   (*set_back_texture_from_output)(struct vl_screen *vscreen,
+   struct pipe_resource *buffer,
+   uint32_t width, uint32_t height);
+
struct pipe_screen *pscreen;
struct pipe_loader_device *dev;
 };
diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c 
b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
index 2929928..a810dea 100644
--- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
+++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "loader.h"
 
@@ -71,9 +72,12 @@ struct vl_dri3_screen
xcb_special_event_t *special_event;
 
struct pipe_context *pipe;
+   struct pipe_resource *output_texture;
+   uint32_t clip_width, clip_height;
 
struct vl_dri3_buffer *back_buffers[BACK_BUFFER_NUM];
int cur_back;
+   int next_back;
 
struct u_rect dirty_areas[BACK_BUFFER_NUM];
 
@@ -105,7 +109,8 @@ dri3_free_back_buffer(struct vl_dri3_screen *scrn,
xcb_free_pixmap(scrn->conn, buffer->pixmap);
xcb_sync_destroy_fence(scrn->conn, buffer->sync_fence);
xshmfence_unmap_shm(buffer->shm_fence);
-   pipe_resource_reference(>texture, NULL);
+   if (!scrn->output_texture)
+  pipe_resource_reference(>texture, NULL);
if (buffer->linear_texture)
pipe_resource_reference(>linear_texture, NULL);
FREE(buffer);
@@ -236,29 +241,31 @@ dri3_alloc_back_buffer(struct vl_dri3_screen *scrn)
templ.format = PIPE_FORMAT_B8G8R8X8_UNORM;
templ.target = PIPE_TEXTURE_2D;
templ.last_level = 0;
-   templ.width0 = scrn->width;
-   templ.height0 = scrn->height;
+   templ.width0 = (scrn->output_texture) ?
+  scrn->output_texture->width0 : scrn->width;
+   templ.height0 = (scrn->output_texture) ?
+   scrn->output_texture->height0 : scrn->height;
templ.depth0 = 1;
templ.array_size = 1;
 
if (scrn->is_different_gpu) {
-  buffer->texture = scrn->base.pscreen->resource_create(scrn->base.pscreen,
-);
+  buffer->texture = (scrn->output_texture) ? scrn->output_texture :
+
scrn->base.pscreen->resource_create(scrn->base.pscreen, );
   if (!buffer->texture)
  goto unmap_shm;
 
   templ.bind |= PIPE_BIND_SCANOUT | PIPE_BIND_SHARED |
 PIPE_BIND_LINEAR;
-  buffer->linear_texture = 
scrn->base.pscreen->resource_create(scrn->base.pscreen,
-  );
+  buffer->linear_texture =
+  scrn->base.pscreen->resource_create(scrn->base.pscreen, );
   pixmap_buffer_texture = buffer->linear_texture;
 
   if (!buffer->linear_texture)
  goto no_linear_texture;
} else {
   templ.bind |= PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;
-  buffer->texture = scrn->base.pscreen->resource_create(scrn->base.pscreen,
-);
+

Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Nayan Deshmukh
On Wed, Jan 11, 2017 at 9:25 PM, Andy Furniss  wrote:
> Nayan Deshmukh wrote:
>>
>> Hi Andy,
>>
>> Can you try this patch? This should help with the tearing.
>
>
> Patch seems to be good - I get page flipping again so DC, modesetting
> and "normal" setup all work OK.
>
Great.

Thanks for the help Michel.

Christian, I will resend the series with the changes. Please review the
other 2
patches.

Regards,
Nayan
>
>>
>> diff --git a/src/gallium/state_trackers/vdpau/output.c
>> b/src/gallium/state_trackers/vdpau/output.c
>> index 48e3133..98a8011 100644
>> --- a/src/gallium/state_trackers/vdpau/output.c
>> +++ b/src/gallium/state_trackers/vdpau/output.c
>> @@ -82,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
>>  res_tmpl.depth0 = 1;
>>  res_tmpl.array_size = 1;
>>  res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
>> -   PIPE_BIND_SHARED;
>> +   PIPE_BIND_SHARED | PIPE_BIND_SCANOUT;
>>  res_tmpl.usage = PIPE_USAGE_DEFAULT;
>>
>>  pipe_mutex_lock(dev->mutex);
>>
>> Regards,
>> Nayan
>>
>> On Wed, Jan 11, 2017 at 5:11 PM, Andy Furniss 
wrote:
>>>
>>> Michel Dänzer wrote:


 On 11/01/17 05:13 PM, Nayan Deshmukh wrote:
>
>
> On Wed, Jan 11, 2017 at 12:44 PM, Michel Dänzer 
> wrote:
>>
>>
>> On 10/01/17 06:53 PM, Nayan Deshmukh wrote:
>>>
>>>
>>> On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer 
>>> wrote:


 On 06/01/17 05:50 AM, Andy Furniss wrote:
>
>
> Christian König wrote:
>>
>>
>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>>>
>>>
>>> dri3 allows us to send handle of a texture directly to X
>>> so this patch allows a state tracker to directly send its
>>> texture to X to be used as back buffer and avoids extra
>>> copying
>>>
>>> v2: use clip width/height to display a portion of the surface
>>> v3: remove redundant variables, fix wrapping, rename variables
>>>handle vaapi path
>>> v3.1: we need clip_width/height for every frame so we don't need
>>>  to maintain it for each buffer instead use a global
>>> variable
>>> v4: In case of single gpu we can cache the buffers as
>>> applications
>>>use constant number of buffer and we can avoid calls to
>>> present
>>>extension for every frame
>>>
>>> Suggested-by: Leo Liu 
>>> Signed-off-by: Nayan Deshmukh 
>>
>>
>>
>> Acked-by: Christian König .
>>
>> Andy & Leo did you guys already had a chance to test it? To me it
>> looks
>> like this should work now.
>
>
>
> Well there is still the tearing issue from loosing pageflips.
>
> Maybe different GPUs don't see this. I can fix by forcing perf but
> I
> just tested dal and it's not even fixable running that.
>
> I guess that may not count as an issue with these patches as such
> if
> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
> regression until that happens.



 Somebody should track down why the buffers sent for presentation in
 this
 case don't use the same tiling parameters as buffers used for GL
via
 DRI3.

>>> I can look into this, but I don't know where to look exactly. Can
you
>>> give some
>>> pointers to get started.
>>
>>
>>
>> Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
>> again, my guess is that it's due to PIPE_BIND_SCANOUT not being set
>> when
>> creating the buffers that are now being directly sent to the X server
>> for presentation.
>>
> So the only way to avoid this is to have a PIPE_BIND_SCANOUT for the
> output surfaces of the state tracker. Will introducing
> PIPE_BIND_SCANOUT lead to performance loss for these surfaces?



 Potentially, but I doubt it'll make a big difference for this use case.
 In the future, there might be a feedback mechanism which allows
 re-allocating the buffer with/out PIPE_BIND_SCANOUT according to the
 current circumstances, but for now it's probably better to set it (at
 least in cases where we don't know that the buffer can never be scanned
 out directly) to allow for page flipping.
>>>
>>>
>>>
>>> Pure luck noticing this because I haven't tested modesetting driver for
>>> ages, but -
>>>
>>> These patches also break full screen vdpau playback when using that.
>>>
>>> Result is a screen of mostly junk with a hint of the vid - looks like
>>> when direct 

Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Andy Furniss

Nayan Deshmukh wrote:

Hi Andy,

Can you try this patch? This should help with the tearing.


Patch seems to be good - I get page flipping again so DC, modesetting
and "normal" setup all work OK.



diff --git a/src/gallium/state_trackers/vdpau/output.c
b/src/gallium/state_trackers/vdpau/output.c
index 48e3133..98a8011 100644
--- a/src/gallium/state_trackers/vdpau/output.c
+++ b/src/gallium/state_trackers/vdpau/output.c
@@ -82,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
 res_tmpl.depth0 = 1;
 res_tmpl.array_size = 1;
 res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
-   PIPE_BIND_SHARED;
+   PIPE_BIND_SHARED | PIPE_BIND_SCANOUT;
 res_tmpl.usage = PIPE_USAGE_DEFAULT;

 pipe_mutex_lock(dev->mutex);

Regards,
Nayan

On Wed, Jan 11, 2017 at 5:11 PM, Andy Furniss  wrote:

Michel Dänzer wrote:


On 11/01/17 05:13 PM, Nayan Deshmukh wrote:


On Wed, Jan 11, 2017 at 12:44 PM, Michel Dänzer 
wrote:


On 10/01/17 06:53 PM, Nayan Deshmukh wrote:


On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer 
wrote:


On 06/01/17 05:50 AM, Andy Furniss wrote:


Christian König wrote:


Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:


dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
   handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
 to maintain it for each buffer instead use a global
variable
v4: In case of single gpu we can cache the buffers as applications
   use constant number of buffer and we can avoid calls to
present
   extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 



Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it
looks
like this should work now.



Well there is still the tearing issue from loosing pageflips.

Maybe different GPUs don't see this. I can fix by forcing perf but I
just tested dal and it's not even fixable running that.

I guess that may not count as an issue with these patches as such if
xorg/xf86-video-amdgpu can work around, but it's a very noticeable
regression until that happens.



Somebody should track down why the buffers sent for presentation in
this
case don't use the same tiling parameters as buffers used for GL via
DRI3.


I can look into this, but I don't know where to look exactly. Can you
give some
pointers to get started.



Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
again, my guess is that it's due to PIPE_BIND_SCANOUT not being set when
creating the buffers that are now being directly sent to the X server
for presentation.


So the only way to avoid this is to have a PIPE_BIND_SCANOUT for the
output surfaces of the state tracker. Will introducing
PIPE_BIND_SCANOUT lead to performance loss for these surfaces?



Potentially, but I doubt it'll make a big difference for this use case.
In the future, there might be a feedback mechanism which allows
re-allocating the buffer with/out PIPE_BIND_SCANOUT according to the
current circumstances, but for now it's probably better to set it (at
least in cases where we don't know that the buffer can never be scanned
out directly) to allow for page flipping.



Pure luck noticing this because I haven't tested modesetting driver for
ages, but -

These patches also break full screen vdpau playback when using that.

Result is a screen of mostly junk with a hint of the vid - looks like
when direct scan out fails on wayland due to tiling mismatch.





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Cheng, Tony
Vblank interrupt fires as soon as the last line of active region is scanned out.
VSync interrupt fires at the vsync.
VUpdate interrupt fires HW is ready to scan out a new frame, this include latch 
on double buffer registers, starting memory request etc.

We use VUpdate to accommodate free sync, as in free sync, blank region is 
variable and a frame can be terminate as soon as new surface address is written 
to register.  If we use vblank interrupt will be fired too early and might not 
stretch frame time properly.

Dal/dc does not manage any interrupt as dc is architected to behave more like a 
helper.  Dal/amdgpu_dm is the glue code and does interrupt registration and 
handling of interrupt.  I think us DAL guys might not have the full 
understanding of DRM vblank machinery.  Is there some document we can go read 
up on to make sure all our assumption is correct?  From our perspective it 
seems some of the DRM vblank machinery (or the way we implement them) is 
redundant as our HW can do things that we queue off a work item to do 
automatically if we configure the HW correctly.

-Original Message-
From: Wentland, Harry 
Sent: Wednesday, January 11, 2017 9:51 AM
To: Michel Dänzer <mic...@daenzer.net>; Andy Furniss <adf.li...@gmail.com>; 
Nayan Deshmukh <nayan26deshm...@gmail.com>
Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org>; Cheng, Tony 
<tony.ch...@amd.com>
Subject: Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back 
buffers(v4)

On 2017-01-11 12:50 AM, Michel Dänzer wrote:
> On 10/01/17 09:07 PM, Andy Furniss wrote:
>> Andy Furniss wrote:
>>
>>> Though recent testing shows this is not true with DAL/DC on 3.7 - 
>>> todo test DC on new drm-next branch.
>>
>> todo done, DC for some reason on both amd-staging-4.7 and 
>> amd-staging-drm-next is "slower" = the tear region is 2 to 3 times 
>> larger than non DC kernel with powerplay auto. With high it is 
>> smaller but still present.
>
> This particular issue is because DC uses the GPU's VUPDATE interrupt 
> instead of the VBLANK interrupt to drive the DRM vblank machinery. The 
> result is that userspace is only notified of a vertical blank period 
> when it's already over, so it doesn't get a chance to do anything 
> inside the vertical blank period.
>
>

Adding Tony for comment on why DC behaves the way it does.

Harry
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Nayan Deshmukh
Hi Andy,

Can you try this patch? This should help with the tearing.

diff --git a/src/gallium/state_trackers/vdpau/output.c
b/src/gallium/state_trackers/vdpau/output.c
index 48e3133..98a8011 100644
--- a/src/gallium/state_trackers/vdpau/output.c
+++ b/src/gallium/state_trackers/vdpau/output.c
@@ -82,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
res_tmpl.depth0 = 1;
res_tmpl.array_size = 1;
res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
-   PIPE_BIND_SHARED;
+   PIPE_BIND_SHARED | PIPE_BIND_SCANOUT;
res_tmpl.usage = PIPE_USAGE_DEFAULT;

pipe_mutex_lock(dev->mutex);

Regards,
Nayan

On Wed, Jan 11, 2017 at 5:11 PM, Andy Furniss  wrote:
> Michel Dänzer wrote:
>>
>> On 11/01/17 05:13 PM, Nayan Deshmukh wrote:
>>>
>>> On Wed, Jan 11, 2017 at 12:44 PM, Michel Dänzer 
>>> wrote:

 On 10/01/17 06:53 PM, Nayan Deshmukh wrote:
>
> On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer 
> wrote:
>>
>> On 06/01/17 05:50 AM, Andy Furniss wrote:
>>>
>>> Christian König wrote:

 Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>
> dri3 allows us to send handle of a texture directly to X
> so this patch allows a state tracker to directly send its
> texture to X to be used as back buffer and avoids extra
> copying
>
> v2: use clip width/height to display a portion of the surface
> v3: remove redundant variables, fix wrapping, rename variables
>   handle vaapi path
> v3.1: we need clip_width/height for every frame so we don't need
> to maintain it for each buffer instead use a global
> variable
> v4: In case of single gpu we can cache the buffers as applications
>   use constant number of buffer and we can avoid calls to
> present
>   extension for every frame
>
> Suggested-by: Leo Liu 
> Signed-off-by: Nayan Deshmukh 


 Acked-by: Christian König .

 Andy & Leo did you guys already had a chance to test it? To me it
 looks
 like this should work now.
>>>
>>>
>>> Well there is still the tearing issue from loosing pageflips.
>>>
>>> Maybe different GPUs don't see this. I can fix by forcing perf but I
>>> just tested dal and it's not even fixable running that.
>>>
>>> I guess that may not count as an issue with these patches as such if
>>> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
>>> regression until that happens.
>>
>>
>> Somebody should track down why the buffers sent for presentation in
>> this
>> case don't use the same tiling parameters as buffers used for GL via
>> DRI3.
>>
> I can look into this, but I don't know where to look exactly. Can you
> give some
> pointers to get started.


 Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
 again, my guess is that it's due to PIPE_BIND_SCANOUT not being set when
 creating the buffers that are now being directly sent to the X server
 for presentation.

>>> So the only way to avoid this is to have a PIPE_BIND_SCANOUT for the
>>> output surfaces of the state tracker. Will introducing
>>> PIPE_BIND_SCANOUT lead to performance loss for these surfaces?
>>
>>
>> Potentially, but I doubt it'll make a big difference for this use case.
>> In the future, there might be a feedback mechanism which allows
>> re-allocating the buffer with/out PIPE_BIND_SCANOUT according to the
>> current circumstances, but for now it's probably better to set it (at
>> least in cases where we don't know that the buffer can never be scanned
>> out directly) to allow for page flipping.
>
>
> Pure luck noticing this because I haven't tested modesetting driver for
> ages, but -
>
> These patches also break full screen vdpau playback when using that.
>
> Result is a screen of mostly junk with a hint of the vid - looks like
> when direct scan out fails on wayland due to tiling mismatch.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Andy Furniss

Michel Dänzer wrote:

On 11/01/17 05:13 PM, Nayan Deshmukh wrote:

On Wed, Jan 11, 2017 at 12:44 PM, Michel Dänzer  wrote:

On 10/01/17 06:53 PM, Nayan Deshmukh wrote:

On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer  wrote:

On 06/01/17 05:50 AM, Andy Furniss wrote:

Christian König wrote:

Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:

dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
  handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
  use constant number of buffer and we can avoid calls to present
  extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 


Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it looks
like this should work now.


Well there is still the tearing issue from loosing pageflips.

Maybe different GPUs don't see this. I can fix by forcing perf but I
just tested dal and it's not even fixable running that.

I guess that may not count as an issue with these patches as such if
xorg/xf86-video-amdgpu can work around, but it's a very noticeable
regression until that happens.


Somebody should track down why the buffers sent for presentation in this
case don't use the same tiling parameters as buffers used for GL via DRI3.


I can look into this, but I don't know where to look exactly. Can you give some
pointers to get started.


Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
again, my guess is that it's due to PIPE_BIND_SCANOUT not being set when
creating the buffers that are now being directly sent to the X server
for presentation.


So the only way to avoid this is to have a PIPE_BIND_SCANOUT for the
output surfaces of the state tracker. Will introducing
PIPE_BIND_SCANOUT lead to performance loss for these surfaces?


Potentially, but I doubt it'll make a big difference for this use case.
In the future, there might be a feedback mechanism which allows
re-allocating the buffer with/out PIPE_BIND_SCANOUT according to the
current circumstances, but for now it's probably better to set it (at
least in cases where we don't know that the buffer can never be scanned
out directly) to allow for page flipping.


Pure luck noticing this because I haven't tested modesetting driver for 
ages, but -


These patches also break full screen vdpau playback when using that.

Result is a screen of mostly junk with a hint of the vid - looks like
when direct scan out fails on wayland due to tiling mismatch.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Michel Dänzer
On 11/01/17 05:13 PM, Nayan Deshmukh wrote:
> On Wed, Jan 11, 2017 at 12:44 PM, Michel Dänzer  wrote:
>> On 10/01/17 06:53 PM, Nayan Deshmukh wrote:
>>> On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer  wrote:
 On 06/01/17 05:50 AM, Andy Furniss wrote:
> Christian König wrote:
>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>>> dri3 allows us to send handle of a texture directly to X
>>> so this patch allows a state tracker to directly send its
>>> texture to X to be used as back buffer and avoids extra
>>> copying
>>>
>>> v2: use clip width/height to display a portion of the surface
>>> v3: remove redundant variables, fix wrapping, rename variables
>>>  handle vaapi path
>>> v3.1: we need clip_width/height for every frame so we don't need
>>>to maintain it for each buffer instead use a global variable
>>> v4: In case of single gpu we can cache the buffers as applications
>>>  use constant number of buffer and we can avoid calls to present
>>>  extension for every frame
>>>
>>> Suggested-by: Leo Liu 
>>> Signed-off-by: Nayan Deshmukh 
>>
>> Acked-by: Christian König .
>>
>> Andy & Leo did you guys already had a chance to test it? To me it looks
>> like this should work now.
>
> Well there is still the tearing issue from loosing pageflips.
>
> Maybe different GPUs don't see this. I can fix by forcing perf but I
> just tested dal and it's not even fixable running that.
>
> I guess that may not count as an issue with these patches as such if
> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
> regression until that happens.

 Somebody should track down why the buffers sent for presentation in this
 case don't use the same tiling parameters as buffers used for GL via DRI3.

>>> I can look into this, but I don't know where to look exactly. Can you give 
>>> some
>>> pointers to get started.
>>
>> Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
>> again, my guess is that it's due to PIPE_BIND_SCANOUT not being set when
>> creating the buffers that are now being directly sent to the X server
>> for presentation.
>>
> So the only way to avoid this is to have a PIPE_BIND_SCANOUT for the
> output surfaces of the state tracker. Will introducing
> PIPE_BIND_SCANOUT lead to performance loss for these surfaces?

Potentially, but I doubt it'll make a big difference for this use case.
In the future, there might be a feedback mechanism which allows
re-allocating the buffer with/out PIPE_BIND_SCANOUT according to the
current circumstances, but for now it's probably better to set it (at
least in cases where we don't know that the buffer can never be scanned
out directly) to allow for page flipping.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-11 Thread Nayan Deshmukh
On Wed, Jan 11, 2017 at 12:44 PM, Michel Dänzer  wrote:
> On 10/01/17 06:53 PM, Nayan Deshmukh wrote:
>> On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer  wrote:
>>> On 06/01/17 05:50 AM, Andy Furniss wrote:
 Christian König wrote:
> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>> dri3 allows us to send handle of a texture directly to X
>> so this patch allows a state tracker to directly send its
>> texture to X to be used as back buffer and avoids extra
>> copying
>>
>> v2: use clip width/height to display a portion of the surface
>> v3: remove redundant variables, fix wrapping, rename variables
>>  handle vaapi path
>> v3.1: we need clip_width/height for every frame so we don't need
>>to maintain it for each buffer instead use a global variable
>> v4: In case of single gpu we can cache the buffers as applications
>>  use constant number of buffer and we can avoid calls to present
>>  extension for every frame
>>
>> Suggested-by: Leo Liu 
>> Signed-off-by: Nayan Deshmukh 
>
> Acked-by: Christian König .
>
> Andy & Leo did you guys already had a chance to test it? To me it looks
> like this should work now.

 Well there is still the tearing issue from loosing pageflips.

 Maybe different GPUs don't see this. I can fix by forcing perf but I
 just tested dal and it's not even fixable running that.

 I guess that may not count as an issue with these patches as such if
 xorg/xf86-video-amdgpu can work around, but it's a very noticeable
 regression until that happens.
>>>
>>> Somebody should track down why the buffers sent for presentation in this
>>> case don't use the same tiling parameters as buffers used for GL via DRI3.
>>>
>> I can look into this, but I don't know where to look exactly. Can you give 
>> some
>> pointers to get started.
>
> Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
> again, my guess is that it's due to PIPE_BIND_SCANOUT not being set when
> creating the buffers that are now being directly sent to the X server
> for presentation.
>
So the only way to avoid this is to have a PIPE_BIND_SCANOUT for the
output surfaces
of the state tracker. Will introducing PIPE_BIND_SCANOUT lead to
performance loss
for these surfaces?

It will probably depend on the way drivers handle PIPE_BIND_SCANOUT.

Regards,
Nayan.

>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Michel Dänzer
On 10/01/17 06:53 PM, Nayan Deshmukh wrote:
> On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer  wrote:
>> On 06/01/17 05:50 AM, Andy Furniss wrote:
>>> Christian König wrote:
 Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
> dri3 allows us to send handle of a texture directly to X
> so this patch allows a state tracker to directly send its
> texture to X to be used as back buffer and avoids extra
> copying
>
> v2: use clip width/height to display a portion of the surface
> v3: remove redundant variables, fix wrapping, rename variables
>  handle vaapi path
> v3.1: we need clip_width/height for every frame so we don't need
>to maintain it for each buffer instead use a global variable
> v4: In case of single gpu we can cache the buffers as applications
>  use constant number of buffer and we can avoid calls to present
>  extension for every frame
>
> Suggested-by: Leo Liu 
> Signed-off-by: Nayan Deshmukh 

 Acked-by: Christian König .

 Andy & Leo did you guys already had a chance to test it? To me it looks
 like this should work now.
>>>
>>> Well there is still the tearing issue from loosing pageflips.
>>>
>>> Maybe different GPUs don't see this. I can fix by forcing perf but I
>>> just tested dal and it's not even fixable running that.
>>>
>>> I guess that may not count as an issue with these patches as such if
>>> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
>>> regression until that happens.
>>
>> Somebody should track down why the buffers sent for presentation in this
>> case don't use the same tiling parameters as buffers used for GL via DRI3.
>>
> I can look into this, but I don't know where to look exactly. Can you give 
> some
> pointers to get started.

Looking at src/gallium/auxiliary/vl/vl_winsys_dri3.c and the patches
again, my guess is that it's due to PIPE_BIND_SCANOUT not being set when
creating the buffers that are now being directly sent to the X server
for presentation.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Michel Dänzer
On 10/01/17 09:07 PM, Andy Furniss wrote:
> Andy Furniss wrote:
> 
>> Though recent testing shows this is not true with DAL/DC on 3.7 -
>> todo test DC on new drm-next branch.
> 
> todo done, DC for some reason on both amd-staging-4.7 and
> amd-staging-drm-next is "slower" = the tear region is 2 to 3 times
> larger than non DC kernel with powerplay auto. With high it is smaller
> but still present.

This particular issue is because DC uses the GPU's VUPDATE interrupt
instead of the VBLANK interrupt to drive the DRM vblank machinery. The
result is that userspace is only notified of a vertical blank period
when it's already over, so it doesn't get a chance to do anything inside
the vertical blank period.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Alex Deucher
On Tue, Jan 10, 2017 at 12:56 PM, Andy Furniss  wrote:
> Alex Deucher wrote:
>>
>> On Tue, Jan 10, 2017 at 4:50 AM, Nayan Deshmukh
>>  wrote:
>>>
>>> On Fri, Jan 6, 2017 at 2:20 AM, Andy Furniss  wrote:

 Christian König wrote:
>
>
> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>>
>>
>> dri3 allows us to send handle of a texture directly to X
>> so this patch allows a state tracker to directly send its
>> texture to X to be used as back buffer and avoids extra
>> copying
>>
>> v2: use clip width/height to display a portion of the surface
>> v3: remove redundant variables, fix wrapping, rename variables
>>   handle vaapi path
>> v3.1: we need clip_width/height for every frame so we don't need
>> to maintain it for each buffer instead use a global variable
>> v4: In case of single gpu we can cache the buffers as applications
>>   use constant number of buffer and we can avoid calls to present
>>   extension for every frame
>>
>> Suggested-by: Leo Liu 
>> Signed-off-by: Nayan Deshmukh 
>
>
>
> Acked-by: Christian König .
>
> Andy & Leo did you guys already had a chance to test it? To me it looks
> like this should work now.



 Well there is still the tearing issue from loosing pageflips.

 Maybe different GPUs don't see this. I can fix by forcing perf but I
 just tested dal and it's not even fixable running that.

 I guess that may not count as an issue with these patches as such if
 xorg/xf86-video-amdgpu can work around, but it's a very noticeable
 regression until that happens.

>>>
>>> That's bad. It should have improved the speed due to less copying
>>> involved.
>>> But it seems there are some problems in the patch. It may be that somehow
>>> we
>>> make calls to present extension on every frame.
>>>
>>
>> This is not the fault of your patches.  They reduce the copying
>> involved with generates less GPU activity which causes the GPU to not
>> ramp up the clocks as high.  For multi-media especially, we really
>> need to add a kernel interface to request a minimum clock floor for
>> specific contexts.
>
>
> Hmm, are these hidden clocks?
>
> echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level
>
> With dri2 it's still OK fullscreen, with opengl perf is hurt but it
> doesn't tear fullscreen - just can't make the framerate so player drops
> or slow-mo depending on its settings.
>
> Clearly clocks play a part in that on a non DC kernel high will "fix"
> but even then that's one test at 1080p. I tried 2160p framebuffer and
> it doesn't quite fix that. Going more extreme 4320p it's worse of course
> but full screen dri2 and opengl still won't tear.
>

Thanks for clarifying.  I didn't realize the tearing and performance
were intertwined.

Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Andy Furniss

Alex Deucher wrote:

On Tue, Jan 10, 2017 at 4:50 AM, Nayan Deshmukh
 wrote:

On Fri, Jan 6, 2017 at 2:20 AM, Andy Furniss  wrote:

Christian König wrote:


Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:


dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
  handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
  use constant number of buffer and we can avoid calls to present
  extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 



Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it looks
like this should work now.



Well there is still the tearing issue from loosing pageflips.

Maybe different GPUs don't see this. I can fix by forcing perf but I
just tested dal and it's not even fixable running that.

I guess that may not count as an issue with these patches as such if
xorg/xf86-video-amdgpu can work around, but it's a very noticeable
regression until that happens.



That's bad. It should have improved the speed due to less copying involved.
But it seems there are some problems in the patch. It may be that somehow we
make calls to present extension on every frame.



This is not the fault of your patches.  They reduce the copying
involved with generates less GPU activity which causes the GPU to not
ramp up the clocks as high.  For multi-media especially, we really
need to add a kernel interface to request a minimum clock floor for
specific contexts.


Hmm, are these hidden clocks?

echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level

With dri2 it's still OK fullscreen, with opengl perf is hurt but it
doesn't tear fullscreen - just can't make the framerate so player drops
or slow-mo depending on its settings.

Clearly clocks play a part in that on a non DC kernel high will "fix"
but even then that's one test at 1080p. I tried 2160p framebuffer and
it doesn't quite fix that. Going more extreme 4320p it's worse of course
but full screen dri2 and opengl still won't tear.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Alex Deucher
On Tue, Jan 10, 2017 at 4:50 AM, Nayan Deshmukh
 wrote:
> On Fri, Jan 6, 2017 at 2:20 AM, Andy Furniss  wrote:
>> Christian König wrote:
>>>
>>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:

 dri3 allows us to send handle of a texture directly to X
 so this patch allows a state tracker to directly send its
 texture to X to be used as back buffer and avoids extra
 copying

 v2: use clip width/height to display a portion of the surface
 v3: remove redundant variables, fix wrapping, rename variables
  handle vaapi path
 v3.1: we need clip_width/height for every frame so we don't need
to maintain it for each buffer instead use a global variable
 v4: In case of single gpu we can cache the buffers as applications
  use constant number of buffer and we can avoid calls to present
  extension for every frame

 Suggested-by: Leo Liu 
 Signed-off-by: Nayan Deshmukh 
>>>
>>>
>>> Acked-by: Christian König .
>>>
>>> Andy & Leo did you guys already had a chance to test it? To me it looks
>>> like this should work now.
>>
>>
>> Well there is still the tearing issue from loosing pageflips.
>>
>> Maybe different GPUs don't see this. I can fix by forcing perf but I
>> just tested dal and it's not even fixable running that.
>>
>> I guess that may not count as an issue with these patches as such if
>> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
>> regression until that happens.
>>
>
> That's bad. It should have improved the speed due to less copying involved.
> But it seems there are some problems in the patch. It may be that somehow we
> make calls to present extension on every frame.
>

This is not the fault of your patches.  They reduce the copying
involved with generates less GPU activity which causes the GPU to not
ramp up the clocks as high.  For multi-media especially, we really
need to add a kernel interface to request a minimum clock floor for
specific contexts.

Alex

> Andy you are using dri3 for testing, right?
>
> Leo, did you also experience tearing issues?
>
> Regards,
> Nayan
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Andy Furniss

Andy Furniss wrote:


Though recent testing shows this is not true with DAL/DC on 3.7 -
todo test DC on new drm-next branch.


todo done, DC for some reason on both amd-staging-4.7 and
amd-staging-drm-next is "slower" = the tear region is 2 to 3 times
larger than non DC kernel with powerplay auto. With high it is smaller
but still present.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Andy Furniss

Nayan Deshmukh wrote:

On Fri, Jan 6, 2017 at 2:20 AM, Andy Furniss  wrote:

Christian König wrote:


Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:


dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
  handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
  use constant number of buffer and we can avoid calls to present
  extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 



Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it looks
like this should work now.



Well there is still the tearing issue from loosing pageflips.

Maybe different GPUs don't see this. I can fix by forcing perf but I
just tested dal and it's not even fixable running that.

I guess that may not count as an issue with these patches as such if
xorg/xf86-video-amdgpu can work around, but it's a very noticeable
regression until that happens.



That's bad. It should have improved the speed due to less copying involved.
But it seems there are some problems in the patch. It may be that somehow we
make calls to present extension on every frame.


Tiling issue that Michel suggested?

FWIW in windowed playback everything (dri2/3/opengl) has this, I assume
because I don't get pageflipping then. I wouldn't notice though because
even if the player opens the window at the top of the screen, the window
border makes the vid low enough to miss the tear.
Though recent testing shows this is not true with DAL/DC on 3.7 -
todo test DC on new drm-next branch.


Andy you are using dri3 for testing, right?


Yes, there is no (fullscreen) tearing if I startx with DRI3 disabled.


Leo, did you also experience tearing issues?


It's quite possible that most won't see this -

I use a non-compositing desktop (fluxbox), so I guess unless people use
unredirect full screen windows they may still get gl page flipping?

It's only the top of the screen and you won't notice on many video
unless there is a lot of horizontal panning.


Regards,
Nayan



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Nayan Deshmukh
On Sat, Jan 7, 2017 at 12:42 PM, Michel Dänzer  wrote:
> On 06/01/17 05:50 AM, Andy Furniss wrote:
>> Christian König wrote:
>>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
 dri3 allows us to send handle of a texture directly to X
 so this patch allows a state tracker to directly send its
 texture to X to be used as back buffer and avoids extra
 copying

 v2: use clip width/height to display a portion of the surface
 v3: remove redundant variables, fix wrapping, rename variables
  handle vaapi path
 v3.1: we need clip_width/height for every frame so we don't need
to maintain it for each buffer instead use a global variable
 v4: In case of single gpu we can cache the buffers as applications
  use constant number of buffer and we can avoid calls to present
  extension for every frame

 Suggested-by: Leo Liu 
 Signed-off-by: Nayan Deshmukh 
>>>
>>> Acked-by: Christian König .
>>>
>>> Andy & Leo did you guys already had a chance to test it? To me it looks
>>> like this should work now.
>>
>> Well there is still the tearing issue from loosing pageflips.
>>
>> Maybe different GPUs don't see this. I can fix by forcing perf but I
>> just tested dal and it's not even fixable running that.
>>
>> I guess that may not count as an issue with these patches as such if
>> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
>> regression until that happens.
>
> Somebody should track down why the buffers sent for presentation in this
> case don't use the same tiling parameters as buffers used for GL via DRI3.
>
I can look into this, but I don't know where to look exactly. Can you give some
pointers to get started.

Regards,
Nayan
>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-10 Thread Nayan Deshmukh
On Fri, Jan 6, 2017 at 2:20 AM, Andy Furniss  wrote:
> Christian König wrote:
>>
>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>>>
>>> dri3 allows us to send handle of a texture directly to X
>>> so this patch allows a state tracker to directly send its
>>> texture to X to be used as back buffer and avoids extra
>>> copying
>>>
>>> v2: use clip width/height to display a portion of the surface
>>> v3: remove redundant variables, fix wrapping, rename variables
>>>  handle vaapi path
>>> v3.1: we need clip_width/height for every frame so we don't need
>>>to maintain it for each buffer instead use a global variable
>>> v4: In case of single gpu we can cache the buffers as applications
>>>  use constant number of buffer and we can avoid calls to present
>>>  extension for every frame
>>>
>>> Suggested-by: Leo Liu 
>>> Signed-off-by: Nayan Deshmukh 
>>
>>
>> Acked-by: Christian König .
>>
>> Andy & Leo did you guys already had a chance to test it? To me it looks
>> like this should work now.
>
>
> Well there is still the tearing issue from loosing pageflips.
>
> Maybe different GPUs don't see this. I can fix by forcing perf but I
> just tested dal and it's not even fixable running that.
>
> I guess that may not count as an issue with these patches as such if
> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
> regression until that happens.
>

That's bad. It should have improved the speed due to less copying involved.
But it seems there are some problems in the patch. It may be that somehow we
make calls to present extension on every frame.

Andy you are using dri3 for testing, right?

Leo, did you also experience tearing issues?

Regards,
Nayan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-06 Thread Michel Dänzer
On 06/01/17 05:50 AM, Andy Furniss wrote:
> Christian König wrote:
>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>>> dri3 allows us to send handle of a texture directly to X
>>> so this patch allows a state tracker to directly send its
>>> texture to X to be used as back buffer and avoids extra
>>> copying
>>>
>>> v2: use clip width/height to display a portion of the surface
>>> v3: remove redundant variables, fix wrapping, rename variables
>>>  handle vaapi path
>>> v3.1: we need clip_width/height for every frame so we don't need
>>>to maintain it for each buffer instead use a global variable
>>> v4: In case of single gpu we can cache the buffers as applications
>>>  use constant number of buffer and we can avoid calls to present
>>>  extension for every frame
>>>
>>> Suggested-by: Leo Liu 
>>> Signed-off-by: Nayan Deshmukh 
>>
>> Acked-by: Christian König .
>>
>> Andy & Leo did you guys already had a chance to test it? To me it looks
>> like this should work now.
> 
> Well there is still the tearing issue from loosing pageflips.
> 
> Maybe different GPUs don't see this. I can fix by forcing perf but I
> just tested dal and it's not even fixable running that.
> 
> I guess that may not count as an issue with these patches as such if
> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
> regression until that happens.

Somebody should track down why the buffers sent for presentation in this
case don't use the same tiling parameters as buffers used for GL via DRI3.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-05 Thread Andy Furniss

Christian König wrote:

Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:

dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
 handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
   to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
 use constant number of buffer and we can avoid calls to present
 extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 


Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it looks
like this should work now.


Well there is still the tearing issue from loosing pageflips.

Maybe different GPUs don't see this. I can fix by forcing perf but I
just tested dal and it's not even fixable running that.

I guess that may not count as an issue with these patches as such if
xorg/xf86-video-amdgpu can work around, but it's a very noticeable
regression until that happens.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-05 Thread Leo Liu



On 01/05/2017 05:21 AM, Christian König wrote:

Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:

dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
 handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
   to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
 use constant number of buffer and we can avoid calls to present
 extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 


Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it 
looks like this should work now.


Nayan and I worked offline for this last couple of weeks, the patch set 
has been tested.


This patch is
Reviewed-by: Leo Liu 

Regards,
Leo



If in need I can setup a test system as well.

Regards,
Christian.


---
  configure.ac  |   2 +-
  src/gallium/auxiliary/vl/vl_winsys.h  |   5 ++
  src/gallium/auxiliary/vl/vl_winsys_dri3.c | 126 
++

  3 files changed, 115 insertions(+), 18 deletions(-)

diff --git a/configure.ac b/configure.ac
index 799f5eb..94aac34 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2078,7 +2078,7 @@ if test "x$enable_xvmc" = xyes -o \
  "x$enable_va" = xyes; then
  if test x"$enable_dri3" = xyes; then
  PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync 
xshmfence >= $XSHMFENCE_REQUIRED
- x11-xcb xcb xcb-dri2 >= 
$XCBDRI2_REQUIRED])
+ xcb-xfixes x11-xcb xcb xcb-dri2 >= 
$XCBDRI2_REQUIRED])

  else
  PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= 
$XCBDRI2_REQUIRED])

  fi
diff --git a/src/gallium/auxiliary/vl/vl_winsys.h 
b/src/gallium/auxiliary/vl/vl_winsys.h

index 26db9f2..e1f9b27 100644
--- a/src/gallium/auxiliary/vl/vl_winsys.h
+++ b/src/gallium/auxiliary/vl/vl_winsys.h
@@ -59,6 +59,11 @@ struct vl_screen
 void *
 (*get_private)(struct vl_screen *vscreen);
  +   void
+   (*set_back_texture_from_output)(struct vl_screen *vscreen,
+   struct pipe_resource *buffer,
+   uint32_t width, uint32_t height);
+
 struct pipe_screen *pscreen;
 struct pipe_loader_device *dev;
  };
diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c 
b/src/gallium/auxiliary/vl/vl_winsys_dri3.c

index 2929928..a810dea 100644
--- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
+++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
@@ -31,6 +31,7 @@
  #include 
  #include 
  #include 
+#include 
#include "loader.h"
  @@ -71,9 +72,12 @@ struct vl_dri3_screen
 xcb_special_event_t *special_event;
   struct pipe_context *pipe;
+   struct pipe_resource *output_texture;
+   uint32_t clip_width, clip_height;
   struct vl_dri3_buffer *back_buffers[BACK_BUFFER_NUM];
 int cur_back;
+   int next_back;
   struct u_rect dirty_areas[BACK_BUFFER_NUM];
  @@ -105,7 +109,8 @@ dri3_free_back_buffer(struct vl_dri3_screen *scrn,
 xcb_free_pixmap(scrn->conn, buffer->pixmap);
 xcb_sync_destroy_fence(scrn->conn, buffer->sync_fence);
 xshmfence_unmap_shm(buffer->shm_fence);
-   pipe_resource_reference(>texture, NULL);
+   if (!scrn->output_texture)
+  pipe_resource_reference(>texture, NULL);
 if (buffer->linear_texture)
 pipe_resource_reference(>linear_texture, NULL);
 FREE(buffer);
@@ -236,29 +241,31 @@ dri3_alloc_back_buffer(struct vl_dri3_screen 
*scrn)

 templ.format = PIPE_FORMAT_B8G8R8X8_UNORM;
 templ.target = PIPE_TEXTURE_2D;
 templ.last_level = 0;
-   templ.width0 = scrn->width;
-   templ.height0 = scrn->height;
+   templ.width0 = (scrn->output_texture) ?
+  scrn->output_texture->width0 : scrn->width;
+   templ.height0 = (scrn->output_texture) ?
+   scrn->output_texture->height0 : scrn->height;
 templ.depth0 = 1;
 templ.array_size = 1;
   if (scrn->is_different_gpu) {
-  buffer->texture = 
scrn->base.pscreen->resource_create(scrn->base.pscreen,

- );
+  buffer->texture = (scrn->output_texture) ? scrn->output_texture :
+ scrn->base.pscreen->resource_create(scrn->base.pscreen, );
if (!buffer->texture)
   goto unmap_shm;
  templ.bind |= PIPE_BIND_SCANOUT | PIPE_BIND_SHARED |
  PIPE_BIND_LINEAR;
-  buffer->linear_texture = 
scrn->base.pscreen->resource_create(scrn->base.pscreen,

- );
+  buffer->linear_texture =
+ scrn->base.pscreen->resource_create(scrn->base.pscreen, );
pixmap_buffer_texture = buffer->linear_texture;
  if 

Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-05 Thread Christian König

Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:

dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
 handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
   to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
 use constant number of buffer and we can avoid calls to present
 extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 


Acked-by: Christian König .

Andy & Leo did you guys already had a chance to test it? To me it looks 
like this should work now.


If in need I can setup a test system as well.

Regards,
Christian.


---
  configure.ac  |   2 +-
  src/gallium/auxiliary/vl/vl_winsys.h  |   5 ++
  src/gallium/auxiliary/vl/vl_winsys_dri3.c | 126 ++
  3 files changed, 115 insertions(+), 18 deletions(-)

diff --git a/configure.ac b/configure.ac
index 799f5eb..94aac34 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2078,7 +2078,7 @@ if test "x$enable_xvmc" = xyes -o \
  "x$enable_va" = xyes; then
  if test x"$enable_dri3" = xyes; then
  PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync xshmfence >= 
$XSHMFENCE_REQUIRED
- x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
+ xcb-xfixes x11-xcb xcb xcb-dri2 >= 
$XCBDRI2_REQUIRED])
  else
  PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
  fi
diff --git a/src/gallium/auxiliary/vl/vl_winsys.h 
b/src/gallium/auxiliary/vl/vl_winsys.h
index 26db9f2..e1f9b27 100644
--- a/src/gallium/auxiliary/vl/vl_winsys.h
+++ b/src/gallium/auxiliary/vl/vl_winsys.h
@@ -59,6 +59,11 @@ struct vl_screen
 void *
 (*get_private)(struct vl_screen *vscreen);
  
+   void

+   (*set_back_texture_from_output)(struct vl_screen *vscreen,
+   struct pipe_resource *buffer,
+   uint32_t width, uint32_t height);
+
 struct pipe_screen *pscreen;
 struct pipe_loader_device *dev;
  };
diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c 
b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
index 2929928..a810dea 100644
--- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
+++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
@@ -31,6 +31,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "loader.h"
  
@@ -71,9 +72,12 @@ struct vl_dri3_screen

 xcb_special_event_t *special_event;
  
 struct pipe_context *pipe;

+   struct pipe_resource *output_texture;
+   uint32_t clip_width, clip_height;
  
 struct vl_dri3_buffer *back_buffers[BACK_BUFFER_NUM];

 int cur_back;
+   int next_back;
  
 struct u_rect dirty_areas[BACK_BUFFER_NUM];
  
@@ -105,7 +109,8 @@ dri3_free_back_buffer(struct vl_dri3_screen *scrn,

 xcb_free_pixmap(scrn->conn, buffer->pixmap);
 xcb_sync_destroy_fence(scrn->conn, buffer->sync_fence);
 xshmfence_unmap_shm(buffer->shm_fence);
-   pipe_resource_reference(>texture, NULL);
+   if (!scrn->output_texture)
+  pipe_resource_reference(>texture, NULL);
 if (buffer->linear_texture)
 pipe_resource_reference(>linear_texture, NULL);
 FREE(buffer);
@@ -236,29 +241,31 @@ dri3_alloc_back_buffer(struct vl_dri3_screen *scrn)
 templ.format = PIPE_FORMAT_B8G8R8X8_UNORM;
 templ.target = PIPE_TEXTURE_2D;
 templ.last_level = 0;
-   templ.width0 = scrn->width;
-   templ.height0 = scrn->height;
+   templ.width0 = (scrn->output_texture) ?
+  scrn->output_texture->width0 : scrn->width;
+   templ.height0 = (scrn->output_texture) ?
+   scrn->output_texture->height0 : scrn->height;
 templ.depth0 = 1;
 templ.array_size = 1;
  
 if (scrn->is_different_gpu) {

-  buffer->texture = scrn->base.pscreen->resource_create(scrn->base.pscreen,
-);
+  buffer->texture = (scrn->output_texture) ? scrn->output_texture :
+scrn->base.pscreen->resource_create(scrn->base.pscreen, 
);
if (!buffer->texture)
   goto unmap_shm;
  
templ.bind |= PIPE_BIND_SCANOUT | PIPE_BIND_SHARED |

  PIPE_BIND_LINEAR;
-  buffer->linear_texture = 
scrn->base.pscreen->resource_create(scrn->base.pscreen,
-  );
+  buffer->linear_texture =
+  scrn->base.pscreen->resource_create(scrn->base.pscreen, );
pixmap_buffer_texture = buffer->linear_texture;
  
if (!buffer->linear_texture)

   goto 

[Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-04 Thread Nayan Deshmukh
dri3 allows us to send handle of a texture directly to X
so this patch allows a state tracker to directly send its
texture to X to be used as back buffer and avoids extra
copying

v2: use clip width/height to display a portion of the surface
v3: remove redundant variables, fix wrapping, rename variables
handle vaapi path
v3.1: we need clip_width/height for every frame so we don't need
  to maintain it for each buffer instead use a global variable
v4: In case of single gpu we can cache the buffers as applications
use constant number of buffer and we can avoid calls to present
extension for every frame

Suggested-by: Leo Liu 
Signed-off-by: Nayan Deshmukh 
---
 configure.ac  |   2 +-
 src/gallium/auxiliary/vl/vl_winsys.h  |   5 ++
 src/gallium/auxiliary/vl/vl_winsys_dri3.c | 126 ++
 3 files changed, 115 insertions(+), 18 deletions(-)

diff --git a/configure.ac b/configure.ac
index 799f5eb..94aac34 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2078,7 +2078,7 @@ if test "x$enable_xvmc" = xyes -o \
 "x$enable_va" = xyes; then
 if test x"$enable_dri3" = xyes; then
 PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync xshmfence >= 
$XSHMFENCE_REQUIRED
- x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
+ xcb-xfixes x11-xcb xcb xcb-dri2 >= 
$XCBDRI2_REQUIRED])
 else
 PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
 fi
diff --git a/src/gallium/auxiliary/vl/vl_winsys.h 
b/src/gallium/auxiliary/vl/vl_winsys.h
index 26db9f2..e1f9b27 100644
--- a/src/gallium/auxiliary/vl/vl_winsys.h
+++ b/src/gallium/auxiliary/vl/vl_winsys.h
@@ -59,6 +59,11 @@ struct vl_screen
void *
(*get_private)(struct vl_screen *vscreen);
 
+   void
+   (*set_back_texture_from_output)(struct vl_screen *vscreen,
+   struct pipe_resource *buffer,
+   uint32_t width, uint32_t height);
+
struct pipe_screen *pscreen;
struct pipe_loader_device *dev;
 };
diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c 
b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
index 2929928..a810dea 100644
--- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
+++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "loader.h"
 
@@ -71,9 +72,12 @@ struct vl_dri3_screen
xcb_special_event_t *special_event;
 
struct pipe_context *pipe;
+   struct pipe_resource *output_texture;
+   uint32_t clip_width, clip_height;
 
struct vl_dri3_buffer *back_buffers[BACK_BUFFER_NUM];
int cur_back;
+   int next_back;
 
struct u_rect dirty_areas[BACK_BUFFER_NUM];
 
@@ -105,7 +109,8 @@ dri3_free_back_buffer(struct vl_dri3_screen *scrn,
xcb_free_pixmap(scrn->conn, buffer->pixmap);
xcb_sync_destroy_fence(scrn->conn, buffer->sync_fence);
xshmfence_unmap_shm(buffer->shm_fence);
-   pipe_resource_reference(>texture, NULL);
+   if (!scrn->output_texture)
+  pipe_resource_reference(>texture, NULL);
if (buffer->linear_texture)
pipe_resource_reference(>linear_texture, NULL);
FREE(buffer);
@@ -236,29 +241,31 @@ dri3_alloc_back_buffer(struct vl_dri3_screen *scrn)
templ.format = PIPE_FORMAT_B8G8R8X8_UNORM;
templ.target = PIPE_TEXTURE_2D;
templ.last_level = 0;
-   templ.width0 = scrn->width;
-   templ.height0 = scrn->height;
+   templ.width0 = (scrn->output_texture) ?
+  scrn->output_texture->width0 : scrn->width;
+   templ.height0 = (scrn->output_texture) ?
+   scrn->output_texture->height0 : scrn->height;
templ.depth0 = 1;
templ.array_size = 1;
 
if (scrn->is_different_gpu) {
-  buffer->texture = scrn->base.pscreen->resource_create(scrn->base.pscreen,
-);
+  buffer->texture = (scrn->output_texture) ? scrn->output_texture :
+
scrn->base.pscreen->resource_create(scrn->base.pscreen, );
   if (!buffer->texture)
  goto unmap_shm;
 
   templ.bind |= PIPE_BIND_SCANOUT | PIPE_BIND_SHARED |
 PIPE_BIND_LINEAR;
-  buffer->linear_texture = 
scrn->base.pscreen->resource_create(scrn->base.pscreen,
-  );
+  buffer->linear_texture =
+  scrn->base.pscreen->resource_create(scrn->base.pscreen, );
   pixmap_buffer_texture = buffer->linear_texture;
 
   if (!buffer->linear_texture)
  goto no_linear_texture;
} else {
   templ.bind |= PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;
-  buffer->texture = scrn->base.pscreen->resource_create(scrn->base.pscreen,
-);
+  buffer->texture = (scrn->output_texture) ? scrn->output_texture :
+